I have a method-free Python class which is implemented as a double-linked list node:
class DataNode:
def __init__(self, my_value):
self.my_value = my_value
self.prev = None
self.next = None
After initializing a number of DataNode instances, there are put in the list and linked:
# Suppose data_nodes is already filled with DataNode elements
for e1, e2 in zip(data_nodes, data_nodes[1:]):
e1.next = e2
e2.prev = e1
My question is: What would be the best way to serialize this list to .json? In other words, how plausible would it be to write a somewhat of generalized encoder for double-linked lists? In my program I have a number of classes implemented this way - it seems a bit wasteful to write a custom encoder for the each double-linked node class:
class DataNodeEncoder(JSONEncoder):
def default(self, o):
return {'my_value': o.my_value}
class DataNode2Encoder(JSONEncoder):
def default(self, o):
return {'my_other_value1': o.my_other_value1,
'my_other_value2': o.my_other_value2}
I would serialise as standard list, and then encode it from list back to doubly linked list. You could make a constructor for a DataList class that can take a list or separate values as argument(s), and will create the doubly linked list. I would use a sentinel node to keep the code simple.
Add an __iter__ method that will iterate the values (not the node objects). This way you can easily convert a doubly linked list to a list, and from there to JSON.
Here is the code:
class DataNode:
def __init__(self, my_value, prev=None, nxt=None):
self.my_value = my_value
self.prev = prev
if prev:
prev.next = self
self.next = nxt
if nxt:
nxt.prev = self
class DataList(DataNode):
def __init__(self, *values):
super().__init__(None, self, self) # Sentinel node
# Support a single argument that is a list or tuple
if len(values) == 1 and type(values[0]) in (list, tuple):
values = values[0]
for value in values:
self.append(value)
def append(self, my_value):
DataNode(my_value, self.prev, self)
return self
def __iter__(self):
node = self.next
while node != self:
yield node.my_value
node = node.next
Here are some example uses:
# Create the doubly linked list from some given values
lst = DataList(1, 2, 3, 4, 5, 6)
# Add a few more
lst.append(7).append(8)
# Get as standard list of values (not nodes)
print(list(lst))
# Get as JSON:
import json
print(json.dumps(list(lst)))
# Build from JSON:
lst = DataList(json.loads("[2,3,5,7,11]"))
# Print the values
print(*lst)
You could give each node a unique Id. And use a custom serializer and deserializer to regenerate references as such. Mark the reference values so you can rebuild the list afterwards, with this you should be able to go through the list and generate the first part of the JSON and then include the references:
{
"$id": "1",
"myValue": "lorem ipsum",
"next": {
"$id": "2",
"myValue": "lorem ipsum",
"next": {
"$id": "3",
"myValue": "lorem ipsum",
"next": {
"$id": "4",
"myValue": "lorem ipsum",
},
"prev": {
"$ref": "2"
}
},
"prev": {
"$ref": "1"
}
}
}
This can be done in two passes. One setting the next elements and one backreferencing prev elements where they exist. This is a way to keep your linked list in JSON. However, if it is not necessary to preserve the LL structure. You could also serialize it to a regular array and rebuild the linked list afterward.
Related
I have an API response that can respond with an array like this:
[
{
"id": 1,
"title": "Warning"
},
{
"id": 2,
"title": "Warning"
}
]
sometimes it can respond just empty array
[]
in my case i created a class for this object.
Something like this:
class Warning:
def __init__(self, data: Dict):
if bool(data):
self.id: int = data["id"]
self.title: str = data["title"]
else:
pass
if __name__ == '__main__':
data = {
"id": 123455,
"title": "Warning"
}
empty_data = {}
object1: List[Warning] = [Warning(data)]
object2: List[Warning] = [Warning(empty_data)]
if object1:
print("we have a warnings")
if object2:
print("we don't have warnings")
I can't understand, how can I check if i get List of Object with empty fields like object2?
I would suggest looking at the __bool__ class method which enables you to determine the boolean value of a python class.
However, you will also need to decide what the boolean value of the list should be e.g. should bool([Warning(empty_data), Warning(data)]) return False or True?
I'm a beginner programmer, and I just started learning about Nested lists and dictionaries. I have a task to create a system of files, with class Directory and it's attributes.
class Directory:
def __init__(self, name: str, parent: Optional['Directory'], children: List[Optional['Directory']]):
self.name = name
self.parent = parent
self.children = children
I'm supposed to build a function to create this system of files recursively, given root and it's directories from dictionary. Parent is a directory which includes current dir as one of his children. Any dir which doesn't have any children is supposed to be an empty directory.
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]}
I've been trying to do this and I think I know how to create directories recursively, however I have no idea what to put in dir.parent place without any additional imports. With root, there's no problem because it is None but further in process I don't know how to place child's parent (which is supposed to be Directory) as one of his attributes since I'm going recursively from there. Do you have any idea how to do that? Here's code which I have so far:
def create_system(system: Dict[str, List[str]], parent_children: List[str]) -> Optional[List[Optional['Directory']]]:
children: List[Optional['Directory']] = []
for child in parent_children:
if child in system.keys():
children.append(Directory(child, parent, create_system(system, list(system.get(child)))))
else:
children.append(Directory(child, parent, []))
return children
def root(system: Dict[str, List[str]]) -> Optional['Directory']:
return Directory("root", None, create_system(system, list(system.get("root"))))
Thank you for any responses!
Your goal is to transform the dictionary
system = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]
}
into the following tree:
root
/ \
dirA dirB
/ \
dirC dirE
/ \
dirH dirG
/ \
dirX dirY
Hopefully, it's clear that the return value of the process can be just the root. To guarantee that you hit every folder only after its parent has been created, you can use either a stack-based BFS approach, or a recursive DFS approach.
Let's look at a simple BFS approach:
def create_system_bfs(system):
root = Directory('root', None, [])
stack = [root] # in practice, use collections.deque([root])
while stack:
current = stack.pop(0)
for child in system.get(current.name, []):
d = Directory(child, current, [])
current.children.append(d)
stack.append(d)
return root
The DFS version of that could be something like:
def create_system_dfs(system):
def create_node(name, parent):
d = Directory(name, parent, [])
d.children = [create_node(child, d) for child in system.get(name, [])]
return d
return create_node('root', None)
Keep in mind that there are other possible approaches. In both cases, the create_root method is completely unnecessary. The BFS approach is limited only by available heap memory. The DFS approach may also be limited by stack size.
Before get all tangled up in classes we can think in terms of an ordinary function -
def paths(t, init="root"):
def loop(q, path):
if isinstance(q, dict):
for (dir, v) in q.items():
yield from loop(v, [*path, dir])
elif isinstance(q, list):
for dir in q:
yield from loop \
( t[dir] if dir in t else None
, [*path, dir]
)
else:
yield "/".join(path)
yield from loop(t[init], ["."])
Using paths is easy, simply call it on your input tree -
input = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
for path in paths(input):
print(path)
./dirA/dirC/dirH
./dirA/dirC/dirG/dirX
./dirA/dirC/dirG/dirY
./dirB/dirE
Using paths allows us to easily create the directories we need -
import os
for path in paths(input):
os.makedirs(path)
This is an opportunity to learn about reusable modules and mutual recursion. This solution in this answer solves your specific problem without any modification of the modules written in another answer. The distinct advantage of this approach is that tree has zero knowledge of your node shape and allows you to define any output shape.
Below we create a tree using plain dict with name, parent, and children properties. tree does not make this choice for you. A different structure or a custom class can be used, if desired -
from tree import tree
input = {
None: ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
result = tree \
( flatten(input)
, parent
, lambda node, children:
dict \
( name=name(node)
, parent=parent(node)
, children=children(name(node))
)
)
print(result)
[
{
"name": "dirA",
"parent": None,
"children": [
{
"name": "dirC",
"parent": "dirA",
"children": [
{
"name": "dirH",
"parent": "dirC",
"children": []
},
{
"name": "dirG",
"parent": "dirC",
"children": [
{
"name": "dirX",
"parent": "dirG",
"children": []
},
{
"name": "dirY",
"parent": "dirG",
"children": []
}
]
}
]
}
]
},
{
"name": "dirB",
"parent": None,
"children": [
{
"name": "dirE",
"parent": "dirB",
"children": []
}
]
}
]
In order to use tree, we defined a way to flatten the input nodes -
def flatten(t):
seq = chain.from_iterable \
( map(lambda _: (_, k), v)
for (k,v) in input.items()
)
return list(seq)
print(flatten(input))
[ ('dirA', None)
, ('dirB', None)
, ('dirC', 'dirA')
, ('dirH', 'dirC')
, ('dirG', 'dirC')
, ('dirE', 'dirB')
, ('dirX', 'dirG')
, ('dirY', 'dirG')
]
And we also defined the primary key and foreign key. Here we use name and parent, but you can choose whichever names you like -
def name(t):
return t[0]
def parent(t):
return t[1]
To learn more about the tree module and some of its benefits, see the original Q&A
I have a strategic issue of writing a program doing a job.
I have CSV files like:
Column1 Column 2
------- ----------
parent1 [child1, child2, child3]
parent2 [child4, child5, child6]
child1 [child7, child8]
child5 [child10, child33]
... ...
It is unknown how deep each element of those lists will be extended and I want to loop through them.
Code:
def make_parentClass(self):
for i in self.csv_rows_list:
self.parentClassList.append(parentClass(i))
# after first Parent
for i in self.parentClassList:
if i.children !=[]:
for child in i.children:
for z in self.parentClassList:
if str(child) == str(z.node_parent):
i.node_children.append(z)
self.parentClassList.remove(z)
class parentClass():
node_children = []
def __init__(self, the_list):
self.node_parent = the_list[0]
self.children = the_list[1]
The above code might be a solution if I will find a way to iterate. Let me see if you like the question and makes sense now.
Output:
My aim is to build up a treeview through another language but first I need to make this output in JSON format. So the output expected to be something like:
{
paren1:{'child1':{'child7':{}, 'child8':{}},
'child2': {},
'child3': {},
},
parent2: {
'child4':{},
'child5': {
'child10':{},
'child33':{}
},
'child6':{}
}
}
I would recommend a solution using two dictionaries. One nested one with the actually data structure you plan to convert to JSON, and one flat one that will let you actually find the keys. Since everything is a reference in Python, you can make sure that both dictionaries have the exact same values. Carefully modifying the flat dictionary will build your structure for you.
The following code assumes that you have already managed to split each line into a string parent and list children, containing values form the two columns.
json_dict = {}
flat_dict = {}
for parent, children in file_iterator():
if parent in flat_dict:
value = flat_dict[parent]
else:
value = {}
flat_dict[parent] = json_dict[parent] = value
for child in children:
flat_dict[child] = value[child] = {}
Running this produces json_dict like this:
{
'parent1': {
'child1': {
'child7': {},
'child8': {}
},
'child2': {},
'child3': {}
},
'parent2': {
'child4': {},
'child5': {
'child10': {},
'child33': {}
},
'child6': {}
}
}
Here is an IDEOne link to play with.
I have a variable declaration as follows
my_var = typing.List[typing.Tuple[int, int]]
and I want to write a validator as follows
schema_validator = "my_var": {
"type": "list",
"empty": False,
"items": [
{"type": "tuple"},
{"items": [
{"type": "int"}, {"type": "int"}
]}
]
}
In Cerberus documentation it does not specify a validator example for tuples.
How to accomplish this?
Given your typevar typing.List[typing.Tuple[int, int]], you expect an arbritrary length list of two-value tuples where each value is an integer.
class MyValidator(Validator):
# add a type definition to a validator subclass
types_mapping = Validator.types_mapping.copy()
types_mapping['tuple'] = TypeDefinition((tuple,), ())
schema = {
'type': 'list',
'empty': False,
'schema': { # the number of items is undefined
'type': 'tuple',
'items': 2 * ({'type': 'int'},)
}
}
validator = MyValidator(schema)
It's important to understand the difference of the items and the schema rule.
Mind that the default list type actually maps to the more abstract Sequence type and you might want to add another, stricter type for that.
While this isn't the cleanest solution, it will certainly do what you want.
from cerberus import Validator, TypeDefinition
class MyValidator(Validator):
def __init__(self, *args, **kwargs):
# Add the tuple type
tuple_type = TypeDefinition("tuple", (tuple,), ())
Validator.types_mapping["tuple"] = tuple_type
# Call the Validator constructor
super(MyValidator, self).__init__(*args, **kwargs)
def _validate_is_int_two_tuple(self, is_int_two_tuple, field, value):
''' Test that the value is a 2-tuple of ints
The rule's arguments are validated against this schema:
{'type': 'boolean'}
'''
if is_int_two_tuple:
# Check the type
if type(value) != tuple:
self._error(field, "Must be of type 'tuple'")
# Check the length
if len(value) != 2:
self._error(field, "Tuple must have two elements")
# Check the element types
if type(value[0]) != int or type(value[1]) != int:
self._error(field, "Both tuple values must be of type 'int'")
data = {"mylist": [(1,1), (2,2), (3,3)]}
schema = {
"mylist": {
"type": "list",
"schema": {
"type": "tuple",
"is_int_two_tuple": True
}
}
}
v = MyValidator(schema)
print("Validated: {}".format(v.validate(data)))
print("Validation errors: {}".format(v.errors))
print("Normalized result: {}".format(v.normalized(data)))
So as bro-grammer pointed out, the custom data types will get you validation of the types, but that's it. From the schema that you provided, it looks like you also want to validate other features like the length of the tuple and the types of the elements in the tuple. Doing that requires more than just a simple TypeDefinition for tuples.
Extending Validator to include a rule for this specific use-case isn't ideal, but it will do what you want. The more comprehensive solution would be to create a TupleValidator subclass that has rules for validating length, element-types, order, etc. of tuples.
Giving data organized in JSON format (code example bellow) how can we get the path of keys and sub-keys associated with a given value?
i.e.
Giving an input "23314" we need to return a list with:
Fanerozoico, Cenozoico, Quaternario, Pleistocenico, Superior.
Since data is a json file, using python and json lib we had decoded it:
import json
def decode_crono(crono_file):
with open(crono_file) as json_file:
data = json.load(json_file)
Now on we do not know how to treat it in a way to get what we need.
We can access keys like this:
k = data["Fanerozoico"]["Cenozoico"]["Quaternario "]["Pleistocenico "].keys()
or values like this:
v= data["Fanerozoico"]["Cenozoico"]["Quaternario "]["Pleistocenico "]["Superior"].values()
but this is still far from what we need.
{
"Fanerozoico": {
"id": "20000",
"Cenozoico": {
"id": "23000",
"Quaternario": {
"id": "23300",
"Pleistocenico": {
"id": "23310",
"Superior": {
"id": "23314"
},
"Medio": {
"id": "23313"
},
"Calabriano": {
"id": "23312"
},
"Gelasiano": {
"id": "23311"
}
}
}
}
}
}
It's a little hard to understand exactly what you are after here, but it seems like for some reason you have a bunch of nested json and you want to search it for an id and return a list that represents the path down the json nesting. If so, the quick and easy path is to recurse on the dictionary (that you got from json.load) and collect the keys as you go. When you find an 'id' key that matches the id you are searching for you are done. Here is some code that does that:
def all_keys(search_dict, key_id):
def _all_keys(search_dict, key_id, keys=None):
if not keys:
keys = []
for i in search_dict:
if search_dict[i] == key_id:
return keys + [i]
if isinstance(search_dict[i], dict):
potential_keys = _all_keys(search_dict[i], key_id, keys + [i])
if 'id' in potential_keys:
keys = potential_keys
break
return keys
return _all_keys(search_dict, key_id)[:-1]
The reason for the nested function is to strip off the 'id' key that would otherwise be on the end of the list.
This is really just to give you an idea of what a solution might look like. Beware the python recursion limit!
Based on the assumption that you need the full dictionary path until a key named id has a particular value, here's a recursive solution that iterates the whole dict. Bear in mind that:
The code is not optimized at all
For huge json objects it might yield StackOverflow :)
It will stop at first encountered value found (in theory there shouldn't be more than 1 if the json is semantically correct)
The code:
import json
from types import DictType
SEARCH_KEY_NAME = "id"
FOUND_FLAG = ()
CRONO_FILE = "a.jsn"
def decode_crono(crono_file):
with open(crono_file) as json_file:
return json.load(json_file)
def traverse_dict(dict_obj, value):
for key in dict_obj:
key_obj = dict_obj[key]
if key == SEARCH_KEY_NAME and key_obj == value:
return FOUND_FLAG
elif isinstance(key_obj, DictType):
inner = traverse_dict(key_obj, value)
if inner is not None:
return (key,) + inner
return None
if __name__ == "__main__":
value = "23314"
json_dict = decode_crono(CRONO_FILE)
result = traverse_dict(json_dict, value)
print result