Python JSON to tree with anytree - python

I have JSON data that looks something like
{'Tree': [
{"From": "1",
"To": "2"},
{"From": "2",
"To": "3"}
]}
basically an array of all existing references between nodes. I want to be able to draw trees starting from a chosen root node using this data.
I find that using anytree I must link a node to a parent node, but ideally I want to simply link it to the name and have it put together the tree structure on its own.
# I want to do this
child[0] = Node(data[0]["To"], parent=data[0]["From"])
# But I think I need to assign a node, not its name
child[1] = Node(data[1]["To"], parent=child[?])
Suggestions on how to do this?

Do this in two steps:
Convert the input format into an adjacency list: a dictionary keyed by the node keys, and with associated lists that hold the keys of the connected nodes.
Choose a root node, and then create the anytree tree using the adjacency list.
Here are two functions that deal with these two tasks:
from anytree import Node, RenderTree
from collections import defaultdict
def makegraph(edges):
adj = defaultdict(list)
for edge in edges:
a, b = edge.values()
adj[a].append(b)
adj[b].append(a)
return adj
def maketree(adj, nodekey):
visited = set()
def dfs(nodekey, parent):
if nodekey not in visited:
visited.add(nodekey)
node = Node(nodekey, parent)
for childkey in adj[nodekey]:
dfs(childkey, node)
return node
return dfs(nodekey, None)
Here is how to use it:
edges = [
{"From": "1", "To": "2"},
{"From": "2", "To": "3"}
]
adj = makegraph(edges)
# Choose "1" as root:
root = maketree(adj, '1')
print(RenderTree(root))
# Now let's choose "2" as root:
root = maketree(adj, '2')
print(RenderTree(root))

Related

How to serialize double linked list in Python

I have a method-free Python class which is implemented as a double-linked list node:
class DataNode:
def __init__(self, my_value):
self.my_value = my_value
self.prev = None
self.next = None
After initializing a number of DataNode instances, there are put in the list and linked:
# Suppose data_nodes is already filled with DataNode elements
for e1, e2 in zip(data_nodes, data_nodes[1:]):
e1.next = e2
e2.prev = e1
My question is: What would be the best way to serialize this list to .json? In other words, how plausible would it be to write a somewhat of generalized encoder for double-linked lists? In my program I have a number of classes implemented this way - it seems a bit wasteful to write a custom encoder for the each double-linked node class:
class DataNodeEncoder(JSONEncoder):
def default(self, o):
return {'my_value': o.my_value}
class DataNode2Encoder(JSONEncoder):
def default(self, o):
return {'my_other_value1': o.my_other_value1,
'my_other_value2': o.my_other_value2}
I would serialise as standard list, and then encode it from list back to doubly linked list. You could make a constructor for a DataList class that can take a list or separate values as argument(s), and will create the doubly linked list. I would use a sentinel node to keep the code simple.
Add an __iter__ method that will iterate the values (not the node objects). This way you can easily convert a doubly linked list to a list, and from there to JSON.
Here is the code:
class DataNode:
def __init__(self, my_value, prev=None, nxt=None):
self.my_value = my_value
self.prev = prev
if prev:
prev.next = self
self.next = nxt
if nxt:
nxt.prev = self
class DataList(DataNode):
def __init__(self, *values):
super().__init__(None, self, self) # Sentinel node
# Support a single argument that is a list or tuple
if len(values) == 1 and type(values[0]) in (list, tuple):
values = values[0]
for value in values:
self.append(value)
def append(self, my_value):
DataNode(my_value, self.prev, self)
return self
def __iter__(self):
node = self.next
while node != self:
yield node.my_value
node = node.next
Here are some example uses:
# Create the doubly linked list from some given values
lst = DataList(1, 2, 3, 4, 5, 6)
# Add a few more
lst.append(7).append(8)
# Get as standard list of values (not nodes)
print(list(lst))
# Get as JSON:
import json
print(json.dumps(list(lst)))
# Build from JSON:
lst = DataList(json.loads("[2,3,5,7,11]"))
# Print the values
print(*lst)
You could give each node a unique Id. And use a custom serializer and deserializer to regenerate references as such. Mark the reference values so you can rebuild the list afterwards, with this you should be able to go through the list and generate the first part of the JSON and then include the references:
{
"$id": "1",
"myValue": "lorem ipsum",
"next": {
"$id": "2",
"myValue": "lorem ipsum",
"next": {
"$id": "3",
"myValue": "lorem ipsum",
"next": {
"$id": "4",
"myValue": "lorem ipsum",
},
"prev": {
"$ref": "2"
}
},
"prev": {
"$ref": "1"
}
}
}
This can be done in two passes. One setting the next elements and one backreferencing prev elements where they exist. This is a way to keep your linked list in JSON. However, if it is not necessary to preserve the LL structure. You could also serialize it to a regular array and rebuild the linked list afterward.

How can I create nested folders with attributes in python?

I'm a beginner programmer, and I just started learning about Nested lists and dictionaries. I have a task to create a system of files, with class Directory and it's attributes.
class Directory:
def __init__(self, name: str, parent: Optional['Directory'], children: List[Optional['Directory']]):
self.name = name
self.parent = parent
self.children = children
I'm supposed to build a function to create this system of files recursively, given root and it's directories from dictionary. Parent is a directory which includes current dir as one of his children. Any dir which doesn't have any children is supposed to be an empty directory.
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]}
I've been trying to do this and I think I know how to create directories recursively, however I have no idea what to put in dir.parent place without any additional imports. With root, there's no problem because it is None but further in process I don't know how to place child's parent (which is supposed to be Directory) as one of his attributes since I'm going recursively from there. Do you have any idea how to do that? Here's code which I have so far:
def create_system(system: Dict[str, List[str]], parent_children: List[str]) -> Optional[List[Optional['Directory']]]:
children: List[Optional['Directory']] = []
for child in parent_children:
if child in system.keys():
children.append(Directory(child, parent, create_system(system, list(system.get(child)))))
else:
children.append(Directory(child, parent, []))
return children
def root(system: Dict[str, List[str]]) -> Optional['Directory']:
return Directory("root", None, create_system(system, list(system.get("root"))))
Thank you for any responses!
Your goal is to transform the dictionary
system = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]
}
into the following tree:
root
/ \
dirA dirB
/ \
dirC dirE
/ \
dirH dirG
/ \
dirX dirY
Hopefully, it's clear that the return value of the process can be just the root. To guarantee that you hit every folder only after its parent has been created, you can use either a stack-based BFS approach, or a recursive DFS approach.
Let's look at a simple BFS approach:
def create_system_bfs(system):
root = Directory('root', None, [])
stack = [root] # in practice, use collections.deque([root])
while stack:
current = stack.pop(0)
for child in system.get(current.name, []):
d = Directory(child, current, [])
current.children.append(d)
stack.append(d)
return root
The DFS version of that could be something like:
def create_system_dfs(system):
def create_node(name, parent):
d = Directory(name, parent, [])
d.children = [create_node(child, d) for child in system.get(name, [])]
return d
return create_node('root', None)
Keep in mind that there are other possible approaches. In both cases, the create_root method is completely unnecessary. The BFS approach is limited only by available heap memory. The DFS approach may also be limited by stack size.
Before get all tangled up in classes we can think in terms of an ordinary function -
def paths(t, init="root"):
def loop(q, path):
if isinstance(q, dict):
for (dir, v) in q.items():
yield from loop(v, [*path, dir])
elif isinstance(q, list):
for dir in q:
yield from loop \
( t[dir] if dir in t else None
, [*path, dir]
)
else:
yield "/".join(path)
yield from loop(t[init], ["."])
Using paths is easy, simply call it on your input tree -
input = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
for path in paths(input):
print(path)
./dirA/dirC/dirH
./dirA/dirC/dirG/dirX
./dirA/dirC/dirG/dirY
./dirB/dirE
Using paths allows us to easily create the directories we need -
import os
for path in paths(input):
os.makedirs(path)
This is an opportunity to learn about reusable modules and mutual recursion. This solution in this answer solves your specific problem without any modification of the modules written in another answer. The distinct advantage of this approach is that tree has zero knowledge of your node shape and allows you to define any output shape.
Below we create a tree using plain dict with name, parent, and children properties. tree does not make this choice for you. A different structure or a custom class can be used, if desired -
from tree import tree
input = {
None: ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
result = tree \
( flatten(input)
, parent
, lambda node, children:
dict \
( name=name(node)
, parent=parent(node)
, children=children(name(node))
)
)
print(result)
[
{
"name": "dirA",
"parent": None,
"children": [
{
"name": "dirC",
"parent": "dirA",
"children": [
{
"name": "dirH",
"parent": "dirC",
"children": []
},
{
"name": "dirG",
"parent": "dirC",
"children": [
{
"name": "dirX",
"parent": "dirG",
"children": []
},
{
"name": "dirY",
"parent": "dirG",
"children": []
}
]
}
]
}
]
},
{
"name": "dirB",
"parent": None,
"children": [
{
"name": "dirE",
"parent": "dirB",
"children": []
}
]
}
]
In order to use tree, we defined a way to flatten the input nodes -
def flatten(t):
seq = chain.from_iterable \
( map(lambda _: (_, k), v)
for (k,v) in input.items()
)
return list(seq)
print(flatten(input))
[ ('dirA', None)
, ('dirB', None)
, ('dirC', 'dirA')
, ('dirH', 'dirC')
, ('dirG', 'dirC')
, ('dirE', 'dirB')
, ('dirX', 'dirG')
, ('dirY', 'dirG')
]
And we also defined the primary key and foreign key. Here we use name and parent, but you can choose whichever names you like -
def name(t):
return t[0]
def parent(t):
return t[1]
To learn more about the tree module and some of its benefits, see the original Q&A

Increment the value of a node in the json after converting the xml to json structure

I have an xml of the current format. I convert this xml to a json using the xmltodict library in python.
<?xml version="1.0" encoding="UTF-8" ?>
<MyHouse>
<Garden>
<InfoList>
<status value = "0"/>
</InfoList>
<Flowers>
<InfoList>
<status value = "0"/>
</InfoList>
</Flowers>
</Garden>
</MyHouse>
I want my json dict to look something like this after I send it to the xmltodict method.
json_tree =
{
"MyHouse": {
"Tid": "1", --> Need to add this node and its value increments from '1'.
"status": "0", --> This node is added to the root level node ONLY as it
is not in the xml shown above !!
"Garden": {
"Tid": "2", --> Incremeneted to 2
"InfoList": {
"status": {
"#value": "0"
}
},
"Flowers": {
"Tid": "3", ---> Incremented to 3
"InfoList": {
"status": {
"#value": "0"
}
}
}
}
}
}
As we can see in the json structure above, I want to be able to add a default "status": "0" to the root node which is "MyHouse" in this case.
I also want to be able to add the "Tid" for each of the nodes such as "Garden", 'Flowers". Note, there could be many more levels in the xml but it is not shown here for simplicity. I would like to have a generic method.
My current implementation is as the follows.
def add_status(root, el_to_insert):
# Add "id":"#" to the nodes of the xml
for el in root:
if len(list(el)): # check if element has child nodes
el.insert(1, el_to_insert)
el_to_insert = el_to_insert.text + 1 ---> This line of code doesn't seem to work. I want to increment the value of "Tid" everytime its added to the tree?
add_status(el, el_to_insert)
def ConverxmltoJson(target):
xmlConfigFile = ET.parse(target)
root = xmlConfigFile.getroot()
state_el = ET.Element("Tid") # Create `Tid` node, not sure how to add the "status" node to the root "Garden" node.
state_el.text = "0"
root.insert(1, state_el)
add_status(root, state_el)
json_str = xmltodict.parse(ET.tostring(root, encoding="utf8"))
with open("xmlconfig.xml") as xmlConfigFile:
ConverxmltoJson(xmlConfigFile)
I would be glad if someone could help me to resolve the issue.
Thank you.
I was able to resolve the part in which I had to the "status" to the root node and "Tid" by the following changes.
state_el = ET.Element("state") # Create `state` node for root node
state_el.text = "0"
root.insert(1, state_el)
# Adding the Tid node to root level
id_node = ET.Element("Tid") # Create `Tid` node
id_node.text = "0"
root.insert(1, id_node)
I have a new issue now and opened a new question at link: Updating the "Tid" node value updates to the final value of the global variable

Recursively build hierarchical JSON tree?

I have a database of parent-child connections. The data look like the following but could be presented in whichever way you want (dictionaries, list of lists, JSON, etc).
links=(("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl"))
The output that I need is a hierarchical JSON tree, which will be rendered with d3. There are discrete sub-trees in the data, which I will attach to a root node. So I need to recursively go though the links, and build up the tree structure. The furthest I can get is to iterate through all the people and append their children, but I can't figure out to do the higher order links (e.g. how to append a person with children to the child of someone else). This is similar to another question here, but I have no way to know the root nodes in advance, so I can't implement the accepted solution.
I am going for the following tree structure from my example data.
{
"name":"Root",
"children":[
{
"name":"Tom",
"children":[
{
"name":"Dick",
"children":[
{"name":"Harry"}
]
},
{
"name":"Larry"}
]
},
{
"name":"Bob",
"children":[
{
"name":"Leroy"
},
{
"name":"Earl"
}
]
}
]
}
This structure renders like this in my d3 layout.
To identify the root nodes you can unzip links and look for parents who are not children:
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
Then you can apply the recursive method:
import json
links = [("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl")]
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
for node in root_nodes:
links.append(('Root', node))
def get_nodes(node):
d = {}
d['name'] = node
children = get_children(node)
if children:
d['children'] = [get_nodes(child) for child in children]
return d
def get_children(node):
return [x[1] for x in links if x[0] == node]
tree = get_nodes('Root')
print json.dumps(tree, indent=4)
I used a set to get the root nodes, but if order is important you can use a list and remove the duplicates.
Try follwing code:
import json
links = (("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Tom","Hurbert"),("Tom","Neil"),("Bob","Leroy"),("Bob","Earl"),("Tom","Reginald"))
name_to_node = {}
root = {'name': 'Root', 'children': []}
for parent, child in links:
parent_node = name_to_node.get(parent)
if not parent_node:
name_to_node[parent] = parent_node = {'name': parent}
root['children'].append(parent_node)
name_to_node[child] = child_node = {'name': child}
parent_node.setdefault('children', []).append(child_node)
print json.dumps(root, indent=4)
In case you want to format the data as a hierarchy in the HTML/JS itself, take a look at:
Generate (multilevel) flare.json data format from flat json
In case you have tons of data the Web conversion will be faster since it uses the reduce functionality while Python lacks functional programming.
BTW: I am also working on the same topic i.e. generating the collapsible tree structure in d3.js. If you want to work along, my email is: erprateek.vit#gmail.com.

Python - how to convert parent child into a nested dictionary

I have a parent child dict that looks like this where the key is child and 0 is root node.
node[0]=[{"parms":{"meta1":"foo"},"name":"RootNoe"}]
node[1]=[{"parent":0,"data":{"parms":{"meta2":"bar"},"name":"country"} }]
node[2]=[{"parent":1,"data":{"parms":{"meta3":"baz"},"name":"day"} }]
I need to create a nested json object that looks like this:
test = {
"params": {"parms":{"meta1":"foo"},
"name": "RootNode",
"children": [
{
"parms":{"meta2":"bar"},
"name":"country",
"children": [
{"parms":{"meta3":"baz"},
"name":"day","children": []}
]
}]
}
How do I do that in python?
You could construct the tree from the definition you have in a loop.
for element in node:
if 'parent' in element:
if 'children' not in node[element['parent']]:
node[element['parent']]['children'] = []
node[element['parent']]['children'].append(element)
del element['parent']
test = node[0]
children needs to be present for this to work, but I'm hoping you get the gist.
Also note that this modifies the node sequence.

Categories

Resources