Recursively build hierarchical JSON tree? - python

I have a database of parent-child connections. The data look like the following but could be presented in whichever way you want (dictionaries, list of lists, JSON, etc).
links=(("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl"))
The output that I need is a hierarchical JSON tree, which will be rendered with d3. There are discrete sub-trees in the data, which I will attach to a root node. So I need to recursively go though the links, and build up the tree structure. The furthest I can get is to iterate through all the people and append their children, but I can't figure out to do the higher order links (e.g. how to append a person with children to the child of someone else). This is similar to another question here, but I have no way to know the root nodes in advance, so I can't implement the accepted solution.
I am going for the following tree structure from my example data.
{
"name":"Root",
"children":[
{
"name":"Tom",
"children":[
{
"name":"Dick",
"children":[
{"name":"Harry"}
]
},
{
"name":"Larry"}
]
},
{
"name":"Bob",
"children":[
{
"name":"Leroy"
},
{
"name":"Earl"
}
]
}
]
}
This structure renders like this in my d3 layout.

To identify the root nodes you can unzip links and look for parents who are not children:
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
Then you can apply the recursive method:
import json
links = [("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl")]
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
for node in root_nodes:
links.append(('Root', node))
def get_nodes(node):
d = {}
d['name'] = node
children = get_children(node)
if children:
d['children'] = [get_nodes(child) for child in children]
return d
def get_children(node):
return [x[1] for x in links if x[0] == node]
tree = get_nodes('Root')
print json.dumps(tree, indent=4)
I used a set to get the root nodes, but if order is important you can use a list and remove the duplicates.

Try follwing code:
import json
links = (("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Tom","Hurbert"),("Tom","Neil"),("Bob","Leroy"),("Bob","Earl"),("Tom","Reginald"))
name_to_node = {}
root = {'name': 'Root', 'children': []}
for parent, child in links:
parent_node = name_to_node.get(parent)
if not parent_node:
name_to_node[parent] = parent_node = {'name': parent}
root['children'].append(parent_node)
name_to_node[child] = child_node = {'name': child}
parent_node.setdefault('children', []).append(child_node)
print json.dumps(root, indent=4)

In case you want to format the data as a hierarchy in the HTML/JS itself, take a look at:
Generate (multilevel) flare.json data format from flat json
In case you have tons of data the Web conversion will be faster since it uses the reduce functionality while Python lacks functional programming.
BTW: I am also working on the same topic i.e. generating the collapsible tree structure in d3.js. If you want to work along, my email is: erprateek.vit#gmail.com.

Related

Python JSON to tree with anytree

I have JSON data that looks something like
{'Tree': [
{"From": "1",
"To": "2"},
{"From": "2",
"To": "3"}
]}
basically an array of all existing references between nodes. I want to be able to draw trees starting from a chosen root node using this data.
I find that using anytree I must link a node to a parent node, but ideally I want to simply link it to the name and have it put together the tree structure on its own.
# I want to do this
child[0] = Node(data[0]["To"], parent=data[0]["From"])
# But I think I need to assign a node, not its name
child[1] = Node(data[1]["To"], parent=child[?])
Suggestions on how to do this?
Do this in two steps:
Convert the input format into an adjacency list: a dictionary keyed by the node keys, and with associated lists that hold the keys of the connected nodes.
Choose a root node, and then create the anytree tree using the adjacency list.
Here are two functions that deal with these two tasks:
from anytree import Node, RenderTree
from collections import defaultdict
def makegraph(edges):
adj = defaultdict(list)
for edge in edges:
a, b = edge.values()
adj[a].append(b)
adj[b].append(a)
return adj
def maketree(adj, nodekey):
visited = set()
def dfs(nodekey, parent):
if nodekey not in visited:
visited.add(nodekey)
node = Node(nodekey, parent)
for childkey in adj[nodekey]:
dfs(childkey, node)
return node
return dfs(nodekey, None)
Here is how to use it:
edges = [
{"From": "1", "To": "2"},
{"From": "2", "To": "3"}
]
adj = makegraph(edges)
# Choose "1" as root:
root = maketree(adj, '1')
print(RenderTree(root))
# Now let's choose "2" as root:
root = maketree(adj, '2')
print(RenderTree(root))

How can I create nested folders with attributes in python?

I'm a beginner programmer, and I just started learning about Nested lists and dictionaries. I have a task to create a system of files, with class Directory and it's attributes.
class Directory:
def __init__(self, name: str, parent: Optional['Directory'], children: List[Optional['Directory']]):
self.name = name
self.parent = parent
self.children = children
I'm supposed to build a function to create this system of files recursively, given root and it's directories from dictionary. Parent is a directory which includes current dir as one of his children. Any dir which doesn't have any children is supposed to be an empty directory.
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]}
I've been trying to do this and I think I know how to create directories recursively, however I have no idea what to put in dir.parent place without any additional imports. With root, there's no problem because it is None but further in process I don't know how to place child's parent (which is supposed to be Directory) as one of his attributes since I'm going recursively from there. Do you have any idea how to do that? Here's code which I have so far:
def create_system(system: Dict[str, List[str]], parent_children: List[str]) -> Optional[List[Optional['Directory']]]:
children: List[Optional['Directory']] = []
for child in parent_children:
if child in system.keys():
children.append(Directory(child, parent, create_system(system, list(system.get(child)))))
else:
children.append(Directory(child, parent, []))
return children
def root(system: Dict[str, List[str]]) -> Optional['Directory']:
return Directory("root", None, create_system(system, list(system.get("root"))))
Thank you for any responses!
Your goal is to transform the dictionary
system = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]
}
into the following tree:
root
/ \
dirA dirB
/ \
dirC dirE
/ \
dirH dirG
/ \
dirX dirY
Hopefully, it's clear that the return value of the process can be just the root. To guarantee that you hit every folder only after its parent has been created, you can use either a stack-based BFS approach, or a recursive DFS approach.
Let's look at a simple BFS approach:
def create_system_bfs(system):
root = Directory('root', None, [])
stack = [root] # in practice, use collections.deque([root])
while stack:
current = stack.pop(0)
for child in system.get(current.name, []):
d = Directory(child, current, [])
current.children.append(d)
stack.append(d)
return root
The DFS version of that could be something like:
def create_system_dfs(system):
def create_node(name, parent):
d = Directory(name, parent, [])
d.children = [create_node(child, d) for child in system.get(name, [])]
return d
return create_node('root', None)
Keep in mind that there are other possible approaches. In both cases, the create_root method is completely unnecessary. The BFS approach is limited only by available heap memory. The DFS approach may also be limited by stack size.
Before get all tangled up in classes we can think in terms of an ordinary function -
def paths(t, init="root"):
def loop(q, path):
if isinstance(q, dict):
for (dir, v) in q.items():
yield from loop(v, [*path, dir])
elif isinstance(q, list):
for dir in q:
yield from loop \
( t[dir] if dir in t else None
, [*path, dir]
)
else:
yield "/".join(path)
yield from loop(t[init], ["."])
Using paths is easy, simply call it on your input tree -
input = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
for path in paths(input):
print(path)
./dirA/dirC/dirH
./dirA/dirC/dirG/dirX
./dirA/dirC/dirG/dirY
./dirB/dirE
Using paths allows us to easily create the directories we need -
import os
for path in paths(input):
os.makedirs(path)
This is an opportunity to learn about reusable modules and mutual recursion. This solution in this answer solves your specific problem without any modification of the modules written in another answer. The distinct advantage of this approach is that tree has zero knowledge of your node shape and allows you to define any output shape.
Below we create a tree using plain dict with name, parent, and children properties. tree does not make this choice for you. A different structure or a custom class can be used, if desired -
from tree import tree
input = {
None: ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
result = tree \
( flatten(input)
, parent
, lambda node, children:
dict \
( name=name(node)
, parent=parent(node)
, children=children(name(node))
)
)
print(result)
[
{
"name": "dirA",
"parent": None,
"children": [
{
"name": "dirC",
"parent": "dirA",
"children": [
{
"name": "dirH",
"parent": "dirC",
"children": []
},
{
"name": "dirG",
"parent": "dirC",
"children": [
{
"name": "dirX",
"parent": "dirG",
"children": []
},
{
"name": "dirY",
"parent": "dirG",
"children": []
}
]
}
]
}
]
},
{
"name": "dirB",
"parent": None,
"children": [
{
"name": "dirE",
"parent": "dirB",
"children": []
}
]
}
]
In order to use tree, we defined a way to flatten the input nodes -
def flatten(t):
seq = chain.from_iterable \
( map(lambda _: (_, k), v)
for (k,v) in input.items()
)
return list(seq)
print(flatten(input))
[ ('dirA', None)
, ('dirB', None)
, ('dirC', 'dirA')
, ('dirH', 'dirC')
, ('dirG', 'dirC')
, ('dirE', 'dirB')
, ('dirX', 'dirG')
, ('dirY', 'dirG')
]
And we also defined the primary key and foreign key. Here we use name and parent, but you can choose whichever names you like -
def name(t):
return t[0]
def parent(t):
return t[1]
To learn more about the tree module and some of its benefits, see the original Q&A

Python: how to loop through all unknown depth of a tree?

I have a strategic issue of writing a program doing a job.
I have CSV files like:
Column1 Column 2
------- ----------
parent1 [child1, child2, child3]
parent2 [child4, child5, child6]
child1 [child7, child8]
child5 [child10, child33]
... ...
It is unknown how deep each element of those lists will be extended and I want to loop through them.
Code:
def make_parentClass(self):
for i in self.csv_rows_list:
self.parentClassList.append(parentClass(i))
# after first Parent
for i in self.parentClassList:
if i.children !=[]:
for child in i.children:
for z in self.parentClassList:
if str(child) == str(z.node_parent):
i.node_children.append(z)
self.parentClassList.remove(z)
class parentClass():
node_children = []
def __init__(self, the_list):
self.node_parent = the_list[0]
self.children = the_list[1]
The above code might be a solution if I will find a way to iterate. Let me see if you like the question and makes sense now.
Output:
My aim is to build up a treeview through another language but first I need to make this output in JSON format. So the output expected to be something like:
{
paren1:{'child1':{'child7':{}, 'child8':{}},
'child2': {},
'child3': {},
},
parent2: {
'child4':{},
'child5': {
'child10':{},
'child33':{}
},
'child6':{}
}
}
I would recommend a solution using two dictionaries. One nested one with the actually data structure you plan to convert to JSON, and one flat one that will let you actually find the keys. Since everything is a reference in Python, you can make sure that both dictionaries have the exact same values. Carefully modifying the flat dictionary will build your structure for you.
The following code assumes that you have already managed to split each line into a string parent and list children, containing values form the two columns.
json_dict = {}
flat_dict = {}
for parent, children in file_iterator():
if parent in flat_dict:
value = flat_dict[parent]
else:
value = {}
flat_dict[parent] = json_dict[parent] = value
for child in children:
flat_dict[child] = value[child] = {}
Running this produces json_dict like this:
{
'parent1': {
'child1': {
'child7': {},
'child8': {}
},
'child2': {},
'child3': {}
},
'parent2': {
'child4': {},
'child5': {
'child10': {},
'child33': {}
},
'child6': {}
}
}
Here is an IDEOne link to play with.

Create jstree hierarchy from parent child pair from table

I have a parent child pair from table,
example:
links=(("Tom","Dick"),
("Dick","Harry"),
("Tom","Larry"),
("Bob","Leroy"),
("Bob","Earl"),
("Earl","Joy"),
("Joy","Joy child"),
("Joy","Joy child2"),
("Joy child2", "Maria"))
and I am trying to create jstree from this pairs. I went through various links but cant get this tuple to work. Can anyone please provide a recursive function in python which takes the tuple as mentioned above or any combination of parent-child-grand-child pairs as input and creates a hierarchy by creating a json similar to this format
{
"name": "Tom",
"children": [
{
"name": "Dick"
}
]
}
Thank you in Advance:) I really appreciate your help!
import json
links = (("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Tom","Hurbert"),("Tom","Neil"),("Bob","Leroy"),("Bob","Earl"),("Tom","Reginald"))
name_to_node = {}
root = {'name': 'Root', 'children': []}
for parent, child in links:
parent_node = name_to_node.get(parent)
if not parent_node:
name_to_node[parent] = parent_node = {'name': parent}
root['children'].append(parent_node)
name_to_node[child] = child_node = {'name': child}
parent_node.setdefault('children', []).append(child_node)
print json.dumps(root, indent=4)

Python - how to convert parent child into a nested dictionary

I have a parent child dict that looks like this where the key is child and 0 is root node.
node[0]=[{"parms":{"meta1":"foo"},"name":"RootNoe"}]
node[1]=[{"parent":0,"data":{"parms":{"meta2":"bar"},"name":"country"} }]
node[2]=[{"parent":1,"data":{"parms":{"meta3":"baz"},"name":"day"} }]
I need to create a nested json object that looks like this:
test = {
"params": {"parms":{"meta1":"foo"},
"name": "RootNode",
"children": [
{
"parms":{"meta2":"bar"},
"name":"country",
"children": [
{"parms":{"meta3":"baz"},
"name":"day","children": []}
]
}]
}
How do I do that in python?
You could construct the tree from the definition you have in a loop.
for element in node:
if 'parent' in element:
if 'children' not in node[element['parent']]:
node[element['parent']]['children'] = []
node[element['parent']]['children'].append(element)
del element['parent']
test = node[0]
children needs to be present for this to work, but I'm hoping you get the gist.
Also note that this modifies the node sequence.

Categories

Resources