I'm a beginner programmer, and I just started learning about Nested lists and dictionaries. I have a task to create a system of files, with class Directory and it's attributes.
class Directory:
def __init__(self, name: str, parent: Optional['Directory'], children: List[Optional['Directory']]):
self.name = name
self.parent = parent
self.children = children
I'm supposed to build a function to create this system of files recursively, given root and it's directories from dictionary. Parent is a directory which includes current dir as one of his children. Any dir which doesn't have any children is supposed to be an empty directory.
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]}
I've been trying to do this and I think I know how to create directories recursively, however I have no idea what to put in dir.parent place without any additional imports. With root, there's no problem because it is None but further in process I don't know how to place child's parent (which is supposed to be Directory) as one of his attributes since I'm going recursively from there. Do you have any idea how to do that? Here's code which I have so far:
def create_system(system: Dict[str, List[str]], parent_children: List[str]) -> Optional[List[Optional['Directory']]]:
children: List[Optional['Directory']] = []
for child in parent_children:
if child in system.keys():
children.append(Directory(child, parent, create_system(system, list(system.get(child)))))
else:
children.append(Directory(child, parent, []))
return children
def root(system: Dict[str, List[str]]) -> Optional['Directory']:
return Directory("root", None, create_system(system, list(system.get("root"))))
Thank you for any responses!
Your goal is to transform the dictionary
system = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"]
"dirG": ["dirX", "dirY"]
}
into the following tree:
root
/ \
dirA dirB
/ \
dirC dirE
/ \
dirH dirG
/ \
dirX dirY
Hopefully, it's clear that the return value of the process can be just the root. To guarantee that you hit every folder only after its parent has been created, you can use either a stack-based BFS approach, or a recursive DFS approach.
Let's look at a simple BFS approach:
def create_system_bfs(system):
root = Directory('root', None, [])
stack = [root] # in practice, use collections.deque([root])
while stack:
current = stack.pop(0)
for child in system.get(current.name, []):
d = Directory(child, current, [])
current.children.append(d)
stack.append(d)
return root
The DFS version of that could be something like:
def create_system_dfs(system):
def create_node(name, parent):
d = Directory(name, parent, [])
d.children = [create_node(child, d) for child in system.get(name, [])]
return d
return create_node('root', None)
Keep in mind that there are other possible approaches. In both cases, the create_root method is completely unnecessary. The BFS approach is limited only by available heap memory. The DFS approach may also be limited by stack size.
Before get all tangled up in classes we can think in terms of an ordinary function -
def paths(t, init="root"):
def loop(q, path):
if isinstance(q, dict):
for (dir, v) in q.items():
yield from loop(v, [*path, dir])
elif isinstance(q, list):
for dir in q:
yield from loop \
( t[dir] if dir in t else None
, [*path, dir]
)
else:
yield "/".join(path)
yield from loop(t[init], ["."])
Using paths is easy, simply call it on your input tree -
input = {
"root": ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
for path in paths(input):
print(path)
./dirA/dirC/dirH
./dirA/dirC/dirG/dirX
./dirA/dirC/dirG/dirY
./dirB/dirE
Using paths allows us to easily create the directories we need -
import os
for path in paths(input):
os.makedirs(path)
This is an opportunity to learn about reusable modules and mutual recursion. This solution in this answer solves your specific problem without any modification of the modules written in another answer. The distinct advantage of this approach is that tree has zero knowledge of your node shape and allows you to define any output shape.
Below we create a tree using plain dict with name, parent, and children properties. tree does not make this choice for you. A different structure or a custom class can be used, if desired -
from tree import tree
input = {
None: ["dirA", "dirB"],
"dirA": ["dirC"],
"dirC": ["dirH", "dirG"],
"dirB": ["dirE"],
"dirG": ["dirX", "dirY"]
}
result = tree \
( flatten(input)
, parent
, lambda node, children:
dict \
( name=name(node)
, parent=parent(node)
, children=children(name(node))
)
)
print(result)
[
{
"name": "dirA",
"parent": None,
"children": [
{
"name": "dirC",
"parent": "dirA",
"children": [
{
"name": "dirH",
"parent": "dirC",
"children": []
},
{
"name": "dirG",
"parent": "dirC",
"children": [
{
"name": "dirX",
"parent": "dirG",
"children": []
},
{
"name": "dirY",
"parent": "dirG",
"children": []
}
]
}
]
}
]
},
{
"name": "dirB",
"parent": None,
"children": [
{
"name": "dirE",
"parent": "dirB",
"children": []
}
]
}
]
In order to use tree, we defined a way to flatten the input nodes -
def flatten(t):
seq = chain.from_iterable \
( map(lambda _: (_, k), v)
for (k,v) in input.items()
)
return list(seq)
print(flatten(input))
[ ('dirA', None)
, ('dirB', None)
, ('dirC', 'dirA')
, ('dirH', 'dirC')
, ('dirG', 'dirC')
, ('dirE', 'dirB')
, ('dirX', 'dirG')
, ('dirY', 'dirG')
]
And we also defined the primary key and foreign key. Here we use name and parent, but you can choose whichever names you like -
def name(t):
return t[0]
def parent(t):
return t[1]
To learn more about the tree module and some of its benefits, see the original Q&A
Related
I need convert one file XML to JSON with python, but I need put the attributes of the father after the child nodes in the JSON
My current code is like that.
def generate_json(self, event=None):
# opening the xml file
with open(self.fn ,"r") as xmlfileObj:
data_dict = lib.xmltodict.parse(xmlfileObj.read(),attr_prefix='_')
xmlfileObj.close()
jsonObj= json.dumps(data_dict, sort_keys=False)
restored = json.loads(jsonObj)
#storing json data to json file
with open("data.json", "w") as jsonfileObj:
jsonfileObj.write(jsonObj)
jsonfileObj.close()
and I need this;
{
"datasetVersion": {
"metadataBlocks": {
"citation": {
"fields": [
{
"value": "Darwin's Finches",
"typeClass": "primitive",
"multiple": false,
"typeName": "title"
}
],
"displayName": "Citation Metadata"
}
}
}
}
in place of:
{
"datasetVersion": {
"metadataBlocks": {
"citation": {
"displayName": "Citation Metadata",
"fields": [
{
"value": "Darwin's Finches",
"typeClass": "primitive",
"multiple": false,
"typeName": "title"
}
]
}
}
}
}
No in alphabetic order changing sort_keys=False, I need only to change the attributes of node father to the final.
on some website make how I need:
https://www.convertjson.com/xml-to-json.htm
and another no how:
http://www.utilities-online.info/xmltojson/#.X_crINhKiUk
can somebody help me?
I could format making use of the https://pypi.org/project/xmljson/ library, I modified the BadgerFish style
class BadgerFish(XMLData): # the convention was changed from the project the prefix _ and order the attributers of father nodes
'''Converts between XML and data using the BadgerFish convention'''
def init(self, **kwargs):
super(BadgerFish, self).init(attr_prefix='_', text_content='$', **kwargs)
I changed the prefix of the attribute to "_"
by otherwise only changed the **# modify to put the father attributes to finish, how we can see in the code
def data(self, root):
'''Convert etree.Element into a dictionary'''
value = self.dict()
children = [node for node in root if isinstance(node.tag, basestring)]
if root.text and self.text_content is not None:
text = root.text
if text.strip():
if self.simple_text and len(children) == len(root.attrib) == 0:
value = self._fromstring(text)
else:
value[self.text_content] = self._fromstring(text)
count = Counter(child.tag for child in children)
for child in children:
# if child.tag == "System_State_Entry": print(child.tag)
if count[child.tag] == 1:
value.update(self.data(child))
else:
result = value.setdefault(child.tag, self.list())
result += self.data(child).values()
# if simple_text, elements with no children nor attrs become '', not {}
if isinstance(value, dict) and not value and self.simple_text:
value = ''
**# modify to put the father atributes to finish
for attr, attrval in root.attrib.items():
attr = attr if self.attr_prefix is None else self.attr_prefix + attr
value[attr] = self._fromstring(attrval)**
return self.dict([(root.tag, value)])
How to create folders with name of all dictionaries in nested dictionary?
The code should iterate through dictionary and create the directories structure like in nested dictionary with keeping all hierarchy.
dic = {
"root": {
'0_name': {
"0_name_a": {
"0_name_a_a": {
},
"0_name_a_b": {
"file": "file"
}
},
"0_name_b": {
}
},
"1_name": {
},
"2_name": {
},
"3_name": {
"3_name": {
},
}
}
}
Should make directories like:
root/0_name
root/0_name/0_name_a
root/0_name/0_name_a/0_name_a_a
root/0_name/0_name_a/0_name_a_b
root/0_name/0_name_b
root/1_name/1_name_a
root/2_name/
root/3_name/3_name(the same name)
The script needs to determine if the value is final, and create a folder with that path, then remove that value from a dictionary and start over. Also somehow to recognize "file" type and skip it. I couldn't determine a recursive way to iterate all values and add them to a path.
My approach (absolutely not working, just pasted it in to show something):
def rec(dic):
path = []
def nova(dic):
for k, v in dic.copy().items():
if isinstance(v, dict):
path.append(k)
if not v:
print(os.path.join(*path))
path.clear()
dic.pop(k)
nova(v)
if path == []:
nova(dic)
You're going to need a recursive program, and the value of "path" needs to be part of that recursion.
This isn't exactly what you want, but it's close.
def handle_problem(dic):
def one_directory(dic, path):
for name, info in dic.items():
next_path = path + "/" + name
if isinstance(info, dict):
print("Creating " + next_path) # actually use mkdir here!
one_directory(info, next_path)
one_directory(dic, '')
I have a strategic issue of writing a program doing a job.
I have CSV files like:
Column1 Column 2
------- ----------
parent1 [child1, child2, child3]
parent2 [child4, child5, child6]
child1 [child7, child8]
child5 [child10, child33]
... ...
It is unknown how deep each element of those lists will be extended and I want to loop through them.
Code:
def make_parentClass(self):
for i in self.csv_rows_list:
self.parentClassList.append(parentClass(i))
# after first Parent
for i in self.parentClassList:
if i.children !=[]:
for child in i.children:
for z in self.parentClassList:
if str(child) == str(z.node_parent):
i.node_children.append(z)
self.parentClassList.remove(z)
class parentClass():
node_children = []
def __init__(self, the_list):
self.node_parent = the_list[0]
self.children = the_list[1]
The above code might be a solution if I will find a way to iterate. Let me see if you like the question and makes sense now.
Output:
My aim is to build up a treeview through another language but first I need to make this output in JSON format. So the output expected to be something like:
{
paren1:{'child1':{'child7':{}, 'child8':{}},
'child2': {},
'child3': {},
},
parent2: {
'child4':{},
'child5': {
'child10':{},
'child33':{}
},
'child6':{}
}
}
I would recommend a solution using two dictionaries. One nested one with the actually data structure you plan to convert to JSON, and one flat one that will let you actually find the keys. Since everything is a reference in Python, you can make sure that both dictionaries have the exact same values. Carefully modifying the flat dictionary will build your structure for you.
The following code assumes that you have already managed to split each line into a string parent and list children, containing values form the two columns.
json_dict = {}
flat_dict = {}
for parent, children in file_iterator():
if parent in flat_dict:
value = flat_dict[parent]
else:
value = {}
flat_dict[parent] = json_dict[parent] = value
for child in children:
flat_dict[child] = value[child] = {}
Running this produces json_dict like this:
{
'parent1': {
'child1': {
'child7': {},
'child8': {}
},
'child2': {},
'child3': {}
},
'parent2': {
'child4': {},
'child5': {
'child10': {},
'child33': {}
},
'child6': {}
}
}
Here is an IDEOne link to play with.
I have a database of parent-child connections. The data look like the following but could be presented in whichever way you want (dictionaries, list of lists, JSON, etc).
links=(("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl"))
The output that I need is a hierarchical JSON tree, which will be rendered with d3. There are discrete sub-trees in the data, which I will attach to a root node. So I need to recursively go though the links, and build up the tree structure. The furthest I can get is to iterate through all the people and append their children, but I can't figure out to do the higher order links (e.g. how to append a person with children to the child of someone else). This is similar to another question here, but I have no way to know the root nodes in advance, so I can't implement the accepted solution.
I am going for the following tree structure from my example data.
{
"name":"Root",
"children":[
{
"name":"Tom",
"children":[
{
"name":"Dick",
"children":[
{"name":"Harry"}
]
},
{
"name":"Larry"}
]
},
{
"name":"Bob",
"children":[
{
"name":"Leroy"
},
{
"name":"Earl"
}
]
}
]
}
This structure renders like this in my d3 layout.
To identify the root nodes you can unzip links and look for parents who are not children:
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
Then you can apply the recursive method:
import json
links = [("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Bob","Leroy"),("Bob","Earl")]
parents, children = zip(*links)
root_nodes = {x for x in parents if x not in children}
for node in root_nodes:
links.append(('Root', node))
def get_nodes(node):
d = {}
d['name'] = node
children = get_children(node)
if children:
d['children'] = [get_nodes(child) for child in children]
return d
def get_children(node):
return [x[1] for x in links if x[0] == node]
tree = get_nodes('Root')
print json.dumps(tree, indent=4)
I used a set to get the root nodes, but if order is important you can use a list and remove the duplicates.
Try follwing code:
import json
links = (("Tom","Dick"),("Dick","Harry"),("Tom","Larry"),("Tom","Hurbert"),("Tom","Neil"),("Bob","Leroy"),("Bob","Earl"),("Tom","Reginald"))
name_to_node = {}
root = {'name': 'Root', 'children': []}
for parent, child in links:
parent_node = name_to_node.get(parent)
if not parent_node:
name_to_node[parent] = parent_node = {'name': parent}
root['children'].append(parent_node)
name_to_node[child] = child_node = {'name': child}
parent_node.setdefault('children', []).append(child_node)
print json.dumps(root, indent=4)
In case you want to format the data as a hierarchy in the HTML/JS itself, take a look at:
Generate (multilevel) flare.json data format from flat json
In case you have tons of data the Web conversion will be faster since it uses the reduce functionality while Python lacks functional programming.
BTW: I am also working on the same topic i.e. generating the collapsible tree structure in d3.js. If you want to work along, my email is: erprateek.vit#gmail.com.
I have a parent child dict that looks like this where the key is child and 0 is root node.
node[0]=[{"parms":{"meta1":"foo"},"name":"RootNoe"}]
node[1]=[{"parent":0,"data":{"parms":{"meta2":"bar"},"name":"country"} }]
node[2]=[{"parent":1,"data":{"parms":{"meta3":"baz"},"name":"day"} }]
I need to create a nested json object that looks like this:
test = {
"params": {"parms":{"meta1":"foo"},
"name": "RootNode",
"children": [
{
"parms":{"meta2":"bar"},
"name":"country",
"children": [
{"parms":{"meta3":"baz"},
"name":"day","children": []}
]
}]
}
How do I do that in python?
You could construct the tree from the definition you have in a loop.
for element in node:
if 'parent' in element:
if 'children' not in node[element['parent']]:
node[element['parent']]['children'] = []
node[element['parent']]['children'].append(element)
del element['parent']
test = node[0]
children needs to be present for this to work, but I'm hoping you get the gist.
Also note that this modifies the node sequence.