How to parse a directory structure into dictionary?

How to parse a directory structure into dictionary? - python

I have list of directory structure such as:
['/a/b', '/a/b/c', '/a/b/c/d', '/a/b/c/e', '/a/b/c/f/g', '/a/b/c/f/h', '/a/b/c/f/i']
I want to convert it into dict like a tree structure.
{'/': {'a': {'b': {'c':
[{'d':None},
{'e':None},
{'f':[{'g':None, {'h':None}, {'i':None}]}
]
}
}
}
}
I got stuck where to strat ? Which data structure will be suitable?
Thanks.

basically
lst = ['/a/b', '/a/b/c', '/a/b/c/d', '/a/b/c/e', '/a/b/c/f/g', '/a/b/c/f/h', '/a/b/c/f/i']
dct = {}
for item in lst:
p = dct
for x in item.split('/'):
p = p.setdefault(x, {})
print dct
produces
{'': {'a': {'b': {'c': {'e': {}, 'd': {}, 'f': {'i': {}, 'h': {}, 'g': {}}}}}}}
this is not exactly your structure, but should give you a basic idea.

As Sven Marnach said, the output data structure should be more consistent, eg only nested dictionaries where folders are associated to dict and files to None.
Here is a script which uses os.walk. It does not take a list as input but should do what you want in the end if you want to parse files.
import os
from pprint import pprint
def set_leaf(tree, branches, leaf):
""" Set a terminal element to *leaf* within nested dictionaries.
*branches* defines the path through dictionnaries.
Example:
>>> t = {}
>>> set_leaf(t, ['b1','b2','b3'], 'new_leaf')
>>> print t
{'b1': {'b2': {'b3': 'new_leaf'}}}
"""
if len(branches) == 1:
tree[branches[0]] = leaf
return
if not tree.has_key(branches[0]):
tree[branches[0]] = {}
set_leaf(tree[branches[0]], branches[1:], leaf)
startpath = '.'
tree = {}
for root, dirs, files in os.walk(startpath):
branches = [startpath]
if root != startpath:
branches.extend(os.path.relpath(root, startpath).split('/'))
set_leaf(tree, branches, dict([(d,{}) for d in dirs]+ \
[(f,None) for f in files]))
print 'tree:'
pprint(tree)

Start by looking at os.listdir or os.walk. They will allow you to traverse directories recursively. Either automatically (os.walk) or semi-automatically (with os.listdir). You could then store what you find in a dictionary.

Related

How to filter a nested dict by the key of the nested element?

I have a nested dictionary of source words, target words, and their frequency counts. It looks like this:
src_tgt_dict = {"each":{"chaque":3}, "in-front-of":{"devant":4}, "next-to":{"à-côté-de":5}, "for":{"pour":7}, "cauliflower":{"chou-fleur":4}, "on":{"sur":2, "panda-et":2}}
I am trying to filter the dictionary so that only key-value pairs that are prepositions (including multi-word prepositions) remain. To that end, I've written the following:
tgt_preps = set(["devant", "pour", "sur", "à"]) #set of initial target prepositions
src_tgt_dict = {"each":{"chaque":3}, "in-front-of":{"devant":4}, "next-to":{"à-côté-de":5}, "for":{"pour":7}, "cauliflower":{"chou-fleur":4}, "on":{"sur":2, "panda-et":2}}
new_tgt_preps = [] #list of new target prepositions
for src, d in src_tgt_dict.items(): #loop into the dictionary
for tgt, count in d.items(): #loop into the nested dictionary
check_prep = []
if "-" in tgt: #check to see if hyphen occurs in the target word (this is to capture multi-word prepositions that are not in the original preposition set)
check_prep.append(tgt[0:(tgt.index("-"))]) #if there's a hyphen, append the preceding word to the check_prep list
for t in check_prep:
if t in tgt_preps: # check to see if the token preceding the hyphen is a preposition
new_tgt_preps.append(tgt) #if yes, append the multi-word preposition to the list of new target prepositions
tgt_preps.update(new_tgt_preps) # update the set of prepositions to include the multi-word prepositions
temp_2_src_tgt_dict = {} # create new dict for filtering
for src, d in src_tgt_dict.items(): # loop into the dictionary
for tgt, count in d.items(): # loop into the nested dictionary
if tgt in tgt_preps: # if the target is in the set of target prepositions
temp_2_src_tgt_dict[tgt] = count # add to the new dict with the tgt as the key and the count as the value
When I print the new dict, I get the following:
{'devant': 4, 'pour': 7, 'sur': 2, 'à-côté-de': 5}
And it totally makes sense why I get that, because that's what I told the machine to do. But that's not my intention!
What I want is:
{"in-front-of:{"devant":4}, "for":{"pour":7}, "on":{"sur":2}, {"next-to":{"à-côté-de":5}}
I've tried to instantiate the nested dictionary by writing:
temp_2_src_tgt_dict[tgt][src] = count
but that throws up a Key Error.
I've also tried:
new_tgt_dict = {}
for i in src_tgt_dict.items():
for j in tgt_preps:
if j in list(i[1].keys())[0][:len(j)]:
new_tgt_dict.update({i[0]: i[1]})
But that outputs {'in-front-of': {'devant': 4}, 'next-to': {'à-côté-de': 5}, 'for': {'pour': 7}, 'on': {'sur': 2, 'panda-et': 2}}, which is correct in format, but the value 'panda-et' should not be included because it does not occur in tgt_preps when updated with new_tgt_preps.
Can anyone provide any suggestions or advice? Thank you in advance for your help.

Maybe something like this:
from collections import defaultdict
new_tgt_dict = defaultdict(dict)
for k, v in src_tgt_dict.items():
for k1, v1 in v.items():
k_temp = k1
if "-" in k1:
k_temp = k1[0:(k1.index("-"))]
if k_temp in tgt_preps:
new_tgt_dict[k].update({k1: v1})
print(dict(new_tgt_dict))
{'in-front-of': {'devant': 4}, 'next-to': {'à-côté-de': 5}, 'for': {'pour': 7}, 'on': {'sur': 2}}

You could use a NestedDict. First install ndicts
pip install ndicts
Then
from ndicts.ndicts import NestedDict
tgt_preps = set(["devant", "pour", "sur", "à", "à-côté-de"]) # I added "à-côté-de"
src_tgt_dict = {
"each": {"chaque": 3},
"in-front-of": {"devant":4},
"next-to": {"à-côté-de": 5},
"for": {"pour": 7},
"cauliflower": {"chou-fleur": 4},
"on": {"sur":2, "panda-et":2}
}
for key, value in nd.copy().items():
if not set(key) & tgt_preps:
nd.pop(key)
If you need a dictionary as a result
result = nd.to_dict()

Python: update dict string that has placeholders?

Consider this string: "{'a': A, 'b': B, 'c': 10}". Now I want to update this "string" and add new key d with let say value 20, so result would be "{'a': A, 'b': B, 'c': 10, 'd': 20}"
Normally, you could just eval string (eval or literal_eval) into dict, update the way you want and convert it back to string. But in this case, there are placeholders, which would not be recognized when evaluating.
What would be best way to update it, so old values are kept the same, but "dict-string" is updated properly?

For a more robust solution that properly parses the dict, you can subclass lib2to3.refactor.RefactoringTool to refactor the code using a fixer that is a subclass of lib2to3.fixer_base.BaseFix with a pattern that looks for a dictsetmaker node, and a transform method that extends the children list with leaf nodes that consist of the tokens that will make for a new key-value pair in the dict:
from lib2to3 import fixer_base, refactor, pytree
from lib2to3.pgen2 import token
class AddKeyValue(fixer_base.BaseFix):
PATTERN = "dictsetmaker"
def transform(self, node, results):
node.children.extend((
pytree.Leaf(token.COMMA, ','),
pytree.Leaf(token.STRING, "'d'", prefix=' '),
pytree.Leaf(token.COLON, ':'),
pytree.Leaf(token.NUMBER, 20, prefix=' ')
))
return node
class Refactor(refactor.RefactoringTool):
def __init__(self, fixers):
self._fixers= [cls(None, None) for cls in fixers]
super().__init__(None)
def get_fixers(self):
return self._fixers, []
s = "{'a': A, 'b': B, 'c': 10}"
print(Refactor([AddKeyValue]).refactor_string(s + '\n', ''))
This outputs:
{'a': A, 'b': B, 'c': 10, 'd': 20}
lib2to3 is round-trip stable so all white spaces are preserved after the transformation, and a new node should be specified with a prefix if whitespaces are to be inserted before it.
You can find the definition of the Python grammar in Grammar.txt of the lib2to3 module.
Demo: https://repl.it/#blhsing/RudeLimegreenConcentrate

This by no means a best solution but here is one approach:
import re
dict_str = "{'a': A, 'b': B, 'c': 10}"
def update_dict(dict_str, **keyvals):
"""creates an updated dict_str
Parameters:
dict_str (str): current dict_str
**keyvals: variable amounts of key-values
Returns:
str:updated string
"""
new_entries = ", ".join(map(lambda keyval: f"'{keyval[0]}': {keyval[1]}", keyvals.items())) # create a string representation for each key-value and join by ','
return dict_str.replace("}", f", {new_entries}{'}'}") # update the dict_str by removing the last '}' and add the new entries
output:
updated = update_dict(dict_str,
d = 20,
e = 30
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'd': 20, 'e': 30}
some_dict = {
'g': 2,
'h': 3
}
updated = update_dict(dict_str,
**some_dict
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'g': 2, 'h': 3}

I think that you can:
Option 1 - Adding
Insert the new string ", key: value" at the end of the string, before the "}".
Option 2 - RagEx for adding/updating
1 - use find() and search for the key. If it exist use the regex to substitute:
re.replace(regex_search,regex_replace,contents)
So using something like:
string = re.sub(r'key: (.+),', 'key: value', article)
2 - if the find() fail, use the add of the option 1

If it's just about adding at the end of the string...
this_string = "{'a': A, 'b': B, 'c': 10}"
this_add = "'d': 20"
this_string = f"{this_string[:-1]}, {this_add}{this_string[-1]}"
print(this_string)
will output
{'a': A, 'b': B, 'c': 10, 'd': 20}
If you need to insert the new string in between you can do something similar using string.find to locate the index and use that index number instead.
It's basically rewriting the entire string but strings are immutable what can we do.

merge a dict into another, overwriting values including updating lists and sub-dicts (not overwriting the list itself)

I have a dictionary D which contains default settings for my application. It has a complex hierarchy, such as lists, and more dicts inside those lists (e.g. it might have a list of modules, and within each module there are further dicts, sometimes with more lists and more dicts etc).
I also have a small preferences dictionary P which contains an arbitrary subset of this dict (I'm 100% sure that this is a perfect subset).
I'd like to merge this subset P over the default dictionary D.
I thought D.update(P) would work, but this overwrites the lists.
E.g.
D={'i':0, 'j':1, 'modules':[{'a':1}, {'b':2}, {'c':3}] }
P={'i':10, 'modules':[{'c':30}] }
D.update()
# gives {'i': 10, 'j': 1, 'modules': [{'c': 30}]}
# I'd like {'i': 10, 'j': 1, 'modules': [{'a': 1}, {'b': 2}, {'c': 30}]}
There are a lot of similar posts regarding merging dictionaries in different ways, adding entries etc, but none of them seem to address this exact issue. This seems like a very common task but I couldn't figure out how to do it so I'd appreciate any pointers.
Cheers,
(P.S. I'd also like to maintain the order of all of the lists, as it gets reflected in the GUI)
EDIT:
It seems I wasn't very clear in my explanation. Sorry about that. The example above is a very simple toy example. My actual data (when saved to JSON) is about 50K. The hierarchy goes quite deep and I have dicts inside lists inside dicts inside lists etc. Also the atomic update rule wasn't clear apparently (i.e. 0 to 10 is addition or overwriting?). To be clear the atomic update is overwriting. P overwrites D. It's only dicts and lists of dicts which need to further iterated. (I was hoping the user Preferences overwriting Default settings would help visualise this). I also omitted an important detail in the above toy example, and that is that the dictionaries in the list should be matched not by key name (as is in the example above, i.e. the dict with key 'a' is common to P and D), but by value on a specific key. See new toy example below.
D={'i':'Hello', 'j':'World', 'modules':[{'name':'a', 'val':1}, {'name':'b', 'val':2}, {'name':'c', 'val':3}, {'name':'d', 'val':4}] }
P={'i':'Goodbye', 'modules':[{'name':'a', 'val':10}, {'name':'c', 'val':30}] }
EDIT2:
I've added a solution which seems to work. I was hoping for a more concise pythonic solution, but this does the job for now.

Here is a hack that merge your current two dicts.
I'm aware that is not the "most pythonic" way to do it, but it can handle a dicts like yours and give the desired output.
In my answer, i'm using groupby and zip_longest from itertools module.
Here is my answer:
from itertools import groupby, zip_longest
D = {'i':0, 'j':1, 'modules':[{'a':1}, {'b':2}, {'c':3}] }
P = {'i':10, 'modules':[{'c':30}] }
sub = list(D.items()) + list(P.items())
final = {}
for k,v in groupby(sorted(sub, key=lambda x: x[0]), lambda x: x[0]):
bb = list(v)
if not isinstance(bb[0][1], list):
for j in bb:
final[k] = max(bb, key=lambda x: x[1])[1]
else:
kk, ff = [], []
for k_ in zip_longest(*[k[1] for k in bb]):
kk += [j for j in k_ if j != None]
for j,m in groupby(sorted(kk, key= lambda x: list(x.keys())[0]), lambda x: list(x.keys())[0]):
ff += ff += [dict(max([list(k.items()) for k in list(m)], key=lambda x:x))]
final[k] = ff
print(final)
Output:
{'i': 10, 'j': 1, 'modules': [{'a': 1}, {'b': 2}, {'c': 30}]}

I was hoping for a more pythonic solution (much more concise). Here is a C-like solution (which is more where I come from).
Note: D and P below are very simplified toy examples. In reality they are quite deep with dicts inside lists inside dicts inside lists. This might not cover all cases, but it seems to work with my data (~50KBish when saved to json).
Output:
In [2]: P
Out[2]:
{'i': 'Goodbye',
'modules': [{'name': 'a', 'val': 10}, {'name': 'c', 'val': 30}]}
In [3]: D
Out[3]:
{'i': 'Hello',
'j': 'World',
'modules': [{'name': 'a', 'val': 1},
{'name': 'b', 'val': 2},
{'name': 'c', 'val': 3},
{'name': 'd', 'val': 4}]}
In [4]: merge_dicts_by_name(P, D)
merge_dicts_by_name <type 'dict'> <type 'dict'>
key: .i : Hello overwritten by Goodbye
key: .modules :
merge_dicts_by_name .modules <type 'list'> <type 'list'>
list item: .modules[0]
merge_dicts_by_name .modules[0] <type 'dict'> <type 'dict'>
key: .modules[0].name : a overwritten by a
key: .modules[0].val : 1 overwritten by 10
list item: .modules[1]
merge_dicts_by_name .modules[1] <type 'dict'> <type 'dict'>
key: .modules[1].name : c overwritten by c
key: .modules[1].val : 3 overwritten by 30
In [5]: D
Out[5]:
{'i': 'Goodbye',
'j': 'World',
'modules': [{'name': 'a', 'val': 10},
{'name': 'b', 'val': 2},
{'name': 'c', 'val': 30},
{'name': 'd', 'val': 4}]}
Code:
def merge_dicts_by_name(P, D, id_key='name', root='', depth=0, verbose=True, indent=' '):
'''
merge from dict (or list of dicts) P into D.
i.e. can think of D as Default settings, and P as a subset containing user Preferences.
Any value in P or D can be a dict or a list of dicts
in which case same behaviour will apply (through recursion):
lists are iterated and dicts are matched between P and D
dicts are matched via an id_key (only at same hierarchy depth / level)
matching dicts are updated with same behaviour
for anything else P overwrites D
P : dict or list of dicts (e.g. containing user Preferences, subset of D)
D : dict or list of dicts (e.g. Default settings)
id_key : the key by which sub-dicts are compared against (e.g. 'name')
root : for keeping track of full path during recursion
depth : keep track of recursion depth (for indenting)
verbose : dump progress to console
indent : with what to indent (if verbose)
'''
if verbose:
indent_full = indent * depth
print(indent_full, 'merge_dicts_by_name', root, type(P), type(D))
if type(P)==list: # D and P are lists of dicts
assert(type(D)==type(P))
for p_i, p_dict in enumerate(P): # iterate dicts in P
path = root + '[' + str(p_i) + ']'
if verbose: print(indent_full, 'list item:', path)
d_id = p_dict[id_key] # get name of current dict
# find corresponding dict in D
d_dict = D[ next(i for (i,d) in enumerate(D) if d[id_key] == d_id) ]
merge_dicts_by_name(p_dict, d_dict, id_key=id_key, root=path, depth=depth+1, verbose=verbose, indent=indent)
elif type(P)==dict:
assert(type(D)==type(P))
for k in P:
path = root + '.' + k
if verbose: print(indent_full, 'key:', path, end=' : ')
if k in D:
if type(P[k]) in [dict, list]:
print()
merge_dicts_by_name(P[k], D[k], id_key=id_key, root=path, depth=depth+1, verbose=verbose, indent=indent)
else:
if verbose: print(D[k], 'overwritten by', P[k])
D[k] = P[k]
else:
print(indent_full, 'Warning: Key {} in P not found in D'.format(path))
else:
print(indent_full, "Warning: Don't know what to do with these types", type(P), type(D))

Using a loop to .setdefault on Dict Creates Nested Dict

I'm trying to understand why
tree = {}
def add_to_tree(root, value_string):
"""Given a string of characters `value_string`, create or update a
series of dictionaries where the value at each level is a dictionary of
the characters that have been seen following the current character.
"""
for character in value_string:
root = root.setdefault(character, {})
add_to_tree(tree, 'abc')
creates {'a': {'b': {'c': {}}}}
while
root = {}
root.setdefault('a', {})
root.setdefault('b', {})
root.setdefault('c', {})
creates {'a': {}, 'b': {}, 'c': {}}
What is putting us into the assigned dict value on each iteration of the loop?

root.setdefault(character, {}) returns root[character] if character is a key in root or it returns the empty dict {}. It is the same as root.get(character, {}) except that it also assigns root[character] = {} if character is not already a key in root.
root = root.setdefault(character, {})
reassigns root to a new dict if character is not already a key in the original root.
In [4]: root = dict()
In [5]: newroot = root.setdefault('a', {})
In [6]: root
Out[6]: {'a': {}}
In [7]: newroot
Out[7]: {}
In contrast, using root.setdefault('a', {}) without reassigning its return value to root works:
tree = {}
def add_to_tree(root, value_string):
"""Given a string of characters `value_string`, create or update a
series of dictionaries where the value at each level is a dictionary of
the characters that have been seen following the current character.
"""
for character in value_string:
root.setdefault(character, {})
add_to_tree(tree, 'abc')
print(tree)
# {'a': {}, 'c': {}, 'b': {}}

For anyone else who is as slow as me. The answer to, "Why does the (above) function produce {'a': {'b': {'c': {}, 'd': {}}}} and not {'a': {}, 'b': {}, 'c': {}}?" is:
Because we’re looping within a function and reassigning the the result to root each time, it’s kind of like in the TV infomercials where they keep saying, “but wait! there’s more!”. So when .setdefault gets called on 'a', before that gets returned, it’s result, {'a': {}} is held {inside the loop} while it’s run on 'b', which yields {'b': {}}, within {'a': {}} and that is held to the side and {'a': {}} is run, then the whole thing is returned from the loop and applied to tree. Note that each time, what is actually returned by .setdefault IS the default, which in this case is {}. Here is a Python Visualizer illustration of the process.

How to assert a dict contains another dict without assertDictContainsSubset in python? [duplicate]

This question already has answers here:
Python unittest's assertDictContainsSubset recommended alternative [duplicate]
(4 answers)
Closed 1 year ago.
I know assertDictContainsSubset can do this in python 2.7, but for some reason it's deprecated in python 3.2. So is there any way to assert a dict contains another one without assertDictContainsSubset?
This seems not good:
for item in dic2:
self.assertIn(item, dic)
any other good way? Thanks

Although I'm using pytest, I found the following idea in a comment. It worked really great for me, so I thought it could be useful here.
Python 3:
assert dict1.items() <= dict2.items()
Python 2:
assert dict1.viewitems() <= dict2.viewitems()
It works with non-hashable items, but you can't know exactly which item eventually fails.

>>> d1 = dict(a=1, b=2, c=3, d=4)
>>> d2 = dict(a=1, b=2)
>>> set(d2.items()).issubset( set(d1.items()) )
True
And the other way around:
>>> set(d1.items()).issubset( set(d2.items()) )
False
Limitation: the dictionary values have to be hashable.

The big problem with the accepted answer is that it does not work if you have non hashable values in your objects values. The second thing is that you get no useful output - the test passes or fails but doesn't tell you which field within the object is different.
As such it is easier to simply create a subset dictionary then test that. This way you can use the TestCase.assertDictEquals() method which will give you very useful formatted output in your test runner showing the diff between the actual and the expected.
I think the most pleasing and pythonic way to do this is with a simple dictionary comprehension as such:
from unittest import TestCase
actual = {}
expected = {}
subset = {k:v for k, v in actual.items() if k in expected}
TestCase().assertDictEqual(subset, expected)
NOTE obviously if you are running your test in a method that belongs to a child class that inherits from TestCase (as you almost certainly should be) then it is just self.assertDictEqual(subset, expected)

John1024's solution worked for me. However, in case of a failure it only tells you False instead of showing you which keys are not matching. So, I tried to avoid the deprecated assert method by using other assertion methods that will output helpful failure messages:
expected = {}
response_keys = set(response.data.keys())
for key in input_dict.keys():
self.assertIn(key, response_keys)
expected[key] = response.data[key]
self.assertDictEqual(input_dict, expected)

You can use assertGreaterEqual or assertLessEqual.
users = {'id': 28027, 'email': 'chungs.lama#gmail.com', 'created_at': '2005-02-13'}
data = {"email": "chungs.lama#gmail.com"}
self.assertGreaterEqual(user.items(), data.items())
self.assertLessEqual(data.items(), user.items()) # Reversed alternative
Be sure to specify .items() or it won't work.

In Python 3 and Python 2.7, you can create a set-like "item view" of a dict without copying any data. This allows you can use comparison operators to test for a subset relationship.
In Python 3, this looks like:
# Test if d1 is a sub-dict of d2
d1.items() <= d2.items()
# Get items in d1 not found in d2
difference = d1.items() - d2.items()
In Python 2.7 you can use the viewitems() method in place of items() to achieve the same result.
In Python 2.6 and below, your best bet is to iterate over the keys in the first dict and check for inclusion in the second.
# Test if d1 is a subset of d2
all(k in d2 and d2[k] == d1[k] for k in d1)

This answers a little broader question than you're asking but I use this in my test harnesses to see if the container dictionary contains something that looks like the contained dictionary. This checks keys and values. Additionally you can use the keyword 'ANYTHING' to indicate that you don't care how it matches.
def contains(container, contained):
'''ensure that `contained` is present somewhere in `container`
EXAMPLES:
contains(
{'a': 3, 'b': 4},
{'a': 3}
) # True
contains(
{'a': [3, 4, 5]},
{'a': 3},
) # True
contains(
{'a': 4, 'b': {'a':3}},
{'a': 3}
) # True
contains(
{'a': 4, 'b': {'a':3, 'c': 5}},
{'a': 3, 'c': 5}
) # True
# if an `contained` has a list, then every item from that list must be present
# in the corresponding `container` list
contains(
{'a': [{'b':1}, {'b':2}, {'b':3}], 'c':4},
{'a': [{'b':1},{'b':2}], 'c':4},
) # True
# You can also use the string literal 'ANYTHING' to match anything
contains(
{'a': [{'b':3}]},
{'a': 'ANYTHING'},
) # True
# You can use 'ANYTHING' as a dict key and it indicates to match the corresponding value anywhere
# below the current point
contains(
{'a': [ {'x':1,'b1':{'b2':{'c':'SOMETHING'}}}]},
{'a': {'ANYTHING': 'SOMETHING', 'x':1}},
) # True
contains(
{'a': [ {'x':1, 'b':'SOMETHING'}]},
{'a': {'ANYTHING': 'SOMETHING', 'x':1}},
) # True
contains(
{'a': [ {'x':1,'b1':{'b2':{'c':'SOMETHING'}}}]},
{'a': {'ANYTHING': 'SOMETHING', 'x':1}},
) # True
'''
ANYTHING = 'ANYTHING'
if contained == ANYTHING:
return True
if container == contained:
return True
if isinstance(container, list):
if not isinstance(contained, list):
contained = [contained]
true_count = 0
for contained_item in contained:
for item in container:
if contains(item, contained_item):
true_count += 1
break
if true_count == len(contained):
return True
if isinstance(contained, dict) and isinstance(container, dict):
contained_keys = set(contained.keys())
if ANYTHING in contained_keys:
contained_keys.remove(ANYTHING)
if not contains(container, contained[ANYTHING]):
return False
container_keys = set(container.keys())
if len(contained_keys - container_keys) == 0:
# then all the contained keys are in this container ~ recursive check
if all(
contains(container[key], contained[key])
for key in contained_keys
):
return True
# well, we're here, so I guess we didn't find a match yet
if isinstance(container, dict):
for value in container.values():
if contains(value, contained):
return True
return False

Here is a comparison that works even if you have lists in the dictionaries:
superset = {'a': 1, 'b': 2}
subset = {'a': 1}
common = { key: superset[key] for key in set(superset.keys()).intersection(set(subset.keys())) }
self.assertEquals(common, subset)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to parse a directory structure into dictionary? - python

Start by looking at os.listdir or os.walk. They will allow you to traverse directories recursively. Either automatically (os.walk) or semi-automatically (with os.listdir). You could then store what you find in a dictionary.

Related

How to filter a nested dict by the key of the nested element?

Python: update dict string that has placeholders?

merge a dict into another, overwriting values including updating lists and sub-dicts (not overwriting the list itself)

Using a loop to .setdefault on Dict Creates Nested Dict

How to assert a dict contains another dict without assertDictContainsSubset in python? [duplicate]

Categories

Resources