I need to reduce the nested levels of a dictionary when a node has 1 element, by appending the inner key to the upper key.
Example:
Given this dictionary:
{'A': {'a': {'1': {}}},
'B': {'b': {'2': {}},
'c': {'3': {'x': {}}},
'd': {},
'e': {'0': {},
'1': {},
},
},
}
I need to return:
{'A a 1': {},
'B': {'b 2': {},
'c 3 x': {},
'd': {},
'e': {'0': {},
'1': {},
},
},
}
It should be generic for any number of levels, and the last element is always an empty dict.
You can first flatten the structure to retrieve all paths and then rebuild it using collections.defaultdict:
import collections
data = {'A': {'a': {'1': {}}}, 'B': {'b': {'2': {}}, 'c': {'3': {'x': {}}}, 'd': {}, 'e': {'0': {}, '1': {}}}}
def flatten(d, c = []):
for a, b in d.items():
if not b:
yield (c+[a], b)
else:
yield from flatten(b, c +[a])
def compress(d):
_d, r = collections.defaultdict(list), {}
for [a, *b], c in d:
_d[a].append((b, c))
for a, b in _d.items():
val = compress(b) if len(b) > 1 and all(j for j, _ in b) else b[0][-1]
r[a if len(b) > 1 else a+' '+' '.join(b[0][0])] = val
return r
print(compress(list(flatten(data))))
Output:
{'A a 1': {},
'B': {'b 2': {},
'c 3 x': {},
'd ': {},
'e': {'0 ': {},
'1 ': {}}
}
}
I believe this recursive function works for your example:
def flatten_keys(key_so_far = '', d={}):
if len(d) > 1:
sub_dict = {}
for (k,v) in d.items():
sub_dict.update(flatten_keys(k, v))
return {key_so_far: sub_dict} if key_so_far else sub_dict
elif d == {}:
return {key_so_far: {}}
else:
k,v = list(d.items())[0]
key_so_far += (' ' if key_so_far else '') + k
return(flatten_keys(key_so_far, v))
input_d = {'A': {'a': {'1': {}}},
'B': {'b': {'2': {}},
'c': {'3': {'x': {}}},
'd': {},
'e': {'0': {},
'1': {},
},
},
}
flatten_keys(input_d)
# {'A a 1': {}, 'B': {'b 2': {}, 'c 3 x': {}, 'd': {}, 'e': {'0': {}, '1': {}}}}
Related
I have a large dataset of item code and component each item code correlate with component and further component become item code of another component. how can I make a nested dictionary in python
item code component
a q
b w
c r
d t
e y
q u
q v
desired output:-
{a:{q:[u,v]},b:w,c:r etc}
How can I achieve this nested dictionary in python, I have large data
I used defaultdict but it gave me only a dictionary not a nested dictionary
In [108]: df = pd.DataFrame({'item_code': list('abcdeqq'), 'component': list('qwrtyuv')})
In [109]: import networkx as nx
In [110]: g = nx.DiGraph([(k,v) for k,v in zip(df['item_code'], df['component'])])
In [111]: {k:v if len(v) > 1 else v[0] for k,v in nx.convert.to_dict_of_lists(g).items() if v}
Out[111]: {'a': 'q', 'q': ['u', 'v'], 'b': 'w', 'c': 'r', 'd': 't', 'e': 'y'}
Using networkx you can get something like this. Based on this answer I am able to reach to this solution:
import networkx
G = nx.DiGraph()
G.add_edges_from(df.values)
def comb_tup(li_tup):
d = {}
crnt = d # memo the crnt subtree
stck = [] # stack of (sub)trees along current path
for k, v in li_tup:
while stck and k not in crnt:
crnt = stck.pop()
if k not in crnt:
crnt[k] = {}
stck.append(crnt)
crnt = crnt[k]
crnt[v] = {}
return d
final_di = {}
for node in G.nodes:
vi = list(nx.dfs_edges(G,node))
d = comb_tup(vi)
if len(d.keys()):
for k,v in d.items():
final_di[k] = v
final_di:
{'a': {'q': {'u': {}, 'v': {}}},
'q': {'u': {}, 'v': {}},
'b': {'w': {}},
'c': {'r': {}},
'd': {'t': {}},
'e': {'y': {}}}
If you have this data:
item_code component
0 a q
1 b w
2 c r
3 d t
4 e y
5 q u
6 q v
7 u x
final_di:
{'a': {'q': {'u': {'x': {}}, 'v': {}}},
'q': {'u': {'x': {}}, 'v': {}},
'b': {'w': {}},
'c': {'r': {}},
'd': {'t': {}},
'e': {'y': {}},
'u': {'x': {}}}
Say I have 2 dict with same structure (this entire dict is the structure for any one of the dicts):
{
'0': {
'A': a,
'B': b,
},
'1': {
'C': c,
'D': d,
}
}
#Sample input and output:
#dict1
{
'0': {
'A': 0,
'B': 1,
},
'1': {
'C': 2,
'D': 3,
}
}
#dict2
{
'0': {
'A': 5,
'B': 5,
},
'1': {
'C': 5,
'D': 5,
},
'3': { 'E': 5 } #this will be ignored when combining
}
#merged output with addition:
{
'0': {
'A': 5,
'B': 6,
},
'1': {
'C': 7,
'D': 8,
}
}
Ideally, everything about both dicts are the same except for the values a,b,c,d. For any subsequent dicts that have parts in their structure that are different from the first dict, those parts are ignored when merging. Like how the 3 AND E key in dict2 was ignored in the merge.
How can I combine both dicts into one dict that maintains the same structure, but with merged values? I would like to make this generic, so this 'merge' operation could be addition, subtraction, etc. And the number of dicts to merge can also change (not just 2 at a time).
Thanks, I hope to learn more about Python from your solutions
is you want that type of code
marged={}
for i in a:
for j in a[i]:
marged.setdefault(j,a[i].get(j))
print(marged)
Output:
{'A': 'a', 'B': 'b', 'K': 'k', 'L': 'l', 'C': 'c', 'D': 'd'}
Input:
a={
'0': {
'A': "a",
'B': "b",
'K':"k",
'L':"l"
},
'1': {
'C': "c",
'D': "d"
}
}
You can simply use UPDATE method.
if a is a dictionary and you want to add another dictionary say b.
then simply use
a.update(b)
This is to combine all the dicts, i.e, to merge.
d = {
'0': {
'A': a,
'B': b,
},
'1': {
'C': c,
'D': d,
},
}
def merge(d):
merged_dict = {}
for k in d:
for key in d[k]:
if not merged_dict.get(key, False):
merged_dict[key] = d[k][key]
return merged_dict
md = merge(d)
This should just merge the different dicts into one.
For supporting arbitrary dict structure and nesting, you will have to use a recursive function. Pass the operator as a parameter and use e.g. reduce to calculate the value for all the passed dicts (one or more). This assumes only dicts of dicts, but could also handle e.g. lists with another block if isinstance(first, list) and a corresponding list comprehension.
import functools, operator
def merge(op, first, *more):
if isinstance(first, dict):
return {key: merge(op, first[key], *(d[key] for d in more if key in d))
for key in first}for key in first}
else:
return functools.reduce(op, more, first)
A = {'0': {'A': 1, 'B': 2}, '1': {'C': 3, 'D': 4}}
B = {'0': {'A': 5, 'B': 6}, '1': {'C': 7, 'D': 8}}
print(merge(operator.add, A, B))
# {'0': {'A': 6, 'B': 8}, '1': {'C': 10, 'D': 12}}
Also works with more than two dicts, different operator, or if the more dicts have more or fewer keys than first:
C = {'0': {'A': 9, 'B': 10}, '1': {'C': 11, 'D': 12}}
D = {'0': {'A': 13, 'X': 14}, '2': {'C': 15, 'D': 16}}
print(merge(lambda x, y: f"{x}-{y}", A, B, C, D))
# {'0': {'A': '1-5-9-13', 'B': '2-6-10'}, '1': {'C': '3-7-11', 'D': '4-8-12'}}
It still assumes that values corresponding to the same keys in first and more have the same type, though.
I have nested dictionary d as below
d = {'id': {"a,b": {'id': {"x": {'id': None},
"y": {'id': {"a": {'id': None},
"b": {'id': None}}}}},
"c,d": {'id': {"c": {'id': None},
"d": {'id': {"x": {'id': None},
"y": {'id': None}}}}}}}
and would like unnest some levels and compress it to the following output:
{"a,b": {"x": None,
"y": {"a": None,
"b": None}},
"c,d": {"c": None,
"d": {"x": None,
"y": None}}}
Would like to unnest any nested dictionary with the key id and replace it with the inner dictionary
My starting point is:
def unnest_dictionary(d):
for k,v in d.items():
if isinstance(v, dict):
unnest_dictionary(v)
if k=='id':
......
Not sure how to unnest it from there
Here is how i ended up solving for it
I flattened the dictionary, removed the levels with id then nested it back again
import re
d = {'id': {"a,b": {'id': {"x": {'id': None},
"y": {'id': {"a": {'id': None},
"b": {'id': None}}}}},
"c,d": {'id': {"c": {'id': None},
"d": {'id': {"x": {'id': None},
"y": {'id': None}}}}}}}
def flatten_dict(dd, separator ='_', prefix =''):
return { prefix + separator + k if prefix else k : v
for kk, vv in dd.items()
for k, v in flatten_dict(vv, separator, kk).items()
} if isinstance(dd, dict) else { prefix : dd }
def nest_dict(dict1):
result = {}
for k, v in dict1.items():
split_rec(k, v, result)
return result
def split_rec(k, v, out):
k, *rest = k.split('_', 1)
if rest:
split_rec(rest[0], v, out.setdefault(k, {}))
else:
out[k] = v
flat_d = flatten_dict(d)
for k in list(flat_d.keys()):
new_key = re.sub(r'_id|id_','',k)
flat_d[new_key] = flat_d.pop(k)
nested_d = nest_dict(flat_d)
print(nested_d)
# {'a,b': {'x': None, 'y': {'a': None, 'b': None}}, 'c,d': {'c': None, 'd': {'x': None, 'y': None}}}
This question already has answers here:
Python dict.fromkeys return same id element
(2 answers)
Closed 3 years ago.
I've a dict code snippets which is not behaving as expected
a = {"d1":{"a":1,"b":2,"c":4},"d2":{"a":1,"b":2,"c":4},"d3":{"a":1,"b":2,"c":4}}
b = {"d1":{"a":1,"b":0},"d2":{"a":0,"c":4},"d3":{"a":1,"b":2,"c":4}}
c = dict.fromkeys(a.keys(),{})
print(c)
for doc in b.keys():
for word in b[doc].keys():
c[doc][word] = a[doc][word]*b[doc][word]
print(c)
output is:
{'d1': {}, 'd2': {}, 'd3': {}}
{'d1': {'a': 1, 'b': 4, 'c': 16}, 'd2': {'a': 1, 'b': 4, 'c': 16}, 'd3': {'a': 1, 'b': 4, 'c': 16}}
instead of:
{'d1': {}, 'd2': {}, 'd3': {}}
{'d1': {'a': 1, 'b': 0}, 'd2': {'a': 0, 'c': 16}, 'd3': {'a': 1, 'b': 4, 'c': 16}}
I very confused now any insights would be helpful.
The problem is because you are using a mutable object as the second argument for fromkeys.
This is much clearer here:
d = dict.fromkeys(['a', 'b'], [])
d['a'].append(1)
print(d)
Outputs
{'a': [1], 'b': [1]}
Made a modification to your for loop :
for doc in b.keys():
for word in b[doc].keys():
if doc not in c:
c[doc]={}
c[doc][word] = a[doc][word]*b[doc][word]
print(c)
#{'d1': {'a': 1, 'b': 0}, 'd2': {'a': 0, 'c': 16}, 'd3': {'a': 1, 'b': 4, 'c': 16}}
Use a dictionary comprehension to create c instead:
c = {k: {} for k in a.keys()}
for doc in b.keys():
for word in b[doc].keys():
c[doc][word] = a[doc][word]*b[doc][word]
print(c)
# {'d1': {'a': 1, 'b': 0}, 'd2': {'a': 0, 'c': 16}, 'd3': {'a': 1, 'b': 4, 'c': 16}}
Notice the difference when you use fromkeys vs dictionary comprehension:
c = dict.fromkeys(a.keys(),{})
print([id(o) for o in c.values()])
# [53649152, 53649152, 53649152]
# same object reference id!
c = {k: {} for k in a.keys()}
print([id(o) for o in c.values()])
# [53710208, 53649104, 14445232]
# each object has different reference id
Having a dict like:
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
I'd like to have a new key total with the sum of each key in the subdictionaries, like:
x['total'] = {'a': 3, 'b': 7}
I've tried adapting the answer from this question but found no success.
Could someone shed a light?
Assuming all the values of x are dictionaries, you can iterate over their items to compose your new dictionary.
from collections import defaultdict
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = defaultdict(int)
for d in x.values():
for k, v in d.items():
total[k] += v
print(total)
# defaultdict(<class 'int'>, {'a': 3, 'b': 7})
A variation of Patrick answer, using collections.Counter and just update since sub-dicts are already in the proper format:
from collections import Counter
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = Counter()
for d in x.values():
total.update(d)
print(total)
result:
Counter({'b': 7, 'a': 3})
(update works differently for Counter, it doesn't overwrite the keys but adds to the current value, that's one of the subtle differences with defaultdict(int))
You can use a dictionary comprehension:
x = {'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}}
full_sub_keys = {i for b in map(dict.keys, x.values()) for i in b}
x['total'] = {i:sum(b.get(i, 0) for b in x.values()) for i in full_sub_keys}
Output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'b': 7, 'a': 3}}
from collections import defaultdict
dictionary = defaultdict(int)
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
for key, numbers in x.items():
for key, num in numbers.items():
dictionary[key] += num
x['total'] = {key: value for key, value in dictionary.items()}
print(x)
We can create a default dict to iterate through each of they key, value pairs in the nested dictionary and sum up the total for each key. That should enable a to evaluate to 3 and b to evaluate to 7. After we increment the values we can do a simple dictionary comprehension to create another nested dictionary for the totals, and make a/b the keys and their sums the values. Here is your output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'a': 3, 'b': 7}}