If I have a nested dictionary in Python, is there any way to restructure it based on keys?
I'm bad at explaining, so I'll give a little example.
d = {'A':{'a':[1,2,3],'b':[3,4,5],'c':[6,7,8]},
'B':{'a':[7,8,9],'b':[4,3,2],'d':[0,0,0]}}
Re-organize like this
newd = {'a':{'A':[1,2,3],'B':[7,8,9]},
'b':{'A':[3,4,5],'B':[4,3,2]},
'c':{'A':[6,7,8]},
'd':{'B':[0,0,0]}}
Given some function with inputs like
def mysteryfunc(olddict,newkeyorder):
????
mysteryfunc(d,[1,0])
Where the [1,0] list passed means to put the dictionaries 2nd level of keys in the first level and the first level in the 2nd level. Obviously the values need to be associated with their unique key values.
Edit:
Looking for an answer that covers the general case, with arbitrary unknown nested dictionary depth.
Input:
d = {'A':{'a':[1,2,3],'b':[3,4,5],'c':[6,7,8]},
'B':{'a':[7,8,9],'b':[4,3,2],'d':[0,0,0]}}
inner_dict={}
for k,v in d.items():
print(k)
for ka,va in v.items():
val_list=[]
if ka not in inner_dict:
val_dict={}
val_dict[k]=va
inner_dict[ka]=val_dict
else:
val_dict=inner_dict[ka]
val_dict[k]=va
inner_dict[ka]=val_dict
Output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]},
'b': {'A': [3, 4, 5], 'B': [4, 3, 2]},
'c': {'A': [6, 7, 8]},
'd': {'B': [0, 0, 0]}}
you can use 2 for loops, one to iterate over each key, value pair and the second for loop to iterate over the nested dict, at each step form the second for loop iteration you can build your desired output:
from collections import defaultdict
new_dict = defaultdict(dict)
for k0, v0 in d.items():
for k1, v1 in v0.items():
new_dict[k1][k0] = v1
print(dict(new_dict))
output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]},
'b': {'A': [3, 4, 5], 'B': [4, 3, 2]},
'c': {'A': [6, 7, 8]},
'd': {'B': [0, 0, 0]}}
You can use recursion with a generator to handle input of arbitrary depth:
def paths(d, c = []):
for a, b in d.items():
yield from ([((c+[a])[::-1], b)] if not isinstance(b, dict) else paths(b, c+[a]))
from collections import defaultdict
def group(d):
_d = defaultdict(list)
for [a, *b], c in d:
_d[a].append([b, c])
return {a:b[-1][-1] if not b[0][0] else group(b) for a, b in _d.items()}
print(group(list(paths(d))))
Output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]}, 'b': {'A': [3, 4, 5], 'B': [4, 3, 2]}, 'c': {'A': [6, 7, 8]}, 'd': {'B': [0, 0, 0]}}
Related
I would like to merge two dictionaries, but if they have the same key, I would only merge non-duplicate values.
The following code works, but I have a question if it's possible to rewrite this when trying to get a union by using | or (**dict1, **dict2)? When I tried using |, my output would be from this dict_merge({ 'A': [1, 2, 3] }, { 'A': [2, 3, 4] }) to this {'A': [2, 3, 4]}
def dict_merge(dict1, dict2):
for key in dict2.keys():
if key in dict1.keys():
d3 = dict1[key] + dict2[key]
d3 = set(d3)
dict1[key] = list(d3)
else:
dict1[key] = dict2[key]
return dict1
dict_merge({ 'A': [1, 2, 3] }, { 'B': [2, 4, 5, 6]})
Output
{ 'A': [1, 2, 3], 'B': [2, 4, 5, 6] }
Giving your two dictionaries names, let's get the union of their keys.
>>> d1 = { 'A': [1, 2, 3] }
>>> d2 = { 'A': [2, 3, 4] }
>>> d1.keys() | d2.keys()
{'A'}
Assuming the lists are really sets based on your code, we can now iterate over the union of the keys in a dictionary comprehension, and union those two sets and turning them back into a list.
>>> {k: list(set(d1.get(k, [])) | set(d2.get(k, []))) for k in d1.keys() | d2.keys()}
{'A': [1, 2, 3, 4]}
If we incorporate some more interesting dictionaries and repeat the same dictionary comprehension:
>>> d1 = {'A': [1,2,3], 'B': [4,5,6]}
>>> d2 = {'B': [5,6,7,8], 'C': [9,10]}
>>> {k: list(set(d1.get(k, [])) | set(d2.get(k, []))) for k in d1.keys() | d2.keys()}
{'C': [9, 10], 'A': [1, 2, 3], 'B': [4, 5, 6, 7, 8]}
Is that the solution you're wanting?
In [61]: my_dict = {}
...: d1 = {'A': [1, 2, 3], 'C': 123}
...: d2 = {'B': [2, 3, 4], 'A': [1, 2, 3]}
...: for i in set(d1).symmetric_difference(set(d2)):
...: my_dict[i] = d1[i] if i in d1 else d2[i]
...: print(my_dict)
Output :
{'B': [2, 3, 4], 'C': 123}
I have list of identical dictionaries:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
I need to get something like this:
a = [1, 4, 7]
b = [2, 5, 8]
c = [3, 6, 9]
I know how to do in using for .. in .., but is there way to do it without looping?
If i do
a, b, c = zip(*my_list)
i`m getting
a = ('a', 'a', 'a')
b = ('b', 'b', 'b')
c = ('c', 'c', 'c')
Any solution?
You need to extract all the values in my_list.You could try:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
a, b, c = zip(*map(lambda d: d.values(), my_list))
print(a, b, c)
# (1, 4, 7) (2, 5, 8) (3, 6, 9)
Pointed out by #Alexandre,This work only when the dict is ordered.If you couldn't make sure the order, consider the answer of yatu.
You will have to loop to obtain the values from the inner dictionaries. Probably the most appropriate structure would be to have a dictionary, mapping the actual letter and a list of values. Assigning to different variables is usually not the best idea, as it will only work with the fixed amount of variables.
You can iterate over the inner dictionaries, and append to a defaultdict as:
from collections import defaultdict
out = defaultdict(list)
for d in my_list:
for k,v in d.items():
out[k].append(v)
print(out)
#defaultdict(list, {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]})
Pandas DataFrame has just a factory method for this, so if you already have it as a dependency or if the input data is large enough:
import pandas as pd
my_list = ...
df = pd.DataFrame.from_rows(my_list)
a = list(df['a']) # df['a'] is a pandas Series, essentially a wrapped C array
b = list(df['b'])
c = list(df['c'])
Please find the code below. I believe that the version with a loop is much easier to read.
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
# we assume that all dictionaries have the sames keys
a, b, c = map(list, map(lambda k: map(lambda d: d[k], my_list), my_list[0]))
print(a,b,c)
I have written the code which works perfectly but I am trying to use only one for loop but it didn't work out
Here is the python code
lst_one=[1,2,3,4,5,6,7,8]
lst_two=['a','b','c','a','b','c','a','a']
result={}
for createname in range(len(lst_one)):
result[lst_two[createname]]=[]
for value in range(len(lst_one)):
result[lst_two[value]].append(lst_one[value])
print(result)
above code result {'a': [1, 4, 7, 8], 'b': [2, 5], 'c': [3, 6]}
it is working fine using two loop
is it possible to use one loop instead of two-loop
I am using range loop, not lambda, zip and .....
Use zip + setdefault:
lst_one = [1, 2, 3, 4, 5, 6, 7, 8]
lst_two = ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'a']
result = {}
for o, t in zip(lst_one, lst_two):
result.setdefault(t, []).append(o)
print(result)
Output
{'a': [1, 4, 7, 8], 'b': [2, 5], 'c': [3, 6]}
you can use defaultdict which create a dictionary where type of value you define like list or int or dict and it will handle if the key is there or not . if present then do operation on value and if not then make a key and value air
lst_one=[1,2,3,4,5,6,7,8]
lst_two=['a','b','c','a','b','c','a','a']
from collections import defaultdict
result = defaultdict(list)
for a,b in zip(lst_one, lst_two):
result[b].append(a)
print(dict(result))
output
{'a': [1, 4, 7, 8], 'b': [2, 5], 'c': [3, 6]}
if you not wana use default dict then, you can use below code which doing the same way like default dict
lst_one=[1,2,3,4,5,6,7,8]
lst_two=['a','b','c','a','b','c','a','a']
result ={}
for a, b in zip(lst_one, lst_two):
if b not in result.keys():
result.update({b:[a]})
else:
result[b].append(a)
print(result)
output
{'a': [1, 4, 7, 8], 'b': [2, 5], 'c': [3, 6]}
I'd recommend using groupby from the itertools package if you want to condense this:
from itertools import groupby
{a[0]:[e[1] for e in b] for a,b in groupby(sorted(zip(lst_two, lst_one)), lambda x:x[0])}
I'd like to do the cartesian product of multiple dicts, based on their keys, and then sum the produced tuples, and return that as a dict. Keys that don't exist in one dict should be ignored (this constraint is ideal, but not necessary; i.e. you may assume all keys exist in all dicts if needed). Below is basically what I'm trying to achieve (example shown with two dicts). Is there a simpler way to do this, and with N dicts?
def doProdSum(inp1, inp2):
prod = defaultdict(lambda: 0)
for key in set(list(inp1.keys())+list(inp2.keys())):
if key not in prod:
prod[key] = []
if key not in inp1 or key not in inp2:
prod[key] = inp1[key] if key in inp1 else inp2[key]
continue
for values in itertools.product(inp1[key], inp2[key]):
prod[key].append(values[0] + values[1])
return prod
x = doProdSum({"a":[0,1,2],"b":[10],"c":[1,2,3,4]}, {"a":[1,1,1],"b":[1,2,3,4,5]})
print(x)
Output (as expected):
{'c': [1, 2, 3, 4], 'b': [11, 12, 13, 14, 15], 'a': [1, 1, 1, 2, 2, 2,
3, 3, 3]}
You can do it like this, by first reorganizing your data by key:
from collections import defaultdict
from itertools import product
def doProdSum(list_of_dicts):
# We reorganize the data by key
lists_by_key = defaultdict(list)
for d in list_of_dicts:
for k, v in d.items():
lists_by_key[k].append(v)
# list_by_key looks like {'a': [[0, 1, 2], [1, 1, 1]], 'b': [[10], [1, 2, 3, 4, 5]],'c': [[1, 2, 3, 4]]}
# Then we generate the output
out = {}
for key, lists in lists_by_key.items():
out[key] = [sum(prod) for prod in product(*lists)]
return out
Example output:
list_of_dicts = [{"a":[0,1,2],"b":[10],"c":[1,2,3,4]}, {"a":[1,1,1],"b":[1,2,3,4,5]}]
doProdSum(list_of_dicts)
# {'a': [1, 1, 1, 2, 2, 2, 3, 3, 3],
# 'b': [11, 12, 13, 14, 15],
# 'c': [1, 2, 3, 4]}
I have two dictionaries where each value is a list of floats
d1 = {'a': [10,11,12], 'b': [9,10,11], 'c': [8,9,10], 'd': [7,8,9]}
d2 = {'a': [1,1,1], 'b': [2,3,2], 'c': [1,2,2], 'd': [4,3,4]}
I want to subtract the values between dictionaries d1-d2 and get the result:
d3 = {'a': [9,10,11], 'b': [7,7,9], 'c': [7,7,9], 'd': [3,5,5] }
I have found on this site entries on how to subtract dictionaries with only one float value per key, and how to subtract lists within each dictionary, but not between dictionaries.
Also, speed needs to be taken into account because I am going to run this ~200,000 times with different dictionaries each time.
Use a dict comprehension with
zip:
>>> {k:[x-y for x, y in zip(d1[k], d2[k])] for k in d1}
{'a': [9, 10, 11], 'c': [7, 7, 8], 'b': [7, 7, 9], 'd': [3, 5, 5]}
or map:
>>> from operator import sub
>>> {k:map(sub, d1[k], d2[k]) for k in d1}
{'a': [9, 10, 11], 'c': [7, 7, 8], 'b': [7, 7, 9], 'd': [3, 5, 5]}
If speed is important then you can try numpy:
import numpy as np
def sub(x, y):
# probably it would be better if x and y already had numpy arrays as the values.
return {key: np.array(x[key]) - np.array(y[key]) for key in x}
print sub(d1, d2)