I want to combine my 3 dictionaries into 1 nested dictionary. I wrote the following code to do it using 3 nested for loops. But is there any efficient way or recursive function to the same thing?
X = {"X1":["O","E","P"],"X2":["M"]}
Y = {"O":["a"],"E":["b","c"],"P":["d"],"M":["r"]}
Z = {"a":["1"],"b":["2","3"],"c":[],"d":["4","5"],"r":["6"]}
d1 = {}
for k in X:
A = X[k]
d2 = {}
for v in A:
B = Y[v]
d3 = {}
for i in B:
C = Z[i]
d3.update({i:C})
d2.update({v:d3})
d1.update({k:d2})
You can use simple recursion:
X = {"X1":["O","E","P"],"X2":["M"]}
Y = {"O":["a"],"E":["b","c"],"P":["d"],"M":["r"]}
Z = {"a":["1"],"b":["2","3"],"c":[],"d":["4","5"],"r":["6"]}
start = [X, Y, Z]
def group(d):
return d if all(all(c not in i for i in start) for c in d) else \
{i:group([c[i] for c in start if i in c][0]) for i in d}
r = {a:group(b) for a, b in X.items()}
print(r == d1) #d1 generated from OP's solution
Output:
{'X1': {'O': {'a': ['1']}, 'E': {'b': ['2', '3'], 'c': []}, 'P': {'d': ['4', '5']}}, 'X2': {'M': {'r': ['6']}}}
True
dictionary comprehension for a 1 liner, basically same procedure as your nested for- loop:
{k: {v0:{v1: Z[v1] for v1 in Y[v0]} for v0 in v} for k, v in X.items()}
outputs:
{'X1': {'O': {'a': ['1']},
'E': {'b': ['2', '3'], 'c': []},
'P': {'d': ['4', '5']}},
'X2': {'M': {'r': ['6']}}}
explanation:
OP's algorithm looks up the values list in the next dictionary using as keys each of the values in the current list until the last dictionary is reached. In pseudocode, the nesting looks like:
# pseudo code
for key, values in X
for valX in values:
for valY in Y[valX]: # note Y[valX] is a list
Z[valY]
translating this into a comprehension, we start from the inner-most loop, going out & adding the necessary decoration
step 1:
{y:Z[y] for ys in Y.values() for y in ys}
# out:
{'a': ['1'], 'b': ['2', '3'], 'c': [], 'd': ['4', '5'], 'r': ['6']}
step 2: now we're looking up the ys directly
{x:{y:Z[y] for y in Y[x]} for xs in X.values() for x in xs}
# out:
{'O': {'a': ['1']},
'E': {'b': ['2', '3'], 'c': []},
'P': {'d': ['4', '5']},
'M': {'r': ['6']}}
step 3: now we put in the keys from X & add another layer of dictionary nesting
{k:{x:{y:Z[y] for y in Y[x]} for x in xs} for k, xs in X.items()}
which yields the desired result
In general, when attempting to convert nested-loops to comprehensions, start from the inner most loop, and work outwards.
Related
I am trying to convert below dataframe to dictionary.
I want to group via column A and take a list of common sequence. for e.g.
Example 1:
n1 v1 v2
2 A C 3
3 A D 4
4 A C 5
5 A D 6
Expected output:
{'A': [{'C':'3','D':'4'},{'C':'5','D':'6'}]}
Example 2:
n1 n2 v1 v2
s1 A C 3
s1 A D 4
s1 A C 5
s1 A D 6
s1 B P 6
s1 B Q 3
Expected Output:
{'s1': {'A': [{'C': 3, 'D': 4}, {'C': 5, 'D': 6}], 'B': {'P': 6, 'Q': 3}}}
so basically C and D are repeating as a sequence,I want to club C and D in one dictionary and make a list of if it occurs multiple times.
Please note (Currently I am using below code):
def recur_dictify(frame):
if len(frame.columns) == 1:
if frame.values.size == 1: return frame.values[0][0]
return frame.values.squeeze()
grouped = frame.groupby(frame.columns[0])
d = {k: recur_dictify(g.iloc[:,1:]) for k,g in grouped}
return d
This returns :
{s1 : {'A': {'C': array(['3', '5'], dtype=object), 'D': array(['4', '6'], dtype=object),'B':{'E':'5','F':'6'}}
Also, there can be another series of s2 having E,F,G,E,F,G repeating and some X and Y having single values
Lets create a function dictify which create a dictionary with top level keys from name column and club's the repeating occurrences of values in column v1 into different sub dictionaries:
from collections import defaultdict
def dictify(df):
dct = defaultdict(list)
for k, g in df.groupby(['n1', df.groupby(['n1', 'v1']).cumcount()]):
dct[k[0]].append(dict([*g[['v1', 'v2']].values]))
return dict(dct)
dictify(df)
{'A': [{'C': 3, 'D': 4}, {'C': 5, 'D': 6}]}
UPDATE:
In case there can be variable number of primary grouping keys i.e. [n1, n2, ...] we can use a more generic method:
def update(dct, keys, val):
k, *_ = keys
dct[k] = update(dct.get(k, {}), _, val) if _ \
else [*np.hstack([dct[k], [val]])] if k in dct else val
return dct
def dictify(df, keys):
dct = dict()
for k, g1 in df.groupby(keys):
for _, g2 in g1.groupby(g1.groupby('v1').cumcount()):
update(dct, k, dict([*g2[['v1', 'v2']].values]))
return dict(dct)
dictify(df, ['n1', 'n2'])
{'s1': {'A': [{'C': 3, 'D': 4}, {'C': 5, 'D': 6}], 'B': {'P': 6, 'Q': 3}}}
Here is a simple one line statement that solves your problem:
def df_to_dict(df):
return {name: [dict(x.to_dict('split')['data'])
for _, x in d.drop('name', 1).groupby(d.index // 2)]
for name, d in df.groupby('name')}
Here is an example:
df = pd.DataFrame({'name': ['A'] * 4,
'v1': ['C', 'D'] * 2,
'v2': [3, 4, 5, 6]})
print(df_to_dict(df))
Output:
{'A': [{'C': 3, 'D': 4}, {'C': 5, 'D': 6}]}
I couldn't find any similar codes so i need your help on my junior question. I summarize the code as follows:
A=[{'a': '1', 'b': '2'}]
L=['x', 'y']
B=[]
for i in A:
for j in L:
i["c"]=j
B.append(i)
print(B)
The output is:
[{'a': '1', 'b': '2', 'c': 'y'}, {'a': '1', 'b': '2', 'c': 'y'}]
What i need is:
[{'a': '1', 'b': '2', 'c': 'x'}, {'a': '1', 'b': '2', 'c': 'y'}]
Thanks for your help.
A more concise, though less readable version:
A = [{'a': '1', 'b': '2'}]
L = ['x', 'y']
B = [ {**z, 'c': j } for j in L for z in A ]
You need to make a copy of the dictionary in your loop, or else it appends the same dictionary twice.
Under the hood, it uses a pointer to the original dictionary. Which means you are editing the same dictionary. When you add it, you are editing the original object, and then saying "This same object is in the list twice.
Here is the code...
import copy
A=[{'a': '1', 'b': '2'}]
L=['x', 'y']
B=[]
for i in A:
for j in L:
i2 = copy.deepcopy(i)
i2["c"]=j
B.append(i2)
print(B)
The key thing is that you need to make sure you're creating a new dictionary rather than modifying the original dictionary in A. Here's a way to do it with zip:
B = [{**a, **c} for a, c in zip(A*2, [{'c': v} for v in L])]
As others have said, you have to copy a dictionary because the variable it is stored to is an object, therefore modifying the object will affect all locations it appears.
a = {'a': 'example'}
B = [a]
#B[0] is a, so modifying a, also updates B[0]
a['a'] = 'test'
print(a) #{'a': 'test'}
print(B[0]) #{'a': 'test'}
I'd like to point out the update function as a possibility.
The update function basically updates the existing dictionary by adding new keys to it from a source dictionary.
Example:
a = {'a': 1, 'b': 2}
c = {'c': 3}
a.update(c) #{'a': 1, 'b': 2, 'c': 3}
Applying it to your code:
A={'a': '1', 'b': '2'}
L=['x', 'y']
import copy
B = []
for j in L:
mod_info = {'c': j}
A.update(mod_info)
B.append(copy.copy(A))
print(B)
I have dictionary something like:
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}
and I want result like this
d2 = {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
so I tried
d2 = dict ((v, k) for k, v in d1.items())
but value is surrounded by set{}, so it didn't work well...
is there any way that I can fix it?
You could use a dictionary comprehension:
{v:k for k,vals in d1.items() for v in vals}
# {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Note that you need an extra level of iteration over the values in each key here to get a flat dictionary.
Another dict comprehension:
>>> {v: k for k in d1 for v in d1[k]}
{'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Benchmark comparison with yatu's:
from timeit import repeat
setup = "d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}"
yatu = "{v:k for k,vals in d1.items() for v in vals}"
heap = "{v:k for k in d1 for v in d1[k]}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup)))
print('heap', min(repeat(heap, setup)))
print()
Results:
yatu 1.4274586000000227
heap 1.4059823000000051
yatu 1.4562267999999676
heap 1.3701727999999775
yatu 1.4313863999999512
heap 1.3878657000000203
Another benchmark, with a million keys/values:
setup = "d1 = {k: {k+1, k+2} for k in range(0, 10**6, 3)}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup, number=10)))
print('heap', min(repeat(heap, setup, number=10)))
print()
yatu 1.071519999999964
heap 1.1391495000000305
yatu 1.0880677000000105
heap 1.1534022000000732
yatu 1.0944767999999385
heap 1.1526202000000012
Here's another possible solution to the given problem:
def flatten_dictionary(dct):
d = {}
for k, st_values in dct.items():
for v in st_values:
d[v] = k
return d
if __name__ == '__main__':
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E', 'F', 'G'}}
d2 = flatten_dictionary(d1)
print(d2)
I have the file which contains distance values for a set of nodes in matrix form. I extracted those values and want to save them in a nested dictionary.
I already tried, but my dictionary contains only values from the last iteration.
d={}
i, j = 0,0
for f in tmp:
for k in range(3,len(f),3):
d[nodes[i]] = {}
d[nodes[i]][nodes[j]]= f[k-2]+f[k-1]
j += 1
i += 1
j = 0
return d
d={'A': {'P': '5'},
'B': {'P': '3'},
'C': {'P': '6'},
'D': {'P': '5'},
'E': {'P': '3'},
'F': {'P': '33'},
'G': {'P': '21'},
'H': {'P': '39'},
'I': {'P': '4'}}
But d should contain:
d={"A":{"A":5,"B":6, "C":7, "D":8, "E":9, "F":10, "G":11;"H":12, "I":13},
"B":{"A":3,"B":4, "C":5, "D":8, "E":9, "F":14, "G":11;"H":12,
"I":16}},.....
You're re-initializing the second-level dict each iteration of your inner loop. That is what is causing it to "lose data".
Instead, you could use a defaultdict:
from collections import defaultdict
d = defaultdict(dict)
i, j = 0,0
for f in tmp:
for k in range(3,len(f),3):
d[nodes[i]][nodes[j]]= f[k-2]+f[k-1]
j += 1
i += 1
j = 0
return d
I have a following data set that I read in from a text file:
all_examples= ['A,1,1', 'B,2,1', 'C,4,4', 'D,4,5']
I need to create a list of dictionary as follows:
lst = [
{"A":1, "B":2, "C":4, "D":4 },
{"A":1, "B":1, "C":4, "D":5 }
]
I tried using an generator function but it was hard to create a list as such.
attributes = 'A,B,C'
def get_examples():
for value in examples:
yield dict(zip(attributes, value.strip().replace(" ", "").split(',')))
A one liner, just for fun:
all_examples = ['A,1,1', 'B,2,1', 'C,4,4', 'D,4,5']
map(dict, zip(*[[(s[0], int(x)) for x in s.split(',')[1:]] for s in all_examples]))
Produces:
[{'A': 1, 'C': 4, 'B': 2, 'D': 4},
{'A': 1, 'C': 4, 'B': 1, 'D': 5}]
As a bonus, this will work for longer sequences too:
all_examples = ['A,1,1,1', 'B,2,1,2', 'C,4,4,3', 'D,4,5,6']
Output:
[{'A': 1, 'C': 4, 'B': 2, 'D': 4},
{'A': 1, 'C': 4, 'B': 1, 'D': 5},
{'A': 1, 'C': 3, 'B': 2, 'D': 6}]
Explanation:
map(dict, zip(*[[(s[0], int(x)) for x in s.split(',')[1:]] for s in all_examples]))
[... for s in all_examples] For each element in your list:
s.split(',')[1:] Split it by commas, then take each element after the first
(...) for x in and turn it into a list of tuples
s[0], int(x) of the first letter, with that element converted to integer
zip(*[...]) now transpose your lists of tuples
map(dict, ...) and turn each one into a dictionary!
Also just for fun, but with a focus on understandability:
all_examples = ['A,1,1', 'B,2,1', 'C,4,4', 'D,4,5']
ll = [ x.split(",") for x in all_examples ]
ld = list()
for col in range(1, len(ll[0])):
ld.append({ l[0] : int(l[col]) for l in ll })
print ld
will print
[{'A': 1, 'C': 4, 'B': 2, 'D': 4}, {'A': 1, 'C': 4, 'B': 1, 'D': 5}]
Works as long as the input is csv with integers and lines are same length.
Dissection: I will use the teminology "thing" for A, B and C and "measurement" for the "columns" in the data, i.e. those values in the same "csv-column" of the inut data.
Get the string input data into a list for each line: A,1,1 -> ["A","1","1"]
ll = [ x.split(",") for x in all_examples ]
The result is supposed to be a list of dicts, so let's initialize one:
ld = list()
For each measurement (assuming that all lines have the same number of columns):
for col in range(1, len(ll[0])):
Take the thing l[0], e.g. "A", from the line and assign the numeric value int(), e.g. 1, of the measurement in the respective column l[col], e.g. "1", to the thing. Then use a dictionary comprehension to combine it into the next line of the desired result. Finally append() the dict to the result list ld.
ld.append({ l[0] : int(l[col]) for l in ll })
View unfoamtted. Use print json.dumps(ld, indent=4) for more convenient display:
print ld
Hope this helps. Find more on dict comprehensions e.g. here (Python3 version of this great book).
You actually have a list of strings, and you'd like to have a list of paired dictionaries generated from the same key in the tuple triplets of each string.
To keep this relatively simple, I'll use a for loop instead of a complicated dictionary comprehension structure.
my_dictionary_list = list()
d1 = dict()
d2 = dict()
for triplet_str in all_examples:
key, val1, val2 = triplet_str.split(',')
d1[key] = val1
d2[key] = val2
my_dictionary_list.append(d1)
my_dictionary_list.append(d2)
>>> my_dictionary_list
my_dictionary_list
[{'A': '1', 'B': '2', 'C': '4', 'D': '4'},
{'A': '1', 'B': '1', 'C': '4', 'D': '5'}]
Your question should be "How to crate list of dictionaries?". Here's something you would like to consider.
>>> dict={}
>>> dict2={}
>>> new_list = []
>>> all_examples=['A,1,1', 'B,2,1', 'C,4,4', 'D,4,5']
>>> for k in all_examples:
... ele=k.split(",")
... dict[str(ele[0])]=ele[1]
... dict[str(ele[0])]=ele[2]
... new_list.append(dict)
... new_list.append(dict2)
>>> dict
{'A': '1', 'C': '4', 'B': '2', 'D': '4'}
>>> dict2
{'A': '1', 'C': '4', 'B': '1', 'D': '5'}