How to create nested dictionary without losing data? - python

I have the file which contains distance values for a set of nodes in matrix form. I extracted those values and want to save them in a nested dictionary.
I already tried, but my dictionary contains only values from the last iteration.
d={}
i, j = 0,0
for f in tmp:
for k in range(3,len(f),3):
d[nodes[i]] = {}
d[nodes[i]][nodes[j]]= f[k-2]+f[k-1]
j += 1
i += 1
j = 0
return d
d={'A': {'P': '5'},
'B': {'P': '3'},
'C': {'P': '6'},
'D': {'P': '5'},
'E': {'P': '3'},
'F': {'P': '33'},
'G': {'P': '21'},
'H': {'P': '39'},
'I': {'P': '4'}}
But d should contain:
d={"A":{"A":5,"B":6, "C":7, "D":8, "E":9, "F":10, "G":11;"H":12, "I":13},
"B":{"A":3,"B":4, "C":5, "D":8, "E":9, "F":14, "G":11;"H":12,
"I":16}},.....

You're re-initializing the second-level dict each iteration of your inner loop. That is what is causing it to "lose data".
Instead, you could use a defaultdict:
from collections import defaultdict
d = defaultdict(dict)
i, j = 0,0
for f in tmp:
for k in range(3,len(f),3):
d[nodes[i]][nodes[j]]= f[k-2]+f[k-1]
j += 1
i += 1
j = 0
return d

Related

I tried to created dictionary in dictionary but I struggled

I tried to created a get_dict function that takes a parameter as a filename and then creates and returns a dictionary which contains
key is the number of the product code and has
value is a dictionary that contains
key is a string of sizes (S, M, L, or XL), and
value is the number of the product.
enter image description here
I tried this.
def get_dict(file_name):
d={}
e={}
with open(file_name) as f:
for line in f:
line = line.strip()
alist = line.split()
e[alist[1]] = alist[2]
d[alist[0]] = e
print (d)
the output is look like this
{'4125': {'M': '4', 'L': '7', 'XL': '3'}, '5645': {'M': '4', 'L': '7', 'XL': '3'}, '7845': {'M': '4', 'L': '7', 'XL': '3'}}
but I expect that output will be like this
{4125: {'S': 1, 'M': 4}, 5645: {'L': 7}, 9874: {'S': 8}, 9875: {'M': 8}, 7845: {'S': 10, 'XL': 3}}
Text file example
7845 XL 3
4125 S 1
5645 L 7
9874 S 3
4125 M 4
def get_dict(file_name):
d={}
with open(file_name) as f:
for line in f:
line = line.strip()
alist = line.split()
if not alist[0] in d:
d[alist[0]] = {alist[1]: alist[2]}
else:
d[alist[0]].update({alist[1]: alist[2]})
print(d)
You have to update the dictionary instead of overwriting the same key value. The above solution should work.
Output -
{'7845': {'XL': '3'}, '4125': {'S': '1', 'M': '4'}, '5645': {'L': '7'}, '9874': {'S': '3'}}

how to convert data into nested dictionary in python,dataframe

I have a large dataset of item code and component each item code correlate with component and further component become item code of another component. how can I make a nested dictionary in python
item code component
a q
b w
c r
d t
e y
q u
q v
desired output:-
{a:{q:[u,v]},b:w,c:r etc}
How can I achieve this nested dictionary in python, I have large data
I used defaultdict but it gave me only a dictionary not a nested dictionary
In [108]: df = pd.DataFrame({'item_code': list('abcdeqq'), 'component': list('qwrtyuv')})
In [109]: import networkx as nx
In [110]: g = nx.DiGraph([(k,v) for k,v in zip(df['item_code'], df['component'])])
In [111]: {k:v if len(v) > 1 else v[0] for k,v in nx.convert.to_dict_of_lists(g).items() if v}
Out[111]: {'a': 'q', 'q': ['u', 'v'], 'b': 'w', 'c': 'r', 'd': 't', 'e': 'y'}
Using networkx you can get something like this. Based on this answer I am able to reach to this solution:
import networkx
G = nx.DiGraph()
G.add_edges_from(df.values)
def comb_tup(li_tup):
d = {}
crnt = d # memo the crnt subtree
stck = [] # stack of (sub)trees along current path
for k, v in li_tup:
while stck and k not in crnt:
crnt = stck.pop()
if k not in crnt:
crnt[k] = {}
stck.append(crnt)
crnt = crnt[k]
crnt[v] = {}
return d
final_di = {}
for node in G.nodes:
vi = list(nx.dfs_edges(G,node))
d = comb_tup(vi)
if len(d.keys()):
for k,v in d.items():
final_di[k] = v
final_di:
{'a': {'q': {'u': {}, 'v': {}}},
'q': {'u': {}, 'v': {}},
'b': {'w': {}},
'c': {'r': {}},
'd': {'t': {}},
'e': {'y': {}}}
If you have this data:
item_code component
0 a q
1 b w
2 c r
3 d t
4 e y
5 q u
6 q v
7 u x
final_di:
{'a': {'q': {'u': {'x': {}}, 'v': {}}},
'q': {'u': {'x': {}}, 'v': {}},
'b': {'w': {}},
'c': {'r': {}},
'd': {'t': {}},
'e': {'y': {}},
'u': {'x': {}}}

Switch key and value in a dictionary of sets

I have dictionary something like:
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}
and I want result like this
d2 = {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
so I tried
d2 = dict ((v, k) for k, v in d1.items())
but value is surrounded by set{}, so it didn't work well...
is there any way that I can fix it?
You could use a dictionary comprehension:
{v:k for k,vals in d1.items() for v in vals}
# {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Note that you need an extra level of iteration over the values in each key here to get a flat dictionary.
Another dict comprehension:
>>> {v: k for k in d1 for v in d1[k]}
{'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Benchmark comparison with yatu's:
from timeit import repeat
setup = "d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}"
yatu = "{v:k for k,vals in d1.items() for v in vals}"
heap = "{v:k for k in d1 for v in d1[k]}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup)))
print('heap', min(repeat(heap, setup)))
print()
Results:
yatu 1.4274586000000227
heap 1.4059823000000051
yatu 1.4562267999999676
heap 1.3701727999999775
yatu 1.4313863999999512
heap 1.3878657000000203
Another benchmark, with a million keys/values:
setup = "d1 = {k: {k+1, k+2} for k in range(0, 10**6, 3)}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup, number=10)))
print('heap', min(repeat(heap, setup, number=10)))
print()
yatu 1.071519999999964
heap 1.1391495000000305
yatu 1.0880677000000105
heap 1.1534022000000732
yatu 1.0944767999999385
heap 1.1526202000000012
Here's another possible solution to the given problem:
def flatten_dictionary(dct):
d = {}
for k, st_values in dct.items():
for v in st_values:
d[v] = k
return d
if __name__ == '__main__':
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E', 'F', 'G'}}
d2 = flatten_dictionary(d1)
print(d2)

How to combine dictionaries into a nested dictionary?

I want to combine my 3 dictionaries into 1 nested dictionary. I wrote the following code to do it using 3 nested for loops. But is there any efficient way or recursive function to the same thing?
X = {"X1":["O","E","P"],"X2":["M"]}
Y = {"O":["a"],"E":["b","c"],"P":["d"],"M":["r"]}
Z = {"a":["1"],"b":["2","3"],"c":[],"d":["4","5"],"r":["6"]}
d1 = {}
for k in X:
A = X[k]
d2 = {}
for v in A:
B = Y[v]
d3 = {}
for i in B:
C = Z[i]
d3.update({i:C})
d2.update({v:d3})
d1.update({k:d2})
You can use simple recursion:
X = {"X1":["O","E","P"],"X2":["M"]}
Y = {"O":["a"],"E":["b","c"],"P":["d"],"M":["r"]}
Z = {"a":["1"],"b":["2","3"],"c":[],"d":["4","5"],"r":["6"]}
start = [X, Y, Z]
def group(d):
return d if all(all(c not in i for i in start) for c in d) else \
{i:group([c[i] for c in start if i in c][0]) for i in d}
r = {a:group(b) for a, b in X.items()}
print(r == d1) #d1 generated from OP's solution
Output:
{'X1': {'O': {'a': ['1']}, 'E': {'b': ['2', '3'], 'c': []}, 'P': {'d': ['4', '5']}}, 'X2': {'M': {'r': ['6']}}}
True
dictionary comprehension for a 1 liner, basically same procedure as your nested for- loop:
{k: {v0:{v1: Z[v1] for v1 in Y[v0]} for v0 in v} for k, v in X.items()}
outputs:
{'X1': {'O': {'a': ['1']},
'E': {'b': ['2', '3'], 'c': []},
'P': {'d': ['4', '5']}},
'X2': {'M': {'r': ['6']}}}
explanation:
OP's algorithm looks up the values list in the next dictionary using as keys each of the values in the current list until the last dictionary is reached. In pseudocode, the nesting looks like:
# pseudo code
for key, values in X
for valX in values:
for valY in Y[valX]: # note Y[valX] is a list
Z[valY]
translating this into a comprehension, we start from the inner-most loop, going out & adding the necessary decoration
step 1:
{y:Z[y] for ys in Y.values() for y in ys}
# out:
{'a': ['1'], 'b': ['2', '3'], 'c': [], 'd': ['4', '5'], 'r': ['6']}
step 2: now we're looking up the ys directly
{x:{y:Z[y] for y in Y[x]} for xs in X.values() for x in xs}
# out:
{'O': {'a': ['1']},
'E': {'b': ['2', '3'], 'c': []},
'P': {'d': ['4', '5']},
'M': {'r': ['6']}}
step 3: now we put in the keys from X & add another layer of dictionary nesting
{k:{x:{y:Z[y] for y in Y[x]} for x in xs} for k, xs in X.items()}
which yields the desired result
In general, when attempting to convert nested-loops to comprehensions, start from the inner most loop, and work outwards.

change a dictionary of dictionaries in python

I have a dictionary of dictionaries like this small example:
small example:
dict = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
I want to make an other dictionary of dictionaries in which instead of absolute numbers I would have the frequency of every number in every sub-dictionary. for example for the 1st inner dictionary I would have the following sub-dictionary:
1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}
here is the expected output:
dict2 = {1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}, 2: {'A': 32.5, 'C': 22.50, 'T': 25, 'G': 20}}
I am trying to do that in python using the following command:
dict2 = {}
for item in dict.items():
freq = item.items/sum(item.items())
dict2[] = freq
but the results of this code is not what I want. do you know how to fix it?
What you want is to process the inner dictionaries without modifying the keys of the big one. Outsource the frequency into a function:
def get_frequency(d):
total = sum(d.values())
return {key: value / total * 100 for key, value in d.items()}
Then use a dict comprehension to apply the function on all your sub dictionaries:
dict2 = {key: get_frequency(value) for key, value in dict1.items()}
Note that I added a * 100, it appears from your output that you are looking for percents from 0-100 and not a float from 0-1.
Edit:
If you're using python2 / is integer division so add a float like so:
return {key: float(value) / total * 100 for key, value in d.items()}
You could do the following:
dct = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
result = {}
for key, d in dct.items():
total = sum(d.values())
result[key] = {k : a / total for k, a in d.items()}
print(result)
Output
{1: {'C': 0.21050328227571116, 'T': 0.2775346462436178, 'G': 0.2012399708242159, 'A': 0.31072210065645517}, 2: {'C': 0.22447118891320203, 'T': 0.25018234865062, 'G': 0.20076586433260393, 'A': 0.32458059810357404}}

Categories

Resources