Python - restructure dictionary of lits in a loop - python

Hi guys I have a dict id_dict where I have some keys. In this keys there are lists as values (sometimes only one item somtimes 4). (These are names of components)
my dict id_dict is created dynamically and the keys and values will always change.

S = {'G': ['crypto'], 'T': ['update', 'monitor', 'ipforum'], 'F': ['update'], 'M': ['crypto','update']}
R = {}
for key, value_list in S.items():
for value in value_list:
if value not in R:
R[value] = []
R[value].append(key)
print(R)
Output:
{'crypto': ['G', 'M'], 'update': ['T', 'F', 'M'], 'monitor': ['T'], 'ipforum': ['T']}

you are actually trying to assign the components list you're iterating in as a key of the dictionary, this is what will cause the error.
try by simply doing:
component_dict[component] = id

maybe try something like:
new_dic = {}
for k, v in dic.items():
for ele in v:
if ele not in new_dic:
new_dic[ele] = []
new_dic[ele].append(k)
new_dic
Note the append

It seems that u call components Instead of component and that cannot be a dict key, this si what causes the error,
In addition you didnt set it as a list and just assigned the value, you need to assign a list of it doesnt exist, and append to it if it does
So:
If dict_components.get(component):
dict_components[component].append(id)
else:
dict_components[component] = [id]

try this
from collections import defaultdict
d = {'G': ['crypto'], 'T': ['update', 'monitor', 'ipforum'], 'F': ['update'], 'M': ['crypto', 'update']}
d1 = defaultdict(list)
for k, v in d.items():
for x in v:
d1[x].append(k)
print(d1)
output
defaultdict(<class 'list'>, {'crypto': ['G', 'M'], 'update': ['T', 'F', 'M'], 'monitor': ['T'], 'ipforum': ['T']})

The easies way will be to use collections.defaultdict - which will save you from multiple tests
from collections import defaultdict
source = {'G': ['crypto'],
'T': ['update', 'monitor', 'ipforum'],
'F': ['update'],
'M': ['crypto','update']}
components = defaultdict(list)
for key, values_list in source.items():
for value in values_list:
components[value].append(key)
And the result will be
defaultdict(list,
{'crypto': ['G', 'M'],
'update': ['T', 'F', 'M'],
'monitor': ['T'],
'ipforum': ['T']})
As an alternative, you may create regular dict and use setdefault method to create lists
components = {}
for key, values_list in source.items():
for value in values_list:
components.setdefault(value, []).append(key)

Related

How to merge keys of dictionary which have the same value?

I need to combine two dictionaries by their value, resulting in a new key which is the list of keys with the shared value. All I can find online is how to add two values with the same key or how to simply combine two dictionaries, so perhaps I am just searching in the wrong places.
To give an idea:
dic1 = {'A': 'B', 'C': 'D'}
dic2 = {'D': 'B', 'E': 'F'}
Should result in:
dic3 = {['A', 'D']: 'B', 'C': 'D', 'E': 'F'}
I am not sure why you would need such a data structure, you can probably find a better solution to your problem. However, just for the sake of answering your question, here is a possible solution:
dic1 = {'A':'B', 'C':'D'}
dic2 = {'D':'B', 'E':'F'}
key_list = list(dic2.keys())
val_list = list(dic2.values())
r = {}
for k,v in dic1.items():
if v in val_list:
i = val_list.index(v) #get index at value
k2 = key_list[i] #use index to retrive the key at value
r[(k, k2)] = v #make the dict entry
else:
r[k] = v
val_list = list(r.values()) #get all the values already processed
for k,v in dic2.items():
if v not in val_list: #if missing value
r[k] = v #add new entry
print(r)
output:
{('A', 'D'): 'B', 'C': 'D', 'E': 'F'}
You can't assign a list as a key in a python dictionary since the key must be hashable and a list is not an ashable object, so I have used a tuple instead.
I would use a defaultdict of lists and build a reversed dict and in the end reverse it while converting the lists to tuples (because lists are not hashable and can't be used as dict keys):
from collections import defaultdict
dic1 = {'A':'B', 'C':'D'}
dic2 = {'D':'B', 'E':'F'}
temp = defaultdict(list)
for d in (dic1, dic2):
for key, value in d.items():
temp[value].append(key)
print(temp)
res = {}
for key, value in temp.items():
if len(value) == 1:
res[value[0]] = key
else:
res[tuple(value)] = key
print(res)
The printout from this (showing the middle step of temp) is:
defaultdict(<class 'list'>, {'B': ['A', 'D'], 'D': ['C'], 'F': ['E']})
{('A', 'D'): 'B', 'C': 'D', 'E': 'F'}
If you are willing to compromise from 1-element tuples as keys, the second part will become much simpler:
res = {tuple(value): key for key, value in temp.items()}

Compare dicts and merge them. No overwrite and no duplicate values

I made a mistake in my question here (wrong requested input and expected output):
Comparing dicts, updating NOT overwriting values
I am not looking for this solution:
Combining 2 dictionaries with common key
So this question is not a duplicate
Problem statement:
requested input:
d1 = {'a': ['a'], 'b': ['b', 'c']}
d2 = {'b': ['c', 'd'], 'c': ['e','f']}
expected output (I don't care about the order of the keys / values!):
new_dict = {'a': ['a'], 'b': ['b', 'c', 'd'], 'c': ['e', 'f']}
The solution in Combining 2 dictionaries with common key
gives following output:
new_dict = {'a': ['a'], 'b': ['b', 'c', 'c', 'd'], 'c': ['e', 'f']}
I don't want the duplicates to be stored.
My solution (it works but it is not so efficient):
unique_vals = []
new_dict = {}
for key in list(d1.keys())+list(d2.keys()) :
unique_vals = []
try:
for val in d1[key]:
try:
for val1 in d2[key]:
if(val1 == val) and (val1 not in unique_vals):
unique_vals.append(val)
except:
continue
except:
new_dict[key] = unique_vals
new_dict[key] = unique_vals
for key in d1.keys():
for val in d1[key]:
if val not in new_dict[key]:
new_dict[key].append(val)
for key in d2.keys():
for val in d2[key]:
if val not in new_dict[key]:
new_dict[key].append(val)
Here is how I would go about it:
d1 = {'a': ['a'], 'b': ['b', 'c']}
d2 = {'b': ['c', 'd'], 'c': ['e','f']}
dd1 = {**d1, **d2}
dd2 = {**d2, **d1}
{k:list(set(dd1[k]).union(set(dd2[k]))) for k in dd1}
Produces the desired result.
I suggest using a default dictionary collection with a set as a default value.
It guarantees that all values will be unique and makes the code cleaner.
Talking about efficiecy it's O(n^2) by time.
from collections import defaultdict
d1 = {'a': ['a'], 'b': ['b', 'c']}
d2 = {'b': ['c', 'd'], 'c': ['e','f']}
new_dict = defaultdict(set)
for k, v in d1.items():
new_dict[k] = new_dict[k].union(set(v))
for k, v in d2.items():
new_dict[k] = new_dict[k].union(set(v))
Try this code. You can remove deep copy if modifications in the initial array are fine for you.
import copy
def merge(left, right):
res = copy.deepcopy(left)
for k, v in right.items():
res[k] = list(set(res[k]).union(v)) if k in res else v
return res
Simple if statement if you don't want to use a Set.
d3 = dict(d2)
for k,v in d1.items():
if k not in d3:
d3[k] = v
else:
for n in d1[k]:
if n not in d3[k]:
d3[k].append(n)

File to dictionary only prints one

I have a text file that reads:
a;b
a;c
a;d
b;h
c;e
e;f
e;g
e;j
f;b
g;d
h;b
h;e
i;d
i;e
but when I print it after making it into a dictionary
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" in line:
key, val = map(str.strip, line.split(";"))
graph[key] = val
return dict(sorted(graph.items())))
It prints:
{'a': 'b', 'b': 'd', 'c': 'e', 'd': 'g', 'e': 'd', 'f': 'd'}
how do I make it where it prints the keys that repeat?
I assume for this you'd want to use a list of strings instead of a single string as the value, otherwise your dictionary will keep replacing the value for the same key.
Instead of:
{'a': 'b'}
You would probably want a structure such as:
{'a': ['b','c','d']}
Using your function:
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" not in line: continue
key, val = line.strip().split(';')
if key not in graph: graph[key] = list()
if val not in graph[key]: graph[key].append(val)
return dict(sorted(graph.items()))
read_graph('file.txt')
{'a': ['b', 'c', 'd'], 'c': ['e'], 'b': ['h'], 'e': ['f', 'g', 'j'], 'g': ['d'], 'f': ['b'], 'i': ['d', 'e'], 'h': ['b', 'e']}
Dictionaries in python (and every other language I know) have unique values for each key, and will overwrite them when you put a new value in for an existing key.
Consider a different kind of data structure, like a set of tuples, e.g.
{('a','b'), ('a','c'), ...}
Or, as it looks like you are making a graph, a dictionary where the values are lists of vertices instead of individual vertices, e.g.
{'a':['b','c'],...}
To make the set of tuples, replace the line
graph[key] = val
with
graph.append((key, val))
To make a dictionary-to-lists, use
if key in graph:
graph[key].append(val)
else:
graph[key] = [val]
Hope this helps!
You cannot because that is a dictionary, and it is not allowed to have two same keys or it would ambiguous. You could group by key.
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" in line:
key, val = map(str.strip, line.split(";"))
if key not in graph:
graph[key] = [val]
else:
graph[key].append(val)
return dict(sorted(graph.items())))
So now you have for every key, an array with its values.
Since you seem to be working with a graph structure, I would recommend you look at the NetworkX package for Python. They have pre-built graph data-structures for you to use and many algorithms that can operate on them.
import networkx as nx
graph = nx.Graph()
with open(file_name) as f: # This closes the file automatically when you're done
for line in f:
if ";" in line:
source, dest = map(str.strip, line.split(";"))
graph.add_edge(source, dest)
In case you still want to use vanilla Python only:
Python's dictionaries can only have one value per key. To store multiple values for a single key, you have to store your keys in a list of values.
my_dict = {
'a': ['b', 'c', 'd'],
'b': ['h'],
...
}

How to create a dictionary whose values are lists

I read a code in a book 'Think Python'. This code gets stuck at the inverse[val].[key] with an error:
'str' object has no attribute 'append''
Which makes sense as inverse[val] contains a string object.
Here d is the input dictionary.
def invert_dict(d):
inverse = dict()
for key in d:
val = d[key]
if val not in inverse:
inverse[val] = [key]
else:
inverse[val].append(key)
return inverse
The input dictionary is {'a': 1, 'p': 1, 'r': 2, 't': 1, 'o': 1}
The expected output is {1: ['a', 'p', 't', 'o'], 2: ['r']}
How do I implement this, by modifying the given block of code?
You can use collections.defaultdict to create a dictionary of lists. Then append to dictionary values while iterating your input dictionary.
from collections import defaultdict
d_in = {'a': 1, 'p': 1, 'r': 2, 't': 1, 'o': 1}
d_out = defaultdict(list)
for k, v in d_in.items():
d_out[v].append(k)
print(d_out)
defaultdict(<class 'list'>, {1: ['a', 'p', 't', 'o'], 2: ['r']})
Your code can be improved by iterating keys and values simultaneously via dict.items, instead of iterating keys and manually extracting the value. In addition, your indentation is incorrect. After resolving these issues:
def invert_dict(d):
inverse = dict()
for key, val in d.items():
if val not in inverse:
inverse[val] = [key]
else:
inverse[val].append(key)
return inverse
try this:
def invert_dict(data):
inverse = {}
for key, value in data.items():
if value not in inverse:
inverse[value] = [key]
else:
inverse[value].append(key)
return inverse
A one-liner using reduce:
inverted_dict = reduce((lambda inverted_dict, key: inverted_dict.setdefault(dd[key], []).append(key) or inverted_dict), d, {})
Output:
{1: ['t', 'o', 'p', 'a'], 2: ['r']}
You can also follow a different approach in which you take all values from your dictionary and match each value with the keys that have this value in the initial dictionary:
def invert_dict(d):
values = set(d.values())
inverse = dict((v,[k for k in d.keys() if d[k]==v]) for v in values)
return inverse
inv = invert_dict({'a': 1, 'p': 1, 'r': 2, 't': 1, 'o': 1})
print(inv)
Output:
{1: ['a', 'p', 't', 'o'], 2: ['r']}

Making a dictionary from a list of lists

I have been unable to figure this out, I think the problem might be in the way I am making the list of lists. Can anyone help out? Thanks!
My desired outcome is
codondict = {'A': ['GCT','GCC','GCA','GCG'], 'C': ['TGT','TGC'], &c
but what i get is:
{'A': 'A', 'C': 'C', &c.
Here's my terminal:
A=['GCT','GCC','GCA','GCG']
C=['TGT','TGC']
D=['GAT','GAC']
E=['GAA','GAG']
F=['TTT','TTC']
G=['GGT','GGC','GGA','GGG']
H=['CAT','CAC']
I=['ATT','ATC','ATA']
K=['AAA','AAG']
L=['TTA','TTG','CTT','CTC','CTA','CTG']
M=['ATG']
N=['AAT','AAC']
P=['CCT','CCC','CCA','CCG']
Q=['CAA','CAG']
R=['CGT','CGC','CGA','CGG','AGA','AGG']
S=['TCT','TCC','TCA','TCG','AGT','AGC']
T=['ACT','ACC','ACA','ACG']
V=['GTT','GTC','GTA','GTG']
W=['TGG']
Y=['TAT','TAC']
aminoacids=['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
from collections import defaultdict
codondict=defaultdict(list)
for i in aminoacids:
... for j in i:(ALSO TRIED for j in list(i))
... ... codondict[i]=j
...
codondict
defaultdict(, {'A': 'A', 'C': 'C', 'E': 'E', 'D': 'D', 'G': 'G', 'F': 'F', 'I': 'I', 'H': 'H', 'K': 'K', 'M': 'M', 'L': 'L', 'N': 'N', 'Q': 'Q', 'P': 'P', 'S': 'S', 'R': 'R', 'T': 'T', 'W': 'W', 'V': 'V', 'Y': 'Y'})
You can try this:
condondict= dict(A=['GCT','GCC','GCA','GCG'],
C=['TGT','TGC'],
D=['GAT','GAC'],
E=['GAA','GAG'],
F=['TTT','TTC'],
G=['GGT','GGC','GGA','GGG'],
H=['CAT','CAC'],
I=['ATT','ATC','ATA'],
K=['AAA','AAG'],
L=['TTA','TTG','CTT','CTC','CTA','CTG'],
M=['ATG'],
N=['AAT','AAC'],
P=['CCT','CCC','CCA','CCG'],
Q=['CAA','CAG'],
R=['CGT','CGC','CGA','CGG','AGA','AGG'],
S=['TCT','TCC','TCA','TCG','AGT','AGC'],
T=['ACT','ACC','ACA','ACG'],
V=['GTT','GTC','GTA','GTG'],
W=['TGG'],
Y=['TAT','TAC'])
The reason to use defaultdict() is to allow access/creation of dictionary values without causing a KeyError, or by-pass using the form:
if key not in mydict.keys():
mydict[key] = []
mydict[key].append(something)
If your not creating new keys dynamically, you don't really need to use defaultdict().
Also if your keys already represent the aminoacids, you and just iterate over the keys themselves.
for aminoacid, sequence in condondict.iteritems():
# do stuff with with data...
Another way to do what you need is using the locals() function, which returns a dictionary containing the whole set of variables of the local scope, with the variable names as the keys and its contents as values.
for i in aminoacids:
codondict[i] = locals()[i]
So, you could get the A list, for example, using: locals()['A'].
That's kind of verbose, and is confusing the name of a variable 'A' with its value A. Keeping to what you've got:
aminoacids = { 'A': A, 'C': C, 'D': D ... }
should get you the dictionary you ask for:
{ 'A' : ['GCT', 'GCC', 'GCA', 'GCG'], 'C' : ['TGT', 'TGC'], ... }
where the order of keys 'A' and 'C' may not be what you get back because dictionaries are not ordered.
You can use globals() built-in too, and dict comprehension:
codondict = {k:globals()[k] for k in aminoacids}
it's better to rely on locals() instead of globals(), like stummjr's solution, but you can't do so with dict comprehension directly
codondict = dict([(k,locals()[k]) for k in aminoacids])
However you can do this:
loc = locals()
codondict = {k:loc[k] for k in aminoacids}
If you change dinamically your aminoacids list or the aminoacids assignments, it's better to use something lazier, like:
codondict = lambda: {k:globals()[k] for k in aminoacids}
with this last you can always use the updated dictionary, but it's now a callable, so use codondict()[x] instead of codondict[x] to get an actual dict. This way you can store the entire dict like hist = codondict() in case you need to compare different historical versions of codondict. That's small enough to be useful in interactive modes, but not recommended in bigger codes, though.

Categories

Resources