Making a dictionary from a list of lists - python

I have been unable to figure this out, I think the problem might be in the way I am making the list of lists. Can anyone help out? Thanks!
My desired outcome is
codondict = {'A': ['GCT','GCC','GCA','GCG'], 'C': ['TGT','TGC'], &c
but what i get is:
{'A': 'A', 'C': 'C', &c.
Here's my terminal:
A=['GCT','GCC','GCA','GCG']
C=['TGT','TGC']
D=['GAT','GAC']
E=['GAA','GAG']
F=['TTT','TTC']
G=['GGT','GGC','GGA','GGG']
H=['CAT','CAC']
I=['ATT','ATC','ATA']
K=['AAA','AAG']
L=['TTA','TTG','CTT','CTC','CTA','CTG']
M=['ATG']
N=['AAT','AAC']
P=['CCT','CCC','CCA','CCG']
Q=['CAA','CAG']
R=['CGT','CGC','CGA','CGG','AGA','AGG']
S=['TCT','TCC','TCA','TCG','AGT','AGC']
T=['ACT','ACC','ACA','ACG']
V=['GTT','GTC','GTA','GTG']
W=['TGG']
Y=['TAT','TAC']
aminoacids=['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
from collections import defaultdict
codondict=defaultdict(list)
for i in aminoacids:
... for j in i:(ALSO TRIED for j in list(i))
... ... codondict[i]=j
...
codondict
defaultdict(, {'A': 'A', 'C': 'C', 'E': 'E', 'D': 'D', 'G': 'G', 'F': 'F', 'I': 'I', 'H': 'H', 'K': 'K', 'M': 'M', 'L': 'L', 'N': 'N', 'Q': 'Q', 'P': 'P', 'S': 'S', 'R': 'R', 'T': 'T', 'W': 'W', 'V': 'V', 'Y': 'Y'})

You can try this:
condondict= dict(A=['GCT','GCC','GCA','GCG'],
C=['TGT','TGC'],
D=['GAT','GAC'],
E=['GAA','GAG'],
F=['TTT','TTC'],
G=['GGT','GGC','GGA','GGG'],
H=['CAT','CAC'],
I=['ATT','ATC','ATA'],
K=['AAA','AAG'],
L=['TTA','TTG','CTT','CTC','CTA','CTG'],
M=['ATG'],
N=['AAT','AAC'],
P=['CCT','CCC','CCA','CCG'],
Q=['CAA','CAG'],
R=['CGT','CGC','CGA','CGG','AGA','AGG'],
S=['TCT','TCC','TCA','TCG','AGT','AGC'],
T=['ACT','ACC','ACA','ACG'],
V=['GTT','GTC','GTA','GTG'],
W=['TGG'],
Y=['TAT','TAC'])
The reason to use defaultdict() is to allow access/creation of dictionary values without causing a KeyError, or by-pass using the form:
if key not in mydict.keys():
mydict[key] = []
mydict[key].append(something)
If your not creating new keys dynamically, you don't really need to use defaultdict().
Also if your keys already represent the aminoacids, you and just iterate over the keys themselves.
for aminoacid, sequence in condondict.iteritems():
# do stuff with with data...

Another way to do what you need is using the locals() function, which returns a dictionary containing the whole set of variables of the local scope, with the variable names as the keys and its contents as values.
for i in aminoacids:
codondict[i] = locals()[i]
So, you could get the A list, for example, using: locals()['A'].

That's kind of verbose, and is confusing the name of a variable 'A' with its value A. Keeping to what you've got:
aminoacids = { 'A': A, 'C': C, 'D': D ... }
should get you the dictionary you ask for:
{ 'A' : ['GCT', 'GCC', 'GCA', 'GCG'], 'C' : ['TGT', 'TGC'], ... }
where the order of keys 'A' and 'C' may not be what you get back because dictionaries are not ordered.

You can use globals() built-in too, and dict comprehension:
codondict = {k:globals()[k] for k in aminoacids}
it's better to rely on locals() instead of globals(), like stummjr's solution, but you can't do so with dict comprehension directly
codondict = dict([(k,locals()[k]) for k in aminoacids])
However you can do this:
loc = locals()
codondict = {k:loc[k] for k in aminoacids}
If you change dinamically your aminoacids list or the aminoacids assignments, it's better to use something lazier, like:
codondict = lambda: {k:globals()[k] for k in aminoacids}
with this last you can always use the updated dictionary, but it's now a callable, so use codondict()[x] instead of codondict[x] to get an actual dict. This way you can store the entire dict like hist = codondict() in case you need to compare different historical versions of codondict. That's small enough to be useful in interactive modes, but not recommended in bigger codes, though.

Related

Reverse lookup dict of tuples, or use a different data structure?

I currently have a dictionary of tuples like this:
d = {
0: ('f', 'farm'),
1: ('m', 'mountain'),
2: ('h', 'house'),
3: ('t', 'forest'),
4: ('d', 'desert')
}
It's been working fine until I realized that I need to be able to do a reverse lookup, so given 'f' return 0, or given 'm' return 1
I know that's possible here by creating lists of the keys and values in the dict and cross-referencing them to find the position of the key, but that seems counter productive. I was wondering if there's a different data structure that would be better suited.
All the relationships here are one-to-one. 0 will always map to f, and f will always map to 0
There are already similar questions that you could use as a starting point.
How to implement an efficient bidirectional hash table?
Reverse / invert a dictionary mapping
In your specific case, the dict in its current form may be pointless. Dicts with integer keys starting from zero are just inefficient lists, but lists already have a reverse lookup method called index.
So what you could do is:
places_order = ['f', 'm', 'h', 't', 'd']
and then use places_order.index(some_letter) to get the corresponding integer.
This is an O(n) operation, so I'm assuming you don't need to perform these lookups millions of times with high performance. (Otherwise, especially if the real places_order is a long list, consider dict(zip(places_order, range(len(places_order)))).)
In addition, you could keep a map for the abbreviations of place names.
abbreviations = {
'f': 'farm',
'm': 'mountain',
'h': 'house',
't': 'forest',
'd': 'desert'
}
It's hard to say more without knowing any specifics about what you are trying to achieve.
A most straightforward search would be:
def search(dictionary, word):
for key,value in dictionary.items():
if word in value:
return key
This could then be used as:
>>> search(d, 'h')
2
Not sure I fully understand the use-case, but if you are not looking for a built-in python what about pandas.Series:
import pandas as pd
d = pd.Series(['farm', 'mountain', 'house', 'forest', 'desert'],index = ['f', 'm', 'h', 't', 'd'])
so both d[0] and d['f'] will output 'farm', d.index[0] will give 'f' and d.index.get_loc('f') will give 0.
For the case of built-ins see #timgeb's answer or consider a sort of namedtuple.
Here is another approach:
d = {
0: ('f', 'farm'),
1: ('m', 'mountain'),
2: ('h', 'house'),
3: ('t', 'forest'),
4: ('d', 'desert')
}
for key, value in d.items():
if 'h' in value:
print (key)
break
You can use a dict like this:
abs = {
'f': 'farm',
'm': 'mountain',
'h': 'house',
't': 'forest',
'd': 'desert'
}
when you need to reverse loop:
for i, (key, value) in enumerate(abs.items()):
print(i, key, value) # 0 f farm
when you want items by index:
list(abs.items())[i] # ('f', 'farm')

Replace symbols in a string with conflicting keys in a dictionary

I need a translator, that have a dictionary with keys like
's': 'd'
and
'sch': 'b'
.
That's a rough example, but the point is, when i have an input word like "schto", it needs to replace it as "bkr", substitute 'sch' to 'b'. BUT there are the key 's', thus it translates the word as "dnokr", leave out and never lookup for 'sch', because there the key with the symbol 's' and it translates it first before 'sch'. What is a workaround here to replace the input word with the key 'sch' first, not with separate 's', 'c', and 'h'?
Here is the example of the code.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
code = input("Type: ")
code = "".join([newdict[w] for w in code])
print(code)
Regular expressions are greedy by default. If you're using a version of Python in which the insertion-order of key-value pairs in a dictionary are guaranteed, and you insert the key-value pairs in such a way that the longer ones come first, something like this should work for you - re.sub takes either a string with which to replace a match, or it takes a callable (function/lambda/whatever), which accepts the current match as an argument, and must return a string with which to replace it:
import re
lookup = {
"sch": "b",
"sh": "q",
"s": "d"
}
def replace(match):
return lookup[match.group()]
pattern = "|".join(lookup)
print(re.sub(pattern, replace, "schush swim"))
Output:
buq dwim
>>>
If you are using Python version 3.4+, then dictionary maintain the insertions order of keys. And hence you can achieve this using str.replace() while iterating over dict.items().
It'll recursively update the strings based on mapping. For example, if 'h' is replaced by 'o', then 'o' will be replaced by 'r'.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
my_word = "schto"
for k, v in newdict.items():
my_word = my_word.replace(k, v)
where my_word will give you your desired string as 'bkr'.
Here, since the dict.items() maintains the insertion order, keys which are defined first will be executed first during the iteration. Hence, you can define the priority of your rules by defining the keys you want to give precedence by declaring them before the other keys.

Python - restructure dictionary of lits in a loop

Hi guys I have a dict id_dict where I have some keys. In this keys there are lists as values (sometimes only one item somtimes 4). (These are names of components)
my dict id_dict is created dynamically and the keys and values will always change.
S = {'G': ['crypto'], 'T': ['update', 'monitor', 'ipforum'], 'F': ['update'], 'M': ['crypto','update']}
R = {}
for key, value_list in S.items():
for value in value_list:
if value not in R:
R[value] = []
R[value].append(key)
print(R)
Output:
{'crypto': ['G', 'M'], 'update': ['T', 'F', 'M'], 'monitor': ['T'], 'ipforum': ['T']}
you are actually trying to assign the components list you're iterating in as a key of the dictionary, this is what will cause the error.
try by simply doing:
component_dict[component] = id
maybe try something like:
new_dic = {}
for k, v in dic.items():
for ele in v:
if ele not in new_dic:
new_dic[ele] = []
new_dic[ele].append(k)
new_dic
Note the append
It seems that u call components Instead of component and that cannot be a dict key, this si what causes the error,
In addition you didnt set it as a list and just assigned the value, you need to assign a list of it doesnt exist, and append to it if it does
So:
If dict_components.get(component):
dict_components[component].append(id)
else:
dict_components[component] = [id]
try this
from collections import defaultdict
d = {'G': ['crypto'], 'T': ['update', 'monitor', 'ipforum'], 'F': ['update'], 'M': ['crypto', 'update']}
d1 = defaultdict(list)
for k, v in d.items():
for x in v:
d1[x].append(k)
print(d1)
output
defaultdict(<class 'list'>, {'crypto': ['G', 'M'], 'update': ['T', 'F', 'M'], 'monitor': ['T'], 'ipforum': ['T']})
The easies way will be to use collections.defaultdict - which will save you from multiple tests
from collections import defaultdict
source = {'G': ['crypto'],
'T': ['update', 'monitor', 'ipforum'],
'F': ['update'],
'M': ['crypto','update']}
components = defaultdict(list)
for key, values_list in source.items():
for value in values_list:
components[value].append(key)
And the result will be
defaultdict(list,
{'crypto': ['G', 'M'],
'update': ['T', 'F', 'M'],
'monitor': ['T'],
'ipforum': ['T']})
As an alternative, you may create regular dict and use setdefault method to create lists
components = {}
for key, values_list in source.items():
for value in values_list:
components.setdefault(value, []).append(key)

File to dictionary only prints one

I have a text file that reads:
a;b
a;c
a;d
b;h
c;e
e;f
e;g
e;j
f;b
g;d
h;b
h;e
i;d
i;e
but when I print it after making it into a dictionary
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" in line:
key, val = map(str.strip, line.split(";"))
graph[key] = val
return dict(sorted(graph.items())))
It prints:
{'a': 'b', 'b': 'd', 'c': 'e', 'd': 'g', 'e': 'd', 'f': 'd'}
how do I make it where it prints the keys that repeat?
I assume for this you'd want to use a list of strings instead of a single string as the value, otherwise your dictionary will keep replacing the value for the same key.
Instead of:
{'a': 'b'}
You would probably want a structure such as:
{'a': ['b','c','d']}
Using your function:
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" not in line: continue
key, val = line.strip().split(';')
if key not in graph: graph[key] = list()
if val not in graph[key]: graph[key].append(val)
return dict(sorted(graph.items()))
read_graph('file.txt')
{'a': ['b', 'c', 'd'], 'c': ['e'], 'b': ['h'], 'e': ['f', 'g', 'j'], 'g': ['d'], 'f': ['b'], 'i': ['d', 'e'], 'h': ['b', 'e']}
Dictionaries in python (and every other language I know) have unique values for each key, and will overwrite them when you put a new value in for an existing key.
Consider a different kind of data structure, like a set of tuples, e.g.
{('a','b'), ('a','c'), ...}
Or, as it looks like you are making a graph, a dictionary where the values are lists of vertices instead of individual vertices, e.g.
{'a':['b','c'],...}
To make the set of tuples, replace the line
graph[key] = val
with
graph.append((key, val))
To make a dictionary-to-lists, use
if key in graph:
graph[key].append(val)
else:
graph[key] = [val]
Hope this helps!
You cannot because that is a dictionary, and it is not allowed to have two same keys or it would ambiguous. You could group by key.
def read_graph(file_name):
graph = {}
for line in open(file_name):
if ";" in line:
key, val = map(str.strip, line.split(";"))
if key not in graph:
graph[key] = [val]
else:
graph[key].append(val)
return dict(sorted(graph.items())))
So now you have for every key, an array with its values.
Since you seem to be working with a graph structure, I would recommend you look at the NetworkX package for Python. They have pre-built graph data-structures for you to use and many algorithms that can operate on them.
import networkx as nx
graph = nx.Graph()
with open(file_name) as f: # This closes the file automatically when you're done
for line in f:
if ";" in line:
source, dest = map(str.strip, line.split(";"))
graph.add_edge(source, dest)
In case you still want to use vanilla Python only:
Python's dictionaries can only have one value per key. To store multiple values for a single key, you have to store your keys in a list of values.
my_dict = {
'a': ['b', 'c', 'd'],
'b': ['h'],
...
}

"TypeError: unhashable type: 'list'" yet I'm trying to only slice the value of the list, not use the list itself

I've been having issues trying to create a dictionary by using the values from a list.
alphabetList = list(string.ascii_lowercase)
alphabetList.append(list(string.ascii_lowercase))
alphabetDict = {}
def makeAlphabetDict (Dict, x):
count = 0
while count <= len(alphabetList):
item1 = x[(count + (len(alphabetList) / 2))]
item2 = item1
Dict[item1] = item2
count += 1
makeAlphabetDict(alphabetDict , alphabetList)
Which returns:
TypeError: unhashable type: 'list'
I tried here and other similar questions yet I still can't see why Python thinks I'm trying to use the list, rather than just a slice from a list.
Your list contains a nested list:
alphabetList.append(list(string.ascii_lowercase))
You now have a list with ['a', 'b', ..., 'z', ['a', 'b', ..., 'z']]. It is that last element in the outer list that causes your problem.
You'd normally would use list.extend() to add additional elements:
alphabetList.extend(string.ascii_lowercase)
You are using string.ascii_lowercase twice there; perhaps you meant to use ascii_uppercase for one of those strings instead? Even so, your code always uses the same character for both key and value so it wouldn't really matter here.
If you are trying to map lowercase to uppercase or vice-versa, just use zip() and dict():
alphabetDict = dict(zip(string.ascii_lowercase, string.ascii_uppercase))
where zip() produces pairs of characters, and dict() takes those pairs as key-value pairs. The above produces a dictionary mapping lowercase ASCII characters to uppercase:
>>> import string
>>> dict(zip(string.ascii_lowercase, string.ascii_uppercase))
{'u': 'U', 'v': 'V', 'o': 'O', 'k': 'K', 'n': 'N', 'm': 'M', 't': 'T', 'l': 'L', 'h': 'H', 'e': 'E', 'p': 'P', 'i': 'I', 'b': 'B', 'x': 'X', 'q': 'Q', 'g': 'G', 'd': 'D', 'r': 'R', 'z': 'Z', 'c': 'C', 'w': 'W', 'a': 'A', 'y': 'Y', 'j': 'J', 'f': 'F', 's': 'S'}
As Martijn Pieters noted, you have problem with the list append that adds a list within your other list. You can add two list in any of the following ways for simplicity:
alphabetList = list(string.ascii_lowercase)
alphabetList += list(string.ascii_lowercase)
# Adds two lists; same as that of alphabetList.extend(alphabetList)
alphabetList = list(string.ascii_lowercase) * 2
# Just for your use case to iterate twice over the alphabets
In either case, your alphabetDict will have only 26 alphabets and not 52 as you cannot have repeated keys within the dict.

Categories

Resources