Adding two strings to a dictionary - python

I'm trying to check if key and value are same or not in a dictionary, if they are print the count of correct words and if not check how many letters are exact match.
eg. {'KEY':'KET'}
the output should be 1 mismatch for Y!=T
I tried zip function to add key and value to a new dictionary, but it doesn't add repeating letters to dictionary. like below.
word_dict={'PRETTY': 'PRESEN'}
for key,value in word_dict.items():
if key==value:
count_correct+=1
elif key!=value and len(key)==len(value):
new_dict=dict(zip(key,value))
print (new_dict)
output of above code is:
{'P': 'P', 'T': 'E', 'E': 'E', 'Y': 'N', 'R': 'R'}
which is missing one 'T':'S'
I know I could convert key and value in different lists and compare the indexes of both. But I would also like to know if creating a dictionary adds all the values from both strings.

Related

Reverse lookup dict of tuples, or use a different data structure?

I currently have a dictionary of tuples like this:
d = {
0: ('f', 'farm'),
1: ('m', 'mountain'),
2: ('h', 'house'),
3: ('t', 'forest'),
4: ('d', 'desert')
}
It's been working fine until I realized that I need to be able to do a reverse lookup, so given 'f' return 0, or given 'm' return 1
I know that's possible here by creating lists of the keys and values in the dict and cross-referencing them to find the position of the key, but that seems counter productive. I was wondering if there's a different data structure that would be better suited.
All the relationships here are one-to-one. 0 will always map to f, and f will always map to 0
There are already similar questions that you could use as a starting point.
How to implement an efficient bidirectional hash table?
Reverse / invert a dictionary mapping
In your specific case, the dict in its current form may be pointless. Dicts with integer keys starting from zero are just inefficient lists, but lists already have a reverse lookup method called index.
So what you could do is:
places_order = ['f', 'm', 'h', 't', 'd']
and then use places_order.index(some_letter) to get the corresponding integer.
This is an O(n) operation, so I'm assuming you don't need to perform these lookups millions of times with high performance. (Otherwise, especially if the real places_order is a long list, consider dict(zip(places_order, range(len(places_order)))).)
In addition, you could keep a map for the abbreviations of place names.
abbreviations = {
'f': 'farm',
'm': 'mountain',
'h': 'house',
't': 'forest',
'd': 'desert'
}
It's hard to say more without knowing any specifics about what you are trying to achieve.
A most straightforward search would be:
def search(dictionary, word):
for key,value in dictionary.items():
if word in value:
return key
This could then be used as:
>>> search(d, 'h')
2
Not sure I fully understand the use-case, but if you are not looking for a built-in python what about pandas.Series:
import pandas as pd
d = pd.Series(['farm', 'mountain', 'house', 'forest', 'desert'],index = ['f', 'm', 'h', 't', 'd'])
so both d[0] and d['f'] will output 'farm', d.index[0] will give 'f' and d.index.get_loc('f') will give 0.
For the case of built-ins see #timgeb's answer or consider a sort of namedtuple.
Here is another approach:
d = {
0: ('f', 'farm'),
1: ('m', 'mountain'),
2: ('h', 'house'),
3: ('t', 'forest'),
4: ('d', 'desert')
}
for key, value in d.items():
if 'h' in value:
print (key)
break
You can use a dict like this:
abs = {
'f': 'farm',
'm': 'mountain',
'h': 'house',
't': 'forest',
'd': 'desert'
}
when you need to reverse loop:
for i, (key, value) in enumerate(abs.items()):
print(i, key, value) # 0 f farm
when you want items by index:
list(abs.items())[i] # ('f', 'farm')

Replace symbols in a string with conflicting keys in a dictionary

I need a translator, that have a dictionary with keys like
's': 'd'
and
'sch': 'b'
.
That's a rough example, but the point is, when i have an input word like "schto", it needs to replace it as "bkr", substitute 'sch' to 'b'. BUT there are the key 's', thus it translates the word as "dnokr", leave out and never lookup for 'sch', because there the key with the symbol 's' and it translates it first before 'sch'. What is a workaround here to replace the input word with the key 'sch' first, not with separate 's', 'c', and 'h'?
Here is the example of the code.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
code = input("Type: ")
code = "".join([newdict[w] for w in code])
print(code)
Regular expressions are greedy by default. If you're using a version of Python in which the insertion-order of key-value pairs in a dictionary are guaranteed, and you insert the key-value pairs in such a way that the longer ones come first, something like this should work for you - re.sub takes either a string with which to replace a match, or it takes a callable (function/lambda/whatever), which accepts the current match as an argument, and must return a string with which to replace it:
import re
lookup = {
"sch": "b",
"sh": "q",
"s": "d"
}
def replace(match):
return lookup[match.group()]
pattern = "|".join(lookup)
print(re.sub(pattern, replace, "schush swim"))
Output:
buq dwim
>>>
If you are using Python version 3.4+, then dictionary maintain the insertions order of keys. And hence you can achieve this using str.replace() while iterating over dict.items().
It'll recursively update the strings based on mapping. For example, if 'h' is replaced by 'o', then 'o' will be replaced by 'r'.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
my_word = "schto"
for k, v in newdict.items():
my_word = my_word.replace(k, v)
where my_word will give you your desired string as 'bkr'.
Here, since the dict.items() maintains the insertion order, keys which are defined first will be executed first during the iteration. Hence, you can define the priority of your rules by defining the keys you want to give precedence by declaring them before the other keys.

Python: why a dictionary type data could automatically exclude some items?

I created a dictionary containing six items as below:
>>> dict1 = {
'A': ['A','A'],
'AB':['A','B'],
'A':['A','O'],
'B':['B','B'],
'B':['B','O'],
'O':['O','O']
}
But when I check the dictionary I found that the items "{'A': ['A', 'A'], 'B': ['B', 'B']}" have been excluded.
>>> dict1
Out[19]: {'A': ['A', 'O'], 'AB': ['A', 'B'], 'B': ['B', 'O'], 'O': ['O', 'O']}
>>> len(dict1)
Out[17]: 4
However if I create a new dictionary with the excluded items.It becomes normal.
>>> dict2 ={'A': ['A', 'A'], 'B': ['B', 'B']}
>>> dict2
Out[21]: {'A': ['A', 'A'], 'B': ['B', 'B']}
Could anybody explain me why is that?
You cannot have duplicate keys but you can have multiple values. In other words, each key is unique.
So each time you assign new values to the same key, you override the previous values of the key.
A way to assign 2 values (or lists) like in your example can be the following:
dict1 = {'A': [['A','A'],['A','O']], 'B':[['B','B'],['B','O']], 'O':['O','O'], 'AB':['A','B']}
Result
{'A': [['A', 'A'], ['A', 'O']], 'B': [['B', 'B'], ['B', 'O']], 'AB': ['A', 'B'], 'O': ['O', 'O']}
Finally, you can access each key as follows:
dict1['A']
Result
[['A', 'A'], ['A', 'O']]
This seems to be what you want to do.
Hope this helps.
The thing with dictionaries in Python is that each key is unique. That is, when you add an existing entry the previous value stored is overwritten by the new one.
When you typed:
dict1 = {
'A': ['A','A'],
'AB':['A','B'],
'A':['A','O'], # Overrides ['A', 'A']
'B':['B','B'],
'B':['B','O'], # Overrides previous entry
'O':['O','O']
}
You gave the dictionary two values for the keys 'A' and 'B'. That is you asked the dict to change the value previously stored.
I hope my answer was clear enough :)
EDIT: format & language
In the python dictionary you cannot have duplicate keys. If any duplicate key is present into the python dictionary, python automatically replaces the first values by the new ones. python dictionary behaves as unique key.
In your example:
dict1 = {
'A': ['A','A'],
'AB':['A','B'],
'A':['A','O'], # 'A': ['A','A'] and 'A': ['A','O'] override.
'B':['B','B'],
'B':['B','O'], # 'B': ['B','B'] and 'B': ['B','O'] override.
'O':['O','O']
}
Then your dictionary will be:
dict1 = {
'A': ['A','O'],
'AB':['A','B'],
'B':['B','O'],
'O':['O','O']
}
I think, It will be helpfull .
As Python Documentation says
It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output.
The main operations on a dictionary are storing a value with some key and extracting the value given the key. It is also possible to delete a key:value pair with del. If you store using a key that is already in use, the old value associated with that key is forgotten. It is an error to extract a value using a non-existent key.
Reference : https://docs.python.org/3/tutorial/datastructures.html

Python: How do I consistently find the next index after another index in a list with duplicates?

Sorry if the question wasn't clear enough, I am very new to python. I also apologize in advance if there are any typos in my code.
Say I have a list
list = [a,b,c,a,x,y,b,m,a,z]
And I want to get the index value of the element after each 'a' using a for loop and store it in a dict. (This assumes dict = {} already exists)
for store in list:
if dict.has_key(store) == False:
if list.index(store) != len(list)-1:
dict[store] = []
dict[store].append(list[list.index(store)+1])
else:
if list.index(store) != len(list)-1:
dict[store].append(list[list.index(store)+1])
Now ideally, I would want my dict to be
dict = {'a':['b','x','z'], 'b':['c','m'], 'c':['a']....etc.}
Instead, I get
dict = {'a':['b','b','b'], 'b':['c','c'], 'c':['a']...etc.}
I realized this is because index only finds the first occurrence of variable store. How would I structure my code so that for every value of store I can find the next index of that specific value instead of only the first one?
Also, I want to know how to do this only using a for loop; no recursions or while, etc (if statements are fine obviously).
I apologize again if my question isn't clear or if my code is messy.
You can do it like that:
l = ['a','b','c','a','x','y','b','m','a','z']
d={}
for i in range(len(l)-1):
if not l[i] in d:
d[l[i]] = []
d[l[i]].append(l[i+1])
Then d is
{'a': ['b', 'x', 'z'],
'b': ['c', 'm'],
'c': ['a'],
'm': ['a'],
'x': ['y'],
'y': ['b']}
Regarding your code, there is no need to use index, as you already enumerating over the list, so you do not need to search for the place of the current element. Also, you can just enumerate until len(l)-1, which simplifies the code. The problem in your code was that list.index(store) always finds the first appearance of store in list.
This looks like a job for defaultdict. Also, you should avoid using list and dict as variables since they are reserved words.
from collections import defaultdict
# create a dictionary that has default value of an empty list
# for any new key
d = defaultdict(list)
# create the list
my_list = 'a,b,c,a,x,y,b,m,a,z'.split(',')
# create tuples of each item with its following item
for k,v in zip(my_list, my_list[1:]):
d[k].append(v)
d
# returns:
defaultdict(list,
{'a': ['b', 'x', 'z'],
'b': ['c', 'm'],
'c': ['a'],
'm': ['a'],
'x': ['y'],
'y': ['b']})

Making a dictionary from a list of lists

I have been unable to figure this out, I think the problem might be in the way I am making the list of lists. Can anyone help out? Thanks!
My desired outcome is
codondict = {'A': ['GCT','GCC','GCA','GCG'], 'C': ['TGT','TGC'], &c
but what i get is:
{'A': 'A', 'C': 'C', &c.
Here's my terminal:
A=['GCT','GCC','GCA','GCG']
C=['TGT','TGC']
D=['GAT','GAC']
E=['GAA','GAG']
F=['TTT','TTC']
G=['GGT','GGC','GGA','GGG']
H=['CAT','CAC']
I=['ATT','ATC','ATA']
K=['AAA','AAG']
L=['TTA','TTG','CTT','CTC','CTA','CTG']
M=['ATG']
N=['AAT','AAC']
P=['CCT','CCC','CCA','CCG']
Q=['CAA','CAG']
R=['CGT','CGC','CGA','CGG','AGA','AGG']
S=['TCT','TCC','TCA','TCG','AGT','AGC']
T=['ACT','ACC','ACA','ACG']
V=['GTT','GTC','GTA','GTG']
W=['TGG']
Y=['TAT','TAC']
aminoacids=['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
from collections import defaultdict
codondict=defaultdict(list)
for i in aminoacids:
... for j in i:(ALSO TRIED for j in list(i))
... ... codondict[i]=j
...
codondict
defaultdict(, {'A': 'A', 'C': 'C', 'E': 'E', 'D': 'D', 'G': 'G', 'F': 'F', 'I': 'I', 'H': 'H', 'K': 'K', 'M': 'M', 'L': 'L', 'N': 'N', 'Q': 'Q', 'P': 'P', 'S': 'S', 'R': 'R', 'T': 'T', 'W': 'W', 'V': 'V', 'Y': 'Y'})
You can try this:
condondict= dict(A=['GCT','GCC','GCA','GCG'],
C=['TGT','TGC'],
D=['GAT','GAC'],
E=['GAA','GAG'],
F=['TTT','TTC'],
G=['GGT','GGC','GGA','GGG'],
H=['CAT','CAC'],
I=['ATT','ATC','ATA'],
K=['AAA','AAG'],
L=['TTA','TTG','CTT','CTC','CTA','CTG'],
M=['ATG'],
N=['AAT','AAC'],
P=['CCT','CCC','CCA','CCG'],
Q=['CAA','CAG'],
R=['CGT','CGC','CGA','CGG','AGA','AGG'],
S=['TCT','TCC','TCA','TCG','AGT','AGC'],
T=['ACT','ACC','ACA','ACG'],
V=['GTT','GTC','GTA','GTG'],
W=['TGG'],
Y=['TAT','TAC'])
The reason to use defaultdict() is to allow access/creation of dictionary values without causing a KeyError, or by-pass using the form:
if key not in mydict.keys():
mydict[key] = []
mydict[key].append(something)
If your not creating new keys dynamically, you don't really need to use defaultdict().
Also if your keys already represent the aminoacids, you and just iterate over the keys themselves.
for aminoacid, sequence in condondict.iteritems():
# do stuff with with data...
Another way to do what you need is using the locals() function, which returns a dictionary containing the whole set of variables of the local scope, with the variable names as the keys and its contents as values.
for i in aminoacids:
codondict[i] = locals()[i]
So, you could get the A list, for example, using: locals()['A'].
That's kind of verbose, and is confusing the name of a variable 'A' with its value A. Keeping to what you've got:
aminoacids = { 'A': A, 'C': C, 'D': D ... }
should get you the dictionary you ask for:
{ 'A' : ['GCT', 'GCC', 'GCA', 'GCG'], 'C' : ['TGT', 'TGC'], ... }
where the order of keys 'A' and 'C' may not be what you get back because dictionaries are not ordered.
You can use globals() built-in too, and dict comprehension:
codondict = {k:globals()[k] for k in aminoacids}
it's better to rely on locals() instead of globals(), like stummjr's solution, but you can't do so with dict comprehension directly
codondict = dict([(k,locals()[k]) for k in aminoacids])
However you can do this:
loc = locals()
codondict = {k:loc[k] for k in aminoacids}
If you change dinamically your aminoacids list or the aminoacids assignments, it's better to use something lazier, like:
codondict = lambda: {k:globals()[k] for k in aminoacids}
with this last you can always use the updated dictionary, but it's now a callable, so use codondict()[x] instead of codondict[x] to get an actual dict. This way you can store the entire dict like hist = codondict() in case you need to compare different historical versions of codondict. That's small enough to be useful in interactive modes, but not recommended in bigger codes, though.

Categories

Resources