KeyError: '\n' python 2.7.5 - python

I have a dictonairy I want to compare to my string, for the each ke in the dictoniary which matches that in the string I wish to convert the string character to that of the dictoniary
I want to compare my dictionary to my string character by character and when they match replace the strings character with the value of the dictionary's match e.g. if A is in the string it will match to A in the dictionary and be replaced with T which is written to the file line2_u_rev_comp. However the error KeyError: '\n' occurs instead. What is this signaling and how can it be removed?
REV_COMP = {
'A': 'T',
'T': 'A',
'C': 'G',
'G': 'C',
'N': 'N',
'U': 'A'
}
tbl = REV_COMP
line2_u_rev_comp = [tbl[k] for k in line2_u_rev[::-1]]
''.join(line2_u_rev_comp)

'\n' means new line, and you can get rid of it (and other extraneous whitespace) using str.strip, e.g.:
line2_u_rev_comp = [tbl[k] for k in line2_u_rev.strip()[::-1]]

line2_u_rev_comp = [tbl.get(k,k) ... ]
this will either get it from the dictionary or return itself

The problem is the tbl[k] but you don't check if the key exists in the dict, if not you need to return k it self.
you also need to reverse again the list since your for statement is reversed.
Try this code:
line2_u_rev = "MY TEST IS THIS"
REV_COMP = {
'A': 'T',
'T': 'A',
'C': 'G',
'G': 'C',
'N': 'N',
'U': 'A'
}
tbl = REV_COMP
line2_u_rev_comp = [tbl[k] if k in tbl else k for k in line2_u_rev[::-1]][::-1]
print ''.join(line2_u_rev_comp)
Output:
MY AESA IS AHIS

Related

Decrypt message with random shift of letters

I am writing a program to decrypt a message and only given assumption that the maximum occur letter of decrypted message is "e". No shift number is given. Below code are my workdone. I can only hardcode the shift number to decrypt the given message, but if the message changed my code of course didn't work.
from collections import Counter
import string
message = "Zyp cpxpxmpc ez wzzv fa le esp delcd lyo yze ozhy le jzfc qppe Ehz ypgpc rtgp fa hzcv Hzcv rtgpd jzf xplytyr lyo afcazdp lyo wtqp td pxaej hteszfe te Escpp tq jzf lcp wfnvj pyzfrs ez qtyo wzgp cpxpxmpc te td espcp lyo ozye esczh te lhlj Depaspy Slhvtyr"
#frequency of each letter
letter_counts = Counter(message)
print(letter_counts) # Print the count of each element in string
#find max letter
maxFreq = -1
maxLetter = None
letter_counts[' '] = 0 # Don't count spaces zero count
for letter, freq in letter_counts.items():
print(letter, ":", freq)
maxLetter = max(letter_counts, key = letter_counts.get) # Find max freq letter in the string
print("Max Ocurring Letter:", maxLetter)
#right shift for encrypting and left shift for descripting.
#predict shift
#assume max letter is 'e'
letters = string.ascii_letters #contains 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
shift = 15 #COMPUTE SHIFT HERE (hardcode)
print("Predicted Shift:", shift)
totalLetters = 26
keys = {} #use dictionary for letter mapping
invkeys = {} #use dictionary for inverse letter mapping, you could use inverse search from original dict
for index, letter in enumerate(letters):
# cypher setup
if index < totalLetters: #lowercase
# Dictionary for encryption
letter = letters[index]
keys[letter] = letters[(index + shift) % 26]
# Dictionary for decryption
invkeys = {val: key for key, val in keys.items()}
else: #uppercase
# Dictionary for encryption
keys[letter] = letters[(index + shift) % 26 + 26]
# Dictionary for decryption
invkeys = {val: key for key, val in keys.items()}
print("Cypher Dict", keys)
#decrypt
decryptedMessage = []
for letter in message:
if letter == ' ': #spaces
decryptedMessage.append(letter)
else:
decryptedMessage.append(keys[letter])
print("Decrypted Message:", ''.join(decryptedMessage)) #join is used to put list inot string
# Checking if message is the same as the encrypt message provided
#Encrypt
encryptedMessage = []
for letter in decryptedMessage:
if letter == ' ': #spaces
encryptedMessage.append(letter)
else:
encryptedMessage.append(invkeys[letter])
print("Encrypted Message:", ''.join(encryptedMessage)) #join is used to put list inot string
The encrypt part of code is not necessary to exist, it is for checking only. It would be great if someone could help to modify my code/ give me some hints for the predict shift part. Thanks!
Output of the code:
Cypher Dict {'a': 'p', 'b': 'q', 'c': 'r', 'd': 's', 'e': 't', 'f': 'u', 'g': 'v', 'h': 'w', 'i': 'x', 'j': 'y', 'k': 'z', 'l': 'a', 'm': 'b', 'n': 'c', 'o': 'd', 'p': 'e', 'q': 'f', 'r': 'g', 's': 'h', 't': 'i', 'u': 'j', 'v': 'k', 'w': 'l', 'x': 'm', 'y': 'n', 'z': 'o', 'A': 'P', 'B': 'Q', 'C': 'R', 'D': 'S', 'E': 'T', 'F': 'U', 'G': 'V', 'H': 'W', 'I': 'X', 'J': 'Y', 'K': 'Z', 'L': 'A', 'M': 'B', 'N': 'C', 'O': 'D', 'P': 'E', 'Q': 'F', 'R': 'G', 'S': 'H', 'T': 'I', 'U': 'J', 'V': 'K', 'W': 'L', 'X': 'M', 'Y': 'N', 'Z': 'O'}
Decrypted Message: One remember to look up at the stars and not down at your feet Two never give up work Work gives you meaning and purpose and life is empty without it Three if you are lucky enough to find love remember it is there and dont throw it away Stephen Hawking
Encrypted Message: Zyp cpxpxmpc ez wzzv fa le esp delcd lyo yze ozhy le jzfc qppe Ehz ypgpc rtgp fa hzcv Hzcv rtgpd jzf xplytyr lyo afcazdp lyo wtqp td pxaej hteszfe te Escpp tq jzf lcp wfnvj pyzfrs ez qtyo wzgp cpxpxmpc te td espcp lyo ozye esczh te lhlj Depaspy Slhvtyr
This has three components:
Finding the character with max frequency:
test_str = "" # your string
counter = Counter(test_str)
keys = sorted(counter, key=counter.get, reverse=True)
res = keys[1] if keys[0] == " " else keys[0]
Calculating the shift:
shift = ord('e') - ord(res)
Encrypting/decrypting the string, which is trivial since you know the shift now
Something like this should allow you to calculate the shift based on the assumption that the letter in the original message with the highest frequency is 'e':
letter_counts = Counter(message)
e_encrypted = [k for k, v in letter_counts.items() if v == max(count for c, count in letter_counts.items() if c != ' ')][0]
shift = (ord('e') - ord(e_encrypted)) % 26
Or, to unroll the comprehensions for ease of understanding:
letter_counts = Counter(message)
e_encrypted, max_v = None, 0
for k, v in letter_counts.items():
if v > max_v and k != ' ':
e_encrypted, max_v = k, v
shift = (ord('e') - ord(e_encrypted)) % 26
It does the following:
take frequency counts of characters in message using the Counter class
find the maximum frequency, and the character with that maximum frequency
set the shift equal to the difference between the ascii value of that character and the letter 'e' (modulo 26)

Replace symbols in a string with conflicting keys in a dictionary

I need a translator, that have a dictionary with keys like
's': 'd'
and
'sch': 'b'
.
That's a rough example, but the point is, when i have an input word like "schto", it needs to replace it as "bkr", substitute 'sch' to 'b'. BUT there are the key 's', thus it translates the word as "dnokr", leave out and never lookup for 'sch', because there the key with the symbol 's' and it translates it first before 'sch'. What is a workaround here to replace the input word with the key 'sch' first, not with separate 's', 'c', and 'h'?
Here is the example of the code.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
code = input("Type: ")
code = "".join([newdict[w] for w in code])
print(code)
Regular expressions are greedy by default. If you're using a version of Python in which the insertion-order of key-value pairs in a dictionary are guaranteed, and you insert the key-value pairs in such a way that the longer ones come first, something like this should work for you - re.sub takes either a string with which to replace a match, or it takes a callable (function/lambda/whatever), which accepts the current match as an argument, and must return a string with which to replace it:
import re
lookup = {
"sch": "b",
"sh": "q",
"s": "d"
}
def replace(match):
return lookup[match.group()]
pattern = "|".join(lookup)
print(re.sub(pattern, replace, "schush swim"))
Output:
buq dwim
>>>
If you are using Python version 3.4+, then dictionary maintain the insertions order of keys. And hence you can achieve this using str.replace() while iterating over dict.items().
It'll recursively update the strings based on mapping. For example, if 'h' is replaced by 'o', then 'o' will be replaced by 'r'.
newdict = {'sch': 'b', 'sh': 'q', 'ch': 'w', 's': 'd', 'c': 'n', 'h': 'o', 't': 'k', 'o': 'r'}
my_word = "schto"
for k, v in newdict.items():
my_word = my_word.replace(k, v)
where my_word will give you your desired string as 'bkr'.
Here, since the dict.items() maintains the insertion order, keys which are defined first will be executed first during the iteration. Hence, you can define the priority of your rules by defining the keys you want to give precedence by declaring them before the other keys.

Replacing Multiple Letters in a String with Each Other in Python [duplicate]

This question already has answers here:
Replace multiple elements in string with str methods
(2 answers)
Closed 8 years ago.
So I understand how to use str.replace() to replace single letters in a string, and I also know how to use the following replace_all function:
def replace_all(text, dic):
for i, j in dic.iteritems():
text = text.replace(i,j)
return text
But I am trying to replace letters with each other. For example replace each A with T and each T with A, each C with G and each G with C, but I end up getting a string composed of only two letters, either A and G or C and T, for example, and I know the output should be composed of four letters. Here is the code I have tried (I'd rather avoid built in functions):
d={'A': 'T', 'C': 'G', 'A': 'T', 'G': 'C'}
DNA_String = open('rosalind_rna.txt', 'r')
DNA_String = DNA_String.read()
reverse = str(DNA_String[::-1])
def replace_all(text, dic):
for i, j in dic.iteritems():
text = text.replace(i,j)
return text
complement = replace_all(reverse, d)
print complement
I also tried using:
complement = str.replace(reverse, 'A', 'T')
complement = str.replace(reverse, 'T', 'A')
complement = str.replace(reverse, 'G', 'C')
complement = str.replace(reverse, 'C', 'G')
But I end up getting a string that is four times as long as it should be.
I've also tried:
complement = str.replace(reverse, 'A', 'T').replace(reverse, 'T', 'A').replace(reverse, 'G', 'C')str.replace(reverse, 'C', 'G')
But I get an error message that an integer input is needed.
You can map each letter to another letter.
>>> M = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
>>> STR = 'CGAATT'
>>> S = "".join([M.get(c,c) for c in STR])
>>> S
'GCTTAA'
You should probably use str.translate for this. Use string.maketrans to create an according transition table.
>>> import string
>>> d ={'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
>>> s = "ACTG"
>>> _from, to = map(lambda t: ''.join(t), zip(*d.items()))
>>> t = string.maketrans(_from, to)
>>> s.translate(t)
'TGAC'
By the way, the error you get with this line
complement = str.replace(reverse, 'A', 'T').replace(reverse, 'T', 'A')...
is that you are explicitly passing the self keyword when it is passed implicitly. Doing str.replace(reverse, 'A', 'T') is equivalent to reverse.replace('A', 'T'). Accordingly, when you do str.replace(...).replace(reverse, 'T', 'A'), this is equivalent to str.replace(str.replace(...), reverse, 'T', 'A'), i.e. the result of the first replace is inserted as self in the other replace, and the other parameters are shifted and the 'A' is interpreted as the count parameter, which has to be an int.
I think this is happening because you're replacing all the As with Ts and then replacing all those Ts (as well as those in the original string) with As. Try replacing with lower-case letters and then converting the whole string with upper():
dic = {'A': 't', 'T': 'a', 'C': 'g', 'G': 'c'}
text = 'GATTCCACCGT'
for i, j in dic.iteritems():
text = text.replace(i,j)
text = text.upper()
gives:
'CTAAGGTGGCA'

"TypeError: unhashable type: 'list'" yet I'm trying to only slice the value of the list, not use the list itself

I've been having issues trying to create a dictionary by using the values from a list.
alphabetList = list(string.ascii_lowercase)
alphabetList.append(list(string.ascii_lowercase))
alphabetDict = {}
def makeAlphabetDict (Dict, x):
count = 0
while count <= len(alphabetList):
item1 = x[(count + (len(alphabetList) / 2))]
item2 = item1
Dict[item1] = item2
count += 1
makeAlphabetDict(alphabetDict , alphabetList)
Which returns:
TypeError: unhashable type: 'list'
I tried here and other similar questions yet I still can't see why Python thinks I'm trying to use the list, rather than just a slice from a list.
Your list contains a nested list:
alphabetList.append(list(string.ascii_lowercase))
You now have a list with ['a', 'b', ..., 'z', ['a', 'b', ..., 'z']]. It is that last element in the outer list that causes your problem.
You'd normally would use list.extend() to add additional elements:
alphabetList.extend(string.ascii_lowercase)
You are using string.ascii_lowercase twice there; perhaps you meant to use ascii_uppercase for one of those strings instead? Even so, your code always uses the same character for both key and value so it wouldn't really matter here.
If you are trying to map lowercase to uppercase or vice-versa, just use zip() and dict():
alphabetDict = dict(zip(string.ascii_lowercase, string.ascii_uppercase))
where zip() produces pairs of characters, and dict() takes those pairs as key-value pairs. The above produces a dictionary mapping lowercase ASCII characters to uppercase:
>>> import string
>>> dict(zip(string.ascii_lowercase, string.ascii_uppercase))
{'u': 'U', 'v': 'V', 'o': 'O', 'k': 'K', 'n': 'N', 'm': 'M', 't': 'T', 'l': 'L', 'h': 'H', 'e': 'E', 'p': 'P', 'i': 'I', 'b': 'B', 'x': 'X', 'q': 'Q', 'g': 'G', 'd': 'D', 'r': 'R', 'z': 'Z', 'c': 'C', 'w': 'W', 'a': 'A', 'y': 'Y', 'j': 'J', 'f': 'F', 's': 'S'}
As Martijn Pieters noted, you have problem with the list append that adds a list within your other list. You can add two list in any of the following ways for simplicity:
alphabetList = list(string.ascii_lowercase)
alphabetList += list(string.ascii_lowercase)
# Adds two lists; same as that of alphabetList.extend(alphabetList)
alphabetList = list(string.ascii_lowercase) * 2
# Just for your use case to iterate twice over the alphabets
In either case, your alphabetDict will have only 26 alphabets and not 52 as you cannot have repeated keys within the dict.

encoding using a random cipher

I'm trying to write a program that takes a long string of letters and characters, and creates a dictionary of {original character:random character}. It should remove characters that have already been assigned a random value.
This is what I have:
import random
all_chars='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.!?'
def make_encoder(all_chars):
all_chars=list(all_chars)
encoder = {}
for c in range (0,len(all_chars)):
e = random.choice(all_chars)
all_chars.remove(e)
key = all_chars[c]
encoder[key] = e
return encoder
I keep getting index out of range: 33 on line 10 key = all_chars[c]
Here's my whole code, with the first problem fixed:
import random
all_chars='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.!?'
def make_encoder(all_chars):
list_chars= list(all_chars)
all_chars= list(all_chars)
encoder = {}
i=0
while len(encoder) < len(all_chars):
e = random.choice(all_chars)
key = all_chars[i]
if key not in encoder.keys():
encoder[key] = e
i += 1
return encoder
def encode_message(encoder,msg):
encoded_msg = ""
for x in msg:
c = encoder[x]
encoded_msg = encoded_msg + c
def make_decoder(encoder):
decoder = {}
for k in encoder:
v = encoder[k]
decoder[v] = k
return decoder
def decode_message(decoder,msg):
decoded_msg = ""
for x in msg:
c = decoder[x]
decoded_msg = decoded_msg + c
def main():
alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 ,.!?"
e = make_encoder(alphabet)
d = make_decoder(e)
print(e)
print(d)
phrase = input("enter a phrase")
print(phrase)
encoded = encode_message(e,phrase)
print(encoded)
decoded = decode_message(d,encoded)
print(decoded)
I now get TypeError: iteration over non-sequence of type NoneType for the line for x in msg:
You are altering the list. Point: never alter list while iterating over it.
for c in range (0,len(all_chars)): this line will iterate till length of list but at same time you removing element, so list got altered, that is why you got list out of range.
try like this:
import random
all_chars='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.!?'
def make_encoder(all_chars):
all_char = list(all_chars)
encoder = {}
i=0
while len(encoder) < len(all_char):
e = random.choice(all_char)
key = all_char[i]
if key not in encoder.keys():
encoder[key] = e
i += 1
return encoder
output:
>>> make_encoder(all_chars)
{'!': '3', ',': 'l', '.': 'J', '1': 'y', '0': 'l', '3': 'G', '2': ',', '5': '6', '4': 'f', '7': 'f', '6': 'C', '9': 'F', '8': 'y', '?': 'S', 'A': 'm', 'C': 'z', 'B': 'b', 'E': 'J', 'D': '0', 'G': 'S', 'F': 'v', 'I': 'v', 'H': '?', 'K': 'd', 'J': 'X', 'M': 'o', 'L': 'O', 'O': 'Q', 'N': 'P', 'Q': 'Z', 'P': '8', 'S': 'r', 'R': 'h', 'U': 'o', 'T': 'M', 'W': 'l', 'V': '.', 'Y': 'R', 'X': 'C', 'Z': 'a', 'a': 's', 'c': 'Y', 'b': 'X', 'e': 's', 'd': 'd', 'g': 'L', 'f': 'G', 'i': 'm', 'h': 'k', 'k': 'f', 'j': '1', 'm': 'J', 'l': 'L', 'o': '2', 'n': 'N', 'q': 'n', 'p': 'l', 's': 'W', 'r': '7', 'u': 'y', 't': 'S', 'w': 'J', 'v': 'E', 'y': 'r', 'x': 'C', 'z': 'i'}
You're modifying the list as you iterate over it:
for c in range(0,len(all_chars)):
e = random.choice(all_chars)
all_chars.remove(e)
The range item range(0,len(all_chars)) is only generated when the for loop starts. That means it will always assume its length is what it started as.
After you remove a character, all_chars.remove(e), now the list is one item shorter than when the for loop started, leading to the eventual over-run.
How about this instead:
while all_chars: # While there are chars left in the list
...
You should never modify an iterable while you are iterating over it.
Think about it: you told Python to loop from 0 to the length of the list all_chars, which is 66 in the beginning. But you are constantly shrinking this length with all_chars.remove(e). So, the loop still loops 66 times, but all_chars only has 66 items for the first iteration. Afterwards, it has 65, then 64, then 63, etc.
Eventually, you will run into an IndexError when c equals the length of the list (which happens at c==33). Note that it is not when c is greater than the length because Python indexes start at 0:
>>> [1, 2, 3][3] # There is no index 3 because 0 is the first index
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> [1, 2, 3][2] # 2 is the greatest index
3
>>>
To fix the problem, you can either:
Stop removing elements from all_chars inside the loop. That way, its length will always be 66.
Use a while True: loop and break when all_chars is empty (you run out of characters).
I would recommend making two strings or at least separating the two databases.
import random
all_chars='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.!?'
def make_encoder(all_chars):
list_chars= list(all_chars)
all_chars= list(all_chars) #<-------------EDIT
encoder = {}
for c in all_chars:
e = random.choice(list_chars)
list_chars.remove(e)
key = c #<---------------EDIT
encoder[key] = e
return encoder<--------EDIT, unindented this line.
That is your issue, because you were taking away from the list you were iterating though. Making two lists, although a little messy, is the best way.
You don't have to remove it from the initial string (it's bad practice to change a item while iterating over it)
Just check if the item isn't already in the dictonary.
import random
all_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.!?'
encoder = {}
n = 0
while len(all_chars) != len(encoder):
rand = random.choice(all_chars)
if rand not in encoder:
encoder[all_chars[n]] = rand
n += 1
for k,v in sorted(encoder.iteritems()):
print k,v
By the way, your encoder may work fine by doing this, but you have no way to decode it back since you are using a random factor to build the encoder. You can fix this by using random.seed('KEY').

Categories

Resources