Breaking the Caesar Cipher with brute force - python

The first two functions are mine:
def rotated(n: int):
'''Returns a rotated letter if parameter is greater than 26'''
ALPHABET = 'abcdefghijklmnopqrstuvwxyz'
if n >= 26:
n %= 26
return ALPHABET[n:26] + ALPHABET[:n]
assert rotated(0) == 'abcdefghijklmnopqrstuvwxyz'
assert rotated(26) == 'abcdefghijklmnopqrstuvwxyz'
def Caesar_decrypt(text: str, key: int) -> str:
'''Returns a decryption of parameter text and key'''
text = text.lower()
key_to_zero = str.maketrans(rotated(key),rotated(0))
return text.translate(key_to_zero)
But my partner worked on the 3rd function:
def Caesar_break(code: str)-> str:
'Decrypts the coded text without a key'
file = open('wordlist.txt', 'r')
dic = []
dlist = file.readlines()
wl = []
l = []
cl = []
swl = []
sw = ''
for words in code:
if words.isalnum() or words.isspace():
l.append(words)
else:
l.append(' ')
Ncode = ''.join(l)
codelist = Ncode.split()
high = 0
for i in range(1,27):
highesthit = 0
hit = 0
out = Caesar_decrypt(Ncode, i)
e = 0
l = 0
while l < len(dlist):
dic.append(dlist[l].split()[0])
l += 1
while e < len(dic):
if out == dic[e]:
hit += 1
e += 1
if hit > highesthit:
high = i
highesthit = hit
return(Caesar_decrypt(Ncode, high))
I can't contact him right now, so I was wondering if there is a simpler way to break the Caesar code using brute force. My partner used too many random letters in his code, so I can't really understand it.
Note: "wordlist.txt" is a document we downloaded down with all of the words in the dictionary. Here is the link for reference.
The Caesar_break code is supposed to work like this:
Caesar_break('amknsrcp qagclac') == 'computer science'

code breaking! Yay! The simplest way to break the caeser cipher is to assume that your encoded text is representative of the actual language it's in with respect to the frequency of letters. In English, that relative frequency looks kind of like:
"etaoinshrdlcumwfgypbvkjxqz"
# most to least common characters in English according to
# https://en.wikipedia.org/wiki/Letter_frequency
The fastest way to break a Caeser Cipher, then, is to create a collections.Counter of the letters in your encrypted phrase, find the most common couple, and assume each one (in turn) is e. Calculate your difference from there, and apply the decrypt cipher. Test to see if it's valid English, and ta-da!
import collections
def difference(a: str, b: str) -> int:
a, b = a.lower(), b.lower()
return ord(b) - ord(a)
def english_test(wordlist: "sequence of valid english words",
text: str) -> bool:
"""english_test checks that every word in `text` is in `wordlist`"""
return all(word in wordlist for word in text)
def bruteforce_caeser(text: str) -> str:
with open('path/to/wordlist.txt') as words:
wordlist = {word.strip() for word in words}
# set comprehension!
c = collections.Counter(filter(lambda ch: not ch.isspace(), text))
most_common = c.most_common() # ordered by most -> least common
for ch, _ in most_common:
diff = difference('e', ch)
plaintext = Caeser_decrypt(text, diff)
if english_test(wordlist, plaintext):
return plaintext
There's a subtle logic error in this code though, w.r.t. an assumption made about the input text. I'll leave it as an exercise to the student to find the logic error and think of what small change could be made to ensure a result on any input. As a hint: try rotating and then decrypting the following phrase:
Judy I don't think it's right for you to contact such a marksman as this man, for without warning this marksman could shoot and kill you.

If you are sure that a ciphertext was encrypted with ceaser (x+3)mod25 you can just float letters. I would make all text lowercase first. then get asci values all chracters. For example asci(a)=97, make it 97-97=0; for b make it 98-97=1.Then I would make 2 arrays 1 for characters, 1 for integer values of chracters....

Related

python string.split() and loops

disclaimer im new to python
i need to split a string input send if to a function that substitutes a character in the string with a different character (like a substitution cipher) but i just dont know how to go about this
print('Welcome to the encryption protocol for top secret governemt cover ups')
string=input('whats your message?')
def encrypt(string):
alpha = "abcdefghijklmnopqrstuvwyz"
sub_alpha = "pokmenliuytrwqazxcvsdfgbhn"
index=0
while index < len(string):
letter=string[index]
im not really sure what im doing im really bad at python, this has had me stumped for 3 days now ive reviewed my course material and tried videos on youtube im probably just really really dumb
I think the key piece of knowledge you're missing is that strings are iterable. So you can do things like:
for c in "FOO":
print(c)
# prints "F\nO\nO\n"
And you can find the index of a character within a string with str.index. So you can build up your cyphertext like this:
alpha = "abcdefghijklmnopqrstuvwyz "
cypher = "pokmenliuytrw qazxcvsdfgbhn"
plaintext = "some string"
cyphertext = ""
for c in plaintext:
char_index = alpha.index(c)
cyphertext += cypher[char_index]
You can also iterate over things inline - this is called a comprehension. So to transform your string you can do this instead of using the for loop:
cyphertext = "".join(cypher[alpha.index(c)] for c in plaintext)
The example above uses the str.join function to concatenate each character of cyphertext.
Here is a solution that asks the question and then iterates through each letter, finding the index in the alpha key, and replacing it with the sub_alpha key equivalent.
Note this example also checks if it should be lowercase or uppercase.
EDIT: if the input character does not have a valid cipher, it doesn't get altered.
EDIT 2: expanded answer to convert both forwards and backwards.
alpha = "abcdefghijklmnopqrstuvwyz"
sub_alpha = "pokmenliuytrwqazxcvsdfgbhn"
def encrypt(in_char):
is_lower_case = in_char.islower()
index = alpha.find(in_char.lower())
if index < 0:
return in_char
elif is_lower_case:
return sub_alpha[index]
else:
return sub_alpha[index].upper()
def decrypt(in_char):
is_lower_case = in_char.islower()
index = sub_alpha.find(in_char.lower())
if index < 0:
return in_char
elif is_lower_case:
return alpha[index]
else:
return alpha[index].upper()
print('Welcome to the encryption protocol for top secret governemt cover ups')
input_str=input('whats your message? ')
output_str=""
for letter in input_str:
output_str += encrypt(letter)
print("Encrypted: ")
print(output_str)
input_str=""
for letter in output_str:
input_str+= decrypt(letter)
print("Decrypted: ")
print(input_str)

Brute Force Dictionary Attack Caesar Cipher Python Code not working past 18'th shift

This was made to brute force caesar ciphers using a dictionary file from http://www.math.sjsu.edu/~foster/dictionary.txt. It is run through three functions, lang_lib() which makes the text of the dictionary into a callable object, isEnglish(), which checks the percentage of the phrase, and if at least 60% of it matchwa with the any words in the dictionary, it would return a True value. Using this, a caeser cipher function runs through all shifts, and checking them from english words. It should return the result with the highest percentage, but it only seems to work through shifts 1-18. I can't figure out why it isn't working.
def lang_lib():
file = open('dictionary.txt', 'r')
file_read = file.read()
file_split = file_read.split()
words = []
for word in file_split:
words.append(word)
file.close()
return words
dictionary = lang_lib()
def isEnglish(text):
split_text = text.lower().split()
counter = 0
not_in_dict = []
for word in split_text:
if word in dictionary:
counter += 1
else:
not_in_dict.append(word)
length = len(split_text)
text_percent = ((counter / length) * 100)
#print(text_percent)
if text_percent >= 60.0:
return True
else:
return False
alphabet = "abcdefghijklmnopqrstuvwxyz0123456789!##$%/."
def caeser(text): #Put in text, and it will spit out all possible values
lower_text = text.lower()
ciphertext = "" #stores current cipher value
matches = [] #stores possible matches
for i in range(len(alphabet)): #loops for the length of input alphabet
for c in lower_text:
if c in alphabet:
num = alphabet.find(c)
newnum = num - i
if newnum >= len(alphabet):
newnum -= len(alphabet)
elif newnum < 0:
newnum += len(alphabet)
ciphertext = ciphertext + alphabet[newnum]
else:
ciphertext = ciphertext + c
testing = isEnglish(ciphertext)
for text in ciphertext:
if testing == True and len(ciphertext) == len(lower_text):
matches.append(ciphertext)
return i, matches
ciphertext = "" #clears ciphertext so it doesn't get cluttered
print(caeser('0x447 #0x$x 74w v0%5')) #shift of 19
print(caeser('zw336 #zw9w 63v uz#4')) #shift of 18
Thanks guys.
This part is indented too far as #tripleee suggested:
testing = isEnglish(ciphertext)
for text in ciphertext:
if testing == True:
matches.append(ciphertext)
return i, matches
Also you don't need to check the length if you have the indentation right and let the previous loop complete....
I found out that the dictionary.txt does not contain 2 or 3 letter words, so it would skew long inputs with many of these words, and return False. I added a list of common words, so now all inputs work accurately.
If anyone wants to help me make this code more efficient, I'd love some pointers. I am very new to Python.

Need fresh eyes on a cipher program

I have been working for a while on cipher program in python for an online course. I keep going back and forth between successes and set backs, and recently thought I had figured it out. That is, until I compared the output I was getting to what the course said I should actually be getting. When I input "The crow flies at midnight!" and a key of "boom", I should be getting back "Uvs osck rwse bh auebwsih!" but instead get back "Tvs dfci tzufg mu auebwsih!" I am at a loss for what my program is doing, and could use a second look at my program from someone. Unfortunately, I don't have a person in real life to go to lol. Any help is greatly appreciated.
alphabet = "abcdefghijklmnopqrstuvwxyz"
def alphabet_position(letter):
lower_letter = letter.lower() #Makes any input lowercase.
return alphabet.index(lower_letter) #Returns the position of input as a number.
def vigenere(text,key):
m = len(key)
newList = ""
for i in range(len(text)):
if text[i] in alphabet:
text_position = alphabet_position(text[i])
key_position = alphabet_position(key[i % m])
value = (text_position + key_position) % 26
newList += alphabet[value]
else:
newList += text[i]
return newList
print (vigenere("The crow flies at midnight!", "boom"))
# Should print out Uvs osck rmwse bh auebwsih!
# Actually prints out Tvs dfci tzufg mu auebwsih!
Ok.The problem was the expected cipher skipped non-alphabetical characters and continued on the next letter with the same key.But in your implementaion you skipped the key too.
The crow
boo mboo // expected
boo boom // your version
So here is the corrected code:
alphabet = "abcdefghijklmnopqrstuvwxyz"
def alphabet_position(letter):
lower_letter = letter.lower() #Makes any input lowercase.
return alphabet.index(lower_letter) #Returns the position of input as a number.
def vigenere(text,key):
text_lower = text.lower()
m = len(key)
newList = ""
c = 0
for i in range(len(text)):
if text_lower[i] in alphabet:
text_position = alphabet_position(text[i])
key_position = alphabet_position(key[c % m])
value = (text_position + key_position) % 26
if text[i].isupper():
newList += alphabet[value].upper()
else:
newList += alphabet[value]
c += 1
else:
newList += text[i]
return newList
print (vigenere("The crow flies at midnight!", "boom"))
# Should print out Uvs osck rmwse bh auebwsih!
# Actually prints out Tvs dfci tzufg mu auebwsih!
In your vigenere function, convert set text = text.lower() .
To find such problems just follow one letter and see what happens, it was very easy to see that it doesn't work because 'T' is not in the alphabet but 't' is so you should convert the text to lower case.
It looks like the problem is that you didn't remind to handle the spaces. The "m" of "boom" should be used to encrypt the "c" of "crow", not the space between "The" and "crow"

Selecting specific int values from list and changing them

I have been playing with Python and came across a task from MIT, which is to create coded message (Julius Cesar code where for example you change ABCD letters in message to CDEF). This is what I came up with:
Phrase = input('Type message to encrypt: ')
shiftValue = int(input('Enter shift value: '))
listPhrase = list(Phrase)
listLenght = len(listPhrase)
ascii = []
for ch in listPhrase:
ascii.append(ord(ch))
print (ascii)
asciiCoded = []
for i in ascii:
asciiCoded.append(i+shiftValue)
print (asciiCoded)
phraseCoded = []
for i in asciiCoded:
phraseCoded.append(chr(i))
print (phraseCoded)
stringCoded = ''.join(phraseCoded)
print (stringCoded)
The code works but I have to implement not shifting the ascii value of spaces and special signs in message.
So my idea is to select values in list in range of range(65,90) and range(97,122) and change them while I do not change any others. But how do I do that?
If you want to use that gigantic code :) to do something as simple as that, then you keep a check like so:
asciiCoded = []
for i in ascii:
if 65 <= i <= 90 or 97 <= i <= 122: # only letters get changed
asciiCoded.append(i+shiftValue)
else:
asciiCoded.append(i)
But you know what, python can do the whole of that in a single line, using list comprehension. Watch this:
Phrase = input('Type message to encrypt: ')
shiftValue = int(input('Enter shift value: '))
# encoding to cypher, in single line
stringCoded = ''.join(chr(ord(c)+shiftValue) if c.isalpha() else c for c in Phrase)
print(stringCoded)
A little explanation: the list comprehension boils down to this for loop, which is easier to comprehend. Caught something? :)
temp_list = []
for c in Phrase:
if c.isalpha():
# shift if the c is alphabet
temp_list.append(chr(ord(c)+shiftValue))
else:
# no shift if c is no alphabet
temp_list.append(c)
# join the list to form a string
stringCoded = ''.join(temp_list)
Much easier it is to use the maketrans method from the string module:
>>import string
>>
>>caesar = string.maketrans('ABCD', 'CDEF')
>>
>>s = 'CAD BA'
>>
>>print s
>>print s.translate(caesar)
CAD BA
ECF DC
EDIT: This was for Python 2.7
With 3.5 just do
caesar = str.maketrans('ABCD', 'CDEF')
And an easy function to return a mapping.
>>> def encrypt(shift):
... alphabet = string.ascii_uppercase
... move = (len(alphabet) + shift) % len(alphabet)
... map_to = alphabet[move:] + alphabet[:move]
... return str.maketrans(alphabet, map_to)
>>> "ABC".translate(encrypt(4))
'EFG'
This function uses modulo addition to construct the encrypted caesar string.
asciiCoded = []
final_ascii = ""
for i in ascii:
final_ascii = i+shiftValue #add shiftValue to ascii value of character
if final_ascii in range(65,91) or final_ascii in range(97,123): #Condition to skip the special characters
asciiCoded.append(final_ascii)
else:
asciiCoded.append(i)
print (asciiCoded)

Caesar Cipher Recursion

I am attempting to finish a problem involving decoding a string of text encoded with multiple levels of a Caesar cipher. It seems to work for the first shift by returning Do but it will not recurse. I have print statements throughout showing the snippets I am using and putting into the functions and they seem to be correct.
def build_decoder(shift):
cipher = build_coder(shift)
decoder = {}
for k, v in cipher.items():
decoder[v] = k
return decoder
def is_word(wordlist, word):
word = word.lower()
word = word.strip(" !##$%^&*()-_+={}[]|\:;'<>?,./\"")
return word in wordlist
def apply_coder(text, coder):
encrypted = []
for character in text:
if character in coder.keys():
encrypted.append(coder[character])
else:
encrypted.append(character)
return ''.join(encrypted)
def apply_shift(text, shift):
coder = build_coder(shift)
return apply_coder(text, coder)
def apply_shifts(text, shifts):
for index, shift in shifts:
text = (text[:index]) + (apply_coder(text[index:], build_coder(shift)))
return text
def find_best_shifts_rec(wordlist, text, start=0):
"""
text: scrambled text to try to find the words for
start: where to start looking at shifts
returns: list of tuples. each tuple is (position in text, amount of shift)
"""
key = []
for shift in range(28):
message = text[:start] + apply_shift(text[start:], -shift) #Concatenate text from beginning to start with an apply_shift from start to end of text.
space = message[start:].find(" ") #Find next space from start to " " character.
if is_word(wordlist, message[start:space]): #If text from start to space is a word.
print message[start:space]
key.append((start, shift)) #Add position and shift as tuple in list key.
print key
print len(message[:start]), message[:start]
print len(message[start:space]), message[start:space]
print len(message), message
print message[:space]
print message[space+1:]
print message[space+1]
if not(is_word(wordlist, message[start:])):
return message[:space] + find_best_shifts_rec(wordlist, message, space+1) #Return text from beginning to space(decrypted) and recursively call find_best_shifts_rec on rest of text.
else:
return message[start:]
print "No shift match found, closest match:"
print key
return ''
s = apply_shifts("Do Androids Dream of Electric Sheep?", [(0,6), (3, 18), (12, 16)])
print find_best_shifts_rec(wordlist, s)
Output:
Do
[(0, 6)]
0
2 Do
36 Do Sevif vjrKylhtgvmgLslj ypjgZollw?
Do
Sevif vjrKylhtgvmgLslj ypjgZollw?
S
No shift match found, closest match:
[]
Do
I assume this is for the MIT 6.00 Course? I wrote a working find_best_shifts and find_best_shifts_rec. This is my first experience coding so I'm sure my code can be improved, but it does work, so you might be able to use it as a baseline to improve upon.
def find_best_shifts(wordlist, text):
global shifts
shifts = []
return find_best_shifts_rec(wordlist, text, 0)
def find_best_shifts_rec(wordlist, text, start):
for shift in range(28):
decoded = apply_shift(text[start:], shift)
words = decoded.split()
decoded = text[:start] + decoded
string_split = decoded.split()
size = len(string_split)
correct_words = 0
if is_word(wordlist, words[0]):
if shift != 0:
shifts.append((start,shift))
new_start = start + len(words[0]) + 1
if new_start >= len(text)-1:
return shifts
else:
return find_best_shifts_rec(wordlist, decoded, start=new_start)
for j in string_split:
if is_word(wordlist, j):
correct_words += 1
if correct_words == size:
return shifts
I'm pretty sure these lines don't do what you intend them to do:
space = message[start:].find(" ")
if is_word(wordlist, message[start:space]):
space is the index of the first space within the slice message[start:], but you're using it as an index into the whole message. For your later slice to work, it should be message[start:start+space]. The other places you use space in the later code should also probably be start+space.
Now, that may not be the only error, but it is the first obvious one I see. I can't actually run your code to test for other errors, because you haven't provided the build_coder function that is called by your other stuff (nor a wordlist, and who knows what else).

Categories

Resources