binary to string, better than a dictionary?

binary to string, better than a dictionary? - python

Objective: Convert binary to string
Example: 0111010001100101011100110111010001100011011011110110010001100101 -> testCode (without space)
I use a dictionary and my function, i search a better way and more efficient
from textwrap import wrap
DICO = {'\x00': '00', '\x04': '0100', '\x08': '01000', '\x0c': '01100',
'\x10': '010000', '\x14': '010100', '\x18': '011000', '\x1c': '011100',
' ': '0100000', '$': '0100100', '(': '0101000', ',': '0101100', '0': '0110000',
'4': '0110100', '8': '0111000', '<': '0111100', '#': '01000000',
'D': '01000100', 'H': '01001000', 'L': '01001100', 'P': '01010000',
'T': '01010100', 'X': '01011000', '\\': '01011100', '`': '01100000',
'd': '01100100', 'h': '01101000', 'l': '01101100', 'p': '01110000',
't': '01110100', 'x': '01111000', '|': '01111100', '\x03': '011',
'\x07': '0111', '\x0b': '01011', '\x0f': '01111', '\x13': '010011',
'\x17': '010111', '\x1b': '011011', '\x1f': '011111', '#': '0100011',
"'": '0100111', '+': '0101011', '/': '0101111', '3': '0110011', '7': '0110111',
';': '0111011', '?': '0111111', 'C': '01000011', 'G': '01000111',
'K': '01001011', 'O': '01001111', 'S': '01010011', 'W': '01010111',
'[': '01011011', '_': '01011111', 'c': '01100011', 'g': '01100111',
'k': '01101011', 'o': '01101111', 's': '01110011', 'w': '01110111',
'{': '01111011', '\x7f': '01111111', '\x02': '010', '\x06': '0110',
'\n': '01010', '\x0e': '01110', '\x12': '010010', '\x16': '010110',
'\x1a': '011010', '\x1e': '011110', '"': '0100010', '&': '0100110',
'*': '0101010', '.': '0101110', '2': '0110010', '6': '0110110', ':': '0111010',
'>': '0111110', 'B': '01000010', 'F': '01000110', 'J': '01001010',
'N': '01001110', 'R': '01010010', 'V': '01010110', 'Z': '01011010',
'^': '01011110', 'b': '01100010', 'f': '01100110', 'j': '01101010',
'n': '01101110', 'r': '01110010', 'v': '01110110', 'z': '01111010',
'~': '01111110', '\x01': '01', '\x05': '0101', '\t': '01001', '\r': '01101',
'\x11': '010001', '\x15': '010101', '\x19': '011001', '\x1d': '011101',
'!': '0100001', '%': '0100101', ')': '0101001', '-': '0101101',
'1': '0110001', '5': '0110101', '9': '0111001', '=': '0111101',
'A': '01000001', 'E': '01000101', 'I': '01001001', 'M': '01001101',
'Q': '01010001', 'U': '01010101', 'Y': '01011001', ']': '01011101',
'a': '01100001', 'e': '01100101', 'i': '01101001', 'm': '01101101',
'q': '01110001', 'u': '01110101', 'y': '01111001', '}': '01111101'}
def decrypt(binary):
"""Function to convert binary into string"""
binary = wrap(binary, 8)
ch = ''
for b in binary:
for i, j in DICO.items():
if j == b:
ch += i
return ch
thank by advance,

''.join([ chr(int(p, 2)) for p in wrap(binstr, 8) ])
What this does: wrap first splits your string up into chunks of 8. Then, I iterate through each one, and convert it to an integer (base 2). Each of those converted integer now get covered to a character with chr. Finally I wrap it all up with a ''.join to smash it all together.
A bit more of a breakdown of each step of the chr(int(p, 2)):
>>> int('01101010', 2)
106
>>> chr(106)
'j'
To make it fit into your pattern above:
def decrypt(binary):
"""Function to convert binary into string"""
binary = wrap(binary, 8)
ch = ''
for b in binary:
ch += chr(int(b, 2))
return ch
or
def decrypt(binary):
"""Function to convert binary into string"""
return ''.join([ chr(int(p, 2)) for p in wrap(binary, 8) ])
This is definitely faster since it is just doing the math in place, not iterating through the dictionary over and over. Plus, it is more readable.

If execution speed it the most important for you, why not invert the roles of keys and values in your dict?! (If you also need the current dict, you could created an inverted version like this {v:k for k, v in DICO.items()})
Now, you find directly the searched translation by key instead of looping through the whole dict.
Your new function would look like this:
def decrypt2(binary):
"""Function to convert binary into string"""
binary = wrap(binary, 8)
ch = ''
for b in binary:
if b in DICO_INVERTED:
ch += DICO_INVERTED[b]
return ch
Depending on the size of your binary string, you could gain some time by changing the way you construct your output-string (see Efficient String Concatenation in Python or performance tips - string concatenation). Using join seems promising. I would give it a try: ''.join(DICO_INVERTED.get(b, '') for b in binary)

did you try
def decrypt(binary):
"""Function to convert binary into string"""
return ''.join(( chr(int(p, 2)) for p in grouper(8,binary,'') ))
where grouper is taken from here http://docs.python.org/library/itertools.html#recipes
or
def decrypt2(binary):
"""Function to convert binary into string"""
return ''.join(( DICO_INVERTED[p] for p in grouper(8,binary,'') ))
that avoids to create temporary list
EDIT
as I was choisen to be the "right" answer I have to confess that I used the other answers. The point is here not to use generator list but generator expression and iterators

Related

Regex to turn a list of strings into numbers following a specific rule

Suppose I have a list in the form;
lst = ["5kxn"] # 1 string only for example
5k denotes 5000 and xn denotes n times, the processed list should be;
[5*1e3 for i in range(n)] # float values
#Not this literally but a list of n 5000's.
I am aware I can do this using non re methods but it could be bug prone, and my re skills are not good enough to come up with a method to pull off this conversion
Here is a dictionary of multipliers:
replace_dict = {'a': '1e-18', 'f': '1e-15', 'p': '1e-12',
'n': '1e-9', 'u': '1e-6', 'm': '1e-3',
'c': '1e-2', 'd': '1e-1', 'da': '1e1',
'h': '1e2', 'k': '1e3', 'M': '1e6',
'G': '1e9', 'T': '1e12', "P": '1e15',
'E': '1e18'}
Desired output is a list. For example ["2kx1","3kx2","4k"] will be [2000.0,3000.0,3000.0,4000.0]

import re
replace_dict = {'a': '1e-18', 'f': '1e-15', 'p': '1e-12',
'n': '1e-9', 'u': '1e-6', 'm': '1e-3',
'c': '1e-2', 'd': '1e-1', 'da': '1e1',
'h': '1e2', 'k': '1e3', 'M': '1e6',
'G': '1e9', 'T': '1e12', "P": '1e15',
'E': '1e18'}
def str_to_list(list_str):
regex = re.compile(r"([0-9]+)([^x]+)(x[0-9]+)?")
list_numbers = []
for string in list_str:
parsed = re.findall(regex, string)[0]
n = 1 if parsed[2] == '' else int(parsed[2].replace('x', ''))
list_numbers += [float(parsed[0]) * eval(replace_dict[parsed[1]])] * n
return list_numbers
result = str_to_list(["2kx1","3kx2","4k"])
print(result) # [2000.0, 3000.0, 3000.0, 4000.0]
A bit of explanation:
([0-9]+): captures what comes before the unit prefix, e.g. k.
([^x]+): captures the unit prefix (anything that is not an "x"). This could be refined so it only accepts the letters defined in replace_dict. The + is needed because of the 'da' prefix.
(x[0-9]+)?: captures the multiplier, e.g. x2, if exists.
The method re.findall returns a list, in these case containing a tuple with the groups captured by the regex, e.g.: [('22', 'k', 'x2')] for "22kx2". We take [0] to work directly with the tuple.
If the "xn" is missing, re.findall will return an empty string for the group (x[0-9]+)? since there's not any match, e.g.: [('2', 'k', '')] for "2k". That's why n is 1 if that string is empty, else we discard the x (replacing it with an empty string, i.e. replace('x', '')), so we only take the number, e.g. 2 in "x2".
Finally, we concatenate the resulting list to list_numbers, e.g. list_numbers += [2 * eval("1e3")] * 2 in case of "2kx2".
Hope that was clear enough :)

Here is a regex based approach which ultimately uses the eval() function to evaluate a string expression to generate a list:
def get_list(inp):
replace_dict = {'a': '1e-18', 'f': '1e-15', 'p': '1e-12',
'n': '1e-9', 'u': '1e-6', 'm': '1e-3',
'c': '1e-2', 'd': '1e-1', 'da': '1e1',
'h': '1e2', 'k': '1e3', 'M': '1e6',
'G': '1e9', 'T': '1e12', 'P': '1e15',
'E': '1e18'}
parts = re.findall(r'^(\d+)(\w+)x(\w+)$', inp)
expr = "[" + parts[0][0] + "*" + replace_dict[parts[0][1]] + " for i in range(" + parts[0][2] + ")]"
return expr
expr = get_list("5kxn")
print(expr) # [5*1e3 for i in range(n)]
n = 5
lst = eval(expr)
print(lst) # [5000.0, 5000.0, 5000.0, 5000.0, 5000.0]

Characters incorrectly translated from Morse Code to English

I started to translate Morse code to English and there is an issue. Here is my code:
morse_dict = {
'a': '.-',
'b': '-...',
'c': '-.-.',
'd': '-..',
'e': '.',
'f': '..-.',
'g': '--.',
'h': '....',
'i': '..',
'j': '.---',
'k': '-.-',
'l': '.-..',
'm': '--',
'n': '-.',
'o': '---',
'p': '.--.',
'q': '--.-',
'r': '.-.',
's': '...',
't': '-',
'u': '..-',
'v': '...-',
'w': '.--',
'x': '-..-',
'y': '-.--',
'z': '--..',
}
def morse_decrypt(message):
m1 = message.split()
new_str = []
letter = ''
for i,n in morse_dict.items():
if n in m1:
letter = str(i)
new_str.append(letter)
return ''.join(new_str)
print(morse_decrypt('... --- ...'))
>>>os
But when I try to use function it prints each character one time. I don't know what the problem. What I am doing wrong?

Your morse_dict translates alphabetical letters into Morse code letters. But you want the reverse since you're trying to decrypt rather than encrypt. Either rewrite your dictionary or use
morse_to_alpha = dict(map(reversed, morse_dict.items()))
to flip the key-value pairs.
Once you've done that, then you can look up each message chunk in the translation dictionary (rather than the other way around):
def morse_decrypt(message):
morse_to_alpha = dict(map(reversed, morse_dict.items()))
return "".join(map(morse_to_alpha.get, message.split()))
This still breaks encapsulation. morse_to_alpha should be made into a parameter so you're not accessing global state, and it's wasteful to flip the dict for every translation. I'll leave these adjustments for you to handle.
It's also unclear how to handle errors; this raises a (not particularly clearly named) exception if the Morse code is invalid.

You have a dictionary with the Key as the letter and the code as the value.
Python dictionaries are lookup by key not value(Unfortunately), BUT there is a way around this as you probably found.
Pull the dictionary into items for letter and code like you were doing, BUT put the code to lookup in the first FOR loop. :)
def morse_decrypt(message):
global morse_dict
msgList = message.split(" ")
msgEnglish = ""
for codeLookup in msgList:
for letter, code in morse_dict.items():
if(code == codeLookup):
msgEnglish += letter
return msgEnglish
print(morse_decrypt('... --- ...'))
references:
Get key by value in dictionary
https://www.w3schools.com/python/python_dictionaries.asp

Syntax Error: Unexpected EOF While Parsing (Small Code)

https://imgur.com/a/AVsyR
My code. No errors show up on the editor.
def word_to_code(word):
#TODO1
myTranslatedWord = ""
for a in range(0, len(word)):
for b in range(0, len(code)):
if(word[a] == code[b]):
myTranslatedWord += code[b]
print(myTranslatedWord)
return(myTranslatedWord)
code = {'A': '=.===',
'B': '===.=.=.=',
'C': '===.=.===.=',
'D': '===.=.=',
'E': '=',
'F': '=.=.===.=',
'G': '===.===.=',
'H': '=.=.=.=',
'I': '=.=',
'J': '=.===.===.===',
'K': '===.=.===',
'L': '=.===.=.=',
'M': '===.===',
'N': '===.=',
'O': '===.===.===',
'P': '=.===.===.=',
'Q': '===.===.=.===',
'R': '=.===.=',
'S': '=.=.=',
'T': '===',
'U': '=.=.===',
'V': '=.=.=.===',
'W': '=.===.===',
'X': '===.=.=.=.===',
'Y': '===.=.===.===',
'Z': '===.====.=.='}
print((word_to_code("PAPI"))
This is for a class where I'm trying to independently problem solve an objective. For some reason though my code is not working.

You have one too many parentheses in the last line. It should be
print(word_to_code("PAPI"))
In the future, copy the actual text of the error into the question instead of as a picture.
Note that there's also a fairly large problem with how you do the translation. Remember what dictionaries are good for - you should be able to write that function with only a single loop.

Printing out all the possibilites of ambiguous Morse code

I've been tasked with a problem for school and it's left me stumped. What I have to do is read in an ambiguous Morse Code string (i.e. without any spaces to state what is a letter and what is not) and print out what all the possible valid english translations for that Morse Code could be. I've seen an algorithm to solve this exact problem somewhere on the internet but have no idea how to convert it to Python 3 and can not find it for the life of me.
Some helpful things:
I have a list of words which the program considers valid: Download
The program does not need to output gramatically correct sentences, only sentences that form words that are valid and in words.txt.
Some extra things that define if a sentence is valid or not is that the sentence cannot have two identical words; all words must be unique, and there cannot be more than one 1-letter word and one 2-letter word in the sentence.
My code, which at the moment is incomplete but sorts all the words into their corresponding Morse Code definitions:
# Define the mapping from letter to Morse code.
CODES = {
'A': '.-',
'B': '-...',
'C': '-.-.',
'D': '-..',
'E': '.',
'F': '..-.',
'G': '--.',
'H': '....',
'I': '..',
'J': '.---',
'K': '-.-',
'L': '.-..',
'M': '--',
'N': '-.',
'O': '---',
'P': '.--.',
'Q': '--.-',
'R': '.-.',
'S': '...',
'T': '-',
'U': '..-',
'V': '...-',
'W': '.--',
'X': '-..-',
'Y': '-.--',
'Z': '--..',
}
words={}
f=open('words.txt').read()
a=f
for i in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ':
a=a.replace(i,CODES[i])
f=f.split('\n')
a=a.split('\n')
for i in f:
words[i]=a[f.index(i)]
q=input('Morse: ')
An example test case of how this would work is:
Morse: .--....-....-.-..-----.
A BED IN DOG
A DID IN DOG
A BLUE DOG
A TEST IF DOG
WEST I IN DOG
WEST EVEN A ON
WEST IF DOG

To complete the program, you need to use a recursive algorithm as there are so many possible combinations of words.
I have changed your variable names so that they are easy to understand what data they hold.
The decode function is used as a recursive algorithm. The first line checks if the Morse is empty so no need to run the function as it is a finishing point, it prints the output of that branch.
The rest of the function will check if a word is possible to make out of the first i letters. i starts at 1 as this is the shortest letter and the max length is the max Morse length in the file. The while loop also checks that an out of bound error does not occur by checking that i is not greater than the length of Morse.
The code can't change the function's arguments as other word could be found in the same functions causing a clash so new variable are created for the changed English and Morse. Checks that if the length of possible word has been repeated and allowed.
from string import ascii_uppercase
#Defines the letter to Morse mapping
code = {
'A': '.-',
'B': '-...',
'C': '-.-.',
'D': '-..',
'E': '.',
'F': '..-.',
'G': '--.',
'H': '....',
'I': '..',
'J': '.---',
'K': '-.-',
'L': '.-..',
'M': '--',
'N': '-.',
'O': '---',
'P': '.--.',
'Q': '--.-',
'R': '.-.',
'S': '...',
'T': '-',
'U': '..-',
'V': '...-',
'W': '.--',
'X': '-..-',
'Y': '-.--',
'Z': '--..'
}
#Opens the file and reads all words
file = open("words.txt","r")
words = file.read()
file.close()
morse = words
# Converts all words to morse
for letter in list(ascii_uppercase):
morse = morse.replace(letter, code[letter])
# Creates list of the morse and english words from strings
morsewords = morse.split("\n")
engwords = words.split("\n")
# Finds the max length of morse words
maxlength = max([len(i)] for i in morsewords)[0]
# Creates a dictionary of {morse words : english words}
words = dict(zip(morsewords, engwords))
# MorseInput = input("Morse code :")
MorseInput = ".--....-....-.-..-----."
# This is the recursive function
def decode(morse, eng="", oneWord=False, twoWord=False):
# Print the english when finished
if morse == "":
print(eng)
else:
i = 1
# While loop allows to go through all possWord where condition met
while len(morse) >= i and i <= maxlength:
possWord = morse[:i]
# Checks if the word is a real word
if possWord in words.keys():
# Real word therefore add to english and the morse
newEng = eng + " " + words[possWord]
newMorse = morse[i:]
# Checks if that not more than one, One length word used
if len(words[possWord]) == 1:
if not oneWord:
decode(newMorse, newEng, True, twoWord)
# Checks if that not more than one, Two length word used
elif len(words[possWord]) == 2:
if not twoWord:
decode(newMorse, newEng, oneWord, True)
# Word is greater than two so doesn't matter
else:
decode(newMorse, newEng, oneWord, twoWord)
i += 1
decode(MorseInput)
I hope that my comments make some sense.
I am sure that the code could be made better and shorter but I did it in under an hour.
It prints
A TEST IF DOG
A DID IN DOG
A BLUE DOG
WEST I IN DOG
WEST IF DOG
WEST EVEN A ON

Map character to its escaped version

I have this mapping:
mapping = {'a': '\a', 'b': '\b', 'f': '\f', 'n': '\n', 'r': '\r',
't': '\t', 'v': '\v'}
Is there a way to do this without using a dictionary? Perhaps something like:
if c in "abfnrtv": c = '\\' + c

>>> ('\\' + 'a').decode('string-escape')
'\x07'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

binary to string, better than a dictionary? - python

Related

Regex to turn a list of strings into numbers following a specific rule

Characters incorrectly translated from Morse Code to English

Syntax Error: Unexpected EOF While Parsing (Small Code)

Printing out all the possibilites of ambiguous Morse code

Map character to its escaped version

Categories

Resources