Replace numbers with letters and offer all permutations - python

I need to determine all possible letter combinations of a string that has numbers when converting numbers into possible visually similar letters.
Using the dictionary:
number_appearance = {
'1': ['l', 'i'],
'2': ['r', 'z'],
'3': ['e', 'b'],
'4': ['a'],
'5': ['s'],
'6': ['b', 'g'] ,
'7': ['t'],
'8': ['b'],
'9': ['g', 'p'],
'0': ['o', 'q']}
I want to write a function that takes an input and creates all possible letter combinations. For example:
text = 'l4t32'
def convert_numbers(text):
return re.sub('[0-9]', lambda x: number_appearance[x[0]][0], text)
I want the output to be a list with all possible permutations:
['later', 'latbr', 'latbz', 'latez]
The function above works if you are just grabbing the first letter in each list from number_appearance, but I'm trying to figure out the best way to iterate through all possible combinations. Any help would be much appreciated!

As an upgrade from your own answer, I suggest the following:
def convert_numbers(text):
all_items = [number_appearance.get(char, [char]) for char in text]
return [''.join(elem) for elem in itertools.product(*all_items)]
The improvements are that:
it doesn't convert text to a list (there is no need for that)
you don't need regex
it will still work if you decide instead that you also want to add other characters on top of numbers

def convert_num_appearance(text):
string_characters = [character for character in text]
all_items = []
for item in string_characters:
if re.search('[a-zA-Z]', item):
all_items.append([item])
elif re.search('\d', item):
all_items.append(number_appearance[item])
return [''.join(elem) for elem in itertools.product(*all_items)]

I would break down the problem like so:
First, create a function that can do the replacement for a given set of replacement letters. My input specification is a sequence of letters, where the first letter is the replacement for the '0' character, next for 1 etc. This allows me to use the index in that sequence to determine the character being replaced, while generating a plain sequence rather than a dict or other complex structure. To do the replacement, I will use the built-in translate method of the original string. That requires a dictionary as described in the documentation, which I can easily build with a dict comprehension, or with the provided helper method str.maketrans (a static method of the str type).
Use itertools.product to generate those sequences.
Use a list comprehension to apply the replacement for each sequence.
Thus:
from itertools import product
def replace_digits(original, replacement):
# translation = {ord(str(i)): c for i, c in enumerate(replacement)}
translation = str.maketrans('0123456789', ''.join(replacement))
print(translation)
return original.translate(translation)
replacements = product(
['o', 'q'], ['l', 'i'], ['r', 'z'], ['e', 'b'], ['a'],
['s'], ['b', 'g'] , ['t'], ['b'], ['g', 'p']
)
[replace_digits('14732', r) for r in replacements]
(You will notice there are duplicates in the result; this is because of variant replacements for symbols that don't appear in the input.)

Related

python - how to use the join method and sort method

My purpose is to get an input as a string and return a list of lower case letters of that string, without repeats, without punctuations, in alphabetical order. For example, the input "happy!" would get ['a','h','p','y']. I try to use the join function to get rid of my punctuations but somehow it doesn't work. Does anybody know why? Also, can sort.() sort alphabets? Am I using it in the right way? Thanks!
def split(a):
a.lower()
return [char for char in a]
def f(a):
i=split(a)
s=set(i)
l=list(s)
v=l.join(u for u in l if u not in ("?", ".", ";", ":", "!"))
v.sort()
return v
.join() is a string method, but being used on a list, so the code raises an exception, but join and isn't really needed here.
You're on the right track with set(). It only stores unique items, so create a set of your input and compute the intersection(&) with lower case letters. Sort the result:
>>> import string
>>> s = 'Happy!'
>>> sorted(set(s.lower()) & set(string.ascii_lowercase))
['a', 'h', 'p', 'y']
You could use:
def f(a):
return sorted(set(a.lower().strip('?.;:!')))
>>> f('Happy!')
['a', 'h', 'p', 'y']
You could also use regex for this:
pattern = re.compile(r'[^a-z]')
string = 'Hello# W0rld!!##'
print(sorted(set(pattern.sub('', string))))
Output:
['d', 'e', 'l', 'o', 'r']

How would I turn this array of letters, with choices, into a list of possible words?

7h47 --> should be 'that' in 1337speak
[['j', 't'], ['h'], ['a', 'h'], ['j', 't']]
So the output should be: possible_words = [jhaj, jhat, jhhj, jhht, thaj, that, thhj, thht], just having trouble getting to this step.
What you want is the combination of these lists.
Happily, Python does the hard work for you and everything you need is to use itertools.product:
import itertools
letters = [['j', 't'], ['h'], ['a', 'h'], ['j', 't']]
words = [''.join(x) for x in itertools.product(*letters)]
And you get:
['jhaj', 'jhat', 'jhhj', 'jhht', 'thaj', 'that', 'thhj', 'thht']
In more details, itertools.product returns an iterator that generates all possible combinations of the parameters (a list of letters). Then we iterate over these combinations within a list comprehension and use join to get a word from each list of "chars" (str of length 1 in Python).
If you're a recursion fan, here's an approach using generators:
def combine(combinations, prefix=''):
head, *tail = combinations
# If it's the last word from the combinations
if not tail:
# Yield the full word using the head as suffixes
for letter in head:
yield prefix + letter
else:
# Yield from the tail combinations using the head as suffixes
for letter in head:
yield from combine(tail, prefix + letter)
print(list(combine([['j', 't'], ['h'], ['a', 'h'], ['j', 't']])))
# ['jhaj', 'jhat', 'jhhj', 'jhht', 'thaj', 'that', 'thhj', 'thht']

How do I create a new list with a nested list comprehension?

Say I have a list of words
word_list = ['cat','dog','rabbit']
and I want to end up with a list of letters (not including any repeated letters), like this:
['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i']
without a list comprehension the code would like this:
letter_list=[]
for a_word in word_list:
for a_letter in a_word:
if a_letter not in letter_list:
letter_list.append(a_letter)
print(letter_list)
is there a way to do this with a list comprehension?
I have tried
letter_list = [a_letter for a_letter in a_word for a_word in word_list]
but I get a
NameError: name 'a_word' is not defined
error. I have see answers for similar problems, but they usually iterate over a nested collection (list or tuple). Is there a way to do this from a non-nested list like a_word?
Trying
letter_list = [a_letter for a_letter in [a_word for a_word in word_list]]
Results in the initial list: ['cat','dog','rabbit']
And trying
letter_list = [[a_letter for a_letter in a_word] for a_word in word_list]
Results in:[['c', 'a', 't'], ['d', 'o', 'g'], ['r', 'a', 'b', 'b', 'i', 't']], which is closer to what I want except it's nested lists. Is there a way to do this and have just the letters be in letter_list?
Update. How about this:
word_list = ['cat','dog','rabbit']
new_list = [letter for letter in ''.join(word_list)]
new_list = sorted(set(new_list), key=new_list.index)
print(new_list)
Output:
['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i']
word_list = ['cat','dog','rabbit']
letter_list = list(set([letter for word in word_list for letter in word]))
This works and removes the duplicate letters, but the order is not preserved. If you want to keep the order you can do this.
from collections import OrderedDict
word_list = ['cat','dog','rabbit']
letter_list = list(OrderedDict.fromkeys("".join(word_list)))
you can do it by using list comprehension
l=[j for i in word_list for j in i ]
print(l)
output:
['c', 'a', 't', 'd', 'o', 'g', 'r', 'a', 'b', 'b', 'i', 't']
You can use a list comprehension. It is faster than looping in cases like yours when you call .append on each iteration, as explained by this answer.
But if you want to keep only unique letters (i.e. without repeating any letter), you can use a set comprehension by changing the braces [] to curly braces {} as in
letter_set = {letter for letter in word for word in word_list}
This way you avoid checking the partial list on every iteration to see if the letter is already part of the set. Instead you make use of pythons embedded hashing algorithms and make your code a lot faster.
Another solution:
>>> s = set()
>>> word_list = ['cat', 'dog', 'rabbit']
>>> [c for word in word_list for c in word if (c not in s, s.add(c))[0]]
['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i']
This will test whether the letter is already in the set or not, and it will unconditionally add it to the set (having no effect if it is already present). The None returned from s.add is stored in the temporary tuple but otherwise ignored. The first element of the temporary tuple (that is, the result of the c not in s) is used to filter the items.
This relies on the fact that the elements of the temporary tuple are evaluated from left to right.
Could be considered a bit hacky :-)

Nested list loop indexing

I am trying to create a list of characters based on a list of words
i.e.
["BOARD", "GAME"] -> [["B","O"...], ["G","A","M"...]
From my understanding, I have an IndexError because my initial boardlist does not contain a predetermined the amount of lists.
Is there a way for to create a new list in boardlist according to number of objects in board?
I don't know if I'm being clear.
Thank you.
board=["BOARD", "GAME"]
boardlist=[[]]
i=0
for word in board:
for char in word:
boardlist[i].append(char)
i=i+1
print(boardlist)
IndexError: list index out of range
Note that this can be done in a much simpler way by taking a list of each string in the board, as the list constructor will be converting the input iterable, in this case a string, to a list of substrings from it:
l = ["BOARD", "GAME"]
[list(i) for i in l]
# [['B', 'O', 'A', 'R', 'D'], ['G', 'A', 'M', 'E']]
Let's also find a fix to your current approach. Firstly boardlist=[[]] is not a valid way of initializing a list (check what it returns). You might want to check this post. Also instead of incrementing a counter you have enumerate for that:
boardlist = [[] for _ in range(len(board))]
for i, word in enumerate(board):
for char in word:
boardlist[i].extend(char)
print(boardlist)
# [['B', 'O', 'A', 'R', 'D'], ['G', 'A', 'M', 'E']]

Function that retrieves and returns letters from a list of lists

I'm writing a function that needs to go through a list of lists, collect all letters uppercase or lowercase and then return a list with 1 of each letter that it found in order. If the letter appears multiple times in the list of lists the function only has to report the first time it sees the letter.
For example, if the list of lists was [['.', 'M', 'M', 'N', 'N'],['.', '.', '.', '.', 'g'], ['B', 'B', 'B', '.','g']] then the function output should return ["M","N","g","B"].
The code I have so far seems like it could work but it doesn't seem to be working. Any help is appreciated
def get_symbols(lot):
symbols = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
newlot = []
for i in lot:
if i == symbols:
newlot.append(symbols)
return newlot
else:
return None
To build on your existing code:
import string
def get_symbols(lot):
symbols = string.ascii_lowercase + string.ascii_uppercase
newlot = []
for sublot in lot:
for x in sublot:
if x in symbols and x not in newlot:
newlot.append(x)
return newlot
print get_symbols([['.', 'M', 'M', 'N', 'N'],['.', '.', '.', '.', 'g'], ['B', 'B', 'B', '.','g']])
Using string gets us the letters a little more neatly. We then loop over each list provided (each sublot of the lot), and then for each element (x), we check if it is both in our list of all letters and not in our list of found letters. If this is the case, we add it to our output.
There are a few things wrong with your code. You are using return in the wrong place, looping only over the outer list (not over the items in the sublists) and you were appending symbols to newlot instead of the matched item.
def get_symbols(lot):
symbols = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' # You should define this OUTSIDE of the function
newlot = []
for i in lot: # You are iterating over the outer list only here
if i == symbols: # == does not check if an item is in a list, use `in` here
newlot.append(symbols) # You are appending symbols which is the alphabet
return newlot # This will cause your function to exit as soon as the first iteration is over
else:
return None # No need for this
You can use a double for loop and use in to check if the character is in symbols and isn't already in newlot:
l = [['.', 'M', 'M', 'N', 'N'],['.', '.', '.', '.', 'g'], ['B', 'B', 'B', '.','g']]
symbols = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
def get_symbols(lot):
newlot = []
for sublist in lot:
for i in sublist:
if i in symbols and i not in newlot:
newlot.append(i)
return newlot
This is the output for your list:
>>> get_symbols(l)
['M', 'N', 'g', 'B']
this also can be done by using chain, OrderedDict and isalpha as follow
>>> from collections import OrderedDict
>>> from itertools import chain
>>> data = [['.', 'M', 'M', 'N', 'N'],['.', '.', '.', '.', 'g'], ['B', 'B', 'B', '.','g']]
>>> temp = OrderedDict.fromkeys(chain.from_iterable(data))
>>> [x for x in temp if x.isalpha()]
['M', 'N', 'g', 'B']
>>>
chain.from_iterable will serve the same purpose as if you concatenate all the sublist in one
As the order is relevant, OrderedDict will server the same purpose as an set by removing duplicates with the added bonus of preserving the order of the first instance of the object added. The fromkeys class-method will create a dictionary with the given keys and same value, which by default is None, and as we don't care about it, for our purpose is a orderer set
Finally the isalpha will tell you if the string is a letter or not
you can also take a look at the unique_everseen recipe, because itertools is your best friend I recommend to put all those recipes in a file that is always at hand, they always are helpful

Categories

Resources