Coding words as products of integers - python

I am trying to write a program that checks if smaller words are found within a larger word. For example, the word "computer" contains the words "put", "rum", "cut", etc. To perform the check I am trying to code each word as a product of prime numbers, that way the smaller words will all be factors of the larger word. I have a list of letters and a list of primes and have assigned (I think) an integer value to each letter:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101]
index = 0
while index <= len(letters)-1:
letters[index] = primes[index]
index += 1
The problem I am having now is how to get the integer code for a given word and be able to create the codes for a whole list of words. For example, I want to be able to input the word "cab," and have the code generate its integer value of 5*2*3 = 30.
Any help would be much appreciated.

from functools import reduce # only needed for Python 3.x
from operator import mul
primes = [
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101
]
lookup = dict(zip("abcdefghijklmnopqrstuvwxyz", primes))
def encode(s):
return reduce(mul, (lookup.get(ch, 1) for ch in s.lower()))
then
encode("cat") # => 710
encode("act") # => 710
Edit: more to the point,
def is_anagram(s1, s2):
"""
s1 consists of the same letters as s2, rearranged
"""
return encode(s1) == encode(s2)
def is_subset(s1, s2):
"""
s1 consists of some letters from s2, rearranged
"""
return encode(s2) % encode(s1) == 0
then
is_anagram("cat", "act") # => True
is_subset("cat", "tactful") # => True

I would use a dict here to look-up the prime for a given letter:
In [1]: letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
In [2]: primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101]
In [3]: lookup = dict(zip(letters, primes))
In [4]: lookup['a']
Out[4]: 2
This will let you easily determine the list of primes for a given word:
In [5]: [lookup[letter] for letter in "computer"]
Out[5]: [5, 47, 41, 53, 73, 71, 11, 61]
To find the product of those primes:
In [6]: import operator
In [7]: reduce(operator.mul, [lookup[letter] for letter in "cab"])
Out[7]: 30

You've got your two lists set up, so now you just need to iterate over each character in a word and determine what value that letter gives you.
Something like
total = 1
for letter in word:
index = letters.index(letter)
total *= primes[index]
Or whichever operation you decide to use.
You would generalize that to a list of words.

Hmmmm... It isn't very clear how this code is supposed to run. If it is built to find words in the english dictionary, think about using PyEnchant, a module for checking if words are in the dictionary. Something you could try is this:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
word = raw_input('What is your word? ')
word = list(word)
total = 1
nums = []
for k in word:
nums.append(primes[letters.index(k)])
for k in nums:
total = total*k
print total
This will output as:
>>> What is your word? cat
710
>>>
This is correct, as 5*2*71 equals 710

Related

Is it possible to find the index of elements with char value higher than `n` with numpy?

Basically I have something like this :
letters = "ABNJDSJHIUOIUIYEIUWEYIUJHAJHSGJHASNMVFDJHKIUYEIUWYEWUIEYUIUYIEJSGCDJHDS"
And I want to find the index of letters above let's say M. I want to do something like :
import numpy as np
letters = "ABNJDSJHIUOIUIYEIUWEYIUJHAJHSGJHASNMVFDJHKIUYEIUWYEWUIEYUIUYIEJSGCDJHDS"
# - test
np_array = np.array(np.where(letters > chr(77))[0])
Is this possible? or do I have do something like letters not in ...?
Convert letters to a character array:
>>> ar = np.array(list(letters))
>>> ar
array(['A', 'B', 'N', 'J', 'D', 'S', 'J', 'H', 'I', 'U', 'O', 'I', 'U',
'I', 'Y', 'E', 'I', 'U', 'W', 'E', 'Y', 'I', 'U', 'J', 'H', 'A',
'J', 'H', 'S', 'G', 'J', 'H', 'A', 'S', 'N', 'M', 'V', 'F', 'D',
'J', 'H', 'K', 'I', 'U', 'Y', 'E', 'I', 'U', 'W', 'Y', 'E', 'W',
'U', 'I', 'E', 'Y', 'U', 'I', 'U', 'Y', 'I', 'E', 'J', 'S', 'G',
'C', 'D', 'J', 'H', 'D', 'S'], dtype='<U1')
>>> np.where(ar > 'M')[0]
array([ 2, 5, 9, 10, 12, 14, 17, 18, 20, 22, 28, 33, 34, 36, 43, 44, 47,
48, 49, 51, 52, 55, 56, 58, 59, 63, 70], dtype=int64)
Byte arrays can also be:
>>> ar = np.array(bytearray(letters.encode()))
>>> ar
array([65, 66, 78, 74, 68, 83, 74, 72, 73, 85, 79, 73, 85, 73, 89, 69, 73,
85, 87, 69, 89, 73, 85, 74, 72, 65, 74, 72, 83, 71, 74, 72, 65, 83,
78, 77, 86, 70, 68, 74, 72, 75, 73, 85, 89, 69, 73, 85, 87, 89, 69,
87, 85, 73, 69, 89, 85, 73, 85, 89, 73, 69, 74, 83, 71, 67, 68, 74,
72, 68, 83], dtype=uint8)
>>> np.where(ar > ord('M'))[0]
array([ 2, 5, 9, 10, 12, 14, 17, 18, 20, 22, 28, 33, 34, 36, 43, 44, 47,
48, 49, 51, 52, 55, 56, 58, 59, 63, 70], dtype=int64)

Print statements won't execute after the for loop

"""
ID: kunalgu1
LANG: PYTHON3
TASK: ride
"""
fin = open ('ride.in', 'r')
fout = open ('ride.out', 'w')
lines = fin.readlines()
cometString = lines[0]
cometValue = 1
groupString = lines[1]
groupValue = 1
def orderS (val):
arrL = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
arrN = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]
indexVal = arrL.index(val.lower())
return arrN[indexVal]
for x in cometString:
print(orderS(x))
cometValue *= orderS(x)
print(cometValue)
Here is the main error: it won't print
cometValue = cometValue % 47
print(cometValue)
fout.close()
The loop gets an error because the lines returned by readlines() include the newline terminator. When it calls orderS() for that character, arrL.index() fails because there's no newline in arrL.
You can remove the newline with the rstrip() method:
cometString = lines[0].rstrip()
You could also have orderS() return a default value when the character can't be found:
def orderS (val):
arrL = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
arrN = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]
try:
indexVal = arrL.index(val.lower())
return arrN[indexVal]
except ValueError:
return 27

Merge function will only work for ordered list

I have this 2 lists as input:
list1 = [['A', 14, 'I', 10, 20], ['B', 15, 'S', 30, 40], ['C', 16, 'F', 50, 60]]
list2 = [['A', 14, 'Y', 0, 200], ['B', 15, 'M', 0, 400], ['C', 17, 'G', 0, 600]]
and my desired output will be this:
finalList = [['A', 14, 'Y', 10, 200], ['B', 15, 'M', 30, 400], ['C', 16, 'F', 50, 60],['C', 17, 'G', 0, 600]]
Using this function:
def custom_merge(list1, list2):
finalList = []
for sub1, sub2 in zip(list1, list2):
if sub1[1]==sub2[1]:
out = sub1.copy()
out[2] = sub2[2]
out[4] = sub2[4]
finalList.append(out)
else:
finalList.append(sub1)
finalList.append(sub2)
return finalList
I will get indeed my desired output, but what if I switch positions (list2[1] and list2[2]) and my list2:
list2 = [['A', 14, 'Y', 0, 200], ['C', 17, 'G', 0, 600], ['B', 15, 'M', 0, 400]]
Then the output will be this:
[['A', 14, 'Y', 10, 200], ['B', 15, 'S', 30, 40], ['C', 17, 'G', 0, 600], ['C', 16, 'F', 50, 60], ['B', 15, 'M', 0, 400]]
(notice the extra ['B', 15, 'M', 0, 400])
What I have to modify in my function in order to get my first desired output if my lists have a different order in my list of lists!? I use python 3. Thank you!
LATER EDIT:
Merge rules:
When list1[listindex][1] == list2[listindex][1] (ex: when 14==14), replace in list1 -> list2[2] and list2[4] (ex: 'Y' and 200) and if not just add the unmatched list from list2 to list1 as it is (like in my desired output) and also keep the ones that are in list1 that aren't matched(ex: ['C', 16, 'F', 50, 60])
To be noted that list1 and list2 can have different len (list1 can have more lists than list2 or vice versa)
EDIT.2
I found this:
def combine(list1,list2):
combined_list = list1 + list2
final_dict = {tuple(i[:2]):tuple(i[2:]) for i in combined_list}
merged_list = [list(k) + list (final_dict[k]) for k in final_dict]
return merged_list
^^ That could work, still testing!
You can sort the lists by the first element in the sublists before merging them.
def custom_merge(list1, list2):
finalList = []
for sub1, sub2 in zip(sorted(list1), sorted(list2)):
if sub1[1]==sub2[1]:
out = sub1.copy()
out[2] = sub2[2]
out[4] = sub2[4]
finalList.append(out)
else:
finalList.append(sub1)
finalList.append(sub2)
return finalList
tests:
list1 = [['A', 14, 'I', 10, 20], ['B', 15, 'S', 30, 40], ['C', 16, 'F', 50, 60]]
list2 = [['A', 14, 'Y', 0, 200], ['C', 17, 'G', 0, 600], ['B', 15, 'M', 0, 400]]
custom_merge(list1, list2)
# returns:
[['A', 14, 'Y', 10, 200],
['B', 15, 'M', 30, 400],
['C', 16, 'F', 50, 60],
['C', 17, 'G', 0, 600]]

How can you convert a Python identifier into a number?

Reference: Is there a faster way of converting a number to a name?
In the question referenced above, a solution was found for turning a numbe into a name. This question asks just the opposite. How can you convert a name back into a number? So far, this is what I have:
>>> import string
>>> HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
>>> TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
>>> HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
>>> def number_to_name(number):
"Convert a number into a valid identifier."
if number < HEAD_BASE:
return HEAD_CHAR[number]
q, r = divmod(number - HEAD_BASE, TAIL_BASE)
return number_to_name(q) + TAIL_CHAR[r]
>>> [number_to_name(n) for n in range(117)]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A0', 'A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9', 'AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ', 'AK', 'AL', 'AM', 'AN', 'AO', 'AP', 'AQ', 'AR', 'AS', 'AT', 'AU', 'AV', 'AW', 'AX', 'AY', 'AZ', 'A_', 'Aa', 'Ab', 'Ac', 'Ad', 'Ae', 'Af', 'Ag', 'Ah', 'Ai', 'Aj', 'Ak', 'Al', 'Am', 'An', 'Ao', 'Ap', 'Aq', 'Ar', 'As', 'At', 'Au', 'Av', 'Aw', 'Ax', 'Ay', 'Az', 'B0']
>>> def name_to_number(name):
assert name, 'Name must exist!'
head, *tail = name
number = HEAD_CHAR.index(head)
for position, char in enumerate(tail):
if position:
number *= TAIL_BASE
else:
number += HEAD_BASE
number += TAIL_CHAR.index(char)
return number
>>> [name_to_number(number_to_name(n)) for n in range(117)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 54]
The function number_to_name works perfectly, and name_to_number works up until it gets to number 116. At that point, the function returns 54 instead. Does anyone see the code's problem?
Solution based on recursive's answer:
import string
HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
def name_to_number(name):
if not name.isidentifier():
raise ValueError('Name must be a Python identifier!')
head, *tail = name
number = HEAD_CHAR.index(head)
for char in tail:
number *= TAIL_BASE
number += TAIL_CHAR.index(char)
return number + sum(HEAD_BASE * TAIL_BASE ** p for p in range(len(tail)))
Unfortunately, these identifiers don't yield to traditional constant base encoding techniques. For example "A" acts like a zero, but leading "A"s change the value. In normal number systems leading zeroes do not. There could be multiple approaches, but I settled on one that calculates the total number of identifiers with fewer digits, and starts from that.
def name_to_number(name):
assert name, 'Name must exist!'
skipped = sum(HEAD_BASE * TAIL_BASE ** i for i in range(len(name) - 1))
val = reduce(
lambda a,b: a * TAIL_BASE + TAIL_CHAR.index(b),
name[1:],
HEAD_CHAR.index(name[0]))
return val + skipped

Extract specific data entries(as groups) from a list and store in separate lists?

For example, here is a sample of the list i am looping over,
['n', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 82, 83, 84, 85, 86, 87, 88, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', ''n', 'n', 'n', 178, 179, 180]
This list is generated from a previously called function(the n's have been inserted to hide unwanted values).
I am attempting to group the numbers which are separated between n's and send them to a list, for instance take the numbers 1-37->put in a list, take numbers 82-88->different list, 178-180->send to different list.
The tricky part is that the list will not always have the same set of data inside, the 'groups' can be of arbitrary size and location. The only defining feature is that they are separated by n's.
My attempt so far:
for i in range(0, len(lists)):
for index, item in enumerate(lines):
if item != 'n': #if item is not n send to list
lists[i].append(item)
elif lines[index+1] == 'n':#if the next item is an n
del lines[:index]
Where 'lists' is actually a list of lists created outside this function, to store each of the groups, the number of lists is determined by the number of groups that is needed to be stored.
'lines' is the list of values I wish to loop over.
My logic is, all values that is not an 'n' append in the first list, if the next value is an n, then delete all values before and loop over the new list putting the next set of values into the next list. so on.
except I get: list index out of range
Which I understand, but I was hoping there was a way to hack around it. I have also tried breaking out of the loop at the elif, but then I cannot continue the loop where I left off.
My final attempt was to restart the loop at a location set after the first run like so:
place=0
for i in range(0, len(lists)):
for index, item in enumerate(lines[place:]):
if item != 'n': #if item is not n send to list
lists[i].append(item)
elif lines[index+1] == 'n':#if the next item is an n
place=[index]
break
slice indices must be integers or None or have an index method
Hope that was clear, Can anyone lend a hand?
from itertools import groupby
data = ['n', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 82, 83, 84, 85, 86, 87, 88, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 178, 179, 180]
def keyfunc(n):
return n == 'n'
groups = [list(g) for k, g in groupby(data, keyfunc) if not k]
>>> import itertools
>>> data = ['n', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 82, 83, 84, 85, 86, 87, 88, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 178, 179, 180]
>>> [list(g) for k, g in itertools.groupby(data, lambda x: x != 'n') if k]
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37],
[82, 83, 84, 85, 86, 87, 88],
[178, 179, 180]]
This works:
li=['n', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 82, 83, 84, 85, 86, 87, 88, 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 178, 179, 180]
from itertools import groupby
def is_n(c): return c=='n'
print [list(t1) for t0,t1 in groupby(li, is_n) if t0==False]
Prints:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37], [82, 83, 84, 85, 86, 87, 88], [178, 179, 180]]
If you want to do this 'raw' (without itertools) this works:
def is_n(c): return c=='n'
def divide(li, f):
r=[]
while li:
while li and f(li[0]):
li.pop(0)
sub=[]
while li and not f(li[0]):
sub.append(li.pop(0))
r.append(sub)
return r
print divide(li,is_n)
Or, if you want to use a for loop:
def divide4(li,f):
r=[]
sub=[]
for e in li:
if f(e):
if len(sub)==0:
continue
else:
r.append(sub)
sub=[]
else:
sub.append(e)
else:
if len(sub): r.append(sub)
return r
print divide4(li,is_n)
Either case, prints:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37], [82, 83, 84, 85, 86, 87, 88], [178, 179, 180]]
itertools is faster, easier, proven. Use that.

Categories

Resources