Changing multiple values in a list - python

I'm writing an encryption code so I don't know exactly where each character I need to replace will be:
I have put a text file into a list and then the list into ascii, But i need to replace 32 with a space. I get this printed out so far but I need to replace 32 on the second list with " "
Original = ['S', 'o', 'm', 'e', 'w', 'h', 'e', 'r', 'e', ' ', 'i', 'n', ' ', 'l', 'a', ' ', 'M', 'a', 'n', 'c', 'h', 'a', ',', ' ', 'i', 'n', ' ', 'a', ' ', 'p', 'l', 'a', 'c', 'e', ' ', 'w', 'h', 'o', 's', 'e', ' ', 'n', 'a', 'm', 'e', ' ', 'I', ' ', 'd', 'o', ' ', 'n', 'o', 't', ' ', 'c', 'a', 'r', 'e', ' ', 't', 'o', ' ', 'r', 'e', 'm', 'e', 'm', 'b', 'e', 'r', ',', ' ', 'a', ' ', 'g', 'e', 'n', 't', 'l', 'e', 'm', 'a', 'n', ' ', 'l', 'i', 'v', 'e', 'd', ' ', 'n', 'o', 't', ' ', 'l', 'o', 'n', 'g', ' ', 'a', 'g', 'o', ',', ' ', 'o', 'n', 'e', ' ', 'o', 'f', ' ', 't', 'h', 'o', 's', 'e', ' ', 'w', 'h', 'o', ' ', 'h', 'a', 's', ' ', 'a', ' ', 'l', 'a', 'n', 'c', 'e', ' ', 'a', 'n', 'd', ' ', 'a', 'n', 'c', 'i', 'e', 'n', 't', ' ', 's', 'h', 'i', 'e', 'l', 'd', ' ', 'o', 'n', ' ', 'a', ' ', 's', 'h', 'e', 'l', 'f', ' ', 'a', 'n', 'd', ' ', 'k', 'e', 'e', 'p', 's', ' ', 'a', ' ', 's', 'k', 'i', 'n', 'n', 'y', ' ', 'n', 'a', 'g', ' ', 'a', 'n', 'd', ' ', 'a', ' ', 'g', 'r', 'e', 'y', 'h', 'o', 'u', 'n', 'd', ' ', 'f', 'o', 'r', ' ', 'r', 'a', 'c', 'i', 'n', 'g', '.']
ASCII_conversion = [83, 111, 109, 101, 119, 104, 101, 114, 101, 32, 105, 110, 32, 108, 97, 32, 77, 97, 110, 99, 104, 97, 44, 32, 105, 110, 32, 97, 32, 112, 108, 97, 99, 101, 32, 119, 104, 111, 115, 101, 32, 110, 97, 109, 101, 32, 73, 32, 100, 111, 32, 110, 111, 116, 32, 99, 97, 114, 101, 32, 116, 111, 32, 114, 101, 109, 101, 109, 98, 101, 114, 44, 32, 97, 32, 103, 101, 110, 116, 108, 101, 109, 97, 110, 32, 108, 105, 118, 101, 100, 32, 110, 111, 116, 32, 108, 111, 110, 103, 32, 97, 103, 111, 44, 32, 111, 110, 101, 32, 111, 102, 32, 116, 104, 111, 115, 101, 32, 119, 104, 111, 32, 104, 97, 115, 32, 97, 32, 108, 97, 110, 99, 101, 32, 97, 110, 100, 32, 97, 110, 99, 105, 101, 110, 116, 32, 115, 104, 105, 101, 108, 100, 32, 111, 110, 32, 97, 32, 115, 104, 101, 108, 102, 32, 97, 110, 100, 32, 107, 101, 101, 112, 115, 32, 97, 32, 115, 107, 105, 110, 110, 121, 32, 110, 97, 103, 32, 97, 110, 100, 32, 97, 32, 103, 114, 101, 121, 104, 111, 117, 110, 100, 32, 102, 111, 114, 32, 114, 97, 99, 105, 110, 103, 46]
any help?

Add a new line of code after your current ASCII_conversion list:
ASCII_conversion = [x if x!=32 else " " for x in ASCII_conversion]

A simple list comprehension with TrueValue if condition else FalseValue will do:
ASCII_conversion = [' ' if x == 32 else x for x in ASCII_conversion]
or combining the two operations
ASCII_conversion = [' ' if x == ' ' else ord(x) for x in Original]
Note #1: The above piece of code will produce a list containing both strings and integers. This might not be a good approach. Try to have all elements of a list of the same type. It helps reduce errors.
Note #2: Please work on on your variable naming convention. It sort of hurts readers' eyes.

Related

Is it possible to find the index of elements with char value higher than `n` with numpy?

Basically I have something like this :
letters = "ABNJDSJHIUOIUIYEIUWEYIUJHAJHSGJHASNMVFDJHKIUYEIUWYEWUIEYUIUYIEJSGCDJHDS"
And I want to find the index of letters above let's say M. I want to do something like :
import numpy as np
letters = "ABNJDSJHIUOIUIYEIUWEYIUJHAJHSGJHASNMVFDJHKIUYEIUWYEWUIEYUIUYIEJSGCDJHDS"
# - test
np_array = np.array(np.where(letters > chr(77))[0])
Is this possible? or do I have do something like letters not in ...?
Convert letters to a character array:
>>> ar = np.array(list(letters))
>>> ar
array(['A', 'B', 'N', 'J', 'D', 'S', 'J', 'H', 'I', 'U', 'O', 'I', 'U',
'I', 'Y', 'E', 'I', 'U', 'W', 'E', 'Y', 'I', 'U', 'J', 'H', 'A',
'J', 'H', 'S', 'G', 'J', 'H', 'A', 'S', 'N', 'M', 'V', 'F', 'D',
'J', 'H', 'K', 'I', 'U', 'Y', 'E', 'I', 'U', 'W', 'Y', 'E', 'W',
'U', 'I', 'E', 'Y', 'U', 'I', 'U', 'Y', 'I', 'E', 'J', 'S', 'G',
'C', 'D', 'J', 'H', 'D', 'S'], dtype='<U1')
>>> np.where(ar > 'M')[0]
array([ 2, 5, 9, 10, 12, 14, 17, 18, 20, 22, 28, 33, 34, 36, 43, 44, 47,
48, 49, 51, 52, 55, 56, 58, 59, 63, 70], dtype=int64)
Byte arrays can also be:
>>> ar = np.array(bytearray(letters.encode()))
>>> ar
array([65, 66, 78, 74, 68, 83, 74, 72, 73, 85, 79, 73, 85, 73, 89, 69, 73,
85, 87, 69, 89, 73, 85, 74, 72, 65, 74, 72, 83, 71, 74, 72, 65, 83,
78, 77, 86, 70, 68, 74, 72, 75, 73, 85, 89, 69, 73, 85, 87, 89, 69,
87, 85, 73, 69, 89, 85, 73, 85, 89, 73, 69, 74, 83, 71, 67, 68, 74,
72, 68, 83], dtype=uint8)
>>> np.where(ar > ord('M'))[0]
array([ 2, 5, 9, 10, 12, 14, 17, 18, 20, 22, 28, 33, 34, 36, 43, 44, 47,
48, 49, 51, 52, 55, 56, 58, 59, 63, 70], dtype=int64)

Find the product of the minimum height of defenders lower than 180 cm and the maximum height of midfielders higher than 185 cm

positions = ['GK', 'M', 'A', 'D', 'M', 'D', 'M', 'M', 'M', 'A', 'M', 'M', 'A', 'A', 'A', 'M', 'D', 'A', 'D', 'M', 'GK', 'D', 'D', 'M', 'M', 'M', 'M', 'D', 'M', 'GK', 'D', 'GK', 'D', 'D', 'M']
heights = [191, 184, 185, 180, 181, 187, 170, 179, 183, 186, 185, 170, 187, 183, 173, 188, 183, 180, 188, 175, 193, 180, 185, 170, 183, 173, 185, 185, 168, 190, 178, 185, 185, 193, 183]
np_positions = np.array(positions)
np_heights = np.array(heights)
My code is:
print(np.min(positions=='D'[heights<180]) * np.max(positions=='A'[heights > 185]))
I get a TypeError. I made it another way, but I need to do this in 1 string.
IIUC, use your numpy arrays and boolean slicing!
NB. assuming Defender is D and Midfielder is M.
# position is D/M condition on height
(np_heights[(np_positions=='D')&(np_heights<180)].min()
*np_heights[(np_positions=='M')&(np_heights>185)].max()
)
output: 33464 (178*188)
What you could do is to create a 2D list with position and height using zip.
In the min and max functions of python you are able to define your own key on what the minimum and maximum value should be searched for. In this case I used a lambda function using an if else statement.
positions = ['GK', 'M', 'A', 'D', 'M', 'D', 'M', 'M', 'M', 'A', 'M', 'M', 'A', 'A', 'A', 'M', 'D', 'A', 'D', 'M', 'GK', 'D', 'D', 'M', 'M', 'M', 'M', 'D', 'M', 'GK', 'D', 'GK', 'D', 'D', 'M']
heights = [191, 184, 185, 180, 181, 187, 170, 179, 183, 186, 185, 170, 187, 183, 173, 188, 183, 180, 188, 175, 193, 180, 185, 170, 183, 173, 185, 185, 168, 190, 178, 185, 185, 193, 183]
# create a 2d array:
d = list(zip(positions, heights))
min_height_defender = min(d, key=lambda x: x[1] if x[0] == 'D' and x[1] < 180 else 1e9)
max_height_midfielder = max(d, key=lambda x: x[1] if x[0] == 'A' and x[1] > 185 else -1e9)
print( min_height_defender[1] * max_height_midfielder[1] )
>>> 33286
Or you can use numpy, but I would suggest that you test your minimum and maximum value after filtering, otherwise your code becomes unreadable:
# or with numpy:
positions = np.array(positions)
heights = np.array(heights)
print( heights[np.argwhere(positions == 'D')].min() * heights[np.argwhere(positions == 'A')].max() )
>>> 33286
# or without argwhere:
print( heights[positions == 'D'].min() * heights[positions == 'A'].max() )
>>> 33286
If you want to filter inline you can use it, but as said it is very ugly (in my opinion) and you should avoid to do more than 1 thing per line of code. But if you want of just for the kick of doing things in one line:
print( heights[heights<180][positions[heights<180] == 'D'].min() * heights[heights>185][positions[heights>185] == 'A'].max() )
>>> 33286

Python3.7 i can't convert this byte to string

I have this code:
byte = b'\x7f\x9fKL\xaa\xe6\xc8\x8d\xdf865\xf1s\t`R\xd6\xe8\x9c\x07\xae\x97\xe4\x0e\xe6\x08_CZY(1\x94\xca1\x165m\xd6m\x90xs\xc7\x90d\x0c\xe3\xe9;\x9ec\xd3Q\xe6\x11<z\xff:\x97\x9cz\x86{\xdd\x82S\xfc_\xbcow,`i<\xdd\x0f\xe0^\xb12\xdc,\xf5\x08\xdeey\xbb\xf4o\xadx\xc8(\xd0\xab)\xc1\x7f\xbe<z\xderLp\xa0\x02\x0c\x87!+q\x90\xae\x17\xd0\\y04\x1f\xae\xd2x\xc2\x92\xd4\xd5\x04\x9c\x9c\xc7\x0e\xcbxb\x81\xab\xe4w\xf4\xa1\x9f5\xb1p\xf1\xdf\x12^\x00lA\x83\xe1KP\xdb\xa93\x83\x13\x19\xb8\xf7RA\xe8\xe7\xdcU\xfc\xff\xbcJ\x9d\xc2\xba \xd5\xd5>\x15X#=\xf9\xdf\xbe\xee.\xc5\x82c\r\xd6\xad\x88=\xfc\x9f\xf4%+\xf5\ry\xb7\xb2\xabN\x1a\xb5$\xb6\x8b\x7f2sT\x9eo//\xb3\xbe\xdc\xc8\xbc\xc40\xae/P\xef\x1a\x0bP\x96R\xa0p\xe5\x8a\xad\x11\xe5u\xaa\xcbR'
print(str(byte,'utf-8'))
I want to convert this byte to string and and this to a json file so I can take the string and convert it back to byte when I want to use it.
but when I try to convert it gives such an error:
Traceback (most recent call last):
File "wallet.py", line 126, in <module>
print(str(byte,'utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 1:
invalid start byte`
You could decode it like this,
>>> byte
b'\x7f\x9fKL\xaa\xe6\xc8\x8d\xdf865\xf1s\tR\xd6\xe8\x9c\x07\xae\x97\xe4\x0e\xe6\x08_CZY(1\x94\xca1\x165m\xd6m\x90xs\xc7\x90d\x0c\xe3\xe9;\x9ec\xd3Q\xe6\x11<z\xff:\x97\x9cz\x86{\xdd\x82S\xfc_\xbcow,i<\xdd\x0f\xe0^\xb12\xdc,\xf5\x08\xdeey\xbb\xf4o\xadx\xc8(\xd0\xab)\xc1\x7f\xbe\x15X#=\xf9\xdf\xbe\xee.\xc5\x82c\r\xd6\xad\x88=\xfc\x9f\xf4%+\xf5\ry\xb7\xb2\xabN\x1a\xb5$\xb6\x8b\x7f2sT\x9eo//\xb3\xbe\xdc\xc8\xbc\xc40\xae/P\xef\x1a\x0bP\x96R\xa0p\xe5\x8a\xad\x11\xe5u\xaa\xcbR'
>>> print(''.join(chr(x) for x in byte))
KLªæÈß865ñs RÖè®ä_CZY(1Ê15mÖmxsÇd
y·²«N▒µ$¶2sTo//³¾ÜȼÄ0®/Pï▒ ãé;cÓQæ<zÿ:z{ÝSü_¼ow,i<Ýà^±2Ü,Þey»ôo­xÈ(Ы)Á¾X#=ùß¾î.Åc
PR på­åuªËR
You can see what is going on here,
>>> y = [chr(x) for x in byte]
>>> y
['\x7f', '\x9f', 'K', 'L', 'ª', 'æ', 'È', '\x8d', 'ß', '8', '6', '5', 'ñ', 's', '\t', 'R', 'Ö', 'è', '\x9c', '\x07', '®', '\x97', 'ä', '\x0e', 'æ', '\x08', '_', 'C', 'Z', 'Y', '(', '1', '\x94', 'Ê', '1', '\x16', '5', 'm', 'Ö', 'm', '\x90', 'x', 's', 'Ç', '\x90', 'd', '\x0c', 'ã', 'é', ';', '\x9e', 'c', 'Ó', 'Q', 'æ', '\x11', '<', 'z', 'ÿ', ':', '\x97', '\x9c', 'z', '\x86', '{', 'Ý', '\x82', 'S', 'ü', '_', '¼', 'o', 'w', ',', 'i', '<', 'Ý', '\x0f', 'à', '^', '±', '2', 'Ü', ',', 'õ', '\x08', 'Þ', 'e', 'y', '»', 'ô', 'o', '\xad', 'x', 'È', '(', 'Ð', '«', ')', 'Á', '\x7f', '¾', '\x15', 'X', '#', '=', 'ù', 'ß', '¾', 'î', '.', 'Å', '\x82', 'c', '\r', 'Ö', '\xad', '\x88', '=', 'ü', '\x9f', 'ô', '%', '+', 'õ', '\r', 'y', '·', '²', '«', 'N', '\x1a', 'µ', '$', '¶', '\x8b', '\x7f', '2', 's', 'T', '\x9e', 'o', '/', '/', '³', '¾', 'Ü', 'È', '¼', 'Ä', '0', '®', '/', 'P', 'ï', '\x1a', '\x0b', 'P', '\x96', 'R', '\xa0', 'p', 'å', '\x8a', '\xad', '\x11', 'å', 'u', 'ª', 'Ë', 'R']
>>> [ord(x) for x in y]
[127, 159, 75, 76, 170, 230, 200, 141, 223, 56, 54, 53, 241, 115, 9, 82, 214, 232, 156, 7, 174, 151, 228, 14, 230, 8, 95, 67, 90, 89, 40, 49, 148, 202, 49, 22, 53, 109, 214, 109, 144, 120, 115, 199, 144, 100, 12, 227, 233, 59, 158, 99, 211, 81, 230, 17, 60, 122, 255, 58, 151, 156, 122, 134, 123, 221, 130, 83, 252, 95, 188, 111, 119, 44, 105, 60, 221, 15, 224, 94, 177, 50, 220, 44, 245, 8, 222, 101, 121, 187, 244, 111, 173, 120, 200, 40, 208, 171, 41, 193, 127, 190, 21, 88, 35, 61, 249, 223, 190, 238, 46, 197, 130, 99, 13, 214, 173, 136, 61, 252, 159, 244, 37, 43, 245, 13, 121, 183, 178, 171, 78, 26, 181, 36, 182, 139, 127, 50, 115, 84, 158, 111, 47, 47, 179, 190, 220, 200, 188, 196, 48, 174, 47, 80, 239, 26, 11, 80, 150, 82, 160, 112, 229, 138, 173, 17, 229, 117, 170, 203, 82]
>>> bytes([ord(x) for x in y])
b'\x7f\x9fKL\xaa\xe6\xc8\x8d\xdf865\xf1s\tR\xd6\xe8\x9c\x07\xae\x97\xe4\x0e\xe6\x08_CZY(1\x94\xca1\x165m\xd6m\x90xs\xc7\x90d\x0c\xe3\xe9;\x9ec\xd3Q\xe6\x11<z\xff:\x97\x9cz\x86{\xdd\x82S\xfc_\xbcow,i<\xdd\x0f\xe0^\xb12\xdc,\xf5\x08\xdeey\xbb\xf4o\xadx\xc8(\xd0\xab)\xc1\x7f\xbe\x15X#=\xf9\xdf\xbe\xee.\xc5\x82c\r\xd6\xad\x88=\xfc\x9f\xf4%+\xf5\ry\xb7\xb2\xabN\x1a\xb5$\xb6\x8b\x7f2sT\x9eo//\xb3\xbe\xdc\xc8\xbc\xc40\xae/P\xef\x1a\x0bP\x96R\xa0p\xe5\x8a\xad\x11\xe5u\xaa\xcbR'
>>>
>>> len(y)
171
>>> len(byte)
171
try to use this code:
print(byte.decode('latin-1'))

Coding words as products of integers

I am trying to write a program that checks if smaller words are found within a larger word. For example, the word "computer" contains the words "put", "rum", "cut", etc. To perform the check I am trying to code each word as a product of prime numbers, that way the smaller words will all be factors of the larger word. I have a list of letters and a list of primes and have assigned (I think) an integer value to each letter:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101]
index = 0
while index <= len(letters)-1:
letters[index] = primes[index]
index += 1
The problem I am having now is how to get the integer code for a given word and be able to create the codes for a whole list of words. For example, I want to be able to input the word "cab," and have the code generate its integer value of 5*2*3 = 30.
Any help would be much appreciated.
from functools import reduce # only needed for Python 3.x
from operator import mul
primes = [
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101
]
lookup = dict(zip("abcdefghijklmnopqrstuvwxyz", primes))
def encode(s):
return reduce(mul, (lookup.get(ch, 1) for ch in s.lower()))
then
encode("cat") # => 710
encode("act") # => 710
Edit: more to the point,
def is_anagram(s1, s2):
"""
s1 consists of the same letters as s2, rearranged
"""
return encode(s1) == encode(s2)
def is_subset(s1, s2):
"""
s1 consists of some letters from s2, rearranged
"""
return encode(s2) % encode(s1) == 0
then
is_anagram("cat", "act") # => True
is_subset("cat", "tactful") # => True
I would use a dict here to look-up the prime for a given letter:
In [1]: letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
In [2]: primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101]
In [3]: lookup = dict(zip(letters, primes))
In [4]: lookup['a']
Out[4]: 2
This will let you easily determine the list of primes for a given word:
In [5]: [lookup[letter] for letter in "computer"]
Out[5]: [5, 47, 41, 53, 73, 71, 11, 61]
To find the product of those primes:
In [6]: import operator
In [7]: reduce(operator.mul, [lookup[letter] for letter in "cab"])
Out[7]: 30
You've got your two lists set up, so now you just need to iterate over each character in a word and determine what value that letter gives you.
Something like
total = 1
for letter in word:
index = letters.index(letter)
total *= primes[index]
Or whichever operation you decide to use.
You would generalize that to a list of words.
Hmmmm... It isn't very clear how this code is supposed to run. If it is built to find words in the english dictionary, think about using PyEnchant, a module for checking if words are in the dictionary. Something you could try is this:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
word = raw_input('What is your word? ')
word = list(word)
total = 1
nums = []
for k in word:
nums.append(primes[letters.index(k)])
for k in nums:
total = total*k
print total
This will output as:
>>> What is your word? cat
710
>>>
This is correct, as 5*2*71 equals 710

How can you convert a Python identifier into a number?

Reference: Is there a faster way of converting a number to a name?
In the question referenced above, a solution was found for turning a numbe into a name. This question asks just the opposite. How can you convert a name back into a number? So far, this is what I have:
>>> import string
>>> HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
>>> TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
>>> HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
>>> def number_to_name(number):
"Convert a number into a valid identifier."
if number < HEAD_BASE:
return HEAD_CHAR[number]
q, r = divmod(number - HEAD_BASE, TAIL_BASE)
return number_to_name(q) + TAIL_CHAR[r]
>>> [number_to_name(n) for n in range(117)]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A0', 'A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9', 'AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ', 'AK', 'AL', 'AM', 'AN', 'AO', 'AP', 'AQ', 'AR', 'AS', 'AT', 'AU', 'AV', 'AW', 'AX', 'AY', 'AZ', 'A_', 'Aa', 'Ab', 'Ac', 'Ad', 'Ae', 'Af', 'Ag', 'Ah', 'Ai', 'Aj', 'Ak', 'Al', 'Am', 'An', 'Ao', 'Ap', 'Aq', 'Ar', 'As', 'At', 'Au', 'Av', 'Aw', 'Ax', 'Ay', 'Az', 'B0']
>>> def name_to_number(name):
assert name, 'Name must exist!'
head, *tail = name
number = HEAD_CHAR.index(head)
for position, char in enumerate(tail):
if position:
number *= TAIL_BASE
else:
number += HEAD_BASE
number += TAIL_CHAR.index(char)
return number
>>> [name_to_number(number_to_name(n)) for n in range(117)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 54]
The function number_to_name works perfectly, and name_to_number works up until it gets to number 116. At that point, the function returns 54 instead. Does anyone see the code's problem?
Solution based on recursive's answer:
import string
HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
def name_to_number(name):
if not name.isidentifier():
raise ValueError('Name must be a Python identifier!')
head, *tail = name
number = HEAD_CHAR.index(head)
for char in tail:
number *= TAIL_BASE
number += TAIL_CHAR.index(char)
return number + sum(HEAD_BASE * TAIL_BASE ** p for p in range(len(tail)))
Unfortunately, these identifiers don't yield to traditional constant base encoding techniques. For example "A" acts like a zero, but leading "A"s change the value. In normal number systems leading zeroes do not. There could be multiple approaches, but I settled on one that calculates the total number of identifiers with fewer digits, and starts from that.
def name_to_number(name):
assert name, 'Name must exist!'
skipped = sum(HEAD_BASE * TAIL_BASE ** i for i in range(len(name) - 1))
val = reduce(
lambda a,b: a * TAIL_BASE + TAIL_CHAR.index(b),
name[1:],
HEAD_CHAR.index(name[0]))
return val + skipped

Categories

Resources