I have charset as follows.
charset =set([ '$', '^', '#', '(', ')', '-', '.', '/', '1', '2', '3', '4', '5', '6', '7', '=', 'Br',
'C', 'Cl', 'F', 'I', 'N', 'O', 'P', 'S', '[2H]', '[Br-]', '[C##H]', '[C##]', '[C#H]', '[C#]',
'[Cl-]', '[H]', '[I-]', '[N+]', '[N-]', '[N#+]', '[N##+]', '[NH+]', '[NH2+]', '[NH3+]', '[N]',
'[Na+]', '[O-]', '[P+]', '[S+]', '[S-]', '[S#+]', '[S##+]', '[SH]', '[Si]', '[n+]', '[n-]',
'[nH+]', '[nH]', '[o+]', '[se]', '\\', 'c', 'n', 'o', 's', '!', 'E'])
On the basis of this charset, I create char_to_int as follows.
char_to_int = dict((c,i) for i,c in enumerate(charset))
{'[nH]': 0, '[2H]': 1, '2': 2, 'N': 3, 'Cl': 4, 'c': 5, '$': 6,
'O': 7, '(': 8, '6': 9, 's': 10, '[S#+]': 11, '[C##H]': 12, 'C':
13, '[nH+]': 14, '/': 15, '[NH+]': 16, '[Br-]': 17, '[Si]': 18,
'4': 19, '[N#+]': 20, '[se]': 21, 'P': 22, '[SH]': 23, '[N+]':
24, '[N]': 25, '^': 26, '5': 27, '7': 28, 'n': 29, '!': 30,
'\': 31, '[n-]': 32, 'S': 33, '[NH3+]': 34, '#': 35, 'I': 36,
'[O-]': 37, '1': 38, '[NH2+]': 39, '[S##+]': 40, 'Br': 41, 'F':
42, '[Na+]': 43, 'E': 44, '[S-]': 45, '.': 46, ')': 47, '[C#]':
48, '=': 49, '3': 50, '-': 51, '[C#H]': 52, '[Cl-]': 53, '[I-]':
54, '[H]': 55, '[P+]': 56, '[S+]': 57, 'o': 58, '[N##+]': 59,
'[N-]': 60, '[n+]': 61, '[o+]': 62, '[C##]': 63}
and int_to_char as follows.
int_to_char = dict((i,c) for i,c in enumerate(charset))
{0: '[nH]', 1: '[2H]', 2: '2', 3: 'N', 4: 'Cl', 5: 'c', 6: '$',
7: 'O', 8: '(', 9: '6', 10: 's', 11: '[S#+]', 12: '[C##H]', 13:
'C', 14: '[nH+]', 15: '/', 16: '[NH+]', 17: '[Br-]', 18: '[Si]',
19: '4', 20: '[N#+]', 21: '[se]', 22: 'P', 23: '[SH]', 24:
'[N+]', 25: '[N]', 26: '^', 27: '5', 28: '7', 29: 'n', 30: '!',
31: '\', 32: '[n-]', 33: 'S', 34: '[NH3+]', 35: '#', 36: 'I',
37: '[O-]', 38: '1', 39: '[NH2+]', 40: '[S##+]', 41: 'Br', 42:
'F', 43: '[Na+]', 44: 'E', 45: '[S-]', 46: '.', 47: ')', 48:
'[C#]', 49: '=', 50: '3', 51: '-', 52: '[C#H]', 53: '[Cl-]', 54:
'[I-]', 55: '[H]', 56: '[P+]', 57: '[S+]', 58: 'o', 59: '[N##+]',
60: '[N-]', 61: '[n+]', 62: '[o+]', 63: '[C##]'}
I have a string which I want to convert to one hot encoding on the basis of char_to_int and int_to_char.
string = 'N[C#H]1C[C##H](N2Cc3nn4cccnc4c3C2)CC[C##H]1c1cc(F)c(F)cc1F'
Is there any efficient way which uses the self defined char_to_int and int_to_char to convert a string to one hot vector?
from itertools import chain, repeat, islice
import re
string = 'N[C#H]1C[C##H](N2Cc3nn4cccnc4c3C2)CC[C##H]1c1cc(F)c(F)cc1F'
items_list=[ '$', '^', '#', '(', ')', '-', '.', '/', '1', '2', '3', '4', '5', '6', '7', '=', 'Br',
'C', 'Cl', 'F', 'I', 'N', 'O', 'P', 'S', '[2H]', '[Br-]', '[C##H]', '[C##]', '[C#H]', '[C#]',
'[Cl-]', '[H]', '[I-]', '[N+]', '[N-]', '[N#+]', '[N##+]', '[NH+]', '[NH2+]', '[NH3+]', '[N]',
'[Na+]', '[O-]', '[P+]', '[S+]', '[S-]', '[S#+]', '[S##+]', '[SH]', '[Si]', '[n+]', '[n-]',
'[nH+]', '[nH]', '[o+]', '[se]', '\\', 'c', 'n', 'o', 's', '!', 'E']
charset = set(items_list)
char_to_int = dict((c,i) for i,c in enumerate(charset))
pattern = '|'.join(re.escape(item) for item in items_list)
tokens = re.findall(pattern, string)
x=[char_to_int[k] for k in tokens]
Here, xis one hot encoded.
x=[3, 52, 38, 13, 12, 8, 3, 2, 13, 5, 50, 29, 29, 19, 5, 5, 5, 29, 5, 19, 5, 50, 13, 2, 47, 13, 13, 12, 38, 5, 38, 5, 5, 8, 42, 47, 5, 8, 42, 47, 5, 5, 38, 42]
Related
I'm a Korean. English translation may be wrong.
I am making a program that can output data in Python using a qr reader that is received as a usb input from a Raspberry Pi 4.
The code below raises KeyError:74 . What's the workaround?
ss += hid[int(ord(c))]
Below is the full code.
import sys
hid = {4: 'a', 5: 'b', 6: 'c', 7: 'd', 8: 'e', 9: 'f', 10: 'g', 11: 'h', 12: 'i', 13: 'j', 14: 'k', 15: 'l', 16: 'm',
17: 'n', 18: 'o', 19: 'p', 20: 'q', 21: 'r', 22: 's', 23: 't', 24: 'u', 25: 'v', 26: 'w', 27: 'x', 28: 'y',
29: 'z', 30: '1', 31: '2', 32: '3', 33: '4', 34: '5', 35: '6', 36: '7', 37: '8', 38: '9', 39: '0', 44: ' ',
45: '-', 46: '=', 47: '[', 48: ']', 49: '\\', 51: ';', 52: '\'', 53: '~', 54: ',', 55: '.', 56: '/'}
hid2 = {4: 'A', 5: 'B', 6: 'C', 7: 'D', 8: 'E', 9: 'F', 10: 'G', 11: 'H', 12: 'I', 13: 'J', 14: 'K', 15: 'L', 16: 'M',
17: 'N', 18: 'O', 19: 'P', 20: 'Q', 21: 'R', 22: 'S', 23: 'T', 24: 'U', 25: 'V', 26: 'W', 27: 'X', 28: 'Y',
29: 'Z', 30: '!', 31: '#', 32: '#', 33: '$', 34: '%', 35: '^', 36: '&', 37: '*', 38: '(', 39: ')', 44: ' ',
45: '_', 46: '+', 47: '{', 48: '}', 49: '|', 51: ':', 52: '"', 53: '~', 54: '<', 55: '>', 56: '?'}
fp = open('/dev/hidraw4', 'rb')
ss = ""
shift = False
done = False
while not done:
## Get the character from the HID
buffer = fp.read(8)
for c in buffer:
if ord(c) > 0:
## 40 is carriage return which signifies
## we are done looking for characters
if int(ord(c)) == 40:
done = True
break;
## If we are shifted then we have to
## use the hid2 characters.
if shift:
## If it is a '2' then it is the shift key
if int(ord(c)) == 2 :
shift = True
## if not a 2 then lookup the mapping
else:
ss += hid2[int(ord(c))]
shift = False
## If we are not shifted then use
## the hid characters
else:
## If it is a '2' then it is the shift key
if int(ord(c)) == 2 :
shift = True
## if not a 2 then lookup the mapping
else:
ss += hid[int(ord(c))]
print(ss)
A KeyError is raised when you try to access a key/value in a dict that does not contain that key. You probably want to re-check and update your mapping to contain the correct (ASCII) values as keys. The 74 comes from int(ord("J")).
You can avoid Key errors by changing hid[int(ord(c))] to hid.get(int(ord(c)) which would return None when the key does not exist.
In some special cases, can we use the multiple iterations in the dictionary comprehensions?
For Example, we have a string in the below format:-
"6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4"
So If I want to set 6,11,17,22,38,...... as the keys
and 14,28,74,7... as the corresponding values
How can it be achieved by Dictionary Comprehensions?
You can use ast.literal_eval to convert a string to a dictionary:
import ast
my_string = "6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4"
my_dict = ast.literal_eval(f"{{{my_string}}}")
Dictionary comprehension combined with split() on : should be good enough:
dic = {elt.split(':')[0].strip(): elt.split(':')[1].strip() for elt in string.split(',')}
Output:
{'6': '14', '11': '28', '17': '74', '22': '7', '38': '59', '49': '12', '57': '76', '61': '54', '81': '98', '88': '4'}
If you want keys and values to be as int objects:
dic = {int(elt.split(':')[0].strip()): int(elt.split(':')[1].strip()) for elt in string.split(',')}
Output:
{6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4}
Use dict constructor
dict(x.replace(' ', '').split(':') for x in s.split(','))
Or use dict constructor and map function
dict(map(lambda x: x.replace(' ', '').split(':'), s.split(',')))
Output
{'6': '14', '11': '28', '17': '74', '22': '7', '38': '59', '49': '12', '57': '76', '61': '54', '81': '98', '88': '4'}
there are multiple possibility:
With Regex:
import re
sample = "6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4"
my_dict = {key: value for key, value in re.findall('(\d+): (\d+)', sample)}
print(my_dict)
output:
{'6': '14', '11': '28', '17': '74', '22': '7', '38': '59', '49': '12', '57': '76', '61': '54', '81': '98', '88': '4'}
With Split():
sample = "6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4"
my_dict = {elem.split(": ")[0]: elem.split(": ")[1] for elem in sample.split(", ")}
print(my_dict)
output:
{'6': '14', '11': '28', '17': '74', '22': '7', '38': '59', '49': '12', '57': '76', '61': '54', '81': '98', '88': '4'}
Required output can be achieved with below comprehension:
>>> s1 = "6: 14, 11: 28, 17: 74, 22: 7, 38: 59, 49: 12, 57: 76, 61: 54, 81: 98, 88: 4"
>>> dict((s[0].strip(),int(s[1].strip())) for s in [s.split(":") for s in s1.split(",")])
{'11': 28, '38': 59, '17': 74, '22': 7, '49': 12, '57': 76, '61': 54, '88': 4, '6': 14, '81': 98}
>>>
My data frame is as follows:
ex = {'group': {0: '0', 1: '0', 2: '0', 3: '0', 4: '0', 5: '0', 6: '0', 7: '0', 8: '0', 9: '0', 10: '0', 11: '0', 12: '0', 13: '0', 14: '0', 15: '0', 16: '0', 17: '0', 18: '0', 19: '0', 20: '0', 21: '1', 22: '1', 23: '1', 24: '1', 25: '1', 26: '1', 27: '1', 28: '1', 29: '1', 30: '1', 31: '1', 32: '1', 33: '1', 34: '1', 35: '1', 36: '1', 37: '1', 38: '1', 39: '1'}, 'order': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16, 17: 17, 18: 18, 19: 19, 20: 20, 21: 0, 22: 1, 23: 2, 24: 3, 25: 4, 26: 5, 27: 6, 28: 7, 29: 8, 30: 9, 31: 10, 32: 11, 33: 12, 34: 13, 35: 14, 36: 15, 37: 16, 38: 17, 39: 18}, 'id': {0: '102', 1: '302', 2: '302', 3: '302', 4: '102', 5: '302', 6: '302', 7: '302', 8: '302', 9: '302', 10: '102', 11: '308', 12: '308', 13: '308', 14: '308', 15: '302', 16: '102', 17: '302', 18: '102', 19: '302', 20: '102', 21: '102', 22: '102', 23: '308', 24: '312', 25: '312', 26: '312', 27: '308', 28: '102', 29: '302', 30: '312', 31: '302', 32: '302', 33: '102', 34: '102', 35: '302', 36: '312', 37: '308', 38: '102', 39: '302'}, 'type': {0: 'A', 1: 'B', 2: 'C', 3: 'A', 4: 'D', 5: 'E', 6: 'D', 7: 'E', 8: 'A', 9: 'E', 10: 'E', 11: 'D', 12: 'A', 13: 'A', 14: 'A', 15: 'D', 16: 'D', 17: 'D', 18: 'A', 19: 'D', 20: 'A', 21: 'D', 22: 'F', 23: 'A', 24: 'D', 25: 'A', 26: 'E', 27: 'A', 28: 'E', 29: 'D', 30: 'E', 31: 'E', 32: 'G', 33: 'A', 34: 'D', 35: 'D', 36: 'H', 37: 'I', 38: 'A', 39: 'E'}, 'of_interest': {0: False, 1: False, 2: True, 3: False, 4: False, 5: True, 6: False, 7: True, 8: True, 9: True, 10: True, 11: True, 12: True, 13: False, 14: True, 15: True, 16: True, 17: True, 18: False, 19: False, 20: True, 21: False, 22: False, 23: False, 24: True, 25: False, 26: True, 27: True, 28: False, 29: True, 30: True, 31: False, 32: True, 33: True, 34: True, 35: True, 36: True, 37: False, 38: True, 39: False}}
ex.head()
group order id type of_interest
0 0 0 102 A False
1 0 1 302 B False
2 0 2 302 C True
3 0 3 302 A False
4 0 4 102 D False
I want to create a column that for each combination of group and id return previous type where of_interest == True.
My first attempt involved querying for of_interest == True, therefore returned value only for these rows:
ex['prev_type_of_interest'] = ex \
.query('of_interest == True') \
.groupby(['group', 'id'])['type'] \
.shift(1)
How can I return previous type of interest for every row?
I believe you need shift all rows per groups, then set missing values by Series.where and last replace missing values by previos non missing values by GroupBy.ffill:
ex1 = ex.groupby(['group', 'id']).shift()
ex['prev_type_of_interest'] = ex1['type'].where(ex1['of_interest'] == True)
ex['prev_type_of_interest'] = ex.groupby(['group', 'id'])['prev_type_of_interest'].ffill()
print (ex.head(10))
group order id type of_interest prev_type_of_interest
0 0 0 102 A False NaN
1 0 1 302 B False NaN
2 0 2 302 C True NaN
3 0 3 302 A False C
4 0 4 102 D False NaN
5 0 5 302 E True C
6 0 6 302 D False E
7 0 7 302 E True E
8 0 8 302 A True E
9 0 9 302 E True A
I have set up a raspberry Pi with a USB barcode scanner for a little project. It works with my generated barcodes, it prints the output of the scanned code in the terminal. I really want to save this input to a txt file that doesn't overwrite itself. I have tried changing all the functions and i just cant get it to work. I'm just a novice in Python and i have been stuck on this for a long time now and i have looked all over the internet. If you can just point me to the specific place in code i need to change in order to print the output out i would be very appreciative.
Source: Instructables
!/usr/bin/python
import sys
import requests
import json
api_key = "" #https://upcdatabase.org/
def barcode_reader():
hid = {4: 'a', 5: 'b', 6: 'c', 7: 'd', 8: 'e', 9: 'f', 10: 'g', 11: 'h', 12: 'i', 13: 'j', 14: 'k', 15: 'l', 16: 'm',
17: 'n', 18: 'o', 19: 'p', 20: 'q', 21: 'r', 22: 's', 23: 't', 24: 'u', 25: 'v', 26: 'w', 27: 'x', 28: 'y',
29: 'z', 30: '1', 31: '2', 32: '3', 33: '4', 34: '5', 35: '6', 36: '7', 37: '8', 38: '9', 39: '0', 44: ' ',
45: '-', 46: '=', 47: '[', 48: ']', 49: '\\', 51: ';', 52: '\'', 53: '~', 54: ',', 55: '.', 56: '/'}
hid2 = {4: 'A', 5: 'B', 6: 'C', 7: 'D', 8: 'E', 9: 'F', 10: 'G', 11: 'H', 12: 'I', 13: 'J', 14: 'K', 15: 'L', 16: 'M',
17: 'N', 18: 'O', 19: 'P', 20: 'Q', 21: 'R', 22: 'S', 23: 'T', 24: 'U', 25: 'V', 26: 'W', 27: 'X', 28: 'Y',
29: 'Z', 30: '!', 31: '#', 32: '#', 33: '$', 34: '%', 35: '^', 36: '&', 37: '*', 38: '(', 39: ')', 44: ' ',
45: '_', 46: '+', 47: '{', 48: '}', 49: '|', 51: ':', 52: '"', 53: '~', 54: '<', 55: '>', 56: '?'}
fp = open('/dev/hidraw0', 'rb')
ss = ""
shift = False
done = False
while not done:
## Get the character from the HID
buffer = fp.read(8)
for c in buffer:
if ord(c) > 0:
## 40 is carriage return which signifies
## we are done looking for characters
if int(ord(c)) == 40:
done = True
break;
## If we are shifted then we have to
## use the hid2 characters.
if shift:
## If it is a '2' then it is the shift key
if int(ord(c)) == 2:
shift = True
## if not a 2 then lookup the mapping
else:
ss += hid2[int(ord(c))]
shift = False
## If we are not shifted then use
## the hid characters
else:
## If it is a '2' then it is the shift key
if int(ord(c)) == 2:
shift = True
## if not a 2 then lookup the mapping
else:
ss += hid[int(ord(c))]
return ss
def UPC_lookup(api_key,upc):
'''V3 API'''
url = "https://api.upcdatabase.org/product/%s/%s" % (upc, api_key)
headers = {
'cache-control': "no-cache",
}
response = requests.request("GET", url, headers=headers)
print("-----" * 5)
print(upc)
print(json.dumps(response.json(), indent=2))
print("-----" * 5 + "\n")
if __name__ == '__main__':
try:
while True:
UPC_lookup(api_key,barcode_reader())
except KeyboardInterrupt:
pass
If it is already printing to the console it means it's coming from this part of the code:
print("-----" * 5)
print(upc)
print(json.dumps(response.json(), indent=2))
print("-----" * 5 + "\n")
In order to save it to a file you can use the following:
with open('FILENAME.txt', 'a', encoding='utf-8') as file:
file.write('CONTENT THAT YOU WANT TO WRITE!\n')
Or in your particular case:
with open('FILENAME.txt', 'a', encoding='utf-8') as file:
file.write("-----" * 5)
file.write(upc)
file.write(json.dumps(response.json(), indent=2))
file.write("-----" * 5 + "\n")
Reference: Is there a faster way of converting a number to a name?
In the question referenced above, a solution was found for turning a numbe into a name. This question asks just the opposite. How can you convert a name back into a number? So far, this is what I have:
>>> import string
>>> HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
>>> TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
>>> HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
>>> def number_to_name(number):
"Convert a number into a valid identifier."
if number < HEAD_BASE:
return HEAD_CHAR[number]
q, r = divmod(number - HEAD_BASE, TAIL_BASE)
return number_to_name(q) + TAIL_CHAR[r]
>>> [number_to_name(n) for n in range(117)]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A0', 'A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9', 'AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ', 'AK', 'AL', 'AM', 'AN', 'AO', 'AP', 'AQ', 'AR', 'AS', 'AT', 'AU', 'AV', 'AW', 'AX', 'AY', 'AZ', 'A_', 'Aa', 'Ab', 'Ac', 'Ad', 'Ae', 'Af', 'Ag', 'Ah', 'Ai', 'Aj', 'Ak', 'Al', 'Am', 'An', 'Ao', 'Ap', 'Aq', 'Ar', 'As', 'At', 'Au', 'Av', 'Aw', 'Ax', 'Ay', 'Az', 'B0']
>>> def name_to_number(name):
assert name, 'Name must exist!'
head, *tail = name
number = HEAD_CHAR.index(head)
for position, char in enumerate(tail):
if position:
number *= TAIL_BASE
else:
number += HEAD_BASE
number += TAIL_CHAR.index(char)
return number
>>> [name_to_number(number_to_name(n)) for n in range(117)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 54]
The function number_to_name works perfectly, and name_to_number works up until it gets to number 116. At that point, the function returns 54 instead. Does anyone see the code's problem?
Solution based on recursive's answer:
import string
HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
def name_to_number(name):
if not name.isidentifier():
raise ValueError('Name must be a Python identifier!')
head, *tail = name
number = HEAD_CHAR.index(head)
for char in tail:
number *= TAIL_BASE
number += TAIL_CHAR.index(char)
return number + sum(HEAD_BASE * TAIL_BASE ** p for p in range(len(tail)))
Unfortunately, these identifiers don't yield to traditional constant base encoding techniques. For example "A" acts like a zero, but leading "A"s change the value. In normal number systems leading zeroes do not. There could be multiple approaches, but I settled on one that calculates the total number of identifiers with fewer digits, and starts from that.
def name_to_number(name):
assert name, 'Name must exist!'
skipped = sum(HEAD_BASE * TAIL_BASE ** i for i in range(len(name) - 1))
val = reduce(
lambda a,b: a * TAIL_BASE + TAIL_CHAR.index(b),
name[1:],
HEAD_CHAR.index(name[0]))
return val + skipped