Python find CRC32 of string - python

I tried to get crc32 of a string data type variable but getting the following error.
>>> message='hello world!'
>>> import binascii
>>> binascii.crc32(message)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
For a string values it can be done with binascii.crc32(b'hello world!') but I would like to know how to do this for a string data-type variable

When you are computing crc32 of some data, you need to know the exact value of bytes you are hashing. One string can represent different values of bytes in different encodings, therefore passing string as parameter is ambiguous.
When using binascii.crc32(b'hello world!'), you are converting char array into array of bytes using simple ascii table as conversion.
To convert any string, you can use:
import binascii
text = 'hello'
binascii.crc32(text.encode('utf8'))

This can be done using binascii.crc32 or zlib.crc32. This answer improves upon the prior answer by Tomas by documenting both modules and by producing a string output besides just an integer.
# Define data
> text = "hello"
> data = text.encode()
> data
b'hello'
# Using binascii
> import binascii
> crc32 = binascii.crc32(data)
> crc32
907060870
> hex(crc32)
'0x3610a686'
> f'{crc32:#010x}'
'0x3610a686'
# Using zlib
> import zlib
> zlib.crc32(data)
907060870 # Works the same as binascii.crc32.
If you don't want the string output to have the 0x prefix:
> import base64
> crc32 = 907060870
> digest = crc32.to_bytes(4, 'big')
> digest
b'6\x10\xa6\x86'
> base64.b16encode(digest).decode().lower()
'3610a686'

Related

struct.error: bad char in struct format when i want to use struct.pack()

I want to pack my data to send it with socket.
I did it.
sensor = b'cam'
msg = struct.pack('3s >I >I', sensor, len(channel), len(inf_bytes)) + channel + inf_bytes ```
And the I got: struct.error: bad char in struct format
Could you tell me where I am wrong?
Only the first character in the format string can be > to use big-endian:
>>> struct.pack('3s>I>I', b'A', 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: bad char in struct format
>>> struct.pack('>3sII', b'A', 2, 3)
b'A\x00\x00\x00\x00\x00\x02\x00\x00\x00\x03'
The doc says it: (emphasis mine)
By default, C types are represented in the machine’s native format and byte order, and properly aligned by skipping pad bytes if necessary (according to the rules used by the C compiler).
Alternatively, the first character of the format string can be used to indicate the byte order, size and alignment of the packed data, according to the following table:
at https://docs.python.org/3/library/struct.html#struct-format-strings

invalid literal for int() with base 10: '328.94'(while converting bytes to int())

This is my code:
import serial
print('Arduino is setting up')
# Setting up the Arduino board
arduinoSerialData = serial.Serial('com4', 9600)
while True:
if arduinoSerialData.inWaiting() > 1:
myData = arduinoSerialData.readline()
myData = str(myData)
myData = myData.replace("b'", '')
myData = myData.replace("\\r\\n'", '')
myData1=myData
if myData1.find("a"):
myData1= myData1.replace("a",str(0))
if int(myData1)<100:
print(myData)
What this code does is it imports the data from the ultrasonic sensor thats attached to the arduino board, and prints it.myData is initially in bytes so I convert it to string, but I cannot seem to convert it to int.When I tried the above code, I get try this code, I get this error.Anyone know how to troubleshoot this?Thanks!
it seems that your bytes to string conversion is not correct. Why not try this:
1. Bytes to string conversion:
mydata = myData.decode("utf-8")
2. Eliminatinf trailing newline characters:
myData = myData.strip("\r\n")
Make sure that that the resulting string contains only numeric characters to get converted to int. You can do this check :
if mydata1.isdigit() and int(mydata1) < 100:
<your code>
If ur string contains float number,then u can perform do this:
if mydata1.replace(".", "").isdigit() and int(float(mydata1)) < 100:
If you give a string to int(), it needs to be an integer. If you instead have a non-integer, you can convert it with float() first, then use int() to turn that floating point value into an integer, as per the following transcript:
>>> print(int("328.94")) # Will not work.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '328.94'
>>> print(float("328.94")) # Convert string to float.
328.94
>>> print(int(float("328.94"))) # Convert string to float to int.
328
>>> print(int(float("328.94") + 0.5)) # Same but rounded.
329
That last one is an option if you want it rounded to the nearest integer, rather than truncated.

Some conversion issues between the byte and strings

Here is the I am trying:
import struct
#binary_data = open("your_binary_file.bin","rb").read()
#your binary data would show up as a big string like this one when you .read()
binary_data = '\x44\x69\x62\x65\x6e\x7a\x6f\x79\x6c\x70\x65\x72\x6f\x78\x69\x64\x20\x31\
x32\x30\x20\x43\x20\x30\x33\x2e\x30\x35\x2e\x31\x39\x39\x34\x20\x31\x34\x3a\x32\
x34\x3a\x33\x30'
def search(text):
#convert the text to binary first
s = ""
for c in text:
s+=struct.pack("b", ord(c))
results = binary_data.find(s)
if results == -1:
print ("no results found")
else:
print ("the string [%s] is found at position %s in the binary data"%(text, results))
search("Dibenzoylperoxid")
search("03.05.1994")
And this is the error I am getting:
Traceback (most recent call last):
File "dec_new.py", line 22, in <module>
search("Dibenzoylperoxid")
File "dec_new.py", line 14, in search
s+=struct.pack("b", ord(c))
TypeError: Can't convert 'bytes' object to str implicitly
Kindly, let me know what I can do to make it functional properly.
I am using Python 3.5.0.
s = ""
for c in text:
s+=struct.pack("b", ord(c))
This won't work because s is a string, and struct.pack returns a bytes, and you can't add a string and a bytes.
One possible solution is to make s a bytes.
s = b""
... But it seems like a lot of work to convert a string to a bytes this way. Why not just use encode()?
def search(text):
#convert the text to binary first
s = text.encode()
results = binary_data.find(s)
#etc
Also, "your binary data would show up as a big string like this one when you .read()" is not, strictly speaking, true. The binary data won't show up as a big string, because it is a bytes, not a string. If you want to create a bytes literal that resembles what might be returned by open("your_binary_file.bin","rb").read(), use the bytes literal syntax binary_data = b'\x44\x69<...etc...>\x33\x30'

Convertion between ISO-8859-2 and UTF-8 in Python

I'm wondering how can I convert ISO-8859-2 (latin-2) characters (I mean integer or hex values that represents ISO-8859-2 encoded characters) to UTF-8 characters.
What I need to do with my project in python:
Receive hex values from serial port, which are characters encoded in ISO-8859-2.
Decode them, this is - get "standard" python unicode strings from them.
Prepare and write xml file.
Using Python 3.4.3
txt_str = "ąęłóźć"
txt_str.decode('ISO-8859-2')
Traceback (most recent call last): File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
The main problem is still to prepare valid input for the "decode" method (it works in python 2.7.10, and thats the one I'm using in this project). How to prepare valid string from decimal value, which are Latin-2 code numbers?
Note that it would be uber complicated to receive utf-8 characters from serial port, thanks to devices I'm using and communication protocol limitations.
Sample data, on request:
68632057
62206A75
7A647261
B364206F
20616775
777A616E
616A2061
6A65696B
617A20B6
697A7970
6A65B361
70697020
77F36469
62202C79
6E647572
75206A65
7963696C
72656D75
6A616E20
73726F67
206A657A
65647572
77207972
73772065
00000069
This is some sample data. ISO-8859-2 pushed into uint32, 4 chars per int.
bit of code that manages unboxing:
l = l[7:].replace(",", "").replace(".", "").replace("\n","").replace("\r","") # crop string from uart, only data left
vl = [l[0:2], l[2:4], l[4:6], l[6:8]] # list of bytes
vl = vl[::-1] # reverse them - now in actual order
To get integer value out of hex string I can simply use:
int_vals = [int(hs, 16) for hs in vl]
Your example doesn't work because you've tried to use a str to hold bytes. In Python 3 you must use byte strings.
In reality, if you're using PySerial then you'll be reading byte strings anyway, which you can convert as required:
with serial.Serial('/dev/ttyS1', 19200, timeout=1) as ser:
s = ser.read(10)
# Py3: s == bytes
# Py2.x: s == str
my_unicode_string = s.decode('iso-8859-2')
If your iso-8895-2 data is actually then encoded to ASCII hex representation of the bytes, then you have to apply an extra layer of encoding:
with serial.Serial('/dev/ttyS1', 19200, timeout=1) as ser:
hex_repr = ser.read(10)
# Py3: hex_repr == bytes
# Py2.x: hex_repr == str
# Decodes hex representation to bytes
# Eg. b"A3" = b'\xa3'
hex_decoded = codecs.decode(hex_repr, "hex")
my_unicode_string = hex_decoded.decode('iso-8859-2')
Now you can pass my_unicode_string to your favourite XML library.
Interesting sample data. Ideally your sample data should be a direct print of the raw data received from PySerial. If you actually are receiving the raw bytes as 8-digit hexadecimal values, then:
#!python3
from binascii import unhexlify
data = b''.join(unhexlify(x)[::-1] for x in b'''\
68632057
62206A75
7A647261
B364206F
20616775
777A616E
616A2061
6A65696B
617A20B6
697A7970
6A65B361
70697020
77F36469
62202C79
6E647572
75206A65
7963696C
72656D75
6A616E20
73726F67
206A657A
65647572
77207972
73772065
00000069'''.splitlines())
print(data.decode('iso-8859-2'))
Output:
W chuj bardzo długa nazwa jakiejś zapyziałej pipidówy, brudnej ulicyumer najgorszej rudery we wsi
Google Translate of Polish to English:
The dick very long name some zapyziałej Small Town , dirty ulicyumer worst hovel in the village
This topic is closed. Working code, that handles what need to be done:
x=177
x.to_bytes(1, byteorder='big').decode("ISO-8859-2")

How to base64 encode/decode a variable with string type in Python 3?

It gives me an error that the line encoded needs to be bytes not str/dict
I know of adding a "b" before the text will solve that and print the encoded thing.
import base64
s = base64.b64encode(b'12345')
print(s)
>>b'MTIzNDU='
But how do I encode a variable?
such as
import base64
s = "12345"
s2 = base64.b64encode(s)
print(s2)
It gives me an error with the b added and without. I don't understand
I'm also trying to encode/decode a dictionary with base64.
You need to encode the unicode string. If it's just normal characters, you can use ASCII. If it might have other characters in it, or just for general safety, you probably want utf-8.
>>> import base64
>>> s = "12345"
>>> s2 = base64.b64encode(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ". . . /lib/python3.3/base64.py", line 58, in b64encode
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str
>>> s2 = base64.b64encode(s.encode('ascii'))
>>> print(s2)
b'MTIzNDU='
>>>

Categories

Resources