Python int to bytes conversion error with some values - python

I am using the int.to_bytes method within Python to convert integer values into bytes.
With certain values, this seems to fail. Attached is the output from the Python console:
value = 2050
value.to_bytes(2, 'big')
>>> b'\x08\x02'
value = 2082
value.to_bytes(2, 'big')
>>> b'\x08"'
With a value of 2050, the conversion seems to be correct. But when the value is 2082, for some reason only the upper byte seems to be extracted. Any reason as to why this is happening?

It extracts all bytes. Try
value = 2082
x = value.to_bytes(2, 'big')
print(x[0]) # Output: 8
print(x[1]) # Output: 34
When you convert to string, byte 34 translates to ASCII ", which is what you see.

Related

Decoding Error: Int too big to convert while decrypting

I have encoded a string to integer in the following way in python:
b = bytearray()
b.extend(input_number_or_text.encode('ascii'))
input_number_or_text = int.from_bytes(b,byteorder='big', signed=False)
I am encrypting this integer to get a new value and subsequently decrypting to get back the original integer.
Now how do I get back the string from the integer
I have tried the following method for decryption:
decrypted_data.to_bytes(1,byteorder='big').decode('ascii')
but I get int too big to convert error.
How to fix this problem?
You told it the int should be convertable to a length 1 byte string. If it's longer, that won't work. You can remember the length, or you can guess at it:
num_bytes = (decrypted_data.bit_length() + 7) // 8
decrypted_data.to_bytes(num_bytes, byteorder='big').decode('ascii')
Adding 7 and floor-dividing by 8 ensures enough bytes for the data. -(-decrypted_data.bit_length() // 8) also works (and is trivially faster on Python), but is a bit more magical looking.
The byte representation of an integer is different than a string.
For example - 1 , '1', 1.0 all look different when looking at the byte representation.
From the code you supply -
b.extend(input_number_or_text.encode('ascii'))
and int.from_bytes(b,byteorder='big', signed=False)
Seems like your encoding a string of a number, and trying to decode it as a int.
See the next example:
In [3]: b = bytearray()
In [4]: a = '1'
In [5]: b.extend(a.encode('ascii'))
In [6]: int.from_bytes(b,byteorder='big',signed=False)
Out[6]: 49
If you are encoding a string, you should first decode a string, and then convert to int.
In [1]: b = bytearray()
In [2]: a = '123'
In [3]: b.extend(a.encode('ascii'))
In [4]: decoded = int(b.decode('ascii'))
In [5]: decoded
Out[5]: 123

How to turn a binary string into a byte?

If I take the letter 'à' and encode it in UTF-8 I obtain the following result:
'à'.encode('utf-8')
>> b'\xc3\xa0'
Now from a bytearray I would like to convert 'à' into a binary string and turn it back into 'à'. To do so I execute the following code:
byte = bytearray('à','utf-8')
for x in byte:
print(bin(x))
I get 0b11000011and0b10100000, which is 195 and 160. Then, I fuse them together and take the 0b part out. Now I execute this code:
s = '1100001110100000'
value1 = s[0:8].encode('utf-8')
value2 = s[9:16].encode('utf-8')
value = value1 + value2
print(chr(int(value, 2)))
>> 憠
No matter how I develop the later part I get symbols and never seem to be able to get back my 'à'. I would like to know why is that? And how can I get an 'à'.
>>> bytes(int(s[i:i+8], 2) for i in range(0, len(s), 8)).decode('utf-8')
'à'
There are multiple parts to this. The bytes constructor creates a byte string from a sequence of integers. The integers are formed from strings using int with a base of 2. The range combined with the slicing peels off 8 characters at a time. Finally decode converts those bytes back into Unicode characters.
you need your second bits to be s[8:16] (or just s[8:]) otherwise you get 0100000
you also need to convert you "bit string" back to an integer before thinking of it as a byte with int("0010101",2)
s = '1100001110100000'
value1 = bytearray([int(s[:8],2), # bits 0..7 (8 total)
int(s[8:],2)] # bits 8..15 (8 total)
)
print(value1.decode("utf8"))
Convert the base-2 value back to an integer with int(s,2), convert that integer to a number of bytes (int.to_bytes) based on the original length divided by 8 and big-endian conversion to keep the bytes in the right order, then .decode() it (default in Python 3 is utf8):
>>> s = '1100001110100000'
>>> int(s,2)
50080
>>> int(s,2).to_bytes(len(s)//8,'big')
b'\xc3\xa0'
>>> int(s,2).to_bytes(len(s)//8,'big').decode()
'à'

Why does a Python bytearray work with value >= 256

The Python documentation for bytearray states:
The bytearray type is a mutable sequence of integers in the range 0 <=
x < 256.
However the following code suggests values can be >= 256. I store a 9 bit binary number which has a maximum value of: 2^9-1 = 512-1 = 511
ba = bytes([0b111111111])
print '%s' % (ba)
The 9 bit binary number is printed as decimal 511:
[511]
I don't know what the intended behavior is, but I assumed the most significant bit(s) would be dropped to give an 8 bit number.
You aren't actually creating a bytearray or a bytes object, you're just creating a string containing '[511]', since bytes in Python 2 is just a synonym for str. In Python 3, you would get an error message:
ValueError: byte must be in range(0, 256)
The following code works in Python 2 or Python 3; note that I'm passing an 8 bit number, so it's in range.
ba = bytearray([0b11111111])
print(repr(ba))
output
bytearray(b'\xff')
code:
a = 511
byte = a.to_bytes(byte length goes here, 'little')
to decode:
a = int.from_bytes(byte, 'little')

Converting a float to bytearray

So, what I am trying to do is convert a float to a bytearray but I keep on receiving both no input, and EXTREME slowing/freezing of my computer.
My code is
import struct
def float_to_hex(f):
return hex(struct.unpack('<I', struct.pack('<f', f))[0])
value = 5.1 #example value
...
value = bytearray(int(float_to_hex(float(value)), 16)
I found on another article a function to convert floats to hex which is
def float_to_hex(f):
return hex(struct.unpack('<I', struct.pack('<f', f))[0])
and then I converted it from hex to an int.
What is the problem with this? How could I better convert it from a float to bin or bytearray?
It depends what you want, and what you are going to do with it. If all you want is a bytearray then:
import struct
value = 5.1
ba = bytearray(struct.pack("f", value))
Where ba is a bytearray. However, if you wish to display the hex values (which I suspect), then:
print([ "0x%02x" % b for b in ba ])
EDIT:
This gives (for value 5.1):
['0x33', '0x33', '0xa3', '0x40']
However, CPython uses the C type double to store even small floats (there are good reasons for that), so:
value = 5.1
ba = bytearray(struct.pack("d", value))
print([ "0x%02x" % b for b in ba ])
Gives:
['0x66', '0x66', '0x66', '0x66', '0x66', '0x66', '0x14', '0x40']
The result I would want from 5.1 is 0x40 a3 33 33 or 64 163 51 51. Not as a string.
To get the desired list of integers from the float:
>>> import struct
>>> list(struct.pack("!f", 5.1))
[64, 163, 51, 51]
Or the same as a bytearray type:
>>> bytearray(struct.pack("!f", 5.1))
bytearray(b'#\xa333')
Note: the bytestring (bytes type) contains exactly the same bytes:
>>> struct.pack("!f", 5.1)
b'#\xa333'
>>> for byte in struct.pack("!f", 5.1):
... print(byte)
...
64
163
51
51
The difference is only in mutability. list, bytearray are mutable sequences while bytes type represents an immutable sequence of bytes. Otherwise, bytes and bytearray types have a very similar API.

Read string from binary file

I want to read bytes 1,2 and 3 from a file. I know it corresponds to a string (in this case it's ELF of a Linux binary header)
Following examples I could find on the net I came up with this:
with open('hello', 'rb') as f:
f.seek(1)
bytes = f.read(3)
string = struct.unpack('s', bytes)
print st
Looking at the official documentation of struct it seems that passing s as argument should allow me to read a string.
I get the error:
st = struct.unpack('s', bytes)
struct.error: unpack requires a string argument of length 1
EDIT: Using Python 2.7
In your special case, it is enough to just check
if bytes == 'ELF':
to test all three bytes in one step to be the three characters E, L and F.
But also if you want to check the numerical values, you do not need to unpack anything here. Just use ord(bytes[i]) (with i in 0, 1, 2) to get the byte values of the three bytes.
Alternatively you can use
byte_values = struct.unpack('bbb', bytes)
to get a tuple of the three bytes. You can also unpack that tuple on the fly in case the bytes have nameable semantics like this:
width, height, depth = struct.unpack('bbb', bytes)
Use 'BBB' instead of 'bbb' in case your byte values shall be unsigned.
In Python 2, read returns a string; in the sense "string of bytes". To get a single byte, use bytes[i], it will return another string but with a single byte. If you need the numeric value of a byte, use ord: ord(bytes[i]). Finally, to get numeric values for all bytes use map(ord, bytes).
In [4]: s = "foo"
In [5]: s[0]
Out[5]: 'f'
In [6]: ord(s[0])
Out[6]: 102
In [7]: map(ord, s)
Out[7]: [102, 111, 111]

Categories

Resources