Why does a Python bytearray work with value >= 256 - python

The Python documentation for bytearray states:
The bytearray type is a mutable sequence of integers in the range 0 <=
x < 256.
However the following code suggests values can be >= 256. I store a 9 bit binary number which has a maximum value of: 2^9-1 = 512-1 = 511
ba = bytes([0b111111111])
print '%s' % (ba)
The 9 bit binary number is printed as decimal 511:
[511]
I don't know what the intended behavior is, but I assumed the most significant bit(s) would be dropped to give an 8 bit number.

You aren't actually creating a bytearray or a bytes object, you're just creating a string containing '[511]', since bytes in Python 2 is just a synonym for str. In Python 3, you would get an error message:
ValueError: byte must be in range(0, 256)
The following code works in Python 2 or Python 3; note that I'm passing an 8 bit number, so it's in range.
ba = bytearray([0b11111111])
print(repr(ba))
output
bytearray(b'\xff')

code:
a = 511
byte = a.to_bytes(byte length goes here, 'little')
to decode:
a = int.from_bytes(byte, 'little')

Related

Python int to bytes conversion error with some values

I am using the int.to_bytes method within Python to convert integer values into bytes.
With certain values, this seems to fail. Attached is the output from the Python console:
value = 2050
value.to_bytes(2, 'big')
>>> b'\x08\x02'
value = 2082
value.to_bytes(2, 'big')
>>> b'\x08"'
With a value of 2050, the conversion seems to be correct. But when the value is 2082, for some reason only the upper byte seems to be extracted. Any reason as to why this is happening?
It extracts all bytes. Try
value = 2082
x = value.to_bytes(2, 'big')
print(x[0]) # Output: 8
print(x[1]) # Output: 34
When you convert to string, byte 34 translates to ASCII ", which is what you see.

How to turn a binary string into a byte?

If I take the letter 'à' and encode it in UTF-8 I obtain the following result:
'à'.encode('utf-8')
>> b'\xc3\xa0'
Now from a bytearray I would like to convert 'à' into a binary string and turn it back into 'à'. To do so I execute the following code:
byte = bytearray('à','utf-8')
for x in byte:
print(bin(x))
I get 0b11000011and0b10100000, which is 195 and 160. Then, I fuse them together and take the 0b part out. Now I execute this code:
s = '1100001110100000'
value1 = s[0:8].encode('utf-8')
value2 = s[9:16].encode('utf-8')
value = value1 + value2
print(chr(int(value, 2)))
>> 憠
No matter how I develop the later part I get symbols and never seem to be able to get back my 'à'. I would like to know why is that? And how can I get an 'à'.
>>> bytes(int(s[i:i+8], 2) for i in range(0, len(s), 8)).decode('utf-8')
'à'
There are multiple parts to this. The bytes constructor creates a byte string from a sequence of integers. The integers are formed from strings using int with a base of 2. The range combined with the slicing peels off 8 characters at a time. Finally decode converts those bytes back into Unicode characters.
you need your second bits to be s[8:16] (or just s[8:]) otherwise you get 0100000
you also need to convert you "bit string" back to an integer before thinking of it as a byte with int("0010101",2)
s = '1100001110100000'
value1 = bytearray([int(s[:8],2), # bits 0..7 (8 total)
int(s[8:],2)] # bits 8..15 (8 total)
)
print(value1.decode("utf8"))
Convert the base-2 value back to an integer with int(s,2), convert that integer to a number of bytes (int.to_bytes) based on the original length divided by 8 and big-endian conversion to keep the bytes in the right order, then .decode() it (default in Python 3 is utf8):
>>> s = '1100001110100000'
>>> int(s,2)
50080
>>> int(s,2).to_bytes(len(s)//8,'big')
b'\xc3\xa0'
>>> int(s,2).to_bytes(len(s)//8,'big').decode()
'à'

How to turn the first n bits of a digest into a integer?

I'm workign with Python 3, trying get an integer out of a digest in python. I'm only interested in the first n bits of the digest though.
What I have right now is this:
n = 3
int(hashlib.sha1(b'test').digest()[0:n])
This however results in a ValueError: invalid literal for int() with base 10: b'\xa9J' error.
Thanks.
The Py3 solution is to use int.from_bytes to convert bytes to int, then shift off the part you don't care about:
def bitsof(bt, nbits):
# Directly convert enough bytes to an int to ensure you have at least as many bits
# as needed, but no more
neededbytes = (nbits+7)//8
if neededbytes > len(bt):
raise ValueError("Require {} bytes, received {}".format(neededbytes, len(bt)))
i = int.from_bytes(bt[:neededbytes], 'big')
# If there were a non-byte aligned number of bits requested,
# shift off the excess from the right (which came from the last byte processed)
if nbits % 8:
i >>= 8 - nbits % 8
return i
Example use:
>>> bitsof(hashlib.sha1(b'test').digest(), 3)
5 # The leftmost bits of the a nibble that begins the hash
On Python 2, the function can be used almost as is, aside from adding a binascii import, and changing the conversion from bytes to int to the slightly less efficient two step conversion (from str to hex representation, then using int with base of 16 to parse it):
i = int(binascii.hexlify(bt[:neededbytes]), 16)
Everything else works as is (even the // operator works as expected; Python 2's / operator is different from Py 3's, but // works the same on both).

The best way to base64 encode the last 6k bits of a python integer

I illustrate a case for where k = 2 (so, the bottom 12 digits)
import base64
# Hi
# 7 - 34
# 000111 - 100010
# 0001 - 1110 - 0010 = 0x1E2 = 482
# 1
integer = int(bin(482)[-12:] + '0' * 20, 2)
encoded = base64.b64encode(base64.b16decode('{0:08X}'.format(integer)))
print encoded
# 2
encoded = base64.b64encode(base64.b16decode('{0:08X}'.format(482 << 20)))
print encoded
Both output HiAAAA== as desired
An ideone link for your convenience: http://ideone.com/O73kQs
Intuitively these are very clear, and I'm favoring #2 by quite a bit.
One thing that "irks" me about #1, is that if the integers in python are not 32 bits, then I'm in trouble.
How can I get the proper size of an int? (total python newbie question?) (edit: yes, apparently a newbie-ish question How can i determine the exact size of a type used by python)
It would be nice, however, if there was a way to simply do something like
encoded = base64.b64encode('{0:08X}'.format(482 << 20))
Moreover, how can I go from
bin(1)
which equals
'0b1'
to the actual binary literal
0b1
you can go from bin back with int , which takes an optional 2nd parameter that is base
int(bin(18)[2:],2)
since you use this earlier you must know about it ... so I only assume you mean something else by binary literal than its integer representation ... although for the life of me im not sure what that is...
you can do
print 0b1
and see that the actual repr is the decimal value ...
to get the last 12 bits of an int
my_int = 482
k=2
mask = int("1"*(6*k),2)
last_bits = my_int & mask
then you can just shift it 20 or whatever ...
first get the last 12 bits as demonstrated above
import struct
print struct.pack('H',last_bits)
print struct.pack('H',0b100001)
alternatively you could
def get_chars(int_val):
while int_val > 0:
yield chr(int_val & 0xFF)
int_val <<= 8
print repr("".join(get_chars(last_bits)))

Z3py: Array of specific integer type?

In Z3Python, I want to declare an array of bytes (meaning each member of array is of integer of 8 bits). I tried with the following code, but apparently it reports that Int(8) is illegal type.
Any idea on how to fix the problem? Thanks!
I = IntSort()
I8 = Int(8)
A = Array('A', I, I8)
You cannot provide a number as argument of the Int() function. It expects a string (the name of the integer actually) and not the size, in bits, of the integer. You might want to consider using bit vectors instead:
Byte = BitVecSort(8)
i8 = BitVec('i8', Byte)
A = Array('A', IntSort(), Byte)

Categories

Resources