Convert a Python int into a big-endian string of bytes - python

I have a non-negative int and I would like to efficiently convert it to a big-endian string containing the same data. For example, the int 1245427 (which is 0x1300F3) should result in a string of length 3 containing three characters whose byte values are 0x13, 0x00, and 0xf3.
My ints are on the scale of 35 (base-10) digits.
How do I do this?

In Python 3.2+, you can use int.to_bytes:
If you don't want to specify the size
>>> n = 1245427
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0'
b'\x13\x00\xf3'
If you don't mind specifying the size
>>> (1245427).to_bytes(3, byteorder='big')
b'\x13\x00\xf3'

You can use the struct module:
import struct
print(struct.pack('>I', your_int))
'>I' is a format string. > means big endian and I means unsigned int. Check the documentation for more format chars.

This is fast and works for small and (arbitrary) large ints:
def Dump(n):
s = '%x' % n
if len(s) & 1:
s = '0' + s
return s.decode('hex')
print repr(Dump(1245427)) #: '\x13\x00\xf3'

Probably the best way is via the built-in struct module:
>>> import struct
>>> x = 1245427
>>> struct.pack('>BH', x >> 16, x & 0xFFFF)
'\x13\x00\xf3'
>>> struct.pack('>L', x)[1:] # could do it this way too
'\x13\x00\xf3'
Alternatively -- and I wouldn't usually recommend this, because it's mistake-prone -- you can do it "manually" by shifting and the chr() function:
>>> x = 1245427
>>> chr((x >> 16) & 0xFF) + chr((x >> 8) & 0xFF) + chr(x & 0xFF)
'\x13\x00\xf3'
Out of curiosity, why do you only want three bytes? Usually you'd pack such an integer into a full 32 bits (a C unsigned long), and use struct.pack('>L', 1245427) but skip the [1:] step?

def tost(i):
result = []
while i:
result.append(chr(i&0xFF))
i >>= 8
result.reverse()
return ''.join(result)

Single-source Python 2/3 compatible version based on #pts' answer:
#!/usr/bin/env python
import binascii
def int2bytes(i):
hex_string = '%x' % i
n = len(hex_string)
return binascii.unhexlify(hex_string.zfill(n + (n & 1)))
print(int2bytes(1245427))
# -> b'\x13\x00\xf3'

The shortest way, I think, is the following:
import struct
val = 0x11223344
val = struct.unpack("<I", struct.pack(">I", val))[0]
print "%08x" % val
This converts an integer to a byte-swapped integer.

Using the bitstring module:
>>> bitstring.BitArray(uint=1245427, length=24).bytes
'\x13\x00\xf3'
Note though that for this method you need to specify the length in bits of the bitstring you are creating.
Internally this is pretty much the same as Alex's answer, but the module has a lot of extra functionality available if you want to do more with your data.

Very easy with pwntools , the tools created for software hacking
(Un-ironically, I stumbled across this thread and tried solutions here, until I realised there exists conversion functionality in pwntools)
import pwntools
x2 = p32(x1)

Related

get bytes result from secrets.randbits() [duplicate]

I want to convert an 32-byte (although I might need other lengths) integer to a bytes object in python. Is there a clean and simple way to do this?
to_bytes(length, byteorder[, signed]) is all you need starting from 3.2. In this case, someidentifier.to_bytes(4,'big') should give you the bytes string you need.
I'm guessing you need a 32-bit integer, and big-endian to boot:
>>> from ctypes import c_uint32
>>> l = c_uint32(0x12345678)
>>> bytes(l)
b'xV4\x12'
There is c_uint8, c_uint16 and c_uint64 as well. For longer ints you need to make it manually, using divmod(x, 256).
>>> def bytify(v):
... v, r = divmod(v, 256)
... yield r
... if v == 0:
... raise StopIteration
... for r in bytify(v):
... yield r
...
>>> [x for x in bytify(0x12345678)]
[120, 86, 52, 18]
>>> bytes(bytify(0x12345678))
b'xV4\x12
>>> bytes(bytify(0x123456789098765432101234567890987654321))
b'!Ce\x87\t\x89gE#\x01!Ce\x87\t\x89gE#\x01'
You can use bytes("iterable") directly. Where every value in iterable will be specific byte in bytes(). Example for little endian encoding:
>>> var=0x12345678
>>> var_tuple=((var)&0xff, (var>>8)&0xff, (var>>16)&0xff, (var>>24)&0xff)
>>> bytes(var_tuple)
b'xV4\x12'
Suppose you have
var = 'і' # var is ukrainian і
We want to get binary from it.
Flow is this. value/which is string => bytes => int => binary
binary_var = '{:b}'.format(int.from_bytes(var.encode('utf-8'), byteorder='big'))
Now binary_var is '1101000110010110'. It type is string.
Now go back, you want get unicode value from binary:
int_var = int(binary_var, 2) # here we get int value, int_var = 53654
Now we need convert integer to bytes. Ukrainian 'і' is not gonna fit into 1 byte but in 2. We convert to actual bytes bytes_var = b'\xd1\x96'
bytes_var = int_var.to_bytes(2, byteorder='big')
Finally we decode our bytes.
ukr_i = bytes_var.decode('utf-8') # urk_i = 'і'

Convert bytes to int?

I'm currently working on an encryption/decryption program and I need to be able to convert bytes to an integer. I know that:
bytes([3]) = b'\x03'
Yet I cannot find out how to do the inverse. What am I doing terribly wrong?
Assuming you're on at least 3.2, there's a built in for this:
int.from_bytes( bytes, byteorder, *, signed=False )
...
The argument bytes must either be a bytes-like object or an iterable
producing bytes.
The byteorder argument determines the byte order used to represent the
integer. If byteorder is "big", the most significant byte is at the
beginning of the byte array. If byteorder is "little", the most
significant byte is at the end of the byte array. To request the
native byte order of the host system, use sys.byteorder as the byte
order value.
The signed argument indicates whether two’s complement is used to
represent the integer.
## Examples:
int.from_bytes(b'\x00\x01', "big") # 1
int.from_bytes(b'\x00\x01', "little") # 256
int.from_bytes(b'\x00\x10', byteorder='little') # 4096
int.from_bytes(b'\xfc\x00', byteorder='big', signed=True) #-1024
Lists of bytes are subscriptable (at least in Python 3.6). This way you can retrieve the decimal value of each byte individually.
>>> intlist = [64, 4, 26, 163, 255]
>>> bytelist = bytes(intlist) # b'#\x04\x1a\xa3\xff'
>>> for b in bytelist:
... print(b) # 64 4 26 163 255
>>> [b for b in bytelist] # [64, 4, 26, 163, 255]
>>> bytelist[2] # 26
list() can be used to convert bytes to int (works in Python 3.7):
list(b'\x03\x04\x05')
[3, 4, 5]
int.from_bytes( bytes, byteorder, *, signed=False )
doesn't work with me
I used function from this website, it works well
https://coderwall.com/p/x6xtxq/convert-bytes-to-int-or-int-to-bytes-in-python
def bytes_to_int(bytes):
result = 0
for b in bytes:
result = result * 256 + int(b)
return result
def int_to_bytes(value, length):
result = []
for i in range(0, length):
result.append(value >> (i * 8) & 0xff)
result.reverse()
return result
In case of working with buffered data I found this useful:
int.from_bytes([buf[0],buf[1],buf[2],buf[3]], "big")
Assuming that all elements in buf are 8-bit long.
An old question that I stumbled upon while looking for an existing solution. Rolled my own and thought I'd share because it allows you to create a 32-bit integer from a list of bytes, specifying an offset.
def bytes_to_int(bList, offset):
r = 0
for i in range(4):
d = 32 - ((i + 1) * 8)
r += bList[offset + i] << d
return r
#convert bytes to int
def bytes_to_int(value):
return int.from_bytes(bytearray(value), 'little')
bytes_to_int(b'\xa231')

How to split 16-bit unsigned integer into array of bytes in python?

I need to split a 16-bit unsigned integer into an array of bytes (i.e. array.array('B')) in python.
For example:
>>> reg_val = 0xABCD
[insert python magic here]
>>> print("0x%X" % myarray[0])
0xCD
>>> print("0x%X" % myarray[1])
0xAB
The way I'm currently doing it seems very complicated for something so simple:
>>> import struct
>>> import array
>>> reg_val = 0xABCD
>>> reg_val_msb, reg_val_lsb = struct.unpack("<BB", struct.pack("<H", (0xFFFF & reg_val)))
>>> myarray = array.array('B')
>>> myarray.append(reg_val_msb)
>>> myarray.append(reg_val_lsb)
Is there a better/more efficient/more pythonic way of accomplishing the same thing?
(using python 3 here, there are some nomenclature differences in 2)
Well first, you could just leave everything as bytes. This is perfectly valid:
reg_val_msb, reg_val_lsb = struct.pack('<H', 0xABCD)
bytes allows for "tuple unpacking" (not related to struct.unpack, tuple unpacking is used all over python). And bytes is an array of bytes, which can be accessed via index as you wanted.
b = struct.pack('<H',0xABCD)
b[0],b[1]
Out[52]: (205, 171)
If you truly wanted to get it into an array.array('B'), it's still rather easy:
ary = array('B',struct.pack('<H',0xABCD))
# ary = array('B', [205, 171])
print("0x%X" % ary[0])
# 0xCD
For non-complex numbers you can use divmod(a, b), which returns a tuple of the quotient and remainder of arguments.
The following example uses map() for demonstration purposes. In both examples we're simply telling divmod to return a tuple (a/b, a%b), where a=0xABCD and b=256.
>>> map(hex, divmod(0xABCD, 1<<8)) # Add a list() call here if your working with python 3.x
['0xab', '0xcd']
# Or if the bit shift notation is distrubing:
>>> map(hex, divmod(0xABCD, 256))
Or you can just place them in the array:
>>> arr = array.array('B')
>>> arr.extend(divmod(0xABCD, 256))
>>> arr
array('B', [171, 205])
You can write your own function like this.
def binarray(i):
while i:
yield i & 0xff
i = i >> 8
print list(binarray(0xABCD))
#[205, 171]

Convert bytes to bits in python

I am working with Python3.2. I need to take a hex stream as an input and parse it at bit-level. So I used
bytes.fromhex(input_str)
to convert the string to actual bytes. Now how do I convert these bytes to bits?
Another way to do this is by using the bitstring module:
>>> from bitstring import BitArray
>>> input_str = '0xff'
>>> c = BitArray(hex=input_str)
>>> c.bin
'0b11111111'
And if you need to strip the leading 0b:
>>> c.bin[2:]
'11111111'
The bitstring module isn't a requirement, as jcollado's answer shows, but it has lots of performant methods for turning input into bits and manipulating them. You might find this handy (or not), for example:
>>> c.uint
255
>>> c.invert()
>>> c.bin[2:]
'00000000'
etc.
What about something like this?
>>> bin(int('ff', base=16))
'0b11111111'
This will convert the hexadecimal string you have to an integer and that integer to a string in which each byte is set to 0/1 depending on the bit-value of the integer.
As pointed out by a comment, if you need to get rid of the 0b prefix, you can do it this way:
>>> bin(int('ff', base=16))[2:]
'11111111'
... or, if you are using Python 3.9 or newer:
>>> bin(int('ff', base=16)).removepreffix('0b')
'11111111'
Note: using lstrip("0b") here will lead to 0 integer being converted to an empty string. This is almost always not what you want to do.
Operations are much faster when you work at the integer level. In particular, converting to a string as suggested here is really slow.
If you want bit 7 and 8 only, use e.g.
val = (byte >> 6) & 3
(this is: shift the byte 6 bits to the right - dropping them. Then keep only the last two bits 3 is the number with the first two bits set...)
These can easily be translated into simple CPU operations that are super fast.
using python format string syntax
>>> mybyte = bytes.fromhex("0F") # create my byte using a hex string
>>> binary_string = "{:08b}".format(int(mybyte.hex(),16))
>>> print(binary_string)
00001111
The second line is where the magic happens. All byte objects have a .hex() function, which returns a hex string. Using this hex string, we convert it to an integer, telling the int() function that it's a base 16 string (because hex is base 16). Then we apply formatting to that integer so it displays as a binary string. The {:08b} is where the real magic happens. It is using the Format Specification Mini-Language format_spec. Specifically it's using the width and the type parts of the format_spec syntax. The 8 sets width to 8, which is how we get the nice 0000 padding, and the b sets the type to binary.
I prefer this method over the bin() method because using a format string gives a lot more flexibility.
I think simplest would be use numpy here. For example you can read a file as bytes and then expand it to bits easily like this:
Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)
input_str = "ABC"
[bin(byte) for byte in bytes(input_str, "utf-8")]
Will give:
['0b1000001', '0b1000010', '0b1000011']
Here how to do it using format()
print "bin_signedDate : ", ''.join(format(x, '08b') for x in bytevector)
It is important the 08b . That means it will be a maximum of 8 leading zeros be appended to complete a byte. If you don't specify this then the format will just have a variable bit length for each converted byte.
To binary:
bin(byte)[2:].zfill(8)
Use ord when reading reading bytes:
byte_binary = bin(ord(f.read(1))) # Add [2:] to remove the "0b" prefix
Or
Using str.format():
'{:08b}'.format(ord(f.read(1)))
The other answers here provide the bits in big-endian order ('\x01' becomes '00000001')
In case you're interested in little-endian order of bits, which is useful in many cases, like common representations of bignums etc -
here's a snippet for that:
def bits_little_endian_from_bytes(s):
return ''.join(bin(ord(x))[2:].rjust(8,'0')[::-1] for x in s)
And for the other direction:
def bytes_from_bits_little_endian(s):
return ''.join(chr(int(s[i:i+8][::-1], 2)) for i in range(0, len(s), 8))
One line function to convert bytes (not string) to bit list. There is no endnians issue when source is from a byte reader/writer to another byte reader/writer, only if source and target are bit reader and bit writers.
def byte2bin(b):
return [int(X) for X in "".join(["{:0>8}".format(bin(X)[2:])for X in b])]
I came across this answer when looking for a way to convert an integer into a list of bit positions where the bitstring is equal to one. This becomes very similar to this question if you first convert your hex string to an integer like int('0x453', 16).
Now, given an integer - a representation already well-encoded in the hardware, I was very surprised to find out that the string variants of the above solutions using things like bin turn out to be faster than numpy based solutions for a single number, and I thought I'd quickly write up the results.
I wrote three variants of the function. First using numpy:
import math
import numpy as np
def bit_positions_numpy(val):
"""
Given an integer value, return the positions of the on bits.
"""
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
arr = np.frombuffer(bytestr, dtype=np.uint8, count=length)
bit_arr = np.unpackbits(arr, bitorder='big')
bit_positions = np.where(bit_arr[::-1])[0].tolist()
return bit_positions
Then using string logic:
def bit_positions_str(val):
is_negative = val < 0
if is_negative:
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
neg_position = (length * 8) - 1
# special logic for negatives to get twos compliment repr
max_val = 1 << neg_position
val_ = max_val + val
else:
val_ = val
binary_string = '{:b}'.format(val_)[::-1]
bit_positions = [pos for pos, char in enumerate(binary_string)
if char == '1']
if is_negative:
bit_positions.append(neg_position)
return bit_positions
And finally, I added a third method where I precomputed a lookuptable of the positions for a single byte and expanded that given larger itemsizes.
BYTE_TO_POSITIONS = []
pos_masks = [(s, (1 << s)) for s in range(0, 8)]
for i in range(0, 256):
positions = [pos for pos, mask in pos_masks if (mask & i)]
BYTE_TO_POSITIONS.append(positions)
def bit_positions_lut(val):
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
bit_positions = []
for offset, b in enumerate(bytestr[::-1]):
pos = BYTE_TO_POSITIONS[b]
if offset == 0:
bit_positions.extend(pos)
else:
pos_offset = (8 * offset)
bit_positions.extend([p + pos_offset for p in pos])
return bit_positions
The benchmark code is as follows:
def benchmark_bit_conversions():
# for val in [-0, -1, -3, -4, -9999]:
test_values = [
# -1, -2, -3, -4, -8, -32, -290, -9999,
# 0, 1, 2, 3, 4, 8, 32, 290, 9999,
4324, 1028, 1024, 3000, -100000,
999999999999,
-999999999999,
2 ** 32,
2 ** 64,
2 ** 128,
2 ** 128,
]
for val in test_values:
r1 = bit_positions_str(val)
r2 = bit_positions_numpy(val)
r3 = bit_positions_lut(val)
print(f'val={val}')
print(f'r1={r1}')
print(f'r2={r2}')
print(f'r3={r3}')
print('---')
assert r1 == r2
import xdev
xdev.profile_now(bit_positions_numpy)(val)
xdev.profile_now(bit_positions_str)(val)
xdev.profile_now(bit_positions_lut)(val)
import timerit
ti = timerit.Timerit(10000, bestof=10, verbose=2)
for timer in ti.reset('str'):
for val in test_values:
bit_positions_str(val)
for timer in ti.reset('numpy'):
for val in test_values:
bit_positions_numpy(val)
for timer in ti.reset('lut'):
for val in test_values:
bit_positions_lut(val)
for timer in ti.reset('raw_bin'):
for val in test_values:
bin(val)
for timer in ti.reset('raw_bytes'):
for val in test_values:
val.to_bytes(val.bit_length(), 'big', signed=True)
And it clearly shows the str and lookup table implementations are ahead of numpy. I tested this on CPython 3.10 and 3.11.
Timed str for: 10000 loops, best of 10
time per loop: best=20.488 µs, mean=21.438 ± 0.4 µs
Timed numpy for: 10000 loops, best of 10
time per loop: best=25.754 µs, mean=28.509 ± 5.2 µs
Timed lut for: 10000 loops, best of 10
time per loop: best=19.420 µs, mean=21.305 ± 3.8 µs

How to convert a string of bytes into an int?

How can I convert a string of bytes into an int in python?
Say like this: 'y\xcc\xa6\xbb'
I came up with a clever/stupid way of doing it:
sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))
I know there has to be something builtin or in the standard library that does this more simply...
This is different from converting a string of hex digits for which you can use int(xxx, 16), but instead I want to convert a string of actual byte values.
UPDATE:
I kind of like James' answer a little better because it doesn't require importing another module, but Greg's method is faster:
>>> from timeit import Timer
>>> Timer('struct.unpack("<L", "y\xcc\xa6\xbb")[0]', 'import struct').timeit()
0.36242198944091797
>>> Timer("int('y\xcc\xa6\xbb'.encode('hex'), 16)").timeit()
1.1432669162750244
My hacky method:
>>> Timer("sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))").timeit()
2.8819329738616943
FURTHER UPDATE:
Someone asked in comments what's the problem with importing another module. Well, importing a module isn't necessarily cheap, take a look:
>>> Timer("""import struct\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""").timeit()
0.98822188377380371
Including the cost of importing the module negates almost all of the advantage that this method has. I believe that this will only include the expense of importing it once for the entire benchmark run; look what happens when I force it to reload every time:
>>> Timer("""reload(struct)\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""", 'import struct').timeit()
68.474128007888794
Needless to say, if you're doing a lot of executions of this method per one import than this becomes proportionally less of an issue. It's also probably i/o cost rather than cpu so it may depend on the capacity and load characteristics of the particular machine.
In Python 3.2 and later, use
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='big')
2043455163
or
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='little')
3148270713
according to the endianness of your byte-string.
This also works for bytestring-integers of arbitrary length, and for two's-complement signed integers by specifying signed=True. See the docs for from_bytes.
You can also use the struct module to do this:
>>> struct.unpack("<L", "y\xcc\xa6\xbb")[0]
3148270713L
As Greg said, you can use struct if you are dealing with binary values, but if you just have a "hex number" but in byte format you might want to just convert it like:
s = 'y\xcc\xa6\xbb'
num = int(s.encode('hex'), 16)
...this is the same as:
num = struct.unpack(">L", s)[0]
...except it'll work for any number of bytes.
I use the following function to convert data between int, hex and bytes.
def bytes2int(str):
return int(str.encode('hex'), 16)
def bytes2hex(str):
return '0x'+str.encode('hex')
def int2bytes(i):
h = int2hex(i)
return hex2bytes(h)
def int2hex(i):
return hex(i)
def hex2int(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return int(h, 16)
def hex2bytes(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return h.decode('hex')
Source: http://opentechnotes.blogspot.com.au/2014/04/convert-values-to-from-integer-hex.html
import array
integerValue = array.array("I", 'y\xcc\xa6\xbb')[0]
Warning: the above is strongly platform-specific. Both the "I" specifier and the endianness of the string->int conversion are dependent on your particular Python implementation. But if you want to convert many integers/strings at once, then the array module does it quickly.
In Python 2.x, you could use the format specifiers <B for unsigned bytes, and <b for signed bytes with struct.unpack/struct.pack.
E.g:
Let x = '\xff\x10\x11'
data_ints = struct.unpack('<' + 'B'*len(x), x) # [255, 16, 17]
And:
data_bytes = struct.pack('<' + 'B'*len(data_ints), *data_ints) # '\xff\x10\x11'
That * is required!
See https://docs.python.org/2/library/struct.html#format-characters for a list of the format specifiers.
>>> reduce(lambda s, x: s*256 + x, bytearray("y\xcc\xa6\xbb"))
2043455163
Test 1: inverse:
>>> hex(2043455163)
'0x79cca6bb'
Test 2: Number of bytes > 8:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAA"))
338822822454978555838225329091068225L
Test 3: Increment by one:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAB"))
338822822454978555838225329091068226L
Test 4: Append one byte, say 'A':
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))
86738642548474510294585684247313465921L
Test 5: Divide by 256:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))/256
338822822454978555838225329091068226L
Result equals the result of Test 4, as expected.
I was struggling to find a solution for arbitrary length byte sequences that would work under Python 2.x. Finally I wrote this one, it's a bit hacky because it performs a string conversion, but it works.
Function for Python 2.x, arbitrary length
def signedbytes(data):
"""Convert a bytearray into an integer, considering the first bit as
sign. The data must be big-endian."""
negative = data[0] & 0x80 > 0
if negative:
inverted = bytearray(~d % 256 for d in data)
return -signedbytes(inverted) - 1
encoded = str(data).encode('hex')
return int(encoded, 16)
This function has two requirements:
The input data needs to be a bytearray. You may call the function like this:
s = 'y\xcc\xa6\xbb'
n = signedbytes(s)
The data needs to be big-endian. In case you have a little-endian value, you should reverse it first:
n = signedbytes(s[::-1])
Of course, this should be used only if arbitrary length is needed. Otherwise, stick with more standard ways (e.g. struct).
int.from_bytes is the best solution if you are at version >=3.2.
The "struct.unpack" solution requires a string so it will not apply to arrays of bytes.
Here is another solution:
def bytes2int( tb, order='big'):
if order == 'big': seq=[0,1,2,3]
elif order == 'little': seq=[3,2,1,0]
i = 0
for j in seq: i = (i<<8)+tb[j]
return i
hex( bytes2int( [0x87, 0x65, 0x43, 0x21])) returns '0x87654321'.
It handles big and little endianness and is easily modifiable for 8 bytes
As mentioned above using unpack function of struct is a good way. If you want to implement your own function there is an another solution:
def bytes_to_int(bytes):
result = 0
for b in bytes:
result = result * 256 + int(b)
return result
In python 3 you can easily convert a byte string into a list of integers (0..255) by
>>> list(b'y\xcc\xa6\xbb')
[121, 204, 166, 187]
A decently speedy method utilizing array.array I've been using for some time:
predefined variables:
offset = 0
size = 4
big = True # endian
arr = array('B')
arr.fromstring("\x00\x00\xff\x00") # 5 bytes (encoding issues) [0, 0, 195, 191, 0]
to int: (read)
val = 0
for v in arr[offset:offset+size][::pow(-1,not big)]: val = (val<<8)|v
from int: (write)
val = 16384
arr[offset:offset+size] = \
array('B',((val>>(i<<3))&255 for i in range(size)))[::pow(-1,not big)]
It's possible these could be faster though.
EDIT:
For some numbers, here's a performance test (Anaconda 2.3.0) showing stable averages on read in comparison to reduce():
========================= byte array to int.py =========================
5000 iterations; threshold of min + 5000ns:
______________________________________code___|_______min______|_______max______|_______avg______|_efficiency
⣿⠀⠀⠀⠀⡇⢀⡀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⡀⠀⢰⠀⠀⠀⢰⠀⠀⠀⢸⠀⠀⢀⡇⠀⢀⠀⠀⠀⠀⢠⠀⠀⠀⠀⢰⠀⠀⠀⢸⡀⠀⠀⠀⢸⠀⡇⠀⠀⢠⠀⢰⠀⢸⠀
⣿⣦⣴⣰⣦⣿⣾⣧⣤⣷⣦⣤⣶⣾⣿⣦⣼⣶⣷⣶⣸⣴⣤⣀⣾⣾⣄⣤⣾⡆⣾⣿⣿⣶⣾⣾⣶⣿⣤⣾⣤⣤⣴⣼⣾⣼⣴⣤⣼⣷⣆⣴⣴⣿⣾⣷⣧⣶⣼⣴⣿⣶⣿⣶
val = 0 \nfor v in arr: val = (val<<8)|v | 5373.848ns | 850009.965ns | ~8649.64ns | 62.128%
⡇⠀⠀⢀⠀⠀⠀⡇⠀⡇⠀⠀⣠⠀⣿⠀⠀⠀⠀⡀⠀⠀⡆⠀⡆⢰⠀⠀⡆⠀⡄⠀⠀⠀⢠⢀⣼⠀⠀⡇⣠⣸⣤⡇⠀⡆⢸⠀⠀⠀⠀⢠⠀⢠⣿⠀⠀⢠⠀⠀⢸⢠⠀⡀
⣧⣶⣶⣾⣶⣷⣴⣿⣾⡇⣤⣶⣿⣸⣿⣶⣶⣶⣶⣧⣷⣼⣷⣷⣷⣿⣦⣴⣧⣄⣷⣠⣷⣶⣾⣸⣿⣶⣶⣷⣿⣿⣿⣷⣧⣷⣼⣦⣶⣾⣿⣾⣼⣿⣿⣶⣶⣼⣦⣼⣾⣿⣶⣷
val = reduce( shift, arr ) | 6489.921ns | 5094212.014ns | ~12040.269ns | 53.902%
This is a raw performance test, so the endian pow-flip is left out.
The shift function shown applies the same shift-oring operation as the for loop, and arr is just array.array('B',[0,0,255,0]) as it has the fastest iterative performance next to dict.
I should probably also note efficiency is measured by accuracy to the average time.

Categories

Resources