Logical OR for Bit-string in Python - python

What i want to do is have the result of logical OR for two bit-strings. For example:
a='010010'
b='000101'
c=LOGIC_OR(a,b)
c
010111
The error i encounter most of the time is when I convert 'b' from string to binary it removes leading zeros. Others methods i have used convert 'a' and 'b' to integers. Generally nothing is working and help would be much appreciated.
Thanks in advance

You can convert them to integers with int specifying the base to be 2. Then, perform a bitwise OR operation and convert the result to a bit string with bin.
>>> c = int(a, 2) | int(b, 2))
>>> c
23
If you want to print the result as a bit string, use str.format. If you're on python-3.6, you can also use f-strings.
>>> '{:b}'.format(c)
'10111'
>>> print(f"{c:b}")
10111
To capture leading zeros with respect to a/b, use str.zfill -
>>> f"{c:b}".zfill(len(a))
'010111'

Here are a couple of alternative methods.
Third-party bitarray library:
from bitarray import bitarray
a='010010'
b='000101'
logical_or_bitarray = bitarray(a) | bitarray(b) # output: bitarray('010111')
logical_or_string = ''.join(map(str, map(int, logical_or_bitarray))) # output: '010111'
Python strings:-
a='010010'
b='000101'
def compare_bits(A, B):
c_1 = str(int(A) | int(B))
c = (len(A) - len(c_1))*'0' + str(c_1)
return c
compare_bits(a, b)

You should convert to int objects and do numerical operations in the numerical data type. Then you use string-formatting when you need to see it. If you have Python 3.6, using f-strings makes this trivial:
>>> a='010010'
>>> b='000101'
>>> a = int(a, base=2) # we should be ints
>>> b = int(b, base=2) # we should be ints
>>> c = a | b # operations natural and built in
>>> print(f"{c:b}") # use formatting when you need it
10111
Read the string formatting spec's. You can make them do whatever you desire. Using a fill value of '0' and a width of '6':
>>> print(f"{c:0>6b}")
010111
And this is cool too:
>>> pad='0'
>>> width = 6
>>> print(f"{c:{pad}>{width}b}")
010111

Related

Zylabs 7.14 Lab: Reverse binary [duplicate]

Are there any canned Python methods to convert an Integer (or Long) into a binary string in Python?
There are a myriad of dec2bin() functions out on Google... But I was hoping I could use a built-in function / library.
Python's string format method can take a format spec.
>>> "{0:b}".format(37)
'100101'
Format spec docs for Python 2
Format spec docs for Python 3
If you're looking for bin() as an equivalent to hex(), it was added in python 2.6.
Example:
>>> bin(10)
'0b1010'
Python actually does have something already built in for this, the ability to do operations such as '{0:b}'.format(42), which will give you the bit pattern (in a string) for 42, or 101010.
For a more general philosophy, no language or library will give its user base everything that they desire. If you're working in an environment that doesn't provide exactly what you need, you should be collecting snippets of code as you develop to ensure you never have to write the same thing twice. Such as, for example, the pseudo-code:
define intToBinString, receiving intVal:
if intVal is equal to zero:
return "0"
set strVal to ""
while intVal is greater than zero:
if intVal is odd:
prefix "1" to strVal
else:
prefix "0" to strVal
divide intVal by two, rounding down
return strVal
which will construct your binary string based on the decimal value. Just keep in mind that's a generic bit of pseudo-code which may not be the most efficient way of doing it though, with the iterations you seem to be proposing, it won't make much difference. It's really just meant as a guideline on how it could be done.
The general idea is to use code from (in order of preference):
the language or built-in libraries.
third-party libraries with suitable licenses.
your own collection.
something new you need to write (and save in your own collection for later).
If you want a textual representation without the 0b-prefix, you could use this:
get_bin = lambda x: format(x, 'b')
print(get_bin(3))
>>> '11'
print(get_bin(-3))
>>> '-11'
When you want a n-bit representation:
get_bin = lambda x, n: format(x, 'b').zfill(n)
>>> get_bin(12, 32)
'00000000000000000000000000001100'
>>> get_bin(-12, 32)
'-00000000000000000000000000001100'
Alternatively, if you prefer having a function:
def get_bin(x, n=0):
"""
Get the binary representation of x.
Parameters
----------
x : int
n : int
Minimum number of digits. If x needs less digits in binary, the rest
is filled with zeros.
Returns
-------
str
"""
return format(x, 'b').zfill(n)
I am surprised there is no mention of a nice way to accomplish this using formatting strings that are supported in Python 3.6 and higher. TLDR:
>>> number = 1
>>> f'0b{number:08b}'
'0b00000001'
Longer story
This is functionality of formatting strings available from Python 3.6:
>>> x, y, z = 1, 2, 3
>>> f'{x} {y} {2*z}'
'1 2 6'
You can request binary as well:
>>> f'{z:b}'
'11'
Specify the width:
>>> f'{z:8b}'
' 11'
Request zero padding:
f'{z:08b}'
'00000011'
And add common prefix to signify binary number:
>>> f'0b{z:08b}'
'0b00000011'
You can also let Python add the prefix for you but I do not like it so much as the version above because you have to take the prefix into width consideration:
>>> f'{z:#010b}'
'0b00000011'
More info is available in official documentation on Formatted string literals and Format Specification Mini-Language.
As a reference:
def toBinary(n):
return ''.join(str(1 & int(n) >> i) for i in range(64)[::-1])
This function can convert a positive integer as large as 18446744073709551615, represented as string '1111111111111111111111111111111111111111111111111111111111111111'.
It can be modified to serve a much larger integer, though it may not be as handy as "{0:b}".format() or bin().
This is for python 3 and it keeps the leading zeros !
print(format(0, '08b'))
A simple way to do that is to use string format, see this page.
>> "{0:b}".format(10)
'1010'
And if you want to have a fixed length of the binary string, you can use this:
>> "{0:{fill}8b}".format(10, fill='0')
'00001010'
If two's complement is required, then the following line can be used:
'{0:{fill}{width}b}'.format((x + 2**n) % 2**n, fill='0', width=n)
where n is the width of the binary string.
one-liner with lambda:
>>> binary = lambda n: '' if n==0 else binary(n/2) + str(n%2)
test:
>>> binary(5)
'101'
EDIT:
but then :(
t1 = time()
for i in range(1000000):
binary(i)
t2 = time()
print(t2 - t1)
# 6.57236599922
in compare to
t1 = time()
for i in range(1000000):
'{0:b}'.format(i)
t2 = time()
print(t2 - t1)
# 0.68017411232
As the preceding answers mostly used format(),
here is an f-string implementation.
integer = 7
bit_count = 5
print(f'{integer:0{bit_count}b}')
Output:
00111
For convenience here is the python docs link for formatted string literals: https://docs.python.org/3/reference/lexical_analysis.html#f-strings.
Summary of alternatives:
n=42
assert "-101010" == format(-n, 'b')
assert "-101010" == "{0:b}".format(-n)
assert "-101010" == (lambda x: x >= 0 and str(bin(x))[2:] or "-" + str(bin(x))[3:])(-n)
assert "0b101010" == bin(n)
assert "101010" == bin(n)[2:] # But this won't work for negative numbers.
Contributors include John Fouhy, Tung Nguyen, mVChr, Martin Thoma. and Martijn Pieters.
>>> format(123, 'b')
'1111011'
For those of us who need to convert signed integers (range -2**(digits-1) to 2**(digits-1)-1) to 2's complement binary strings, this works:
def int2bin(integer, digits):
if integer >= 0:
return bin(integer)[2:].zfill(digits)
else:
return bin(2**digits + integer)[2:]
This produces:
>>> int2bin(10, 8)
'00001010'
>>> int2bin(-10, 8)
'11110110'
>>> int2bin(-128, 8)
'10000000'
>>> int2bin(127, 8)
'01111111'
you can do like that :
bin(10)[2:]
or :
f = str(bin(10))
c = []
c.append("".join(map(int, f[2:])))
print c
Using numpy pack/unpackbits, they are your best friends.
Examples
--------
>>> a = np.array([[2], [7], [23]], dtype=np.uint8)
>>> a
array([[ 2],
[ 7],
[23]], dtype=uint8)
>>> b = np.unpackbits(a, axis=1)
>>> b
array([[0, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 0, 1, 1, 1]], dtype=uint8)
Yet another solution with another algorithm, by using bitwise operators.
def int2bin(val):
res=''
while val>0:
res += str(val&1)
val=val>>1 # val=val/2
return res[::-1] # reverse the string
A faster version without reversing the string.
def int2bin(val):
res=''
while val>0:
res = chr((val&1) + 0x30) + res
val=val>>1
return res
numpy.binary_repr(num, width=None)
Examples from the documentation link above:
>>> np.binary_repr(3)
'11'
>>> np.binary_repr(-3)
'-11'
>>> np.binary_repr(3, width=4)
'0011'
The two’s complement is returned when the input number is negative and width is specified:
>>> np.binary_repr(-3, width=3)
'101'
>>> np.binary_repr(-3, width=5)
'11101'
The accepted answer didn't address negative numbers, which I'll cover.
In addition to the answers above, you can also just use the bin and hex functions. And in the opposite direction, use binary notation:
>>> bin(37)
'0b100101'
>>> 0b100101
37
But with negative numbers, things get a bit more complicated. The question doesn't specify how you want to handle negative numbers.
Python just adds a negative sign so the result for -37 would be this:
>>> bin(-37)
'-0b100101'
In computer/hardware binary data, negative signs don't exist. All we have is 1's and 0's. So if you're reading or producing binary streams of data to be processed by other software/hardware, you need to first know the notation being used.
One notation is sign-magnitude notation, where the first bit represents the negative sign, and the rest is the actual value. In that case, -37 would be 0b1100101 and 37 would be 0b0100101. This looks like what python produces, but just add a 0 or 1 in front for positive / negative numbers.
More common is Two's complement notation, which seems more complicated and the result is very different from python's string formatting. You can read the details in the link, but with an 8bit signed integer -37 would be 0b11011011 and 37 would be 0b00100101.
Python has no easy way to produce these binary representations. You can use numpy to turn Two's complement binary values into python integers:
>>> import numpy as np
>>> np.int8(0b11011011)
-37
>>> np.uint8(0b11011011)
219
>>> np.uint8(0b00100101)
37
>>> np.int8(0b00100101)
37
But I don't know an easy way to do the opposite with builtin functions. The bitstring package can help though.
>>> from bitstring import BitArray
>>> arr = BitArray(int=-37, length=8)
>>> arr.uint
219
>>> arr.int
-37
>>> arr.bin
'11011011'
>>> BitArray(bin='11011011').int
-37
>>> BitArray(bin='11011011').uint
219
Python 3.6 added a new string formatting approach called formatted string literals or “f-strings”.
Example:
name = 'Bob'
number = 42
f"Hello, {name}, your number is {number:>08b}"
Output will be 'Hello, Bob, your number is 00001010!'
A discussion of this question can be found here - Here
Unless I'm misunderstanding what you mean by binary string I think the module you are looking for is struct
n=input()
print(bin(n).replace("0b", ""))
def binary(decimal) :
otherBase = ""
while decimal != 0 :
otherBase = str(decimal % 2) + otherBase
decimal //= 2
return otherBase
print binary(10)
output:
1010
Here is the code I've just implemented. This is not a method but you can use it as a ready-to-use function!
def inttobinary(number):
if number == 0:
return str(0)
result =""
while (number != 0):
remainder = number%2
number = number/2
result += str(remainder)
return result[::-1] # to invert the string
Calculator with all neccessary functions for DEC,BIN,HEX:
(made and tested with Python 3.5)
You can change the input test numbers and get the converted ones.
# CONVERTER: DEC / BIN / HEX
def dec2bin(d):
# dec -> bin
b = bin(d)
return b
def dec2hex(d):
# dec -> hex
h = hex(d)
return h
def bin2dec(b):
# bin -> dec
bin_numb="{0:b}".format(b)
d = eval(bin_numb)
return d,bin_numb
def bin2hex(b):
# bin -> hex
h = hex(b)
return h
def hex2dec(h):
# hex -> dec
d = int(h)
return d
def hex2bin(h):
# hex -> bin
b = bin(h)
return b
## TESTING NUMBERS
numb_dec = 99
numb_bin = 0b0111
numb_hex = 0xFF
## CALCULATIONS
res_dec2bin = dec2bin(numb_dec)
res_dec2hex = dec2hex(numb_dec)
res_bin2dec,bin_numb = bin2dec(numb_bin)
res_bin2hex = bin2hex(numb_bin)
res_hex2dec = hex2dec(numb_hex)
res_hex2bin = hex2bin(numb_hex)
## PRINTING
print('------- DECIMAL to BIN / HEX -------\n')
print('decimal:',numb_dec,'\nbin: ',res_dec2bin,'\nhex: ',res_dec2hex,'\n')
print('------- BINARY to DEC / HEX -------\n')
print('binary: ',bin_numb,'\ndec: ',numb_bin,'\nhex: ',res_bin2hex,'\n')
print('----- HEXADECIMAL to BIN / HEX -----\n')
print('hexadec:',hex(numb_hex),'\nbin: ',res_hex2bin,'\ndec: ',res_hex2dec,'\n')
Somewhat similar solution
def to_bin(dec):
flag = True
bin_str = ''
while flag:
remainder = dec % 2
quotient = dec / 2
if quotient == 0:
flag = False
bin_str += str(remainder)
dec = quotient
bin_str = bin_str[::-1] # reverse the string
return bin_str
here is simple solution using the divmod() fucntion which returns the reminder and the result of a division without the fraction.
def dectobin(number):
bin = ''
while (number >= 1):
number, rem = divmod(number, 2)
bin = bin + str(rem)
return bin
Here's yet another way using regular math, no loops, only recursion. (Trivial case 0 returns nothing).
def toBin(num):
if num == 0:
return ""
return toBin(num//2) + str(num%2)
print ([(toBin(i)) for i in range(10)])
['', '1', '10', '11', '100', '101', '110', '111', '1000', '1001']
To calculate binary of numbers:
print("Binary is {0:>08b}".format(16))
To calculate the Hexa decimal of a number:
print("Hexa Decimal is {0:>0x}".format(15))
To Calculate all the binary no till 16::
for i in range(17):
print("{0:>2}: binary is {0:>08b}".format(i))
To calculate Hexa decimal no till 17
for i in range(17):
print("{0:>2}: Hexa Decimal is {0:>0x}".format(i))
##as 2 digit is enogh for hexa decimal representation of a number
try:
while True:
p = ""
a = input()
while a != 0:
l = a % 2
b = a - l
a = b / 2
p = str(l) + p
print(p)
except:
print ("write 1 number")
I found a method using matrix operation to convert decimal to binary.
import numpy as np
E_mat = np.tile(E,[1,M])
M_order = pow(2,(M-1-np.array(range(M)))).T
bindata = np.remainder(np.floor(E_mat /M_order).astype(np.int),2)
Eis input decimal data,M is the binary orders. bindata is output binary data, which is in a format of 1 by M binary matrix.

How can I merge strings of 0 or 1 as if I were doing an OR on bitfields in python?

For example, for the strings "000100", "010000", and "100000", I want the result to be "110100".
Is there a simple approach to this in Python?
You can convert each binary string to their actual integer value by using int(<str>, 2), then use the binary or operation (|) to merge them together and get the binary representation back by using bin:
>>> binstrings = ['000100', '010000', '100000']
>>> result = 0
>>> for s in binstrings:
... result |= int(s, 2)
...
>>> result
52
>>> bin(result)
'0b110100'
Here is a simple example of how to use a logical OR | operator.
a = 0b000100
b = 0b010000
c = 0b100000
merge = a | b | c
print(bin(merge))

How to split 16-bit unsigned integer into array of bytes in python?

I need to split a 16-bit unsigned integer into an array of bytes (i.e. array.array('B')) in python.
For example:
>>> reg_val = 0xABCD
[insert python magic here]
>>> print("0x%X" % myarray[0])
0xCD
>>> print("0x%X" % myarray[1])
0xAB
The way I'm currently doing it seems very complicated for something so simple:
>>> import struct
>>> import array
>>> reg_val = 0xABCD
>>> reg_val_msb, reg_val_lsb = struct.unpack("<BB", struct.pack("<H", (0xFFFF & reg_val)))
>>> myarray = array.array('B')
>>> myarray.append(reg_val_msb)
>>> myarray.append(reg_val_lsb)
Is there a better/more efficient/more pythonic way of accomplishing the same thing?
(using python 3 here, there are some nomenclature differences in 2)
Well first, you could just leave everything as bytes. This is perfectly valid:
reg_val_msb, reg_val_lsb = struct.pack('<H', 0xABCD)
bytes allows for "tuple unpacking" (not related to struct.unpack, tuple unpacking is used all over python). And bytes is an array of bytes, which can be accessed via index as you wanted.
b = struct.pack('<H',0xABCD)
b[0],b[1]
Out[52]: (205, 171)
If you truly wanted to get it into an array.array('B'), it's still rather easy:
ary = array('B',struct.pack('<H',0xABCD))
# ary = array('B', [205, 171])
print("0x%X" % ary[0])
# 0xCD
For non-complex numbers you can use divmod(a, b), which returns a tuple of the quotient and remainder of arguments.
The following example uses map() for demonstration purposes. In both examples we're simply telling divmod to return a tuple (a/b, a%b), where a=0xABCD and b=256.
>>> map(hex, divmod(0xABCD, 1<<8)) # Add a list() call here if your working with python 3.x
['0xab', '0xcd']
# Or if the bit shift notation is distrubing:
>>> map(hex, divmod(0xABCD, 256))
Or you can just place them in the array:
>>> arr = array.array('B')
>>> arr.extend(divmod(0xABCD, 256))
>>> arr
array('B', [171, 205])
You can write your own function like this.
def binarray(i):
while i:
yield i & 0xff
i = i >> 8
print list(binarray(0xABCD))
#[205, 171]

Convert a Python int into a big-endian string of bytes

I have a non-negative int and I would like to efficiently convert it to a big-endian string containing the same data. For example, the int 1245427 (which is 0x1300F3) should result in a string of length 3 containing three characters whose byte values are 0x13, 0x00, and 0xf3.
My ints are on the scale of 35 (base-10) digits.
How do I do this?
In Python 3.2+, you can use int.to_bytes:
If you don't want to specify the size
>>> n = 1245427
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0'
b'\x13\x00\xf3'
If you don't mind specifying the size
>>> (1245427).to_bytes(3, byteorder='big')
b'\x13\x00\xf3'
You can use the struct module:
import struct
print(struct.pack('>I', your_int))
'>I' is a format string. > means big endian and I means unsigned int. Check the documentation for more format chars.
This is fast and works for small and (arbitrary) large ints:
def Dump(n):
s = '%x' % n
if len(s) & 1:
s = '0' + s
return s.decode('hex')
print repr(Dump(1245427)) #: '\x13\x00\xf3'
Probably the best way is via the built-in struct module:
>>> import struct
>>> x = 1245427
>>> struct.pack('>BH', x >> 16, x & 0xFFFF)
'\x13\x00\xf3'
>>> struct.pack('>L', x)[1:] # could do it this way too
'\x13\x00\xf3'
Alternatively -- and I wouldn't usually recommend this, because it's mistake-prone -- you can do it "manually" by shifting and the chr() function:
>>> x = 1245427
>>> chr((x >> 16) & 0xFF) + chr((x >> 8) & 0xFF) + chr(x & 0xFF)
'\x13\x00\xf3'
Out of curiosity, why do you only want three bytes? Usually you'd pack such an integer into a full 32 bits (a C unsigned long), and use struct.pack('>L', 1245427) but skip the [1:] step?
def tost(i):
result = []
while i:
result.append(chr(i&0xFF))
i >>= 8
result.reverse()
return ''.join(result)
Single-source Python 2/3 compatible version based on #pts' answer:
#!/usr/bin/env python
import binascii
def int2bytes(i):
hex_string = '%x' % i
n = len(hex_string)
return binascii.unhexlify(hex_string.zfill(n + (n & 1)))
print(int2bytes(1245427))
# -> b'\x13\x00\xf3'
The shortest way, I think, is the following:
import struct
val = 0x11223344
val = struct.unpack("<I", struct.pack(">I", val))[0]
print "%08x" % val
This converts an integer to a byte-swapped integer.
Using the bitstring module:
>>> bitstring.BitArray(uint=1245427, length=24).bytes
'\x13\x00\xf3'
Note though that for this method you need to specify the length in bits of the bitstring you are creating.
Internally this is pretty much the same as Alex's answer, but the module has a lot of extra functionality available if you want to do more with your data.
Very easy with pwntools , the tools created for software hacking
(Un-ironically, I stumbled across this thread and tried solutions here, until I realised there exists conversion functionality in pwntools)
import pwntools
x2 = p32(x1)

How to convert a string of bytes into an int?

How can I convert a string of bytes into an int in python?
Say like this: 'y\xcc\xa6\xbb'
I came up with a clever/stupid way of doing it:
sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))
I know there has to be something builtin or in the standard library that does this more simply...
This is different from converting a string of hex digits for which you can use int(xxx, 16), but instead I want to convert a string of actual byte values.
UPDATE:
I kind of like James' answer a little better because it doesn't require importing another module, but Greg's method is faster:
>>> from timeit import Timer
>>> Timer('struct.unpack("<L", "y\xcc\xa6\xbb")[0]', 'import struct').timeit()
0.36242198944091797
>>> Timer("int('y\xcc\xa6\xbb'.encode('hex'), 16)").timeit()
1.1432669162750244
My hacky method:
>>> Timer("sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))").timeit()
2.8819329738616943
FURTHER UPDATE:
Someone asked in comments what's the problem with importing another module. Well, importing a module isn't necessarily cheap, take a look:
>>> Timer("""import struct\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""").timeit()
0.98822188377380371
Including the cost of importing the module negates almost all of the advantage that this method has. I believe that this will only include the expense of importing it once for the entire benchmark run; look what happens when I force it to reload every time:
>>> Timer("""reload(struct)\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""", 'import struct').timeit()
68.474128007888794
Needless to say, if you're doing a lot of executions of this method per one import than this becomes proportionally less of an issue. It's also probably i/o cost rather than cpu so it may depend on the capacity and load characteristics of the particular machine.
In Python 3.2 and later, use
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='big')
2043455163
or
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='little')
3148270713
according to the endianness of your byte-string.
This also works for bytestring-integers of arbitrary length, and for two's-complement signed integers by specifying signed=True. See the docs for from_bytes.
You can also use the struct module to do this:
>>> struct.unpack("<L", "y\xcc\xa6\xbb")[0]
3148270713L
As Greg said, you can use struct if you are dealing with binary values, but if you just have a "hex number" but in byte format you might want to just convert it like:
s = 'y\xcc\xa6\xbb'
num = int(s.encode('hex'), 16)
...this is the same as:
num = struct.unpack(">L", s)[0]
...except it'll work for any number of bytes.
I use the following function to convert data between int, hex and bytes.
def bytes2int(str):
return int(str.encode('hex'), 16)
def bytes2hex(str):
return '0x'+str.encode('hex')
def int2bytes(i):
h = int2hex(i)
return hex2bytes(h)
def int2hex(i):
return hex(i)
def hex2int(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return int(h, 16)
def hex2bytes(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return h.decode('hex')
Source: http://opentechnotes.blogspot.com.au/2014/04/convert-values-to-from-integer-hex.html
import array
integerValue = array.array("I", 'y\xcc\xa6\xbb')[0]
Warning: the above is strongly platform-specific. Both the "I" specifier and the endianness of the string->int conversion are dependent on your particular Python implementation. But if you want to convert many integers/strings at once, then the array module does it quickly.
In Python 2.x, you could use the format specifiers <B for unsigned bytes, and <b for signed bytes with struct.unpack/struct.pack.
E.g:
Let x = '\xff\x10\x11'
data_ints = struct.unpack('<' + 'B'*len(x), x) # [255, 16, 17]
And:
data_bytes = struct.pack('<' + 'B'*len(data_ints), *data_ints) # '\xff\x10\x11'
That * is required!
See https://docs.python.org/2/library/struct.html#format-characters for a list of the format specifiers.
>>> reduce(lambda s, x: s*256 + x, bytearray("y\xcc\xa6\xbb"))
2043455163
Test 1: inverse:
>>> hex(2043455163)
'0x79cca6bb'
Test 2: Number of bytes > 8:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAA"))
338822822454978555838225329091068225L
Test 3: Increment by one:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAB"))
338822822454978555838225329091068226L
Test 4: Append one byte, say 'A':
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))
86738642548474510294585684247313465921L
Test 5: Divide by 256:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))/256
338822822454978555838225329091068226L
Result equals the result of Test 4, as expected.
I was struggling to find a solution for arbitrary length byte sequences that would work under Python 2.x. Finally I wrote this one, it's a bit hacky because it performs a string conversion, but it works.
Function for Python 2.x, arbitrary length
def signedbytes(data):
"""Convert a bytearray into an integer, considering the first bit as
sign. The data must be big-endian."""
negative = data[0] & 0x80 > 0
if negative:
inverted = bytearray(~d % 256 for d in data)
return -signedbytes(inverted) - 1
encoded = str(data).encode('hex')
return int(encoded, 16)
This function has two requirements:
The input data needs to be a bytearray. You may call the function like this:
s = 'y\xcc\xa6\xbb'
n = signedbytes(s)
The data needs to be big-endian. In case you have a little-endian value, you should reverse it first:
n = signedbytes(s[::-1])
Of course, this should be used only if arbitrary length is needed. Otherwise, stick with more standard ways (e.g. struct).
int.from_bytes is the best solution if you are at version >=3.2.
The "struct.unpack" solution requires a string so it will not apply to arrays of bytes.
Here is another solution:
def bytes2int( tb, order='big'):
if order == 'big': seq=[0,1,2,3]
elif order == 'little': seq=[3,2,1,0]
i = 0
for j in seq: i = (i<<8)+tb[j]
return i
hex( bytes2int( [0x87, 0x65, 0x43, 0x21])) returns '0x87654321'.
It handles big and little endianness and is easily modifiable for 8 bytes
As mentioned above using unpack function of struct is a good way. If you want to implement your own function there is an another solution:
def bytes_to_int(bytes):
result = 0
for b in bytes:
result = result * 256 + int(b)
return result
In python 3 you can easily convert a byte string into a list of integers (0..255) by
>>> list(b'y\xcc\xa6\xbb')
[121, 204, 166, 187]
A decently speedy method utilizing array.array I've been using for some time:
predefined variables:
offset = 0
size = 4
big = True # endian
arr = array('B')
arr.fromstring("\x00\x00\xff\x00") # 5 bytes (encoding issues) [0, 0, 195, 191, 0]
to int: (read)
val = 0
for v in arr[offset:offset+size][::pow(-1,not big)]: val = (val<<8)|v
from int: (write)
val = 16384
arr[offset:offset+size] = \
array('B',((val>>(i<<3))&255 for i in range(size)))[::pow(-1,not big)]
It's possible these could be faster though.
EDIT:
For some numbers, here's a performance test (Anaconda 2.3.0) showing stable averages on read in comparison to reduce():
========================= byte array to int.py =========================
5000 iterations; threshold of min + 5000ns:
______________________________________code___|_______min______|_______max______|_______avg______|_efficiency
⣿⠀⠀⠀⠀⡇⢀⡀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⡀⠀⢰⠀⠀⠀⢰⠀⠀⠀⢸⠀⠀⢀⡇⠀⢀⠀⠀⠀⠀⢠⠀⠀⠀⠀⢰⠀⠀⠀⢸⡀⠀⠀⠀⢸⠀⡇⠀⠀⢠⠀⢰⠀⢸⠀
⣿⣦⣴⣰⣦⣿⣾⣧⣤⣷⣦⣤⣶⣾⣿⣦⣼⣶⣷⣶⣸⣴⣤⣀⣾⣾⣄⣤⣾⡆⣾⣿⣿⣶⣾⣾⣶⣿⣤⣾⣤⣤⣴⣼⣾⣼⣴⣤⣼⣷⣆⣴⣴⣿⣾⣷⣧⣶⣼⣴⣿⣶⣿⣶
val = 0 \nfor v in arr: val = (val<<8)|v | 5373.848ns | 850009.965ns | ~8649.64ns | 62.128%
⡇⠀⠀⢀⠀⠀⠀⡇⠀⡇⠀⠀⣠⠀⣿⠀⠀⠀⠀⡀⠀⠀⡆⠀⡆⢰⠀⠀⡆⠀⡄⠀⠀⠀⢠⢀⣼⠀⠀⡇⣠⣸⣤⡇⠀⡆⢸⠀⠀⠀⠀⢠⠀⢠⣿⠀⠀⢠⠀⠀⢸⢠⠀⡀
⣧⣶⣶⣾⣶⣷⣴⣿⣾⡇⣤⣶⣿⣸⣿⣶⣶⣶⣶⣧⣷⣼⣷⣷⣷⣿⣦⣴⣧⣄⣷⣠⣷⣶⣾⣸⣿⣶⣶⣷⣿⣿⣿⣷⣧⣷⣼⣦⣶⣾⣿⣾⣼⣿⣿⣶⣶⣼⣦⣼⣾⣿⣶⣷
val = reduce( shift, arr ) | 6489.921ns | 5094212.014ns | ~12040.269ns | 53.902%
This is a raw performance test, so the endian pow-flip is left out.
The shift function shown applies the same shift-oring operation as the for loop, and arr is just array.array('B',[0,0,255,0]) as it has the fastest iterative performance next to dict.
I should probably also note efficiency is measured by accuracy to the average time.

Categories

Resources