Convert base-2 binary number string to int - python

I'd simply like to convert a base-2 binary number string into an int, something like this:
>>> '11111111'.fromBinaryToInt()
255
Is there a way to do this in Python?

You use the built-in int() function, and pass it the base of the input number, i.e. 2 for a binary number:
>>> int('11111111', 2)
255
Here is documentation for Python 2, and for Python 3.

Just type 0b11111111 in python interactive interface:
>>> 0b11111111
255

Another way to do this is by using the bitstring module:
>>> from bitstring import BitArray
>>> b = BitArray(bin='11111111')
>>> b.uint
255
Note that the unsigned integer (uint) is different from the signed integer (int):
>>> b.int
-1
Your question is really asking for the unsigned integer representation; this is an important distinction.
The bitstring module isn't a requirement, but it has lots of performant methods for turning input into and from bits into other forms, as well as manipulating them.

Using int with base is the right way to go. I used to do this before I found int takes base also. It is basically a reduce applied on a list comprehension of the primitive way of converting binary to decimal ( e.g. 110 = 2**0 * 0 + 2 ** 1 * 1 + 2 ** 2 * 1)
add = lambda x,y : x + y
reduce(add, [int(x) * 2 ** y for x, y in zip(list(binstr), range(len(binstr) - 1, -1, -1))])

If you wanna know what is happening behind the scene, then here you go.
class Binary():
def __init__(self, binNumber):
self._binNumber = binNumber
self._binNumber = self._binNumber[::-1]
self._binNumber = list(self._binNumber)
self._x = [1]
self._count = 1
self._change = 2
self._amount = 0
print(self._ToNumber(self._binNumber))
def _ToNumber(self, number):
self._number = number
for i in range (1, len (self._number)):
self._total = self._count * self._change
self._count = self._total
self._x.append(self._count)
self._deep = zip(self._number, self._x)
for self._k, self._v in self._deep:
if self._k == '1':
self._amount += self._v
return self._amount
mo = Binary('101111110')

Here's another concise way to do it not mentioned in any of the above answers:
>>> eval('0b' + '11111111')
255
Admittedly, it's probably not very fast, and it's a very very bad idea if the string is coming from something you don't have control over that could be malicious (such as user input), but for completeness' sake, it does work.

A recursive Python implementation:
def int2bin(n):
return int2bin(n >> 1) + [n & 1] if n > 1 else [1]

If you are using python3.6 or later you can use f-string to do the
conversion:
Binary to decimal:
>>> print(f'{0b1011010:#0}')
90
>>> bin_2_decimal = int(f'{0b1011010:#0}')
>>> bin_2_decimal
90
binary to octal hexa and etc.
>>> f'{0b1011010:#o}'
'0o132' # octal
>>> f'{0b1011010:#x}'
'0x5a' # hexadecimal
>>> f'{0b1011010:#0}'
'90' # decimal
Pay attention to 2 piece of information separated by colon.
In this way, you can convert between {binary, octal, hexadecimal, decimal} to {binary, octal, hexadecimal, decimal} by changing right side of colon[:]
:#b -> converts to binary
:#o -> converts to octal
:#x -> converts to hexadecimal
:#0 -> converts to decimal as above example
Try changing left side of colon to have octal/hexadecimal/decimal.

For large matrix (10**5 rows and up) it is better to use a vectorized matmult. Pass in all rows and cols in one shot. It is extremely fast. There is no looping in python here. I originally designed it for converting many binary columns like 0/1 for like 10 different genre columns in MovieLens into a single integer for each example row.
def BitsToIntAFast(bits):
m,n = bits.shape
a = 2**np.arange(n)[::-1] # -1 reverses array of powers of 2 of same length as bits
return bits # a

For the record to go back and forth in basic python3:
a = 10
bin(a)
# '0b1010'
int(bin(a), 2)
# 10
eval(bin(a))
# 10

Related

Zylabs 7.14 Lab: Reverse binary [duplicate]

Are there any canned Python methods to convert an Integer (or Long) into a binary string in Python?
There are a myriad of dec2bin() functions out on Google... But I was hoping I could use a built-in function / library.
Python's string format method can take a format spec.
>>> "{0:b}".format(37)
'100101'
Format spec docs for Python 2
Format spec docs for Python 3
If you're looking for bin() as an equivalent to hex(), it was added in python 2.6.
Example:
>>> bin(10)
'0b1010'
Python actually does have something already built in for this, the ability to do operations such as '{0:b}'.format(42), which will give you the bit pattern (in a string) for 42, or 101010.
For a more general philosophy, no language or library will give its user base everything that they desire. If you're working in an environment that doesn't provide exactly what you need, you should be collecting snippets of code as you develop to ensure you never have to write the same thing twice. Such as, for example, the pseudo-code:
define intToBinString, receiving intVal:
if intVal is equal to zero:
return "0"
set strVal to ""
while intVal is greater than zero:
if intVal is odd:
prefix "1" to strVal
else:
prefix "0" to strVal
divide intVal by two, rounding down
return strVal
which will construct your binary string based on the decimal value. Just keep in mind that's a generic bit of pseudo-code which may not be the most efficient way of doing it though, with the iterations you seem to be proposing, it won't make much difference. It's really just meant as a guideline on how it could be done.
The general idea is to use code from (in order of preference):
the language or built-in libraries.
third-party libraries with suitable licenses.
your own collection.
something new you need to write (and save in your own collection for later).
If you want a textual representation without the 0b-prefix, you could use this:
get_bin = lambda x: format(x, 'b')
print(get_bin(3))
>>> '11'
print(get_bin(-3))
>>> '-11'
When you want a n-bit representation:
get_bin = lambda x, n: format(x, 'b').zfill(n)
>>> get_bin(12, 32)
'00000000000000000000000000001100'
>>> get_bin(-12, 32)
'-00000000000000000000000000001100'
Alternatively, if you prefer having a function:
def get_bin(x, n=0):
"""
Get the binary representation of x.
Parameters
----------
x : int
n : int
Minimum number of digits. If x needs less digits in binary, the rest
is filled with zeros.
Returns
-------
str
"""
return format(x, 'b').zfill(n)
I am surprised there is no mention of a nice way to accomplish this using formatting strings that are supported in Python 3.6 and higher. TLDR:
>>> number = 1
>>> f'0b{number:08b}'
'0b00000001'
Longer story
This is functionality of formatting strings available from Python 3.6:
>>> x, y, z = 1, 2, 3
>>> f'{x} {y} {2*z}'
'1 2 6'
You can request binary as well:
>>> f'{z:b}'
'11'
Specify the width:
>>> f'{z:8b}'
' 11'
Request zero padding:
f'{z:08b}'
'00000011'
And add common prefix to signify binary number:
>>> f'0b{z:08b}'
'0b00000011'
You can also let Python add the prefix for you but I do not like it so much as the version above because you have to take the prefix into width consideration:
>>> f'{z:#010b}'
'0b00000011'
More info is available in official documentation on Formatted string literals and Format Specification Mini-Language.
As a reference:
def toBinary(n):
return ''.join(str(1 & int(n) >> i) for i in range(64)[::-1])
This function can convert a positive integer as large as 18446744073709551615, represented as string '1111111111111111111111111111111111111111111111111111111111111111'.
It can be modified to serve a much larger integer, though it may not be as handy as "{0:b}".format() or bin().
This is for python 3 and it keeps the leading zeros !
print(format(0, '08b'))
A simple way to do that is to use string format, see this page.
>> "{0:b}".format(10)
'1010'
And if you want to have a fixed length of the binary string, you can use this:
>> "{0:{fill}8b}".format(10, fill='0')
'00001010'
If two's complement is required, then the following line can be used:
'{0:{fill}{width}b}'.format((x + 2**n) % 2**n, fill='0', width=n)
where n is the width of the binary string.
one-liner with lambda:
>>> binary = lambda n: '' if n==0 else binary(n/2) + str(n%2)
test:
>>> binary(5)
'101'
EDIT:
but then :(
t1 = time()
for i in range(1000000):
binary(i)
t2 = time()
print(t2 - t1)
# 6.57236599922
in compare to
t1 = time()
for i in range(1000000):
'{0:b}'.format(i)
t2 = time()
print(t2 - t1)
# 0.68017411232
As the preceding answers mostly used format(),
here is an f-string implementation.
integer = 7
bit_count = 5
print(f'{integer:0{bit_count}b}')
Output:
00111
For convenience here is the python docs link for formatted string literals: https://docs.python.org/3/reference/lexical_analysis.html#f-strings.
Summary of alternatives:
n=42
assert "-101010" == format(-n, 'b')
assert "-101010" == "{0:b}".format(-n)
assert "-101010" == (lambda x: x >= 0 and str(bin(x))[2:] or "-" + str(bin(x))[3:])(-n)
assert "0b101010" == bin(n)
assert "101010" == bin(n)[2:] # But this won't work for negative numbers.
Contributors include John Fouhy, Tung Nguyen, mVChr, Martin Thoma. and Martijn Pieters.
>>> format(123, 'b')
'1111011'
For those of us who need to convert signed integers (range -2**(digits-1) to 2**(digits-1)-1) to 2's complement binary strings, this works:
def int2bin(integer, digits):
if integer >= 0:
return bin(integer)[2:].zfill(digits)
else:
return bin(2**digits + integer)[2:]
This produces:
>>> int2bin(10, 8)
'00001010'
>>> int2bin(-10, 8)
'11110110'
>>> int2bin(-128, 8)
'10000000'
>>> int2bin(127, 8)
'01111111'
you can do like that :
bin(10)[2:]
or :
f = str(bin(10))
c = []
c.append("".join(map(int, f[2:])))
print c
Using numpy pack/unpackbits, they are your best friends.
Examples
--------
>>> a = np.array([[2], [7], [23]], dtype=np.uint8)
>>> a
array([[ 2],
[ 7],
[23]], dtype=uint8)
>>> b = np.unpackbits(a, axis=1)
>>> b
array([[0, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 0, 1, 1, 1]], dtype=uint8)
Yet another solution with another algorithm, by using bitwise operators.
def int2bin(val):
res=''
while val>0:
res += str(val&1)
val=val>>1 # val=val/2
return res[::-1] # reverse the string
A faster version without reversing the string.
def int2bin(val):
res=''
while val>0:
res = chr((val&1) + 0x30) + res
val=val>>1
return res
numpy.binary_repr(num, width=None)
Examples from the documentation link above:
>>> np.binary_repr(3)
'11'
>>> np.binary_repr(-3)
'-11'
>>> np.binary_repr(3, width=4)
'0011'
The two’s complement is returned when the input number is negative and width is specified:
>>> np.binary_repr(-3, width=3)
'101'
>>> np.binary_repr(-3, width=5)
'11101'
The accepted answer didn't address negative numbers, which I'll cover.
In addition to the answers above, you can also just use the bin and hex functions. And in the opposite direction, use binary notation:
>>> bin(37)
'0b100101'
>>> 0b100101
37
But with negative numbers, things get a bit more complicated. The question doesn't specify how you want to handle negative numbers.
Python just adds a negative sign so the result for -37 would be this:
>>> bin(-37)
'-0b100101'
In computer/hardware binary data, negative signs don't exist. All we have is 1's and 0's. So if you're reading or producing binary streams of data to be processed by other software/hardware, you need to first know the notation being used.
One notation is sign-magnitude notation, where the first bit represents the negative sign, and the rest is the actual value. In that case, -37 would be 0b1100101 and 37 would be 0b0100101. This looks like what python produces, but just add a 0 or 1 in front for positive / negative numbers.
More common is Two's complement notation, which seems more complicated and the result is very different from python's string formatting. You can read the details in the link, but with an 8bit signed integer -37 would be 0b11011011 and 37 would be 0b00100101.
Python has no easy way to produce these binary representations. You can use numpy to turn Two's complement binary values into python integers:
>>> import numpy as np
>>> np.int8(0b11011011)
-37
>>> np.uint8(0b11011011)
219
>>> np.uint8(0b00100101)
37
>>> np.int8(0b00100101)
37
But I don't know an easy way to do the opposite with builtin functions. The bitstring package can help though.
>>> from bitstring import BitArray
>>> arr = BitArray(int=-37, length=8)
>>> arr.uint
219
>>> arr.int
-37
>>> arr.bin
'11011011'
>>> BitArray(bin='11011011').int
-37
>>> BitArray(bin='11011011').uint
219
Python 3.6 added a new string formatting approach called formatted string literals or “f-strings”.
Example:
name = 'Bob'
number = 42
f"Hello, {name}, your number is {number:>08b}"
Output will be 'Hello, Bob, your number is 00001010!'
A discussion of this question can be found here - Here
Unless I'm misunderstanding what you mean by binary string I think the module you are looking for is struct
n=input()
print(bin(n).replace("0b", ""))
def binary(decimal) :
otherBase = ""
while decimal != 0 :
otherBase = str(decimal % 2) + otherBase
decimal //= 2
return otherBase
print binary(10)
output:
1010
Here is the code I've just implemented. This is not a method but you can use it as a ready-to-use function!
def inttobinary(number):
if number == 0:
return str(0)
result =""
while (number != 0):
remainder = number%2
number = number/2
result += str(remainder)
return result[::-1] # to invert the string
Calculator with all neccessary functions for DEC,BIN,HEX:
(made and tested with Python 3.5)
You can change the input test numbers and get the converted ones.
# CONVERTER: DEC / BIN / HEX
def dec2bin(d):
# dec -> bin
b = bin(d)
return b
def dec2hex(d):
# dec -> hex
h = hex(d)
return h
def bin2dec(b):
# bin -> dec
bin_numb="{0:b}".format(b)
d = eval(bin_numb)
return d,bin_numb
def bin2hex(b):
# bin -> hex
h = hex(b)
return h
def hex2dec(h):
# hex -> dec
d = int(h)
return d
def hex2bin(h):
# hex -> bin
b = bin(h)
return b
## TESTING NUMBERS
numb_dec = 99
numb_bin = 0b0111
numb_hex = 0xFF
## CALCULATIONS
res_dec2bin = dec2bin(numb_dec)
res_dec2hex = dec2hex(numb_dec)
res_bin2dec,bin_numb = bin2dec(numb_bin)
res_bin2hex = bin2hex(numb_bin)
res_hex2dec = hex2dec(numb_hex)
res_hex2bin = hex2bin(numb_hex)
## PRINTING
print('------- DECIMAL to BIN / HEX -------\n')
print('decimal:',numb_dec,'\nbin: ',res_dec2bin,'\nhex: ',res_dec2hex,'\n')
print('------- BINARY to DEC / HEX -------\n')
print('binary: ',bin_numb,'\ndec: ',numb_bin,'\nhex: ',res_bin2hex,'\n')
print('----- HEXADECIMAL to BIN / HEX -----\n')
print('hexadec:',hex(numb_hex),'\nbin: ',res_hex2bin,'\ndec: ',res_hex2dec,'\n')
Somewhat similar solution
def to_bin(dec):
flag = True
bin_str = ''
while flag:
remainder = dec % 2
quotient = dec / 2
if quotient == 0:
flag = False
bin_str += str(remainder)
dec = quotient
bin_str = bin_str[::-1] # reverse the string
return bin_str
here is simple solution using the divmod() fucntion which returns the reminder and the result of a division without the fraction.
def dectobin(number):
bin = ''
while (number >= 1):
number, rem = divmod(number, 2)
bin = bin + str(rem)
return bin
Here's yet another way using regular math, no loops, only recursion. (Trivial case 0 returns nothing).
def toBin(num):
if num == 0:
return ""
return toBin(num//2) + str(num%2)
print ([(toBin(i)) for i in range(10)])
['', '1', '10', '11', '100', '101', '110', '111', '1000', '1001']
To calculate binary of numbers:
print("Binary is {0:>08b}".format(16))
To calculate the Hexa decimal of a number:
print("Hexa Decimal is {0:>0x}".format(15))
To Calculate all the binary no till 16::
for i in range(17):
print("{0:>2}: binary is {0:>08b}".format(i))
To calculate Hexa decimal no till 17
for i in range(17):
print("{0:>2}: Hexa Decimal is {0:>0x}".format(i))
##as 2 digit is enogh for hexa decimal representation of a number
try:
while True:
p = ""
a = input()
while a != 0:
l = a % 2
b = a - l
a = b / 2
p = str(l) + p
print(p)
except:
print ("write 1 number")
I found a method using matrix operation to convert decimal to binary.
import numpy as np
E_mat = np.tile(E,[1,M])
M_order = pow(2,(M-1-np.array(range(M)))).T
bindata = np.remainder(np.floor(E_mat /M_order).astype(np.int),2)
Eis input decimal data,M is the binary orders. bindata is output binary data, which is in a format of 1 by M binary matrix.

Format decimal without trailing zeros [duplicate]

I have a long list of Decimals and that I have to adjust by factors of 10, 100, 1000,..... 1000000 depending on certain conditions. When I multiply them there is sometimes a useless trailing zero (though not always) that I want to get rid of. For example...
from decimal import Decimal
# outputs 25.0, PROBLEM! I would like it to output 25
print Decimal('2.5') * 10
# outputs 2567.8000, PROBLEM! I would like it to output 2567.8
print Decimal('2.5678') * 1000
Is there a function that tells the decimal object to drop these insignificant zeros? The only way I can think of doing this is to convert to a string and replace them using regular expressions.
Should probably mention that I am using python 2.6.5
EDIT
senderle's fine answer made me realize that I occasionally get a number like 250.0 which when normalized produces 2.5E+2. I guess in these cases I could try to sort them out and convert to a int
You can use the normalize method to remove extra precision.
>>> print decimal.Decimal('5.500')
5.500
>>> print decimal.Decimal('5.500').normalize()
5.5
To avoid stripping zeros to the left of the decimal point, you could do this:
def normalize_fraction(d):
normalized = d.normalize()
sign, digits, exponent = normalized.as_tuple()
if exponent > 0:
return decimal.Decimal((sign, digits + (0,) * exponent, 0))
else:
return normalized
Or more compactly, using quantize as suggested by user7116:
def normalize_fraction(d):
normalized = d.normalize()
sign, digit, exponent = normalized.as_tuple()
return normalized if exponent <= 0 else normalized.quantize(1)
You could also use to_integral() as shown here but I think using as_tuple this way is more self-documenting.
I tested these both against a few cases; please leave a comment if you find something that doesn't work.
>>> normalize_fraction(decimal.Decimal('55.5'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55.500'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55500'))
Decimal('55500')
>>> normalize_fraction(decimal.Decimal('555E2'))
Decimal('55500')
There's probably a better way of doing this, but you could use .rstrip('0').rstrip('.') to achieve the result that you want.
Using your numbers as an example:
>>> s = str(Decimal('2.5') * 10)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
25
>>> s = str(Decimal('2.5678') * 1000)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
2567.8
And here's the fix for the problem that #gerrit pointed out in the comments:
>>> s = str(Decimal('1500'))
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
1500
Answer from the Decimal FAQ in the documentation:
>>> def remove_exponent(d):
... return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
>>> remove_exponent(Decimal('5.00'))
Decimal('5')
>>> remove_exponent(Decimal('5.500'))
Decimal('5.5')
>>> remove_exponent(Decimal('5E+3'))
Decimal('5000')
Answer is mentioned in FAQ (https://docs.python.org/2/library/decimal.html#decimal-faq) but does not explain things.
To drop trailing zeros for fraction part you should use normalize:
>>> Decimal('100.2000').normalize()
Decimal('100.2')
>> Decimal('0.2000').normalize()
Decimal('0.2')
But this works different for numbers with leading zeros in sharp part:
>>> Decimal('100.0000').normalize()
Decimal('1E+2')
In this case we should use `to_integral':
>>> Decimal('100.000').to_integral()
Decimal('100')
So we could check if there's a fraction part:
>>> Decimal('100.2000') == Decimal('100.2000').to_integral()
False
>>> Decimal('100.0000') == Decimal('100.0000').to_integral()
True
And use appropriate method then:
def remove_exponent(num):
return num.to_integral() if num == num.to_integral() else num.normalize()
Try it:
>>> remove_exponent(Decimal('100.2000'))
Decimal('100.2')
>>> remove_exponent(Decimal('100.0000'))
Decimal('100')
>>> remove_exponent(Decimal('0.2000'))
Decimal('0.2')
Now we're done.
Use the format specifier %g. It seems remove to trailing zeros.
>>> "%g" % (Decimal('2.5') * 10)
'25'
>>> "%g" % (Decimal('2.5678') * 1000)
'2567.8'
It also works without the Decimal function
>>> "%g" % (2.5 * 10)
'25'
>>> "%g" % (2.5678 * 1000)
'2567.8'
I ended up doing this:
import decimal
def dropzeros(number):
mynum = decimal.Decimal(number).normalize()
# e.g 22000 --> Decimal('2.2E+4')
return mynum.__trunc__() if not mynum % 1 else float(mynum)
print dropzeros(22000.000)
22000
print dropzeros(2567.8000)
2567.8
note: casting the return value as a string will limit you to 12 significant digits
Slightly modified version of A-IV's answer
NOTE that Decimal('0.99999999999999999999999999995').normalize() will round to Decimal('1')
def trailing(s: str, char="0"):
return len(s) - len(s.rstrip(char))
def decimal_to_str(value: decimal.Decimal):
"""Convert decimal to str
* Uses exponential notation when there are more than 4 trailing zeros
* Handles decimal.InvalidOperation
"""
# to_integral_value() removes decimals
if value == value.to_integral_value():
try:
value = value.quantize(decimal.Decimal(1))
except decimal.InvalidOperation:
pass
uncast = str(value)
# use exponential notation if there are more that 4 zeros
return str(value.normalize()) if trailing(uncast) > 4 else uncast
else:
# normalize values with decimal places
return str(value.normalize())
# or str(value).rstrip('0') if rounding edgecases are a concern
You could use :g to achieve this:
'{:g}'.format(3.140)
gives
'3.14'
This should work:
'{:f}'.format(decimal.Decimal('2.5') * 10).rstrip('0').rstrip('.')
Just to show a different possibility, I used to_tuple() to achieve the same result.
def my_normalize(dec):
"""
>>> my_normalize(Decimal("12.500"))
Decimal('12.5')
>>> my_normalize(Decimal("-0.12500"))
Decimal('-0.125')
>>> my_normalize(Decimal("0.125"))
Decimal('0.125')
>>> my_normalize(Decimal("0.00125"))
Decimal('0.00125')
>>> my_normalize(Decimal("125.00"))
Decimal('125')
>>> my_normalize(Decimal("12500"))
Decimal('12500')
>>> my_normalize(Decimal("0.000"))
Decimal('0')
"""
if dec is None:
return None
sign, digs, exp = dec.as_tuple()
for i in list(reversed(digs)):
if exp >= 0 or i != 0:
break
exp += 1
digs = digs[:-1]
if not digs and exp < 0:
exp = 0
return Decimal((sign, digs, exp))
Why not use modules 10 from a multiple of 10 to check if there is remainder? No remainder means you can force int()
if (x * 10) % 10 == 0:
x = int(x)
x = 2/1
Output: 2
x = 3/2
Output: 1.5

Convert bytes to bits in python

I am working with Python3.2. I need to take a hex stream as an input and parse it at bit-level. So I used
bytes.fromhex(input_str)
to convert the string to actual bytes. Now how do I convert these bytes to bits?
Another way to do this is by using the bitstring module:
>>> from bitstring import BitArray
>>> input_str = '0xff'
>>> c = BitArray(hex=input_str)
>>> c.bin
'0b11111111'
And if you need to strip the leading 0b:
>>> c.bin[2:]
'11111111'
The bitstring module isn't a requirement, as jcollado's answer shows, but it has lots of performant methods for turning input into bits and manipulating them. You might find this handy (or not), for example:
>>> c.uint
255
>>> c.invert()
>>> c.bin[2:]
'00000000'
etc.
What about something like this?
>>> bin(int('ff', base=16))
'0b11111111'
This will convert the hexadecimal string you have to an integer and that integer to a string in which each byte is set to 0/1 depending on the bit-value of the integer.
As pointed out by a comment, if you need to get rid of the 0b prefix, you can do it this way:
>>> bin(int('ff', base=16))[2:]
'11111111'
... or, if you are using Python 3.9 or newer:
>>> bin(int('ff', base=16)).removepreffix('0b')
'11111111'
Note: using lstrip("0b") here will lead to 0 integer being converted to an empty string. This is almost always not what you want to do.
Operations are much faster when you work at the integer level. In particular, converting to a string as suggested here is really slow.
If you want bit 7 and 8 only, use e.g.
val = (byte >> 6) & 3
(this is: shift the byte 6 bits to the right - dropping them. Then keep only the last two bits 3 is the number with the first two bits set...)
These can easily be translated into simple CPU operations that are super fast.
using python format string syntax
>>> mybyte = bytes.fromhex("0F") # create my byte using a hex string
>>> binary_string = "{:08b}".format(int(mybyte.hex(),16))
>>> print(binary_string)
00001111
The second line is where the magic happens. All byte objects have a .hex() function, which returns a hex string. Using this hex string, we convert it to an integer, telling the int() function that it's a base 16 string (because hex is base 16). Then we apply formatting to that integer so it displays as a binary string. The {:08b} is where the real magic happens. It is using the Format Specification Mini-Language format_spec. Specifically it's using the width and the type parts of the format_spec syntax. The 8 sets width to 8, which is how we get the nice 0000 padding, and the b sets the type to binary.
I prefer this method over the bin() method because using a format string gives a lot more flexibility.
I think simplest would be use numpy here. For example you can read a file as bytes and then expand it to bits easily like this:
Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)
input_str = "ABC"
[bin(byte) for byte in bytes(input_str, "utf-8")]
Will give:
['0b1000001', '0b1000010', '0b1000011']
Here how to do it using format()
print "bin_signedDate : ", ''.join(format(x, '08b') for x in bytevector)
It is important the 08b . That means it will be a maximum of 8 leading zeros be appended to complete a byte. If you don't specify this then the format will just have a variable bit length for each converted byte.
To binary:
bin(byte)[2:].zfill(8)
Use ord when reading reading bytes:
byte_binary = bin(ord(f.read(1))) # Add [2:] to remove the "0b" prefix
Or
Using str.format():
'{:08b}'.format(ord(f.read(1)))
The other answers here provide the bits in big-endian order ('\x01' becomes '00000001')
In case you're interested in little-endian order of bits, which is useful in many cases, like common representations of bignums etc -
here's a snippet for that:
def bits_little_endian_from_bytes(s):
return ''.join(bin(ord(x))[2:].rjust(8,'0')[::-1] for x in s)
And for the other direction:
def bytes_from_bits_little_endian(s):
return ''.join(chr(int(s[i:i+8][::-1], 2)) for i in range(0, len(s), 8))
One line function to convert bytes (not string) to bit list. There is no endnians issue when source is from a byte reader/writer to another byte reader/writer, only if source and target are bit reader and bit writers.
def byte2bin(b):
return [int(X) for X in "".join(["{:0>8}".format(bin(X)[2:])for X in b])]
I came across this answer when looking for a way to convert an integer into a list of bit positions where the bitstring is equal to one. This becomes very similar to this question if you first convert your hex string to an integer like int('0x453', 16).
Now, given an integer - a representation already well-encoded in the hardware, I was very surprised to find out that the string variants of the above solutions using things like bin turn out to be faster than numpy based solutions for a single number, and I thought I'd quickly write up the results.
I wrote three variants of the function. First using numpy:
import math
import numpy as np
def bit_positions_numpy(val):
"""
Given an integer value, return the positions of the on bits.
"""
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
arr = np.frombuffer(bytestr, dtype=np.uint8, count=length)
bit_arr = np.unpackbits(arr, bitorder='big')
bit_positions = np.where(bit_arr[::-1])[0].tolist()
return bit_positions
Then using string logic:
def bit_positions_str(val):
is_negative = val < 0
if is_negative:
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
neg_position = (length * 8) - 1
# special logic for negatives to get twos compliment repr
max_val = 1 << neg_position
val_ = max_val + val
else:
val_ = val
binary_string = '{:b}'.format(val_)[::-1]
bit_positions = [pos for pos, char in enumerate(binary_string)
if char == '1']
if is_negative:
bit_positions.append(neg_position)
return bit_positions
And finally, I added a third method where I precomputed a lookuptable of the positions for a single byte and expanded that given larger itemsizes.
BYTE_TO_POSITIONS = []
pos_masks = [(s, (1 << s)) for s in range(0, 8)]
for i in range(0, 256):
positions = [pos for pos, mask in pos_masks if (mask & i)]
BYTE_TO_POSITIONS.append(positions)
def bit_positions_lut(val):
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
bit_positions = []
for offset, b in enumerate(bytestr[::-1]):
pos = BYTE_TO_POSITIONS[b]
if offset == 0:
bit_positions.extend(pos)
else:
pos_offset = (8 * offset)
bit_positions.extend([p + pos_offset for p in pos])
return bit_positions
The benchmark code is as follows:
def benchmark_bit_conversions():
# for val in [-0, -1, -3, -4, -9999]:
test_values = [
# -1, -2, -3, -4, -8, -32, -290, -9999,
# 0, 1, 2, 3, 4, 8, 32, 290, 9999,
4324, 1028, 1024, 3000, -100000,
999999999999,
-999999999999,
2 ** 32,
2 ** 64,
2 ** 128,
2 ** 128,
]
for val in test_values:
r1 = bit_positions_str(val)
r2 = bit_positions_numpy(val)
r3 = bit_positions_lut(val)
print(f'val={val}')
print(f'r1={r1}')
print(f'r2={r2}')
print(f'r3={r3}')
print('---')
assert r1 == r2
import xdev
xdev.profile_now(bit_positions_numpy)(val)
xdev.profile_now(bit_positions_str)(val)
xdev.profile_now(bit_positions_lut)(val)
import timerit
ti = timerit.Timerit(10000, bestof=10, verbose=2)
for timer in ti.reset('str'):
for val in test_values:
bit_positions_str(val)
for timer in ti.reset('numpy'):
for val in test_values:
bit_positions_numpy(val)
for timer in ti.reset('lut'):
for val in test_values:
bit_positions_lut(val)
for timer in ti.reset('raw_bin'):
for val in test_values:
bin(val)
for timer in ti.reset('raw_bytes'):
for val in test_values:
val.to_bytes(val.bit_length(), 'big', signed=True)
And it clearly shows the str and lookup table implementations are ahead of numpy. I tested this on CPython 3.10 and 3.11.
Timed str for: 10000 loops, best of 10
time per loop: best=20.488 µs, mean=21.438 ± 0.4 µs
Timed numpy for: 10000 loops, best of 10
time per loop: best=25.754 µs, mean=28.509 ± 5.2 µs
Timed lut for: 10000 loops, best of 10
time per loop: best=19.420 µs, mean=21.305 ± 3.8 µs

Convert hex to binary

I have ABC123EFFF.
I want to have 001010101111000001001000111110111111111111 (i.e. binary repr. with, say, 42 digits and leading zeroes).
How?
For solving the left-side trailing zero problem:
my_hexdata = "1a"
scale = 16 ## equals to hexadecimal
num_of_bits = 8
bin(int(my_hexdata, scale))[2:].zfill(num_of_bits)
It will give 00011010 instead of the trimmed version.
import binascii
binary_string = binascii.unhexlify(hex_string)
Read
binascii.unhexlify
Return the binary data represented by the hexadecimal string specified as the parameter.
Convert hex to binary
I have ABC123EFFF.
I want to have 001010101111000001001000111110111111111111 (i.e. binary
repr. with, say, 42 digits and leading zeroes).
Short answer:
The new f-strings in Python 3.6 allow you to do this using very terse syntax:
>>> f'{0xABC123EFFF:0>42b}'
'001010101111000001001000111110111111111111'
or to break that up with the semantics:
>>> number, pad, rjust, size, kind = 0xABC123EFFF, '0', '>', 42, 'b'
>>> f'{number:{pad}{rjust}{size}{kind}}'
'001010101111000001001000111110111111111111'
Long answer:
What you are actually saying is that you have a value in a hexadecimal representation, and you want to represent an equivalent value in binary.
The value of equivalence is an integer. But you may begin with a string, and to view in binary, you must end with a string.
Convert hex to binary, 42 digits and leading zeros?
We have several direct ways to accomplish this goal, without hacks using slices.
First, before we can do any binary manipulation at all, convert to int (I presume this is in a string format, not as a literal):
>>> integer = int('ABC123EFFF', 16)
>>> integer
737679765503
alternatively we could use an integer literal as expressed in hexadecimal form:
>>> integer = 0xABC123EFFF
>>> integer
737679765503
Now we need to express our integer in a binary representation.
Use the builtin function, format
Then pass to format:
>>> format(integer, '0>42b')
'001010101111000001001000111110111111111111'
This uses the formatting specification's mini-language.
To break that down, here's the grammar form of it:
[[fill]align][sign][#][0][width][,][.precision][type]
To make that into a specification for our needs, we just exclude the things we don't need:
>>> spec = '{fill}{align}{width}{type}'.format(fill='0', align='>', width=42, type='b')
>>> spec
'0>42b'
and just pass that to format
>>> bin_representation = format(integer, spec)
>>> bin_representation
'001010101111000001001000111110111111111111'
>>> print(bin_representation)
001010101111000001001000111110111111111111
String Formatting (Templating) with str.format
We can use that in a string using str.format method:
>>> 'here is the binary form: {0:{spec}}'.format(integer, spec=spec)
'here is the binary form: 001010101111000001001000111110111111111111'
Or just put the spec directly in the original string:
>>> 'here is the binary form: {0:0>42b}'.format(integer)
'here is the binary form: 001010101111000001001000111110111111111111'
String Formatting with the new f-strings
Let's demonstrate the new f-strings. They use the same mini-language formatting rules:
>>> integer = 0xABC123EFFF
>>> length = 42
>>> f'{integer:0>{length}b}'
'001010101111000001001000111110111111111111'
Now let's put this functionality into a function to encourage reusability:
def bin_format(integer, length):
return f'{integer:0>{length}b}'
And now:
>>> bin_format(0xABC123EFFF, 42)
'001010101111000001001000111110111111111111'
Aside
If you actually just wanted to encode the data as a string of bytes in memory or on disk, you can use the int.to_bytes method, which is only available in Python 3:
>>> help(int.to_bytes)
to_bytes(...)
int.to_bytes(length, byteorder, *, signed=False) -> bytes
...
And since 42 bits divided by 8 bits per byte equals 6 bytes:
>>> integer.to_bytes(6, 'big')
b'\x00\xab\xc1#\xef\xff'
bin(int("abc123efff", 16))[2:]
>>> bin( 0xABC123EFFF )
'0b1010101111000001001000111110111111111111'
Use Built-in format() function and int() function
It's simple and easy to understand. It's little bit simplified version of Aaron answer
int()
int(string, base)
format()
format(integer, # of bits)
Example
# w/o 0b prefix
>> format(int("ABC123EFFF", 16), "040b")
1010101111000001001000111110111111111111
# with 0b prefix
>> format(int("ABC123EFFF", 16), "#042b")
0b1010101111000001001000111110111111111111
# w/o 0b prefix + 64bit
>> format(int("ABC123EFFF", 16), "064b")
0000000000000000000000001010101111000001001000111110111111111111
See also this answer
"{0:020b}".format(int('ABC123EFFF', 16))
Here's a fairly raw way to do it using bit fiddling to generate the binary strings.
The key bit to understand is:
(n & (1 << i)) and 1
Which will generate either a 0 or 1 if the i'th bit of n is set.
import binascii
def byte_to_binary(n):
return ''.join(str((n & (1 << i)) and 1) for i in reversed(range(8)))
def hex_to_binary(h):
return ''.join(byte_to_binary(ord(b)) for b in binascii.unhexlify(h))
print hex_to_binary('abc123efff')
>>> 1010101111000001001000111110111111111111
Edit: using the "new" ternary operator this:
(n & (1 << i)) and 1
Would become:
1 if n & (1 << i) or 0
(Which TBH I'm not sure how readable that is)
This is a slight touch up to Glen Maynard's solution, which I think is the right way to do it. It just adds the padding element.
def hextobin(self, hexval):
'''
Takes a string representation of hex data with
arbitrary length and converts to string representation
of binary. Includes padding 0s
'''
thelen = len(hexval)*4
binval = bin(int(hexval, 16))[2:]
while ((len(binval)) &lt thelen):
binval = '0' + binval
return binval
Pulled it out of a class. Just take out self, if you're working in a stand-alone script.
I added the calculation for the number of bits to fill to Onedinkenedi's solution. Here is the resulting function:
def hextobin(h):
return bin(int(h, 16))[2:].zfill(len(h) * 4)
Where 16 is the base you're converting from (hexadecimal), and 4 is how many bits you need to represent each digit, or log base 2 of the scale.
Replace each hex digit with the corresponding 4 binary digits:
1 - 0001
2 - 0010
...
a - 1010
b - 1011
...
f - 1111
hex --> decimal then decimal --> binary
#decimal to binary
def d2b(n):
bStr = ''
if n < 0: raise ValueError, "must be a positive integer"
if n == 0: return '0'
while n > 0:
bStr = str(n % 2) + bStr
n = n >> 1
return bStr
#hex to binary
def h2b(hex):
return d2b(int(hex,16))
# Python Program - Convert Hexadecimal to Binary
hexdec = input("Enter Hexadecimal string: ")
print(hexdec," in Binary = ", end="") # end is by default "\n" which prints a new line
for _hex in hexdec:
dec = int(_hex, 16) # 16 means base-16 wich is hexadecimal
print(bin(dec)[2:].rjust(4,"0"), end="") # the [2:] skips 0b, and the
Just use the module coden (note: I am the author of the module)
You can convert haxedecimal to binary there.
Install using pip
pip install coden
Convert
a_hexadecimal_number = "f1ff"
binary_output = coden.hex_to_bin(a_hexadecimal_number)
The converting Keywords are:
hex for hexadeimal
bin for binary
int for decimal
_to_ - the converting keyword for the function
So you can also format:
e. hexadecimal_output = bin_to_hex(a_binary_number)
Another way:
import math
def hextobinary(hex_string):
s = int(hex_string, 16)
num_digits = int(math.ceil(math.log(s) / math.log(2)))
digit_lst = ['0'] * num_digits
idx = num_digits
while s > 0:
idx -= 1
if s % 2 == 1: digit_lst[idx] = '1'
s = s / 2
return ''.join(digit_lst)
print hextobinary('abc123efff')
The binary version of ABC123EFFF is actually 1010101111000001001000111110111111111111
For almost all applications you want the binary version to have a length that is a multiple of 4 with leading padding of 0s.
To get this in Python:
def hex_to_binary( hex_code ):
bin_code = bin( hex_code )[2:]
padding = (4-len(bin_code)%4)%4
return '0'*padding + bin_code
Example 1:
>>> hex_to_binary( 0xABC123EFFF )
'1010101111000001001000111110111111111111'
Example 2:
>>> hex_to_binary( 0x7123 )
'0111000100100011'
Note that this also works in Micropython :)
i have a short snipped hope that helps :-)
input = 'ABC123EFFF'
for index, value in enumerate(input):
print(value)
print(bin(int(value,16)+16)[3:])
string = ''.join([bin(int(x,16)+16)[3:] for y,x in enumerate(input)])
print(string)
first i use your input and enumerate it to get each symbol. then i convert it to binary and trim from 3th position to the end. The trick to get the 0 is to add the max value of the input -> in this case always 16 :-)
the short form ist the join method. Enjoy.
HEX_TO_BINARY_CONVERSION_TABLE = {
'0': '0000',
'1': '0001',
'2': '0010',
'3': '0011',
'4': '0100',
'5': '0101',
'6': '0110',
'7': '0111',
'8': '1000',
'9': '1001',
'a': '1010',
'b': '1011',
'c': '1100',
'd': '1101',
'e': '1110',
'f': '1111'}
def hex_to_binary(hex_string):
binary_string = ""
for character in hex_string:
binary_string += HEX_TO_BINARY_CONVERSION_TABLE[character]
return binary_string
when I time hex_to_binary("123ade")
%timeit hex_to_binary("123ade")
here is the result:
316 ns ± 2.52 ns per loop
Alternatively, you could use "join" method:
def hex_to_binary_join(hex_string):
hex_array=[]
for character in hex_string:
hex_array.append(HEX_TO_BINARY_CONVERSION_TABLE[character])
return "".join(hex_array)
I timed this too:
%timeit hex_to_binary_join("123ade")
397 ns ± 4.64 ns per loop
a = raw_input('hex number\n')
length = len(a)
ab = bin(int(a, 16))[2:]
while len(ab)<(length * 4):
ab = '0' + ab
print ab
import binascii
hexa_input = input('Enter hex String to convert to Binary: ')
pad_bits=len(hexa_input)*4
Integer_output=int(hexa_input,16)
Binary_output= bin(Integer_output)[2:]. zfill(pad_bits)
print(Binary_output)
"""zfill(x) i.e. x no of 0 s to be padded left - Integers will overwrite 0 s
starting from right side but remaining 0 s will display till quantity x
[y:] where y is no of output chars which need to destroy starting from left"""
def conversion():
e=raw_input("enter hexadecimal no.:")
e1=("a","b","c","d","e","f")
e2=(10,11,12,13,14,15)
e3=1
e4=len(e)
e5=()
while e3<=e4:
e5=e5+(e[e3-1],)
e3=e3+1
print e5
e6=1
e8=()
while e6<=e4:
e7=e5[e6-1]
if e7=="A":
e7=10
if e7=="B":
e7=11
if e7=="C":
e7=12
if e7=="D":
e7=13
if e7=="E":
e7=14
if e7=="F":
e7=15
else:
e7=int(e7)
e8=e8+(e7,)
e6=e6+1
print e8
e9=1
e10=len(e8)
e11=()
while e9<=e10:
e12=e8[e9-1]
a1=e12
a2=()
a3=1
while a3<=1:
a4=a1%2
a2=a2+(a4,)
a1=a1/2
if a1<2:
if a1==1:
a2=a2+(1,)
if a1==0:
a2=a2+(0,)
a3=a3+1
a5=len(a2)
a6=1
a7=""
a56=a5
while a6<=a5:
a7=a7+str(a2[a56-1])
a6=a6+1
a56=a56-1
if a5<=3:
if a5==1:
a8="000"
a7=a8+a7
if a5==2:
a8="00"
a7=a8+a7
if a5==3:
a8="0"
a7=a8+a7
else:
a7=a7
print a7,
e9=e9+1
no=raw_input("Enter your number in hexa decimal :")
def convert(a):
if a=="0":
c="0000"
elif a=="1":
c="0001"
elif a=="2":
c="0010"
elif a=="3":
c="0011"
elif a=="4":
c="0100"
elif a=="5":
c="0101"
elif a=="6":
c="0110"
elif a=="7":
c="0111"
elif a=="8":
c="1000"
elif a=="9":
c="1001"
elif a=="A":
c="1010"
elif a=="B":
c="1011"
elif a=="C":
c="1100"
elif a=="D":
c="1101"
elif a=="E":
c="1110"
elif a=="F":
c="1111"
else:
c="invalid"
return c
a=len(no)
b=0
l=""
while b<a:
l=l+convert(no[b])
b+=1
print l

How to convert a string of bytes into an int?

How can I convert a string of bytes into an int in python?
Say like this: 'y\xcc\xa6\xbb'
I came up with a clever/stupid way of doing it:
sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))
I know there has to be something builtin or in the standard library that does this more simply...
This is different from converting a string of hex digits for which you can use int(xxx, 16), but instead I want to convert a string of actual byte values.
UPDATE:
I kind of like James' answer a little better because it doesn't require importing another module, but Greg's method is faster:
>>> from timeit import Timer
>>> Timer('struct.unpack("<L", "y\xcc\xa6\xbb")[0]', 'import struct').timeit()
0.36242198944091797
>>> Timer("int('y\xcc\xa6\xbb'.encode('hex'), 16)").timeit()
1.1432669162750244
My hacky method:
>>> Timer("sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))").timeit()
2.8819329738616943
FURTHER UPDATE:
Someone asked in comments what's the problem with importing another module. Well, importing a module isn't necessarily cheap, take a look:
>>> Timer("""import struct\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""").timeit()
0.98822188377380371
Including the cost of importing the module negates almost all of the advantage that this method has. I believe that this will only include the expense of importing it once for the entire benchmark run; look what happens when I force it to reload every time:
>>> Timer("""reload(struct)\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""", 'import struct').timeit()
68.474128007888794
Needless to say, if you're doing a lot of executions of this method per one import than this becomes proportionally less of an issue. It's also probably i/o cost rather than cpu so it may depend on the capacity and load characteristics of the particular machine.
In Python 3.2 and later, use
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='big')
2043455163
or
>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='little')
3148270713
according to the endianness of your byte-string.
This also works for bytestring-integers of arbitrary length, and for two's-complement signed integers by specifying signed=True. See the docs for from_bytes.
You can also use the struct module to do this:
>>> struct.unpack("<L", "y\xcc\xa6\xbb")[0]
3148270713L
As Greg said, you can use struct if you are dealing with binary values, but if you just have a "hex number" but in byte format you might want to just convert it like:
s = 'y\xcc\xa6\xbb'
num = int(s.encode('hex'), 16)
...this is the same as:
num = struct.unpack(">L", s)[0]
...except it'll work for any number of bytes.
I use the following function to convert data between int, hex and bytes.
def bytes2int(str):
return int(str.encode('hex'), 16)
def bytes2hex(str):
return '0x'+str.encode('hex')
def int2bytes(i):
h = int2hex(i)
return hex2bytes(h)
def int2hex(i):
return hex(i)
def hex2int(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return int(h, 16)
def hex2bytes(h):
if len(h) > 1 and h[0:2] == '0x':
h = h[2:]
if len(h) % 2:
h = "0" + h
return h.decode('hex')
Source: http://opentechnotes.blogspot.com.au/2014/04/convert-values-to-from-integer-hex.html
import array
integerValue = array.array("I", 'y\xcc\xa6\xbb')[0]
Warning: the above is strongly platform-specific. Both the "I" specifier and the endianness of the string->int conversion are dependent on your particular Python implementation. But if you want to convert many integers/strings at once, then the array module does it quickly.
In Python 2.x, you could use the format specifiers <B for unsigned bytes, and <b for signed bytes with struct.unpack/struct.pack.
E.g:
Let x = '\xff\x10\x11'
data_ints = struct.unpack('<' + 'B'*len(x), x) # [255, 16, 17]
And:
data_bytes = struct.pack('<' + 'B'*len(data_ints), *data_ints) # '\xff\x10\x11'
That * is required!
See https://docs.python.org/2/library/struct.html#format-characters for a list of the format specifiers.
>>> reduce(lambda s, x: s*256 + x, bytearray("y\xcc\xa6\xbb"))
2043455163
Test 1: inverse:
>>> hex(2043455163)
'0x79cca6bb'
Test 2: Number of bytes > 8:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAA"))
338822822454978555838225329091068225L
Test 3: Increment by one:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAB"))
338822822454978555838225329091068226L
Test 4: Append one byte, say 'A':
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))
86738642548474510294585684247313465921L
Test 5: Divide by 256:
>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))/256
338822822454978555838225329091068226L
Result equals the result of Test 4, as expected.
I was struggling to find a solution for arbitrary length byte sequences that would work under Python 2.x. Finally I wrote this one, it's a bit hacky because it performs a string conversion, but it works.
Function for Python 2.x, arbitrary length
def signedbytes(data):
"""Convert a bytearray into an integer, considering the first bit as
sign. The data must be big-endian."""
negative = data[0] & 0x80 > 0
if negative:
inverted = bytearray(~d % 256 for d in data)
return -signedbytes(inverted) - 1
encoded = str(data).encode('hex')
return int(encoded, 16)
This function has two requirements:
The input data needs to be a bytearray. You may call the function like this:
s = 'y\xcc\xa6\xbb'
n = signedbytes(s)
The data needs to be big-endian. In case you have a little-endian value, you should reverse it first:
n = signedbytes(s[::-1])
Of course, this should be used only if arbitrary length is needed. Otherwise, stick with more standard ways (e.g. struct).
int.from_bytes is the best solution if you are at version >=3.2.
The "struct.unpack" solution requires a string so it will not apply to arrays of bytes.
Here is another solution:
def bytes2int( tb, order='big'):
if order == 'big': seq=[0,1,2,3]
elif order == 'little': seq=[3,2,1,0]
i = 0
for j in seq: i = (i<<8)+tb[j]
return i
hex( bytes2int( [0x87, 0x65, 0x43, 0x21])) returns '0x87654321'.
It handles big and little endianness and is easily modifiable for 8 bytes
As mentioned above using unpack function of struct is a good way. If you want to implement your own function there is an another solution:
def bytes_to_int(bytes):
result = 0
for b in bytes:
result = result * 256 + int(b)
return result
In python 3 you can easily convert a byte string into a list of integers (0..255) by
>>> list(b'y\xcc\xa6\xbb')
[121, 204, 166, 187]
A decently speedy method utilizing array.array I've been using for some time:
predefined variables:
offset = 0
size = 4
big = True # endian
arr = array('B')
arr.fromstring("\x00\x00\xff\x00") # 5 bytes (encoding issues) [0, 0, 195, 191, 0]
to int: (read)
val = 0
for v in arr[offset:offset+size][::pow(-1,not big)]: val = (val<<8)|v
from int: (write)
val = 16384
arr[offset:offset+size] = \
array('B',((val>>(i<<3))&255 for i in range(size)))[::pow(-1,not big)]
It's possible these could be faster though.
EDIT:
For some numbers, here's a performance test (Anaconda 2.3.0) showing stable averages on read in comparison to reduce():
========================= byte array to int.py =========================
5000 iterations; threshold of min + 5000ns:
______________________________________code___|_______min______|_______max______|_______avg______|_efficiency
⣿⠀⠀⠀⠀⡇⢀⡀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⡀⠀⢰⠀⠀⠀⢰⠀⠀⠀⢸⠀⠀⢀⡇⠀⢀⠀⠀⠀⠀⢠⠀⠀⠀⠀⢰⠀⠀⠀⢸⡀⠀⠀⠀⢸⠀⡇⠀⠀⢠⠀⢰⠀⢸⠀
⣿⣦⣴⣰⣦⣿⣾⣧⣤⣷⣦⣤⣶⣾⣿⣦⣼⣶⣷⣶⣸⣴⣤⣀⣾⣾⣄⣤⣾⡆⣾⣿⣿⣶⣾⣾⣶⣿⣤⣾⣤⣤⣴⣼⣾⣼⣴⣤⣼⣷⣆⣴⣴⣿⣾⣷⣧⣶⣼⣴⣿⣶⣿⣶
val = 0 \nfor v in arr: val = (val<<8)|v | 5373.848ns | 850009.965ns | ~8649.64ns | 62.128%
⡇⠀⠀⢀⠀⠀⠀⡇⠀⡇⠀⠀⣠⠀⣿⠀⠀⠀⠀⡀⠀⠀⡆⠀⡆⢰⠀⠀⡆⠀⡄⠀⠀⠀⢠⢀⣼⠀⠀⡇⣠⣸⣤⡇⠀⡆⢸⠀⠀⠀⠀⢠⠀⢠⣿⠀⠀⢠⠀⠀⢸⢠⠀⡀
⣧⣶⣶⣾⣶⣷⣴⣿⣾⡇⣤⣶⣿⣸⣿⣶⣶⣶⣶⣧⣷⣼⣷⣷⣷⣿⣦⣴⣧⣄⣷⣠⣷⣶⣾⣸⣿⣶⣶⣷⣿⣿⣿⣷⣧⣷⣼⣦⣶⣾⣿⣾⣼⣿⣿⣶⣶⣼⣦⣼⣾⣿⣶⣷
val = reduce( shift, arr ) | 6489.921ns | 5094212.014ns | ~12040.269ns | 53.902%
This is a raw performance test, so the endian pow-flip is left out.
The shift function shown applies the same shift-oring operation as the for loop, and arr is just array.array('B',[0,0,255,0]) as it has the fastest iterative performance next to dict.
I should probably also note efficiency is measured by accuracy to the average time.

Categories

Resources