I'm using a Python to transmit two integers (range 0...4095) via SPI. The package seems to expect a byte array in form of [0xff,0xff,0xff].
So e.g. 1638(hex:666) and 1229(hex:4cd) should yield [0x66,0x64,0xcd].
So would an effective conversion look like as the mixed byte in the middle seems quite nasty?
You can do it by left shifting and then bitwise OR'ing the two 12-bit values together and using the int_to_bytes() function shown below, which will work in Python 2.x.
In Python 3, the int type has a built-in method called to_bytes() that will do this and more, so in that version you wouldn't need to supply your own.
def int_to_bytes(n, minlen=0):
""" Convert integer to bytearray with optional minimum length.
"""
if n > 0:
arr = []
while n:
n, rem = n >> 8, n & 0xff
arr.append(rem)
b = bytearray(reversed(arr))
elif n == 0:
b = bytearray(b'\x00')
else:
raise ValueError('Only non-negative values supported')
if minlen > 0 and len(b) < minlen: # zero padding needed?
b = (minlen-len(b)) * '\x00' + b
return b
a, b = 1638, 1229 # two 12 bit values
v = a << 12 | b # shift first 12 bits then OR with second
ba = int_to_bytes(v, 3) # convert to array of bytes
print('[{}]'.format(', '.join(hex(b) for b in ba))) # -> [0x66, 0x64, 0xcd]
Related
I have int in python that I want to reverse
x = int(1234567899)
I want to result will be 3674379849
explain : = 1234567899 = 0x499602DB and 3674379849 = 0xDB029649
How to do that in python ?
>>> import struct
>>> struct.unpack('>I', struct.pack('<I', 1234567899))[0]
3674379849
>>>
This converts the integer to a 4-byte array (I), then decodes it in reverse order (> vs <).
Documentation: struct
If you just want the result, use sabiks approach - if you want the intermediate steps for bragging rights, you would need to
create the hex of the number (#1) and maybe add a leading 0 for correctness
reverse it 2-byte-wise (#2)
create an integer again (#3)
f.e. like so
n = 1234567899
# 1
h = hex(n)
if len(h) % 2: # fix for uneven lengthy inputs (f.e. n = int("234",16))
h = '0x0'+h[2:]
# 2 (skips 0x and prepends 0x for looks only)
bh = '0x'+''.join([h[i: i+2] for i in range(2, len(h), 2)][::-1])
# 3
b = int(bh, 16)
print(n, h, bh, b)
to get
1234567899 0x499602db 0xdb029649 3674379849
I have a string of booleans and I want to create a binary file using these booleans as bits. This is what I am doing:
# first append the string with 0s to make its length a multiple of 8
while len(boolString) % 8 != 0:
boolString += '0'
# write the string to the file byte by byte
i = 0
while i < len(boolString) / 8:
byte = int(boolString[i*8 : (i+1)*8], 2)
outputFile.write('%c' % byte)
i += 1
But this generates the output 1 byte at a time and is slow. What would be a more efficient way to do it?
It should be quicker if you calculate all your bytes first and then write them all together. For example
b = bytearray([int(boolString[x:x+8], 2) for x in range(0, len(boolString), 8)])
outputFile.write(b)
I'm also using a bytearray which is a natural container to use, and can also be written directly to your file.
You can of course use libraries if that's appropriate such as bitarray and bitstring. Using the latter you could just say
bitstring.Bits(bin=boolString).tofile(outputFile)
Here's another answer, this time using an industrial-strength utility function from the PyCrypto - The Python Cryptography Toolkit where, in version 2.6 (the current latest stable release), it's defined inpycrypto-2.6/lib/Crypto/Util/number.py.
The comments preceeding it say:
Improved conversion functions contributed by Barry Warsaw, after careful benchmarking
import struct
def long_to_bytes(n, blocksize=0):
"""long_to_bytes(n:long, blocksize:int) : string
Convert a long integer to a byte string.
If optional blocksize is given and greater than zero, pad the front of the
byte string with binary zeros so that the length is a multiple of
blocksize.
"""
# after much testing, this algorithm was deemed to be the fastest
s = b('')
n = long(n)
pack = struct.pack
while n > 0:
s = pack('>I', n & 0xffffffffL) + s
n = n >> 32
# strip off leading zeros
for i in range(len(s)):
if s[i] != b('\000')[0]:
break
else:
# only happens when n == 0
s = b('\000')
i = 0
s = s[i:]
# add back some pad bytes. this could be done more efficiently w.r.t. the
# de-padding being done above, but sigh...
if blocksize > 0 and len(s) % blocksize:
s = (blocksize - len(s) % blocksize) * b('\000') + s
return s
You can convert a boolean string to a long using data = long(boolString,2). Then to write this long to disk you can use:
while data > 0:
data, byte = divmod(data, 0xff)
file.write('%c' % byte)
However, there is no need to make a boolean string. It is much easier to use a long. The long type can contain an infinite number of bits. Using bit manipulation you can set or clear the bits as needed. You can then write the long to disk as a whole in a single write operation.
You can try this code using the array class:
import array
buffer = array.array('B')
i = 0
while i < len(boolString) / 8:
byte = int(boolString[i*8 : (i+1)*8], 2)
buffer.append(byte)
i += 1
f = file(filename, 'wb')
buffer.tofile(f)
f.close()
A helper class (shown below) makes it easy:
class BitWriter:
def __init__(self, f):
self.acc = 0
self.bcount = 0
self.out = f
def __del__(self):
self.flush()
def writebit(self, bit):
if self.bcount == 8 :
self.flush()
if bit > 0:
self.acc |= (1 << (7-self.bcount))
self.bcount += 1
def writebits(self, bits, n):
while n > 0:
self.writebit( bits & (1 << (n-1)) )
n -= 1
def flush(self):
self.out.write(chr(self.acc))
self.acc = 0
self.bcount = 0
with open('outputFile', 'wb') as f:
bw = BitWriter(f)
bw.writebits(int(boolString,2), len(boolString))
bw.flush()
Use the struct package.
This can be used in handling binary data stored in files or from network connections, among other sources.
Edit:
An example using ? as the format character for a bool.
import struct
p = struct.pack('????', True, False, True, False)
assert p == '\x01\x00\x01\x00'
with open("out", "wb") as o:
o.write(p)
Let's take a look at the file:
$ ls -l out
-rw-r--r-- 1 lutz lutz 4 Okt 1 13:26 out
$ od out
0000000 000001 000001
000000
Read it in again:
with open("out", "rb") as i:
q = struct.unpack('????', i.read())
assert q == (True, False, True, False)
I'm reading a binary file in python and the documentation for the file format says:
Flag (in binary)Meaning
1 nnn nnnn Indicates that there is one data byte to follow
that is to be duplicated nnn nnnn (127 maximum)
times.
0 nnn nnnn Indicates that there are nnn nnnn bytes of image
data to follow (127 bytes maximum) and that
there are no duplications.
n 000 0000 End of line field. Indicates the end of a line
record. The value of n may be either zero or one.
Note that the end of line field is required and
that it is reflected in the length of line record
field mentioned above.
When reading the file I'm expecting the byte I'm at to return 1 nnn nnnn where the nnn nnnn part should be 50.
I've been able to do this using the following:
flag = byte >> 7
numbytes = int(bin(byte)[3:], 2)
But the numbytes calculation feels like a cheap workaround.
Can I do more bit math to accomplish the calculation of numbytes?
How would you approach this?
The classic approach of checking whether a bit is set, is to use binary "and" operator, i.e.
x = 10 # 1010 in binary
if x & 0b10: # explicitly: x & 0b0010 != 0
print('First bit is set')
To check, whether n^th bit is set, use the power of two, or better bit shifting
def is_set(x, n):
return x & 2 ** n != 0
# a more bitwise- and performance-friendly version:
return x & 1 << n != 0
is_set(10, 1) # 1 i.e. first bit - as the count starts at 0-th bit
>>> True
You can strip off the leading bit using a mask ANDed with a byte from file. That will leave you with the value of the remaining bits:
mask = 0b01111111
byte_from_file = 0b10101010
value = mask & byte_from_file
print bin(value)
>> 0b101010
print value
>> 42
I find the binary numbers easier to understand than hex when doing bit-masking.
EDIT: Slightly more complete example for your use case:
LEADING_BIT_MASK = 0b10000000
VALUE_MASK = 0b01111111
values = [0b10101010, 0b01010101, 0b0000000, 0b10000000]
for v in values:
value = v & VALUE_MASK
has_leading_bit = v & LEADING_BIT_MASK
if value == 0:
print "EOL"
elif has_leading_bit:
print "leading one", value
elif not has_leading_bit:
print "leading zero", value
If I read your description correctly:
if (byte & 0x80) != 0:
num_bytes = byte & 0x7F
there you go:
class ControlWord(object):
"""Helper class to deal with control words.
Bit setting and checking methods are implemented.
"""
def __init__(self, value = 0):
self.value = int(value)
def set_bit(self, bit):
self.value |= bit
def check_bit(self, bit):
return self.value & bit != 0
def clear_bit(self, bit):
self.value &= ~bit
Instead of int(bin(byte)[3:], 2), you could simply use: int(bin(byte>>1),2)
not sure I got you correctly, but if I did, this should do the trick:
>>> x = 154 #just an example
>>> flag = x >> 1
>>> flag
1
>>> nb = x & 127
>>> nb
26
You can do it like this:
def GetVal(b):
# mask off the most significant bit, see if it's set
flag = b & 0x80 == 0x80
# then look at the lower 7 bits in the byte.
count = b & 0x7f
# return a tuple indicating the state of the high bit, and the
# remaining integer value without the high bit.
return (flag, count)
>>> testVal = 50 + 0x80
>>> GetVal(testVal)
(True, 50)
I want to generate 64 bits long int to serve as unique ID's for documents.
One idea is to combine the user's ID, which is a 32 bit int, with the Unix timestamp, which is another 32 bits int, to form an unique 64 bits long integer.
A scaled-down example would be:
Combine two 4-bit numbers 0010 and 0101 to form the 8-bit number 00100101.
Does this scheme make sense?
If it does, how do I do the "concatenation" of numbers in Python?
Left shift the first number by the number of bits in the second number, then add (or bitwise OR - replace + with | in the following examples) the second number.
result = (user_id << 32) + timestamp
With respect to your scaled-down example,
>>> x = 0b0010
>>> y = 0b0101
>>> (x << 4) + y
37
>>> 0b00100101
37
>>>
foo = <some int>
bar = <some int>
foobar = (foo << 32) + bar
This should do it:
(x << 32) + y
For the next guy (which was me in this case was me). Here is one way to do it in general (for the scaled down example):
def combineBytes(*args):
"""
given the bytes of a multi byte number combine into one
pass them in least to most significant
"""
ans = 0
for i, val in enumerate(args):
ans += (val << i*4)
return ans
for other sizes change the 4 to a 32 or whatever.
>>> bin(combineBytes(0b0101, 0b0010))
'0b100101'
None of the answers before this cover both merging and splitting the numbers. Splitting can be as much a necessity as merging.
NUM_BITS_PER_INT = 4 # Replace with 32, 48, 64, etc. as needed.
MAXINT = (1 << NUM_BITS_PER_INT) - 1
def merge(a, b):
c = (a << NUM_BITS_PER_INT) | b
return c
def split(c):
a = (c >> NUM_BITS_PER_INT) & MAXINT
b = c & MAXINT
return a, b
# Test
EXPECTED_MAX_NUM_BITS = NUM_BITS_PER_INT * 2
for a in range(MAXINT + 1):
for b in range(MAXINT + 1):
c = merge(a, b)
assert c.bit_length() <= EXPECTED_MAX_NUM_BITS
assert (a, b) == split(c)
I have a file where the first byte contains encoded information. In Matlab I can read the byte bit by bit with var = fread(file, 8, 'ubit1'), and then retrieve each bit by var(1), var(2), etc.
Is there any equivalent bit reader in python?
Read the bits from a file, low bits first.
def bits(f):
bytes = (ord(b) for b in f.read())
for b in bytes:
for i in xrange(8):
yield (b >> i) & 1
for b in bits(open('binary-file.bin', 'r')):
print b
The smallest unit you'll be able to work with is a byte. To work at the bit level you need to use bitwise operators.
x = 3
#Check if the 1st bit is set:
x&1 != 0
#Returns True
#Check if the 2nd bit is set:
x&2 != 0
#Returns True
#Check if the 3rd bit is set:
x&4 != 0
#Returns False
With numpy it is easy like this:
Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)
More info here:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html
You won't be able to read each bit one by one - you have to read it byte by byte. You can easily extract the bits out, though:
f = open("myfile", 'rb')
# read one byte
byte = f.read(1)
# convert the byte to an integer representation
byte = ord(byte)
# now convert to string of 1s and 0s
byte = bin(byte)[2:].rjust(8, '0')
# now byte contains a string with 0s and 1s
for bit in byte:
print bit
Joining some of the previous answers I would use:
[int(i) for i in "{0:08b}".format(byte)]
For each byte read from the file. The results for an 0x88 byte example is:
>>> [int(i) for i in "{0:08b}".format(0x88)]
[1, 0, 0, 0, 1, 0, 0, 0]
You can assign it to a variable and work as per your initial request.
The "{0.08}" is to guarantee the full byte length
To read a byte from a file: bytestring = open(filename, 'rb').read(1). Note: the file is opened in the binary mode.
To get bits, convert the bytestring into an integer: byte = bytestring[0] (Python 3) or byte = ord(bytestring[0]) (Python 2) and extract the desired bit: (byte >> i) & 1:
>>> for i in range(8): (b'a'[0] >> i) & 1
...
1
0
0
0
0
1
1
0
>>> bin(b'a'[0])
'0b1100001'
There are two possible ways to return the i-th bit of a byte. The "first bit" could refer to the high-order bit or it could refer to the lower order bit.
Here is a function that takes a string and index as parameters and returns the value of the bit at that location. As written, it treats the low-order bit as the first bit. If you want the high order bit first, just uncomment the indicated line.
def bit_from_string(string, index):
i, j = divmod(index, 8)
# Uncomment this if you want the high-order bit first
# j = 8 - j
if ord(string[i]) & (1 << j):
return 1
else:
return 0
The indexing starts at 0. If you want the indexing to start at 1, you can adjust index in the function before calling divmod.
Example usage:
>>> for i in range(8):
>>> print i, bit_from_string('\x04', i)
0 0
1 0
2 1
3 0
4 0
5 0
6 0
7 0
Now, for how it works:
A string is composed of 8-bit bytes, so first we use divmod() to break the index into to parts:
i: the index of the correct byte within the string
j: the index of the correct bit within that byte
We use the ord() function to convert the character at string[i] into an integer type. Then, (1 << j) computes the value of the j-th bit by left-shifting 1 by j. Finally, we use bitwise-and to test if that bit is set. If so return 1, otherwise return 0.
Supposing you have a file called bloom_filter.bin which contains an array of bits and you want to read the entire file and use those bits in an array.
First create the array where the bits will be stored after reading,
from bitarray import bitarray
a=bitarray(size) #same as the number of bits in the file
Open the file,
using open or with, anything is fine...I am sticking with open here,
f=open('bloom_filter.bin','rb')
Now load all the bits into the array 'a' at one shot using,
f.readinto(a)
'a' is now a bitarray containing all the bits
This is pretty fast I would think:
import itertools
data = range(10)
format = "{:0>8b}".format
newdata = (False if n == '0' else True for n in itertools.chain.from_iterable(map(format, data)))
print(newdata) # prints tons of True and False
I think this is a more pythonic way:
a = 140
binary = format(a, 'b')
The result of this block is:
'10001100'
I was to get bit planes of the image and this function helped me to write this block:
def img2bitmap(img: np.ndarray) -> list:
if img.dtype != np.uint8 or img.ndim > 2:
raise ValueError("Image is not uint8 or gray")
bit_mat = [np.zeros(img.shape, dtype=np.uint8) for _ in range(8)]
for row_number in range(img.shape[0]):
for column_number in range(img.shape[1]):
binary = format(img[row_number][column_number], 'b')
for idx, bit in enumerate("".join(reversed(binary))[:]):
bit_mat[idx][row_number, column_number] = 2 ** idx if int(bit) == 1 else 0
return bit_mat
Also by this block, I was able to make primitives image from extracted bit planes
img = cv2.imread('test.jpg', cv2.IMREAD_GRAYSCALE)
out = img2bitmap(img)
original_image = np.zeros(img.shape, dtype=np.uint8)
for i in range(original_image.shape[0]):
for j in range(original_image.shape[1]):
for data in range(8):
x = np.array([original_image[i, j]], dtype=np.uint8)
data = np.array([data], dtype=np.uint8)
flag = np.array([0 if out[data[0]][i, j] == 0 else 1], dtype=np.uint8)
mask = flag << data[0]
x[0] = (x[0] & ~mask) | ((flag[0] << data[0]) & mask)
original_image[i, j] = x[0]