Python Bits and bytes - python

I was wondering how i could extract the last 2 bits of a byte. I receive the bytes when reading in from a file.
byte = b'\xfe'
bits = bin(byte)
output: 0b00110001
I want to know how i can 7th and 8th bit from that.
Any help would be appreciated.

There is always the old fashioned trick of masking:
>>> bits = bin(byte[0] & 0x03)
>>> bits
'0b10'

Related

How can I convert from bytes to binary numbers in Python?

So I'm a total Python beginner and I got this byte object:
byte_obj = b'\x45\x10\x00\x4c\xcc\xde\x40\x00\x40\x06\x6c\x80\xc0\xa8\xd9\x17\x8d\x54\xda\x28'
But I have no idea how to put this in a binary number, I only know it's gonna have 32 bits.
You could try int.from_bytes(...), documented here e.g.:
>>> byte_obj = b'\x45\x10\x00\x4c\xcc\xde\x40\x00\x40\x06\x6c\x80\xc0\xa8\xd9\x17\x8d\x54\xda\x28'
>>> int.from_bytes(byte_obj, byteorder='big')
394277201243797802270421732363840487422965373480
Where byteorder is used to specify whether the input is big- or little-endian (i.e. most or least significant byte first).
(Looks a bit bigger than 32 bits though!)

Python 3 wave module byteorder..?

[Edit: In summary, this question was the result of me making (clearly incorrect) assumptions about what endian means (I assumed it was 00000001 vs 10000000, i.e. reversing the bits, rather than the bytes). Many thanks #tripleee for clearing up my confusion.]
As far as I can tell, the byte order of frames returned by the Python 3 wave module [1] (which I'll now refer to as pywave) isn't documented. I've had a look at the source code [2] [3], but haven't quite figured it out.
Firstly, it looks like pywave only supports 'RIFF' wave files [2]. 'RIFF' files use little endian; unsigned for 8 bit or lower bitrate, otherwise signed (two's complement).
However, it looks like pywave converts the bytes it reads from the file to sys.byteorder [2]:
data = self._data_chunk.read(nframes * self._framesize)
if self._sampwidth != 1 and sys.byteorder == 'big':
data = audioop.byteswap(data, self._sampwidth)
Except in the case of sampwidth==1, which corresponds to an 8 bit file. So 8 bit files aren't converted to sys.byteorder? Why would this be? (Maybe because they are unsigned?)
Currently my logic looks like:
if sampwidth == 1:
signed = False
byteorder = 'little'
else:
signed = True
byteorder = sys.byteorder
Is this correct?
8 bit wav files are incredibly rare nowadays, so this isn't really a problem. But I would still like to find answers...
[1] https://docs.python.org/3/library/wave.html
[2] https://github.com/python/cpython/blob/3.9/Lib/wave.py
[3] https://github.com/python/cpython/blob/3.9/Lib/chunk.py
A byte is a byte, little or big endian only makes sense for data which is more than one byte.
0xf0 is a single, 8-bit byte. The bits are 0x11110000 on any modern architecture. Without a sign bit, the range is 0 through 255 (8 bits of storage gets 28 possible values).
0xf0eb is a 16-bit number which takes two 8-bit bytes to represent. This can be represented as
0xf0 0xeb big-endian (0x11110000 0x11101011), or
0xeb 0xf0 little-endian (0x11101011 0x11110000)
The range of possible values without a sign bit is 0 through 65,535 (216 values).
You can also have different byte orders for 32-bit numbers etc, but I'll defer to Wikipedia etc for the full exposition.

Extract 12-bit integer from 2 byte big endian (motorola) bytearray

I am trying to extract an integer which occupies up to 12 bits in a 2 byte (16 bit) message, which is in big-endian format. I have done some research already and expect that I will have to use bit_manipulation (bit shifting) to achieve this, but I am unsure how this can be applied to big-endian format.
A couple of answers on here used the python 'Numpy' package, but I don't have access to that on Micropython. I do have access to the 'ustruct' module, which I use to unpack certain other parts of the message, but it only seems to apply to 8 bit, 16bit and 32bit messages.
So far the only thing I have come up with is:
int12 = (byte1 << 4) + (byte2)
expected_value = int.from_bytes(int12)
but this isn't giving me the number's I am expecting. For example 0x02,0x15 should present decimal 533 .
Where am I going wrong?
I'm new to bit manipulation and extracting data from bytes so any help is greatly appreciated, Thanks!
This should work:
import struct
val, _ = struct.unpack( '!h', b'23' )
val = (val >> 4) & 0xFFF
gives:
>>> hex(val)
'0x333'
However, you should check what 12 bits out of 16 are occupied. My previous code assumes that those are the upper 3 nibbles. If the number occupies lower 3 nibbles, you don't need any shifts, just the mask with 0xFFF.

how to re-order bytes in a python hex string and convert to long

I have this long hex string 20D788028A4B59FB3C07050E2F30 In python 2.7 I want to extract the first 4 bytes, change their order, convert it to a signed number, divide it by 2^20 and then print it out. In C this would be very easy for me :) but here I'm a little stuck.
For example the correct answer would extract the 4 byte number from the string above as 0x288D720. Then divided by 2^20 would be 40.5525. Mainly I'm having trouble figuring out the right way to do byte manipulation in python. In C I would just grab pointers to each byte and shift them where I wanted them to go and cast into an int or a long.
Python is great in strings, so let's use what we have:
s = "20D788028A4B59FB3C07050E2F30"
t = "".join([s[i-2:i] for i in range(8,0,-2)])
print int(t, 16) * 1.0 / pow(2,20)
But dividing by 2**20 comes a bit strange with bits, so maybe shifting is at least worth a mention too...
print int(t, 16) >> 20
After all, I would
print int(t, 16) * 1.0 / (1 << 20)
For an extraction you can just do
foo[:8]
Hex to bytes: hexadecimal string to byte array in python
Rearrange bytes: byte reverse AB CD to CD AB with python
You can use struct for conversion to long
And just do a normal division by (2**20)

Convert binary data to signed integer

I read a binary file and get an array with characters. When converting two bytes to an integer I do 256*ord(p1) + ord(p0). It works fine for positive integers but when I get a negative number it doesn't work. I know there is something with the first bit in the most significant byte but with no success.
I also understand there is something called struct and after reading I ended up with the following code
import struct
p1 = chr(231)
p0 = chr(174)
a = struct.unpack('h',p0+p1)
print str(a)
a becomes -6226 and if I swap p0 and p1 I get -20761.
a should have been -2
-2 is not correct for the values you have specified, and byte order matters. struct uses > for big-endian (most-significant byte first) and < for little-endian (least-significant byte first):
>>> import struct
>>> struct.pack('>h',-2)
'\xff\xfe'
>>> struct.pack('<h',-2)
'\xfe\xff'
>>> p1=chr(254) # 0xFE
>>> p0=chr(255) # 0xFF
>>> struct.unpack('<h',p1+p0)[0]
-2
>>> struct.unpack('>h',p0+p1)[0]
-2
Generally, when using struct, your format string should start with one of the alignment specifiers. The default, native one differs from machine to machine.
Therefore, the correct result is
>>> struct.unpack('!h',p0+p1)[0]
-20761
The representation of -2 in big endian is:
1111 1111 1111 1110 # binary
255 254 # decimal bytes
f f f e # hexadecimal bytes
You can easily verify that by adding two, which results in 0.
With the first method (256*ord(p1) + ord(p0)), you could check to see if the first bit is 1 with if p1 & 0x80 > 0. If so then you'd use p1 & 0x7f instead of p1 and then negate the end result.
For the record, you can do it without struct. Your original equation can be used, but if the result is greater than 32767, subtract 65536. (Or if the high-order byte is greater than 127, which is the same thing.) Look up two's complement, which is how all modern computers represent negative integers.
p1 = chr(231)
p0 = chr(174)
a = 256 * ord(p1) + ord(p0) - (65536 if ord(p1) > 127 else 0)
This gets you the correct answer of -6226. (The correct answer is not -2.)
If you are converting values from a file that is large, use the array module.
For a file, know that it is the endianess of the file format that matters. Not the endianess of the machine that either wrote it or is reading it.
Alex Martelli, of course, has the definitive answer.
Your original equation will work fine if you use masking to take off the extra 1 bits in a negative number:
256*(ord(p0) & 0xff) + (ord(p1) & 0xff)
Edit: I think I might have misunderstood your question. You're trying to convert two positive byte values into a negative integer? This should work:
a = 256*ord(p0) + ord(p1)
if a > 32767: # 0x7fff
a -= 65536 # 0x10000

Categories

Resources