I have a bmp file that I read in my Python program. Once I have read in the bytes, I want to do bit-wise operations on each byte I read in. My program is:
with open("ship.bmp", "rb") as f:
byte = f.read(1)
while byte != b"":
# Do stuff with byte.
byte = f.read(1)
print(byte)
output:
b'\xfe'
I was wondering how I can do manipulation on that? I.e convert it to bits. Some general pointers would be good. I lack experience with Python, so any help would be appreciated!
bytes objects yield integers from 0 through 255 inclusive when indexed. So, just perform the bit manipulation on the result of indexing.
3>> b'\xfe'[0]
254
3>> b'\xfe'[0] ^ 0x55
171
file.read(1) constructs a length 1 bytes objects, which is a bit overkill when you want the byte as an integer. To access each byte as an integer the following would be more succinct, and have the benefit of using a for loop.
with open("ship.bmp", "rb") as f:
byte_data = f.read()
for byte in byte_data:
# do stuff with byte. eg.
result = byte & 0x2
...
Related
Let's say I have the following ELF file in python:
>>> data=open('file','rb').read()
>>> data
b'\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00>\x00\x01\x00\x00\x00x\x00#\x00\x00\x00\x00\x00#\x00\x00\x00\x00\x00\x00\x00X\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00#\x008\x00\x01\x00#\x00\x05\x00\x04\x00\x01\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00#\x00\x00\x00\x00\x00\x00\x00#\x00\x00\x00\x00\x00\x84\x00\x00\x00\x00\x00\x00\x00\x84\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\xbf\x03\x00\x00\x00\xb8<\x00\x00\x00\x0f\x05\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x01\x00x\x00#\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x10\x00\x01\x00\x84\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\r\x00\x00\x00\x10\x00\x01\x00\x84\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x14\x00\x00\x00\x10\x00\x01\x00\x88\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00__bss_start\x00_edata\x00_end\x00\x00.symtab\x00.strtab\x00.shstrtab\x00.text\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1b\x00\x00\x00\x01\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00x\x00#\x00\x00\x00\x00\x00x\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x88\x00\x00\x00\x00\x00\x00\x00\x90\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x18\x00\x00\x00\x00\x00\x00\x00\t\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x18\x01\x00\x00\x00\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x11\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x001\x01\x00\x00\x00\x00\x00\x00!\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
The first 7 bytes are, in hex:
0x7F 0x45 ("E") 0x4c ("L") 0x46 ("F") 0x02 0x01 0x01
How would I change the 5th byte to 1 and save the file? Something like:
data[5]=1 # gives a 'bytes' assignment error
open('newfile','wb').write(data)
Convert it to a bytearray, which is a mutable sequence of bytes, for modifying and re-saving the file:
The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has
# open the file in binary mode and convert to a byte-array
with open('file', 'rb') as f:
barray=bytearray(r.read())
# modify the byte in the array
barray[4]=1
# write-out in binary mode
with open('newfile', 'wb') as f:
f.write(barray)
I would like to scan through data files from GPS receiver byte-wise (actually it will be a continuous flow, not want to test the code with offline data). If find a match, then check the next 2 bytes for the 'length' and get the next 2 bytes and shift 2 bits(not byte) to the right, etc. I didn't handle binary before, so stuck in a simple task. I could read the binary file byte-by-byte, but can not find a way to match by desired pattern (i.e. D3).
with open("COM6_200417.ubx", "rb") as f:
byte = f.read(1) # read 1-byte at a time
while byte != b"":
# Do stuff with byte.
byte = f.read(1)
print(byte)
The output file is:
b'\x82'
b'\xc2'
b'\xe3'
b'\xb8'
b'\xe0'
b'\x00'
b'#'
b'\x13'
b'\x05'
b'!'
b'\xd3'
b'\x00'
b'\x13'
....
how to check if that byte is == '\xd3'? (D3)
also would like to know how to shift bit-wise, as I need to check decimal value consisting of 6 bits
(1-byte and next byte's first 2-bits). Considering, taking 2-bytes(8-bits) and then 2-bit right-shift
to get 6-bits. Is it possible in python? Any improvement/addition/changes are very much appreciated.
ps. can I get rid of that pesky 'b' from the front? but if ignoring it does not affect then no problem though.
Thanks in advance.
'That byte' is represented with a b'' in front, indicating that it is a byte object. To get rid of it, you can convert it to an int:
thatbyte = b'\xd3'
byteint = thatbyte[0] # or
int.from_bytes(thatbyte, 'big') # 'big' or 'little' endian, which results in the same when converting a single byte
To compare, you can do:
thatbyte == b'\xd3'
Thus compare a byte object with another byte object.
The shift << operator works on int only
To convert an int back to bytes (assuming it is [0..255]) you can use:
bytes([byteint]) # note the extra brackets!
And as for improvements, I would suggest to read the whole binary file at once:
with open("COM6_200417.ubx", "rb") as f:
allbytes = f.read() # read all
for val in allbytes:
# Do stuff with val, val is int !!!
print(bytes([val]))
I just finished creating a huffman compression algorithm . I converted my compressed text from a string to a byte array with bytearray(). Im attempting to decompress my huffman algorithm. My only concern though is that i cannot convert my byte array back into a string. Is there any built in function i could use to convert my byte array (with a variable) back into a string? If not is there a better method to convert my compressed string to something else? I attempted to use byte_array.decode() and I get this:
print("Index: ", Index) # The Index
# Subsituting text to our compressed index
for x in range(len(TextTest)):
TextTest[x]=Index[TextTest[x]]
NewText=''.join(TextTest)
# print(NewText)
# NewText=int(NewText)
byte_array = bytearray() # Converts the compressed string text to bytes
for i in range(0, len(NewText), 8):
byte_array.append(int(NewText[i:i + 8], 2))
NewSize = ("Compressed file Size:",sys.getsizeof(byte_array),'bytes')
print(byte_array)
print(byte_array)
print(NewSize)
x=bytes(byte_array)
x.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 0: invalid start byte
You can use .decode('ascii') (leave empty for utf-8).
>>> print(bytearray("abcd", 'utf-8').decode())
abcd
Source : Convert bytes to a string?
I am building a parser, and I kinda new to this.
I have problem with decoding specific bytes, they always return same int(and they shouldn't) so I must doing it wrong.
byte = ser.read(1)
byte += ser.read(ser.inWaiting())
a = 0
for i in byte:
if i == 0x04:
value = struct.unpack("<h", bytes([i, a]))[0]
print (value)
I recive bytes like this:
b'\xaa\x04\x80\x02\xff\xfb\x83\xaa\xaa\x04\x80\
And I need to decode packet 0x04. I am using Python 3.6
Try something like :
value = int.from_bytes(byte, byteorder='little')
Suppose I have a number like 824 and I write it to a text file using python. In the text file, it will take 3 bytes space. However, If i represent it using bits, it has the following representation 0000001100111000 which is 2 bytes (16 bits). I was wondering how can I write bits to file in python, not bytes. If I can do that, the size of the file will be 2 bytes, not 3.
Please provide code. I am using python 2.6. Also, I do not want to use any external modules that do not come with the basic installation
I tried below and gave me 12 bytes!
a =824;
c=bin(a)
handle = open('try1.txt','wb')
handle.write(c)
handle.close()
The struct module is what you want. From your example, 824 = 0000001100111000 binary or 0338 hexadecimal. This is the two bytes 03H and 38H. struct.pack will convert 824 to a string of these two bytes, but you also have to decide little-endian (write the 38H first) or big-endian (write the 03H first).
Example
>>> import struct
>>> struct.pack('>H',824) # big-endian
'\x038'
>>> struct.pack('<H',824) # little-endian
'8\x03'
>>> struct.pack('H',824) # Use system default
'8\x03'
struct returns a two-byte string. the '\x##' notation means (a byte with hexadecimal value ##). the '8' is an ASCII '8' (value 38H). Python byte strings use ASCII for printable characters, and \x## notation for unprintable characters.
Below is an example writing and reading binary data to a file. You should always specify the endian-ness when writing to and reading from a binary file, in case it is read on a system with a different endian default:
import struct
a = 824
bin_data = struct.pack('<H',824)
print 'bin_data length:',len(bin_data)
with open('data.bin','wb') as f:
f.write(bin_data)
with open('data.bin','rb') as f:
bin_data = f.read()
print 'Value from file:',struct.unpack('<H',bin_data)[0]
print 'bin_data representation:',repr(bin_data)
for i,c in enumerate(bin_data):
print 'Byte {0} as binary: {1:08b}'.format(i,ord(c))
Output
bin_data length: 2
Value from file: 824
bin_data representation: '8\x03'
Byte 0 as binary: 00111000
Byte 1 as binary: 00000011
Have a look at struct:
>>> struct.pack("h", 824)
'8\x03'
I think what you want is to open the file in binary mode:
open("file.bla", "wb")
However, this will write an integer to the file, which will probably be 4 bytes in size. I do not know if Python has a 2 byte integer type. But you can circumvent that by encoding 2 16 bit number in one 32 bit number:
a = 824
b = 1234
c = (a << 16) + b