I am writing a Python program where I process a file byte-by-byte, and I am trying to write a function that splits a byte into its upper and lower halves. To elaborate, let's say I want to run this function on the byte with the decimal value 18 and the hexadecimal value 12. I would want it to be split into two bytes with values of 1 and 2.
Here is a function I wrote to do this:
# split byte into upper and lower halves
def splitByte(b):
lowerMask = b'\x0F'
lowerHalf = bytes(b & lowerMask[0])[0]
upperMask = b'\xF0'
upperHalf = bytes(b & upperMask[0])[0]
upperHalf = upperHalf >> 4
return [upperHalf,lowerHalf]
Here is where I am calling the function:
info = stream.read(1)
result = splitByte(info[0])
print(result)
However, when I run a file with just the above code and the function, the following occurs:
[0, 0]
[0, 0]
[0, 0]
[0, 0]
Traceback (most recent call last):
File "./test.py", line 8, in <module>
result = splitByte(info[0])
File "<home folder>/byteops.py", line 21, in splitByte
lowerHalf = bytes(b & lowerMask[0])[0]
IndexError: index out of range
Not only is the function returning 0 for both values, but it errors out on some inputs, with an 'index out of range' error. For context, here is the file I'm reading from, as viewed in a hex editor:
00000000: 4C 49 54 35 30 0A 09 09 02 01
I am running Manjaro Linux with Python 3.7.1. How should I fix my splitByte function, or is there a library function that does it for me?
Your problem is that converting from int to bytes. bytes(2) is a request for a bytearray of two zeros. You can simply use the int manipulations you already know:
# split byte into upper and lower halves
def splitByte(b):
lowerHalf = b & 15
upperHalf = (b >> 4) & 15
return [upperHalf,lowerHalf]
result = splitByte(18)
print(result)
Output:
[1, 2]
I left this as integers, since your original program needed only the byte division, not a bytearray.
There is a much simpler way to do that. You can use the ord function to convert a single character to its ASCII value (in base 10). Then, you use the hex function to convert this value into an hexadecimal value (in string). You can now easily access to the upper and lower part of your value.
Here is an example:
val = 'a'
print(hex(ord(val))[2]) # 6
print(hex(ord(val))[3]) # 1
You get 6 and 1 because hexadecimal value of a is 0x61.
Now, if you directly get the decimal value of each character of your source file, you can get rid of the ord function:
val = 97
print(hex(val)[2]) # 6
print(hex(val)[3]) # 1
Related
I want to combine two bytes (8 bit) to form a signed value (one bit for sign and 15 for the value) according to the two complement's method.
I receive MSbyte (note that the most left bit of MSByte is for the sign) and the LSbyte. So I write a function by shifting the MSByte to the left by 8 bit then I add it with the LSByte to form a binary sequence of 16 bit. Then, I calculate the ones'complement, and I finally add 1 to the result. However, it does not work.
def twos_comp_two_bytes(msb, lsb):
a= (msb<<8)+ lsb
r = ~(a)+1
return r
For example 0b0b1111110111001001 is -567 however with the above function I get -64969.
EDIT : call of the function
twos_comp_two_bytes(0b11111101,0b11001001) => -64969
Python uses integers which may have any lenght - they are not restricted to 16bits so to get -567 it would need rather
r = a - (256*256)
but it need more code for other values
def twos_comp_two_bytes(msb, lsb):
a = (msb<<8) + lsb
if a >= (256*256)//2:
a = a - (256*256)
return a
print(twos_comp_two_bytes(0b11111101, 0b11001001))
print(twos_comp_two_bytes(0b0, 0b0))
print(twos_comp_two_bytes(0b0, 0b1))
print(twos_comp_two_bytes(0b10000000, 0b0))
print(twos_comp_two_bytes(0b10000000, 0b1))
Results:
-567
0
1
-32768
-32767
It would be better to use special module struct for this
import struct
def twos_comp_two_bytes(msb, lsb):
return struct.unpack('>h', bytes([msb, lsb]))[0]
#return struct.unpack('<h', bytes([lsb, msb]))[0] # different order `[lsb, msb]`
#return struct.unpack( 'h', bytes([lsb, msb]))[0] # different order `[lsb, msb]`
print(twos_comp_two_bytes(0b11111101, 0b11001001))
print(twos_comp_two_bytes(0b0, 0b0))
print(twos_comp_two_bytes(0b0, 0b1))
print(twos_comp_two_bytes(0b10000000, 0b0))
print(twos_comp_two_bytes(0b10000000, 0b1))
Results:
-567
0
1
-32768
-32767
Letter h means short integer (signed int with 2 bytes).
Char >, < describes order of bytes.
See more in Format Characters
I ask a Measurement Device to give me some Data. At first it tells me how many bytes of data are in the storage. It is always 14. Then it gives me the data which i have to encode into hex. It is Python 2.7 canĀ“t use newer versions. Line 6 to 10 tells the Device to give me the measured data.
Line 12 to 14 is the encoding to Hex. In other Programs it works. but when i print result(Line 14) then i get a Hex number with 13 Bytes PLUS 1 which can not be correct because it has an L et the end. I guess it is some LONG or whatever. and i dont need the last Byte. but i do think it changes the Data too, which is picked out from Line 15 and up. at first in Hex. Then it is converted into Int.
Is it possible that the L has an effect on the Data or not?
How can i fix it?
1 ap.write(b"ML\0")
rmemb = ap.read(2)
print(rmemb)
rmemb = int(rmemb)+1
5 rmem = rmemb #must be and is 14 Bytes
addmem = ("MR:%s\0" % rmem)
# addmem = ("MR:14\0")
ap.write(addmem.encode())
10 time.sleep(1)
test = ap.read(rmem)
result = hex(int(test.encode('hex'), 16))
print(result)
15 ftflash = result[12:20]
ftbg = result[20:28]
print(ftflash)
print(ftbg)
ftflash = int(ftflash, 16)
20 # print(ftflash)
ftbg = int(ftbg, 16)
# print(ftbg)
OUTPUT:
14
0x11bd5084c0b000001ce00000093L
b000001c
e0000009
Python 2 has two built-in integer types, int and long. hex returns a string representing a Python hexadecimal literal, and in Python 2, that means that longs get an L at the end, to signify that it's a long.
I am writing a Python script to hide data in an image. It basically hides the bits in the last two bits of the red color in RGB map of each pixel in a .PNG. The script works fine for lowercase letters but produces an error with a full stop. It produces this error:
Traceback (most recent call last): File
"E:\Python\Steganography\main.py", line 65, in
print(unhide('coded-img.png')) File "E:\Python\Steganography\main.py", line 60, in unhide
message = bin2str(binary) File "E:\Python\Steganography\main.py", line 16, in bin2str
return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position
6: invalid start byte
Here, is my code:
from PIL import Image
def str2bin(message):
binary = bin(int.from_bytes(message.encode('utf-8'), 'big'))
return binary[2:]
def bin2str(binary):
n = int(binary, 2)
return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
def hide(filename, message):
image = Image.open(filename)
binary = str2bin(message) + '00000000'
data = list(image.getdata())
newData = []
index = 0
for pixel in data:
if index < len(binary):
pixel = list(pixel)
pixel[0] >>= 2
pixel[0] <<= 2
pixel[0] += int('0b' + binary[index:index+2], 2)
pixel = tuple(pixel)
index += 2
newData.append(pixel)
print(binary)
image.putdata(newData)
image.save('coded-'+filename, 'PNG')
def unhide(filename):
image = Image.open(filename)
data = image.getdata()
binary = '0'
index = 0
while binary[-8:] != '00000000':
binary += bin(data[index][0])[-2:]
index += 1
binary = binary[:-1]
print(binary)
print(index*2)
message = bin2str(binary)
return message
hide('img.png', 'alpha.')
print(unhide('coded-img.png'))
Please help. Thanks!
There are at least two problems with your code.
The first problem is that your encoding can be misaligned by 1 bit since leading null bits are not included in the output of the bin() function:
>>> bin(int.from_bytes('a'.encode('utf-8'), 'big'))[2:]
'1100001'
# This string is of odd length and (when joined with the terminating '00000000')
# turns into a still odd-length '110000100000000' which is then handled by your
# code as if there was an extra trailing zero (so that the length is even).
# During decoding you compensate for that effect with the
#
# binary = binary[:-1]
#
# line. The latter is responsible for your stated problem when the binary
# representation of your string is in fact of even length and doesn't need
# the extra bit as in the below example:
>>> bin(int.from_bytes('.'.encode('utf-8'), 'big'))[2:]
'101110'
You better complement your binary string to even length by prepending an extra null bit (if needed).
The other problem is that while restoring the hidden message the stopping condition binary[-8:] == '00000000' can be prematurely satisfied through leading bits of one (partially restored) symbol being concatenated to trailing bits of another symbol. This can happen, for example, in the following cases
the symbol # (with ASCII code=64, i.e. 6 low order bits unset) followed by any character having an ASCII code value less than 64 (i.e. with 2 highest order bits unset);
a space character (with ASCII code=32, i.e. 4 low order bits unset) followed by a linefeed/newline character(with ASCII code=10, i.e. 4 high order bits unset).
You can fix that bug by requiring that a full byte is decoded at the time when the last 8 bits appear to all be unset:
while not (len(binary) % 8 == 0 and binary[-8:] == '00000000'):
# ...
Here's the code snippet from my RFID wiegand reader on my Raspberry Pi I use already.
def main():
set_procname("Wiegand Reader")
global bits
global timeout
GPIO.add_event_detect(D0, GPIO.FALLING, callback=one)
GPIO.add_event_detect(D1, GPIO.FALLING, callback=zero)
GPIO.add_event_detect(S1, GPIO.FALLING, callback=unlockDoor)
while 1:
if bits:
timeout = timeout -1
time.sleep(0.001)
if len(bits) > 1 and timeout == 0:
#print "Binary:", int(str(bits),2)
c1 = int(str(bits),2)
result = ((~c1) >> 1) & 0x0FFFFFF;
checkAccess(result, doorID)
else:
time.sleep(0.001)
if __name__ == '__main__':
main()
On a normal USB RFID reader, I get 0000119994 and that's what's printed on the card. But with this code it reads 119994. I've tried multiple cards. It always drops the zeros at the front .
I even tried a card with a zero in it. 0000120368 and it shows 120368
I thought it was taking off the first 4 characters but then I tried a key fob that only had 3 zeros in front. 0004876298 and it reads 4876298. Only dropping the front zeros.
Python will remove the front few bits if they are zero, this also applies to integers. For example
>>> a = 0003
>>> a
3
>>> b = 0b0011
>>> bin(b)
0b11
From what I see, all RFID's will have 10 numbers. You can make a simple program to add those numbers in and store the value as a string:
def rfid_formatter(value):
str_value = str(value)
for s in range(10 - len(str_value)):
str_value = "0" + str_value
return str_value
Your test cases:
print rfid_formatter(120368)
print "0000120368"
print rfid_formatter(4876298)
print "0004876298"
As mentioned already, leading zeros are removed in binary sequences and also when you explicitly convert a string to decimal using int().
What hasn't been mentioned already is that, in Python 2.x, integers with leading zeros are treated as octal values.
>>> a = 0003
>>> a
3
>>> a = 000127
>>> a
87
Since this was causing confusion, the implicit octal conversion was removed in Python 3 and any number of leading zeros in numerical values will raise a SyntaxError.
>>> a = 000127
File "<stdin>", line 1
a = 000127
^
SyntaxError: invalid token
>>>
You can read the rationale behind these decisions in PEP 3127.
Anyway, I mention all of this simply to arrive at an assumption: you're probably not working with octal representations. Instead, I think you're converting result to a string in checkAccess so you can do a string comparison. If this assumption is correct, you can simply use the string method zfill (zero fill):
>>> str(119994).zfill(10)
'0000119994'
>>>
>>> str(4876298).zfill(10)
'0004876298'
>>>
I am facing a little corner case of the famous struct.pack.
The situation is the following: I have a dll with a thin layer wrapper to python. One of the python method in the wraper accept a byte array as argument. This byte array the representation of a register on a specific hardware bus. Each bus has different register width, typically 8, 16 and 24 bits wide (alignement is the same in all cases).
When calling this method I need to convert my value (whatever that is) to a byte array of 8/16 or 24bits. Such conversion is relatively easy with 8 or 16bits using the struct.pack:
byteList = struct.pack( '>B', regValue ) # For 8 bits case
byteList = struct.pack( '>H', regValue ) # for 16 bits case
I am now looking to make it flexible enough for all three cases 8/16 & 24 bits. I could use a mix of the two previous line to handle the three cases; but I find it quite ugly.
I was hoping this would work:
packformat = ">{0}B".format(regSize)
byteList = struct.pack( packformat, regValue )
But it is not the case as the struct.pack expect an equal amount of arguments.
Any idea how can I convert (neatly) my register value into an arbitrary number of bytes?
You are always packing unsigned integers, and only big endian to boot. Take a look at what happens when you pack them:
>>> import struct
>>> struct.pack('>B', 255)
'\xff'
>>> struct.pack('>H', 255)
'\x00\xff'
>>> struct.pack('>I', 255)
'\x00\x00\x00\xff'
Essentially the value is padded with null bytes at the start. Use this to your advantage:
>>> struct.pack('>I', 255)[-3:]
'\x00\x00\xff'
>>> struct.pack('>I', 255)[-2:]
'\x00\xff'
>>> struct.pack('>I', 255)[-1:]
'\xff'
You won't get an exception now, if your value is too large, but it would simplify your code enormously. You can always add a separate validation step:
def packRegister(value, size):
if value < 0 or value.bit_length() > size:
raise ValueError("Value won't fit in register of size {} bits".format(size))
return struct.pack('>I', value)[-(size // 8):]
Demo:
>>> packRegister(255, 8)
'\xff'
>>> packRegister(1023, 16)
'\x03\xff'
>>> packRegister(324353, 24)
'\x04\xf3\x01'
>>> packRegister(324353, 8)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in packRegister
ValueError: Value won't fit in register of size 8 bits