How can I unpack binary hex formatted data in Python?

How can I unpack binary hex formatted data in Python? - python

Using the PHP pack() function, I have converted a string into a binary hex representation:
$string = md5(time); // 32 character length
$packed = pack('H*', $string);
The H* formatting means "Hex string, high nibble first".
To unpack this in PHP, I would simply use the unpack() function with the H* format flag.
How would I unpack this data in Python?

There's an easy way to do this with the binascii module:
>>> import binascii
>>> print binascii.hexlify("ABCZ")
'4142435a'
>>> print binascii.unhexlify("4142435a")
'ABCZ'
Unless I'm misunderstanding something about the nibble ordering (high-nibble first is the default… anything different is insane), that should be perfectly sufficient!
Furthermore, Python's hashlib.md5 objects have a hexdigest() method to automatically convert the MD5 digest to an ASCII hex string, so that this method isn't even necessary for MD5 digests. Hope that helps.

There's no corresponding "hex nibble" code for struct.pack, so you'll either need to manually pack into bytes first, like:
hex_string = 'abcdef12'
hexdigits = [int(x, 16) for x in hex_string]
data = ''.join(struct.pack('B', (high <<4) + low)
for high, low in zip(hexdigits[::2], hexdigits[1::2]))
Or better, you can just use the hex codec. ie.
>>> data = hex_string.decode('hex')
>>> data
'\xab\xcd\xef\x12'
To unpack, you can encode the result back to hex similarly
>>> data.encode('hex')
'abcdef12'
However, note that for your example, there's probably no need to take the round-trip through a hex representation at all when encoding. Just use the md5 binary digest directly. ie.
>>> x = md5.md5('some string')
>>> x.digest()
'Z\xc7I\xfb\xee\xc96\x07\xfc(\xd6f\xbe\x85\xe7:'
This is equivalent to your pack()ed representation. To get the hex representation, use the same unpack method above:
>>> x.digest().decode('hex')
'acbd18db4cc2f85cedef654fccc4a4d8'
>>> x.hexdigest()
'acbd18db4cc2f85cedef654fccc4a4d8'
[Edit]: Updated to use better method (hex codec)

In Python you use the struct module for this.
>>> from struct import *
>>> pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
>>> calcsize('hhl')
8
HTH

Related

Convert hex string to usably hex

I want to convert a hex string I read from a file
"0xbffffe43" to a value written in little endian "\x43\xfe\xff\xbf".
I've tried using struct.pack but it requires a valid integer. Everytime I try to cast hex functions it will convert the 43. I need this for an assignment around memory exploits.
I have access to python 2.7
a = "0xbffffe43"
...
out = "\x43\xfe\xff\xbf"
Is what I want to achieve

You have a string in input. You can convert it to an integer using int and a base.
>>> a = "0xbffffe43"
>>> import struct
>>> out = struct.pack("<I",int(a,16))
>>> out
b'C\xfe\xff\xbf'
The b prefix is there because solution was tested with python 3. But it works as python 2 as well.
C is printed like this because python interprets printable characters. But
>>> b'C\xfe\xff\xbf' == b'\x43\xfe\xff\xbf'
True
see:
Convert hex string to int in Python
Convert a Python int into a big-endian string of bytes

You can try doing:
my_hex = 0xbffffe43
my_little_endian = my_hex.to_bytes(4, 'little')
print(my_little_endian)

in Python, trying to convert integer to character and put in a binary "string"

In Python 3.3 I need to convert an integer into the middle of three bytes to send it over a serial connection.
That is, I need to have a value of: b'\x4c\x00\x46', except that the \x00 byte will need to take the single-byte value of an integer variable that may vary from 0 to 255. I thought chr(value) would work, but that gives a string rather than a byte.
For example, if value is 255, I want to get b'\x4c\xff\x46'.

Using bytearray:
>>> b'\x4c\x00\x46'
b'L\x00F'
>>> a = bytearray(b'\x4c\x00\x46')
>>> a[1] = 255
>>> a
bytearray(b'L\xffF')
>>> bytes(a)
b'L\xffF'
You can also use list in place of bytearray. But using list does not work in Python 2.x.

Convert decimal int to little endian string ('\x##\x##...')

I want to convert an integer value to a string of hex values, in little endian. For example, 5707435436569584000 would become '\x4a\xe2\x34\x4f\x4a\xe2\x34\x4f'.
All my googlefu is finding for me is hex(..) which gives me '0x4f34e24a4f34e180' which is not what I want.
I could probably manually split up that string and build the one I want but I'm hoping somone can point me to a better option.

You need to use the struct module:
>>> import struct
>>> struct.pack('<Q', 5707435436569584000)
'\x80\xe14OJ\xe24O'
>>> struct.pack('<Q', 5707435436569584202)
'J\xe24OJ\xe24O'
Here < indicates little-endian, and Q that we want to pack a unsigned long long (8 bytes).
Note that Python will use ASCII characters for any byte that falls within the printable ASCII range to represent the resulting bytestring, hence the 14OJ, 24O and J parts of the above result:
>>> struct.pack('<Q', 5707435436569584202).encode('hex')
'4ae2344f4ae2344f'
>>> '\x4a\xe2\x34\x4f\x4a\xe2\x34\x4f'
'J\xe24OJ\xe24O'

I know it is an old thread, but it is still useful. Here my two cents using python3:
hex_string = hex(5707435436569584202) # '0x4f34e24a4f34e180' as you said
bytearray.fromhex(hex_string[2:]).reverse()
So, the key is convert it to a bytearray and reverse it.
In one line:
bytearray.fromhex(hex(5707435436569584202)[2:])[::-1] # bytearray(b'J\xe24OJ\xe24O')
PS: You can treat "bytearray" data like "bytes" and even mix them with b'raw bytes'
Update:
As Will points in coments, you can also manage negative integers:
To make this work with negative integers you need to mask your input with your preferred int type output length. For example, -16 as a little endian uint32_t would be bytearray.fromhex(hex(-16 & (2**32-1))[2:])[::-1], which evaluates to bytearray(b'\xf0\xff\xff\xff')

CRC32 checksum in Python with hex input

I'm wanting to calculate the CRC32 checksum of a string of hex values in python. I found zlib.crc32(data) and binascii.crc32(data), but all the examples I found using these functions have 'data' as a string ('hello' for example). I want to pass hex values in as data and find the checksum. I've tried setting data as a hex value (0x18329a7e for example) and I get a TypeError: must be string or buffer, not int. The function evaluates when I make the hex value a string ('0x18329a7e' for example), but I don't think it's evaluating the correct checksum. Any help would be appreciated. Thanks!

I think you are looking for binascii.a2b_hex():
>>> binascii.crc32(binascii.a2b_hex('18329a7e'))
-1357533383

>>> import struct,binascii
>>> ncrc = lambda numVal: binascii.crc32(struct.pack('!I', numVal))
>>> ncrc(0x18329a7e)
-1357533383

Try converting the list of hex values to a string:
t = ['\x18', '\x32', '\x9a', '\x7e']
chksum = binascii.crc32(str(t))

How to convert hexadecimal string to bytes in Python?

I have a long Hex string that represents a series of values of different types. I need to convert this Hex String into bytes or bytearray so that I can extract each value from the raw data. How can I do this?
For example, the string "ab" should convert to the bytes b"\xab" or equivalent byte array. Longer example:
>>> # what to use in place of `convert` here?
>>> convert("8e71c61de6a2321336184f813379ec6bf4a3fb79e63cd12b")
b'\x8eq\xc6\x1d\xe6\xa22\x136\x18O\x813y\xeck\xf4\xa3\xfby\xe6<\xd1+'

Suppose your hex string is something like
>>> hex_string = "deadbeef"
Convert it to a bytearray (Python 3 and 2.7):
>>> bytearray.fromhex(hex_string)
bytearray(b'\xde\xad\xbe\xef')
Convert it to a bytes object (Python 3):
>>> bytes.fromhex(hex_string)
b'\xde\xad\xbe\xef'
Note that bytes is an immutable version of bytearray.
Convert it to a string (Python ≤ 2.7):
>>> hex_data = hex_string.decode("hex")
>>> hex_data
"\xde\xad\xbe\xef"

There is a built-in function in bytearray that does what you intend.
bytearray.fromhex("de ad be ef 00")
It returns a bytearray and it reads hex strings with or without space separator.

provided I understood correctly, you should look for binascii.unhexlify
import binascii
a='45222e'
s=binascii.unhexlify(a)
b=[ord(x) for x in s]

Assuming you have a byte string like so
"\x12\x45\x00\xAB"
and you know the amount of bytes and their type you can also use this approach
import struct
bytes = '\x12\x45\x00\xAB'
val = struct.unpack('<BBH', bytes)
#val = (18, 69, 43776)
As I specified little endian (using the '<' char) at the start of the format string the function returned the decimal equivalent.
0x12 = 18
0x45 = 69
0xAB00 = 43776
B is equal to one byte (8 bit) unsigned
H is equal to two bytes (16 bit) unsigned
More available characters and byte sizes can be found here
The advantages are..
You can specify more than one byte and the endian of the values
Disadvantages..
You really need to know the type and length of data your dealing with

You can use the Codecs module in the Python Standard Library, i.e.
import codecs
codecs.decode(hexstring, 'hex_codec')

You should be able to build a string holding the binary data using something like:
data = "fef0babe"
bits = ""
for x in xrange(0, len(data), 2)
bits += chr(int(data[x:x+2], 16))
This is probably not the fastest way (many string appends), but quite simple using only core Python.

A good one liner is:
byte_list = map(ord, hex_string)
This will iterate over each char in the string and run it through the ord() function. Only tested on python 2.6, not too sure about 3.0+.
-Josh

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I unpack binary hex formatted data in Python? - python

In Python you use the struct module for this. >>> from struct import * >>> pack('hhl', 1, 2, 3) '\x00\x01\x00\x02\x00\x00\x00\x03' >>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03') (1, 2, 3) >>> calcsize('hhl') 8 HTH

Related

Convert hex string to usably hex

in Python, trying to convert integer to character and put in a binary "string"

Convert decimal int to little endian string ('\x##\x##...')

CRC32 checksum in Python with hex input

How to convert hexadecimal string to bytes in Python?

Categories

Resources