Print bytes to hex - python

I want to encode string to bytes.
To convert to byes, I used byte.fromhex()
>>> byte.fromhex('7403073845')
b't\x03\x078E'
But it displayed some characters.
How can it be displayed as hex like following?
b't\x03\x078E' => '\x74\x03\x07\x38\x45'

I want to encode string to bytes.
bytes.fromhex() already transforms your hex string into bytes. Don't confuse an object and its text representation -- REPL uses sys.displayhook that uses repr() to display bytes in ascii printable range as the corresponding characters but it doesn't affect the value in any way:
>>> b't' == b'\x74'
True
Print bytes to hex
To convert bytes back into a hex string, you could use bytes.hex method since Python 3.5:
>>> b't\x03\x078E'.hex()
'7403073845'
On older Python version you could use binascii.hexlify():
>>> import binascii
>>> binascii.hexlify(b't\x03\x078E').decode('ascii')
'7403073845'
How can it be displayed as hex like following? b't\x03\x078E' => '\x74\x03\x07\x38\x45'
>>> print(''.join(['\\x%02x' % b for b in b't\x03\x078E']))
\x74\x03\x07\x38\x45

The Python repr can't be changed. If you want to do something like this, you'd need to do it yourself; bytes objects are trying to minimize spew, not format output for you.
If you want to print it like that, you can do:
from itertools import repeat
hexstring = '7403073845'
# Makes the individual \x## strings using iter reuse trick to pair up
# hex characters, and prefixing with \x as it goes
escapecodes = map(''.join, zip(repeat(r'\x'), *[iter(hexstring)]*2))
# Print them all with quotes around them (or omit the quotes, your choice)
print("'", *escapecodes, "'", sep='')
Output is exactly as you requested:
'\x74\x03\x07\x38\x45'

Related

Python format hex number

I need to send a string via tcp. One of the first sections of the string is the length of the command variable
Example:
command = STATUS?UPDATE
I need to send the following string below
sendCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'+command+'\x0D\x0A'
My string length is 11 so I need STRINGLENGTH to be the hex equivalent of 11, which is 0xB, except that I need it to output as \x0B
Padding it with the leading 0 is easy, but I cannot get it to output as \x instead of 0x, and if I do a string replace it is treated as text and not as hex, so it doesn't work.
My final hex string should be:
\x00\x00\x00\x0B\x02\x53\x54\x41\x54\x55\x53\x3f\x55\x53\x45\x52\x0D\x0A
I am instead getting:
\x00\x00\x000x0B\x02\x53\x54\x41\x54\x55\x53\x3f\x55\x53\x45\x52\x0D\x0A
Any ideas on how to format it correctly?
So, this is a bit of a round-about fashion, but use a bytes object:
>>> STRINGLENGTH = bytes([11]).decode()
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x00\x0b\x02'
Almost certainly, you are going to want to change your str object back to a bytes object, but the above should get you going.
I suspect what you were doing was using the hex function:
>>> STRINGLENGTH = hex(11)
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x000xb\x02'
The fundamental thing you need to understand is that you aren't working with "hex", you are working with bytes. Hex is just how bytes are traditionally represented. The hex helper function returns a hexadecimal representation, as a string of an integer. But that isn't what you want. You want the byte corresponding to the value 11.
Note, for the ascii-range, chr(i) might works as well, so
>>> STRINGLENGTH = chr(11)
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x00\x0b\x02'
But be careful, say you wanted the number 129, you have to care about the encoding...
>>> chr(129)
'\x81'
But in bytes, in UTF-8, that's actually represented by two different bytes
>>> chr(129).encode()
b'\xc2\x81'
>>> list(chr(129).encode())
[194, 129]
Which of course, depends on the encoding:
>>> chr(129).encode('latin')
b'\x81'
>>> list(chr(129).encode('latin'))
[129]
>>>
For that reason, I think it is safer to stick with the slightly wordier:
>>> bytes([129])
b'\x81'

replace padding in base64 encoding in python 3

import base64
s = "05052020"
python2.7
base64.b64encode(s)
output is string 'MDUwNTIwMjA='
python 3.7
base64.b64encode(b"05052020")
output is bytes
b'MDUwNTIwMjA='
I want to replace = with "a"
s = str(base64.b64encode(b"05052020"))[2:-1]
s = s.replace("=", "a")
I realise it is dirty way so how can I do it better?
EDIT:
expected result:
Python code 3 output string with replaced padding
In Python 3, a byte string supports almost the same methods as a unicode string (except for encode/decode). So you can just do:
s = base64.b64encode(b"05052020").replace(b'=', b'a')
to get the b'MDUwNTIwMjAa' byte string.
If you want an unicode string, just decode it:
s = base64.b64encode(b"05052020").replace(b'=', b'a').decode()
will give 'MDUwNTIwMjAa' as a plain (unicode) Python 3 string.
Why do you need to replace the padding? If the = character breaks something, just remove them, these characters contain no information and base64 encoding works perfectly without them.
When decoding back, you may pad a few = characters back just in case (always no more than 3, so I'd pad 3, but extra characters don't break anything:
>>> import base64
>>> base64.b64encode('aa')
'YWE='
>>> base64.b64decode('YWE==')
'aa'
>>> base64.b64decode('YWE===')
'aa'
>>> base64.b64decode('YWE======')
'aa'
>>>
On the other hand, putting a character, which is a valid b64 encoding character might ruin your decoded string:
>>> base64.b64encode('aa')
'YWE='
>>> base64.b64decode('YWEa')
'aa\x1a'

Python 3: Get Bytes from File

I'm trying to get bytes from a png file in python 3, and print a string showing the bytes from the png file. However, it gives me this output:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00(\x00\x00\x00(\x08\x02\x00\x00\x00\x03\x9c/:\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00\tpHYs\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01\xc7o\xa8d\x00\x00\x01XIDATXG\xe5\xcd\xb1m\x031\x14\x04\xd1\xebF\xad\xb8+\xd5\xe0\x8a\xe5`f\x19|,.\xa0\x0fL\xf4\xc0h\x08.\xafo\xf5>\xc8/a;\xc2/a;\xc2/a;\xc2/a;\xc2/a\x0b\xebC\x1c\r+la}\x88\xa3a\x85-\x88\xbf?\xff=p4\xac\xb0\x05q\xacl\x1c8\x1aV\xd8\x828V6\x0e\x1c\r+lA\x1c+\x1b\x07\x8e\x86\x15\xb6 \x8e\x95\x8d\x03G\xc3\n[\x10\xeb\xca\xbd\xfa\xc4\xd1\xb0\xc2\x16\xc4\xbar\xaf>q4\xac\xb0\x05\xb1\xae|\xde\xafz\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xf7\xea\x13G\xc3\n[\x10\xeb\xca\xbd\xfa\xc4\xd1\xb0\xc2\x16\xc4\xb1\xb2q\xe0hXa\x0b\xe2X\xd98p4\xac\xb0\x05q\xacl\x1c8\x1aV\xd8\x828V6\x0e\x1c\r+lA\x1c+\x1b\x07\x8e\x86\x15\xb6\xb0>\xc4\xd1\xb0\xc2\x16\xd6\x878\x1aV\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0\xcb/s]\x7f\xf8o$|7\xc4\xdf\xeb\x00\x00\x00\x00IEND\xaeB`\x82'
instead of normal bytes (here are the bytes it should show): 89504E470D0A1A0A0000000D4948445200000028000000280802000000039C2F3A000000017352474200AECE1CE90000000467414D410000B18F0BFC6105000000097048597300000EC300000EC301C76FA86400000158494441545847E5CDB16D03311404D1EB46ADB82BD5E08AE56066197C2C2EA00F4CF4C068082EAF6FF53EC82F613BC22F613BC22F613BC22F613BC22F610BEB431C0D2B6C617D88A361852D88BF3FFF3D7034ACB00571AC6C1C381A56D8823856360E1C0D2B6C411C2B1B078E8615B6208E958D0347C30A5B10EBCABDFAC4D1B0C216C4BA72AF3E7134ACB005B1AE7CDEAF7AB8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BE3BF75B8AD4F1C0D2B6C41AC2BF7EA1347C30A5B10EBCABDFAC4D1B0C216C4B1B271E06858610BE258D9387034ACB00571AC6C1C381A56D8823856360E1C0D2B6C411C2B1B078E8615B6B03EC4D1B0C216D687381A56D88EF04BD88EF04BD88EF04BD88EF04BD88EF0CB2F735D7FF86F247C37C4DFEB0000000049454E44AE426082
Here is the code that I wrote to do this:
fileread = input("Input File: ")
with open(fileread, 'rb') as readfile:
string = str(readfile.read())
readfile.close()
print("String: "+string)
newstr = str(bytes(string, 'utf-8').decode('utf-8'))
Can anyone help me?
You've got it right. It's just showing the ASCII representation of the data as that's usually the more useful form
>>> s = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00(\x00\x00\x00(\x08\x02\x00\x00\x00\x03\x9c/:\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00\tpHYs\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01\xc7o\xa8d\x00\x00\x01XIDATXG\xe5\xcd\xb1m\x031\x14\x04\xd1\xebF\xad\xb8+\xd5\xe0\x8a\xe5`f\x19|,.\xa0\x0fL\xf4\xc0h\x08.\xafo\xf5>\xc8/a;\xc2/a;\xc2/a;\xc2/a;\xc2/a\x0b\xebC\x1c\r+la}\x88\xa3a\x85-\x88\xbf?\xff=p4\xac\xb0\x05q\xacl\x1c8\x1aV\xd8\x828V6\x0e\x1c\r+lA\x1c+\x1b\x07\x8e\x86\x15\xb6 \x8e\x95\x8d\x03G\xc3\n[\x10\xeb\xca\xbd\xfa\xc4\xd1\xb0\xc2\x16\xc4\xbar\xaf>q4\xac\xb0\x05\xb1\xae|\xde\xafz\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xe3\xbfu\xb8\xadO\x1c\r+lA\xac+\xf7\xea\x13G\xc3\n[\x10\xeb\xca\xbd\xfa\xc4\xd1\xb0\xc2\x16\xc4\xb1\xb2q\xe0hXa\x0b\xe2X\xd98p4\xac\xb0\x05q\xacl\x1c8\x1aV\xd8\x828V6\x0e\x1c\r+lA\x1c+\x1b\x07\x8e\x86\x15\xb6\xb0>\xc4\xd1\xb0\xc2\x16\xd6\x878\x1aV\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0K\xd8\x8e\xf0\xcb/s]\x7f\xf8o$|7\xc4\xdf\xeb\x00\x00\x00\x00IEND\xaeB`\x82'
>>> s[0]
137
>>> s[1]
80
>>> s[2]
78
>>> hex(s[0])
'0x89'
>>> hex(s[1])
'0x50'
>>> hex(s[2])
'0x4e'
>>>
I don't think you'd need the UTF-8 decode step as this is just binary data right?
If you actually want an ASCII representation of the data in hex form to match what you have in the question you could use
>>> ''.join('%02x' % c for c in s)
'89504e470d0a1a0a0000000d4948445200000028000000280802000000039c2f3a000000017352474200aece1ce90000000467414d410000b18f0bfc6105000000097048597300000ec300000ec301c76fa86400000158494441545847e5cdb16d03311404d1eb46adb82bd5e08ae56066197c2c2ea00f4cf4c068082eaf6ff53ec82f613bc22f613bc22f613bc22f613bc22f610beb431c0d2b6c617d88a361852d88bf3fff3d7034acb00571ac6c1c381a56d8823856360e1c0d2b6c411c2b1b078e8615b6208e958d0347c30a5b10ebcabdfac4d1b0c216c4ba72af3e7134acb005b1ae7cdeaf7ab8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2be3bf75b8ad4f1c0d2b6c41ac2bf7ea1347c30a5b10ebcabdfac4d1b0c216c4b1b271e06858610be258d9387034acb00571ac6c1c381a56d8823856360e1c0d2b6c411c2b1b078e8615b6b03ec4d1b0c216d687381a56d88ef04bd88ef04bd88ef04bd88ef04bd88ef0cb2f735d7ff86f247c37c4dfeb0000000049454e44ae426082'
You're getting the bytes fine; you just want to print them differently from the default Python method (which uses characters for printable ASCII codes so you can read them more easily). Just iterate over the bytes and format them however you like:
for byte in string:
print(("%02x" % byte).upper(), end="")
If the file isn't too large, you could also do it with one print() call by doing the formatting all at once and printing that:
print("".join(("%02x" % byte).upper() for byte in string))
This will build a string using approximately 6 times the amount of memory as your file before printing it. Use the first method if this could be a problem.
Actually, I just remembered... Python has a module for this!
from binascii import hexlify
print(hexlify(string).upper())
This will actually use even more memory, since it converts the letters in the hex string to uppercase after building it, but if you're OK with lowercase letters in your hex, this is probably the best solution.
BTW, it's advisable not to call what you read from your file string; it's binary data, not text.

How to remove '\x' from a hex string in Python?

I'm reading a wav audio file in Python using wave module. The readframe() function in this library returns frames as hex string. I want to remove \x of this string, but translate() function doesn't work as I want:
>>> input = wave.open(r"G:\Workspace\wav\1.wav",'r')
>>> input.readframes (1)
'\xff\x1f\x00\xe8'
>>> '\xff\x1f\x00\xe8'.translate(None,'\\x')
'\xff\x1f\x00\xe8'
>>> '\xff\x1f\x00\xe8'.translate(None,'\x')
ValueError: invalid \x escape
>>> '\xff\x1f\x00\xe8'.translate(None,r'\x')
'\xff\x1f\x00\xe8'
>>>
Any way I want divide the result values by 2 and then add \x again and generate a new wav file containing these new values. Does any one have any better idea?
What's wrong?
Indeed, you don't have backslashes in your string. So, that's why you can't remove them.
If you try to play with each hex character from this string (using ord() and len() functions - you'll see their real values. Besides, the length of your string is just 4, not 16.
You can play with several solutions to achieve your result:
'hex' encode:
'\xff\x1f\x00\xe8'.encode('hex')
'ff1f00e8'
Or use repr() function:
repr('\xff\x1f\x00\xe8').translate(None,r'\\x')
One way to do what you want is:
>>> s = '\xff\x1f\x00\xe8'
>>> ''.join('%02x' % ord(c) for c in s)
'ff1f00e8'
The reason why translate is not working is that what you are seeing is not the string itself, but its representation. In other words, \x is not contained in the string:
>>> '\\x' in '\xff\x1f\x00\xe8'
False
\xff, \x1f, \x00 and \xe8 are the hexadecimal representation of for characters (in fact, len(s) == 4, not 24).
Use the encode method:
>>> s = '\xff\x1f\x00\xe8'
>>> print s.encode("hex")
'ff1f00e8'
As this is a hexadecimal representation, encode with hex
>>> '\xff\x1f\x00\xe8'.encode('hex')
'ff1f00e8'

Python: decode bin using base64

If I have a binary string, let say str = "010100011010101001001101100101100110101" which is an encoded by base64 version of some other string how can I decode this string?
It would have been great if your example string is actually something meaningful rather than something made up which makes this question rather unclear, but I will try my best here to figure out what you might have meant in the most verbose manner possible.
Assuming your actual input is a str that looks like this:
s = '101100101010011010000100011000001011010010110000100111000110000'
You can get the hexadecimal form of this by casting it to int using the base keyword argument
>>> i = int(s, base=2) # 6436561067884170800
Then turn it back into a string by formatting it like so:
>>> h = '%x' % i # '595342305a584e30'
Then use the binascii.a2b_hex function on the hexadecimal string to get the raw bytes:
>>> b64 = binascii.a2b_hex(h) # b'YSB0ZXN0'
If it is some valid base 64 encoded stream of bytes, you may then use base64.b64decode on that to get the actual bytes
>>> r = base64.b64decode(b64) # b'a test'
To turn that into a string, apply the correct codec to it (i.e. use bytes.encode).
Finally, if you cared to know how I generated that input, this is all the above, reversed into a single one-line function:
>>> '{0:b}'.format(int(binascii.b2a_hex(base64.b64encode(b'a test')), base=16))
'101100101010011010000100011000001011010010110000100111000110000'

Categories

Resources