Convert hex-string to string using binascii - python

By hex-string, it is a regular string except every two characters represents some byte, which is mapped to some ASCII char.
So for example the string
abc
Would be represented as
979899
I am looking at the binascii module but don't really know how to take the hex-string and turn it back into the ascii string.
Which method can I use?
Note: I am starting with 979899 and want to convert it back to abc

You can use ord() to get the integer value of each character:
>>> map(ord, 'abc')
[97, 98, 99]
>>> ''.join(map(lambda c: str(ord(c)), 'asd'))
'979899'
>>> ''.join((str(ord(c)) for c in 'abc'))
'979899'

You don't need binascii to get the integer representation of a character in a string, all you need is the built in function ord().
s = 'abc'
print(''.join(map(lambda x:str(ord(x)),s))) # outputs "979899"

To get the string back from the hexadecimal number you can use
s=str(616263)
print "".join([chr(int(s[x:x+2], 16)) for x in range(0,len(s),2)])
See http://ideone.com/dupgs

Related

Python format hex number

I need to send a string via tcp. One of the first sections of the string is the length of the command variable
Example:
command = STATUS?UPDATE
I need to send the following string below
sendCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'+command+'\x0D\x0A'
My string length is 11 so I need STRINGLENGTH to be the hex equivalent of 11, which is 0xB, except that I need it to output as \x0B
Padding it with the leading 0 is easy, but I cannot get it to output as \x instead of 0x, and if I do a string replace it is treated as text and not as hex, so it doesn't work.
My final hex string should be:
\x00\x00\x00\x0B\x02\x53\x54\x41\x54\x55\x53\x3f\x55\x53\x45\x52\x0D\x0A
I am instead getting:
\x00\x00\x000x0B\x02\x53\x54\x41\x54\x55\x53\x3f\x55\x53\x45\x52\x0D\x0A
Any ideas on how to format it correctly?
So, this is a bit of a round-about fashion, but use a bytes object:
>>> STRINGLENGTH = bytes([11]).decode()
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x00\x0b\x02'
Almost certainly, you are going to want to change your str object back to a bytes object, but the above should get you going.
I suspect what you were doing was using the hex function:
>>> STRINGLENGTH = hex(11)
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x000xb\x02'
The fundamental thing you need to understand is that you aren't working with "hex", you are working with bytes. Hex is just how bytes are traditionally represented. The hex helper function returns a hexadecimal representation, as a string of an integer. But that isn't what you want. You want the byte corresponding to the value 11.
Note, for the ascii-range, chr(i) might works as well, so
>>> STRINGLENGTH = chr(11)
>>> endCommand = '\x00\x00\x00'+STRINGLENGTH+'\x02'
>>> endCommand
'\x00\x00\x00\x0b\x02'
But be careful, say you wanted the number 129, you have to care about the encoding...
>>> chr(129)
'\x81'
But in bytes, in UTF-8, that's actually represented by two different bytes
>>> chr(129).encode()
b'\xc2\x81'
>>> list(chr(129).encode())
[194, 129]
Which of course, depends on the encoding:
>>> chr(129).encode('latin')
b'\x81'
>>> list(chr(129).encode('latin'))
[129]
>>>
For that reason, I think it is safer to stick with the slightly wordier:
>>> bytes([129])
b'\x81'

How do I convert hex to utf-8?

I want to convert a hex string to utf-8
a = '0xb3d9'
to
동 (http://www.unicodemap.org/details/0xB3D9/index.html)
First, obtain the integer value from the string of a, noting that a is expressed in hexadecimal:
a_int = int(a, 16)
Next, convert this int to a character. In python 2 you need to use the unichr method to do this, because the chr method can only deal with ASCII characters:
a_chr = unichr(a_int)
Whereas in python 3 you can just use the chr method for any character:
a_chr = chr(a_int)
So, in python 3, the full command is:
a_chr = chr(int(a, 16))

Print bytes to hex

I want to encode string to bytes.
To convert to byes, I used byte.fromhex()
>>> byte.fromhex('7403073845')
b't\x03\x078E'
But it displayed some characters.
How can it be displayed as hex like following?
b't\x03\x078E' => '\x74\x03\x07\x38\x45'
I want to encode string to bytes.
bytes.fromhex() already transforms your hex string into bytes. Don't confuse an object and its text representation -- REPL uses sys.displayhook that uses repr() to display bytes in ascii printable range as the corresponding characters but it doesn't affect the value in any way:
>>> b't' == b'\x74'
True
Print bytes to hex
To convert bytes back into a hex string, you could use bytes.hex method since Python 3.5:
>>> b't\x03\x078E'.hex()
'7403073845'
On older Python version you could use binascii.hexlify():
>>> import binascii
>>> binascii.hexlify(b't\x03\x078E').decode('ascii')
'7403073845'
How can it be displayed as hex like following? b't\x03\x078E' => '\x74\x03\x07\x38\x45'
>>> print(''.join(['\\x%02x' % b for b in b't\x03\x078E']))
\x74\x03\x07\x38\x45
The Python repr can't be changed. If you want to do something like this, you'd need to do it yourself; bytes objects are trying to minimize spew, not format output for you.
If you want to print it like that, you can do:
from itertools import repeat
hexstring = '7403073845'
# Makes the individual \x## strings using iter reuse trick to pair up
# hex characters, and prefixing with \x as it goes
escapecodes = map(''.join, zip(repeat(r'\x'), *[iter(hexstring)]*2))
# Print them all with quotes around them (or omit the quotes, your choice)
print("'", *escapecodes, "'", sep='')
Output is exactly as you requested:
'\x74\x03\x07\x38\x45'

How to remove '\x' from a hex string in Python?

I'm reading a wav audio file in Python using wave module. The readframe() function in this library returns frames as hex string. I want to remove \x of this string, but translate() function doesn't work as I want:
>>> input = wave.open(r"G:\Workspace\wav\1.wav",'r')
>>> input.readframes (1)
'\xff\x1f\x00\xe8'
>>> '\xff\x1f\x00\xe8'.translate(None,'\\x')
'\xff\x1f\x00\xe8'
>>> '\xff\x1f\x00\xe8'.translate(None,'\x')
ValueError: invalid \x escape
>>> '\xff\x1f\x00\xe8'.translate(None,r'\x')
'\xff\x1f\x00\xe8'
>>>
Any way I want divide the result values by 2 and then add \x again and generate a new wav file containing these new values. Does any one have any better idea?
What's wrong?
Indeed, you don't have backslashes in your string. So, that's why you can't remove them.
If you try to play with each hex character from this string (using ord() and len() functions - you'll see their real values. Besides, the length of your string is just 4, not 16.
You can play with several solutions to achieve your result:
'hex' encode:
'\xff\x1f\x00\xe8'.encode('hex')
'ff1f00e8'
Or use repr() function:
repr('\xff\x1f\x00\xe8').translate(None,r'\\x')
One way to do what you want is:
>>> s = '\xff\x1f\x00\xe8'
>>> ''.join('%02x' % ord(c) for c in s)
'ff1f00e8'
The reason why translate is not working is that what you are seeing is not the string itself, but its representation. In other words, \x is not contained in the string:
>>> '\\x' in '\xff\x1f\x00\xe8'
False
\xff, \x1f, \x00 and \xe8 are the hexadecimal representation of for characters (in fact, len(s) == 4, not 24).
Use the encode method:
>>> s = '\xff\x1f\x00\xe8'
>>> print s.encode("hex")
'ff1f00e8'
As this is a hexadecimal representation, encode with hex
>>> '\xff\x1f\x00\xe8'.encode('hex')
'ff1f00e8'

I want one backslash - not two

I have a string that after print is like this: \x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71
But I want to change this string to "\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71" which is not printable (it is necessary to write to serial port). I know that it ist problem with '\'. how can I replace this printable backslashes to unprintable?
If you want to decode your string, use decode() with 'string_escape' as parameter which will interpret the literals in your variable as python literal string (as if it were typed as constant string in your code).
mystr.decode('string_escape')
Use decode():
>>> st = r'\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71'
>>> print st
\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71
>>> print st.decode('string-escape')
MÿýfHq
That last garbage is what my Python prints when trying to print your unprintable string.
You are confusing the printable representation of a string literal with the string itself:
>>> c = '\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71'
>>> c
'M\xff\xfd\x00\x02\x8f\x0e\x80fHq'
>>> len(c)
11
>>> len('\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71')
11
>>> len(r'\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71')
44
your_string.decode('string_escape')

Categories

Resources