how to convert string of bytes string into string using python - python

I got a string of bytes string like below:
string1 = "b'\xe6\x88\x91\xe4\xbb\xac \xe7\xb4\xa2\xe8\xa6\x81 \xe6\x8e\xa8\xe5\xb9\xbf \xe7\x9a\x84 \xe6\x98\xaf\xe4\xb8\x80 \xe5\xbe\x97 \xe6\x96\xb9\xe6\x96\xb9 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\xba\xe5\x9f\xba \xe7\xa1\x80 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\x80 \xe5\x8c\x97\xe4\xba\xac \xe5\xb7\xb2 \xe5\x9b\xa0 \xe4\xb8\xba \xe6\xa0\x87\xe5\x87\x86 \xe7\x9a\x84 \xe6\x99\xae\xe9\x80\x9a \xe8\xaf\x9d \xe4\xbb\x96 \xe4\xbb\x8e \xe5\x84\xbf\xe7\xab\xa5 \xe6\x97\xb6\xe4\xbb\xa3 \xe8\xb5\xb7 \xe5\xb0\xb1 \xe5\x96\x9c\xe6\xac\xa2 \xe4\xb8\x8b \xe5\x9b\xb4\xe6\xa3\x8b \xe5\x9c\xa8 \xe5\x8d\x81\xe4\xba\x94 \xe5\xb2\x81 \xe7\x9a\x84 \xe6\x97\xb6\xe5\x80\x99 \xe5\xb0\xb1 \xe6\x98\xaf\xe6\x9c\x89 \xe5\x90\x8d \xe5\x85\xb6 \xe5\xb0\x91 \xe4\xba\x86'"
I want to convert string of bytes string into string so that i could use decode function to normal result.

First, put an r before it so that the \x keeps both characters. Then ast.literal_eval() will work.
import ast
string1 = r"b'\xe6\x88\x91\xe4\xbb\xac \xe7\xb4\xa2\xe8\xa6\x81 \xe6\x8e\xa8\xe5\xb9\xbf \xe7\x9a\x84 \xe6\x98\xaf\xe4\xb8\x80 \xe5\xbe\x97 \xe6\x96\xb9\xe6\x96\xb9 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\xba\xe5\x9f\xba \xe7\xa1\x80 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\x80 \xe5\x8c\x97\xe4\xba\xac \xe5\xb7\xb2 \xe5\x9b\xa0 \xe4\xb8\xba \xe6\xa0\x87\xe5\x87\x86 \xe7\x9a\x84 \xe6\x99\xae\xe9\x80\x9a \xe8\xaf\x9d \xe4\xbb\x96 \xe4\xbb\x8e \xe5\x84\xbf\xe7\xab\xa5 \xe6\x97\xb6\xe4\xbb\xa3 \xe8\xb5\xb7 \xe5\xb0\xb1 \xe5\x96\x9c\xe6\xac\xa2 \xe4\xb8\x8b \xe5\x9b\xb4\xe6\xa3\x8b \xe5\x9c\xa8 \xe5\x8d\x81\xe4\xba\x94 \xe5\xb2\x81 \xe7\x9a\x84 \xe6\x97\xb6\xe5\x80\x99 \xe5\xb0\xb1 \xe6\x98\xaf\xe6\x9c\x89 \xe5\x90\x8d \xe5\x85\xb6 \xe5\xb0\x91 \xe4\xba\x86'"
bytes1 = ast.literal_eval(string1)
print(bytes1.decode('utf8')) # 我们 索要 ...

Related

Python Hex values in ascii encoded string

I have a problem in python reading a string from a .txt file
File contains these data : \xce\x97
Encoded in ascii (Similar to "\xce\x97" using a python string)
I want to convert it to UTF-8 encoding
file.open("file.txt", "r")
a = file.read() #a = "\\xce\\x97"
file.close()
The correct value of this string is : "Η" (Its a greek letter, capital "η")
Ι can use
>>>a = b'\xce\x97'
>>>print(a.decode("utf-8"))
>>>Η
How can I do it using the varriable a?
For decoding problems:
a = "\\xce\\x97"
print(a.encode().decode('unicode-escape').encode("latin-1").decode('utf-8'))
'Η'

How to join hex values

I am reading four bytes from file
I would like to join them
g = f.read(60)
f.seek (60)
k60 =f.read(1)
print('byte60',k60)
k61 =f.read(1)
print('byte61',k61)
k62 =f.read(1)
print('byte62',k62)
k63 =f.read(1)
print('byte63',k63)
print(k63,k62,k61,k60)
print (b''.join([k63,k62,k61,k60]))
Result is:
b'\x00\x00\x00\x80'
I would like to receive:
00000080
You to convert a byte string to its hex representation, you can use the hexlify() method from the binascii module:
>>> from binascii import hexlify
>>> ...
>>> raw = b''.join([k63,k62,k61,k60])
>>> print(hexlify(raw))
b'00000080'
>>> print(hexlify(raw).decode('ascii') # if you want to convert it to a string
00000080
The same could be accomplished by using codecs.encode(raw, 'hex').

Is there a better way to unpack a binary string in Python

At the moment I have a byte stream of a string that is received by my Python code and must be converted into a string. For now I managed to extract each character, convert them and append them to a string individually. The code looks something like this:
import struct
# The byte stream is received and stored in byte_stream
text = ''
i = 0
while i < len(byte_stream):
text = text + struct.unpack('c', byte_stream[i])[0]
i += 1
print(text)
But that surely cannot be the most efficient way... Is there a more elegant way to do achieve the same result?
From Convert bytes to a Python string:
byte_stream = [112, 52, 52]
''.join(map(chr, bytes))
>> p44

Python3 print in hex representation

I can find lot's of threads that tell me how to convert values to and from hex. I do not want to convert anything. Rather I want to print the bytes I already have in hex representation, e.g.
byteval = '\x60'.encode('ASCII')
print(byteval) # b'\x60'
Instead when I do this I get:
byteval = '\x60'.encode('ASCII')
print(byteval) # b'`'
Because ` is the ASCII character that my byte corresponds to.
To clarify: type(byteval) is bytes, not string.
>>> print("b'" + ''.join('\\x{:02x}'.format(x) for x in byteval) + "'")
b'\x60'
See this:
hexify = lambda s: [hex(ord(i)) for i in list(str(s))]
And
print(hexify("abcde"))
# ['0x61', '0x62', '0x63', '0x64', '0x65']
Another example:
byteval='\x60'.encode('ASCII')
hexify = lambda s: [hex(ord(i)) for i in list(str(s))]
print(hexify(byteval))
# ['0x62', '0x27', '0x60', '0x27']
Taken from https://helloacm.com/one-line-python-lambda-function-to-hexify-a-string-data-converting-ascii-code-to-hexadecimal/

How to define a long hex literal over several lines?

How can I define a very long hex literal over several lines in Python? E.g.
p = 0xB10B8F96 A080E01D DE92DE5E AE5D54EC 52C99FBC FB06A3C6
9A6A9DCA 52D23B61 6073E286 75A23D18 9838EF1E 2EE652C0
13ECB4AE A9061123 24975C3C D49B83BF ACCBDD7D 90C4BD70
98488E9C 219A7372 4EFFD6FA E5644738 FAA31A4F F55BCCC0
A151AF5F 0DC8B4BD 45BF37DF 365C1A65 E68CFDA7 6D4DA708
DF1FB2BC 2E4A4371
It would be nice if I can keep the spaces or another separator like _ too.
Here is one attempt, which saves it as a string, and then uses ast.literal_eval to calculate the actual number:
from ast import literal_eval
hex_string_literal = (
"0xB10B8F96" "A080E01D" "DE92DE5E" "AE5D54EC" "52C99FBC" "FB06A3C6"
"9A6A9DCA" "52D23B61" "6073E286" "75A23D18" "9838EF1E" "2EE652C0"
"13ECB4AE" "A9061123" "24975C3C" "D49B83BF" "ACCBDD7D" "90C4BD70"
"98488E9C" "219A7372" "4EFFD6FA" "E5644738" "FAA31A4F" "F55BCCC0"
"A151AF5F" "0DC8B4BD" "45BF37DF" "365C1A65" "E68CFDA7" "6D4DA708"
"DF1FB2BC" "2E4A4371")
p = literal_eval(hex_string_literal)
Defining the string literal above uses the string literal concatenation.
EDIT
As said by #nneonneo in comments below, you could also use int(hex_string_literal, 16) or int(hex_string_literal, 0) in example above, so that you don't have to import something extra.
p = int(''.join('''
B10B8F96 A080E01D DE92DE5E AE5D54EC 52C99FBC FB06A3C6
9A6A9DCA 52D23B61 6073E286 75A23D18 9838EF1E 2EE652C0
13ECB4AE A9061123 24975C3C D49B83BF ACCBDD7D 90C4BD70
98488E9C 219A7372 4EFFD6FA E5644738 FAA31A4F F55BCCC0
A151AF5F 0DC8B4BD 45BF37DF 365C1A65 E68CFDA7 6D4DA708
DF1FB2BC 2E4A4371
'''.split()), 16)
You can use int(str, 16) to translate hex.

Categories

Resources