Converting string to raw bytes - python

I wrote a program that works with raw bytes (I don't know if this is the right name!) but the user will input the data as plain strings.
How to convert them?
I've tried wih a method but it returns a string with length 0!
Here's the starting string:
5A05705DC25CA15123C8E4750B80D0A9
Here's the result that I need:
\x5A\x05\x70\x5D\xC2\x5C\xA1\x51\x23\xC8\xE4\x75\x0B\x80\xD0\xA9
And here's the method I wrote:
def convertStringToByte(string):
byte_char = "\\x"
n=2
result = ""
bytesList = [string[i:i+n] for i in range(0, len(string), n)]
for i in range(0, len(bytesList)):
bytesList[i] = byte_char + bytesList[i]
return result

Use binascii.unhexlify():
import binascii
binary = binascii.unhexlify(text)
The same module has binascii.hexlify() too, to mirror the operation.
Demo:
>>> import binascii
>>> binary = '\x5A\x05\x70\x5D\xC2\x5C\xA1\x51\x23\xC8\xE4\x75\x0B\x80\xD0\xA9'
>>> text = '5A05705DC25CA15123C8E4750B80D0A9'
>>> binary == binascii.unhexlify(text)
True
>>> text == binascii.hexlify(binary).upper()
True
The hexlify() operation produces lowercase hex, but that is easily fixed with a .upper() call.

You must get from 5A (a string representing an hexidecimal number) to 0x5A or 90 (integers) and feed them into chr(). You can do the first conversion with int('0x5A', 16), so you'll get something like
chr(int('0x5A', 16))

Related

Why does a string converted from an array behave differently from another initialized with the same value?

The goal of the program is converting the little_endian string to another string equal to clean_data_little_endian and then to convert it using struct.unpack. However the string clean_data_little_endian behaves differently from the other, that is the result of a conversion from an array.
During debug clean_data_little_endian is à1ÿÏÿÊÿÄ and strBinary_Values is \xE0\x31\xFF\xCF\xFF\xCA\xFF\xC4 and if I try to print them I obtain
:
clean_data_little_endian: b'\xe01\xff\xcf\xff\xca\xff\xc4' <class 'str'>
strBinary_Values: b'\\xE0\\x31\\xFF\\xCF\\xFF\\xCA\\xFF\\xC4' <class 'str'>
(strBinary values has 2 backslashes instead of one)
There must be a difference that I don't know how to remove between them, so that struct.unpack works only with clean_data_little_endian and not with strBinary_Values.
The error returned is:
unpack requires a buffer of 8 bytes
and if I change the buffer the number of bytes required becomes the double and so on.
Here's the code I used, even if I think it will not be necessary to read it.
little_endian = '#800000100?xE0??x31??xFF??xCF??xFF??xCA??xFF??xC4?'
clean_data_little_endian = '\xE0\x31\xFF\xCF\xFF\xCA\xFF\xC4'
#from raw string to clean string
j=0
i=0
listValuesToClean = list(little_endian[10:len(little_endian)])
for i in range(0,len(listValuesToClean)-1):
mod = i % 5
if ((mod == 2) or (mod == 3) or (mod == 1)):
listBinary_Values.append(listValuesToClean[i])
j=j+1
if (mod == 0):
listBinary_Values.append('\\')
j=j+1
strBinary_Values=''.join(listBinary_Values)
print('expected: ',clean_data_little_endian.encode('raw_unicode_escape'),type(strBinary_Values), '\n' 'real: ', strBinary_Values.encode('raw_unicode_escape'),type(clean_data_little_endian))
#from clean string to initial values
iqty_of_values = len(strBinary_Values)/8
h = "H" * int(iqty_of_values)
#correct result:
ivalues = struct.unpack("<"+h,clean_data_little_endian.encode('raw_unicode_escape'))
#wrong result:
ivalues = struct.unpack("<"+h,strBinary_Values.encode('raw_unicode_escape'))
The double backslashes indicate a literal backslash, and it doesn't create the byte values you want. This would fix it. latin1 translates 1:1 Unicode string codepoints to byte values, which is required for unicode_escape to translate the literal escape codes to Unicode string codepoints, but then encoding to latin1 again turns the string back to the bytes required for unpack:
ivalues = struct.unpack("<"+h,strBinary_Values.encode('latin1').decode('unicode_escape').encode('latin1'))
print(ivalues)
# (12768, 53247, 51967, 50431)
From the looks of it, a regular expression to capture the hexadecimal bytes and a direct conversion using bytes.fromhex would be more straightforward:
import re
import struct
little_endian = '#800000100?xE0??x31??xFF??xCF??xFF??xCA??xFF??xC4?'
s = ''.join(re.findall(r'x([0-9A-F]{2})',little_endian))
print(s)
b = bytes.fromhex(s)
print(b)
data = struct.unpack(f'<{len(b)//2}H',b)
print(data)
Output:
E031FFCFFFCAFFC4
b'\xe01\xff\xcf\xff\xca\xff\xc4'
(12768, 53247, 51967, 50431)

How can I concatenate string to a string as a hex in python?

data = "\xAA\x12\xFF\x01\x21\x33"
ser.write(data)
This is the original code. How can I concatenate a string to a string (which contains hexa numbers) and use the all as a hexa number to send it as a hexa oder like the second code?
var = 21
data = "\xAA\x12\xFF\x01" + var + "\x33"
ser.write(data)
What you're looking for is hex() function:
>>> var = 21
>>> data = "\xAA\x12\xFF\x01" + hex(var) + "\x33"
>>> data
'\xaa\x12\xff\x01\0x153'
>>>
hex() job is to convert an integer number (of any size) to a lowercase hexadecimal string prefixed with “0x”.
EDIT:
I noticed you need the backslash to keep the formatting, so chr() will return the hexa value with the backslash.
>>> chr(var)
'\x15'
>>> hex(var)
'0x15'
chr(i) returns a string of one character whose ASCII code is the integer i.

Python strings as long number

I have long string and I want to present it as a long num.
I tried:
l=[ord (i)for i in str1]
but this is not what I need.
I need to make it long number and not numbers as items in the list.
this line gives me [23,21,45,34,242,32]
and I want to make it one long Number that I can change it again to the same string.
any idea?
Here is a translation of Paulo Bu's answer (with base64 encoding) into Python 3:
>>> import base64
>>> s = 'abcde'
>>> e = base64.b64encode(s.encode('utf-8'))
>>> print(e)
b'YWJjZGU='
>>> base64.b64decode(e).decode('utf-8')
'abcde'
Basically the difference is that your workflow has gone from:
string -> base64
base64 -> string
To:
string -> bytes
bytes -> base64
base64 -> bytes
bytes -> string
Is this what you are looking for :
>>> str = 'abcdef'
>>> ''.join([chr(y) for y in [ ord(x) for x in str ]])
'abcdef'
>>>
#! /usr/bin/python2
# coding: utf-8
def encode(s):
result = 0
for ch in s.encode('utf-8'):
result *= 256
result += ord(ch)
return result
def decode(i):
result = []
while i:
result.append(chr(i%256))
i /= 256
result = reversed(result)
result = ''.join(result)
result = result.decode('utf-8')
return result
orig = u'Once in Persia reigned a king …'
cipher = encode(orig)
clear = decode(cipher)
print '%s -> %s -> %s'%(orig, cipher, clear)
This is code that I found that works.
str='sdfsdfsdfdsfsdfcxvvdfvxcvsdcsdcs sdcsdcasd'
I=int.from_bytes(bytes([ord (i)for i in str]),byteorder='big')
print(I)
print(I.to_bytes(len(str),byteorder='big'))
What about using base 64 encoding? Are you fine with it? Here's an example:
>>>import base64
>>>s = 'abcde'
>>>e = base64.b64encode(s)
>>>print e
YWJjZGU=
>>>base64.b64decode(e)
'abcde'
The encoding is not pure numbers but you can go back and forth from string without much trouble.
You can also try encoding the string to hexadecimal. This will yield numbers although I'm not sure you can always come back from the encode string to the original string:
>>>s='abc'
>>>n=s.encode('hex')
>>>print n
'616263'
>>>n.decode('hex')
'abc'
If you need it to be actual integers then you can extend the trick:
>>>s='abc'
>>>n=int(s.encode('hex'), 16) #convert it to integer
>>>print n
6382179
hex(n)[2:].decode('hex') # return from integer to string
>>>abc
Note: I'm not sure this work out of the box in Python 3
UPDATE: To make it work with Python 3 I suggest using binascii module this way:
>>>import binascii
>>>s = 'abcd'
>>>n = int(binascii.hexlify(s.encode()), 16) # encode is needed to convert unicode to bytes
>>>print(n)
1633837924 #integer
>>>binascii.unhexlify(hex(n)[2:].encode()).decode()
'abcd'
encode and decode methods are needed to convert from bytes to string and the opposite. If you plan to include especial (non-ascii) characters then probably you'll need to specify encodings.
Hope this helps!

hex string to character in python

I have a hex string like:
data = "437c2123"
I want to convert this string to a sequence of characters according to the ASCII table.
The result should be like:
data_con = "C|!#"
Can anyone tell me how to do this?
In Python2
>>> "437c2123".decode('hex')
'C|!#'
In Python3 (also works in Python2, for <2.6 you can't have the b prefixing the string)
>>> import binascii
>>> binascii.unhexlify(b"437c2123")
b'C|!#'
In [17]: data = "437c2123"
In [18]: ''.join(chr(int(data[i:i+2], 16)) for i in range(0, len(data), 2))
Out[18]: 'C|!#'
Here:
for i in range(0, len(data), 2) iterates over every second position in data: 0, 2, 4 etc.
data[i:i+2] looks at every pair of hex digits '43', '7c', etc.
chr(int(..., 16)) converts the pair of hex digits into the corresponding character.
''.join(...) merges the characters into a single string.
Since Python 2.6 you can use simple:
data_con = bytes.fromhex(data)
The ord function converts characters to numerical values and the chr function does the inverse. So to convert 97 to "a", do ord(97)

How do I convert a single character into its hex ASCII value in Python?

I am interested in taking in a single character.
c = 'c' # for example
hex_val_string = char_to_hex_string(c)
print hex_val_string
output:
63
What is the simplest way of going about this? Any predefined string library stuff?
There are several ways of doing this:
>>> hex(ord("c"))
'0x63'
>>> format(ord("c"), "x")
'63'
>>> import codecs
>>> codecs.encode(b"c", "hex")
b'63'
On Python 2, you can also use the hex encoding like this (doesn't work on Python 3+):
>>> "c".encode("hex")
'63'
This might help
import binascii
x = b'test'
x = binascii.hexlify(x)
y = str(x,'ascii')
print(x) # Outputs b'74657374' (hex encoding of "test")
print(y) # Outputs 74657374
x_unhexed = binascii.unhexlify(x)
print(x_unhexed) # Outputs b'test'
x_ascii = str(x_unhexed,'ascii')
print(x_ascii) # Outputs test
This code contains examples for converting ASCII characters to and from hexadecimal. In your situation, the line you'd want to use is str(binascii.hexlify(c),'ascii').
Considering your input string is in the inputString variable, you could simply apply .encode('utf-8').hex() function on top of this variable to achieve the result.
inputString = "Hello"
outputString = inputString.encode('utf-8').hex()
The result from this will be 48656c6c6f.
You can do this:
your_letter = input()
def ascii2hex(source):
return hex(ord(source))
print(ascii2hex(your_letter))
For extra information, go to:
https://www.programiz.com/python-programming/methods/built-in/hex
to get ascii code use ord("a");
to convert ascii to character use chr(97)

Categories

Resources