how to split this string of HEX bytes - python

i have the following string of hex bytes from a smart meter:
'~\xa0\x1e\x03\x00\x02\xfe\xff4\xca\xec\xe6\xe7\x00\xc4\x01A\x00\x02\x04\x12\x00\x05\x11\x01\x11\x01\x11\x00\xc7\x11 ~'
I want to separate them in a list and then pass them to decimals or int. The .split() python function won't work, any ideas?
thanks!

You can convert a string to a list of ascii values with ord.
values = [ord(c) for c in data]
Although, depending on what you want to do, you might not even need to cast your data as a list since a str is already iterable.
Instead, iterate over your characters and recover their value. Here is a simplified example.
dt = '\xa0\x1e\x03\x00\x02\xfe'
for x in map(ord, dt):
print(x)
Output
160
30
3
0
2
254

Related

How to get the last byte item from a bytes list in Python?

I have a bytes list and want to get the last item, while preserving its bytes type. Using [-1] it gives out an int type, so this is not a direct solution.
Example code:
x = b'\x41\x42\x43'
y = x[-1]
print (y, type(y))
# outputs:
67 <class 'int'>
For an arbitrary index I know how to do it:
x = b'\x41\x42\x43'
i = 2 # assume here a valid index in reference to list length
y = x[i:i+1]
print (y, type(y))
# outputs:
b'C' <class 'bytes'>
Probably I can calculate the list length and then point an absolute length-1 number, rather than relative to list end.
However, is there a more elegant way to do this ? (i.e. similar to the simple [-1]) I also cannot imagine how to adapt the [i:i+1] principle in reverse from list end.
There are a number of ways I think the one you're interested in is:
someBytes[-1:]
Edit: I just randomly decided to elaborate a bit on what is going on and why this works. A bytes object is an immutable arbitrary memory buffer so a single element of a bytes object is a byte, which is best represented by an int. This is why someBytes[-1] will be the last int in the buffer. It can be counterintuitive when you're using a bytes object like a string for whatever reason (pattern matching, handling ascii data and not bothering to convert to a string,) because a string in python (or a str to be pedantic,) represents the idea of textual data and isn't tied to any particular binary encoding (though it defaults to UTF-8). So the last element of "hello" is "o" which is a string since python has no single char type, just strings of length 1. So if you're treating a bytes object like a memory buffer you likely want an int but if you're treating it like a string you want a bytes object of length 1. So this line tells python to return a slice of the bytes object from the last element to the end of the bytes object which results in a slice length one containing only the last value in the bytes object and a slice of a bytes object is a bytes object.
You can trivially cast that int back to a bytes object:
>>> z = bytes([y])
>>> z == b'C'
True
...in the event that you can't easily get around fetching the values as ints, say because another function you don't have control of returns them that way.
If you have:
x = b'\x41\x42\x43'
Then you will get:
>>> x
b'ABC'
As you said, x[-1] will give you Ord() value.
>>> x[-1]
67
However, if you want to get the value of this, you can give:
>>> x.decode()[-1]
'C'
If you do want to get the value 43, then you can give it as follows:
>>> "{0:x}".format(x[-1])
'43'
Example above
>>> z = bytes([y])
>>> z == b'C'
True
Same you can get with
x.strip()[-1:]
Output
b'C'
So,
bytes(b'\x41\x42\x43')
Give
b'ABC'

How to turn a binary string into a byte?

If I take the letter 'à' and encode it in UTF-8 I obtain the following result:
'à'.encode('utf-8')
>> b'\xc3\xa0'
Now from a bytearray I would like to convert 'à' into a binary string and turn it back into 'à'. To do so I execute the following code:
byte = bytearray('à','utf-8')
for x in byte:
print(bin(x))
I get 0b11000011and0b10100000, which is 195 and 160. Then, I fuse them together and take the 0b part out. Now I execute this code:
s = '1100001110100000'
value1 = s[0:8].encode('utf-8')
value2 = s[9:16].encode('utf-8')
value = value1 + value2
print(chr(int(value, 2)))
>> 憠
No matter how I develop the later part I get symbols and never seem to be able to get back my 'à'. I would like to know why is that? And how can I get an 'à'.
>>> bytes(int(s[i:i+8], 2) for i in range(0, len(s), 8)).decode('utf-8')
'à'
There are multiple parts to this. The bytes constructor creates a byte string from a sequence of integers. The integers are formed from strings using int with a base of 2. The range combined with the slicing peels off 8 characters at a time. Finally decode converts those bytes back into Unicode characters.
you need your second bits to be s[8:16] (or just s[8:]) otherwise you get 0100000
you also need to convert you "bit string" back to an integer before thinking of it as a byte with int("0010101",2)
s = '1100001110100000'
value1 = bytearray([int(s[:8],2), # bits 0..7 (8 total)
int(s[8:],2)] # bits 8..15 (8 total)
)
print(value1.decode("utf8"))
Convert the base-2 value back to an integer with int(s,2), convert that integer to a number of bytes (int.to_bytes) based on the original length divided by 8 and big-endian conversion to keep the bytes in the right order, then .decode() it (default in Python 3 is utf8):
>>> s = '1100001110100000'
>>> int(s,2)
50080
>>> int(s,2).to_bytes(len(s)//8,'big')
b'\xc3\xa0'
>>> int(s,2).to_bytes(len(s)//8,'big').decode()
'à'

How to convert hexadecimal string to character with that code point?

I have the string x = '0x32' and would like to turn it into y = '\x32'.
Note that len(x) == 4 and len(y) == 1.
I've tried to use z = x.replace("0", "\\"), but that causes z = '\\x32' and len(z) == 4. How can I achieve this?
You do not have to make it that hard: you can use int(..,16) to parse a hex string of the form 0x.... Next you simply use chr(..) to convert that number into a character with that Unicode (and in case the code is less than 128 ASCII) code:
y = chr(int(x,16))
This results in:
>>> chr(int(x,16))
'2'
But \x32 is equal to '2' (you can look it up in the ASCII table):
>>> chr(int(x,16)) == '\x32'
True
and:
>>> len(chr(int(x,16)))
1
Try this:
z = x[2:].decode('hex')
The ability to include code points like '\x32' inside a quoted string is a convenience for the programmer that only works in literal values inside the source code. Once you're manipulating strings in memory, that option is no longer available to you, but there are other ways of getting a character into a string based on its code point value.
Also note that '\x32' results in exactly the same string as '2'; it's just typed out differently.
Given a string containing a hexadecimal literal, you can convert it to its numeric value with int(str,16). Once you have a numeric value, you can convert it to the character with that code point via chr(). So putting it all together:
x = '0x32'
print(chr(int(x,16)))
#=> 2

How do you convert a python sequence item to an integer

I need to convert the elements of a python2.7 bytearray() or string or bytes() into integers for processing. In many languages(ie C, etc) bytes and 'chars' are more or less 8 bit ints that you an perform math operations on. How can I convince python to let me use (appropriate) bytearrays or strings interchangebly?
Consider toHex(stringlikeThing):
zerof = '0123456789ABCDEF'
def toHex(strg):
ba = bytearray(len(strg)*2)
for xx in range(len(strg)):
vv = ord(strg[xx])
ba[xx*2] = zerof[vv>>4]
ba[xx*2+1] = zerof[vv&0xf]
return ba
which should take a string like thing (ie bytearray or string) and make a printable string like thing of hexadecimal text. It converts "string" to the hex ASCII:
>>> toHex("string")
bytearray(b'737472696E67')
However, when given a bytearray:
>>> nobCom.toHex(bytearray("bytes"))
EX ord() expected string of length 1, but int found: 0 bytes
The ord() in the 'for' loop gets strg[xx], an item of a bytearray, which seems to be an integer (Whereas an item of a str is a single element string)
So ord() wants a char (single element string) not an int.
Is there some method or function that takes an argument that is a byte, char, small int, one element string and returns it's value?
Of course you could check the type(strg[xx]) and handle the cases laboriously.
The unvoiced question is: Why (what is the reasoning) for Python to be so picky about the difference between a byte and char (normal or unicode) (ie single element string)?
When you index a bytearray object in python, you get an integer. This integer is the code for the corresponding character in the bytearray, or in other words, the very thing that the ord function would return.
There is no method in python that takes a byte, character, small integer, or one element string and returns it's value in python. Making such a method would be simple however.
def toInt(x):
return x if type(x) == int else ord(x)

Python read a binary file and decode

I am quite new in python and I need to solve this simple problem. Already there are several similar questions but still I cannot solve it.
I need to read a binary file, which is composed by several blocks of bytes. For example the header is composed by 6 bytes and I would like to extract those 6 bytes and transform ins sequence of binary characters like 000100110 011001 for example.
navatt_dir='C:/PROCESSING/navatt_read/'
navatt_filename='OSPS_FRMT_NAVATT____20130621T100954_00296_caseB.bin'
navatt_path=navatt_dir+navatt_filename
navatt_file=open(navatt_path, 'rb')
header=list(navatt_file.read(6))
print header
As result of the list i have the following
%run C:/PROCESSING/navatt_read/navat_read.py
['\t', 'i', '\xc0', '\x00', '\x00', 't']
which is not what i want.
I would like also to read a particular value in the binary file knowing the position and the length, without reading all the file. IS it possible
thanks
ByteArray
A bytearray is a mutable sequence of bytes (Integers where 0 ≤ x ≤ 255). You can construct a bytearray from a string (If it is not a byte-string, you will have to provide encoding), an iterable of byte-sized integers, or an object with a buffer interface. You can of course just build it manually as well.
An example using a byte-string:
string = b'DFH'
b = bytearray(string)
# Print it as a string
print b
# Prints the individual bytes, showing you that it's just a list of ints
print [i for i in b]
# Lets add one to the D
b[0] += 1
# And print the string again to see the result!
print b
The result:
DFH
[68, 70, 72]
EFH
This is the type you want if you want raw byte manipulation. If what you want is to read 4 bytes as a 32bit int, one would use the struct module, with the unpack method, but I usually just shift them together myself from a bytearray.
Printing the header in binary
What you seem to want is to take the string you have, convert it to a bytearray, and print them as a string in base 2/binary.
So here is a short example for how to write the header out (I read random data from a file named "dump"):
with open('dump', 'rb') as f:
header = f.read(6)
b = bytearray(header)
print ' '.join([bin(i)[2:].zfill(8) for i in b])
After converting it to a bytearray, I call bin() on every single one, which gives back a string with the binary representation we need, in the format of "0b1010". I don't want the "0b", so I slice it off with [2:]. Then, I use the string method zfill, which allows me to have the required amount of 0's prepended for the string to be 8 long (which is the amount of bits we need), as bin will not show any unneeded zeroes.
If you're new to the language, the last line might look quite mean. It uses list comprehension to make a list of all the binary strings we want to print, and then join them into the final string with spaces between the elements.
A less pythonic/convoluted variant of the last line would be:
result = []
for byte in b:
string = bin(i)[2:] # Make a binary string and slice the first two bytes
result.append(string.zfill(8)) # Append a 0-padded version to the results list
# Join the array to a space separated string and print it!
print ' '.join(result)
I hope this helps!

Categories

Resources