Python Encoding Issues

Python Encoding Issues - python

I have data that I would like to decode from and its in Windows-1252 basically I send code to a socket and it sends it back and I have to decode the message and use IEEE-754 to get a certain value from it but I can seem to figure out all this encoding stuff. Here is my code.
def printKinds ():
test = "x40\x39\x19\x99\x99\x99\x99\x9A"
print (byt1Hex(test))
test = byt1Hex(test).replace(' ', '')
struct.unpack('<d', binascii.unhexlify(test))
print (test)
printKinds()
def byt1Hex( bytStr ):
return ' '.join( [ "%02X" % ord( x ) for x in bytStr ] )
So I use that and then I have to get the value from that.. But it's not working and I can not figure out why.
The current output I am getting is
struct.unpack('<d', binascii.unhexlify(data))
struct.error: unpack requires a bytes object of length 8
That the error the expected output I am looking for is 25.1
but when I encode it, It actually changes the string into the wrong values so when I do this:
print (byt1Hex(data))
I expect to get this.
40 39 19 99 99 99 99 9A
But I actually get this instead
78 34 30 39 19 99 99 99 99 9A

>>> import struct
>>> struct.pack('!d', 25.1)
b'#9\x19\x99\x99\x99\x99\x9a'
>>> struct.unpack('!d', _) #NOTE: no need to call byt1hex, unhexlify
(25.1,)
You send, receive bytes over the network. No need hexlify/unhexlify them; unless the protocol requires it (you should mention the protocol in the question then).

You have:
test = "x40\x39\x19\x99\x99\x99\x99\x9A"
You need:
test = "\x40\x39\x19\x99\x99\x99\x99\x9A"

Related

Read and write to port with unit8 format?

I have Matlab code that communicates over serial port, and I am trying to translate the code into python, however I am not getting the same "read" messages.
Here is the matlab code:
s = serial('COM3','BaudRate',115200,'InputBufferSize',1e6,'Timeout',2); %OPEN COM PORT
fopen(s);
string=[];
while(length(st)<1)
fwrite(s,30,'uint8'); %REQUEST CONNECTION STRING
pause(0.1);
st = fread(s,5); %READ CONNECTION (5BYTES, "Ready")
disp(st)
end
fwrite(s,18,'uint8'); % START ACQUISITION
while(1)
st(1:131) = fread(s,131); .....
disp(st)
OUT:
%first disp(st)
82
101
97
100
121
%from disp(st) second time
106
85
106
59
106
61
106
0
106...
Here is my attempt of python code:
# Open serial -
import serial
import time
s = serial.Serial('COM3', baudrate=115200, timeout=2 )
s.set_buffer_size(rx_size = 1000000, tx_size = 1000000)
#serieal is open
print ("serial is open: ", s.isOpen())
s.write(30) #request connection string
time.sleep(0.1)
string = s.read(5) #read connection string (5 BYTESm "Ready)
print (string)
# start aquisition
s.write(18) #request connection string
print (s.read(131))
however the output is, OUT:
serial is open: True
b'Uj.jg'
b'jVj3j-i\xefjOj8jajJj"i\xb8j\x19j4j,j\x17\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00j\x9aj\x9aj\x8djfjkj/j\xa0j\x97jbjKj#i\xb9j\x1bj5j-j\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00/\xaaUj\xe5j\xb7'
As you can see they aren't the same, so:
How do I send via pyserial a 'uint8' encoded number like in matlabs: fwrite(s,30,'uint8');
How do I read and display from pyserial similar to matlabs: st = fread(s,5);

How do I read and display from pyserial similar to matlabs: st = fread(s,5);
uint8 means 8-bit number. So 1 byte.
In python, you get a bytes object which is a sequence of bytes. You can iterate over it to get each value. - Which is exactly what you want, because that would be a value of one byte, 0-255
first = b'Uj.jg'
for i in first:
print(i)
This gives you:
85
106
46
106
103
How do I send via pyserial a 'uint8' encoded number like in matlabs: fwrite(s,30,'uint8');
You can convert your int to bytes object using int.to_bytes:
print((30).to_bytes(1, byteorder="big"))
Results in
b'\x1e'
First argument is number of bytes - in our case 1. Second argument is byte order - which is useless when we use 1 byte but it's still required.
So what you had as s.write(30) will be
s.write(b'\x1e')
or to keep the "30" thing visible just directly paste the conversion:
s.write((30).to_bytes(1, byteorder="big"))
Same for sending 18: s.write((18).to_bytes(1, byteorder="big"))

Encode data to HEX and get an L at the end in Python 2.7. Why?

I ask a Measurement Device to give me some Data. At first it tells me how many bytes of data are in the storage. It is always 14. Then it gives me the data which i have to encode into hex. It is Python 2.7 can´t use newer versions. Line 6 to 10 tells the Device to give me the measured data.
Line 12 to 14 is the encoding to Hex. In other Programs it works. but when i print result(Line 14) then i get a Hex number with 13 Bytes PLUS 1 which can not be correct because it has an L et the end. I guess it is some LONG or whatever. and i dont need the last Byte. but i do think it changes the Data too, which is picked out from Line 15 and up. at first in Hex. Then it is converted into Int.
Is it possible that the L has an effect on the Data or not?
How can i fix it?
1 ap.write(b"ML\0")
rmemb = ap.read(2)
print(rmemb)
rmemb = int(rmemb)+1
5 rmem = rmemb #must be and is 14 Bytes
addmem = ("MR:%s\0" % rmem)
# addmem = ("MR:14\0")
ap.write(addmem.encode())
10 time.sleep(1)
test = ap.read(rmem)
result = hex(int(test.encode('hex'), 16))
print(result)
15 ftflash = result[12:20]
ftbg = result[20:28]
print(ftflash)
print(ftbg)
ftflash = int(ftflash, 16)
20 # print(ftflash)
ftbg = int(ftbg, 16)
# print(ftbg)
OUTPUT:
14
0x11bd5084c0b000001ce00000093L
b000001c
e0000009

Python 2 has two built-in integer types, int and long. hex returns a string representing a Python hexadecimal literal, and in Python 2, that means that longs get an L at the end, to signify that it's a long.

RAW CAN decoding

I'm trying to import CAN data using a virtual CAN network and am getting strange results when I unpack my CAN packet of data. I'm using Python 3.3.7
Code:
import socket, sys, struct
sock = socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW)
interface = "vcan0"
try:
sock.bind((interface,))
except OSError:
sys.stderr.write("Could not bind to interface '%s'\n" % interface)
fmt = "<IB3x8s"
while True:
can_pkt = sock.recv(16)
can_id, length, data = struct.unpack(fmt, can_pkt)
can_id &= socket.CAN_EFF_MASK
data = data[:length]
print(data, can_id , can_pkt)
So when I have a CAN packet looking like this.
candump vcan0: vcan0 0FF [8] 77 9C 3C 21 A2 9A B9 66
output in Python: b'\xff\x00\x00\x00\x08\x00\x00\x00w\x9c<!\xa2\x9a\xb9f'
Where vcan0 is the interface, [x] is the number of bytes in the payload, the rest is an 8 byte hex payload.
Do I have the wrong formatting? Has PF_CAN been updated for newer Python version? Am I using CAN_RAW when I should be using CAN_BCM for my protocol family? Or am I just missing how to decode the unpacked data?
Any direction or answer would be much appreciated.
Also, here are some script outputs to can-utils values I've plucked. If I can't find anything, I'm probably just going to make collect a ton of data then decode for the bytes of data that don't translate over properly. I feel that i'm over complicating things, and possibly missing one key aspect.
Python3 output == can-utils/socketCAN (hex)
M= == 4D 3D
~3 == 7E 33
p == 70
. == 2E
# == 40
r: == 0D 3A
c == 63
5g == 35 67
y == 79
a == 61
) == 29
E == 45
M == 4D
C == 43
P> == 50 3E
SGN == 53 47 4E
8 == 38

Rather than laboriously complete that table you started, just look at any ASCII code chart. When you simply print a string, any characters that are actually ASCII text will print as that character: only unprintable characters get shown as hexadecimal escapes. If you want everything in hex, you need to explicitly request that:
import binascii
print(binascii.hexlify(data))
for example.

I'm sure you've already run into the python-can library? If not we have a native python version of socketcan that correctly parse data out of CAN messages. Some of the source might help you out - or you might want to use it directly. CAN_RAW is probably what you want, if you plan on leaving virtual can for real hardware you might also want to get the timestamp from the hardware.
Not all constants have been exposed in Python's socket module so there is also a ctypes version which made in easier to experiment with things like the socketcan broadcast manager. Docs for both are here.

How to read part of binary file with numpy?

I'm converting a matlab script to numpy, but have some problems with reading data from a binary file. Is there an equivelent to fseek when using fromfile to skip the beginning of the file? This is the type of extractions I need to do:
fid = fopen(fname);
fseek(fid, 8, 'bof');
second = fread(fid, 1, 'schar');
fseek(fid, 100, 'bof');
total_cycles = fread(fid, 1, 'uint32', 0, 'l');
start_cycle = fread(fid, 1, 'uint32', 0, 'l');
Thanks!

You can use seek with a file object in the normal way, and then use this file object in fromfile. Here's a full example:
import numpy as np
import os
data = np.arange(100, dtype=np.int)
data.tofile("temp") # save the data
f = open("temp", "rb") # reopen the file
f.seek(256, os.SEEK_SET) # seek
x = np.fromfile(f, dtype=np.int) # read the data into numpy
print x
# [64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
# 89 90 91 92 93 94 95 96 97 98 99]

There probably is a better answer… But when I've been faced with this problem, I had a file that I already wanted to access different parts of separately, which gave me an easy solution to this problem.
For example, say chunkyfoo.bin is a file consisting of a 6-byte header, a 1024-byte numpy array, and another 1024-byte numpy array. You can't just open the file and seek 6 bytes (because the first thing numpy.fromfile does is lseek back to 0). But you can just mmap the file and use fromstring instead:
with open('chunkyfoo.bin', 'rb') as f:
with closing(mmap.mmap(f.fileno(), length=0, access=mmap.ACCESS_READ)) as m:
a1 = np.fromstring(m[6:1030])
a2 = np.fromstring(m[1030:])
This sounds like exactly what you want to do. Except, of course, that in real life the offset and length to a1 and a2 probably depend on the header, rather than being fixed comments.
The header is just m[:6], and you can parse that by explicitly pulling it apart, using the struct module, or whatever else you'd do once you read the data. But, if you'd prefer, you can explicitly seek and read from f before constructing m, or after, or even make the same calls on m, and it will work, without affecting a1 and a2.
An alternative, which I've done for a different non-numpy-related project, is to create a wrapper file object, like this:
class SeekedFileWrapper(object):
def __init__(self, fileobj):
self.fileobj = fileobj
self.offset = fileobj.tell()
def seek(self, offset, whence=0):
if whence == 0:
offset += self.offset
return self.fileobj.seek(offset, whence)
# ... delegate everything else unchanged
I did the "delegate everything else unchanged" by generating a list of attributes at construction time and using that in __getattr__, but you probably want something less hacky. numpy only relies on a handful of methods of the file-like object, and I think they're properly documented, so just explicitly delegate those. But I think the mmap solution makes more sense here, unless you're trying to mechanically port over a bunch of explicit seek-based code. (You'd think mmap would also give you the option of leaving it as a numpy.memmap instead of a numpy.array, which lets numpy have more control over/feedback from the paging, etc. But it's actually pretty tricky to get a numpy.memmap and an mmap to work together.)

This is what I do when I have to read arbitrary in an heterogeneous binary file.
Numpy allows to interpret a bit pattern in arbitray way by changing the dtype of the array.
The Matlab code in the question reads a char and two uint.
Read this paper (easy reading on user level, not for scientists) on what one can achieve with changing the dtype, stride, dimensionality of an array.
import numpy as np
data = np.arange(10, dtype=np.int)
data.tofile('f')
x = np.fromfile('f', dtype='u1')
print x.size
# 40
second = x[8]
print 'second', second
# second 2
total_cycles = x[8:12]
print 'total_cycles', total_cycles
total_cycles.dtype = np.dtype('u4')
print 'total_cycles', total_cycles
# total_cycles [2 0 0 0] !endianness
# total_cycles [2]
start_cycle = x[12:16]
start_cycle.dtype = np.dtype('u4')
print 'start_cycle', start_cycle
# start_cycle [3]
x.dtype = np.dtype('u4')
print 'x', x
# x [0 1 2 3 4 5 6 7 8 9]
x[3] = 423
print 'start_cycle', start_cycle
# start_cycle [423]

There is a quite new feature of numpy.fromfile()
offset int
The offset (in bytes) from the file’s current position. Defaults to 0. Only permitted for binary files.
New in version 1.17.0.
import numpy as np
import os
data = np.arange(100, dtype=np.int32)
data.tofile("temp") # save the data
x = np.fromfile("temp", dtype=np.int32, offset=256) # use the offset
print (x)
# [64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
# 89 90 91 92 93 94 95 96 97 98 99]

Problem sending Bytes with pySerial and socat

I want to send some bytes via pySerial. I created virtual serial ports with socat for testing purposes:
socat PTY,link=./ptyp1,b9600 PTY,link=./ptyp2,b9600
Here's the python code:
ser = serial.Serial('./ptyp1')
x = struct.pack('B',2)
print binascii.hexlify(x) # 02
ser.write(x)
y = ser.read(2)
print binascii.hexlify(y) # 5e42
The ouput I get:
02 # x
5e42 # y
The output I expect:
02 # x
02 # y
What am I doing wrong here? Is it socat or python?
Edit:
I just noticed some other strange behavior for different x values. Here the ouput:
x = 12 => y = 5E 52 0D 0A 5E 50
x = 100 => y = 100 # why does it work here?
Solution:
The problem was that I read on the same port I wrote to. If I get it right socat "connects" the two ports as "in" and "out". So I have to read on ./ptyp2 if I write to ./ptyp1. After that, everything is fine.

The problem was that I read on the same port I wrote to. If I get it right socat "connects" the two ports as "in" and "out". So I have to read on ./ptyp2 if I write to ./ptyp1. After that, everything is fine.

I have installed socat to test your code. I have run this line :
socat PTY,link=./ptyp1,b9600 PTY,link=./ptyp2,b9600
Then, the following code works :
from binascii import hexlify
from serial import Serial, struct
ser = Serial('ptyp1')
x = struct.pack('B', 2)
print hexlify(x) # 02
ser.write(x)
y = ser.read()
print hexlify(y) # 5E
y = ser.read()
print hexlify(y) # 42
Ouput :
02
5e
42

What you seem to be getting back is the string "^B". It's possible that socat (or something else along the way) is interpreting the byte you're sending (\x02) as a control code of some sort.
Off the top of my head, Ctrl-B is the page-back mnemonic, but I'm not sure.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Encoding Issues - python

You have: test = "x40\x39\x19\x99\x99\x99\x99\x9A" You need: test = "\x40\x39\x19\x99\x99\x99\x99\x9A"

Related

Read and write to port with unit8 format?

Encode data to HEX and get an L at the end in Python 2.7. Why?

RAW CAN decoding

How to read part of binary file with numpy?

Problem sending Bytes with pySerial and socat

Categories

Resources