Weird Python output when printing

Weird Python output when printing - python

I'm very new to python and I'm trying to read pressure data from a Honeywell differential pressure sensor using the LibMPSSE library. I'm reading it from a Adafruit FT232H chip and I'm using python 2.7.6 on ubuntu Linux.
#!/usr/bin/env python
from mpsse import *
SIZE = 2 # 2 bytes MSB first
WCMD = "\x50" # Write start address command
RCMD = "\x51" # Read command
FOUT = "eeprom.txt" # Output file
try:
eeprom = MPSSE(I2C)
# print "%s initialized at %dHz (I2C)" % (eeprom.GetDescription(), eeprom.GetClock())
eeprom.Start()
eeprom.Write(WCMD)
# Send Start condition to i2c-slave (Pressure Sensor)
if eeprom.GetAck() == ACK:
# ACK received,resend START condition and set R/W bit to 1 for read
eeprom.Start()
eeprom.Write(RCMD)
if eeprom.GetAck() == ACK:
# ACK recieved, continue supply the clock to slave
data = eeprom.Read(SIZE)
eeprom.SendNacks()
eeprom.Read(1)
else:
raise Exception("Received read command NACK2!")
else:
raise Exception("Received write command NACK1!")
eeprom.Stop()
print(data)
eeprom.Close()
except Exception, e:
print "MPSSE failure:", e
According to the library, Read returns a string of size bytes and whenever I want to print the data, the only output I see is �����2��T_`ʋ�Q}�*/�eE�
. I've tried encoding with utf-8 and still no luck.

Python may print "weird" output when bytes/characters in the string (data in your case) contain non-printable characters. To "view" the content of the individual bytes (as integers or hex), do the following:
print(','.join(['{:d}'.format(x) for x in map(ord, data)])) # decimal
or
print(','.join(['{:02X}'.format(x) for x in map(ord, data)]))
Since the length of your data buffer is set by SIZE=2, to extract each byte from this buffer as an integer you can do the following:
hi, lo = map(ord, data)
# or:
hi = ord(data[0])
lo = ord(data[1])
To read more about ord and what it does - see https://docs.python.org/2/library/functions.html#ord

Related

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 398: invalid start byte || book python for everyone

hey am trying to pull image from web server using socket programming in python while going through python for everyone book there is example in networked programming chapter i copied the code from example urljpeg.py
import socket
import time
#HOST = 'data.pr4e.org'
#PORT = 80
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
mysock.sendall(b'GET http://data.pr4e.org/cover3.jpg HTTP/1.0\r\n\r\n')
count = 0
picture = b""
while True:
data = mysock.recv(5120)
if len(data) < 1: break
# time .sleep(0.25)
count = count + len(data)
print( len(data),count)
picture = picture + data
mysock.close()
# look for the end of the header (2crlf)
pos = picture.find(b"r\n\r\n")
print("Header length ", pos)
print(picture[:pos].decode())
# skip pasr the header and save the picture data
picture = picture[pos+4:]
fhand = open("stuff.jpg","wb")
fhand.write(picture)
fhand.close()

The error message indicates that you are trying to decode data which is not utf-8. So why is this happening? Let's take a step back and look at what the code is doing:
# look for the end of the header (2crlf)
pos = picture.find(b"r\n\r\n")
print("Header length ", pos)
print(picture[:pos].decode())
We're trying to find a sequence of \r\n\r\n, i.e. CR LF CR LF in the data. This would be the empty line that separates the HTTP header (which should be in ASCII, which is a subset of UTF-8) from the actual image data. Then we try to decode everything up to that point as a string. So why does it fail? The program conveniently prints the header length, and in the bit you posted earlier we could see that this was -1, which means that the picture.find call did not find anything! Why not? Well, look carefully at what the code actually does:
# look for the end of the header (2crlf)
pos = picture.find(b"r\n\r\n")
It should be looking for \r\n\r\n, but it is actually looking for r\n\r\n!

on-the-fly parsing of binary serial data in python

I'm new to using Python3 for data acquisition. I'm trying to find a way to parse binary data from a serial port on Linux.
import serial
ser = serial.Serial(
port='/dev/ttyS0',
baudrate = 9600,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS,
timeout=1)
counter = 0
while 1:
x = ser.read(31)
print (x)
This gives me a string which I'm not sure about the format of:
x='\x00\x00\x91\x00\x02\x88BM\x00\x1c\x00\x00\x00\x01\x00\x01\x00\x00\x00\x01\x00\x01\x00\xe1\x00K\x00\x1a\x00\x02\x00\x00'
using
x.encode('hex')
gives a string of hex values
x='000091000288**424d**001c00000001000100000001000100e1004b001a00020000'
where 0x42 is the end of message and 0x4d is start of message.
I can convert it into a base 10 list using
y = map(ord,x)
print(y)
Then I have a way to re-order the message using the indexes but surely there is a neater way? How do I create a list which starts at 0x4d to parse with?

If you are using python3, this is likely already bytes:
x='\x00\x00\x91\x00\x02\x88BM\x00\x1c\x00\x00\x00\x01\x00\x01\x00\x00\x00\x01\x00\x01\x00\xe1\x00K\x00\x1a\x00\x02\x00\x00'
It likely looks this way because Python printed it for you, and all of the non-ascii characters are shown in hex. Your start of message is in 0x42, 0x4d which is BM in ascii and can be seen in the data above between 0x88 and 0x00 as \x88BM\x00.
I would suggest just iterating over the byte array in x to do your parsing. The encoding and mapping should not be needed.
for b in x:
if b == 0x4d:
found_byte1 = True
... # etc

Convertion between ISO-8859-2 and UTF-8 in Python

I'm wondering how can I convert ISO-8859-2 (latin-2) characters (I mean integer or hex values that represents ISO-8859-2 encoded characters) to UTF-8 characters.
What I need to do with my project in python:
Receive hex values from serial port, which are characters encoded in ISO-8859-2.
Decode them, this is - get "standard" python unicode strings from them.
Prepare and write xml file.
Using Python 3.4.3
txt_str = "ąęłóźć"
txt_str.decode('ISO-8859-2')
Traceback (most recent call last): File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
The main problem is still to prepare valid input for the "decode" method (it works in python 2.7.10, and thats the one I'm using in this project). How to prepare valid string from decimal value, which are Latin-2 code numbers?
Note that it would be uber complicated to receive utf-8 characters from serial port, thanks to devices I'm using and communication protocol limitations.
Sample data, on request:
68632057
62206A75
7A647261
B364206F
20616775
777A616E
616A2061
6A65696B
617A20B6
697A7970
6A65B361
70697020
77F36469
62202C79
6E647572
75206A65
7963696C
72656D75
6A616E20
73726F67
206A657A
65647572
77207972
73772065
00000069
This is some sample data. ISO-8859-2 pushed into uint32, 4 chars per int.
bit of code that manages unboxing:
l = l[7:].replace(",", "").replace(".", "").replace("\n","").replace("\r","") # crop string from uart, only data left
vl = [l[0:2], l[2:4], l[4:6], l[6:8]] # list of bytes
vl = vl[::-1] # reverse them - now in actual order
To get integer value out of hex string I can simply use:
int_vals = [int(hs, 16) for hs in vl]

Your example doesn't work because you've tried to use a str to hold bytes. In Python 3 you must use byte strings.
In reality, if you're using PySerial then you'll be reading byte strings anyway, which you can convert as required:
with serial.Serial('/dev/ttyS1', 19200, timeout=1) as ser:
s = ser.read(10)
# Py3: s == bytes
# Py2.x: s == str
my_unicode_string = s.decode('iso-8859-2')
If your iso-8895-2 data is actually then encoded to ASCII hex representation of the bytes, then you have to apply an extra layer of encoding:
with serial.Serial('/dev/ttyS1', 19200, timeout=1) as ser:
hex_repr = ser.read(10)
# Py3: hex_repr == bytes
# Py2.x: hex_repr == str
# Decodes hex representation to bytes
# Eg. b"A3" = b'\xa3'
hex_decoded = codecs.decode(hex_repr, "hex")
my_unicode_string = hex_decoded.decode('iso-8859-2')
Now you can pass my_unicode_string to your favourite XML library.

Interesting sample data. Ideally your sample data should be a direct print of the raw data received from PySerial. If you actually are receiving the raw bytes as 8-digit hexadecimal values, then:
#!python3
from binascii import unhexlify
data = b''.join(unhexlify(x)[::-1] for x in b'''\
68632057
62206A75
7A647261
B364206F
20616775
777A616E
616A2061
6A65696B
617A20B6
697A7970
6A65B361
70697020
77F36469
62202C79
6E647572
75206A65
7963696C
72656D75
6A616E20
73726F67
206A657A
65647572
77207972
73772065
00000069'''.splitlines())
print(data.decode('iso-8859-2'))
Output:
W chuj bardzo długa nazwa jakiejś zapyziałej pipidówy, brudnej ulicyumer najgorszej rudery we wsi
Google Translate of Polish to English:
The dick very long name some zapyziałej Small Town , dirty ulicyumer worst hovel in the village

This topic is closed. Working code, that handles what need to be done:
x=177
x.to_bytes(1, byteorder='big').decode("ISO-8859-2")

read and stock various data from various usb devices in python

I am a beginner in python, and I am trying to read the data from several sensors (humidity, temperature, pressure sensors...) that I connect with a usb hub to my computer. My main goal is to record every five minutes the different values of those sensors and then store it to analyse it.
I have got all the data sheets and manuals of my sensors (which are from Hygrosens Instruments), I know how they work and what kind of data they are sending. But I do not know how to read them. Below is what I tried, using pyserial.
import serial #import the serial library
from time import sleep #import the sleep command from the time library
import binascii
output_file = open('hygro.txt', 'w') #create a file and allow you to write in it only. The name of this file is hygro.txt
ser = serial.Serial("/dev/tty.usbserial-A400DUTI", 9600) #load into a variable 'ser' the information about the usb you are listening. /dev/tty.usbserial.... is the port after plugging in the hygrometer, 9600 is for bauds, it can be diminished
count = 0
while 1:
read_byte = ser.read(size=1)
So now I want to find the end of the line of the data as the measurement informations that I need are in a line that begins with 'V', and if the data sheet of my sensor, it said that a line ends by , so I want to read one byte at a time and look for '<', then 'c', then 'r', then '>'. So I wanted to do this:
while 1:
read_byte = ser.read(size=8) #read a byte
read_byte_hexa =binascii.hexlify(read_byte) #convert the byte into hexadecimal
trad_hexa = int(read_byte_hexa , 16) #convert the hexadecimal into an int in purpose to compare it with another int
trad_firstcrchar = int('3c' , 16) #convert the hexadecimal of the '<' into a int to compare it with the first byte
if (trad_hexa == trad_firstcrchar ): #compare the first byte with the '<'
read_byte = ser.read(size=1) #read the next byte (I am not sure if that really works)
read_byte_hexa =binascii.hexlify(read_byte)# from now I am doing the same thing as before
trad_hexa = int(read_byte_hexa , 16)
trad_scdcrchar = int('63' , 16)
print(trad_hexa, end='/')# this just show me if it gets in the condition
print(trad_scdcrchar)
if (trad_hexa == trad_scdcrchar ):
read_byte = ser.read(size=1) #read the next byte
read_byte_hexa =binascii.hexlify(read_byte)
trad_hexa = int(read_byte_hexa , 16)
trad_thirdcrchar = int('72' , 16)
print(trad_hexa, end='///')
print(trad_thirdcrchar)
if (trad_hexa == trad_thirdcrchar ):
read_byte = ser.read(size=1) #read the next byte
read_byte_hexa =binascii.hexlify(read_byte)
trad_hexa = int(read_byte_hexa , 16)
trad_fourthcrchar = int('3e' , 16)
print(trad_hexa, end='////')
print(trad_fourthcrchar)
if (trad_hexa == trad_fourthcrchar ):
print ('end of the line')
But I am not sure that it works, I mean I think it does not have the time to read the second one, the second byte I am reading, it's not exactly the second one. So that's why I want to use a buffer, but I don't really get how I can do that. I am going to look for it, but if someone knows an easier way to do what I want, I am ready to try it!
Thank you

You seem to be under the impression that the end-of-line character for that sensor's communication protocol is 4 different characters: <, c, r and >. However, what is being referred to is the carriage return, often denoted by <cr> and in many programming languages just by \r (even though it looks like 2 characters, it represents just one character).
You could simplify your code greatly by reading in the data from the sensors line by line, as the protocol is structured. Here's something to help you get started:
import time
def parse_info_line(line):
# implement to your own liking
logical_channel, physical_probe, hardware_id, crc = [line[index:index+2] for index in (1, 3, 5, 19)]
serialno = line[7:19]
return physical_probe
def parse_value_line(line):
channel, crc = [line[ind:ind+2] for ind in (1,7)]
encoded_temp = line[3:7]
return twos_comp(int(encoded_temp, 16), 16)/100.
def twos_comp(val, bits):
"""compute the 2's compliment of int value `val`"""
if (val & (1 << (bits - 1))) != 0: # if sign bit is set e.g., 8bit: 128-255
val = val - (1 << bits) # compute negative value
return val # return positive value as is
def listen_on_serial(ser):
ser.readline() # do nothing with the first line: you have no idea when you start listening to the data broadcast from the sensor
while True:
line = ser.readline()
try:
first_char = line[0]
except IndexError: # got no data from sensor
break
else:
if first_char == '#': # begins a new sensor record
in_record = True
elif first_char == '$':
in_record = False
elif first_char == 'I':
parse_info_line(line)
elif first_char == 'V':
print(parse_value_line(line))
else:
print("Unexpected character at the start of the line:\n{}".format(line))
time.sleep(2)
The twos_comp function was written by travc and you are encouraged to upvote his answer when you have enough reputation and if you intend to use his code (and even if you won't, it's still a good answer, I upvoted it just now). The listen_on_serial could be improved as well (many Python programmers will recognize the switch-structure and implement it with a dictionary rather than if... elif... elif...), but this is only intended to get you started.
As a test, the following code extract simulates the sensor sending some data (which is line-delimited, using the carriage return as the end-of-line marker), which I copied from the pdf you linked to (FAQ_terminalfenster_E.pdf).
>>> import serial
>>> import io
>>>
>>> ser = serial.serial_for_url('loop://', timeout=1)
>>> serio = io.TextIOWrapper(io.BufferedRWPair(ser, ser), newline='\r', line_buffering=True)
>>> serio.write(u'A1A0\r' # simulation of starting to listen halfway between 2 records
... '$\r' # marks the end of the previous record
... '#\r' # marks the start of a new sensor record
... 'I0101010000000000001B\r' # info about a sensor's probe
... 'V0109470D\r' # data matching that probe
... 'I0202010000000000002B\r' # other probe, same sensor
... 'V021BB55C\r') # data corresponding with 2nd probe
73L
>>>
>>> listen_on_serial(serio)
23.75
70.93
>>>
Note that it is recommended by the pyserial docs to be using TextIOWrapper when the end-of-line character is not \n (the linefeed character), as was also answered here.

Python pySerial read data from arduino breaks when sending "(char)0"

I send some data from an arduino using pySerial.
My Data looks like
bytearray(DST, SRC, STATUS, TYPE, CHANNEL, DATA..., SIZEOFDATA)
where sizeofData is a test that all bytes are received.
The problem is, every time when a byte is zero, my python program just stops reading there:
serial_port = serial.Serial("/dev/ttyUSB0")
while serial_port.isOpen():
response_header_str = serial_port.readline()
format = '>';
format += ('B'*len(response_header_str));
response_header = struct.unpack(format, response_header_str)
pprint(response_header)
serial_port.close()
For example, when I send bytearray(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) everything is fine. But when I send something like bytearray(1,2,3,4,0,1,2,3,4) I don't see everything beginning with the zero.
The problem is that I cannot avoid sending zeros as I am just sending the "memory dump" e.g. when I send a float value, there might be zero bytes.
how can I tell pyserial not to ignore zero bytes.

I've looked through the source of PySerial and the problem is in PySerial's implementation of FileLike.readline (in http://svn.code.sf.net/p/pyserial/code/trunk/pyserial/serial/serialutil.py). The offending function is:
def readline(self, size=None, eol=LF):
"""\
Read a line which is terminated with end-of-line (eol) character
('\n' by default) or until timeout.
"""
leneol = len(eol)
line = bytearray()
while True:
c = self.read(1)
if c:
line += c
if line[-leneol:] == eol:
break
if size is not None and len(line) >= size:
break
else:
break
return bytes(line)
With the obvious problem being the if c: line. When c == b'\x00' this evaluates to false, and the routine breaks out of the read loop. The easiest thing to do would be to reimplement this yourself as something like:
def readline(port, size=None, eol="\n"):
"""\
Read a line which is terminated with end-of-line (eol) character
('\n' by default) or until timeout.
"""
leneol = len(eol)
line = bytearray()
while True:
line += port.read(1)
if line[-leneol:] == eol:
break
if size is not None and len(line) >= size:
break
return bytes(line)
To clarify from your comments, this is a replacement for the Serial.readline method that will consume null-bytes and add them to the returned string until it hits the eol character, which we define here as "\n".
An example of using the new method, with a file-object substituted for the socket:
>>> # Create some example data terminated by a newline containing nulls.
>>> handle = open("test.dat", "wb")
>>> handle.write(b"hell\x00o, w\x00rld\n")
>>> handle.close()
>>>
>>> # Use our readline method to read it back in.
>>> handle = open("test.dat", "rb")
>>> readline(handle)
'hell\x00o, w\x00rld\n'
Hopefully this makes a little more sense.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Weird Python output when printing - python

Related

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 398: invalid start byte || book python for everyone

on-the-fly parsing of binary serial data in python

Convertion between ISO-8859-2 and UTF-8 in Python

read and stock various data from various usb devices in python

Python pySerial read data from arduino breaks when sending "(char)0"

Categories

Resources