Pyserial buffer fills faster than I can read - python

I am reading data from a microcontroller via serial, at a baudrate of 921600. I'm reading a large amount of ASCII csv data, and since it comes in so fast, the buffer get's filled and all the rest of the data gets lost before I can read it. I know I could manually edit the pyserial source code for serialwin32 to increase the buffer size, but I was wondering if there is another way around it?
I can only estimate the amount of data I will receive, but it is somewhere around 200kB of data.

Have you considered reading from the serial interface in a separate thread that is running prior to sending the command to uC to send the data?
This would remove some of the delay after the write command and starting the read. There are other SO users who have had success with this method, granted they weren't having buffer overruns.
If this isn't clear let me know and I can throw something together to show this.
EDIT
Thinking about it a bit more, if you're trying to read from the buffer and write it out to the file system even the standalone thread might not save you. To minimize the processing time you might consider reading say 100 bytes at a time serial.Read(size=100) and pushing that data into a Queue to process it all after the transfer has completed
Pseudo Code Example
def thread_main_loop(myserialobj, data_queue):
data_queue.put_no_wait(myserialobj.Read(size=100))
def process_queue_when_done(data_queue):
while(1):
if len(data_queue) > 0:
poped_data = data_queue.get_no_wait()
# Process the data as needed
else:
break;

There's a "Receive Buffer" slider that's accessible from the com port's Properties Page in Device Manager. It is found by following the Advanced button on the "Port Settings" tab.
More info:
http://support.microsoft.com/kb/131016 under heading Receive Buffer
http://tldp.org/HOWTO/Serial-HOWTO-4.html under heading Interrupts
Try knocking it down a notch or two.

You do not need to manually change pyserial code.
If you run your code on Windows platform, you simply need to add a line in your code
ser.set_buffer_size(rx_size = 12800, tx_size = 12800)
Where 12800 is an arbitrary number I chose. You can make receiving(rx) and transmitting(tx) buffer as big as 2147483647a
See also:
https://docs.python.org/3/library/ctypes.html
https://msdn.microsoft.com/en-us/library/system.io.ports.serialport.readbuffersize(v=vs.110).aspx
You might be able to setup the serial port from the DLL
// Setup serial
mySerialPort.BaudRate = 9600;
mySerialPort.PortName = comPort;
mySerialPort.Parity = Parity.None;
mySerialPort.StopBits = StopBits.One;
mySerialPort.DataBits = 8;
mySerialPort.Handshake = Handshake.None;
mySerialPort.RtsEnable = true;
mySerialPort.ReadBufferSize = 32768;
Property Value
Type: System.Int32
The buffer size, in bytes. The default value is 4096; the maximum value is that of a positive int, or 2147483647
And then open and use it in Python

I am somewhat surprised that nobody has yet mentioned the correct solution to such problems (when available), which is effective flow control through either software (XON/XOFF) or hardware flow control between the microcontroller and its sink. The issue is well described by this web article.
It may be that the source device doesn't honour such protocols, in which case you are stuck with a series of solutions that delegate the problem upwards to where more resources are available (move it from the UART buffer to the driver and upwards towards your application code). If you are losing data, it would certainly seem sensible to try and implement a lower data rate if that's a possibility.

For me the problem was it was overloading the buffer when receiving data from the Arduino.
All I had to do was mySerialPort.flushInput() and it worked.
I don't know why mySerialPort.flush() didn't work. flush() must only flush the outgoing data?
All I know is mySerialPort.flushInput() solved my problems.

Related

Best approach to parse buffer string in python

I'm working on an embedded system that sends commands via Uart.
Uart works at 115200 baud
On PC side I want to read these commands, parse them and execute the related action.
I choose python as language to build a script.
This is a typical command received from the embedded system:
S;SEND;40;{"ID":"asg01","T":1,"P":{"T":180}};E
Each message starts with S and ends with E.
The command associated to the message is "SEND" and the payload length is 40.
My idea is read the bytes coming from the UART and:
check if the message starts with S
check if the message ends with E
if the above assumptions are true, split the message in order to find the command and the payload.
Which is the best way to parse the all bytes coming from an asynchronous uart?
My concern regards the lost of message due to wrong (or slow) parsing.
Thanks for the help!
BR,
Federico
In my day job, I wrote the software for an embedded system and a PC communicating with each other by a USB cable, using the UART protocol at 115,200 baud.
I see that you tagged your post with PySerial, so you already know about Python's most popular package for serial port communication. I will add that if you are using PyQt, there's a serial module included in that package as well.
115,200 baud is not fast for a modern desktop PC. I doubt that any parsing you do on the PC side will fail to keep up. I parse data streams and plot graphs of my data in real time using PyQt.
What I have noticed in my work with communication between an embedded system and a PC over a UART is that some data gets corrupted occasionally. A byte can be garbled, repeated, or dropped. Also, even if no bytes are added or dropped, you can occasionally perform a read while only part of a packet is in the buffer, and the read will terminate early. If you use a fixed read length of 40 bytes and trust that each read will always line up exactly with a data packet as you show above, you will frequently be wrong.
To solve these kinds of problems, I wrote a FIFO class in Python which consumes serial port data at the head of the FIFO, yields valid data packets at the tail, and discards invalid data. My FIFO holds 3 times as many bytes as my data packets, so if I am looking for packet boundaries using specific sequences, I have plenty of signposts.
A few more recommendations: work in Python 3 if you have the choice, it's cleaner. Use bytes and bytearray objects. Don't use str, because you will find yourself converting back and forth between Unicode and ASCII.
This format is almost parseable as a csv, but not quite, because the fourth field is JSON, and you may not be able to guarantee that the JSON doesn't contain any strings with embedded semicolons. So, I think you probably want to just use string (or, rather, bytes) manipulation functions:
def parsemsg(buf):
s, cmd, length, rest = buf.split(b';', 3)
j, _, e = rest.rpartition(b';')
if s != b'S' or e != b'E':
raise ValueError('must start with S and end with E')
return cmd.decode('utf-8'), int(length), json.loads(j)
Then:
>>> parsemsg(b'S,SEND,40,{"ID":"asg01","T":1,"P":{"T":180}},E')
('SEND', 40, {'ID': 'asg01', 'T': 1, 'P': {'T': 180}})
The actual semicolon-parsing part takes 602ns on my laptop, The decode and int raise that to 902ns. The json.loads, on the other hand, takes 10us. So, if you're worried about performance, the JSON part is really the only part that matters (trying third-party JSON libs I happen to have installed, the fastest one is still 8.1us, which isn't much better). You might as well keep everything else simple and robust.
Also, considering that you're reading this at 115000 baud, you can't get these messages any faster than about 6ms, so spending 11us parsing them is not even close to a problem in the first place.

Weird (py)serial linux corruption

I have a Linux SBC based on the Atmel SAMA5D36. I have another device hooked up to it via /dev/ttyS2 via TTL lines (115200 8N1). Using pyserial, I have a pretty high bandwidth query/response conversation with that device.
Periodically (at least once a minute), I see a very repeatable corruption of the date coming back from the other device. If it were to respond with some text like
"123456" (ascii character values)
It will drop one character AND add character-0 after the following character:
"13\x00456"
Hopefully that's clear. It will drop the 2, the next character is as expected, a character-0 follows, and then back to normal.
I am using kernel 4.1.10. Via some debug statements, I'm pretty sure this is not happening in my python loop, because the 0's show up in random spots of the read() buffer. I have also hooked up a scope on the incoming lines and have verified that the wire is not carrying this corruption.
I am looking for an answer that can get me in the right direction of figuring out why this is happening. CPU load does seem to increase the frequency (for example, when I'm doing a bunch of DBUS traffic for a BLE adaptor attached).
This could be a result of overflow errors. If you look at atmel_serial you can see if there are any errors.
cat /proc/tty/driver/atmel_serial
For example on ttyS2 you might see something like this (oe: shows the overflow errors):
2: uart:ATMEL_SERIAL mmio:0xF0020000 irq:31 tx:266758 rx:361385 oe:51 RTS|DTR|DSR|CD|RI
Since you are high rate serial you might try implementing DMA on the USART lines. Tweak the appropriate dts file in your kernel by adding the following to your usart settings:
atmel,use-dma-rx;
atmel,use-dma-tx;
For my kernel, I had to disable SPI and I2C so that there would be enough DMA channels available for the USART.

PySerial does not receive data correctly

I have a little problem receiving data correctly via pySerial: it often does not read the full data or too much of it. Sometimes and sometimes more often, there are additional characters or some characters/parts of the sended data are missing. It seems, PC and the emitter of the data are not synchronised correctly.
In the current example I use a arduino, sending 'Hello World' to the serial port of my PC (os is Ubuntu 14.04), with the following simple code:
void setup(){
Serial.begin(9600);
Serial.print("Programme initiated\n");
}
I use the following python3 code to receive the data:
import serial
import time
arduino = serial.Serial(port, baudrate=9600, timeout=2)
print(arduino.isOpen)
print(arduino)
time.sleep(1)
while True:
print(arduino.readline())
This is pretty much a simple tutorial example, and here is what I receive (apart from the correct stuff):
b'PrProgramme initiated\n'
or
b'PProgramme initiated\n'
or
b'ProgProgramme initiated\n'
or
b'ogramme initiated\n'
I moved on with more complex problems in my code, but still I didn't solved that problem. When sending a message in a loop from the arduino (the standard hello world code), it often needs time to stabilise (while that, it again does only show the middle fragment of the data) and after that running quite stable, but even then it sometimes breaks down single lines.
I faced the same difficulties when communicating with a multimeter device. There, it often does not read the first characters or mixes up with previous data.
Did anyone faced that problem before? I think it is a question of synchronisation, but I don't know how to solve it. And what puzzles me, is that I really only used tutorial stuff and it doesn't seem to work properly. Or is it a configuration problem of my PC?
What you are looking at is happening because some different things are going on.
First of all every time you open the serial port, you are causing what is called and "autoreset" and the Arduino reboot. That can be avoided in hardware or even in software by explicitly disabling RST signal on open. How to do that may vary and is out of scope of the question.
Then we have to understand that serial does NOT wait for the other part to be listening to send data; so if you disable the autoreset and connecting to the Arduino you should see random part of output of the program, depending of its current state.
Finally we have some buffer capability on pc (and sometimes even on the UART to USB side), so its not true that if you are not listening that data get lost, it may be still in the buffer.
We could say the first 3 artifact may be given by buffered data + reboot (this happen a lot when you send a lot of data, and that break the autoupload of code and you have to do a manual procedure), while the last one may be something that prevented the buffer to fill, maybe it was disabled by you, maybe some weird timing opening the serial, maybe you disabled the autoreset, maybe time that the arduino got enumerated part of the message was gone.

pySerial reading data from AT commands

I'm having trouble reading the response from a RS232 OBD2 interface via pySerial.
The code successfully enters the data, as I can see from a direct parallel terminal screen, but fails to read and print the response, regardless of the response.
Right now the code is not capable of printing the response in neither versions of Python.
The code looks something like this :
from serial import * # I also tried using /from serial import Serial
import time
ser = Serial("/dev/rfcomm1", 38400, timeout=1)
#print ('Starting up, formatting responses')
#ser.write("ATZ\r"),
#ser.write("ATSP0\r"),
#ser.write("ATS1\r"),
#ser.write("ATL1\r"),
#ser.write("ATH1\r"),
#ser.write("ATF1\r")
#time.sleep(1)
#print ('We have lift-off !')
if ser.inWaiting() > 0:
ser.flushInput()
#ser.timeout = 1.
time.sleep(1)
#print (raw_data)
ser.write("AT RV\r") #The response should be something like 13.5V, but nothing
ser.timeout = 1.
msg = ser.read(size=1024)
print msg
ser.close()
I left only the AT RV command because while I'm working on it I sent the text formatting commands to ease the job. Right now when I send it it just gives me a blank line (although the terminal which is running on the same machine displays the desired output)
There are no errors in the code, and the commands go through and are responded to by the interface, and I can see that in another live term, but nothing appears when running the Python code.
What should I do ?
You should read after writing, not before.
# before writing anything, ensure there is nothing in the buffer
if ser.inWaiting() > 0:
ser.flushInput()
# set the timeout to something reasonable, e.g., 1 s
ser.timeout = 1.
# send the commands:
ser.write("ATZ\r")
# ...
# read the response, guess a length that is more than the message
msg = ser.read(1024)
print msg
# send more commands
# read more responses
# ...
The point here is that there is no way to know when the response has been received. This code waits for one second after each command sent, unless more than 1024 bytes arrive during that time. There are more clever algorithms, but let's try with this one, first.
If you want to do something more complicated with the serial line, have a look at the pexpect module.
Some thoughts debugging python serial problems
Serial communication problems are sometimes a bit sticky to solve. pySerial is a reliable library, but as different platforms have different types of serial API, there are a lot of details. Things have not become any easier by the removal of physical serial ports, as the USB converters bring an extra layer into the game. Bluetooth converters are even worse.
The best way to debug the physical layer is to have some monitor hardware with two serial ports tapped into the serial lines. This kind of sniffer helps to isolate the problem to either end of the connection. Unfortunately, such sniffers are very rarely at hand when needed.
The next best thing is to short the RD and TD (RXD, TXD) pins of the serial line. This way all data will be echoed. If the data is received as sent, the physical connection is good. One thing to take care is handshaking. If you do not know what you are doing, disable all flow control (xon/xoff, rts/cts, dtr/dsr. pySerial disables these all if otherwise instructed.
In the case of the question above the physical connection is ok, as another piece of software demonstrates that the data is sent and understood by the other device. (Seeing that something is sent does not prove anything, as that information does not go through the physical layer, but seeing something produced by another device is received proves that the physical connection is ok.)
Now we know the data comes into the operating system, but pySerial does not see it. Or then our code is still somehow bad (no, it shouldn't, but...)
Let us suspect own own code and try someone else's code. This can be run from command prompt:
python -m serial.tools.miniterm /dev/rfcomm1 38400
Now we have a terminal which can be used to manually send/receive data form the other party. If the behaviour can be repeated (sends ok, data is received into the system, but not shown on the terminal) with this, then the problem is probably not in our code.
The next step then is to try:
sudo python -m serial.tools.miniterm /dev/rfcomm1 38400
In principle access right problems lead to situations where we can receive but not send. But it does not harm to test this, because odd rights cause odd problems.
pySerialhas a handy function readline which should read one line at a time from the serial line. This is often what is wanted. However, in this specific case the lines seem to end with \r instead of \n. The same may be repeated elsewhere in code, so with special data special care is needed. (The simple "read with timeout" is safe but slow in this sense.) This is discussed in: pySerial 2.6: specify end-of-line in readline()
The same issue plagues all terminal programs. For the pySerial miniterm, see its documentation (command-line option --cr).
If there are timeouts, they can and should be made longer for debugging purposes. A one-second timeout may be changed into a ten-second timeout to make sure the other device has ample time to answer.
I had exactly the same problem, through Python 2 IDLE no results displayed on the IDLE screen, but the results were redirected to picocom active at the terminal. I needed to capture results because my goal is to read incoming SMSs.
The following code solved my problem, I do not know the reason yet, ongoing analysis.
import time
import serial
modem1 = serial.Serial("/dev/ttyUSB3",baudrate=115200,timeout=0,rtscts=0,xonxoff=0)
def sendat1(cmd):
if cmd == 'res' : modem1.write('Z'); return
if cmd == 'out' : modem1.write(chr(26)); return
modem1.write('AT+'+cmd+'\r')
time.sleep(3)
obu = str(modem1.inWaiting())
msg = modem1.read(32798)
print(obu+':\n'+msg)
return
try:
if modem1.inWaiting()>0: modem1.flushInput()
sendat1('res')
sendat1('CMGF=1')
sendat1('CMGL')
sendat1('out')
finally:
modem1.close()

How to expand input buffer size of pyserial

I want to communicate with the phone via serial port. After writing some command to phone, I used ser.read(ser.inWaiting()) to get its return value, but I always got total 1020 bytes of characters, and actually, the desired returns is supposed to be over 50KB.
I have tried to set ser.read(50000), but the interpreter will hang on.
How would I expand the input buffer to get all of the returns at once?
If you run your code on Windows platform, you simply need to add a line in your code.
from serial import Serial
ser = Serial(port='COM1', baudrate=115200, timeout=1, writeTimeout=1)
ser.set_buffer_size(rx_size = 12800, tx_size = 12800)
Where 12800 is an arbitrary number I chose. You can make receiving(rx) and transmitting(tx) buffer as big as 2147483647 (equal to 2^31 - 1)
this will allow you to expand the input buffer to get all of the returns at once.
Be aware that this will work only with some drivers since it is a recommendation. The driver might not take your recommendation and will stick with its' original buffer size.
I have had exactly the same problem, including the 1020 byte buffer size and haven't found a way to change this. My solution has been to implement a loop like:
in_buff=''
while mbed.inWaiting():
in_buff+=mbed.read(mbed.inWaiting()) #read the contents of the buffer
time.sleep(0.11) #depending on your hardware, it can take time to refill the buffer
I would be very pleased if someone can come up with a buffer-resize solution!
I'm guessing that you are reading 1020 bytes because that is all there is in the buffer, which is what ser.inWaiting() is returning. Depending on the baud rate 50 KB may take a while to transfer, or the phone is expecting something different from you. Handshaking?
Inspect the value of ser.inWaiting, and then the contents of what you are receiving for hints.
pySerial uses the native OS drivers for serial receiving. In the case of Windows, the size of the input driver is based on the device driver.
You may be able to increase the size in your Device Manager settings if it is possible, but ultimately you just need to read the data in fast enough.

Categories

Resources