I am using python 2.7.2 with pyserial 2.6.
What is the best way to use pyserial.readline() when talking to a device that has a character other than "\n" for eol? The pyserial doc points out that pyserial.readline() no longer takes an 'eol=' argument in python 2.6+, but recommends using io.TextIOWrapper as follows:
ser = serial.serial_for_url('loop://', timeout=1)
sio = io.TextIOWrapper(io.BufferedRWPair(ser, ser))
However the python io.BufferedRWPair doc specifically warns against that approach, saying "BufferedRWPair does not attempt to synchronize accesses to its underlying raw streams. You should not pass it the same object as reader and writer; use BufferedRandom instead."
Could someone point to a working example of pyserial.readline() working with an eol other than 'eol'?
Thanks,
Tom
read() has a user-settable maximum size to the data it reads(in bits), if your data strings are a predictable length you could simply set that to capture a fixed-length string. it's sort of 'kentucky windage' in execution but so long as your data strings are consistent in size it won't bork.
beyond that, your real option is to capture and write the data stream to another file and split out your entries manually/programatically.
for example, you could write your datastream to a .csv file, and adjust the delimiter variable to be your EoL character.
Assume s is an open serial.serial object and the newline character is \r.
Then this will read to end of line and return everything up to '\r' as a string.
def read_value(encoded_command):
s.write(encoded_command)
temp = ''
response = ''
while '\r' not in response:
response = s.read().decode()
temp = temp + response
return temp
And, BTW, I implemented the io.TextIOWrapper recommendation you talked about above and it 1)is much slower and 2)somehow makes the port close.
Related
In my python code I wrote the following function to receive self-defined binary package from stdin.
def recvPkg():
## The first 4 bytes stands for the remaining package length
Len = int.from_bytes(sys.stdin.buffer.read(4), byteorder='big', signed=True)
## Then read the remaining package
data = json.loads(str(sys.stdin.buffer.read(Len), 'utf-8'))
## do something...
while True:
recvPkg()
Then, in another Node.js program I spawn this python program as a child process, and send bytes to it.
childProcess = require('child_process').spawn('./python_code.py');
childProcess.stdin.write(someBinaryPackage)
I expect the child process to read from its stdin buffer once a package is received and give the output. But it doesn't work, and I think the reason is that the child process won't begin to read unless its stdin buffer receive a signal, like an EOF. As a proof, if I close childProcess's stdin after stdin.write, the python code will work and receive all the buffered packages at once. This is not the way I want because I need childProcess's stdin to be open. So is there any other way for node.js to send a signal to childProcess to inform of reading from stdin buffer?
(sorry for poor english.
From Wikipedia (emphasis mine):
Input from a terminal never really "ends" (unless the device is disconnected), but it is useful to enter more than one "file" into a terminal, so a key sequence is reserved to indicate end of input. In UNIX the translation of the keystroke to EOF is performed by the terminal driver, so a program does not need to distinguish terminals from other input files.
There is no way to send an EOF character how you are expecting. EOF isn't really a character that exists. When you're in a terminal, you can press the key sequence ctrlz on Windows, and ctrld on UNIX-like enviroments. These produce control characters for the terminal (code 26 on Windows, code 04 on UNIX) and are read by the terminal. The terminal (upon reading this code) will then essentially stop writing to a programs stdin and close it.
In Python, a file object will .read() forever. The EOF condition is that .read() returns ''. In some other languages, this might be -1, or some other condition.
Consider:
>>> my_file = open("file.txt", "r")
>>> my_file.read()
'This is a test file'
>>> my_file.read()
''
The last character here isn't EOF, there's just nothing there. Python has .read() until the end of the file and can't .read() any more.
Because stdin in a special type of 'file' it doesn't have an end. You have to define that end. The terminal has defined that end as the control characters, but here you are not passing data to stdin via a terminal, you'll have to manage it yourself.
Just closing the file
Input [...] never really "ends" (unless the device is disconnected)
Closing stdin is probably the simplest solution here. stdin is an infinite file, so once you're done writing to it, just close it.
Expect your own control character
Another option is to define your own control character. You can use whatever you want here. The example below uses a NULL byte.
Python
class FileWithEOF:
def __init__(self, file_obj):
self.file = file_obj
self.value = bytes()
def __enter__(self):
return self
def __exit__(self, *args, **kwargs):
pass
def read(self):
while True:
val = self.file.buffer.read(1)
if val == b"\x00":
break
self.value += val
return self.value
data = FileWithEOF(sys.stdin).read()
Node
childProcess = require('child_process').spawn('./python_code.py');
childProcess.stdin.write("Some text I want to send.");
childProcess.stdin.write(Buffer.from([00]));
You might be reading the wrong length
I think the value you're capturing in Len is less than the length of your file.
Python
import sys
while True:
length = int(sys.stdin.read(2))
with open("test.txt", "a") as f:
f.write(sys.stdin.read(length))
Node
childProcess = require('child_process').spawn('./test.py');
// Python reads the first 2 characters (`.read(2)`)
childProcess.stdin.write("10");
// Python reads 9 characters, but does nothing because it's
// expecting 10. `stdin` is still capable of producing bytes from
// Pythons point of view.
childProcess.stdin.write("123456789");
// Writing the final byte hits 10 characters, and the contents
// are written to `test.txt`.
childProcess.stdin.write("A");
I have a plain ASCII file. When I try to open it with codecs.open(..., "utf-8"), I am unable to read single characters. ASCII is a subset of UTF-8, so why can't codecs open such a file in UTF-8 mode?
# test.py
import codecs
f = codecs.open("test.py", "r", "utf-8")
# ASCII is supposed to be a subset of UTF-8:
# http://www.fileformat.info/info/unicode/utf8.htm
assert len(f.read(1)) == 1 # OK
f.readline()
c = f.read(1)
print len(c)
print "'%s'" % c
assert len(c) == 1 # fails
# max% p test.py
# 63
# '
# import codecs
#
# f = codecs.open("test.py", "r", "utf-8")
#
# # ASC'
# Traceback (most recent call last):
# File "test.py", line 15, in <module>
# assert len(c) == 1 # fails
# AssertionError
# max%
system:
Linux max 4.4.0-89-generic #112~14.04.1-Ubuntu SMP Tue Aug 1 22:08:32 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Of course it works with regular open. It also works if I remove the "utf-8" option. Also what does 63 mean? That's like the middle of the 3rd line. I don't get it.
Found your problem:
When passed an encoding, codecs.open returns a StreamReaderWriter, which is really just a wrapper around (not a subclass of; it's a "composed of" relationship, not inheritance) StreamReader and StreamWriter. Problem is:
StreamReaderWriter provides a "normal" read method (that is, it takes a size parameter and that's it)
It delegates to the internal StreamReader.read method, where the size argument is only a hint as to the number of bytes to read, but not a limit; the second argument, chars, is a strict limiter, but StreamReaderWriter never passes that argument along (it doesn't accept it)
When size hinted, but not capped using chars, if StreamReader has buffered data, and it's large enough to match the size hint StreamReader.read blindly returns the contents of the buffer, rather than limiting it in any way based on the size hint (after all, only chars imposes a maximum return size)
The API of StreamReader.read and the meaning of size/chars for the API is the only documented thing here; the fact that codecs.open returns StreamReaderWriter is not contractual, nor is the fact that StreamReaderWriter wraps StreamReader, I just used ipython's ?? magic to read the source code of the codecs module to verify this behavior. But documented or not, that's what it's doing (feel free to read the source code for StreamReaderWriter, it's all Python level, so it's easy).
The best solution is to switch to io.open, which is faster and more correct in every standard case (codecs.open supports the weirdo codecs that don't convert between bytes [Py2 str] and str [Py2 unicode], but rather, handle str to str or bytes to bytes encodings, but that's an incredibly limited use case; most of the time, you're converting between bytes and str). All you need to do is import io instead of codecs, and change the codecs.open line to:
f = io.open("test.py", encoding="utf-8")
The rest of your code can remain unchanged (and will likely run faster to boot).
As an alternative, you could explicitly bypass StreamReaderWriter to get the StreamReader's read method and pass the limiting argument directly, e.g. change:
c = f.read(1)
to:
# Pass second, character limiting argument after size hint
c = f.reader.read(6, 1) # 6 is sort of arbitrary; should ensure a full char read in one go
I suspect Python Bug #8260, which covers intermingling readline and read on codecs.open created file objects, applies here, officially, it's "fixed", but if you read the comments, the fix wasn't complete (and may not be possible to complete given the documented API); arbitrarily weird combinations of read and readline will be able to break it.
Again, just use io.open; as long as you're on Python 2.6 or higher, it's available, and it's just plain better.
I converted a huge file which I wrote it at python 2.7.3 and then now I wanted to upgrade to python 3+ (i have 3.5).
what I have done so far:
installed the python interpreter 3.5+
updated the environment path to read from python3+ folder
upgraded the numpy, pandas,
I used >python 2to3.py -w viterbi.py to convert to version 3+
the section that I have error
import sys
import numpy as np
import pandas as pd
# Counting number of lines in the text file
lines = 0
buffer = bytearray(2048)
with open(inputFilePatheName) as f:
while f.readinto(buffer) > 0:
lines += buffer.count('\n')
My error is:
AttributeError: '_io.TextIOWrapper' object has no attribute 'readinto'
This is the first error and I cannot proceed to see if there is any other error. I dont know what is the equivalent command for readinto
In 3.x, the readinto method is only available on binary I/O streams. Thus: with open(inputFilePatheName, 'rb') as f:.
Separately, buffer.count('\n') will not work any more, because Python 3.x handles text properly, as something distinct from a raw sequence of bytes. buffer, being a bytearray, stores bytes; it still has a .count method, but it has to be given either an integer (representing the numeric value of a byte to look for) or a "bytes-like object" (representing a subsequence of bytes to look for). So we also have to update that, as buffer.count(b'\n') (using a bytes literal).
Finally, we need to be aware that processing the file this way means we don't get universal newline translation by default any more.
Open the file as binary.
As long as you can guarantee it's utf-8 or CP encoded, all \ns will necessarily be newlines:
with open(inputFilePatheName, "rb") as f:
while f.readinto(buffer) > 0:
lines += buffer.count(b'\n')
That way you also save the time of decoding the file, and use your buffer in the most efficient way possible.
A better approach to what you're trying to achieve is using memory mapped files.
In case of Windows:
file_handle = os.open(r"yourpath", os.O_RDONLY|os.O_BINARY|os.O_SEQUENTIAL)
try:
with mmap.mmap(file_handle, 0, access=mmap.ACCESS_READ) as f:
pos = -1
total = 0
while (pos := f.find(b"\n", pos+1)) != -1:
total +=1
finally:
os.close(file_handle)
Again, make sure you are not encoding the text as UTF-16 which is the default for Windows.
I'm writing a serial adapter for some scientific hardware whose command set uses UTF-8 character encodings. All responses from the hardware are terminated with a carriage return (u'\r'). I would like to able to use pySerial's readline() function with an EOL character specified, so I have this setup, ala this thread:
import serial
import io
ser = serial.Serial(port='COM10', baudrate=128000)
sio = io.TextIOWrapper(io.BufferedRWPair(ser, ser, 1), encoding='utf-8', newline=u'\r')
ser.open()
# these commands move to coordintes (25000, 0, 25000)
cmd = 'M\x80\x1a\x06\x00\x00\x00\x00\x00\x80\x1a\x06\x00'
ucmd = u'M\x80\x1a\x06\x00\x00\x00\x00\x00\x80\x1a\x06\x00'
#this works
ser.write(cmd)
print sio.readline()
#this does not
sio.write(ucmd)
sio.flush()
print sio.readline()
Strangely, the first command string (non-unicode using pySerial directly) elicits the correct behavior from the hardware. The second (unicode via Python's io module) causes it to move erratically and then hang. Why would this be? Sending unicode command strings to the hardware does work IF the command string is only a couple of a characters. Once you start sending bytes with hex(ord(byte)) values > 0x7F (outside ASCII range), then you start running intro trouble. I can work around this problem without too much trouble, but would like to know what is going on. Thanks!
From io docs:
BufferedRWPair does not attempt to synchronize accesses to its
underlying raw streams. You should not pass it the same object as
reader and writer; use BufferedRandom instead.
I'm guessing that's your problem, as you are passing same object ser as reader and writer. BufferendRandom doesn't look like it quite fits the bill either.
So is your problem with serial that it hangs waiting for the EOL?
I'm using a script importing PySerial
to read from COM4
messages I would like to intercept end with a couple of #
so I tried to use
bus.readline(eol='##')
where bus is my connection.
I expected to read like:
*#*3##
*#*3##
*#*3##
Unfortunalyy I found also
*#*1##*1*1*99##
that I expected to read spleetted into 2 lines
*#*1##
*1*1*99##
Clearly readline is not working but why?
The readline() method in pyserial reads one character at a time and compares it to the EOL character. You cannot specify multiple characters as the EOL. You'll have to read in and then split later using string.split() or re.split()