I am trying to write a "cleaner" program to release a potential writer which is blocked at a named pipe (because no reader is reading from the pipe). However, the cleaner itself should not block when no writer is blocked writing to the pipe. In other words, the "cleaner" must return/terminate immediately, whether there is a blocked writer or not.
Therefore I searched for "Python non-blocking read from named pipe", and got these:
How to read named FIFO non-blockingly?
fifo - reading in a loop
What conditions result in an opened, nonblocking named pipe (fifo) being "unavailable" for reads?
Why does a read-only open of a named pipe block?
It seems that they suggest simply using os.open(file_name, os.O_RDONLY | os.O_NONBLOCK) should be fine, which didn't really work on my machine. I think I may have messed up somewhere or misunderstood some of their suggestion/situation. However, I really couldn't figure out what's wrong myself.
I found Linux man page (http://man7.org/linux/man-pages/man2/open.2.html), and the explanation of O_NONBLOCK seems consistent with their suggestions but not with my observation on my machine...
Just in case it is related, my OS is Ubuntu 14.04 LTS 64-bit.
Here is my code:
import os
import errno
BUFFER_SIZE = 65536
ph = None
try:
ph = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
os.read(ph, BUFFER_SIZE)
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
raise err
else:
raise err
finally:
if ph:
os.close(ph)
(Don't know how to do Python syntax highlighting...)
Originally there is only the second raise, but I found that os.open and os.read, though not blocking, don't raise any exception either... I don't really know how much the writer will write to the buffer! If the non blocking read does not raise exception, how should I know when to stop reading?
Updated on 8/8/2016:
This seems to be a workaround/solution that satisfied my need:
import os
import errno
BUFFER_SIZE = 65536
ph = None
try:
ph = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
while True:
buffer = os.read(ph, BUFFER_SIZE)
if len(buffer) < BUFFER_SIZE:
break
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
pass # It is supposed to raise one of these exceptions
else:
raise err
finally:
if ph:
os.close(ph)
It will loop on read. Every time it reads something, it compares the size of the content read with the specified BUFFER_SIZE, until it reaches EOF (writer will then unblock and continue/exit).
I still want to know why no exception is raised in that read.
Updated on 8/10/2016:
To make it clear, my overall goal is like this.
My main program (Python) has a thread serving as the reader. It normally blocks on the named pipe, waiting for "commands". There is a writer program (Shell script) which will write a one-liner "command" to the same pipe in each run.
In some cases, a writer starts before my main program starts, or after my main program terminates. In this case, the writer will block on the pipe waiting for a reader. In this way, if later my main program starts, it will read immediately from the pipe to get that "command" from the blocked writer - this is NOT what I want. I want my program to disregard writers that started before it.
Therefore, my solution is, during initialization of my reader thread, I do non-blocking read to release the writers, without really executing the "command" they were trying to write to the pipe.
This solution is incorrect.
while True:
buffer = os.read(ph, BUFFER_SIZE)
if len(buffer) < BUFFER_SIZE:
break
This will not actually read everything, it will only read until it gets a partial read. Remember: You are only guaranteed to fill the buffer with regular files, in all other cases it is possible to get a partial buffer before EOF. The correct way to do this is to loop until the actual end of file is reached, which will give a read of length 0. The end of file indicates that there are no writers (they have all exited or closed the fifo).
while True:
buffer = os.read(ph, BUFFER_SIZE)
if not buffer:
break
However, this will not work correctly in the face of non-blocking IO. It turns out non-blocking IO is completely unnecessary here.
import os
import fcntl
h = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
# Now that we have successfully opened it without blocking,
# we no longer want the handle to be non-blocking
flags = fcntl.fcntl(h, fcntl.F_GETFL)
flags &= ~os.O_NONBLOCK
fcntl.fcntl(h, fcntl.F_SETFL, flags)
try:
while True:
# Only blocks if there is a writer
buf = os.read(h, 65536)
if not buf:
# This happens when there are no writers
break
finally:
os.close(h)
The only scenario which will cause this code to block is if there is an active writer which has opened the fifo but is not writing to it. From what you've described, it doesn't sound like this is the case.
Non-blocking IO doesn't do that
Your program wants to do two things, depending on circumstance:
If there are no writers, return immediately.
If there are writers, read data from the FIFO until the writers are done.
Non-blocking read() has no effect whatsoever on task #1. Whether you use O_NONBLOCK or not, read() will return immediately in situation #1. So the only difference is in situation #2.
In situation #2, your program's goal is to read the entire block of data from the writers. That is exactly how blocking IO works: it waits for the writers to finish, and then read() returns. The whole point of non-blocking IO is to return early if the operation can't complete immediately, which is the opposite of your program's goal—which is to wait until the operation is complete.
If you use non-blocking read(), in situation #2, your program will sometimes return early, before the writers have finished their jobs. Or maybe your program will return after reading half of a command from the FIFO, leaving the other (now corrupted) half there. This concern is expressed in your question:
If the non blocking read does not raise exception, how should I know when to stop reading?
You know when to stop reading because read() returns zero bytes when all writers have closed the pipe. (Conveniently, this is also what happens if there were no writers in the first place.) This is unfortunately not what happens if the writers do not close their end of the pipe when they are done. It is far simpler and more straightforward if the writers close the pipe when done, so this is the recommended solution, even if you need to modify the writers a little bit. If the writers cannot close the pipe for whatever reason, the solution is more complicated.
The main use case for non-blocking read() is if your program has some other task to complete while IO goes on in the background.
In POSIX C programs, if read() attempts to read from an empty pipe or a FIFO special file, it has one of the following results:
If no process has the pipe open for writing, read() returns 0 to indicate the end of the file.
If some process has the pipe open for writing and O_NONBLOCK is set to 1, read() returns -1 and sets errno to EAGAIN.
If some process has the pipe open for writing and O_NONBLOCK is set to 0, read() blocks (that is, does not return) until some data is written, or the pipe is closed by all other processes that have the pipe open for writing.
So,first check if there's any writer still open the fifo for write. If there's no one, the read will get an empty string and no exception. Otherwise, an exception will be raised
Related
I wrote the following code to understand how nonblocking write is operated:
import os, time
def takeAnap():
print('I am sleeping a bit while it is writing!')
time.sleep(50)
fd = os.open('t.txt', os.O_CREAT | os.O_NONBLOCK)
for i in range(100):
# Non-blocking write
fd = os.open('t.txt', os.O_APPEND | os.O_WRONLY | os.O_NONBLOCK)
os.write(fd, str(i))
os.close(fd)
time.sleep(2)
takeAnap()
As you can see, I am creating takeAnap() to be activated while the loop is being processed so that I can convince my self that the writing is performed without blocking! However, the loop still blocks and the method is not performed until finishing. I am not sure if my understanding is wrong but as far as I know, non-blocking operation allows you to do other tasks while the writing is being processed. Is that correct? If so, kindly where is the problem in my code!
Thank you.
I think you misunderstand what the O_NONBLOCK flag is used for. Here's what the flag actually does:
This prevents open from blocking for a “long time” to open the file.
This is only meaningful for some kinds of files, usually devices such
as serial ports; when it is not meaningful, it is harmless and
ignored.
Excerpt from https://www.gnu.org/software/libc/manual/html_node/Open_002dtime-Flags.html.
So, the flag does not specify non-blocking write, but non-blocking open. The writing is still serial, and blocking, and slow.
I have a reader and writer on a FIFO, where the reader must not block indefinitely. To do this, I open the read end with O_NONBLOCK.
The write end can block, so I open it as a regular file. Large writes perform unacceptably awfully - reading/writing a 4MB block takes minutes instead of the expected fraction of a second (expected, because in linux the same code takes a fraction of a second).
Example code in Python replicating the issue. First, create a fifo using mkfifo, e.g. mkfifo some_fifo, then run the reading end, then the writing end.
Reading End:
import os, time
# mkfifo some_fifo before starting python
fd = os.open('some_fifo',os.O_RDONLY | os.O_NONBLOCK)
while True:
try:
read = len(os.read(fd, 8192)) # read up to 8kb (FIFO buffer size in mac os)
print(read)
should_block = read < 8192 # linux
except BlockingIOError:
should_block = True # mac os
if should_block:
print('blocking')
time.sleep(0.5)
Write End:
import os
fd = os.open('some_fifo',os.O_WRONLY)
os.write(fd, b'aaaa'*1024*1024) # 4MB write
Note: The original code where I hit on this issue is cross-platform Java code that also runs on linux. Unfortunately, this means I can't use kqueue with a kevent's data field to figure out how much I can read without blocking - this data is lost in the abstraction over epoll/kqueue that I use. This means a solution of using a blocking fd à la this answer is unacceptable.
Edit: the original code used kqueue to block on the file descriptor in the read end, which performed worse
Edit 2: Linux os.read() doesn't throw a BlockingIOError before the other side of the pipe is connected despite the docs stating that it should (the call succeeds (returns 0) but sets errno to EAGAIN). Updated the code to be friendly to linux behavior too.
Edit 3: The code for macOS was originally:
import select, os
# mkfifo some_fifo before starting python
fd = os.open('some_fifo',os.O_RDONLY | os.O_NONBLOCK)
kq = select.kqueue()
ke = select.kevent(fd)
while True:
try:
read = len(os.read(fd, 8192)) # read up to 8kb (FIFO buffer size in mac os)
except BlockingIOError:
evts = kq.control([ke], 1, 10) # 10-second timeout, wait for 1 event
print(evts)
This performs as poorly as the version with sleeps, but sleeping makes sure the issue isn't with the blocking mechanism, and is cross-platform.
I have a C program that writes data to a named pipe and a Python program that reads data from the named pipe, like this:
p = open('/path/to/named/pipe', 'r')
...
data = p.read(size)
When the C program exits, it closes the pipe.
How can I detect this from the Python side? I've tried installing a handler for SIGPIPE, but it seems that SIGPIPE only happens when attempting to write to a closed pipe, not read from it. I also expected that p.read(size) might return a length-zero string because of the EOF at the other end, but actually it just hangs waiting for data.
How can I detect this situation and handle it?
You can use the select module to monitor the state of your pipe. On Linux (where select.poll() is available, the following code will detect the presence of a closed pipe:
import select
# ...
poller = select.poll()
# Register the "hangup" event on p
poller.register(p, select.POLLHUP)
# Call poller.poll with 0s as timeout
for descriptor, mask in poller.poll(0):
# Can contain at most one element, but still:
if descriptor == p.fileno() and mask & select.POLLHUP:
print('The pipe is closed on the other end.')
p.close()
Analogous methods exist for other OS as well that can detect such situations.
The reason why it hangs when calling read, is because the IO is blocking. You can turn it into non-blocking (and have read return an empty string) by using os.set_blocking, but this would still not allow you to detect when the pipe on the other end is closed.
#!/usr/bin/python
import pty
import os
import sys
pid, out = pty.fork()
if pid:
try:
for line in os.fdopen(out):
sys.stdout.write(line)
except IOError:
pass
else:
sys.stdout.write("foobar")
sys.stdout.flush()
prints nothing. How do I make it print the incomplete line emitted by the child??
The following is pure speculation: Iterating over an input stream is likely implemented as reading characters into a buffer until either a newline or an end-of-file condition is encountered. If the child dies, some implementations (platform dependent) loose the remaining characters from the buffer.
Maybe using some more low-level I/O can avoid this issue. When I run the original script on my GNU/Linux system, I don't get the “foobar” but an IOError instead. However, when I change it to
with os.fdopen(out) as istr:
sys.stdout.write(istr.read())
it prints “foobar” without throwing any exception.
Update: In order to read the stream one piece at a time, we'll need to resort to even more low-level I/O. I found that the following works for me:
import pty
import os
import sys
pid, out = pty.fork()
if pid:
while True:
buffsz = 10 # Use a larger number in production code.
buff = b''
died = False
try:
buff = os.read(out, buffsz)
except IOError:
died = True
sys.stdout.write(buff.decode())
if len(buff) == 0 or died:
break
else:
with sys.stdout:
# Also try writing a longer string with newlines.
sys.stdout.write("foobar")
Unfortunately, this means we'll need to reassemble the buffer chunks manually and scan for newlines. This is inconvenient but certainly can be done.
I have been trying to write an application that runs subprocesses and (among other things) displays their output in a GUI and allows the user to click a button to cancel them. I start the processes like this:
queue = Queue.Queue(500)
process = subprocess.Popen(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
iothread = threading.Thread(
target=simple_io_thread,
args=(process.stdout, queue))
iothread.daemon=True
iothread.start()
where simple_io_thread is defined as follows:
def simple_io_thread(pipe, queue):
while True:
line = pipe.readline()
queue.put(line, block=True)
if line=="":
break
This works well enough. In my UI I periodically do non-blocking "get"s from the queue. However, my problems come when I want to terminate the subprocess. (The subprocess is an arbitrary process, not something I wrote myself.) I can use the terminate method to terminate the process, but I do not know how to guarantee that my I/O thread will terminate. It will normally be doing blocking I/O on the pipe. This may or may not end some time after I terminate the process. (If the subprocess has spawned another subprocess, I can kill the first subprocess, but the second one will still keep the pipe open. I'm not even sure how to get such grand-children to terminate cleanly.) After that the I/O thread will try to enqueue the output, but I don't want to commit to reading from the queue indefinitely.
Ideally I would like some way to request termination of the subprocess, block for a short (<0.5s) amount of time and after that be guaranteed that the I/O thread has exited (or will exit in a timely fashion without interfering with anything else) and that I can stop reading from the queue.
It's not critical to me that a solution uses an I/O thread. If there's another way to do this that works on Windows and Linux with Python 2.6 and a Tkinter GUI that would be fine.
EDIT - Will's answer and other things I've seen on the web about doing this in other languages suggest that the operating system expects you just to close the file handle on the main thread and then the I/O thread should come out of its blocking read. However, as I described in the comment, that doesn't seem to work for me. If I do this on the main thread:
process.stdout.close()
I get:
IOError: close() called during concurrent operation on the same file object.
...on the main thread. If I do this on the main thread:
os.close(process.stdout.fileno())
I get:
close failed in file object destructor: IOError: [Errno 9] Bad file descriptor
...later on in the main thread when it tries to close the file handle itself.
I know this is an old post, but in case it still helps anyone, I think your problem could be solved by passing the subprocess.Popen instance to io_thread, rather than it's output stream.
If you do that, then you can replace your while True: line with while process.poll() == None:.
process.poll() checks for the subprocess return code; if the process hasn't finished, then there isn't one (i.e. process.poll() == None). You can then do away with if line == "": break.
The reason I'm here is because I wrote a very similar script to this today, and I got those:-
IOError: close() called during concurrent operation on the same file object. errors.
Again, in case it helps, I think my problems stem from (my) io_thread doing some overly efficient garbage collection, and closes a file handle I give it (I'm probably wrong, but it works now..) Mine's different tho in that it's not daemonic, and it iterates through subprocess.stdout, rather than using a while loop.. i.e.:-
def io_thread(subprocess,logfile,lock):
for line in subprocess.stdout:
lock.acquire()
print line,
lock.release()
logfile.write( line )
I should also probably mention that I pass the bufsize argument to subprocess.Popen, so that it's line buffered.
This is probably old enough, but still usefull to someone coming from search engine...
The reason that it shows that message is that after the subprocess has been completed it closes the file descriptors, therefore, the daemon thread (which is running concurrently) will try to use those closed descriptors raising the error.
By joining the thread before the subprocess wait() or communicate() methods should be more than enough to suppress the error.
my_thread.join()
print my_thread.is_alive()
my_popen.communicate()
In the code that terminates the process, you could also explicitly os.close() the pipe that your thread is reading from?
You should close the write pipe instead... but as you wrote the code you cannot access to it. To do it you should
crate a pipe
pass the write pipe file id to Popen's stdout
use the read pipe file simple_io_thread to read lines.
Now you can close the write pipe and the read thread will close gracefully.
queue = Queue.Queue(500)
r, w = os.pipe()
process = subprocess.Popen(
command,
stdout=w,
stderr=subprocess.STDOUT)
iothread = threading.Thread(
target=simple_io_thread,
args=(os.fdopen(r), queue))
iothread.daemon=True
iothread.start()
Now by
os.close(w)
You can close the pipe and iothread will shutdown without any exception.