Python async subprocess pipe read nbytes

Python async subprocess pipe read nbytes - python

After looking at the subprocess async documentation I'm left wondering how anyone would run something equivalent to await process.stdout.read(NUMBER_OF_BYTES_TO_READ). The usage of the previous code snippet is advised against right in the documentation, they suggest to use the communicate method, and from what I can tell there is no way of indicating the number of bytes that need to be read with communicate().
What am I missing?
How would I tell communicate to return after reading a certain number of bytes?
Edit - I'm creating my subprocess with async pipes, I am trying to use the pipe asynchronously.

Short answer: stdout.read is blocking.
When there is enough bytes to read, it will return. This is a very happy and unlikely occasion. More likely, there will be little or no bytes to return, so it will wait. Locking the process.
The pipe can be created to be non-blocking, but that behavior is system-specific and fickle in my experience.
The "right" way to use stdout.read is to either be ready in the reading process to be blocked on this operation, possibly indefinitely; or use an external thread to read and push data to a shared buffer. Main thread can then decide to either pull or await on the buffer, retaining control.
In practical terms, and I wrote code like this several times, there will be a listening thread attached to a pipe, reading it until close or a signal from main thread to die. Reader and Main thread will use Queue.Queue to communicate, which is trivial to use in this scenario -- it's thread safe.
So, stdout.read comes with so many caveats, that nobody in the right mind would advise anyone to use it.

Related

Send a string to another python thread

I need to send strings to a separate thread asynchronously (meaning no one knows when). I am using threading because my problem is IO bound.
My current best bet is set a flag and have the receiver thread poll that. When it detects a set flag it reads from some shared memory.
This sounds horribly complicated. Can't I send messages via a mailbox? Or something even better like implicit variable semantics?

The queue.Queue class is ideal for this. Your thread can either block waiting for input, or it can check whether anything new has arrived, and it is thread-safe.

How do I send a variable/flag to another thread in a non-blocking thread safe way in Python?

I've been looking around and see some answers saying to use globals, but doesn't seem thread safe. I have also tried using queues but that is apparently blocking, at least in how I did it. Can someone help/show an example on how to launch a thread from the main thread and communicate between one thread to another in a non-blocking thread safe way? Basically, the use case is that the threads will be looping and checking fairly constantly if there's something that needs to be done/changed and act accordingly. Thanks for the help

Python Queues are thread safe according to the documentation. I don't think it should be a problem to push and pop from a shared queue within threads. https://docs.python.org/3/library/queue.html

Why does Python's multiprocessing Queue have a buffer and a Pipe

Context
I have been looking at the source code SEE HERE for multiprocessing Queue Python 2.7 and have some questions.
A deque is used for a buffer and any items put on the Queue are appended to the deque but for get(), a pipe is used.
We can see that during put, if the feeder thread has not been started yet it will start.
The thread will pop objects off the thread and send them on the read side of the above pipe.
Questions
So, why use a deque and a pipe?
Couldn't one just use a deque (or any other data structure with FIFO behavior) and synchronize push and pop?
Likewise couldn't one also just use a Pipe, wrapping send and recv?
Maybe there is something here that I am missing but the feeder thread popping items and putting them on the Pipe seems like overkill.

The multiprocessing.Queue is a port of the standard Queue capable of running on multiple processes. Therefore it tries to reproduce the same behaviour.
A deque is a list with fast insertion/extraction on both sides with, theoretically, infinite size. It's very well suited for representing a stack or a queue. It does not work across different processes though.
A Pipe works more like a socket and allows to transfer data across processes. Pipes are Operating System objects and their implementation differs from OS to OS. Moreover, pipes have a limited size. If you fill a pipe your next call to send will block until the other side of it does not get drained.
If you want to expose a Queue capable to work across multiple processes in a similar fashion than the standard one, you need the following features.
A buffer capable of storing messages in arrival order which have not been consumed yet.
A channel capable of transferring such messages across different processes.
Atomic put and get methods able to leave the control to the User on when to block the program flow.
The use of a deque a Thread and a Pipe is one of the simplest way to deliver these features but it's not the only one.
I personally prefer the use of bare pipes to let processes communicate as it gives me more control on my application.

A dequeue can only be in one process memory so using it to pass data between processes is impossible(...*)
You could use just a Pipe but then you would need to protect it with locks, and I guess this is why a dequeue was introduced.

Is there anyway to terminate a running function from a thread?

I've tried lately to write my own Socket-Server in python.
While i was writing a thread to handle server commands (sort of command line in the server), I've tried to implement a code that will restart the server when the raw_input() receives specific command.
Basically, i want to restart the server as soon as the "Running" variable changes its state from True to False, and when it does, i would like to stop the function (The function that called the thread) from running (get back to main function) and then run it again. Is there a way to do it?
Thank you very much, and i hope i was clear about my problem,
Idan :)

Communication between threads can be done with Events, Queues, Semaphores, etc. Check them out and choose the one, that fits your problem best.

You can't abort a thread, or raise an exception into it asynchronously, in Python.
The standard Unix solution to this problem is to use a non-blocking socket, create a pipe with pipe, replace all your blocking sock.recv calls with a blocking r, _, _ = select.select([sock, pipe], [], []), and then the other thread can write to the pipe to wake up the other thread.
To make this portable to Windows you'll need to create a UDP localhost socket instead of a pipe, which makes things slightly more complicated, but it's still not hard.
Or, of course, you can use a higher-level framework, like asyncio in 3.4+, or twisted or another third-party lib, which will wrap this up for you. (Most of them are already running the equivalent of a loop around select to service lots of clients in one thread or a small thread pool, so it's trivial to toss in a stop pipe.)
Are there other alternatives? Yes, but all less portable and less good in a variety of other ways.
Most platforms have a way to asynchronously kill or signal another thread, which you can access via, e.g., ctypes. But this is a bad idea, because it will prevent Python from doing any normal cleanup. Even if you don't get a segfault, this could mean files never get flushed and end up with incomplete/garbage data, locks are left acquired to deadlock your program somewhere completely unrelated a short time later, memory gets leaked, etc.
If you're specifically trying to interrupt the main thread, and you only care about CPython on Unix, you can use a signal handler and the kill function. The signal will take effect on the next Python bytecode, and if the interpreter is blocked on any kind of I/O (or most other syscalls, e.g., inside a sleep), the system will return to the interpreter with an EINTR, allowing it to interrupt immediately. If the interpreter is blocked on something else, like a call to a C library that blocks signals or just does nothing but CPU work for 30 seconds, then you'll have to wait 30 seconds (although that doesn't come up that often, and you should know if it will in your case). Also, threads and signals don't play nice on some older *nix platforms. And signals don't work the same way on Windows, or in some other Python implementations like Jython.
On some platforms (including Windows--but not most modern *nix plafforms), you can wake up a blocking socket call just by closing the socket out from under the waiting thread. On other platforms, this will not unblock the thread, or will do it sometimes but not other times (and theoretically it could even segfault your program or leave the socket library in an unusable state, although I don't think either of those will happen on any modern platform).

As far as I understand the documentation, and some experiments I've over the last weeks, there is no way to really force another thread to 'stop' or 'abort'. Unless the function is aware of the possibility of being stopped and has a foolproof method of avoiding getting stuck in some of the I/O functions. Then you can use some communication method such as semaphores. The only exception is the specialized Timer function, which has a Cancel method.
So, if you really want to stop the server thread forcefully, you might want to think about running it in a separate process, not a thread.
EDIT: I'm not sure why you want to restart the server - I just thought it was in case of a failure. Normal procedure in a server is to loop waiting for connections on the socket, and when a connection appears, attend it and return to that loop.
A better way, is to use the GIO library (part of glib), and connect methods to the connection event, to attend the connection even asynchronously. This avoids the loop completely. I don't have any real code for this in Python, but here's an example of a client in Python (which uses GIO for reception events) and a server in C, which uses GIO for connections.
Use of GIO makes life so much easier...

Avoid hang when writing to named pipe which disappears and comes back

I have a program which dumps information into a named pipe like this:
cmd=open(destination,'w')
cmd.write(data)
cmd.close()
This works pretty well until the pipe (destination) disappears while my program is writing to it. The problem is that it keeps hanging on the write part(?)
I was expecting some exception to happen, but that's not the case.
How can I avoid this situation?
Thanks,
Jay

If the process reading from the pipe is not reading as fast as your writing, your script will block when it tries to write to the pipe. From the Wikipedia article:
"If the queue buffer fills up, the
sending program is suspended (blocked)
until the receiving program has had a
chance to read some data and make room
in the buffer. In Linux, the size of
the buffer is 65536 bytes."
Luckly you have a few options:
The signal module will allow you to set an alarm to break out of the write call. After the prescribed amount of time, a SIGALRM signal will be sent to your process, if your handler for the signal raises an exception, it will break you out of the write.
With threading, you can spawn a new thread to handle the writing, killing it if it blocks for too long.
You can also use the fnctl module to make the pipe nonblocking (meaning the call will not wait, it will fail immediately if the pipe is full): Non-blocking read on a subprocess.PIPE in python.
Finally, you can use the select module to check if the pipe is ready for writing before attempting your write, just be careful, the check-write action is not idempotent (e.g. the pipe could fill up between the check and write).

I think that the signal module can help you. Check this example:
http://docs.python.org/library/signal.html#example
(The example solves an possibly non-finishing open() call, but can be trivially modified to do the same thing to your cmd.write() call.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.