EOFError with multiprocessing Manager

EOFError with multiprocessing Manager - python

I have a bunch of clients connecting to a server via 0MQ. I have a Manager queue used for a pool of workers to communicate back to the main process on each client machine.
On just one client machine having 250 worker processes, I see a bunch of EOFError's almost instantly. They occur at the point that the put() is being performed.
I would expect that a lot of communication might slow everything down, but that I should never see EOFError's in internal multiprocessing logic. I'm not using gevent or anything that might break standard socket functionality.
Any thoughts on what could make puts to a Manager queue start raising EOFError's?

For me the error was actually that my receiving process had thrown an exception and terminated, and so the sending process was receiving an EOFError, meaning that the interprocess communication pipeline had closed.

Related

Ensuring socket messages were sent using flask-socketio, redis

I have a flask-socketio server running on multiple pods, using redis as a message queue. I want to ensure that emits from external processes reach their destination 100% of the time, or to know when they have failed.
When process A emits an event to a socket that's connected to process B, the event goes through the message queue to process B, to the client. Is there any way I can intercept the outgoing emit on process B? Ideally i'd then use a worker to check after a few seconds if the message reached the client (via a confirm event emitted from the client) or it will be emitted again.
This code runs on process A:
#app.route('/ex')
def ex_route():
socketio.emit('external', {'text': f'sender: {socket.gethostname()}, welcome!'}, room='some_room')
return jsonify(f'sending message to room "some_room" from {socket.gethostname()}')
This is the output from process A
INFO:socketio.server:emitting event "external" to some_room [/]
INFO:geventwebsocket.handler:127.0.0.1 - - [2019-01-11 13:33:44] "GET /ex HTTP/1.1" 200 177 0.003196
This is the output from process B
INFO:engineio.server:9aab2215a0da4816a45e3fdc1e449fce: Sending packet MESSAGE data 2["external",{"text":"sender: *******, welcome!"}]

There is currently no mechanism to do what you ask, unfortunately.
I think you basically have two approaches to go about this:
Always run your emits from the main server(s). If you need to emit from an auxiliary process, use an IPC mechanism to notify the server so that it can run the emit on its behalf. And now you can use callbacks.
Ignore the callbacks, and instead have the client acknowledge receipt of the event by emitting back to the server.
Adding callback support for auxiliary processes should not be terribly difficult, by the way. I never needed that functionality myself and you are the first to ask about it. Maybe I should look into that at some point.
Edit: after some thought, I came up with a 3rd option:
You can connect your external process to the server as a client, instead of using the "emit-only" option. If this process is a client, it can emit an event to the server, which in turn the server can relay to the external client. When the client replies to the server, the server can once again relay the response to the external process, which is not another client and has full send and receive capabilities.

Using IPC is not very robust, especially in case of server receiving a lot of requests there might be an issue where you receive a message and don't retranslate it and it's vital.
Use either celery or zmq or redis itself for interconnect. The most natural is using socketio itself like mentioned by Miguel as it's already waiting for the requests has the environment and can emit anytime.
I've used a greenlet hack over threads - where greenlet is lighter than threads and runs in the same environment allowing it to send the message while your main thread awaits socket in non-blocking mode. Basically you write a thread, then apply eventlet or gevent to the whole code via monkeypatching and the thread becomes a greenlet - an inbetween function call. You put a sleep on it so it doesn't hog all resources and you have your sender because greenlets share environment easily, they are not bound by io, just cpu (which is the same for threads in Python but greenlets are even more lightweight due to no OS-level context change at all).
But as soon as CPU load increased I switched over to client/server. Imbuing IPC would require massive rewrites from the ground up.

Listening for events on a network and handling callbacks robostly

I am developing a small Python program for the Raspberry Pi that listens for some events on a Zigbee network.
The way I've written this is rather simplisic, I have a while(True): loop checking for a Uniquie ID (UID) from the Zigbee. If a UID is received it's sent to a dictionary containing some callback methods. So, for instance, in the dictionary the key 101 is tied to a method called PrintHello().
So if that key/UID is received method PrintHello will be executed - pretty simple, like so:
if self.expectedCallBacks.has_key(UID) == True:
self.expectedCallBacks[UID]()
I know this approach is probably too simplistic. My main concern is, what if the system is busy handling a method and the system receives another message?
On an embedded MCU I can handle easily with a circuler buffer + interrupts but I'm a bit lost with it comes to doing this with a RPi. Do I need to implement a new thread for the Zigbee module that basically fills a buffer that the call back handler can then retrieve/read from?
I would appreciate any suggestions on how to implement this more robustly.

Threads can definitely help to some degree here. Here's a simple example using a ThreadPool:
from multiprocessing.pool import ThreadPool
pool = ThreadPool(2) # Create a 2-thread pool
while True:
uid = zigbee.get_uid()
if uid in self.expectedCallbacks:
pool.apply_async(self.expectedCallbacks[UID])
That will kick off the callback in a thread in the thread pool, and should help prevent events from getting backed up before you can send them to a callback handler. The ThreadPool will internally handle queuing up any tasks that can't be run when all the threads in the pool are already doing work.
However, remember that Raspberry Pi's have only one CPU core, so you can't execute more than one CPU-based operation concurrently (and that's even ignoring the limitations of threading in Python caused by the GIL, which is normally solved by using multiple processes instead of threads). That means no matter how many threads/processes you have, only one can get access to the CPU at a time. For that reason, you probably don't want more than one thread actually running the callbacks, since as you add more you're just going to slow things down, due to the OS needing to constantly switch between threads.

Controlling http streams with python threads

I am implementing an app consuming a few http streams at the same time.
All threads (a pycurl object each) are spawned in the same loop.
The trick is how to build a proper architecture for handling reconnects.
Is it a good practice to create a separate controller thread that somehow
checks which connections are not alive or need forced reconnect?
Or may be such task should be done inside separate processes?

I would suggest to have one controling thread which spawns http streaming threads, and such a streaming thread implements the proper handling for a connection loss or timeout (e.g. either terminating itself or telling to controling thread that a new streaming thread should be spawned for a reconnect). Depending on your http serving peer you could also try to continue an interrupted stream by using the http Content-Range feature.

My hot-deploying (toy) server: need your design inputs (homework)

from multiprocessing import Process
a=Process(target=worker, args=())
a.start()
I am making a multiple worker-process app (don't laugh yet) in which each worker can gracefully reload. Whenever the code is updated, new requests are served by new worker processes with the new code. This is such that
A newly launched thread contains updated code
ensure that no requests are dropped
I already made a worker that listens:
serves requests when it gets aa request signal
kills itself when the next signal is a control signal
I did it in zeromq. The clients connect to this server using zeromq. The clients do not interact by HTTP.
What is a good way to reload the code? Can you explain a scheme that is simple and stupid enough to be robust?
What I have in mind/ can do
Launch a thread within the main process that iterates:
Signal every worker process to die
Launch new worker processes
But this approach will drop (I configured it that way) requests between the death of the last old worker and the spawning of the first new worker.
And no, I am not a college student. The "homework" just means a curiosity-driven pursuit.

Reloading code in python is a notoriously difficult problem.
Here's how I would deal with it:
at server startup, listen on your HTTP port (but do not start accepting connections)
Use multiprocessing or some such to create some worker processes; this should happen after the socket starts listening so that the subprocesses inherit the socket.
each worker can then accept connections and service requests.
when the parent process learns that it should reload, it shuts down the listening socket.
when a worker tries to accept a closed socket, it recieves a socket.error exception, and should terminate
the parent process can start a new main process (as in subprocess.Popen(sys.argv)) the new process can start accepting connections immediately.
the old process can now wait for the child workers to finish; the children cannot accept new connections (since the listening socket is shutdown). Once all child process have finished handling in-flight requests and closed, the parent process can also terminate.

The way that I did this in python is fairly simple but depends on a middleman or message broker. Basically, I receive a message, process it and ack it. If a message is not acked, then after a timeout the broker requeues it.
With this in place, you simpy kill and restart the process. In my case the process traps SIGINT and does an orderly shutdown. However it dies, the supervisor notices that the worker has died and starts a new one, which continues processing messages from the queue.
I was inspired by Erlang's supervision tree model where everything is designed to survive the death of a process. I even made my workers send a heartbeat to the supervisor periodically (ZeroMQ PUB to supervisor SUB) so that the supervisor can kill and restart a process if it hangs for any reason.

detection of communication failure when "put" in queue

I am using the multiprocessing python module with Queue for communication between processes. Some processes only send (i.e. queue.put) and I can't seem to find a way to detect when the receiving end gets terminated abruptly.
Is there a way to detect if the process at the other end of the Queue gets terminated without having to get from the Queue? Isn't there a signal I could trap somehow? Or do I have to periodically get from the Queue and trap the EOFError manually.

I don't believe multiprocessing sets up a "watch-dog" process for you to take care of crashes or kills of some of your processes. It may be worth your while to set one up (pretty hard to do cross-platform, but if, say, you're only worried about Linux, it's not that terrible).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

EOFError with multiprocessing Manager - python

For me the error was actually that my receiving process had thrown an exception and terminated, and so the sending process was receiving an EOFError, meaning that the interprocess communication pipeline had closed.

Related

Ensuring socket messages were sent using flask-socketio, redis

Listening for events on a network and handling callbacks robostly

Controlling http streams with python threads

My hot-deploying (toy) server: need your design inputs (homework)

detection of communication failure when "put" in queue

Categories

Resources