Controlling http streams with python threads - python

I am implementing an app consuming a few http streams at the same time.
All threads (a pycurl object each) are spawned in the same loop.
The trick is how to build a proper architecture for handling reconnects.
Is it a good practice to create a separate controller thread that somehow
checks which connections are not alive or need forced reconnect?
Or may be such task should be done inside separate processes?

I would suggest to have one controling thread which spawns http streaming threads, and such a streaming thread implements the proper handling for a connection loss or timeout (e.g. either terminating itself or telling to controling thread that a new streaming thread should be spawned for a reconnect). Depending on your http serving peer you could also try to continue an interrupted stream by using the http Content-Range feature.

Related

Ensuring socket messages were sent using flask-socketio, redis

I have a flask-socketio server running on multiple pods, using redis as a message queue. I want to ensure that emits from external processes reach their destination 100% of the time, or to know when they have failed.
When process A emits an event to a socket that's connected to process B, the event goes through the message queue to process B, to the client. Is there any way I can intercept the outgoing emit on process B? Ideally i'd then use a worker to check after a few seconds if the message reached the client (via a confirm event emitted from the client) or it will be emitted again.
This code runs on process A:
#app.route('/ex')
def ex_route():
socketio.emit('external', {'text': f'sender: {socket.gethostname()}, welcome!'}, room='some_room')
return jsonify(f'sending message to room "some_room" from {socket.gethostname()}')
This is the output from process A
INFO:socketio.server:emitting event "external" to some_room [/]
INFO:geventwebsocket.handler:127.0.0.1 - - [2019-01-11 13:33:44] "GET /ex HTTP/1.1" 200 177 0.003196
This is the output from process B
INFO:engineio.server:9aab2215a0da4816a45e3fdc1e449fce: Sending packet MESSAGE data 2["external",{"text":"sender: *******, welcome!"}]
There is currently no mechanism to do what you ask, unfortunately.
I think you basically have two approaches to go about this:
Always run your emits from the main server(s). If you need to emit from an auxiliary process, use an IPC mechanism to notify the server so that it can run the emit on its behalf. And now you can use callbacks.
Ignore the callbacks, and instead have the client acknowledge receipt of the event by emitting back to the server.
Adding callback support for auxiliary processes should not be terribly difficult, by the way. I never needed that functionality myself and you are the first to ask about it. Maybe I should look into that at some point.
Edit: after some thought, I came up with a 3rd option:
You can connect your external process to the server as a client, instead of using the "emit-only" option. If this process is a client, it can emit an event to the server, which in turn the server can relay to the external client. When the client replies to the server, the server can once again relay the response to the external process, which is not another client and has full send and receive capabilities.
Using IPC is not very robust, especially in case of server receiving a lot of requests there might be an issue where you receive a message and don't retranslate it and it's vital.
Use either celery or zmq or redis itself for interconnect. The most natural is using socketio itself like mentioned by Miguel as it's already waiting for the requests has the environment and can emit anytime.
I've used a greenlet hack over threads - where greenlet is lighter than threads and runs in the same environment allowing it to send the message while your main thread awaits socket in non-blocking mode. Basically you write a thread, then apply eventlet or gevent to the whole code via monkeypatching and the thread becomes a greenlet - an inbetween function call. You put a sleep on it so it doesn't hog all resources and you have your sender because greenlets share environment easily, they are not bound by io, just cpu (which is the same for threads in Python but greenlets are even more lightweight due to no OS-level context change at all).
But as soon as CPU load increased I switched over to client/server. Imbuing IPC would require massive rewrites from the ground up.

EOFError with multiprocessing Manager

I have a bunch of clients connecting to a server via 0MQ. I have a Manager queue used for a pool of workers to communicate back to the main process on each client machine.
On just one client machine having 250 worker processes, I see a bunch of EOFError's almost instantly. They occur at the point that the put() is being performed.
I would expect that a lot of communication might slow everything down, but that I should never see EOFError's in internal multiprocessing logic. I'm not using gevent or anything that might break standard socket functionality.
Any thoughts on what could make puts to a Manager queue start raising EOFError's?
For me the error was actually that my receiving process had thrown an exception and terminated, and so the sending process was receiving an EOFError, meaning that the interprocess communication pipeline had closed.

Writing a java server for queueing incoming HTTP Request and processing them a little while later?

I want to write an Java Server may be using Netty or anything else suggested.
The whole purpose is that I want to queue incoming HTTP Request for a while because the systems I'm targeting are doing Super Memory and Compute intensive tasks so if they are burdened with heavy load they eventually tend to get crashed.
I want to have a queue in place that will actually allow only max upto 5 requests passed to destination at any given time and hold the rest of the requests in queue.
Can this be achieved using Netty in Java, I'm equally open for an implementation in Scala, Python or clojure.
I did something similar with Scala Akka actors. Instead of HTTP Request I had unlimited number of job requests come in and get added to a queue (regular Queue). Worker Manager would manage that queue and dispatch work to worker actors whenever they are done processing previous tasks. Workers would notify Worker Manager that task is complete and it would send them a new one from the queue. So in this case there is no busy waiting or looping, everything happens on message reception. You can do the same with your HTTP Requests. Akka can be used from Scala or Java and a process I described is easier to implement than it sounds.
As a web server you could use anything really. It can be Jetty, or some Servlet Container like Tomcat, or even Spray-can. All it needs to do is to receive a request and send a message to Worker Manager. The whole system would be asynchronous and non-blocking.

Making Tornado websocket handler thread safe

I am randomly getting error 1006 ( (I failed the WebSocket connection by dropping the TCP connection) when trying to write messages from threads using Tornado's websocket server handler.
I created N threads and passed my ws_handler to them.
But when I start using
self.ws_handler.write_message(jsondata)
for a large number of threads, I keep getting the same error.
From what I understand, 1006 is TCP connection dropped when a 'heartbeat' communication is skipped between websockets. I am guessing this is because of threads running in parallel and trying to send messages. I tested it using 2-3 threads and it works fine but for large number it doesn't.
I wonder if there's any method to achieve message sending within threads.( meaning lock being handled internally by ws_handler and sending accordingly).
One solution I am thinking of is to push jsondata into a queue and have another single thread push the messages, but I fear that would create a bottleneck.
My client is AutobahnPython.
Tornado is based on a single-threaded event loop; all interactions with Tornado objects must be on the event loop's thread. Use IOLoop.current().add_callback() from another thread when you need to transfer control back to the event loop.
See also http://www.tornadoweb.org/en/stable/web.html#thread-safety-notes

How to ensure that send method of the protocol is thread safe

I'm working on TCP client-server application using the IntNReceiver protocol. Server is accepting multiple TCP connections from client. I would like to let other threads use the protocol's sendString method, on both client and the server. I tried to use synchronized queue, monitored in separate thread and reactor.callFromThread() to call the sendString from there. This seems to work but there is a weird delay of about 20 seconds before the actual sendString actually sends the string. It does not block, returns immediately. I ran strace and the send() system call is definitely delayed. What is the proper way to do this kind of thing with twisted?
Just use callFromThread directly as your queue. The reactor is already synchronizing on and monitoring it. Anywhere you want to call foo.sendString() from a non-reactor thread, just do reactor.callFromThread(foo.sendString). Building additional infrastructure to do this (your own custom synchronized queues, for example) is just additional code that might break – as you have already discovered.

Categories

Resources