Short brief about my situation:
I'm writing a server (twisted powered) which handles WebSocket connections with multiple clients (over 1000). Messages send from server to client are handled via Redis pub/subinterface (because messages can be applied via REST) in that flow:
REST appends command to the client and publishes,
twisted is getting poked because it subscribes that redis channel,
the message is added to the client queue and waits for further processing
Now, as client connects, gets registered I'm launching inlineCallback for each client to sweep throught the queue, like this:
#inlineCallbacks
def client_queue_handler(self, uuid):
queue = self.send_queue[uuid]
client = self.get_client(uuid)
while True:
uniqueID = yield queue.get()
client_context = self.redis_handler.get_single(uuid)
msg_context = next(iter([msg
for msg in client_context
if msg['unique'] == unique]),
None)
client.sendMessage(msg_context)
As I said previously, many clients may connect. Is this perfectly fine, that each client has it's own inlineCallback which performs an infinite loop? As far as I know, twisted has customizable thread pool limit. What will happen if there will be more clients (inlineCallbacks) than threads in threadpool? Will queue.get() block/sleep that "virtual thread" and pass control to the other one? Maybe one "global" thread which sweeps all clients is a better option?
inlineCallbacks doesn't start any OS threads. It's just a different interface to using Deferred. A Deferred is just an API for dealing with callbacks.
queue.get() returns a Deferred. When you yield it, inlineCallbacks internally adds a callback to it and your function remains suspended. When the callback fires, inlineCallbacks resumes your function with the value passed to the callback - which is the "result" of the Deferred you yielded.
All that's happening is some Deferred objects are being created and some callbacks are being added to them. Somewhere inside the implementation of your redis client, some event sources are "firing" the Deferred with a result which starts the process of calling its callbacks.
You can have as many of these:
* as you have system memory to hold
* as the redis client can keep track of at a time
I don't know the details of how your redis client is implemented. If it has to open a socket per queue then you'll probably be limited to the number of file descriptors you can open or the number of sockets your system can support. These numbers will be somewhere in the tens of thousands and when you bump into them, there are tricks you can deploy to raise the limit further.
If it doesn't have to open a socket per queue (for example, if it can multiplex notification for all the queues over a single socket) then it probably has a limit that's much, much higher (maybe imposed by algorithmic complexity of the slowest part of its implementation).
Related
I have a flask-socketio server running on multiple pods, using redis as a message queue. I want to ensure that emits from external processes reach their destination 100% of the time, or to know when they have failed.
When process A emits an event to a socket that's connected to process B, the event goes through the message queue to process B, to the client. Is there any way I can intercept the outgoing emit on process B? Ideally i'd then use a worker to check after a few seconds if the message reached the client (via a confirm event emitted from the client) or it will be emitted again.
This code runs on process A:
#app.route('/ex')
def ex_route():
socketio.emit('external', {'text': f'sender: {socket.gethostname()}, welcome!'}, room='some_room')
return jsonify(f'sending message to room "some_room" from {socket.gethostname()}')
This is the output from process A
INFO:socketio.server:emitting event "external" to some_room [/]
INFO:geventwebsocket.handler:127.0.0.1 - - [2019-01-11 13:33:44] "GET /ex HTTP/1.1" 200 177 0.003196
This is the output from process B
INFO:engineio.server:9aab2215a0da4816a45e3fdc1e449fce: Sending packet MESSAGE data 2["external",{"text":"sender: *******, welcome!"}]
There is currently no mechanism to do what you ask, unfortunately.
I think you basically have two approaches to go about this:
Always run your emits from the main server(s). If you need to emit from an auxiliary process, use an IPC mechanism to notify the server so that it can run the emit on its behalf. And now you can use callbacks.
Ignore the callbacks, and instead have the client acknowledge receipt of the event by emitting back to the server.
Adding callback support for auxiliary processes should not be terribly difficult, by the way. I never needed that functionality myself and you are the first to ask about it. Maybe I should look into that at some point.
Edit: after some thought, I came up with a 3rd option:
You can connect your external process to the server as a client, instead of using the "emit-only" option. If this process is a client, it can emit an event to the server, which in turn the server can relay to the external client. When the client replies to the server, the server can once again relay the response to the external process, which is not another client and has full send and receive capabilities.
Using IPC is not very robust, especially in case of server receiving a lot of requests there might be an issue where you receive a message and don't retranslate it and it's vital.
Use either celery or zmq or redis itself for interconnect. The most natural is using socketio itself like mentioned by Miguel as it's already waiting for the requests has the environment and can emit anytime.
I've used a greenlet hack over threads - where greenlet is lighter than threads and runs in the same environment allowing it to send the message while your main thread awaits socket in non-blocking mode. Basically you write a thread, then apply eventlet or gevent to the whole code via monkeypatching and the thread becomes a greenlet - an inbetween function call. You put a sleep on it so it doesn't hog all resources and you have your sender because greenlets share environment easily, they are not bound by io, just cpu (which is the same for threads in Python but greenlets are even more lightweight due to no OS-level context change at all).
But as soon as CPU load increased I switched over to client/server. Imbuing IPC would require massive rewrites from the ground up.
I am writing a Python program where in the main thread I am continuously (in a loop) receiving data through a TCP socket, using the recv function. In a callback function, I am sending data through the same socket, using the sendall function. What triggers the callback is irrelevant. I've set my socket to blocking.
My question is, is this safe to do? My understanding is that a callback function is called on a separate thread (not the main thread). Is the Python socket object thread-safe? From my research, I've been getting conflicting answers.
Sockets in Python are not thread safe.
You're trying to solve a few problems at once:
Sockets are not thread-safe.
recv is blocking and blocks the main thread.
sendall is being used from a different thread.
You may solve these by either using asyncio or solving it the way asyncio solves it internally: By using select.select together with a socketpair, and using a queue for the incoming data.
import select
import socket
import queue
# Any data received by this queue will be sent
send_queue = queue.Queue()
# Any data sent to ssock shows up on rsock
rsock, ssock = socket.socketpair()
main_socket = socket.socket()
# Create the connection with main_socket, fill this up with your code
# Your callback thread
def different_thread():
# Put the data to send inside the queue
send_queue.put(data)
# Trigger the main thread by sending data to ssock which goes to rsock
ssock.send(b"\x00")
# Run the callback thread
while True:
# When either main_socket has data or rsock has data, select.select will return
rlist, _, _ = select.select([main_socket, rsock], [], [])
for ready_socket in rlist:
if ready_socket is main_socket:
data = main_socket.recv(1024)
# Do stuff with data, fill this up with your code
else:
# Ready_socket is rsock
rsock.recv(1) # Dump the ready mark
# Send the data.
main_socket.sendall(send_queue.get())
We use multiple constructs in here. You will have to fill up the empty spaces with your code of choice. As for the explanation:
We first create a send_queue which is a queue of data to send. Then, we create a pair of connected sockets (socketpair()). We need this later on in order to wake up the main thread as we don't wish recv() to block and prevent writing to the socket.
Then, we connect the main_socket and start the callback thread. Now here's the magic:
In the main thread, we use select.select to know if the rsock or main_socket has any data. If one of them has data, the main thread wakes up.
Upon adding data to the queue, we wake up the main thread by signaling ssock which wakes up rsock and thus returns from select.select.
In order to fully understand this, you'll have to read select.select(), socketpair() and queue.Queue().
#tobias.mcnulty asked a good question in the comments: Why should we use a Queue instead of sending all the data through the socket?
You can use the socketpair to send the data as well, which has its benefits, but sending over a queue might be preferable for multiple reasons:
Sending data over a socket is an expensive operation. It requires a syscall, requires passing data back and forth inside system buffers, and entails full use of the TCP stack. Using a Queue guarantees we'll have only 1 call - for the single-byte signal - and not more (apart from the queue's internal lock, but that one is pretty cheap). Sending large data through the socketpair will result in multiple syscalls. As a tip, you may as well use a collections.deque which CPython guarantees to be thread-safe because of the GIL. That way you won't have to require any syscall besides the socketpair.
Architecture-wise, using a queue allows you to have finer-grained control later on. For example, the data can be sent in whichever type you wish and be decoded afterwards. This allows the main loop to be a little smarter and can help you create an easier interface.
You don't have size limits. It can be a bug or a feature. I believe changing the system's buffer size is not exactly encouraged, which creates a natural throttle to the amount of data you can send. It might be a benefit, but the application may wish to control it on its own. Using the "natural" feature will cause the calling thread to hang.
Just like socketpair.recv syscalls, for large data you will pass through multiple select calls as well. TCP does not have message boundaries. You'll either have to create artificial ones, set the socket to nonblocking and deal with asynchronous sockets, or think of it as a stream and continuously pass through select calls which might be expensive depending on your OS.
Support for multiple threads on the same socketpair. Sending 1 byte for signalling over a socket from multiple threads is fine, and is exactly how asyncio works. Sending more than that may cause the data to be sent in an incorrect order.
All in all, transferring the data back and forth between the kernel and userspace is possible and will work, but I personally do not recommend it.
Is there a way to run all messages that arrive to the same websocket sequentially, in a blocking way, while without blocking messages arriving to different websocket?
So let's assume someone is using ThreadPoolExecutor with 8 threads (to utilize all available cores), together with the yield statement and the #gen.coroutine decorator, every time the server runs executor.submit the task goes to some thread arbitrarily. I'd like to enforce that for a given WebSocket, only one thread will handle the tasks, in order to assure things will run sequentially.
I am randomly getting error 1006 ( (I failed the WebSocket connection by dropping the TCP connection) when trying to write messages from threads using Tornado's websocket server handler.
I created N threads and passed my ws_handler to them.
But when I start using
self.ws_handler.write_message(jsondata)
for a large number of threads, I keep getting the same error.
From what I understand, 1006 is TCP connection dropped when a 'heartbeat' communication is skipped between websockets. I am guessing this is because of threads running in parallel and trying to send messages. I tested it using 2-3 threads and it works fine but for large number it doesn't.
I wonder if there's any method to achieve message sending within threads.( meaning lock being handled internally by ws_handler and sending accordingly).
One solution I am thinking of is to push jsondata into a queue and have another single thread push the messages, but I fear that would create a bottleneck.
My client is AutobahnPython.
Tornado is based on a single-threaded event loop; all interactions with Tornado objects must be on the event loop's thread. Use IOLoop.current().add_callback() from another thread when you need to transfer control back to the event loop.
See also http://www.tornadoweb.org/en/stable/web.html#thread-safety-notes
I have an idea. Write a WebSocket based RPC that would process messages according to the scenario below.
Client connects to a WS (web socket) server
Client sends a message to the WS server
WS server puts the message into the incoming queue (can be a multiprocessing.Queue or RabbitMQ queue)
One of the workers in the process pool picks up the message for processing
Message is being processed (can be blazingly fast or extremely slow - it is irrelevant for the WS server)
After the message is processed, results of the processing are pushed to the outcoming queue
WS server pops the result from the queue and sends it to the client
NOTE: the key point is that the WS server should be non-blocking and responsible only for:
connection acceptance
getting messages from the client and puting them into the incoming queue
popping messages from the outcoming queue and sending them back to the client
NOTE2: it might be a good idea to store client identifier somehow and pass it around with the message from the client
NOTE3: it is completely fine that because of queueing the messages back and forth the speed of simple message processing (e.g. get message as input and push it back as a result) shall become lower. Target goal is to be able to run processor expensive operations (rough non-practical example: several nested “for” loops) in the pool with the same code style as handling fast messages. I.e. pop message from the input queue together with some sort of client identifier, process it (might take a while) and push the processing results together with client ID to the output queue.
Questions:
In TornadoWeb, if I have a queue (multiprocessing or Rabit), how can
I make Tornado’s IOLoop trigger some callback whenever there is a new
item in that queue? Can you navigate me to some existing
implementation if there is any?
Is there any ready implementation of such a design? (Not necessarily with Tornado)
Maybe I should use another language (not python) to implement such a design?
Acknowledgments:
Recommendations to use REST and WSGI for whatever goal I aim to achieve are not welcome
Comments like “Here is a link to the code that I found by googling for 2 seconds. It has some imports from tornado and multiprocessing.I am not sure what it does, however I am for 99% certain that it isexactly what you need” are not welcome neither
Recommendations to use asynchronous libraries instead of normal blocking ones are ... :)
Tornado's IOLoop allows you handling events from any file object by its file descriptor, so you could try this:
connect with each of your workers processes through multiprocessing.Pipe
call add_handler for each pipe's parent end (using the connection's fileno())
make the workers write some random garbage each time they put something into the output queue, no matter if that's multiprocessing.Queue of any MQ.
handle the answers form the workers in the event handlers