Can pyzmq pub/sub sockets be used bidirectionally?

Can pyzmq pub/sub sockets be used bidirectionally? - python

I'm using a pyzmq pub/sub socket for a server to advertise notifications to client subscribers. It works nicely but I have a question:
Is there any way to use the same socket to send information back to the server? Or do I need a separate socket for that?
Use case: I just want to allow the server to see who's actively subscribing to notifications, so I was hoping I could allow clients to send back periodic "heartbeat" messages. I have a use case where if no clients are listening, I want the server to spawn one. (This is a multiprocess system that uses localhost only.)

You need a separate socket. From the ZMQ guide (http://zguide.zeromq.org/page:all#Pros-and-Cons-of-Pub-Sub):
Killing back-chatter is essential to real scalability. With pub-sub, it's how the pattern can map cleanly to the PGM multicast protocol, which is handled by the network switch. In other words, subscribers don't connect to the publisher at all, they connect to a multicast group on the switch, to which the publisher sends its messages.
In order for this to work, the PUB socket will not send back data to the subscribers (at least not in a way visible to the user. The heartbeating problem was discussed in-depth in the guide: http://zguide.zeromq.org/page:all#The-Asynchronous-Client-Server-Pattern
Also, check out the 7/MDP and 18/MDP protocols (http://rfc.zeromq.org/spec:7 -- this is also discussed in the guide) if you want to keep track of clients.

Related

How to efficiently separate socket pollers using asyncio and zmq?

Let's say I want to implement an echo server and client using ZeroMQ (pyzmq) and asyncio (for its event loop, coroutines, etc.).
Now I want to add more reliability by adding a heartbeat. As I don't want to interact too much with my wonderful echo protocol, this heartbeat is done by both client and server on a dedicated pair of sockets.
From what I understand, the way to go™ is to create a new zmq socket in the server class, register it to the existing Poller and let the server class handle everything, from timeout calculation to sending beats. That works, of course.
But this is more complicated than it should be (that's a personal view). From the server point of view, 'heartbeats' are implementation details. What heartbeats are there for is to answer a simple question: "is the client still there?". More technically, I would like to setup and Heartbeat object that takes a timeout and an address. That Heartbeat object would do all the socket setup, beat-related socket polling, send actual beats and receive them.
From the server point of view, I would just use client.is_alive() when required. But that would require two socket pollers to work in parallel. I can achieve that with an executor, but that does not seem right. How would you do that?

How do chat servers distribute messages to multiple clients?

This is really a programming design question more than a specific language or library question. I'm tinkering with the idea of a standalone chat server for websockets that will accept several remote browser-based javascript clients. I'm going for something super simple at first, then might build it up. The server just keeps accepting client connections and listens for messages. When a message is received, it will be sent back to all the clients.
What I need to better understand is which approach is best for sending the messages out to all clients, specifically, sending immediately to all clients, or queuing the messages to each client's queue to be sent when a client connection handler's turn comes up. Below are the two examples in a python-like pseudo-code:
Broadcast Method
def client_handler(client):
while true:
if(client.pending_msg):
rmsg = client.recv()
for c in clients:
c.send(rmsg)
client.sleep(1)
Queue Method
def client_handler(client):
while true:
if client.pending_msg:
rmsg = client.recv()
for c in clients:
c.queue_msg(rmsg)
if client.has_queued:
client.send_queue
client.sleep(1)
What is the best approach? Or, perhaps they are good for different use-cases, in which case, what are the pros, cons and circumstances for which they should be used. Thanks!

First of all, it seems odd to me that a single client handler would know about all the other existing clients. This should be the first thing you should abstract away and create a central message processing handler instead which the individual clients talk to.
That handler can then either send the message directly to the clients (like in your broadcast example), or add them to queues of the clients (like your queue example). Which would be the preferred version depends a bit on your network protocol.
Since you said that you will be using websockets, you have a persistent network connection to the clients anyway, so you can just send them out immediately. There is no real gain to queue (and buffer) the messages. Ideally, a client would just have a send() method anyway, and the client would then internally decide whether that means appending it to a queue or sending it immediately over the network.
Furthermore, since websockets are kind of asynchronous in their nature, you don’t need busy wait loops anyway. You can just listen for messages from the client directly, process those, and broadcast them using your central handler. And since you then don’t have a wait loop anymore, there also would be no place where you work off your queue anymore, making the immediate broadcast the more natural decision.

Which protocol should I use for pyzmq?

I am working on a project where I have a client server model in python. I set up a server to monitor requests and send back data. PYZMQ supports: tcp, udp, pgm, epgm, inproc and ipc. I have been using tcp for interprocess communication, but have no idea what i should use for sending a request over the internet to a server. I simply need something to put in:
socket.bind(BIND_ADDRESS)
DIAGRAM: Client Communicating over internet to server running a program

Any particular reason you're not using ipc or inproc for interprocess communication?
Other than that, generally, you can consider tcp the universal communicator; it's not always the best choice, but no matter what (so long as you actually have an IP address) it will work.
Here's what you need to know when making a choice between transports:
PGM/EPGM are multicast transports - the idea is that you send one message and it gets delivered as a single message until the last possible moment where it will be broken up into multiple messages, one for each receiver. Unless you absolutely know you need this, you don't need this.
IPC/Inproc are for interprocess communication... if you're communicating between different threads in the same process, or different processes on the same logical host, then these might be appropriate. You get the benefit of a little less overhead. If you might ever add new logical hosts, this is probably not appropriate.
Russle Borogove enumerates the difference between TCP and UDP well. Typically you'll want to use TCP. Only if absolute speed is more important than reliability then you'll use UDP.
It was always my understanding that UDP wasn't supported by ZMQ, so if it's there it's probably added by the pyzmq binding.
Also, I took a look at your diagram - you probably want the server ZMQ socket to bind and the client ZMQ socket to connect... there are some reasons why you might reverse this, but as a general rule the server is considered the "reliable" peer, and the client is the "transient" peer, and you want the "reliable" peer to bind, the "transient" peer to connect.

Over the internet, TCP or UDP are the usual choices. I don't know if pyzmq has its own delivery guarantees on top of the transport protocol. If it doesn't, TCP will guarantee in-order delivery of all messages, while UDP may drop messages if the network is congested.
If you don't know what you want, TCP is the simplest and safest choice.

Multi-threaded UDP server with Python

I want to create a simple video streaming (actually, image streaming) server that can manage different protocols (TCP Push/Pull, UDP Push/Pull/Multicast).
I managed to get TCP Push/Pull working with the SocketServer.TCPServer class and ThreadinMixIn for processing each connected client in a different thread.
But now that I'm working on the UDP protocol, I just realized that ThreadinMixIn creates a thread per call of handle() per client query (as there's nothing such as a "connection" in UDP).
The problem is I need to process a sequence of queries by the same client, for all the clients. How could I manage that ?
The only way I see I could handle that is to have a list of (client adresses, processing thread) and send each query to the matching thread (or create a new one if the client haven't sent any thread yet). Is there an easier way to do that ?
Thanks !
P.S : I can't use any external or too "high-level" library for this as it's a school subject meant to understand how sockets work.

Take a look at Twisted. This will remove the need to do any thread dispatch from your application. You still have to match up packets to a particular session in order to handle them, but this isn't difficult (use a port per client and dispatch based on the port, or require packets in a session to always come from the same address and use the peer address, or use one of the existing protocols that solves this problem such as SIP).

Design question on Python network programming

I'm currently writing a project in Python which has a client and a server part. I have troubles with the network communication, so I need to explain some things...
The client mainly does operations the server tells him to and sends the results of the operations back to the server. I need a way to communicate bidirectional on a TCP socket.
Current Situation
I currently use a LineReceiver of the Twisted framework on the server side, and a plain Python socket (and ssl) on client side (because I was unable to correctly implement a Twisted PushProducer). There is a Queue on the client side which gets filled with data which should be sent to the server; a subprocess continuously pulls data from the queue and sends it to the server (see code below).
This scenario works well, if only the client pushes its results to the manager. There is no possibility the server can send data to the client. More accurate, there is no way for the client to receive data the server has sent.
The Problem
I need a way to send commands from the server to the client.
I thought about listening for incoming data in the client loop I use to send data from the queue:
def run(self):
while True:
data = self.queue.get()
logger.debug("Sending: %s", repr(data))
data = cPickle.dumps(data)
self.socket.write(data + "\r\n")
# Here would be a good place to listen on the socket
But there are several problems with this solution:
the SSLSocket.read() method is a blocking one
if there is no data in the queue, the client will never receive any data
Yes, I could use Queue.get_nowait() instead of Queue.get(), but all in all it's not a good solution, I think.
The Question
Is there a good way to achieve this requirements with Twisted? I really do not have that much skills on Twisted to find my way round in there. I don't even know if using the LineReceiver is a good idea for this kind of problem, because it cannot send any data, if it does not receive data from the client. There is only a lineReceived event.
Is Twisted (or more general any event driven framework) able to solve this problem? I don't even have real event on the communication side. If the server decides to send data, it should be able to send it; there should not be a need to wait for any event on the communication side, as possible.

"I don't even know if using the LineReceiver is a good idea for this kind of problem, because it cannot send any data, if it does not receive data from the client. There is only a lineReceived event."
You can send data using protocol.transport.write from anywhere, not just in lineReceived.

"I need a way to send commands from the server to the client."
Don't do this. It inverts the usual meaning of "client" and "server". Clients take the active role and send stuff or request stuff from the server.
Is Twisted (or more general any event driven framework) able to solve this problem?
It shouldn't. You're inverting the role of client and server.
If the server decides to send data, it should be able to send it;
False, actually.
The server is constrained to wait for clients to request data. That's generally the accepted meaning of "client" and "server".
"One to send commands to the client and one to transmit the results to the server. Does this solution sound more like a standard client-server communication for you?"
No.
If a client sent messages to a server and received responses from the server, it would meet more usual definitions.
Sometimes, this sort of thing is described as having "Agents" which are -- each -- a kind of server and a "Controller" which is a single client of all these servers.
The controller dispatches work to the agents. The agents are servers -- they listen on a port, accept work from the controller, and do work. Each Agent must do two concurrent things (usually via the select API):
Monitor a well-known socket on which it will receive work from the one-and-only client.
Do the work (in the background).
This is what Client-Server usually means.
If each Agent is a Server, you'll find lots of libraries will support this. This is the way everyone does it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.