ZMQ in Python: New socket object for each incoming connection

ZMQ in Python: New socket object for each incoming connection - python

Sockets in ZMQ are simply bound to an interface and are then able to receive messages right away, like this:
socket.bind("tcp://*:5555")
message = socket.recv()
Since multiple connections can send data to that socket simultaneously, how to distinguish the different senders?
On the other hand, with regular sockets, incoming connections are first accepted, which spawns a new socket, like this:
serversocket.bind((socket.gethostname(), 5555))
serversocket.listen()
(clientsocket, address) = serversocket.accept()
Here, the different senders can be easily distinguished since each is received through a different socket.
What is the best way to benefit from the convenience message-based and queue-buffered communication of ZMQ but still create an arbitrary number of distinguishable one-on-one connections as soon as they are requested?

How to distinguish the different clients depends on what socket type your using as your 'server', the explanations below will hopefully answer the 2nd question too.
REQ - Will reply to the client that sent the request, a recv call on a REQ socket must be followed by a send so you can't service the next request until you have processed the first. However multiple requests from different clients will be queued.
ROUTER - Will append a frame onto the message you recv that contains the client id of the sender. When sending a message the first frame will be removed and used to identify which connected client to reply to. You should store all frames up to and including the empty delimiter frame and prepend them to your reply message when you send the reply. Unlike REQ there is no need to send any messaged before another call to recv. The client id will be generated by ZeroMQ if not specified, but if you want 'persistence' you can set the id via setsockopt with the zmq.IDENTITY flag.

Related

Alternative to timeout socket

Is there an alternative technique of timeout to interrupt communication with a server in case it has received all the messages?
Let me explain, I want to communicate through an SSL socket with a server (XMPP) unknown to me. The communication is done through XML messages.
For each request I can receive one or more response messages of different types. Or a single message that is sent in multiple chunks. Therefore I create a loop to receive all messages related to a request from the server. So I do not know the number and the size of messages for each request a priori. Once the messages are finished, the client waits for no more responses from the Server to stop listening (i.e., it waits for the timeout). However, this results in a long wait (e.g., for each request I have to wait 2s which at the end of the communication could be as long as 1 minute for all requests).
Is there a faster way to stop listening when we have received all the messages for each request?
I attach here my code:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock = ssl.wrap_socket(s, ssl_version=ssl. ssl.PROTOCOL_TLSv1_2)
sock.connect((hostname, port))
sock.timeout(2)
#example for a generic unique message request
sock.sendall(message.encode())
data = “”
while True:
try:
response = sock.recv(1024)
if not response: break
data += response.decode()
except socket.timeout:
break

You know how much data to expect because of the protocol.
The XMPP protocol says (section 4.2) what the start of a stream looks like; (section 4.3) tells you about stream negotiation process which the <stream:features> tag is used for. Don't understand them yet? Section 4.1 tells you about what a stream is and how it works.
In one comment, you complained about receiving a response in multiple chunks. This is absolutely normal in TCP because TCP does not care about chunks. TCP can split up a 1kB message into 1000 1-byte chunks if it feels like it (usually it doesn't feel like it). So you must keep receiving chunks until you see the end of the message.
Actually, XMPP is not even a request-response protocol. The server can send you messages at any time, which are not responses to something you sent, for example it could tell you that someone said "hello" to you. So you might send an IQ request, but the next message you receive might not be an IQ response, it might be a message saying that your friend is now online. You know which message is the response because: the message type is IQ; the 'type' is 'result' or 'error'; and the 'id' is the same as in the request. Section 8.2.3.

Receiving multicast data with socket recvfrom in Python

I have a multicast server sending data that must be captured by a python client. The problem is that recvfrom does not receive any data or at least receive the first packet and sorta caches it. If I use recvfrom in a loop then my data is received correctly.
My question is why I should use recvfrom in a loop to have the expected behavior?
from socket import *
s=socket(AF_INET, SOCK_DGRAM)
s.bind(('172.30.102.141',12345))
m=s.recvfrom(1024)
print m[0]
# sleep for x seconds here
m=s.recvfrom(1024)
print m[0]
# print the exact same thing as previously...

One thing is for sure, multicast is basically sending UDP packages and you have to keep listening for new packages. That is true even for TCP protocol based communication.
When you use low level interfaces for network communication, like socket is, it's up on both sides to define application level protocol.
That means, you define how receiving party concludes that message is complete. This is because message could get split in multiple parts/packets that get through the network. So receiving side has to assemble them in a proper way and then check if the message is whole. After that you push it up through the pipeline of processing messages or whatever you do in receiving side.
When using UDP, receiving side doesn't know if there is any packet on its way, so it just does try to recvfrom 1024 bytes and finishes. It doesn't know and should not care if there is more data on it's way. It's up to you to take care of that.

Prevent python2's TCPThreadingServer from calling handle on the request handler, on multiple requests

I'm trying to write a Network Application using Python's socket and SocketServer Modules.
In the Network Model, there are only clients (nodes).
Each node is connected to some other nodes(Neighbours) and can interchange "messages" with them.
There are two types of messages request_data and response_data, the response_data string is a message generated based on a request_data message (messages are basically two line strings).
In order for a Node to generate a response_data message, it must send request_data messages to the nodes it's connected to, and generate the response_data based on the received data.
I'm implementing these Connections using TCP i.e: when two nodes are connected (using socket.connect() and socket.accept()) they will stay connected and will pass messages from the same connection.
Now here's the problem.
I've implemented the Nodes using SocketServer.ThreadingTCPServer and a custom request handler so when a Node gets a request_data he sends response_datas to it's Neighbours, but when he gets the responses, the ThreadingTCPServer might capture it as a new request, (I assume that's how select.select works when there's data to be read) and I might not be able to get the response message from where I sent the request message, because instead a new request handler has been instantiated by the ThreadingTCPServer.
Basically I'm doing this in my request handler and I'm afraid it might not work:
# conn : a connected socket object created from socket.accept
conn.sendAll(requestMessage)
# I think this will not work because it might be considered a new request by the ThreadingTCPServer
response = conn.recv(1024)
I haven't actually tried this, and don't know if it will work or not, however even if it works for some limited tests I can't be sure it will always work since the problem(if it does in fact exist) stems from a race condition.
So does this work?if not what are some other approaches I can take without reinventing the wheel.

This approach does indeed work, since TCP will open seperate ports for the conversation between the Nodes, and the it will have nothing to do with the port the server is listening on.

Python server client communication

I have a client and a server communicating as follows:
# server
running = 1
while running:
data = self.client.recv(self.size)
self.client.send(str(self.vel))
# client
def runit(self):
self.s.send('test')
data = float(self.s.recv(self.size))
self.master_.after(5,self.runit)
So both are in infinity loops, although this does transfer data, it is inefficient for my application. I am making a game and I want the server to send data to the client at specific instances, and I also want the client to receive that data at that instance. Something similar to a callback would work. Unfortunately, I wasn't able to find anything suitable for my needs.

First, I don't see anything "inefficient", a while loop is very common in such case, and as the comment says, the loop will just blocks in recv until a client connects.
So your question is how to send data from server to client? If a connection is established, assuming you're using TCP, then just call send() method on that socket. Maybe you want to set socket SO_KEEPALIVE, see How to change tcp keepalive timer using python script?.

ZeroMQ PUB socket buffers all my out going data when it is connecting

I noticed that a zeromq PUB socket will buffers all outgoing data if it is connecting, for example
import zmq
import time
context = zmq.Context()
# create a PUB socket
pub = context.socket (zmq.PUB)
pub.connect("tcp://127.0.0.1:5566")
# push some message before connected
# they should be dropped
for i in range(5):
pub.send('a message should not be dropped')
time.sleep(1)
# create a SUB socket
sub = context.socket (zmq.SUB)
sub.bind("tcp://127.0.0.1:5566")
sub.setsockopt(zmq.SUBSCRIBE, "")
time.sleep(1)
# this is the only message we should see in SUB
pub.send('hi')
while True:
print sub.recv()
The sub binds after those messages, they should be dropped, because PUB should drop messages if no one connected to it. But instead of dropping messages, it buffers all messages.
a message should not be dropped
a message should not be dropped
a message should not be dropped
a message should not be dropped
a message should not be dropped
hi
As you can see, those "a message should not be dropped" are buffered by the socket, once it gets connected, it flush them to SUB socket. If I bind at the PUB socket, and connect at the SUB socket, then it works correctly.
import zmq
import time
context = zmq.Context()
# create a PUB socket
pub = context.socket (zmq.PUB)
pub.bind("tcp://127.0.0.1:5566")
# push some message before connected
# they should be dropped
for i in range(5):
pub.send('a message should not be dropped')
time.sleep(1)
# create a SUB socket
sub = context.socket (zmq.SUB)
sub.connect("tcp://127.0.0.1:5566")
sub.setsockopt(zmq.SUBSCRIBE, "")
time.sleep(1)
# this is the only message we should see in SUB
pub.send('hi')
while True:
print repr(sub.recv())
And you can only see the output
'hi'
This kind of strange behavior cause a problem, it buffers all data on a connecting socket, I have two servers, server A publishes data to server B
Server A -- publish --> Server B
It works fine if server B gets online. But what if I start the Server A and do not start Server B?
As the result, the connecting PUB socket on Server A keeps all those data, the memory usage gets higher and higher.
Here is the problem, is this kind of behavior a bug or feature? If it is feature, where can I find a document that mentions this behavior? And how can I stop the connecting PUB socket buffers all data?
Thanks.

Whether the socket blocks or drops messages depends on the socket type as described in the ZMQ::Socket documentation (emphasis below is mine):
ZMQ::HWM: Retrieve high water mark
The ZMQ::HWM option shall retrieve the high water mark for the
specified socket. The high water mark is a hard limit on the maximum
number of outstanding messages 0MQ shall queue in memory for any
single peer that the specified socket is communicating with.
If this limit has been reached the socket shall enter an exceptional
state and depending on the socket type, 0MQ shall take appropriate
action such as blocking or dropping sent messages. Refer to the
individual socket descriptions in ZMQ::Socket for details on the exact
action taken for each socket type.
The default ZMQ::HWM value of zero means “no limit”.
You can see if it will block or drop by looking through the documentation for the socket type for ZMQ::HWM option action which will either be Block or Drop.
The action for ZMQ::PUB is Drop, so if it is not dropping you should check the HWM (High Water Mark) value and heed the warning that The default ZMQ::HWM value of zero means “no limit”, meaning that it will not enter an exceptional state until the system runs out of memory (at which point I don't know how it behaves).

I feel this behavior is the semantic of zmq_connect().
That is: when zmq_connect() returns success, then the connection is conceptually established, and thus your connecting-PUB starts queuing message instead of dropping.
Following excerpt from "ZMQ Guide" is a hint for this:
In theory with ØMQ sockets, it does not matter which end connects, and
which end binds. However with PUB-SUB sockets, if you bind the SUB
socket and connect the PUB socket, the SUB socket may receive old
messages, i.e. messages sent before the SUB started up. This is an
artifact of the way bind/connect works. It's best to bind the PUB and
connect the SUB, if you can.
Following section in zmq_connect() has some hints, shown below:
Key differences to conventional sockets
Generally speaking, conventional sockets present a synchronous
interface to either connection-oriented reliable byte streams
(SOCK_STREAM), or connection-less unreliable datagrams (SOCK_DGRAM).
In comparison, ØMQ sockets present an abstraction of an asynchronous
message queue, with the exact queueing semantics depending on the
socket type in use. Where conventional sockets transfer streams of
bytes or discrete datagrams, ØMQ sockets transfer discrete messages.
ØMQ sockets being asynchronous means that the timings of the physical
connection setup and tear down, reconnect and effective delivery are
transparent to the user and organized by ØMQ itself. Further, messages
may be queued in the event that a peer is unavailable to receive them.

They setting HWM option on the socket.

So bind() and connect() result in two different behaviors. Why don't you just choose which one you prefer (it seems like bind()) and use that?
It is indeed a feature of ZeroMQ in general that it buffers outgoing messages until a connection is made.

You should be able to set a high water mark in the socket using the hwm settingom the pub socket. It lets you define how many messages are kept.

Here's a hack that might help...
Set your ZMQ::HWM to a fixed number, say 10. Upon connection, call the subscriber socket's recv method in a loop until it discards all the buffered messages, and only THEN start your main receiving loop.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.