ZeroMQ: Waiting for a message without consuming thread - python

I have some simple python code which waits for messages for a topic.
However...when running this, the python process will hog CPU. I know with other languages, such as with sockets, there is a way to wait for messages without eating the entire processing power. Essentially the thread just remains halted waiting for a response. Is that possible with ZeroMQ?
import zmq
import sys
port = "5556"
if len(sys.argv) > 1:
port = sys.argv[1]
int(port)
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
print "Collecting updates from weather server..."
socket.connect ("tcp://localhost:%s" % port)
# Subscribe to zipcode, default is NYC, 10001
topicfilter = "10001"
socket.setsockopt(zmq.SUBSCRIBE, topicfilter)
# Process 5 updates
total_value = 0
for update_nbr in range (5):
string = socket.recv()
topic, messagedata = string.split()
total_value += int(messagedata)
print ('{} {}'.format(topic, message)

Yes, this is possible.
The code above is not a reproducible MCVE/MWE-representation of the claimed problem.
First : the code explicitly blocks. Using a .recv()-method is blocking until any new, matching message arrives ( if ever ).
That does not overload the CPU. It just keeps sitting on the SUB-side waiting, until a first such message ( if any ) arrives.
If interested in a better scheme, do not use .recv()-method this way, but rather detect any such POSACK-acknowledgement with a .poll()-method & read a POSACK-ed case with .recv( zmq.NOBLOCK )-method.

Related

How to publish and subscribe the latest message correctly using pyzmq?

I have a process A that publishes a message constantly and processes B and C subscribe to the topic and get the latest message published by the publisher in process A.
So, I set zmq.CONFLATE to both publisher and subscriber. However, I found that one subscriber was not able to receive messages.
def publisher(sleep_time=1.0, port="5556"):
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.setsockopt(zmq.CONFLATE, 1)
socket.bind("tcp://*:%s" % port)
print ("Running publisher on port: ", port)
while True:
localtime = time.asctime( time.localtime(time.time()))
string = "Message published time: {}".format(localtime)
socket.send_string("{}".format(string))
time.sleep(sleep_time)
def subscriber(name="sub", sleep_time=1, ports="5556"):
print ("Subscriber Name: {}, Sleep Time: {}, Port: {}".format(name, sleep_time, ports))
context = zmq.Context()
print ("Connecting to publisher with ports %s" % ports)
socket = context.socket(zmq.SUB)
socket.setsockopt(zmq.CONFLATE, 1)
socket.setsockopt_string(zmq.SUBSCRIBE, "")
socket.connect ("tcp://localhost:%s" % ports)
while True:
message = socket.recv()
localtime = time.asctime( time.localtime(time.time()))
print ("\nSubscriber [{}]\n[RECV]: {} at [TIME]: {}".format(name, message, localtime))
time.sleep(sleep_time)
if __name__ == "__main__":
Process(target=publisher).start()
Process(target=subscriber, args=("SUB1", 1.2, )).start()
Process(target=subscriber, args=("SUB2", 1.1, )).start()
I tried to unset the socket.setsockopt(zmq.CONFLATE, 1) in the publisher, and that seemed to solve the problem. Both subscribers in processes B and C could receive messages and the messages seemed to be the latest ones.
I'm trying to find out why setting the publisher with CONFLATE caused the problem I had. I could not find information about it. Does anyone know what causes this behavior?
Also, I want to know, in the situation of one publisher to multiple subscribers, what is the correct code setup, so that subscriber can always get the latest messages?
It's most likely a timing issue, the ZMQ_CONFLATE socket option limits the inbound and outbound queue to 1 message.
The way PUB/SUB works is the subscriber sends a subscription message to the publisher when you set the ZMQ_SUBSCRIBE option. If you start both subscribers at the same time then its possible that one of the subscription messages that arrived on the publisher queue will be discarded.
Try adding a sleep between the starting each subscriber.
From the zeromq docs
If set, a socket shall keep only one message in its inbound/outbound
queue, this message being the last message received/the last message
to be sent. Ignores ZMQ_RCVHWM and ZMQ_SNDHWM options. Does not
support multi-part messages, in particular, only one part of it is
kept in the socket internal queue.
I am not saying this is the solution to you problem, but if that is the case we may need to post a change to libzmq to make the conflate options more granular so you can choose if conflate should be applied to inbound or outbound queues.
There is a manner to get "Last message only" option in ZMQ Subscribe socket (using CONFLATE option).
You need it on the subscriber side.
Here is an example:
import zmq
port = "5556"
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.setsockopt(zmq.SUBSCRIBE, '')
socket.setsockopt(zmq.CONFLATE, 1) # last msg only.
socket.connect("tcp://localhost:%s" % port) # must be placed after above options.
while True:
data = socket.recv()
print data
On the other word, I removed any buffered queue in subscriber code.
[In Additional]:
With the zmq.SNDBUF and zmq.RCVBUF options we could set a limit on ZMQ buffer size. (More complete and an example)

UDP Client sends ping once a second, and also prints anything sent to it?

Good afternoon everyone reading this, I am new to programming with sockets, as well as new to asynchronous coding (I feel async may be part of the solution to my problem), so forgive me for any silly mistakes I make.
To start, I have a UDP Echo server that acts as a game server. Anytime it gets a ping sent to it, it adds the source ip and port to a list of "connected clients", and sends that exact ping out to everyone on the list, excluding the sender. This works fairly well, because it reacts upon receiving a message, so it can always just listen. The problem with the client however, is that I need to be constantly sending pings, while also listening.
This is currently what my client looks like:
import socket
from time import sleep
from contextlib import contextmanager
UDP_IP_ADDRESS = "127.0.0.1"
UDP_PORT_NO = 14004
Message = b"Hello World, From Client B"
#contextmanager
def socket_ctx():
""" Context Manager for the socket. Makes sure the socket will close regardless of why it exited."""
my_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# Assign IP address and a RANDOM available port number to socket
my_socket.bind(('127.0.0.1', 0))
try:
# Let the rest of the app use the socket and wait for it to finish
yield my_socket
finally:
my_socket.close()
def send_data(client_sock):
client_sock.sendto(Message, (UDP_IP_ADDRESS, UDP_PORT_NO))
def listen(client_sock):
print(client_sock.recvfrom(100))
with socket_ctx() as sock:
while True:
send_data(sock)
listen(sock)
sleep(2)
Currently, it sends a ping once, then just idles as it presumably is listening. If it does happen to get a ping back, say, another client send a ping to the server, and the server sent the ping to this client, it hears it, prints it, and starts the loop again. The issue is, without another client sending something to jolt this one out of the listen, it doesn't send it's pings.
I think async might be my solution, but I would have no clue how to go about that. Does anyone have a solution for this problem?
Here's how I would implement a server with "receive and handle incoming UDP sockets, plus do some packet-sending once per second" behavior. Note that this uses the select() function to multiplex the two tasks, rather than asynchronous I/O; hopefully that is okay.
import socket
import select
import time
UDP_IP_ADDRESS = "127.0.0.1"
UDP_PORT_NO = 14004
Message = b"Hello World, From Client B"
udp_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
udp_socket.bind(('127.0.0.1', 0))
print "UDP socket is listening for incoming packets on port", udp_socket.getsockname()[1]
# When we want to send the next periodic-ping-message out
nextPingTime = time.time()
while True:
secondsUntilNextPing = nextPingTime - time.time();
if (secondsUntilNextPing < 0):
secondsUntilNextPing = 0
# select() won't return until udp_socket has some data
# ready-for-read, OR until secondsUntilNextPing seconds
# have passed, whichever comes first
inReady, outReady, exReady = select.select([udp_socket], [], [], secondsUntilNextPing)
if (udp_socket in inReady):
# There's an incoming UDP packet ready to receive!
print(udp_socket.recvfrom(100))
now = time.time()
if (now >= nextPingTime):
# Time to send out the next ping!
print "Sending out scheduled ping at time ", now
udp_socket.sendto(Message, (UDP_IP_ADDRESS, UDP_PORT_NO))
nextPingTime = now + 1.0 # we'll do it again in another second

When/why to use s.shutdown(socket.SHUT_WR)?

I have just started learning python network programming. I was reading Foundations of Python Network Programming and could not understand the use of s.shutdown(socket.SHUT_WR) where s is a socket object.
Here is the code(where sys.argv[2] is the number of bytes user wants to send, which is rounded off to a multiple of 16) in which it is used:
import socket, sys
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
HOST = '127.0.0.1'
PORT = 1060
if sys.argv[1:] == ['server']:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((HOST, PORT))
s.listen(1)
while True:
print 'Listening at', s.getsockname()
sc, sockname = s.accept()
print 'Processing up to 1024 bytes at a time from', sockname
n = 0
while True:
message = sc.recv(1024)
if not message:
break
sc.sendall(message.upper()) # send it back uppercase
n += len(message)
print '\r%d bytes processed so far' % (n,),
sys.stdout.flush()
print
sc.close()
print 'Completed processing'
elif len(sys.argv) == 3 and sys.argv[1] == 'client' and sys.argv[2].isdigit():
bytes = (int(sys.argv[2]) + 15) // 16 * 16 # round up to // 16
message = 'capitalize this!' # 16-byte message to repeat over and over
print 'Sending', bytes, 'bytes of data, in chunks of 16 bytes'
s.connect((HOST, PORT))
sent = 0
while sent < bytes:
s.sendall(message)
sent += len(message)
print '\r%d bytes sent' % (sent,),
sys.stdout.flush()
print
s.shutdown(socket.SHUT_WR)
print 'Receiving all the data the server sends back'
received = 0
while True:
data = s.recv(42)
if not received:
print 'The first data received says', repr(data)
received += len(data)
if not data:
break
print '\r%d bytes received' % (received,),
s.close()
else:
print >>sys.stderr, 'usage: tcp_deadlock.py server | client <bytes>'
And this is the explanation that the author provides which I am finding hard to understand:
Second, you will see that the client makes a shutdown() call on the socket after it finishes sending its transmission. This solves an important problem: if the server is going to read forever until it sees end-of-file, then how will the client avoid having to do a full close() on the socket and thus forbid itself from doing the many recv() calls that it still needs to make to receive the server’s response? The solution is to “half-close” the socket—that is, to permanently shut down communication in one direction but without destroying the socket itself—so that the server can no longer read any data, but can still send any remaining reply back in the other direction, which will still be open.
My understanding of what it will do is that it will prevent the client application from further sending the data and thus will also prevent the server side from further attempting to read any data.
What I cant understand is that why is it used in this program and in what situations should I consider using it in my programs?
My understanding of what it will do is that it will prevent the client
application from further sending the data and thus will also prevent
the server side from further attempting to read any data.
Your understanding is correct.
What I cant understand is that why is it used in this program …
As your own statement suggests, without the client's s.shutdown(socket.SHUT_WR) the server would not quit waiting for data, but instead stick in its sc.recv(1024) forever, because there would be no connection termination request sent to the server.
Since the server then would never get to its sc.close(), the client on his part also would not quit waiting for data, but instead stick in its s.recv(42) forever, because there would be no connection termination request sent from the server.
Reading this answer to "close vs shutdown socket?" might also be enlightening.
The explanation is half-baked, it applies only to this specific code and overall I would vote with all-fours that this is bad practice.
Now to understand why is it so, you need to look at a server code. This server works by blocking execution until it receives 1024 bytes. Upon reception it processes the data (makes it upper-case) and sends it back. Now the problem is with hardcoded value of 1024. What if your string is shorter than 1024 bytes?
To resolve this you need to tell the server that - hey there is no more data coming your way, so return from message = sc.recv(1024) and you do this by shutting down the socket in one direction.
You do not want to fully close the socket, because then the server would not be able to send you the reply.

More than one socket stops all sockets from working

I have a few computers on a network and I'm trying to coordinate work between them by broadcasting instructions and receiving replies from individual workers. When I use zmq to assign a single socket to each program it works fine, but when I try to assign another, none of them work. For example, the master program runs on one machine. With the code as such it works fine as a publisher, but when I uncomment the commented lines neither socket works. I've seen example code extremely similar to this so I believe it should work, but I must be missing something.
Here's some example code, first with the master program and then the worker program. The idea is to control the worker programs from the master based on input from the workers to the master.
import zmq
import time
import sys
def master():
word = sys.argv[1]
numWord = sys.argv[2]
port1 = int(sys.argv[3])
port2 = int(sys.argv[4])
context = zmq.Context()
publisher = context.socket(zmq.PUB)
publisher.bind("tcp://*:%s" % port1)
#receiver = context.socket(zmq.REP)
#receiver.bind("tcp://*:%s" % port2)
for i in range(int(numWord)):
print str(i)+": "+word
print "Publishing 1"
publisher.send("READY_FOR_NEXT_WORD")
print "Publishing 2"
publisher.send(word)
#print "Published. Waiting for REQ"
#word = receiver.recv()
#receiver.send("Master IRO")
time.sleep(1)
print "Received: "+word
publisher.send("EXIT_NOW")
master()
Ditto for the workers:
import zmq
import random
import zipfile
import sys
def worker(workerID, fileFirst, fileLast):
print "Worker "+ str(workerID) + " started"
port1 = int(sys.argv[4])
port2 = int(sys.argv[5])
# Socket to talk to server
context = zmq.Context()
#pusher = context.socket(zmq.REQ)
#pusher.connect("tcp://10.122.102.45:%s" % port2)
receiver = context.socket(zmq.SUB)
receiver.connect ("tcp://10.122.102.45:%s" % port1)
receiver.setsockopt(zmq.SUBSCRIBE, '')
found = False
done = False
while True:
print "Ready to receive"
word = receiver.recv()
print "Received order: "+word
#pusher.send("Worker #"+str(workerID)+" IRO "+ word)
#pusher.recv()
#print "Confirmed receipt"
worker(sys.argv[1], sys.argv[2], sys.argv[3])
Well, PUB-SUB patterns are not meant to be reliable specially on initialization (while the connection is established).
Your "master" publishes the first two messages in that loop and then waits for a request from the "worker". Now, if those messages get lost (something that may happen with the first messages sent with PUB-SUB patterns), then the "worker" will be stuck waiting for a publication from the "master". So, basically, they are both stuck waiting for an incoming message.
Apart from that, notice that you are publishing 2 messages from the "master" node while only processing 1 from the "worker". Your "worker" wont be able to catch-up with your "master" and, therefore, messages will be dropped or you'll get a crash.

zeromq: how to prevent infinite wait?

I just got started with ZMQ. I am designing an app whose workflow is:
one of many clients (who have random PULL addresses) PUSH a request to a server at 5555
the server is forever waiting for client PUSHes. When one comes, a worker process is spawned for that particular request. Yes, worker processes can exist concurrently.
When that process completes it's task, it PUSHes the result to the client.
I assume that the PUSH/PULL architecture is suited for this. Please correct me on this.
But how do I handle these scenarios?
the client_receiver.recv() will wait for an infinite time when server fails to respond.
the client may send request, but it will fail immediately after, hence a worker process will remain stuck at server_sender.send() forever.
So how do I setup something like a timeout in the PUSH/PULL model?
EDIT: Thanks user938949's suggestions, I got a working answer and I am sharing it for posterity.
If you are using zeromq >= 3.0, then you can set the RCVTIMEO socket option:
client_receiver.RCVTIMEO = 1000 # in milliseconds
But in general, you can use pollers:
poller = zmq.Poller()
poller.register(client_receiver, zmq.POLLIN) # POLLIN for recv, POLLOUT for send
And poller.poll() takes a timeout:
evts = poller.poll(1000) # wait *up to* one second for a message to arrive.
evts will be an empty list if there is nothing to receive.
You can poll with zmq.POLLOUT, to check if a send will succeed.
Or, to handle the case of a peer that might have failed, a:
worker.send(msg, zmq.NOBLOCK)
might suffice, which will always return immediately - raising a ZMQError(zmq.EAGAIN) if the send could not complete.
This was a quick hack I made after I referred user938949's answer and http://taotetek.wordpress.com/2011/02/02/python-multiprocessing-with-zeromq/ . If you do better, please post your answer, I will recommend your answer.
For those wanting lasting solutions on reliability, refer http://zguide.zeromq.org/page:all#toc64
Version 3.0 of zeromq (beta ATM) supports timeout in ZMQ_RCVTIMEO and ZMQ_SNDTIMEO. http://api.zeromq.org/3-0:zmq-setsockopt
Server
The zmq.NOBLOCK ensures that when a client does not exist, the send() does not block.
import time
import zmq
context = zmq.Context()
ventilator_send = context.socket(zmq.PUSH)
ventilator_send.bind("tcp://127.0.0.1:5557")
i=0
while True:
i=i+1
time.sleep(0.5)
print ">>sending message ",i
try:
ventilator_send.send(repr(i),zmq.NOBLOCK)
print " succeed"
except:
print " failed"
Client
The poller object can listen in on many recieving sockets (see the "Python Multiprocessing with ZeroMQ" linked above. I linked it only on work_receiver. In the infinite loop, the client polls with an interval of 1000ms. The socks object returns empty if no message has been recieved in that time.
import time
import zmq
context = zmq.Context()
work_receiver = context.socket(zmq.PULL)
work_receiver.connect("tcp://127.0.0.1:5557")
poller = zmq.Poller()
poller.register(work_receiver, zmq.POLLIN)
# Loop and accept messages from both channels, acting accordingly
while True:
socks = dict(poller.poll(1000))
if socks:
if socks.get(work_receiver) == zmq.POLLIN:
print "got message ",work_receiver.recv(zmq.NOBLOCK)
else:
print "error: message timeout"
The send wont block if you use ZMQ_NOBLOCK, but if you try closing the socket and context, this step would block the program from exiting..
The reason is that the socket waits for any peer so that the outgoing messages are ensured to get queued.. To close the socket immediately and flush the outgoing messages from the buffer, use ZMQ_LINGER and set it to 0..
If you're only waiting for one socket, rather than create a Poller, you can do this:
if work_receiver.poll(1000, zmq.POLLIN):
print "got message ",work_receiver.recv(zmq.NOBLOCK)
else:
print "error: message timeout"
You can use this if your timeout changes depending on the situation, instead of setting work_receiver.RCVTIMEO.

Categories

Resources