How to use inproc transport with pyzmq?

How to use inproc transport with pyzmq? - python

I have set up two small scripts imitating a publish and subscribe procedure with pyzmq. However, I am unable to send messages over to my subscriber client using the inproc transport. I am able to use tcp://127.0.0.1:8080 fine, just not inproc.
pub_server.py
import zmq
import random
import sys
import time
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind("inproc://stream")
while True:
socket.send_string("Hello")
time.sleep(1)
sub_client.py
import sys
import zmq
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.setsockopt_string(zmq.SUBSCRIBE, '')
socket.connect("inproc://stream")
for x in range (5):
string = socket.recv()
print(string)
How can I successfully alter my code so that I'm able to use the inproc transport method between my two scripts?
EDIT:
I have updated my code to further reflect #larsks comment. I am still not receiving my published string - what is it that I am doing wrong?
import threading
import zmq
def pub():
context = zmq.Context()
sender = context.socket(zmq.PUB)
sender.connect("inproc://hello")
lock = threading.RLock()
with lock:
sender.send(b"")
def sub():
context = zmq.Context()
receiver = context.socket(zmq.SUB)
receiver.bind("inproc://hello")
pub()
# Wait for signal
string = receiver.recv()
print(string)
print("Test successful!")
receiver.close()
if __name__ == "__main__":
sub()

As the name implies, inproc sockets can only be used within the same process. If you were to rewrite your client and server such that there were two threads in the same process you could use inproc, but otherwise this socket type simply isn't suitable for what you're doing.
The documentation is very clear on this point:
The in-process transport passes messages via memory directly between threads sharing a single ØMQ context.
Update
Taking a look at the updated code, the problem that stands out first is that while the documentation quoted above says "...between threads sharing a single ØMQ context", you are creating two contexts in your code. Typically, you will only call zmq.Context() once in your program.
Next, you are never subscribing your subscriber to any messages, so even in the event that everything else was working correctly you would not actually receive any messages.
Lastly, your code is going to experience the slow joiner problem:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
The pub/sub model isn't meant for single messages, nor is it meant to be a reliable transport.
So, to sum up:
You need to create a shared ZMQ context before you creating your sockets.
You probably want your publisher to publish in a loop instead of publishing a single message. Since you're trying to use inproc sockets you're going to need to put your two functions into separate threads.
You need to set a subscription filter in order to receive messages.
There is an example using PAIR sockets in the ZMQ documentation that might provide a useful starting point. PAIR sockets are designed for coordinating threads over inproc sockets, and unlike pub/sub sockets they are bidirectional and are not impacted by the "slow joiner" issue.

As mention earlier by #larsks, the context object should be the same. Declare the context object globally and use it in both pub and sub functions instead of creating new ones for each.

Related

Receiving zmq messages on background thread fails on Windows

I'm trying to set up a hello world style example of asynchronous communication between two peers with zmq.PAIR by receiving messages on a background thread while using console input to send messages:
server.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.bind('tcp://*:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from client: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
client.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.connect('tcp://127.0.0.1:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from server: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
This works completely fine on a Linux machine but socket.send_string blocks in either process when run from the Windows 10 command prompt. What is the reason for this discrepancy?
The socket is set up properly, flushing all outputs make no difference. The reading itself also works as expected as may be verified by navigating to 127.0.0.1:5556 in a browser. Looking at the loopback interface in Wireshark also reveals that the connection is set up properly, yet no messages are sent.
If I comment out recv_thread.start() in the client, however, messages are sent through as may be verified in Wireshark, which suggests that somehow socket.recv_string is blocking the socket from sending even though it isn't doing so on Linux.
I am also able to achieve the desired behavior by using two sets of PUSH/PULL (cf. this answer) but that doesn't quite help explain what's going on in the example at hand.
This is on Python 3.7.1, pyzmq 18.0.0, and libzmq 4.3.1 on both systems.

zmq sockets are not threadsafe, so running send and recv on the same socket in different threads should not be expected to work. Different threading behaviors on different platforms may be responsible for the difference in behavior you are seeing, but this code could also result in segfaults eventually due to the thread-unsafety of zmq sockets. Using a Lock might solve the problem.
As a side note, PAIR is a rarely-used socket type, and not often intended for use in production or inter-process communication. Most real-world instances of PAIR are as inproc sockets for inter-thread communication. PAIR can have weird behavior on reconnect, for example. Using PUSH-PULL for one-way or DEALER-DEALER for two-way communication is likely to behave in a more expected fashion.

Accessing Sockets with Python SocketServer.ThreadingTCPServer

I'm using a SocketServer.ThreadingTCPServer to serve socket connections to clients. This provides an interface where users can connect, type commands and get responses. That part I have working well.
However, in some cases I need a separate thread to broadcast a message to all connected clients. I can't figure out how to do this because there is no way to pass arguments to the class instantiated by ThreadingTCPServer. I don't know how to gather a list of socket connections that have been created.
Consider the example here. How could I access the socket created in the MyTCPHandler class from the __main__ thread?

You should not write to the same TCP socket from multiple threads. The writes may be interleaved if you do ("Hello" and "World" may become "HelWloorld").
That being said, you can create a global list to contain references to all the server objects (who would register themselves in __init__()). The question is, what to do with this list? One idea would be to use a queue or pipe to send the broadcast data to each server object, and have the server objects look in that queue for the "extra" broadcast data to send each time their handle() method is invoked.
Alternatively, you could use the Twisted networking library, which is more flexible and will let you avoid threading altogether - usually a superior alternative.

Here is what I've come up with. It isn't thread safe yet, but that shouldn't be a hard fix:
When the socket is accepted:
if not hasattr(self.server, 'socketlist'):
self.server.socketlist = dict()
thread_id = threading.current_thread().ident
self.server.socketlist[thread_id] = self.request
When the socket closes:
del self.server.socketlist[thread_id]
When I want to write to all sockets:
def broadcast(self, message):
if hasattr(self._server, 'socketlist'):
for socket in self._server.socketlist.values():
socket.sendall(message + "\r\n")
It seems to be working well and isn't as messy as I thought it might end up being.

ZeroMQ: multiple remote (LAN) publishers

I have a basic ZeroMQ scenario consisting of two publishers and a subscriber. This has been working fine on a local computer until I decided to separate all process in different computers within my LAN. This is how I'm creating the ZeroMQ sockets (simplified Python code):
(Subscriber process running on machine with IP 192.168.1.52)
Publisher code (common for both publishers):
context = zmq.Context()
self.pub_socket = context.socket(zmq.PUB)
self.pub_socket.connect("tcp://192.168.1.52:5556")
Subscriber code:
context = zmq.Context()
self.sub_socket = context.socket(zmq.SUB)
self.sub_socket.bind("tcp://192.168.1.52:5556")
self.sub_socket.setsockopt(zmq.SUBSCRIBE, "")
I've tried entering tcp://127.0.0.1:5556 as the binding address:port for the subscriber but that makes no difference.

I would suspect your issue might be related to the openness of the ports between your machines. Some operating systems have their own software firewalls so you may need to check if you need to open them up.
First I would check that you can do one of the simple req/rep between two machines:
# machine 1
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5556")
req = socket.recv()
socket.send(req)
# machine 2
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")
print socket.recv()
If you are having a problem with that, then you might want to check those ports.
Secondly, you also might try binding to all interfaces with: socket.bind("tcp://*:5556")
And for your actual goal, if all you need is a mutli-sender / single receiver, you can probably just use the PUSH/PULL instead of PUB/SUB
# one receiver
import zmq
context = zmq.Context()
socket = context.socket(zmq.PULL)
socket.bind("tcp://*:5556")
while True:
print socket.recv()
# many senders
import zmq
context = zmq.Context()
socket = context.socket(zmq.PUSH)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")

Did you walkthrough the "Missing Message Problem Solver" in the ZMQ guild?
Note that when using PUB/SUB pattern, there is a slow-joiner syndrome that always lose some messages. The syndrome can be eliminated if we do connect in the SUB and do bind in the PUB; however, when having multiple publishers, the subscriber needs to connect to all of them.

Thanks guys for your suggestions.
Firewalls were indeed disabled but finally found a PC which can receive from both publishers. It seems to be a problem related to the ZMQ versions installed on each computer. Senders had v.2.2 whilst the receiver had 2.1. It's weird because I though that the zmq protocol was version agnostic. Need to remember this for next time.
Thanks again!

The protocol should work between 2.1 and 2.2, but it got broken by 3.1. In 3.2 we fixed things to work with older versions again.

Constantly running python script, calling functions via terminal

quick question that I'm never even sure is possible :3
I have a python script, a network script that connects to a server and remains connected until I either disconnect or it kicks me (which it normally shouldn't), which is constantly receiving data and doing other tasks.
I was curious if it's at all possible while the script is running, to trigger functions from within the script? Say while the script was running, if I had the urge to send some sort of data to the server, I could type it up and send it to the function that handles this?
Wasn't quite sure if it was possible or not, as I've never had to attempt or even seen it done. If it helps, I'm on Ubuntu linux running the script from the terminal.

The usual 'UNIX-way' to solve such problems is to poll or select on both the socket and the standard input file descriptors. You then handle network input on 'IN' event on the socket and terminal input on 'IN' event on the stdin file descriptor.
This is not portable to Windows (which sucks), but that is the most natural way to do it on UNIX-like systems. And you don't get all the problems which come with threads (which often need polling in Python too, as they get 'unkillable' otherwise).

Take a look at gevent:
gevent is a coroutine-based Python networking library that uses
greenlet to provide a high-level synchronous API on top of the
libevent event loop.
and gevent.socket.

Jacek Konieczny's solution is good and simple. Should you want more flexible message passing, consider ZeroMQ. This gives you lots of power to easily create various messaging solutions around your main program. Using a single thread, your main program would look something like this:
#!/usr/bin/env python
import zmq
from time import sleep
CTX = zmq.Context()
incoming = CTX.socket(zmq.PULL)
incoming.bind("tcp://127.0.0.1:3000")
outgoing = CTX.socket(zmq.PUB)
outgoing.bind("tcp://127.0.0.1:3001")
# Poller for the incoming messages
poller = zmq.Poller()
poller.register(incoming, zmq.POLLIN)
def main():
while True:
# Do things on the network
print("[Did things on the network]")
# Send messages if you want
outgoing.send("Important message")
# Poll for incoming messages
socks = dict(poller.poll(zmq.NOBLOCK))
if incoming in socks and socks[incoming] == zmq.POLLIN:
message = incoming.recv()
# Handle message
print("[Handled message '%s']" % message)
sleep(1) # Only for this dummy program
if __name__ == "__main__":
main()
You would then write a client (in any language that has ZeroMQ bindings) that pushes and subscribes to messages from the main program. Example pusher:
#!/usr/bin/env python
import zmq
CTX = zmq.Context()
pusher = CTX.socket(zmq.PUSH)
pusher.connect("tcp://127.0.0.1:3000")
def main():
pusher.send("Message to main program")
if __name__ == "__main__":
main()
Example subscriber:
#!/usr/bin/env python
import zmq
CTX = zmq.Context()
subscriber = CTX.socket(zmq.SUB)
subscriber.connect("tcp://127.0.0.1:3001")
subscriber.setsockopt(zmq.SUBSCRIBE, "")
def main():
while True:
msg = subscriber.recv()
print("[Received message] %s" % msg)
if __name__ == "__main__":
main()
It sounds like you will want to combine the pusher and subscriber programs into one. If you decide to use ZeroMQ have a look at the excellent user guide.
You can of course also use ZeroMQ with multiple threads or processess (just be careful not to share individual ZeroMQ sockets between threads).

Without more details, I can only provide you with general ideas. In order to do two things at once (download from the server and wait for data to send) you will need to use either multiple threads or processes. There is a tutorial with some examples of multiple threads here. If you use multiple processes, you would be using the multiprocessing package.
With either solution, you would need a similar setup. I'll use the term thread for the rest, but you could easily replace that with process if you used multiple processes instead. You would probably have (at least) a thread to send and receive data (this might be two threads) and a separate thread to wait for something to send. This is a simplified example of the producer/consumer problem. The thread that waits for the commands/data would be a simple input loop that produces data to send, while the thread that sends data would consume the data as it sends it to the server.

Stick your server stuff in another thread (investigate the threading module) and use the main thread for interaction with the user via raw_input/input.

zeromq persistence patterns

Who has to manages the persistent in the ZeroMQ?
When we use the ZeroMQ clients in Python language, what are the plug-ins/modules available to manage the persistent?
I would like to know the patterns to use the ZeroMQ.

As far as I know, Zeromq does not have any persistence. It is out of scope for it and needs to be handled by the end user. Just like serializing the message.
In C#, I have used db4o to add persistence. Typically I persist the object in its raw state, then serialize it and send it to ZMQ socket. Btw, this was for PUB/SUB pair.

On the application ends you can persist accordingly, for example I've built a persistance layer in node.js which communicated to back-end php calls and via websockets.
The persistance aspect held messages for a certain period of time (http://en.wikipedia.org/wiki/Time_to_live) this was to give clients a chance to connect. I used in-memory data structures but I toyed with the idea of using redis to gain on-disk persistance.

We needed to persist the received messages from a subscriber before processing them. The messages are received in a separate thread and stored on disk, while the persisted message queue is manipulated in the main thread.
The module is available at: https://pypi.org/project/persizmq. From the documentation:
import pathlib
import zmq
import persizmq
context = zmq.Context()
subscriber = context.socket(zmq.SUB)
subscriber.setsockopt_string(zmq.SUBSCRIBE, "")
subscriber.connect("ipc:///some-queue.zeromq")
persistent_dir = pathlib.Path("/some/dir")
storage = persizmq.PersistentStorage(persistent_dir=persistent_dir)
def on_exception(exception: Exception)->None:
print("an exception in the listening thread: {}".format(exception))
with persizmq.ThreadedSubscriber(
callback=storage.add_message, subscriber=subscriber,
on_exception=on_exception):
msg = storage.front() # non-blocking
if msg is not None:
print("Received a persistent message: {}".format(msg))
storage.pop_front()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.