Receiving zmq messages on background thread fails on Windows

Receiving zmq messages on background thread fails on Windows - python

I'm trying to set up a hello world style example of asynchronous communication between two peers with zmq.PAIR by receiving messages on a background thread while using console input to send messages:
server.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.bind('tcp://*:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from client: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
client.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.connect('tcp://127.0.0.1:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from server: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
This works completely fine on a Linux machine but socket.send_string blocks in either process when run from the Windows 10 command prompt. What is the reason for this discrepancy?
The socket is set up properly, flushing all outputs make no difference. The reading itself also works as expected as may be verified by navigating to 127.0.0.1:5556 in a browser. Looking at the loopback interface in Wireshark also reveals that the connection is set up properly, yet no messages are sent.
If I comment out recv_thread.start() in the client, however, messages are sent through as may be verified in Wireshark, which suggests that somehow socket.recv_string is blocking the socket from sending even though it isn't doing so on Linux.
I am also able to achieve the desired behavior by using two sets of PUSH/PULL (cf. this answer) but that doesn't quite help explain what's going on in the example at hand.
This is on Python 3.7.1, pyzmq 18.0.0, and libzmq 4.3.1 on both systems.

zmq sockets are not threadsafe, so running send and recv on the same socket in different threads should not be expected to work. Different threading behaviors on different platforms may be responsible for the difference in behavior you are seeing, but this code could also result in segfaults eventually due to the thread-unsafety of zmq sockets. Using a Lock might solve the problem.
As a side note, PAIR is a rarely-used socket type, and not often intended for use in production or inter-process communication. Most real-world instances of PAIR are as inproc sockets for inter-thread communication. PAIR can have weird behavior on reconnect, for example. Using PUSH-PULL for one-way or DEALER-DEALER for two-way communication is likely to behave in a more expected fashion.

Related

How to use inproc transport with pyzmq?

I have set up two small scripts imitating a publish and subscribe procedure with pyzmq. However, I am unable to send messages over to my subscriber client using the inproc transport. I am able to use tcp://127.0.0.1:8080 fine, just not inproc.
pub_server.py
import zmq
import random
import sys
import time
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind("inproc://stream")
while True:
socket.send_string("Hello")
time.sleep(1)
sub_client.py
import sys
import zmq
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.setsockopt_string(zmq.SUBSCRIBE, '')
socket.connect("inproc://stream")
for x in range (5):
string = socket.recv()
print(string)
How can I successfully alter my code so that I'm able to use the inproc transport method between my two scripts?
EDIT:
I have updated my code to further reflect #larsks comment. I am still not receiving my published string - what is it that I am doing wrong?
import threading
import zmq
def pub():
context = zmq.Context()
sender = context.socket(zmq.PUB)
sender.connect("inproc://hello")
lock = threading.RLock()
with lock:
sender.send(b"")
def sub():
context = zmq.Context()
receiver = context.socket(zmq.SUB)
receiver.bind("inproc://hello")
pub()
# Wait for signal
string = receiver.recv()
print(string)
print("Test successful!")
receiver.close()
if __name__ == "__main__":
sub()

As the name implies, inproc sockets can only be used within the same process. If you were to rewrite your client and server such that there were two threads in the same process you could use inproc, but otherwise this socket type simply isn't suitable for what you're doing.
The documentation is very clear on this point:
The in-process transport passes messages via memory directly between threads sharing a single ØMQ context.
Update
Taking a look at the updated code, the problem that stands out first is that while the documentation quoted above says "...between threads sharing a single ØMQ context", you are creating two contexts in your code. Typically, you will only call zmq.Context() once in your program.
Next, you are never subscribing your subscriber to any messages, so even in the event that everything else was working correctly you would not actually receive any messages.
Lastly, your code is going to experience the slow joiner problem:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
The pub/sub model isn't meant for single messages, nor is it meant to be a reliable transport.
So, to sum up:
You need to create a shared ZMQ context before you creating your sockets.
You probably want your publisher to publish in a loop instead of publishing a single message. Since you're trying to use inproc sockets you're going to need to put your two functions into separate threads.
You need to set a subscription filter in order to receive messages.
There is an example using PAIR sockets in the ZMQ documentation that might provide a useful starting point. PAIR sockets are designed for coordinating threads over inproc sockets, and unlike pub/sub sockets they are bidirectional and are not impacted by the "slow joiner" issue.

As mention earlier by #larsks, the context object should be the same. Declare the context object globally and use it in both pub and sub functions instead of creating new ones for each.

Client(1) to Server to Client(2) in Python 3.x (Sockets?)

I've been working with a project that involves sending information to a public server (to demonstrate how key-exchange schemes work) and then sending it to a specific client. There is only two clients.
I'm hoping to get pushed in the right direction on how to get information from client(1) to the server, then have the server redirect that information to client(2). I've messed with the code somewhat, getting comfortable with how to send and recieve information from the server, but I have no idea (~2 hours of research so far) how to differentiate clients and send information to specific clients
My current server code (pretty much unchanged from the python3 docs:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = socketserver.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
My client code (pretty much unchanged from the python3 docs:
import socket
import time
data = "matt is ok"
def contactserver(data):
HOST, PORT = "localhost", 9999
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.sendall(bytes(data, "utf-8"))
# Receive data from the server and shut down
received = str(sock.recv(1024), "utf-8")
print("Sent: {}".format(data))
print("Received: {}".format(received))
return format(received)
while True:
k = contactserver('banana')
time.sleep(1)
print(k)

First, a base socketserver.TCPServer can't even talk to two clients at the same time. As the docs explain:
These four classes process requests synchronously; each request must be completed before the next request can be started.
As the same paragraph tells you, you can solve that problem by using a forking or threading mix-in. That's pretty easy.
But there's a bigger problem. A threaded socketserver server creates a separate, completely independent, object for each connected client, and has no means of communicating between them, or even letting them find out about each other. So, what can you do?
You can always build it yourself. Put some kind of shared data somewhere, and some kind of synchronization on it, and all of the threads can talk to each other the same way any threads can, socketserver or otherwise.
For your design, a queue has all the magic built in for everything we need: client 1 can put a message on the queue (whether client 2 has shown up yet or not), and client 2 can get a message off the same queue (automatically waiting around if the message isn't there yet), and it's all automatically synchronized.
The big question is: how does the server know who's client 1 and who's client 2? Unless you want to switch based on address and port, or add some kind of "login" mechanism, the only rule I can think of is that whoever connects first is client 1, whoever connects second is client 2, and anyone who connects after that, who cares, they don't belong here. So, we can use a simple shared flag with a Lock on it.
Putting it all together:
class MyTCPHandler(socketserver.ThreadingMixIn, socketserver.BaseRequestHandler):
q = queue.queue()
got_first = False
got_first_lock = threading.Lock()
def handle_request(self):
with MyTCPHandler.got_first_lock:
if MyTCPHandler.got_first:
first = False
else:
first = True
MyTCPHandler.got_first = True
if first:
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
# and also queue it up for client 2
MyTCPHandler.q.put(self.data)
else:
# get the message off the queue, waiting if necessary
self.data = MyTCPHandler.q.get()
self.request.sendall(self.data)
If you want to build a more complicated chat server, where everyone talks to everyone… well, that gets a bit more complicated, and you're stretching socketserver even farther beyond its intended limits.
I would suggest either (a) dropping to a lower level and writing a threaded or multiplexing server manually, or (b) going to a higher-level, more-powerful framework that can more easily handle interdependent clients.
The stdlib comes with a few alternatives for writing servers, but all of them suck except for asyncio—which is great, but unfortunately brand new (it requires 3.4, which is still in beta, or can be installed as a back-port for 3.3). If you don't want to skate on the bleeding edge, there are some great third-party choices like twisted or gevent. All of these options have a higher learning curve than socketserver, but that's only to be expected from something much more flexible and powerful.

Is it OK to send asynchronous notifications from server to client via the same TCP connection?

As far as I understand the basics of the client-server model, generally only client may initiate requests; server responds to them. Now I've run into a system where the server sends asynchronous messages back to the client via the same persistent TCP connection whenever it wants. So, a couple of questions:
Is it a right thing to do at all? It seems to really overcomplicate implementation of a client.
Are there any nice patterns/methodologies I could use to implement a client for such a system in Python? Changing the server is not an option.
Obviously, the client has to watch both the local request queue (i.e. requests to be sent to the server), and the incoming messages from the server. Launching two threads (Rx and Tx) per connection does not feel right to me. Using select() is a major PITA here. Do I miss something?

When dealing with asynchronous io in python I typically use a library such as gevent or eventlet. The objective of these libraries is allow for applications written in a synchronous to be multiplexed by a back-end reactor.
This basic example demonstrates the launching of two green threads/co-routines/fibers to handle either side of the TCP duplex. The send side of the duplex is listening on an asynchronous queue.
This is all performed within a single hardware thread. Both gevent && eventlet have more substantive examples in their documentation that what I have provided below.
If you run nc -l -p 8000 you will see "012" printed out. As soon netcat is exited, this code will be terminated.
from eventlet import connect, sleep, GreenPool
from eventlet.queue import Queue
def handle_i(sock, queue):
while True:
data = sock.recv(8)
if data:
print(data)
else:
queue.put(None) #<- signal send side of duplex to exit
break
def handle_o(sock, queue):
while True:
data = queue.get()
if data:
sock.send(data)
else:
break
queue = Queue()
sock = connect(('127.0.0.1', 8000))
gpool = GreenPool()
gpool.spawn(handle_i, sock, queue)
gpool.spawn(handle_o, sock, queue)
for i in range(0, 3):
queue.put(str(i))
sleep(1)
gpool.waitall() #<- waits until nc exits

I believe what you are trying to achieve is a bit similar to jsonp. While sending to the client, send through a callback method which you know of, that is existing in client.
like if you are sending "some data xyz", send it like server.send("callback('some data xyz')");. This suggestion is for javascript because it executes the returned code as if it were called through that method., and I believe you can port this theory to python with some difficulty. But I am not sure, though.

Yes this is very normal and Server can also send the messages to client after connection is made like in case of telnet server when you initiate a connection it sends you a message for the capability exchange and after that it asks you about your username & password.
You could very well use select() or if I were in your shoes I would have spawned a separate thread to receive the asynchronous messages from the server & would have left the main thread free to do further processing.

ZeroMQ: multiple remote (LAN) publishers

I have a basic ZeroMQ scenario consisting of two publishers and a subscriber. This has been working fine on a local computer until I decided to separate all process in different computers within my LAN. This is how I'm creating the ZeroMQ sockets (simplified Python code):
(Subscriber process running on machine with IP 192.168.1.52)
Publisher code (common for both publishers):
context = zmq.Context()
self.pub_socket = context.socket(zmq.PUB)
self.pub_socket.connect("tcp://192.168.1.52:5556")
Subscriber code:
context = zmq.Context()
self.sub_socket = context.socket(zmq.SUB)
self.sub_socket.bind("tcp://192.168.1.52:5556")
self.sub_socket.setsockopt(zmq.SUBSCRIBE, "")
I've tried entering tcp://127.0.0.1:5556 as the binding address:port for the subscriber but that makes no difference.

I would suspect your issue might be related to the openness of the ports between your machines. Some operating systems have their own software firewalls so you may need to check if you need to open them up.
First I would check that you can do one of the simple req/rep between two machines:
# machine 1
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5556")
req = socket.recv()
socket.send(req)
# machine 2
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")
print socket.recv()
If you are having a problem with that, then you might want to check those ports.
Secondly, you also might try binding to all interfaces with: socket.bind("tcp://*:5556")
And for your actual goal, if all you need is a mutli-sender / single receiver, you can probably just use the PUSH/PULL instead of PUB/SUB
# one receiver
import zmq
context = zmq.Context()
socket = context.socket(zmq.PULL)
socket.bind("tcp://*:5556")
while True:
print socket.recv()
# many senders
import zmq
context = zmq.Context()
socket = context.socket(zmq.PUSH)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")

Did you walkthrough the "Missing Message Problem Solver" in the ZMQ guild?
Note that when using PUB/SUB pattern, there is a slow-joiner syndrome that always lose some messages. The syndrome can be eliminated if we do connect in the SUB and do bind in the PUB; however, when having multiple publishers, the subscriber needs to connect to all of them.

Thanks guys for your suggestions.
Firewalls were indeed disabled but finally found a PC which can receive from both publishers. It seems to be a problem related to the ZMQ versions installed on each computer. Senders had v.2.2 whilst the receiver had 2.1. It's weird because I though that the zmq protocol was version agnostic. Need to remember this for next time.
Thanks again!

The protocol should work between 2.1 and 2.2, but it got broken by 3.1. In 3.2 we fixed things to work with older versions again.

Constantly running python script, calling functions via terminal

quick question that I'm never even sure is possible :3
I have a python script, a network script that connects to a server and remains connected until I either disconnect or it kicks me (which it normally shouldn't), which is constantly receiving data and doing other tasks.
I was curious if it's at all possible while the script is running, to trigger functions from within the script? Say while the script was running, if I had the urge to send some sort of data to the server, I could type it up and send it to the function that handles this?
Wasn't quite sure if it was possible or not, as I've never had to attempt or even seen it done. If it helps, I'm on Ubuntu linux running the script from the terminal.

The usual 'UNIX-way' to solve such problems is to poll or select on both the socket and the standard input file descriptors. You then handle network input on 'IN' event on the socket and terminal input on 'IN' event on the stdin file descriptor.
This is not portable to Windows (which sucks), but that is the most natural way to do it on UNIX-like systems. And you don't get all the problems which come with threads (which often need polling in Python too, as they get 'unkillable' otherwise).

Take a look at gevent:
gevent is a coroutine-based Python networking library that uses
greenlet to provide a high-level synchronous API on top of the
libevent event loop.
and gevent.socket.

Jacek Konieczny's solution is good and simple. Should you want more flexible message passing, consider ZeroMQ. This gives you lots of power to easily create various messaging solutions around your main program. Using a single thread, your main program would look something like this:
#!/usr/bin/env python
import zmq
from time import sleep
CTX = zmq.Context()
incoming = CTX.socket(zmq.PULL)
incoming.bind("tcp://127.0.0.1:3000")
outgoing = CTX.socket(zmq.PUB)
outgoing.bind("tcp://127.0.0.1:3001")
# Poller for the incoming messages
poller = zmq.Poller()
poller.register(incoming, zmq.POLLIN)
def main():
while True:
# Do things on the network
print("[Did things on the network]")
# Send messages if you want
outgoing.send("Important message")
# Poll for incoming messages
socks = dict(poller.poll(zmq.NOBLOCK))
if incoming in socks and socks[incoming] == zmq.POLLIN:
message = incoming.recv()
# Handle message
print("[Handled message '%s']" % message)
sleep(1) # Only for this dummy program
if __name__ == "__main__":
main()
You would then write a client (in any language that has ZeroMQ bindings) that pushes and subscribes to messages from the main program. Example pusher:
#!/usr/bin/env python
import zmq
CTX = zmq.Context()
pusher = CTX.socket(zmq.PUSH)
pusher.connect("tcp://127.0.0.1:3000")
def main():
pusher.send("Message to main program")
if __name__ == "__main__":
main()
Example subscriber:
#!/usr/bin/env python
import zmq
CTX = zmq.Context()
subscriber = CTX.socket(zmq.SUB)
subscriber.connect("tcp://127.0.0.1:3001")
subscriber.setsockopt(zmq.SUBSCRIBE, "")
def main():
while True:
msg = subscriber.recv()
print("[Received message] %s" % msg)
if __name__ == "__main__":
main()
It sounds like you will want to combine the pusher and subscriber programs into one. If you decide to use ZeroMQ have a look at the excellent user guide.
You can of course also use ZeroMQ with multiple threads or processess (just be careful not to share individual ZeroMQ sockets between threads).

Without more details, I can only provide you with general ideas. In order to do two things at once (download from the server and wait for data to send) you will need to use either multiple threads or processes. There is a tutorial with some examples of multiple threads here. If you use multiple processes, you would be using the multiprocessing package.
With either solution, you would need a similar setup. I'll use the term thread for the rest, but you could easily replace that with process if you used multiple processes instead. You would probably have (at least) a thread to send and receive data (this might be two threads) and a separate thread to wait for something to send. This is a simplified example of the producer/consumer problem. The thread that waits for the commands/data would be a simple input loop that produces data to send, while the thread that sends data would consume the data as it sends it to the server.

Stick your server stuff in another thread (investigate the threading module) and use the main thread for interaction with the user via raw_input/input.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.