ZeroMQ: multiple remote (LAN) publishers - python

I have a basic ZeroMQ scenario consisting of two publishers and a subscriber. This has been working fine on a local computer until I decided to separate all process in different computers within my LAN. This is how I'm creating the ZeroMQ sockets (simplified Python code):
(Subscriber process running on machine with IP 192.168.1.52)
Publisher code (common for both publishers):
context = zmq.Context()
self.pub_socket = context.socket(zmq.PUB)
self.pub_socket.connect("tcp://192.168.1.52:5556")
Subscriber code:
context = zmq.Context()
self.sub_socket = context.socket(zmq.SUB)
self.sub_socket.bind("tcp://192.168.1.52:5556")
self.sub_socket.setsockopt(zmq.SUBSCRIBE, "")
I've tried entering tcp://127.0.0.1:5556 as the binding address:port for the subscriber but that makes no difference.

I would suspect your issue might be related to the openness of the ports between your machines. Some operating systems have their own software firewalls so you may need to check if you need to open them up.
First I would check that you can do one of the simple req/rep between two machines:
# machine 1
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5556")
req = socket.recv()
socket.send(req)
# machine 2
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")
print socket.recv()
If you are having a problem with that, then you might want to check those ports.
Secondly, you also might try binding to all interfaces with: socket.bind("tcp://*:5556")
And for your actual goal, if all you need is a mutli-sender / single receiver, you can probably just use the PUSH/PULL instead of PUB/SUB
# one receiver
import zmq
context = zmq.Context()
socket = context.socket(zmq.PULL)
socket.bind("tcp://*:5556")
while True:
print socket.recv()
# many senders
import zmq
context = zmq.Context()
socket = context.socket(zmq.PUSH)
socket.connect("tcp://192.168.1.52:5556")
socket.send("FOO")

Did you walkthrough the "Missing Message Problem Solver" in the ZMQ guild?
Note that when using PUB/SUB pattern, there is a slow-joiner syndrome that always lose some messages. The syndrome can be eliminated if we do connect in the SUB and do bind in the PUB; however, when having multiple publishers, the subscriber needs to connect to all of them.

Thanks guys for your suggestions.
Firewalls were indeed disabled but finally found a PC which can receive from both publishers. It seems to be a problem related to the ZMQ versions installed on each computer. Senders had v.2.2 whilst the receiver had 2.1. It's weird because I though that the zmq protocol was version agnostic. Need to remember this for next time.
Thanks again!

The protocol should work between 2.1 and 2.2, but it got broken by 3.1. In 3.2 we fixed things to work with older versions again.

Related

Receiving zmq messages on background thread fails on Windows

I'm trying to set up a hello world style example of asynchronous communication between two peers with zmq.PAIR by receiving messages on a background thread while using console input to send messages:
server.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.bind('tcp://*:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from client: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
client.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.connect('tcp://127.0.0.1:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from server: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
This works completely fine on a Linux machine but socket.send_string blocks in either process when run from the Windows 10 command prompt. What is the reason for this discrepancy?
The socket is set up properly, flushing all outputs make no difference. The reading itself also works as expected as may be verified by navigating to 127.0.0.1:5556 in a browser. Looking at the loopback interface in Wireshark also reveals that the connection is set up properly, yet no messages are sent.
If I comment out recv_thread.start() in the client, however, messages are sent through as may be verified in Wireshark, which suggests that somehow socket.recv_string is blocking the socket from sending even though it isn't doing so on Linux.
I am also able to achieve the desired behavior by using two sets of PUSH/PULL (cf. this answer) but that doesn't quite help explain what's going on in the example at hand.
This is on Python 3.7.1, pyzmq 18.0.0, and libzmq 4.3.1 on both systems.
zmq sockets are not threadsafe, so running send and recv on the same socket in different threads should not be expected to work. Different threading behaviors on different platforms may be responsible for the difference in behavior you are seeing, but this code could also result in segfaults eventually due to the thread-unsafety of zmq sockets. Using a Lock might solve the problem.
As a side note, PAIR is a rarely-used socket type, and not often intended for use in production or inter-process communication. Most real-world instances of PAIR are as inproc sockets for inter-thread communication. PAIR can have weird behavior on reconnect, for example. Using PUSH-PULL for one-way or DEALER-DEALER for two-way communication is likely to behave in a more expected fashion.

How to use inproc transport with pyzmq?

I have set up two small scripts imitating a publish and subscribe procedure with pyzmq. However, I am unable to send messages over to my subscriber client using the inproc transport. I am able to use tcp://127.0.0.1:8080 fine, just not inproc.
pub_server.py
import zmq
import random
import sys
import time
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind("inproc://stream")
while True:
socket.send_string("Hello")
time.sleep(1)
sub_client.py
import sys
import zmq
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.setsockopt_string(zmq.SUBSCRIBE, '')
socket.connect("inproc://stream")
for x in range (5):
string = socket.recv()
print(string)
How can I successfully alter my code so that I'm able to use the inproc transport method between my two scripts?
EDIT:
I have updated my code to further reflect #larsks comment. I am still not receiving my published string - what is it that I am doing wrong?
import threading
import zmq
def pub():
context = zmq.Context()
sender = context.socket(zmq.PUB)
sender.connect("inproc://hello")
lock = threading.RLock()
with lock:
sender.send(b"")
def sub():
context = zmq.Context()
receiver = context.socket(zmq.SUB)
receiver.bind("inproc://hello")
pub()
# Wait for signal
string = receiver.recv()
print(string)
print("Test successful!")
receiver.close()
if __name__ == "__main__":
sub()
As the name implies, inproc sockets can only be used within the same process. If you were to rewrite your client and server such that there were two threads in the same process you could use inproc, but otherwise this socket type simply isn't suitable for what you're doing.
The documentation is very clear on this point:
The in-process transport passes messages via memory directly between threads sharing a single ØMQ context.
Update
Taking a look at the updated code, the problem that stands out first is that while the documentation quoted above says "...between threads sharing a single ØMQ context", you are creating two contexts in your code. Typically, you will only call zmq.Context() once in your program.
Next, you are never subscribing your subscriber to any messages, so even in the event that everything else was working correctly you would not actually receive any messages.
Lastly, your code is going to experience the slow joiner problem:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
The pub/sub model isn't meant for single messages, nor is it meant to be a reliable transport.
So, to sum up:
You need to create a shared ZMQ context before you creating your sockets.
You probably want your publisher to publish in a loop instead of publishing a single message. Since you're trying to use inproc sockets you're going to need to put your two functions into separate threads.
You need to set a subscription filter in order to receive messages.
There is an example using PAIR sockets in the ZMQ documentation that might provide a useful starting point. PAIR sockets are designed for coordinating threads over inproc sockets, and unlike pub/sub sockets they are bidirectional and are not impacted by the "slow joiner" issue.
As mention earlier by #larsks, the context object should be the same. Declare the context object globally and use it in both pub and sub functions instead of creating new ones for each.

If using sockets to pass sensitive data between two scripts in the same Tkinter app, over localhost, are there any security concerns?

I am implementing a socket in Python to pass data back and forth between two scripts running on the same machine as part of a single Tkinter application.
This data, in many cases, will be highly sensitive (i.e. personal credit card numbers).
Does passing the data between scripts in this way open me up to any security concerns?
Server side:
import socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(('localhost', 8089))
serversocket.listen(5) # become a server socket, maximum 5 connections
while True:
connection, address = serversocket.accept()
buf = connection.recv(64)
if len(buf) > 0:
print buf
break
Client side:
import socket
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect(('localhost', 8089))
clientsocket.send('hello')
Code source.
Additional considerations:
This will only ever function as part of a single Tkinter application, on a single machine. Localhost will always be specified.
I am unable to use multiprocessing or threading; please no suggestions for using one of those or an alternative, other than varieties of socket. For more info as to why, see this SO question, answers, and comments. It has to do with this needing to function on Windows 7 and *nix, as well as my desired set-up.
Yes, passing the data between scripts in this way may raise a security concerns. If the attacker has an access to the same machine - he can easily sniff the traffic using the tool like tcpdump for example.
To avoid this you should encrypted your traffic - I have posted a comment below your question with an example solution.

Client(1) to Server to Client(2) in Python 3.x (Sockets?)

I've been working with a project that involves sending information to a public server (to demonstrate how key-exchange schemes work) and then sending it to a specific client. There is only two clients.
I'm hoping to get pushed in the right direction on how to get information from client(1) to the server, then have the server redirect that information to client(2). I've messed with the code somewhat, getting comfortable with how to send and recieve information from the server, but I have no idea (~2 hours of research so far) how to differentiate clients and send information to specific clients
My current server code (pretty much unchanged from the python3 docs:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = socketserver.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
My client code (pretty much unchanged from the python3 docs:
import socket
import time
data = "matt is ok"
def contactserver(data):
HOST, PORT = "localhost", 9999
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.sendall(bytes(data, "utf-8"))
# Receive data from the server and shut down
received = str(sock.recv(1024), "utf-8")
print("Sent: {}".format(data))
print("Received: {}".format(received))
return format(received)
while True:
k = contactserver('banana')
time.sleep(1)
print(k)
First, a base socketserver.TCPServer can't even talk to two clients at the same time. As the docs explain:
These four classes process requests synchronously; each request must be completed before the next request can be started.
As the same paragraph tells you, you can solve that problem by using a forking or threading mix-in. That's pretty easy.
But there's a bigger problem. A threaded socketserver server creates a separate, completely independent, object for each connected client, and has no means of communicating between them, or even letting them find out about each other. So, what can you do?
You can always build it yourself. Put some kind of shared data somewhere, and some kind of synchronization on it, and all of the threads can talk to each other the same way any threads can, socketserver or otherwise.
For your design, a queue has all the magic built in for everything we need: client 1 can put a message on the queue (whether client 2 has shown up yet or not), and client 2 can get a message off the same queue (automatically waiting around if the message isn't there yet), and it's all automatically synchronized.
The big question is: how does the server know who's client 1 and who's client 2? Unless you want to switch based on address and port, or add some kind of "login" mechanism, the only rule I can think of is that whoever connects first is client 1, whoever connects second is client 2, and anyone who connects after that, who cares, they don't belong here. So, we can use a simple shared flag with a Lock on it.
Putting it all together:
class MyTCPHandler(socketserver.ThreadingMixIn, socketserver.BaseRequestHandler):
q = queue.queue()
got_first = False
got_first_lock = threading.Lock()
def handle_request(self):
with MyTCPHandler.got_first_lock:
if MyTCPHandler.got_first:
first = False
else:
first = True
MyTCPHandler.got_first = True
if first:
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
# and also queue it up for client 2
MyTCPHandler.q.put(self.data)
else:
# get the message off the queue, waiting if necessary
self.data = MyTCPHandler.q.get()
self.request.sendall(self.data)
If you want to build a more complicated chat server, where everyone talks to everyone… well, that gets a bit more complicated, and you're stretching socketserver even farther beyond its intended limits.
I would suggest either (a) dropping to a lower level and writing a threaded or multiplexing server manually, or (b) going to a higher-level, more-powerful framework that can more easily handle interdependent clients.
The stdlib comes with a few alternatives for writing servers, but all of them suck except for asyncio—which is great, but unfortunately brand new (it requires 3.4, which is still in beta, or can be installed as a back-port for 3.3). If you don't want to skate on the bleeding edge, there are some great third-party choices like twisted or gevent. All of these options have a higher learning curve than socketserver, but that's only to be expected from something much more flexible and powerful.

Twisted Python: multicast server not working as expected

I am experimenting with twisted python's multicast protocol. This is a simple example:
I created two servers, listening on 224.0.0.1 and 224.0.0.2 like below:
from twisted.internet.protocol import DatagramProtocol
from twisted.internet import reactor
from twisted.application.internet import MulticastServer
class MulticastServerUDP(DatagramProtocol):
def __init__ (self, group, name):
self.group = group
self.name = name
def startProtocol(self):
print '%s Started Listening' % self.group
# Join a specific multicast group, which is the IP we will respond to
self.transport.joinGroup(self.group)
def datagramReceived(self, datagram, address):
print "%s Received:"%self.name + repr(datagram) + repr(address)
reactor.listenMulticast(10222, MulticastServerUDP('224.0.0.1', 'SERVER1'), listenMultiple = True)
reactor.listenMulticast(10222, MulticastServerUDP('224.0.0.1', 'SERVER2'), listenMultiple = True)
reactor.run()
Then I run this code to send "HELLO":
import socket
MCAST_GRP = '224.0.0.1'
MCAST_PORT = 10222
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_TTL, 2)
sock.sendto("HELLO", (MCAST_GRP, MCAST_PORT))
The results were quite confusing. There are several cases:
-When I set all group IP and MCAST_GRP to 224.0.0.1, both servers received the message (expected)
-When I set both servers' group IP to 224.0.0.1 and MCAST_GRP in the sending script to 224.0.0.2 (or something different from 224.0.0.1), both servers did not receive the message (expected)
-When I set one server's group IP to 224.0.0.1 and the other 224.0.0.2, strange things happen. When I set MCAST_GRP to 224.0.0.1 or 224.0.0.2, I expected only ONE of the two servers to receive the message. The result was that BOTH servers received the message. I am not sure what is going on. Can someone explain this?
Note: I am running these on the same machine.
SL
It's a little tricky, indeed.
You must write it this way:
reactor.listenMulticast(
10222,
MulticastServerUDP('224.0.0.1', 'SERVER1'),
listenMultiple=True,
interface='224.0.0.1'
)
reactor.listenMulticast(
10222,
MulticastServerUDP('224.0.0.2', 'SERVER2'),
listenMultiple=True,
interface='224.0.0.2'
)
I had the same problem before. I had to look at the source to find it out.
But I solved it, because of my background in network programming in C.
Multicast is wacky and platform (Linux, Windows, OS X, etc) implementations of multicast are even wackier.
Twisted is just reflecting the platform's multicast behavior here, so this is only vaguely a Twisted-related question. Really, it's a multicast question and a platform question.
Here's a slightly educated guess as to what's going on.
Multicast works by having hosts subscribe to addresses. When a program running on the host joins a group (eg 224.0.0.1), the host makes a note of this locally and does some network operations (IGMP) to tell nearby hosts (probably via a router, but I'm fuzzy on the details of this part) that it is now interested in messages for that group.
In the ideal universe of the creators of multicast, that subscription propagates all the way through the internet. This is necessary so that whenever anyone anywhere on the internet sends a message to that group, whichever routers get their hands on it can deliver it to all the hosts that have subscribed to the group. This is supposed to be more efficient than broadcast because only hosts that have subscribed need the message delivered to them. Since routers are tracking subscriptions, they can skip sending the traffic down links that have no subscribed hosts.
In the real universe, multicast subscriptions usually don't get propagated very far (eg, they reach the first router, probably the one running your house LAN, and stop there).
So it may seem like all that information about the ideal universe is irrelevant to this scenario. However! My suspicion is that most of the people implementing multicast thought really, really hard about that first part and were pretty tired by the time they were done implementing it.
Once a message for a multicast group actually gets to a host, the host needs to deliver it to the programs that are actually interested in it. Here, I suspect, implementers were too tired to do the right thing. Instead, they did a variety of lazy, easy things (depending on your platform). For example, some of them just visited every single open socket on the system that was subscribed to a multicast group and delivered the message to them.
On other platforms, you'll sometimes find that a single multicast message is delivered to a single listening multicast socket more than once. And of course there's the popular issue of multicast messages never being delivered at all.
Enjoy your wacky times in multicast land!

Categories

Resources