Dynamically connect to an endpoint from a ZeroMQ client

Dynamically connect to an endpoint from a ZeroMQ client - python

My client built with pyzmq, will connect to a service that will provide it with the correct address it needs to connect to. It might do this several times, each time having to connect to a different worker.
What I have created until now, based on the zguide, is a simple broker that will accept connections from clients on a frontend port and then it will connect with one of the workers and make a question (right now its just a random choice of yes and no). If the client replies with 'yes' then my idea was to let the client know that that specific worker is ready and have it connect directly to the worker.
In the examples that I have seen clients mostly connect to a single server or broker once. What would be the best way to connect with an address given to me on runtime potentially multiple times?

Related

python zmq many client to many server discovery message patterns

Strugging on this problem for a while so finally asking for some help from the experts.
Language: python
The problem/setup:
I have many clients, client[n], client[n] .. etc
I have many servers, server[n], server[n] .. etc
Each server can plugin to 5 external ws connections. At any time I may need to open [x] ws connections; maybe 2, maybe 32, the total ws connections i need, thus servers needed, is dynamic...
Each client maybe connecting 1 ws connection from server[1], 1 ws connection from server[2] .. .etc
How I imagine the flow working
New client[1] is loaded, needing 2 ws feeds
New client[1] broadcasts [xpub/xsub ?] message to all servers saying, 'hey, I need these 2 ws connections, who has them?'
Server[1] with the ws connections reply to client[1] (and only that client) - 'I got what youre looking for, talk to me'
client[1] engages in req/reply communication with server[1] so that client[1] can utilize server[1]'s ws connection to make queries against it, eg, 'hey, server[1] with access to ws[1], can you request [x]' .. server[1] replies to client[1] 'heres the reply from the ws request you made'
tldr
clients will be having multiple req/rep with many servers
servers will be dealing with many clients
client need to broadcast/find appropriate clients to be messaging with

The Zyre protocol is specifically designed for brokerless "gossip" discovery. Pyre (https://github.com/zeromq/pyre) is the Python implementation. It provides mechanisms for nodes to join a "discovery group" and share information. Among other things, it allows group members to WHISPER to individual members or SHOUT (multicast) to all members.
Zyre uses UDP broadcast beacons to initiate contact, so it is generally limited to a single subnet (UDP broadcast is generally not forwarded beyond a subnet). However, you could bridge a group across different subnets via your server(s) in each subnet (see below).
You could use zyre to distribute topology information (in this case, your server list) to your clients.
I have only played around with pyre a little, so I may not have all the details exactly right, but I would try to set it up like this:
Define a Zyre group.
Each server...
Joins the group.
Sets its server address (ip or fqdn, and maybe port) in its beacon header.
Each client...
Joins the group.
Reads server address from the HELLO messages it receives from servers.
Makes REQ connections to server(s).
Adds/removes server connections based on HELLO/LEAVE/AVOID messages received over time.
If servers are not in the same subnet (e.g., maybe they are in different AWS availability zones), you could preconfigure the servers to know what all the server IPs are, periodically verify that they are up (via REQ/REP or PUB/SUB between the servers), and SHOUT active-servers information to the local group. Clients could use this information to inform/adjust their list of active servers/connections.
I've thought about doing exactly the above, but it unfortunately hasn't risen above other priorities in the backlog, so I haven't gotten past the above level of detail.

I'll focus on the discovery problem. How do clients know which servers are available and which ws connections each one has?
One approach is to add a third type of node, call it broker. There is a single broker, and all clients and servers know how to reach it. (Eg, all clients and servers are configured with the broker's IP or hostname.)
When a server starts it registers itself with the broker: "I have ws feeds x,y,z and accept requests on 1.2.3.5:1234". The broker tracks this state, maybe in a hash table.
When a client needs ws feed y, it first contacts the broker: "Which server has ws feed y?" If the broker knows who has feed y, it gives the client the server's IP and port. The client can then contact the server directly. (If multiple servers can access feed y, the broker could return a list of servers instead of a single one.)
If servers run for a "long" time, clients can cache the "server X has feed y" information and only talk to the broker when they need to access a new feed.
With this design, clients use the broker to find servers of interest. Servers don't have to know anything about clients at all. And the "real" traffic (clients accessing feeds via servers) is still done directly between clients and servers - no broker involved.
HTH. And for the record I am definitely not an expert.

Why are redis pub and sub considered different clients when only one connection is opened?

How come that even when only one instance of Redis connection created, every time I call publish or subscribe on that instance, it counts it like another client. So when I connect to redis using python
import redis
redis_server = redis.Redis()
it does not recognize it as new client. Only when I call one of these
redis_server.publish("channel", message)
redis_server.subscribe("channel")
I can see that there are 2 clients connected. Are the pub/sub clients treated seperately in redis? Why not registering connected client when the new connection is open?

By default redis-py gives you get a connection pool with only a maximum number of connections. On the first command you issue a real connection will be made and you'll see it appear in the CLIENT LIST on the server.
Whenever any client library for Redis issues a subscribe command, that entire connection is occupied by this, so redis-py is probably creating a separate connection dedicated to this.
This should explain why you see no clients connected, then 2. It's not necessarily 1 connection for every command issued as the connections in the pool will be reused.

Only allow connections from custom clients

I'm writing a Socket Server in Python, and also a Socket Client to connect to the Server.
The Client interacts with the Server in a way that the Client sends information when an action is invoked, and the Server processes the information.
The problem I'm having, is that I am able to connect to my Server with Telnet, and probably other things that I haven't tried yet. I want to disable connection from these other Clients, and only allow connections from Python Clients. (Preferably my custom-made client, as it sends information to communicate)
Is there a way I could set up authentication on connection to differentiate Python Clients from others?
Currently there is no code, as this is a problem I want to be able to solve before getting my hands dirty.

When a new connection is made to your server, your protocol will have to specify some way for the client to authenticate. Ultimately there is nothing that the network infrastructure can do to determine what sort of process initiated the connection, so you will have to specify some exchange that allows the server to be sure that it really is talking to a valid client process.

#holdenweb has already given a good answer with basic info.
If a (terminal) software sends the bytes that your application expects as a valid identification, your app will never know whether it talks to an original client or anything else.
A possible way to test for valid clients could be, that your server sends an encrypted and authenticated question (should be different at each test!), e.g. something like "what is 18:37:12 (current date and time) plus 2 (random) hours?"
Encryption/Authentication would be another issue then.
If you keep this algorithm secret, only your clients can answer it and validate themselves successfully. It can be hacked/reverse engineered, but it is safe against basic attackers.

Python - Multiple client servers for scaling

For my current setup, I have a single client server using Tornado, a standalone database server and another standalone server for my website.
I'm looking at having a second client server process running on the same system (to take advantage of its multiple cores) and I would like some advice in locating which server my "clients" have connected to. Each client can have multiple connections (instances).
I've already looked at using memcached to hold a list of user identifiers and link them to which server(s) they are connected to, but that doesn't seem like it would scale very well (eg having six digits of connected users).
I see the same issue with database lookups.
I have already optimized my server as much as possible, without going into micro-optimization and I personally frown upon that.
Current server methodology:
On connect:
Accept connection, rate limit for max connections per IP.
Append client instance to a list named "clientList".
On data from client:
Rate limit for max messages per second.
Append data to a client work queue.
If client has a thread dedicated toward its work queue:
return, its work will be chewed by the current thread
otherwise create a new thread for this users work queue, start it.
TLDR:
How do I efficiently store which servers a client has connected to in order to forward messages to that user.

Python server to receive specific data from two clients on same remote device

I was able to set up a simple socket server and client connection between two devices, with the ability to send and receive values. My issue is with setting up the remote server to accept two clients from the same device, and differentiate the data being received by them.
Specifically, each client will be running a similar code to accept encoder/decoder values from their respective motor. My main program, attached to the server, needs to use the data from each client separately, in order to carry out the appropriate calculations. How do I differentiate the incoming signals coming from both clients?

When the communication isn't heavy between clients and server, one way to do this is to have clients do a handshake to server and have the server enumerate clients and send back id's for communication.
Then the client sends it's id along with any communication it has with server in order for the server to identify it. At least that is what I did.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.