PyZmq ensure connect() after bind()

PyZmq ensure connect() after bind() - python

Trying to establish some communication between two python processes , I've come to use pyzmq. Since the communication is simple enough I 'm using the Zmq.PAIR messaging pattern with a tcp socket. Basically one process binds on an address and the other one connects to the same address . However both operations happen at startup , and since I cannot control the order in which the processes start , I am often encountering the case in which 'connect()' is called before 'bind()' which leads to failing in establishing communication.
Is there a way to know a socket is not yet ready to be connected to ?
What are the strategies to employ in such situations in order to obtain a safe connection ?

put some sleep before connecting. so bind will run first, and connect will continue after waiting for sometime

Related

Is it safe to run server.accept() constantly with sockets?

Right now I'm building a server-client program using TCP in Python with the sockets module. Having looked all over the internet, it has become apparent that a conn, addr = server.accept() line is required in the server code, however there is no way for the server to know when the client will connect. It could be from seconds to minutes after the server is run.
So my question is this: can I use threading to constantly run a server.accept() line of code so any client that chooses to connect can? Or could this lead to something malicious connecting?

As per Can 'connect' call on socket return successfully without server calling 'accept'? ,
TCP establishes the connection - the 3-way handshake - under the
covers and puts it in a completed connection queue when it is ready.
Accept() returns the next waiting connection from the front of this
queue.
From the client's perspective it is "connected" but it won't be
talking to anyone until the server accepts and begins processing. Sort
of like when you call a company and are immediately put in the hold
queue. You are "connected" but no business is going to be done until
someone actually picks up and starts talking.
So, you won't "miss" connections if you're not doing that. But accept() is typically run in an infinite loop anyway -- in the main thread or otherwise -- 'cuz it's server's primary job to service clients.
According to Is accept() thread-safe? , accept() is thread-safe, you can very well have it running in a separate thread, or even have multiple accept() calls in different threads (or even different processes in OSes with fork) at the same time.

Moving a bound ZMQ socket to another process

I have a Python program which spawns several other Python programs as subprocesses. One of these subprocesses is supposed to open and bind a ZMQ publisher socket, such that other subprocesses can subscribe to it.
I cannot give guarantees about which tcp ports will be available, so when I bind to a random port in the subprocess, my main program will not know what to tell the other subprocesses.
Is there a way to bind the socket in the main process and then somehow pass the socket to my subprocess? Or either some other way to preregister the socket or a standard way to pass the port information from the subprocess back to my main process (stdout and stderr are already used by other data)?
Just checking for a free port in the main process and passing that to the subprocess is not really optimal, because this could still fail if the socket is being assigned in the meantime. Also, since my program should work on Unix and Windows, I cannot really use ipc sockets, which would otherwise solve my problem.

The simplest is to create a logic for a pool-of-ports manager ( rather avoid attempts to share / pass ZeroMQ sockets to / among other processes )
One may create a persistent, a-priori known, tcp://A.B.C.D:8765-transport-class based .bind() access-point, exposed to all client processes as a port-assignment service, to which client processes .connect(), handshake in whatever manner is needed to proof an identity/credentials/purpose/etc and .recv() in a coordinated manner one actually free messaging/signalling-service port number, that is system-wide guaranteed to not be used at the very moment / until returned to the port-manager ( a rotating pool of ports is centrally managed, under your code-control, whereas all the sockets, created locally in the distributed process(es)/thread(s) .connect() / .bind()-ing to the pool-manager announced port#, and thus will still remain, and ought remain, consistently within ZeroMQ advice, not to be shared per-se ).

How to abruptly disconnect a socket without closing it appropriately

I have a Python test program for testing features of another software component, let's call the latter the component under test (COT).
The Python test program is connected to the COT via a persistent TCP connection.
The Python program is using the Python socket API for this.
Now in order to simulate a failure of the physical link, I'd like to have the Python program shut the socket down, but without disconnecting appropriately.
I.e. I don't want anything to be sent on the TCP channel any more, including any TCP SYN/ACK/FIN. I just want the socket to go silent. It must not respond to the remote packets any more.
This is not as easy as it seems, since calling close on a socket will send TCP FIN packets to the remote end. (graceful disconnection).
So how can I kill the socket without sending any packets out?
I cannot shut down the Python program itself, because it needs to maintain other connections to other components.
For information, the socket runs in a separate thread. So I thought of abruptly killing the thread, but this is also not so easy. (Is there any way to kill a Thread?)
Any ideas?

You can't do that from a userland process since in-kernel network stack still holds resources and state related to given TCP connection. Event if you kill your whole process the kernel is going to send a FIN to the other side since it knows what file descriptors your process had and will try to clean them up properly.
One way to get around this is to engage firewall software (on local or intermediate machine). Call a script that tells the firewall to drop all packets from/to given IP and port (that of course would need appropriate administrative privileges).

Contrary to Nikolai's answer, there is indeed a way to reset the connection from userland such that an RST is sent and pending data discarded, rather than a FIN after all the pending data. However as it is more abused than used, I won't publish it here. And I don't know whether it can be done from Python. Setting one of the three possible SO_LINGER configurations and closing will do it. I won't say more than that, and I will say that this technique should only be used for the purpose outlined in the question.

How-To - Update Live Running Python Application

I have a python application , to be more precise a Network Application that can't go down this means i can't kill the PID since it actually talks with other servers and clients and so on ... many € per minute of downtime , you know the usual 24/7 system.
Anyway in my hobby projects i also work a lot with WSGI frameworks and i noticed that i have the same problem even during off-peak hours.
Anyway imagine a normal server using TCP/UDP ( put here your favourite WSGI/SIP/Classified Information Server/etc).
Now you perform a git pull in the remote server and there goes the new python files into the server (these files will of course ONLY affect the data processing and not the actual sockets so there is no need to re-raise the sockets or touch in any way the network part).
I don't usually use File monitors since i prefer to use SIGNAL to wakeup the internal app updater.
Now imagine the following code
from mysuper.app import handler
while True:
data = socket.recv()
if data:
socket.send(handler(data))
Lets imagine that handler is a APP with DB connections, cache connections , etc.
What is the best way to update the handler.
Is it safe to call reload(handler) ?
Will this break DB connections ?
Will DB Connections survive to this restart ?
Will current transactions be lost ?
Will this create anti-matter ?
What is the best-pratice patterns that you guys usually use if there are any ?

It's safe to call reload(handler).
Depends where you initialize your connections. If you make the connections inside handler(), then yes, they'll be garbage collected when the handler() object falls out of scope. But you wouldn't be connecting inside your main loop, would you? I'd highly recommend something like:
dbconnection = connect(...)
while True:
...
socket.send(handler(data, dbconnection))
if for no other reason than that you won't be making an expensive connection inside a tight loop.
That said, I'd recommend going with an entirely different architecture. Make a listener process that does basically nothing more than listen for UDP datagrams, sends them to a messaging queue like RabbitMQ, then waits for the reply message to send the results back to the client. Then write your actual servers that get their requests from the messaging queue, process them, and send a reply message back.
If you want to upgrade the UDP server, launch the new instance listening on another port. Update your firewall rules to redirect incoming traffic to the new port. Reload the rules. Kill the old process. Voila: seamless cutover.
The real win is from uncoupling your backend. Since multiple processes can listen for the same messages from your frontend "proxy" service, you can run several in parallel - on different machines, if you want to. To upgrade the backend, start a new instance then kill the old one so that there's no time when at least one instance isn't running.
To scale your proxy, have multiple instances running on different ports or different hosts, and configure your firewall to randomly redirect incoming datagrams to one of the proxies.
To scale your backend, run more instances.

MySQLdb execute timeout

Sometimes in our production environment occurs situation when connection between service (which is python program that uses MySQLdb) and mysql server is flacky, some packages are lost, some black magic happens and .execute() of MySQLdb.Cursor object never ends (or take great amount of time to end).
This is very bad because it is waste of service worker threads. Sometimes it leads to exhausting of workers pool and service stops responding at all.
So the question is: Is there a way to interrupt MySQLdb.Connection.execute operation after given amount of time?

if the communication is such a problem, consider writing a 'proxy' that receives your SQL commands over the flaky connection and relays them to the MySQL server on a reliable channel (maybe running on the same box as the MySQL server). This way you have total control over failure detection and retrying.

You need to analyse exactly what the problem is. MySQL connections should eventually timeout if the server is gone; TCP keepalives are generally enabled. You may be able to tune the OS-level TCP timeouts.
If the database is "flaky", then you definitely need to investigate how. It seems unlikely that the database really is the problem, more likely that networking in between is.
If you are using (some) stateful firewalls of any kind, it's possible that they're losing some of the state, thus causing otherwise good long-lived connections to go dead.
You might want to consider changing the idle timeout parameter in MySQL; otherwise, a long-lived, unused connection may go "stale", where the server and client both think it's still alive, but some stateful network element in between has "forgotten" about the TCP connection. An application trying to use such a "stale" connection will have a long wait before receiving an error (but it should eventually).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.