Let say we have a server application written in Python.
Let also say that this main server process forked two more processes at the startup.
Server awaits its clients, and when one comes decides to which of two forked processes should pass the client's socket.
I do not want to fork a process each time a client comes; I want to have fixed number of servers, but one main server that receives a connection, then pass it to a server that deals with a specific work client asked for.
This should be a DOS attack protection, job separation, etc. etc.
Is there any trick to pass a Python object between started Python programs.
Some shared memory or something like that?
Would pickling the socket object and pushing it through IPC work?
Would pickling the socket object and pushing it through IPC work?
No. Inside that object is a file descriptor or handle to the kernel socket. It's just a number that the process uses to identify the socket when making system calls.
If you pickle that Python socket object and send it to another process, that process will be using a handle for a socket it didn't open. Or worse, that handle may refer to a different open file.
The most efficient way to handle this (on Linux) is like this:
Master process opens listening socket (e.g. TCP port 80)
Master process forks N children who all inherit that open socket
They all call accept() and block, waiting for a new connection
When a new client connects, the kernel will select one of the processes with a handle to that socket to accept the connection; the others will continue to wait
This way, you let the kernel handle the load balancing.
If you don't want this behavior, there is a way (in UNIX) to pass an open socket to another process. Again, this is more than just the handle; the kernel effectively copies the open socket to your processs's open file list. This mechanism is known as SCM_RIGHTS, and you can see an example (in C) here:
http://man7.org/tlpi/code/online/dist/sockets/scm_rights_send.c.html
Otherwise, your master process will need to effectively proxy the connection to the child processes, reducing thr efficiency of the system.
Related
I would like to have a server process (preferably Python) that accepts simple messages and multiple clients (again, preferably Python) that connect to the server and send messages to it. The server and clients will only ever be running on the same local machine and the OS is Linux based. The server will be automatically started by the OS and the clients started later independent of the server. I strongly want to avoid installing a whole separate messaging framework/server to do this. The messages will be simple strings such as "kick" or even just a single byte representing the message type. It also needs to know when a connection is made and lost.
From these requirements, I think named pipes would be a feasible solution, with a new instance of that pipe created for each client connection. However, when I search for examples, all of the ones I have come across deal with processes that are spawned from the same parent process and not independently started which means they can pass a parent reference to the child.
Windows seems to allow multiple instances of a named pipe (one for each client connection), but I'm unsure if this is possible on a Linux based OS?
Please could someone point me in the right direction, preferably with a basic example, even if it's just pseudo-code.
I've looked at the multiprocessing module in Python, but this seems to be oriented around the server and client sharing the same process or having one spawn the other.
Edit
May be important, the host device is not guaranteed to have networking capabilities (embedded device).
I've used zeromq for this sort of thing before. it's a relatively lightweight library that exposes this sort of functionality
otherwise, you could implement it yourself by binding a socket in the server process and having clients connect to it. this works fine for unix domain sockets, just pass AF_UNIX when creating the socket, e.g:
import socket
with socket.socket(socket.AF_UNIX) as s:
s.bind('/tmp/srv')
s.listen(1)
(c, addr) = s.accept()
with c:
c.send(b"hello world")
for the server, and:
with socket.socket(socket.AF_UNIX) as c:
c.connect('/tmp/srv')
print(c.recv(8192))
for the client.
writing a protocol around this is more involved, which is where things like zmq really help where you can easily push JSON messages around
I have a Python program which spawns several other Python programs as subprocesses. One of these subprocesses is supposed to open and bind a ZMQ publisher socket, such that other subprocesses can subscribe to it.
I cannot give guarantees about which tcp ports will be available, so when I bind to a random port in the subprocess, my main program will not know what to tell the other subprocesses.
Is there a way to bind the socket in the main process and then somehow pass the socket to my subprocess? Or either some other way to preregister the socket or a standard way to pass the port information from the subprocess back to my main process (stdout and stderr are already used by other data)?
Just checking for a free port in the main process and passing that to the subprocess is not really optimal, because this could still fail if the socket is being assigned in the meantime. Also, since my program should work on Unix and Windows, I cannot really use ipc sockets, which would otherwise solve my problem.
The simplest is to create a logic for a pool-of-ports manager ( rather avoid attempts to share / pass ZeroMQ sockets to / among other processes )
One may create a persistent, a-priori known, tcp://A.B.C.D:8765-transport-class based .bind() access-point, exposed to all client processes as a port-assignment service, to which client processes .connect(), handshake in whatever manner is needed to proof an identity/credentials/purpose/etc and .recv() in a coordinated manner one actually free messaging/signalling-service port number, that is system-wide guaranteed to not be used at the very moment / until returned to the port-manager ( a rotating pool of ports is centrally managed, under your code-control, whereas all the sockets, created locally in the distributed process(es)/thread(s) .connect() / .bind()-ing to the pool-manager announced port#, and thus will still remain, and ought remain, consistently within ZeroMQ advice, not to be shared per-se ).
I've come to the realization where I need to change my design for a file synchronization program I am writing.
Currently, my program goes as follows:
1) client connects to server (and is verified)
2) if the client is verified, create a thread and begin a loop using the socket the client connected with
3) if a file on the client or server changes, send the change through that socket (using select for asynchronous communication)
My code sucks because I am torn between using one socket for file transfer or using a socket for each file transfer. Either case (in my opinion) will work, but for the first case I would have to create some sort of protocol to determine what bytes go where (some sort of header), and for the second case, I would have to create new sockets on a new thread (that do not need to be verified again), so that files can be sent on each thread without worrying about asynchronous transfer.
I would prefer to do the second option, so I'm investigating using SocketServer. Would this kind of problem be solved with SocketServer.ThreadingTCPServer and SocketServer.ThreadingMixIn? I'm having trouble thinking about it because I would assume SocketServer.ThreadingMixIn works for newly connected clients, unless I somehow have an "outer" socket server which servers "inner" socket servers?
SocketServer will work for you. You create one SocketServer per port you want to listen on. Your choice is whether you have one listener that handles the client/server connection plus per file connections (you'd need some sort of header to tell the difference) or two listeners that separate client/server connection and per file connections (you'd still need a header so that you knew which file was coming in).
Alternately, you could choose something like zeromq that provides a message transport for you.
I have a Python test program for testing features of another software component, let's call the latter the component under test (COT).
The Python test program is connected to the COT via a persistent TCP connection.
The Python program is using the Python socket API for this.
Now in order to simulate a failure of the physical link, I'd like to have the Python program shut the socket down, but without disconnecting appropriately.
I.e. I don't want anything to be sent on the TCP channel any more, including any TCP SYN/ACK/FIN. I just want the socket to go silent. It must not respond to the remote packets any more.
This is not as easy as it seems, since calling close on a socket will send TCP FIN packets to the remote end. (graceful disconnection).
So how can I kill the socket without sending any packets out?
I cannot shut down the Python program itself, because it needs to maintain other connections to other components.
For information, the socket runs in a separate thread. So I thought of abruptly killing the thread, but this is also not so easy. (Is there any way to kill a Thread?)
Any ideas?
You can't do that from a userland process since in-kernel network stack still holds resources and state related to given TCP connection. Event if you kill your whole process the kernel is going to send a FIN to the other side since it knows what file descriptors your process had and will try to clean them up properly.
One way to get around this is to engage firewall software (on local or intermediate machine). Call a script that tells the firewall to drop all packets from/to given IP and port (that of course would need appropriate administrative privileges).
Contrary to Nikolai's answer, there is indeed a way to reset the connection from userland such that an RST is sent and pending data discarded, rather than a FIN after all the pending data. However as it is more abused than used, I won't publish it here. And I don't know whether it can be done from Python. Setting one of the three possible SO_LINGER configurations and closing will do it. I won't say more than that, and I will say that this technique should only be used for the purpose outlined in the question.
I have a python application , to be more precise a Network Application that can't go down this means i can't kill the PID since it actually talks with other servers and clients and so on ... many € per minute of downtime , you know the usual 24/7 system.
Anyway in my hobby projects i also work a lot with WSGI frameworks and i noticed that i have the same problem even during off-peak hours.
Anyway imagine a normal server using TCP/UDP ( put here your favourite WSGI/SIP/Classified Information Server/etc).
Now you perform a git pull in the remote server and there goes the new python files into the server (these files will of course ONLY affect the data processing and not the actual sockets so there is no need to re-raise the sockets or touch in any way the network part).
I don't usually use File monitors since i prefer to use SIGNAL to wakeup the internal app updater.
Now imagine the following code
from mysuper.app import handler
while True:
data = socket.recv()
if data:
socket.send(handler(data))
Lets imagine that handler is a APP with DB connections, cache connections , etc.
What is the best way to update the handler.
Is it safe to call reload(handler) ?
Will this break DB connections ?
Will DB Connections survive to this restart ?
Will current transactions be lost ?
Will this create anti-matter ?
What is the best-pratice patterns that you guys usually use if there are any ?
It's safe to call reload(handler).
Depends where you initialize your connections. If you make the connections inside handler(), then yes, they'll be garbage collected when the handler() object falls out of scope. But you wouldn't be connecting inside your main loop, would you? I'd highly recommend something like:
dbconnection = connect(...)
while True:
...
socket.send(handler(data, dbconnection))
if for no other reason than that you won't be making an expensive connection inside a tight loop.
That said, I'd recommend going with an entirely different architecture. Make a listener process that does basically nothing more than listen for UDP datagrams, sends them to a messaging queue like RabbitMQ, then waits for the reply message to send the results back to the client. Then write your actual servers that get their requests from the messaging queue, process them, and send a reply message back.
If you want to upgrade the UDP server, launch the new instance listening on another port. Update your firewall rules to redirect incoming traffic to the new port. Reload the rules. Kill the old process. Voila: seamless cutover.
The real win is from uncoupling your backend. Since multiple processes can listen for the same messages from your frontend "proxy" service, you can run several in parallel - on different machines, if you want to. To upgrade the backend, start a new instance then kill the old one so that there's no time when at least one instance isn't running.
To scale your proxy, have multiple instances running on different ports or different hosts, and configure your firewall to randomly redirect incoming datagrams to one of the proxies.
To scale your backend, run more instances.