Twisted - Pass Protocol (and socket handle) object to Twisted subprocess

Twisted - Pass Protocol (and socket handle) object to Twisted subprocess - python

Using the following code it seems I can fairly easily reconstruct a socket in a child process using multiprocessing.reduction..
import socket,os
import multiprocessing
from multiprocessing.reduction import reduce_handle, rebuild_handle
client = socket.socket()
client.connect(('google.com', 80))
rd = reduce_handle(client.fileno())
print "Parent: %s" % (os.getpid())
def test(x):
print "Child: %s" % (os.getpid())
build = rebuild_handle(x)
rc = socket.fromfd(build, socket.AF_INET, socket.SOCK_STREAM)
rc.send('GET / HTTP/1.1\n\n')
print rc.recv(1024)
p = multiprocessing.Process(target=test, args=(rd,))
p.start()
p.join()
I have a Twisted game server that runs multiple matches at the same time. These matches may contain several players, each of whom has a Protocol instance. What I'd like to do is have matches split across a pool of Twisted subprocesses, and have the pools handle the clients of the matches they're processing themselves. It seems like reading/writing the client's data and passing that data to and from the subprocesses would be unnecessary overhead.
The Protocols are guaranteed to be TCP instances so I believe I can (like the above code) reduce the socket like this:
rd = reduce_handle(myclient.transport.fileno())
Upon passing that data to a subprocess by looking at the Twisted source it seems I can reconstruct it in a subprocess now like this:
import socket
from twisted.internet import reactor, tcp
from multiprocessing.reduction import reduce_handle, rebuild_handle
handle = rebuild_handle(rd)
sock = socket.fromfd(handle, socket.AF_INET, socket.SOCK_STREAM)
protocol = MyProtocol(...)
transport = tcp.Connection(sock, protocol, reactor=reactor)
protocol.transport = transport
I would just try this, but seeing as I'm not super familiar with the Twisted internals even if this works I don't really know what the implications might be.
Can anyone tell me whether this looks right and whether it would work? Is this inadvisable for any some reason (I've never seen it mentioned in Twisted documentation or posts even though it seems quite relevant)? If this works, anything I should be wary of?
Thanks in advance.

Twisted and the multiprocessing module are incompatible with each other. If the code appears to work, it's only by luck and accident and a future version of either (there may well be no future versions of multiprocessing but there will probably be futures versions of Twisted) might turn this good luck into bad luck.
twisted.internet.tcp also isn't a great module to use in your applications. It's not exactly private but you also can't rely on it always working with the reactor your application uses, either. For example, iocp reactor uses twisted.internet.iocpreactor.tcp instead and will not work at all with twisted.internet.tcp (I don't expect it's very likely you'll be using iocp reactor with this code and the rest of the reactors Twisted ships with do use twisted.internet.tcp but third-party reactors may not and future versions of Twisted may change how the reactors are implemented).
There are two parts of the problem you're solving. One part is conveying the file descriptor between two processes. The other part is convincing the reactor to start monitoring the file descriptor and dispatching its events.
It's possible the risk of using multiprocessing.reduction with Twisted is minimal because there doesn't seem to be anything to do with process management in that module. Instead, it's just about pickling sockets. So you may be able to continue to convey your file descriptors using that method (and you might want to do this if you wanted to avoid using Twisted in the parent process for some reason - I'm not sure, but it doesn't sound like this is the case). However, an alternative to this is to use twisted.python.sendmsg to pass these descriptors over a UNIX socket - or better yet, to use a higher-level layer that handles the fiddly sendmsg bits for you: twisted.protocols.amp. AMP supports an argument type that is a file descriptor, letting you pass file descriptors between processes (again, only over a UNIX socket) just like you'd pass any other Python object.
As for the second part, you can add an already-established TCP connection to the reactor using reactor.adoptStreamConnection. This is a public interface that you can rely on (as long as the reactor actually implements it - which not all reactors do: you can introspect the reactor using twisted.internet.interfaces.IReactorSocket.providedBy(reactor) if you want to do some kind of graceful degradation or user-friendly error reporting).

Related

Local machine interprocess communication with multiple independent processes (1 server, n clients)

I would like to have a server process (preferably Python) that accepts simple messages and multiple clients (again, preferably Python) that connect to the server and send messages to it. The server and clients will only ever be running on the same local machine and the OS is Linux based. The server will be automatically started by the OS and the clients started later independent of the server. I strongly want to avoid installing a whole separate messaging framework/server to do this. The messages will be simple strings such as "kick" or even just a single byte representing the message type. It also needs to know when a connection is made and lost.
From these requirements, I think named pipes would be a feasible solution, with a new instance of that pipe created for each client connection. However, when I search for examples, all of the ones I have come across deal with processes that are spawned from the same parent process and not independently started which means they can pass a parent reference to the child.
Windows seems to allow multiple instances of a named pipe (one for each client connection), but I'm unsure if this is possible on a Linux based OS?
Please could someone point me in the right direction, preferably with a basic example, even if it's just pseudo-code.
I've looked at the multiprocessing module in Python, but this seems to be oriented around the server and client sharing the same process or having one spawn the other.
Edit
May be important, the host device is not guaranteed to have networking capabilities (embedded device).

I've used zeromq for this sort of thing before. it's a relatively lightweight library that exposes this sort of functionality
otherwise, you could implement it yourself by binding a socket in the server process and having clients connect to it. this works fine for unix domain sockets, just pass AF_UNIX when creating the socket, e.g:
import socket
with socket.socket(socket.AF_UNIX) as s:
s.bind('/tmp/srv')
s.listen(1)
(c, addr) = s.accept()
with c:
c.send(b"hello world")
for the server, and:
with socket.socket(socket.AF_UNIX) as c:
c.connect('/tmp/srv')
print(c.recv(8192))
for the client.
writing a protocol around this is more involved, which is where things like zmq really help where you can easily push JSON messages around

Moving a bound ZMQ socket to another process

I have a Python program which spawns several other Python programs as subprocesses. One of these subprocesses is supposed to open and bind a ZMQ publisher socket, such that other subprocesses can subscribe to it.
I cannot give guarantees about which tcp ports will be available, so when I bind to a random port in the subprocess, my main program will not know what to tell the other subprocesses.
Is there a way to bind the socket in the main process and then somehow pass the socket to my subprocess? Or either some other way to preregister the socket or a standard way to pass the port information from the subprocess back to my main process (stdout and stderr are already used by other data)?
Just checking for a free port in the main process and passing that to the subprocess is not really optimal, because this could still fail if the socket is being assigned in the meantime. Also, since my program should work on Unix and Windows, I cannot really use ipc sockets, which would otherwise solve my problem.

The simplest is to create a logic for a pool-of-ports manager ( rather avoid attempts to share / pass ZeroMQ sockets to / among other processes )
One may create a persistent, a-priori known, tcp://A.B.C.D:8765-transport-class based .bind() access-point, exposed to all client processes as a port-assignment service, to which client processes .connect(), handshake in whatever manner is needed to proof an identity/credentials/purpose/etc and .recv() in a coordinated manner one actually free messaging/signalling-service port number, that is system-wide guaranteed to not be used at the very moment / until returned to the port-manager ( a rotating pool of ports is centrally managed, under your code-control, whereas all the sockets, created locally in the distributed process(es)/thread(s) .connect() / .bind()-ing to the pool-manager announced port#, and thus will still remain, and ought remain, consistently within ZeroMQ advice, not to be shared per-se ).

Strengthen python UDP server

I'm a beginner in python (2.6/2.7) who has been thrown in the deep end to create a network service to an existing python app.
I've got a UDP server up and running which works just great but I'm asking for help in making it slightly more bullet proof.
Here is the base code I have written, the usual standard boiler plate:
import sys
import socket
from threading import Thread
def handleInput():
sock = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )
sock.bind( ("127.0.0.1",5005) )
# socket always binded?
while True:
data, addr = sock.recvfrom(512)
# is 'data' usable?
t=Thread(target=handleInput)
t.daemon=True
t.start()
# can thread die?
Firstly, I understand that the socket is in a sense always available, it doesn't have to listen, so there is no such thing as failing and you have to reconnect. Is this correct?
Secondly, the issue with returned data from recvfrom. Is it a simple 'if data : then' to check if its valid?
Thirdly, and finally, the thread. Can a thread bomb out? If it does whats the best way to restart it?
Any other tips would be welcome.
(Note: I cannot use external libraries, e.g. twisted etc.)

Some answers for your questions:
UDP sockets are connectionless, and as such "always available". Please also be aware that the UDP message length is limited to 65535 bytes IIRC.
Data is just a string, and its validity will depend on the used protocol. Your if data: ... will work as a test for receiving something, but the recvfrom() call will block anyway until it receives something so this seems superfluous.
I have no knowledge about python thread reliability in general.
I would also check the SocketServer module and the associated UDPServerclass - this might make for easier implementation. The SocketServer is part of python's standard library.

pure python socket module

The socket module in python wraps the _socket module which is the C implementation stuff. As well, socket.socket will take a _sock parameter that must implement the _socket interface. In some regards _sock must be an actual instance of the underlying socket type from _socket since the C code does type checking (unlike pure python).
Given that you can pass in a socket-like object for _sock, it seems like you could write a socket emulator that you could pass in to socket.socket. It would need to emulate the underlying socket behavior but in memory and not over a real network or anything. It should not otherwise be distinguishable from _socket.
What would it take to build out this sort of emulation?
I know, I know, this is not terribly practical. In fact, I learned the hard way that using regular sockets was easier and fake sockets were unnecessary. I had thought that I would have better control of a test environment with fake sockets. Regardless of the time I "wasted", I found that I learned a bunch about sockets and about python in the process.
I was guessing any solution would have to be a stack of interacting objects like this:
something_that_uses_sockets (like XMLRPCTransport for ServerProxy)
|
V
socket.socket
|
V
FakeSocket
|
V
FakeNetwork
|
V
FakeSocket ("remote")
|
V
socket.socket
|
V
something_else_that_uses_sockets (like SimpleXMLRPCServer)
It seems like this is basically what is going on for real sockets, except for a real network instead of a fake one (plus OS-level sockets), and _socket instead of FakeSocket.
Anyway, just for fun, any ideas on how to approach this?
Incidently, with a FakeSocket you could do some socket-like stuff in Google Apps...

It's already been done. Twisted uses this extensively for unit tests of its protocol implementations. A good starting place would be looking at some of Twisted's unit tests.
In essence, you'd just call makeConnection on your protocol with a transport that isn't connected to a real socket. Super easy!

Non Blocking Server in Python

Can someone please tell how to write a Non-Blocking server code using the socket library alone.Thanks

Frankly, just don't (unless it's for an exercise). The Twisted Framework will do everything network-related for you, so you have to write only your protocol without caring about the transport layer. Writing socket code is not easy, so why not use code somebody else wrote and tested.

Why socket alone? It's so much simpler to use another standard library module, asyncore -- and if you can't, at the very least select!
If you're constrained by your homework's condition to only use socket, then I hope you can at least add threading (or multiprocessing), otherwise you're seriously out of luck -- you can make sockets with timeout, but juggling timing-out sockets without the needed help from any of the other obvious standard library modules (to support either async or threaded serving) is a serious mess indeed-y...;-).

Not sure what you mean by "socket library alone" - you surely will need other modules from the standard Python library.
The lowest level of non-blocking code is the select module. This allows you to have many simultaneous client connections, and reports which of them have input pending to process. So you select both the server (accept) socket, plus any client connections that you have already accepted. A thin layer on top of that is the asyncore module.

Use eventlets or gevent. It monkey patches existing libraries. socket module can be used without any changes. Though code appears synchronous, it executes asynchronously.
Example:
http://eventlet.net/doc/examples.html#socket-connect

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.