so, imagine I have only one socket that I need to manage for IO completion while its alive. With select.select() I would do
import socket
import select
a = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
a.setblocking(True)
connected = True
while connected:
read, write, error = select.select([a], [a], [a])
if read:
#do stuff with data
if write:
#write to socket
is there not a better solution to checking if only one socket is readable/writable? I will only be dealing with a single socket object, and no more. I feel like this method was built in mind with managing multiple sockets as there could be a more effecient way to handle only one.
select() is still the way to go, even for a single socket. The only alternative (ignoring for the sake of discussion alternatives like poll(), which are similar to select()) would be blocking I/O, but if you're blocked in recv() you have no way to be woken up when the socket becomes ready-for-write, and vice versa if you're blocked in send().
So you might as well use select(). There's nothing inefficient about using it for a single socket.
Related
I'm using a SocketServer.ThreadingTCPServer to serve socket connections to clients. This provides an interface where users can connect, type commands and get responses. That part I have working well.
However, in some cases I need a separate thread to broadcast a message to all connected clients. I can't figure out how to do this because there is no way to pass arguments to the class instantiated by ThreadingTCPServer. I don't know how to gather a list of socket connections that have been created.
Consider the example here. How could I access the socket created in the MyTCPHandler class from the __main__ thread?
You should not write to the same TCP socket from multiple threads. The writes may be interleaved if you do ("Hello" and "World" may become "HelWloorld").
That being said, you can create a global list to contain references to all the server objects (who would register themselves in __init__()). The question is, what to do with this list? One idea would be to use a queue or pipe to send the broadcast data to each server object, and have the server objects look in that queue for the "extra" broadcast data to send each time their handle() method is invoked.
Alternatively, you could use the Twisted networking library, which is more flexible and will let you avoid threading altogether - usually a superior alternative.
Here is what I've come up with. It isn't thread safe yet, but that shouldn't be a hard fix:
When the socket is accepted:
if not hasattr(self.server, 'socketlist'):
self.server.socketlist = dict()
thread_id = threading.current_thread().ident
self.server.socketlist[thread_id] = self.request
When the socket closes:
del self.server.socketlist[thread_id]
When I want to write to all sockets:
def broadcast(self, message):
if hasattr(self._server, 'socketlist'):
for socket in self._server.socketlist.values():
socket.sendall(message + "\r\n")
It seems to be working well and isn't as messy as I thought it might end up being.
Using the following code it seems I can fairly easily reconstruct a socket in a child process using multiprocessing.reduction..
import socket,os
import multiprocessing
from multiprocessing.reduction import reduce_handle, rebuild_handle
client = socket.socket()
client.connect(('google.com', 80))
rd = reduce_handle(client.fileno())
print "Parent: %s" % (os.getpid())
def test(x):
print "Child: %s" % (os.getpid())
build = rebuild_handle(x)
rc = socket.fromfd(build, socket.AF_INET, socket.SOCK_STREAM)
rc.send('GET / HTTP/1.1\n\n')
print rc.recv(1024)
p = multiprocessing.Process(target=test, args=(rd,))
p.start()
p.join()
I have a Twisted game server that runs multiple matches at the same time. These matches may contain several players, each of whom has a Protocol instance. What I'd like to do is have matches split across a pool of Twisted subprocesses, and have the pools handle the clients of the matches they're processing themselves. It seems like reading/writing the client's data and passing that data to and from the subprocesses would be unnecessary overhead.
The Protocols are guaranteed to be TCP instances so I believe I can (like the above code) reduce the socket like this:
rd = reduce_handle(myclient.transport.fileno())
Upon passing that data to a subprocess by looking at the Twisted source it seems I can reconstruct it in a subprocess now like this:
import socket
from twisted.internet import reactor, tcp
from multiprocessing.reduction import reduce_handle, rebuild_handle
handle = rebuild_handle(rd)
sock = socket.fromfd(handle, socket.AF_INET, socket.SOCK_STREAM)
protocol = MyProtocol(...)
transport = tcp.Connection(sock, protocol, reactor=reactor)
protocol.transport = transport
I would just try this, but seeing as I'm not super familiar with the Twisted internals even if this works I don't really know what the implications might be.
Can anyone tell me whether this looks right and whether it would work? Is this inadvisable for any some reason (I've never seen it mentioned in Twisted documentation or posts even though it seems quite relevant)? If this works, anything I should be wary of?
Thanks in advance.
Twisted and the multiprocessing module are incompatible with each other. If the code appears to work, it's only by luck and accident and a future version of either (there may well be no future versions of multiprocessing but there will probably be futures versions of Twisted) might turn this good luck into bad luck.
twisted.internet.tcp also isn't a great module to use in your applications. It's not exactly private but you also can't rely on it always working with the reactor your application uses, either. For example, iocp reactor uses twisted.internet.iocpreactor.tcp instead and will not work at all with twisted.internet.tcp (I don't expect it's very likely you'll be using iocp reactor with this code and the rest of the reactors Twisted ships with do use twisted.internet.tcp but third-party reactors may not and future versions of Twisted may change how the reactors are implemented).
There are two parts of the problem you're solving. One part is conveying the file descriptor between two processes. The other part is convincing the reactor to start monitoring the file descriptor and dispatching its events.
It's possible the risk of using multiprocessing.reduction with Twisted is minimal because there doesn't seem to be anything to do with process management in that module. Instead, it's just about pickling sockets. So you may be able to continue to convey your file descriptors using that method (and you might want to do this if you wanted to avoid using Twisted in the parent process for some reason - I'm not sure, but it doesn't sound like this is the case). However, an alternative to this is to use twisted.python.sendmsg to pass these descriptors over a UNIX socket - or better yet, to use a higher-level layer that handles the fiddly sendmsg bits for you: twisted.protocols.amp. AMP supports an argument type that is a file descriptor, letting you pass file descriptors between processes (again, only over a UNIX socket) just like you'd pass any other Python object.
As for the second part, you can add an already-established TCP connection to the reactor using reactor.adoptStreamConnection. This is a public interface that you can rely on (as long as the reactor actually implements it - which not all reactors do: you can introspect the reactor using twisted.internet.interfaces.IReactorSocket.providedBy(reactor) if you want to do some kind of graceful degradation or user-friendly error reporting).
I am working on writing a network-oriented application in Python. I had earlier worked on using blocking sockets, but after a better understanding of the requirement and concepts, I am wanting to write the application using non-blocking sockets and thus an event-driven server.
I understand that the functions in the select module in Python are to be used to conveniently see which socket interests us and so forth. Towards that I was basically trying to flip through a couple of examples of an event-driven server and I had come across this one:
"""
An echo server that uses select to handle multiple clients at a time.
Entering any line of input at the terminal will exit the server.
"""
import select
import socket
import sys
host = ''
port = 50000
backlog = 5
size = 1024
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((host,port))
server.listen(backlog)
input = [server,sys.stdin]
running = 1
while running:
inputready,outputready,exceptready = select.select(input,[],[])
for s in inputready:
if s == server:
# handle the server socket
client, address = server.accept()
input.append(client)
elif s == sys.stdin:
# handle standard input
junk = sys.stdin.readline()
running = 0
else:
# handle all other sockets
data = s.recv(size)
if data:
s.send(data)
else:
s.close()
input.remove(s)
server.close()
The parts that I didn't seem to understand are the following:
In the code snippet inputready,outputready,exceptready = select.select(input,[],[]), I believe the select() function returns three possibly empty lists of waitable objects for input, output and exceptional conditions. So it makes sense that the first argument to the select() function is the list containing the server socket and the stdin. However, where I face confusion is in the else block of the code.
Since we are for-looping over the list of inputready sockets, it is clear that the select() function will choose a client socket that is ready to be read. However, after we read data using recv() and find that the socket has actually sent data, we would want to echo it back to the client. My question is how can we write to this socket without adding it to the list passed as second argument to the select() function call? Meaning, how can we call send() on the new socket directly without 'registering' it with select() as a writable socket?
Also, why do we loop only over the sockets ready to be read (inputready in this case)? Isn't it necessary to loop over even the outputready list to see which sockets are ready to be written?
Obviously, I am missing something here.
It would also be really helpful if somebody could explain in a little more detailed fashion the working of select() function or point to good documentation.
Thank you.
Probably that snippet of code is just a simple example and so it is not exhaustive. You are free to write and read in every socket, also if select does not tell you that they are ready. But, of course, if you do this you cannot be sure that your send() won't block.
So, yes, it would be best practice to rely on select also for writing operations.
There are also many other function which have a similar purpose and in many cases they are better then select (e.g. epoll), but they are not available on all platforms.
Information about the select, epoll & other functions may be found in Linux man pages.
However in python there are many nice libraries used to handle many connections, some of these are: Twisted and gevent
I am trying to find the easiest way to read from multiple (around 100) udp datagram sockets in python. I have looked at tornado, but tornado touts http/tcp rather than udp support.
Right now I have threads dedicated to each udp socket; however, this doesn't seem very efficient.
The SocketServer module has a built-in UDP server with options for threading and forking.
Another option is the use the select module which will allow you to focus only on the sockets where data is already available for reading.
I must confess I never used it, but maybe Twisted will suit your needs.
It supports lots of protocols, even serial connections.
I'd like to add some comments on the initial question even though it already has an accepted answer.
If you have multiple connections which need independent processing (or at least processing with not much synchronization), its okay to use one thread per connection and do blocking reads. Modern schedulers won't kill you for that one. It is a fairly efficient way of handling the connections. If you are concerned about memory footprint, you could reduce the stack size of the threads accordingly (does not apply to python).
The threads/processes will stay in a non-busy waiting state for most of the time (while waiting for new data) and not consume any CPU time.
If you do not want or can not use threads, the select call is definetly the way to go. This is also low-level and efficient waiting and as a bonus, gives you the list of sockets which triggered.
asyncoro supports asynchronous TCP and UDP sockets (among many other features). Unlike with other frameworks, programming with asyncoro is very similar to that of threads. A simple UDP client/server program to illustrate:
import socket, asyncoro
def server_proc(n, sock, coro=None):
for i in xrange(n):
msg, addr = yield sock.recvfrom(1024)
print('Received "%s" from %s:%s' % (msg, addr[0], addr[1]))
sock.close()
def client_proc(host, port, coro=None):
sock = asyncoro.AsynCoroSocket(socket.socket(socket.AF_INET, socket.SOCK_DGRAM))
msg = 'client socket: %s' % (sock.fileno())
yield sock.sendto(msg, (host, port))
sock.close()
if __name__ == '__main__':
sock = asyncoro.AsynCoroSocket(socket.socket(socket.AF_INET, socket.SOCK_DGRAM))
sock.bind(('127.0.0.1', 0))
host, port = sock.getsockname()
n = 100
server_coro = asyncoro.Coro(server_proc, n, sock)
for i in range(n):
asyncoro.Coro(client_proc, host, port)
asyncoro uses efficient polling mechanisms where possible. Only with Windows and UDP sockets it uses inefficient 'select' (but uses efficient Windows I/O Completion Ports for TCP if pywin32 is installed).
I think if you do insist on using tornado's ioloop and want to do UDP socket processing, you should use a UDP version of the tornado IOStream. I have done this with success in my own projects. It is a little bit of a misnomer to call it UDPStream (since it is not quite a stream), but the basic use should be very easy for you to integrate into your application.
See the code at: http://kyle.graehl.org/coding/2012/12/07/tornado-udpstream.html
I want a two way communication in Python :
I want to bind to a socket where one client can connect to, and then server and client can "chat" with eachother.
I already have the basic listener :
import socket
HOST='' #localhost
PORT=50008
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM ) #create an INET, STREAMing socket
s.bind((HOST,PORT)) #bind to that port
s.listen(1) #listen for user input and accept 1 connection at a time.
conn, addr = s.accept()
print "The connection has been set up"
bool=1
while bool==1:
data=conn.recv(1024)
print data
if "#!END!#" in data:
print "closing the connection"
s.close()
bool=0
What I want to do now is implement something so this script also accepts user input and after the enter key is hit, send it back to the client.
But I can't figure out how I can do this ? Because if I would do it like this :
while bool==1:
data=conn.recv(1024)
print data
u_input = raw_input("input now")
if u_input != "":
conn.send(u_input)
u_input= ""
Problem is that it probably hangs at the user input prompt, so it does not allow my client to send data.
How do I solve this ?
I want to keep it in one window, can this be solved with threads ?
(I've never used threads in python)
Python's sockets have a makefile tool to make this sort of interaction much easier. After creating a socket s, then run f = s.makefile(). That will return an object with a file-like interface (so you can use readline, write, writelines and other convenient method calls). The Python standard library itself makes use of this approach (see the source for ftplib and poplib for example).
To get text from the client and display it on the server console, write a loop with print f.readline().
To get text from the server console and send it to the client, write a loop with f.write(raw_input('+ ') + '\n').
To be send and receive at the same time, do those two steps separate threads:
Thread(target=read_client_and_print_to_console).start()
Thread(target=read_server_console_and_send).start()
If you prefer async over threads, here are two examples to get you started:
Basic Async HTTP Client
Basic Async Echo Server
The basic problem is that you have two sources of input you're waiting for: the socket and the user. The three main approaches I can think of are to use asynchronous I/O, to use synchronous (blocking) I/O with multiple threads, or to use synchronous I/O with timeouts. The last approach is conceptually the simplest: wait for data on the socket for up to some timeout period, then switch to waiting for the user to enter data to send, then back to the socket, etc.
I know at a lower level, you could implement this relatively easily by treating both the socket and stdin as I/O handles and use select to wait on both of them simultaneously, but I can't recall if that functionality is mapped into Python, or if so, how. That's potentially a very good way of handling this if you can make it work. EDIT: I looked it up, and Python does have a select module, but it sounds like it only functions like this under Unix operating systems--in Windows, it can only accept sockets, not stdin or files.
have you checked twisted? twisted python event driven networking engine and library or
oidranot a python library especially for that based on torando web server