I'm trying to figure out how to make a server that can accept multiple clients at one time. While doing so, I need the client able to send and receive data from the server at the same time.
Would i have to make a threaded server? And have a thread for listening for data.
And then another thread for sending out information to the client?
Then for the client side, do i need use threads to send/get info?
Use async IO. There are dozen of async IO socket libs for python. Here is a brief benchmark.
I also tested gevent, eventlet, asyncore, twisted, pyev, pycurl, tornado.
Twsited
is stable but most slow and also not easy to start with.
gevent, eventlet (libevent)
easy to start and fast (code looks like blocking) but have some issues with forking.
pycurl (libcurl)
fast and easy (if you ok to do flags magic.. but there are example) but only http.
pyev (libev)
you must understand what you are doing almost like polling yourself.
tornado (polling in python)
fast enough and i think stable and also easy to start.
asyncore
really fast.. but don't use it.. it is ugly-ugly.
Don't use threads in python unless you are really know what you are doing.
Python and threads not really big friends (unless version <3.2 in 3.2 there must be a new gil).
On server-side you clearly need a Socket Server. This server creates a new thread for every incoming client connection.
Once a connection is established, both the client and the thread that was instantiated for the communication require an additional thread if they have to do other business in parallel than listening to the socket if the communication is synchronous. In case an asynchronous communication is what you need, then Python provides an excellent Asynchronous Socket Handler.
Use a asynchronous socket. Example server could be found here and the client code here. No direct hassle with threads. Depending on your needs, you probably don't need the asynchronous client.
You don't need threads for either client or server; you can instead select() to multiplex all the I/O inside a single thread.
Related
I need to develop an application in Python handling a few thousand of persistent TCP connection in parallel. Clients connected to the server at bootstrap and send some message (in binary format) from time to time. The server also send both in reply to clients' message and asynchronously some other binary messages. Basically it is a persistent connection initiated by the client because I have no way to reach clients that are behind a NAT.
The question is: which is the libraries/framework i shall consider for this task. Spawning a thread for each client is not an option. I'm not aware of thread pool library for python. I also recently discovered gevent. Which other options do I have?
This link is an excellent read. It lists all the available event driven and asynchronous network frameworks within Python and also has good analysis of the performance for each framework.
It appears that the Tornado framework is one of the most-performant when developing such applications.
Hope this helps
'greenlets' is a leighweight concurrency package. See http://greenlet.readthedocs.org/en/latest/.
Besides greenlets, you might also want to consider multiprocessing. See http://docs.python.org/2/library/multiprocessing.html.
The end result I am trying to achieve is allow a server to assign specific tasks to a client when it makes it's connection. A simplified version would be like this
Client connects to Server
Server tells Client to run some network task
Client receives task and fires up another process to complete task
Client tells Server it has started
Server tells Client it has another task to do (and so on...)
A couple of notes
There would be a cap on how many tasks a client can do
The client would need to be able to monitor the task/process (running? died?)
It would be nice if the client could receive data back from the process to send to the server if needed
At first, I was going to try threading, but I have heard python doesn't do threading correctly (is that right/wrong?)
Then it was thought to fire of a system call from python and record the PID. Then send certain signals to it for status, stop, (SIGUSR1, SIGUSR2, SIGINT). But not sure if that will work, because I don't know if I can capture data from another process. If you can, I don't have a clue how that would be accomplished. (stdout or a socket file?)
What would you guys suggest as far as the best way to handle this?
Use spawnProcess to spawn a subprocess. If you're using Twisted already, then this should integrate pretty seamlessly into your existing protocol logic.
Use Celery, a Python distributed task queue. It probably does everything you want or can be made to do everything you want, and it will also handle a ton of edge cases you might not have considered yet (what happens to existing jobs if the server crashes, etc.)
You can communicate with Celery from your other software using a messaging queue like RabbitMQ; see the Celery tutorials for details on this.
It will probably be most convenient to use a database such as MySQL or PostgreSQL to store information about tasks and their results, but you may be able to engineer a solution that doesn't use a database if you prefer.
I use Tornado as the web server. I write some daemons with Python, which run in the server hardware. Sometimes the web server needs to send some data to the daemon and receives some computed results. There are two working:
1. Asynchronous mode: the server sends some data to the daemons, and it doesn't need the results soon. Can I use message queue to do it perfectly?
2. Synchronous mode: the server sends data to the daemons, and it will wait until it get the results. Should Iuse sockets?
So what's the best way of communication between tornado and Python based daemon?
ZeroMQ can be used for this purpose. It has various sockets for different purposes and it's fast enough to never be your bottleneck. For asynchronous you can use DEALER/ROUTER sockets and for strict synchronous mode you can use REQ/REP sockets.
You can use the python binding for this --> http://www.zeromq.org/bindings:python.
For the async mode you can try something like this from the zguide chapter 3 Router-to-dealer async routing :
In your case, the "client" in the diagram will be your web server and your daemon will be the "worker".
For synchronous you can try a simple request-reply broker or some variant to suit your need.
The diagram above shows a strictly synchronous cycle of send/recv at the REQ/REP sockets. Read through the zguide link to understand how it works. They also have a python code snippet on the page.
Depending on the scale - the simple thing is to just use HTTP and the AsyncHTTPClient in Tornado. For the request<->response case in our application we're going 300 connections/second with such an approach.
For the first case Fire and forget, you could also use AsyncHTTP and just have the server close out the connection and continue working...
I'm looking for some general information on how I should approach a problem that I think Twisted is a great fit for. (I'm new to Twisted but not Python)
I have a home automation controller that can support a single TCP socket connection, sending and receiving binary data. I'd like to use XMPP as a bridge to the socket so a user can send commands and receive events.
I got a rudimentary socket connection working with Twisted that was able to send and receive commands from one of the examples in the O'Reilly book. I also have a fully working Python XMPP bot written with the SleekXMPP library that I'm happy with. I'm just not sure how to bring these together.
The basic scenario is:
User sends message to XMPP bot, which figures out what command to send to the socket
ASCII Socket command is converted to binary and sent to socket
Socket receives command and sends binary response
Binary response converted to ASCII
XMPP bot sends response back to user.
Network events (independent from user action) can also be received by network socket and should be sent to user
It's #6 that is presenting the challenge, otherwise I'd just open/close the socket on demand when in need to write something.
The part that I'm having trouble wrapping my head around with Twisted is the best approach to make these two event loops communicate. I've seen lots of info on using Queues, deferred, threads, select, etc. I have a feeling that Twisted can handle much of the complexity if I just learn to use the tool properly.
If someone can point me in the right direction, I'll take the ball and run with it. As I mentioned, I'm happy with my XMPP bot and I'd like to use the existing code. I think my problem now comes down to creating the socket in the background, then sending and receiving data from that socket in the foreground.
By the way, I'm very happy to share back my code once it's working so someone else can benefit from the help I'm asking for.
-- Scott
One of the problems with a non-blocking IO engine is that its pretty much all-or-nothing. As soon as you introduce blocking code, you can quickly lose most of the benefits of the event-driven asynch approach. Wherever possible (as a rule of thumb), its best to have the entire app running off the same reactor.
As i see it, you have two options:
Twisted is not thread safe. That said, you can use mechanisms like deferToThread and callFromThread to interact with other threads. This is by far the most confusing and needlessly complex approach for your application design. It's particularly painful if you're new to twisted.
Use twisted.words.protocols.jabber, and implement your XMPP stuff in a non-blocking manner using the twisted reactor. That way it will happily exist alongside all your other twisted code. and allow you to cleanly interact between protocols. It will result in less code, and a robust implementation that is easy to extend, maintain, and test.
I want to implement a lightweight Message Queue proxy. It's job is to receive messages from a web application (PHP) and send them to the Message Queue server asynchronously. The reason for this proxy is that the MQ isn't always avaliable and is sometimes lagging, or even down, but I want to make sure the messages are delivered, and the web application returns immediately.
So, PHP would send the message to the MQ proxy running on the same host. That proxy would save the messages to SQLite for persistence, in case of crashes. At the same time it would send the messages from SQLite to the MQ in batches when the connection is available, and delete them from SQLite.
Now, the way I understand, there are these components in this service:
message listener (listens to the messages from PHP and writes them to a Incoming Queue)
DB flusher (reads messages from the Incoming Queue and saves them to a database; due to SQLite single-threadedness)
MQ connection handler (keeps the connection to the MQ server online by reconnecting)
message sender (collects messages from SQlite db and sends them to the MQ server, then removes them from db)
I was thinking of using Twisted for #1 (TCPServer), but I'm having problem with integrating it with other points, which aren't event-driven. Intuition tells me that each of these points should be running in a separate thread, because all are IO-bound and independent of each other, but I could easily put them in a single thread. Even though, I couldn't find any good and clear (to me) examples on how to implement this worker thread aside of Twisted's main loop.
The example I've started with is the chatserver.py, which uses service.Application and internet.TCPServer objects. If I start my own thread prior to creating TCPServer service, it runs a few times, but the it stops and never runs again. I'm not sure, why this is happening, but it's probably because I don't use threads with Twisted correctly.
Any suggestions on how to implement a separate worker thread and keep Twisted? Do you have any alternative architectures in mind?
You're basically considering writing an ad-hoc extension to your messaging server, the job of which it is to provide whatever reliability guarantees you've asked of it.
Instead, perhaps you should take the hardware where you were planning to run this new proxy and run another MQ node on it. The new node should take care of persisting and relaying messages that you deliver to it while the other nodes are overloaded or offline.
Maybe it's not the best bang for your buck to use a separate thread in Twisted to get around a blocking call, but sometimes the least evil solution is the best. Here's a link that shows you how to integrate threading into Twisted:
http://twistedmatrix.com/documents/10.1.0/core/howto/threading.html
Sometimes in a pinch easy-to-implement is faster than hours/days of research which may all turn out to be for nought.
A neat solution to this problem would be to use the Key Value store Redis. Its a high speed persistent data store, with plenty of clients - it has a php and a python client (if you want to use a timed/batch process to process messages - it saves you creating a database, and also deals with your persistence stories. It runs fine on Cywin/Windows + posix environments.
PHP Redis client is here.
Python client is here.
Both have a very clean and simple API. Redis also offers a publish/subscribe mechanism, should you need it, although it sounds like it would be of limited value if you're publishing to an inconsistent queue.