I am running django on twisted in a wsgi container. Obviously I am avoiding all the async stuff with deferreds inside my django code because according to the documentation, twisted async abilities are not allowed inside WSGI apps.
However, I would like to use twisted.words inside my WSGI app to send requests to a jabber server. Does this count as async stuff or can I use it inside my app? What could happen if I sent twisted.words jabber requests to an xmpp server inside a WSGI anyway?
Moreover, I have a more general question. Is there any reason twisted's WSGI container is multithreaded (is it multithreaded?) since it is well known python's GIL only reduces the overall performance of a script with threads.
Thanks for any replies.
To call a function in the main event loop (I/O thread) in Twisted from another thread (non-I/O thread i.e., a WSGI application thread) you could use reactor.callFromThread(). If you'd like to wait for results then use threads.blockingCallFromThread(). Thus you could call functions that use twisted.words See Using Threads in Twisted.
To find out whether a wsgi container is multi-threaded inspect wsgi.multithread it should return true for twisted container.
WSGI containers are multi-threaded to support more than one request at a time (it is not strictly necessary but it makes life easier using existing software). Otherwise (if you don't use other means to solve it) your whole server blocks while your request handler waits for an answer from a database. Some people find it simpler to write request handlers less worrying about blocking other requests if there are not many concurrent requests.
Functions in Python that perform CPU-intensive jobs when performance matters can use libraries that release GIL during calculations or offload them to other processes. Network, disk I/O that are frequent in webapps are usually much slower than CPU.
Related
What exactly does passing threaded = True to app.run() do?
My application processes input from the user, and takes a bit of time to do so. During this time, the application is unable to handle other requests. I have tested my application with threaded=True and it allows me to handle multiple requests concurrently.
As of Flask 1.0, the WSGI server included with Flask is run in threaded mode by default.
Prior to 1.0, or if you disable threading, the server is run in single-threaded mode, and can only handle one request at a time. Any parallel requests will have to wait until they can be handled, which can lead to issues if you tried to contact your own server from a request.
With threaded=True requests are each handled in a new thread. How many threads your server can handle concurrently depends entirely on your OS and what limits it sets on the number of threads per process. The implementation uses the SocketServer.ThreadingMixIn class, which sets no limits to the number of threads it can spin up.
Note that the Flask server is designed for development only. It is not a production-ready server. Don't rely on it to run your site on the wider web. Use a proper WSGI server (like gunicorn or uWSGI) instead.
How many requests will my application be able to handle concurrently with this statement?
This depends drastically on your application. Each new request will have a thread launched- it depends on how many threads your machine can handle. I don't see an option to limit the number of threads (like uwsgi offers in a production deployment).
What are the downsides to using this? If i'm not expecting more than a few requests concurrently, can I just continue to use this?
Switching from a single thread to multi-threaded can lead to concurrency bugs... if you use this be careful about how you handle global objects (see the g object in the documentation!) and state.
What exactly does passing threaded = True to app.run() do?
My application processes input from the user, and takes a bit of time to do so. During this time, the application is unable to handle other requests. I have tested my application with threaded=True and it allows me to handle multiple requests concurrently.
As of Flask 1.0, the WSGI server included with Flask is run in threaded mode by default.
Prior to 1.0, or if you disable threading, the server is run in single-threaded mode, and can only handle one request at a time. Any parallel requests will have to wait until they can be handled, which can lead to issues if you tried to contact your own server from a request.
With threaded=True requests are each handled in a new thread. How many threads your server can handle concurrently depends entirely on your OS and what limits it sets on the number of threads per process. The implementation uses the SocketServer.ThreadingMixIn class, which sets no limits to the number of threads it can spin up.
Note that the Flask server is designed for development only. It is not a production-ready server. Don't rely on it to run your site on the wider web. Use a proper WSGI server (like gunicorn or uWSGI) instead.
How many requests will my application be able to handle concurrently with this statement?
This depends drastically on your application. Each new request will have a thread launched- it depends on how many threads your machine can handle. I don't see an option to limit the number of threads (like uwsgi offers in a production deployment).
What are the downsides to using this? If i'm not expecting more than a few requests concurrently, can I just continue to use this?
Switching from a single thread to multi-threaded can lead to concurrency bugs... if you use this be careful about how you handle global objects (see the g object in the documentation!) and state.
I've got a flask app that connects with given URL to external services (with different, but usually long response times) and searches for some stuff there. After that there's some CPU heavy operations on the retrieved data. This take some time too.
My problem: response from external may take some time. You can't do much about it, but it becomes a big problem when you have multiple requests at once - flask request to external service blocks the thread and the rest is waiting.
Obvious waste of time and it's killing the app.
I heard about this asynchonous library called Tornado. And there are my questions:
Does that mean it can manage to handle multiple reqests and just trigger callback right after response from external?
Can I achieve that with my current flask app (probably not because of WSGI I guess?) or maybe I need to rewrite the whole app to Tornado?
What about those CPU heavy operations - would that block my thread? It's a good idea to do some load balancing anyway, but I'm curious how Tornado handles that.
Possible traps, gotchas?
The web server built into flask isn't meant to be used in production, for exactly the reasons you're listing - it's single threaded, and easily bogged down if any request blocking for a non-trivial amount of time. The flask documentation lists several options for deploying it in a production environment; mod_wsgi, gunicorn, uSWGI, etc. All of those deployment options provides mechanisms for handling concurrency, either via threads, processes, or non-blocking I/O. Note, though, that if you're doing CPU-bound operations, the only option that will give true concurrency is to use multiple processes.
If you want to use tornado, you'll need to rewrite your application in the tornado style. Because its architecture based on explicit asynchronous I/O, you can't use its asynchronous features if you deploy it as a WSGI application. The "tornado style" basically means using non-blocking APIs for all I/O operations, and using sub-processes for handling any long-running CPU-bound operations. The tornado documentation covers how to make asynchronous I/O calls, but here's a basic example of how it works:
from tornado import gen
#gen.coroutine
def fetch_coroutine(url):
http_client = AsyncHTTPClient()
response = yield http_client.fetch(url)
return response.body
The response = yield http_client.fetch(curl) call is actually asynchronous; it will return control to the tornado event loop when the requests begins, and will resume again once the response is received. This allows multiple asynchronous HTTP requests to run concurrently, all within one thread. Do note though, that anything you do inside of fetch_coroutine that isn't asynchronous I/O will block the event loop, and no other requests can be handled while that code is running.
To deal with long-running CPU-bound operations, you need to send the work to a subprocess to avoid blocking the event loop. For Python, that generally means using either multiprocessing or concurrent.futures. I'd take a look at this question for more information on how best to integrate those libraries with tornado. Do note that you won't want to maintain a process pool larger than the number of CPUs you have on the system, so consider how many concurrent CPU-bound operations you expect to be running at any given time when you're figuring out how to scale this beyond a single machine.
The tornado documentation has a section dedicated to running behind a load balancer, as well. They recommend using NGINX for this purpose.
Tornado seems more fit for this task than Flask. A subclass of Tornado.web.RequestHandler run in an instance of tornado.ioloop should give you non blocking request handling. I expect it would look something like this.
import tornado
import tornado.web
import tornado.ioloop
import json
class handler(tornado.web.RequestHandler):
def post(self):
self.write(json.dumps({'aaa':'bbbbb'}))
if __name__ == '__main__':
app = tornado.web.Application([('/', handler)])
app.listen(80, address='0.0.0.0')
loop = tornado.ioloop.IOLoop.instance()
loop.start()
if you want your post handler to be asynchronous you could decorate it with tornado.gen.coroutine with 'AsyncHTTPClientorgrequests`. This will give you non blocking requests. you could potentially put your calculations in a coroutine as well, though I'm not entirely sure.
To achieve something similar to google app engines 'deferred calls' (i.e., the request is handled, and afterwards the deferred task is handled), i experimented a little and came up with the solution to spawn a thread in which my deferred call is handled.
I am now trying to determine if this is an acceptable way.
Is it possible (according to the WSGI specification) that the process is terminated by the webserver after the actual request is handled, but before all threads run out?
(if there's a better way, that would be also fine)
WSGI does not specify the lifetime of an application process (as WSGI application is a Python callable object). You can run it in a way that is completely independent of the web server, in which case, only you control the lifetime.
There is also nothing in the WSGI that would prohibit you from spawning threads, or processes, or doing whatever the hell you want.
FWIW, also have a read of:
http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
The hooking of actions to close() of iterable is the only way within context of the WSGI specification itself for doing deferred work. That isn't in a separate thread though and would occur within the context of the actual request, albeit after the response is supposed to have been flushed back to the client. Thus your deferred action will consume that request thread until the work is complete and so that request thread would not be able to handle other requests until then.
In general, if you do use background threads, there is no guarantee that any hosting mechanism would wait until those background threads complete before shutting process down. In fact, can't even think of any standard deployment mechanism which does wait. There isn't really even a guarantee that atexit handlers will be called on process shutdown, something that the referenced documentation also briefly talks about.
So, I'm writing a python web application using the twisted web2 framework. There's a library that I need to use (SQLAlchemy, to be specific) that doesn't have asynchronous code. Would it be bad to spawn a thread to handle the request, fetch any data from the DB, and then return a response? I'm afraid that if there was a flood of requests, too many threads would be started and the server would be overwhelmed. Is there something built into twisted that prevents this from happening (eg request throttling)?
See the docs, and specifically the thread pool which lets you control how many threads are active at most. Spawning one new thread per request would definitely be an inferior idea!