How do you handle high throughput functions in the REST API server?

How do you handle high throughput functions in the REST API server? - python

I am developing the rest API using python flask. (Client is a mobile app)
However, important functions are a batch program that reads data from
DB processes it, and then updates (or inserts) the data when a user requests POST method with user data
Considering a lot of Read, Write, and Computation
How do you develop it?
This is how I think.
Use procedures in DB
Create an external deployment program that is independent of API.
Create a separate batch server
Just run it on the API server
I can not judge what is right with my knowledge.
And the important thing is that execution speed should not be slow.
For the user to feel, they should look as though they are running on their own devices.
I would like to ask you for advice on back-end development.

I would recommand considering asyncio. This is pretty much the use-case you have - i/o is time-consuming, but doesn't require lots of CPU. So essentially you would want that i/o to be done asynchronously, while the rest of the server carries on.
The server receives some requests that requires i/o.
It spins off that request into your asyncio architecture, so it can
be performed.
The server is already available to receive other requests, while the
previous i/o requests is being processed.
The previous i/o requests finishes. Asyncio offers a few ways to
deal with this.
See the docs, but you could provide a callback, or build your logic to take advantage of Asyncio's event loop (which essentially manages switching back & forth between context, e.g. the "main" context of your server serving resquests and the async i/o operations that you have queued up).

Related

Concurrent processing of messages (Python)

I have the following scenario:
There is one thread that manages long-polling HTTP connection (non-stop) from an API. When a new message arrives, it must be processed within the special process() method.
I just want to design it in a way that incoming messages will be processed concurrently, but there is another important point: in the end of each processing an answer should be passed to the outcoming queue, which is organized in a separated thread. From there the answers will be sent via HTTP.
Here is a scheme:
Let's consider that it can be 30-50 messages in a second, and procces method will work from 1 up to 10 seconds.
The question is: what library or framework can I use to implement this architecture?
As far as I have researched, Python Tornado have good benchmarks, but here I do not need a web framework, just a tool that can provide a concurrent running of message processors.

Your message rate is pretty low. So you may freely use "standard" tools like RabbitMQ/Redis, Celery ("Celery Project") and asyncio.
RabbitMQ/Redis with Celery - are great tools to implement queues and manage your tasks and processes.
Asyncio is faster than Tornado but it doesn't matter for your task. What is more important is that asyncio gives you all the benefits of modern async/await coroutine technique.

Flask and/or Tornado - handling time consuming call to external webservice

I've got a flask app that connects with given URL to external services (with different, but usually long response times) and searches for some stuff there. After that there's some CPU heavy operations on the retrieved data. This take some time too.
My problem: response from external may take some time. You can't do much about it, but it becomes a big problem when you have multiple requests at once - flask request to external service blocks the thread and the rest is waiting.
Obvious waste of time and it's killing the app.
I heard about this asynchonous library called Tornado. And there are my questions:
Does that mean it can manage to handle multiple reqests and just trigger callback right after response from external?
Can I achieve that with my current flask app (probably not because of WSGI I guess?) or maybe I need to rewrite the whole app to Tornado?
What about those CPU heavy operations - would that block my thread? It's a good idea to do some load balancing anyway, but I'm curious how Tornado handles that.
Possible traps, gotchas?

The web server built into flask isn't meant to be used in production, for exactly the reasons you're listing - it's single threaded, and easily bogged down if any request blocking for a non-trivial amount of time. The flask documentation lists several options for deploying it in a production environment; mod_wsgi, gunicorn, uSWGI, etc. All of those deployment options provides mechanisms for handling concurrency, either via threads, processes, or non-blocking I/O. Note, though, that if you're doing CPU-bound operations, the only option that will give true concurrency is to use multiple processes.
If you want to use tornado, you'll need to rewrite your application in the tornado style. Because its architecture based on explicit asynchronous I/O, you can't use its asynchronous features if you deploy it as a WSGI application. The "tornado style" basically means using non-blocking APIs for all I/O operations, and using sub-processes for handling any long-running CPU-bound operations. The tornado documentation covers how to make asynchronous I/O calls, but here's a basic example of how it works:
from tornado import gen
#gen.coroutine
def fetch_coroutine(url):
http_client = AsyncHTTPClient()
response = yield http_client.fetch(url)
return response.body
The response = yield http_client.fetch(curl) call is actually asynchronous; it will return control to the tornado event loop when the requests begins, and will resume again once the response is received. This allows multiple asynchronous HTTP requests to run concurrently, all within one thread. Do note though, that anything you do inside of fetch_coroutine that isn't asynchronous I/O will block the event loop, and no other requests can be handled while that code is running.
To deal with long-running CPU-bound operations, you need to send the work to a subprocess to avoid blocking the event loop. For Python, that generally means using either multiprocessing or concurrent.futures. I'd take a look at this question for more information on how best to integrate those libraries with tornado. Do note that you won't want to maintain a process pool larger than the number of CPUs you have on the system, so consider how many concurrent CPU-bound operations you expect to be running at any given time when you're figuring out how to scale this beyond a single machine.
The tornado documentation has a section dedicated to running behind a load balancer, as well. They recommend using NGINX for this purpose.

Tornado seems more fit for this task than Flask. A subclass of Tornado.web.RequestHandler run in an instance of tornado.ioloop should give you non blocking request handling. I expect it would look something like this.
import tornado
import tornado.web
import tornado.ioloop
import json
class handler(tornado.web.RequestHandler):
def post(self):
self.write(json.dumps({'aaa':'bbbbb'}))
if __name__ == '__main__':
app = tornado.web.Application([('/', handler)])
app.listen(80, address='0.0.0.0')
loop = tornado.ioloop.IOLoop.instance()
loop.start()
if you want your post handler to be asynchronous you could decorate it with tornado.gen.coroutine with 'AsyncHTTPClientorgrequests`. This will give you non blocking requests. you could potentially put your calculations in a coroutine as well, though I'm not entirely sure.

Parallelism in one web request

Our server has a lot if CPUs, and some web requests could be faster if request handlers would do some parallel processing.
Example: Some work needs to be done on N (about 1-20) pictures, to severe one web request.
Caching or doing the stuff before the request comes in is not possible.
What can be done to use several CPUs of the hardware:
threads: I don't like them
multiprocessing: Every request needs to start N processes. Many CPU cycles will be lost for starting a new process and importing libraries.
special (hand made) service, which has N processes ready for processing
cellery (rabbitMQ): I don't know how big the communication overhead is...
Other solution?
Platform: Django (Python)

Regarding your second and third alternatives: you do not need to start a new process for every request. This is what process pools are for. New processes are created when your app starts up. When you submit a request to the pool, it is automatically queued until a worker is available. The disadvantage is that requests are blocking- if no worker is available at the moment, your user will sit and wait.

You could use the standard library module asyncore.
This module provides the basic infrastructure for writing asynchronous socket service clients and servers.
There is an example for how to create a basic HTML client.
Then there's Twisted, it can do lots and lots of things, which is why it's somewhat daunting. Here is an example using its HTTP client.
Twisted "speaks HTTP", asyncore does not, you'll have to.
Other libraries:
Tornado's httpclient
asynchttp

How can I offer concurrency with Pika in long-working consumers?

Short version: How can I prevent blocking Pika in a Remote Procedure Call situation?
Long version:
None of the Pika examples demonstrate my use case.
I have a Tornado server which communicates with other processes/machines over AMQP (RabbitMQ, Pika). These other processes are not very well-defined, but they will, for the most part, be returning data (see the RPC example on RabbitMQ's website). Sometimes, a process might need to take an extremely long time to process a large amount of information, but it shouldn't completely block smaller requests from being taken by the process. Or maybe the remote server is blocking because it sent out a web request. Think of it like a web server, but using AMQP instead of HTTP.
Since Pika documentation claims that it's not thread-safe, I cannot pass the connection to multiple threads (or processes, for that matter). What I want to do is start a new process, and add a socket event (for the pipe to that program) to the Pika IOLoop, as I would be able to do with Tornado. The Pika IOLoop is much different from the Tornado IOLoop, and it doesn't seem to support adding multiple handlers; it seems to operate using one "poller" on one socket.
I'd like to avoid requiring the Tornado package for this package, because I would only be using the IOLoop. It's not out of the question, but I want to see what my other options are, or if there is a solution to my problem by somehow connecting multiple Pika IOLoops/Pollers. RabbitMQ's documentation says that workers can often be "scaled up" by adding more. I'd like to avoid creating a connection for every request that comes in (if they're coming in fast).

From what you described, I believe you unfortunately either need a different communication model or need multiple Pika IOLoops/Pollers/Redundant Connections.
It sounds like from documentation and from other sites that RPC in Pika is always a blocking statement and unable to be passed around between threads. See http://www.rabbitmq.com/tutorials/tutorial-six-python.html where the author points out that RPC in Pika is inherently blocking once you actually call the ioloop.
"When in doubt avoid RPC. If you can, you should use an asynchronous pipeline - instead of RPC-like blocking"
If you want to keep sending multiple RPC calls on the same connection before one completes, you'll need a different Asynchronous model. Multiple RPC calls on the same connection before completion isn't the usual implementation of the RPC model, though it's not technically forbidden ( http://pic.dhe.ibm.com/infocenter/aix/v6r1/index.jsp?topic=%2Fcom.ibm.aix.progcomm%2Fdoc%2Fprogcomc%2Frpc_mod.htm ). I don't think Pika operates with this model, though it does have asynchronous support via callbacks (not what you are looking for I think).
If you just want to easily be able to generate new connections on the fly you could use a thread or process wrapper on a connection, where you create and block on the RPC in the other context and push to a common Queue which the main thread can monitor. Tornado might give you this, but I agree that it's a bit of overkill, and making such a connection wrapper shouldn't be all that difficult as I've done something similar for other I/O ops in less than 100 lines of Python (see Queue package for Threaded wrapper version). I think you already saw this possibility though based on your talk of multiple IOLoops.

Multi-Threading and Asynchronous sockets in python

I'm quite new to python threading/network programming, but have an assignment involving both of the above.
One of the requirements of the assignment is that for each new request, I spawn a new thread, but I need to both send and receive at the same time to the browser.
I'm currently using the asyncore library in Python to catch each request, but as I said, I need to spawn a thread for each request, and I was wondering if using both the thread and the asynchronous is overkill, or the correct way to do it?
Any advice would be appreciated.
Thanks
EDIT:
I'm writing a Proxy Server, and not sure if my client is persistent. My client is my browser (using firefox for simplicity)
It seems to reconnect for each request. My problem is that if I open a tab with http://www.google.com in it, and http://www.stackoverflow.com in it, I only get one request at a time from each tab, instead of multiple requests from google, and from SO.

I answered a question that sounds amazingly similar to your, where someone had a homework assignment to create a client server setup, with each connection being handled in a new thread: https://stackoverflow.com/a/9522339/496445
The general idea is that you have a main server loop constantly looking for a new connection to come in. When it does, you hand it off to a thread which will then do its own monitoring for new communication.
An extra bit about asyncore vs threading
From the asyncore docs:
There are only two ways to have a program on a single processor do
“more than one thing at a time.” Multi-threaded programming is the
simplest and most popular way to do it, but there is another very
different technique, that lets you have nearly all the advantages of
multi-threading, without actually using multiple threads. It’s really
only practical if your program is largely I/O bound. If your program
is processor bound, then pre-emptive scheduled threads are probably
what you really need. Network servers are rarely processor bound,
however.
As this quote suggests, using asyncore and threading should be for the most part mutually exclusive options. My link above is an example of the threading approach, where the server loop (either in a separate thread or the main one) does a blocking call to accept a new client. And when it gets one, it spawns a thread which will then continue to handle the communication, and the server goes back into a blocking call again.
In the pattern of using asyncore, you would instead use its async loop which will in turn call your own registered callbacks for various activity that occurs. There is no threading here, but rather a polling of all the open file handles for activity. You get the sense of doing things all concurrently, but under the hood it is scheduling everything serially.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.