Flask: spawning a single async sub-task within a request - python

I have seen a few variants of my question but not quite exactly what I am looking for, hence opening a new question.
I have a Flask/Gunicorn app that for each request inserts some data in a store and, consequently, kicks off an indexing job. The indexing is 2-4 times longer than the main data write and I would like to do that asynchronously to reduce the response latency.
The overall request lifespan is 100-150ms for a large request body.
I have thought about a few ways to do this, that is as resource-efficient as possible:
Use Celery. This seems the most obvious way to do it, but I don't want to introduce a large library and most of all, a dependency on Redis or other system packages.
Use subprocess.Popen. This may be a good route but my bottleneck is I/O, so threads could be more efficient.
Using threads? I am not sure how and if that can be done. All I know is how to launch multiple processes concurrently with ThreadPoolExecutor, but I only need to spawn one additional task, and return immediately without waiting for the results.
asyncio? This too I am not sure how to apply to my situation. asyncio has always a blocking call.
Launching data write and indexing concurrently: not doable. I have to wait for a response from the data write to launch indexing.
Any suggestions are welcome!
Thanks.

Celery will be your best bet - it's exactly what it's for.
If you have a need to introduce dependencies, it's not a bad thing to have dependencies. Just as long as you don't have unneeded dependencies.
Depending on your architecture, though, more advanced and locked-in solutions might be available. You could, if you're using AWS, launch an AWS Lambda function by firing off an AWS SNS notification, and have that handle what it needs to do. The sky is the limit.

I actually should have perused the Python manual section on concurrency better: the threading module does just what I needed: https://docs.python.org/3.5/library/threading.html
And I confirmed with some dummy sleep code that the sub-thread gets completed even after the Flask request is completed.

Related

python async/await callback prevalence

I feel like there is a gap in my understanding regarding async/await functionality in python. From my understanding, once is a task is created via asyncio.create_task() it is automatically scheduled, and then a future await call will actually block other code execution until the task has completed. So if you create two tasks, and then await them sequentially, the second task could finish first, but the first task must be completed before code execution can continue. However, code in between the task creation and the await call will obviously proceed immediately (unlike the sync case), which is what I think is the benefit of async/await (please correct me if I am wrong). Are there also other benefits?
Alternatively, one can send off multiple tasks and then use as_completed or gather to handle them as they finish (perhaps out of order).
This flow makes sense in some data aggregating workflow, if you want to send off like 1000 requests simultaneously and want to aggregate or operate on the results sequentially. This all makes sense if you know how many requests you will have before hand, and what exactly the calls will look like, since you are essentially creating async tasks en masse. But what if you want to do async tasks frequently, but not all at once, like sending quotes to a trade matching engine?
I many situations, you likely want to fire a request, continue doing some other work, maybe fire more requests, and then handle the responses upon receiving them (in order of receipt since speed is critical), in a callback mechanism fashion. I don't see much recommendation about add_done_callback which leads me to believe that while I could probably achieve what I am looking for, there must be a gap in my understanding since its not recommended much. What alternatives in the asyncio sphere exist to achieve what I'm talking about? Can an asyncio.Queue be used to achieve this? I'm just confused because nearly every tutorial on the internet involves sending 1000 http requests at once, and handling them using gather or as_completed, but I feel that is such a synthetic and non-real-world workflow.

Concurrency: multiprocessing, threading, greenthreads and asyncio

I'm currently working on Python project that receives a lot os AWS SQS messages (more than 1 million each day), process these messages, and send then to another SQS queue with additional data. Everything works fine, but now we need to speed up this process a lot!
From what we have seen, or biggest bottleneck is in regards to HTTP requests to send and receive messages from AWS SQS api. So basically, our code is mostly I/O bound due to these HTTP requests.
We are trying to escalate this process by one of the following methods:
Using Python's multiprocessing: this seems like a good idea, but our workers run on small machines, usually with a single core. So creating different process may still give some benefit, since the CPU will probably change process as one or another is stuck at an I/O operation. But still, that seems a lot of overhead of process managing and resources for an operations that doesn't need to run in parallel, but concurrently.
Using Python's threading: since GIL locks all threads at a single core, and threads have less overhead than processes, this seems like a good option. As one thread is stuck waiting for an HTTP response, the CPU can take another thread to process, and so on. This would get us to our desired concurrent execution. But my question is how dos Python's threading know that it can switch some thread for another? Does it knows that some thread is currently on an I/O operation and that he can switch her for another one? Will this approach absolutely maximize CPU usage avoiding busy wait? Do I specifically has to give up control of a CPU inside a thread or is this automatically done in Python?
Recently, I also read about a concept called green-threads, using Eventlet on Python. From what I saw, they seem the perfect match for my project. The have little overhead and don't create OS threads like threading. But will we have the same problems as threading referring to CPU control? Does a green-thread needs to warn the CPU that it may take another one? I saw on some examples that Eventlet offers some built-in libraries like Urlopen, but no Requests.
The last option we considered was using Python's AsyncIo and async libraries such as Aiohttp. I have done some basic experimenting with AsyncIo and wasn't very pleased. But I can understand that most of it comes from the fact that Python is not a naturally asynchronous language. From what I saw, it would behave something like Eventlet.
So what do you think would be the best option here? What library would allow me to maximize performance on a single core machine? Avoiding busy waits as much as possible?

Python Multiprocessing: worker pool vs process

I have a discord bot I need to scale.
The main features of the bot is to fetch data from a 3rd party website and also keep a database with member info.
These 2 operations are quite time consuming and I wanted to have a separate worker/process for each of them.
My constraints:
There is a limit of GET's per min with the 3rd party website.
The database can't be accessed simultaneously for same guild.
I've been researching online for the best way to do this but I come into several libraries/ways to implement this kind of solution. What are the options I have and their strengths and weaknesses?
Since there is a limit on the amount of requests from the host, I would first try to run a synchronous program and check whether the limit is reached before the minute ends. If it does then there would be no need to concurrently run other workers. However if the limit is not reached, then I would recommend you use both asyncio and aiohttp to asynchronously get the requests. There's a ton of information out there on how to get started using these libraries.
The other option would be to use the good old threading module (or concurrent.futures for a higher level use case). Both options have their pros and cons. What I would do is first try the concurrent.futures (namely, the ThreadPoolExecutor context manager) module since you only have to add like one line of code. If it does not get the job done, then remember: use asyncio if you have to, and threading if you must. Both of these modules are easy to use and understand as well, but they do need to follow a general structure, which means you'll most likely have to change your code.

Running as many instances of a program as possible

I'm trying to implement some code to import user's data from another service via the service's API. The way I'm going to set it up is all the request jobs will be kept in a queue which my simple importer program will draw from. Handling one task at a time won't come anywhere close to maxing out any of the computer's resources so I'm wondering what is the standard way to structure a program to run multiple "jobs" at once? Should I be looking into threading or possibly a program that pulls the jobs from the queue and launches instances of the importer program? Thanks for the help.
EDIT: What I have right now is in Python although I'm open to rewriting it in another language if need be.
Use a Producer-Consumer queue, with as many Consumer threads as you need to optimize resource usage on the host (sorry - that's very vague advice, but the "right number" is problem-dependent).
If requests are lightweight you may well only need one Producer thread to handle them.
Launching multiple processes could work too - best choice depends on your requirements. Do you need the Producer to know whether the operation worked, or is it 'fire-and-forget'? Do you need retry logic in the event of failure? How do you keep count of concurrent Consumers in this model? And so on.
For Python, take a look at this.

Threaded vs. asynchronous image processing?

I have a Python function which generates an image once it is accessed. I can either invoke it directly upon a HTTP request, or do it asynchronously using Gearman. There are a lot of requests.
Which way is better:
Inline - create an image inline, will result in many images being generated at once
Asynchronous - queue jobs (with Gearman) and generate images in a worker
Which option is better?
In this case "better" would mean the best speed / load combinations. The image generation example is symbolical, as this can also be applied to Database connections and other things.
I have a Python function which
generates an image once it is
accessed. I can either invoke it
directly upon a HTTP request, or do it
asynchronously using Gearman. There
are a lot of requests.
You should not do it inside you request because then you can't throttle(your server could get overloaded). All big sites use a message queue to do the processing offline.
Which option is better?
In this case "better" would mean the
best speed / load combinations. The
image generation example is
symbolical, as this can also be
applied to Database connections and
other things.
You should do it asynchronous because the most compelling reason to do it besides it speeds up your website is that you can throttle your queue if you are on high load. You could first execute the tasks with the highest priority.
I believe forking processes is expensive. I would create a couple worker processes(maybe do a little threading inside process) to handle the load. I would probably use redis because it is fast, actively developed(antirez/pietern commits almost everyday) and has a very good/stable python client library. blpop/rpush could be used to simulate a queue(job)
If your program is CPU bound in the interpreter then spawning multiple threads will actually slow down the result even if there are enough processors to run them all. This happens because the GIL (global interpreter lock) only allows one thread to run in the interpreter at a time.
If most of the work happens in a C library it's likely the lock is not held and you can productively use multiple threads.
If you are spawning threads yourself you'll need to make sure to not create too many - 10K threads at one would be bad news - so you'd need to setup a work queue that the threads read from instead of just spawning them in a loop.
If I was doing this I'd just use the standard multiprocessing module.

Categories

Resources