How to handle parallel calls to a Python API?

How to handle parallel calls to a Python API? - python

I wrote an API which which does some database operations with values requested by the API caller. How does this whole API system work when more than on person calls a function at the same time?
Do different instances of my API code start when a number of API calls are made?
If I need to handle like 2500 parallel API calls, what exact precaution (like paying attention to database load) do I need to take?

Do you plan to call your python API from some other python code? If so then how is the parallelism achieved? Do you plan to spawn many threads, use your api in every thread?
Anyway it's worthwhile to take a look at multiprocessing module that allows one to create separate python processes. There are lots of threading modules that allow to parallelize code execution within the same process. But keep in mind that the latter case is a subject for Global Interpreter Lock - google for more info.

Related

Python Multiprocessing: worker pool vs process

I have a discord bot I need to scale.
The main features of the bot is to fetch data from a 3rd party website and also keep a database with member info.
These 2 operations are quite time consuming and I wanted to have a separate worker/process for each of them.
My constraints:
There is a limit of GET's per min with the 3rd party website.
The database can't be accessed simultaneously for same guild.
I've been researching online for the best way to do this but I come into several libraries/ways to implement this kind of solution. What are the options I have and their strengths and weaknesses?

Since there is a limit on the amount of requests from the host, I would first try to run a synchronous program and check whether the limit is reached before the minute ends. If it does then there would be no need to concurrently run other workers. However if the limit is not reached, then I would recommend you use both asyncio and aiohttp to asynchronously get the requests. There's a ton of information out there on how to get started using these libraries.
The other option would be to use the good old threading module (or concurrent.futures for a higher level use case). Both options have their pros and cons. What I would do is first try the concurrent.futures (namely, the ThreadPoolExecutor context manager) module since you only have to add like one line of code. If it does not get the job done, then remember: use asyncio if you have to, and threading if you must. Both of these modules are easy to use and understand as well, but they do need to follow a general structure, which means you'll most likely have to change your code.

How do language workers work in Azure Functions using Python?

I’m working on an ETL project using Azure Functions where I extract data from blob storage, transform the data in Python and pandas, and load the data using pandas to_sql(). I’m trying to make this process more efficient by using asyncio and language workers.
I’m a little confused because I was under the impression that asyncio works using one thread, but the Azure Functions documentation says you can use multiple language workers if you change your config and that even a method that doesn’t use the async keyword runs in a thread pool.
Does that mean that if I don’t use the async keyword my methods will run concurrently using language workers? Do I have to use asyncio to utilize language workers?
Also, the documentation says that Azure Functions can scale to up to 200 instances. How can I scale to that many instances if I’m only allowed a maximum of 10 language workers?
Edit: Thanks Anatoli. Just to clarify, if I have a Timer Trigger with the following code:
import azure.functions as func
from . import client_one_etl
from . import client_two_etl
def main(mytimer: func.TimerRequest) -> None:
client_one_etl.main()
client_two_etl.main()
If I have increased the number of language workers, does that mean both client_one_etl.main() and client_two_etl.main() are automatically run in separate threads even without using asyncio? And if client_two_etl.main() needs client_one_etl.main() to finish before executing, I will need to use async await to prevent them from running concurrently?
And for the separate instances, if client_one_etl.main() and client_two_etl.main() do not rely on each other does that mean I can execute them in one Azure Function app as separate .py scripts that run in their own VM? Is it possible to run multiple Timer Triggers (calling multiple __init__.py scripts each in their own VM for one Azure Function)? Then all scripts will need to complete within 10 minutes if I increase functionTimeout in the host.json file?

FUNCTIONS_WORKER_PROCESS_COUNT limits the maximum number of worker processes per Functions host instance. If you set it to 10, each host instance will be able to run up to 10 Python functions concurrently. Each worker process will still execute Python code on a single thread, but now you have up to 10 of them running concurrently. You don't need to use asyncio for this to happen. (Having said that, there are legitimate reasons to use asyncio to improve scalability and resource utilization, but you don't have to do that to take advantage of multiple Python worker processes.)
The 200 limit applies to the number of Functions host instances per Function app. You can think of these instances as separate VMs. The FUNCTIONS_WORKER_PROCESS_COUNT limit is applied to each of them individually, which brings the total number of concurrent threads to 2000.
UPDATE (answering the additional questions):
As soon as your function invocation starts on a certain worker, it will run on this worker until complete. Within this invocation, code execution will not be distributed to other worker processes or Functions host instances, and it will not be automatically parallelized for you in any other way. In your example, client_two_etl.main() will start after client_one_etl.main() exits, and it will start on the same worker process, so you will not observe any concurrency, regardless of the configured limits (unless you do something special in client_*_etl.main()).
When multiple invocations happen around the same time, these invocations may be automatically distributed to multiple workers, and this is where the limits mentioned above apply. Each invocation will still run on exactly one worker, from start to finish. In your example, if you manage to invoke this function twice around the same time, each invocation can get its own worker and they can run concurrently, but each will execute both client_one_etl.main() and client_two_etl.main() sequentially.
Please also note that because you are using a timer trigger on a single function, you will not experience any concurrency at all: by design, timer trigger will not start a new invocation until the previous invocation is complete. If you want concurrency, either use a different trigger type (for example, you can put a queue message on timer, and then the function triggered by the queue can scale out to multiple workers automatically), or use multiple timer triggers with multiple functions, like you suggested.
If what you actually want is to run independent client_one_etl.main() and client_two_etl.main() concurrently, the most natural thing to do is to invoke them from different functions, each implemented in a separate __init__.py with its own trigger, within the same or different Function apps.
functionTimeout in host.json is applied per function invocation. So, if you have multiple functions in your app, each invocation should complete within the specified limit. This does not mean all of them together should complete within this limit (if I understood your question correctly).
UPDATE 2 (answering more questions):
#JohnT Please note that I'm not talking about the number of Function apps or ___init___.py scripts. A function (described by ___init___.py) is a program that defines what needs to be done. You can create way more than 10 functions per app, but don't do this to increase concurrency - this will not help. Instead, add functions to separate logically-independent and coherent programs. Function invocation is a process that actively executes the program, and this is where the limits I'm talking about apply. You will need to be very clear on the difference between a function and a function invocation.
Now, in order to invoke a function, you need a worker process dedicated to this invocation until this invocation is complete. Next, in order to run a worker process, you need a machine that will host this process. This is what the Functions host instance is (not a very accurate definition of Functions host instance, but good enough for the purposes of this discussion). When running on Consumption plan, your app can scale out to 200 Functions host instances, and each of them will start a single worker process by default (because FUNCTIONS_WORKER_PROCESS_COUNT = 1), so you can run up to 200 function invocations simultaneously. Increasing FUNCTIONS_WORKER_PROCESS_COUNT will allow each Functions host instance create more than one worker process, so up to FUNCTIONS_WORKER_PROCESS_COUNT function invocations can be handled by each Functions host instance, bringing the potential total to 2000.
Please note though that "can scale out" does not necessarily mean "will scale out". For more details, see Azure Functions scale and hosting and Azure Functions limits.

How do you handle high throughput functions in the REST API server?

I am developing the rest API using python flask. (Client is a mobile app)
However, important functions are a batch program that reads data from
DB processes it, and then updates (or inserts) the data when a user requests POST method with user data
Considering a lot of Read, Write, and Computation
How do you develop it?
This is how I think.
Use procedures in DB
Create an external deployment program that is independent of API.
Create a separate batch server
Just run it on the API server
I can not judge what is right with my knowledge.
And the important thing is that execution speed should not be slow.
For the user to feel, they should look as though they are running on their own devices.
I would like to ask you for advice on back-end development.

I would recommand considering asyncio. This is pretty much the use-case you have - i/o is time-consuming, but doesn't require lots of CPU. So essentially you would want that i/o to be done asynchronously, while the rest of the server carries on.
The server receives some requests that requires i/o.
It spins off that request into your asyncio architecture, so it can
be performed.
The server is already available to receive other requests, while the
previous i/o requests is being processed.
The previous i/o requests finishes. Asyncio offers a few ways to
deal with this.
See the docs, but you could provide a callback, or build your logic to take advantage of Asyncio's event loop (which essentially manages switching back & forth between context, e.g. the "main" context of your server serving resquests and the async i/o operations that you have queued up).

Threaded vs. asynchronous image processing?

I have a Python function which generates an image once it is accessed. I can either invoke it directly upon a HTTP request, or do it asynchronously using Gearman. There are a lot of requests.
Which way is better:
Inline - create an image inline, will result in many images being generated at once
Asynchronous - queue jobs (with Gearman) and generate images in a worker
Which option is better?
In this case "better" would mean the best speed / load combinations. The image generation example is symbolical, as this can also be applied to Database connections and other things.

I have a Python function which
generates an image once it is
accessed. I can either invoke it
directly upon a HTTP request, or do it
asynchronously using Gearman. There
are a lot of requests.
You should not do it inside you request because then you can't throttle(your server could get overloaded). All big sites use a message queue to do the processing offline.
Which option is better?
In this case "better" would mean the
best speed / load combinations. The
image generation example is
symbolical, as this can also be
applied to Database connections and
other things.
You should do it asynchronous because the most compelling reason to do it besides it speeds up your website is that you can throttle your queue if you are on high load. You could first execute the tasks with the highest priority.
I believe forking processes is expensive. I would create a couple worker processes(maybe do a little threading inside process) to handle the load. I would probably use redis because it is fast, actively developed(antirez/pietern commits almost everyday) and has a very good/stable python client library. blpop/rpush could be used to simulate a queue(job)

If your program is CPU bound in the interpreter then spawning multiple threads will actually slow down the result even if there are enough processors to run them all. This happens because the GIL (global interpreter lock) only allows one thread to run in the interpreter at a time.
If most of the work happens in a C library it's likely the lock is not held and you can productively use multiple threads.
If you are spawning threads yourself you'll need to make sure to not create too many - 10K threads at one would be bad news - so you'd need to setup a work queue that the threads read from instead of just spawning them in a loop.
If I was doing this I'd just use the standard multiprocessing module.

Concurrency Testing For A Web Service Using Python

I have a web service that is required to handle significant concurrent utilization and volume and I need to test it. Since the service is fairly specialized, it does not lend itself well to a typical testing framework. The test would need to simulate multiple clients concurrently posting to a URL, parsing the resulting Http response, checking that a database has been appropriately updated and making sure certain emails have been correctly sent/received.
The current opinion at my company is that I should write this framework using Python. I have never used Python with multiple threads before and as I was doing my research I came across the Global Interpreter Lock which seems to be the basis of most of Python's concurrency handling. It seems to me that the GIL would prevent Python from being able to achieve true concurrency even on a multi-processor machine. Is this true? Does this scenario change if I use a compiler to compile Python to native code? Am I just barking up the wrong tree entirely and is Python the wrong tool for this job?

The Global Interpreter Lock prevents threads simultaneously executing Python code. This doesn't change when Python is compiled to bytecode, because the bytecode is still run by the Python interpreter, which will enforce the GIL. threading works by switching threads every sys.getcheckinterval() bytecodes.
This doesn't apply to multiprocessing, because it creates multiple Python processes instead of threads. You can have as many of those as your system will support, running truly concurrently.
So yes, you can do this with Python, either with threading or multiprocessing.

you can use python's multiprocessing library to achieve this.
http://docs.python.org/library/multiprocessing.html

Assuming general network conditions, as long you have sufficient system resources Python's regular threading module will allow you to simulate concurrent workload at an higher rate than any a real workload.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to handle parallel calls to a Python API? - python

Related

Python Multiprocessing: worker pool vs process

How do language workers work in Azure Functions using Python?

How do you handle high throughput functions in the REST API server?

Threaded vs. asynchronous image processing?

Concurrency Testing For A Web Service Using Python

Categories

Resources