Not sure if I'm taking the right approach. I need to change dynamically the amount of requests I'm sending, for this I thought I could change dynamically the amount of semaphores I'm running. Let's say I want 10 parallel calls during the first minute, then increase it to 20 for 2 minutes, then decrease it to 15, and so on, like a load-test. Is it possible to control dynamic_value during run-time?
async def send_cus(cus_id):
async with ClientSession(connector=TCPConnector(limit=0)) as session:
num_parallel = asyncio.Semaphore(dynamic_value)
async def send_cust_2(cust_id):
async with num_parallel:
# do something ...
tasks = list(send_cust_2(cust_id) for cus_id in my_ordered_lst)
await asyncio.gather(*tasks)
Related
I have a function that make a post request with a lot of treatment. All of that takes 30 seconds.
I need to execute this function every 6 mins. So I used asyncio for that ... But it's not asynchrone my api is blocked since the end of function ... Later I will have treatment that takes 5 minutes to execute.
def update_all():
# do request and treatment (30 secs)
async run_update_all():
while True:
await asyncio.sleep(6 * 60)
update_all()
loop = asyncio.get_event_loop()
loop.create_task(run_update_all())
So, I don't understand why during the execute time of update_all() all requests comming are in pending, waiting for the end of update_all() instead of being asynchronous
I found an answer with the indication of larsks
I did that :
def update_all():
# Do synchrone post request and treatment that take long time
async def launch_async():
loop = asyncio.get_event_loop()
while True:
await asyncio.sleep(120)
loop.run_in_executore(None, update_all)
asyncio.create_task(launch_async())
With that code I'm able to launch a synchrone function every X seconds without blocking the main thread of FastApi :D
I hope that will help other people in the same case than me.
I have a dataframe where each row is a record and I need to send each record in the body of a post request. Right now I am looping through the dataframe to accomplish this. I am constrained by the fact that each record must be posted individually. Is there a faster way to accomplish this?
Iterating over the data frame is not the issue here. The issue is you have to wait for the server to response to each of your request. Network request takes eons compared to CPU time need to iterate over the data frame. In other words, your program is I/O bound, not CPU bound.
One way to speed it up is to use coroutines. Let's say you have to make 1000 requests. Instead of firing one request, wait for the response, then fire the next request and so on, you fire 1000 requests at once and tell Python to wait until you have received all 1000 responses.
Since you didn't provide any code, here's a small program to illustrate the point:
import aiohttp
import asyncio
import numpy as np
import time
from typing import List
async def send_single_request(session: aiohttp.ClientSession, url: str):
async with session.get(url) as response:
return await response.json()
async def send_all_requests(urls: List[str]):
async with aiohttp.ClientSession() as session:
# Make 1 coroutine for each request
coroutines = [send_single_request(session, url) for url in urls]
# Wait until all coroutines have finished
return await asyncio.gather(*coroutines)
# We will make 10 requests to httpbin.org. Each request will take at least d
# seconds. If you were to fire them sequentially, they would have taken at least
# delays.sum() seconds to complete.
np.random.seed(42)
delays = np.random.randint(0, 5, 10)
urls = [f"https://httpbin.org/delay/{d}" for d in delays]
# Instead, we will fire all 10 requests at once, then wait until all 10 have
# finished.
t1 = time.time()
result = asyncio.run(send_all_requests(urls))
t2 = time.time()
print(f"Expected time: {delays.sum()} seconds")
print(f"Actual time: {t2 - t1:.2f} seconds")
Output:
Expected time: 28 seconds
Actual time: 4.57 seconds
You have to read up a bit on coroutines and how they work but for the most part, they are not too complicated for your use case. This comes with a couple caveats:
All your requests must be independent from each other.
The rate limit on the server must be sufficient to handle your workload. For example, if it restricts you to 2 requests per minute, there is no way around that other than upgrading to different service tier.
Contextvar 'value' doest change as it goes through 'while' cycle .
'''
import contextvars
import keyboard
import asyncio
import random
value = contextvars.ContextVar('value')
value.set('-')
async def q():
while True:
await asyncio.sleep(1)
print(value.get())
async def s():
while True:
x = random.choice(list(range(10)))
value.set(x)
await asyncio.sleep(1)
async def main():
t1 = asyncio.create_task(q())
t2 = asyncio.create_task(s())
await t1
asyncio.run(main())
The output is '---' . I want to set a new value to this context var , but i cant find any similiar cases
for the first time here so i dunno if all images are shown correct and dunno gow to paste a code here so pls help
contextvars are designe exactly to isolate values across different task groups. The idea is that any calls that are "awaited" from within your task t2 will see the values set in that task, and t1 and anything called from there will "see" values set in t1.
To put it in more concrete terms, think of the common usage of async functions to handle an http request in a web framework. A framework might choose to add into contextvars details about each request, that it does not bother passing as parameters to each function (http headers, cookies, and so on). - but each function can retrieve these as "context" - and the values must be isolated from values seen in the same function, but when called to answer another request taking place in parallel.
If you want to communicate data across several tasks and syncs, either use plain global variables, or a queue - https://docs.python.org/3/library/asyncio-queue.html - if you want to pipe values that are to be consumed in other tasks/call stacks.
I am pretty new with Async.io and I am using it with Discord.py to create a bot. Once a day, I need to update a spreadsheet, but the problem is that the spreadsheet has gotten a little long so it now triggers the loop's default timeout. Is there anyway to overcome this? I have seen run_until_complete but as you see below there is a await asyncio.sleep(86400) which from my understanding will not work with wait until complete because it will wait for a day? I would also be fine with just changing the timeout for that function and then changing it back after it is complete, but I have not been able to find any resources.
Here is the function that needs to repeat everyday:
async def updateSheet():
while True:
print("Updating Sheet at " + datetime.now().strftime("%H:%M"))
user.updateAllUsers(os.getenv('CID'), os.getenv('CS'), subs) #This is the function that takes too long
print("Done Updating")
await asyncio.sleep(86400)
and here is how I am adding it to the loop (because I am using Discord.py):
#client.event
async def on_ready():
print('We have logged in as {0.user}'.format(client))
client.loop.create_task(updateSheet())
Any and all help will be appreciated since as long as this is down my project loses precious time. :)
If something is blocking, the direct method would be trying to convert it to a task, which might not be possible in your case. So we would have to use something like APS to schedule jobs.
sched = Scheduler()
sched.start()
#sched.cron_schedule(day='mon-fri')
def task():
user.updateAllUsers(os.getenv('CID'), os.getenv('CS'), subs)
Make sure you do this in a separate file, and use async scheduler for tasks.
You can simply measure how much time does the function take to execute and simply subtract it from 86400
import time
async def updateSheet():
while True:
start = time.monotonic()
print("Updating Sheet at " + datetime.now().strftime("%H:%M"))
user.updateAllUsers(os.getenv('CID'), os.getenv('CS'), subs) #This is the function that takes too long
end = time.monotonic()
total = end - start
sleep_time = 86400 - total
await asyncio.sleep(sleep_time)
I really suggest you that you run the blocking functions in a non-blocking way, refer to one of my previous answers for more info, (What does "blocking" mean)
In one of my projects, I need to run three different database updater functions at different intervals.
For instance, function one needs to run every 30 seconds, function two needs to run every 60 seconds and function 3 every 5 minutes (notably due to API call restrictions).
I've been trying to achieve this in python, looking up every possible solution but I cannot seem to find anything that works for my use case. I am rather fresh in python.
Here is (somewhat) what I have, using asyncio.
import asyncio
def updater1(url1, url2, time):
print(f"Doing my thing here every {time} seconds")
def updater2(url1, url2, time):
print(f"Doing my thing here every {time} seconds")
def updater3(url, time):
print(f"Doing my thing here every {time} seconds")
async def func1():
updater1(rankUrl, statsUrl, 30)
await asyncio.sleep(30)
async def func2():
updater2(rankUrl, statsUrl, 60)
await asyncio.sleep(60)
async def func3():
updater3(url, 300)
await asyncio.sleep(300)
# Initiate async loops
while True:
asyncio.run(func1())
asyncio.run(func2())
asyncio.run(func3())
The issue is that these tasks run one after each other, while what I am trying to achieve is that they run independently from each other, with a start time when the script is initiated, and respective to their own individual loop times
Any idea on how this could be done is much appreciated - I am open to new concepts and ideas if you have any for me to explore :)
Don't use asyncio.run() on individual coroutines, as async.run() is itself not asynchronous. The call to asyncio.run() won't return until the funcN() coroutine is done.
Create a single top-level coroutine that then runs others as tasks:
async def main():
task1 = asyncio.create_task(func1())
task2 = asyncio.create_task(func2())
task3 = asyncio.create_task(func3())
await asyncio.wait([task1, task2, task3])
The above kicks off three independent tasks, then waits for all 3 to complete.