I have a long-running synchronous Python program, where I'd like to run ~10 "fire and forget" tasks every second. These tasks hit a remote API and do not need to return any value. I tried this answer, but it requires too much CPU/memory to spawn and maintain all the separate threads so I've been looking into asyncio.
This answer explained nicely how to run "fire and forget" using asyncio. However, it requires using run_until_complete(), which waits until all the asyncio tasks are done. My program is using sync Python so this doesn't work for me. Ideally, the code should be as simple as this, where log_remote won't block the loop:
while True:
latest_state, metrics = expensive_function(latest_state)
log_remote(metrics) # <-- this should be run as "fire and forget"
I'm on Python 3.7. How can I easily run this using asyncio on another thread?
You can start a single event loop in a single background thread and use it for all your fire&forget tasks. For example:
import asyncio, threading
_loop = None
def fire_and_forget(coro):
global _loop
if _loop is None:
_loop = asyncio.new_event_loop()
threading.Thread(target=_loop.run_forever, daemon=True).start()
_loop.call_soon_threadsafe(asyncio.create_task, coro)
With that in place, you can just call fire_and_forget on a coroutine object, created by calling an async def:
# fire_and_forget defined as above
import time
async def long_task(msg):
print(msg)
await asyncio.sleep(1)
print('done', msg)
fire_and_forget(long_task('foo'))
fire_and_forget(long_task('bar'))
print('continuing with something else...')
time.sleep(3)
Note that log_remote will need to actually be written as an async def using asyncio, aiohttp instead of requests, etc.
Related
I have a asyncio running loop, and from the coroutine I'm calling a sync function, is there any way we can call and get result from an async function in a sync function
tried below code, it is not working
want to print output of hel() in i() without changing i() to async function
is it possible, if yes how?
import asyncio
async def hel():
return 4
def i():
loop = asyncio.get_running_loop()
x = asyncio.run_coroutine_threadsafe(hel(), loop) ## need to change
y = x.result() ## this lines
print(y)
async def h():
i()
asyncio.run(h())
This is one of the most commonly asked type of question here. The tools to do this are in the standard library and require only a few lines of setup code. However, the result is not 100% robust and needs to be used with care. This is probably why it's not already a high-level function.
The basic problem with running an async function from a sync function is that async functions contain await expressions. Await expressions pause the execution of the current task and allow the event loop to run other tasks. Therefore async functions (coroutines) have special properties that allow them to yield control and resume again where they left off. Sync functions cannot do this. So when your sync function calls an async function and that function encounters an await expression, what is supposed to happen? The sync function has no ability to yield and resume.
A simple solution is to run the async function in another thread, with its own event loop. The calling thread blocks until the result is available. The async function behaves like a normal function, returning a value. The downside is that the async function now runs in another thread, which can cause all the well-known problems that come with threaded programming. For many cases this may not be an issue.
This can be set up as follows. This is a complete script that can be imported anywhere in an application. The test code that runs in the if __name__ == "__main__" block is almost the same as the code in the original question.
The thread is lazily initialized so it doesn't get created until it's used. It's a daemon thread so it will not keep your program from exiting.
The solution doesn't care if there is a running event loop in the main thread.
import asyncio
import threading
_loop = asyncio.new_event_loop()
_thr = threading.Thread(target=_loop.run_forever, name="Async Runner",
daemon=True)
# This will block the calling thread until the coroutine is finished.
# Any exception that occurs in the coroutine is raised in the caller
def run_async(coro): # coro is a couroutine, see example
if not _thr.is_alive():
_thr.start()
future = asyncio.run_coroutine_threadsafe(coro, _loop)
return future.result()
if __name__ == "__main__":
async def hel():
await asyncio.sleep(0.1)
print("Running in thread", threading.current_thread())
return 4
def i():
y = run_async(hel())
print("Answer", y, threading.current_thread())
async def h():
i()
asyncio.run(h())
Output:
Running in thread <Thread(Async Runner, started daemon 28816)>
Answer 4 <_MainThread(MainThread, started 22100)>
In order to call an async function from a sync method, you need to use asyncio.run, however this should be the single entry point of an async program so asyncio makes sure that you don't do this more than once per program, so you can't do that.
That being said, this project https://github.com/erdewit/nest_asyncio patches the asyncio event loop to do that, so after using it you should be able to just call asyncio.run in your sync function.
My discord bot has command that makes the bot it say "I'm running" every 30 seconds. However that makes the main thread sleep making the bot commands unusable.
the code is as follows:
while true:
await channel.send("I'm running")
time.sleep(30)
I tried using https://github.com/Rapptz/discord.py/issues/82 and https://docs.python.org/3/library/asyncio-task.html#asyncio.run_coroutine_threadsafe as a reference, but I couldn't understand how it worked.
You should use some non-blocking sleep. Instead of using time.sleep you could for example try await asyncio.sleep(30) Maybe it works straight?
If not, then you must implement manual timer to call channel.send after certain time has passed with if statements for example. Asynchronous methods might be confusing at first.
Examples about threading you checked, are outdated, since discord.py uses async methods nowadays.
If its purpose is only to send a certain message every 30 seconds. You could also create a task which runs every 30 seconds. You can start/stop the task anytime you want.
Example:
class example_class():
def __init__():
self.channel = None
async def some_command_starting_our_task(ctx):
self.channel = ctx.channel
botStatus.start() # Used to start the task
async def some_command_stopping_our_task():
botStatus.stop() # Used to stop the task
#task.loop(seconds=30)
async def botStatus():
self.channel.send("Im running")
For more information about tasks you can read the documentation.
Let's assume I'm new to asyncio. I'm using async/await to parallelize my current project, and I've found myself passing all of my coroutines to asyncio.ensure_future. Lots of stuff like this:
coroutine = my_async_fn(*args, **kwargs)
task = asyncio.ensure_future(coroutine)
What I'd really like is for a call to an async function to return an executing task instead of an idle coroutine. I created a decorator to accomplish what I'm trying to do.
def make_task(fn):
def wrapper(*args, **kwargs):
return asyncio.ensure_future(fn(*args, **kwargs))
return wrapper
#make_task
async def my_async_func(*args, **kwargs):
# usually making a request of some sort
pass
Does asyncio have a built-in way of doing this I haven't been able to find? Am I using asyncio wrong if I'm lead to this problem to begin with?
asyncio had #task decorator in very early pre-released versions but we removed it.
The reason is that decorator has no knowledge what loop to use.
asyncio don't instantiate a loop on import, moreover test suite usually creates a new loop per test for sake of test isolation.
Does asyncio have a built-in way of doing this I haven't been able to
find?
No, asyncio doesn't have decorator to cast coroutine-functions into tasks.
Am I using asyncio wrong if I'm lead to this problem to begin with?
It's hard to say without seeing what you're doing, but I think it may happen to be true. While creating tasks is usual operation in asyncio programs I doubt you created this much coroutines that should be tasks always.
Awaiting for coroutine - is a way to "call some function asynchronously", but blocking current execution flow until it finished:
await some()
# you'll reach this line *only* when some() done
Task on the other hand - is a way to "run function in background", it won't block current execution flow:
task = asyncio.ensure_future(some())
# you'll reach this line immediately
When we write asyncio programs we usually need first way since we usually need result of some operation before starting next one:
text = await request(url)
links = parse_links(text) # we need to reach this line only when we got 'text'
Creating task on the other hand usually means that following further code doesn't depend of task's result. But again it doesn't happening always.
Since ensure_future returns immediately some people try to use it as a way to run some coroutines concurently:
# wrong way to run concurrently:
asyncio.ensure_future(request(url1))
asyncio.ensure_future(request(url2))
asyncio.ensure_future(request(url3))
Correct way to achieve this is to use asyncio.gather:
# correct way to run concurrently:
await asyncio.gather(
request(url1),
request(url2),
request(url3),
)
May be this is what you want?
Upd:
I think using tasks in your case is a good idea. But I don't think you should use decorator: coroutine functionality (to make request) still is a separate part from it's concrete usage detail (it will be used as task). If requests synchronization controlling is separate from their's main functionalities it's also make sense to move synchronization into separate function. I would do something like this:
import asyncio
async def request(i):
print(f'{i} started')
await asyncio.sleep(i)
print(f'{i} finished')
return i
async def when_ready(conditions, coro_to_start):
await asyncio.gather(*conditions, return_exceptions=True)
return await coro_to_start
async def main():
t = asyncio.ensure_future
t1 = t(request(1))
t2 = t(request(2))
t3 = t(request(3))
t4 = t(when_ready([t1, t2], request(4)))
t5 = t(when_ready([t2, t3], request(5)))
await asyncio.gather(t1, t2, t3, t4, t5)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
I've written a library of objects, many which make HTTP / IO calls. I've been looking at moving over to asyncio due to the mounting overheads, but I don't want to rewrite the underlying code.
I've been hoping to wrap asyncio around my code in order to perform functions asynchronously without replacing all of my deep / low level code with await / yield.
I began by attempting the following:
async def my_function1(some_object, some_params):
#Lots of existing code which uses existing objects
#No await statements
return output_data
async def my_function2():
#Does more stuff
while True:
loop = asyncio.get_event_loop()
tasks = my_function(some_object, some_params), my_function2()
output_data = loop.run_until_complete(asyncio.gather(*tasks))
print(output_data)
I quickly realised that while this code runs, nothing actually happens asynchronously, the functions complete synchronously. I'm very new to asynchronous programming, but I think this is because neither of my functions are using the keyword await or yield and thus these functions are not cooroutines, and do not yield, thus do not provide an opportunity to move to a different cooroutine. Please correct me if I am wrong.
My question is, is it possible to wrap complex functions (where deep within they make HTTP / IO calls ) in an asyncio await keyword, e.g.
async def my_function():
print("Welcome to my function")
data = await bigSlowFunction()
UPDATE - Following Karlson's Answer
Following and thanks to Karlsons accepted answer, I used the following code which works nicely:
from concurrent.futures import ThreadPoolExecutor
import time
#Some vars
a_var_1 = 0
a_var_2 = 10
pool = ThreadPoolExecutor(3)
future = pool.submit(my_big_function, object, a_var_1, a_var_2)
while not future.done() :
print("Waiting for future...")
time.sleep(0.01)
print("Future done")
print(future.result())
This works really nicely, and the future.done() / sleep loop gives you an idea of how many CPU cycles you get to use by going async.
The short answer is, you can't have the benefits of asyncio without explicitly marking the points in your code where control may be passed back to the event loop. This is done by turning your IO heavy functions into coroutines, just like you assumed.
Without changing existing code you might achieve your goal with greenlets (have a look at eventlet or gevent).
Another possibility would be to make use of Python's Future implementation wrapping and passing calls to your already written functions to some ThreadPoolExecutor and yield the resulting Future. Be aware, that this comes with all the caveats of multi-threaded programming, though.
Something along the lines of
from concurrent.futures import ThreadPoolExecutor
from thinair import big_slow_function
executor = ThreadPoolExecutor(max_workers=5)
async def big_slow_coroutine():
await executor.submit(big_slow_function)
As of python 3.9 you can wrap a blocking (non-async) function in a coroutine to make it awaitable using asyncio.to_thread(). The exampe given in the official documentation is:
def blocking_io():
print(f"start blocking_io at {time.strftime('%X')}")
# Note that time.sleep() can be replaced with any blocking
# IO-bound operation, such as file operations.
time.sleep(1)
print(f"blocking_io complete at {time.strftime('%X')}")
async def main():
print(f"started main at {time.strftime('%X')}")
await asyncio.gather(
asyncio.to_thread(blocking_io),
asyncio.sleep(1))
print(f"finished main at {time.strftime('%X')}")
asyncio.run(main())
# Expected output:
#
# started main at 19:50:53
# start blocking_io at 19:50:53
# blocking_io complete at 19:50:54
# finished main at 19:50:54
This seems like a more joined up approach than using concurrent.futures to make a coroutine, but I haven't tested it extensively.
I have successfully built a RESTful microservice with Python asyncio and aiohttp that listens to a POST event to collect realtime events from various feeders.
It then builds an in-memory structure to cache the last 24h of events in a nested defaultdict/deque structure.
Now I would like to periodically checkpoint that structure to disc, preferably using pickle.
Since the memory structure can be >100MB I would like to avoid holding up my incoming event processing for the time it takes to checkpoint the structure.
I'd rather create a snapshot copy (e.g. deepcopy) of the structure and then take my time to write it to disk and repeat on a preset time interval.
I have been searching for examples on how to combine threads (and is a thread even the best solution for this?) and asyncio for that purpose but could not find something that would help me.
Any pointers to get started are much appreciated!
It's pretty simple to delegate a method to a thread or sub-process using BaseEventLoop.run_in_executor:
import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
def cpu_bound_operation(x):
time.sleep(x) # This is some operation that is CPU-bound
#asyncio.coroutine
def main():
# Run cpu_bound_operation in the ProcessPoolExecutor
# This will make your coroutine block, but won't block
# the event loop; other coroutines can run in meantime.
yield from loop.run_in_executor(p, cpu_bound_operation, 5)
loop = asyncio.get_event_loop()
p = ProcessPoolExecutor(2) # Create a ProcessPool with 2 processes
loop.run_until_complete(main())
As for whether to use a ProcessPoolExecutor or ThreadPoolExecutor, that's kind of hard to say; pickling a large object will definitely eat some CPU cycles, which initially would make you think ProcessPoolExecutor is the way to go. However, passing your 100MB object to a Process in the pool would require pickling the instance in your main process, sending the bytes to the child process via IPC, unpickling it in the child, and then pickling it again so you can write it to disk. Given that, my guess is the pickling/unpickling overhead will be large enough that you're better off using a ThreadPoolExecutor, even though you're going to take a performance hit because of the GIL.
That said, it's very simple to test both ways and find out for sure, so you might as well do that.
I also used run_in_executor, but I found this function kinda gross under most circumstances, since it requires partial() for keyword args and I'm never calling it with anything other than a single executor and the default event loop. So I made a convenience wrapper around it with sensible defaults and automatic keyword argument handling.
from time import sleep
import asyncio as aio
loop = aio.get_event_loop()
class Executor:
"""In most cases, you can just use the 'execute' instance as a
function, i.e. y = await execute(f, a, b, k=c) => run f(a, b, k=c) in
the executor, assign result to y. The defaults can be changed, though,
with your own instantiation of Executor, i.e. execute =
Executor(nthreads=4)"""
def __init__(self, loop=loop, nthreads=1):
from concurrent.futures import ThreadPoolExecutor
self._ex = ThreadPoolExecutor(nthreads)
self._loop = loop
def __call__(self, f, *args, **kw):
from functools import partial
return self._loop.run_in_executor(self._ex, partial(f, *args, **kw))
execute = Executor()
...
def cpu_bound_operation(t, alpha=30):
sleep(t)
return 20*alpha
async def main():
y = await execute(cpu_bound_operation, 5, alpha=-2)
loop.run_until_complete(main())
Another alternative is to use loop.call_soon_threadsafe along with an asyncio.Queue as the intermediate channel of communication.
The current documentation for Python 3 also has a section on Developing with asyncio - Concurrency and Multithreading:
import asyncio
# This method represents your blocking code
def blocking(loop, queue):
import time
while True:
loop.call_soon_threadsafe(queue.put_nowait, 'Blocking A')
time.sleep(2)
loop.call_soon_threadsafe(queue.put_nowait, 'Blocking B')
time.sleep(2)
# This method represents your async code
async def nonblocking(queue):
await asyncio.sleep(1)
while True:
queue.put_nowait('Non-blocking A')
await asyncio.sleep(2)
queue.put_nowait('Non-blocking B')
await asyncio.sleep(2)
# The main sets up the queue as the communication channel and synchronizes them
async def main():
queue = asyncio.Queue()
loop = asyncio.get_running_loop()
blocking_fut = loop.run_in_executor(None, blocking, loop, queue)
nonblocking_task = loop.create_task(nonblocking(queue))
running = True # use whatever exit condition
while running:
# Get messages from both blocking and non-blocking in parallel
message = await queue.get()
# You could send any messages, and do anything you want with them
print(message)
asyncio.run(main())
How to send asyncio tasks to loop running in other thread may also help you.
If you need a more "powerful" example, check out my Wrapper to launch async tasks from threaded code. It will handle the thread safety part for you (for the most part) and let you do things like this:
# See https://gist.github.com/Lonami/3f79ed774d2e0100ded5b171a47f2caf for the full example
async def async_main(queue):
# your async code can go here
while True:
command = await queue.get()
if command.id == 'print':
print('Hello from async!')
elif command.id == 'double':
await queue.put(command.data * 2)
with LaunchAsync(async_main) as queue:
# your threaded code can go here
queue.put(Command('print'))
queue.put(Command('double', 7))
response = queue.get(timeout=1)
print('The result of doubling 7 is', response)