Running Image Manipulation in run_in_executor. Adapting to multiprocessing - python

Hey so I run lots of Image Manipulation on an api built using fastapi async. I would like to be able to run the Image Manipulation asynchronously. As a result I used run_in_executor which I believe runs it in a seperate thread. However I was told that using python multiprocessing is better instead. Does moving have any advantages?.
import asyncio
import functools
from app.exceptions.errors import ManipulationError
def executor(function):
#functools.wraps(function)
def decorator(*args, **kwargs):
try:
partial = functools.partial(function, *args, **kwargs)
loop = asyncio.get_event_loop()
return loop.run_in_executor(None, partial)
except Exception:
raise ManipulationError("Uanble To Manipulate Image")
return decorator
I made this decorator to wrap my blocking funcs as run in executor.
two questions
a) Does moving to multiprocesisng have any advantages
b) How would I do so

a) Does moving to multiprocesisng have any advantages
Yes, it utilizes multiple cores in case of CPU-bound processing.
b) How would I do so
By passing an instance of ProcessPoolExecutor to run_in_executor. (The None value you're passing now means use the default executor provided by asyncio, which is a ThreadPoolExecutor.) For example (untested):
_pool = concurrent.futures.ProcessPoolExecutor()
def executor(function):
#functools.wraps(function)
def decorator(*args):
loop = asyncio.get_event_loop()
return loop.run_in_executor(_pool, function, *args)
return decorator
This will also require that all arguments to the function be serializable, so that they can be transferred to the subprocess.

Related

partial asynchronous functions are not detected as asynchronous

I have a function which accepts both regular and asynchronous functions (not coroutines, but functions returning coroutines).
Internally it uses asyncio.iscoroutinefunction() test to see which type of function it got.
Recently it broke down when I attempted to create a partial async function.
In this demonstration, ptest is not recognized as a couroutine function, even if it returns a coroutine, i.e. ptest() is a coroutine.
import asyncio
import functools
async def test(arg): pass
print(asyncio.iscoroutinefunction(test)) # True
ptest = functools.partial(test, None)
print(asyncio.iscoroutinefunction(ptest)) # False!!
print(asyncio.iscoroutine(ptest())) # True
The problem cause is clear, but the solution is not.
How to dynamically create a partial async func which passes the test?
OR
How to test the func wrapped inside a partial object?
Either answer would solve the problem.
Using Python versions < 3.8 you can't make a partial() object pass that test, because the test requires there to be a __code__ object attached directly to the object you pass to inspect.iscoroutinefunction().
You should instead test the function object that partial wraps, accessible via the partial.func attribute:
>>> asyncio.iscoroutinefunction(ptest.func)
True
If you also need to test for partial() objects, then test against functools.partial:
def iscoroutinefunction_or_partial(object):
while isinstance(object, functools.partial):
object = object.func
return inspect.iscoroutinefunction(object)
In Python 3.8 (and newer), the relevant code in the inspect module (that asyncio.iscoroutinefunction() delegates to) was updated to handle partial() objects, and you no longer have to unwrap partial() objects yourself. The implementation uses the same while isinstance(..., functools.partial) loop.
I solved this by replacing all instances of partial with async_partial:
def async_partial(f, *args):
async def f2(*args2):
result = f(*args, *args2)
if asyncio.iscoroutinefunction(f):
result = await result
return result
return f2

How to wrap custom future to use with asyncio in Python?

There is a lot of libraries that use their custom version of Future. kafka and s3transfer are just two examples: all their custom future-like classes have object as the superclass.
Not surprisingly, you cannot directly call asyncio.wrap_future() on such objects and can't use await with them.
What is the proper way of wrapping such futures for use with asyncio?
If the future class supports standard future features such as done callbacks and the result method, just use something like this:
def wrap_future(f):
loop = asyncio.get_event_loop()
aio_future = loop.create_future()
def on_done(*_):
try:
result = f.result()
except Exception as e:
loop.call_soon_threadsafe(aio_future.set_exception, e)
else:
loop.call_soon_threadsafe(aio_future.set_result, result)
f.add_done_callback(on_done)
return aio_future
Consider that code a template which you can customize to match the specifics of the future you are dealing with.
Intended usage is to call it from the thread that runs the asyncio event loop:
value = await wrap_future(some_foreign_future)
If you are calling it from a different thread, be sure to pass loop explicitly because asyncio.get_event_loop will fail when invoked from a thread not registered with asyncio.

Optional Synchronous Interface to Asynchronous Functions

I'm writing a library which is using Tornado Web's tornado.httpclient.AsyncHTTPClient to make requests which gives my code a async interface of:
async def my_library_function():
return await ...
I want to make this interface optionally serial if the user provides a kwarg - something like: serial=True. Though you can't obviously call a function defined with the async keyword from a normal function without await. This would be ideal - though almost certain imposible in the language at the moment:
async def here_we_go():
result = await my_library_function()
result = my_library_function(serial=True)
I'm not been able to find anything online where someones come up with a nice solution to this. I don't want to have to reimplement basically the same code without the awaits splattered throughout.
Is this something that can be solved or would it need support from the language?
Solution (though use Jesse's instead - explained below)
Jesse's solution below is pretty much what I'm going to go with. I did end up getting the interface I originally wanted by using a decorator. Something like this:
import asyncio
from functools import wraps
def serializable(f):
#wraps(f)
def wrapper(*args, asynchronous=False, **kwargs):
if asynchronous:
return f(*args, **kwargs)
else:
# Get pythons current execution thread and use that
loop = asyncio.get_event_loop()
return loop.run_until_complete(f(*args, **kwargs))
return wrapper
This gives you this interface:
result = await my_library_function(asynchronous=True)
result = my_library_function(asynchronous=False)
I sanity checked this on python's async mailing list and I was lucky enough to have Guido respond and he politely shot it down for this reason:
Code smell -- being able to call the same function both asynchronously
and synchronously is highly surprising. Also it violates the rule of
thumb that the value of an argument shouldn't affect the return type.
Nice to know it's possible though if not considered a great interface. Guido essentially suggested Jesse's answer and introducing the wrapping function as a helper util in the library instead of hiding it in a decorator.
When you want to call such a function synchronously, use run_until_complete:
asyncio.get_event_loop().run_until_complete(here_we_go())
Of course, if you do this often in your code, you should come up with an abbreviation for this statement, perhaps just:
def sync(fn, *args, **kwargs):
return asyncio.get_event_loop().run_until_complete(fn(*args, **kwargs))
Then you could do:
result = sync(here_we_go)

Is this style of using Thread pool with tornado ok?

So I create a class variable called executor in my class
executor = ThreadPoolExecutor(100)
and instead of having functions and methods and using decorators, I simply use following line to handle my blocking tasks(like io and hash creation and....) in my async methods
result = await to_tornado_future(self.executor.submit(blocking_method, param1, param2)
I decided to use this style cause
1- decorators are slower by nature
2- there is no need for extra methods and functions
3- it workes as expected and creates no threads before it needed
Am I right ? Please use reasons(I want to know if the way I use, is slower or uses more resources or....)
Update
Based on Ben answer, my above approach was not correct
so I ended up using following function as needed, I think it's the best way to go
def pool(pool_executor, fn, *args, **kwargs):
new_future = Future()
result_future = pool_executor.submit(fn, *args, **kwargs)
result_future.add_done_callback(lambda f: new_future.set_result(f.result()))
return new_future
usage:
result = await pool(self.executor, time.sleep, 3)
This is safe as long as all your blocking methods are thread-safe. Since you mentioned doing IO in these threads, I'll point out that doing file IO here is fine but all network IO in Tornado must occur on the IOLoop's thread.
Why do you say "decorators are slower by nature"? Which decorators are slower than what? Some decorators have no performance overhead at all (although most do have some runtime cost). to_tornado_future(executor.submit()) isn't free either. (BTW, I think you want tornado.gen.convert_yielded instead of tornado.platform.asyncio.to_tornado_future. executor.submit doesn't return an asyncio.Future).
As a general rule, running blocking_method on a thread pool is going to be slower than just calling it directly. You should do this only when blocking_method is likely to block for long enough that you want the main thread free to do other things in the meantime.

Using Twisted's #inlineCallbacks with Tornado's #gen.engine

Tornado/Twisted newb here. First I just want to confirm what I know (please correct and elaborate if I am wrong):
In order to use #gen.engine and gen.Task in Tornado, I need to feed gen.Task() functions that are:
asynchronous to begin with
has the keyword argument "callback"
calls the callback function at the very end
In other words the function should look something like this:
def function(arg1, arg2, ... , callback=None):
# asynchronous stuff here ...
callback()
And I would call it like this (trivial example):
#gen.engine
def coroutine_call():
yield gen.Task(function, arg1, arg2)
Now I am in a weird situation where I have to use Twisted in a Tornado system for asynchronous client calls to a server (since Tornado apparently does not support it).
So I wrote a function in Twisted (e.g. connects to the server):
import tornado.platform.twisted
tornado.platform.twisted.install()
from twisted.web.xmlrpc import Proxy
class AsyncConnection():
def __init__(self, hostname):
self.proxy = Proxy(hostname)
self.token = False
#defer.inlineCallbacks
def login(self, user, passwd, callback=None):
"""Login to server using given username and password"""
self.token = yield self.proxy.callRemote('login', user, passwd) # twisted function
callback()
And if I run it like so:
#gen.engine
def test():
conn = AsyncConnection("192.168.11.11")
yield gen.Task(conn.login, "user","pwd")
print conn.token
if __name__ == '__main__':
test()
tornado.ioloop.IOLoop.instance().start()
And I DO get the token as I want. But my question is:
I know that Twisted and Tornado can share the same IOLoop. But am I allowed to do this (i.e. use #defer.inlineCallbacks function in gen.Task simply by giving it the callback keyword argument)? I seem to get the right result but is my way really running things asynchronously? Any complications/problems with the IOLoop this way?
I actually posted somewhat related questions on other threads
Is it possible to use tornado's gen.engine and gen.Task with twisted?
Using Tornado and Twisted at the same time
and the answers told me that I should "wrap" the inlineCallback function. I was wondering if adding the callback keyword is enough to "wrap" the twisted function to be suitable for Tornado.
Thanks in advance
What you're doing is mostly fine: adding a callback argument is enough to make a function usable with gen.Task. The only tricky part is exception handling: you'll need to run the callback from an except or finally block to ensure it always happens, and should probably return some sort of value to indicate whether the operation succeeded or not (exceptions do not reliably pass through a gen.Task when you're working with non-Tornado code)
The wrapper approach (which I posted in Is it possible to use tornado's gen.engine and gen.Task with twisted?) has two advantages: it can be used with most Twisted code directly (since Twisted functions usually don't have a callback argument), and exceptions work more like you'd expect (an exception raised in the inner function will be propagated to the outer function).

Categories

Resources