asyncio set_exception fails for a pending task

asyncio set_exception fails for a pending task - python

import asyncio
async def task_successful():
pass
if __name__ == '__main__':
loop = asyncio.get_event_loop()
task = loop.create_task(task_successful())
task.set_exception(ValueError)
loop.run_until_complete(task)
What I expect is an exception rising from loop.run_until_complete(task) or at least task.exception() being a ValueError.
Instead I'm getting the following error:
Traceback (most recent call last):
File "set_exception.py", line 11, in <module>
task.set_exception(ValueError)
RuntimeError: Task does not support set_exception operation
This is very weird since this task is not done when the call is made:
<Task pending coro=<task_successful() running at set_exception.py:4>>
Also it is not an InvalidStateError mentioned at the docs
Tried Python 3.7.5, 3.9.1.
What is it? A bug or I'm doing something wrong?

This isn't a valid operation on the Task instance. See the Task docs
"A Future-like object that runs a Python coroutine"
"asyncio.Task inherits from Future all of its APIs except Future.set_result() and Future.set_exception()."
Further proof, look at the source:
class Task(futures._PyFuture): # Inherit Python Task implementation
# from a Python Future implementation.
...
def set_exception(self, exception):
raise RuntimeError('Task does not support set_exception operation')

Related

How to properly transform a sync function to an async one?

I'm writing a telegram bot and I need the bot to be available to users even when it is processing some previous request. My bot downloads some videos and compresses them if it exceeds the size limit, so it takes some time to process the request. I want to turn my sync functions to async ones and handle them within another process to make this happen.
I found a way to do this, using this article but it doesn't work for me. That's my code to test the solution:
import asyncio
from concurrent.futures import ProcessPoolExecutor
from functools import wraps, partial
executor = ProcessPoolExecutor()
def async_wrap(func):
#wraps(func)
async def run(*args, **kwargs):
loop = asyncio.get_running_loop()
pfunc = partial(func, *args, **kwargs)
return await loop.run_in_executor(executor, pfunc)
return run
#async_wrap
def sync_func(a):
import time
time.sleep(10)
if __name__ == "__main__":
asyncio.run(sync_func(4))
As a result, I've got the following error message:
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/queues.py", line 245, in _feed
obj = _ForkingPickler.dumps(obj)
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function sync_func at 0x7f2e333625f0>: it's not the same object as __main__.sync_func
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/mikhail/Projects/social_network_parsing_bot/processes.py", line 34, in <module>
asyncio.run(sync_func(4))
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/home/mikhail/Projects/social_network_parsing_bot/processes.py", line 18, in run
return await loop.run_in_executor(executor, pfunc)
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/queues.py", line 245, in _feed
obj = _ForkingPickler.dumps(obj)
File "/home/mikhail/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function sync_func at 0x7f2e333625f0>: it's not the same object as __main__.sync_func
As I understand, the error arises because decorator changes the function and as a result returns a new object. What I need to change in my code to make it work. Maybe I don't understand some crucial concepts and there is some simple method to achieve the desired. Thanks for help

The article runs a nice experiment, but it really is just meant to work with a threaded-pool exercutor - not a multi-processing one.
If you see its code, at some point it passes executor=None to the .run_in_executor call, and asyncio creates a default executor which is a ThreadPoolExecutor.
The main difference to a ProcessPoolExecutor is that all data moved cross-process (and therefore, all data sent to the workers, including the target functions) have to be serialized - and it is done via Python's pickle.
Now, Pickle serialization of functions do not really send the function objects, along with its bytecode, down the wire: rather, it just sends the function qualname, and it is expected that the function with the same qualname on the other end is the same as the original function.
In the case of your code, the func which is the target for the executor-pool is the declared function, prior to it being wrapped in the decorator ( __main__.sync_func) . But what exists with this name in the target process is the post-decorated function. So, if Python would not block it due to the functions not being the same, you'd get into an infinite-loop creating hundreds of nested subprocess and never actually calling your function - as the entry-point in the target would be the wrapped function. That is just an error in the article you viewed.
All this said, the simpler way to make all this work, is instead of using this decorator in the usual fashion, just keep the original, undecorated function, in the module namespace, and create a new name for the wrapped function - this way, the "raw" code can be the target for the executor:
(...)
def sync_func(a):
import time
time.sleep(2)
print(f"finished {a}")
# this creates the decorated function with a new name,
# instead of replacing the original:
wrapped_sync = async_wrap(sync_func)
if __name__ == "__main__":
asyncio.run(wrapped_sync("go go go"))

how to use Queue in python?

My code in the following encounters error.
It's an example to use Queue in python.
May I ask how to fix it?
thanks for your help in advance.
# coding=utf-8
from multiprocessing import Queue
if __name__ == '__main__':
q=Queue(3)
q.put('message1')
q.put('message2')
print(q.full())
q.put('message3')
print(q.full())
try:
q.put("message4",True,1)
except:
print("the queue is full. current amount is %s"%q.qsize())
try:
q.put('message4')
except:
print("the queue is full. current amount is %s"%q.qsize())
if not q.empty():
print('get message from the queue.')
for i in range(q.qsize()):
print(q.get_nowait())
if not q.full():
q.put_nowait("message4")
The output error is:
False
True
Traceback (most recent call last):<br/>
File "process.py", line 13, in <module>
q.put("message4",True,1)
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/queues.py", line 84, in put
raise Full
queue.Full
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "process.py", line 15, in <module>
print("the queue is full. current amount is %s"%q.qsize())
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/queues.py", line 120, in qsize
return self._maxsize - self._sem._semlock._get_value()
NotImplementedError

As Víctor Terrón has suggested in a GitHub discussion, you can use his implementation:
https://github.com/vterron/lemon/blob/d60576bec2ad5d1d5043bcb3111dff1fcb58a8d6/methods.py#L536-L573
According to the doc:
A portable implementation of multiprocessing.Queue. Because of
multithreading / multiprocessing semantics, Queue.qsize() may raise
the NotImplementedError exception on Unix platforms like Mac OS X
where sem_getvalue() is not implemented. This subclass addresses this
problem by using a synchronized shared counter (initialized to zero)
and increasing / decreasing its value every time the put() and get()
methods are called, respectively. This not only prevents
NotImplementedError from being raised, but also allows us to implement
a reliable version of both qsize() and empty().

loop.run_until_complete(g) got error of An asyncio.Future, a coroutine or an awaitable is required

The following script
import asyncio
async def f(x):
print(f'test {x}')
raise Exception('error...')
g = lambda: f('x')
print(type(g))
loop = asyncio.get_event_loop()
loop.run_until_complete(g) # error, change g to f also got the error
got the following error,
TypeError: An asyncio.Future, a coroutine or an awaitable is required
Tried call loop.run_until_complete() with both g and f and it got the same error?

>>> async def a():
... pass
...
>>> a
<function a at 0x0000022F70FEA3B0>
>>> a()
<coroutine object a at 0x0000022F7100A2D0>
Just like generators as lain Shelvington said, coroutines are not coroutines until called.
But in case your company uses other concurrent libraries it may differ. What library your company uses?
import trio
import anyio
import curio
import asyncio
async def f():
print("run")
trio.run(f)
anyio.run(f)
curio.run(f)
asyncio.run(f)
run
run
run
Traceback (most recent call last):
File "C:\Users\jupiterbjy\AppData\Roaming\JetBrains\PyCharm2021.3\scratches\scratch.py", line 13, in <module>
asyncio.run(f)
File "C:\Program Files\Python310\lib\asyncio\runners.py", line 37, in run
raise ValueError("a coroutine was expected, got {!r}".format(main))
ValueError: a coroutine was expected, got <function f at 0x000001816CE9E200>

Django API ignored exception in sessions module

I'm running a django rest framework server hosted on google cloud. Every hour or so I get a couple of these errors that I can't figure out:
Exception ignored in: .cb at 0x7f50f275ebf8>
Traceback (most recent call last): (no traceback provided)
File "", line 191, in cb
"KeyError: ('django.contrib.sessions.serializers',)
There's no traceback since the error is caught and ignored. I've followed the code down to the cpython library in this method:
def _get_module_lock(name):
"""Get or create the module lock for a given module name.
Should only be called with the import lock taken."""
lock = None
try:
lock = _module_locks[name]()
except KeyError:
pass
if lock is None:
if _thread is None:
lock = _DummyModuleLock(name)
else:
lock = _ModuleLock(name)
def cb(_):
del _module_locks[name]
_module_locks[name] = _weakref.ref(lock, cb)
return lock
Has anyone seen this error before? I can't find any pattern of when this error comes in and can't manually reproduce it with any certainty.

Learning the Python Thread Module

I am trying to learn more about the thread module. I've come up with a quick script but am getting an error when I run it. The docs show the format as:
thread.start_new_thread ( function, args[, kwargs] )
My method only has one argument.
#!/usr/bin/python
import ftplib
import thread
sites = ["ftp.openbsd.org","ftp.ucsb.edu","ubuntu.osuosl.org"]
def ftpconnect(target):
ftp = ftplib.FTP(target)
ftp.login()
print "File list from: %s" % target
files = ftp.dir()
print files
for i in sites:
thread.start_new_thread(ftpconnect(i))
The error I am seeing occurs after one iteration of the for loop:
Traceback (most recent call last): File "./ftpthread.py", line 16,
in
thread.start_new_thread(ftpconnect(i)) TypeError: start_new_thread expected at least 2 arguments, got 1
Any suggestions for this learning process would be appreciated. I also looked into using threading, but I am unable to import threading since its not install apparently and I haven't found any documentation for installing that module yet.
Thank You!
There error I get when trying to import threading on my Mac is:
>>> import threading
# threading.pyc matches threading.py
import threading # precompiled from threading.pyc
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "threading.py", line 7, in <module>
class WorkerThread(threading.Thread) :
AttributeError: 'module' object has no attribute 'Thread'

The thread.start_new_thread function is really low-level and doesn't give you a lot of control. Take a look at the threading module, more specifically the Thread class: http://docs.python.org/2/library/threading.html#thread-objects
You then want to replace the last 2 lines of your script with:
# This line should be at the top of your file, obviously :p
from threading import Thread
threads = []
for i in sites:
t = Thread(target=ftpconnect, args=[i])
threads.append(t)
t.start()
# Wait for all the threads to complete before exiting the program.
for t in threads:
t.join()
Your code was failing, by the way, because in your for loop, you were calling ftpconnect(i), waiting for it to complete, and then trying to use its return value (that is, None) to start a new thread, which obviously doesn't work.
In general, starting a thread is done by giving it a callable object (function/method -- you want the callable object, not the result of a call -- my_function, not my_function()), and optional arguments to give the callable object (in our case, [i] because ftpconnect takes one positional argument and you want it to be i), and then calling the Thread object's start method.

Now that you can import threading, start with best practices at once ;-)
import threading
threads = [threading.Thread(target=ftpconnect, args=(s,))
for s in sites]
for t in threads:
t.start()
for t in threads: # shut down cleanly
t.join()

What you want is to pass the function object and arguments to the function to thread.start_new_thread, not execute the function.
Like this:
for i in sites:
thread.start_new_thread(ftpconnect, (i,))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

asyncio set_exception fails for a pending task - python

Related

How to properly transform a sync function to an async one?

how to use Queue in python?

loop.run_until_complete(g) got error of An asyncio.Future, a coroutine or an awaitable is required

Django API ignored exception in sessions module

Learning the Python Thread Module

Categories

Resources