Strange behaviour when task added to empty loop in different thread

Strange behaviour when task added to empty loop in different thread - python

I have an app which adds coroutines to an already-running event loop.
The arguments for these coroutines depend on I/O and are not available when I initially start the event loop - with loop.run_forever(), so I add the tasks later. To demonstrate the phenomenon, here is some example code:
import asyncio
from threading import Thread
from time import sleep
loop = asyncio.new_event_loop()
def foo():
loop.run_forever()
async def bar(s):
while True:
await asyncio.sleep(1)
print("s")
#loop.create_task(bar("A task created before thread created & before loop started"))
t = Thread(target=foo)
t.start()
sleep(1)
loop.create_task(bar("secondary task"))
The strange behaviour is that everything works as expected when there is at least one task in the loop when invoking loop.run_forever(). i.e. when the commented line is not commented out.
But when it is commented out, as shown above, nothing is printed and it appears I am unable to add a task to the event_loop. Should I avoid invoking run_forever() without adding a single task? I don't see why this should be a problem. Adding tasks to an event_loop after it is running is standard, why should the empty case be an issue?

Adding tasks to an event_loop after it is running is standard, why should the empty case be an issue?
Because you're supposed to add tasks from the thread running the event loop. In general one should not mix threads and asyncio, except through APIs designed for that purpose, such as loop.run_in_executor.
If you understand this and still have good reason to add tasks from a separate thread, use asyncio.run_coroutine_threadsafe. Change loop.create_task(bar(...)) to:
asyncio.run_coroutine_threadsafe(bar("in loop"), loop=loop)
run_coroutine_threadsafe accesses the event loop in a thread-safe manner, and also ensures that the event loop wakes up to notice the new task, even if it otherwise has nothing to do and is just waiting for IO/timeouts.
Adding another task beforehand only appeared to work because bar happens to be an infinite coroutine that makes the event loop wake up every second. Once the event loop wakes up for any reason, it executes all runnable tasks regardless of which thread added them. It would be a really bad idea to rely on this, though, because loop.create_task is not thread-safe, so there could be any number of race conditions if it executed in parallel with a running event loop.

Because loop.create_task is not thread safe, and if you set loop._debug = True, you should see the error as
Traceback (most recent call last):
File "test.py", line 23, in <module>
loop.create_task(bar("secondary task"))
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 284, in create_task
task = tasks.Task(coro, loop=self)
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 576, in call_soon
self._check_thread()
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 615, in _check_thread
"Non-thread-safe operation invoked on an event loop other "
RuntimeError: Non-thread-safe operation invoked on an event loop other than the current one

Related

When or where is a callback added through add_done_callback triggered?

I'm reading the following code with Visual Studio Code (VSCode) Debugger.
import asyncio
async def main():
print("OK Google. Wake me up in 1 seconds.")
await asyncio.sleep(1)
print("Wake up!")
if __name__ == "__main__":
asyncio.run(main(), debug=False)
I could understand the main flow of the program that schedules a callback to sleep the process for a second, but it was difficult to follow when or where _run_until_complete_cb is called?
After the main coroutine is executed, this function/callback is called to stop an event loop by setting a _stopping flag to be True. It is, however, originally appended to a _callbacks internal property in the Future class through add_done_callback or called soon if the future is done.
def add_done_callback(self, fn, *, context=None):
"""Add a callback to be run when the future becomes done.
The callback is called with a single argument - the future object. If
the future is already done when this is called, the callback is
scheduled with call_soon.
"""
if self._state != _PENDING:
self._loop.call_soon(fn, self, context=context)
else:
if context is None:
context = contextvars.copy_context()
self._callbacks.append((fn, context))
Either case, it's registered with the event loop by a call_soon method and called at the next iteration at the end. But the future haven't done yet at the moment added the else clause above.
My question is where or when the future is done to proceed with _run_until_complete_cb to the else clause? Since the VSCode debugger just skips or ignores the line of code that calls the method on Future and Task instances somehow, so the flow jumps right into the call_soon in _run_until_complete_cb.
What exactly happened after finishing the main coroutine? Does someone have any ideas or hints about a clean-up process of the asyncio module to stop the event loop or a way to look into the methods on Future or Task by the VSCode debugger?
Thanks a lot in advance!

If you're trying to debug into asyncio then you want to set "justMyCode": true in your launch.json. That will have the debugger trace into 3rd-party code, including the stdlib.

Why can asyncio event loop sometimes finish a task even when encountering `RuntimeError`?

I've been playing around with Python's asyncio. I think I have a reasonable understanding by now. But the following behavior puzzles me.
test.py:
from threading import Thread
import asyncio
async def wait(t):
await asyncio.sleep(t)
print(f'waited {t} sec')
def run(loop):
loop.run_until_complete(wait(2))
loop = asyncio.get_event_loop()
t = Thread(target=run, args=(loop,))
t.start()
loop.run_until_complete(wait(1))
t.join()
This code is wrong. I know that. The event loop can't be run while it's running, and it's generally not thread safe.
My question: why can wait(1) sometimes still finish its job?
Here's the output from two consecutive runs:
>>> py test.py
... Traceback (most recent call last):
... File "test.py", line 14, in <module>
... loop.run_until_complete(wait(1))
... File "C:\Python\Python37\lib\asyncio\base_events.py", line 555, in run_until_complete
... self.run_forever()
... File "C:\Python\Python37\lib\asyncio\base_events.py", line 510, in run_forever
...
... raise RuntimeError('This event loop is already running')
... RuntimeError: This event loop is already running
... waited 2 sec
>>> py test.py
... Traceback (most recent call last):
... File "test.py", line 14, in <module>
... loop.run_until_complete(wait(1))
... File "C:\Python\Python37\lib\asyncio\base_events.py", line 555, in run_until_c
... omplete
... self.run_forever()
... File "C:\Python\Python37\lib\asyncio\base_events.py", line 510, in run_forever
...
... raise RuntimeError('This event loop is already running')
... RuntimeError: This event loop is already running
... waited 1 sec
... waited 2 sec
The first run's behavior is what I expected - the main thread fails, but the event loop still runs wait(2) to finish in the thread t.
The second run is puzzling, how can wait(1) do its job when the RuntimeError is already thrown? I guess it has to do with thread synchronization and the non-thread-safe nature of the event loop. But I don't know exactly how this works.

Ohhh... never mind. I read the code of asyncio and figured it out. It's actually quite simple.
run_until_complete calls ensure_future(future, loop=self) before it checks self.is_running() (which is done in run_forever). Since the loop is already running, it can pick up the task before the RuntimeError is thrown. Of course it doesn't always happen because of the race condition.

Exceptions are thrown per thread. The runtime error is raised in a different thread from the event loop. The event loop continues to execute, regardless.
And wait(1) can sometimes finish it's job because you can get lucky. The asyncio loop internal data structures are not guarded against race conditions caused by using threads (which is why there are specific thread-support methods you should use instead). But the nature of race conditions is such that it depends on the exact order of events and that order can change each time you run your program, depending on what else your OS is doing at the time.
The run_until_complete() method first calls asyncio.ensure_task() to add the coroutine to the task queue with a 'done' callback attached that will stop the event loop again, then calls loop.run_forever(). When the coroutine returns, the callback stops the loop. The loop.run_forever() call throws the RuntimeError here.
When you do this from a thread, the task gets added to a deque object attached to the loop, and if that happens at the right moment (e.g. when the running loop is not busy emptying the queue), the running loop in the main thread will find it, and execute it, even if the loop.run_forever() call raised an exception.
All this relies on implementation details. Different versions of Python will probably exhibit different behaviour here, and if you install an alternative loop (e.g. uvloop), there will almost certainly be different behaviour again.
If you want to schedule coroutines from a different thread, use asyncio.run_coroutine_threadsafe(); it would :
from threading import Thread
import asyncio
async def wait(t):
print(f'going to wait {t} seconds')
await asyncio.sleep(t)
print(f'waited {t} sec')
def run(loop):
asyncio.run_coroutine_threadsafe(wait(2), loop)
loop = asyncio.get_event_loop()
t = Thread(target=run, args=(loop,))
t.start()
loop.run_until_complete(wait(1))
t.join()
The above doesn't actually complete the wait(2) coroutine because the wait(1) coroutine is being run with loop.run_until_complete() so its callback stops the loop again before the 2 second wait is over. But the coroutine is actually started:
going to wait 1 seconds
going to wait 2 seconds
waited 1 sec
but if you made the main-thread coroutine take longer (with, say, wait(3)) then the one scheduled from the thread would also complete. You'd have to do additional work to ensure that there are no more pending tasks scheduled to run with the loop before you shut it down.

When is the right time to call loop.close()?

I have been experimenting with asyncio for a little while and read the PEPs; a few tutorials; and even the O'Reilly book.
I think I got the hang of it, but I'm still puzzled by the behavior of loop.close() which I can't quite figure out when it is "safe" to invoke.
Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding, print a sensible log and then (hopefully, cleanly) get out of the way.
Say, something like this:
import asyncio
import time
def blocking(num):
time.sleep(num)
if num == 2:
raise ValueError("don't like 2")
return num
async def my_coro(loop, num):
try:
result = await loop.run_in_executor(None, blocking, num)
print(f"Coro {num} done")
return result
except asyncio.CancelledError:
# Do some cleanup here.
print(f"man, I was canceled: {num}")
def main():
loop = asyncio.get_event_loop()
tasks = []
for num in range(5):
tasks.append(loop.create_task(my_coro(loop, num)))
try:
# No point in waiting; if any of the tasks go wrong, I
# just want to abandon everything. The ALL_DONE is not
# a good solution here.
future = asyncio.wait(tasks, return_when=asyncio.FIRST_EXCEPTION)
done, pending = loop.run_until_complete(future)
if pending:
print(f"Still {len(pending)} tasks pending")
# I tried putting a stop() - with/without a run_forever()
# after the for - same exception raised.
# loop.stop()
for future in pending:
future.cancel()
for task in done:
res = task.result()
print("Task returned", res)
except ValueError as error:
print("Outer except --", error)
finally:
# I also tried placing the run_forever() here,
# before the stop() - no dice.
loop.stop()
if pending:
print("Waiting for pending futures to finish...")
loop.run_forever()
loop.close()
I tried several variants of the stop() and run_forever() calls, the "run_forever first, then stop" seems to be the one to use according to the pydoc and, without the call to close() yields a satisfying:
Coro 0 done
Coro 1 done
Still 2 tasks pending
Task returned 1
Task returned 0
Outer except -- don't like 2
Waiting for pending futures to finish...
man, I was canceled: 4
man, I was canceled: 3
Process finished with exit code 0
However, when the call to close() is added (as shown above) I get two exceptions:
exception calling callback for <Future at 0x104f21438 state=finished returned int>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/futures.py", line 414, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 620, in call_soon_threadsafe
self._check_closed()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 357, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
which is at best annoying, but to me, totally puzzling: and, to make matter worse, I've been unable to figure out what would The Right Way of handling such a situation.
Thus, two questions:
what am I missing? how should I modify the code above in a way that with the call to close() included does not raise?
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
For my own personal satisfaction, also:
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
Many thanks in advance for any suggestions you may have!

Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding
This can't work as envisioned because run_in_executor submits the function to a thread pool, and OS threads can't be cancelled in Python (or in other languages that expose them). Canceling the future returned by run_in_executor will attempt to cancel the underlying concurrent.futures.Future, but that will only have effect if the blocking function is not yet running, e.g. because the thread pool is busy. Once it starts to execute, it cannot be safely cancelled. Support for safe and reliable cancellation is one of the benefits of using asyncio compared to threads.
If you are dealing with synchronous code, be it a legacy blocking call or longer-running CPU-bound code, you should run it with run_in_executor and incorporate a way to interrupt it. For example, the code could occasionally check a stop_requested flag and exit if that is true, perhaps by raising an exception. Then you can "cancel" those tasks by setting the appropriate flag or flags.
how should I modify the code above in a way that with the call to close() included does not raise?
As far as I can tell, there is currently no way to do so without modifications to blocking and the top-level code. run_in_executor will insist on informing the event loop of the result, and this fails when the event loop is closed. It doesn't help that the asyncio future is cancelled, because the cancellation check is performed in the event loop thread, and the error occurs before that, when call_soon_threadsafe is called by the worker thread. (It might be possible to move the check to the worker thread, but it should be carefully analyzed whether it leads a race condition between the call to cancel() and the actual check.)
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
It wants the blocking functions passed to run_in_executor (literally called blocking in the question) that have already been started to finish running before the event loop is closed. You cancelled the asyncio future, but the underlying concurrent future still wants to "phone home", finding the loop closed.
It is not obvious whether this is a bug in asyncio, or if you are simply not supposed to close an event loop until you somehow ensure that all work submitted to run_in_executor is done. Doing so requires the following changes:
Don't attempt to cancel the pending futures. Canceling them looks correct superficially, but it prevents you from being able to wait() for those futures, as asyncio will consider them complete.
Instead, send an application-specific event to your background tasks informing them that they need to abort.
Call loop.run_until_complete(asyncio.wait(pending)) before loop.close().
With these modifications (except for the application-specific event - I simply let the sleep()s finish their course), the exception did not appear.
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
Since a typical event loop runs as long as the application, there should be no issue in not call close() at the very end of the program. The operating system will clean up the resources on program exit anyway.
Calling loop.close() is important for event loops that have a clear lifetime. For example, a library might create a fresh event loop for a specific task, run it in a dedicated thread, and dispose of it. Failing to close such a loop could leak its internal resources (such as the pipe it uses for inter-thread wakeup) and cause the program to fail. Another example are test suites, which often start a new event loop for each unit test to ensure separation of test environments.
EDIT: I filed a bug for this issue.
EDIT 2: The bug was fixed by devs.

Until the upstream issue is fixed, another way to work around the problem is by replacing the use of run_in_executor with a custom version without the flaw. While rolling one's own run_in_executor sounds like a bad idea at first, it is in fact only a small glue between a concurrent.futures and an asyncio future.
A simple version of run_in_executor can be cleanly implemented using the public API of those two classes:
def run_in_executor(executor, fn, *args):
"""Submit FN to EXECUTOR and return an asyncio future."""
loop = asyncio.get_event_loop()
if args:
fn = functools.partial(fn, *args)
work_future = executor.submit(fn)
aio_future = loop.create_future()
aio_cancelled = False
def work_done(_f):
if not aio_cancelled:
loop.call_soon_threadsafe(set_result)
def check_cancel(_f):
nonlocal aio_cancelled
if aio_future.cancelled():
work_future.cancel()
aio_cancelled = True
def set_result():
if work_future.cancelled():
aio_future.cancel()
elif work_future.exception() is not None:
aio_future.set_exception(work_future.exception())
else:
aio_future.set_result(work_future.result())
work_future.add_done_callback(work_done)
aio_future.add_done_callback(check_cancel)
return aio_future
When loop.run_in_executor(blocking) is replaced with run_in_executor(executor, blocking), executor being a ThreadPoolExecutor created in main(), the code works without other modifications.
Of course, in this variant the synchronous functions will continue running in the other thread to completion despite being canceled -- but that is unavoidable without modifying them to support explicit interruption.

How to use asyncio event loop in library function

I'm trying to create a function performing some asynchronous operations using asyncio, users of this function should not need to know that asyncio is involved under the hood.
I'm having a very hard time understanding how this shall be done with the asyncio API as most functions seem to operate under some global loop-variable accessed with get_event_loop and calls to this are affected by the global state inside this loop.
I have four examples here where two (foo1 and foo3) seem to be reasonable use cases but they all show very strange behaviors:
async def bar(loop):
# Disregard how simple this is, it's just for example
s = await asyncio.create_subprocess_exec("ls", loop=loop)
def foo1():
# Example1: Just use get_event_loop
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(bar(loop), 1000))
# On exit this is written to stderr:
# Exception ignored in: <bound method BaseEventLoop.__del__ of <_UnixSelectorEventLoop running=False closed=True debug=False>>
# Traceback (most recent call last):
# File "/usr/lib/python3.5/asyncio/base_events.py", line 510, in __del__
# File "/usr/lib/python3.5/asyncio/unix_events.py", line 65, in close
# File "/usr/lib/python3.5/asyncio/unix_events.py", line 146, in remove_signal_handler
# File "/usr/lib/python3.5/signal.py", line 47, in signal
# TypeError: signal handler must be signal.SIG_IGN, signal.SIG_DFL, or a callable object
def foo2():
# Example2: Use get_event_loop and close it when done
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(bar(loop), 1000)) # RuntimeError: Event loop is closed --- if foo2() is called twice
loop.close()
def foo3():
# Example3: Always use new_event_loop
loop = asyncio.new_event_loop()
loop.run_until_complete(asyncio.wait_for(bar(loop), 1000)) #RuntimeError: Cannot add child handler, the child watcher does not have a loop attached
loop.close()
def foo4():
# Example4: Same as foo3 but also set_event_loop to the newly created one
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop) # Polutes global event loop, callers of foo4 does not expect this.
loop.run_until_complete(asyncio.wait_for(bar(loop), 1000)) # OK
loop.close()
None of these functions work and i don't see any other obvious way to do it, how is asyncio supposed to be used? It's seems like it's only designed to be used under the assumption that the entry point of the application is the only place where you can create and close the global loop. Do i have to fiddle around with event loop policies?
foo3 seems like the correct solution but i get an error even though i explicitly pass along loop, because deep down inside create_subprocess_exec it is using the current policy to get a new loop which is None, is this a bug in asyncio subprocess?
I'm using Python 3.5.3 on Ubuntu.

foo1 error happens because you didn't close event loop, see this issue.
foo2 because you can't reuse closed event loop.
foo3 because you didn't set new event loop as global.
foo4 is almost what you want, all you left to do is store old event loop and set it back as global after bar executed:
import asyncio
async def bar():
# After you set new event loop global,
# there's no need to pass loop as param to bar or anywhere else.
process = await asyncio.create_subprocess_exec("ls")
await process.communicate()
def sync_exec(coro): # foo5
old_loop = asyncio.get_event_loop()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(coro)
finally:
loop.close()
asyncio.set_event_loop(old_loop)
sync_exec(asyncio.wait_for(bar(), 1000))
One more important thing: it's not clear why you want to hide using of asyncio behind some sync functions, but usually it's bad idea. Whole thing about one global event loop is to allow user to run different concurrent jobs in this single event loop. You're trying to take away this possibility. I think you should reconsider this decision.

Upgrade to Python 3.6, then foo1() will work, without the need to explicitly close the default event loop.
Not the answer i was hoping for as we only use 3.5 :(

Python - Cannot join thread - No multiprocessing

I have this piece of code in my program. Where OnDone function is an event in a wxPython GUI. When I click the button DONE, the OnDone event fires up, which then does some functionality and starts the thread self.tstart - with target function StartEnable. This thread I want to join back using self.tStart.join(). However I am getting an error as follows:
Exception in thread StartEnablingThread:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "//wagnernt.wagnerspraytech.com/users$/kundemj/windows/my documents/Production GUI/Trial python Codes/GUI_withClass.py", line 638, in StartEnable
self.tStart.join()
File "C:\Python27\lib\threading.py", line 931, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
I have not got this type of error before. Could any one of you guys tell me what I am missing here.
def OnDone(self, event):
self.WriteToController([0x04],'GuiMsgIn')
self.status_text.SetLabel('PRESSURE CALIBRATION DONE \n DUMP PRESSURE')
self.led1.SetBackgroundColour('GREY')
self.add_pressure.Disable()
self.tStart = threading.Thread(target=self.StartEnable, name = "StartEnablingThread", args=())
self.tStart.start()
def StartEnable(self):
while True:
time.sleep(0.5)
if int(self.pressure_text_control.GetValue()) < 50:
print "HELLO"
self.start.Enable()
self.tStart.join()
print "hello2"
break
I want to join the thread after the "if" condition has executed. Until them I want the thread to run.

Joining Is Waiting
Joining a thread actually means waiting fo another thread to finish.
So, in thread1, there can be code which says:
thread2.join()
That means "stop here and do not execute the next line of code until thread2 is finished".
If you did (in thread1) the following, that would fail with the error from the question:
thread1.join() # RuntimeError: cannot join current thread
Joining Is Not Stopping
Calling thread2.join() does not cause thread2 to stop, nor even signal to it in any way that it should stop.
A thread stops when its target function exits. Often, a thread is implemented as a loop which checks for a signal (a variable) which tells it to stop, e.g.
def run():
while whatever:
# ...
if self.should_abort_immediately:
print 'aborting'
return
Then, the way to stop the thread is to do:
thread2.should_abort_immediately = True # tell the thread to stop
thread2.join() # entirely optional: wait until it stops
The Code from the Question
That code already implements the stopping correctly with the break. The join should just be deleted.
if int(self.pressure_text_control.GetValue()) < 50:
print "HELLO"
self.start.Enable()
print "hello2"
break

When the StartEnable method is executing, it is running on the StartEnablingThread you created in the __init__ method. You cannot join the current thread. This is clearly stated in the documentation for the join call.
join() raises a RuntimeError if an attempt is made to join the current thread as that would cause a deadlock. It is also an error to join() a thread before it has been started and attempts to do so raises the same exception.

I have some bad news. Threading in Python is pointless and you best bet to look at using only 1 thread or use multi process. If you will need to look at thread then you will need to look at a different language like C# or C. Have look at https://docs.python.org/2/library/multiprocessing.html
The reason that threading is pointless in python is because of the global interpreter lock (GIL). This make you only able to use one thread at a time, so no multi-threading in python but there are people working on it. http://pypy.org/

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.