I'm replacing part of an existing program. That original program uses threads. There's this particular class which inherits from threading.Thread which functionality I need to replace but I need to keep the interface the same.
The functionality I'm integrating is packaged in a library which uses asyncio a lot.
The original calls to the class I'm replacing go something like this:
network = Network()
network.start()
network.fetch_something() # crashes!
network.stop()
I've gotten to a point where my replacing class inherits from threading.Thread too and I can connect, from within the run method to my backends via the client library:
class Network(threading.Thread):
def __init__(self):
self._loop = asyncio.new_event_loop()
self._client = Client() # this is the library
def run(self):
self._loop.run_until_complete(self.__connect()) # works dandy, implementation not shown
self._loop.run_forever()
def fetch_something(self):
return self._loop.run_until_complete(self._client.fetch_something())
Running this code throws an exception:
RuntimeError: Non-thread-safe operation invoked on an event loop other than the current one
I sort of get what's going on here. In the run method things worked out because the same thread running the event loop was the caller. In the other case an other thread was the caller hence the problem.
As you might have noticed I was hoping the problem would have been solved by using the same event loop. Alas, that didn't work out.
I really want to keep the interface exactly as it is otherwise I'm refactoring for the remainder of the year. I could relatively easily pass arguments to the constructor of the Network class. I've tried passing in an event loop created on the main thread but the result was the same.
(Note that this is the opposite problem this author has: Call coroutine within Thread)
When scheduling a coroutine from a different thread, you must use asyncio.run_coroutine_threadsafe. For example:
def fetch_something(self):
future = asyncio.run_coroutine_threadsafe(
self._client.fetch_something(), loop)
return future.result()
run_coroutine_threadsafe schedules the coroutine with the event loop in a thread-safe way and returns a concurrent.futures.Future. You can use the returned future to simply wait for the result as shown above, but you can also pass it to other functions, poll whether the result has arrived, or implement timeouts.
When combining threads and asyncio, remember to make sure that all interfacing with the event loop from other threads (even to call something as simple as loop.stop to implement Network.stop) is done using loop.call_soon_threadsafe and asyncio.run_coroutine_threadsafe.
Related
I'm working on a Project with Rabbitmq, I'm using the RPC Pattern, basically I'm receiving or consuming Messages from a Queue, make some Processing and then send a Response back. Im using Pika, my goal is to use a Thread per Task so for every Task i ll make a Thread perticularly for that Task. I also read that the best Practice is to make only one Connection and under it many channels as i want to, but i get always this Error :
'start_consuming may not be called from the scope of '
pika.exceptions.RecursionError: start_consuming may not be called from the scope of another BlockingConnection or BlockingChannel callback.
I made some Research and found out that Pika is not thread safe and we should use for every Thread an Independant Connection and a channel. but i dont want to do that since it is considered bad Practice. So I wanted to ask here if someone already achieved to make this work. I read also that it is Possible if i didn't use BlockingConnection to instantiate my Connection and also that there is a Function called add_callback_threadsafe which can make this Possible. but there is unfortunally no Examples for that and I read the Documentation but it's complex and without Examples it was hard for me to grasp what they want to describe.
my Try was to declare two Classes. Each class will represent a Task Executer which receive or consume a message from a queue and based on that made some Processing and deliver a Response back. my idea was to share a rabbitmq Connection between the two Tasks but every Task will get an independant Channel. in the Code above the rabbit Parameter passed to the function is a Class that holds some Variables like Connection and other Functions like EventSubscriber which when called it will assign a new Channel and start consuming messages from that Particular Exchanges and routingKey. Next i declare a Thread and give the subscribe or Consume function as a Target to that Thread. the other Task Class look also the same as this Class that's why i ll only upload this Code. in the main Class i make a Connection to rabbitmq and pass it as Parameter to the constructor of the Two Task Classes.
class On_Deregistration:
def __init__(self, rabbit):
self.event(rabbit) # this will call event function and pass the connection shared between all Tasks. rabbit parameter hold a connection to rabbitmq
def event(self, rabbit):
self.Subscriber = rabbit.EventSubscriber(rabbit, 'testing.test', 'test', False, onDeregistrationFromHRS # this func is task listener)
def subscribeAsync(self):
self.Subscriber.subscribe() # here i call start_consuming
def start(self):
"""start Subscribtion in an Independant Thread """
thread = threading.Thread(target = self.subscribeAsync )
thread.start()
if thread.isAlive():
print("asynchronous subscription started")
MAin Class:
class App:
def __init__(self):
self.rabbitMq = RabbitMqCommunicationInterface(host='localhost', port=5672)
firstTask = On_Deregistration(self.rabbitMq)
secondTask = secondTask(self.rabbitMq)
app = App()
error : 'start_consuming may not be called from the scope of '
pika.exceptions.RecursionError: start_consuming may not be called from the scope of another BlockingConnection or BlockingChannel callback
I searched for the cause of this Error and obviously is pika not thread safe but there must be a Solution for this. maybe Not using a BlockingConnection ? maybe someone can give me an Example how to do that because i tried it and didnt work. Maybe I'm missing something about how to Implement multithreading with rabbitmq
so after a long research, I figure out that Pika is not thread safe. well for the moment at least, maybe in new versions it will be thread safe. so now for my Project I stopped using Pika and I'm using b-rabbit, which is a thread safe wrapper around Rabbitpy. but I must say that Pika is a great Library and I find the API better described and structured than rabbitpy but for my Project it was mandatory to use multithreading and that's why Pika for the moment was a bad choice. I hope this helps someone in the Future
Why get_event_loop method in asyncio (source) is checking if the current thread is the main thread (see my comment in the snippet below)?
def get_event_loop(self):
"""Get the event loop.
This may be None or an instance of EventLoop.
"""
if (self._local._loop is None and
not self._local._set_called and
isinstance(threading.current_thread(), threading._MainThread)): # <- I mean this thing here
self.set_event_loop(self.new_event_loop())
if self._local._loop is None:
raise RuntimeError('There is no current event loop in thread %r.'
% threading.current_thread().name)
return self._local._loop
For convenience, asyncio supports automatically creating an event loop without having to go through calls to new_event_loop() and set_event_loop(). As the event loop is moderately expensive to create, and consumes some OS resources, it's not created automatically on import, but on-demand, specifically on the first call to get_event_loop(). (This feature is mostly obsoleted by asyncio.run which always creates a new event loop, and then the auto-created one can cause problems.)
This convenience, however, is reserved for the main thread - any other thread must set the event loop explicitly. There are several possible reasons for this:
preventing confusion - you don't want an accidental call to get_event_loop() from an arbitrary thread to appropriate the "main" (auto-created) event loop for that thread;
some asyncio features work best when or require that the event loop is run in the main thread - for example, subprocesses and signal handling.
These problems could also be avoided by automatically creating a new event loop in each thread that invokes get_event_loop(), but that would make it easy to accidentally create multiple event loops whose coroutines would be unable to communicate with each other, which would go against the design of asyncio. So the remaining option is for the code to special-case the main thread, encouraging developers to use that thread for executing asyncio code.
Suppose there's a synchronous function in a twisted-powered Python program that takes a long time to execute, doing that in a lot of reasonable-sized pieces of work. If the function could return deferreds, this would be a no-brainer, however the function happens to be deep inside some synchronous code, so that yielding deferreds to continue is impossible.
Is there a way to let twisted handle outstanding events without leaving that function? I.e. what I want to do is something along the lines of
def my_func():
results = []
for item in a_lot_of_items():
results.append(do_computation(item))
reactor.process_outstanding_events()
return results
Of course, this imposes reentrancy requirements on the code, but still, there's QCoreApplication.processEvents for that in Qt, is there anything in twisted?
The solution taken by some event-loop-based systems (essentially the solution you're referencing via Qt's QCoreApplication.processEvents API) is to make the main loop re-entrant. In Twisted terms, this would mean something like (not working code):
def my_expensive_task_that_cannot_be_asynchronous():
#inlineCallbacks
def do_work(units):
for unit in units:
yield do_one_work_asynchronously(unit)
work = do_work(some_work_units())
work.addBoth(lambda ignored: reactor.stop())
reactor.run()
def main():
# Whatever your setup is...
# Then, hypothetical event source triggering your
# expensive function:
reactor.callLater(
30,
my_expensive_task_that_cannot_be_asynchronous,
)
reactor.run()
Notice how there are two reactor.run calls in this program. If Twisted had a re-entrant event loop, this second call would start spinning the reactor again and not return until a matching reactor.stop call is encountered. The reactor would process all events it knows about, not just the ones generated by do_work, and so you would have the behavior you desire.
This requires a re-entrant event loop because my_expensive_task_... is already being called by the reactor loop. The reactor loop is on the call stack. Then, reactor.run is called and the reactor loop is now on the call stack again. So the usual issues apply: the event loop cannot have left over state in its frame (otherwise it may be invalid by the time the nested call is complete), it cannot leave its instance state inconsistent during any calls out to other code, etc.
Twisted does not have a re-entrant event loop. This is a feature that has been considered and, at least in the past, explicitly rejected. Supporting this features brings a huge amount of additional complexity (described above) to the implementation and the application. If the event loop is re-entrant then it becomes very difficult to avoid requiring all application code to be re-entrant safe as well. This negates one of the major benefits of the cooperative multitasking approach Twisted takes to concurrency (that you are guaranteed your functions will not be re-entered).
So, when using Twisted, this solution is out.
I'm not aware of another solution which would allow you to continue to run this code in the reactor thread. You mentioned that the code in question is nested deeply within some other synchronous code. The other options that come to mind are:
make the synchronous code capable of dealing with asynchronous things
factor the expensive parts out and compute them first, then pass the result in to the rest of the code
run all of that code, not just the computationally expensive part, in another thread
You could use deferToThread.
http://twistedmatrix.com/documents/13.2.0/core/howto/threading.html
That method runs your calculation in a separate thread and returns a deferred that is called back when the calculation is actually finished.
The issue is if do_heavy_computation() is code that blocks then execution won't go to the next function. In this case use deferToThread or blockingCallFromThread for heavy calculations. Also if you don't care for the results of the calculation then you can use callInThread. Take a look at documentation on threads
This should do:
for item in items:
reactor.callLater(0, heavy_func, item)
reactor.callLater should bring you back into the event loop.
So I have this library that I use and within one of my functions I call a function from that library, which happens to take a really long time. Now, at the same time I have another thread running where I check for different conditions, what I want is that if a condition is met, I want to cancel the execution of the library function.
Right now I'm checking the conditions at the start of the function, but if the conditions happen to change while the library function is running, I don't need its results, and want to return from it.
Basically this is what I have now.
def my_function():
if condition_checker.condition_met():
return
library.long_running_function()
Is there a way to run the condition check every second or so and return from my_function when the condition is met?
I've thought about decorators, coroutines, I'm using 2.7 but if this can only be done in 3.x I'd consider switching, it's just that I can't figure out how.
You cannot terminate a thread. Either the library supports cancellation by design, where it internally would have to check for a condition every once in a while to abort if requested, or you have to wait for it to finish.
What you can do is call the library in a subprocess rather than a thread, since processes can be terminated through signals. Python's multiprocessing module provides a threading-like API for spawning forks and handling IPC, including synchronization.
Or spawn a separate subprocess via subprocess.Popen if forking is too heavy on your resources (e.g. memory footprint through copying of the parent process).
I can't think of any other way, unfortunately.
Generally, I think you want to run your long_running_function in a separate thread, and have it occasionally report its information to the main thread.
This post gives a similar example within a wxpython program.
Presuming you are doing this outside of wxpython, you should be able to replace the wx.CallAfter and wx.Publisher with threading.Thread and PubSub.
It would look something like this:
import threading
import time
def myfunction():
# subscribe to the long_running_function
while True:
# subscribe to the long_running_function and get the published data
if condition_met:
# publish a stop command
break
time.sleep(1)
def long_running_function():
for loop in loops:
# subscribe to main thread and check for stop command, if so, break
# do an iteration
# publish some data
threading.Thread(group=None, target=long_running_function, args=()) # launches your long_running_function but doesn't block flow
myfunction()
I haven't used pubsub a ton so I can't quickly whip up the code but it should get you there.
As an alternative, do you know the stop criteria before you launch the long_running_function? If so, you can just pass it as an argument and check whether it is met internally.
I'm writing a multi-threaded application that utilizes QThreads. I know that, in order to start a thread, I need to override the run() method and call that method using the thread.start() somewhere (in my case in my GUI thread).
I was wondering, however, is it required to call the .wait() method anywhere and also am I supposed to call the .quit() once the thread finishes, or is this done automatically?
I am using PySide.
Thanks
Both answers depend on what your code is doing and what you expect from the thread.
If your logic which uses the thread needs to wait synchronously for the moment QThread finishes, then yes, you need to call wait(). However such requirement is a sign of sloppy threading model, except very specific situations like application startup and shutdown. Usage of QThread::wait() suggests creeping sequential operation, which means that you are effectively not using threads concurrently.
quit() exits QThread-internal event loop, which is not mandatory to use. A long-running thread (as opposed to one-task worker) must have an event loop of some sort - this is a generic statement, not specific to QThread. You either do it yourself (in form of some while(keepRunning) { } cycle) or use Qt-provided event loop, which you fire off by calling exec() in your run() method. The former implementation is finishable by you, because you did provide the keepRunning condition. The Qt-provided implementation is hidden from you and here goes the quit() call - which internally does nothing more than setting some sort of similar flag inside Qt.