Infinite Threads? - python

The following code is the init function from a Kivy app I'm coding. The app utilises Kivy's built-in Clock method to call an update function every 10 seconds. The update function runs cpu intensive code so I use a function within the init function to run the update function in it's own thread. This code does what I want it to do but it occurred to me that each time the update function gets called, a new unique thread is created.
My Questions:
Are there any problems or issues associated with a potentially infinite number of threads being created?
Is there a method that stops or destroys a thread before the new one is created? If so is that approach advisable or does it matter if infinite threads get created?
Is there a better way to code this?
def __init__(self, **kwargs):
super().__init__(**kwargs)
def start_thread(dt):
t = threading.Thread(target=self.update)
t.start()
Clock.schedule_interval(start_thread, 10)
def update(self):
"Does some stuff in new thread every time it's called"

Is there a better way to code this?
Use a thread pool. Pooling is when we re-use objects instead of continually creating and destroying new ones. A thread pool uses a small collection of "worker" threads to perform tasks (callable objects) that your program submits to it.
https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor

Related

Terminate application if subprocess ends

I have an application that is doing some data processing in its main thread. So far it was a pure console application. Now I had to add a QT App for visualization purpose and did this as a separate thread.
If the QT Window is closed, the main thread of course still runs. How could I terminate the main thread once the window is closed?
class Window(threading.Thread)
def __init__(self, data_getter):
super(Window, self).__init__()
self.getter = data_getter
def update(self):
data = self.getter()
#update all UI widgets
def run(self):
app: QApplication = QApplication([])
app.setStyleSheet(style.load_stylesheet())
window = QWidget()
window.setWindowTitle("Test Widget")
window.setGeometry(100, 100, 600, 300)
layout = QGridLayout()
self.LABEL_state: QLabel = QLabel("SM State: N/A")
layout.addWidget(self.LABEL_state)
window.setLayout(layout)
window.show()
timer = QTimer()
timer.timeout.connect(self.update)
timer.start(1000)
app.exec_()
class Runner:
def __init__(self)
pass
def data_container(self):
return data
def process_data(self):
#do the data processing
def main():
runner: Runner = Runner()
time.sleep(1)
w = Window(runner.data_container)
w.start()
while True:
runner.process_data()
time.sleep(2)
if __name__ == "__main__": main()
The best idea I had is to give Window another function reference of Runner that is then registered inside Window to atexit and would set a termination flag that is frequently checked inside the main process (Runner). Is there a better approach? I know it migth be better to have the QApp run as the main process, but I'd like to not have to do that in this case.
There are basically two questions here: synchronising an event accross two threads, and stopping a running thread from outside. The way you solve the latter problem will probably affect your solution to the former. Very broadly, you can either:
poll some flag inside a main loop (in your case the while True loop in main would be an obvious target, possibly moving the logic into process_data and having it run to completion), or
use some mechanism to stop the containing process (like a signal), optionally registering cleanup code to get things into a known state.
In either case you can then design your api however you like, but a .stop() or .cancel() method is a very normal solution.
The trouble with relying on polling is that the worse case response time is an entire cycle of your main loop. If that's not acceptable you probably want to trigger the containing process or look for ways to check more frequently (if your process_data() takes << 2s to run, replace the sleep(2) with a looped smaller delay and poll the flag there).
If stopping by setting a flag isn't workable, you can trigger the containing process. This normally implies that the triggering code is running in a different thread/process. Python's threads don't have a .terminate(), but multiprocessing.Processes do, so you could delegate your processing over to a process and then have the main code call .terminate() (or get the pid yourself and send the signal manually). In this case the main code would be doing nothing until signalled, or perhaps nothing at all.
Lastly, communication between the graphical thread and the processing thread depends on how you implement the rest. For simply setting a flag, exposing a method is fine. If you move the processing code to a Process and have the main thread idle, use a blocking event to avoid busy-looping.
And yes, it would be easier if the graphical thread were the main thread and started and stopped the processing code itself. Unless you know this will greatly complicate things, have a look at it to see how much you would need to change to do this: well designed data processing code should just take data, process it, and push it out. If putting it in a thread is hard work, the design probably needs revisiting. Lastly there's the 'nuclear option' of just getting the pid of the main thread inside your window loop and killing it. That's horribly hacky, but might be good enough for a demonstration job.

How to pause a thread for indefinite time until user resumes it manually

Let's say I have a class named Worker that has many methods and some of its methods have for and while loops. Something along these lines:
class Worker:
def operation1(self):
for i in range(5000):
# performs some resource intensive operation 5000 times.
def operation2(self):
while some_condition:
# another resource intensive operation in the while loop
I want to be able to pause the execution of an instance of the class at any given time and then be able to resume it. So this class should be running on one thread and GUI on another thread. The GUI will have two buttons: Pause and Resume that will pause and resume execution accordingly . How to make it happen? The only way I know how to pause execution of a thread is to use the sleep() method, but how to pause it until user resumes it?

From running thread call coroutine on other thread

I'm replacing part of an existing program. That original program uses threads. There's this particular class which inherits from threading.Thread which functionality I need to replace but I need to keep the interface the same.
The functionality I'm integrating is packaged in a library which uses asyncio a lot.
The original calls to the class I'm replacing go something like this:
network = Network()
network.start()
network.fetch_something() # crashes!
network.stop()
I've gotten to a point where my replacing class inherits from threading.Thread too and I can connect, from within the run method to my backends via the client library:
class Network(threading.Thread):
def __init__(self):
self._loop = asyncio.new_event_loop()
self._client = Client() # this is the library
def run(self):
self._loop.run_until_complete(self.__connect()) # works dandy, implementation not shown
self._loop.run_forever()
def fetch_something(self):
return self._loop.run_until_complete(self._client.fetch_something())
Running this code throws an exception:
RuntimeError: Non-thread-safe operation invoked on an event loop other than the current one
I sort of get what's going on here. In the run method things worked out because the same thread running the event loop was the caller. In the other case an other thread was the caller hence the problem.
As you might have noticed I was hoping the problem would have been solved by using the same event loop. Alas, that didn't work out.
I really want to keep the interface exactly as it is otherwise I'm refactoring for the remainder of the year. I could relatively easily pass arguments to the constructor of the Network class. I've tried passing in an event loop created on the main thread but the result was the same.
(Note that this is the opposite problem this author has: Call coroutine within Thread)
When scheduling a coroutine from a different thread, you must use asyncio.run_coroutine_threadsafe. For example:
def fetch_something(self):
future = asyncio.run_coroutine_threadsafe(
self._client.fetch_something(), loop)
return future.result()
run_coroutine_threadsafe schedules the coroutine with the event loop in a thread-safe way and returns a concurrent.futures.Future. You can use the returned future to simply wait for the result as shown above, but you can also pass it to other functions, poll whether the result has arrived, or implement timeouts.
When combining threads and asyncio, remember to make sure that all interfacing with the event loop from other threads (even to call something as simple as loop.stop to implement Network.stop) is done using loop.call_soon_threadsafe and asyncio.run_coroutine_threadsafe.

Resource usage of "time.sleep" in loop vs. "threading.Timer"

First method:
import threading
import time
def keepalive():
while True:
print 'Alive.'
time.sleep(200)
threading.Thread(target=keepalive).start()
Second method:
import threading
def keepalive():
print 'Alive.'
threading.Timer(200, keepalive).start()
threading.Timer(200, keepalive).start()
Which method takes up more RAM? And in the second method, does the thread end after being activated? or does it remain in the memory and start a new thread? (multiple threads)
Timer creates a new thread object for each started timer, so it certainly needs more resources when creating and garbage collecting these objects.
As each thread exits immediately after it spawned another active_count stays constant, but there are constantly new Threads created and destroyed, which causes overhead. I'd say the first method is definitely better.
Altough you won't realy see much difference, only if the interval is very small.
Here's an example of how to test this yourself:
And in the second method, does the thread end after being activated? or does it remain in the memory and start a new thread? (multiple threads)
import threading
def keepalive():
print 'Alive.'
threading.Timer(200, keepalive).start()
print threading.active_count()
threading.Timer(200, keepalive).start()
I also changed the 200 to .2 so it wouldn't take as long.
The thread count was 3 forever.
Then I did this:
top -pid 24767
The #TH column never changed.
So, there's your answer: We don't have enough info to know whether Python maintains a single timer thread for all of the timers, or ends and cleans up the thread as soon as the timer runs, but we can be sure the threads doesn't stick around and pile up. (If you do want to know which of the former is happening, you can, e.g., print the thread ids.)
An alternative way to find out is to look at the source. As the documentation says, "Timer is a subclass of Thread and as such also functions as an example of creating custom threads". The fact that it's a subclass of Thread already tells you that each Timer is a Thread. And the fact that it "functions as an example" implies that it ought to be easy to read. If you click the link form the documentation to the source, you can see how trivial it is. Most of the work is done by Event, but that's in the same source file, and it's almost as simple. Effectively, it just creates a condition variable, waits on it (so it blocks until it times out, or you notify the condition by calling cancel), then quits.
The reason I'm answering one sub-question and explaining how I did it, rather than answering each sub-question, is because I think it would be more useful for you to walk through the same steps.
On further reflection, this probably isn't a question to be decided by optimization in the first place:
If you have a simple, synchronous program that needs to do nothing for 200 seconds, make a blocking call to sleep. Or, even simpler, just do the job and quit, and pick an external tool to schedule your script to run every 200s.
On the other hand, if your program is inherently asynchronous—especially if you've already got thread, signal handlers, and/or an event loop—there's just no way you're going to get sleep to work. If Timer is too inefficient, go to PyPI or ActiveState and find a better timer that lets you schedule repeatable timers (or even multiple timers) with a single instance and thread. (Or, if you're using signals, use signal.alarm or setitimer, and if you're using an event loop, build the timer into your main loop.)
I can't think of any use case where sleep and Timer would both be serious contenders.

How to implement a master/watchdog script in python?

I need it to open 10 processes, and each time one of them finishes I want to wait few seconds and start another one.
It seems pretty simple, but somehow I can't get it to work.
I'm not 100% clear on what you're trying to accomplish, but have you looked at the multiprocessing module, specifically using a pool of workers?
I've done this same thing to process web statistics using a semaphore. Essentially, as processes are created, the semaphore is incremented. When they exit, it's decremented. The creation process is blocked when the semaphore blocks.
This actually fires off threads, which run external processes down execution path a bit.
Here's an example.
thread_sem = threading.Semaphore(int(cfg.maxthreads))
for k,v in log_data.items():
thread_list.append(ProcessorThread(int(k), v, thread_sem))
thread_list[-1].start()
And then in the constructor for ProcessorThread, I do this:
def __init__(self, siteid, data, lock_object):
threading.Thread.__init__(self)
self.setDaemon(False)
self.lock_object = lock_object
self.data = data
self.siteid = siteid
self.lock_object.acquire()
When the thread finishes it's task (whether successfully or not), the lock_object is released which allows for another process to begin.
HTH

Categories

Resources