I make a thread to run a script, and it may spend much time. And I want to pause and resume it in another thread. If I use a flag and detect it, it can not pause immediately. I have searched a lot, but it seems that self.__flag, self.pause can not achieve the target.
class MT(threading.Thread):
def __init__():
self.__running = threading.Event()
self.__running.set()
self.__flag = threading.Event()
self.__flag.set()
def run(self):
'''
run the script
'''
while self.__running.isSet():
self.__flag.wait()
moudleTest()
def pause(self):
'''
pause the thread
'''
self.__flag.clear()
def resume(self):
'''
resume the thread
'''
self._-flag.set()
What you want is not possible without diving below the Python layer using C extensions with OS specific techniques, e.g. on Windows, SuspendThread. You can not immediately and completely suspend another thread via Python level APIs, because doing so is considered absurdly dangerous.
Even when such a thing is possible, it's a terrible idea, prone to deadlocks and other terrible things. Just for example, pre-CPython 3.3, there was a single global import lock for the whole interpreter. If the other thread was in the middle of importing a module when it was suspended, no other thread could import at all until it was resumed and finished the import (causing a deadlock if that thread was the one responsible for resuming the suspended thread); in CPython 3.3+, it's better, but if another thread tried to import that specific module, it would deadlock just as badly.
In summary: Use Locks, Events and/or Conditions appropriately, and if you need faster pauses, make the wait checks more often (interspersed with thread "work" more regularly). If your code can't tolerate even a tiny delay before the pause, you have a design problem that you need to fix (e.g. you're using Event to simulate locking or the like, possibly for performance, which is hilariously misguided, since Events are built on Conditions which are in turn built on Locks, and all but Lock are implemented at the Python layer, not the C layer, and therefore quite slow).
Related
I've got some Python code that farms out expensive jobs using ThreadPoolExecutor, and I'd like to keep track of which of them have completed so that if I have to restart this system, I don't have to redo the stuff that already finished. In a single-threaded context, I could just mark what I've done in a shelf. Here's a naive port of that idea to a multithreaded environment:
from concurrent.futures import ThreadPoolExecutor
import subprocess
import shelve
def do_thing(done, x):
# Don't let the command run in the background; we want to be able to tell when it's done
_ = subprocess.check_output(["some_expensive_command", x])
done[x] = True
futs = []
with shelve.open("done") as done:
with ThreadPoolExecutor(max_workers=18) as executor:
for x in things_to_do:
if done.get(x, False):
continue
futs.append(executor.submit(do_thing, done, x))
# Can't run `done[x] = True` here--have to wait until do_thing finishes
for future in futs:
future.result()
# Don't want to wait until here to mark stuff done, as the whole system might be killed at some point
# before we get through all of things_to_do
Can I get away with this? The documentation for shelve doesn't contain any guarantees about thread safety, so I'm thinking no.
So what is the simple way to handle this? I thought that perhaps sticking done[x] = True in future.add_done_callback would do it, but that will often run in the same thread as the future itself. Perhaps there is a locking mechanism that plays nicely with ThreadPoolExecutor? That seems cleaner to me that writing a loop that sleeps and then checks for completed futures.
While you're still in the outer-most with context manager, the done shelve is just a normal python object- it is only written to disk when the context manager closes and it runs its __exit__ method. It is therefore just as thread safe as any other python object, due to the GIL (as long as you're using CPython).
Specifically, the reassignment done[x] = True is thread safe / will be done atomically.
It's important to note that while the shelve's __exit__ method will run after a Ctrl-C, it won't if the python process ends abruptly, and the shelve won't be saved to disk.
To protect against this kind of failure, I would suggest using a lightweight file-based thread safe database like sqllite3.
Kind all, I'm really new to python and I'm facing a task which I can't completely grasp.
I've created an interface with Tkinter which should accomplish a couple of apparently easy feats.
By clicking a "Start" button two threads/processes will be started (each calling multiple subfunctions) which mainly read data from a serial port (one port per process, of course) and write them to file.
The I/O actions are looped within a while loop with a very high counter to allow them to go onward almost indefinitely.
The "Stop" button should stop the acquisition and essentially it should:
Kill the read/write Thread
Close the file
Close the serial port
Unfortunately I still do not understand how to accomplish point 1, i.e.: how to create killable threads without killing the whole GUI. Is there any way of doing this?
Thank you all!
First, you have to choose whether you are going to use threads or processes.
I will not go too much into differences, google it ;) Anyway, here are some things to consider: it is much easier to establish communication between threads than betweeween processes; in Python, all threads will run on the same CPU core (see Python GIL), but subprocesses may use multiple cores.
Processes
If you are using subprocesses, there are two ways: subprocess.Popen and multiprocessing.Process. With Popen you can run anything, whereas Process gives a simpler thread-like interface to running python code which is part of your project in a subprocess.
Both can be killed using terminate method.
See documentation for multiprocessing and subprocess
Of course, if you want a more graceful exit, you will want to send an "exit" message to the subprocess, rather than just terminate it, so that it gets a chance to do the clean-up. You could do that e.g. by writing to its stdin. The process should read from stdin and when it gets message "exit", it should do whatever you need before exiting.
Threads
For threads, you have to implement your own mechanism for stopping, rather than using something as violent as process.terminate().
Usually, a thread runs in a loop and in that loop you check for a flag which says stop. Then you break from the loop.
I usually have something like this:
class MyThread(Thread):
def __init__(self):
super(Thread, self).__init__()
self._stop_event = threading.Event()
def run(self):
while not self._stop_event.is_set():
# do something
self._stop_event.wait(SLEEP_TIME)
# clean-up before exit
def stop(self, timeout):
self._stop_event.set()
self.join(timeout)
Of course, you need some exception handling etc, but this is the basic idea.
EDIT: Answers to questions in comment
thread.start_new_thread(your_function) starts a new thread, that is correct. On the other hand, module threading gives you a higher-level API which is much nicer.
With threading module, you can do the same with:
t = threading.Thread(target=your_function)
t.start()
or you can make your own class which inherits from Thread and put your functionality in the run method, as in the example above. Then, when user clicks the start button, you do:
t = MyThread()
t.start()
You should store the t variable somewhere. Exactly where depends on how you designed the rest of your application. I would probably have some object which hold all active threads in a list.
When user clicks stop, you should:
t.stop(some_reasonable_time_in_which_the_thread_should_stop)
After that, you can remove the t from your list, it is not usable any more.
First you can use subprocess.Popen() to spawn child processes, then later you can use Popen.terminate() to terminate them.
Note that you could also do everything in a single Python thread, without subprocesses, if you want to. It's perfectly possible to "multiplex" reading from multiple ports in a single event loop.
So I have this library that I use and within one of my functions I call a function from that library, which happens to take a really long time. Now, at the same time I have another thread running where I check for different conditions, what I want is that if a condition is met, I want to cancel the execution of the library function.
Right now I'm checking the conditions at the start of the function, but if the conditions happen to change while the library function is running, I don't need its results, and want to return from it.
Basically this is what I have now.
def my_function():
if condition_checker.condition_met():
return
library.long_running_function()
Is there a way to run the condition check every second or so and return from my_function when the condition is met?
I've thought about decorators, coroutines, I'm using 2.7 but if this can only be done in 3.x I'd consider switching, it's just that I can't figure out how.
You cannot terminate a thread. Either the library supports cancellation by design, where it internally would have to check for a condition every once in a while to abort if requested, or you have to wait for it to finish.
What you can do is call the library in a subprocess rather than a thread, since processes can be terminated through signals. Python's multiprocessing module provides a threading-like API for spawning forks and handling IPC, including synchronization.
Or spawn a separate subprocess via subprocess.Popen if forking is too heavy on your resources (e.g. memory footprint through copying of the parent process).
I can't think of any other way, unfortunately.
Generally, I think you want to run your long_running_function in a separate thread, and have it occasionally report its information to the main thread.
This post gives a similar example within a wxpython program.
Presuming you are doing this outside of wxpython, you should be able to replace the wx.CallAfter and wx.Publisher with threading.Thread and PubSub.
It would look something like this:
import threading
import time
def myfunction():
# subscribe to the long_running_function
while True:
# subscribe to the long_running_function and get the published data
if condition_met:
# publish a stop command
break
time.sleep(1)
def long_running_function():
for loop in loops:
# subscribe to main thread and check for stop command, if so, break
# do an iteration
# publish some data
threading.Thread(group=None, target=long_running_function, args=()) # launches your long_running_function but doesn't block flow
myfunction()
I haven't used pubsub a ton so I can't quickly whip up the code but it should get you there.
As an alternative, do you know the stop criteria before you launch the long_running_function? If so, you can just pass it as an argument and check whether it is met internally.
I have a python program which operates an external program and starts a timeout thread. Timeout thread should countdown for 10 minutes and if the script, which operates the external program isn't finished in that time, it should kill the external program.
My thread seems to work fine on the first glance, my main script and the thread run simultaneously with no issues. But if a pop up window appears in the external program, it stops my scripts, so that even the countdown thread stops counting, therefore totally failing it's job.
I assume the issue is that the script calls a blocking function in API for the external program, which is blocked by the pop up window. I understand why it blocks my main program, but don't understand why it blocks my countdown thread. So, one possible solution might be to run a separate script for the countdown, but I would like to keep it as clean as possible and it seems really messy to start a script for this.
I have searched everywhere for a clue, but I didn't find much. There was a reference to the gevent library here:
background function in Python
, but it seems like such a basic task, that I don't want to include external library for this.
I also found a solution which uses a windows multimedia timer here, but I've never worked with this before and am afraid the code won't be flexible with this. Script is Windows-only, but it should work on all Windows from XP on.
For Unix I found signal.alarm which seems to do exactly what I want, but it's not available for Windows. Any alternatives for this?
Any ideas on how to work with this in the most simplified manner?
This is the simplified thread I'm creating (run in IDLE to reproduce the issue):
import threading
import time
class timeToKill():
def __init__(self, minutesBeforeTimeout):
self.stop = threading.Event()
self.countdownFrom = minutesBeforeTimeout * 60
def startCountdown(self):
self.countdownThread= threading.Thread(target=self.countdown, args=(self.countdownFrom,))
self.countdownThread.start()
def stopCountdown(self):
self.stop.set()
self.countdownThread.join()
def countdown(self,seconds):
for second in range(seconds):
if(self.stop.is_set()):
break
else:
print (second)
time.sleep(1)
timeout = timeToKill(1)
timeout.startCountdown()
raw_input("Blocking call, waiting for input:\n")
One possible explanation for a function call to block another Python thread is that CPython uses global interpreter lock (GIL) and the blocking API call doesn't release it (NOTE: CPython releases GIL on blocking I/O calls therefore your raw_input() example should work as is).
If you can't make the buggy API call to release GIL then you could use a process instead of a thread e.g., multiprocessing.Process instead of threading.Thread (the API is the same). Different processes are not limited by GIL.
For quick and dirty threading, I usually resort to subprocess commands. it is quite robust and os independent. It does not give as fine grained control as the thread and queue modules but for external calls to programs generally does nicely. Note the shell=True must be used with caution.
#this can be any command
p1 = subprocess.Popen(["python", "SUBSCRIPTS/TEST.py", "0"], shell=True)
#the thread p1 will run in the background - asynchronously. If you want to kill it after some time, then you need
#here do some other tasks/computations
time.sleep(10)
currentStatus = p1.poll()
if currentStatus is None: #then it is still running
try:
p1.kill() #maybe try os.kill(p1.pid,2) if p1.kill does not work
except:
#do something else if process is done running - maybe do nothing?
pass
Let's say I have this blob of code that's made to be one long-running thread of execution, to poll for events and fire off other events (in my case, using XMLRPC calls). It needs to be refactored into clean objects so it can be unit tested, but in the meantime I want to capture some of its current behavior in some integration tests, treating it like a black box. For example:
# long-lived code
import xmlrpclib
s = xmlrpclib.ServerProxy('http://XXX:yyyy')
def do_stuff():
while True:
...
if s.xyz():
s.do_thing(...)
_
# test code
import threading, time
# stub out xmlrpclib
def run_do_stuff():
other_code.do_stuff()
def setUp():
t = threading.Thread(target=run_do_stuff)
t.setDaemon(True)
def tearDown():
# somehow kill t
t.join()
def test1():
t.start()
time.sleep(5)
assert some_XMLRPC_side_effects
The last big issue is that the code under test is designed to run forever, until a Ctrl-C, and I don't see any way to force it to raise an exception or otherwise kill the thread so I can start it up from scratch without changing the code I'm testing. I lose the ability to poll any flags from my thread as soon as I call the function under test.
I know this is really not how tests are designed to work, integration tests are of limited value, etc, etc, but I was hoping to show off the value of testing and good design to a friend by gently working up to it rather than totally redesigning his software in one go.
The last big issue is that the code under test is designed to run forever, until a Ctrl-C, and I don't see any way to force it to raise an exception or otherwise kill the thread
The point of Test-Driven Development is to rethink your design so that it is testable.
Loop forever -- while seemingly fine for production use -- is untestable.
So make the loop terminate. It won't hurt production. It will improve testability.
The "designed to run forever" is not designed for testability. So fix the design to be testable.
I think I found a solution that does what I was looking for: Instead of using a thread, use a separate process.
I can write a small python stub to do mocking and run the code in a controlled way. Then I can write the actual tests to run my stub in a subprocess for each test and kill it when each test is finished. The test process could interact with the stub over stdio or a socket.