I have a python script which starts multiple sub processes using these lines :
for elm in elements:
t = multiprocessing.Process(target=sub_process,args=[elm])
threads.append(t)
t.start()
for t in threads:
t.join()
Sometimes, for some reason the thread halts and the script never finishes.
I'm trying to use VSCode debugger to find the problem and check where in the thread itself it stuck but I'm having issues pausing these sub processes because when I click the pause in the debugger window:
It will pause the main thread and some other threads that are running properly but it won't pause the stuck sub process.
Even when I try to pause the threads manually one by one using the Call Stack window, I can still pause only the working threads and not the stuck one.
Please help me figure this thing, It's a hard thing because the thing that makes the process stuck doesn't always happen so it makes it very hard to debug.
First, those are subprocesses, not threads. It's important to understand
the difference, although it doesn't answer your question.
Second, a pause (manual break) in the Python debugger will break in Python code.
It won't break in the machine code below that executes the Python, or in the machine
code below that performing the OS services the Python code is asking for.
If you execute a pause, the pause will occur in the Python code above
the machine code when (and if) the machine code returns to the Python interpreter loop.
Given a complete example:
import multiprocessing
import time
elements = ["one", "two", "three"]
def sub_process(gs, elm):
gs.acquire()
print("sleep", elm)
time.sleep(60)
print("awake", elm);
gs.release()
def test():
gs = multiprocessing.Semaphore()
subprocs = []
for elm in elements:
p = multiprocessing.Process(target=sub_process,args=[gs, elm])
subprocs.append(p)
p.start()
for p in subprocs:
p.join()
if __name__ == '__main__':
test()
The first subprocess will grab the semaphore and sleep for a minute,
and the second and third subprocesses will wait inside gs.acquire() until they
can move forward. A pause will not break into the debugger until the
subprocess returns from the acquire, because acquire is below the Python code.
It sounds like you have an idea where the process is getting stuck,
but you don't know why. You need to determine what questions
you are trying to answer. For example:
(Assuming) one of the processess is stuck in acquire. That means one of the other
processess didn't release the semaphore. What code in which process is
acquiring a semaphore and not releasing it?
Looking at the semaphore object itself might tell you which subprocess is holding it,
but this is a tangent: can you use the debugger to inspect the semaphore
and determine who is holding it? For example, using a machine level debugger in windows,
if these were threads and a critical section, it's possible to look at the critical section
and see which thread is still holding it. I don't know if this could be
done using processes and semaphores on your chosen platform.
Which debuggers you have access to depend on the platform you're running on.
In summary:
You can't break the Python debugger in machine code
You can run the Python interpreter in a machine code debugger, but this
won't show you the Python code at all, which make life interesting.
This can be helpful if you have an idea what you're looking for -
for example, you might be able to tell that you're stuck waiting for a semaphore.
Running a machine code debugger becomes more difficult when you're running
sub-processes, because you need to know which sub-process you're interested
in, and attach to that one. This becomes simpler if you're using a single
process and multiple threads instead, since there's only one process to deal with.
"You can't get there from here, you have to go someplace else first."
You'll need to take a closer look at your code and figure out how
to answer the questions you need to answer using other means.
It's just an idea, Why not to set a timeout on your sub processes and terminate it?
TIMEOUT = 60
for elm in elements:
t = multiprocessing.Process(target=sub_process,args=[elm])
t.daemon = True
threads.append(t)
t.start()
t.join(TIMEOUT)
for t in threads:
t.join()
Related
Good evening, after several hours looking for a solution, I still can't find any way to solve this :
I'm currently working on a Selenium script that creates X threads, each thread run a Firefox instance that makes test on a website. The thing is, when I'm using Ctrl C or leaving the executable with the cross at top-right, every Mozilla instances created keep living.
I assumed this is caused by the fact that sub-threads created in the main thread are not stopped due to these processes that are still running so I decided to make a function that takes a list of drivers will close EVERY drivers in the list, and these drivers are added to the list when they are created.
The issue happens when I'm running it as an executable, the "Stop" function for my IDE (PyCharm) has no issue with it.
What I've tried :
using atexit module to shutdown every drivers (Firefox instances) on exit with a clean_threads function -> It doesn't work because it looks like atexit is running once every threads are shutdown, so in my case, the function was never called
Running my main function in a "try - finally" structure with the clean_threads function called in the finally -> doesn't work as well, I might have used it the wrong way but it did not worked as well.
Running my main function in a "try - except (KeyboardInterrupt, SystemExit)", didn't managed to make it work aswell, for some unknown reason it just made Ctrl C being not able to
I'd love to have some advice on the procedure to follow, I admit going in circles and not finding a solution to the problem..
Any help will be appreciated, thanks in advance :) And if there is a need for more clarification, snippets or whatever, please do not hesitate.
Code of my main function :
def main():
global THREADS
load_settings()
try :
# TODO clean firefox instances
# TODO proper switch
THREADS = [Thread(target=automation, args=(i, HEADLESS)) for i in
range(FIRST_ACC, ACCOUNT_NUMBER + FIRST_ACC, 1)]
for thread in THREADS:
thread.start()
time.sleep(30)
for thread in THREADS:
thread.join()
except (KeyboardInterrupt, SystemExit):
print("Exception catchée")
clean_threads(DRIVER_LIST)
The clean_threads function :
def clean_threads(driver_list):
discard_list = []
print("Test")
for each in driver_list:
each.exit()
discard_list.append(each)
print(len(discard_list))
I'm working on a computercheck program, with several checks.
Once the checks are completed, the results will go into a database.
So far so good.
Since separate functions were freezing the application (wx based), I introduced threading in the code. Which work fine and fast.
The threading looked like this:
check2 = thread2()
check3 = thread3()
check3 = thread4()
check4 = thread5()
check1.start()
check2.start()
check3.start()
check4.start()
check5.start()
The above is a def and is initiated by a button press event.
This all works well. Now I have to upload the results into a database. When I add the function e.g. uploadDB(arg[]) after the code, the function will start although the threads are still busy.
Which means I have to wait with that untill finished. Hence I'm now using the code a bit different like:
threads=[]
c1 = check1()
threads.append(c1)
c2 = check2()
threads.append(c2)
...
for x in threads:
x.start()
# wait for all threads to finish
for x in threads:
x.join()
uploadDB(arg[])
This works as well, but during the join, the interface freezes again, because everything is waiting until the threads are finished...and the freezing is actually what I don't want...but If I don't use the join...I don't know when the threads are finished before uploading..
There should be a more easy way to do this I suppose?
Thanks again for the help!~
/Jasper
An interim solution I have now is that I have a thead listener running. Every thread is posting a "run" to the listener and that one is keeping an integer for the amount of threads. At the end of a thread (wx.callafter) a "done" is posted and the integer is decreasing again. So when the integer is 0, it means all the threads are done and I can continue with e.g. database stuff.
However, this seems not to be a real efficient way of checking if the threads are done.
But the problem remains. "join()" is freezing...and if I set a while loop, I basically create the same situation as with the join() statement...it freezes the application..
So any suggestion of how to do this more efficient is welcome!
Thanks!
Kind all, I'm really new to python and I'm facing a task which I can't completely grasp.
I've created an interface with Tkinter which should accomplish a couple of apparently easy feats.
By clicking a "Start" button two threads/processes will be started (each calling multiple subfunctions) which mainly read data from a serial port (one port per process, of course) and write them to file.
The I/O actions are looped within a while loop with a very high counter to allow them to go onward almost indefinitely.
The "Stop" button should stop the acquisition and essentially it should:
Kill the read/write Thread
Close the file
Close the serial port
Unfortunately I still do not understand how to accomplish point 1, i.e.: how to create killable threads without killing the whole GUI. Is there any way of doing this?
Thank you all!
First, you have to choose whether you are going to use threads or processes.
I will not go too much into differences, google it ;) Anyway, here are some things to consider: it is much easier to establish communication between threads than betweeween processes; in Python, all threads will run on the same CPU core (see Python GIL), but subprocesses may use multiple cores.
Processes
If you are using subprocesses, there are two ways: subprocess.Popen and multiprocessing.Process. With Popen you can run anything, whereas Process gives a simpler thread-like interface to running python code which is part of your project in a subprocess.
Both can be killed using terminate method.
See documentation for multiprocessing and subprocess
Of course, if you want a more graceful exit, you will want to send an "exit" message to the subprocess, rather than just terminate it, so that it gets a chance to do the clean-up. You could do that e.g. by writing to its stdin. The process should read from stdin and when it gets message "exit", it should do whatever you need before exiting.
Threads
For threads, you have to implement your own mechanism for stopping, rather than using something as violent as process.terminate().
Usually, a thread runs in a loop and in that loop you check for a flag which says stop. Then you break from the loop.
I usually have something like this:
class MyThread(Thread):
def __init__(self):
super(Thread, self).__init__()
self._stop_event = threading.Event()
def run(self):
while not self._stop_event.is_set():
# do something
self._stop_event.wait(SLEEP_TIME)
# clean-up before exit
def stop(self, timeout):
self._stop_event.set()
self.join(timeout)
Of course, you need some exception handling etc, but this is the basic idea.
EDIT: Answers to questions in comment
thread.start_new_thread(your_function) starts a new thread, that is correct. On the other hand, module threading gives you a higher-level API which is much nicer.
With threading module, you can do the same with:
t = threading.Thread(target=your_function)
t.start()
or you can make your own class which inherits from Thread and put your functionality in the run method, as in the example above. Then, when user clicks the start button, you do:
t = MyThread()
t.start()
You should store the t variable somewhere. Exactly where depends on how you designed the rest of your application. I would probably have some object which hold all active threads in a list.
When user clicks stop, you should:
t.stop(some_reasonable_time_in_which_the_thread_should_stop)
After that, you can remove the t from your list, it is not usable any more.
First you can use subprocess.Popen() to spawn child processes, then later you can use Popen.terminate() to terminate them.
Note that you could also do everything in a single Python thread, without subprocesses, if you want to. It's perfectly possible to "multiplex" reading from multiple ports in a single event loop.
I have a python program which operates an external program and starts a timeout thread. Timeout thread should countdown for 10 minutes and if the script, which operates the external program isn't finished in that time, it should kill the external program.
My thread seems to work fine on the first glance, my main script and the thread run simultaneously with no issues. But if a pop up window appears in the external program, it stops my scripts, so that even the countdown thread stops counting, therefore totally failing it's job.
I assume the issue is that the script calls a blocking function in API for the external program, which is blocked by the pop up window. I understand why it blocks my main program, but don't understand why it blocks my countdown thread. So, one possible solution might be to run a separate script for the countdown, but I would like to keep it as clean as possible and it seems really messy to start a script for this.
I have searched everywhere for a clue, but I didn't find much. There was a reference to the gevent library here:
background function in Python
, but it seems like such a basic task, that I don't want to include external library for this.
I also found a solution which uses a windows multimedia timer here, but I've never worked with this before and am afraid the code won't be flexible with this. Script is Windows-only, but it should work on all Windows from XP on.
For Unix I found signal.alarm which seems to do exactly what I want, but it's not available for Windows. Any alternatives for this?
Any ideas on how to work with this in the most simplified manner?
This is the simplified thread I'm creating (run in IDLE to reproduce the issue):
import threading
import time
class timeToKill():
def __init__(self, minutesBeforeTimeout):
self.stop = threading.Event()
self.countdownFrom = minutesBeforeTimeout * 60
def startCountdown(self):
self.countdownThread= threading.Thread(target=self.countdown, args=(self.countdownFrom,))
self.countdownThread.start()
def stopCountdown(self):
self.stop.set()
self.countdownThread.join()
def countdown(self,seconds):
for second in range(seconds):
if(self.stop.is_set()):
break
else:
print (second)
time.sleep(1)
timeout = timeToKill(1)
timeout.startCountdown()
raw_input("Blocking call, waiting for input:\n")
One possible explanation for a function call to block another Python thread is that CPython uses global interpreter lock (GIL) and the blocking API call doesn't release it (NOTE: CPython releases GIL on blocking I/O calls therefore your raw_input() example should work as is).
If you can't make the buggy API call to release GIL then you could use a process instead of a thread e.g., multiprocessing.Process instead of threading.Thread (the API is the same). Different processes are not limited by GIL.
For quick and dirty threading, I usually resort to subprocess commands. it is quite robust and os independent. It does not give as fine grained control as the thread and queue modules but for external calls to programs generally does nicely. Note the shell=True must be used with caution.
#this can be any command
p1 = subprocess.Popen(["python", "SUBSCRIPTS/TEST.py", "0"], shell=True)
#the thread p1 will run in the background - asynchronously. If you want to kill it after some time, then you need
#here do some other tasks/computations
time.sleep(10)
currentStatus = p1.poll()
if currentStatus is None: #then it is still running
try:
p1.kill() #maybe try os.kill(p1.pid,2) if p1.kill does not work
except:
#do something else if process is done running - maybe do nothing?
pass
This question already has answers here:
How do you create a daemon in Python?
(16 answers)
Closed 9 years ago.
I am new with Daemons and I was wondering how can I make my main script a daemon?
I have my main script which I wish to make a Daemon and run in the background:
main.py
def requestData(information):
return currently_crunched_data()
while True:
crunchData()
I would like to be able to use the requestData function to this daemon while the loop is running. I am not too familiar with Daemons or how to convert my script into one.
However I am guessing I would have to make two threads, one for my cruncData loop and one for the Daemon request receiever since the Daemon has its own loop (daemon.requestLoop()).
I am currently looking into Pyro to do this. Does anyone know how I can ultimately make a background running while loop have the ability to receive requests from other processes (like a Daemon I suppose) ?
There are already a number of questions on creating a daemon in Python, like this one, which answer that part nicely.
So, how do you have your daemon do background work?
As you suspected, threads are an obvious answer. But there are three possible complexities.
First, there's shutdown. If you're lucky, your crunchData function can be summarily killed at any time with no corrupted data or (too-significant) lost work. In that case:
def worker():
while True:
crunchData()
# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.daemon = True
t.start()
Notice that t.daemon. A "daemon thread" has nothing to do with your program being a daemon; it means that you can just quit the main process, and it will be summarily killed.
But what if crunchData can't be killed? Then you'll need to do something like this:
quitflag = False
quitlock = threading.Lock()
def worker():
while True:
with quitlock:
if quitflag:
return
crunchData()
# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.start()
# ... somewhere in the daemon shutdown code ...
with quitlock:
quitflag = True
t.join()
I'm assuming each iteration of crunchData doesn't take that long. If it does, you may need to check quitFlag periodically within the function itself.
Meanwhile, you want your request handler to access some data that the background thread is producing. You'll need some kind of synchronization there as well.
The obvious thing is to just use another Lock. But there's a good chance that crunchData is writing to its data frequently. If it holds the lock for 10 seconds at a time, the request handler may block for 10 seconds. But if it grabs and releases the lock a million times, that could take longer than the actual work.
One alternative is to double-buffer your data: Have crunchData write into a new copy, then, when it's done, briefly grab the lock and set currentData = newData.
Depending on your use case, a Queue, a file, or something else might be even simpler.
Finally, crunchData is presumably doing a lot of CPU work. You need to make sure that the request handler does very little CPU work, or each request will slow things down quite a bit as the two threads fight over the GIL. Usually this is no problem. If it is, use a multiprocessing.Process instead of a Thread (which makes sharing or passing the data between the two processes slightly more complicated, but still not too bad).