multiprocessing pool stopping after exit of ide - python

I have been having trouble with exiting Mulitprocessing pool after keyboard interrupt, and after a long time of trying a gave up, but if i just exited CMD after im done/ have done what i needed with my script, what would be the downsides? I know this isn't a good practice, but it dosen't really need to be good. I'm assuming everything gets killed after the Command line is exited, but i'm not sure.

I think it depends. If it is a Python console the ressources might get cleaned up. But I know from software crashes in Python that sometimes the child processes stay alive even when the main application closes. Usually when started directly via the file explorer.
To make sure that your sub threads/process get closed when the main application closes, you shoud set them as daemon. Then killing the main script kills the childs as well.
see this Link

Related

How to make the Python subprocess wait for some input when running through SLURM script?

I am running some Python code using a SLURM script on a remote server accessed through SSH. At some point, issues related to licenses on the SLURM platform may happen, generating errors in Python and ending the subprocess. I want to use try-except to let the Python subprocess wait until the issue is fixed, after that it can keep running from where it stopped.
What are some smart implementations for that?
My most obvious solution is just keeping Python inside a loop if the error occurs and letting it read a file every X seconds, when I finally fix the error and want it to keep running from where it stopped, I would write something on the file and break the loop. I wonder if there is a smarter way to provide input to the Python subprocess while it is running through the SLURM script.
One idea might be to add a signal handler for signal USR1 to your Python script like this.
In the signal handler function, you can set a global variable or send a message or set a threading.Event that the main process is waiting on.
Then you can signal the process with:
kill -USR1 <PID>
or with the Python os.kill() equivalent.
Though I do have to agree there is something to be said for the simplicity of your process doing:
touch /tmp/blocked.$$
and your program waiting in a loop with a 1s sleep for that file to be removed. This way you can tell which process id is blocked.

What is the best way to debug a python multiprocess script which fails to terminate?

I am writing a python script which uses multiprocessing, multithreading and zeromq for interprocess communication. It all works fine until the program finishes: at that time the child processes terminate properly (sigwait is intercepted and the child procs terminate which I have confirmed with the ps command) but the main process often does not shut down - occasionally it does, but most of the time it does not. I have confirmed that all remaining threads of the main process are daemonic and that the last row of the script is executed properly (it is a logging.info call). I am using fork for forking processes and can see that a Forkprocess still runs in addition to the main process.
What is the best way to debug this, considering that the script has actually finished ? Maybe add a pdb or breakpoint() right at the end ?
Thanks in advance.
Here is the output, after the last row the script usually does not terminate:
INFO root::remaining active child processes: [<ForkProcess name='SyncManager-1' pid=6362 parent=6361 started>]
INFO root::non-daemonic threads which are still running, preventing orderly shutdown: [].
INFO root::======== PID: 6361 main() end: shut down completed.=========
EDIT:
I refactored the code and noticed that it now misbehaves very rarely. I am 99.9% certain that it is due to an open zeromq REQ/REP 'socket' at the time of shutdown. The refactoring made sure that these sockets are only held open only for a very short time - but it is not predictable what sockets are open at shutdown so occasionally it still hangs.
I will write a simple testharness with two processes communicating via REQ/REP sockets then shut down the child process followed by main process. I expect same result, i.e., interpreter not shutting down. Lets see, keep you posted.
I think you could try viztracer. The good thing about viztracer is that it can display all the processes on the same timeline. Maybe you can catch what's stopping your main process/forked process from shutting down. If it's a deadlock it should be noticeable. However, without the code, I really can't tell if it would help for sure.

Wait and complete processes when Python script is stopped from PyCharm console?

Basically I am writing a script that can be stopped and resumed at any time. So if the user uses, say PyCharm console to execute the program, he can just click on the stop button whenever he wants.
Now, I need to save some variables and let an ongoing function finish before terminating. What functions do I use for this?
I have already tried atexit.register() to no avail.
Also, how do I make sure that an ongoing function is completed before the program can exit?
Solved it using a really bad workaround. I used all functions that are related to exit in Python, including SIG* functions, but uniquely, I did not find a way to catch the exit signal when Python program is being stopped by pressing the "Stop" button in PyCharm application. Finally got a workaround by using tkinter to open an empty window, with my program running in a background thread, and used that to close/stop program execution. Works wonderfully, and catches the SIG* signal as well as executing atexit . Anyways massive thanks to #scrineym as the link really gave a lot of useful information that did help me in development of the final version.
It looks like you might want to catch a signal.
When a program is told to stop a signal is sent to the process from the OS, you can then catch them and do cleanup before exit. There are many diffferent signals , for xample when you press CTRL+C a SIGINT signal is sent by the OS to stop your process, but there are many others.
See here : How do I capture SIGINT in Python?
and here for the signal library: https://docs.python.org/2/library/signal.html

How to use threading/multiprocessing to prevent program hanging?

I am a bit confused with multiprocessing. I have a video processing script which can be run from the command line or launched from a PySide application using a subprocess call. The script seems to run fine from the command line and basically initializes a pool of workers which each process a separate video file.
When I run the program however the OS tells me my program is not responding. I would like to make use of all the cores on my system for multiprocessing but I would also like to prevent this annoyance. What should I do I get around this? Do I start the initial script in a thread or something?
As you are speaking of PySide, I assume you program is a GUI one. In a GUI program all processing must occurs in a worker thread if you want to keep the UI responsive. So yes, the initial script must be start in a thread distinct from main thread (main one is reserved for UI)

How to identify the cause in Python of code that is not interruptible with a CTRL +C

I am using requests to pull some files. I have noticed that the program seems to hang after some large number of iterations that varies from 5K to 20K. I can tell it is hanging because the folder where the results are stored has not changed in several hours. I have been trying to interrupt the process (I am using IDLE) by hitting CTRL + C to no avail. I would like to interrupt instead of killing the process because restart is easier. I have finally had to kill the process. I restart and it runs fine again until I have the same symptoms. I would like to figure out how to diagnose the problem but since I am having to kill everything I have no idea where to start.
Is there an alternate way to view what is going on or to more robustly interrupt the process?
I have been assuming that if I can interrupt without killing I can look at globals and or do some other mucking around to figure out where my code is hanging.
In case it's not too late: I've just faced the same problems and have some tips
First thing: In python most waiting apis are not interruptible (ie Thread.join(), Lock.acquire()...).
Have a look at theese pages for more informations:
http://snakesthatbite.blogspot.fr/2010/09/cpython-threading-interrupting.html
http://docs.python.org/2/library/thread.html
Then if a thread is waiting on such a call, it cannot be stopped.
There is another thing to know: if a normal thread is running (or hanged) the main program will stay indefinitely untill all threads are stopped or the process is killed.
To avoid that, you can make the thread a daemon thread: Thread.daemon=True before calling Thread.start().
Second thing, to find where your program is hanged, you can launch it with a debugger but I prefer logging because logs are always there in case its to late to debug.
Try logging before and after each waiting call to see how much time your threads have been hanged. To have high quality logs, uses python logging configured with file handler, html handler or even better with a syslog handler.

Categories

Resources