Why does mulitprocessing.Pool run but never terminate?

Why does mulitprocessing.Pool run but never terminate? - python

I'm trying to use mulitprocessing.Pool to speed up the execution of a function across a range of inputs. The processes seem to have been called, since my task manager indicates a substantial increase in my CPU's utilization, but the task never terminates. No exceptions are ever raised, runtime or otherwise.
from multiprocessing import Pool
def f(x):
print(x)
return x**2
class Klass:
def __init__(self):
pass
def foo(self):
X = list(range(1, 1000))
with Pool(15) as p:
result = p.map(f, X)
if __name__ == "__main__":
obj = Klass()
obj.foo()
print("All Done!")
Interestingly, despite the uptick in CPU utilization, print(x) never prints anything to the console.
I have moved the function f outside of the class as was suggested here, to no avail. I have tried adding p.close() and p.join() as well with no success. Using other Pool class methods like imap lead to TypeError: can't pickle _thread.lock objects errors and seems to take a step away from the example usage in the introduction of the Python Multiprocessing Documentation.
Adding to the confusion, if I try running the code above enough times (killing the hung kernel after each attempt) the code begins consistently working as expected. It usually takes about twenty attempts before this "clicks" into place. Restarting my IDE reverts the now functional code back to the former broken state. For reference, I am running using the Anaconda Python Distribution (Python 3.7) with the Spyder IDE on Windows 10. My CPU has 16 cores, so the Pool(15) is not calling for more processes than I have CPU cores. However, running the code with a different IDE, like Jupyter Lab, yields the same broken results.
Others have suggested that this may be a flaw with Spyder itself, but the suggestion to use mulitprocessing.Pool instead of mulitprocessing.Process doesn't seem to work either.

Could be related to this from python doc:
Note Functionality within this package requires that the main
module be importable by the children. This is covered in Programming
guidelines however it is worth pointing out here. This means that some
examples, such as the multiprocessing.pool.Pool examples will not work
in the interactive interpreter.
and then this comment on their example:
If you try this it will actually output three full tracebacks
interleaved in a semi-random fashion, and then you may have to stop
the master process somehow.
UPDATE:
The info found here seems to confirm that using the pool from an interactive interpreter will have varying success. This guidance is also shared...
...guidance [is] to always use functions/classes whose definitions are
importable.
This is the solution outlined here and which works for me (every time) using your code.

This seems like it might be a problem with both Spyder and Jupyter. If you run the above code in the console directly, everything works as intended.

Related

Starting process in Google Colab with Prefix "!" vs. "subprocess.Popen(..)"

I've been using Google Colab for a few weeks now and I've been wondering what the difference is between the two following commands (for example):
!ffmpeg ...
subprocess.Popen(['ffmpeg', ...
I was wondering because I ran into some issues when I started either of the commands above and then tried to stop execution midway. Both of them cancel on KeyboardInterrupt but I noticed that after that the runtime needs a factory reset because it somehow got stuck. Checking ps aux in the Linux console listed a process [ffmpeg] <defunct> which somehow still was running or at least blocking some ressources as it seemed.
I then did some research and came across some similar posts asking questions on how to terminate a subprocess correctly (1, 2, 3). Based on those posts I generally came to the conclusion that using the subprocess.Popen(..) variant obviously provides more flexibility when it comes to handling the subprocess: Defining different stdout procedures or reacting to different returncode etc. But I'm still unsure on what the first command above using the ! as prefix exactly does under the hood.
Using the first command is much easier and requires way less code to start this process. And assuming I don't need a lot of logic handling the process flow it would be a nice way to execute something like ffmpeg - if I were able to terminate it as expected. Even following the answers from the other posts using the 2nd command never got me to a point where I could terminate the process fully once started (even when using shell=False, process.kill() or process.wait() etc.). This got me frustrated, because restarting and re-initializing the Colab instance itself can take several minutes every time.
So, finally, I'd like to understand in more general terms what the difference is and was hoping that someone could enlighten me. Thanks!

! commands are executed by the notebook (or more specifically by the ipython interpreter), and are not valid Python commands. If the code you are writing needs to work outside of the notebook environment, you cannot use ! commands.
As you correctly note, you are unable to interact with the subprocess you launch via !; so it's also less flexible than an explicit subprocess call, though similar in this regard to subprocess.call
Like the documentation mentions, you should generally avoid the bare subprocess.Popen unless you specifically need the detailed flexibility it offers, at the price of having to duplicate the higher-level functionality which subprocess.run et al. already implement. The code to run a command and wait for it to finish is simply
subprocess.check_call(['ffmpeg', ... ])
with variations for capturing its output (check_output) and the more modern run which can easily replace all three of the legacy high-level calls, albeit with some added verbosity.

Python Pool.Starmap Not Terminating Or Outputting on Print

I have attempted in a few different ways to perform Pool.starmap. I have tried various different suggestions and answers, and to no avail. Below is a sample of the code I am trying to run, however it gets caught and never terminates. What am I doing wrong here?
Side note: I am on python version 3.9.8
if __name__ == '__main__':
with get_context("spawn").Pool() as p:
tasks = [(1,1),(2,2),(3,3)]
print(p.starmap(add,tasks))
p.close()
p.join()

Multiprocessing in python has some complexity you should be aware of that make it dependent on how you run your script in addition to what OS, and python version you're using.
One of the big issues I see very often is the fact that Jupyter and other "notebook" style python environments don't always play nice with multiprocessing. There are technically some ways around this, but I typically just suggest executing your code from a more normal system terminal. The common thread is "interactive" interpreters don't work very well because there needs to be a "main" file, and in interactive mode there's no file; it just waits for user input.
I can't know exactly what your issue is here, as you haven't provided all your code, what OS you're using, and what IDE you're using but I can at least leave you with a working (on my setup) example. (windows 10; python 3.9; Spyder IDE with run settings -> execute in an external system terminal)
import multiprocessing as mp
def add(a, b): #I'm assuming your "add" function looks a bit like this...
return a+b
if __name__ == "__main__":
#this is critical when using "spawn" so code doesn't run when the file is imported
#you should only define functions, classes, and static data outside this (constants)
#most critically, it shouldn't be possible for a new child process to start outside this
ctx = mp.get_context("spawn")
#This is the only context available on windows, and the default for MacOS since python 3.8.
# Contexts are an important topic somewhat unique to python multiprocessing, and you should
# absolutely do some additional reading about "spawn" vs "fork". tldr; "spawn" starts a new
# process with no knowledge of the old one, and must `import` everything from __main__.
# "fork" on the other hand copies the existing process and all its memory before branching. This is
# faster than re-starting the interpreter, and re-importing everything, but sometimes things
# get copied that shouldn't, and other things that should get copied don't.
with ctx.Pool() as p:
#using `with` automatically shuts down the pool (forcibly) at the end of the block so you don't have to call `close` or `join`.
# It was also pointed out that due to the forcible shutdown, async calls like `map_async` may not finish unless you wait for the results
# before the end of the `with` block. `starmap` already waits for the results in this case however, so extra waiting is not needed.
tasks = [(1,1),(2,2),(3,3)]
print(p.starmap(add, tasks))

How to start a python script using the multiprocessing library (with map_async) from the console

I am sorry for this rather long question but, since it is my first question on Stackoverflow, I wanted to be thorough in describing my problem and what I already tried.
I am doing simulations of stochastic processes and thought it to be a good idea to use multiprocessing in order to increase the speed of my simulations . Since the individual processes have no need to share information with each other, this is really a trivial application of multiprocessing – unfortunately I struggle with calling my script from the console.
My code for a testfunction looks like this:
#myscript.py
from multiprocessing import Pool
def testFunc (inputs):
print(inputs)
def multi():
print('Test2')
pool = Pool()
pool.map_async(testFunc, range(10))
if __name__ == '__main__':
print('Test1')
multi()
This works absolutely fine as long as I run the code from within my Spyder IDE. As the next step I want to execute my script on my university's cluster which I access via a slurm script; therefore, I need to be able to execute my python script via a bash script. Here I got some unexpected results.
What I tried – on my Mac Book Pro with iOS 10.15.7 and a work station with Ubuntu 18.04.5 – are the following console inputs: python myscript.py and python -c "from myscript import multi; multi()".
In each case my only output is Test1 and Test2, and testFunc never seems to be called. Following this answer Using python multiprocessing Pool in the terminal and in code modules for Django or Flask, I also tried various versions of omitting the if __name__ == '__main__' and importing the relevant functions to another module. For example I tried `
#myscript.py
from multiprocessing import Pool
def testFunc (inputs):
print(inputs)
pool = Pool()
pool.map_async(testFunc, range(10))
But all to no prevail. To confuse me even further I now found out that first opening the python interpreter of the console by simply typing python, pressing enter and then executing
from myscript import multi
multi()
inside the python interpreter does work.
As I said, I am very confused by this, since I thought this to be equivalent to python -c "from myscript import multi; multi()" and I really don't understand why one works and the other doesn't. Trying to reproduce this success I also tried executing the following bash script
python - <<'END_SCRIPT'
from multiTest import multi
multi()
END_SCRIPT
but, alas, also this doesn't work.
As a last "dicovery", I found out that all those problems only arise when using map_async instead of just map – however, I think that for my application asynchron processes are preferable.
I would be really grateful if someone could shed light on this mystery (at least for me it is a mystery).
Also, as I said this is my first question on Stackoverflow, so I apologize if I forgot relevant information or did accidentally not follow the formatting guidelines. All comments or edits helping me to improve my questions (and answers) in the future are also much appreciated!

You aren't waiting for the pool to finish what it's doing before your program exits.
def multi():
print('Test2')
with Pool() as pool:
result = pool.map_async(testFunc, range(10))
result.wait()
If the order in which the subprocesses process things isn't relevant, I'd suggest
with Pool() as pool:
for result in pool.imap_unordered(testFunc, range(10), 5):
pass
(change 5, the chunk size parameter, to taste.)

Python Code Coverage and Multiprocessing

I use coveralls in combination with coverage.py to track python code coverage of my testing scripts. I use the following commands:
coverage run --parallel-mode --source=mysource --omit=*/stuff/idont/need.py ./mysource/tests/run_all_tests.py
coverage combine
coveralls --verbose
This works quite nicely with the exception of multiprocessing. Code executed by worker pools or child processes is not tracked.
Is there a possibility to also track multiprocessing code? Any particular option I am missing? Maybe adding wrappers to the multiprocessing library to start coverage every time a new process is spawned?
EDIT:
I (and jonrsharpe, also :-) found a monkey-patch for multiprocessing.
However, this does not work for me, my Tracis-CI build is killed almost right after the start. I checked the problem on my local machine and apparently adding the patch to multiprocessing busts my memory. Tests that take much less than 1GB of memory need more than 16GB with this fix.
EDIT2:
The monkey-patch does work after a small modification: Removing
the config_file parsing (config_file=os.environ['COVERAGE_PROCESS_START']) did the trick. This solved the issue of the bloated memory. Accordingly, the corresponding line simply becomes:
cov = coverage(data_suffix=True)

Coverage 4.0 includes a command-line option --concurrency=multiprocessing to deal with this. You must use coverage combine afterward. For instance, if your tests are in regression_tests.py, then you would simply do this at the command line:
coverage run --concurrency=multiprocessing regression_tests.py
coverage combine

I've had spent some time trying to make sure coverage works with multiprocessing.Pool, but it never worked.
I have finally made a fix that makes it work - would be happy if someone directed me if I am doing something wrong.
https://gist.github.com/andreycizov/ee59806a3ac6955c127e511c5e84d2b6

One of the possible causes of missing coverage data from forked processes, even with concurrency=multiprocessing, is the way of multiprocessing.Pool shutdown. For example, with statement leads to terminate() call (see __exit__ here). As a consequence, pool workers have no time to save coverage data. I had to use close(), timed join() (in a thread), terminate sequence instead of with to get coverage results saved.

Can I put break points on background threads in Python?

I'm using the PyDev for Eclipse plugin, and I'm trying to set a break point in some code that gets run in a background thread. The break point never gets hit even though the code is executing. Here's a small example:
import thread
def go(count):
print 'count is %d.' % count # set break point here
print 'calling from main thread:'
go(13)
print 'calling from bg thread:'
thread.start_new_thread(go, (23,))
raw_input('press enter to quit.')
The break point in that example gets hit when it's called on the main thread, but not when it's called from a background thread. Is there anything I can do, or is that a limitation of the PyDev debugger?
Update
Thanks for the work arounds. I submitted a PyDev feature request, and it has been completed. It should be released with version 1.6.0. Thanks, PyDev team!

The problem is that there's no API in the thread module to know when a thread starts.
What you can do in your example is set the debugger trace function yourself (as Alex pointed) as in the code below (if you're not in the remote debugger, the pydevd.connected = True is currently required -- I'll change pydev so that this is not needed anymore). You may want to add a try..except ImportError for the pydevd import (which will fail if you're not running in the debugger)
def go(count):
import pydevd
pydevd.connected = True
pydevd.settrace(suspend=False)
print 'count is %d.' % count # set break point here
Now, on a second thought, I think that pydev can replace the start_new_thread method in the thread module providing its own function which will setup the debugger and later call the original function (just did that and it seems to be working, so, if you use the nightly that will be available in some hours, which will become the future 1.6.0, it should be working without doing anything special).

The underlying issue is with sys.settrace, the low-level Python function used to perform all tracing and debugging -- as the docs say,
The function is thread-specific; for a
debugger to support multiple threads,
it must be registered using settrace()
for each thread being debugged.
I believe that when you set a breakpoint in PyDev, the resulting settrace call is always happening on the main thread (I have not looked at PyDev recently so they may have added some way to work around that, but I don't recall any from the time when I did look).
A workaround you might implement yourself is, in your main thread after the breakpoint has been set, to use sys.gettrace to get PyDev's trace function, save it in a global variable, and make sure in all threads of interest to call sys.settrace with that global variable as the argument -- a tad cumbersome (more so for threads that already exist at the time the breakpoint is set!), but I can't think of any simpler alternative.

On this question, I found a way to start the command-line debugger:
import pdb; pdb.set_trace()
It's not as easy to use as the Eclipse debugger, but it's better than nothing.

For me this worked according to one of Fabio's posts, after setting the trace with setTrace("000.000.000.000") # where 0's are the IP of your computer running Eclipse/PyDev
threading.settrace(pydevd.GetGlobalDebugger().trace_dispatch)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.