Python Pool.Starmap Not Terminating Or Outputting on Print

Python Pool.Starmap Not Terminating Or Outputting on Print - python

I have attempted in a few different ways to perform Pool.starmap. I have tried various different suggestions and answers, and to no avail. Below is a sample of the code I am trying to run, however it gets caught and never terminates. What am I doing wrong here?
Side note: I am on python version 3.9.8
if __name__ == '__main__':
with get_context("spawn").Pool() as p:
tasks = [(1,1),(2,2),(3,3)]
print(p.starmap(add,tasks))
p.close()
p.join()

Multiprocessing in python has some complexity you should be aware of that make it dependent on how you run your script in addition to what OS, and python version you're using.
One of the big issues I see very often is the fact that Jupyter and other "notebook" style python environments don't always play nice with multiprocessing. There are technically some ways around this, but I typically just suggest executing your code from a more normal system terminal. The common thread is "interactive" interpreters don't work very well because there needs to be a "main" file, and in interactive mode there's no file; it just waits for user input.
I can't know exactly what your issue is here, as you haven't provided all your code, what OS you're using, and what IDE you're using but I can at least leave you with a working (on my setup) example. (windows 10; python 3.9; Spyder IDE with run settings -> execute in an external system terminal)
import multiprocessing as mp
def add(a, b): #I'm assuming your "add" function looks a bit like this...
return a+b
if __name__ == "__main__":
#this is critical when using "spawn" so code doesn't run when the file is imported
#you should only define functions, classes, and static data outside this (constants)
#most critically, it shouldn't be possible for a new child process to start outside this
ctx = mp.get_context("spawn")
#This is the only context available on windows, and the default for MacOS since python 3.8.
# Contexts are an important topic somewhat unique to python multiprocessing, and you should
# absolutely do some additional reading about "spawn" vs "fork". tldr; "spawn" starts a new
# process with no knowledge of the old one, and must `import` everything from __main__.
# "fork" on the other hand copies the existing process and all its memory before branching. This is
# faster than re-starting the interpreter, and re-importing everything, but sometimes things
# get copied that shouldn't, and other things that should get copied don't.
with ctx.Pool() as p:
#using `with` automatically shuts down the pool (forcibly) at the end of the block so you don't have to call `close` or `join`.
# It was also pointed out that due to the forcible shutdown, async calls like `map_async` may not finish unless you wait for the results
# before the end of the `with` block. `starmap` already waits for the results in this case however, so extra waiting is not needed.
tasks = [(1,1),(2,2),(3,3)]
print(p.starmap(add, tasks))

Related

How to import script that requires name == "main"

I'm pretty new to Python, this question probably shows that. I'm working on multiprocessing part of my script, couldn't find a definitive answer to my problem.
I'm struggling with one thing. When using multiprocessing, part of the code has to be guarded with if __name__ == "__main__". I get that, my pool is working great. But I would love to import that whole script (making it a one big function that returns an argument would be the best). And here is the problem. First, how can I import something if part of it will only run when launched from the main/source file because of that guard? Secondly, if I manage to work it out and the whole script will be in one big function, pickle can't handle that, will use of "multiprocessing on dill" or "pathos" fix it?
Thanks!

You are probably confused with the concept. The if __name__ == "__main__" guard in Python exists exactly in order for it to be possible for all Python files to be importable.
Without the guard, a file, once imported, would have the same behavior as if it were the "root" program - and it would require a lot of boyler plate and inter-process comunication (like writting a "PID" file at a fixed filesystem location) to coordinate imports of the same code, including for multiprocessing.
Just leave under the guard whatever code needs to run for the root process. Everything else you move into functions that you can call from the importing code.
If you'd run "all" the script, even the part setting up the multiprocessing workers would run, and any simple job would create more workers exponentially until all machine resources were taken (i.e.: it would crash hard and fast, potentially taking the machine to an unresponsive state).
So, this is a good pattern - th "dothejob" function can call all
other functions you need, so you just need to import and call it,
either from a master process, or from any other project importing
your file as a Python module.
import multiprocessing
...
def dothejob():
...
def start():
# code to setup and start multiprocessing workers:
# like:
worker1 = multiprocessing.Process(target=dothejob)
...
worker1.start()
...
worker1.join()
if __name__ == "__main__":
start()

How to start a python script using the multiprocessing library (with map_async) from the console

I am sorry for this rather long question but, since it is my first question on Stackoverflow, I wanted to be thorough in describing my problem and what I already tried.
I am doing simulations of stochastic processes and thought it to be a good idea to use multiprocessing in order to increase the speed of my simulations . Since the individual processes have no need to share information with each other, this is really a trivial application of multiprocessing – unfortunately I struggle with calling my script from the console.
My code for a testfunction looks like this:
#myscript.py
from multiprocessing import Pool
def testFunc (inputs):
print(inputs)
def multi():
print('Test2')
pool = Pool()
pool.map_async(testFunc, range(10))
if __name__ == '__main__':
print('Test1')
multi()
This works absolutely fine as long as I run the code from within my Spyder IDE. As the next step I want to execute my script on my university's cluster which I access via a slurm script; therefore, I need to be able to execute my python script via a bash script. Here I got some unexpected results.
What I tried – on my Mac Book Pro with iOS 10.15.7 and a work station with Ubuntu 18.04.5 – are the following console inputs: python myscript.py and python -c "from myscript import multi; multi()".
In each case my only output is Test1 and Test2, and testFunc never seems to be called. Following this answer Using python multiprocessing Pool in the terminal and in code modules for Django or Flask, I also tried various versions of omitting the if __name__ == '__main__' and importing the relevant functions to another module. For example I tried `
#myscript.py
from multiprocessing import Pool
def testFunc (inputs):
print(inputs)
pool = Pool()
pool.map_async(testFunc, range(10))
But all to no prevail. To confuse me even further I now found out that first opening the python interpreter of the console by simply typing python, pressing enter and then executing
from myscript import multi
multi()
inside the python interpreter does work.
As I said, I am very confused by this, since I thought this to be equivalent to python -c "from myscript import multi; multi()" and I really don't understand why one works and the other doesn't. Trying to reproduce this success I also tried executing the following bash script
python - <<'END_SCRIPT'
from multiTest import multi
multi()
END_SCRIPT
but, alas, also this doesn't work.
As a last "dicovery", I found out that all those problems only arise when using map_async instead of just map – however, I think that for my application asynchron processes are preferable.
I would be really grateful if someone could shed light on this mystery (at least for me it is a mystery).
Also, as I said this is my first question on Stackoverflow, so I apologize if I forgot relevant information or did accidentally not follow the formatting guidelines. All comments or edits helping me to improve my questions (and answers) in the future are also much appreciated!

You aren't waiting for the pool to finish what it's doing before your program exits.
def multi():
print('Test2')
with Pool() as pool:
result = pool.map_async(testFunc, range(10))
result.wait()
If the order in which the subprocesses process things isn't relevant, I'd suggest
with Pool() as pool:
for result in pool.imap_unordered(testFunc, range(10), 5):
pass
(change 5, the chunk size parameter, to taste.)

Why does mulitprocessing.Pool run but never terminate?

I'm trying to use mulitprocessing.Pool to speed up the execution of a function across a range of inputs. The processes seem to have been called, since my task manager indicates a substantial increase in my CPU's utilization, but the task never terminates. No exceptions are ever raised, runtime or otherwise.
from multiprocessing import Pool
def f(x):
print(x)
return x**2
class Klass:
def __init__(self):
pass
def foo(self):
X = list(range(1, 1000))
with Pool(15) as p:
result = p.map(f, X)
if __name__ == "__main__":
obj = Klass()
obj.foo()
print("All Done!")
Interestingly, despite the uptick in CPU utilization, print(x) never prints anything to the console.
I have moved the function f outside of the class as was suggested here, to no avail. I have tried adding p.close() and p.join() as well with no success. Using other Pool class methods like imap lead to TypeError: can't pickle _thread.lock objects errors and seems to take a step away from the example usage in the introduction of the Python Multiprocessing Documentation.
Adding to the confusion, if I try running the code above enough times (killing the hung kernel after each attempt) the code begins consistently working as expected. It usually takes about twenty attempts before this "clicks" into place. Restarting my IDE reverts the now functional code back to the former broken state. For reference, I am running using the Anaconda Python Distribution (Python 3.7) with the Spyder IDE on Windows 10. My CPU has 16 cores, so the Pool(15) is not calling for more processes than I have CPU cores. However, running the code with a different IDE, like Jupyter Lab, yields the same broken results.
Others have suggested that this may be a flaw with Spyder itself, but the suggestion to use mulitprocessing.Pool instead of mulitprocessing.Process doesn't seem to work either.

Could be related to this from python doc:
Note Functionality within this package requires that the main
module be importable by the children. This is covered in Programming
guidelines however it is worth pointing out here. This means that some
examples, such as the multiprocessing.pool.Pool examples will not work
in the interactive interpreter.
and then this comment on their example:
If you try this it will actually output three full tracebacks
interleaved in a semi-random fashion, and then you may have to stop
the master process somehow.
UPDATE:
The info found here seems to confirm that using the pool from an interactive interpreter will have varying success. This guidance is also shared...
...guidance [is] to always use functions/classes whose definitions are
importable.
This is the solution outlined here and which works for me (every time) using your code.

This seems like it might be a problem with both Spyder and Jupyter. If you run the above code in the console directly, everything works as intended.

Python multiprocessing, PyAudio, and wxPython

I have a wxPython GUI, and would like to use multiprocessing to create a separate process which uses PyAudio. That is, I want to use PyAudio, wxPython, and the multiprocessing module, but although I can use any two of these, I can't use all three together. Specifically, if from one file I import wx, and create a multiprocessing.Process which opens PyAudio, PyAudio won't open. Here's an example:
file: A.py
import wx
import time
use_multiprocessing = True
if use_multiprocessing:
from multiprocessing import Process as X
else:
from threading import Thread as X
import B
if __name__=="__main__":
p = X(target=B.worker)
p.start()
time.sleep(5.)
p.join()
file: B.py
import pyaudio
def worker():
print "11"
feed = pyaudio.PyAudio()
print "22"
feed.terminate()
In all my tests I see 11 print, but the problem is that I don't see 22 for the program as shown.
If I only comment out import wx I see 22 and pyaudio loads
If I only set use_multiprocessing=False so I use threading instead, I see 22 and pyaudio loads.
If I do something else in worker, it will run (only pyaudio doesn't run)
I've tried this with Python 2.6 and 2.7; PyAudio 0.2.4, 0.2.7, and 0.2.8; and wx 3.0.0.0 and 2.8.12.1; and I'm using OSX 10.9.4

There are two reasons this can happen, but they look pretty much the same.
Either way, the root problem is that multiprocessing is just forking a child. This could be either causing CoreFoundation to get confused about its runloop*, or causing some internal objects inside wx to get confused about its threads.**
But you don't care why your child process is deadlocking; you want to know how to fix it.
The simple solution is to, instead of trying to fork and then clean up all the stuff that shouldn't be copied, spawn a brand-new Python process and then copy over all the stuff that should.
As of Python 3.4, there are actually two variations on this. See Contexts and start methods for details, and issue #8713 for the background.
But you're on 2.6, so that doesn't help you. So, what can you do?
The easiest answer is to switch from multiprocessing to the third-party library billiard. billiard is a fork of Python 2.7's multiprocessing, which adds many of the features and bug fixes from both Python 3.x and Celery.
I believe new versions have the exact same fix as Python 3.4, but I'm not positive (sorry, I don't have it installed, and can't find the docs online…).
But I'm sure that it has a similar but different solution, inherited from Celery: call billiards.forking_enable(False) before calling anything else on the library. (Or, from outside the program, set the environment variable MULTIPROCESSING_FORKING_DISABLE=1.)
* Usually, CF can detect the problem and call __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YO‌U_MUST_EXEC__, which logs an error message and fails. But sometimes it can't, and will wait end up waiting forever for an event that nobody can send. Google that string for more information.
** See #5527 for details on the equivalent issue with threaded Tkinter, and the underlying problem. This one affects all BSD-like *nixes, not just OS X.

If you can't solve the problem by fixing or working around multiprocessing, there's another option. If you can spin off the child process before you create your main runloop or create any threads, you can prevent the child process from getting confused. This doesn't always work, but it often does, so it may be worth trying.
That's easy to do with Tkinter or PySide or another library that doesn't actually do anything until you call a function like mainloop or construct an App instance.
But with wx, I think it does some of the setup before you even touch anything beyond the import. So, you may have to do something a little hacky and move the import wx after the p.start().
In your real app, you probably aren't going to want to start doing audio until some trigger from the GUI. This means you'll need to create some kind of sync object, like an Event. So, you create the Event, then start the child process. The child initializes the audio, and then waits on the Event. And then, where you'd like to launch the child from the GUI, you instead just signal the Event.

Can I put break points on background threads in Python?

I'm using the PyDev for Eclipse plugin, and I'm trying to set a break point in some code that gets run in a background thread. The break point never gets hit even though the code is executing. Here's a small example:
import thread
def go(count):
print 'count is %d.' % count # set break point here
print 'calling from main thread:'
go(13)
print 'calling from bg thread:'
thread.start_new_thread(go, (23,))
raw_input('press enter to quit.')
The break point in that example gets hit when it's called on the main thread, but not when it's called from a background thread. Is there anything I can do, or is that a limitation of the PyDev debugger?
Update
Thanks for the work arounds. I submitted a PyDev feature request, and it has been completed. It should be released with version 1.6.0. Thanks, PyDev team!

The problem is that there's no API in the thread module to know when a thread starts.
What you can do in your example is set the debugger trace function yourself (as Alex pointed) as in the code below (if you're not in the remote debugger, the pydevd.connected = True is currently required -- I'll change pydev so that this is not needed anymore). You may want to add a try..except ImportError for the pydevd import (which will fail if you're not running in the debugger)
def go(count):
import pydevd
pydevd.connected = True
pydevd.settrace(suspend=False)
print 'count is %d.' % count # set break point here
Now, on a second thought, I think that pydev can replace the start_new_thread method in the thread module providing its own function which will setup the debugger and later call the original function (just did that and it seems to be working, so, if you use the nightly that will be available in some hours, which will become the future 1.6.0, it should be working without doing anything special).

The underlying issue is with sys.settrace, the low-level Python function used to perform all tracing and debugging -- as the docs say,
The function is thread-specific; for a
debugger to support multiple threads,
it must be registered using settrace()
for each thread being debugged.
I believe that when you set a breakpoint in PyDev, the resulting settrace call is always happening on the main thread (I have not looked at PyDev recently so they may have added some way to work around that, but I don't recall any from the time when I did look).
A workaround you might implement yourself is, in your main thread after the breakpoint has been set, to use sys.gettrace to get PyDev's trace function, save it in a global variable, and make sure in all threads of interest to call sys.settrace with that global variable as the argument -- a tad cumbersome (more so for threads that already exist at the time the breakpoint is set!), but I can't think of any simpler alternative.

On this question, I found a way to start the command-line debugger:
import pdb; pdb.set_trace()
It's not as easy to use as the Eclipse debugger, but it's better than nothing.

For me this worked according to one of Fabio's posts, after setting the trace with setTrace("000.000.000.000") # where 0's are the IP of your computer running Eclipse/PyDev
threading.settrace(pydevd.GetGlobalDebugger().trace_dispatch)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.