If a module is imported from a script without a main guard (if __name__ == '__main__':), doing any kind of parallelism in some function in the module will result in an infinite loop on Windows. Each new process loads all of the sources, now with __name__ not equal to '__main__', and then continues execution in parallel. If there's no main guard, we're going to do another call to the same function in each of our new processes, spawning even more processes, until we crash. It's only a problem on Windows, but the scripts are also executed on osx and linux.
I could check this by writing to a special file on disk, and read from it to see if we've already started, but that limits us to a single python script running at once. The simple solution of modifying all the calling code to add main guards is not feasible because they are spread out in many repositories, which I do not have access to. Thus, I would like to parallelize, when main guards are used, but fallback to single threaded execution when they're not.
How do I figure out if I'm being called in an import loop due to a missing main guard, so that I can fallback to single threaded execution?
Here's some demo code:
lib with parallel code:
from multiprocessing import Pool
def _noop(x):
return x
def foo():
p = Pool(2)
print(p.map(_noop, [1, 2, 3]))
Good importer (with guard):
from lib import foo
if __name__ == "__main__":
foo()
Bad importer (without guard):
from lib import foo
foo()
where the bad importer fails with this RuntimeError, over and over again:
p = Pool(2)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 118, in Pool
context=self.get_context())
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 168, in __init__
self._repopulate_pool()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 233, in _repopulate_pool
w.start()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 34, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 144, in get_preparation_data
_check_not_importing_main()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Since you're using multiprocessing, you can also use it to detect if you're the main process or a child process. However, these features are not documented and are therefore just implementation details that could change without warning between python versions.
Each process has a name, _identity and _parent_pid. You can check any of them to see if you're in the main process or not. In the main process name will be 'MainProcess', _identity will be (), and _parent_pid will be None).
My solution allows you to continue using multiprocessing, but just modifies child processes so they can't keep creating child processes forever. It uses a decorator to change foo to a no-op in child processes, but returns foo unchanged in the main process. This means when the spawned child process tries to execute foo nothing will happen (as if it had been executed inside a __main__ guard.
from multiprocessing import Pool
from multiprocessing.process import current_process
def run_in_main_only(func):
if current_process().name == "MainProcess":
return func
else:
def noop(*args, **kwargs):
pass
return noop
def _noop(_ignored):
p = current_process()
return p.name, p._identity, p._parent_pid
#run_in_main_only
def foo():
with Pool(2) as p:
for result in p.map(_noop, [1, 2, 3]):
print(result) # prints something like ('SpawnPoolWorker-2', (2,), 10720)
if __name__ == "__main__":
print(_noop(1)) # prints ('MainProcess', (), None)
Related
I am trying to create a shared memory for my Python application, which should be used in the parent process and in another process that is spawned from that parent process. In most cases that works fine, however, sometimes I get the following stacktrace:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory: '/psm_47f7f5d7'
I want to emphasize that our code/application works fine in 99% of the time. We are spawning these new processes with new shared memory for each such process on a regular basis in our application (which is a server process, so it's running 24/7). Nearly all the time this works fine, only from time to time this error above is thrown, which then kills the whole application.
Update: I noticed that this problem occurs mainly when the application was running for a while already. When I start it up the creation of shared memory and spawning new processes works fine without this error.
The shared memory is created like this:
# Spawn context for multiprocessing
_mp_spawn_ctxt = multiprocessing.get_context("spawn")
_mp_spawn_ctxt_pipe = _mp_spawn_ctxt.Pipe
# Create shared memory
mem_size = width * height * bpp
shared_mem = shared_memory.SharedMemory(create=True, size=mem_size)
image = np.ndarray((height, width, bpp), dtype=np.uint8, buffer=shared_mem.buf)
parent_pipe, child_pipe = _mp_spawn_ctxt_pipe()
time.sleep(0.1)
# Spawn new process
# _CameraProcess is a custom class derived from _mp_spawn_ctxt.Process
proc = _CameraProcess(shared_mem, child_pipe)
proc.start()
Any ideas what could be the issue here?
I had the similar issue in case, that more processes had access to the shared memory/object and one process did update the shared memory/object.
I solved these issues based on these steps:
I synchronized all operations with shared memory/object via mutexes (see sample for multiprocessing usage superfastpython or protect shared resources). Critical part of code are create, update, delete but also reading content of shared object/memory, because at the same time different process can do update of shared object/memory, etc.
I avoided libraries with only single thread execution support
See sample code with synchronization:
def increase(sharedObj, lock):
for i in range(100):
time.sleep(0.01)
lock.acquire()
sharedObj = sharedObj + 1
lock.release()
def decrease(sharedObj, lock):
for i in range(100):
time.sleep(0.001)
lock.acquire()
sharedObj = sharedObj - 1
lock.release()
if __name__ == '__main__':
sharedObj = multiprocessing.Value ('i',1000)
lock=multiprocessing.Lock()
p1=multiprocessing.Process(target=increase, args=(sharedObj, lock))
p2=multiprocessing.Process(target=decrease, args=(sharedObj, lock))
p1.start()
p2.start()
p1.join()
p2.join()
from multiprocessing spawn:
The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process objects run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver. [Available on Unix and Windows. The default on Windows and macOS.]
import multiprocessing as mp
import signal
FOO = 10
def foo():
assert FOO == 0
def test1():
global FOO
FOO = 0
ctx = mp.get_context("spawn")
p = ctx.Process(target=foo, args=())
p.start()
p.join()
def bar():
assert signal.getsignal(signal.SIGTERM) == signal.SIG_DFL
def test2():
orignal = signal.getsignal(signal.SIGTERM)
assert orignal == signal.SIG_DFL
signal.signal(signal.SIGTERM, signal.SIG_IGN)
ctx = mp.get_context("spawn")
p = ctx.Process(target=bar, args=())
p.start()
p.join()
if __name__ == "__main__":
test1()
test2()
output:
Process SpawnProcess-1:
Traceback (most recent call last):
File "/Users/foo/opt/miniconda3/envs/iris-dev/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/Users/foo/opt/miniconda3/envs/iris-dev/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/foo/Work/iris/iris2/example.py", line 7, in foo
assert FOO == 0
AssertionError
Process SpawnProcess-2:
Traceback (most recent call last):
File "/Users/foo/opt/miniconda3/envs/iris-dev/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/Users/foo/opt/miniconda3/envs/iris-dev/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/foo/Work/iris/iris2/example.py", line 18, in bar
assert signal.getsignal(signal.SIGTERM) == signal.SIG_DFL
AssertionError
For the first test, FOO in parent process state is mutated but its child doesn't see this modification.
But for the test2, the signal handler mutated state is reflected in both child and parent. (maybe it's C-level code???)
I know when you do fork, children should have same memory of its parent at the fork point. But is seems it's not always true for spawn
So my questions are
What's happening during spawn?
Why the memory state of child and parent sometimes differ while sometimes not?
update: I have test 3, where I register a signal handler (python code) in parent process, which is not replicated to children process by spwan
def bar():
# assert this is not handler installed at parent
assert signal.getsignal(signal.SIGTERM) == signal.SIG_DFL
def dummy(signum, frame):
print("dummy", signum, frame)
def test3():
orignal = signal.getsignal(signal.SIGTERM)
assert orignal == signal.SIG_DFL
signal.signal(signal.SIGTERM, dummy)
ctx = mp.get_context("spawn")
p = ctx.Process(target=bar, args=())
p.start()
p.join()
Python relies on the exec primitive to implement the spawn start method on UNIX platforms.
When a new process is forked, the exec loads a new Python interpreter and points it out to the module and function you are giving as a target to your Process object. When the module is loaded, the if __name__ == "__main__": evaluates to False. This avoids your logic from entering an endless loop which would end up spawning infinite processes.
Assuming you are executing this code on a UNIX machine, this is the correct behaviour based on POSIX specifications.
This volume of POSIX.1-2017 specifies that signals set to SIG_IGN remain set to SIG_IGN, and that the new process image inherits the signal mask of the thread that called exec in the old process image.
This works only for SIG_IGN. In fact, on test3 you can observe how your handler is reset.
I'm not sure why, but yesterday I was testing some multiprocessing code that I wrote and it was working fine. Then today when I checked the code again, it would give me this error:
Exception in thread Thread-5:
Traceback (most recent call last):
File "C:\Python32\lib hreading.py", line 740, in _bootstrap_inner
self.run()
File "C:\Python32\lib hreading.py", line 693, in run
self._target(*self._args, **self._kwargs)
File "C:\Python32\lib\multiprocessing\pool.py", line 342, in _handle_tasks
put(task)
File "C:\Python32\lib\multiprocessing\pool.py", line 439, in __reduce__
'pool objects cannot be passed between processes or pickled'
NotImplementedError: pool objects cannot be passed between processes or pickled
The structure of my code goes as follows:
* I have 2 modules, say A.py, and B.py.
* A.py has class defined in it called A.
* B.py similarly has class B.
* In class A I have a multiprocessing pool as one of the attributes.
* The pool is defined in A.__init__(), but used in another method - run()
* In A.run() I set some attributes of some objects of class B (which are collected in a list called objBList), and then I use pool.map(processB, objBList)
* processB() is a module function (in A.py) that receives as the only parameter (an instance of B) and calls B.runInput()
* the error happens at the pool.map() line.
basically in A.py:
class A:
def __init__(self):
self.pool = multiprocessing.Pool(7)
def run(self):
for b in objBList:
b.inputs = something
result = self.pool.map(processB, objBList)
return list(result)
def processB(objB):
objB.runInputs()
and in B.py:
class B:
def runInputs(self):
do_something()
BTW, I'm forced to use the processB() module function because of the way multiprocessing works on Windows.
Also I would like to point out that the error I am getting - that pool can't be pickled - shouldn't be referring to any part of my code, as I'm not trying to send the child processes any Pool objects.
Any ideas?
(PS: I should also mention that in between the two days that I was testing this function the computer restarted unexpectedly - possibly after installing windows updates.)
Perhaps your class B objects contain a reference to your A instance.
I have some Python code (on Windows) that uses the multiprocessing module to run a pool of worker processes. Each worker process needs to do some cleanup at the end of the map_async method.
Does anyone know how to do that?
Do you really want to run a cleanup function once for each worker process rather than once for every task created by the map_async call?
multiprocess.pool.Pool creates a pool of, say, 8 worker processes. map_async might submit 40 tasks to be distributed among the 8 workers.
I can imagine why you might want to run cleanup code at the end of each task, but I'm having trouble imagining why you would want to run cleanup code just before each of the 8 worker processes is finalized.
Nevertheless, if that is what you want to do, you could do it by monkeypatching multiprocessing.pool.worker:
import multiprocessing as mp
import multiprocessing.pool as mpool
from multiprocessing.util import debug
def cleanup():
print('{n} CLEANUP'.format(n=mp.current_process().name))
# This code comes from /usr/lib/python2.6/multiprocessing/pool.py,
# except for the single line at the end which calls cleanup().
def myworker(inqueue, outqueue, initializer=None, initargs=()):
put = outqueue.put
get = inqueue.get
if hasattr(inqueue, '_writer'):
inqueue._writer.close()
outqueue._reader.close()
if initializer is not None:
initializer(*initargs)
while 1:
try:
task = get()
except (EOFError, IOError):
debug('worker got EOFError or IOError -- exiting')
break
if task is None:
debug('worker got sentinel -- exiting')
break
job, i, func, args, kwds = task
try:
result = (True, func(*args, **kwds))
except Exception, e:
result = (False, e)
put((job, i, result))
cleanup()
# Here we monkeypatch mpool.worker
mpool.worker=myworker
def foo(i):
return i*i
def main():
pool = mp.Pool(8)
results = pool.map_async(foo, range(40)).get()
print(results)
if __name__=='__main__':
main()
yields:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521]
PoolWorker-8 CLEANUP
PoolWorker-3 CLEANUP
PoolWorker-7 CLEANUP
PoolWorker-1 CLEANUP
PoolWorker-6 CLEANUP
PoolWorker-2 CLEANUP
PoolWorker-4 CLEANUP
PoolWorker-5 CLEANUP
Your only real option here is to run cleanup at the end of the function you map_async to.
If this cleanup is honestly intended for at process death, you cannot use the concept of a pool. They are orthogonal. A pool does not dictate the process lifetime unless you use maxtasksperchild, which is new in Python 2.7. Even then, you do not gain the ability to run code at process death. However, maxtasksperchild might suit you, because any resources that the process opens will definitely go away when the process is terminated.
That being said, if you have a bunch of functions that you need to run cleanup on, you can save duplication of effort by designing a decorator. Here's an example of what I mean:
import functools
import multiprocessing
def cleanup(f):
"""Decorator for shared cleanup mechanism"""
#functools.wraps(f)
def wrapped(arg):
result = f(arg)
print("Cleaning up after f({0})".format(arg))
return result
return wrapped
#cleanup
def task1(arg):
print("Hello from task1({0})".format(arg))
return arg * 2
#cleanup
def task2(arg):
print("Bonjour from task2({0})".format(arg))
return arg ** 2
def main():
p = multiprocessing.Pool(processes=3)
print(p.map(task1, [1, 2, 3]))
print(p.map(task2, [1, 2, 3]))
if __name__ == "__main__":
main()
When you execute this (barring stdout being jumbled because I'm not locking it here for brevity), the order you get things out should indicate that your cleanup task is running at the end of each task:
Hello from task1(1)
Cleaning up after f(1)
Hello from task1(2)
Cleaning up after f(2)
Hello from task1(3)
Cleaning up after f(3)
[2, 4, 6]
Bonjour from task2(1)
Cleaning up after f(1)
Bonjour from task2(2)
Cleaning up after f(2)
Bonjour from task2(3)
Cleaning up after f(3)
[1, 4, 9]
I'm getting the following error when using the multiprocessing module within a python daemon process (using python-daemon):
Traceback (most recent call last):
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 262, in _exit_function
for p in active_children():
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 43, in active_children
_cleanup()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
The daemon process (parent) spawns a number of processes (children) and then periodically polls the processes to see if they have completed. If the parent detects that one of the processes has completed, it then attempts to restart that process. It is at this point that the above exception is raised. It seems that once one of the processes completes, any operation involving the multiprocessing module will generate this exception. If I run the identical code in a non-daemon python script, it executes with no errors whatsoever.
EDIT:
Sample script
from daemon import runner
class DaemonApp(object):
def __init__(self, pidfile_path, run):
self.pidfile_path = pidfile_path
self.run = run
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
def run():
import multiprocessing as processing
import time
import os
import sys
import signal
def func():
print 'pid: ', os.getpid()
for i in range(5):
print i
time.sleep(1)
process = processing.Process(target=func)
process.start()
while True:
print 'checking process'
if not process.is_alive():
print 'process dead'
process = processing.Process(target=func)
process.start()
time.sleep(1)
# uncomment to run as daemon
app = DaemonApp('/root/bugtest.pid', run)
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#uncomment to run as regular script
#run()
Your problem is a conflict between the daemon and multiprocessing modules, in particular in its handling of the SIGCLD (child process terminated) signal. daemon sets SIGCLD to SIG_IGN when launching, which, at least on Linux, causes terminated children to immediately be reaped (rather than becoming a zombie until the parent invokes wait()). But multiprocessing's is_alive test invokes wait() to see if the process is alive, which fails if the process has already been reaped.
Simplest solution is just to set SIGCLD back to SIG_DFL (default behaviour -- ignore the signal and let the parent wait() for the terminated child process):
def run():
# ...
signal.signal(signal.SIGCLD, signal.SIG_DFL)
process = processing.Process(target=func)
process.start()
while True:
# ...
Ignoring SIGCLD also causes problems with the subprocess module, because of a bug in that module (issue 1731717, still open as of 2011-09-21).
This behaviour is addressed in version 1.4.8 of the python-daemon library; it now omits the default fiddling with SIGCLD, so no longer has this unpleasant interaction with other standard library modules.
I think there was a fix put into trunk and 2.6 maint a little while ago which should help with this can you try running your script in python-trunk or the latest 2.6-maint svn? I'm failing to pull up the bug information
Looks like your error is coming at the very end of your process -- your clue's at the very start of your traceback, and I quote...:
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
if atexit._run_exitfuncs is running, this clearly shows that your own process is terminating. So, the error itself is a minor issue in a sense -- just from some function that the multiprocessing module registered to run "at-exit" from your process. The really interesting issue is, WHY is your main process exiting? I think this may be due to some uncaught exception: try setting the exception hook and showing rich diagnostic info before it gets lost by the OTHER exception caused by whatever it is that multiprocessing's registered for at-exit running...
I'm running into this also using the celery distributed task manager under RHEL 5.3 with Python 2.6. My traceback looks a little different but the error the same:
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 334, in terminate
self._terminate()
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 174, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 373, in _terminate_pool
p.terminate()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 111, in terminate
self._popen.terminate()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 136, in terminate
if self.wait(timeout=0.1) is None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 121, in wait
res = self.poll()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
Quite frustrating.. I'm running the code through pdb now, but haven't spotted anything yet.
The original sample script has "import signal" but no use of signals. However, I had a script causing this error message and it was due to my signal handling, so I'll explain here in case its what is happening for others. Within a signal handler, I was doing stuff with processes (e.g. creating a new process). Apparently this doesn't work, so I stopped doing that within the handler and fixed the error. (Note: sleep() functions wake up after signal handling so that can be an alternative approach to acting upon signals if you need to do things with processes)