My coworker asked for my help with a problem he was having with a daemon script he is working on. He was having a strange error involving a multiprocessing.Manager, which I managed to reproduce with the following five lines:
import multiprocessing, os, sys
mgr = multiprocessing.Manager()
pid = os.fork()
if pid > 0:
sys.exit(0)
When run on CentOS 6 Linux and Python 2.6, I get the following error:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/util.py", line 235, in _run_finalizers
finalizer()
File "/usr/lib64/python2.6/multiprocessing/util.py", line 174, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 576, in _finalize_manager
if process.is_alive():
File "/usr/lib64/python2.6/multiprocessing/process.py", line 129, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 117, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 117, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
I suspect the error is due to some interaction between os.fork and the multiprocessing.Manager, and that he should use the multiprocessing module to create new processes instead of os.fork. Can anyone confirm this and/or explain what is going on? If my hunch is correct, why is this the wrong place to use os.fork?
The issue is that Manager create a process and try to stop it at sys.exit. Since the memory of the process is copied (lazily) during the fork both the parent and the child try to stop the process and wait for it to stop. However, as the exception mention only the parent process can do that. If instead of using os.fork, you use multiprocessing.Process which will spawn a new process which wouldn't try to close the Manager at sys.exit.
Related
I am trying to run the code available here https://github.com/GDPlumb/MAPLE/blob/master/1-Accuracy/run.py but the following error occurs. Any idea how can I solve it
child processes and you have forgotten to use the proper idiom
main
in the
module :
if name main freeze=support() --
"The ""freeze_support()"" line can be omitted if the program"
"return Pool(processes, initializer, initargs, rnaxtasksperchild,"
"File ""C:\Python39\lib\multiprocessing\pool.py"", line 212, in init"
"File ""C:\Python39\lib\multiprocessing\pool.py"", line 303, return self._repopulate_pool_static(self._ctx, self.Pro"
"File ""C:\Python39\lib\multiprocessing\pool.py"", line 326,"
w.start()
"""C:\Python39\lib\multiprocessing\process.py"", line 121, in start lf._popen = self._Popen(self)"
_static
"in _repopulate_pool cess,"
in _repopulate_pool
File se
File re """C:\Python39\lib\multiproce ssing\context.py"", line 327, in turn Popen(process_obj)" _Popen
"File ""C:\Python39\lib\multiproce ssing\popen_spawn_win32.py"", line 45,"
prep_data = spawn .get_preparation_data(process_obj._name)
"File ""c:\Python39\lib\multiprocessing\spawn .py"", line 154,in get_preparation_data"
_check_not_importing
main ()
main
"File ""C:\Python39\lib\multiprocessing\spawn .py"",line 134,in check_not_importing raise RuntimeError ('''"
RuntimeError :
before
An attempt has been made to start a new process current process has finished its bootstrapping phase .
This probably means that you are not using fork to
the
your
start
child processes and you have forgotten to use the proper idiom in the main module:
if name main reeze_support()
"The ""freeze_support()"" line can be omitted if the program"
is
"l.py"","
"File ""C:\Python39\lib\multiprocessing\poo"
"line 326, in _repopulate_pool_static"
"File ""C:\Python39\lib\multiprocessing\process.py"", line 121, in start"
"File ""C:\Python39\lib\multiproce ssin \context. "" line 327, in Popen"
P.S: I tried adding if __name__ == '__main__': pool = Pool(12) pool.map(run, args)
Traceback (most recent call last): Traceback (most recent call last):
"File """", line 1, in Traceback (most recent call last): Traceback (most recent call last):"
"File """", line 1, in "
"File ""c:\Python39\lib\rrultiprocessing\spawn .py"", line 116, in spawn_main exitcode = _main(fd, parent_sentinel)"
"File ""c:\Python39\lib\rrultiprocessing\spawn .py"", line 125, in main"
"File ""c:\Python39\lib\rnultiprocessing\spawn .py"", line 116, in spawn_main File """", line 1, in "
Traceback (roost recent call last):
"Traceback (roost recent call last): File """", line 1, in "
"File ""c:\Python39\lib\rnultiprocessing\spawn.py'', line 125,in main prepare(preparation_data)"
"File ""c:\Python39\lib\multiprocessing\spawn .py"",line 236,in prepare"
_fixup_main_from_path(data['init_main_from_path'])
"File ""c:\Python39\lib\multiprocessing\spawn .py"",line 287,in _fixup_main_from_path main_content = runpy.run_path(main_path,"
"File ""c:\Python39\lib\runpy.py'', line 268, in run_path return _run_module_code(code, init_globals, run_name,"
"File ""c:\Python39\lib\runpy.py'', line 97, in _run_module_code exitcode = _main(fd, parent_sentinel)"
"File ""c:\Python39\lib\multiprocessing\spawn .py"", line 116, in spawn_main File ""c:\Python39\lib\multiprocessing\spawn .py"", line 116, in spawn_main File ""'', line 1, in "
"Traceback (most recent call last): File """", line 1, in "
"File ""c:\Python39\lib\rrultiprocessing\spawn .py'', line 125, in main"
You must set the multiprocessing Contexts and start methods
For my case, I had to utilize the context 'fork'
ctx = multiprocessing.get_context('fork')
work_queue = ctx.Queue()
results_queue = ctx.Queue()
...
workers = get_worker_processes(
_process_data,
(task_function, work_queue, results_queue),
nproc=nproc,
)
.....
workers = [
ctx.Process(target=f, args=args) for _ in range(num_procs)
]
Follow the guidance given in the Python multiprocessing documentation for the link referenced. Notice the change in defaults.
Contexts and start methods Depending on the platform, multiprocessing
supports three ways to start a process. These start methods are
spawn The parent process starts a fresh python interpreter process.
The child process will only inherit those resources necessary to run
the process object’s run() method. In particular, unnecessary file
descriptors and handles from the parent process will not be inherited.
Starting a process using this method is rather slow compared to using
fork or forkserver.
Available on Unix and Windows. The default on Windows and macOS.
fork The parent process uses os.fork() to fork the Python interpreter.
The child process, when it begins, is effectively identical to the
parent process. All resources of the parent are inherited by the child
process. Note that safely forking a multithreaded process is
problematic.
Available on Unix only. The default on Unix.
forkserver When the program starts and selects the forkserver start
method, a server process is started. From then on, whenever a new
process is needed, the parent process connects to the server and
requests that it fork a new process. The fork server process is single
threaded so it is safe for it to use os.fork(). No unnecessary
resources are inherited.
Available on Unix platforms which support passing file descriptors
over Unix pipes.
Changed in version 3.8: On macOS, the spawn start method is now the
default. The fork start method should be considered unsafe as it can
lead to crashes of the subprocess. See bpo-33725.
Changed in version 3.4: spawn added on all unix platforms, and
forkserver added for some unix platforms. Child processes no longer
inherit all of the parents inheritable handles on Windows.
I've never used the multiprocessing library before, so all advice is welcome..
I've got a python program that uses the multiprocessing library to do some memory-intensive tasks in multiple processes, which occasionally runs out of memory (I'm working on optimizations, but that's not what this question is about). Sometimes, an out-of-memory error gets thrown in a way that I can't seem to catch (output below), and then the program hangs on pool.join() (I'm using multiprocessing.Pool. How can I make the program do something other than indefinitely wait when this problem occurs?
Ideally, The memory error is propagated back to the main process which then dies.
Here's the memory error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
pool._maintain_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 229, in _maintain_pool
self._repopulate_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
And here's where i manage multiprocessing:
mp_pool = mp.Pool(processes=num_processes)
mp_results = list()
for datum in input_data:
data_args = {
'value': 0 // actually some other simple dict key/values
}
mp_results.append(mp_pool.apply_async(_process_data, args=(common_args, data_args)))
frame_pool.close()
frame_pool.join() // hangs here when that thread dies..
for result_async in mp_results:
result = result_async.get()
// do stuff to collect results
// rest of the code
When I interrupt the hanging program, I get:
Process process_003:
Traceback (most recent call last):
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/queues.py", line 374, in get
return recv()
racquire()
KeyboardInterrupt
This is actually a known bug in python's multiprocessing module, fixed in python 3 (here's a summarizing blog post I found). There's a patch attached to python issue 22393, but that hasn't been officially applied.
Basically, if one of a multiprocess pool's sub-processes die unexpectedly (out of memory, killed externally, etc.), the pool will wait indefinitely.
I am getting the following error
Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run() File "python/file_download.py", line 30, in run
self._downloadFile(host[0], host[1]) File "python/file_download.py", line 57, in _downloadFile
th.exit() AttributeError: 'Thread' object has no attribute 'exit'
from the code below:
th = threading.Thread(
target=self._fileWriteToDisk,
args=(saveTo, u, file_name),
name="fileWrite_Child_of_%s" % self.getName(),
)
th.setDaemon(False)
th.start()
print "Writing to disk using child: %s " % th.name
th.exit()
There is no need to kill the thread, python threads kill themselves when they are completed.
Abruptly killing a thread is bad practice, you could have resourced that are used by the thread that aren't closed properly. A stop flag is a better way to end a thread, see this for an example.
You should be using the following, instead of hat you have:
th.daemon = False
as th.setDaemon(False) is depreciated.
I wrote the sample program.
It creates 8 threads and spawns process in each one
import threading
from multiprocessing import Process
def fast_function():
pass
def thread_function():
process_number = 1
print 'start %s processes' % process_number
for i in range(process_number):
p = Process(target=fast_function, args=())
p.start()
p.join()
def main():
threads_number = 8
print 'start %s threads' % threads_number
threads = [threading.Thread(target=thread_function, args=())
for i in range(threads_number)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
It crashes with several exceptions like this
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "./repeat_multiprocessing_bug.py", line 15, in thread_function
p.start()
File "/usr/lib/python2.6/multiprocessing/process.py", line 99, in start
_cleanup()
File "/usr/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
Python version 2.6.5. Can somebody explain what I do wrong?
You're probably trying to run it from the interactive interpreter. Try writing your code to a file and run it as a python script, it works on my machine...
See the explanation and examples at the Python multiprocessing docs.
The multiprocessing module has a thread-safety issue in 2.6.5. Your best bet is updating to a newer Python, or add this patch to 2.6.5: http://hg.python.org/cpython/rev/41aef062d529/
The bug is described in more detail in the following links:
http://bugs.python.org/issue11891
http://bugs.python.org/issue1731717
I have one multprocess demo here, and I met some problems with it. Researched for a night, I cannot resolve the reason.
Any one can help me?
I want to have one parent process acts as producer, when there are tasks come, the parent can fork some children to consume these tasks. The parent monitors the child, if any one exits with exception, it can be restarted by parent.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from multiprocessing import Process, Queue from Queue import Empty import sys, signal, os, random, time import traceback
child_process = []
child_process_num = 4
queue = Queue(0)
def work(queue):
signal.signal(signal.SIGINT, signal.SIG_DFL)
signal.signal(signal.SIGTERM, signal.SIG_DFL)
signal.signal(signal.SIGCHLD, signal.SIG_DFL)
time.sleep(10) #demo sleep
def kill_child_processes(signum, frame):
#terminate all children
pass
def restart_child_process(signum, frame):
global child_process
for i in xrange(len(child_process)):
child = child_process[i]
try:
if child.is_alive():
continue
except OSError, e:
pass
child.join() #join this process to make sure there is no zombie process
new_child = Process(target=work, args=(queue,))
new_child.start()
child_process[i] = new_child #restart one new process
child = None
return
if __name__ == '__main__':
reload(sys)
sys.setdefaultencoding("utf-8")
for i in xrange(child_process_num):
child = Process(target=work, args=(queue,))
child.start()
child_process.append(child)
signal.signal(signal.SIGINT, kill_child_processes)
signal.signal(signal.SIGTERM, kill_child_processes) #hook the SIGTERM
signal.signal(signal.SIGCHLD, restart_child_process)
signal.signal(signal.SIGPIPE, signal.SIG_DFL)
When this program runs, there will be errors as below:
Error in atexit._run_exitfuncs:
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/local/python/lib/python2.6/atexit.py", line 30, in _run_exitfuncs
traceback.print_exc()
File "/usr/local/python/lib/python2.6/traceback.py", line 227, in print_exc
print_exception(etype, value, tb, limit, file)
File "/usr/local/python/lib/python2.6/traceback.py", line 124, in print_exception
_print(file, 'Traceback (most recent call last):')
File "/usr/local/python/lib/python2.6/traceback.py", line 12, in _print
def _print(file, str='', terminator='\n'):
File "test.py", line 42, in restart_child_process
new_child.start()
File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 99, in start
_cleanup()
File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
If I send signal to one child:kill –SIGINT {child_pid} I will get:
[root#mail1 mail]# kill -SIGINT 32545
[root#mail1 mail]# Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/local/python/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/local/python/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 119, in join
res = self._popen.wait(timeout)
File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 117, in wait
return self.poll(0)
File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 4] Interrupted system call Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/local/python/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/local/python/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 119, in join
res = self._popen.wait(timeout)
File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 117, in wait
return self.poll(0)
File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 4] Interrupted system call
Main proc is waiting for all child procs to be terminated before exits itself so there's a blocking call (i.e. wait4) registered as at_exit handles. The signal you sent interrupts that blocking call thus the stack trace.
The thing I'm not clear about is that if the signal sent to child would be redirected to the parent process, which then interrupted that wait4 call. This is something related to the Unix process group behaviors.