I have some Python code (on Windows) that uses the multiprocessing module to run a pool of worker processes. Each worker process needs to do some cleanup at the end of the map_async method.
Does anyone know how to do that?
Do you really want to run a cleanup function once for each worker process rather than once for every task created by the map_async call?
multiprocess.pool.Pool creates a pool of, say, 8 worker processes. map_async might submit 40 tasks to be distributed among the 8 workers.
I can imagine why you might want to run cleanup code at the end of each task, but I'm having trouble imagining why you would want to run cleanup code just before each of the 8 worker processes is finalized.
Nevertheless, if that is what you want to do, you could do it by monkeypatching multiprocessing.pool.worker:
import multiprocessing as mp
import multiprocessing.pool as mpool
from multiprocessing.util import debug
def cleanup():
print('{n} CLEANUP'.format(n=mp.current_process().name))
# This code comes from /usr/lib/python2.6/multiprocessing/pool.py,
# except for the single line at the end which calls cleanup().
def myworker(inqueue, outqueue, initializer=None, initargs=()):
put = outqueue.put
get = inqueue.get
if hasattr(inqueue, '_writer'):
inqueue._writer.close()
outqueue._reader.close()
if initializer is not None:
initializer(*initargs)
while 1:
try:
task = get()
except (EOFError, IOError):
debug('worker got EOFError or IOError -- exiting')
break
if task is None:
debug('worker got sentinel -- exiting')
break
job, i, func, args, kwds = task
try:
result = (True, func(*args, **kwds))
except Exception, e:
result = (False, e)
put((job, i, result))
cleanup()
# Here we monkeypatch mpool.worker
mpool.worker=myworker
def foo(i):
return i*i
def main():
pool = mp.Pool(8)
results = pool.map_async(foo, range(40)).get()
print(results)
if __name__=='__main__':
main()
yields:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521]
PoolWorker-8 CLEANUP
PoolWorker-3 CLEANUP
PoolWorker-7 CLEANUP
PoolWorker-1 CLEANUP
PoolWorker-6 CLEANUP
PoolWorker-2 CLEANUP
PoolWorker-4 CLEANUP
PoolWorker-5 CLEANUP
Your only real option here is to run cleanup at the end of the function you map_async to.
If this cleanup is honestly intended for at process death, you cannot use the concept of a pool. They are orthogonal. A pool does not dictate the process lifetime unless you use maxtasksperchild, which is new in Python 2.7. Even then, you do not gain the ability to run code at process death. However, maxtasksperchild might suit you, because any resources that the process opens will definitely go away when the process is terminated.
That being said, if you have a bunch of functions that you need to run cleanup on, you can save duplication of effort by designing a decorator. Here's an example of what I mean:
import functools
import multiprocessing
def cleanup(f):
"""Decorator for shared cleanup mechanism"""
#functools.wraps(f)
def wrapped(arg):
result = f(arg)
print("Cleaning up after f({0})".format(arg))
return result
return wrapped
#cleanup
def task1(arg):
print("Hello from task1({0})".format(arg))
return arg * 2
#cleanup
def task2(arg):
print("Bonjour from task2({0})".format(arg))
return arg ** 2
def main():
p = multiprocessing.Pool(processes=3)
print(p.map(task1, [1, 2, 3]))
print(p.map(task2, [1, 2, 3]))
if __name__ == "__main__":
main()
When you execute this (barring stdout being jumbled because I'm not locking it here for brevity), the order you get things out should indicate that your cleanup task is running at the end of each task:
Hello from task1(1)
Cleaning up after f(1)
Hello from task1(2)
Cleaning up after f(2)
Hello from task1(3)
Cleaning up after f(3)
[2, 4, 6]
Bonjour from task2(1)
Cleaning up after f(1)
Bonjour from task2(2)
Cleaning up after f(2)
Bonjour from task2(3)
Cleaning up after f(3)
[1, 4, 9]
Related
I have this simple async code that spawns sleep 3 and then waits for it to complete:
from asyncio import SelectorEventLoop, create_subprocess_exec, \
wait_for, get_event_loop, set_event_loop
def run_timeout(loop, awaitable, timeout):
timed_awaitable = wait_for(awaitable, timeout=timeout, loop=loop)
return loop.run_until_complete(timed_awaitable)
async def foo(loop):
process = await create_subprocess_exec('sleep', '3', loop=loop)
await process.wait()
print(process.returncode)
Notice how it takes a custom loop. If I run it with the following:
loop = get_event_loop()
run_timeout(loop, foo(loop), 5)
loop.close()
It works as expected (after 3 seconds sleep 3 completes successfully and 0 is printed). However, if I run it with my own event loop:
loop = SelectorEventLoop()
run_timeout(loop, foo(loop), 5)
loop.close()
I get a TimeoutError (from the wait_for in run_timeout):
Traceback (most recent call last):
File "test.py", line 15, in <module>
_run_async(loop, foo(loop), 5)
File "test.py", line 7, in _run_async
return loop.run_until_complete(timed_coroutine)
File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
return future.result()
File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/usr/lib/python3.5/asyncio/tasks.py", line 396, in wait_for
raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
The only way I can get my custom event loop to work is if I set_event_loop() after creating my own SelectorEventLoop:
loop = SelectorEventLoop()
set_event_loop(loop)
run_timeout(loop, foo(loop), 3)
loop.close()
What gives here? Am I misunderstanding the docs? Must all event loops (that you use) be made the default one? If so, it seems useless to allow custom loops to be passed into many of the async methods (eg. create_subprocess_exec and wait_for), because the only value you could pass in is get_event_loop(), which is the default.
It's really strange. I debugged the program and found that is hard to say if it is a bug.
Let's make a long story short, when executing create_subprocess_exec, you need not only an event loop but also a child watcher(which is used to monitor child processes). But create_subprocess_exec doesn't provide a way to let you set custom child watcher, it just use the default watcher which attaches to the default event loop but not current running event loop.
If you use the following code, it will work:
from asyncio import SelectorEventLoop, create_subprocess_exec, \
wait_for, get_event_loop, set_event_loop, get_child_watcher
def run_timeout(loop, awaitable, timeout):
timed_awaitable = wait_for(awaitable, timeout=timeout)
return loop.run_until_complete(timed_awaitable)
async def foo():
process = await create_subprocess_exec('sleep', '3')
await process.wait()
print(process.returncode)
loop = SelectorEventLoop()
# core line, get default child watcher and attach it to your custom loop.
get_child_watcher().attach_loop(loop)
run_timeout(loop, foo(), 5)
loop.close()
And if you use set_event_loop to set default loop, it will also reattach the default child watcher to new default loop. That's why it works.
It's really hard to say if it is a bug or a problem about API design. Should create_subprocess_exec let you pass a custom watcher? If it should, it will make confusion as you will only touch child watcher when you are working with child processes.
Let me start by saying that I'm not using a Queue, so this question is not a duplicate of this one and I'm not using a process pool, so it's not a duplicate of this one.
I have a Process object that uses a pool of thread workers to accomplish some task. For the sake of an MCVE, this task is just constructing a list of the integers from 0 to 9. Here's my source:
#!/usr/bin/env python3
from multiprocessing.pool import ThreadPool as Pool
from multiprocessing import Process
from sys import stdout
class Quest():
def __init__(self):
pass
def doIt(self, i):
return i
class Test(Process):
def __init__(self, arg):
super(Test, self).__init__()
self.arg = arg
self.pool = Pool()
def run(self):
quest = Quest()
done = self.pool.map_async(quest.doIt, range(10), error_callback=print)
stdout.flush()
self.arg = [item for item in done.get()]
def __str__(self):
return str(self.arg)
# I tried both with and without this method
def join(self, timeout=None):
self.pool.close()
self.pool.join()
super(Test, self).join(timeout)
test = Test("test")
print(test) # should print 'test' (and does)
test.start()
# this line hangs forever
_ = test.join()
print(test) # should print '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
This is a pretty rough model of what I want my actual program to do. The problem, as indicated in the comments, is that Test.join always hangs forever. That's totally independent of whether or not that method is overridden in the Test class. It also never prints anything, but the output when I send a KeyboardInterrupt signal indicates that the problem lies in getting the results from the workers:
test
^CTraceback (most recent call last):
File "./test.py", line 44, in <module>
Process Test-1:
_ = test.join()
File "./test.py", line 34, in join
super(Test, self).join(timeout)
File "/path/to/multiprocessing/process.py", line 124, in join
res = self._popen.wait(timeout)
File "/path/to/multiprocessing/popen_fork.py", line 51, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/path/to/multiprocessing/popen_fork.py", line 29, in poll
pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Traceback (most recent call last):
File "/path/to/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "./test.py", line 25, in run
self.arg = [item for item in done.get()]
File "/path/to/multiprocessing/pool.py", line 638, in get
self.wait(timeout)
File "/path/to/multiprocessing/pool.py", line 635, in wait
self._event.wait(timeout)
File "/path/to/threading.py", line 551, in wait
signaled = self._cond.wait(timeout)
File "/path/to/threading.py", line 295, in wait
waiter.acquire()
KeyboardInterrupt
Why doesn't the stupid process stupid exit? The only thing a worker does is a single dereference and function call that executes one operation, it should be really simple.
I forgot to mention: This works fine if I make Test a subclass of threading.Thread instead of multiprocessing.Process. I'm really not sure why this breaks it in half.
Your goal is to do this work asynchronously. Why not spawn the asynchronous subprocess workers from your main process WITHOUT spawning a child process (class Test)? The results will be available in your main process and no fancy stuff needs to be done. You can stop reading here if you choose to do this. Otherwise, read on.
Your join is running forever because there are two separate pools, one when you create the process object (local to your main process), and another when you fork the process by calling process.start() (local to the spawned process)
For example, this doesn't work:
def __init__(self, arg, shared):
super(Test, self).__init__()
self.arg = arg
self.quest = Quest()
self.shared = shared
self.pool = Pool()
def run(self):
iterable = list(range(10))
self.shared.extend(self.pool.map_async(self.quest.doIt, iterable, error_callback=print).get())
print("1" + str(self.shared))
self.pool.close()
However, this works:
def __init__(self, arg, shared):
super(Test, self).__init__()
self.arg = arg
self.quest = Quest()
self.shared = shared
def run(self):
pool = Pool()
iterable = list(range(10))
self.shared.extend(pool.map_async(self.quest.doIt, iterable, error_callback=print).get())
print("1" + str(self.shared))
pool.close()
This has to do with the fact that when you spawn a process, the entire code, stack, and heap segments of your process is cloned into the process such that your main process and subprocess have separate contexts.
So, you are calling join() on the pool object created local to your main process, and that calls close() on the pool. Then, in run() there's another pool object that was cloned into the subprocess when start() was called, and that pool was never closed and cannot be joined in the way you're doing it. Simply put, your main process has no reference to the cloned pool object in the subprocess.
This works fine if I make Test a subclass of threading.Thread instead
of multiprocessing.Process. I'm really not sure why this breaks it in
half.
Makes sense, because threads differ from processes in that they have independent call stacks, but share the other segments of memory, so any updates you make to an object created in another thread is visible in your main process (which is the parent of these threads) and vice versa.
Resolution is to create the pool object local to the run() function. Close the pool object in the subprocess context, and join the subprocess in the main process. Which brings us to #2...
Shared state: There are these multiprocessing.Manager() objects that allow for some sort of magical process-safe shared state between processes. Doesn't seem like the manager allows for reassignment of object references, which makes sense, because if you reassign the managed value in a subprocess, when the subprocess is terminated, that process context (code, stack, heap) disappears and your main process never sees this assignment (since it was done referencing an object local to the context of the subprocess). It may work for ctype primitive values, though.
If someone more experienced with Manager() wants to chime in on its innards, that'd be cool. But, the following code gives you your expected behavior:
#!/usr/bin/env python3
from multiprocessing.pool import ThreadPool as Pool
from multiprocessing import Process, Manager
from sys import stdout
class Quest():
def __init__(self):
pass
def doIt(self, i):
return i
class Test(Process):
def __init__(self, arg, shared):
super(Test, self).__init__()
self.arg = arg
self.quest = Quest()
self.shared = shared
def run(self):
with Pool() as pool:
iterable = list(range(10))
self.shared.extend(pool.map_async(self.quest.doIt, iterable, error_callback=print).get())
print("1" + str(self.shared)) # can remove, just to make sure we've updated state
def __str__(self):
return str(self.arg)
with Manager() as manager:
res = manager.list()
test = Test("test", res)
print(test) # should print 'test' (and does)
test.start()
test.join()
print("2" + str(res)) # should print '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
Outputs:
rpg711$ python multiprocess_async_join.py
test
1[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
When running using multiprocessing pool, I find that the worker process keeps running past a point where an exception is thrown.
Consider the following code:
import multiprocessing
def worker(x):
print("input: " + x)
y = x + "_output"
raise Exception("foobar")
print("output: " + y)
return(y)
def main():
data = [str(x) for x in range(4)]
pool = multiprocessing.Pool(1)
chunksize = 1
results = pool.map(worker, data, chunksize)
pool.close()
pool.join()
print("Printing results:")
print(results)
if __name__ == "__main__":
main()
The output is:
$ python multiprocessing_fail.py
input: 0
input: 1
input: 2
Traceback (most recent call last):
input: 3
File "multiprocessing_fail.py", line 25, in <module>
main()
File "multiprocessing_fail.py", line 16, in main
results = pool.map(worker, data, 1)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
Exception: foobar
As you can see, the worker process never proceeds beyond raise Exception("foobar") to the second print statement. However, it resumes work at the beginning of function worker() again and again.
I looked for an explanation in the documentation, but couldn't find any. Here is a potentially related SO question:
Keyboard Interrupts with python's multiprocessing Pool
But that is different (about keyboard interrupts not being picked by the master process).
Another SO question:
How to catch exceptions in workers in Multiprocessing
This question is also different, since in it the master process doesnt catch any exception, whereas here the master did catch the exception (line 16). More importantly, in that question the worker did not run past an exception (there is only one executable line for the worker).
Am running python 2.7
Comment: Pool should start one worker since the code has pool = multiprocessing.Pool(1).
From the Documnentation:
A process pool object which controls a pool of worker processes to which jobs can be submitted
Comment: That one worker is running the worker() function multiple times
From the Documentation:
map(func, iterable[, chunksize])
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks.
Your worker() is the separate task. Renaming your worker() to task() could help to clarify what is what.
Comment: What I expect is that the worker process crashes at the Exception
It does, the separate task, your worker() dies and Pool starts the next task.
What you want is Pool.terminate()
From the Documentation:
terminate()
Stops the worker processes immediately without completing outstanding work.
Question: ... I find that the worker process keeps running past a point where an exception is thrown.
You give iteration data to Pool, therfore Pool does what it have to do:
Starting len(data) worker.
data = [str(x) for x in range(4)]
The main Question is: What do you want to expect with
raise Exception("foobar")
If a module is imported from a script without a main guard (if __name__ == '__main__':), doing any kind of parallelism in some function in the module will result in an infinite loop on Windows. Each new process loads all of the sources, now with __name__ not equal to '__main__', and then continues execution in parallel. If there's no main guard, we're going to do another call to the same function in each of our new processes, spawning even more processes, until we crash. It's only a problem on Windows, but the scripts are also executed on osx and linux.
I could check this by writing to a special file on disk, and read from it to see if we've already started, but that limits us to a single python script running at once. The simple solution of modifying all the calling code to add main guards is not feasible because they are spread out in many repositories, which I do not have access to. Thus, I would like to parallelize, when main guards are used, but fallback to single threaded execution when they're not.
How do I figure out if I'm being called in an import loop due to a missing main guard, so that I can fallback to single threaded execution?
Here's some demo code:
lib with parallel code:
from multiprocessing import Pool
def _noop(x):
return x
def foo():
p = Pool(2)
print(p.map(_noop, [1, 2, 3]))
Good importer (with guard):
from lib import foo
if __name__ == "__main__":
foo()
Bad importer (without guard):
from lib import foo
foo()
where the bad importer fails with this RuntimeError, over and over again:
p = Pool(2)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 118, in Pool
context=self.get_context())
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 168, in __init__
self._repopulate_pool()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 233, in _repopulate_pool
w.start()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 34, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 144, in get_preparation_data
_check_not_importing_main()
File "C:\Users\filip.haglund\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Since you're using multiprocessing, you can also use it to detect if you're the main process or a child process. However, these features are not documented and are therefore just implementation details that could change without warning between python versions.
Each process has a name, _identity and _parent_pid. You can check any of them to see if you're in the main process or not. In the main process name will be 'MainProcess', _identity will be (), and _parent_pid will be None).
My solution allows you to continue using multiprocessing, but just modifies child processes so they can't keep creating child processes forever. It uses a decorator to change foo to a no-op in child processes, but returns foo unchanged in the main process. This means when the spawned child process tries to execute foo nothing will happen (as if it had been executed inside a __main__ guard.
from multiprocessing import Pool
from multiprocessing.process import current_process
def run_in_main_only(func):
if current_process().name == "MainProcess":
return func
else:
def noop(*args, **kwargs):
pass
return noop
def _noop(_ignored):
p = current_process()
return p.name, p._identity, p._parent_pid
#run_in_main_only
def foo():
with Pool(2) as p:
for result in p.map(_noop, [1, 2, 3]):
print(result) # prints something like ('SpawnPoolWorker-2', (2,), 10720)
if __name__ == "__main__":
print(_noop(1)) # prints ('MainProcess', (), None)
I'm getting the following error when using the multiprocessing module within a python daemon process (using python-daemon):
Traceback (most recent call last):
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 262, in _exit_function
for p in active_children():
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 43, in active_children
_cleanup()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
The daemon process (parent) spawns a number of processes (children) and then periodically polls the processes to see if they have completed. If the parent detects that one of the processes has completed, it then attempts to restart that process. It is at this point that the above exception is raised. It seems that once one of the processes completes, any operation involving the multiprocessing module will generate this exception. If I run the identical code in a non-daemon python script, it executes with no errors whatsoever.
EDIT:
Sample script
from daemon import runner
class DaemonApp(object):
def __init__(self, pidfile_path, run):
self.pidfile_path = pidfile_path
self.run = run
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
def run():
import multiprocessing as processing
import time
import os
import sys
import signal
def func():
print 'pid: ', os.getpid()
for i in range(5):
print i
time.sleep(1)
process = processing.Process(target=func)
process.start()
while True:
print 'checking process'
if not process.is_alive():
print 'process dead'
process = processing.Process(target=func)
process.start()
time.sleep(1)
# uncomment to run as daemon
app = DaemonApp('/root/bugtest.pid', run)
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#uncomment to run as regular script
#run()
Your problem is a conflict between the daemon and multiprocessing modules, in particular in its handling of the SIGCLD (child process terminated) signal. daemon sets SIGCLD to SIG_IGN when launching, which, at least on Linux, causes terminated children to immediately be reaped (rather than becoming a zombie until the parent invokes wait()). But multiprocessing's is_alive test invokes wait() to see if the process is alive, which fails if the process has already been reaped.
Simplest solution is just to set SIGCLD back to SIG_DFL (default behaviour -- ignore the signal and let the parent wait() for the terminated child process):
def run():
# ...
signal.signal(signal.SIGCLD, signal.SIG_DFL)
process = processing.Process(target=func)
process.start()
while True:
# ...
Ignoring SIGCLD also causes problems with the subprocess module, because of a bug in that module (issue 1731717, still open as of 2011-09-21).
This behaviour is addressed in version 1.4.8 of the python-daemon library; it now omits the default fiddling with SIGCLD, so no longer has this unpleasant interaction with other standard library modules.
I think there was a fix put into trunk and 2.6 maint a little while ago which should help with this can you try running your script in python-trunk or the latest 2.6-maint svn? I'm failing to pull up the bug information
Looks like your error is coming at the very end of your process -- your clue's at the very start of your traceback, and I quote...:
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
if atexit._run_exitfuncs is running, this clearly shows that your own process is terminating. So, the error itself is a minor issue in a sense -- just from some function that the multiprocessing module registered to run "at-exit" from your process. The really interesting issue is, WHY is your main process exiting? I think this may be due to some uncaught exception: try setting the exception hook and showing rich diagnostic info before it gets lost by the OTHER exception caused by whatever it is that multiprocessing's registered for at-exit running...
I'm running into this also using the celery distributed task manager under RHEL 5.3 with Python 2.6. My traceback looks a little different but the error the same:
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 334, in terminate
self._terminate()
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 174, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 373, in _terminate_pool
p.terminate()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 111, in terminate
self._popen.terminate()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 136, in terminate
if self.wait(timeout=0.1) is None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 121, in wait
res = self.poll()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
Quite frustrating.. I'm running the code through pdb now, but haven't spotted anything yet.
The original sample script has "import signal" but no use of signals. However, I had a script causing this error message and it was due to my signal handling, so I'll explain here in case its what is happening for others. Within a signal handler, I was doing stuff with processes (e.g. creating a new process). Apparently this doesn't work, so I stopped doing that within the handler and fixed the error. (Note: sleep() functions wake up after signal handling so that can be an alternative approach to acting upon signals if you need to do things with processes)