Python 3.7.7 Subinterpreter fails at multiprocessing.Process - python

Assume the following python-program from the official docs (The Process class):
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
Running this code against my 3.7.7 interpreter on my Windows machine works, as expected, without any problems. However, running the same code against a Subinterpreter created in C++ fails with the following error (no Exception actually, the following error just gets printed to the console):
unrecognised option '-c'
I assume that the reason for this error is to be found in spawn.py (within the multiprocessing module, line 89):
...
return [_python_exe] + opts + ['-c', prog, '--multiprocessing-fork']
...
I could create my new process via Popen. This works, but the spawned process should be a child-process, not a completely independent process.
My question:
Why does this error occur? Is there any way to spawn a child process within a Subinterpreter via multiprocessing.Process?
Thank you!
UPDATE 1
As suggested, adding freeze_support fixes the error, but a new one occurs:
unrecognised option '--multiprocessing-fork'

Related

Python `multiprocessing` spawned process using base Python, not virtualenv Python

On a standard installation of python (e.g. via miniconda), I run this script (also pasted below) and I get the following output:
python test_python_multiprocessing.py
arg1: called directly
sys.executable: C:\ProgramData\Miniconda3\envs\python3_7_4\python.exe
-----
arg1: called via multiprocessing
sys.executable: C:\ProgramData\Miniconda3\envs\python3_7_4\python.exe
-----
The two exes:
C:\ProgramData\Miniconda3\envs\python3_7_4\python.exe
C:\ProgramData\Miniconda3\envs\python3_7_4\python.exe
This is what I expect.
When I run the same script from a python from a virtual environment whose base is a python embedded in a scriptable application, I get the following result:
arg1: called directly
sys.executable: C:\[virtual_environment_path]\Scripts\python.exe
-----
arg1: called via multiprocessing
sys.executable: C:\[application_with_python]\contrib\Python37\python.exe
-----
The two exes:
C:\[virtual_environment_path]\Scripts\python.exe
C:\[application_with_python]\contrib\Python37\python.exe
Traceback (most recent call last):
File ".\test_python_multiprocessing.py", line 67, in <module>
test_exes()
File ".\test_python_multiprocessing.py", line 64, in test_exes
assert exe1 == exe2
AssertionError
Crucially, the child sys.executable does not match the parent sys.executable, but instead matches the base of the parent.
I suspected that the python that ships with the application had been altered, perhaps to have the spawned process point to a hard-coded python path.
I have taken a look at the python standard libraries that ship with the application, and I do not find any discrepancy that explains this difference in behavior.
I tried manually setting the executable to what the default should be before multiprocessing.Process with multiprocessing.set_executable(sys.executable) or multiprocessing.get_context("spawn").set_executable(sys.executable). These do not have an effect.
What are possible explanations for the difference in behavior between a standard python installation, and this python that is embedded within a scriptable application? How can I investigate the cause, and force the correct python executable to be used when spawning child processes?
test_python_multiprocessing.py:
import multiprocessing
def functionality(arg1):
import sys
print("arg1: " + str(arg1))
print("sys.executable: " + str(sys.executable))
print("-----\n")
return sys.executable
def worker(queue, arg1):
import traceback
try:
retval = functionality(arg1)
queue.put({"retval": retval})
except Exception as e:
queue.put({"exception": e, "traceback_str": traceback.format_exc()})
raise
finally:
pass
def spawn_worker(arg1):
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue, arg1,))
p.start()
p.join()
err_or_ret = queue.get()
handle_worker_err(err_or_ret)
if p.exitcode != 0:
raise RuntimeError("Subprocess failed with code " + str(p.exitcode) + ", but no exception was thrown.")
return err_or_ret["retval"]
def handle_worker_err(err_or_ret):
if "retval" in err_or_ret:
return None
err = err_or_ret
#import traceback
if (err is not None):
#traceback.print_tb(err["traceback"]) # TODO use e.g. tblib to get traceback
print("The exception was thrown in the child process, reraised in parent process:")
print(err["traceback_str"])
raise err["exception"]
def test_exes():
exe1 = functionality("called directly")
exe2 = spawn_worker("called via multiprocessing")
print("The two exes:")
print(exe1)
print(exe2)
assert exe1 == exe2
if __name__ == "__main__":
test_exes()
[EDIT] the fact that I detected the issue on a python embedded in the scriptable application is a red-herring. Making a virtual environment with a "standard install" Python 3.7.4 base also has the same issue.
long story short, using the "virtual" interpreter causes bugs in multiprocessing and the developers decided to redirect virtualenv environments to the base one.
link to issue 35797
and this is pulled from popen_spawn_win32.py
# bpo-35797: When running in a venv, we bypass the redirect
# executor and launch our base Python.
one solution is to use subprocess instead, and connect to your "pipes" through a socket to a manager instead of using multiprocessing, you can see how to connect to a manager using a socket in BaseManager documentation, python makes it as simple as plugging in its port number.
you can also try pathos as its multiprocessing implementation is "different", (i think its pools use sockets, but i didn't dig in it before and it has other problems from the way it spawns new workers differently, but it can work in a few weird environments where multiprocessing fails.)
Edit: another nice parallelizing alternative that actually uses sockets is Dask, but you have to start the workers separately, not through its built-in pool.

Why don't I see output from this function when calling it with multiprocessing? [duplicate]

A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.

Can multiprocessing Process class be run from IDLE

A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.

Python Process which is joined will not call atexit

I thought Python Processes call their atexit functions when they terminate. Note that I'm using Python 2.7. Here is a simple example:
from __future__ import print_function
import atexit
from multiprocessing import Process
def test():
atexit.register(lambda: print("atexit function ran"))
process = Process(target=test)
process.start()
process.join()
I'd expect this to print "atexit function ran" but it does not.
Note that this question:
Python process won't call atexit
is similar, but it involves Processes that are terminated with a signal, and the answer involves intercepting that signal. The Processes in this question are exiting gracefully, so (as far as I can tell anyway) that question & answer do not apply (unless these Processes are exiting due to a signal somehow?).
I did some research by looking at how this is implemented in CPython. This is assumes you are running on Unix. If you are running on Windows the following might not be valid as the implementation of processes in multiprocessing differs.
It turns out that os._exit() is always called at the end of the process. That, together with the following note from the documentation for atexit, should explain why your lambda isn't running.
Note: The functions registered via this module are not called when the
program is killed by a signal not handled by Python, when a Python
fatal internal error is detected, or when os._exit() is called.
Here's an excerpt from the Popen class for CPython 2.7, used for forking processes. Note that the last statement of the forked process is a call to os._exit().
# Lib/multiprocessing/forking.py
class Popen(object):
def __init__(self, process_obj):
sys.stdout.flush()
sys.stderr.flush()
self.returncode = None
self.pid = os.fork()
if self.pid == 0:
if 'random' in sys.modules:
import random
random.seed()
code = process_obj._bootstrap()
sys.stdout.flush()
sys.stderr.flush()
os._exit(code)
In Python 3.4, the os._exit() is still there if you are starting a forking process, which is the default. But it seems like you can change it, see Contexts and start methods for more information. I haven't tried it, but perhaps using a start method of spawn would work? Not available for Python 2.7 though.

Python Eclipse threaded subprocess.Popen() <terminated, exit value: 137>

I am running python 2.7 on Ubuntu in Eclipse
I am trying to call subprocess.Popen from a thread other than the main thread.
When I run this code from Eclipse:
#lsbt.py
class someThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
p = subprocess.Popen(["ls", "/usr"], stdout=subprocess.PIPE)
out = p.communicate()
print "Done" + out[0]
def main():
test = someThread()
test.daemon = True
test.start()
while True:
time.sleep(3600)
The whole python program seems to exit at the subprocess.Popen() line.
Here is what eclipse says the call stack looks like:
<terminated>lsbt_1 lsbt.py [Python Run]
<terminated>lsbt.py
lsbt.py
<terminated, exit value: 137>lsbt.py
All debugging seems to stop as well and nothing is printed to the console.
When I run the subprocess code from the main thread in Eclipse, it seems to work well.
It does not seem to matter what command the subprocess.Popen runs, then only thing that seems to matter is that it is not being run from the main thread.
When I run the python code from the terminal, it works.
Could it be a problem with Eclipse?
#aabarnert commented: IIRC, errno 137 on linux is ENOTTY
One way to do it is to set:
daemon = False
I'm not sure why this works for Eclipse, but it does.
From Python Documentation:
A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread. The flag can be set through the daemon property

Categories

Resources