IDLE crash when using multiprocessing on Mac OSX - python

If I run this simple code in IDLE in Python 2.7.8, it will pop a window saying "The program is still running! Do you want to kill it?".
from multiprocessing import Pool
def foo(x):
return x**2
if __name__ == '__main__':
pool = Pool(2)
pows = pool.map(foo, range(10))
print pows
Even if I do kill or not (it will ask twice) nothing will happen. I used to use Windows and I've just recently started using Mac OSX (10.9.4), and I don't know if I'm missing something here.
If I run the same code directly in the Python Shell in terminal, it will run fine. Same in iPython notebook. It just won't on IDLE, popping up that message box.
Any ideas? I'd like to keep using IDLE...
here's the log:
INFO:root:10221: Started process
INFO:root:10221: Defined foo
INFO:root:10221: __name__ == '__main__'
INFO:root:10221: pool created

Ref this:
https://docs.python.org/2/library/multiprocessing.html#introduction
Specifically, in the note:
Functionality within this package requires that the __main__ module be
importable by the children. This is covered in Programming guidelines
however it is worth pointing out here. This means that some examples,
such as the multiprocessing.Pool examples will not work in the
interactive interpreter."
Here's a similar question Child processes created with python multiprocessing module won't print
Example of logging activity to a file:
#!/usr/bin/env python
import logging
from multiprocessing import Pool
import os
logging.basicConfig(filename='example.log',level=logging.DEBUG)
def log_msg(msg):
logging.info("{}: {}".format(os.getpid(), msg))
log_msg("Started process")
def foo(x):
log_msg("running foo")
return x**2
log_msg("Defined foo")
if __name__ == '__main__':
log_msg("__name__ == '__main__'")
pool = Pool(2)
log_msg("pool created")
pows = pool.map(foo, range(10))
log_msg("map completed")
print pows
log_msg("output printed")
log_msg("Finished running")
Example output for me:
tom#fannybawz:~$ ./multiproc.py
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
tom#fannybawz:~$ cat example.log
INFO:root:22238: Started process
INFO:root:22238: Defined foo
INFO:root:22238: __name__ == '__main__'
INFO:root:22238: pool created
INFO:root:22240: {}: running foo
INFO:root:22239: {}: running foo
INFO:root:22240: {}: running foo
INFO:root:22239: {}: running foo
INFO:root:22240: {}: running foo
INFO:root:22239: {}: running foo
INFO:root:22240: {}: running foo
INFO:root:22239: {}: running foo
INFO:root:22240: {}: running foo
INFO:root:22240: {}: running foo
INFO:root:22238: map completed
INFO:root:22238: output printed
INFO:root:22238: Finished running
tom#fannybawz:~$
Try the same thing yourself with the Process version.

This was a known issue with the previous version of Pycharm. If you upgrade with the latest version now you can safely use multiprocessing within the console of the IDE without running in this issue any longer.
See here for further informations: https://youtrack.jetbrains.com/issue/PY-14969

Related

How to run 10 python programs simultaneously?

I have a_1.py~a_10.py
I want to run 10 python programs in parallel.
I tried:
from multiprocessing import Process
import os
def info(title):
I want to execute python program
def f(name):
for i in range(1, 11):
subprocess.Popen(['python3', f'a_{i}.py'])
if __name__ == '__main__':
info('main line')
p = Process(target=f)
p.start()
p.join()
but it doesn't work
How do I solve this?
I would suggest using the subprocess module instead of multiprocessing:
import os
import subprocess
import sys
MAX_SUB_PROCESSES = 10
def info(title):
print(title, flush=True)
if __name__ == '__main__':
info('main line')
# Create a list of subprocesses.
processes = []
for i in range(1, MAX_SUB_PROCESSES+1):
pgm_path = f'a_{i}.py' # Path to Python program.
command = f'"{sys.executable}" "{pgm_path}" "{os.path.basename(pgm_path)}"'
process = subprocess.Popen(command, bufsize=0)
processes.append(process)
# Wait for all of them to finish.
for process in processes:
process.wait()
print('Done')
If you just need to call 10 external py scripts (a_1.py ~ a_10.py) as a separate processes - use subprocess.Popen class:
import subprocess, sys
for i in range(1, 11):
subprocess.Popen(['python3', f'a_{i}.py'])
# sys.exit() # optional
It's worth to look at a rich subprocess.Popen signature (you may find some useful params/options)
You can use a multiprocessing pool to run them concurrently.
import multiprocessing as mp
def worker(module_name):
""" Executes a module externally with python """
__import__(module_name)
return
if __name__ == "__main__":
max_processes = 5
module_names = [f"a_{i}" for i in range(1, 11)]
print(module_names)
with mp.Pool(max_processes) as pool:
pool.map(worker, module_names)
The max_processes variable is the maximum number of workers to have working at any given time. In other words, its the number of processes spawned by your program. The pool.map(worker, module_names) uses the available processes and calls worker on each item in your module_names list. We don't include the .py because we're running the module by importing it.
Note: This might not work if the code you want to run in your modules is contained inside if __name__ == "__main__" blocks. If that is the case, then my recommendation would be to move all the code in the if __name__ == "__main__" blocks of the a_{} modules into a main function. Additionally, you would have to change the worker to something like:
def worker(module_name):
module = __import__(module_name) # Kind of like 'import module_name as module'
module.main()
return

Python multiprocessing.Process() got different behavior in cmd console and spyder

So this is a code snippet from python doc to demonstrate multiprocessing module.
I opened a empty file 'multi_process_test.py' and typed in these code.
from multiprocessing import Process
import sys
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
And the weird part is if I run it under Spyder, the program prints nothing, but if I run the source code in powershell env by 'python multi_process_test.py', the console will prints 'hello bob' as expected, can s.b. help to explain why this happen? Or any hint to resolve this difference.
Appreciated for your help.

Python multiprocessing pool never finishes

I am running the following (example) code:
from multiprocessing import Pool
def f(x):
return x*x
pool = Pool(processes=4)
print pool.map(f, range(10))
However, the code never finishes. What am I doing wrong?
The line
pool = Pool(processes=4)
completes successfully, it appears to stop in the last line. Not even pressing ctrl+c interrupts the execution. I am running the code inside an ipython console in Spyder.
from multiprocessing import Pool
def f(x):
return x * x
def main():
pool = Pool(processes=3) # set the processes max number 3
result = pool.map(f, range(10))
pool.close()
pool.join()
print(result)
print('end')
if __name__ == "__main__":
main()
The key step is to call pool.close() and pool.join() after the processes finished. Otherwise the pool is not released.
Besides, you should create the pool in the main process by putting the codes within if __name__ == "__main__":
Your constructor is throwing the interpreter off into a thread producing factory for some reason.
You first need to stop all the threads are now running and there will be tons. If you bring up the task manager you will see tons of rogue python.exe tasks. To kill them in bulk try:
taskkill /F /IM python.exe
You would need to do the above a couple of times and make sure the task manager does not show anymore python.exe tasks. This will also kill you spyder instance. So make sure you save.
Now change your code to the following:
from multiprocessing import Pool
def f(x):
return x*x
if (__name__ == '__main__'):
pool = Pool(4)
print pool.map(f, range(10))
Note that I have removed the processes named argument.

multiprocessing ignores "__setstate__"

I assumed that the multiprocessing package used pickle to send things between processes. However, pickle pays attention to the __getstate__ and __setstate__ methods of an object. Multiprocessing seems to ignore them. Is this correct? Am I confused?
To replicate, install docker, and type into command line
$ docker run python:3.4 python -c "import pickle
import multiprocessing
import os
class Tricky:
def __init__(self,x):
self.data=x
def __setstate__(self,d):
self.data=10
def __getstate__(self):
return {}
def report(ar,q):
print('running report in pid %d, hailing from %d'%(os.getpid(),os.getppid()))
q.put(ar.data)
print('module loaded in pid %d, hailing from pid %d'%(os.getpid(),os.getppid()))
if __name__ == '__main__':
print('hello from pid %d'%os.getpid())
ar = Tricky(5)
q = multiprocessing.Queue()
p = multiprocessing.Process(target=report, args=(ar, q))
p.start()
p.join()
print(q.get())
print(pickle.loads(pickle.dumps(ar)).data)"
You should get something like
module loaded in pid 1, hailing from pid 0
hello from pid 1
running report in pid 5, hailing from 1
5
10
I would have thought it would have been "10" "10" but instead it is "5" "10". What could it mean?
(note: code edited to comply with programming guidelines, as suggested by user3667217)
The multiprocessing module can start one of three ways: spawn, fork, or forkserver. By default on unix, it forks. That means that there's no need to pickle anything that's already loaded into ram at the moment the new process is born.
If you need more direct control over how you want the fork to take place, you need to change the startup setting to spawn. To do this, create a context
ctx=multiprocessing.get_context('spawn')
and replace all calls to multiprocessing.foo() with calls to ctx.foo(). When you do this, every new process is born as a fresh python instance; everything that gets sent into it will be sent via pickle, instead of direct memcopy.
Reminder: when you're using multiprocessing, you need to start a process in an 'if __name__ == '__main__': clause: (see programming guidelines)
import pickle
import multiprocessing
class Tricky:
def __init__(self,x):
self.data=x
def __setstate__(self, d):
print('setstate happening')
self.data = 10
def __getstate__(self):
return self.data
print('getstate happening')
def report(ar,q):
q.put(ar.data)
if __name__ == '__main__':
ar = Tricky(5)
q = multiprocessing.Queue()
p = multiprocessing.Process(target=report, args=(ar, q))
print('now starting process')
p.start()
print('now joining process')
p.join()
print('now getting results from queue')
print(q.get())
print('now getting pickle dumps')
print(pickle.loads(pickle.dumps(ar)).data)
On windows, I see
now starting process
now joining process
setstate happening
now getting results from queue
10
now getting pickle dumps
setstate happening
10
On Ubuntu, I see:
now starting process
now joining process
now getting results from queue
5
now getting pickle dumps
getstate happening
setstate happening
10
I suppose this should answer your question. The multiprocess invokes __setstate__ method on Windows but not on Linux. And on Linux, when you call pickle.dumps it first call __getstate__, then __setstate__. It's interesting to see how multiprocessing module is behaving differently on different platforms.

eventlet hangs when using futures.ProcessPoolExecutor

I'm using Ubuntu 12.04 Server x64, Python 2.7.3, futures==2.1.5, eventlet==0.14.0
Did anybody hit the same problem?
import eventlet
import futures
import random
# eventlet.monkey_patch() # Uncomment this line and it will hang!
def test():
if random.choice((True, False)):
raise Exception()
print "OK this time"
def done(f):
print "done callback!"
def main():
with futures.ProcessPoolExecutor() as executor:
fu = []
for i in xrange(6):
f = executor.submit(test)
f.add_done_callback(done)
fu.append(f)
futures.wait(fu, return_when=futures.ALL_COMPLETED)
if __name__ == '__main__':
main()
This code, if you uncomment the line, will hang. I can only stop it by pressing Ctrl+C. In this case the following KeyboardInterrupt traceback is printed: https://gist.github.com/max-lobur/8526120#file-traceback
This works fine with ThreadPoolExecutor.
Any feedback is appreciated
I think this KeyboardInterrupt reaches the started process because under ubuntu and most linux systems the new process is created with fork - starting a new process from the current one with the same stdin and stdout and stderr. So the KeyboardInterrupt can reach the child process. Maybe someone who know something more can add thoughts.

Categories

Resources