I am trying to execute the following code in jupyter notebook using multiprocessing but the loop is running infinitely.
I need help resolving this issue.
import multiprocessing as mp
import numpy as np
def square(x):
return np.square(x)
x = np.arange(64)
pool = mp.Pool(4)
squared = pool.map(square, [x[16*i:16*i+16] for i in range(4)])
The output for mp.cpu_count() was 4.
You need to rewrite your code to be something like:
def main():
x = np.arange(64)
pool = mp.Pool(4)
squared = .....
if __name__ == '__main__':
main()
This code is currently being run in every process. You need it to only run in the one process that is doing the setup.
You forgot:
pool.close()
pool.join()
Related
I am working on multiprocessing and trying to replicate the code given in the below link:
Python Multiprocessing imap
My system is hanging in both Spyder and Jupyter as shown following. What could be the reason?
Following is the code exactly copied and running. But it is just hanging.
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(3) as p:
print(p.map(f, [1, 2, 3]))
If you read the docs on multiprocessing, in particular the following section:
... you will see this will not work. The solution is to put function f in another .py file and import it order to get it to work. For example:
File worker.py:
def f(x):
return x*x
Your revised code:
from multiprocessing import Pool
from worker import f
if __name__ == '__main__':
with Pool(3) as p:
print(p.map(f, [1, 2, 3]))
I have a_1.py~a_10.py
I want to run 10 python programs in parallel.
I tried:
from multiprocessing import Process
import os
def info(title):
I want to execute python program
def f(name):
for i in range(1, 11):
subprocess.Popen(['python3', f'a_{i}.py'])
if __name__ == '__main__':
info('main line')
p = Process(target=f)
p.start()
p.join()
but it doesn't work
How do I solve this?
I would suggest using the subprocess module instead of multiprocessing:
import os
import subprocess
import sys
MAX_SUB_PROCESSES = 10
def info(title):
print(title, flush=True)
if __name__ == '__main__':
info('main line')
# Create a list of subprocesses.
processes = []
for i in range(1, MAX_SUB_PROCESSES+1):
pgm_path = f'a_{i}.py' # Path to Python program.
command = f'"{sys.executable}" "{pgm_path}" "{os.path.basename(pgm_path)}"'
process = subprocess.Popen(command, bufsize=0)
processes.append(process)
# Wait for all of them to finish.
for process in processes:
process.wait()
print('Done')
If you just need to call 10 external py scripts (a_1.py ~ a_10.py) as a separate processes - use subprocess.Popen class:
import subprocess, sys
for i in range(1, 11):
subprocess.Popen(['python3', f'a_{i}.py'])
# sys.exit() # optional
It's worth to look at a rich subprocess.Popen signature (you may find some useful params/options)
You can use a multiprocessing pool to run them concurrently.
import multiprocessing as mp
def worker(module_name):
""" Executes a module externally with python """
__import__(module_name)
return
if __name__ == "__main__":
max_processes = 5
module_names = [f"a_{i}" for i in range(1, 11)]
print(module_names)
with mp.Pool(max_processes) as pool:
pool.map(worker, module_names)
The max_processes variable is the maximum number of workers to have working at any given time. In other words, its the number of processes spawned by your program. The pool.map(worker, module_names) uses the available processes and calls worker on each item in your module_names list. We don't include the .py because we're running the module by importing it.
Note: This might not work if the code you want to run in your modules is contained inside if __name__ == "__main__" blocks. If that is the case, then my recommendation would be to move all the code in the if __name__ == "__main__" blocks of the a_{} modules into a main function. Additionally, you would have to change the worker to something like:
def worker(module_name):
module = __import__(module_name) # Kind of like 'import module_name as module'
module.main()
return
I have problems with python multiprocessing
python version 3.6.6
using Spyder IDE on windows 7
1.
queue is not being populated -> everytime I try to read it, its empty. Somewhere I read, that I have to get() it before process join() but it did not solve it.
from multiprocessing import Process,Queue
# define a example function
def fnc(i, output):
output.put(i)
if __name__ == '__main__':
# Define an output queue
output = Queue()
# Setup a list of processes that we want to run
processes = [Process(target=fnc, args=(i, output)) for i in range(4)]
print('created')
# Run processes
for p in processes:
p.start()
print('started')
# Exit the completed processes
for p in processes:
p.join()
print(output.empty())
print('finished')
>>>created
>>>started
>>>True
>>>finished
I would expect output to not be empty.
if I change it from .join() to
for p in processes:
print(output.get())
#p.join()
it freezes
2.
Next problem I have is with pool.map() - it freezes and has no chance to exceed memory limit. I dont even know how to debug such simple pieace of code.
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4)
print('Pool created')
# print "[0, 1, 4,..., 81]"
print(pool.map(f, range(10))) # it freezes here
Hope its not a big deal to have two questions in one topic
Apperently the problem is Spyder's IPython console. When I run both from cmd, its executed properly.
Solution
for debugging in Spyder add .dummy to multiprocessing import
from multiprocessing.dummy import Process,Queue
It will not be executed by more processors, but you will get results and can actualy see the output. When debugging is done simply delete .dummy, place it in another file, import it and call it for example as function
multiprocessing_my.py
from multiprocessing import Process,Queue
# define a example function
def fnc(i, output):
output.put(i)
print(i)
def test():
# Define an output queue
output = Queue()
# Setup a list of processes that we want to run
processes = [Process(target=fnc, args=(i, output)) for i in range(4)]
print('created')
# Run processes
for p in processes:
p.start()
print('started')
# Exit the completed processes
for p in processes:
p.join()
print(output.empty())
print('finished')
# Get process results from the output queue
results = [output.get() for p in processes]
print('get results')
print(results)
test_mp.py
executed by selecting code and pressing ctrl+Enter
import multiprocessing_my
multiprocessing_my.test2()
...
In[9]: test()
created
0
1
2
3
started
False
finished
get results
[0, 1, 2, 3]
I'm trying to run a simple function with arguments from a list in a multiprocessing pool in Python 2.7.5 (Windows 7).
from multiprocessing import Pool
index_lst = []
for idx, item in enumerate(range(10)):
index_lst.append(idx)
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(4)
print(p.map(f, index_lst))
Unfortunately, the entire script gets executed multiple times. How to prevent the list (index_lst) from being created over and over again?
I am running the following (example) code:
from multiprocessing import Pool
def f(x):
return x*x
pool = Pool(processes=4)
print pool.map(f, range(10))
However, the code never finishes. What am I doing wrong?
The line
pool = Pool(processes=4)
completes successfully, it appears to stop in the last line. Not even pressing ctrl+c interrupts the execution. I am running the code inside an ipython console in Spyder.
from multiprocessing import Pool
def f(x):
return x * x
def main():
pool = Pool(processes=3) # set the processes max number 3
result = pool.map(f, range(10))
pool.close()
pool.join()
print(result)
print('end')
if __name__ == "__main__":
main()
The key step is to call pool.close() and pool.join() after the processes finished. Otherwise the pool is not released.
Besides, you should create the pool in the main process by putting the codes within if __name__ == "__main__":
Your constructor is throwing the interpreter off into a thread producing factory for some reason.
You first need to stop all the threads are now running and there will be tons. If you bring up the task manager you will see tons of rogue python.exe tasks. To kill them in bulk try:
taskkill /F /IM python.exe
You would need to do the above a couple of times and make sure the task manager does not show anymore python.exe tasks. This will also kill you spyder instance. So make sure you save.
Now change your code to the following:
from multiprocessing import Pool
def f(x):
return x*x
if (__name__ == '__main__'):
pool = Pool(4)
print pool.map(f, range(10))
Note that I have removed the processes named argument.