I am learning about parallel processing in python and I have some very specific doubts regarding the execution flow of the following program. In this program, I am splitting my list into two parts depending on the process. My aim is to run the add function twice parallely where one process takes one part of the list and other takes other part.
import multiprocessing as mp
x = [1,2,3,4]
print('hello')
def add(flag, q_f):
global x
if flag == 1:
dl = x[0:2]
elif flag == 2:
dl = x[2:4]
else:
dl = x
x = [i+2 for i in dl]
print('flag = %d'%flag)
print('1')
print('2')
print(x)
q_f.put(x)
print('Above main')
if __name__ == '__main__':
ctx = mp.get_context('spawn')
print('inside main')
q = ctx.Queue()
jobs = []
for i in range(2):
p = mp.Process(target = add, args = (i+1, q))
jobs.append(p)
for j in jobs:
j.start()
for j in jobs:
j.join()
print('completed')
print(q.get())
print(q.get())
print('outside main')
The output which I got is
hello
Above main
outside main
flag = 1
1
2
[3, 4]
hello
Above main
outside main
flag = 2
1
2
[5, 6]
hello
Above main
inside main
completed
[3, 4]
[5, 6]
outside main
My questions are
1) From the output, we can see that one process is getting executed first, then the other. Is the program actually utilizing multiple processors for parallel processing? If not, how can I make it parallely process? If it was parallely processing, the print statements print('1') print('2') should be executed at random, right?
2) Can I check programmatically on which processor is the program running?
3) Why are the print statements outside main(hello, above main, outside main) getting executed thrice?
4) What is the flow of the program execution?
1) The execution of add() is probably done so fast that the first execution ended already when the second process is started.
2) A process is usually not assigned to a particular CPU but jumps between them
3) If you are using Windows for each started process the module must be executed again. For these executions __name__ isn't 'main' but all unconditional commands (outside of if and such) like these prints are executed.
4) When start() of a Process is called on Windows a new Python interpreter is started which means that necessary modules are imported (and therefore executed) and necessary resources to run the subprocess are handed to the new interpreter (the "spawn"-method described at https://docs.python.org/3.6/library/multiprocessing.html#contexts-and-start-methods). All processes then run independently (if no synchronization is done by the program)
Related
Im trying some multiprocessing as the example code below, it should print message from 01,02,03,hello world,04,05, but it went 01,02,01,05,03,hello world,04,05 instead, why 02 back to 01 then jump to 05, back to 03, why am i missing here, how to let it run by order, Thank you!
from multiprocessing import *
# large data/complex use multiprocessing , else use odinary function
q = Queue() # comm between parent n child proces
def f1(x,q):
print('03')
x = x + " world"
q.put(x)
def main_f():
print('01')
mp = Process(target=f1,args=("hello",q,))
if __name__ == '__main__': # only happen once, else ex 4 process to 16 to 64 endless
print('02')
mp.start()
print(q.get())
mp.join()
print('04')
main_f()
print('05')
I expect the message print from 01,02,03,hello world,04,05
When you do multiprocessing under a platform that uses the spawn method to create new processes, then any code at global scope that is not within a if __name__ == '__main__': block will first be executed by the child process in order to initialize its storage prior to invoking the worker function f1.
In your posted code, when the child process is created it will therefore execute the following statements in order:
from multiprocessing import *
q = Queue() # create global queue
def f1(x, q): # create function definition
def main_f(): # create function definition
main_f() # call main_f
print('05')
In reality the only statement that needs to be executed by the child process before the worker method f1 is invoked is statement #3 above, which defines the worker function for the child process.
Statement 1 imports a package not used by your child process. Doing this does not prevent the program from running correctly but Python is spending time performing animport that is not used.
Statement #2 needlessly creates a new queue instance in the child process distinct from the one created in the main process. It would be disastrous if your child process used this since it would be putting elements on a different queue than the one the main process is getting from. Fortunately, function f1 is not referencing and using the queue that is passed as an argument.
Statement #4 defines a function not used by the child process. It doesn't prevent the program from running but is wasteful.
Statement #5 invokes main_f. This is where your troubles begins. All the code within main_f that is not within a if __name__ == '__main__': block will get executed immediately before your worker function is invoked. This is what is causing an extra '01' to be printed.
Statement #6 likewise is what is causing an extra '05' to be printed.
At the minimum to get your program working correctly, your code should therefore be:
from multiprocessing import *
def f1(x,q):
print('03')
x = x + " world"
q.put(x)
def main_f():
# large data/complex use multiprocessing, else use ordinary function
q = Queue() # comm between parent n child process
print('01')
mp = Process(target=f1,args=("hello",q,))
print('02')
mp.start()
print(q.get())
mp.join()
print('04')
if __name__ == '__main__':
main_f()
If we want to eliminate all possible inefficiencies, i.e. prevent unnecessary statements from being executed when the child process is initialized, then:
def f1(x,q):
print('03')
x = x + " world"
q.put(x)
if __name__ == '__main__':
from multiprocessing import *
def main_f():
# large data/complex use multiprocessing, else use ordinary function
q = Queue() # comm between parent n child process
print('01')
mp = Process(target=f1,args=("hello",q,))
print('02')
mp.start()
print(q.get())
mp.join()
print('04')
main_f()
I am running a multiprocessing code. The framework of the code is something like below:
def func_a(x):
#main function here
return result
def func_b(y):
cores = multiprocessing.cpu_count() - 1
pool = multiprocessing.Pool(processes=cores)
results = pool.map(func_a, np.arange(1000)
return results
if __name__ == '__main__':
final_resu = []
for i in range(0, 200):
final_resu.append(func_b(i))
The problem I found in this code has two problems: Firstly, the memory continues going up during the loop. Secondly, in the task manager (windows10), the number of python executions increased step-wise, i.e. 14 to 25, to 36, to 47... with every iteration finished in the main loop.
I believe it has something wrong with the multiprocessing, but I'm not sure how to deal with it. It looks like the multiprocessing in func_b is not deleted when the main loop finished one loop?
As the examples in the docs show, when you're done with a Pool you should shut it down explicitly, via pool.close() followed by pool.join().That said, it would be better still if, in addition, you created your Pool only once - e.g., pass a Pool as an argument to func_b(). and create it - and close it down - only once, in the __name__ == '__main__' block.
I'm trying to learn how to use multiple processes in Python and I encountered a problem similar to the example below.
I try to start a process called p1 using .start() and after that to call a function do_something(). The problem is that the function is called before the process starts.
The code I used:
import time
from multiprocessing import Process
def complex_calculation():
start_timer = time.time()
print("Started calculating...")
[x ** 2 for x in range(20000000)] # calculation
print(f"complex_calculation: {time.time() - start_timer}")
def do_something():
print(input("Enter a letter: "))
if __name__ == "__main__":
p1 = Process(target=complex_calculation)
p1.start()
do_something()
p1.join()
It seems to work if I use time.sleep():
if __name__ == "__main__":
p1 = Process(target=complex_calculation)
p1.start()
time.sleep(1)
do_something()
p1.join()
My questions are:
Why does this happen?
What can I do so that I don't have to use time.sleep() ?
As pointed out in the comments, multiple processes run concurrently. Without doing some extra work, there are never guarantees about the order in which the processes are scheduled to run by the operating system. So while you call p1.start() before do_something(), all that means is that the Python code related to starting the process has completed before do_something is run. But the actual process represented by p1 may run in any way relative to the remainder of the Python code. It can run entirely before, entirely after, or interleaved in any way with the remainder of the Python code. Relying on it being scheduled in any particular way is one definition of a race condition.
To control the way in which these processes run relative to one another, you need a synchronization primitive. There are many ways to synchronize processes, it just depends on what you want to accomplish. If you want to make sure that the complex_calculation function has started before do_something is called, an event is probably the simplest approach. For example:
import time
from multiprocessing import Process, Event
def complex_calculation(event):
event.set() # Set the event, notifying any process waiting on it
start_timer = time.time()
print("Started calculating...")
[x ** 2 for x in range(20000000)] # calculation
print(f"complex_calculation: {time.time() - start_timer}")
def do_something(event):
event.wait() # Wait for `complex_calculation` to set the event
print(input("Enter a letter: "))
if __name__ == "__main__":
event = Event()
p1 = Process(target=complex_calculation, args=(event,))
p1.start()
do_something(event)
p1.join()
You should see something like:
$ python3 test.py
Started calculating...
Enter a letter: a
a
complex_calculation: 6.86732816696167
How can I add a new task to a multiprocessing pool that I initialized in a parent process? This following does not work:
from multiprocessing import Pool
def child_task(x):
# the child task spawns new tasks
results = p.map(grandchild_task, [x])
return results[0]
def grandchild_task(x):
return x
if __name__ == '__main__':
p = Pool(2)
print(p.map(child_task, [0]))
# Result: NameError: name 'p' is not defined
Motivation: I need to parallelize a program which consists of various child tasks, which themselves also have child tasks (i.e., grandchild tasks). Only parallelizing the child tasks OR the grandchild tasks does not utilize all my CPU cores.
In my use-case, I have various child tasks (maybe 1-50) and many grandchild tasks per child task (maybe 100-1000).
Alternatives: If this is not possible using Python's multiprocessing package, I am happy to switch to another library that supports this.
There is such a thing as a minimal reproducible example and then there is going beyond that to remove so much code as to end up with something that (1) is perhaps too oversimplified with the danger than an answer could miss the mark and (2) couldn't possibly run as shown (you need to enclose the code that creates the Pool and submits the task in a block that is controlled by an if __name__ == '__main__': statement.
But based on what you have shown, I don't believe a Pool is the solution for you; you should be creating Process instances as they are required. One way to get the results from the Processes is to store them in a shareable, managed dictionary whose key is, for example, the process id of the Process that has created the result.
To expand on your example, the child task is passed two arguments, x and y and needs to return as a result x**2 + 'y**2. The child task will spawn two instances of grandchild task, each one computing the square of its argument. The child task will then combine the return values from these processes using addition:
from multiprocessing import Process, Manager
import os
def child_task(results_dict, x, y):
# the child task spawns new tasks
p1 = Process(target=grandchild_task, args=(results_dict, x))
p1.start()
pid1 = p1.pid
p2 = Process(target=grandchild_task, args=(results_dict, y))
p2.start()
pid2 = p2.pid
p1.join()
p2.join()
pid = os.getpid()
results_dict[pid] = results_dict[pid1] + results_dict[pid2]
def grandchild_task(results_dict, n):
pid = os.getpid()
results_dict[pid] = n * n
def main():
manager = Manager()
results_dict = manager.dict()
p = Process(target=child_task, args=(results_dict, 2, 3))
p.start()
pid = p.pid
p.join()
# results will be stored with key p.pid:
print(results_dict[pid])
if __name__ == '__main__':
main()
Prints:
13
Update
If you really had a situation where, for example, child_task needed to process N identical calls varying only in its arguments but it had to spawn a sub-process or two, then use a Pool as before but additionally pass a managed dictionary to child_task to be used for spawning additional Processes (not attempting to use a Pool for this) and retrieving their results.
Update 2
The only way I could figure out for the sub-processes themselves to use pooling is to use the ProcessPoolExecutor class from concurrent.futures module. When I attempted to do the same thing with multiprocessing.Pool, I got an error because we had daemon processes trying to create their own processes. But even here the only way is for each process in the pool to have its own pool of processes. You only have a finite number of processors/cores on your computer, so unless there is a bit of I/O mixed in the processing, you can create all these pools but the processes will be waiting for a chance to run. So, it's not clear what performance gains will be realized. There is also the problem of shutting down all the pools created for the child_task sub-processes. Normally a ProcessPoolExecutor instance is created using a with block and when that block is terminated the pool that was created is cleaned up. But child_task is invoked repeatedly and clearly cannot use with block because we don't want constantly to be creating and destroying pools. What I have come here is a bit of a kludge: A third parameter is passed, either True or False, indicating whether child_task should instigate a shutdown of its pool. The default value for this parameter is False, we don't even bother passing it. After all the actual results have been retrieved and the child_task processes are now idle, we submit N new tasks with dummy values but with shutdown set to True. Note that the ProcessPoolExecutor function map works quite a bit differently than the same function in the Pool class (read the docs):
from concurrent.futures import ProcessPoolExecutor
import time
child_executor = None
def child_task(x, y, shutdown=False):
global child_executor
if child_executor is None:
child_executor = ProcessPoolExecutor(max_workers=1)
if shutdown:
if child_executor:
child_executor.shutdown(False)
child_executor = None
time.sleep(.2) # make sure another process in the pool gets the next task
return None
# the child task spawns new task(s)
future = child_executor.submit(grandchild_task, y)
# we can compute one of the results using the current process:
return grandchild_task(x) + future.result()
def grandchild_task(n):
return n * n
def main():
N_WORKERS = 2
with ProcessPoolExecutor(max_workers=N_WORKERS) as executor:
# first call is (1, 2), second call is (3, 4):
results = [result for result in executor.map(child_task, (1, 3), (2, 4))]
print(results)
# force a shutdown
# need N_WORKERS invocations:
[result for result in executor.map(child_task, (0,) * N_WORKERS, (0,) * N_WORKERS, (True,) * N_WORKERS)]
if __name__ == '__main__':
main()
Prints:
[5, 25]
Check this solution:
#!/usr/bin/python
# requires Python version 3.8 or higher
from multiprocessing import Queue, Process
import time
from random import randrange
import os
import psutil
# function to be run by each child process
def square(number):
sleep = randrange(5)
time.sleep(sleep)
print(f'Result is {number * number}, computed by pid {os.getpid()}...sleeping {sleep} secs')
# create a queue where all tasks will be placed
queue = Queue()
# indicate how many number of children you want the system to create to run the tasks
number_of_child_proceses = 5
# put all tasks in the queue above
for task in range(19):
queue.put(task)
# this the main entry/start of the program when you run
def main():
number_of_task = queue.qsize()
print(f'{"_" * 60}\nBatch: {number_of_task // number_of_child_proceses + 1} \n{"_" * 60}')
# don't create more number of children than the number of tasks. Also, in the last round, wait for all child process
# to complete so as to wrap up everything
if number_of_task <= number_of_child_proceses:
processes = [Process(target=square, args=(queue.get(),)) for _ in
range(number_of_task)]
for p in processes:
p.start()
p.join()
else:
processes = [Process(target=square, args=(queue.get(),)) for _ in range(number_of_child_proceses)]
for p in processes:
p.start()
# update count of remaining task
number_of_task = queue.qsize()
# run the program in a loop until no more task remains in the queue
while number_of_task:
current_process = psutil.Process()
children = current_process.children()
# if children process have completed assigned task but there is still more remaining tasks in the queue,
# assign them more tasks
if not len(children) and number_of_task:
print(f'\nAssigned tasks completed... reasigning the remaining {number_of_task} task(s) in the queue\n')
main()
# exit the loop if no more task in the queue to work on
print('\nAll tasks completed!!')
exit()
if __name__ == "__main__":
main()
I have looked around more, and found Ray, which addresses this exact use case using nested remote functions.
I am currently trying to run parallelized code from a spyder console in anaconda. I believe the issue may be with my computer not allowing anaconda to control CPU cores, but I am not sure how to correct this issue.
Another interesting point is that when I run an async example, but when I try to produce the results I receive the same issue.
I have tried multiple simple examples that should be working. There are no package loading errors
from multiprocessing.pool import ThreadPool, Pool
def square_it(x):
return x*x
# On Windows, make sure that multiprocessing doesn't start
# until after "if __name__ == '__main__'"
with Pool(processes=5) as pool:
results = pool.map(square_it, [5, 4, 3, 2 ,1])
print(results)
I expect for my code to complete all code.
This code is meant to run square_it in parallel, in 5 different processes
def square_it(x):
return x*x
with Pool(processes=5) as pool:
results = pool.map(square_it, [5, 4, 3, 2 ,1])
print(results)
The it does that is that 5 new processes are created, then in each of them, the same python module is loaded and function square_it is called.
What happens when the module is imported in one of the 5 subprocesses is the same thing which happens when it is initially loaded in the main process: it creates another Pool of 5 subprocesses, which do that indefinitely.
To avoid that, you have to make sure that the subprocesses do not recursively create more and more subprocesses. You do that by creating the subprocesses only in the "main" module, aka "__main__":
def square_it(x):
return x*x
if __name__ == "__main__":
with Pool(processes=5) as pool:
results = pool.map(square_it, [5, 4, 3, 2 ,1])
print(results)