how to fetch process from python process pool - python

I want to create many processes,
each process runs 5 seconds later than a previous process,
namely, the time interval between each process starts is 5 seconds,
so that:
run process 1
wait 5 seconds
run process 2
wait 5 seconds
run process 3
wait 5 seconds
.....
like:
for i in range(10):
p = multiprocessing.Process(target=func)
p.start()
sleep(5)
#after all child process exit
do_something()
but I want to call do_something() after all the process exit
I don't know how to do the synchronization here
with a python pool libary, I can have
pool = multiprocessing.Pool(processes=4)
for i in xrange(500):
pool.apply_async(func, i)
pool.close()
pool.join()
do_something()
but in this way, 4 processes will run simultaneously,
I can't decide the time interval between processes,
is it possible to create a process pool and then fetch each process, something like
pool = multiprocessing.Pool(processes=4)
for i in xrange(500):
process = pool.fetch_one()
process(func, i)
time.sleep(5)
pool.close()
pool.join()
do_something()
are there such a library or source code snippets which satisfy my needs?
thanks

Just to put suggestions together, you could do something like:
plist = []
for i in range(10):
p = multiprocessing.Process(target=func)
p.start()
plist.append(p)
sleep(5)
for p in plist:
p.join()
do_something()
You could give a timeout argument to join() in order to handle stuck processes; in that case you'd have to keep iterating through the list, removing terminated processes until the list is empty.

Related

Python multi-process with time-out kill for individual processes

I'm having trouble coming up with a piece of code that
spawns multiple processes and
for any individual process, kill
the process if it is still alive after 5 secs
I know how to handle 1) and 2) individually, but I don't know how to combine them together. Any suggestions would be helpful. Thanks!
For 1) I know how to write a simple multi-process program with return dictionary from here:
import multiprocessing
def worker(procnum, return_dict):
'''worker function'''
print str(procnum) + ' represent!'
return_dict[procnum] = procnum
if __name__ == '__main__':
manager = multiprocessing.Manager()
return_dict = manager.dict()
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,return_dict))
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
print return_dict.values()
For 2), my program hangs on some data, as a function, which is an external C++ extension of Python, will not return. Since there are one million data points to handle, I need to have a time-out killer that kills this function when it's running too long, and moves on to the next iteration. Currently I set a wait time of 5 seconds before killing this process. I know how to write the code from here:
import multiprocessing
import time
# bar
def bar():
for i in range(100):
print "Tick"
time.sleep(1)
if __name__ == '__main__':
# Start bar as a process
p = multiprocessing.Process(target=bar)
p.start()
# Wait for 10 seconds or until process finishes
p.join(10)
# If thread is still active
if p.is_alive():
print "running... let's kill it..."
# Terminate
p.terminate()
p.join()
But, as I mentioned, I am not sure know how to combine these two pieces of code together. Mainly because I don't know where to put the p.join() and if p.alive(). What's the use of p.join() since we already have `p.start()'?
Thanks!

Where to call join() when multiprocessing

When using multiprocessing in Python, I usually see examples where the join() function is called in a separate loop to where each process was actually created.
For example, this:
processes = []
for i in range(10):
p = Process(target=my_func)
processes.append(p)
p.start()
for p in processes:
p.join()
is more common than this:
processes = []
for i in range(10):
p = Process(target=my_func)
processes.append(p)
p.start()
p.join()
But from my understanding of join(), it just tells the script not to exit until that process has finished. Therefore, it shouldn't matter when join() is called. So why is it usually called in a separate loop?
join() is blocking operation.
In first example you start 10 processes and then you are waiting for all procces to finish. All processes are running at same time.
In second example you start one process at time and you are waiting for finish before you start another process. There is only one running process at same time
First example:
def wait()
time.sleep(1)
# You start 10 processes
for i in range(10):
p = Process(target=wait)
processes.append(p)
p.start()
# One second after all processes can be finished you check them all and finish
for p in processes:
p.join()
Execution time of whole script can be near one second.
Second example:
for i in range(10):
p = Process(target=wait) # Here you start one process
processes.append(p)
p.start()
p.join() # Here you will have to wait one second before process finished.
Execution time of whole script can be near 10 seconds!.

Python: How to keep starting new parallel processes based on condition [duplicate]

I am reading various tutorials on the multiprocessing module in Python, and am having trouble understanding why/when to call process.join(). For example, I stumbled across this example:
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
From what I understand, process.join() will block the calling process until the process whose join method was called has completed execution. I also believe that the child processes which have been started in the above code example complete execution upon completing the target function, that is, after they have pushed their results to the out_q. Lastly, I believe that out_q.get() blocks the calling process until there are results to be pulled. Thus, if you consider the code:
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
the main process is blocked by the out_q.get() calls until every single worker process has finished pushing its results to the queue. Thus, by the time the main process exits the for loop, each child process should have completed execution, correct?
If that is the case, is there any reason for calling the p.join() methods at this point? Haven't all worker processes already finished, so how does that cause the main process to "wait for all worker processes to finish?" I ask mainly because I have seen this in multiple different examples, and I am curious if I have failed to understand something.
Try to run this:
import math
import time
from multiprocessing import Queue
import multiprocessing
def factorize_naive(n):
factors = []
for div in range(2, int(n**.5)+1):
while not n % div:
factors.append(div)
n //= div
if n != 1:
factors.append(n)
return factors
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
time.sleep(5)
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
time.sleep(15)
And open the task-manager. You should be able to see that the 4 subprocesses go in zombie state for some seconds before being terminated by the OS(due to the join calls):
With more complex situations the child processes could stay in zombie state forever(like the situation you was asking about in an other question), and if you create enough child-processes you could fill the process table causing troubles to the OS(which may kill your main process to avoid failures).
At the point just before you call join, all workers have put their results into their queues, but they did not necessarily return, and their processes may not yet have terminated. They may or may not have done so, depending on timing.
Calling join makes sure that all processes are given the time to properly terminate.
I am not exactly sure of the implementation details, but join also seems to be necessary to reflect that a process has indeed terminated (after calling terminate on it for example). In the example here, if you don't call join after terminating a process, process.is_alive() returns True, even though the process was terminated with a process.terminate() call.

python multiprocessing pool: how can I know when all the workers in the pool have finished?

I am running a multiprocessing pool in python, where I have ~2000 tasks, being mapped to 24 workers with the pool.
each task creates a file based on some data analysis and webservices.
I want to run a new task, when all the tasks in the pool were finished. how can I tell when all the processes in the pool have finished?
You want to use the join method, which halts the main process thread from moving forward until all sub-processes ends:
Block the calling thread until the process whose join() method is called terminates or until the optional timeout occurs.
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
processes = []
for i in range(10):
p = Process(target=f, args=('bob',))
processes.append(p)
for p in processes:
p.start()
p.join()
# only get here once all processes have finished.
print('finished!')
EDIT:
To use join with pools
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, (10,)) # do some work
pool.close()
pool.join() # block at this line until all processes are done
print("completed")
You can use the wait() method of the ApplyResult object (which is what pool.apply_async returns).
import multiprocessing
def create_file(i):
open(f'{i}.txt', 'a').close()
if __name__ == '__main__':
# The default for n_processes is the detected number of CPUs
with multiprocessing.Pool() as pool:
# Launch the first round of tasks, building a list of ApplyResult objects
results = [pool.apply_async(create_file, (i,)) for i in range(50)]
# Wait for every task to finish
[result.wait() for result in results]
# {start your next task... the pool is still available}
# {when you reach here, the pool is closed}
This method works even if you're planning on using your pool again and don't want to close it--as an example, you might want to keep it around for the next iteration of your algorithm. Use a with statement or call pool.close() manually when you're done using it, or bad things will happen.

When to call .join() on a process?

I am reading various tutorials on the multiprocessing module in Python, and am having trouble understanding why/when to call process.join(). For example, I stumbled across this example:
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
From what I understand, process.join() will block the calling process until the process whose join method was called has completed execution. I also believe that the child processes which have been started in the above code example complete execution upon completing the target function, that is, after they have pushed their results to the out_q. Lastly, I believe that out_q.get() blocks the calling process until there are results to be pulled. Thus, if you consider the code:
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
the main process is blocked by the out_q.get() calls until every single worker process has finished pushing its results to the queue. Thus, by the time the main process exits the for loop, each child process should have completed execution, correct?
If that is the case, is there any reason for calling the p.join() methods at this point? Haven't all worker processes already finished, so how does that cause the main process to "wait for all worker processes to finish?" I ask mainly because I have seen this in multiple different examples, and I am curious if I have failed to understand something.
Try to run this:
import math
import time
from multiprocessing import Queue
import multiprocessing
def factorize_naive(n):
factors = []
for div in range(2, int(n**.5)+1):
while not n % div:
factors.append(div)
n //= div
if n != 1:
factors.append(n)
return factors
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
time.sleep(5)
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
time.sleep(15)
And open the task-manager. You should be able to see that the 4 subprocesses go in zombie state for some seconds before being terminated by the OS(due to the join calls):
With more complex situations the child processes could stay in zombie state forever(like the situation you was asking about in an other question), and if you create enough child-processes you could fill the process table causing troubles to the OS(which may kill your main process to avoid failures).
At the point just before you call join, all workers have put their results into their queues, but they did not necessarily return, and their processes may not yet have terminated. They may or may not have done so, depending on timing.
Calling join makes sure that all processes are given the time to properly terminate.
I am not exactly sure of the implementation details, but join also seems to be necessary to reflect that a process has indeed terminated (after calling terminate on it for example). In the example here, if you don't call join after terminating a process, process.is_alive() returns True, even though the process was terminated with a process.terminate() call.

Categories

Resources