python multiprocessing pool, wait for processes and restart custom processes

python multiprocessing pool, wait for processes and restart custom processes - python

I used python multiprocessing and do wait of all processes with this code:
...
results = []
for i in range(num_extract):
url = queue.get(timeout=5)
try:
print "START PROCESS!"
result = pool.apply_async(process, [host,url],callback=callback)
results.append(result)
except Exception,e:
continue
for r in results:
r.get(timeout=7)
...
i try to use pool.join but get error:
Traceback (most recent call last):
File "C:\workspace\sdl\lxchg\walker4.py", line 163, in <module>
pool.join()
File "C:\Python25\Lib\site-packages\multiprocessing\pool.py", line 338, in joi
n
assert self._state in (CLOSE, TERMINATE)
AssertionError
Why join dont work? And what is the good way to wait all processes.
My second question is how can i restart certain process in pool? i need this in the reason of memory leak. Now In fact i do rebuild of all pool after all processes done their tasks (create new object pool to do process restarting).
What i need: for example i have 4 process in pool. Then process get his task, after task is done i need to kill process and start new (to refresh memory leak).

You are getting the error because you need to call pool.close() before calling pool.join()
I don't know of a good way to shut down a process started with apply_async but see if properly shutting down the pool doesn't make your memory leak go away.
The reason I think this is that the Pool class has a bunch of attributes that are threads running in daemon mode. All of these threads get cleaned up by the join method. The code you have now won't clean them up so if you create a new Pool, you'll still have all those threads running from the last one.

Related

Kill All Child Processes But Not The Parent Processes Upon Timeout Error

while True:
pid = os.getpid()
try:
pool = mp.Pool(processes=1, maxtasksperchild=1)
result = pool.apply_async(my_func, args=())
result.get(timeout=60)
pool.close()
except multiprocessing.context.TimeoutError:
traceback.print_exc()
kill_proc_tree(pid)
def kill_proc_tree(pid):
parent = psutil.Process(pid)
children = parent.children(recursive=True)
for child in children:
child.kill()
I am using the multiprocessing library and am trying to spawn a new process everytime my_func finishes running, throws an exception, or has ran longer than 60 seconds (result.get(timeout=60) should throw an exception). Since I want to keep the while loop running but also avoid having zombie processes, I need to be able to keep the parent process running but at the same time, kill all child processes if an exception is thrown in the parent process or the child process, or the child process finishes before spawning a new process.The kill_proc_tree function that I found online was supposed to tackle the issue which it seemed to do at first (my_func opens a new window when a process begins and closes the window when the process supposedly ends), but then I realized that in my Task Manager, the Python Script is still taking up my memory and after enough multiprocessing.context.TimeoutError errors (they are thrown by the parent process), my memory becomes full.
So what I should I do to solve this problem? Any help would be greatly appreciated!

The solution should be as simple as calling method terminate on the pool for all exceptions and not just for a TimeoutError since result.get(timeout=60) can throw an arbitrary exception if your my_func completes before the 60 seconds with an exception.
Note that according to the documentation the terminate method "stops the worker processes immediately without completing outstanding work" and will be implicitly called when the context handler for the pool is exited as in the following example:
import multiprocessing
while True:
try:
with multiprocessing.Pool(processes=1, maxtasksperchild=1) as pool:
result = pool.apply_async(my_func, args=())
result.get(timeout=60)
except Exception:
pass
Specifying the maxtasksperchild=1 parameter to the Pool constructor seems somewhat superfluous since you are never submitting more than one task to the pool anyway.

Is there a way to know that a pathos/multiprocessing worker is finished?

I'd like to know when workers finish so that I can free up resources as the last action any worker. Alternatively I can also free up these resources on the main process, but I need to free these up after each worker one by one (in contrast to freeing them up once after all of the workers finish).
I'm running my workers as below, tracking progress and PIDs used:
from pathos.multiprocessing import ProcessingPool
pool = ProcessingPool(num_workers)
pool.restart(force=True)
# Loading PIDs of workers with my get_pid() function:
pids = pool.map(get_pid, xrange(num_workers))
try:
results = pool.amap(
exec_func,
exec_args,
)
counter = 0
while not results.ready():
sleep(2)
if counter % 60 == 0:
log.info('Waiting for children running in pool.amap() with PIDs: {}'.format(pids))
counter += 1
results = results.get()
# Attempting to close pool...
pool.close()
# The purpose of join() is to ensure that a child process has completed
# before the main process does anything.
# Attempting to join pool...
pool.join()
except:
# Try to terminate the pool in case some worker PIDs still run:
cls.hard_kill_pool(pids, pool)
raise
Because of load balancing, it is hard to know which job will be the last on a worker. Is there any way to know that some workers are already inactive?
I'm using pathos version 0.2.0.

I'm the pathos author. If you need to free up resources after each worker in a Pool is is done running, I'd suggest you not use a Pool. A Pool is meant to allocate resources, and keep using them until all jobs are done. What I'd suggest is to use a for loop that spawns a Process and then ensures that the spawned Process is joined when you are done with it. If you need to do this within pathos, the Process class is at the horribly named: pathos.helpers.mp.Process (or much more directly at multiprocess.Process from the multiprocess package).

Multiprocessing Python Pool

When is it necessary to call .join() and .close() on a Pool in the case below? Reading the docs, it looks like it is for waiting for the processes to finish. For instance, if I do something like this:
while True:
pool = Pool(processes=4)
results = []
for x in range(1000):
result = pool.apply_async(f, (x,))
results.append(result)
for result in results:
result.get(timeout=1)
print "finished"
Do I still need to wait for the other process to finish with join() and close()? As I assume, that since I am iterating over all async results and waiting (blocking) for them to finish, by the time I get to print finished, all processes will have exited already?
Is this correct?
Also when do the processes start working on a function? I noticed that there are 4 processes running in parallel with ps -elf. Do the processes only start to work on the function after result.get() is called in this case?

close()
Prevents any more tasks from being submitted to the pool. Once all the tasks have been completed the worker processes will exit.
join()
Wait for all processes to properly terminate
Good link to start with Proper way to use multiprocessor.Pool in a nested loop
As soon as you call pool.apply_async the process will start working on the function, it'll return a result object

Why can I not create / access futures in callables submitted to ProcessPoolExecutor?

Why does this code work with threads but not processes?
import concurrent.futures as f
import time
def wait_on_b():
time.sleep(2)
print(b.result())
return 5
def wait_5():
time.sleep(2)
return 6
THREADS = False
if THREADS:
executor = f.ThreadPoolExecutor()
else:
executor = f.ProcessPoolExecutor()
a = executor.submit(wait_on_b)
b = executor.submit(wait_5)
print(a.result()) # works fine if THREADS, BrokenProcessPool otherwise
The docs do warn:
Calling Executor or Future methods from a callable submitted to a ProcessPoolExecutor will result in deadlock.
The docs don't seem to mention raising an exception, so does it mean ProcessPoolExecutor somehow discovered the deadlock and resolved it by killing both processes?
More importantly, why is this deadlock unavoidable with processes (and avoidable with threads), and what is the workaround if I want to use multiple processes with futures, without being so restricted?

When using threads the memory is shared between all threads and that's why wait_on_b can access b.
Using processes, A new memory space is created for each process (copy of the old one in fork mode) so You will get a copy of b with a broken PIPE since it is not the real b (just a copy)
BTW: on windows there is no fork, so b (the memory is totally new) does not exists and you'll get a
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Anaconda3\lib\concurrent\futures\process.py", line 175, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "C:\Users\yglazner\My Documents\LiClipse Workspace\anaconda_stuff\mproc.py", line 5, in wait_on_b
print(b.result())
NameError: name 'b' is not defined
"""

Basic python multi-threading issue

New to python and trying to understand multi-threading. Here's an example from python documentation on Queue
For the heck of my life, I don't understand how this example is working. In the worker() function, there's an infinite loop. How does the worker know when to get out of the loop? There seems to be no breaking condition.
And what exactly is the join doing at the end? Shouldn't I be joining the threads instead?
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(num_worker_threads):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in source():
q.put(item)
q.join() # block until all tasks are done
Also another question, When should multithreading be used and when should multiprocessing be used?

Yup. You're right. worker will run forever. However since Queue only has a finite number of items, eventually worker will permanently block at q.get() (Since there will be no more items in the queue). At this point, it's inconsequential that worker is still running. q.join() blocks until the Queue count drops to 0 (whenever the worker thread calls q.task_done, the count drops by 1). After that, the program ends. And the infinitely blocking thread dies with it's creator.

Regarding your second question, the biggest difference between threads and processes in Python is that the mainstream implementations use a global interpreter lock (GIL) to ensure that multiple threads can't mess up Python's internal data structures. This means that for programs that spend most of their time doing computation in pure Python, even with multiple CPUs you're not going to speed the program up much because only one thread at a time can hold the GIL. On the other hand, multiple threads can trivially share data in a Python program, and in some (but by no means all) cases, you don't have to worry too much about thread safety.
Where multithreading can speed up a Python program is when the program spends most of its time waiting on I/O -- disk access or, particularly these days, network operations. The GIL is not held while doing I/O, so many Python threads can run concurrently in I/O bound applications.
On the other hand, with multiprocessing, each process has its own GIL, so your performance can scale to the number of CPU cores you have available. The down side is that all communication between the processes will have to be done through a multiprocessing.Queue (which acts on the surface very like a Queue.Queue, but has very different underlying mechanics, since it has to communicate across process boundaries).
Since working through a thread safe or interprocess queue avoids a lot of potential threading problems, and since Python makes it so easy, the multiprocessing module is very attractive.

Agree with joel-cornett, mostly. I tried to run the following snippet in python2.7 :
from threading import Thread
from Queue import Queue
def worker():
def do_work(item):
print(item)
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(4):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in range(10):
q.put(item)
q.join()
The output is:
0
1
2
3
4
5
6
7
8
9
Exception in thread Thread-3 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
File "/usr/lib/python2.7/threading.py", line 504, in run
File "abc.py", line 9, in worker
File "/usr/lib/python2.7/Queue.py", line 168, in get
File "/usr/lib/python2.7/threading.py", line 236, in wait
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
Most probable explanation i think:
As the queue gets empty after task exhaustion, parent thread quits, after returning from q.join() and destroys the queue. Child threads are terminated upon receiving the first TypeError exception produced in "item = q.get()", as the queue exists no more.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.