Fork() from inside a spawned Thread in Python issue - python

I'am having problemes to kill the child processes from a fork spawn inside an also spawned thread:
_td = threading.Thread(target=updateProxies,args=())
_td.start()
def updateProxies():
quota = 25
children = []
sons = 0
for i in range(50):
pid = os.fork()
if pid:
children.append(pid)
sons+=1
if sons >= quota:
os.wait()
sons-=1
else:
{CHILD CODE EXECUTION} #database calls, and network requests
sys.exit()
for x in children:
os.waitpid(x,0)
When I run the code above, the parent from the children stops at the "os.waitpid(x,0)" line, and never resumes from there. And yes, I tracked all the children until they die at their respectively sys.exit(), but waitpid never gets informed about their death and my parent process never resumes!
When doing ps -ef, the childs processes are (defunct) aren't they diyng?
IMPORTANT: when I execute the function from the main thread, everything goes fine. How to deal with it?

FOUND THE ANSWER:
Had to exit the fork processes with:
os._exit(0)
not with
sys.exit()

Related

Kill All Child Processes But Not The Parent Processes Upon Timeout Error

while True:
pid = os.getpid()
try:
pool = mp.Pool(processes=1, maxtasksperchild=1)
result = pool.apply_async(my_func, args=())
result.get(timeout=60)
pool.close()
except multiprocessing.context.TimeoutError:
traceback.print_exc()
kill_proc_tree(pid)
def kill_proc_tree(pid):
parent = psutil.Process(pid)
children = parent.children(recursive=True)
for child in children:
child.kill()
I am using the multiprocessing library and am trying to spawn a new process everytime my_func finishes running, throws an exception, or has ran longer than 60 seconds (result.get(timeout=60) should throw an exception). Since I want to keep the while loop running but also avoid having zombie processes, I need to be able to keep the parent process running but at the same time, kill all child processes if an exception is thrown in the parent process or the child process, or the child process finishes before spawning a new process.The kill_proc_tree function that I found online was supposed to tackle the issue which it seemed to do at first (my_func opens a new window when a process begins and closes the window when the process supposedly ends), but then I realized that in my Task Manager, the Python Script is still taking up my memory and after enough multiprocessing.context.TimeoutError errors (they are thrown by the parent process), my memory becomes full.
So what I should I do to solve this problem? Any help would be greatly appreciated!
The solution should be as simple as calling method terminate on the pool for all exceptions and not just for a TimeoutError since result.get(timeout=60) can throw an arbitrary exception if your my_func completes before the 60 seconds with an exception.
Note that according to the documentation the terminate method "stops the worker processes immediately without completing outstanding work" and will be implicitly called when the context handler for the pool is exited as in the following example:
import multiprocessing
while True:
try:
with multiprocessing.Pool(processes=1, maxtasksperchild=1) as pool:
result = pool.apply_async(my_func, args=())
result.get(timeout=60)
except Exception:
pass
Specifying the maxtasksperchild=1 parameter to the Pool constructor seems somewhat superfluous since you are never submitting more than one task to the pool anyway.

what will process created by fork() do in python?

I wrote the following code, but I don't understand how it works very well:
NUM=8
def timec():
x=1000000
while x>0:
x-=1
pid_children=[]
start_time=time.time()
for i in range(NUM):
pid=os.fork()
if pid==0:
timec()
os._exit(0)
else:
pid_children.append(pid)
for j in pid_children:
os.waitpid(j,0)
print(time.time()-start_time)
I cannot understand where the child process starts or where it will finish.
And another question is will the waitpid() method wait for the child process to finish its work, or will it just return as soon as it is called?
When os.fork() is called, the program splits into two completely separate programs. In the child, os.fork() returns 0. In the parent, os.fork() returns the process id of the child.
The key distinction about os.fork() is that it does not create a new thread that shares the memory of the original thread, but instead creates an entirely new process. The new process has a copy of the memory of it's parent. Updates in the parent are not reflected in the child and updates in the child are not reflected in the parent! The each have their own state.
Given that context, here are the answers to your specific questions:
Where do the child processes start?
pid = os.fork()
This will generate more than NUM processes because after the first iteration you will have 2 processes inside of the for loop, each of which will fork into 2 processes, yielding 4 total processes after the second iteration. In total 256 (2^8) processes will be created!
Where do the child processes end?
Some will exit at:
os._exit(0)
Others will exit at the end of the file. That's because you overwrote pid in the subsequent iterations of the loop, so some children became orphaned (and never ran timec()).
pid_children will always only have a single process in it. That's because the entire state of the program is forked, and each fork (which has it's own copy of the list) only adds one element to the list.
What does waitpid do?
os.waitpid(pid) will block until the process with pid pid has completed.
os.fork() documentation
os.waitpid() documentation

how to to terminate process using python's multiprocessing

I have some code that needs to run against several other systems that may hang or have problems not under my control. I would like to use python's multiprocessing to spawn child processes to run independent of the main program and then when they hang or have problems terminate them, but I am not sure of the best way to go about this.
When terminate is called it does kill the child process, but then it becomes a defunct zombie that is not released until the process object is gone. The example code below where the loop never ends works to kill it and allow a respawn when called again, but does not seem like a good way of going about this (ie multiprocessing.Process() would be better in the __init__()).
Anyone have a suggestion?
class Process(object):
def __init__(self):
self.thing = Thing()
self.running_flag = multiprocessing.Value("i", 1)
def run(self):
self.process = multiprocessing.Process(target=self.thing.worker, args=(self.running_flag,))
self.process.start()
print self.process.pid
def pause_resume(self):
self.running_flag.value = not self.running_flag.value
def terminate(self):
self.process.terminate()
class Thing(object):
def __init__(self):
self.count = 1
def worker(self,running_flag):
while True:
if running_flag.value:
self.do_work()
def do_work(self):
print "working {0} ...".format(self.count)
self.count += 1
time.sleep(1)
You might run the child processes as daemons in the background.
process.daemon = True
Any errors and hangs (or an infinite loop) in a daemon process will not affect the main process, and it will only be terminated once the main process exits.
This will work for simple problems until you run into a lot of child daemon processes which will keep reaping memories from the parent process without any explicit control.
Best way is to set up a Queue to have all the child processes communicate to the parent process so that we can join them and clean up nicely. Here is some simple code that will check if a child processing is hanging (aka time.sleep(1000)), and send a message to the queue for the main process to take action on it:
import multiprocessing as mp
import time
import queue
running_flag = mp.Value("i", 1)
def worker(running_flag, q):
count = 1
while True:
if running_flag.value:
print(f"working {count} ...")
count += 1
q.put(count)
time.sleep(1)
if count > 3:
# Simulate hanging with sleep
print("hanging...")
time.sleep(1000)
def watchdog(q):
"""
This check the queue for updates and send a signal to it
when the child process isn't sending anything for too long
"""
while True:
try:
msg = q.get(timeout=10.0)
except queue.Empty as e:
print("[WATCHDOG]: Maybe WORKER is slacking")
q.put("KILL WORKER")
def main():
"""The main process"""
q = mp.Queue()
workr = mp.Process(target=worker, args=(running_flag, q))
wdog = mp.Process(target=watchdog, args=(q,))
# run the watchdog as daemon so it terminates with the main process
wdog.daemon = True
workr.start()
print("[MAIN]: starting process P1")
wdog.start()
# Poll the queue
while True:
msg = q.get()
if msg == "KILL WORKER":
print("[MAIN]: Terminating slacking WORKER")
workr.terminate()
time.sleep(0.1)
if not workr.is_alive():
print("[MAIN]: WORKER is a goner")
workr.join(timeout=1.0)
print("[MAIN]: Joined WORKER successfully!")
q.close()
break # watchdog process daemon gets terminated
if __name__ == '__main__':
main()
Without terminating worker, attempt to join() it to the main process would have blocked forever since worker has never finished.
The way Python multiprocessing handles processes is a bit confusing.
From the multiprocessing guidelines:
Joining zombie processes
On Unix when a process finishes but has not been joined it becomes a zombie. There should never be very many because each time a new process starts (or active_children() is called) all completed processes which have not yet been joined will be joined. Also calling a finished process’s Process.is_alive will join the process. Even so it is probably good practice to explicitly join all the processes that you start.
In order to avoid a process to become a zombie, you need to call it's join() method once you kill it.
If you want a simpler way to deal with the hanging calls in your system you can take a look at pebble.

setDaemon() method of threading.Thread

I am a newbie in python programming, what I understand is that a process can be a daemon, but a thread in a daemon mode, I couldn't understand the usecase of this, I would request the python gurus to help me in understanding this.
Here is some basic code using threading:
import Queue
import threading
def basic_worker(queue):
while True:
item = queue.get()
# do_work(item)
print(item)
queue.task_done()
def basic():
# http://docs.python.org/library/queue.html
queue = Queue.Queue()
for i in range(3):
t = threading.Thread(target=basic_worker,args=(queue,))
t.daemon = True
t.start()
for item in range(4):
queue.put(item)
queue.join() # block until all tasks are done
print('got here')
basic()
When you run it, you get
% test.py
0
1
2
3
got here
Now comment out the line:
t.daemon = True
Run it again, and you'll see that the script prints the same result, but hangs.
The main thread ends (note that got here was printed), but the second thread never finishes.
In contrast, when t.daemon is set to True, the thread t is terminated when the main thread ends.
Note that "daemon threads" has little to do with daemon processes.
It looks like people intend to use Queue to explain threading, but I think there should be a much simpler way, by using time.sleep(), to demo a daemon thread.
Create daemon thread by setting the daemon parameter (default as None):
from threading import Thread
import time
def worker():
time.sleep(3)
print('daemon done')
thread = Thread(target=worker, daemon=True)
thread.start()
print('main done')
Output:
main done
Process finished with exit code 0
Remove the daemon argument, like:
thread = Thread(target=worker)
Re-run and see the output:
main done
daemon done
Process finished with exit code 0
Here we already see the difference of a daemon thread:
The entire Python program can exit if only daemon thread is left.
isDaemon() and setDaemon() are old getter/setter API. Using constructor argument, as above, or daemon property is recommended.
Module Queue has been renamed queue starting with Python3 to better reflect the fact that there are several queue classes (lifo, fifo, priority) in the module.
so please make the changes while using this example
In simple words...
What is a Daemon thread?
daemon threads can shut down any time in between their flow whereas non-daemon (i.e. user threads) execute completely.
daemon threads run intermittently in the background as long as other non-daemon threads are running.
When all of the non-daemon threads are complete, daemon threads terminate automatically (no matter whether they got fully executed or not).
daemon threads are service providers for user threads running in the same process.
python does not care about daemon threads to complete when in running state, NOT EVEN the finally block but python does give preference to non-daemon threads that are created by us.
daemon threads act as services in operating systems.
python stops the daemon threads when all user threads (in contrast to the daemon threads) are terminated. Hence daemon threads can be used to implement, for example, a monitoring functionality as the thread is stopped by the python as soon as all user threads have stopped.
In a nutshell
If you do something like this
thread = Thread(target=worker_method, daemon=True)
there is NO guarantee that worker_method will get executed completely.
Where does this behaviour be useful?
Consider two threads t1 (parent thread) and t2 (child thread). Let t2 be daemon. Now, you want to analyze the working of t1 while it is in running state; you can write the code to do this in t2.
Reference:
StackOverflow - What is a daemon thread in Java?
GeeksForGeeks - Python daemon threads
TutotrialsPoint - Concurrency in Python - Threads
Official Python Documentation
I've adapted #unutbu's answer for python 3. Make sure that you run this script from the command line and not some interactive environment like jupyter notebook.
import queue
import threading
def basic_worker(q):
while True:
item = q.get()
# do_work(item)
print(item)
q.task_done()
def basic():
q = queue.Queue()
for item in range(4):
q.put(item)
for i in range(3):
t = threading.Thread(target=basic_worker,args=(q,))
t.daemon = True
t.start()
q.join() # block until all tasks are done
print('got here')
basic()
So when you comment out the daemon line, you'll notice that the program does not finish, you'll have to interrupt it manually.
Setting the threads to daemon threads makes sure that they are killed once they have finished.
Note: you could achieve the same thing here without daemon threads, if you would replace the infinite while loop with another condition:
def basic_worker(q):
while not q.empty():
item = q.get()
# do_work(item)
print(item)
q.task_done()

Python-daemon doesn't kill its kids

When using python-daemon, I'm creating subprocesses likeso:
import multiprocessing
class Worker(multiprocessing.Process):
def __init__(self, queue):
self.queue = queue # we wait for things from this in Worker.run()
...
q = multiprocessing.Queue()
with daemon.DaemonContext():
for i in xrange(3):
Worker(q)
while True: # let the Workers do their thing
q.put(_something_we_wait_for())
When I kill the parent daemonic process (i.e. not a Worker) with a Ctrl-C or SIGTERM, etc., the children don't die. How does one kill the kids?
My first thought is to use atexit to kill all the workers, likeso:
with daemon.DaemonContext():
workers = list()
for i in xrange(3):
workers.append(Worker(q))
#atexit.register
def kill_the_children():
for w in workers:
w.terminate()
while True: # let the Workers do their thing
q.put(_something_we_wait_for())
However, the children of daemons are tricky things to handle, and I'd be obliged for thoughts and input on how this ought to be done.
Thank you.
Your options are a bit limited. If doing self.daemon = True in the constructor for the Worker class does not solve your problem and trying to catch signals in the Parent (ie, SIGTERM, SIGINT) doesn't work, you may have to try the opposite solution - instead of having the parent kill the children, you can have the children commit suicide when the parent dies.
The first step is to give the constructor to Worker the PID of the parent process (you can do this with os.getpid()). Then, instead of just doing self.queue.get() in the worker loop, do something like this:
waiting = True
while waiting:
# see if Parent is at home
if os.getppid() != self.parentPID:
# woe is me! My Parent has died!
sys.exit() # or whatever you want to do to quit the Worker process
try:
# I picked the timeout randomly; use what works
data = self.queue.get(block=False, timeout=0.1)
waiting = False
except queue.Queue.Empty:
continue # try again
# now do stuff with data
The solution above checks to see if the parent PID is different than what it originally was (that is, if the child process was adopted by init or lauchd because the parent died) - see reference. However, if that doesn't work for some reason you can replace it with the following function (adapted from here):
def parentIsAlive(self):
try:
# try to call Parent
os.kill(self.parentPID, 0)
except OSError:
# *beeep* oh no! The phone's disconnected!
return False
else:
# *ring* Hi mom!
return True
Now, when the Parent dies (for whatever reason), the child Workers will spontaneously drop like flies - just as you wanted, you daemon! :-D
You should store the parent pid when the child is first created (let's say in self.myppid) and when self.myppid is diferent from getppid() means that the parent died.
To avoid checking if the parent has changed over and over again, you can use PR_SET_PDEATHSIG that is described in the signals documentation.
5.8 The Linux "parent death" signal
For each process there is a variable pdeath_signal, that is
initialized to 0 after fork() or clone(). It gives the signal that the
process should get when its parent dies.
In this case, you want your process to die, you can just set it to a SIGHUP, like this:
prctl(PR_SET_PDEATHSIG, SIGHUP);
Atexit won't do the trick -- it only gets run on successful non-signal termination -- see the note near the top of the docs. You need to set up signal handling via one of two means.
The easier-sounding option: set the daemon flag on your worker processes, per http://docs.python.org/library/multiprocessing.html#process-and-exceptions
Somewhat harder-sounding option: PEP-3143 seems to imply there is a built-in way to hook program cleanup needs in python-daemon.

Categories

Resources