I am working with the multiprocessing module on a Unix system. I have noticed memory leaks when I terminate one of my programs. I was thinking that this might be because the processes that were started in the main process kept running. Is this correct?
I think I'd refer you to this post where he does a great job of explaining how other threads behave.
You can just run your program and see if there is python processes alive after the main process terminated.
The correct way to terminate your program is making all the subprocesses terminated before the main process end. (Try to use Process.terminate() and Process.join() methods for all subprocesses before the main process terminated.)
Related
I am writing a python script which uses multiprocessing, multithreading and zeromq for interprocess communication. It all works fine until the program finishes: at that time the child processes terminate properly (sigwait is intercepted and the child procs terminate which I have confirmed with the ps command) but the main process often does not shut down - occasionally it does, but most of the time it does not. I have confirmed that all remaining threads of the main process are daemonic and that the last row of the script is executed properly (it is a logging.info call). I am using fork for forking processes and can see that a Forkprocess still runs in addition to the main process.
What is the best way to debug this, considering that the script has actually finished ? Maybe add a pdb or breakpoint() right at the end ?
Thanks in advance.
Here is the output, after the last row the script usually does not terminate:
INFO root::remaining active child processes: [<ForkProcess name='SyncManager-1' pid=6362 parent=6361 started>]
INFO root::non-daemonic threads which are still running, preventing orderly shutdown: [].
INFO root::======== PID: 6361 main() end: shut down completed.=========
EDIT:
I refactored the code and noticed that it now misbehaves very rarely. I am 99.9% certain that it is due to an open zeromq REQ/REP 'socket' at the time of shutdown. The refactoring made sure that these sockets are only held open only for a very short time - but it is not predictable what sockets are open at shutdown so occasionally it still hangs.
I will write a simple testharness with two processes communicating via REQ/REP sockets then shut down the child process followed by main process. I expect same result, i.e., interpreter not shutting down. Lets see, keep you posted.
I think you could try viztracer. The good thing about viztracer is that it can display all the processes on the same timeline. Maybe you can catch what's stopping your main process/forked process from shutting down. If it's a deadlock it should be noticeable. However, without the code, I really can't tell if it would help for sure.
I'm new to multiprocessing in Python so I'm in doubt. My first idea was to use threads, but then I read about GIL and moved to multiprocessing.
My question is, when I start a process like this:
t1 = Process(target=run, args=lot)
t1.start()
do I need to stop it somehow from the main process, or they shutdown when the run() method is finished?
I know that things like join() exist, but I'm scheduling a job every n minutes and start a couple of processes in parallel, and this procedure goes until stopped, so I don't really need to wait for processes to finish.
Yes when t1.start() happens it executes the method which is specified in target(i.e run). Once its completed it exits automatically.
You can check this by checking the running process eg in linux use below command,
"ps -aux |grep python" or "ps -aux |grep "program_name.py"
when your target is running count will be more.
To wait until a process has completed its work and exited, use the join() method. But in your case its not required
more example are here : https://pymotw.com/2/multiprocessing/basics.html
Well, GIL is not a big problem when you are not doing much computation, but something like networking stuff or reading files when execution of a program is hanged and control flow is given to the krnel untill input/output operation is performed. Then another thread can run in python.
If you, owever, are bothering with more CPU-consuming stuff you actually should go for multiprocessing.
join() method is used for thread synchronization, so when main thread relies on data processed by another thread it is important to use it. Otherwise it is not. You operating system will handle things like closing child processes in a safe manner.
EDIT: check this discussion for more details.
I am using a Python process to run one of my functions like so:
Process1 = Process(target = someFunction)
Process1.start()
Now that function has no looping or anything, it just does its thing, then ends, does the Process die with it? or do I always need to drop a:
Process1.terminate()
Afterwards?
The child process will exit by itself - the Process1.terminate() is unnecessary in that regard. This is especially true if using any shared resources between the child and parent process. From the Python documentation:
Avoid terminating processes
Using the Process.terminate method to stop a process is liable to cause any shared resources (such as locks, semaphores, pipes and queues) currently being used by the process to become broken or unavailable to other processes.
Therefore it is probably best to only consider using Process.terminate on processes which never use any shared resources.
However, if you want the parent process to wait for the child process to finish (perhaps the child process is modifying something that the parent will access afterwards), then you'll want to use Process1.join() to block the parent process from continuing until the child process complete. This is generally good practice when using child processes to avoid zombie processes or orphaned children.
No, as per the documentation it only sends a SIGTERM or TerminateProcess() to the process in question. If it has already exited then there is nothing to terminate.
However, it is always a good process to use exit codes in your subprocesses:
import sys
sys.exit(1)
And then check the exit code once you know the process has terminated:
if Process1.exitcode():
errorHandle()
I'm trying to create a background process for some heavy calculations from the main Pylons process. Here's the code:
p = Process(target = instance_process, \
args = (instance_tuple.instance, parent_pipe, child_pipe,))
p.start()
The process is created and started, but is seems to be a fork from the main process: it is listening to the same port and the whole application hangs up. What am I doing wrong?
Thanks in advance.
Process IS a fork. If you look through it's implementation you'll find that Process.start() calls a fork. It does NOT, however, call any of the exec variations to change the execution context.
Still, this may have nothing to do with listening on the same port (unless the parent process is multi-threaded). At which point is the program hanging?
I know that when you try shutting down a python program without terminating the child process created through multiprocessing it will hang until the child process terminates.
This might be caused if, for instance, you do not close the pipe between the processes.
I am creating a thread in my Python app with thread.start_new_thread.
How do I stop it if it hasn't finished in three seconds time?
You can't do that directly. Anyway aborting a thread is not good practice - rather think about using synchronization mechanisms that let you abort the thread in a "soft" way.
But daemonic threads will automatically be aborted if no non-daemonic threads remain (e.g. if the only main thread ends). Maybe that's what you want.
If you really need to do this (e.g. the thread calls code that may hang forever) then consider rewriting your code to spawn a process with the multiprocessing module. You can then kill the process with the Process.terminate() method. You will need 2.6 or later for this, of course.
You cannot. Threads can't be killed from outside. The only thing you can do is add a way to ask the thread to exit. Obviously you won't be able to do this if the thread is blocked in some systemcall.
As noted in a related question, you might be able to raise an exception through ctypes.pythonapi, but not while it's waiting on a system call.