I'm new to multiprocessing in Python so I'm in doubt. My first idea was to use threads, but then I read about GIL and moved to multiprocessing.
My question is, when I start a process like this:
t1 = Process(target=run, args=lot)
t1.start()
do I need to stop it somehow from the main process, or they shutdown when the run() method is finished?
I know that things like join() exist, but I'm scheduling a job every n minutes and start a couple of processes in parallel, and this procedure goes until stopped, so I don't really need to wait for processes to finish.
Yes when t1.start() happens it executes the method which is specified in target(i.e run). Once its completed it exits automatically.
You can check this by checking the running process eg in linux use below command,
"ps -aux |grep python" or "ps -aux |grep "program_name.py"
when your target is running count will be more.
To wait until a process has completed its work and exited, use the join() method. But in your case its not required
more example are here : https://pymotw.com/2/multiprocessing/basics.html
Well, GIL is not a big problem when you are not doing much computation, but something like networking stuff or reading files when execution of a program is hanged and control flow is given to the krnel untill input/output operation is performed. Then another thread can run in python.
If you, owever, are bothering with more CPU-consuming stuff you actually should go for multiprocessing.
join() method is used for thread synchronization, so when main thread relies on data processed by another thread it is important to use it. Otherwise it is not. You operating system will handle things like closing child processes in a safe manner.
EDIT: check this discussion for more details.
Related
I am in a situation where I have two endpoints I can ask for a value, and one may be faster than the other. The calls to the endpoints are blocking. I want to wait for one to complete and take that result without waiting for the other to complete.
My solution was to issue the requests in separate threads and have those threads set a flag to true when they complete. In the main thread, I continuously check the flags (I know it is a busy wait, but that is not my primary concern right now) and when one completes it takes that value and returns it as the result.
The issue I have is that I never clean up the other thread. I can't find any way to do it without using .join(), which would just block and defeat the purpose of this whole thing. So, how can I clean up that other, slower thread that is blocking without joining it from the main thread?
What you want is to make your threads daemons, so when you get the result and finish your main, the other running thread will be forced to finish. You do that by changing the daemon keyword to True:
tr = threading.Thread(daemon=True)
From the threading docs:
The significance of this flag is that the entire Python program exits
when only daemon threads are left.
Although:
Daemon threads are abruptly stopped at shutdown. Their resources (such
as open files, database transactions, etc.) may not be released
properly. If you want your threads to stop gracefully, make them
non-daemonic and use a suitable signalling mechanism such as an Event.
I don't have any particular experience with Events so can't elaborate on that. Feel free to click the link and read on.
One bad and dirty solution is to implement a methode for the threads which close the socket which is blocking. Now you have to catch the exception in the main thread.
This is a two part question,
After I cancel my script it still continues run, what I'm doing is queering an exchange api and saving the data for various assets.
My parent script can be seen here you can see i'm testing it out with just 3 assets, a sample of one of the child scripts can be seen here.
After I cancel the script the script for BTC seems to still be running and new .json files are still being generated in it's respective folder. The only way to stop it is to delete the folder and create it again.
This is really a bonus, my code was working with two assets but now with the addition of another it seems to only take in data for BTC and not the other 2.
Your first problem is that you are not really creating worker threads.
t1 = Thread(target=BTC.main()) executes BTC.main() and uses its return code to try to start a thread. Since main loops forever, you don't start any other threads.
Once you fix that, you'll still have a problem.
In python, only the root thread sees signals such as ctrl-c. Other threads will continue executing no matter how hard you press the key. When python exits, it tries to join non-daemon threads and that can cause the program to hang. The main thread is waiting for a thread to terminate, but the thread is happily continuing with its execution.
You seem to be depending on this in your code. Your parent starts a bunch of threads (or will, when you fix the first bug) and then exits. Really, its waiting for the threads to exit. If you solve the problem with daemon threads (below), you'll also need to add code for your thread to wait and not exit.
Back to the thread problem...
One solution is to mark threads as "daemon" (do mythread.daemon = True before starting the thread). Python won't wait for those threads and the threads will be killed when the main thread exits. This is great if you don't care about what state the thread is in while terminating. But it can do bad things like leave partially written files laying around.
Another solution is to figure out some way for the main thread to interrupt the thread. Suppose the threads waits of socket traffic. You could close the socket and the thread would be woken by that event.
Another solution is to only run threads for short-lived tasks that you want to complete. Your ctrl-c gets delayed a bit but you eventually exit. You could even set them up to run off of a queue and send a special "kill" message to them when done. In fact, python thread pools are a good way to go.
Another solution is to have the thread check a Event to see if its time to exit.
This question already has answers here:
Is there any way to kill a Thread?
(31 answers)
Closed 6 years ago.
I start a thread using the following code.
t = thread.start_new_thread(myfunction)
How can I kill the thread t from another thread. So basically speaking in terms of code, I want to be able to do something like this.
t.kill()
Note that I'm using Python 2.4.
In Python, you simply cannot kill a Thread.
If you do NOT really need to have a Thread (!), what you can do, instead of using the threading package (http://docs.python.org/2/library/threading.html), is to use the multiprocessing package (http://docs.python.org/2/library/multiprocessing.html). Here, to kill a process, you can simply call the method:
yourProcess.terminate() # kill the process!
Python will kill your process (on Unix through the SIGTERM signal, while on Windows through the TerminateProcess() call). Pay attention to use it while using a Queue or a Pipe! (it may corrupt the data in the Queue/Pipe)
Note that the multiprocessing.Event and the multiprocessing.Semaphore work exactly in the same way of the threading.Event and the threading.Semaphore respectively. In fact, the first ones are clones of the latters.
If you REALLY need to use a Thread, there is no way to kill your threads directly. What you can do, however, is to use a "daemon thread". In fact, in Python, a Thread can be flagged as daemon:
yourThread.daemon = True # set the Thread as a "daemon thread"
The main program will exit when no alive non-daemon threads are left. In other words, when your main thread (which is, of course, a non-daemon thread) will finish its operations, the program will exit even if there are still some daemon threads working.
Note that it is necessary to set a Thread as daemon before the start() method is called!
Of course you can, and should, use daemon even with multiprocessing. Here, when the main process exits, it attempts to terminate all of its daemonic child processes.
Finally, please, note that sys.exit() and os.kill() are not choices.
If your thread is busy executing Python code, you have a bigger problem than the inability to kill it. The GIL will prevent any other thread from even running whatever instructions you would use to do the killing. (After a bit of research, I've learned that the interpreter periodically releases the GIL, so the preceding statement is bogus. The remaining comment stands, however.)
Your thread must be written in a cooperative manner. That is, it must periodically check in with a signalling object such as a semaphore, which the main thread can use to instruct the worker thread to voluntarily exit.
while not sema.acquire(False):
# Do a small portion of work…
or:
for item in work:
# Keep working…
# Somewhere deep in the bowels…
if sema.acquire(False):
thread.exit()
You can't kill a thread from another thread. You need to signal to the other thread that it should end. And by "signal" I don't mean use the signal function, I mean that you have to arrange for some communication between the threads.
I want to create some worker processes and if they crash due to an exception, I would like them to respawn. Aside from the is_alive method in the multiprocessing module, I can't seem to find a way to do this.
This would require me to iterate over all the running processes (after a sleep) and check if they are alive. This is essentially a busy loop, I was wondering if there was a better solution that will wake up my program in the event that any one of my worker processes has crashed. Once it wakes up, I would like to log th exception that crashed my program and launch another process.
Polling to see if the child processes are alive should work fine, since it's a low-overhead check and you don't need to check that often.
The first answer to this (similar) question has a Python code example: Multi-server monitor/auto restarter in python
You can wrap your worker processes in try/except blocks where the except pushes a message onto a pipe before raising. Of course, polling isn't really worse than this and it's simpler.
If you're on a unix-like system, your main program can be notified of dead children by installing a signal handler. Look up your operating system's documentation on signal(), especially SIGCHLD. I'm afraid I don't remember whether Windows covers SIGCHLD with its very limited POSIX signal support.
I am creating a thread in my Python app with thread.start_new_thread.
How do I stop it if it hasn't finished in three seconds time?
You can't do that directly. Anyway aborting a thread is not good practice - rather think about using synchronization mechanisms that let you abort the thread in a "soft" way.
But daemonic threads will automatically be aborted if no non-daemonic threads remain (e.g. if the only main thread ends). Maybe that's what you want.
If you really need to do this (e.g. the thread calls code that may hang forever) then consider rewriting your code to spawn a process with the multiprocessing module. You can then kill the process with the Process.terminate() method. You will need 2.6 or later for this, of course.
You cannot. Threads can't be killed from outside. The only thing you can do is add a way to ask the thread to exit. Obviously you won't be able to do this if the thread is blocked in some systemcall.
As noted in a related question, you might be able to raise an exception through ctypes.pythonapi, but not while it's waiting on a system call.