Run away multi-threading script that continues to run after canceled python

Run away multi-threading script that continues to run after canceled python - python

This is a two part question,
After I cancel my script it still continues run, what I'm doing is queering an exchange api and saving the data for various assets.
My parent script can be seen here you can see i'm testing it out with just 3 assets, a sample of one of the child scripts can be seen here.
After I cancel the script the script for BTC seems to still be running and new .json files are still being generated in it's respective folder. The only way to stop it is to delete the folder and create it again.
This is really a bonus, my code was working with two assets but now with the addition of another it seems to only take in data for BTC and not the other 2.

Your first problem is that you are not really creating worker threads.
t1 = Thread(target=BTC.main()) executes BTC.main() and uses its return code to try to start a thread. Since main loops forever, you don't start any other threads.
Once you fix that, you'll still have a problem.
In python, only the root thread sees signals such as ctrl-c. Other threads will continue executing no matter how hard you press the key. When python exits, it tries to join non-daemon threads and that can cause the program to hang. The main thread is waiting for a thread to terminate, but the thread is happily continuing with its execution.
You seem to be depending on this in your code. Your parent starts a bunch of threads (or will, when you fix the first bug) and then exits. Really, its waiting for the threads to exit. If you solve the problem with daemon threads (below), you'll also need to add code for your thread to wait and not exit.
Back to the thread problem...
One solution is to mark threads as "daemon" (do mythread.daemon = True before starting the thread). Python won't wait for those threads and the threads will be killed when the main thread exits. This is great if you don't care about what state the thread is in while terminating. But it can do bad things like leave partially written files laying around.
Another solution is to figure out some way for the main thread to interrupt the thread. Suppose the threads waits of socket traffic. You could close the socket and the thread would be woken by that event.
Another solution is to only run threads for short-lived tasks that you want to complete. Your ctrl-c gets delayed a bit but you eventually exit. You could even set them up to run off of a queue and send a special "kill" message to them when done. In fact, python thread pools are a good way to go.
Another solution is to have the thread check a Event to see if its time to exit.

Related

What is the best way to debug a python multiprocess script which fails to terminate?

I am writing a python script which uses multiprocessing, multithreading and zeromq for interprocess communication. It all works fine until the program finishes: at that time the child processes terminate properly (sigwait is intercepted and the child procs terminate which I have confirmed with the ps command) but the main process often does not shut down - occasionally it does, but most of the time it does not. I have confirmed that all remaining threads of the main process are daemonic and that the last row of the script is executed properly (it is a logging.info call). I am using fork for forking processes and can see that a Forkprocess still runs in addition to the main process.
What is the best way to debug this, considering that the script has actually finished ? Maybe add a pdb or breakpoint() right at the end ?
Thanks in advance.
Here is the output, after the last row the script usually does not terminate:
INFO root::remaining active child processes: [<ForkProcess name='SyncManager-1' pid=6362 parent=6361 started>]
INFO root::non-daemonic threads which are still running, preventing orderly shutdown: [].
INFO root::======== PID: 6361 main() end: shut down completed.=========
EDIT:
I refactored the code and noticed that it now misbehaves very rarely. I am 99.9% certain that it is due to an open zeromq REQ/REP 'socket' at the time of shutdown. The refactoring made sure that these sockets are only held open only for a very short time - but it is not predictable what sockets are open at shutdown so occasionally it still hangs.
I will write a simple testharness with two processes communicating via REQ/REP sockets then shut down the child process followed by main process. I expect same result, i.e., interpreter not shutting down. Lets see, keep you posted.

I think you could try viztracer. The good thing about viztracer is that it can display all the processes on the same timeline. Maybe you can catch what's stopping your main process/forked process from shutting down. If it's a deadlock it should be noticeable. However, without the code, I really can't tell if it would help for sure.

Clean up a thread without .join() and without blocking the main thread

I am in a situation where I have two endpoints I can ask for a value, and one may be faster than the other. The calls to the endpoints are blocking. I want to wait for one to complete and take that result without waiting for the other to complete.
My solution was to issue the requests in separate threads and have those threads set a flag to true when they complete. In the main thread, I continuously check the flags (I know it is a busy wait, but that is not my primary concern right now) and when one completes it takes that value and returns it as the result.
The issue I have is that I never clean up the other thread. I can't find any way to do it without using .join(), which would just block and defeat the purpose of this whole thing. So, how can I clean up that other, slower thread that is blocking without joining it from the main thread?

What you want is to make your threads daemons, so when you get the result and finish your main, the other running thread will be forced to finish. You do that by changing the daemon keyword to True:
tr = threading.Thread(daemon=True)
From the threading docs:
The significance of this flag is that the entire Python program exits
when only daemon threads are left.
Although:
Daemon threads are abruptly stopped at shutdown. Their resources (such
as open files, database transactions, etc.) may not be released
properly. If you want your threads to stop gracefully, make them
non-daemonic and use a suitable signalling mechanism such as an Event.
I don't have any particular experience with Events so can't elaborate on that. Feel free to click the link and read on.

One bad and dirty solution is to implement a methode for the threads which close the socket which is blocking. Now you have to catch the exception in the main thread.

python function not running as thread

this is done in python 2.7.12
serialHelper is a class module arround python serial and this code does work nicely
#!/usr/bin/env python
import threading
from time import sleep
import serialHelper
sh = serialHelper.SerialHelper()
def serialGetter():
h = 0
while True:
h = h + 1
s_resp = sh.getResponse()
print ('response ' + s_resp)
sleep(3)
if __name__ == '__main__':
try:
t = threading.Thread(target=sh.serialReader)
t.setDaemon(True)
t.start()
serialGetter()
#tSR = threading.Thread(target=serialGetter)
#tSR.setDaemon(True)
#tSR.start()
except Exception as e:
print (e)
however the attemp to run serialGetter as thread as remarked it just dies.
Any reason why that function can not run as thread ?

Quoting from the Python documentation:
The entire Python program exits when no alive non-daemon threads are left.
So if you setDaemon(True) every new thread and then exit the main thread (by falling off the end of the script), the whole program will exit immediately. This kills all of the threads. Either don't use setDaemon(True), or don't exit the main thread without first calling join() on all of the threads you want to wait for.
Stepping back for a moment, it may help to think about the intended use case of a daemon thread. In Unix, a daemon is a process that runs in the background and (typically) serves requests or performs operations, either on behalf of remote clients over the network or local processes. The same basic idea applies to daemon threads:
You launch the daemon thread with some kind of work queue.
When you need some work done on the thread, you hand it a work object.
When you want the result of that work, you use an event or a future to wait for it to complete.
After requesting some work, you always eventually wait for it to complete, or perhaps cancel it (if your worker protocol supports cancellation).
You don't have to clean up the daemon thread at program termination. It just quietly goes away when there are no other threads left.
The problem is step (4). If you forget about some work object, and exit the app without waiting for it to complete, the work may get interrupted. Daemon threads don't gracefully shut down, so you could leave the outside world in an inconsistent state (e.g. an incomplete database transaction, a file that never got closed, etc.). It's often better to use a regular thread, and replace step (5) with an explicit "Finish up your work and shut down" work object that the main thread hands to the worker thread before exiting. The worker thread then recognizes this object, stops waiting on the work queue, and terminates itself once it's no longer doing anything else. This is slightly more up-front work, but is much safer in the event that a work object is inadvertently abandoned.
Because of all of the above, I recommend not using daemon threads unless you have a strong reason for them.

Python - Stopping a long running taskq's thread

I have a fairly simple program that each task added into the taskq is executing and computing something, say for 30 seconds. This task is 'not' running in some kind of while or for loop.
def run(self):
while not self.stopper.is_set():
DO_MY_30_SECONDS_WORK(self)
self.task_done()
Now, assuming i have a thread.event and this can check before/after the task is done. But is there a way to tell the already running thread to stop or exit it's execution.

There's no way to stop your running thread if DO_MY_30_SECONDS_WORK(self) is blocking. Well arguably you could set it as daemon thread and it'll be abruptly killed when your main program execution finishes, this would cause problems if the thread is actually holding resources (e.g. writing to a file) and is generally not a good idea to finish a thread.
What you could do is re-design DO_MY_30_SECONDS_WORK(self) and make it non-blocking, which means cutting the work into small pieces and make it check for the stop sign in a reasonable interval, so that your thread will be responsive enough to finish itself when you tell it to do so.

Setting up idle thread/signalling thread

I'm using Python with wxPython for writing an app.
The method I'm considering to accomplish this may not be the best - if that's the case, let me know because I'm open to refactoring.
Right now, I have one GUI form. The main program start point instantiates an instance of the GUI form then runs wx.mainLoop(), which causes the app's main initial thread to block for the lifetime of the app.
We of course know that when events happen in the UI, the UI thread runs the code for them.
Now, I have another thread - a worker thread. This thread needs to sit idle, and then when something happens in the UI thread, e.g. a button is clicked, I want the worker thread to stop idling and do something else - run a function, say.
I can't envision this right now but I could see as the app gets more complex also having to signal the worker thread while it's actually busy doing something.
I have two questions about this setup:
How can I make my worker thread idle without using up CPU time? Doing something like while True: pass will suck CPU time, while something like while True: time.sleep(0.1) will not allow instantaneous reaction to events.
What's the best way to signal into the worker thread to do something? I don't want the UI thread to execute something, I want the worker thread to be signaled, by the UI thread, that it should change what it's doing. Ideally, I'd have some way for the worker thread to register a callback with the UI itself, so that when a button is clicked or any other UI Event happens, the worker thread is signalled to change what it's doing.
So, is this the best way to accomplish this? And what's the best way to do it?
Thanks!

First: Do you actually need a background thread to sit around idle in the first place?
On most platforms, starting a new thread is cheap. (Except on Windows and Linux, where it's supercheap.) So, why not just kick off a thread whenever you need it? (It's just as easy to keep around a list of threads as a single thread, right?)
Alternatively, why not just create a ThreadPoolExecutor, and just submit jobs to it, and let the executor worry about when they get run and on which thread. Any time you can just think in terms of "tasks that need to get run without blocking the main thread" instead of "worker threads that need to wait on work", you're making your life easier. Under the covers, there's still one or more worker threads waiting on a queue, or something equivalent, but that part's all been written (and debugged and optimized) for you. All you have to write are the tasks, which are just regular functions.
But, if you want to write explicit background threads, you can, so I'll explain that.
How can I make my worker thread idle without using up CPU time? … What's the best way to signal into the worker thread to do something?
The way to idle a thread until a value is ready is to wait on a synchronization object. On any modern OS, waiting on a synchronization object means the operating system stops giving you any CPU time until the object is ready for you.*
There are a variety of different options you can see in the Threading module docs, but the obvious one to use in most cases like this is a Condition. The way to signal the worker thread is then to notify the Condition.
However, often a Queue is a lot simpler. To wait on a Queue, just call its get method with block=True. To signal another thread to wake up, just put something on the Queue. (Under the covers, a Queue wraps up a list or deque or other collection, a Lock, and a Condition, so you just tell it what you want to do—check for a value, block until there's a value, add a value—instead of dealing with waiting and signaling and protecting the collection.)
See the answer to controlling UI elements in wxPython using threading for how to signal in both directions, from a worker thread to a UI thread and vice-versa.
I'd have some way for the worker thread to register a callback with the UI itself, so that when a button is clicked or any other UI Event happens, the worker thread is signalled to change what it's doing.
You can do it this way if you want. Just pass self.queue.put or def callback(value): self.value = value; self.condition.notify() or whatever as a callback, and the GUI thread doesn't even have to know that the callback is triggering another thread.
In fact, that's a pretty nice design that may make you very happy later, when you decide to move some code back and forth between inline and background-threaded, or move it off to a child process instead of a background thread, or whatever.
I can't envision this right now but I could see as the app gets more complex also having to signal the worker thread while it's actually busy doing something.
But what do you want to happen if it's busy?
If you just want to say "If you're idle, wake up and do this task; otherwise, hold onto it and do it whenever you're ready", that's exactly what a Queue, or an Executor, will do for you automatically.
If you want to say, "If you're idle, wake up, otherwise, don't worry about it", that's what a Condition or Event will do.
If you want to say, "If you're idle, wake up and do this, otherwise, cancel what you're doing and do this instead", that's a bit more complicated. You pretty much need to have the background thread periodically check an "interrupt_me" variable while it's busy (and put a Lock around it), and then you'll set that flag as well as notifying the Condition… although in some cases, you can merge the idle and busy cases into a single Condition or Event (by calling an infinite wait() when idle, and a quick-check wait(timeout=0) when busy).
* In some cases—e.g., a Linux futex or a Windows CriticalSection—it may actually spin off a little bit of CPU time in some cases, because that happens to be a good optimization. But the point is, you're not asking for any CPU time until you're ready to use it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.