Stop a long-running action in web2py with multiprocessing

Stop a long-running action in web2py with multiprocessing - python

I have a web2py application that basically serves as a browser interface for a Python script. This script usually returns pretty quickly, but can occasionally take a long time. I want to provide a way for the user to stop the script's execution if it takes too long.
I am currently calling the function like this:
def myView(): # this function is called from ajax
session.model = myFunc() # myFunc is from a module which i have complete control over
return dict(model=session.model)
myFunc, when called with certain options, uses multiprocessing but still ends up taking a long time. I need some way to terminate the function, or at the very least the thread's children.
The first thing i tried was to run myFunc in a new process, and roll my own simple event system to kill it:
# in the controller
def myView():
p_conn, c_conn = multiprocessing.Pipe()
events = multiprocessing.Manager().dict()
proc = multiprocessing.Process(target=_fit, args=(options, events c_conn))
proc.start()
sleep(0.01)
session.events = events
proc.join()
session.model = p_conn.recv()
return dict(model=session.model)
def _fit(options, events pipe):
pipe.send(fitting.logistic_fit(options=options, events=events))
pipe.close()
def stop():
try:
session.events['kill']()
except SystemExit:
pass # because it raises that error intentionally
return dict()
# in the module
def kill():
print multiprocessing.active_children()
for p in multiprocessing.active_children():
p.terminate()
raise SystemExit
def myFunc(options, events):
events['kill'] = kill
I ran into a few major problems with this.
The session in stop() wasn't always the same as the session in myView(), so session.events was None.
Even when the session was the same, kill() wasn't properly killing the children.
The long-running function would hang the web2py thread, so stop() wasn't even processed until the function finished.
I considered not calling join() and using AJAX to pick up the result of the function at a later time, but I wasn't able to save the process object in session for later use. The pipe seemed to be able to be pickled, but then I had the problem with not being able to access the same session from another view.
How can I implement this feature?

For long running tasks, you are better off queuing them via the built-in scheduler. If you want to allow the user to manually stop a task that is taking too long, you can use the scheduler.stop_task(ref) method (where ref is the task id or uuid). Alternatively, when you queue a task, you can specify a timeout, so it will automatically stop if not completed within the timeout period.
You can do simple Ajax polling to notify the client when the task has completed (or implement something more sophisticated with websockets or SSE).

Related

Python 3 - How to terminate a thread instantly?

In my code (a complex GUI application with Tkinter) I have a thread defined in a custom object (a progress bar). It runs a function with a while cicle like this:
def Start(self):
while self.is_active==True:
do it..
time.sleep(1)
do it..
time.sleep(1)
def Stop(self):
self.is_active=False
It can terminate only when another piece of code, placed in another thread, changes the attribute self.is_active using the method self.Stop(). I have the same situation in another custom object (a counter) and both of them have to work together when the another thread (the main one) works.
The code works, but I realized that the two threads associated with the progress bar and the counter don't terminate instantly as I wanted, because before to temrinate, they need to wait the end of their functions, and these ones are slow becose of the time.sleep(1) instructions. From the user point of view, it means see the end of the main thread with the progress bar and the cunter that terminate LATE and I don't like it.
To be honest I don't know how to solve this issue. Is there a way to force a thread to terminate instantly without waiting the end of the function?

First off, to be clear, hard-killing a thread is a terrible idea in any language, and Python doesn't support it; if nothing else, the risk of that thread holding a lock which is never unlocked, causing any thread that tries to acquire it to deadlock, is a fatal flaw.
If you don't care about the thread at all, you can create it with the daemon=True argument, and it will die if all non-daemon threads in the process have exited. But if the thread really should die with proper cleanup (e.g. it might have with statements or the like that manage cleanup of resources outside the process, that won't be cleaned up on process termination), that's not a real solution.
That said, you can avoid waiting a second or more by switching from using a plain bool and time.sleep to using an Event and using the .wait method on it. This will allow the "sleeps" to be interrupted immediately, at the small expense of requiring you to reverse your condition (because Event.wait only blocks while it's false/unset, so you need the flag to be based on when you should stop, not when you are currently active):
class Spam:
def __init__(self):
self.should_stop = threading.Event() # Create an unset event on init
def Start(self):
while not self.should_stop.is_set():
# do it..
if self.should_stop.wait(1):
break
# do it..
if self.should_stop.wait(1):
break
def Stop(self):
self.should_stop.set()
On modern Python (3.1 and higher) the wait method returns True if the event was set (on beginning the wait or because it got set while waiting), and False otherwise, so whenever wait returns True, that means you were told to stop and you can immediately break out of the loop. You also get notified almost immediately, instead of waiting up to one second before you can check the flag.
This won't cause the real "do it.." code to exit immediately, but from what you said, it sounds like that part of the code isn't all that long, so waiting for it to complete isn't a big hassle.
If you really want to preserve the is_active attribute for testing whether it's still active, you can define it as a property that reverses the meaning of the Event, e.g.:
#property
def is_active(self):
return not self.should_stop.is_set()

the safest way to do it without risking a segmentation fault, is to return.
def Start(self):
while self.is_active==True:
do it..
if not self.is_active: return
time.sleep(1)
if not self.is_active: return
do it..
if not self.is_active: return
time.sleep(1)
def Stop(self):
self.is_active=False
python threads need to free the associated resources, and while "killing" the thread is possible using some C tricks, you will be risking a segmentation fault or a memory leak.
here is a cleaner way to do it.
class MyError(Exception):
pass
def Start(self):
try:
while self.is_active==True:
do it..
self.check_termination()
time.sleep(1)
self.check_termination()
do it..
self.check_termination()
time.sleep(1)
except MyError:
return
def check_termination(self):
if not self.is_active:
raise MyError
and you can call self.check_termination() from inside any function to terminate this loop, not necessarily from inside Start directly.
Edit: ShadowRanger solution handles the "interruptable wait" better, i am just keeping this for implementing a kill switch for the thread that can be checked from anywhere inside the thread.

JoinableQueue's join() function causes my program to hang

I have a program, which uses multiprocesses to execute functions from an external hardware library. The communication between the multiprocess and my program happens with JoinableQueue().
A part of the code looks like this:
# Main Code
queue_cmd.put("do_something")
queue_cmd.join() # here is my problem
# multiprocess
task = queue_cmd.get()
if task == "do_something":
external_class.do_something()
queue_cmd.task_done()
Note: external_class is the external hardware library.
This library sometimes crashes and the line queue_cmd.task_done() never gets executed. As a result, my main program hangs indefinitely in the queue_cmd.join() part, waiting for the queue_cmd.task_done() to be called. Unfortunately, there is no timeout parameter for the join() function.
How can I wait for the element in the JoinableQueue to be processed, but also deal with the event of my multiprocess terminating (due to the crash in the do_something() function)?
Ideally, the join function would have a timeout parameter (.join(timeout=30)), which I could use to restart the multiprocess - but it does not.

You can always wrap a blocking function on another thread:
queue_cmd.put("do_something")
t = Thread(target=queue_cmd.join)
t.start()
# implement a timeout
start = datetime.now()
timeout = 10 # seconds
while t.is_alive() and (datetime.now() - start).seconds < timeout:
# do something else
# waiting for the join or timeout
if t.is_alive():
# kill the subprocess that failed
pass

I think the best approach here is to start the "crashable" module in (yet) another process:
Main code
queue_cmd.put("do_something")
queue_cmd.join()
Multiprocess (You can now move this to a thread)
task = queue_cmd.get()
if task == "do_something":
subprocess.run(["python", "pleasedontcrash.py"])
queue_cmd.task_done()
pleasedontcrash.py
external_class.do_something()
As shown, I'd do it using subprocess. If you need to pass parameters (which you could with subprocess using pipes or arguments), it's easier to use multiprocessing.

Equivalent of thread.interrupt_main() in Python 3

In Python 2 there is a function thread.interrupt_main(), which raises a KeyboardInterrupt exception in the main thread when called from a subthread.
This is also available through _thread.interrupt_main() in Python 3, but it's a low-level "support module", mostly for use within other standard modules.
What is the modern way of doing this in Python 3, presumably through the threading module, if there is one?

Well raising an exception manually is kinda low-level, so if you think you have to do that just use _thread.interrupt_main() since that's the equivalent you asked for (threading module itself doesn't provide this).
It could be that there is a more elegant way to achieve your ultimate goal, though. Maybe setting and checking a flag would be already enough or using a threading.Event like #RFmyD already suggested, or using message passing over a queue.Queue. It depends on your specific setup.

If you need a way for a thread to stop execution of the whole program, this is how I did it with a threading.Event:
def start():
"""
This runs in the main thread and starts a sub thread
"""
stop_event = threading.Event()
check_stop_thread = threading.Thread(
target=check_stop_signal, args=(stop_event), daemon=True
)
check_stop_thread.start()
# If check_stop_thread sets the check_stop_signal, sys.exit() is executed here in the main thread.
# Since the sub thread is a daemon, it will be terminated as well.
stop_event.wait()
logging.debug("Threading stop event set, calling sys.exit()...")
sys.exit()
def check_stop_signal(stop_event):
"""
Checks continuously (every 0.1 s) if a "stop" flag has been set in the database.
Needs to run in its own thread.
"""
while True:
if io.check_stop():
logger.info("Program was aborted by user.")
logging.debug("Setting threading stop event...")
stop_event.set()
break
sleep(0.1)

You might want to look into the threading.Event module.

Python, is it proper for one thread to spawn another

I am writing an update application in Python 2.x. I have one thread (ticket_server) sitting on a database (CouchDB) url in longpoll mode. Update requests are dumped into this database from an outside application. When a change comes, ticket_server triggers a worker thread (update_manager). The heavy lifting is done in this update_manager thread. There will be telnet connections and ftp uploads performed. So it is of highest importance that this process not be interrupted.
My question is, is it safe to spawn update_manager threads from the ticket_server threads?
The other option might be to put requests into a queue, and have another function wait for a ticket to enter the queue and then pass the request off to an update_manager thread. But, Id rather keeps tings simple (Im assuming the ticket_server spawning update_manager is simple) until I have a reason to expand.
# Here is the heavy lifter
class Update_Manager(threading.Thread):
def __init__(self)
threading.Thread.__init__(self, ticket, telnet_ip, ftp_ip)
self.ticket = ticket
self.telnet_ip = telnet_ip
self.ftp_ip = ftp_ip
def run(self):
# This will be a very lengthy process.
self.do_some_telnet()
self.do_some_ftp()
def do_some_telnet(self)
...
def do_some_ftp(self)
...
# This guy just passes work orders off to Update_Manager
class Ticket_Server(threading.Thread):
def __init__(self)
threading.Thread.__init__(self, database_ip)
self.database_ip
def run(self):
# This function call will block this thread only.
ticket = self.get_ticket(database_ip)
# Here is where I question what to do.
# Should I 1) call the Update thread right from here...
up_man = Update_Manager(ticket)
up_man.start
# Or should I 2) put the ticket into a queue and let some other function
# not in this thread fire the Update_Manager.
def get_ticket()
# This function will 'wait' for a ticket to get posted.
# for those familiar with couchdb:
url = 'http://' + database_ip:port + '/_changes?feed=longpoll&since=' + update_seq
response = urllib2.urlopen(url)
This is just a lot of code to ask which approach is the safer/more efficient/more pythonic
Im only a few months old with python so these question get my brain stuck in a while loop.

The main thread of a program is a thread; the only way to spawn a thread is from another thread.
Of course, you need to make sure your blocking thread is releasing the GIL while it waits, or other Python threads won't run. All mature Python database bindings will do this, but I've never heard of couchdb.

Waiting on event with Twisted and PB

I have a python app that uses multiple threads and I am curious about the best way to wait for something in python without burning cpu or locking the GIL.
my app uses twisted and I spawn a thread to run a long operation so I do not stomp on the reactor thread. This long operation also spawns some threads using twisted's deferToThread to do something else, and the original thread wants to wait for the results from the defereds.
What I have been doing is this
while self._waiting:
time.sleep( 0.01 )
which seemed to disrupt twisted PB's objects from receiving messages so I thought sleep was locking the GIL. Further investigation by the posters below revealed however that it does not.
There are better ways to wait on threads without blocking the reactor thread or python posted below.

If you're already using Twisted, you should never need to "wait" like this.
As you've described it:
I spawn a thread to run a long operation ... This long operation also spawns some threads using twisted's deferToThread ...
That implies that you're calling deferToThread from your "long operation" thread, not from your main thread (the one where reactor.run() is running). As Jean-Paul Calderone already noted in a comment, you can only call Twisted APIs (such as deferToThread) from the main reactor thread.
The lock-up that you're seeing is a common symptom of not following this rule. It has nothing to do with the GIL, and everything to do with the fact that you have put Twisted's reactor into a broken state.
Based on your loose description of your program, I've tried to write a sample program that does what you're talking about based entirely on Twisted APIs, spawning all threads via Twisted and controlling them all from the main reactor thread.
import time
from twisted.internet import reactor
from twisted.internet.defer import gatherResults
from twisted.internet.threads import deferToThread, blockingCallFromThread
def workReallyHard():
"'Work' function, invoked in a thread."
time.sleep(0.2)
def longOperation():
for x in range(10):
workReallyHard()
blockingCallFromThread(reactor, startShortOperation, x)
result = blockingCallFromThread(reactor, gatherResults, shortOperations)
return 'hooray', result
def shortOperation(value):
workReallyHard()
return value * 100
shortOperations = []
def startShortOperation(value):
def done(result):
print 'Short operation complete!', result
return result
shortOperations.append(
deferToThread(shortOperation, value).addCallback(done))
d = deferToThread(longOperation)
def allDone(result):
print 'Long operation complete!', result
reactor.stop()
d.addCallback(allDone)
reactor.run()
Note that at the point in allDone where the reactor is stopped, you could fire off another "long operation" and have it start the process all over again.

Have you tried condition variables? They are used like
condition = Condition()
def consumer_in_thread_A():
condition.acquire()
try:
while resource_not_yet_available:
condition.wait()
# Here, the resource is available and may be
# consumed
finally:
condition.release()
def produce_in_thread_B():
# ... create resource, whatsoever
condition.acquire()
try:
condition.notify_all()
finally:
condition.release()
Condition variables act as locks (acquire and release), but their main purpose is to provide the control mechanism which allows to wait for them to be notify-d or notify_all-d.

I recently found out that calling
time.sleep( X ) will lock the GIL for
the entire time X and therefore freeze
ALL python threads for that time
period.
You found wrongly -- this is definitely not how it works. What's the source where you found this mis-information?
Anyway, then you clarify (in comments -- better edit your Q!) that you're using deferToThread and your problem with this is that...:
Well yes I defer the action to a
thread and give twisted a callback.
But the parent thread needs to wait
for the whole series of sub threads to
complete before it can move onto a new
set of sub threads to spawn
So use as the callback a method of an object with a counter -- start it at 0, increment it by one every time you're deferring-to-thread and decrement it by one in the callback method.
When the callback method sees that the decremented counter has gone back to 0, it knows that we're done waiting "for the whole series of sub threads to complete" and then the time has come to "move on to a new set of sub threads to spawn", and thus, in that case only, calls the "spawn a new set of sub threads" function or method -- it's that easy!
E.g. (net of typos &c as this is untested code, just to give you the idea)...:
class Waiter(object):
def __init__(self, what_next, *a, **k):
self.counter = 0
self.what_next = what_next
self.a = a
self.k = k
def one_more(self):
self.counter += 1
def do_wait(self, *dont_care):
self.counter -= 1
if self.counter == 0:
self.what_next(*self.a, **self.k)
def spawn_one_thread(waiter, long_calculation, *a, **k):
waiter.one_more()
d = threads.deferToThread(long_calculation, *a, **k)
d.addCallback(waiter.do_wait)
def spawn_all(waiter, list_of_lists_of_functions_args_and_kwds):
if not list_of_lists_of_functions_args_and_kwds:
return
if waiter is None:
waiter=Waiter(spawn_all, list_of_lists_of_functions_args_and_kwds)
this_time = list_of_list_of_functions_args_and_kwds.pop(0)
for f, a, k in this_time:
spawn_one_thread(waiter, f, *a, **k)
def start_it_all(list_of_lists_of_functions_args_and_kwds):
spawn_all(None, list_of_lists_of_functions_args_and_kwds)

According to the Python source, time.sleep() does not hold the GIL.
http://code.python.org/hg/trunk/file/98e56689c59c/Modules/timemodule.c#l920
Note the use of Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, as documented here:
http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock

The threading module allows you to spawn a thread, which is then represented by a Thread object. That object has a join method that you can use to wait for the subthread to complete.
See http://docs.python.org/library/threading.html#module-threading

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Stop a long-running action in web2py with multiprocessing - python

Related

Python 3 - How to terminate a thread instantly?

JoinableQueue's join() function causes my program to hang

Equivalent of thread.interrupt_main() in Python 3

Python, is it proper for one thread to spawn another

Waiting on event with Twisted and PB

Categories

Resources