Python - threading assert group is None when creating a custom Thread Class - python

I wanted to create a custom Thread class that is able to propagate an exception it comes across to the main thread. My implementation is as follows:
class VerseThread(threading.Thread):
def __init__(self, args):
super().__init__(self, args=args)
# self.scraper = scraper
def run(self):
self.exc = None
try:
book, abbrev, template, chapter = self.args
self.parser.parse(book, abbrev, template, chapter)
except ChapterNotFoundError as e:
self.exc = e
def join(self):
threading.Thread.join(self)
if self.exc:
raise self.exc
This is supposed to run in the following method, inside a Scraper class (it's all inside a ẁhile true):
for book, abbrev, testament in self.books[init:end]:
base_chapter = 1
while True:
threads = []
if testament == 'ot':
for i in range(3):
threads.append(VerseThread(args=(book, abbrev, OT_TEMPLATE, base_chapter+i)))
else:
for i in range(3):
threads.append(VerseThread(args=(book, abbrev, NT_TEMPLATE, base_chapter+i)))
try:
for thread in threads:
if not thread.is_alive():
thread.start()
for thread in threads:
thread.join()
base_chapter += 3
except ChapterNotFoundError as e:
LOGGER.info(f"{{PROCESS {multiprocessing.current_process().pid}}} - Chapter {e.chapter} not found in {book}, exiting book...")
break
The issue is, if I run it like presented here, I get the error assert group is None, "group argument must be None for now". However, when I run it using Thread(target=self.parse, args=(book, abbrev, OT_TEMPLATE, base_chapter+1)) instead of VerseThread(args=(book, abbrev, OT_TEMPLATE, base_chapter+i)), it works just fine, but the exception is of course still there. What's wrong with my code? How can I get rid of this error?
EDIT: Upon further testing, it seems that what I'm trying to do works fine when I use thread.run() instead of thread.start(), but then only one thread is being used, which is a problem. This, however, means that the error must be in the start() method, but I've no idea what to do.

You have several errors. First, if you are using super() as in super().__init__(self, target=target, args=args), you do not pass self explicitly as an argument. Second, to handle any possible thread-initializer arguments, your signature for this method should just be as follows:
class VerseThread(threading.Thread):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
... # rest of the code omitted
But since your __init__ method does not do anything but call the parent's __init__ method with any passed arguments, there is now no need to even override this method.
Finally, the attributes that you are interested in are not args but rather _args and _kwargs (in case keyword arguments are specified). Also, you have specified self.parser, but I do not see where that attribute has been set.
import threading
class ChapterNotFoundError(Exception):
pass
class VerseThread(threading.Thread):
def run(self):
self.exc = None
try:
book, abbrev, template, chapter = self._args
self.parser.parse(book, abbrev, template, chapter)
except ChapterNotFoundError as e:
self.exc = e
def join(self):
threading.Thread.join(self) # Or: super().join()
if self.exc:
raise self.exc
for book, abbrev, testament in self.books[init:end]:
base_chapter = 1
while True:
threads = []
if testament == 'ot':
for i in range(3):
threads.append(VerseThread(args=(book, abbrev, OT_TEMPLATE, base_chapter+i)))
else:
for i in range(3):
threads.append(VerseThread(args=(book, abbrev, NT_TEMPLATE, base_chapter+i)))
try:
for thread in threads:
if not thread.is_alive():
thread.start()
for thread in threads:
thread.join()
base_chapter += 3
except ChapterNotFoundError as e:
LOGGER.info(f"{{PROCESS {multiprocessing.current_process().pid}}} - Chapter {e.chapter} not found in {book}, exiting book...")
break
Improvement
Accessing quasi-private attributes, such as self._args is a potentially dangerous thing and should be avoided.
I can see the value of creating a subclass of Thread that will catch exceptions in the "worker" function it is to execute and then "propogate" it back to the main thread when it joins the thread. But I believe such a class should be general purpose and work with any type of worker function. In general, I don't like to have application-specific code (business logic) in a multithreading.Thread or multiprocessing.Pool subclass. I instead prefer having my business logic coded within a function or class method(s) that can then be used in multithreading, multiprocessing or serial processing as you see fit. The following is how I would code the Thread subclass (I have named it PropogateExceptionThread, but chose whatever name you wish) and I might use it:
import threading
class PropogateExceptionThread(threading.Thread):
def run(self):
self.exc = None
try:
super().run()
except Exception as e:
self.exc = e
def join(self):
super().join()
if self.exc:
raise self.exc
def worker(x):
if x < 10 or x > 20:
raise ValueError(f'Bad value for argument x = {x}')
t = PropogateExceptionThread(target=worker, args=(1,))
t.start()
try:
t.join()
except Exception as e:
print('The thread raised an exception:', e)
Prints:
The thread raised an exception: Bad value for argument x = 1

Related

Threading with Decorator in Python [duplicate]

The function foo below returns a string 'foo'. How can I get the value 'foo' which is returned from the thread's target?
from threading import Thread
def foo(bar):
print('hello {}'.format(bar))
return 'foo'
thread = Thread(target=foo, args=('world!',))
thread.start()
return_value = thread.join()
The "one obvious way to do it", shown above, doesn't work: thread.join() returned None.
One way I've seen is to pass a mutable object, such as a list or a dictionary, to the thread's constructor, along with a an index or other identifier of some sort. The thread can then store its results in its dedicated slot in that object. For example:
def foo(bar, result, index):
print 'hello {0}'.format(bar)
result[index] = "foo"
from threading import Thread
threads = [None] * 10
results = [None] * 10
for i in range(len(threads)):
threads[i] = Thread(target=foo, args=('world!', results, i))
threads[i].start()
# do some other stuff
for i in range(len(threads)):
threads[i].join()
print " ".join(results) # what sound does a metasyntactic locomotive make?
If you really want join() to return the return value of the called function, you can do this with a Thread subclass like the following:
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar)
return "foo"
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, Verbose=None):
Thread.__init__(self, group, target, name, args, kwargs, Verbose)
self._return = None
def run(self):
if self._Thread__target is not None:
self._return = self._Thread__target(*self._Thread__args,
**self._Thread__kwargs)
def join(self):
Thread.join(self)
return self._return
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
twrv.start()
print twrv.join() # prints foo
That gets a little hairy because of some name mangling, and it accesses "private" data structures that are specific to Thread implementation... but it works.
For Python 3:
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, Verbose=None):
Thread.__init__(self, group, target, name, args, kwargs)
self._return = None
def run(self):
if self._target is not None:
self._return = self._target(*self._args,
**self._kwargs)
def join(self, *args):
Thread.join(self, *args)
return self._return
FWIW, the multiprocessing module has a nice interface for this using the Pool class. And if you want to stick with threads rather than processes, you can just use the multiprocessing.pool.ThreadPool class as a drop-in replacement.
def foo(bar, baz):
print 'hello {0}'.format(bar)
return 'foo' + baz
from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=1)
async_result = pool.apply_async(foo, ('world', 'foo')) # tuple of args for foo
# do some other stuff in the main process
return_val = async_result.get() # get the return value from your function.
In Python 3.2+, stdlib concurrent.futures module provides a higher level API to threading, including passing return values or exceptions from a worker thread back to the main thread:
import concurrent.futures
def foo(bar):
print('hello {}'.format(bar))
return 'foo'
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(foo, 'world!')
return_value = future.result()
print(return_value)
Jake's answer is good, but if you don't want to use a threadpool (you don't know how many threads you'll need, but create them as needed) then a good way to transmit information between threads is the built-in Queue.Queue class, as it offers thread safety.
I created the following decorator to make it act in a similar fashion to the threadpool:
def threaded(f, daemon=False):
import Queue
def wrapped_f(q, *args, **kwargs):
'''this function calls the decorated function and puts the
result in a queue'''
ret = f(*args, **kwargs)
q.put(ret)
def wrap(*args, **kwargs):
'''this is the function returned from the decorator. It fires off
wrapped_f in a new thread and returns the thread object with
the result queue attached'''
q = Queue.Queue()
t = threading.Thread(target=wrapped_f, args=(q,)+args, kwargs=kwargs)
t.daemon = daemon
t.start()
t.result_queue = q
return t
return wrap
Then you just use it as:
#threaded
def long_task(x):
import time
x = x + 5
time.sleep(5)
return x
# does not block, returns Thread object
y = long_task(10)
print y
# this blocks, waiting for the result
result = y.result_queue.get()
print result
The decorated function creates a new thread each time it's called and returns a Thread object that contains the queue that will receive the result.
UPDATE
It's been quite a while since I posted this answer, but it still gets views so I thought I would update it to reflect the way I do this in newer versions of Python:
Python 3.2 added in the concurrent.futures module which provides a high-level interface for parallel tasks. It provides ThreadPoolExecutor and ProcessPoolExecutor, so you can use a thread or process pool with the same api.
One benefit of this api is that submitting a task to an Executor returns a Future object, which will complete with the return value of the callable you submit.
This makes attaching a queue object unnecessary, which simplifies the decorator quite a bit:
_DEFAULT_POOL = ThreadPoolExecutor()
def threadpool(f, executor=None):
#wraps(f)
def wrap(*args, **kwargs):
return (executor or _DEFAULT_POOL).submit(f, *args, **kwargs)
return wrap
This will use a default module threadpool executor if one is not passed in.
The usage is very similar to before:
#threadpool
def long_task(x):
import time
x = x + 5
time.sleep(5)
return x
# does not block, returns Future object
y = long_task(10)
print y
# this blocks, waiting for the result
result = y.result()
print result
If you're using Python 3.4+, one really nice feature of using this method (and Future objects in general) is that the returned future can be wrapped to turn it into an asyncio.Future with asyncio.wrap_future. This makes it work easily with coroutines:
result = await asyncio.wrap_future(long_task(10))
If you don't need access to the underlying concurrent.Future object, you can include the wrap in the decorator:
_DEFAULT_POOL = ThreadPoolExecutor()
def threadpool(f, executor=None):
#wraps(f)
def wrap(*args, **kwargs):
return asyncio.wrap_future((executor or _DEFAULT_POOL).submit(f, *args, **kwargs))
return wrap
Then, whenever you need to push cpu intensive or blocking code off the event loop thread, you can put it in a decorated function:
#threadpool
def some_long_calculation():
...
# this will suspend while the function is executed on a threadpool
result = await some_long_calculation()
Another solution that doesn't require changing your existing code:
import Queue # Python 2.x
#from queue import Queue # Python 3.x
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar) # Python 2.x
#print('hello {0}'.format(bar)) # Python 3.x
return 'foo'
que = Queue.Queue() # Python 2.x
#que = Queue() # Python 3.x
t = Thread(target=lambda q, arg1: q.put(foo(arg1)), args=(que, 'world!'))
t.start()
t.join()
result = que.get()
print result # Python 2.x
#print(result) # Python 3.x
It can be also easily adjusted to a multi-threaded environment:
import Queue # Python 2.x
#from queue import Queue # Python 3.x
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar) # Python 2.x
#print('hello {0}'.format(bar)) # Python 3.x
return 'foo'
que = Queue.Queue() # Python 2.x
#que = Queue() # Python 3.x
threads_list = list()
t = Thread(target=lambda q, arg1: q.put(foo(arg1)), args=(que, 'world!'))
t.start()
threads_list.append(t)
# Add more threads here
...
threads_list.append(t2)
...
threads_list.append(t3)
...
# Join all the threads
for t in threads_list:
t.join()
# Check thread's return value
while not que.empty():
result = que.get()
print result # Python 2.x
#print(result) # Python 3.x
UPDATE:
I think there's a significantly simpler and more concise way to save the result of the thread, and in a way that keeps the interface virtually identical to the threading.Thread class (please let me know if there are edge cases - I haven't tested as much as my original post below):
import threading
class ConciseResult(threading.Thread):
def run(self):
self.result = self._target(*self._args, **self._kwargs)
To be robust and avoid potential errors:
import threading
class ConciseRobustResult(threading.Thread):
def run(self):
try:
if self._target is not None:
self.result = self._target(*self._args, **self._kwargs)
finally:
# Avoid a refcycle if the thread is running a function with
# an argument that has a member that points to the thread.
del self._target, self._args, self._kwargs
Short explanation: we override only the run method of threading.Thread, and modify nothing else. This allows us to use everything else the threading.Thread class does for us, without needing to worry about missing potential edge cases such as _private attribute assignments or custom attribute modifications in the way that my original post does.
We can verify that we only modify the run method by looking at the output of help(ConciseResult) and help(ConciseRobustResult). The only method/attribute/descriptor included under Methods defined here: is run, and everything else comes from the inherited threading.Thread base class (see the Methods inherited from threading.Thread: section).
To test either of these implementations using the example code below, substitute ConciseResult or ConciseRobustResult for ThreadWithResult in the main function below.
Original post using a closure function in the init method:
Most answers I've found are long and require being familiar with other modules or advanced python features, and will be rather confusing to someone unless they're already familiar with everything the answer talks about.
Working code for a simplified approach:
import threading
class ThreadWithResult(threading.Thread):
def __init__(self, group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None):
def function():
self.result = target(*args, **kwargs)
super().__init__(group=group, target=function, name=name, daemon=daemon)
Example code:
import time, random
def function_to_thread(n):
count = 0
while count < 3:
print(f'still running thread {n}')
count +=1
time.sleep(3)
result = random.random()
print(f'Return value of thread {n} should be: {result}')
return result
def main():
thread1 = ThreadWithResult(target=function_to_thread, args=(1,))
thread2 = ThreadWithResult(target=function_to_thread, args=(2,))
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print(thread1.result)
print(thread2.result)
main()
Explanation:
I wanted to simplify things significantly, so I created a ThreadWithResult class and had it inherit from threading.Thread. The nested function function in __init__ calls the threaded function we want to save the value of, and saves the result of that nested function as the instance attribute self.result after the thread finishes executing.
Creating an instance of this is identical to creating an instance of threading.Thread. Pass in the function you want to run on a new thread to the target argument and any arguments that your function might need to the args argument and any keyword arguments to the kwargs argument.
e.g.
my_thread = ThreadWithResult(target=my_function, args=(arg1, arg2, arg3))
I think this is significantly easier to understand than the vast majority of answers, and this approach requires no extra imports! I included the time and random module to simulate the behavior of a thread, but they're not required to achieve the functionality asked in the original question.
I know I'm answering this looong after the question was asked, but I hope this can help more people in the future!
EDIT: I created the save-thread-result PyPI package to allow you to access the same code above and reuse it across projects (GitHub code is here). The PyPI package fully extends the threading.Thread class, so you can set any attributes you would set on threading.thread on the ThreadWithResult class as well!
The original answer above goes over the main idea behind this subclass, but for more information, see the more detailed explanation (from the module docstring) here.
Quick usage example:
pip3 install -U save-thread-result # MacOS/Linux
pip install -U save-thread-result # Windows
python3 # MacOS/Linux
python # Windows
from save_thread_result import ThreadWithResult
# As of Release 0.0.3, you can also specify values for
#`group`, `name`, and `daemon` if you want to set those
# values manually.
thread = ThreadWithResult(
target = my_function,
args = (my_function_arg1, my_function_arg2, ...)
kwargs = {my_function_kwarg1: kwarg1_value, my_function_kwarg2: kwarg2_value, ...}
)
thread.start()
thread.join()
if getattr(thread, 'result', None):
print(thread.result)
else:
# thread.result attribute not set - something caused
# the thread to terminate BEFORE the thread finished
# executing the function passed in through the
# `target` argument
print('ERROR! Something went wrong while executing this thread, and the function you passed in did NOT complete!!')
# seeing help about the class and information about the threading.Thread super class methods and attributes available:
help(ThreadWithResult)
Parris / kindall's answer join/return answer ported to Python 3:
from threading import Thread
def foo(bar):
print('hello {0}'.format(bar))
return "foo"
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None):
Thread.__init__(self, group, target, name, args, kwargs, daemon=daemon)
self._return = None
def run(self):
if self._target is not None:
self._return = self._target(*self._args, **self._kwargs)
def join(self):
Thread.join(self)
return self._return
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
twrv.start()
print(twrv.join()) # prints foo
Note, the Thread class is implemented differently in Python 3.
I stole kindall's answer and cleaned it up just a little bit.
The key part is adding *args and **kwargs to join() in order to handle the timeout
class threadWithReturn(Thread):
def __init__(self, *args, **kwargs):
super(threadWithReturn, self).__init__(*args, **kwargs)
self._return = None
def run(self):
if self._Thread__target is not None:
self._return = self._Thread__target(*self._Thread__args, **self._Thread__kwargs)
def join(self, *args, **kwargs):
super(threadWithReturn, self).join(*args, **kwargs)
return self._return
UPDATED ANSWER BELOW
This is my most popularly upvoted answer, so I decided to update with code that will run on both py2 and py3.
Additionally, I see many answers to this question that show a lack of comprehension regarding Thread.join(). Some completely fail to handle the timeout arg. But there is also a corner-case that you should be aware of regarding instances when you have (1) a target function that can return None and (2) you also pass the timeout arg to join(). Please see "TEST 4" to understand this corner case.
ThreadWithReturn class that works with py2 and py3:
import sys
from threading import Thread
from builtins import super # https://stackoverflow.com/a/30159479
_thread_target_key, _thread_args_key, _thread_kwargs_key = (
('_target', '_args', '_kwargs')
if sys.version_info >= (3, 0) else
('_Thread__target', '_Thread__args', '_Thread__kwargs')
)
class ThreadWithReturn(Thread):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._return = None
def run(self):
target = getattr(self, _thread_target_key)
if target is not None:
self._return = target(
*getattr(self, _thread_args_key),
**getattr(self, _thread_kwargs_key)
)
def join(self, *args, **kwargs):
super().join(*args, **kwargs)
return self._return
Some sample tests are shown below:
import time, random
# TEST TARGET FUNCTION
def giveMe(arg, seconds=None):
if not seconds is None:
time.sleep(seconds)
return arg
# TEST 1
my_thread = ThreadWithReturn(target=giveMe, args=('stringy',))
my_thread.start()
returned = my_thread.join()
# (returned == 'stringy')
# TEST 2
my_thread = ThreadWithReturn(target=giveMe, args=(None,))
my_thread.start()
returned = my_thread.join()
# (returned is None)
# TEST 3
my_thread = ThreadWithReturn(target=giveMe, args=('stringy',), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=2)
# (returned is None) # because join() timed out before giveMe() finished
# TEST 4
my_thread = ThreadWithReturn(target=giveMe, args=(None,), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=random.randint(1, 10))
Can you identify the corner-case that we may possibly encounter with TEST 4?
The problem is that we expect giveMe() to return None (see TEST 2), but we also expect join() to return None if it times out.
returned is None means either:
(1) that's what giveMe() returned, or
(2) join() timed out
This example is trivial since we know that giveMe() will always return None. But in real-world instance (where the target may legitimately return None or something else) we'd want to explicitly check for what happened.
Below is how to address this corner-case:
# TEST 4
my_thread = ThreadWithReturn(target=giveMe, args=(None,), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=random.randint(1, 10))
if my_thread.isAlive():
# returned is None because join() timed out
# this also means that giveMe() is still running in the background
pass
# handle this based on your app's logic
else:
# join() is finished, and so is giveMe()
# BUT we could also be in a race condition, so we need to update returned, just in case
returned = my_thread.join()
Using Queue :
import threading, queue
def calc_square(num, out_queue1):
l = []
for x in num:
l.append(x*x)
out_queue1.put(l)
arr = [1,2,3,4,5,6,7,8,9,10]
out_queue1=queue.Queue()
t1=threading.Thread(target=calc_square, args=(arr,out_queue1))
t1.start()
t1.join()
print (out_queue1.get())
My solution to the problem is to wrap the function and thread in a class. Does not require using pools,queues, or c type variable passing. It is also non blocking. You check status instead. See example of how to use it at end of code.
import threading
class ThreadWorker():
'''
The basic idea is given a function create an object.
The object can then run the function in a thread.
It provides a wrapper to start it,check its status,and get data out the function.
'''
def __init__(self,func):
self.thread = None
self.data = None
self.func = self.save_data(func)
def save_data(self,func):
'''modify function to save its returned data'''
def new_func(*args, **kwargs):
self.data=func(*args, **kwargs)
return new_func
def start(self,params):
self.data = None
if self.thread is not None:
if self.thread.isAlive():
return 'running' #could raise exception here
#unless thread exists and is alive start or restart it
self.thread = threading.Thread(target=self.func,args=params)
self.thread.start()
return 'started'
def status(self):
if self.thread is None:
return 'not_started'
else:
if self.thread.isAlive():
return 'running'
else:
return 'finished'
def get_results(self):
if self.thread is None:
return 'not_started' #could return exception
else:
if self.thread.isAlive():
return 'running'
else:
return self.data
def add(x,y):
return x +y
add_worker = ThreadWorker(add)
print add_worker.start((1,2,))
print add_worker.status()
print add_worker.get_results()
Taking into consideration #iman comment on #JakeBiesinger answer I have recomposed it to have various number of threads:
from multiprocessing.pool import ThreadPool
def foo(bar, baz):
print 'hello {0}'.format(bar)
return 'foo' + baz
numOfThreads = 3
results = []
pool = ThreadPool(numOfThreads)
for i in range(0, numOfThreads):
results.append(pool.apply_async(foo, ('world', 'foo'))) # tuple of args for foo)
# do some other stuff in the main process
# ...
# ...
results = [r.get() for r in results]
print results
pool.close()
pool.join()
I'm using this wrapper, which comfortably turns any function for running in a Thread - taking care of its return value or exception. It doesn't add Queue overhead.
def threading_func(f):
"""Decorator for running a function in a thread and handling its return
value or exception"""
def start(*args, **kw):
def run():
try:
th.ret = f(*args, **kw)
except:
th.exc = sys.exc_info()
def get(timeout=None):
th.join(timeout)
if th.exc:
raise th.exc[0], th.exc[1], th.exc[2] # py2
##raise th.exc[1] #py3
return th.ret
th = threading.Thread(None, run)
th.exc = None
th.get = get
th.start()
return th
return start
Usage Examples
def f(x):
return 2.5 * x
th = threading_func(f)(4)
print("still running?:", th.is_alive())
print("result:", th.get(timeout=1.0))
#threading_func
def th_mul(a, b):
return a * b
th = th_mul("text", 2.5)
try:
print(th.get())
except TypeError:
print("exception thrown ok.")
Notes on threading module
Comfortable return value & exception handling of a threaded function is a frequent "Pythonic" need and should indeed already be offered by the threading module - possibly directly in the standard Thread class. ThreadPool has way too much overhead for simple tasks - 3 managing threads, lots of bureaucracy. Unfortunately Thread's layout was copied from Java originally - which you see e.g. from the still useless 1st (!) constructor parameter group.
Based of what kindall mentioned, here's the more generic solution that works with Python3.
import threading
class ThreadWithReturnValue(threading.Thread):
def __init__(self, *init_args, **init_kwargs):
threading.Thread.__init__(self, *init_args, **init_kwargs)
self._return = None
def run(self):
self._return = self._target(*self._args, **self._kwargs)
def join(self):
threading.Thread.join(self)
return self._return
Usage
th = ThreadWithReturnValue(target=requests.get, args=('http://www.google.com',))
th.start()
response = th.join()
response.status_code # => 200
join always return None, i think you should subclass Thread to handle return codes and so.
You can define a mutable above the scope of the threaded function, and add the result to that. (I also modified the code to be python3 compatible)
returns = {}
def foo(bar):
print('hello {0}'.format(bar))
returns[bar] = 'foo'
from threading import Thread
t = Thread(target=foo, args=('world!',))
t.start()
t.join()
print(returns)
This returns {'world!': 'foo'}
If you use the function input as the key to your results dict, every unique input is guaranteed to give an entry in the results
Define your target to
1) take an argument q
2) replace any statements return foo with q.put(foo); return
so a function
def func(a):
ans = a * a
return ans
would become
def func(a, q):
ans = a * a
q.put(ans)
return
and then you would proceed as such
from Queue import Queue
from threading import Thread
ans_q = Queue()
arg_tups = [(i, ans_q) for i in xrange(10)]
threads = [Thread(target=func, args=arg_tup) for arg_tup in arg_tups]
_ = [t.start() for t in threads]
_ = [t.join() for t in threads]
results = [q.get() for _ in xrange(len(threads))]
And you can use function decorators/wrappers to make it so you can use your existing functions as target without modifying them, but follow this basic scheme.
GuySoft's idea is great, but I think the object does not necessarily have to inherit from Thread and start() could be removed from interface:
from threading import Thread
import queue
class ThreadWithReturnValue(object):
def __init__(self, target=None, args=(), **kwargs):
self._que = queue.Queue()
self._t = Thread(target=lambda q,arg1,kwargs1: q.put(target(*arg1, **kwargs1)) ,
args=(self._que, args, kwargs), )
self._t.start()
def join(self):
self._t.join()
return self._que.get()
def foo(bar):
print('hello {0}'.format(bar))
return "foo"
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
print(twrv.join()) # prints foo
This is a pretty old question, but I wanted to share a simple solution that has worked for me and helped my dev process.
The methodology behind this answer is the fact that the "new" target function, inner is assigning the result of the original function (passed through the __init__ function) to the result instance attribute of the wrapper through something called closure.
This allows the wrapper class to hold onto the return value for callers to access at anytime.
NOTE: This method doesn't need to use any mangled methods or private methods of the threading.Thread class, although yield functions have not been considered (OP did not mention yield functions).
Enjoy!
from threading import Thread as _Thread
class ThreadWrapper:
def __init__(self, target, *args, **kwargs):
self.result = None
self._target = self._build_threaded_fn(target)
self.thread = _Thread(
target=self._target,
*args,
**kwargs
)
def _build_threaded_fn(self, func):
def inner(*args, **kwargs):
self.result = func(*args, **kwargs)
return inner
Additionally, you can run pytest (assuming you have it installed) with the following code to demonstrate the results:
import time
from commons import ThreadWrapper
def test():
def target():
time.sleep(1)
return 'Hello'
wrapper = ThreadWrapper(target=target)
wrapper.thread.start()
r = wrapper.result
assert r is None
time.sleep(2)
r = wrapper.result
assert r == 'Hello'
As mentioned multiprocessing pool is much slower than basic threading. Using queues as proposeded in some answers here is a very effective alternative. I have use it with dictionaries in order to be able run a lot of small threads and recuperate multiple answers by combining them with dictionaries:
#!/usr/bin/env python3
import threading
# use Queue for python2
import queue
import random
LETTERS = 'abcdefghijklmnopqrstuvwxyz'
LETTERS = [ x for x in LETTERS ]
NUMBERS = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def randoms(k, q):
result = dict()
result['letter'] = random.choice(LETTERS)
result['number'] = random.choice(NUMBERS)
q.put({k: result})
threads = list()
q = queue.Queue()
results = dict()
for name in ('alpha', 'oscar', 'yankee',):
threads.append( threading.Thread(target=randoms, args=(name, q)) )
threads[-1].start()
_ = [ t.join() for t in threads ]
while not q.empty():
results.update(q.get())
print(results)
Here is the version that I created of #Kindall's answer.
This version makes it so that all you have to do is input your command with arguments to create the new thread.
This was made with Python 3.8:
from threading import Thread
from typing import Any
def test(plug, plug2, plug3):
print(f"hello {plug}")
print(f'I am the second plug : {plug2}')
print(plug3)
return 'I am the return Value!'
def test2(msg):
return f'I am from the second test: {msg}'
def test3():
print('hello world')
def NewThread(com, Returning: bool, *arguments) -> Any:
"""
Will create a new thread for a function/command.
:param com: Command to be Executed
:param arguments: Arguments to be sent to Command
:param Returning: True/False Will this command need to return anything
"""
class NewThreadWorker(Thread):
def __init__(self, group = None, target = None, name = None, args = (), kwargs = None, *,
daemon = None):
Thread.__init__(self, group, target, name, args, kwargs, daemon = daemon)
self._return = None
def run(self):
if self._target is not None:
self._return = self._target(*self._args, **self._kwargs)
def join(self):
Thread.join(self)
return self._return
ntw = NewThreadWorker(target = com, args = (*arguments,))
ntw.start()
if Returning:
return ntw.join()
if __name__ == "__main__":
print(NewThread(test, True, 'hi', 'test', test2('hi')))
NewThread(test3, True)
You can use pool.apply_async() of ThreadPool() to return the value from test() as shown below:
from multiprocessing.pool import ThreadPool
def test(num1, num2):
return num1 + num2
pool = ThreadPool(processes=1) # Here
result = pool.apply_async(test, (2, 3)) # Here
print(result.get()) # 5
And, you can also use submit() of concurrent.futures.ThreadPoolExecutor() to return the value from test() as shown below:
from concurrent.futures import ThreadPoolExecutor
def test(num1, num2):
return num1 + num2
with ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(test, 2, 3) # Here
print(future.result()) # 5
And, instead of return, you can use the array result as shown below:
from threading import Thread
def test(num1, num2, r):
r[0] = num1 + num2 # Instead of "return"
result = [None] # Here
thread = Thread(target=test, args=(2, 3, result))
thread.start()
thread.join()
print(result[0]) # 5
And instead of return, you can also use the queue result as shown below:
from threading import Thread
import queue
def test(num1, num2, q):
q.put(num1 + num2) # Instead of "return"
queue = queue.Queue() # Here
thread = Thread(target=test, args=(2, 3, queue))
thread.start()
thread.join()
print(queue.get()) # '5'
The shortest and simplest way I've found to do this is to take advantage of Python classes and their dynamic properties. You can retrieve the current thread from within the context of your spawned thread using threading.current_thread(), and assign the return value to a property.
import threading
def some_target_function():
# Your code here.
threading.current_thread().return_value = "Some return value."
your_thread = threading.Thread(target=some_target_function)
your_thread.start()
your_thread.join()
return_value = your_thread.return_value
print(return_value)
One usual solution is to wrap your function foo with a decorator like
result = queue.Queue()
def task_wrapper(*args):
result.put(target(*args))
Then the whole code may looks like that
result = queue.Queue()
def task_wrapper(*args):
result.put(target(*args))
threads = [threading.Thread(target=task_wrapper, args=args) for args in args_list]
for t in threads:
t.start()
while(True):
if(len(threading.enumerate()) < max_num):
break
for t in threads:
t.join()
return result
Note
One important issue is that the return values may be unorderred.
(In fact, the return value is not necessarily saved to the queue, since you can choose arbitrary thread-safe data structure )
Kindall's answer in Python3
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, *, daemon=None):
Thread.__init__(self, group, target, name, args, kwargs, daemon)
self._return = None
def run(self):
try:
if self._target:
self._return = self._target(*self._args, **self._kwargs)
finally:
del self._target, self._args, self._kwargs
def join(self,timeout=None):
Thread.join(self,timeout)
return self._return
I know this thread is old.... but I faced the same problem... If you are willing to use thread.join()
import threading
class test:
def __init__(self):
self.msg=""
def hello(self,bar):
print('hello {}'.format(bar))
self.msg="foo"
def main(self):
thread = threading.Thread(target=self.hello, args=('world!',))
thread.start()
thread.join()
print(self.msg)
g=test()
g.main()
Best way... Define a global variable, then change the variable in the threaded function. Nothing to pass in or retrieve back
from threading import Thread
# global var
radom_global_var = 5
def function():
global random_global_var
random_global_var += 1
domath = Thread(target=function)
domath.start()
domath.join()
print(random_global_var)
# result: 6

How to track progress of job worker threads when threads are initiated from a Job Processor?

I have a scenario where I get a list of jobs to be processed e.g. a list of web pages to be crawled from internet). Each job is independent and also the jobs can be processed in any order. Individual jobs may fail or succeed and may have to be handled accordingly (e.g. temporary data for a failed crawl task may have to be deleted and recrawled in next round)
I am trying to implement it using thread based processing in python. To mimic the actual task lets say I have a huge list of integer arrays and the individual job is to compute the Sum or Product of each array. What I am trying to do is to use a JobsProcessor class object to instantiate threads of JobWorker class objects which perform the actual processing by creating objects for other classes (Sum and Product here). The code for the same is mentioned below. A snippet is shown
from queue import Queue, Empty
from threading import Thread
import time
class Product:
def __init__(self,data):
self.data = data
def doOperation(self):
try:
product =self.data[0]
for d in self.data[1:]:
if d>100000:
raise Exception( "Forcefully throwing exception")
product*=d
time.sleep(1)
return product
except:
return "product computation failed"
class Sum:
def __init__(self,data):
self.data = data
def doOperation(self):
try:
sum =0
for d in self.data:
sum+=d
time.sleep(1)
return sum
except:
return "sum computation failed"
class JobWorker(Thread):
def __init__(self, queue):
Thread.__init__(self)
self.queue = queue
def run(self):
while True:
try:
jobitem = self.queue.get_nowait()
if jobitem is None:
break
jobdata, optype = jobitem
if optype =='sum':
opobj = Sum(jobdata)
jobresult = opobj.doOperation()
elif optype =='product':
opobj = Product(jobdata)
jobresult = opobj.doOperation()
else:
print ("Invalid op type")
jobresult = 'Failed'
print(" job result", jobresult)
self.queue.task_done()
except Empty:
break
except:
print ("Some exception occured")
#How to pass it to up to the main jobs processor#
class JobsProcessor(object):
def __init__(self, joblist):
self.joblist = joblist
self.job_queue = Queue()
def process_resources(self):
try:
for job in self.joblist:
self.job_queue.put(job)
for i in range(2):
jobthread = JobWorker(self.job_queue)
jobthread.start()
'''
Write code here to monitor current status for all running jobs
'''
self.job_queue.join()
'''I want to write code here to track progress status for all jobs
Some jobs may have failed, not completed and based on that I may
want to take further action such as retry or flag them'''
print("Finished Jobs")
except:
pass
orgjobList = [ ([1,5,9,4],'sum'),
([5,4,5,8],'product'),
([100,45,678,999],'product'),
([3743,34,44324,543],'sum'),
([100001, 100002, 9876, 83989], 'product')]
mainprocessor = JobsProcessor(orgjobList)
mainprocessor.process_resources()
I want to add 2 functionalities to this process.
Consolidation : when all the job threads complete I want to know the status of all the JobWorker objects (e.g if they are completed successfully/ complete with failure). Failure/Exception may occur in the JobWorker object or may be even the Sum or Product object. The failure/success status should be propagate back to JobsProcessor, where I want to perform other actions such as reprocess/delete/send_elsewhere etc based on the returned status
Monitoring - also I want to have a Monitor functionality which can continuously check on the status of current running/completed jobs and perform the requisite actions such as delete immediately rather than waiting till the end for Consolidation
Please advise how I can add the above functionalities, and if only one of them would suffice for cases such as crawling pages. Any other suggestions are also welcome.
You can add both the functionalities in your code in any of the two ways -
Using Global Variables (simplest approach)
Using a getProgress and getStatus methods in your class (elegant approach)
You can create 2 threads, One thread does the actual work and updates the progress variable.
For the second approach, you can set two vars in __init__ class, like the following.
def __init__(self):
self.progress = 0
self.success = True
self.isDone = False
self.error = "No Error Occurred"
Then you can include the logic in your code like the following -
def actualWork(self):
self.isDone = 0
try:
for i in range(1000):
self.progress = i
time.sleep(0.01)
self.isDone = True
except Exception as e:
self.success = False
self.error = str(e)
def getProgress(self):
return self.progress
def getError(self):
return self.error

Python multiprocessing: Parallel Pipeline implementation

I am trying to create a pipeline but I have bad exit issues(zombies) and performance ones. I have created this generic class:
class Generator(Process):
'''
<function>: function to call. None value means that the current class will
be used as a template for another class, with <function> being defined
there
<input_queues> : Queue or list of Queue objects , which refer to the input
to <function>.
<output_queues> : Queue or list of Queue objects , which are used to pass
output
<sema_to_acquire> : Condition or list of Condition objects, which are
blocking generation while not notified
<sema_to_release> : Condition or list of Condition objects, which will be
notified after <function> is called
'''
def __init__(self, function=None, input_queues=None, output_queues=None, sema_to_acquire=None,
sema_to_release=None):
Process.__init__(self)
self.input_queues = input_queues
self.output_queues = output_queues
self.sema_to_acquire = sema_to_acquire
self.sema_to_release = sema_to_release
if function is not None:
self.function = function
def run(self):
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
[sema.release() for sema in self.sema_to_release]
while True:
if self.sema_to_acquire is not None:
try:
self.sema_to_acquire.acquire()
except AttributeError:
[sema.acquire() for sema in self.sema_to_acquire]
if self.input_queues is not None:
try:
data = self.input_queues.get()
except AttributeError:
data = [queue.get() for queue in self.input_queues]
isiterable = True
try:
iter(data)
res = self.function(*tuple(data))
except TypeError, te:
res = self.function(data)
else:
res = self.function()
if self.output_queues is not None:
try:
if self.output_queues.full():
self.output_queues.get(res)
self.output_queues.put(res)
except AttributeError:
[queue.put(res) for queue in self.output_queues]
if self.sema_to_release is not None:
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
[sema.release() for sema in self.sema_to_release]
to simulate a worker inside a pipeline. The Generator is wanted to run an infinite while loop, in which a function is executed using input from n queues and the result is written to m queues. There are some semaphores which need to be acquired by a process, before one iteration happens, and when the iteration finishes some other semaphores are released. So, for processes needed to run on parallel and produce an input for another I send 'crossed' semaphores as arguments, in order to force them to perform together single iterations. For processes which do not need to run on parallel I do not use any conditions. An example (which I actually use, if anyone ignores the input functions) is the following:
import time
from multiprocess import Lock
print_lock = Lock()
_t_=0.5
def func0(data):
time.sleep(_t_)
print_lock.acquire()
print 'func0 sends',data
print_lock.release()
return data
def func1(data):
time.sleep(_t_)
print_lock.acquire()
print 'func1 receives and sends',data
print_lock.release()
return data
def func2(data):
time.sleep(_t_)
print_lock.acquire()
print 'func2 receives and sends',data
print_lock.release()
return data
def func3(*data):
print_lock.acquire()
print 'func3 receives',data
print_lock.release()
run_svm = Semaphore()
run_rf = Semaphore()
inp_rf = Queue()
inp_svm = Queue()
out_rf = Queue()
out_svm = Queue()
kin_stream = Queue()
res_mixed = Queue()
streamproc = Generator(func0,
input_queues=kin_stream,
output_queues=[inp_rf,
inp_svm])
streamproc.daemon = True
streamproc.start()
svm_class = Generator(func1,
input_queues=inp_svm,
output_queues=out_svm,
sema_to_acquire=run_svm,
sema_to_release=run_rf)
svm_class.daemon=True
svm_class.start()
rf_class = Generator(func2,
input_queues=inp_rf,
output_queues=out_rf,
sema_to_acquire=run_rf,
sema_to_release=run_svm)
rf_class.daemon=True
rf_class.start()
mixed_class = Generator(func3,
input_queues=[out_rf, out_svm])
mixed_class.daemon = True
mixed_class.start()
count = 1
while True:
kin_stream.put([count])
count+=1
time.sleep(1)
streamproc.join()
svm_class.join()
rf_class.join()
mixed_class.join()
This example gives:
func0 sends 1
func2 receives and sends 1
func1 receives and sends 1
func3 receives (1, 1)
func0 sends 2
func2 receives and sends 2
func1 receives and sends 2
func3 receives (2, 2)
func0 sends 3
func2 receives and sends 3
func1 receives and sends 3
func3 receives (3, 3)
...
All good. However, if I try to kill main then the other subprocesses are not guaranteed to terminate: the terminal might freeze, or the python compiler might remain running on the background (probably zombies) and I have no clue why this is happening, as I have set the corresponding daemons to True.
Does anyone have a better idea of implementing this type of pipeline or can suggest a solution to this evil problem? Thank you all.
EDIT
Fixed testing. The zombies still do exist however.
I was able to overcome this problem, by introducing a termination queue as additional argument to the given class and set up a signal handler for SIGINT interrupt, in order to stop the pipeline execution. I do not know if this is the most elegant way to get it working, but it works. Also, the way the signal handler is set is important, as it must be set before process.start() for some reason, if anyone knows why, he can comment. Furthermore the signal handler is inherited by the subprocesses, so I have to put the join inside a try:..except AssertionError:pass pattern, otherwise it will throw error (again, if someone knows how to bypass this, please elaborate). Anyways, it works.
SOURCE CODE
class Generator(Process):
'''
<term_queue>: Queue to write termination events, must be same for all
processes spawned
<function>: function to call. None value means that the current class will
be used as a template for another class, with <function> being defined
there
<input_queues> : Queue or list of Queue objects , which refer to the input
to <function>.
<output_queues> : Queue or list of Queue objects , which are used to pass
output
<sema_to_acquire> : Semaphore or list of Semaphore objects, which are
blocking function execution
<sema_to_release> : Semaphore or list of Semaphore objects, which will be
released after <function> is called
'''
def __init__(self, term_queue,
function=None, input_queues=None, output_queues=None, sema_to_acquire=None,
sema_to_release=None):
Process.__init__(self)
self.term_queue = term_queue
self.input_queues = input_queues
self.output_queues = output_queues
self.sema_to_acquire = sema_to_acquire
self.sema_to_release = sema_to_release
if function is not None:
self.function = function
def run(self):
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
deb = [sema.release() for sema in self.sema_to_release]
while True:
if not self.term_queue.empty():
self.term_queue.put((self.name, 0))
break
try:
if self.sema_to_acquire is not None:
try:
self.sema_to_acquire.acquire()
except AttributeError:
deb = [sema.acquire() for sema in self.sema_to_acquire]
if self.input_queues is not None:
try:
data = self.input_queues.get()
except AttributeError:
data = tuple([queue.get()
for queue in self.input_queues])
res = self.function(data)
else:
res = self.function()
if self.output_queues is not None:
try:
if self.output_queues.full():
self.output_queues.get(res)
self.output_queues.put(res)
except AttributeError:
deb = [queue.put(res) for queue in self.output_queues]
if self.sema_to_release is not None:
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
deb = [sema.release() for sema in self.sema_to_release]
except Exception as exc:
self.term_queue.put((self.name, exc))
break
def signal_handler(sig, frame, term_queue, processes):
'''
<term_queue> is the queue to write termination of the __main__
<processes> is a dicitonary holding all running processes
'''
term_queue.put((__name__, 'SIGINT'))
try:
[processes[key].join() for key in processes]
except AssertionError:
pass
sys.exit(0)
term_queue = Queue()
'''
initialize some Generators and add them to <processes> dicitonary
'''
signal.signal(signal.SIGINT, lambda sig,frame: signal_handler(sig,frame,
term_queue,processes))
[processes[key].start() for key in processes]
while True:
if not term_queue.empty():
[processes[key].join() for key in processes]
break
and the example is changed accordingly (comment if you want me to add it)
I have had to work on this issue as well, and indeed, passing some communication pipe or queue to the processes seems to be the easiest way to tell them to terminate.
However the termination code can take advantage of a finally: bloc in the main process, it will take care of any event including signals.
If your processes are supposed to terminate at the same time as an object, you might also want to play with weakref.finalize, but it can be tricky.

How to put a "session" time out for a function in Python? [duplicate]

I'm calling a function in Python which I know may stall and force me to restart the script.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
You may use the signal package if you are running on UNIX:
In [1]: import signal
# Register an handler for the timeout
In [2]: def handler(signum, frame):
...: print("Forever is over!")
...: raise Exception("end of time")
...:
# This function *may* run for an indetermined time...
In [3]: def loop_forever():
...: import time
...: while 1:
...: print("sec")
...: time.sleep(1)
...:
...:
# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0
# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0
In [6]: try:
...: loop_forever()
...: except Exception, exc:
...: print(exc)
....:
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time
# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0
10 seconds after the call signal.alarm(10), the handler is called. This raises an exception that you can intercept from the regular Python code.
This module doesn't play well with threads (but then, who does?)
Note that since we raise an exception when timeout happens, it may end up caught and ignored inside the function, for example of one such function:
def loop_forever():
while 1:
print('sec')
try:
time.sleep(10)
except:
continue
You can use multiprocessing.Process to do exactly that.
Code
import multiprocessing
import time
# bar
def bar():
for i in range(100):
print "Tick"
time.sleep(1)
if __name__ == '__main__':
# Start bar as a process
p = multiprocessing.Process(target=bar)
p.start()
# Wait for 10 seconds or until process finishes
p.join(10)
# If thread is still active
if p.is_alive():
print "running... let's kill it..."
# Terminate - may not work if process is stuck for good
p.terminate()
# OR Kill - will work for sure, no chance for process to finish nicely however
# p.kill()
p.join()
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?
I posted a gist that solves this question/problem with a decorator and a threading.Timer. Here it is with a breakdown.
Imports and setups for compatibility
It was tested with Python 2 and 3. It should also work under Unix/Linux and Windows.
First the imports. These attempt to keep the code consistent regardless of the Python version:
from __future__ import print_function
import sys
import threading
from time import sleep
try:
import thread
except ImportError:
import _thread as thread
Use version independent code:
try:
range, _print = xrange, print
def print(*args, **kwargs):
flush = kwargs.pop('flush', False)
_print(*args, **kwargs)
if flush:
kwargs.get('file', sys.stdout).flush()
except NameError:
pass
Now we have imported our functionality from the standard library.
exit_after decorator
Next we need a function to terminate the main() from the child thread:
def quit_function(fn_name):
# print to stderr, unbuffered in Python 2.
print('{0} took too long'.format(fn_name), file=sys.stderr)
sys.stderr.flush() # Python 3 stderr is likely buffered.
thread.interrupt_main() # raises KeyboardInterrupt
And here is the decorator itself:
def exit_after(s):
'''
use as decorator to exit process if
function takes longer than s seconds
'''
def outer(fn):
def inner(*args, **kwargs):
timer = threading.Timer(s, quit_function, args=[fn.__name__])
timer.start()
try:
result = fn(*args, **kwargs)
finally:
timer.cancel()
return result
return inner
return outer
Usage
And here's the usage that directly answers your question about exiting after 5 seconds!:
#exit_after(5)
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
sleep(1)
print('countdown finished')
Demo:
>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 6, in countdown
KeyboardInterrupt
The second function call will not finish, instead the process should exit with a traceback!
KeyboardInterrupt does not always stop a sleeping thread
Note that sleep will not always be interrupted by a keyboard interrupt, on Python 2 on Windows, e.g.:
#exit_after(1)
def sleep10():
sleep(10)
print('slept 10 seconds')
>>> sleep10()
sleep10 took too long # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 3, in sleep10
KeyboardInterrupt
nor is it likely to interrupt code running in extensions unless it explicitly checks for PyErr_CheckSignals(), see Cython, Python and KeyboardInterrupt ignored
I would avoid sleeping a thread more than a second, in any case - that's an eon in processor time.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
To catch it and do something else, you can catch the KeyboardInterrupt.
>>> try:
... countdown(10)
... except KeyboardInterrupt:
... print('do something else')
...
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else
I have a different proposal which is a pure function (with the same API as the threading suggestion) and seems to work fine (based on suggestions on this thread)
def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError()
# set the timeout handler
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout_duration)
try:
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
finally:
signal.alarm(0)
return result
I ran across this thread when searching for a timeout call on unit tests. I didn't find anything simple in the answers or 3rd party packages so I wrote the decorator below you can drop right into code:
import multiprocessing.pool
import functools
def timeout(max_timeout):
"""Timeout decorator, parameter in seconds."""
def timeout_decorator(item):
"""Wrap the original function."""
#functools.wraps(item)
def func_wrapper(*args, **kwargs):
"""Closure for function."""
pool = multiprocessing.pool.ThreadPool(processes=1)
async_result = pool.apply_async(item, args, kwargs)
# raises a TimeoutError if execution exceeds max_timeout
return async_result.get(max_timeout)
return func_wrapper
return timeout_decorator
Then it's as simple as this to timeout a test or any function you like:
#timeout(5.0) # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
...
The stopit package, found on pypi, seems to handle timeouts well.
I like the #stopit.threading_timeoutable decorator, which adds a timeout parameter to the decorated function, which does what you expect, it stops the function.
Check it out on pypi: https://pypi.python.org/pypi/stopit
I am the author of wrapt_timeout_decorator.
Most of the solutions presented here work wunderfully under Linux on the first glance - because we have fork() and signals() - but on windows the things look a bit different.
And when it comes to subthreads on Linux, You cant use Signals anymore.
In order to spawn a process under Windows, it needs to be picklable - and many decorated functions or Class methods are not.
So you need to use a better pickler like dill and multiprocess (not pickle and multiprocessing) - thats why You cant use ProcessPoolExecutor (or only with limited functionality).
For the timeout itself - You need to define what timeout means - because on Windows it will take considerable (and not determinable) time to spawn the process. This can be tricky on short timeouts. Lets assume, spawning the process takes about 0.5 seconds (easily !!!). If You give a timeout of 0.2 seconds what should happen?
Should the function time out after 0.5 + 0.2 seconds (so let the method run for 0.2 seconds)?
Or should the called process time out after 0.2 seconds (in that case, the decorated function will ALWAYS timeout, because in that time it is not even spawned)?
Also nested decorators can be nasty and You cant use Signals in a subthread. If You want to create a truly universal, cross-platform decorator, all this needs to be taken into consideration (and tested).
Other issues are passing exceptions back to the caller, as well as logging issues (if used in the decorated function - logging to files in another process is NOT supported)
I tried to cover all edge cases, You might look into the package wrapt_timeout_decorator, or at least test Your own solutions inspired by the unittests used there.
#Alexis Eggermont - unfortunately I dont have enough points to comment - maybe someone else can notify You - I think I solved Your import issue.
There are a lot of suggestions, but none using concurrent.futures, which I think is the most legible way to handle this.
from concurrent.futures import ProcessPoolExecutor
# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
with ProcessPoolExecutor() as p:
f = p.submit(fnc, *args, **kwargs)
return f.result(timeout=5)
Super simple to read and maintain.
We make a pool, submit a single process and then wait up to 5 seconds before raising a TimeoutError that you could catch and handle however you needed.
Native to python 3.2+ and backported to 2.7 (pip install futures).
Switching between threads and processes is as simple as replacing ProcessPoolExecutor with ThreadPoolExecutor.
If you want to terminate the Process on timeout I would suggest looking into Pebble.
Building on and and enhancing the answer by #piro , you can build a contextmanager. This allows for very readable code which will disable the alaram signal after a successful run (sets signal.alarm(0))
from contextlib import contextmanager
import signal
import time
#contextmanager
def timeout(duration):
def timeout_handler(signum, frame):
raise TimeoutError(f'block timedout after {duration} seconds')
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(duration)
try:
yield
finally:
signal.alarm(0)
def sleeper(duration):
time.sleep(duration)
print('finished')
Example usage:
In [19]: with timeout(2):
...: sleeper(1)
...:
finished
In [20]: with timeout(2):
...: sleeper(3)
...:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-20-66c78858116f> in <module>()
1 with timeout(2):
----> 2 sleeper(3)
3
<ipython-input-7-a75b966bf7ac> in sleeper(t)
1 def sleeper(t):
----> 2 time.sleep(t)
3 print('finished')
4
<ipython-input-18-533b9e684466> in timeout_handler(signum, frame)
2 def timeout(duration):
3 def timeout_handler(signum, frame):
----> 4 raise Exception(f'block timedout after {duration} seconds')
5 signal.signal(signal.SIGALRM, timeout_handler)
6 signal.alarm(duration)
Exception: block timedout after 2 seconds
Great, easy to use and reliable PyPi project timeout-decorator (https://pypi.org/project/timeout-decorator/)
installation:
pip install timeout-decorator
Usage:
import time
import timeout_decorator
#timeout_decorator.timeout(5)
def mytest():
print "Start"
for i in range(1,10):
time.sleep(1)
print "%d seconds have passed" % i
if __name__ == '__main__':
mytest()
timeout-decorator don't work on windows system as , windows didn't support signal well.
If you use timeout-decorator in windows system you will get the following
AttributeError: module 'signal' has no attribute 'SIGALRM'
Some suggested to use use_signals=False but didn't worked for me.
Author #bitranox created the following package:
pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip
Code Sample:
import time
from wrapt_timeout_decorator import *
#timeout(5)
def mytest(message):
print(message)
for i in range(1,10):
time.sleep(1)
print('{} seconds have passed'.format(i))
def main():
mytest('starting')
if __name__ == '__main__':
main()
Gives the following exception:
TimeoutError: Function mytest timed out after 5 seconds
Highlights
Raises TimeoutError uses exceptions to alert on timeout - can easily be modified
Cross Platform: Windows & Mac OS X
Compatibility: Python 3.6+ (I also tested on python 2.7 and it works with small syntax adjustments)
For full explanation and extension to parallel maps, see here https://flipdazed.github.io/blog/quant%20dev/parallel-functions-with-timeouts
Minimal Example
>>> #killer_call(timeout=4)
... def bar(x):
... import time
... time.sleep(x)
... return x
>>> bar(10)
Traceback (most recent call last):
...
__main__.TimeoutError: function 'bar' timed out after 4s
and as expected
>>> bar(2)
2
Full code
import multiprocessing as mp
import multiprocessing.queues as mpq
import functools
import dill
from typing import Tuple, Callable, Dict, Optional, Iterable, List, Any
class TimeoutError(Exception):
def __init__(self, func: Callable, timeout: int):
self.t = timeout
self.fname = func.__name__
def __str__(self):
return f"function '{self.fname}' timed out after {self.t}s"
def _lemmiwinks(func: Callable, args: Tuple, kwargs: Dict[str, Any], q: mp.Queue):
"""lemmiwinks crawls into the unknown"""
q.put(dill.loads(func)(*args, **kwargs))
def killer_call(func: Callable = None, timeout: int = 10) -> Callable:
"""
Single function call with a timeout
Args:
func: the function
timeout: The timeout in seconds
"""
if not isinstance(timeout, int):
raise ValueError(f'timeout needs to be an int. Got: {timeout}')
if func is None:
return functools.partial(killer_call, timeout=timeout)
#functools.wraps(killer_call)
def _inners(*args, **kwargs) -> Any:
q_worker = mp.Queue()
proc = mp.Process(target=_lemmiwinks, args=(dill.dumps(func), args, kwargs, q_worker))
proc.start()
try:
return q_worker.get(timeout=timeout)
except mpq.Empty:
raise TimeoutError(func, timeout)
finally:
try:
proc.terminate()
except:
pass
return _inners
if __name__ == '__main__':
#killer_call(timeout=4)
def bar(x):
import time
time.sleep(x)
return x
print(bar(2))
bar(10)
Notes
You will need to import inside the function because of the way dill works.
This will also mean these functions may not be not compatible with doctest if there are imports inside your target functions. You will get an issue with __import__ not found.
Just in case it is helpful for anyone, building on the answer by #piro, I've made a function decorator:
import time
import signal
from functools import wraps
def timeout(timeout_secs: int):
def wrapper(func):
#wraps(func)
def time_limited(*args, **kwargs):
# Register an handler for the timeout
def handler(signum, frame):
raise Exception(f"Timeout for function '{func.__name__}'")
# Register the signal function handler
signal.signal(signal.SIGALRM, handler)
# Define a timeout for your function
signal.alarm(timeout_secs)
result = None
try:
result = func(*args, **kwargs)
except Exception as exc:
raise exc
finally:
# disable the signal alarm
signal.alarm(0)
return result
return time_limited
return wrapper
Using the wrapper on a function with a 20 seconds timeout would look something like:
#timeout(20)
def my_slow_or_never_ending_function(name):
while True:
time.sleep(1)
print(f"Yet another second passed {name}...")
try:
results = my_slow_or_never_ending_function("Yooo!")
except Exception as e:
print(f"ERROR: {e}")
We can use signals for the same. I think the below example will be useful for you. It is very simple compared to threads.
import signal
def timeout(signum, frame):
raise myException
#this is an infinite loop, never ending under normal circumstances
def main():
print 'Starting Main ',
while 1:
print 'in main ',
#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)
#change 5 to however many seconds you need
signal.alarm(5)
try:
main()
except myException:
print "whoops"
Another solution with asyncio :
If you want to cancel the background task and not just timeout on the running main code, then you need an explicit communication from main thread to ask the code of the task to cancel , like a threading.Event()
import asyncio
import functools
import multiprocessing
from concurrent.futures.thread import ThreadPoolExecutor
class SingletonTimeOut:
pool = None
#classmethod
def run(cls, to_run: functools.partial, timeout: float):
pool = cls.get_pool()
loop = cls.get_loop()
try:
task = loop.run_in_executor(pool, to_run)
return loop.run_until_complete(asyncio.wait_for(task, timeout=timeout))
except asyncio.TimeoutError as e:
error_type = type(e).__name__ #TODO
raise e
#classmethod
def get_pool(cls):
if cls.pool is None:
cls.pool = ThreadPoolExecutor(multiprocessing.cpu_count())
return cls.pool
#classmethod
def get_loop(cls):
try:
return asyncio.get_event_loop()
except RuntimeError:
asyncio.set_event_loop(asyncio.new_event_loop())
# print("NEW LOOP" + str(threading.current_thread().ident))
return asyncio.get_event_loop()
# ---------------
TIME_OUT = float('0.2') # seconds
def toto(input_items,nb_predictions):
return 1
to_run = functools.partial(toto,
input_items=1,
nb_predictions="a")
results = SingletonTimeOut.run(to_run, TIME_OUT)
#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)
The func_timeout package by Tim Savannah has worked well for me.
Installation:
pip install func_timeout
Usage:
import time
from func_timeout import func_timeout, FunctionTimedOut
def my_func(n):
time.sleep(n)
time_to_sleep = 10
# time out after 2 seconds using kwargs
func_timeout(2, my_func, kwargs={'n' : time_to_sleep})
# time out after 2 seconds using args
func_timeout(2, my_func, args=(time_to_sleep,))
I had a need for nestable timed interrupts (which SIGALARM can't do) that won't get blocked by time.sleep (which the thread-based approach can't do). I ended up copying and lightly modifying code from here: http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
The code itself:
#!/usr/bin/python
# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
"""alarm.py: Permits multiple SIGALRM events to be queued.
Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""
import heapq
import signal
from time import time
__version__ = '$Revision: 2539 $'.split()[1]
alarmlist = []
__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))
class TimeoutError(Exception):
def __init__(self, message, id_=None):
self.message = message
self.id_ = id_
class Timeout:
''' id_ allows for nested timeouts. '''
def __init__(self, id_=None, seconds=1, error_message='Timeout'):
self.seconds = seconds
self.error_message = error_message
self.id_ = id_
def handle_timeout(self):
raise TimeoutError(self.error_message, self.id_)
def __enter__(self):
self.this_alarm = alarm(self.seconds, self.handle_timeout)
def __exit__(self, type, value, traceback):
try:
cancel(self.this_alarm)
except ValueError:
pass
def __clear_alarm():
"""Clear an existing alarm.
If the alarm signal was set to a callable other than our own, queue the
previous alarm settings.
"""
oldsec = signal.alarm(0)
oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
if oldsec > 0 and oldfunc != __alarm_handler:
heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))
def __alarm_handler(*zargs):
"""Handle an alarm by calling any due heap entries and resetting the alarm.
Note that multiple heap entries might get called, especially if calling an
entry takes a lot of time.
"""
try:
nextt = __next_alarm()
while nextt is not None and nextt <= 0:
(tm, func, args, keys) = heapq.heappop(alarmlist)
func(*args, **keys)
nextt = __next_alarm()
finally:
if alarmlist: __set_alarm()
def alarm(sec, func, *args, **keys):
"""Set an alarm.
When the alarm is raised in `sec` seconds, the handler will call `func`,
passing `args` and `keys`. Return the heap entry (which is just a big
tuple), so that it can be cancelled by calling `cancel()`.
"""
__clear_alarm()
try:
newalarm = __new_alarm(sec, func, args, keys)
heapq.heappush(alarmlist, newalarm)
return newalarm
finally:
__set_alarm()
def cancel(alarm):
"""Cancel an alarm by passing the heap entry returned by `alarm()`.
It is an error to try to cancel an alarm which has already occurred.
"""
__clear_alarm()
try:
alarmlist.remove(alarm)
heapq.heapify(alarmlist)
finally:
if alarmlist: __set_alarm()
and a usage example:
import alarm
from time import sleep
try:
with alarm.Timeout(id_='a', seconds=5):
try:
with alarm.Timeout(id_='b', seconds=2):
sleep(3)
except alarm.TimeoutError as e:
print 'raised', e.id_
sleep(30)
except alarm.TimeoutError as e:
print 'raised', e.id_
else:
print 'nope.'
I have face the same problem but my situation is need work on sub thread, signal didn't work for me, so I wrote a python package: timeout-timer to solve this problem, support for use as context or decorator, use signal or sub thread module to trigger a timeout interrupt:
from timeout_timer import timeout, TimeoutInterrupt
class TimeoutInterruptNested(TimeoutInterrupt):
pass
def test_timeout_nested_loop_both_timeout(timer="thread"):
cnt = 0
try:
with timeout(5, timer=timer):
try:
with timeout(2, timer=timer, exception=TimeoutInterruptNested):
sleep(2)
except TimeoutInterruptNested:
cnt += 1
time.sleep(10)
except TimeoutInterrupt:
cnt += 1
assert cnt == 2
see more: https://github.com/dozysun/timeout-timer
Here is a simple example running one method with timeout and also retriev its value if successfull.
import multiprocessing
import time
ret = {"foo": False}
def worker(queue):
"""worker function"""
ret = queue.get()
time.sleep(1)
ret["foo"] = True
queue.put(ret)
if __name__ == "__main__":
queue = multiprocessing.Queue()
queue.put(ret)
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
p.join(timeout=10)
if p.exitcode is None:
print("The worker timed out.")
else:
print(f"The worker completed and returned: {queue.get()}")
Here is a slight improvement to the given thread-based solution.
The code below supports exceptions:
def runFunctionCatchExceptions(func, *args, **kwargs):
try:
result = func(*args, **kwargs)
except Exception, message:
return ["exception", message]
return ["RESULT", result]
def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
import threading
class InterruptableThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.result = default
def run(self):
self.result = runFunctionCatchExceptions(func, *args, **kwargs)
it = InterruptableThread()
it.start()
it.join(timeout_duration)
if it.isAlive():
return default
if it.result[0] == "exception":
raise it.result[1]
return it.result[1]
Invoking it with a 5 second timeout:
result = timeout(remote_calculate, (myarg,), timeout_duration=5)
Here is a POSIX version that combines many of the previous answers to deliver following features:
Subprocesses blocking the execution.
Usage of the timeout function on class member functions.
Strict requirement on time-to-terminate.
Here is the code and some test cases:
import threading
import signal
import os
import time
class TerminateExecution(Exception):
"""
Exception to indicate that execution has exceeded the preset running time.
"""
def quit_function(pid):
# Killing all subprocesses
os.setpgrp()
os.killpg(0, signal.SIGTERM)
# Killing the main thread
os.kill(pid, signal.SIGTERM)
def handle_term(signum, frame):
raise TerminateExecution()
def invoke_with_timeout(timeout, fn, *args, **kwargs):
# Setting a sigterm handler and initiating a timer
old_handler = signal.signal(signal.SIGTERM, handle_term)
timer = threading.Timer(timeout, quit_function, args=[os.getpid()])
terminate = False
# Executing the function
timer.start()
try:
result = fn(*args, **kwargs)
except TerminateExecution:
terminate = True
finally:
# Restoring original handler and cancel timer
signal.signal(signal.SIGTERM, old_handler)
timer.cancel()
if terminate:
raise BaseException("xxx")
return result
### Test cases
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
time.sleep(1)
print('countdown finished')
return 1337
def really_long_function():
time.sleep(10)
def really_long_function2():
os.system("sleep 787")
# Checking that we can run a function as expected.
assert invoke_with_timeout(3, countdown, 1) == 1337
# Testing various scenarios
t1 = time.time()
try:
print(invoke_with_timeout(1, countdown, 3))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function2))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
# Checking that classes are referenced and not
# copied (as would be the case with multiprocessing)
class X:
def __init__(self):
self.value = 0
def set(self, v):
self.value = v
x = X()
invoke_with_timeout(2, x.set, 9)
assert x.value == 9
I intend to kill the process if job not done , using thread and process both to achieve this.
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import multiprocessing
# test case 1
def worker_1(a,b,c):
for _ in range(2):
print('very time consuming sleep')
sleep(1)
return a+b+c
# test case 2
def worker_2(in_name):
for _ in range(10):
print('very time consuming sleep')
sleep(1)
return 'hello '+in_name
Actual class as a contextmanager
class FuncTimer():
def __init__(self,fn,args,runtime):
self.fn = fn
self.args = args
self.queue = multiprocessing.Queue()
self.runtime = runtime
self.process = multiprocessing.Process(target=self.thread_caller)
def thread_caller(self):
with ThreadPoolExecutor() as executor:
future = executor.submit(self.fn, *self.args)
self.queue.put(future.result())
def __enter__(self):
return self
def start_run(self):
self.process.start()
self.process.join(timeout=self.runtime)
if self.process.exitcode is None:
self.process.kill()
if self.process.exitcode is None:
out_res = None
print('killed premature')
else:
out_res = self.queue.get()
return out_res
def __exit__(self, exc_type, exc_value, exc_traceback):
self.process.kill()
How to use it
print('testing case 1')
with FuncTimer(fn=worker_1,args=(1,2,3),runtime = 5) as fp:
res = fp.start_run()
print(res)
print('testing case 2')
with FuncTimer(fn=worker_2,args=('ram',),runtime = 5) as fp:
res = fp.start_run()
print(res)

Better keyboard interrupt detection for this threaded Spinner class

Ok, I've wrote this class based in a bunch of others Spinner classes that I've googled in Google Code Search.
It's working as intended, but I'm looking for a better way to handle KeyboardInterrupt and SystemExit exceptions. Is there better approaches?
Here's my code:
#!/usr/bin/env python
import itertools
import sys
import threading
class Spinner(threading.Thread):
'''Represent a random work indicator, handled in a separate thread'''
# Spinner glyphs
glyphs = ('|', '/', '-', '\\', '|', '/', '-')
# Output string format
output_format = '%-78s%-2s'
# Message to output while spin
spin_message = ''
# Message to output when done
done_message = ''
# Time between spins
spin_delay = 0.1
def __init__(self, *args, **kwargs):
'''Spinner constructor'''
threading.Thread.__init__(self, *args, **kwargs)
self.daemon = True
self.__started = False
self.__stopped = False
self.__glyphs = itertools.cycle(iter(self.glyphs))
def __call__(self, func, *args, **kwargs):
'''Convenient way to run a routine with a spinner'''
self.init()
skipped = False
try:
return func(*args, **kwargs)
except (KeyboardInterrupt, SystemExit):
skipped = True
finally:
self.stop(skipped)
def init(self):
'''Shows a spinner'''
self.__started = True
self.start()
def run(self):
'''Spins the spinner while do some task'''
while not self.__stopped:
self.spin()
def spin(self):
'''Spins the spinner'''
if not self.__started:
raise NotStarted('You must call init() first before using spin()')
if sys.stdin.isatty():
sys.stdout.write('\r')
sys.stdout.write(self.output_format % (self.spin_message,
self.__glyphs.next()))
sys.stdout.flush()
time.sleep(self.spin_delay)
def stop(self, skipped=None):
'''Stops the spinner'''
if not self.__started:
raise NotStarted('You must call init() first before using stop()')
self.__stopped = True
self.__started = False
if sys.stdin.isatty() and not skipped:
sys.stdout.write('\b%s%s\n' % ('\b' * len(self.done_message),
self.done_message))
sys.stdout.flush()
class NotStarted(Exception):
'''Spinner not started exception'''
pass
if __name__ == '__main__':
import time
# Normal example
spinner1 = Spinner()
spinner1.spin_message = 'Scanning...'
spinner1.done_message = 'DONE'
spinner1.init()
skipped = False
try:
time.sleep(5)
except (KeyboardInterrupt, SystemExit):
skipped = True
finally:
spinner1.stop(skipped)
# Callable example
spinner2 = Spinner()
spinner2.spin_message = 'Scanning...'
spinner2.done_message = 'DONE'
spinner2(time.sleep, 5)
Thank you in advance.
You probably don't need to worry about catching SystemExit as it is raised by sys.exit(). You might want to catch it to clean up some resources just before your program exits.
The other way to catch KeyboardInterrupt is to register a signal handler to catch SIGINT. However for your example using try..except makes more sense, so you're on the right track.
A few minor suggestions:
Perhaps rename the __call__ method to start, to make it more clear you're starting a job.
You might also want to make the Spinner class reusable by attaching a new thread within the start method, rather than in the constructor.
Also consider what happens when the user hits CTRL-C for the current spinner job -- can the next job be started, or should the app just exit?
You could also make the spin_message the first argument to start to associate it with the task about to be run.
For example, here is how someone might use Spinner:
dbproc = MyDatabaseProc()
spinner = Spinner()
spinner.done_message = 'OK'
try:
spinner.start("Dropping the database", dbproc.drop, "mydb")
spinner.start("Re-creating the database", dbproc.create, "mydb")
spinner.start("Inserting data into tables", dbproc.populate)
...
except (KeyboardInterrupt, SystemExit):
# stop the currently executing job
spinner.stop()
# do some cleanup if needed..
dbproc.cleanup()

Categories

Resources