In Python, for a toy example:
for x in range(0, 3):
# Call function A(x)
I want to continue the for loop if function A takes more than five seconds by skipping it so I won't get stuck or waste time.
By doing some search, I realized a subprocess or thread may help, but I have no idea how to implement it here.
I think creating a new process may be overkill. If you're on Mac or a Unix-based system, you should be able to use signal.SIGALRM to forcibly time out functions that take too long. This will work on functions that are idling for network or other issues that you absolutely can't handle by modifying your function. I have an example of using it in this answer:
Option for SSH to timeout after a short time? ClientAlive & ConnectTimeout don't seem to do what I need them to do
Editing my answer in here, though I'm not sure I'm supposed to do that:
import signal
class TimeoutException(Exception): # Custom exception class
pass
def timeout_handler(signum, frame): # Custom signal handler
raise TimeoutException
# Change the behavior of SIGALRM
signal.signal(signal.SIGALRM, timeout_handler)
for i in range(3):
# Start the timer. Once 5 seconds are over, a SIGALRM signal is sent.
signal.alarm(5)
# This try/except loop ensures that
# you'll catch TimeoutException when it's sent.
try:
A(i) # Whatever your function that might hang
except TimeoutException:
continue # continue the for loop if function A takes more than 5 second
else:
# Reset the alarm
signal.alarm(0)
This basically sets a timer for 5 seconds, then tries to execute your code. If it fails to complete before time runs out, a SIGALRM is sent, which we catch and turn into a TimeoutException. That forces you to the except block, where your program can continue.
Maybe someone find this decorator useful, based on TheSoundDefense answer:
import time
import signal
class TimeoutException(Exception): # Custom exception class
pass
def break_after(seconds=2):
def timeout_handler(signum, frame): # Custom signal handler
raise TimeoutException
def function(function):
def wrapper(*args, **kwargs):
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(seconds)
try:
res = function(*args, **kwargs)
signal.alarm(0) # Clear alarm
return res
except TimeoutException:
print u'Oops, timeout: %s sec reached.' % seconds, function.__name__, args, kwargs
return
return wrapper
return function
Test:
#break_after(3)
def test(a, b, c):
return time.sleep(10)
>>> test(1,2,3)
Oops, timeout: 3 sec reached. test (1, 2, 3) {}
If you can break your work up and check every so often, that's almost always the best solution. But sometimes that's not possible—e.g., maybe you're reading a file off an slow file share that every once in a while just hangs for 30 seconds. To deal with that internally, you'd have to restructure your whole program around an async I/O loop.
If you don't need to be cross-platform, you can use signals on *nix (including Mac and Linux), APCs on Windows, etc. But if you need to be cross-platform, that doesn't work.
So, if you really need to do it concurrently, you can, and sometimes you have to. In that case, you probably want to use a process for this, not a thread. You can't really kill a thread safely, but you can kill a process, and it can be as safe as you want it to be. Also, if the thread is taking 5+ seconds because it's CPU-bound, you don't want to fight with it over the GIL.
There are two basic options here.
First, you can put the code in another script and run it with subprocess:
subprocess.check_call([sys.executable, 'other_script.py', arg, other_arg],
timeout=5)
Since this is going through normal child-process channels, the only communication you can use is some argv strings, a success/failure return value (actually a small integer, but that's not much better), and optionally a hunk of text going in and a chunk of text coming out.
Alternatively, you can use multiprocessing to spawn a thread-like child process:
p = multiprocessing.Process(func, args)
p.start()
p.join(5)
if p.is_alive():
p.terminate()
As you can see, this is a little more complicated, but it's better in a few ways:
You can pass arbitrary Python objects (at least anything that can be pickled) rather than just strings.
Instead of having to put the target code in a completely independent script, you can leave it as a function in the same script.
It's more flexible—e.g., if you later need to, say, pass progress updates, it's very easy to add a queue in either or both directions.
The big problem with any kind of parallelism is sharing mutable data—e.g., having a background task update a global dictionary as part of its work (which your comments say you're trying to do). With threads, you can sort of get away with it, but race conditions can lead to corrupted data, so you have to be very careful with locking. With child processes, you can't get away with it at all. (Yes, you can use shared memory, as Sharing state between processes explains, but this is limited to simple types like numbers, fixed arrays, and types you know how to define as C structures, and it just gets you back to the same problems as threads.)
Ideally, you arrange things so you don't need to share any data while the process is running—you pass in a dict as a parameter and get a dict back as a result. This is usually pretty easy to arrange when you have a previously-synchronous function that you want to put in the background.
But what if, say, a partial result is better than no result? In that case, the simplest solution is to pass the results over a queue. You can do this with an explicit queue, as explained in Exchanging objects between processes, but there's an easier way.
If you can break the monolithic process into separate tasks, one for each value (or group of values) you wanted to stick in the dictionary, you can schedule them on a Pool—or, even better, a concurrent.futures.Executor. (If you're on Python 2.x or 3.1, see the backport futures on PyPI.)
Let's say your slow function looked like this:
def spam():
global d
for meat in get_all_meats():
count = get_meat_count(meat)
d.setdefault(meat, 0) += count
Instead, you'd do this:
def spam_one(meat):
count = get_meat_count(meat)
return meat, count
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
results = executor.map(spam_one, get_canned_meats(), timeout=5)
for (meat, count) in results:
d.setdefault(meat, 0) += count
As many results as you get within 5 seconds get added to the dict; if that isn't all of them, the rest are abandoned, and a TimeoutError is raised (which you can handle however you want—log it, do some quick fallback code, whatever).
And if the tasks really are independent (as they are in my stupid little example, but of course they may not be in your real code, at least not without a major redesign), you can parallelize the work for free just by removing that max_workers=1. Then, if you run it on an 8-core machine, it'll kick off 8 workers and given them each 1/8th of the work to do, and things will get done faster. (Usually not 8x as fast, but often 3-6x as fast, which is still pretty nice.)
This seems like a better idea (sorry, I am not sure of the Python names of thing yet):
import signal
def signal_handler(signum, frame):
raise Exception("Timeout!")
signal.signal(signal.SIGALRM, signal_handler)
signal.alarm(3) # Three seconds
try:
for x in range(0, 3):
# Call function A(x)
except Exception, msg:
print "Timeout!"
signal.alarm(0) # Reset
The comments are correct in that you should check inside. Here is a potential solution. Note that an asynchronous function (by using a thread for example) is different from this solution. This is synchronous which means it will still run in series.
import time
for x in range(0,3):
someFunction()
def someFunction():
start = time.time()
while (time.time() - start < 5):
# do your normal function
return;
Related
I'm developing a standard python script file (no servers, no async, no multiprocessing, ...) i.e. a classic data science program where I load data, process it as dataframes, and so on. Everything is synchronous.
At some point, I need to call a function of an external library which is totally external to me (I have no control on it, I don't know how it does what it does), like
def inside_my_function(...):
# My code
result = the_function(params)
# Other code
Now, this the_function sometimes never terminates (I don't know why, probably there are bugs or some conditions which makes it stuck, but it's completely random), and when it happens my program gets stuck as well.
Since I have to use it and it cannot be modified, I would like to know if there is a way for example to wrap it in another function which calls the_function, waits for some timeout, and if the_function returns before the timeout the result is returned, otherwise the_function is somehow killed, aborted, skipped, whatever, and retried up to n times.
I realise that in order to execute the_function and check for timeout at the same time for example multithreading will be needed, but I'm not sure if it makes sense and how to implement it correctly without doing bad practices.
How would you proceed?
EDIT: I would avoid multiprocessing because of the great overhead and because I don't want to overcomplicate things with serializability and so on.
Thank you
import time
import random
import threading
def func_that_waits():
t1 = time.time()
while (time.time() - t1) <= 3:
time.sleep(1)
if check_unreliable_func.worked:
break
if not check_unreliable_func.worked:
print("unreliable function has been working for too long, it's killed.")
def check_unreliable_func(func):
check_unreliable_func.worked = False
def inner(*args,**qwargs):
func(*args,**qwargs)
check_unreliable_func.worked = True
return inner
def unreliable_func():
working_time = random.randint(1,6)
time.sleep(working_time)
print(f"unreliable_func has been working for {working_time} seconds")
to_wait = threading.Thread(target=func_that_waits)
main_func = threading.Thread(target=check_unreliable_func(unreliable_func), daemon=True)
main_func.start()
to_wait.start()
Unreliable_func - the function we do not know if it works
check_unreliable_func(func) - decorator the only purpose of which is to make to_wait thread know that unreliable_func returned something and so there is no sense for to_wait to work further
the main thing to understand is that main_func thread is daemon one so it means that when to_wait thread is terminated all daemon threads are terminated automatically and no matter what they've been doing in the moment
Of course it's really far from being best practice, I just show how it can be done. And how it should be done - I myself would be glad to see it too.
In Python, for a toy example:
for x in range(0, 3):
# Call function A(x)
I want to continue the for loop if function A takes more than five seconds by skipping it so I won't get stuck or waste time.
By doing some search, I realized a subprocess or thread may help, but I have no idea how to implement it here.
I think creating a new process may be overkill. If you're on Mac or a Unix-based system, you should be able to use signal.SIGALRM to forcibly time out functions that take too long. This will work on functions that are idling for network or other issues that you absolutely can't handle by modifying your function. I have an example of using it in this answer:
Option for SSH to timeout after a short time? ClientAlive & ConnectTimeout don't seem to do what I need them to do
Editing my answer in here, though I'm not sure I'm supposed to do that:
import signal
class TimeoutException(Exception): # Custom exception class
pass
def timeout_handler(signum, frame): # Custom signal handler
raise TimeoutException
# Change the behavior of SIGALRM
signal.signal(signal.SIGALRM, timeout_handler)
for i in range(3):
# Start the timer. Once 5 seconds are over, a SIGALRM signal is sent.
signal.alarm(5)
# This try/except loop ensures that
# you'll catch TimeoutException when it's sent.
try:
A(i) # Whatever your function that might hang
except TimeoutException:
continue # continue the for loop if function A takes more than 5 second
else:
# Reset the alarm
signal.alarm(0)
This basically sets a timer for 5 seconds, then tries to execute your code. If it fails to complete before time runs out, a SIGALRM is sent, which we catch and turn into a TimeoutException. That forces you to the except block, where your program can continue.
Maybe someone find this decorator useful, based on TheSoundDefense answer:
import time
import signal
class TimeoutException(Exception): # Custom exception class
pass
def break_after(seconds=2):
def timeout_handler(signum, frame): # Custom signal handler
raise TimeoutException
def function(function):
def wrapper(*args, **kwargs):
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(seconds)
try:
res = function(*args, **kwargs)
signal.alarm(0) # Clear alarm
return res
except TimeoutException:
print u'Oops, timeout: %s sec reached.' % seconds, function.__name__, args, kwargs
return
return wrapper
return function
Test:
#break_after(3)
def test(a, b, c):
return time.sleep(10)
>>> test(1,2,3)
Oops, timeout: 3 sec reached. test (1, 2, 3) {}
If you can break your work up and check every so often, that's almost always the best solution. But sometimes that's not possible—e.g., maybe you're reading a file off an slow file share that every once in a while just hangs for 30 seconds. To deal with that internally, you'd have to restructure your whole program around an async I/O loop.
If you don't need to be cross-platform, you can use signals on *nix (including Mac and Linux), APCs on Windows, etc. But if you need to be cross-platform, that doesn't work.
So, if you really need to do it concurrently, you can, and sometimes you have to. In that case, you probably want to use a process for this, not a thread. You can't really kill a thread safely, but you can kill a process, and it can be as safe as you want it to be. Also, if the thread is taking 5+ seconds because it's CPU-bound, you don't want to fight with it over the GIL.
There are two basic options here.
First, you can put the code in another script and run it with subprocess:
subprocess.check_call([sys.executable, 'other_script.py', arg, other_arg],
timeout=5)
Since this is going through normal child-process channels, the only communication you can use is some argv strings, a success/failure return value (actually a small integer, but that's not much better), and optionally a hunk of text going in and a chunk of text coming out.
Alternatively, you can use multiprocessing to spawn a thread-like child process:
p = multiprocessing.Process(func, args)
p.start()
p.join(5)
if p.is_alive():
p.terminate()
As you can see, this is a little more complicated, but it's better in a few ways:
You can pass arbitrary Python objects (at least anything that can be pickled) rather than just strings.
Instead of having to put the target code in a completely independent script, you can leave it as a function in the same script.
It's more flexible—e.g., if you later need to, say, pass progress updates, it's very easy to add a queue in either or both directions.
The big problem with any kind of parallelism is sharing mutable data—e.g., having a background task update a global dictionary as part of its work (which your comments say you're trying to do). With threads, you can sort of get away with it, but race conditions can lead to corrupted data, so you have to be very careful with locking. With child processes, you can't get away with it at all. (Yes, you can use shared memory, as Sharing state between processes explains, but this is limited to simple types like numbers, fixed arrays, and types you know how to define as C structures, and it just gets you back to the same problems as threads.)
Ideally, you arrange things so you don't need to share any data while the process is running—you pass in a dict as a parameter and get a dict back as a result. This is usually pretty easy to arrange when you have a previously-synchronous function that you want to put in the background.
But what if, say, a partial result is better than no result? In that case, the simplest solution is to pass the results over a queue. You can do this with an explicit queue, as explained in Exchanging objects between processes, but there's an easier way.
If you can break the monolithic process into separate tasks, one for each value (or group of values) you wanted to stick in the dictionary, you can schedule them on a Pool—or, even better, a concurrent.futures.Executor. (If you're on Python 2.x or 3.1, see the backport futures on PyPI.)
Let's say your slow function looked like this:
def spam():
global d
for meat in get_all_meats():
count = get_meat_count(meat)
d.setdefault(meat, 0) += count
Instead, you'd do this:
def spam_one(meat):
count = get_meat_count(meat)
return meat, count
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
results = executor.map(spam_one, get_canned_meats(), timeout=5)
for (meat, count) in results:
d.setdefault(meat, 0) += count
As many results as you get within 5 seconds get added to the dict; if that isn't all of them, the rest are abandoned, and a TimeoutError is raised (which you can handle however you want—log it, do some quick fallback code, whatever).
And if the tasks really are independent (as they are in my stupid little example, but of course they may not be in your real code, at least not without a major redesign), you can parallelize the work for free just by removing that max_workers=1. Then, if you run it on an 8-core machine, it'll kick off 8 workers and given them each 1/8th of the work to do, and things will get done faster. (Usually not 8x as fast, but often 3-6x as fast, which is still pretty nice.)
This seems like a better idea (sorry, I am not sure of the Python names of thing yet):
import signal
def signal_handler(signum, frame):
raise Exception("Timeout!")
signal.signal(signal.SIGALRM, signal_handler)
signal.alarm(3) # Three seconds
try:
for x in range(0, 3):
# Call function A(x)
except Exception, msg:
print "Timeout!"
signal.alarm(0) # Reset
The comments are correct in that you should check inside. Here is a potential solution. Note that an asynchronous function (by using a thread for example) is different from this solution. This is synchronous which means it will still run in series.
import time
for x in range(0,3):
someFunction()
def someFunction():
start = time.time()
while (time.time() - start < 5):
# do your normal function
return;
import threading
threads = []
for n in range(0, 60000):
t = threading.Thread(target=function,args=(x, n))
t.start()
threads.append(t)
for t in threads:
t.join()
It is working well for range up to 800 on my laptop, but if I increase range to more than 800 I get the error can't create new thread.
How can I control number to threads to get created or any other way to make it work like timeout? I tried using threading.BoundedSemaphore function but that doesn't seem to work properly.
The problem is that no major platform (as of mid-2013) will let you create anywhere near this number of threads. There are a wide variety of different limitations you could run into, and without knowing your platform, its configuration, and the exact error you got, it's impossible to know which one you ran into. But here are two examples:
On 32-bit Windows, the default thread stack is 1MB, and all of your thread stacks have to fit into the same 2GB of virtual memory space as everything else in your program, so you will run out long before 60000.
On 64-bit linux, you will likely exhaust one of your session's soft ulimit values before you get anywhere near running out of page space. (Linux has a variety of different limits beyond the ones required by POSIX.)
So, how can i control number to threads to get created or any other way to make it work like timeout or whatever?
Using as many threads as possible is very unlikely to be what you actually want to do. Running 800 threads on an 8-core machine means that you're spending a whole lot of time context-switching between the threads, and the cache keeps getting flushed before it ever gets primed, and so on.
Most likely, what you really want is one of the following:
One thread per CPU, serving a pool of 60000 tasks.
Maybe processes instead of threads (if the primary work is in Python, or in C code that doesn't explicitly release the GIL).
Maybe a fixed number of threads (e.g., a web browsers may do, say, 12 concurrent requests at a time, whether you have 1 core or 64).
Maybe a pool of, say, 600 batches of 100 tasks apiece, instead of 60000 single tasks.
60000 cooperatively-scheduled fibers/greenlets/microthreads all sharing one real thread.
Maybe explicit coroutines instead of a scheduler.
Or "magic" cooperative greenlets via, e.g. gevent.
Maybe one thread per CPU, each running 1/Nth of the fibers.
But it's certainly possible.
Once you've hit whichever limit you're hitting, it's very likely that trying again will fail until a thread has finished its job and been joined, and it's pretty likely that trying again will succeed after that happens. So, given that you're apparently getting an exception, you could handle this the same way as anything else in Python: with a try/except block. For example, something like this:
threads = []
for n in range(0, 60000):
while True:
t = threading.Thread(target=function,args=(x, n))
try:
t.start()
threads.append(t)
except WhateverTheExceptionIs as e:
if threads:
threads[0].join()
del threads[0]
else:
raise
else:
break
for t in threads:
t.join()
Of course this assumes that the first task launched is likely to be the one of the first tasks finished. If this is not true, you'll need some way to explicitly signal doneness (condition, semaphore, queue, etc.), or you'll need to use some lower-level (platform-specific) library that gives you a way to wait on a whole list until at least one thread is finished.
Also, note that on some platforms (e.g., Windows XP), you can get bizarre behavior just getting near the limits.
On top of being a lot better, doing the right thing will probably be a lot simpler as well. For example, here's a process-per-CPU pool:
with concurrent.futures.ProcessPoolExecutor() as executor:
fs = [executor.submit(function, x, n) for n in range(60000)]
concurrent.futures.wait(fs)
… and a fixed-thread-count pool:
with concurrent.futures.ThreadPoolExecutor(12) as executor:
fs = [executor.submit(function, x, n) for n in range(60000)]
concurrent.futures.wait(fs)
… and a balancing-CPU-parallelism-with-numpy-vectorization batching pool:
with concurrent.futures.ThreadPoolExecutor() as executor:
batchsize = 60000 // os.cpu_count()
fs = [executor.submit(np.vector_function, x,
np.arange(n, min(n+batchsize, 60000)))
for n in range(0, 60000, batchsize)]
concurrent.futures.wait(fs)
In the examples above, I used a list comprehension to submit all of the jobs and gather their futures, because we're not doing anything else inside the loop. But from your comments, it sounds like you do have other stuff you want to do inside the loop. So, let's convert it back into an explicit for statement:
with concurrent.futures.ProcessPoolExecutor() as executor:
fs = []
for n in range(60000):
fs.append(executor.submit(function, x, n))
concurrent.futures.wait(fs)
And now, whatever you want to add inside that loop, you can.
However, I don't think you actually want to add anything inside that loop. The loop just submits all the jobs as fast as possible; it's the wait function that sits around waiting for them all to finish, and it's probably there that you want to exit early.
To do this, you can use wait with the FIRST_COMPLETED flag, but it's much simpler to use as_completed.
Also, I'm assuming error is some kind of value that gets set by the tasks. In that case, you will need to put a Lock around it, as with any other mutable value shared between threads. (This is one place where there's slightly more than a one-line difference between a ProcessPoolExecutor and a ThreadPoolExecutor—if you use processes, you need multiprocessing.Lock instead of threading.Lock.)
So:
error_lock = threading.Lock
error = []
def function(x, n):
# blah blah
try:
# blah blah
except Exception as e:
with error_lock:
error.append(e)
# blah blah
with concurrent.futures.ProcessPoolExecutor() as executor:
fs = [executor.submit(function, x, n) for n in range(60000)]
for f in concurrent.futures.as_completed(fs):
do_something_with(f.result())
with error_lock:
if len(error) > 1: exit()
However, you might want to consider a different design. In general, if you can avoid sharing between threads, your life gets a lot easier. And futures are designed to make that easy, by letting you return a value or raise an exception, just like a regular function call. That f.result() will give you the returned value or raise the raised exception. So, you can rewrite that code as:
def function(x, n):
# blah blah
# don't bother to catch exceptions here, let them propagate out
with concurrent.futures.ProcessPoolExecutor() as executor:
fs = [executor.submit(function, x, n) for n in range(60000)]
error = []
for f in concurrent.futures.as_completed(fs):
try:
result = f.result()
except Exception as e:
error.append(e)
if len(error) > 1: exit()
else:
do_something_with(result)
Notice how similar this looks to the ThreadPoolExecutor Example in the docs. This simple pattern is enough to handle almost anything without locks, as long as the tasks don't need to interact with each other.
What's the best way to kill a function (that is still running) after a given amount of time in Python? These are two approaches I have found so far:
Say this is our base function:
import time
def foo():
a_long_time = 10000000
time.sleep(a_long_time)
TIMEOUT = 5 # seconds
1. Multiprocessing Approach
import multiprocessing
if __name__ == '__main__':
p = multiprocessing.Process(target=foo, name="Foo")
p.start()
p.join(TIMEOUT)
if p.is_alive()
print('function terminated')
p.terminate()
p.join()
2. Signal Approach
import signal
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(TIMEOUT)
try:
foo()
except TimeoutException:
print('function terminated')
What are the advantages and disadvantages in terms of scope, safety and usability of these two methods? Are there any better approaches?
Well, as always, it depends.
As you probably have already verified, both these methods work. I would say it depends on your application and correct implementation (your signalling method is a bit wrong...)
Both methods can be considered "safe" if implemented correctly. It depends if your main program outside the foo function needs to do something, or can it just sit and wait for foo to either complete or timeout. The signalling method does not allow any parallel processing, as your main program will be in foo() until it either completes or times out. BUT you need then to defuse the signal. If your foo completes in one second, your main program leaves the try/except structure, and four seconds later ... kaboom ... an exception is raised and probably uncaught. Not good.
try:
foo()
signal.alarm(0)
except TimeoutException:
print ("function terminated")
solves the problem.
I would personally prefer the multiprocessing approach. It is simpler and does not require signals and exception handling that in theory can go wrong if your program execution is not where you expect it to be when a signal is raised. If it is ok for your program to wait in join(), then you are done. However, if you want to do something in the main process while you wait, you can enter a loop, track time in a variable, check if over timeout and if so, terminate the process. You would just use join with a tiny timeout to "peek" if the process is still running.
Another method, depending on your foo(), is to use threads with a class or a global variable. If your foo keeps processing commands instead of possibly waiting for a long time for a command to finish, you can add an if clause there:
def foo():
global please_exit_now
while True:
do_stuff
do_more_stuff
if foo_is_ready:
break
if please_exit_now is True:
please_exit_now = False
return
finalise_foo
return
If do_stuff and do_more_stuff complete in a reasonable amount of time, you could then process things in your main program and just set global please_exit_now as True, and your thread would eventually notice that and exit.
I would probably just go for your multiprocessing and join, though.
Hannu
I looked online and found some SO discussing and ActiveState recipes for running some code with a timeout. It looks there are some common approaches:
Use thread that run the code, and join it with timeout. If timeout elapsed - kill the thread. This is not directly supported in Python (used private _Thread__stop function) so it is bad practice
Use signal.SIGALRM - but this approach not working on Windows!
Use subprocess with timeout - but this is too heavy - what if I want to start interruptible task often, I don't want fire process for each!
So, what is the right way? I'm not asking about workarounds (eg use Twisted and async IO), but actual way to solve actual problem - I have some function and I want to run it only with some timeout. If timeout elapsed, I want control back. And I want it to work on Linux and Windows.
A completely general solution to this really, honestly does not exist. You have to use the right solution for a given domain.
If you want timeouts for code you fully control, you have to write it to cooperate. Such code has to be able to break up into little chunks in some way, as in an event-driven system. You can also do this by threading if you can ensure nothing will hold a lock too long, but handling locks right is actually pretty hard.
If you want timeouts because you're afraid code is out of control (for example, if you're afraid the user will ask your calculator to compute 9**(9**9)), you need to run it in another process. This is the only easy way to sufficiently isolate it. Running it in your event system or even a different thread will not be enough. It is also possible to break things up into little chunks similar to the other solution, but requires very careful handling and usually isn't worth it; in any event, that doesn't allow you to do the same exact thing as just running the Python code.
What you might be looking for is the multiprocessing module. If subprocess is too heavy, then this may not suit your needs either.
import time
import multiprocessing
def do_this_other_thing_that_may_take_too_long(duration):
time.sleep(duration)
return 'done after sleeping {0} seconds.'.format(duration)
pool = multiprocessing.Pool(1)
print 'starting....'
res = pool.apply_async(do_this_other_thing_that_may_take_too_long, [8])
for timeout in range(1, 10):
try:
print '{0}: {1}'.format(duration, res.get(timeout))
except multiprocessing.TimeoutError:
print '{0}: timed out'.format(duration)
print 'end'
If it's network related you could try:
import socket
socket.setdefaulttimeout(number)
I found this with eventlet library:
http://eventlet.net/doc/modules/timeout.html
from eventlet.timeout import Timeout
timeout = Timeout(seconds, exception)
try:
... # execution here is limited by timeout
finally:
timeout.cancel()
For "normal" Python code, that doesn't linger prolongued times in C extensions or I/O waits, you can achieve your goal by setting a trace function with sys.settrace() that aborts the running code when the timeout is reached.
Whether that is sufficient or not depends on how co-operating or malicious the code you run is. If it's well-behaved, a tracing function is sufficient.
An other way is to use faulthandler:
import time
import faulthandler
faulthandler.enable()
try:
faulthandler.dump_tracebacks_later(3)
time.sleep(10)
finally:
faulthandler.cancel_dump_tracebacks_later()
N.B: The faulthandler module is part of stdlib in python3.3.
If you're running code that you expect to die after a set time, then you should write it properly so that there aren't any negative effects on shutdown, no matter if its a thread or a subprocess. A command pattern with undo would be useful here.
So, it really depends on what the thread is doing when you kill it. If its just crunching numbers who cares if you kill it. If its interacting with the filesystem and you kill it , then maybe you should really rethink your strategy.
What is supported in Python when it comes to threads? Daemon threads and joins. Why does python let the main thread exit if you've joined a daemon while its still active? Because its understood that someone using daemon threads will (hopefully) write the code in a way that it wont matter when that thread dies. Giving a timeout to a join and then letting main die, and thus taking any daemon threads with it, is perfectly acceptable in this context.
I've solved that in that way:
For me is worked great (in windows and not heavy at all) I'am hope it was useful for someone)
import threading
import time
class LongFunctionInside(object):
lock_state = threading.Lock()
working = False
def long_function(self, timeout):
self.working = True
timeout_work = threading.Thread(name="thread_name", target=self.work_time, args=(timeout,))
timeout_work.setDaemon(True)
timeout_work.start()
while True: # endless/long work
time.sleep(0.1) # in this rate the CPU is almost not used
if not self.working: # if state is working == true still working
break
self.set_state(True)
def work_time(self, sleep_time): # thread function that just sleeping specified time,
# in wake up it asking if function still working if it does set the secured variable work to false
time.sleep(sleep_time)
if self.working:
self.set_state(False)
def set_state(self, state): # secured state change
while True:
self.lock_state.acquire()
try:
self.working = state
break
finally:
self.lock_state.release()
lw = LongFunctionInside()
lw.long_function(10)
The main idea is to create a thread that will just sleep in parallel to "long work" and in wake up (after timeout) change the secured variable state, the long function checking the secured variable during its work.
I'm pretty new in Python programming, so if that solution has a fundamental errors, like resources, timing, deadlocks problems , please response)).
solving with the 'with' construct and merging solution from -
Timeout function if it takes too long to finish
this thread which work better.
import threading, time
class Exception_TIMEOUT(Exception):
pass
class linwintimeout:
def __init__(self, f, seconds=1.0, error_message='Timeout'):
self.seconds = seconds
self.thread = threading.Thread(target=f)
self.thread.daemon = True
self.error_message = error_message
def handle_timeout(self):
raise Exception_TIMEOUT(self.error_message)
def __enter__(self):
try:
self.thread.start()
self.thread.join(self.seconds)
except Exception, te:
raise te
def __exit__(self, type, value, traceback):
if self.thread.is_alive():
return self.handle_timeout()
def function():
while True:
print "keep printing ...", time.sleep(1)
try:
with linwintimeout(function, seconds=5.0, error_message='exceeded timeout of %s seconds' % 5.0):
pass
except Exception_TIMEOUT, e:
print " attention !! execeeded timeout, giving up ... %s " % e