I am trying to get timeouts to work in python3.2 using the concurrent.futures module. However when it does timeout, it doesn't really stop the execution. I tried with both threads and process pool executors neither of them stop the task, and only until its finished does a timeout become raised. So does anyone know if its possible to get this working?
import concurrent.futures
import time
import datetime
max_numbers = [10000000, 10000000, 10000000, 10000000, 10000000]
def run_loop(max_number):
print("Started:", datetime.datetime.now(), max_number)
last_number = 0;
for i in range(1, max_number + 1):
last_number = i * i
return last_number
def main():
with concurrent.futures.ProcessPoolExecutor(max_workers=len(max_numbers)) as executor:
try:
for future in concurrent.futures.as_completed(executor.map(run_loop, max_numbers, timeout=1), timeout=1):
print(future.result(timeout=1))
except concurrent.futures._base.TimeoutError:
print("This took to long...")
if __name__ == '__main__':
main()
As far as I can tell, TimeoutError is actually raised when you would expect it, and not after the task is finished.
However, your program itself will keep on running until all running tasks have been completed. This is because currently executing tasks (in your case, probably all your submitted tasks, as your pool size equals the number of tasks), are not actually "killed".
The TimeoutError is raised, so that you can choose not to wait until the task is finished (and do something else instead), but the task will keep on running until completed. And python will not exit as long as there are unfinished tasks in the threads/subprocesses of your Executor.
As far as I know, it is not possible to just "stop" currently executing Futures, you can only "cancel" scheduled tasks that have yet to be started. In your case, there won't be any, but imagine that you have pool of 5 threads/processes, and you want to process 100 items. At some point, there might be 20 completed tasks, 5 running tasks, and 75 tasks scheduled. In this case, you would be able to cancel those 76 scheduled tasks, but the 4 that are running will continue until completed, whether you wait for the result or not.
Even though it cannot be done that way, I guess there should be ways to achieve your desired end result. Maybe this version can help you on the way (not sure if it does exactly what you wanted, but it might be of some use):
import concurrent.futures
import time
import datetime
max_numbers = [10000000, 10000000, 10000000, 10000000, 10000000]
class Task:
def __init__(self, max_number):
self.max_number = max_number
self.interrupt_requested = False
def __call__(self):
print("Started:", datetime.datetime.now(), self.max_number)
last_number = 0;
for i in xrange(1, self.max_number + 1):
if self.interrupt_requested:
print("Interrupted at", i)
break
last_number = i * i
print("Reached the end")
return last_number
def interrupt(self):
self.interrupt_requested = True
def main():
with concurrent.futures.ThreadPoolExecutor(max_workers=len(max_numbers)) as executor:
tasks = [Task(num) for num in max_numbers]
for task, future in [(i, executor.submit(i)) for i in tasks]:
try:
print(future.result(timeout=1))
except concurrent.futures.TimeoutError:
print("this took too long...")
task.interrupt()
if __name__ == '__main__':
main()
By creating a callable object for each "task", and giving those to the executor instead of just a plain function, you can provide a way to "interrupt" the task.
Tip: remove the task.interrupt() line and see what happens, it may make it easier to understand my long explanation above ;-)
Recently I also hit this issue and finally I come up with the following solution using ProcessPoolExecutor:
def main():
with concurrent.futures.ProcessPoolExecutor(max_workers=len(max_numbers)) as executor:
try:
for future in concurrent.futures.as_completed(executor.map(run_loop, max_numbers, timeout=1), timeout=1):
print(future.result(timeout=1))
except concurrent.futures._base.TimeoutError:
print("This took to long...")
stop_process_pool(executor)
def stop_process_pool(executor):
for pid, process in executor._processes.items():
process.terminate()
executor.shutdown()
Related
I have a python script which calls a series of sub-processes. They need to run "for ever" - but they occasionally die, or get killed. When this happens I need to restart the process using the same arguments as the one which died.
This is a very simplified version:
[edit: this is the less simplified version, which includes "restart" code]
import multiprocessing
import time
import random
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
if __name__ == '__main__':
children = [] # list
args = {} # dictionary
for processNumber in range(10,15):
p = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
children.append(p)
p.start()
args[p.pid] = processNumber
while True:
time.sleep(1)
for n, p in enumerate(children):
if not p.is_alive():
#get parameters dead child was started with
pidArgs = args[p.pid]
del(args[p.pid])
print("n,args,p: ",n,pidArgs,p)
children.pop(n)
# start new process with same args
p = multiprocessing.Process(
target=printNumber,
args=(pidArgs,)
)
children.append(p)
p.start()
args[p.pid] = pidArgs
I have updated the example to illustrate how I want the processes to be restarted if one crashes/killed/etc - keeping track of which pid was started with which args.
Is this the "best" way to do this, or is there a more "python" way of doing this?
I think I would create a separate thread for each Process and use a ProcessPoolExecutor. Executors have a useful function, submit, which returns a Future. You can wait on each Future and re-launch the Executor when the Future is done. Arguments to the function are tracked as class variables, so restarting is just a simple loop.
import threading
from concurrent.futures import ProcessPoolExecutor
import time
import random
import traceback
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
class KeepRunning(threading.Thread):
def __init__(self, func, *args, **kwds):
self.func = func
self.args = args
self.kwds = kwds
super().__init__()
def run(self):
while True:
with ProcessPoolExecutor(max_workers=1) as pool:
future = pool.submit(self.func, *self.args, **self.kwds)
try:
future.result()
except Exception:
traceback.print_exc()
if __name__ == '__main__':
for process_number in range(10, 15):
keep = KeepRunning(printNumber, process_number)
keep.start()
while True:
time.sleep(1)
At the end of the program is a loop to keep the main thread running. Without that, the program will attempt to exit while your Processes are still running.
For the example you provided I would just remove the exit condition from the while loop and change it to True.
As you said though the actual code is more complicated (why didn't you post that?). So if the process gets terminated by lets say an exception just put the code inside a try catch block. You can then put said block in an infinite loop.
I hope this is what you are looking for but that seems to be the right way to do it provided the goal and information you provided.
Instead of just starting the process immediately, you can save the list of processes and their arguments, and create another process that checks they are alive.
For example:
if __name__ == '__main__':
process_list = []
for processNumber in range(5):
process = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
process_list.append((process,args))
process.start()
while True:
for running_process, process_args in process_list:
if not running_process.is_alive():
new_process = multiprocessing.Process(target=printNumber, args=(process_args))
process_list.remove(running_process, process_args) # Remove terminated process
process_list.append((new_process, process_args))
I must say that I'm not sure the best way to do it is in python, you may want to look at scheduler services like jenkins or something like that.
I am running a piece of python code in which multiple threads are run through threadpool executor. Each thread is supposed to perform a task (fetch a webpage for example). What I want to be able to do is to terminate all threads, even if one of the threads fail. For instance:
with ThreadPoolExecutor(self._num_threads) as executor:
jobs = []
for path in paths:
kw = {"path": path}
jobs.append(executor.submit(start,**kw))
for job in futures.as_completed(jobs):
result = job.result()
print(result)
def start(*args,**kwargs):
#fetch the page
if(success):
return True
else:
#Signal all threads to stop
Is it possible to do so? The results returned by threads are useless to me unless all of them are successful, so if even one of them fails, I would like to save some execution time of the rest of the threads and terminate them immediately. The actual code obviously is doing relatively lengthy tasks with a couple of failure points.
If you are done with threads and want to look into processes, then this peace of code here looks very promising and simple, almost the same syntax as thread, but with the multiprocessing module.
When the timeout flag expires the process is terminated, very convenient.
import multiprocessing
def get_page(*args, **kwargs):
# your web page downloading code goes here
def start_get_page(timeout, *args, **kwargs):
p = multiprocessing.Process(target=get_page, args=args, kwargs=kwargs)
p.start()
p.join(timeout)
if p.is_alive():
# stop the downloading 'thread'
p.terminate()
# and then do any post-error processing here
if __name__ == "__main__":
start_get_page(timeout, *args, **kwargs)
I have created an answer for a similar question I had, which I think will work for this question.
Terminate executor using ThreadPoolExecutor from concurrent.futures module
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep
NUM_REQUESTS = 100
def long_request(id):
sleep(1)
# Simulate bad response
if id == 10:
return {"data": {"valid": False}}
else:
return {"data": {"valid": True}}
def check_results(results):
valid = True
for result in results:
valid = result["data"]["valid"]
return valid
def main():
futures = []
responses = []
num_requests = 0
with ThreadPoolExecutor(max_workers=10) as executor:
for request_index in range(NUM_REQUESTS):
future = executor.submit(long_request, request_index)
# Future list
futures.append(future)
for future in as_completed(futures):
is_responses_valid = check_results(responses)
# Cancel all future requests if one invalid
if not is_responses_valid:
executor.shutdown(wait=False)
else:
# Append valid responses
num_requests += 1
responses.append(future.result())
return num_requests
if __name__ == "__main__":
requests = main()
print("Num Requests: ", requests)
In my code I used multiprocessing
import multiprocessing as mp
pool = mp.Pool()
for i in range(threadNumber):
pool.apply_async(publishMessage, args=(map_metrics, connection_parameters...,))
pool.close()
pool.terminate()
This is how I would do it:
import concurrent.futures
def start(*args,**kwargs):
#fetch the page
if(success):
return True
else:
return False
with concurrent.futures.ProcessPoolExecutor() as executor:
results = [executor.submit(start, {"path": path}) for path in paths]
concurrent.futures.wait(results, timeout=10, return_when=concurrent.futures.FIRST_COMPLETED)
for f in concurrent.futures.as_completed(results):
f_success = f.result()
if not f_success:
executor.shutdown(wait=False, cancel_futures=True) # shutdown if one fails
else:
#do stuff here
If any result is not True, everything will be shut down immediately.
You can try to use StoppableThread from func-timeout.
But terminating threads is strongly discouraged. And if you need to kill a thread, you probably have a design problem. Look at alternatives: asyncio coroutines and multiprocessing with legal cancel/terminating functionality.
I'm using python's multiprocessing.Pool and apply_async to call a bunch of functions.
How can I tell whether a function has started processing by a member of the pool or whether it is sitting in a queue?
For example:
import multiprocessing
import time
def func(t):
#take some time processing
print 'func({}) started'.format(t)
time.sleep(t)
pool = multiprocessing.Pool()
results = [pool.apply_async(func, [t]) for t in [100]*50] #adds 50 func calls to the queue
For each AsyncResult in results you can call ready() or get(0) to see if the func finished running. But how do you find out whether the func started but hasn't finished yet?
i.e. for a given AsyncResult object (i.e. a given element of results) is there a way to see whether the function has been called or if it's sitting in the pool's queue?
First, remove completed jobs from results list
results = [r for r in results if not r.ready()]
Number of processes pending is length of results list:
pending = len(results)
And number pending but not started is total pending - pool_size
not_started = pending - pool_size
pool_size will be multiprocessing.cpu_count() if Pool is created with default argument as you did
UPDATE:
After initially misunderstanding the question, here's a way to do what OP was asking about.
I suspect this functionality could be added to the Pool class without too much trouble because AsyncResult is implemented by Pool with a Queue. That queue could also be used internally to indicate whether started or not.
But here's a way to implement using Pool and Pipe. NOTE: this doesn't work in Python 2.x -- not sure why. Tested in Python 3.8.
import multiprocessing
import time
import os
def worker_function(pipe):
pipe.send('started')
print('[{}] started pipe={}'.format(os.getpid(), pipe))
time.sleep(3)
pipe.close()
def test():
pool = multiprocessing.Pool(processes=2)
print('[{}] pool={}'.format(os.getpid(), pool))
workers = []
for x in range(1, 4):
parent, child = multiprocessing.Pipe()
pool.apply_async(worker_function, (child,))
worker = {'name': 'worker{}'.format(x), 'pipe': parent, 'started': False}
workers.append(worker)
pool.close()
while True:
for worker in workers:
if worker.get('started'):
continue
pipe = worker.get('pipe')
if pipe.poll(0.1):
message = pipe.recv()
print('[{}] {} says {}'.format(os.getpid(), worker.get('name'), message))
worker['started'] = True
pipe.close()
count_in_queue = len(workers)
for worker in workers:
if worker.get('started'):
count_in_queue -= 1
print('[{}] count_in_queue = {}'.format(os.getpid(), count_in_queue))
if not count_in_queue:
break
time.sleep(0.5)
pool.join()
if __name__ == '__main__':
test()
The following code:
import threading
import time
from functools import partial
from itertools import count
def daemon_loop(sleep_interval, stop_event):
for j in count():
print(j)
if stop_event.is_set():
break
time.sleep(sleep_interval)
print('Slept %s' % sleep_interval)
print('Prod terminating')
if __name__ == '__main__':
stop_event = threading.Event() #https://stackoverflow.com/a/41139707/281545
target = partial(daemon_loop, sleep_interval=2, stop_event=stop_event)
prod_thread = threading.Thread(target=target,
# daemon=True
)
try:
prod_thread.start()
while True:
time.sleep(10)
except KeyboardInterrupt:
print('Terminating...')
stop_event.set()
prints on a keyboard interrupt:
C:\Users\MrD\.PyCharm2018.2\config\scratches>c:\_\Python363-64\python.exe thread_daemon.py
0
Slept 2
1
Terminating...
Slept 2
2
Prod terminating
Uncommenting the # daemon=True line results in the prod_thread being ended immediately:
C:\Users\MrD\.PyCharm2018.2\config\scratches>c:\_\Python363-64\python.exe thread_daemon.py
0
Slept 2
1
Terminating...
My question is what is the preferred/more pythonic way to deal with thread termination - should I drop the Event machinery and just mark the thread as daemon or is there some edge case I miss?
See:
Daemon Threads Explanation
How to stop daemon thread?
I haven't done enough Python to give you a "Pythonic" answer, but I can answer in more general programming terms.
Firstly, I'm not a fan of terminating threads. There are cases where it is safe and OK, such as your example here - but terminating in the middle of print writing its output would feel a little dirty.
Secondly, if you want to continue using sleep (which I'm also not a fan of) you could repeat your if stop_event.is_set(): and break after the sleep. (Don't move the code, copy it.) The main problem with sleep in this case is that it will wait the full sleep_interval even if the event is set during that time.
Thirdly - and my preference - instead of using sleep, do a wait on the event with a timeout. If the event is not set during the wait, wait returns false after waiting the timeout period. If the event is set before or during the wait, wait returns true immediately (that is, it aborts the timeout, giving you fast, clean shutdown of the thread.)
So your code would look something like this:
def daemon_loop(sleep_interval, stop_event):
for j in count():
print(j)
if stop_event.wait(sleep_interval):
break
print('Slept %s' % sleep_interval)
print('Prod terminating')
Python concurrent.futures and ProcessPoolExecutor provide a neat interface to schedule and monitor tasks. Futures even provide a .cancel() method:
cancel(): Attempt to cancel the call. If the call is currently being executed and cannot be cancelled then the method will return False, otherwise the call will be cancelled and the method will return True.
Unfortunately in a simmilar question (concerning asyncio) the answer claims running tasks are uncancelable using this snipped of the documentation, but the docs dont say that, only if they are running AND uncancelable.
Submitting multiprocessing.Events to the processes is also not trivially possible (doing so via parameters as in multiprocess.Process returns a RuntimeError)
What am I trying to do? I would like to partition a search space and run a task for every partition. But it is enough to have ONE solution and the process is CPU intensive. So is there an actual comfortable way to accomplish this that does not offset the gains by using ProcessPool to begin with?
Example:
from concurrent.futures import ProcessPoolExecutor, FIRST_COMPLETED, wait
# function that profits from partitioned search space
def m_run(partition):
for elem in partition:
if elem == 135135515:
return elem
return False
futures = []
# used to create the partitions
steps = 100000000
with ProcessPoolExecutor(max_workers=4) as pool:
for i in range(4):
# run 4 tasks with a partition, but only *one* solution is needed
partition = range(i*steps,(i+1)*steps)
futures.append(pool.submit(m_run, partition))
done, not_done = wait(futures, return_when=FIRST_COMPLETED)
for d in done:
print(d.result())
print("---")
for d in not_done:
# will return false for Cancel and Result for all futures
print("Cancel: "+str(d.cancel()))
print("Result: "+str(d.result()))
I don't know why concurrent.futures.Future does not have a .kill() method, but you can accomplish what you want by shutting down the process pool with pool.shutdown(wait=False), and killing the remaining child processes by hand.
Create a function for killing child processes:
import signal, psutil
def kill_child_processes(parent_pid, sig=signal.SIGTERM):
try:
parent = psutil.Process(parent_pid)
except psutil.NoSuchProcess:
return
children = parent.children(recursive=True)
for process in children:
process.send_signal(sig)
Run your code until you get the first result, then kill all remaining child processes:
from concurrent.futures import ProcessPoolExecutor, FIRST_COMPLETED, wait
# function that profits from partitioned search space
def m_run(partition):
for elem in partition:
if elem == 135135515:
return elem
return False
futures = []
# used to create the partitions
steps = 100000000
pool = ProcessPoolExecutor(max_workers=4)
for i in range(4):
# run 4 tasks with a partition, but only *one* solution is needed
partition = range(i*steps,(i+1)*steps)
futures.append(pool.submit(m_run, partition))
done, not_done = wait(futures, timeout=3600, return_when=FIRST_COMPLETED)
# Shut down pool
pool.shutdown(wait=False)
# Kill remaining child processes
kill_child_processes(os.getpid())
Unfortunately, running Futures cannot be cancelled. I believe the core reason is to ensure the same API over different implementations (it's not possible to interrupt running threads or coroutines).
The Pebble library was designed to overcome this and other limitations.
from pebble import ProcessPool
def function(foo, bar=0):
return foo + bar
with ProcessPool() as pool:
future = pool.schedule(function, args=[1])
# if running, the container process will be terminated
# a new process will be started consuming the next task
future.cancel()
I found your question interesting so here's my finding.
I found the behaviour of .cancel() method is as stated in python documentation. As for your running concurrent functions, unfortunately they could not be cancelled even after they were told to do so. If my finding is correct, then I reason that Python does require a more effective .cancel() method.
Run the code below to check my finding.
from concurrent.futures import ProcessPoolExecutor, as_completed
from time import time
# function that profits from partitioned search space
def m_run(partition):
for elem in partition:
if elem == 3351355150:
return elem
break #Added to terminate loop once found
return False
start = time()
futures = []
# used to create the partitions
steps = 1000000000
with ProcessPoolExecutor(max_workers=4) as pool:
for i in range(4):
# run 4 tasks with a partition, but only *one* solution is needed
partition = range(i*steps,(i+1)*steps)
futures.append(pool.submit(m_run, partition))
### New Code: Start ###
for f in as_completed(futures):
print(f.result())
if f.result():
print('break')
break
for f in futures:
print(f, 'running?',f.running())
if f.running():
f.cancel()
print('Cancelled? ',f.cancelled())
print('New Instruction Ended at = ', time()-start )
print('Total Compute Time = ', time()-start )
Update:
It is possible to forcefully terminate the concurrent processes via bash, but the consequence is that the main python program will terminate too. If this isn't an issue with you, then try the below code.
You have to add the below codes between the last 2 print statements to see this for yourself. Note: This code works only if you aren't running any other python3 program.
import subprocess, os, signal
result = subprocess.run(['ps', '-C', 'python3', '-o', 'pid='],
stdout=subprocess.PIPE).stdout.decode('utf-8').split()
print ('result =', result)
for i in result:
print('PID = ', i)
if i != result[0]:
os.kill(int(i), signal.SIGKILL)
try:
os.kill(int(i), 0)
raise Exception("""wasn't able to kill the process
HINT:use signal.SIGKILL or signal.SIGABORT""")
except OSError as ex:
continue