Immediately raise exceptions in concurrent.futures - python

I run several threads concurrently using concurrent.futures. All of them are necessary to run successfully for the next steps in the code to succeed.
While at the end of all processes I can raise any exceptions by running .result(), ideally any exception raised in a single thread would immediately stop all threads. This would be helpful to identify bugs in any task sooner, rather than waiting until all long-running processes complete.
Is this possible?

It's possible to exit after the first exception and not submit any new jobs to the Executor. However, once a job has been submitted, it can't be cancelled, you have to wait for all submitted jobs to finish (or timeout). See this question for details. Here's a short example that cancels any unsubmitted jobs once the first exception occurs. However, it still waits for the already submitted jobs to finish. This uses the "FIRST EXCEPTION" keyword listed in the concurrent.futures docs.
import time
import concurrent.futures
def example(i):
print(i)
assert i != 1
time.sleep(i)
return i
if __name__ == "__main__":
futures = []
with concurrent.futures.ThreadPoolExecutor() as executor:
for number in range(5):
futures.append(executor.submit(example, number))
exception = False
for completed, running_or_error in concurrent.futures.wait(futures, return_when="FIRST_EXCEPTION"):
try:
running_or_error.result()
except Exception as e:
for future in futures:
print(future.cancel()) # cancel all unstarted futures
raise e

Related

Unable to cancel future - asyncio.sleep()

I have a signal handler defined that cancels all the tasks in the currently running asyncio event loop when the SIGINT signal is raised. In main, I have defined a new loop and the loop runs until the sleep function completes. I have used print statements inside signal_handler for better understanding as to what happens when an asyncio task is cancelled.
Below is my implementation,
import asyncio
import signal
class temp:
def signal_handler(self, sig, frame):
loop = asyncio.get_event_loop()
tasks = asyncio.all_tasks(loop=loop)
for task in tasks:
print(task.get_name()) #returns the name of the task
ret_val = asyncio.Future.cancel(task) #returns True if task was just cancelled
print(f"Return value : {ret_val}")
print(f"Task Cancelled : {task.cancelled()}") #returns True if task is cancelled
return
def main(self):
try:
signal.signal(signal.SIGINT, self.signal_handler)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop=loop)
loop.run_until_complete(asyncio.sleep(20))
except asyncio.CancelledError as err:
print("Cancellation error raised")
finally:
if not loop.is_closed():
loop.close()
if __name__ == "__main__":
test = temp()
test.main()
Expected Behaviour:
When I raise a SIGINT at any time using Ctrl+C, the task (asyncio.sleep()) gets cancelled instantaneously and a CancellationError is raised and there is a graceful exit.
Actual Behaviour:
The CancellationError is raised after time t (in seconds) specified as a parameter in asyncio.sleep(t). For Example, the CancellationError is raised after 20 secs for the above code.
Unusual Observation:
The behaviour of the code is in line with the Actual Behaviour when executed on Windows.
The issue described above is only happening on Linux.
What could be the reason for this ambiguous behaviour?

Pool only executes a single thread instead of 4, and how do I make it infinite?

So I am working on a little Python tool to stress test an API of application.
I've got a pretty nice script using Threading, but then I read that it will require manual coding to maintain n number of concurrent threads (meaning, starting new ones as soon as old ones finish), and the suggestion here: How to start a new thread when old one finishes? is to use ThreadPool, I tried as follows:
def test_post():
print "Executing in " + threading.currentThread().getName() + "\n"
time.sleep(randint(1, 3))
return randint(1, 5), "Message"
if args.send:
code, content = post()
print (code, "\n")
print (content)
elif args.test:
# Create new threads
print threads
results_list = []
pool = ThreadPool(processes=threads)
results = pool.apply_async(test_post())
pool.close() # Done adding tasks.
pool.join() # Wait for all tasks to complete.
# results = list(pool.imap_unordered(
# test_post(), ()
# ))
# thread_list = []
# while threading.activeCount() <= threads:
# thread = LoadTesting(threadID=free_threads, name="Thread-" + str(threading.activeCount()), counter=1)
# thread.start()
# thread_list.append(thread)
print "Exiting Main Thread" + "\n"
else:
print ("cant get here!")
When I invoke the script, I get consistent output such as:
4
Executing in MainThread
Exiting Main Thread
I am not sure why.. as you see in commented out block I tried different ways and it still does it only once.
My goal is to make the script run in loop, always running n threads at any time.
the test_post (and respectively, post) functions return the HTTP response code, and the content - I would like to later use this to print/stop when response code is NOT 200 OK.
Your first problem is that you already called your function in the MainThread with calling:
pool.apply_async(test_post())
...instead of passing test_post as an argument for a call to be executed in a worker-thread with:
pool.apply_async(test_post)
OP: I've got a pretty nice script using Threading, but then I read that it will require manual coding to maintain n number of concurrent threads (meaning, starting new ones as soon as old ones finish) ...
You need to distinguish between the unit of work (job, task) and a thread. The whole point of using a pool in the first place is re-using the executors, be it threads or processes. The workers are already created when a Pool is instantiated and as long as you don't close the Pool, all initial threads stay alive. So you don't care about recreating threads, you just call pool-methods of an existing pool as often as you have some work you want to distribute. Pool takes this jobs (a pool-method call) and creates tasks out of it. These tasks are put on an unbounded queue. Whenever a workers is finished with a task, it will blockingly try to get() a new task from such an inqueue.
OP: Pool only executes a single thread instead of 4...I tried different ways and it still does it only once.
pool.apply_async(func, args=(), kwds={}, callback=None, error_callback=None)
...is a single-call, single task producing job. In case you want more than one execution of func, you either have to call pool.apply_async() multiple times, or you use a mapping pool-method like
pool.map(func, iterable, chunksize=None)
..., which maps one function over an iterable. pool.apply_async is non-blocking, that is why it is "async". It immediately returns an AsyncResult-object you can (blockingly) call .wait() or .get() upon.
Through the comments it became clear, that you want endless and immediate replacements for finished tasks (self produced input-stream)...and the program should stop on KeyboardInterrupt or when a result does not have a certain value.
You can use the callback-parameter of apply_async to schedule new tasks as soon any of the old ones is finished. The difficulty lies in what to do meanwhile with the MainThread to prevent the whole script from ending prematurely while keeping it responsive for KeyboardInterrupt. Letting the MainThread sleep in a loop lets it still immediately react upon KeyboardInterrupt while preventing early exit. In case a result should stop the program, you can let the callback terminate the pool. The MainThread then just has to include a check of the pool-status in his sleep-loop.
import time
from random import randint, choice
from itertools import count
from datetime import datetime
from threading import current_thread
from multiprocessing.pool import ThreadPool
def test_post(post_id):
time.sleep(randint(1, 3))
status_code = choice([200] * 9 + [404])
return "{} {} Message no.{}: {}".format(
datetime.now(), current_thread().name, post_id, status_code
), status_code
def handle_result(result):
msg, code = result
print(msg)
if code != 200:
print("terminating")
pool.terminate()
else:
pool.apply_async(
test_post, args=(next(post_cnt),), callback=handle_result
)
if __name__ == '__main__':
N_WORKERS = 4
post_cnt = count()
pool = ThreadPool(N_WORKERS)
# initial distribution
for _ in range(N_WORKERS):
pool.apply_async(
test_post, args=(next(post_cnt),), callback=handle_result
)
try:
while pool._state == 0: # check if pool is still alive
time.sleep(1)
except KeyboardInterrupt:
print(" got interrupt")
Example Output with KeyboardInterrupt:
$> python2 scratch.py
2019-02-15 18:46:11.724203 Thread-4 Message no.3: 200
2019-02-15 18:46:12.724713 Thread-2 Message no.1: 200
2019-02-15 18:46:13.726107 Thread-1 Message no.0: 200
2019-02-15 18:46:13.726292 Thread-3 Message no.2: 200
2019-02-15 18:46:14.724537 Thread-4 Message no.4: 200
2019-02-15 18:46:14.726881 Thread-2 Message no.5: 200
2019-02-15 18:46:14.727071 Thread-1 Message no.6: 200
^C got interrupt
Example Output with termination due to unwanted return value:
$> python2 scratch.py
2019-02-15 18:44:19.966387 Thread-3 Message no.0: 200
2019-02-15 18:44:19.966491 Thread-4 Message no.1: 200
2019-02-15 18:44:19.966582 Thread-1 Message no.3: 200
2019-02-15 18:44:20.967555 Thread-2 Message no.2: 200
2019-02-15 18:44:20.968562 Thread-3 Message no.4: 404
terminating
Note, in your scenario you can also call apply_async more often than N_WORKERS-times for your initial distribution to have some buffer for reduced latency.

How to graceful shut down coroutines with Ctrl+C?

I'm writing a spider to crawl web pages. I know asyncio maybe my best choice. So I use coroutines to process the work asynchronously. Now I scratch my head about how to quit the program by keyboard interrupt. The program could shut down well after all the works have been done. The source code could be run in python 3.5 and is attatched below.
import asyncio
import aiohttp
from contextlib import suppress
class Spider(object):
def __init__(self):
self.max_tasks = 2
self.task_queue = asyncio.Queue(self.max_tasks)
self.loop = asyncio.get_event_loop()
self.counter = 1
def close(self):
for w in self.workers:
w.cancel()
async def fetch(self, url):
try:
async with aiohttp.ClientSession(loop = self.loop) as self.session:
with aiohttp.Timeout(30, loop = self.session.loop):
async with self.session.get(url) as resp:
print('get response from url: %s' % url)
except:
pass
finally:
pass
async def work(self):
while True:
url = await self.task_queue.get()
await self.fetch(url)
self.task_queue.task_done()
def assign_work(self):
print('[*]assigning work...')
url = 'https://www.python.org/'
if self.counter > 10:
return 'done'
for _ in range(self.max_tasks):
self.counter += 1
self.task_queue.put_nowait(url)
async def crawl(self):
self.workers = [self.loop.create_task(self.work()) for _ in range(self.max_tasks)]
while True:
if self.assign_work() == 'done':
break
await self.task_queue.join()
self.close()
def main():
loop = asyncio.get_event_loop()
spider = Spider()
try:
loop.run_until_complete(spider.crawl())
except KeyboardInterrupt:
print ('Interrupt from keyboard')
spider.close()
pending = asyncio.Task.all_tasks()
for w in pending:
w.cancel()
with suppress(asyncio.CancelledError):
loop.run_until_complete(w)
finally:
loop.stop()
loop.run_forever()
loop.close()
if __name__ == '__main__':
main()
But if I press 'Ctrl+C' while it's running, some strange errors may occur. I mean sometimes the program could be shut down by 'Ctrl+C' gracefully. No error message. However, in some cases the program will be still running after pressing 'Ctrl+C' and wouldn't stop until all the works have been done. If I press 'Ctrl+C' at that moment, 'Task was destroyed but it is pending!' would be there.
I have read some topics about asyncio and add some code in main() to close coroutines gracefully. But it not work. Is someone else has the similar problems?
I bet problem happens here:
except:
pass
You should never do such thing. And your situation is one more example of what can happen otherwise.
When you cancel task and await for its cancellation, asyncio.CancelledError raised inside task and shouldn't be suppressed anywhere inside. Line where you await of your task cancellation should raise this exception, otherwise task will continue execution.
That's why you do
task.cancel()
with suppress(asyncio.CancelledError):
loop.run_until_complete(task) # this line should raise CancelledError,
# otherwise task will continue
to actually cancel task.
Upd:
But I still hardly understand why the original code could quit well by
'Ctrl+C' at a uncertain probability?
It dependence of state of your tasks:
If at the moment you press 'Ctrl+C' all tasks are done, non of
them will raise CancelledError on awaiting and your code will finished normally.
If at the moment you press 'Ctrl+C' some tasks are pending, but close to finish their execution, your code will stuck a bit on tasks cancellation and finished when tasks are finished shortly after it.
If at the moment you press 'Ctrl+C' some tasks are pending and
far from being finished, your code will stuck trying to cancel these tasks (which
can't be done). Another 'Ctrl+C' will interrupt process of
cancelling, but tasks wouldn't be cancelled or finished then and you'll get
warning 'Task was destroyed but it is pending!'.
I assume you are using any flavor of Unix; if this is not the case, my comments might not apply to your situation.
Pressing Ctrl-C in a terminal sends all processes associated with this tty the signal SIGINT. A Python process catches this Unix signal and translates this into throwing a KeyboardInterrupt exception. In a threaded application (I'm not sure if the async stuff internally is using threads, but it very much sounds like it does) typically only one thread (the main thread) receives this signal and thus reacts in this fashion. If it is not prepared especially for this situation, it will terminate due to the exception.
Then the threading administration will wait for the still running fellow threads to terminate before the Unix process as a whole terminates with an exit code. This can take quite a long time. See this question about killing fellow threads and why this isn't possible in general.
What you want to do, I assume, is kill your process immediately, killing all threads in one step.
The easiest way to achieve this is to press Ctrl-\. This will send a SIGQUIT instead of a SIGINT which typically influences also the fellow threads and causes them to terminate.
If this is not enough (because for whatever reason you need to react properly on Ctrl-C), you can send yourself a signal:
import os, signal
os.kill(os.getpid(), signal.SIGQUIT)
This should terminate all running threads unless they especially catch SIGQUIT in which case you still can use SIGKILL to perform a hard kill on them. This doesn't give them any option of reacting, though, and might lead to problems.

Django + multiprocessing.dummy.Pool + sleep = weird result

In my Django application, I want to do some work in background when a certain view is requested. To that end, I created a multiprocessing.dummy.Pool of workers, and whenever that URL is called, I start a new process on it. The task to be executed in background can have to do some retries with a certain timeout between them.
Since this whole thing is executed, so to speak, not on a UI thread, I thought I'd use sleep for timeouts. When I unittest this arrangement, everything works fine, but when this runs in Django, the thread gets to the sleep statement and then never wakes up, but when I restart the Django app, the thread gets past the sleep statement and then is immediately killed by the restart. I know I could schedule retries using Timers, but I wanted a simpler solution.
Here's a simplified version of my code:
from multiprocessing.dummy import Pool
POOL = Pool(settings.POOL_WORKERS)
def background_task(arg):
refresh = True
try:
for i in range(settings.GET_RETRY_LIMIT):
status, result = (arg, refresh=refresh)
refresh = False
if status is Statuses.OK:
return result
if i < settings.GET_RETRY_LIMIT - 1:
sleep(settings.GET_SLEEP_TIME)
except Exception as e:
logging.error(e)
return []
def do_background_work(arg):
POOL.apply_async(
background_task,
(arg)
)
def my_view(request):
arg = get_arg_from_request(request)
do_background_work(arg)
return Response("Ok")
UPD: By the way, turns out that the workers are most probably killed by Harakiri

Python: wait for sigchild outside main thread

I'm working on a task scheduler which should run tasks described in xml file.
Scheduler should be multithread and asynchronous to minimize overhead spent in waiting.
The problem is that os.wait* throws OSError when no child is running.
So, here is my code:
print("Waiting")
try:
(pid, exit_status) = os.waitpid(-1, 0)
exit_status >>= 8
except OSError:
print("Got error")
#when wait is called, and there is no child processes
#OSError is being raised
import time
time.sleep(1)
continue
I wonder if it is possible to eliminate sleep() call.

Categories

Resources