Handling per worker timeout with Python multiprocessing

Handling per worker timeout with Python multiprocessing - python

I've been having some troubles with Python's multiprocessing module. I need to test a function with different parameters, and as that function does a lot of calculations, the use of all the cores is most desirable. I ended up using pool.map(), which suits my needs. The problem is that sometimes my function never ends, so pool.map keeps blocked forever expecting a returned value. I don't know why this happens. I've done a lot of tests without using multiprocessing, just passing an argumment after another in a for loop, and it always ends.
Anyway, What I want to do now is to specify a timeout for each worker / execution of the function, but I need a variable inside the function to be returned in case that timeout is reached. That would be like the status of the function before the timeout happens. My code is too big to post it here, but here's a simple, equivalent example:
def func(x):
secsPassed = 0
for _ in xrange(x):
time.sleep(1)
secsPassed +=1
return secsPassed
pool = Pool(4)
results = pool.map(func, [3, 10, 50, 20, 300])
So I'd like that each execution takes at max 30 seconds, and I'd also like to know the value of secsPassed just before func gets interrumpted. I'm using Python 2.7 and I can make changes to func, or use another tool aside from Pool.map if necessary.
Thanks in advance.

This question has been asked several times in the past.
multiprocessing.Pool has not been designed for such use case.
Forcing one of the workers to commit suicide will lead to undefined behaviour which might vary from remaining stuck there forever to getting your program to crash.
There are libraries which can solve your problem. pebble allows you to set timeout to your workers and will stop them if the time limit has exceeded.

Related

Python multiprocessing library Pool.map()

def myfun(a):
return a*2
p=Pool(5)
k0=time.time()
p.map(myfun,[1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10])
k1=time.time()
print(k1-k0)
k0=time.time()
for i in [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10]:
myfun(i)
k1=time.time()
print(k1-k0)
I am using the multiprocessing package in python. So as you can see i have executed two different snippets of code separately.The first one that uses Pool.map takes more time than the second one which is executed serially. Can anyone explain to me why so? I thought the p.map() would be much faster. Is it not executed parallely?

Indeed as noted in the comments, it takes longer to run in parallel for some tasks with multiprocessing. This is expected for very small tasks. The reason is that you have to spin up a python instance on each process for each worker used, and you also have to serialize and ship both the function and the data you are sending with map. This takes some time, so there's an overhead associated with using a multiprocessing.Pool. For very quick tasks, I suggest multiprocessing.dummy.Pool, which uses threads -- and thus minimizes setup overhead.
Try putting a time.sleep(x) in your function call, and varying x. You'll see that as x increases, the function becomes more suitable to run in a thread pool, and then in a process pool for even more expensive x.

Graceful Termination of Worker Pool

I want to spawn X number of Pool workers and give each of them X% of the work to do. My issue is that the work takes about 20 minutes to exhaust, longer for each extra process running, due to the type of calculations being done my answer may be found within minutes or hours. What I would like to do is implement some way for a single worker to go "HEY I FOUND IT" and use that signal to kill the remainder of the pool and move on with my calculations.
Key points:
I have tried callbacks, they don't seem to run on a starmap_async until the entire pool finishes.
I only care about the first suitable answer found.
I am not sharing resources and surprise process death, albeit rude, is perfectly acceptable.
I've also considered using a Queue, but it wouldn't make since because the scope of work I'm passing to each is already built into the parameters of the function.
Below is a very dulled down version of what I'm working with (the calculations I'm working with can take hours to finish over a 4.2 billion complex iterable.)
def doWork():
workers = Pool(2)
results = workers.starmap_async( func = distSearch , iterable = Sections1_5, callback = killPool )
workers.close()
print("Found answer : {}".format(results.get()))
workers.join()
def killPool():
workers.terminate()
print("Worker Pool Terminated")
I should probably specify that my process only returns if it finds an answer otherwise it just exits once done. I have looked at this thread but it has my completely lost and seems like a lot of overhead to consistently check for the win condition when that should come in the return/callback of the Worker Pool.
All the answers I've found result in significant overhead by supervising the worker pool, I'm looking for a solution that sources the kill signal at the worker level, autonomously.

I'm looking for a solution that sources the kill signal at the worker level, autonomously.
AFAIK, that doesn't exist. The methods of the Pool object (like Pool.terminate) should only be used in the process that created the pool.
What you could do is use Pool.imap_unordered. This returns an iterator in the parent process over the results which yields results as soon as they become available. As soon as the desired result pops up, you could then use Pool.terminate().
Edit:
From looking at the 3.5 implementation starmap_async returns a MapResult instance, which is not an iterator.
You can wrap multiple inputs in a tuple and use imap_unordered over a list of those.

Asynchronous Timer Implementation in Python

I need to implement an asyncronous timer to 'watch' the execution of a list of functions, till the timer expires. But the problem is function execution is a blocking call and in that case how can I track the timer if the function take too long to comeback.
functions = [func_1, func_2, func_3, func_n]
timer = Timer(30) # timer of 30 sec, just for example.
while timer.expires():
for func in functions:
func() # what if this function runs for a min
I would like to avoid multithreading and multiprocessing as far as possible, but if multiprocessing/threading is the only way out then please provide those solutions also.
What are different ways in python in which asynchronous behaviour can be achieved.

If the functions you call are blocking due to IO, you can use the asyncio module to turn them into non blocking. At that point you wrap them into a future and set a timeout for their completion. Keep in mind that the timeout is considering only the IO.
If the functions are blocking due to CPU bound jobs (while loops, long calculations) there is no way to achieve that without using processes.

Advanced Python Scheduler (apscheduler) Stagger events that fire within the same second?

I've been coding the python "apscheduler" package (Advanced Python Scheduler) into my app, so far it's going good, I'm able to do almost everything that I had envisioned doing with it.
Only one kink left to iron out...
The function my events are calling will only accept around 3 calls a second or fail as it is triggering very slow hardware I/O :(
I've tried limiting the max number of threads in the threadpool from 20 to just 1 to try and slow down execution, but since I'm not really putting a bit load on apscheduler my events are still firing pretty much concurrently (well... very, very close together at least).
Is there a way to 'stagger' different events that fire within the same second?

I have recently found this question because I, like yourself, was trying to stagger scheduled jobs slightly to compensate for slow hardware.
Including an argument like this in the scheduler add_job call staggers the start time for each job by 200ms (while incrementing idx for each job):
next_run_time=datetime.datetime.now() + datetime.timedelta(seconds=idx * 0.2)

What you want to use is the 'jitter' option.
From the docs:
The jitter option enables you to add a random component to the
execution time. This might be useful if you have multiple servers and
don’t want them to run a job at the exact same moment or if you want
to prevent multiple jobs with similar options from always running
concurrently
Example:
# Run the `job_function` every hour with an extra-delay picked randomly
# in a [-120,+120] seconds window.
sched.add_job(job_function, 'interval', hours=1, jitter=120)

I don't know about apscheduler but have you considered using a Redis LIST (queue) and simply serializing the event feed into that one critically bounded function so that it fires no more than three times per second? (For example you could have it do a blocking POP with a one second max delay, increment your trigger count for every event, sleep when it hits three, and zero the trigger count any time the blocking POP times out (Or you could just use 333 millisecond sleeps after each event).

My solution for future reference:
I added a basic bool lock in the function being called and a wait which seems to do the trick nicely - since it's not the calling of the function itself that raises the error, but rather a deadlock situation with what the function carries out :D

threadable delay in python 2.7

I'm currently using python (2.7) to write a GUI that has some threads going on. I come across a point that I need to do a roughly about a second delay before getting a piece of information, but I can't afford to have the function takes more than a few millisecond to run. With that in mind, I'm trying to create a Threaded timer that will set a flag timer.doneFlag and have the main function to keep poking to see whether it's done or not.
It is working. But not all the time. The problem that I run into is that sometimes I feel like the time.sleep function in run , doesn't wait fully for a second (sometimes it may not even wait). All I need is that I can have a flag that allow me control the start time and raise the flag when it reaches 1 second.
I maybe doing too much just to get a delay that is threadable, if you can suggest something, or help me find a bug in the following code, I'd be very grateful!
I've attached a portion of the code I used:
from main program:
class dataCollection:
def __init__(self):
self.timer=Timer(5)
self.isTimerStarted=0
return
def StateFunction(self): #Try to finish the function within a few milliseconds
if self.isTimerStarted==0:
self.timer=Timer(1.0)
self.timer.start()
self.isTimerStarted=1
if self.timer.doneFlag:
self.timer.doneFlag=0
self.isTimerStarted=0
#and all the other code
import time
import threading
class Timer(threading.Thread):
def __init__(self, seconds):
self.runTime = seconds
self.doneFlag=0
threading.Thread.__init__(self)
def run(self):
time.sleep(self.runTime)
self.doneFlag=1
print "Buzzzz"
x=dataCollection()
while 1:
x.StateFunction()
time.sleep(0.1)

First, you've effectively rebuilt threading.Timer with less flexibility. So I think you're better off using the existing class. (There are some obvious downsides with creating a thread for each timer instance. But if you just want a single one-shot timer, it's fine.)
More importantly, having your main thread repeatedly poll doneFlag is probably a bad idea. This means you have to call your state function as often as possible, burning CPU for no good reason.
Presumably the reason you have to return within a few milliseconds is that you're returning to some kind of event loop, presumably for your GUI (but, e.g., a network reactor has the same issue, with the same solutions, so I'll keep things general).
If so, almost all such event loops have a way to schedule a timed callback within the event loop—Timer in wx, callLater in twisted, etc. So, use that.
If you're using a framework that doesn't have anything like that, it hopefully at least has some way to send an event/fire a signal/post a message/whatever it's called from outside. (If it's a simple file-descriptor-based reactor, it may not have that, but you can add it yourself just by tossing a pipe into the reactor.) So, change your Timer callback to signal the event loop, instead of writing code that polls the Timer.
If for some reason you really do need to poll a variable shared across threads, you really, really, should be protecting it with a Condition or RLock. There is no guarantee in the language that, when thread 0 updates the value, thread 1 will see the new value immediately, or even ever. If you understand enough of the internals of (a specific version of) CPython, you can often prove that the GIL makes a lock unnecessary in specific cases. But otherwise, this is a race.
Finally:
The problem that I run into is that sometimes I feel like the time.sleep function in run , doesn't wait fully for a second (sometimes it may not even wait).
Well, the documentation clearly says this can happen:
The actual suspension time may be less than that requested because any caught signal will terminate the sleep() following execution of that signal’s catching routine.
So, if you need a guarantee that it actually sleeps for at least 1 second, the only way to do this is something like this:
t0 = time.time()
dur = 1.0
while True:
time.sleep(dur)
t1 = time.time()
dur = 1.0 - (t1 - t0)
if dur <= 0:
break

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Handling per worker timeout with Python multiprocessing - python

Related

Python multiprocessing library Pool.map()

Graceful Termination of Worker Pool

Asynchronous Timer Implementation in Python

Advanced Python Scheduler (apscheduler) Stagger events that fire within the same second?

threadable delay in python 2.7

Categories

Resources