Schedule Tasks at Fixed Rate with Python Multiprocessing - python

I would like to run a function asynchronously in Python, calling the function repeatedly at a fixed time interval. This java class has functionality similar to what I want. I was hoping for something in python like:
pool = multiprocessing.Pool()
pool.schedule(func, args, period)
# other code to do while that runs in the background
pool.close()
pool.join()
Are there any packages which provide similar functionality? I would prefer something simple and lightweight.
How could I implement this functionality in python?
This post is similar, but asks for an in process solution. I want a multiprocess async solution.

Here is one possible solution. One caveat is that func needs to return faster than rate, else it wont be called as frequently as rate and if it ever gets quicker it will be scheduled faster than rate while it catches up. This approach seems like a lot of work, but then again parallel programming is often tough. I would appreciate a second look at the code to make sure I don't have a deadlock waiting somewhere.
import multiprocessing, time, math
def func():
print('hello its now {}'.format(time.time()))
def wrapper(f, period, event):
last = time.time() - period
while True:
now = time.time()
# returns True if event is set, otherwise False after timeout
if event.wait(timeout=(last + period - now)):
break
else:
f()
last += period
def main():
period = 2
# event is the poison pill, setting it breaks the infinite loop in wrapper
event = multiprocessing.Event()
process = multiprocessing.Process(target=wrapper, args=(func, period, event))
process.start()
# burn some cpu cycles, takes about 20 seconds on my machine
x = 7
for i in range(50000000):
x = math.sqrt(x**2)
event.set()
process.join()
print('x is {} by the way'.format(x))
if __name__ == '__main__':
main()

Related

Python multiprocessing - Is it possible to introduce a fixed time delay between individual processes?

I have searched and cannot find an answer to this question elsewhere. Hopefully I haven't missed something.
I am trying to use Python multiprocessing to essentially batch run some proprietary models in parallel. I have, say, 200 simulations, and I want to batch run them ~10-20 at a time. My problem is that the proprietary software crashes if two models happen to start at the same / similar time. I need to introduce a delay between processes spawned by multiprocessing so that each new model run waits a little bit before starting.
So far, my solution has been to introduced a random time delay at the start of the child process before it fires off the model run. However, this only reduces the probability of any two runs starting at the same time, and therefore I still run into problems when trying to process a large number of models. I therefore think that the time delay needs to be built into the multiprocessing part of the code but I haven't been able to find any documentation or examples of this.
Edit: I am using Python 2.7
This is my code so far:
from time import sleep
import numpy as np
import subprocess
import multiprocessing
def runmodels(arg):
sleep(np.random.rand(1,1)*120) # this is my interim solution to reduce the probability that any two runs start at the same time, but it isn't a guaranteed solution
subprocess.call(arg) # this line actually fires off the model run
if __name__ == '__main__':
arguments = [big list of runs in here
]
count = 12
pool = multiprocessing.Pool(processes = count)
r = pool.imap_unordered(runmodels, arguments)
pool.close()
pool.join()
multiprocessing.Pool() already limits number of processes running concurrently.
You could use a lock, to separate the starting time of the processes (not tested):
import threading
import multiprocessing
def init(lock):
global starting
starting = lock
def run_model(arg):
starting.acquire() # no other process can get it until it is released
threading.Timer(1, starting.release).start() # release in a second
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = Pool(processes=12,
initializer=init, initargs=[multiprocessing.Lock()])
for _ in pool.imap_unordered(run_model, arguments):
pass
One way to do this with thread and semaphore :
from time import sleep
import subprocess
import threading
def runmodels(arg):
subprocess.call(arg)
sGlobal.release() # release for next launch
if __name__ == '__main__':
threads = []
global sGlobal
sGlobal = threading.Semaphore(12) #Semaphore for max 12 Thread
arguments = [big list of runs in here
]
for arg in arguments :
sGlobal.acquire() # Block if more than 12 thread
t = threading.Thread(target=runmodels, args=(arg,))
threads.append(t)
t.start()
sleep(1)
for t in threads :
t.join()
The answer suggested by jfs caused problems for me as a result of starting a new thread with threading.Timer. If the worker just so happens to finish before the timer does, the timer is killed and the lock is never released.
I propose an alternative route, in which each successive worker will wait until enough time has passed since the start of the previous one. This seems to have the same desired effect, but without having to rely on another child process.
import multiprocessing as mp
import time
def init(shared_val):
global start_time
start_time = shared_val
def run_model(arg):
with start_time.get_lock():
wait_time = max(0, start_time.value - time.time())
time.sleep(wait_time)
start_time.value = time.time() + 1.0 # Specify interval here
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = mp.Pool(processes=12,
initializer=init, initargs=[mp.Value('d')])
for _ in pool.imap_unordered(run_model, arguments):
pass

How to jump out of a dead loop automatically in Python?

I have a "do..., until..." structure in Python as follows:
while True:
if foo() == bar():
break
It works fine (jumps out in the end) in most of the cases. However, in some of the cases where the condition is never met, it will get stuck there.
Figuring out what are these cases is kind of difficult, since it is essentially a random process behind. So I wish to set a "timeout" thing for the while loop.
Say, if the loop has been running for 1s, but still not yet stops, I wish the loop to terminate itself.
How may I do this?
Update: Here is the actual code:
while True:
possibleJunctions = junctionReachability[junctions.index(currentJunction)]
nextJunction = random.choice(filter(lambda (jx, jy): (jx - currentJunction[0]) * (endJunction[0] - currentJunction[0]) > 0 or (jy - currentJunction[1]) * (endJunction[1] - currentJunction[1]) > 0, possibleJunctions) or possibleJunctions)
if previousJunction != nextJunction: # never go back
junctionSequence.append(nextJunction)
previousJunction = currentJunction
currentJunction = nextJunction
if currentJunction == endJunction:
break
import time
loop_start = time.time()
while time.time() - loop_start <= 1:
if foo() == bar():
break
EDIT
Dan Doe's solution is simplest and best if your code is synchronous (just runs in a single thread) and you know that the foo and bar functions always terminate within some period of time.
If you have asynchronous code (like a GUI), or if the foo and bar functions you use to test for termination conditions can themselves take too long to complete, then read on.
Run the loop inside a separate thread/process. Run a timer in another process. Once the timer expires, set a flag that would cause the loop to terminate.
Something like this (warning: untested code):
import multiprocessing
import time
SECONDS = 10
event = multiprocessing.Event()
def worker():
"""Does stuff until work is complete, or until signaled to terminate by timer."""
while not event.is_set():
if foo() == bar():
break
def timer():
"""Signals the worker to terminate immediately."""
time.sleep(SECONDS)
event.set()
def main():
"""Kicks off subprocesses and waits for both of them to terminate."""
worker_process = multiprocessing.Process(target=worker)
timer_process = multiprocessing.Process(target=timer)
timer_process.start()
worker_process.start()
timer_process.join()
worker_process.join()
if __name__ == "__main__":
main()
If you were worried about the foo and bar functions taking too long to complete, you could explicitly terminate the worker process from within the timer process.
I recommend using a counter. This is a common trick to detect non-convergence.
maxiter = 10000
while True:
if stopCondition(): break
maxiter = maxiter - 1
if maxiter <= 0:
print >>sys.stderr, "Did not converge."
break
this requires the least overhead and usually adapts best to different CPUs: even on a faster CPU, you want the same termination behavior; instead of a time-based timeout.
However, it would be even better if you would detect being stuck e.g. with some criterion function that no longer improves.

Fast and Precise Python Repeating Timer

I need to send repeating messages from a list quickly and precisely. One list needs to send the messages every 100ms, with a +/- 10ms window. I tried using the code below, but the problem is that the timer waits the 100ms, and then all the computation needs to be done, making the timer fall out of the acceptable window.
Simply decreasing the wait is a messy, and unreliable hack. The there is a Lock around the message loop in the event the list gets edited during the loop.
Thoughts on how to get python to send messages consistently around 100ms? Thanks
from threading import Timer
from threading import Lock
class RepeatingTimer(object):
def __init__(self,interval, function, *args, **kwargs):
super(RepeatingTimer, self).__init__()
self.args = args
self.kwargs = kwargs
self.function = function
self.interval = interval
self.start()
def start(self):
self.callback()
def stop(self):
self.interval = False
def callback(self):
if self.interval:
self.function(*self.args, **self.kwargs)
Timer(self.interval, self.callback, ).start()
def loop(messageList):
listLock.acquire()
for m in messageList:
writeFunction(m)
listLock.release()
MESSAGE_LIST = [] #Imagine this is populated with the messages
listLock = Lock()
rt = RepeatingTimer(0.1,loop,MESSAGE_LIST)
#Do other stuff after this
I do understand that the writeFunction will cause some delay, but not more than the 10ms allowed. I essentially need to call the function every 100ms for each message. The messagelist is small, usually less than elements.
The next challenge is to have this work with every 10ms, +/-1ms :P
Yes, the simple waiting is messy and there are better alternatives.
First off, you need a high-precision timer in Python. There are a few alternatives and depending on your OS, you might want to choose the most accurate one.
Second, you must be aware of the basics preemptive multitasking and understand that there is no high-precision sleep function, and that its actual resolution will differ from OS to OS too. For example, if we're talking Windows, the minimal sleep interval might be around 10-13 ms.
And third, remember that it's always possible to wait for a very accurate interval of time (assuming you have a high-resolution timer), but with a trade-off of high CPU load. The technique is called busy waiting:
while(True):
if time.clock() == something:
break
So, the actual solution is to create a hybrid timer. It will use the regular sleep function to wait the main bulk of the interval, and then it'll start probing the high-precision timer in the loop, while doing the sleep(0) trick. Sleep(0) will (depending on the platform) wait the least possible amount of time, releasing the rest of the remaining time slice to other processes and switching the CPU context. Here is a relevant discussion.
The idea is thoroughly described in the Ryan Geiss's Timing in Win32 article. It's in C and for Windows API, but the basic principles apply here as well.
Store the start time. Send the message. Get the end time. Calculate timeTaken=end-start. Convert to FP seconds. Sleep(0.1-timeTaken). Loop back.
try this:
#!/usr/bin/python
import time; # This is required to include time module.
from threading import Timer
def hello(start, interval, count):
ticks = time.time()
t = Timer(interval - (ticks-start-count*interval), hello, [start, interval, count+1])
t.start()
print "Number of ticks since 12:00am, January 1, 1970:", ticks, " #", count
dt = 1.25 # interval in sec
t = Timer(dt, hello, [round(time.time()), dt, 0]) # start over at full second, round only for testing here
t.start()

Return whichever expression returns first

I have two different functions f, and g that compute the same result with different algorithms. Sometimes one or the other takes a long time while the other terminates quickly. I want to create a new function that runs each simultaneously and then returns the result from the first that finishes.
I want to create that function with a higher order function
h = firstresult(f, g)
What is the best way to accomplish this in Python?
I suspect that the solution involves threading. I'd like to avoid discussion of the GIL.
I would simply use a Queue for this. Start the threads and the first one which has a result ready writes to the queue.
Code
from threading import Thread
from time import sleep
from Queue import Queue
def firstresult(*functions):
queue = Queue()
threads = []
for f in functions:
def thread_main():
queue.put(f())
thread = Thread(target=thread_main)
threads.append(thread)
thread.start()
result = queue.get()
return result
def slow():
sleep(1)
return 42
def fast():
return 0
if __name__ == '__main__':
print firstresult(slow, fast)
Live demo
http://ideone.com/jzzZX2
Notes
Stopping the threads is an entirely different topic. For this you need to add some state variable to the threads which needs to be checked in regular intervals. As I want to keep this example short I simply assumed that part and assumed that all workers get the time to finish their work even though the result is never read.
Skipping the discussion about the Gil as requested by the questioner. ;-)
Now - unlike my suggestion on the other answer, this piece of code does exactly what you are requesting:
from multiprocessing import Process, Queue
import random
import time
def firstresult(func1, func2):
queue = Queue()
proc1 = Process(target=func1,args=(queue,))
proc2 = Process(target=func2, args=(queue,))
proc1.start();proc2.start()
result = queue.get()
proc1.terminate(); proc2.terminate()
return result
def algo1(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 1")
def algo2(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 2")
print firstresult(algo1, algo2)
Run each function in a new worker thread, the 2 worker threads send the result back to the main thread in a 1 item queue or something similar. When the main thread receives the result from the winner, it kills (do python threads support kill yet? lol.) both worker threads to avoid wasting time (one function may take hours while the other only takes a second).
Replace the word thread with process if you want.
You will need to run each function in another process (with multiprocessing) or in a different thread.
If both are CPU bound, multithread won help much - exactly due to the GIL -
so multiprocessing is the way.
If the return value is a pickleable (serializable) object, I have this decorator I created that simply runs the function in background, in another process:
https://bitbucket.org/jsbueno/lelo/src
It is not exactly what you want - as both are non-blocking and start executing right away. The tirck with this decorator is that it blocks (and waits for the function to complete) as when you try to use the return value.
But on the other hand - it is just a decorator that does all the work.

python process takes 100% CPU

I am trying to run python application and execute actions based on specified interval. Below code is consuming constantly 100% of CPU.
def action_print():
print "hello there"
interval = 5
next_run = 0
while True:
while next_run > time.time():
pass
next_run = time.time() + interval
action_print()
I would like to avoid putting process to sleep as there will be more actions to execute at various intervals.
please advise
If you know when the next run will be, you can simply use time.sleep:
import time
interval = 5
next_run = 0
while True:
time.sleep(max(0, next_run - time.time()))
next_run = time.time() + interval
action_print()
If you want other threads to be able to interrupt you, use an event like this:
import time,threading
interval = 5
next_run = 0
interruptEvent = threading.Event()
while True:
interruptEvent.wait(max(0, next_run - time.time()))
interruptEvent.clear()
next_run = time.time() + interval
action_print()
Another thread can now call interruptEvent.set() to wake up yours.
In many cases, you will also want to use a Lock to avoid race conditions on shared data. Make sure to clear the event while you hold the lock.
You should also be aware that under cpython, only one thread can execute Python code. Therefore, if your program is CPU-bound over multiple threads and you're using cpython or pypy, you should substitute threading with multiprocessing.
Presumably you do not want to write time.sleep(interval) , but replacing 'pass' with time.sleep(0.1) will almost completely free up your CPU, and still allow you flexibility in the WHILE predicate.
Alternatively you could use a thread for each event you are scheduling and use time.sleep(interval) but this will still tie up your CPU.
Bottom line : your loop WHILE : PASS is going round and round very fast consuming all your CPU.

Categories

Resources