Related
I have a bunch of long running processes that I would like to split up into multiple processes. That part I can do no problem. The issue I run into is sometimes these processes go into a hung state. To address this issue I would like to be able to set a time threshold for each task that a process is working on. When that time threshold is exceeded I would like to restart or terminate the task.
Originally my code was very simple using a process pool, however with the pool I could not figure out how to retrieve the processes inside the pool, nevermind how to restart / terminate a process in the pool.
I have resorted to using a queue and process objects as is illustrated in this example (https://pymotw.com/2/multiprocessing/communication.html#passing-messages-to-processes with some changes.
My attempts to figure this out are in the code below. In its current state the process does not actually get terminated. Further to that I cannot figure out how to get the process to move onto the next task after the current task is terminated. Any suggestions / help appreciated, perhaps I’m going about this the wrong way.
Thanks
import multiprocess
import time
class Consumer(multiprocess.Process):
def __init__(self, task_queue, result_queue, startTimes, name=None):
multiprocess.Process.__init__(self)
if name:
self.name = name
print 'created process: {0}'.format(self.name)
self.task_queue = task_queue
self.result_queue = result_queue
self.startTimes = startTimes
def stopProcess(self):
elapseTime = time.time() - self.startTimes[self.name]
print 'killing process {0} {1}'.format(self.name, elapseTime)
self.task_queue.cancel_join_thread()
self.terminate()
# now want to get the process to start procesing another job
def run(self):
'''
The process subclass calls this on a separate process.
'''
proc_name = self.name
print proc_name
while True:
# pulling the next task off the queue and starting it
# on the current process.
task = self.task_queue.get()
self.task_queue.cancel_join_thread()
if task is None:
# Poison pill means shutdown
#print '%s: Exiting' % proc_name
self.task_queue.task_done()
break
self.startTimes[proc_name] = time.time()
answer = task()
self.task_queue.task_done()
self.result_queue.put(answer)
return
class Task(object):
def __init__(self, a, b, startTimes):
self.a = a
self.b = b
self.startTimes = startTimes
self.taskName = 'taskName_{0}_{1}'.format(self.a, self.b)
def __call__(self):
import time
import os
print 'new job in process pid:', os.getpid(), self.taskName
if self.a == 2:
time.sleep(20000) # simulate a hung process
else:
time.sleep(3) # pretend to take some time to do the work
return '%s * %s = %s' % (self.a, self.b, self.a * self.b)
def __str__(self):
return '%s * %s' % (self.a, self.b)
if __name__ == '__main__':
# Establish communication queues
# tasks = this is the work queue and results is for results or completed work
tasks = multiprocess.JoinableQueue()
results = multiprocess.Queue()
#parentPipe, childPipe = multiprocess.Pipe(duplex=True)
mgr = multiprocess.Manager()
startTimes = mgr.dict()
# Start consumers
numberOfProcesses = 4
processObjs = []
for processNumber in range(numberOfProcesses):
processObj = Consumer(tasks, results, startTimes)
processObjs.append(processObj)
for process in processObjs:
process.start()
# Enqueue jobs
num_jobs = 30
for i in range(num_jobs):
tasks.put(Task(i, i + 1, startTimes))
# Add a poison pill for each process object
for i in range(numberOfProcesses):
tasks.put(None)
# process monitor loop,
killProcesses = {}
executing = True
while executing:
allDead = True
for process in processObjs:
name = process.name
#status = consumer.status.getStatusString()
status = process.is_alive()
pid = process.ident
elapsedTime = 0
if name in startTimes:
elapsedTime = time.time() - startTimes[name]
if elapsedTime > 10:
process.stopProcess()
print "{0} - {1} - {2} - {3}".format(name, status, pid, elapsedTime)
if allDead and status:
allDead = False
if allDead:
executing = False
time.sleep(3)
# Wait for all of the tasks to finish
#tasks.join()
# Start printing results
while num_jobs:
result = results.get()
print 'Result:', result
num_jobs -= 1
I generally recommend against subclassing multiprocessing.Process as it leads to code hard to read.
I'd rather encapsulate your logic in a function and run it in a separate process. This keeps the code much cleaner and intuitive.
Nevertheless, rather than reinventing the wheel, I'd recommend you to use some library which already solves the issue for you such as Pebble or billiard.
For example, the Pebble library allows to easily set timeouts to processes running independently or within a Pool.
Running your function within a separate process with a timeout:
from pebble import concurrent
from concurrent.futures import TimeoutError
#concurrent.process(timeout=10)
def function(foo, bar=0):
return foo + bar
future = function(1, bar=2)
try:
result = future.result() # blocks until results are ready
except TimeoutError as error:
print("Function took longer than %d seconds" % error.args[1])
Same example but with a process Pool.
with ProcessPool(max_workers=5, max_tasks=10) as pool:
future = pool.schedule(function, args=[1], timeout=10)
try:
result = future.result() # blocks until results are ready
except TimeoutError as error:
print("Function took longer than %d seconds" % error.args[1])
In both cases, the timing out process will be automatically terminated for you.
A way simpler solution would be to continue using a than reimplementing the Pool is to design a mechanism which timeout the function you are running.
For instance:
from time import sleep
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError()
def run_with_timeout(func, *args, timeout=10, **kwargs):
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout)
try:
res = func(*args, **kwargs)
except TimeoutError as exc:
print("Timeout")
res = exc
finally:
signal.alarm(0)
return res
def test():
sleep(4)
print("ok")
if __name__ == "__main__":
import multiprocessing as mp
p = mp.Pool()
print(p.apply_async(run_with_timeout, args=(test,),
kwds={"timeout":1}).get())
The signal.alarm set a timeout and when this timeout, it run the handler, which stop the execution of your function.
EDIT: If you are using a windows system, it seems to be a bit more complicated as signal does not implement SIGALRM. Another solution is to use the C-level python API. This code have been adapted from this SO answer with a bit of adaptation to work on 64bit system. I have only tested it on linux but it should work the same on windows.
import threading
import ctypes
from time import sleep
class TimeoutError(Exception):
pass
def run_with_timeout(func, *args, timeout=10, **kwargs):
interupt_tid = int(threading.get_ident())
def interupt_thread():
# Call the low level C python api using ctypes. tid must be converted
# to c_long to be valid.
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_long(interupt_tid), ctypes.py_object(TimeoutError))
if res == 0:
print(threading.enumerate())
print(interupt_tid)
raise ValueError("invalid thread id")
elif res != 1:
# "if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect"
ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_long(interupt_tid), 0)
raise SystemError("PyThreadState_SetAsyncExc failed")
timer = threading.Timer(timeout, interupt_thread)
try:
timer.start()
res = func(*args, **kwargs)
except TimeoutError as exc:
print("Timeout")
res = exc
else:
timer.cancel()
return res
def test():
sleep(4)
print("ok")
if __name__ == "__main__":
import multiprocessing as mp
p = mp.Pool()
print(p.apply_async(run_with_timeout, args=(test,),
kwds={"timeout": 1}).get())
print(p.apply_async(run_with_timeout, args=(test,),
kwds={"timeout": 5}).get())
For long running processes and/or long iterators, spawned workers might hang after some time. To prevent this, there are two built-in techniques:
Restart workers after they have delivered maxtasksperchild tasks from the queue.
Pass timeout to pool.imap.next(), catch the TimeoutError, and finish the rest of the work in another pool.
The following wrapper implements both, as a generator. This also works when replacing stdlib multiprocessing with multiprocess.
import multiprocessing as mp
def imap(
func,
iterable,
*,
processes=None,
maxtasksperchild=42,
timeout=42,
initializer=None,
initargs=(),
context=mp.get_context("spawn")
):
"""Multiprocessing imap, restarting workers after maxtasksperchild tasks to avoid zombies.
Example:
>>> list(imap(str, range(5)))
['0', '1', '2', '3', '4']
Raises:
mp.TimeoutError: if the next result cannot be returned within timeout seconds.
Yields:
Ordered results as they come in.
"""
with context.Pool(
processes=processes,
maxtasksperchild=maxtasksperchild,
initializer=initializer,
initargs=initargs,
) as pool:
it = pool.imap(func, iterable)
while True:
try:
yield it.next(timeout)
except StopIteration:
return
To catch the TimeoutError:
>>> import time
>>> iterable = list(range(10))
>>> results = []
>>> try:
... for i, result in enumerate(imap(time.sleep, iterable, processes=2, timeout=2)):
... results.append(result)
... except mp.TimeoutError:
... print("Failed to process the following subset of iterable:", iterable[i:])
Failed to process the following subset of iterable: [2, 3, 4, 5, 6, 7, 8, 9]
I'm trying to find the way to start a new Process and get its output if it takes less than X seconds. If the process takes more time I would like to ignore the Process result, kill the Process and carry on.
I need to basically add the timer to the code below. Now sure if there's a better way to do it, I'm open to a different and better solution.
from multiprocessing import Process, Queue
def f(q):
# Ugly work
q.put(['hello', 'world'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print q.get()
p.join()
Thanks!
You may find the following module useful in your case:
Module
#! /usr/bin/env python3
"""Allow functions to be wrapped in a timeout API.
Since code can take a long time to run and may need to terminate before
finishing, this module provides a set_timeout decorator to wrap functions."""
__author__ = 'Stephen "Zero" Chappell ' \
'<stephen.paul.chappell#atlantis-zero.net>'
__date__ = '18 December 2017'
__version__ = 1, 0, 1
__all__ = [
'set_timeout',
'run_with_timeout'
]
import multiprocessing
import sys
import time
DEFAULT_TIMEOUT = 60
def set_timeout(limit=None):
"""Return a wrapper that provides a timeout API for callers."""
if limit is None:
limit = DEFAULT_TIMEOUT
_Timeout.validate_limit(limit)
def wrapper(entry_point):
return _Timeout(entry_point, limit)
return wrapper
def run_with_timeout(limit, polling_interval, entry_point, *args, **kwargs):
"""Execute a callable object and automatically poll for results."""
engine = set_timeout(limit)(entry_point)
engine(*args, **kwargs)
while engine.ready is False:
time.sleep(polling_interval)
return engine.value
def _target(queue, entry_point, *args, **kwargs):
"""Help with multiprocessing calls by being a top-level module function."""
# noinspection PyPep8,PyBroadException
try:
queue.put((True, entry_point(*args, **kwargs)))
except:
queue.put((False, sys.exc_info()[1]))
class _Timeout:
"""_Timeout(entry_point, limit) -> _Timeout instance"""
def __init__(self, entry_point, limit):
"""Initialize the _Timeout instance will all needed attributes."""
self.__entry_point = entry_point
self.__limit = limit
self.__queue = multiprocessing.Queue()
self.__process = multiprocessing.Process()
self.__timeout = time.monotonic()
def __call__(self, *args, **kwargs):
"""Begin execution of the entry point in a separate process."""
self.cancel()
self.__queue = multiprocessing.Queue(1)
self.__process = multiprocessing.Process(
target=_target,
args=(self.__queue, self.__entry_point) + args,
kwargs=kwargs
)
self.__process.daemon = True
self.__process.start()
self.__timeout = time.monotonic() + self.__limit
def cancel(self):
"""Terminate execution if possible."""
if self.__process.is_alive():
self.__process.terminate()
#property
def ready(self):
"""Property letting callers know if a returned value is available."""
if self.__queue.full():
return True
elif not self.__queue.empty():
return True
elif self.__timeout < time.monotonic():
self.cancel()
else:
return False
#property
def value(self):
"""Property that retrieves a returned value if available."""
if self.ready is True:
valid, value = self.__queue.get()
if valid:
return value
raise value
raise TimeoutError('execution timed out before terminating')
#property
def limit(self):
"""Property controlling what the timeout period is in seconds."""
return self.__limit
#limit.setter
def limit(self, value):
self.validate_limit(value)
self.__limit = value
#staticmethod
def validate_limit(value):
"""Verify that the limit's value is not too low."""
if value <= 0:
raise ValueError('limit must be greater than zero')
To use, see the following example that demonstrates its usage:
Example
from time import sleep
def main():
timeout_after_four_seconds = timeout(4)
# create copies of a function that have a timeout
a = timeout_after_four_seconds(do_something)
b = timeout_after_four_seconds(do_something)
c = timeout_after_four_seconds(do_something)
# execute the functions in separate processes
a('Hello', 1)
b('World', 5)
c('Jacob', 3)
# poll the functions to find out what they returned
results = [a, b, c]
polling = set(results)
while polling:
for process, name in zip(results, 'abc'):
if process in polling:
ready = process.ready
if ready is True: # if the function returned
print(name, 'returned', process.value)
polling.remove(process)
elif ready is None: # if the function took too long
print(name, 'reached timeout')
polling.remove(process)
else: # if the function is running
assert ready is False, 'ready must be True, False, or None'
sleep(0.1)
print('Done.')
def do_something(data, work):
sleep(work)
print(data)
return work
if __name__ == '__main__':
main()
Does the process you are running involve a loop?
If so you can get the timestamp prior to starting the loop and include an if statement within the loop with an sys.exit(); command terminating the script if the current timestamp differs from the recorded start time stamp by more than x seconds.
All you need to adapt the queue example from the docs to your case is to pass the timeout to the q.get() call and terminate the process on timeout:
from Queue import Empty
...
try:
print q.get(timeout=timeout)
except Empty: # no value, timeout occured
p.terminate()
q = None # the queue might be corrupted after the `terminate()` call
p.join()
Using a Pipe might be more lightweight otherwise the code is the same (you could use .poll(timeout), to find out whether there is a data to receive).
I am running pool.map on big data array and i want to print report in console every minute.
Is it possible? As i understand, python is synchronous language, it can't do this like nodejs.
Perhaps it can be done by threading.. or how?
finished = 0
def make_job():
sleep(1)
global finished
finished += 1
# I want to call this function every minute
def display_status():
print 'finished: ' + finished
def main():
data = [...]
pool = ThreadPool(45)
results = pool.map(make_job, data)
pool.close()
pool.join()
You can use a permanent threaded timer, like those from this question: Python threading.timer - repeat function every 'n' seconds
from threading import Timer,Event
class perpetualTimer(object):
# give it a cycle time (t) and a callback (hFunction)
def __init__(self,t,hFunction):
self.t=t
self.stop = Event()
self.hFunction = hFunction
self.thread = Timer(self.t,self.handle_function)
def handle_function(self):
self.hFunction()
self.thread = Timer(self.t,self.handle_function)
if not self.stop.is_set():
self.thread.start()
def start(self):
self.stop.clear()
self.thread.start()
def cancel(self):
self.stop.set()
self.thread.cancel()
Basically this is just a wrapper for a Timer object that creates a new Timer object every time your desired function is called. Don't expect millisecond accuracy (or even close) from this, but for your purposes it should be ideal.
Using this your example would become:
finished = 0
def make_job():
sleep(1)
global finished
finished += 1
def display_status():
print 'finished: ' + finished
def main():
data = [...]
pool = ThreadPool(45)
# set up the monitor to make run the function every minute
monitor = PerpetualTimer(60,display_status)
monitor.start()
results = pool.map(make_job, data)
pool.close()
pool.join()
monitor.cancel()
EDIT:
A cleaner solution may be (thanks to comments below):
from threading import Event,Thread
class RepeatTimer(Thread):
def __init__(self, t, callback, event):
Thread.__init__(self)
self.stop = event
self.wait_time = t
self.callback = callback
self.daemon = True
def run(self):
while not self.stop.wait(self.wait_time):
self.callback()
Then in your code:
def main():
data = [...]
pool = ThreadPool(45)
stop_flag = Event()
RepeatTimer(60,display_status,stop_flag).start()
results = pool.map(make_job, data)
pool.close()
pool.join()
stop_flag.set()
One way to do this, is to use main thread as the monitoring one. Something like below should work:
def main():
data = [...]
results = []
step = 0
pool = ThreadPool(16)
pool.map_async(make_job, data, callback=results.extend)
pool.close()
while True:
if results:
break
step += 1
sleep(1)
if step % 60 == 0:
print "status update" + ...
I've used .map() instead of .map_async() as the former is synchronous one. Also you probably will need to replace results.extend with something more efficient. And finally, due to GIL, speed improvement may be much smaller than expected.
BTW, it is little bit funny that you wrote that Python is synchronous in a question that asks about ThreadPool ;).
Consider using the time module. The time.time() function returns the current UNIX time.
For example, calling time.time() right now returns 1410384038.967499. One second later, it will return 1410384039.967499.
The way I would do this would be to use a while loop in the place of results = pool(...), and on every iteration to run a check like this:
last_time = time.time()
while (...):
new_time = time.time()
if new_time > last_time+60:
print "status update" + ...
last_time = new_time
(your computation here)
So that will check if (at least) a minute has elapsed since your last status update. It should print a status update approximately every sixty seconds.
Sorry that this is an incomplete answer, but I hope this helps or gives you some useful ideas.
I've been looking into a way to directly change variables in a running module.
What I want to achieve is that a load test is being run and that I can manually adjust the call pace or whatsoever.
Below some code that I just created (not-tested e.d.), just to give you an idea.
class A():
def __init__(self):
self.value = 1
def runForever(self):
while(1):
print self.value
def setValue(self, value):
self.value = value
if __name__ == '__main__':
#Some code to create the A object and directly apply the value from an human's input
a = A()
#Some parallelism or something has to be applied.
a.runForever()
a.setValue(raw_input("New value: "))
Edit #1: Yes, I know that now I will never hit the a.setValue() :-)
Here is a multi-threaded example. This code will work with the python interpreter but not with the Python Shell of IDLE, because the raw_input function is not handled the same way.
from threading import Thread
from time import sleep
class A(Thread):
def __init__(self):
Thread.__init__(self)
self.value = 1
self.stop_flag = False
def run(self):
while not self.stop_flag:
sleep(1)
print(self.value)
def set_value(self, value):
self.value = value
def stop(self):
self.stop_flag = True
if __name__ == '__main__':
a = A()
a.start()
try:
while 1:
r = raw_input()
a.set_value(int(r))
except:
a.stop()
The pseudo code you wrote is quite similar to the way Threading / Multiprocessing works in python. You will want to start a (for example) thread that "runs forever" and then instead of modifying the internal rate value directly, you will probably just send a message through a Queue that gives the new value.
Check out this question.
Here is a demonstration of doing what you asked about. I prefer to use Queues to directly making calls on threads / processes.
import Queue # !!warning. if you use multiprocessing, use multiprocessing.Queue
import threading
import time
def main():
q = Queue.Queue()
tester = Tester(q)
tester.start()
while True:
user_input = raw_input("New period in seconds or (q)uit: ")
if user_input.lower() == 'q':
break
try:
new_speed = float(user_input)
except ValueError:
new_speed = None # ignore junk
if new_speed is not None:
q.put(new_speed)
q.put(Tester.STOP_TOKEN)
class Tester(threading.Thread):
STOP_TOKEN = '<<stop>>'
def __init__(self, q):
threading.Thread.__init__(self)
self.q = q
self.speed = 1
def run(self):
while True:
# get from the queue
try:
item = self.q.get(block=False) # don't hang
except Queue.Empty:
item = None # do nothing
if item:
# stop when requested
if item == self.STOP_TOKEN:
break # stop this thread loop
# otherwise check for a new speed
try:
self.speed = float(item)
except ValueError:
pass # whatever you like with unknown input
# do your thing
self.main_code()
def main_code(self):
time.sleep(self.speed) # or whatever you want to do
if __name__ == '__main__':
main()
I have a simple example script constructed that defines three separate processes using multiprocessing in python. My objective is to have one parent thread that spawns two smaller threads that will collect and process data.
Currently, my implementation looks like this:
from Queue import Queue,Empty
from multiprocessing import Process
import time
import hashlib
class FillQueue(Process):
def __init__(self,q):
Process.__init__(self)
self.q = q
def run(self):
i = 0
while i is not 5:
print 'putting'
self.q.put('foo')
i+=1
self.q.put('|STOP|')
class ConsumeQueue(Process):
def __init__(self,q):
Process.__init__(self)
self.q = q
def run(self):
print 'Consume'
while True:
try:
value = self.q.get(False)
print value
if value == '|STOP|':
print 'done'
break;
except Empty:
print 'Nothing to process atm'
class Ripper(Process):
q = Queue()
def __init__(self):
self.fq = FillQueue(self.q)
self.cq = ConsumeQueue(self.q)
self.fq.daemon = True
self.cq.daemon = True
def run(self):
try:
self.fq.start()
self.cq.start()
except KeyboardInterrupt:
print 'exit'
if __name__ == '__main__':
r = Ripper()
r.start()
As it runs presently, the output from the script on CLI looks like this:
putting
putting
putting
putting
putting
Consume
foo
foo
foo
foo
foo
|STOP|
done
Obviously, the way I am starting my two threads is blocking, since the consumer doesn't even begin to process the items in the queue until the filler finishes adding items.
How should I rewrite this to make both threads begin immediately and not block, so the consumer will simply pass to the Empty except block while there is no work to process, but will exit completely when it receives the stop message?
EDIT: typo, had the start and run methods mixed up
You seem to be starting multiple processes using multiprocessing.Process.
However, you are using Queue.Queue which is only threadsafe, and not designed to be used by multiple processes.
shevek's answer is valid as well, but as a start, you should replace Queue.Queue with multiprocessing.Queue.
try this:
from Queue import Empty
from multiprocessing import Process, Queue
import time
import hashlib
class FillQueue(object):
def __init__(self, q):
self.q = q
def run(self):
i = 0
while i < 5:
print 'putting'
self.q.put('foo %d' % i )
i+=1
time.sleep(.5)
self.q.put('|STOP|')
class ConsumeQueue(object):
def __init__(self, q):
self.q = q
def run(self):
while True:
try:
value = self.q.get(False)
print value
if value == '|STOP|':
print 'done'
break;
except Empty:
print 'Nothing to process atm'
time.sleep(.2)
if __name__ == '__main__':
q = Queue()
f = FillQueue(q)
c = ConsumeQueue(q)
p1 = Process(target=f.run)
p1.start()
p2 = Process(target=c.run)
p2.start()
p1.join()
p2.join()
I think your program works fine. The CPU processes only one thing at a time, for a short time. However, the time required to put all your stuff in the queue is very short. So there is no reason that the filler cannot do this in one time slice.
If you add some delays in the filler, I think you should see that it actually works as you expect.