I create sub-class from multiprocessing.Process.
Object p.run() can update instance.ret_value from the long_runtime_proc, but p.start() can't get the ret_value updated though long_runtime_proc called and ran.
How can I get ret_value with p.start()?
*class myProcess (multiprocessing.Process):
def __init__(self, pid, name, ret_value=0):
multiprocessing.Process.__init__(self)
self.id = pid
self.ret_value = ret_value
def run(self):
self.ret_value = long_runtime_proc (self.id)*
Calling Process.run() directly does not start a new process, i.e. the code in Process.run() is executed in the same process that invoked it. That's why changes to self.ret_value are effective. However, you are not supposed to call Process.run() directly.
When you start the subprocess with Process.start() a new child process is created and then the code in Process.run() is executed in this new process. When you assign the return value of long_runtime_proc to self.ret_value, this occurs in the child process, not the parent and thus the parent ret_vaule is not updated.
What you probably need to do is to use a pipe or a queue to send the return value to the parent process. See the documentation for details. Here is an example using a queue:
import time
import multiprocessing
def long_runtime_proc():
'''Simulate a long running process'''
time.sleep(10)
return 1234
class myProcess(multiprocessing.Process):
def __init__(self, result_queue):
self.result_queue = result_queue
super(myProcess, self).__init__()
def run(self):
self.result_queue.put(long_runtime_proc())
q = multiprocessing.Queue()
p = myProcess(q)
p.start()
ret_value = q.get()
p.join()
With this code ret_value will end up being assigned the value off the queue which will be 1234.
Related
I am trying to achieve a multiprocessing scenario with Kubernetes.
I have python based code who regularly check if given process exist using psutil:
For example:
import psutil
import time
from multiprocessing import Process
class MyProcessChecker:
def __init__(self):
pass
def task(self):
self.job()
def job(self):
# GET pids in the "some" database
pids = [26, 27, 30]
if not psutil.pid_exists(pid):
# Then update databse and remove pid in the list of PID.
def run(self):
p = Process(target=self.task, name="MyProcessChecker")
p.start()
return p
class MyProcess:
def __init__(self):
pass
def task(self):
self.job()
def job(self):
# Long process
def run(self):
p = Process(target=self.task, name="MyProcess")
p.start()
return p
if __name__ == "__main__":
#Running
proc_1 = MyProcessChecker.run()
for x in range(0, 10):
proc = MyProcess.run()
pid = proc.pid
# Save the pid in some database...
time.sleep(60)
This code should work on single instance. But in a multi replicas instance (pod), ie, 2 replicas,
we will have twice the same for loop so twice the number of process.
But the checker also will be launched two times.
Both will not have access to all linux PID cause their not in same process namespace...
How can I make the launcher also can access PID in other replicas pod in order to check them ?
I have 4 different Python custom objects and an events queue. Each obect has a method that allows it to retrieve an event from the shared events queue, process it if the type is the desired one and then puts a new event on the same events queue, allowing other processes to process it.
Here's an example.
import multiprocessing as mp
class CustomObject:
def __init__(events_queue: mp.Queue) -> None:
self.events_queue = event_queue
def process_events_queue() -> None:
event = self.events_queue.get()
if type(event) == SpecificEventDataTypeForThisClass:
# do something and create a new_event
self.events_queue.put(new_event)
else:
self.events_queue.put(event)
# there are other methods specific to each object
These 4 objects have specific tasks to do, but they all share this same structure. Since I need to "simulate" the production condition, I want them to run all at the same time, indipendently from eachother.
Here's just an example of what I want to do, if possible.
import multiprocessing as mp
import CustomObject
if __name__ == '__main__':
events_queue = mp.Queue()
data_provider = mp.Process(target=CustomObject, args=(events_queue,))
portfolio = mp.Process(target=CustomObject, args=(events_queue,))
engine = mp.Process(target=CustomObject, args=(events_queue,))
broker = mp.Process(target=CustomObject, args=(events_queue,))
while True:
data_provider.process_events_queue()
portfolio.process_events_queue()
engine.process_events_queue()
broker.process_events_queue()
My idea is to run each object in a separate process, allowing them to communicate with events shared through the events_queue. So my question is, how can I do that?
The problem is that obj = mp.Process(target=CustomObject, args=(events_queue,)) returns a Process instance and I can't access the CustomObject methods from it. Also, is there a smarter way to achieve what I want?
Processes require a function to run, which defines what the process is actually doing. Once this function exits (and there are no non-daemon threads) the process is done. This is similar to how Python itself always executes a __main__ script.
If you do mp.Process(target=CustomObject, args=(events_queue,)) that just tells the process to call CustomObject - which instantiates it once and then is done. This is not what you want, unless the class actually performs work when instantiated - which is a bad idea for other reasons.
Instead, you must define a main function or method that handles what you need: "communicate with events shared through the events_queue". This function should listen to the queue and take action depending on the events received.
A simple implementation looks like this:
import os, time
from multiprocessing import Queue, Process
class Worker:
# separate input and output for simplicity
def __init__(self, commands: Queue, results: Queue):
self.commands = commands
self.results = results
# our main function to be run by a process
def main(self):
# each process should handle more than one command
while True:
value = self.commands.get()
# pick a well-defined signal to detect "no more work"
if value is None:
self.results.put(None)
break
# do whatever needs doing
result = self.do_stuff(value)
print(os.getpid(), ':', self, 'got', value, 'put', result)
time.sleep(0.2) # pretend we do something
# pass on more work if required
self.results.put(result)
# placeholder for what needs doing
def do_stuff(self, value):
raise NotImplementedError
This is a template for a class that just keeps on processing events. The do_stuff method must be overloaded to define what actually happens.
class AddTwo(Worker):
def do_stuff(self, value):
return value + 2
class TimesThree(Worker):
def do_stuff(self, value):
return value * 3
class Printer(Worker):
def do_stuff(self, value):
print(value)
This already defines fully working process payloads: Process(target=TimesThree(in_queue, out_queue).main) schedules the main method in a process, listening for and responding to commands.
Running this mainly requires connecting the individual components:
if __name__ == '__main__':
# bookkeeping of resources we create
processes = []
start_queue = Queue()
# connect our workers via queues
queue = start_queue
for element in (AddTwo, TimesThree, Printer):
instance = element(queue, Queue())
# we run the main method in processes
processes.append(Process(target=instance.main))
queue = instance.results
# start all processes
for process in processes:
process.start()
# send input, but do not wait for output
start_queue.put(1)
start_queue.put(248124)
start_queue.put(-256)
# send shutdown signal
start_queue.put(None)
# wait for processes to shutdown
for process in processes:
process.join()
Note that you do not need classes for this. You can also compose functions for a similar effect, as long as everything is pickle-able:
import os, time
from multiprocessing import Queue, Process
def main(commands, results, do_stuff):
while True:
value = commands.get()
if value is None:
results.put(None)
break
result = do_stuff(value)
print(os.getpid(), ':', do_stuff, 'got', value, 'put', result)
time.sleep(0.2)
results.put(result)
def times_two(value):
return value * 2
if __name__ == '__main__':
in_queue, out_queue = Queue(), Queue()
worker = Process(target=main, args=(in_queue, out_queue, times_two))
worker.start()
for message in (1, 3, 5, None):
in_queue.put(message)
while True:
reply = out_queue.get()
if reply is None:
break
print('result:', reply)
What's the proper way to tell a looping thread to stop looping?
I have a fairly simple program that pings a specified host in a separate threading.Thread class. In this class it sleeps 60 seconds, the runs again until the application quits.
I'd like to implement a 'Stop' button in my wx.Frame to ask the looping thread to stop. It doesn't need to end the thread right away, it can just stop looping once it wakes up.
Here is my threading class (note: I haven't implemented looping yet, but it would likely fall under the run method in PingAssets)
class PingAssets(threading.Thread):
def __init__(self, threadNum, asset, window):
threading.Thread.__init__(self)
self.threadNum = threadNum
self.window = window
self.asset = asset
def run(self):
config = controller.getConfig()
fmt = config['timefmt']
start_time = datetime.now().strftime(fmt)
try:
if onlinecheck.check_status(self.asset):
status = "online"
else:
status = "offline"
except socket.gaierror:
status = "an invalid asset tag."
msg =("{}: {} is {}. \n".format(start_time, self.asset, status))
wx.CallAfter(self.window.Logger, msg)
And in my wxPyhton Frame I have this function called from a Start button:
def CheckAsset(self, asset):
self.count += 1
thread = PingAssets(self.count, asset, self)
self.threads.append(thread)
thread.start()
Threaded stoppable function
Instead of subclassing threading.Thread, one can modify the function to allow
stopping by a flag.
We need an object, accessible to running function, to which we set the flag to stop running.
We can use threading.currentThread() object.
import threading
import time
def doit(arg):
t = threading.currentThread()
while getattr(t, "do_run", True):
print ("working on %s" % arg)
time.sleep(1)
print("Stopping as you wish.")
def main():
t = threading.Thread(target=doit, args=("task",))
t.start()
time.sleep(5)
t.do_run = False
if __name__ == "__main__":
main()
The trick is, that the running thread can have attached additional properties. The solution builds
on assumptions:
the thread has a property "do_run" with default value True
driving parent process can assign to started thread the property "do_run" to False.
Running the code, we get following output:
$ python stopthread.py
working on task
working on task
working on task
working on task
working on task
Stopping as you wish.
Pill to kill - using Event
Other alternative is to use threading.Event as function argument. It is by
default False, but external process can "set it" (to True) and function can
learn about it using wait(timeout) function.
We can wait with zero timeout, but we can also use it as the sleeping timer (used below).
def doit(stop_event, arg):
while not stop_event.wait(1):
print ("working on %s" % arg)
print("Stopping as you wish.")
def main():
pill2kill = threading.Event()
t = threading.Thread(target=doit, args=(pill2kill, "task"))
t.start()
time.sleep(5)
pill2kill.set()
t.join()
Edit: I tried this in Python 3.6. stop_event.wait() blocks the event (and so the while loop) until release. It does not return a boolean value. Using stop_event.is_set() works instead.
Stopping multiple threads with one pill
Advantage of pill to kill is better seen, if we have to stop multiple threads
at once, as one pill will work for all.
The doit will not change at all, only the main handles the threads a bit differently.
def main():
pill2kill = threading.Event()
tasks = ["task ONE", "task TWO", "task THREE"]
def thread_gen(pill2kill, tasks):
for task in tasks:
t = threading.Thread(target=doit, args=(pill2kill, task))
yield t
threads = list(thread_gen(pill2kill, tasks))
for thread in threads:
thread.start()
time.sleep(5)
pill2kill.set()
for thread in threads:
thread.join()
This has been asked before on Stack. See the following links:
Is there any way to kill a Thread in Python?
Stopping a thread after a certain amount of time
Basically you just need to set up the thread with a stop function that sets a sentinel value that the thread will check. In your case, you'll have the something in your loop check the sentinel value to see if it's changed and if it has, the loop can break and the thread can die.
I read the other questions on Stack but I was still a little confused on communicating across classes. Here is how I approached it:
I use a list to hold all my threads in the __init__ method of my wxFrame class: self.threads = []
As recommended in How to stop a looping thread in Python? I use a signal in my thread class which is set to True when initializing the threading class.
class PingAssets(threading.Thread):
def __init__(self, threadNum, asset, window):
threading.Thread.__init__(self)
self.threadNum = threadNum
self.window = window
self.asset = asset
self.signal = True
def run(self):
while self.signal:
do_stuff()
sleep()
and I can stop these threads by iterating over my threads:
def OnStop(self, e):
for t in self.threads:
t.signal = False
I had a different approach. I've sub-classed a Thread class and in the constructor I've created an Event object. Then I've written custom join() method, which first sets this event and then calls a parent's version of itself.
Here is my class, I'm using for serial port communication in wxPython app:
import wx, threading, serial, Events, Queue
class PumpThread(threading.Thread):
def __init__ (self, port, queue, parent):
super(PumpThread, self).__init__()
self.port = port
self.queue = queue
self.parent = parent
self.serial = serial.Serial()
self.serial.port = self.port
self.serial.timeout = 0.5
self.serial.baudrate = 9600
self.serial.parity = 'N'
self.stopRequest = threading.Event()
def run (self):
try:
self.serial.open()
except Exception, ex:
print ("[ERROR]\tUnable to open port {}".format(self.port))
print ("[ERROR]\t{}\n\n{}".format(ex.message, ex.traceback))
self.stopRequest.set()
else:
print ("[INFO]\tListening port {}".format(self.port))
self.serial.write("FLOW?\r")
while not self.stopRequest.isSet():
msg = ''
if not self.queue.empty():
try:
command = self.queue.get()
self.serial.write(command)
except Queue.Empty:
continue
while self.serial.inWaiting():
char = self.serial.read(1)
if '\r' in char and len(msg) > 1:
char = ''
#~ print('[DATA]\t{}'.format(msg))
event = Events.PumpDataEvent(Events.SERIALRX, wx.ID_ANY, msg)
wx.PostEvent(self.parent, event)
msg = ''
break
msg += char
self.serial.close()
def join (self, timeout=None):
self.stopRequest.set()
super(PumpThread, self).join(timeout)
def SetPort (self, serial):
self.serial = serial
def Write (self, msg):
if self.serial.is_open:
self.queue.put(msg)
else:
print("[ERROR]\tPort {} is not open!".format(self.port))
def Stop(self):
if self.isAlive():
self.join()
The Queue is used for sending messages to the port and main loop takes responses back. I've used no serial.readline() method, because of different end-line char, and I have found the usage of io classes to be too much fuss.
Depends on what you run in that thread.
If that's your code, then you can implement a stop condition (see other answers).
However, if what you want is to run someone else's code, then you should fork and start a process. Like this:
import multiprocessing
proc = multiprocessing.Process(target=your_proc_function, args=())
proc.start()
now, whenever you want to stop that process, send it a SIGTERM like this:
proc.terminate()
proc.join()
And it's not slow: fractions of a second.
Enjoy :)
My solution is:
import threading, time
def a():
t = threading.currentThread()
while getattr(t, "do_run", True):
print('Do something')
time.sleep(1)
def getThreadByName(name):
threads = threading.enumerate() #Threads list
for thread in threads:
if thread.name == name:
return thread
threading.Thread(target=a, name='228').start() #Init thread
t = getThreadByName('228') #Get thread by name
time.sleep(5)
t.do_run = False #Signal to stop thread
t.join()
I find it useful to have a class, derived from threading.Thread, to encapsulate my thread functionality. You simply provide your own main loop in an overridden version of run() in this class. Calling start() arranges for the object’s run() method to be invoked in a separate thread.
Inside the main loop, periodically check whether a threading.Event has been set. Such an event is thread-safe.
Inside this class, you have your own join() method that sets the stop event object before calling the join() method of the base class. It can optionally take a time value to pass to the base class's join() method to ensure your thread is terminated in a short amount of time.
import threading
import time
class MyThread(threading.Thread):
def __init__(self, sleep_time=0.1):
self._stop_event = threading.Event()
self._sleep_time = sleep_time
"""call base class constructor"""
super().__init__()
def run(self):
"""main control loop"""
while not self._stop_event.isSet():
#do work
print("hi")
self._stop_event.wait(self._sleep_time)
def join(self, timeout=None):
"""set stop event and join within a given time period"""
self._stop_event.set()
super().join(timeout)
if __name__ == "__main__":
t = MyThread()
t.start()
time.sleep(5)
t.join(1) #wait 1s max
Having a small sleep inside the main loop before checking the threading.Event is less CPU intensive than looping continuously. You can have a default sleep time (e.g. 0.1s), but you can also pass the value in the constructor.
Sometimes you don't have control over the running target. In those cases you can use signal.pthread_kill to send a stop signal.
from signal import pthread_kill, SIGTSTP
from threading import Thread
from itertools import count
from time import sleep
def target():
for num in count():
print(num)
sleep(1)
thread = Thread(target=target)
thread.start()
sleep(5)
pthread_kill(thread.ident, SIGTSTP)
result
0
1
2
3
4
[14]+ Stopped
I came across a problem with multiprocessing:
class PythonHelper(object):
#staticmethod
def run_in_parallel(*functions):
processes=list()
for function in functions:
process=Process(target=function)
process.start()
processes.append(process)
for process in processes:
process.join()
The above stathic method is used by me to run several functions simultaneously (combining them in one process). Everything was okay until I came across the need to force the process terminating while one of 'subprocess' is terminated.
For example:
from PythonHelper import PythonHelper as ph
from Recorder import Recorder
class Logger(object):
def run_recorder_proc(self):
rec=Recorder()
rec.record_video()
def run_printer_proc(self):
#hypothetical function: execution takes a long time
for i in range(9000000):
print("number: {}".format(i))
def run_logger(self):
ph.run_in_parallel(self.run_printer_proc,self.run_recorder_proc)
self.run_printer_proc and self.run_recorder_proc are my subprocesses. How to 'kill' remaining subprocess while one of them was completed?
Edit:
Full source code:
class PythonHelper(object):
#staticmethod
#with your fix
def run_in_parallel(*functions):
processes={}
for function in functions:
process=Process(target=function)
process.start()
processes[process.pid]=process
# wait for any process to complete
pid, status = os.waitpid(-1, 0)
# one process terminated
# join it
processes[pid].join()
del processes[pid]
# terminate the rest
for process in processes.values():
process.terminate()
for process in processes.values():
process.join()
class Logger(object):
def run_numbers_1(self):
for i in range(900000):
print("number: {}".format(i))
def run_numbers_2(self):
for i in range(100000):
print("number: {}".format(i))
def run_logger(self):
ph.run_in_parallel(self.run_numbers_1,self.run_numbers_2)
if __name__=="__main__":
logger=Logger()
logger.run_logger()
Based on the above example I would like to force termination of run_numbers_1 while run_numbers_2 is completed.
You may achieve that by changing run_in_parallel() slightly:
def run_in_parallel(*functions):
processes={}
for function in functions:
process=Process(target=function)
process.start()
processes[process.pid]=process
# wait for any process to complete
pid, status = os.waitpid(-1, 0)
# one process terminated
# join it
processes[pid].join()
del processes[pid]
# terminate the rest
for process in processes.itervalues():
process.terminate()
for process in processes.itervalues():
process.join()
[Update]
Based on your complete code here is a working example. Instead of race-prone os.waitpid() it uses Event object, which other processes set when completed:
from multiprocessing import Process, Event
class MyProcess(Process):
def __init__(self, event, *args, **kwargs):
self.event = event
Process.__init__(self, *args, **kwargs)
def run(self):
Process.run(self)
self.event.set()
class PythonHelper(object):
#staticmethod
#with your fix
def run_in_parallel(*functions):
event = Event()
processes=[]
for function in functions:
process=MyProcess(event, target=function)
process.start()
processes.append(process)
# wait for any process to complete
event.wait()
# one process completed
# terminate all child processes
for process in processes:
process.terminate()
for process in processes:
process.join()
class Logger(object):
def run_numbers_1(self):
for i in range(90000):
print("1 number: {}".format(i))
def run_numbers_2(self):
for i in range(10000):
print("2 number: {}".format(i))
def run_logger(self):
PythonHelper.run_in_parallel(self.run_numbers_1,self.run_numbers_2)
if __name__=="__main__":
logger=Logger()
logger.run_logger()
I have a Manager (main thread), that creates other Threads to handle various operations.
I would like my Manager to be notified when a Thread it created ends (when run() method execution is finished).
I know I could do it by checking the status of all my threads with the Thread.isActive() method, but polling sucks, so I wanted to have notifications.
I was thinking of giving a callback method to the Threads, and call this function at the end of the run() method:
class Manager():
...
MyThread(self.on_thread_finished).start() # How do I pass the callback
def on_thread_finished(self, data):
pass
...
class MyThread(Thread):
...
def run(self):
....
self.callback(data) # How do I call the callback?
...
Thanks!
The thread can't call the manager unless it has a reference to the manager. The easiest way for that to happen is for the manager to give it to the thread at instantiation.
class Manager(object):
def new_thread(self):
return MyThread(parent=self)
def on_thread_finished(self, thread, data):
print thread, data
class MyThread(Thread):
def __init__(self, parent=None):
self.parent = parent
super(MyThread, self).__init__()
def run(self):
# ...
self.parent and self.parent.on_thread_finished(self, 42)
mgr = Manager()
thread = mgr.new_thread()
thread.start()
If you want to be able to assign an arbitrary function or method as a callback, rather than storing a reference to the manager object, this becomes a bit problematic because of method wrappers and such. It's hard to design the callback so it gets a reference to both the manager and the thread, which is what you will want. I worked on that for a while and did not come up with anything I'd consider useful or elegant.
Anything wrong with doing it this way?
from threading import Thread
class Manager():
def Test(self):
MyThread(self.on_thread_finished).start()
def on_thread_finished(self, data):
print "on_thread_finished:", data
class MyThread(Thread):
def __init__(self, callback):
Thread.__init__(self)
self.callback = callback
def run(self):
data = "hello"
self.callback(data)
m = Manager()
m.Test() # prints "on_thread_finished: hello"
If you want the main thread to wait for children threads to finish execution, you are probably better off using some kind of synchronization mechanism. If simply being notified when one or more threads has finished executing, a Condition is enough:
import threading
class MyThread(threading.Thread):
def __init__(self, condition):
threading.Thread.__init__(self)
self.condition = condition
def run(self):
print "%s done" % threading.current_thread()
with self.condition:
self.condition.notify()
condition = threading.Condition()
condition.acquire()
thread = MyThread(condition)
thread.start()
condition.wait()
However, using a Queue is probably better, as it makes handling multiple worker threads a bit easier.