Synchronize pool of workers - Python and multiproccessing

Synchronize pool of workers - Python and multiproccessing - python

I want to make an synchronized simulation of graph coloring. To create the graph (tree) I am using igraph package and to synchronization I am using for the first time multiprocessing package. I built a graph where each node has attributes: label, color and parentColor. To color the tree I excecute the following function (I am not giving the full code because it is very long, and I think not necessary to solve my problem):
def sixColor(self):
root = self.graph.vs.find("root")
root["color"] = self.takeColorFromList(root["label"])
self.sendToChildren(root)
lista = []
for e in self.graph.vs():
lista.append(e.index)
p = multiprocessing.Pool(len(lista))
p.map(fun, zip([self]*len(lista), lista),chunksize=300)
def process_sixColor(self, id):
v = self.graph.vs.find(id)
if not v["name"] == "root":
while True:
if v["received"] == True:
v["received"] = False
#------------Part 1-----------
self.sendToChildren(v)
self.printInfo()
#-----------Part 2-------------
diffIdx = self.compareLabelWithParent(v)
if not diffIdx == -1:
diffIdxStr = str(bin(diffIdx))[2:]
charAtPos = (v["label"][::-1])[diffIdx]
newLabel = diffIdxStr + charAtPos
v["label"] = newLabel
self.sendToChildren(v)
colorNum = int(newLabel,2)
if colorNum in sixColorList:
v["color"] = self.takeColorFromList(newLabel)
self.printGraph()
break
I want to have that each node (except root) is calling function process_sixColor synchronously in parallel and will not evaluate Part 2before Part 1 will be made by all nodes. But I notice that this is not working properly and some nodes are evaluating before every other node will execute Part 1. How can I solve that problem?

You can use a combination of a multiprocessing.Queue and a multiprocessing.Event object to synchronize the workers. Make the main process create a Queue and an Event and pass both to all the workers. The Queue will be used by the workers to let the main process know that they are finished with part 1. The Event will be used by the main process to let all the workers know that all the workers are finished with part 1. Basically,
the workers will call queue.put() to let the main process know that they have reached part 2 and then call event.wait() to wait for the main process to give the green light.
the main process will repeatedly call queue.get() until it receives as many messages as there are workers in the worker pool and then call event.set() to give the green light for the workers to start with part 2.
This is a simple example:
from __future__ import print_function
from multiprocessing import Event, Process, Queue
def worker(identifier, queue, event):
# Part 1
print("Worker {0} reached part 1".format(identifier))
# Let the main process know that we have finished part 1
queue.put(identifier)
# Wait for all the other processes
event.wait()
# Start part 2
print("Worker {0} reached part 2".format(identifier))
def main():
queue = Queue()
event = Event()
processes = []
num_processes = 5
# Create the worker processes
for identifier in range(num_processes):
process = Process(target=worker, args=(identifier, queue, event))
processes.append(process)
process.start()
# Wait for "part 1 completed" messages from the processes
while num_processes > 0:
queue.get()
num_processes -= 1
# Set the event now that all the processes have reached part 2
event.set()
# Wait for the processes to terminate
for process in processes:
process.join()
if __name__ == "__main__":
main()
If you want to use this in a production environment, you should think about how to handle errors that occur in part 1. Right now if an exception happens in part 1, the worker will never call queue.put() and the main process will block indefinitely waiting for the message from the failed worker. A production-ready solution should probably wrap the entire part 1 in a try..except block and then send a special error signal in the queue. The main process can then exit immediately if the error signal is received in the queue.

Related

limit number of threads used by a child process launched with ``multiprocessing.Process`

I'm trying to launch a function (my_function) and stop its execution after a certain time is reached.
So i challenged multiprocessing library and everything works well. Here is the code, where my_function() has been changed to only create a dummy message.
from multiprocessing import Queue, Process
from multiprocessing.queues import Empty
import time
timeout=1
# timeout=3
def my_function(something):
time.sleep(2)
return f'my message: {something}'
def wrapper(something, queue):
message ="too late..."
try:
message = my_function(something)
return message
finally:
queue.put(message)
try:
queue = Queue()
params = ("hello", queue)
child_process = Process(target=wrapper, args=params)
child_process.start()
output = queue.get(timeout=timeout)
print(f"ok: {output}")
except Empty:
timeout_message = f"Timeout {timeout}s reached"
print(timeout_message)
finally:
if 'child_process' in locals():
child_process.kill()
You can test and verify that depending on timeout=1 or timeout=3, i can trigger an error or not.
My main problem is that the real my_function() is a torch model inference for which i would like to limit the number of threads (to 4 let's say)
One can easily do so if my_function were in the main process, but in my example i tried a lot of tricks to limit it in the child process without any success (using threadpoolctl.threadpool_limits(4), torch.set_num_threads(4), os.environ["OMP_NUM_THREADS"]=4, os.environ["MKL_NUM_THREADS"]=4).
I'm completely open to other solution that can monitor the time execution of a function while limiting the number of threads used by this function.
thanks
Regards

You can limit simultaneous process with Pool. (https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool)
You can set max tasks done per child. Check it out.
Here you have a sample from superfastpython by Jason Brownlee:
# SuperFastPython.com
# example of limiting the number of tasks per child in the process pool
from time import sleep
from multiprocessing.pool import Pool
from multiprocessing import current_process
# task executed in a worker process
def task(value):
# get the current process
process = current_process()
# report a message
print(f'Worker is {process.name} with {value}', flush=True)
# block for a moment
sleep(1)
# protect the entry point
if __name__ == '__main__':
# create and configure the process pool
with Pool(2, maxtasksperchild=3) as pool:
# issue tasks to the process pool
for i in range(10):
pool.apply_async(task, args=(i,))
# close the process pool
pool.close()
# wait for all tasks to complete
pool.join()

Python: continues method call hold until configured delay time

From different thread/interface my class getting work,my class has to process the work with configured delay time.
def getJob(job):
work = self._getNextWorkToRun(job)
if work is None:
return {}
#proceed to do work
job sends by different package to this class. I wanted to call _getNextWorkToRun() method every five minutes once only. but the job comes every seconds/less than seconds. So I have to wait until 5 minutes to call _getNextWorkToRun() once again with new job. Every job has reference (JOB1,JOB2...etc.,) and all the jobs have to complete with the delay of 5 mins.
What is the best way to achieve this.

below is an example of using threads, jobs will be added anytime to job queue from any other function and a get_job() function will run continuously to monitor jobs and process them on fixed interval until get a stop flag
from threading import Thread
from queue import Queue
import time
from random import random
jobs = Queue() # queue safely used between threads to pass jobs
run_flag = True
def job_feeder():
for i in range(10):
# adding a job to jobs queue, job could be anything, here we just add a string for simplicity
jobs.put(f'job-{i}')
print(f'adding job-{i}')
time.sleep(random()) # simulate adding jobs randomly
print('job_feeder() finished')
def get_job():
while run_flag:
if jobs.qsize(): # check if there is any jobs in queue first
job = jobs.get() # getting the job
print(f'executing {job}')
time.sleep(3)
print('get_job finished')
t1 = Thread(target=job_feeder)
t2 = Thread(target=get_job)
t1.start()
t2.start()
# we can make get_job() thread quit anytime by setting run_flag
time.sleep(20)
run_flag = False
# waiting for threads to quit
t1.join()
t2.join()
print('all clear')
output:
adding job-0
executing job-0
adding job-1
adding job-2
adding job-3
adding job-4
adding job-5
adding job-6
adding job-7
executing job-1
adding job-8
adding job-9
job_feeder() finished
executing job-2
executing job-3
executing job-4
executing job-5
executing job-6
get_job finished
all clear
note get_job() processed only 6 jobs because we send quit signal after 20 seconds

Callback at random times from a child process with infinite loop, and termination

I need to react in a main process to random events happening in a child process. I have implemented this with a queue between the main and the child process, and a 'queue poller' running in a secondary thread of the main process and calling a callback function each time it finds an item in the queue. The code is below and seems to work.
Question 1: Could you please tell me if the strategy is correct or if something simpler exists ?
Question 2: I tried to have both the child process and the secondary thread terminated when stopping the main loop, but it fails, at least in spyder. What should I do to terminate everything properly?
Thanks for your help :-)
from threading import Thread
from multiprocessing import Process, Queue
from time import sleep
from random import random
class MyChildProcess(Process):
"""
This process runs as a child process of the main process.
It fills a queue (instantiated in the main process - main thread) at random times.
"""
def __init__(self,queue):
super(MyChildProcess,self).__init__()
self._q = queue # memorizes the queue
self._i = 0 # attribute to be incremented and put in the queue
def run(self):
while True:
self._q.put(self._i) # puts in the queue
self._i += 1 # increment for next time
sleep(random()) # wait between 0 and 1s
class myListenerInSeparateThreadOfMainProcess():
"""
This listener runs in a secondary thread of the main process.
It polls a queue and calls back a function for each item found.
"""
def __init__(self, queue, callbackFunction):
self._q = queue # memorizes the queue
self._cbf = callbackFunction # memorizes the queue
self.pollQueue()
def pollQueue(self):
while True:
sleep(0.2) # polls 5 times a second max
self.readQueue()
def readQueue(self):
while not self._q.empty(): # empties the queue each time
self._cbf(self._q.get()) # calls the callback function for each item
def runListener(q,cbf):
"""Target function for the secondary thread"""
myListenerInSeparateThreadOfMainProcess(q,cbf)
def callBackFunc(*args):
"""This is my reacting function"""
print 'Main process gets data from queue: ', args
if __name__ == '__main__':
q= Queue()
t = Thread(target=runListener, args=(q,callBackFunc))
t.daemon=True # try to have the secondary thread terminated if main thread terminates
t.start()
p = MyChildProcess(q)
p.daemon = True # try to have the child process terminated if parent process terminates
p.start() # no target scheme and no parent blocking by join
while True: # this is the main application loop
sleep(2)
print 'In main loop doing something independant from the rest'
Here is what I get:
Main process gets data from queue: (0,)
Main process gets data from queue: (1,)
Main process gets data from queue: (2,)
Main process gets data from queue: (3,)
In main loop doing something independant from queue management
Main process gets data from queue: (4,)
Main process gets data from queue: (5,)
Main process gets data from queue: (6,)
Main process gets data from queue: (7,)
In main loop doing something independant from queue management
Main process gets data from queue: (8,)
Main process gets data from queue: (9,)
In main loop doing something independant from queue management
...

General observations:
class MyChildProcess
You don't need to create separate classes for the child process and listener thread. Simple functions can work.
pollQueue
You can use a blocking get() call in the listener thread. This will make that thread more efficient.
Shutting Down
You can kill a Process with a signal, but it's harder (really impossible) to kill a thread. Your shutdown
routine will depend on how you want to handle items which are still in the queue.
If you don't care about processing items remaining in the queue when shutting down, you can
simply send a TERM signal to the child process and exit the main thread. Since the listener
thread has its .daemon attribute set to True it will also exit.
If you do care about processing items in the queue at shutdown time, you should
inform the listener thread to exit its processing loop by sending a special sentinel value
and then joining on that thread to wait for it to exit.
Here is an example which incorporates the above ideas. I haven chosen None for
the sentinel value.
#!/usr/bin/env python
from threading import Thread
from multiprocessing import Process, Queue
from time import sleep
from random import random
import os
import signal
def child_process(q):
i = 1
while True:
q.put(i)
i += 1
sleep( random() )
def listener_thread(q, callback):
while True:
item = q.get() # this will block until an item is ready
if item is None:
break
callback(item)
def doit(item):
print "got:", item
def main():
q = Queue()
# start up the child process:
child = Process(target=child_process, args=(q,))
child.start()
# start up the listener
listener = Thread(target=listener_thread, args=(q,doit))
listener.daemon = True
listener.start()
sleep(5)
print "Exiting"
os.kill( child.pid, signal.SIGTERM )
q.put(None)
listener.join()
main()

how to quit python script after multiprocessing processes are done?

Update: with the help of dano, I solved this problem.
I didn't invoke producers with join(), it made my script hanging.
Only need to add one line as dano said:
...
producer = multiprocessing.Process(target=produce,args=(file_queue,row_queue))
producer.daemon = True
producer.start()
...
Old script:
import multiprocessing
import Queue
QUEUE_SIZE = 2000
def produce(file_queue, row_queue,):
while not file_queue.empty():
src_file = file_queue.get()
zip_reader = gzip.open(src_file, 'rb')
try:
csv_reader = csv.reader(zip_reader, delimiter=SDP_DELIMITER)
for row in csv_reader:
new_row = process_sdp_row(row)
if new_row:
row_queue.put(new_row)
finally:
zip_reader.close()
def consume(row_queue):
'''processes all rows, once queue is empty, break the infinit loop'''
while True:
try:
# takes a row from queue and process it
pass
except multiprocessing.TimeoutError as toe:
print "timeout, all rows have been processed, quit."
break
except Queue.Empty:
print "all rows have been processed, quit."
break
except Exception as e:
print "critical error"
print e
break
def main(args):
file_queue = multiprocessing.Queue()
row_queue = multiprocessing.Queue(QUEUE_SIZE)
file_queue.put(file1)
file_queue.put(file2)
file_queue.put(file3)
# starts 3 producers
for i in xrange(4):
producer = multiprocessing.Process(target=produce,args=(file_queue,row_queue))
producer.start()
# starts 1 consumer
consumer = multiprocessing.Process(target=consume,args=(row_queue,))
consumer.start()
# blocks main thread until consumer process finished
consumer.join()
# prints statistics results after consumer is done
sys.exit(0)
if __name__ == "__main__":
main(sys.argv[1:])
Purpose:
I am using python 2.7 multiprocessing to generate 3 producers reading 3 files at the same time, and then put the file lines into a row_queue and generate 1 consumer to do more processing about all rows. Print statistics result in main thread after consumer is done, so I use join() method. Finally invoke sys.exit(0) to quit the script.
Problem:
Cannot quit the script.
I tried to replace sys.exit(0) with print "the end", "the end" displayed on console. Am I doing something wrong? why the script does not quit, and how to let it quit? Thanks

Your producers do not have multiprocessing.Process.daemon propery set:
daemon
The process’s daemon flag, a Boolean value. This must be set before start() is called.
The initial value is inherited from the creating process.
When a process exits, it attempts to terminate all of its daemonic child processes.
Note that a daemonic process is not allowed to create child processes. Otherwise a daemonic process would leave its children orphaned if it gets terminated when its parent process exits. Additionally, these are not Unix daemons or services, they are normal processes that will be terminated (and not joined) if non-daemonic processes have exited.
https://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process.daemon
Just add producer.daemon = True:
...
producer = multiprocessing.Process(target=produce,args=(file_queue,row_queue))
producer.daemon = True
producer.start()
...
That should make it possible for the whole program to end when the consumer is joined.
By the way, you should probably join the producers too.

Mutual exclusion thread locking, with dropping of queued functions upon mutex/lock release, in Python?

This is the problem I have: I'm using Python 2.7, and I have a code which runs in a thread, which has a critical region that only one thread should execute at the time. That code currently has no mutex mechanisms, so I wanted to inquire what I could use for my specific use case, which involves "dropping" of "queued" functions. I've tried to simulate that behavior with the following minimal working example:
useThreading=False # True
if useThreading: from threading import Thread, Lock
else: from multiprocessing import Process, Lock
mymutex = Lock()
import time
tstart = None
def processData(data):
#~ mymutex.acquire()
try:
print('thread {0} [{1:.5f}] Do some stuff'.format(data, time.time()-tstart))
time.sleep(0.5)
print('thread {0} [{1:.5f}] 1000'.format(data, time.time()-tstart))
time.sleep(0.5)
print('thread {0} [{1:.5f}] done'.format(data, time.time()-tstart))
finally:
#~ mymutex.release()
pass
# main:
tstart = time.time()
for ix in xrange(0,3):
if useThreading: t = Thread(target = processData, args = (ix,))
else: t = Process(target = processData, args = (ix,))
t.start()
time.sleep(0.001)
Now, if you run this code, you get a printout like this:
thread 0 [0.00173] Do some stuff
thread 1 [0.00403] Do some stuff
thread 2 [0.00642] Do some stuff
thread 0 [0.50261] 1000
thread 1 [0.50487] 1000
thread 2 [0.50728] 1000
thread 0 [1.00330] done
thread 1 [1.00556] done
thread 2 [1.00793] done
That is to say, the three threads quickly get "queued" one after another (something like 2-3 ms after each other). Actually, they don't get queued, they simply start executing in parallel after 2-3 ms after each other.
Now, if I enable the mymutex.acquire()/.release() commands, I get what would be expected:
thread 0 [0.00174] Do some stuff
thread 0 [0.50263] 1000
thread 0 [1.00327] done
thread 1 [1.00350] Do some stuff
thread 1 [1.50462] 1000
thread 1 [2.00531] done
thread 2 [2.00547] Do some stuff
thread 2 [2.50638] 1000
thread 2 [3.00706] done
Basically, now with locking, the threads don't run in parallel, but they run one after another thanks to the lock - as long as one thread is working, the others will block at the .acquire(). But this is not exactly what I want to achieve, either.
What I want to achieve is this: let's assume that when .acquire() is first triggered by a thread function, it registers an id of a function (say a pointer to it) in a queue. After that, the behavior is basically the same as with the Lock - while the one thread works, the others block at .acquire(). When the first thread is done, it goes in the finally: block - and here, I'd like to check to see how many threads are waiting in the queue; then I'd like to delete/drop all waiting threads except for the very last one - and finally, I'd .release() the lock; meaning that after this, what was the last thread in the queue would execute next. I'd imagine, I would want to write something like the following pseudocode:
...
finally:
if (len(mymutex.queue) > 2): # more than this instance plus one other waiting:
while (len(mymutex.queue) > 2):
mymutex.queue.pop(1) # leave alone [0]=this instance, remove next element
# at this point, there should be only queue[0]=this instance, and queue[1]= what was the last thread queued previously
mymutex.release() # once we releace, queue[0] should be gone, and the next in the queue should acquire the mutex/lock..
pass
...
With that, I'd expect a printout like this:
thread 0 [0.00174] Do some stuff
thread 0 [0.50263] 1000
thread 0 [1.00327] done
# here upon lock release, thread 1 would be deleted - and the last one in the queue, thread 2, would acquire the lock next:
thread 2 [1.00350] Do some stuff
thread 2 [1.50462] 1000
thread 2 [2.00531] done
What would be the most straightforward way to achieve this in Python?

Seems like you want a queue-like behaviour, so why not use Queue?
import threading
from Queue import Queue
import time
# threads advertise to this queue when they're waiting
wait_queue = Queue()
# threads get their task from this queue
task_queue = Queue()
def do_stuff():
print "%s doing stuff" % str(threading.current_thread())
time.sleep(5)
def queue_thread(sleep_time):
# advertise current thread waiting
time.sleep(sleep_time)
wait_queue.put("waiting")
# wait for permission to pass
message = task_queue.get()
print "%s got task: %s" % (threading.current_thread(), message)
# unregister current thread waiting
wait_queue.get()
if message == "proceed":
do_stuff()
# kill size-1 threads waiting
for _ in range(wait_queue.qsize() - 1):
task_queue.put("die")
# release last
task_queue.put("proceed")
if message == "die":
print "%s died without doing stuff" % threading.current_thread()
pass
t1 = threading.Thread(target=queue_thread, args=(1, ))
t2 = threading.Thread(target=queue_thread, args=(2, ))
t3 = threading.Thread(target=queue_thread, args=(3, ))
t4 = threading.Thread(target=queue_thread, args=(4, ))
# allow first thread to pass
task_queue.put("proceed")
t1.start()
t2.start()
t3.start()
t4.start()
thread-1 arrives first and "acquires" the section, other threads come later to wait at the queue (and advertise they're waiting). Then, when thread-1 leaves it gives permission to the last thread at the queue by telling all other thread to die, and the last thread to proceed.
You can have finer control using different messages, a typical one would be a thread-id in the wait_queue (so you know who is waiting, and the order in which it arrived).
You can probably utilize non-blocking operations (queue.put(block=False) and queue.get(block=False)) in your favour when you're set on what you need.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Synchronize pool of workers - Python and multiproccessing - python

Related

limit number of threads used by a child process launched with ``multiprocessing.Process`

Python: continues method call hold until configured delay time

Callback at random times from a child process with infinite loop, and termination

how to quit python script after multiprocessing processes are done?

Mutual exclusion thread locking, with dropping of queued functions upon mutex/lock release, in Python?

Categories

Resources