python multiprocessing pool blocking main thread

python multiprocessing pool blocking main thread - python

I have the following snippet which attempts to split processing across multiple sub-processes.
def search(self):
print("Checking queue for jobs to process")
if self._job_queue.has_jobs_to_process():
print("Queue threshold met, processing jobs.")
job_sub_lists = partition_jobs(self._job_queue.get_jobs_to_process(), self._process_pool_size)
populated_sub_lists = [sub_list for sub_list in job_sub_lists if len(sub_list) > 0]
self._process_pool.map(process, populated_sub_lists)
print("Job processing pool mapped")
The search function is being called by the main process in a while loop and if the queue reaches a threshold count, the processing pool is mapped to the process function with the jobs sourced from the queue. My question is, does the python multiprocessing pool block the main process during execution or does it immediately continue execution? I don't want to encounter the scenario where "has_jobs_to_process()" evaluates to true and during the processing of the jobs, it evaluates to true for another set of jobs and "self._process_pool.map(process, populated_sub_lists)" is called again as I do not know the consequences of calling map again while processes are running.

multiprocessing.Pool.map blocks the calling thread (not necessarily the MainThread!), not the whole process.
Other threads of the parent process will not be blocked. You could call pool.map from multiple threads in the parent process without breaking things (doesn't make much sense, though). That's because Pool uses thread-safe queue.Queue internally for it's _taskqueue.

From the multiprocessing docs, multiprocessing.map will block the main process during execution until a result is ready, and multiprocessing.map_async will not.

Related

Real difference between thread and threadpool?

In this example is there any real difference or is it just syntactic sugar?
threads = []
for job in jobs:
t = threading.Thread(target=job, args=[exchange])
t.start()
threads.append(t)
for thread in threads:
thread.join()
And
with concurrent.futures.ThreadPoolExecutor(max_workers=len(jobs)) as executor:
for job in jobs:
executor.submit(job, exchange)
Main point of ThreadPool should be to reuse threads but in this example are all threads exited after "with" statement, Am I right?
How to achieve reuse? Keep instance of ThreadPool alive somewhere without with statement?

You can keep the ThreadPool alive somewhere else for as long as you need. But in this particular case you probably want to utilize the result of .submit like this:
with concurrent.futures.ThreadPoolExecutor(max_workers=len(jobs)) as executor:
futures = []
for job in jobs:
future = executor.submit(job, exchange)
futures.append(future)
for future in futures:
future.result()
which is very similar to raw threads, except threads are reused and with future.result() we can retrieve value (if any) and catch exceptions (you may want to try-except the future.result() call).
Btw, I wouldn't do max_workers=len(jobs), it seems to be against the point of ThreadPool. Also I encourage you to have a look at async api instead. Threads are of limited usage in Python anyway.

What you're asking is like asking whether there is any real difference between owning a truck, and renting a truck just on the days when you need it.
A thread is like the truck. A thread pool is like the truck rental company. Any time you create a thread pool, you are indirectly creating threads—probably more than one.
Creating and destroying threads is a costly operation. Thread pools are useful in programs that continually create many small tasks that need to be performed in different threads. Instead of creating and destroying a new thread for each task, the program submits the task to a thread pool, and the thread pool assigns the tasks to one of its worker threads. The worker threads can live a long time. They don't need to be continually created and destroyed because each one can perform any number of tasks.
If the "tasks" that your program creates need to run for almost as long as the whole program itself, Then it might make more sense just to create raw threads for that. But, if your program creates many short-lived tasks, then the thread pool probably is the better choice.

Completing threads and interacting with results at a different rate

I'd like to get some feedback on an approach for receiving data from multiple threads in a concurrent.futures.ThreadPoolExecutor and iterating over the results. Given the scenario a ThreadPoolExecutor has future thread results appended to a buffer container and a secondary / decoupled operation read and withdraw from the same buffer container.
Thread Manager Workflow
/|-> Thread 1 > results \
ThreadPoolExecutor --|-> Thread 2 > results --> Queue [1,2,3] (end)
\|-> Thread 3 > results /
Now we have results from the threads in a First-In-First-Out queue container - which needs to be thread-safe. Now the above process is done and results (str|int|bool|list|dict|any) are in the container awaiting processing by the next step: Communicate the gathered results.
Communication Workflow
/|-> Terminal Print
Queue [1,2,3] < Listener > Communicate --|-> Speech Engine Say
\|-> Write to Log / File
The Communicate class needs to be "listening" on the Queue for new entries, and processing each as they come in at it's own speed (the rate of speech using a text to speech module - Producer-Consumer Problem) and potentially any number of other outputs, so this really can't be invoked from the top-down. If, the Thread Manager calls directly or lets each thread call the Communicate class directly to invoke the Speech Engine we will hear stuttered speech as the speech engine will override itself with each invocation. Thus, we need to decouple the Thread Manager workflow from the Communicate workflow but have them write & read with an In/Out type buffer or Queue and need for a "listener" concept.
I've found references for a structure like the following running as a daemon thread, but the while loop makes me cringe and consumes too much cpu, so I still need a non-blocking approach, where self.pipeline is a queue.Queue object:
while True :
try :
if not self.pipeline.empty ( ) :
task = self.pipeline.get ( timeout=1 )
if task :
self.serve ( task, )
except queue.Empty :
continue
Again, in need of something other than a while loop for this...

As you write in the comments, its standard producer consumer problem. One solution in python is using multithreading and the Queue class
The queue is thread safe . Its using a mutex internally, which handles busy waiting.
Queue.get will eventually call wait on its internal mutex. This will block the calling thread . But instead of busy waiting , which is using cpu, the thread will be put in sleep state. A thread scheduler of the os will take over from here, and will wake up the thread , when items are available (simplified ).
So you can still have while True loops within multiple thread consumers which call queue.get on shared queue. If items are available the threads directly process them, if not, they go into sleep mode and free the cpu. Same goes for producer threads , they simply call Queue.put
However there is one caveat in python. Python has something called global interpreter lock - GIL. This is because it is using a lot of c extension and allows modules which bring in c extensions. But those are not always thread safe. A GIL means, that only one thread will run on only one cpu at a time.
So , once an item is in the queue, only one consumer at a time will wake up and process the result. Also normally one producer can run at a time.
Except those threads start waiting for some I/O, like reading from a socket. Because I/O notification is handled by some other cpu part, there is always some waiting time for I/O. In that time, the threads release the GIL and other threads can do the work.
Summed up, it only makes sense to have multiple consumers and producer threads if they also do some I/O work - read/write on a network socket or disk. This is called concurrency. If you want to use multiple cpu cores at same time, you need to use multiprocessing in python instead of threads.
And it only makes sense to have more processes than cores, if there is also some IO work.
Example

I would suggest that you use multiprocessing rather than threading to ensure maximum parallelism. I am not sure whether you really need a process pool for what you are trying to do rather than 4 dedicated processes; it's a question of how "threads" 1 through 3 are getting their data for feeding to the queue to be processed by the 4th process. Are these implemented by a single, identical worker function to whom "jobs" are submitted? If so then a process pool of 3 identical workers is what you want. But if these are 3 separate functions with their own processing logic, then you just want to create 3 Process instances. I am working on the second assumption.
Since we are now in the realm of multiprocessing, I would suggest using a "managed" Queue instance created with the following code:
with multiprocessing.Manager() as manager:
q = manager.Queue()
Access to such a queue is synchronized across processeses. The following code is a rough idea of creating the processes and accessing the queue:
import multiprocessing
import time
class Communicate:
def listen(self, q):
while True:
obj = q.get()
if obj == None: # our signal to terminate
return
# do something with objects
print(obj)
def process1(q):
while True:
time.sleep(1)
q.put(1)
def process2(q):
while True:
time.sleep(.5)
q.put(2)
def process3(q):
while True:
time.sleep(1.5)
q.put(3)
if __name__ == '__main__':
communicator = Communicate()
with multiprocessing.Manager() as manager:
#start the commmunicator process:
q = manager.Queue()
p = multiprocessing.Process(target=communicator.listen, args=(q,))
p.start()
# start the other 3 processes:
p1 = multiprocessing.Process(target=process1, args=(q,))
p1.daemon = True
p1.start()
# start the other 3 processes:
p2 = multiprocessing.Process(target=process2, args=(q,))
p2.daemon = True
p2.start()
# start the other 3 processes:
p3 = multiprocessing.Process(target=process3, args=(q,))
p3.daemon = True
p3.start()
input('Hit any enter to terminate\n')
q.put(None) # signal for termination
p.join() # wait for process to complete

Searching in array with threads

Lets assume I'm working with Python although it's not really relevant.
I have a big array and I want to find whether element x is in the array.
However, when one of the threads finds the element, I want that all other threads will stop,
there is no point for them to continue running. I want to continue with main program with the result.
What would be the right way for doing this?
I want to minimize the cpu time of the other threads after I already found that the element is truly exist.

In Python, you can create a thread-safe queue in the main thread and pass it to each worker thread. Each worker should search while the queue is empty() and then terminate. If the result is found, the lucky worker should put() it into the queue, causing all other workers to stop after their current iteration.
Example code (untested):
from Queue import Queue
from Threading import Thread
class Worker(Thread):
def __init__(self, queue):
self.queue=queue
def run(self):
while self.queue.empty():
result=search( ... )
if result:
queue.put(result)
def main():
queue=Queue()
workers=[]
for i in xrange(0,5):
workers.append(Worker(queue))
result=queue.get()
print result

There are multiple ways, one of them is polling a queue in caller's thread, where spawned threads store their results. As soon as there first result appears, terminate all running threads.
Just note, in CPython only one thread can run at the same time due to Global Interpreter Lock limitation (unless in C-extension which can free the lock). Also note, for searching in large data more appropriate data structure then array should be used, like a binary tree.

Python What is the difference between a Pool of worker processes and just running multiple Processes?

I am not sure when to use pool of workers vs multiple processes.
processes = []
for m in range(1,5):
p = Process(target=some_function)
p.start()
processes.append(p)
for p in processes:
p.join()
vs
if __name__ == '__main__':
# start 4 worker processes
with Pool(processes=4) as pool:
pool_outputs = pool.map(another_function, inputs)

As it says on PYMOTW:
The Pool class can be used to manage a fixed number of workers for
simple cases where the work to be done can be broken up and
distributed between workers independently.
The return values from the jobs are collected and returned as a list.
The pool arguments include the number of processes and a function to
run when starting the task process (invoked once per child).
Please have a look at the examples given there to better understand its application, functionalities and parameters.
Basically the Pool is a helper, easing the management of the processes (workers) in those cases where all they need to do is consume common input data, process it in parallel and produce a joint output.
The Pool does quite a few things that otherwise you should code yourself (not too hard, but still, it's convenient to find a pre-cooked solution)
i.e.
the splitting of the input data
the target process function is simplified: it can be designed to expect one input element only. The Pool is going to call it providing each element from the subset allocated to that worker
waiting for the workers to finish their job (i.e. joining the processes)
...
merging the output of each worker to produce the final output

Below information might help you understanding the difference between Pool and Process in Python multiprocessing class:
Pool:
When you have junk of data, you can use Pool class.
Only the process under executions are kept in the memory.
I/O operation: It waits till the I/O operation is completed & does not schedule another process. This might increase the execution time.
Uses FIFO scheduler.
Process:
When you have a small data or functions and less repetitive tasks to do.
It puts all the process in the memory. Hence in the larger task, it might cause to loss of memory.
I/O operation: The process class suspends the process executing I/O operations and schedule another process parallel.
Uses FIFO scheduler.

Python multiprocessing - watchdog process?

I have a set of long-running process in a typical "pub/sub" setup with queues for communication.
I would like to do two things, and I can't figure out how to accomplish both simultaneously:
Addition/removal of workers. For example, I want to be able to add extra consumers if I see that my pending queue size has grown too large.
Watchdog for my processes - I want to be notified if any of my producers or consumers crashes.
I can do (2) in isolation:
try:
while True:
for process in workers + consumers:
if not process.is_alive():
logger.critical("%-8s%s died!", process.pid, process.name)
sleep(3)
except KeyboardInterrupt:
# Python propagates CTRL+C to all workers, no need to terminate them
logger.warn('Received CTR+C, shutting down')
The above blocks, which prevents me from doing (1).
So I decided to move the code into its own process.
This doesn't work, because process.is_alive() only works for a parent checking the status of its children. In this case, the processes I want to check would be siblings instead of children.
I'm a bit stumped on how to proceed. How can my main process support changes to subprocesses while also monitoring subprocesses?

multiprocessing.Pool actually has a watchdog built-in already. It runs a thread that checks every 0.1 seconds to see if a worker has died. If it has, it starts a new one to take its place:
def _handle_workers(pool):
thread = threading.current_thread()
# Keep maintaining workers until the cache gets drained, unless the pool
# is terminated.
while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
pool._maintain_pool()
time.sleep(0.1)
# send sentinel to stop workers
pool._taskqueue.put(None)
debug('worker handler exiting')
def _maintain_pool(self):
"""Clean up any exited workers and start replacements for them.
"""
if self._join_exited_workers():
self._repopulate_pool()
This is primarily used to implement the maxtasksperchild keyword argument, and is actually problematic in some cases. If a process dies while a map or apply command is running, and that process is in the middle of handling a task associated with that call, it will never finish. See this question for more information about that behavior.
That said, if you just want to know that a process has died, you can just create a thread (not a process) that monitors the pids of all the processes in the pool, and if the pids in the list ever change, you know a process has crashed:
def monitor_pids(pool):
pids = [p.pid for p in pool._pool]
while True:
new_pids = [p.pid for p in pool._pool]
if new_pids != pids:
print("A worker died")
pids = new_pids
time.sleep(3)
Edit:
If you're rolling your own Pool implementation, you can just take a cue from multiprocessing.Pool, and run your monitoring code in a background thread in the parent process. The checks to see if the processes are still running are quick, so the time lost to the background thread taking the GIL should be negligible. Consider that the multiprocessing.Process watchdog is running every 0.1 seconds! Running yours every 3 seconds shouldn't cause any problems.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.