Python multiprocessing: Parallel Pipeline implementation

Python multiprocessing: Parallel Pipeline implementation - python

I am trying to create a pipeline but I have bad exit issues(zombies) and performance ones. I have created this generic class:
class Generator(Process):
'''
<function>: function to call. None value means that the current class will
be used as a template for another class, with <function> being defined
there
<input_queues> : Queue or list of Queue objects , which refer to the input
to <function>.
<output_queues> : Queue or list of Queue objects , which are used to pass
output
<sema_to_acquire> : Condition or list of Condition objects, which are
blocking generation while not notified
<sema_to_release> : Condition or list of Condition objects, which will be
notified after <function> is called
'''
def __init__(self, function=None, input_queues=None, output_queues=None, sema_to_acquire=None,
sema_to_release=None):
Process.__init__(self)
self.input_queues = input_queues
self.output_queues = output_queues
self.sema_to_acquire = sema_to_acquire
self.sema_to_release = sema_to_release
if function is not None:
self.function = function
def run(self):
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
[sema.release() for sema in self.sema_to_release]
while True:
if self.sema_to_acquire is not None:
try:
self.sema_to_acquire.acquire()
except AttributeError:
[sema.acquire() for sema in self.sema_to_acquire]
if self.input_queues is not None:
try:
data = self.input_queues.get()
except AttributeError:
data = [queue.get() for queue in self.input_queues]
isiterable = True
try:
iter(data)
res = self.function(*tuple(data))
except TypeError, te:
res = self.function(data)
else:
res = self.function()
if self.output_queues is not None:
try:
if self.output_queues.full():
self.output_queues.get(res)
self.output_queues.put(res)
except AttributeError:
[queue.put(res) for queue in self.output_queues]
if self.sema_to_release is not None:
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
[sema.release() for sema in self.sema_to_release]
to simulate a worker inside a pipeline. The Generator is wanted to run an infinite while loop, in which a function is executed using input from n queues and the result is written to m queues. There are some semaphores which need to be acquired by a process, before one iteration happens, and when the iteration finishes some other semaphores are released. So, for processes needed to run on parallel and produce an input for another I send 'crossed' semaphores as arguments, in order to force them to perform together single iterations. For processes which do not need to run on parallel I do not use any conditions. An example (which I actually use, if anyone ignores the input functions) is the following:
import time
from multiprocess import Lock
print_lock = Lock()
_t_=0.5
def func0(data):
time.sleep(_t_)
print_lock.acquire()
print 'func0 sends',data
print_lock.release()
return data
def func1(data):
time.sleep(_t_)
print_lock.acquire()
print 'func1 receives and sends',data
print_lock.release()
return data
def func2(data):
time.sleep(_t_)
print_lock.acquire()
print 'func2 receives and sends',data
print_lock.release()
return data
def func3(*data):
print_lock.acquire()
print 'func3 receives',data
print_lock.release()
run_svm = Semaphore()
run_rf = Semaphore()
inp_rf = Queue()
inp_svm = Queue()
out_rf = Queue()
out_svm = Queue()
kin_stream = Queue()
res_mixed = Queue()
streamproc = Generator(func0,
input_queues=kin_stream,
output_queues=[inp_rf,
inp_svm])
streamproc.daemon = True
streamproc.start()
svm_class = Generator(func1,
input_queues=inp_svm,
output_queues=out_svm,
sema_to_acquire=run_svm,
sema_to_release=run_rf)
svm_class.daemon=True
svm_class.start()
rf_class = Generator(func2,
input_queues=inp_rf,
output_queues=out_rf,
sema_to_acquire=run_rf,
sema_to_release=run_svm)
rf_class.daemon=True
rf_class.start()
mixed_class = Generator(func3,
input_queues=[out_rf, out_svm])
mixed_class.daemon = True
mixed_class.start()
count = 1
while True:
kin_stream.put([count])
count+=1
time.sleep(1)
streamproc.join()
svm_class.join()
rf_class.join()
mixed_class.join()
This example gives:
func0 sends 1
func2 receives and sends 1
func1 receives and sends 1
func3 receives (1, 1)
func0 sends 2
func2 receives and sends 2
func1 receives and sends 2
func3 receives (2, 2)
func0 sends 3
func2 receives and sends 3
func1 receives and sends 3
func3 receives (3, 3)
...
All good. However, if I try to kill main then the other subprocesses are not guaranteed to terminate: the terminal might freeze, or the python compiler might remain running on the background (probably zombies) and I have no clue why this is happening, as I have set the corresponding daemons to True.
Does anyone have a better idea of implementing this type of pipeline or can suggest a solution to this evil problem? Thank you all.
EDIT
Fixed testing. The zombies still do exist however.

I was able to overcome this problem, by introducing a termination queue as additional argument to the given class and set up a signal handler for SIGINT interrupt, in order to stop the pipeline execution. I do not know if this is the most elegant way to get it working, but it works. Also, the way the signal handler is set is important, as it must be set before process.start() for some reason, if anyone knows why, he can comment. Furthermore the signal handler is inherited by the subprocesses, so I have to put the join inside a try:..except AssertionError:pass pattern, otherwise it will throw error (again, if someone knows how to bypass this, please elaborate). Anyways, it works.
SOURCE CODE
class Generator(Process):
'''
<term_queue>: Queue to write termination events, must be same for all
processes spawned
<function>: function to call. None value means that the current class will
be used as a template for another class, with <function> being defined
there
<input_queues> : Queue or list of Queue objects , which refer to the input
to <function>.
<output_queues> : Queue or list of Queue objects , which are used to pass
output
<sema_to_acquire> : Semaphore or list of Semaphore objects, which are
blocking function execution
<sema_to_release> : Semaphore or list of Semaphore objects, which will be
released after <function> is called
'''
def __init__(self, term_queue,
function=None, input_queues=None, output_queues=None, sema_to_acquire=None,
sema_to_release=None):
Process.__init__(self)
self.term_queue = term_queue
self.input_queues = input_queues
self.output_queues = output_queues
self.sema_to_acquire = sema_to_acquire
self.sema_to_release = sema_to_release
if function is not None:
self.function = function
def run(self):
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
deb = [sema.release() for sema in self.sema_to_release]
while True:
if not self.term_queue.empty():
self.term_queue.put((self.name, 0))
break
try:
if self.sema_to_acquire is not None:
try:
self.sema_to_acquire.acquire()
except AttributeError:
deb = [sema.acquire() for sema in self.sema_to_acquire]
if self.input_queues is not None:
try:
data = self.input_queues.get()
except AttributeError:
data = tuple([queue.get()
for queue in self.input_queues])
res = self.function(data)
else:
res = self.function()
if self.output_queues is not None:
try:
if self.output_queues.full():
self.output_queues.get(res)
self.output_queues.put(res)
except AttributeError:
deb = [queue.put(res) for queue in self.output_queues]
if self.sema_to_release is not None:
if self.sema_to_release is not None:
try:
self.sema_to_release.release()
except AttributeError:
deb = [sema.release() for sema in self.sema_to_release]
except Exception as exc:
self.term_queue.put((self.name, exc))
break
def signal_handler(sig, frame, term_queue, processes):
'''
<term_queue> is the queue to write termination of the __main__
<processes> is a dicitonary holding all running processes
'''
term_queue.put((__name__, 'SIGINT'))
try:
[processes[key].join() for key in processes]
except AssertionError:
pass
sys.exit(0)
term_queue = Queue()
'''
initialize some Generators and add them to <processes> dicitonary
'''
signal.signal(signal.SIGINT, lambda sig,frame: signal_handler(sig,frame,
term_queue,processes))
[processes[key].start() for key in processes]
while True:
if not term_queue.empty():
[processes[key].join() for key in processes]
break
and the example is changed accordingly (comment if you want me to add it)

I have had to work on this issue as well, and indeed, passing some communication pipe or queue to the processes seems to be the easiest way to tell them to terminate.
However the termination code can take advantage of a finally: bloc in the main process, it will take care of any event including signals.
If your processes are supposed to terminate at the same time as an object, you might also want to play with weakref.finalize, but it can be tricky.

Related

Main thread hanging at queue join because q.get() on empty queue does not return

I have a request manager that builds a queue and starts x worker threads (x currently == 1).
Each thread is looping and getting elements from the queue appending the results to a shared list.
If the queue is exhausted the queue.Empty exception is caught, the current job marked as done and the thread should exit. This does work.
This block at the end of the run() however seems to break things. The queue has an arbitrary length and it might occur that the queue is longer then actual results fetchable. In order to exit all threads early a thread checks if the result he got has len == 0. If this is the case the thread clears the queue of all items left, marks itself as done and exits.
if len(request_result) == 0:
with self.q.mutex:
self.q.queue.clear()
self.q.task_done()
return
My assumption was that every thread would then finish it's current job and exit.
However the execution of the main thread hangs at q.join() and I can't debug why. From the debugger it looks like the worker-thread is not terminating. But that's just guessing.
I've read: Threading queue hangs in Python
but that does not solve the problem. I however set q.unfinished_tasks to 0 manually but that is not thread safe and will cause the program to crash when threads try to call taks_done() when another thread just set q.unfinished_tasks to 0.
class RequestManager:
def __init__(self, config=None):
self.config = config
def request_all_heroes(self):
q = queue.Queue()
result_list = []
# todo: get range max from highest hero ID.
for skip in [x * 100 for x in range(1, 3)]:
q.put_nowait(skip)
for _ in range(int(self.config["meta"]["number_of_threads"])):
RequestWorker(q=q,
config=self.config,
query_name='all_heroes',
shared_result_list=result_list).start()
q.join()
return [Hero(item) for sublist in result_list for item in sublist]
class RequestWorker(threading.Thread):
def __init__(self,
q=None,
config=None,
query_name="",
shared_result_list=None, *args, **kwargs):
self.q = q
self.config = config
self.query_file_path = self.config["files"][query_name]
self.shared_result_list = shared_result_list
super().__init__(*args, **kwargs)
def run(self):
keep_running = True
while keep_running:
try:
skip_number = self.q.get()
except queue.Empty:
self.q.task_done()
return
sr = SpecificRequest(config=self.config, skip=skip_number, query_file_path=self.query_file_path)
request_result = sr.do_specific_request()
if len(request_result) == 0:
with self.q.mutex:
self.q.queue.clear()
self.q.task_done()
return
self.shared_result_list.append(request_result)
self.q.task_done()
EDIT 1
if not self.q.empty():
skip_number = self.q.get()
else:
return
This works, unfortunately it is plain wrong becase get is called after the check if the queue is empty. This will cause problems at some point because a thread can check, see an element in the queue and another thread can snatch that last element in the meantime. Unlikely but possible.
This question is now about why self.q.get() does not return.

How to track progress of job worker threads when threads are initiated from a Job Processor?

I have a scenario where I get a list of jobs to be processed e.g. a list of web pages to be crawled from internet). Each job is independent and also the jobs can be processed in any order. Individual jobs may fail or succeed and may have to be handled accordingly (e.g. temporary data for a failed crawl task may have to be deleted and recrawled in next round)
I am trying to implement it using thread based processing in python. To mimic the actual task lets say I have a huge list of integer arrays and the individual job is to compute the Sum or Product of each array. What I am trying to do is to use a JobsProcessor class object to instantiate threads of JobWorker class objects which perform the actual processing by creating objects for other classes (Sum and Product here). The code for the same is mentioned below. A snippet is shown
from queue import Queue, Empty
from threading import Thread
import time
class Product:
def __init__(self,data):
self.data = data
def doOperation(self):
try:
product =self.data[0]
for d in self.data[1:]:
if d>100000:
raise Exception( "Forcefully throwing exception")
product*=d
time.sleep(1)
return product
except:
return "product computation failed"
class Sum:
def __init__(self,data):
self.data = data
def doOperation(self):
try:
sum =0
for d in self.data:
sum+=d
time.sleep(1)
return sum
except:
return "sum computation failed"
class JobWorker(Thread):
def __init__(self, queue):
Thread.__init__(self)
self.queue = queue
def run(self):
while True:
try:
jobitem = self.queue.get_nowait()
if jobitem is None:
break
jobdata, optype = jobitem
if optype =='sum':
opobj = Sum(jobdata)
jobresult = opobj.doOperation()
elif optype =='product':
opobj = Product(jobdata)
jobresult = opobj.doOperation()
else:
print ("Invalid op type")
jobresult = 'Failed'
print(" job result", jobresult)
self.queue.task_done()
except Empty:
break
except:
print ("Some exception occured")
#How to pass it to up to the main jobs processor#
class JobsProcessor(object):
def __init__(self, joblist):
self.joblist = joblist
self.job_queue = Queue()
def process_resources(self):
try:
for job in self.joblist:
self.job_queue.put(job)
for i in range(2):
jobthread = JobWorker(self.job_queue)
jobthread.start()
'''
Write code here to monitor current status for all running jobs
'''
self.job_queue.join()
'''I want to write code here to track progress status for all jobs
Some jobs may have failed, not completed and based on that I may
want to take further action such as retry or flag them'''
print("Finished Jobs")
except:
pass
orgjobList = [ ([1,5,9,4],'sum'),
([5,4,5,8],'product'),
([100,45,678,999],'product'),
([3743,34,44324,543],'sum'),
([100001, 100002, 9876, 83989], 'product')]
mainprocessor = JobsProcessor(orgjobList)
mainprocessor.process_resources()
I want to add 2 functionalities to this process.
Consolidation : when all the job threads complete I want to know the status of all the JobWorker objects (e.g if they are completed successfully/ complete with failure). Failure/Exception may occur in the JobWorker object or may be even the Sum or Product object. The failure/success status should be propagate back to JobsProcessor, where I want to perform other actions such as reprocess/delete/send_elsewhere etc based on the returned status
Monitoring - also I want to have a Monitor functionality which can continuously check on the status of current running/completed jobs and perform the requisite actions such as delete immediately rather than waiting till the end for Consolidation
Please advise how I can add the above functionalities, and if only one of them would suffice for cases such as crawling pages. Any other suggestions are also welcome.

You can add both the functionalities in your code in any of the two ways -
Using Global Variables (simplest approach)
Using a getProgress and getStatus methods in your class (elegant approach)
You can create 2 threads, One thread does the actual work and updates the progress variable.
For the second approach, you can set two vars in __init__ class, like the following.
def __init__(self):
self.progress = 0
self.success = True
self.isDone = False
self.error = "No Error Occurred"
Then you can include the logic in your code like the following -
def actualWork(self):
self.isDone = 0
try:
for i in range(1000):
self.progress = i
time.sleep(0.01)
self.isDone = True
except Exception as e:
self.success = False
self.error = str(e)
def getProgress(self):
return self.progress
def getError(self):
return self.error

Is it possible to inherit multiprocessing.Process to communicate with the main process

I'm trying to inherit a sub class from multiprocessing.Process, which will have a queue for each instant, so that the queue can be use to catch the return value of the target.
But the problem is the multiprocessing.Process.start() uses subprocess.Popen (https://github.com/python/cpython/blob/master/Lib/multiprocessing/process.py) to create a process and run the target inside it. Is there a way to overload this without defining/overloading the entire Process module.
This is what I'm trying to do:
class Mprocessor(multiprocessing.Process):
def __init__(self, **kwargs):
multiprocessing.Process.__init__(self, **kwargs)
self._ret = Queue.Queue()
def run(self):
self._ret.put(multiprocessing.Process.run(self))
def getReturn(self):
if self._ret.empty:
return None
return self._ret.get()
Here I try to create a multiprocessig.Queue inside the class.
I override the 'run' method so when it is executed the return value/s of the target is put inside the queue.
I have a 'getReturn' method which is called in the main function using the Mprocess class. This method should only be called when 'Mprocess.isalive()' method(which is defined for multiprocessing.Process) returns false.
But this mechanism is not working because when I call 'Mprocess.start()' it creates a subprocess which runs the target in its own environment.
I want to know if there's a way to use the queue in the start method to get the return value, and avoid the target to have a queue argument to communicate.
I wanted to generalize this module.
I don't want my methods to be defined to have a queue to get return value.
I want to have a module so that it can be applicable to any function, because I am planning to have a manager method, which takes a dict["process_name/ID" : methods/targets], a dict["process name/ID" : [argument_list]] and create a process for each of this targets and return a dict["process_name/ID" : (return tuple, ).
Any ideas will be welcomed.
EDIT
Manager function:
def Processor_call(func = None, func_args = None):
if sorted(func.keys()) != sorted(func_args()):
print "Names in func dict and args dict doesn't match"
return None
process_list = multiprocessing.Queue()
for i in func.keys():
p = Mprocessor(name = i, target = func[i], args = tuple(func_args[i]))
process_list.put(p)
p.start()
return_dict = {}
while not process_list.empty():
process_wait = process_list.get()
if not process_wait.is_alive():
process_wait.join()
if process_wait.exitcode == 0:
return_dict[process_wait.name] = process_wait.getReturn()
else:
print "Error in process %s, status not availabe" %process_wait.name
else:
join_process.put(process_wait)
return return_dict
EDIT: The target function should look like this.
def sum(a , b):
return a + b
I don't want to pass a queue into the function, and return with queue.
I want to make a common module so that, any existing methods can use multiprocessing without any change to its definition, So the interface with other modules are maintained.
I don't want a function to be designed only to be run as a process, I want to have the common interface so that other modules can also use this function as a normal method, without bothering to read from the queue to get the return value.

Comment: ... so that I'll get the return value from the process started from start method
This will work for me, for instance:
class Mprocessor
class Mprocessor(multiprocessing.Process):
def __init__(self, queue, **kwargs):
multiprocessing.Process.__init__(self, **kwargs)
self._ret = queue
def run(self):
return_value = self._target( *self._args )
self._ret.put((self.name, return_value))
time.sleep(0.25)
exit(0)
Start processes and wait for return values
def Processor_call(func=None, func_args=None):
print('func=%s, func_args=%s' % (func, func_args))
ret_q = multiprocessing.Manager().Queue()
process_list = []
for i in func.keys():
p = Mprocessor(name=i, target=func[i], args=(func_args[i],), queue=ret_q)
p.start()
process_list.append(p)
time.sleep(0.1)
print('Block __main__ until all process terminated')
for p in process_list:
p.join()
print('Aggregate alle return values')
return_dict = {}
while not ret_q.empty():
p_name, value = ret_q.get()
return_dict[p_name] = value
return return_dict
__main__
if __name__ == '__main__':
rd = Processor_call({'f1':f1, 'f2':f1}, {'f1':1, 'f2':2})
print('rd=%s' % rd)
Output:
func={'f1': , 'f2': }, func_args={'f1': 1, 'f2': 2}
pid:4501 start 2
pid:4501 running
pid:4500 start 1
pid:4500 running
Block __main__ until all process terminated
pid:4501 running
pid:4500 running
pid:4501 running
pid:4500 running
pid:4501 Terminate
pid:4500 Terminate
Aggregate alle return values
rd={'f1': 1, 'f2': 2}
Tested with Python:3.4.2 and 2.7.9
Question: Is it possible to inherit multiprocessing.Process to communicate with the main process
Yes, it's possible. But not useing a class object, as your process use it's own copy of the class object .
You have to use a global Queue object and pass it to your process .

Thread Getting Stuck At Join

I'm running a thread pool that is giving a random bug. Sometimes it works, sometimes it gets stuck at the pool.join part of this code. I've been at this several days, yet cannot find any difference between when it works or when it gets stuck. Please help...
Here's the code...
def run_thread_pool(functions_list):
# Make the Pool of workers
pool = ThreadPool() # left blank to default to machine number of cores
pool.map(run_function, functions_list)
# close the pool and wait for the work to finish
pool.close()
pool.join()
return
Similarly, this code is also randomly getting stuck at q.join(:
def run_queue_block(methods_list, max_num_of_workers=20):
from views.console_output_handler import add_to_console_queue
'''
Runs methods on threads. Stores method returns in a list. Then outputs that list
after all methods in the list have been completed.
:param methods_list: example ((method name, args), (method_2, args), (method_3, args)
:param max_num_of_workers: The number of threads to use in the block.
:return: The full list of returns from each method.
'''
method_returns = []
log = StandardLogger(logger_name='run_queue_block')
# lock to serialize console output
lock = threading.Lock()
def _output(item):
# Make sure the whole print completes or threads can mix up output in one line.
with lock:
if item:
add_to_console_queue(item)
msg = threading.current_thread().name, item
log.log_debug(msg)
return
# The worker thread pulls an item from the queue and processes it
def _worker():
log = StandardLogger(logger_name='_worker')
while True:
try:
method, args = q.get() # Extract and unpack callable and arguments
except:
# we've hit a nonetype object.
break
if method is None:
break
item = method(*args) # Call callable with provided args and store result
method_returns.append(item)
_output(item)
q.task_done()
num_of_jobs = len(methods_list)
if num_of_jobs < max_num_of_workers:
max_num_of_workers = num_of_jobs
# Create the queue and thread pool.
q = Queue()
threads = []
# starts worker threads.
for i in range(max_num_of_workers):
t = threading.Thread(target=_worker)
t.daemon = True # thread dies when main thread (only non-daemon thread) exits.
t.start()
threads.append(t)
for method in methods_list:
q.put(method)
# block until all tasks are done
q.join()
# stop workers
for i in range(max_num_of_workers):
q.put(None)
for t in threads:
t.join()
return method_returns
I never know when it's going to work. It works most the time, but most the time is not good enough. What might possibly cause a bug like this?

You have to call shutdown on the concurrent.futures.ThreadPoolExecutor object. Then return the result of pool.map.
def run_thread_pool(functions_list):
# Make the Pool of workers
pool = ThreadPool() # left blank to default to machine number of cores
result = pool.map(run_function, functions_list)
# close the pool and wait for the work to finish
pool.shutdown()
return result
I've simplified your code without a Queue object and daemon Thread. Check if it fits your requirement.
def run_queue_block(methods_list):
from views.console_output_handler import add_to_console_queue
'''
Runs methods on threads. Stores method returns in a list. Then outputs that list
after all methods in the list have been completed.
:param methods_list: example ((method name, args), (method_2, args), (method_3, args)
:param max_num_of_workers: The number of threads to use in the block.
:return: The full list of returns from each method.
'''
method_returns = []
log = StandardLogger(logger_name='run_queue_block')
# lock to serialize console output
lock = threading.Lock()
def _output(item):
# Make sure the whole print completes or threads can mix up output in one line.
with lock:
if item:
add_to_console_queue(item)
msg = threading.current_thread().name, item
log.log_debug(msg)
return
# The worker thread pulls an item from the queue and processes it
def _worker(method, *args, **kwargs):
log = StandardLogger(logger_name='_worker')
item = method(*args, **kwargs) # Call callable with provided args and store result
with lock:
method_returns.append(item)
_output(item)
threads = []
# starts worker threads.
for method, args in methods_list:
t = threading.Thread(target=_worker, args=(method, args))
t.start()
threads.append(t)
# stop workers
for t in threads:
t.join()
return method_returns

To allow your queue to join in your second example, you need to ensure that all tasks are removed from the queue.
So in your _worker function, mark tasks as done even if they could not be processed, otherwise the queue will never be emptied, and your program will hang.
def _worker():
log = StandardLogger(logger_name='_worker')
while True:
try:
method, args = q.get() # Extract and unpack callable and arguments
except:
# we've hit a nonetype object.
q.task_done()
break
if method is None:
q.task_done()
break
item = method(*args) # Call callable with provided args and store result
method_returns.append(item)
_output(item)
q.task_done()

Is there a way to add to a variable across threads in python

Is there a way that I can have a single variable across active threads like below
count = 0
threadA(count)
threadB(count)
threadA(count):
#do stuff
count += 1
threadB(count):
#do stuff
print count
so that count will print out 1? I changed the variable in thread A and it reflected across to the other thread?

Your variable count is already available to all your threads. But you need to synchronize access to it, or you will lose updates. Look into using a lock to protect access to the count.

If you want to use processes instead of threads, use multiprocessing. It has more features, including having a Manager objects which handles shared objects for you. As a perk, you can share objects across machines!
source
import multiprocessing, signal, time
def producer(objlist):
'''
add an item to list every sec
'''
while True:
try:
time.sleep(1)
except KeyboardInterrupt:
return
msg = 'ding: {:04d}'.format(int(time.time()) % 10000)
objlist.append( msg )
print msg
def scanner(objlist):
'''
every now and then, consume objlist & run calculation
'''
while True:
try:
time.sleep(3)
except KeyboardInterrupt:
return
print 'items: {}'.format( list(objlist) )
objlist[:] = []
def main():
# create obj sharable between all processes
manager = multiprocessing.Manager()
my_objlist = manager.list() # pylint: disable=E1101
multiprocessing.Process(
target=producer, args=(my_objlist,),
).start()
multiprocessing.Process(
target=scanner, args=(my_objlist,),
).start()
# kill everything after a few seconds
signal.signal(
signal.SIGALRM,
lambda _sig,_frame: manager.shutdown(),
)
signal.alarm(12)
try:
manager.join() # wait until both workers die
except KeyboardInterrupt:
pass
if __name__=='__main__':
main()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python multiprocessing: Parallel Pipeline implementation - python

Related

Main thread hanging at queue join because q.get() on empty queue does not return

How to track progress of job worker threads when threads are initiated from a Job Processor?

Is it possible to inherit multiprocessing.Process to communicate with the main process

Thread Getting Stuck At Join

Is there a way to add to a variable across threads in python

Categories

Resources