Python multiprocessing and Manager - python

I am using Python's multiprocessing to create a parallel application. Processes need to share some data, for which I use a Manager. However, I have some common functions which processes need to call and which need to access the data stored by the Manager object. My question is whether I can avoid needing to pass the Manager instance to these common functions as an argument and rather use it like a global. In other words, consider the following code:
import multiprocessing as mp
manager = mp.Manager()
global_dict = manager.dict(a=[0])
def add():
global_dict['a'] += [global_dict['a'][-1]+1]
def foo_parallel(var):
add()
print var
num_processes = 5
p = []
for i in range(num_processes):
p.append(mp.Process(target=foo_parallel,args=(global_dict,)))
[pi.start() for pi in p]
[pi.join() for pi in p]
This runs fine and returns p=[0,1,2,3,4,5] on my machine. However, is this "good form"? Is this a good way to doing it, just as good as defining add(var) and calling add(var) instead?

Your code example seems to have bigger problems than form. You get your desired output only with luck. Repeated execution will yield different results. That's because += is not an atomic operation. Multiple processes can read the same old value one after another, before any of them has updated it and they will write back the same values. To prevent this behaviour, you'll have to use a Manager.Lock additionally.
To your original question about "good form".
IMO it would be cleaner, to let the main-function of the child process foo_parallel, pass global_dict explicitly into a generic function add(var). That would be a form of dependency injection and has some advantages. In your example non-exhaustively:
allows isolated testing
increases code reusability
easier debugging (detecting non-accessibility of the managed object shouldn't be delayed until addis called (fail fast)
less boilerplate code (for example try-excepts blocks on resources multiple functions need)
As a side note. Using list comprehensions only for it's side effects is considered a 'code smell'. If you don't need a list as result, just use a for-loop.
Code:
import os
from multiprocessing import Process, Manager
def add(l):
l += [l[-1] + 1]
return l
def foo_parallel(global_dict, lock):
with lock:
l = global_dict['a']
global_dict['a'] = add(l)
print(os.getpid(), global_dict)
if __name__ == '__main__':
N_WORKERS = 5
with Manager() as manager:
lock = manager.Lock()
global_dict = manager.dict(a=[0])
pool = [Process(target=foo_parallel, args=(global_dict, lock))
for _ in range(N_WORKERS)]
for p in pool:
p.start()
for p in pool:
p.join()
print('result', global_dict)

Related

How can I use a shared/managed dictionary with a process pool (Python 3.x)

I am working on a project that requires me to extract a ton of information from some files. The format and most of the information about the project does not matter for what I am about to ask. I mostly do not understand how I would share this dictionary with all the processes in the process pool.
Here is my code (changed up variable names and deleted most of the code to just the need to know parts):
import json
import multiprocessing
from multiprocessing import Pool, Lock, Manager
import glob
import os
def record(thing, map):
with mutex:
if(thing in map):
map[thing] += 1
else:
map[thing] = 1
def getThing(file, n, map):
#do stuff
thing = file.read()
record(thing, map)
def init(l):
global mutex
mutex = l
def main():
#create a manager to manage shared dictionaries
manager = Manager()
#get the list of filenames to be analyzed
fileSet1=glob.glob("filesSet1/*")
fileSet2=glob.glob("fileSet2/*")
#create a global mutex for the processes to share
l = Lock()
map = manager.dict()
#create a process pool, give it the global mutex, and max cpu count-1 (manager is its own process)
with Pool(processes=multiprocessing.cpu_count()-1, initializer=init, initargs=(l,)) as pool:
pool.map(lambda file: getThing(file, 2, map), fileSet1) #This line is what i need help with
main()
From what I understand, that lamda function should work. The line that i need help with is: pool.map(lambda file: getThing(file, 2, map), fileSet1). It give me an error there. The error given is "AttributeError: Cant pickle local object 'main..'".
Any help would be appreciated!
In order to parallel-execute the tasks, the multiprocessing "pickles" the task function. In your case, this "task function" is lambda file: getThing(file, 2, map).
Unfortunately for you, by default, lambda functions can not be pickled in python (see also this stackoverflow post). Let me illustrate the problem with a minimal bit of code:
import multiprocessing
l = range(12)
def not_a_lambda(e):
print(e)
def main():
with multiprocessing.Pool() as pool:
pool.map(not_a_lambda, l) # Case (A)
pool.map(lambda e: print(e), l) # Case (B)
main()
In Case A we have a proper, free function which can be pickled an thus the pool.map operation will work. In Case B we have a lambda function and a crash will occur.
One possible solution is to use a proper module-scope function (like my not_a_lambda). Another solution is to rely on a third-party-module, like dill, to extend the pickling functionality. In the latter case, you'd use for example pathos as a replacement for the regular multiprocessing module. Finally, you could create a Worker class which collects your shared state as members. This could look something like this:
import multiprocessing
class Worker:
def __init__(self, mutex, map):
self.mutex = mutex
self.map = map
def __call__(self, e):
print("Hello from Worker e=%r" % (e, ))
with self.mutex:
k, v = e
self.map[k] = v
print("Goodbye from Worker e=%r" % (e, ))
def main():
manager = multiprocessing.Manager()
mutex = manager.Lock()
map = manager.dict()
# there is only ONE Worker instance which is shared across all processes
# thus, you need to make sure you don't access / modify internal state of
# the worker instance without locking the mutex.
worker = Worker(mutex, map)
with multiprocessing.Pool() as pool:
pool.map(worker, l.items())
main()

Working with deque object across multiple processes

I'm trying to reduce the processing time of reading a database with roughly 100,000 entries, but I need them to be formatted a specific way, in an attempt to do this, I tried to use python's multiprocessing.map function which works perfectly except that I can't seem to get any form of queue reference to work across them.
I've been using information from Filling a queue and managing multiprocessing in python to guide me for using queues across multiple processes, and Using a global variable with a thread to guide me for using global variables across threads. I've gotten the software to work, but when I check the list/queue/dict/map length after running the process, it always returns zero
I've written a simple example to show what I mean:
You have to run the script as a file, the map's initialize function does not work from the interpreter.
from multiprocessing import Pool
from collections import deque
global_q = deque()
def my_init(q):
global global_q
global_q = q
q.append("Hello world")
def map_fn(i):
global global_q
global_q.append(i)
if __name__ == "__main__":
with Pool(3, my_init, (global_q,)) as pool:
pool.map(map_fn, range(3))
for p in range(len(global_q)):
print(global_q.pop())
Theoretically, when I pass the queue object reference from the main thread to the worker threads using the pool function, and then initialize that thread's global variables using with the given function, then when I insert elements into the queue from the map function later, that object reference should still be pointing to the original queue object reference (long story short, everything should end up in the same queue, because they all point to the same location in memory).
So, I expect:
Hello World
Hello World
Hello World
1
2
3
of course, the 1, 2, 3's are in arbitrary order, but what you'll see on the output is ''.
How come when I pass object references to the pool function, nothing happens?
Here's an example of how to share something between processes by extending the multiprocessing.managers.BaseManager class to support deques.
There's a Customized managers section in the documentation about creating them.
import collections
from multiprocessing import Pool
from multiprocessing.managers import BaseManager
class DequeManager(BaseManager):
pass
class DequeProxy(object):
def __init__(self, *args):
self.deque = collections.deque(*args)
def __len__(self):
return self.deque.__len__()
def appendleft(self, x):
self.deque.appendleft(x)
def append(self, x):
self.deque.append(x)
def pop(self):
return self.deque.pop()
def popleft(self):
return self.deque.popleft()
# Currently only exposes a subset of deque's methods.
DequeManager.register('DequeProxy', DequeProxy,
exposed=['__len__', 'append', 'appendleft',
'pop', 'popleft'])
process_shared_deque = None # Global only within each process.
def my_init(q):
""" Initialize module-level global. """
global process_shared_deque
process_shared_deque = q
q.append("Hello world")
def map_fn(i):
process_shared_deque.append(i) # deque's don't have a "put()" method.
if __name__ == "__main__":
manager = DequeManager()
manager.start()
shared_deque = manager.DequeProxy()
with Pool(3, my_init, (shared_deque,)) as pool:
pool.map(map_fn, range(3))
for p in range(len(shared_deque)): # Show left-to-right contents.
print(shared_deque.popleft())
Output:
Hello world
0
1
2
Hello world
Hello world
You cant use global variable for multiprocesing.
Pass to the function multiprocessing queue.
from multiprocessing import Queue
queue= Queue()
def worker(q):
q.put(something)
Also you are propably experiencing that the code is allright, but as the pool create separate processes, even the errors are separeted and therefore you dont see the code not only isnt working, but that it throws error.
The reason why your output is '', is because nothing was appended to your q/global_q. And if it was appended, then only some variable, that may be called global_q, but its totally different one than your global_q in your main thread
Try to print('Hello world') inside the function you want to multiprocess and you will see by yourself, that nothing is actually printed at all. That processes is simply outside of your main thread and the only way to access that process is by multiprocessing Queues. You access the Queue by queue.put('something') and something = queue.get()
Try to understand this code and you will do well:
import multiprocessing as mp
shared_queue = mp.Queue() # This will be shared among all procesess, but you need to pass the queue as an argument in the process. You CANNOT use it as global variable. Understand that the functions kind of run in total different processes and nothing can really access them... Except multiprocessing.Queue - that can be shared across all processes.
def channel(que,channel_num):
que.put(channel_num)
if __name__ == '__main__':
processes = [mp.Process(target=channel, args=(shared_queue, channel_num)) for channel_num in range(8)]
for p in processes:
p.start()
for p in processes: # wait for all results to close the pool
p.join()
for i in range(8): # Get data from Queue. (you can get data out of it at any time actually)
print(shared_queue.get())

Creating a Queue delay in a Python pool without blocking

I have a large program (specifically, a function) that I'm attempting to parallelize using a JoinableQueue and the multiprocessing map_async method. The function that I'm working with does several operations on multidimensional arrays, so I break up each array into sections, and each section evaluates independently; however I need to stitch together one of the arrays early on, but the "stitch" happens before the "evaluate" and I need to introduce some kind of delay in the JoinableQueue. I've searched all over for a workable solution but I'm very new to multiprocessing and most of it goes over my head.
This phrasing may be confusing- apologies. Here's an outline of my code (I can't put all of it because it's very long, but I can provide additional detail if needed)
import numpy as np
import multiprocessing as mp
from multiprocessing import Pool, Pipe, JoinableQueue
def main_function(section_number):
#define section sizes
array_this_section = array[:,start:end+1,:]
histogram_this_section = np.zeros((3, dataset_size, dataset_size))
#start and end are defined according to the size of the array
#dataset_size is to show that the histogram is a different size than the array
for m in range(1,num_iterations+1):
#do several operations- each section of the array
#corresponds to a section on the histogram
hist_queue.put(histogram_this_section)
#each process sends their own part of the histogram outside of the pool
#to be combined with every other part- later operations
#in this function must use the full histogram
hist_queue.join()
full_histogram = full_hist_queue.get()
full_hist_queue.task_done()
#do many more operations
hist_queue = JoinableQueue()
full_hist_queue = JoinableQueue()
if __name__ == '__main__':
pool = mp.Pool(num_sections)
args = np.arange(num_sections)
pool.map_async(main_function, args, chunksize=1)
#I need the map_async because the program is designed to display an output at the
#end of each iteration, and each output must be a compilation of all processes
#a few variable definitions go here
for m in range(1,num_iterations+1):
for i in range(num_sections):
temp_hist = hist_queue.get() #the code hangs here because the queue
#is attempting to get before anything
#has been put
hist_full += temp_hist
for i in range(num_sections):
hist_queue.task_done()
for i in range(num_sections):
full_hist_queue.put(hist_full) #the full histogram is sent back into
#the pool
full_hist_queue.join()
#etc etc
pool.close()
pool.join()
I'm pretty sure that your issue is how you're creating the Queues and trying to share them with the child processes. If you just have them as global variables, they may be recreated in the child processes instead of inherited (the exact details depend on what OS and/or context you're using for multiprocessing).
A better way to go about solving this issue is to avoid using multiprocessing.Pool to spawn your processes and instead explicitly create Process instances for your workers yourself. This way you can pass the Queue instances to the processes that need them without any difficulty (it's technically possible to pass the queues to the Pool workers, but it's awkward).
I'd try something like this:
def worker_function(section_number, hist_queue, full_hist_queue): # take queues as arguments
# ... the rest of the function can work as before
# note, I renamed this from "main_function" since it's not running in the main process
if __name__ == '__main__':
hist_queue = JoinableQueue() # create the queues only in the main process
full_hist_queue = JoinableQueue() # the workers don't need to access them as globals
processes = [Process(target=worker_function, args=(i, hist_queue, full_hist_queue)
for i in range(num_sections)]
for p in processes:
p.start()
# ...
If the different stages of your worker function are more or less independent of one another (that is, the "do many more operations" step doesn't depend directly on the "do several operations" step above it, just on full_histogram), you might be able to keep the Pool and instead split up the different steps into separate functions, which the main process could call via several calls to map on the pool. You don't need to use your own Queues in this approach, just the ones built in to the Pool. This might be best especially if the number of "sections" you're splitting the work up into doesn't correspond closely with the number of processor cores on your computer. You can let the Pool match the number of cores, and have each one work on several sections of the data in turn.
A rough sketch of this would be something like:
def worker_make_hist(section_number):
# do several operations, get a partial histogram
return histogram_this_section
def worker_do_more_ops(section_number, full_histogram):
# whatever...
return some_result
if __name__ == "__main__":
pool = multiprocessing.Pool() # by default the size will be equal to the number of cores
for temp_hist in pool.imap_unordered(worker_make_hist, range(number_of_sections)):
hist_full += temp_hist
some_results = pool.starmap(worker_do_more_ops, zip(range(number_of_sections),
itertools.repeat(hist_full)))

Multiprocessing with python3 only runs once

I have a problem running multiple processes in python3 .
My program does the following:
1. Takes entries from an sqllite database and passes them to an input_queue
2. Create multiple processes that take items off the input_queue, run it through a function and output the result to the output queue.
3. Create a thread that takes items off the output_queue and prints them (This thread is obviously started before the first 2 steps)
My problem is that currently the 'function' in step 2 is only run as many times as the number of processes set, so for example if you set the number of processes to 8, it only runs 8 times then stops. I assumed it would keep running until it took all items off the input_queue.
Do I need to rewrite the function that takes the entries out of the database (step 1) into another process and then pass its output queue as an input queue for step 2?
Edit:
Here is an example of the code, I used a list of numbers as a substitute for the database entries as it still performs the same way. I have 300 items on the list and I would like it to process all 300 items, but at the moment it just processes 10 (the number of processes I have assigned)
#!/usr/bin/python3
from multiprocessing import Process,Queue
import multiprocessing
from threading import Thread
## This is the class that would be passed to the multi_processing function
class Processor:
def __init__(self,out_queue):
self.out_queue = out_queue
def __call__(self,in_queue):
data_entry = in_queue.get()
result = data_entry*2
self.out_queue.put(result)
#Performs the multiprocessing
def perform_distributed_processing(dbList,threads,processor_factory,output_queue):
input_queue = Queue()
# Create the Data processors.
for i in range(threads):
processor = processor_factory(output_queue)
data_proc = Process(target = processor,
args = (input_queue,))
data_proc.start()
# Push entries to the queue.
for entry in dbList:
input_queue.put(entry)
# Push stop markers to the queue, one for each thread.
for i in range(threads):
input_queue.put(None)
data_proc.join()
output_queue.put(None)
if __name__ == '__main__':
output_results = Queue()
def output_results_reader(queue):
while True:
item = queue.get()
if item is None:
break
print(item)
# Establish results collecting thread.
results_process = Thread(target = output_results_reader,args = (output_results,))
results_process.start()
# Use this as a substitute for the database in the example
dbList = [i for i in range(300)]
# Perform multi processing
perform_distributed_processing(dbList,10,Processor,output_results)
# Wait for it all to finish.
results_process.join()
A collection of processes that service an input queue and write to an output queue is pretty much the definition of a process pool.
If you want to know how to build one from scratch, the best way to learn is to look at the source code for multiprocessing.Pool, which is pretty simply Python, and very nicely written. But, as you might expect, you can just use multiprocessing.Pool instead of re-implementing it. The examples in the docs are very nice.
But really, you could make this even simpler by using an executor instead of a pool. It's hard to explain the difference (again, read the docs for both modules), but basically, a future is a "smart" result object, which means instead of a pool with a variety of different ways to run jobs and get results, you just need a dumb thing that doesn't know how to do anything but return futures. (Of course in the most trivial cases, the code looks almost identical either way…)
from concurrent.futures import ProcessPoolExecutor
def Processor(data_entry):
return data_entry*2
def perform_distributed_processing(dbList, threads, processor_factory):
with ProcessPoolExecutor(processes=threads) as executor:
yield from executor.map(processor_factory, dbList)
if __name__ == '__main__':
# Use this as a substitute for the database in the example
dbList = [i for i in range(300)]
for result in perform_distributed_processing(dbList, 8, Processor):
print(result)
Or, if you want to handle them as they come instead of in order:
def perform_distributed_processing(dbList, threads, processor_factory):
with ProcessPoolExecutor(processes=threads) as executor:
fs = (executor.submit(processor_factory, db) for db in dbList)
yield from map(Future.result, as_completed(fs))
Notice that I also replaced your in-process queue and thread, because it wasn't doing anything but providing a way to interleave "wait for the next result" and "process the most recent result", and yield (or yield from, in this case) does that without all the complexity, overhead, and potential for getting things wrong.
Don't try to rewrite the whole multiprocessing library again. I think you can use any of multiprocessing.Pool methods depending on your needs - if this is a batch job you can even use the synchronous multiprocessing.Pool.map() - only instead of pushing to input queue, you need to write a generator that yields input to the threads.

Python permutations threads

I have generated permutations with the itertools.permutations function in python. The problem is that the result is very big and I would like to go through it with multiple threads but don't really know how to accomplish that here is what I have so far:
perms = itertools.permutations('1234', r=4)
#I would like to iterate through 'perms' with multiple threads
for perm in perms:
print perm
If the work you want to do with the items from the permutation generator is CPU intensive, you probably want to use processes rather than threads. CPython's Global Interpreter Lock (GIL) makes multithreading of limited value when doing CPU bound work.
Instead, use the multiprocessing module's Pool class, like so:
import multiprocessing
import itertools
def do_stuff(perm):
# whatever
return list(reversed(perm))
if __name__ == "__main__":
with multiprocessing.Pool() as pool: # default is optimal number of processes
results = pool.map(do_stuff, itertools.permutations('1234', r=4))
# do stuff with results
Note that if you will be iterating over results (rather than doing something with it as a list), you can use imap instead of map to get an iterator that you can use to work on the results as they are produced from the worker processes. If it doesn't matter what order the items are returned, you can use imap_unordered to (I think) save a bit of memory.
The if __name__ is "__main__" boilerplate is required on Windows, where the multiprocessing module has to work around the OS's limitations (no fork).
Split the index of the number of perms between threads then use this function to generate the perm from its index in each thread rather than generating all the perms and splitting them between threads.
Assuming your processing function is f(x) you want to do
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
perms = itertools.permutations('1234', r=4)
for r in pool.map(f, perms):
print (r)
In fact, using threads would not execute the processes in parallel, unless it is IO bound. If it is CPU bound and you have a quad core, then it's the way to go. If you don't have multicore and it is CPU bound, then I'm afraid that making it parallel will not improve your current situation.
Python's futures module makes it easy to split work between threads. In this example, 4 threads will be used, but you can modify that to suit your needs.
from concurrent import futures
def thread_process(perm):
#do something
with futures.ThreadPoolExecutor(max_workers=4) as executor:
for perm in perms:
executor.submit(thread_process, perm)

Categories

Resources