Updating object attributes from within module run in multiprocessor process - python

I am relatively new to python and definitely new to multiprocessing. I'm following this question/answer for the structure of my multiprocessing, but in def func_A, I'm calling a module that passes a class object as one of the arguments. In the module, I change an object attribute that I would like the main program to see and update the user with the object attribute value. The child processes run for very long times, so I need the main program to provide updates as they run.
My suspicion is that I'm not understanding namespace/object scoping or something similar, but from what I've read, passing an object (an instance of a class?) to a module as an argument passes a reference to the object and not a copy. I would have thought this meant that changing the attributes of the object in the child process/module would have changed the attributes in the main program object (since they're the same object). Or am I confusing things?
The code for my main program:
# MainProgram.py
import multiprocessing as mp
import time
from time import sleep
import sys
from datetime import datetime
import myModule
MYOBJECTNAMES = ['name1','name2']
class myClass:
def __init__(self, name):
self.name = name
self.value = 0
myObjects = []
for n in MYOBJECTNAMES:
myObjects.append(myClass(n))
def func_A(process_number, queue):
start = datetime.now()
print("Process {} (object: {}) started at {}".format(process_number, myObjects[process_number].name, start))
myModule.Eval(myObjects[process_number])
sys.stdout.flush()
def multiproc_master():
queue = mp.Queue()
proceed = mp.Event()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in range(len(myObjects))]
for p in processes:
p.start()
for i in range(100):
for o in myObjects:
print("In main: Value of {} is {}".format(o.name, o.value))
sleep(10)
for p in processes:
p.join()
if __name__ == '__main__':
split_jobs = multiproc_master()
print(split_jobs)
The code for my module program:
# myModule.py
from time import sleep
def Eval(myObject):
for i in range(100):
myObject.value += 1
print("In module: Value of {} is {}".format(myObject.name, myObject.value))
sleep(5)

That question/answer you linked to probably was probably a poor choice to use as a template, as it's doing many things that your code doesn't require (much less use).
I think your biggest misconception about how multiprocessing works is thinking that all the code is running in the same address-space. The main task runs in its own, and there are separate ones for each subtask. The way your code is written, each of them will end up with its own separate myObjects list. That's why the main task doesn't see any of the changes made by any of the other tasks.
While there are ways share objects using the multiprocessing module, doing so often introduces significant overhead because keeping it or them all in-sync between all the processes requires lots of things happening "under the covers" to make seem like they're shared (which is what is really going on since they can't actually be because of having separate address-spaces). This overhead frequently completely cancels out any speed gained by parallel-processing.
As stated in the documentation: "when doing concurrent programming it is usually best to avoid using shared state as far as possible".

Related

Working with deque object across multiple processes

I'm trying to reduce the processing time of reading a database with roughly 100,000 entries, but I need them to be formatted a specific way, in an attempt to do this, I tried to use python's multiprocessing.map function which works perfectly except that I can't seem to get any form of queue reference to work across them.
I've been using information from Filling a queue and managing multiprocessing in python to guide me for using queues across multiple processes, and Using a global variable with a thread to guide me for using global variables across threads. I've gotten the software to work, but when I check the list/queue/dict/map length after running the process, it always returns zero
I've written a simple example to show what I mean:
You have to run the script as a file, the map's initialize function does not work from the interpreter.
from multiprocessing import Pool
from collections import deque
global_q = deque()
def my_init(q):
global global_q
global_q = q
q.append("Hello world")
def map_fn(i):
global global_q
global_q.append(i)
if __name__ == "__main__":
with Pool(3, my_init, (global_q,)) as pool:
pool.map(map_fn, range(3))
for p in range(len(global_q)):
print(global_q.pop())
Theoretically, when I pass the queue object reference from the main thread to the worker threads using the pool function, and then initialize that thread's global variables using with the given function, then when I insert elements into the queue from the map function later, that object reference should still be pointing to the original queue object reference (long story short, everything should end up in the same queue, because they all point to the same location in memory).
So, I expect:
Hello World
Hello World
Hello World
1
2
3
of course, the 1, 2, 3's are in arbitrary order, but what you'll see on the output is ''.
How come when I pass object references to the pool function, nothing happens?
Here's an example of how to share something between processes by extending the multiprocessing.managers.BaseManager class to support deques.
There's a Customized managers section in the documentation about creating them.
import collections
from multiprocessing import Pool
from multiprocessing.managers import BaseManager
class DequeManager(BaseManager):
pass
class DequeProxy(object):
def __init__(self, *args):
self.deque = collections.deque(*args)
def __len__(self):
return self.deque.__len__()
def appendleft(self, x):
self.deque.appendleft(x)
def append(self, x):
self.deque.append(x)
def pop(self):
return self.deque.pop()
def popleft(self):
return self.deque.popleft()
# Currently only exposes a subset of deque's methods.
DequeManager.register('DequeProxy', DequeProxy,
exposed=['__len__', 'append', 'appendleft',
'pop', 'popleft'])
process_shared_deque = None # Global only within each process.
def my_init(q):
""" Initialize module-level global. """
global process_shared_deque
process_shared_deque = q
q.append("Hello world")
def map_fn(i):
process_shared_deque.append(i) # deque's don't have a "put()" method.
if __name__ == "__main__":
manager = DequeManager()
manager.start()
shared_deque = manager.DequeProxy()
with Pool(3, my_init, (shared_deque,)) as pool:
pool.map(map_fn, range(3))
for p in range(len(shared_deque)): # Show left-to-right contents.
print(shared_deque.popleft())
Output:
Hello world
0
1
2
Hello world
Hello world
You cant use global variable for multiprocesing.
Pass to the function multiprocessing queue.
from multiprocessing import Queue
queue= Queue()
def worker(q):
q.put(something)
Also you are propably experiencing that the code is allright, but as the pool create separate processes, even the errors are separeted and therefore you dont see the code not only isnt working, but that it throws error.
The reason why your output is '', is because nothing was appended to your q/global_q. And if it was appended, then only some variable, that may be called global_q, but its totally different one than your global_q in your main thread
Try to print('Hello world') inside the function you want to multiprocess and you will see by yourself, that nothing is actually printed at all. That processes is simply outside of your main thread and the only way to access that process is by multiprocessing Queues. You access the Queue by queue.put('something') and something = queue.get()
Try to understand this code and you will do well:
import multiprocessing as mp
shared_queue = mp.Queue() # This will be shared among all procesess, but you need to pass the queue as an argument in the process. You CANNOT use it as global variable. Understand that the functions kind of run in total different processes and nothing can really access them... Except multiprocessing.Queue - that can be shared across all processes.
def channel(que,channel_num):
que.put(channel_num)
if __name__ == '__main__':
processes = [mp.Process(target=channel, args=(shared_queue, channel_num)) for channel_num in range(8)]
for p in processes:
p.start()
for p in processes: # wait for all results to close the pool
p.join()
for i in range(8): # Get data from Queue. (you can get data out of it at any time actually)
print(shared_queue.get())

ThreadPoolExecutor, ProcessPoolExecutor and global variables

I am new to parallelization in general and concurrent.futures in particular. I want to benchmark my script and compare the differences between using threads and processes, but I found that I couldn't even get that running because when using ProcessPoolExecutor I cannot use my global variables.
The following code will output Helloas I expect, but when you change ThreadPoolExecutor for ProcessPoolExecutor, it will output None.
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
greeting = None
def process():
print(greeting)
return None
def main():
with ThreadPoolExecutor(max_workers=1) as executor:
executor.submit(process)
return None
def init():
global greeting
greeting = 'Hello'
return None
if __name__ == '__main__':
init()
main()
I don't understand why this is the case. In my real program, init is used to set the global variables to CLI arguments, and there are a lot of them. Hence, passing them as arguments does not seem recommended. So how do I pass those global variables to each process/thread correctly?
I know that I can change things around, which will work, but I don't understand why. E.g. the following works for both Executors, but it also means that the globals initialisation has to happen for every instance.
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
greeting = None
def init():
global greeting
greeting = 'Hello'
return None
def main():
with ThreadPoolExecutor(max_workers=1) as executor:
executor.submit(process)
return None
def process():
init()
print(greeting)
return None
if __name__ == '__main__':
main()
So my main question is, what is actually happening. Why does this code work with threads and not with processes? And, how do I correctly pass set globals to each process/thread without having to re-initialise them for every instance?
(Side note: because I have read that concurrent.futures might behave differently on Windows, I have to note that I am running Python 3.6 on Windows 10 64 bit.)
I'm not sure of the limitations of this approach, but you can pass (serializable?) objects between your main process/thread. This would also help you get rid of the reliance on global vars:
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
def process(opts):
opts["process"] = "got here"
print("In process():", opts)
return None
def main(opts):
opts["main"] = "got here"
executor = [ProcessPoolExecutor, ThreadPoolExecutor][1]
with executor(max_workers=1) as executor:
executor.submit(process, opts)
return None
def init(opts): # Gather CLI opts and populate dict
opts["init"] = "got here"
return None
if __name__ == '__main__':
cli_opts = {"__main__": "got here"} # Initialize dict
init(cli_opts) # Populate dict
main(cli_opts) # Use dict
Works with both executor types.
Edit: Even though it sounds like it won't be a problem for your use case, I'll point out that with ProcessPoolExecutor, the opts dict you get inside process will be a frozen copy, so mutations to it will not be visible across processes nor will they be visible once you return to the __main__ block. ThreadPoolExecutor, on the other hand, will share the dict object between threads.
Actually, the first code of the OP will work as intended on Linux (tested in Python 3.6-3.8) because
On Unix a child process can make use of a shared resource created in a
parent process using a global resource.
as explained in multiprocessing doc. However, for a mysterious reasons, it won't work on my Mac running Mojave (which is supposed to be a UNIX-compliant OS; tested only with Python 3.8). And for sure, it won't work on Windows, and it's in general not a recommended practice with multiple processes.
Let's image a process is a box while a thread is a worker inside a box. A worker can only access the resources in the box and cannot touch the other resources in other boxes.
So when you use threads, you are creating multiple workers for your current box(main process). But when you use process, you are creating another box. In this case, the global variables initialised in this box is completely different from ones in another box. That's why it doesn't work as you expect.
The solution given by jedwards is good enough for most situations. You can expilictly package the resources in current box(serialize variables) and deliver it to another box(transport to another process) so that the workers in that box have access to the resources.
A process represents activity that is run in a separate process in the OS meaning of the term while threads all run in your main process. Every process has its own unique namespace.
Your main process sets the value to greeting by calling init() inside your __name__ == '__main__'condition for its own namespace. In your new process, this does not happen (__name__ is '__mp_name__' here) hence greeting remains None and init() is never actually called unless you do so explicitly in the function your process executes.
While sharing state between processes is generally not recommended, there are ways to do so, like outlined in #jedwards answer.
You might also want to check Sharing State Between Processes from the docs.

Python multiprocessing: can I reuse processes (already parallelized functions) with updated global variable?

At first let me show you the current setup I have:
import multiprocessing.pool
from contextlib import closing
import os
def big_function(param):
process(another_module.global_variable[param])
def dispatcher():
# sharing read-only global variable taking benefit from Unix
# which follows policy copy-on-update
# https://stackoverflow.com/questions/19366259/
another_module.global_variable = huge_list
# send indices
params = range(len(another_module.global_variable))
with closing(multiprocessing.pool.Pool(processes=os.cpu_count())) as p:
multiprocessing_result = list(p.imap_unordered(big_function, params))
return multiprocessing_result
Here I use shared variable updated before creating process pool, which contains huge data, and that indeed gained me speedup, so it seem to be not pickled now. Also this variable belongs to the scope of an imported module (if it's important).
When I tried to create setup like this:
another_module.global_variable = []
p = multiprocessing.pool.Pool(processes=os.cpu_count())
def dispatcher():
# sharing read-only global variable taking benefit from Unix
# which follows policy copy-on-update
# https://stackoverflow.com/questions/19366259/
another_module_global_variable = huge_list
# send indices
params = range(len(another_module.global_variable))
multiprocessing_result = list(p.imap_unordered(big_function, params))
return multiprocessing_result
p "remembered" that global shared list was empty and refused to use new data when was called from inside the dispatcher.
Now here is the problem: processing ~600 data objects on 8 cores with the first setup above, my parallel computation runs 8 sec, while single-threaded it works 12 sec.
This is what I think: as long, as multiprocessing pickles data, and I need to re-create processes each time, I need to pickle function big_function(), so I lose time on that. The situation with data was partially solved using global variable (but I still need to recreate pool on each update of it).
What can I do with instances of big_function()(which depends on many other functions from other modules, numpy, etc)? Can I create os.cpu_count() of it's copies once and for all, and somehow feed new data into them and receive results, reusing workers?
Just to go over 'remembering' issue:
another_module.global_variable = []
p = multiprocessing.pool.Pool(processes=os.cpu_count())
def dispatcher():
another_module_global_variable = huge_list
params = range(len(another_module.global_variable))
multiprocessing_result = list(p.imap_unordered(big_function, params))
return multiprocessing_result
What seems to be the problem is when you are creating Pool instance.
Why is that?
It's because when you create instance of Pool, it does set up number of workers (by default equal to a number of CPU cores) and they are all started (forked) at that time. That means workers have a copy of parents global state (and another_module.global_variable among everything else), and with copy-on-write policy, when you update value of another_module.global_variable you change it in parent's process. Workers have a reference to the old value. That is why you have a problem with it.
Here are couple of links that can give you more explanation on this: this and this.
Here is a small snippet where you can switch lines where global variable value is changed and where process is started, and check what is printed in child process.
from __future__ import print_function
import multiprocessing as mp
glob = dict()
glob[0] = [1, 2, 3]
def printer(a):
print(globals())
print(a, glob[0])
if __name__ == '__main__':
p = mp.Process(target=printer, args=(1,))
p.start()
glob[0] = 'test'
p.join()
This is the Python2.7 code, but it works on Python3.6 too.
What would be the solution for this issue?
Well, go back to first solution. You update value of imported module's variable and then create pool of processes.
Now the real issue with the lack of speedup.
Here is the interesting part from documentation on how functions are pickled:
Note that functions (built-in and user-defined) are pickled by “fully
qualified” name reference, not by value. This means that only the
function name is pickled, along with the name of the module the
function is defined in. Neither the function’s code, nor any of its
function attributes are pickled. Thus the defining module must be
importable in the unpickling environment, and the module must contain
the named object, otherwise an exception will be raised.
This means that your function pickling should not be a time wasting process, or at least not by itself. What causes lack of speedup is that for ~600 data objects in list that you pass to imap_unordered call, you pass each one of them to a worker process. Once again, underlying implementation of multiprocessing.Pool may be the cause of this issue.
If you go deeper into multiprocessing.Pool implementation, you will see that two Threads using Queue are handling communication between parent and all child (worker) processes. Because of this and that all processes constantly require arguments for function and constantly return responses, you end up with very busy parent process. That is why 'a lot' of time is spent doing 'dispatching' work passing data to and from worker processes.
What to do about this?
Try to increase number of data objects that are processes in worker process at any time. In your example, you pass one data object after other and you can be sure that each worker process is processing exactly one data object at any time. Why not increase the number of data objects you pass to worker process? That way you can make each process busier with processing 10, 20 or even more data objects. From what I can see, imap_unordered has an chunksize argument. It's set to 1 by default. Try increasing it. Something like this:
import multiprocessing.pool
from contextlib import closing
import os
def big_function(params):
results = []
for p in params:
results.append(process(another_module.global_variable[p]))
return results
def dispatcher():
# sharing read-only global variable taking benefit from Unix
# which follows policy copy-on-update
# https://stackoverflow.com/questions/19366259/
another_module.global_variable = huge_list
# send indices
params = range(len(another_module.global_variable))
with closing(multiprocessing.pool.Pool(processes=os.cpu_count())) as p:
multiprocessing_result = list(p.imap_unordered(big_function, params, chunksize=10))
return multiprocessing_result
Couple of advices:
I see that you create params as a list of indexes, that you use to pick particular data object in big_function. You can create tuples that represent first and last index and pass them to big_function. This can be a way of increasing chunk of work. This is an alternative approach to the one I proposed above.
Unless you explicitly like to have Pool(processes=os.cpu_count()), you can omit it. It by default takes number of CPU cores.
Sorry for the length of answer or any typo that might have sneaked in.

multiprocessing's Queue inside Manger.Namespace()

I am currently creating a class which is supposed to execute some methods in a multi-threaded way, using the multiprocessing module. I execute the real computation using a Pool of n workers. Now I wanted to assign each of the currently n active workers an index between 0 and n for some other calculation. To do this, I wanted to use a shared Queue to assign an index in a way, that at every time no two workers have the same id. To share the same Queue inside the class between the different threads, I wanted to store it inside a Manager.Namespace(). But doing this, I got some problems with the Queue. Therefore, I created a minimal version of my problem and ended up with something like this:
from multiprocess import Process, Queue, Manager, Pool, cpu_count
class A(object):
def __init__(self):
manager = Manager()
self.ns = manager.Namespace()
self.ns.q = manager.Queue()
def foo(self):
for i in range(10):
print(i)
self.ns.q.put(i)
print(self.ns.q.get())
print(self.ns.q.qsize())
a = A()
a.foo()
In this code, the execution stops before the second print statement - therefore, I think, that no data is actually written in the Queue. When I remove the namespace related stuff the code works flawlessly. Is this the intended behaviour of the multiprocessings objects and am I doing something wrong? Or is this some kind of bug?
yes, you should not use Namespace here. when you put a Queue object into manager.Namespace(), each process will get a new Queue instance, all the writer/reader of those newly created queue objects have no connection with parent process, therefore no message will be received by worker processes. share a Queue solely instead.
by the way, you mentioned "thread" many times, but in the context of multiprocess module, a worker is a process, not a thread.

multiprocessing launch from within module or class, not from main()

I want to use Python's multiprocessing unit to make effective use of multiple cpu's to speed up my processing.
All seems to work, however I want to run Pool.map(f, [item, item]) from within a class, in a sub module somewhere deep in my program. The reason is that the program has to prepare the data first and wait for certain events to happen before there is anything to be processed.
The multiprocessing docs says you can only run from within a if __name__ == '__main__': statement. I don't understand the significance of that and tried it anyway, like so:
from multiprocessing import Pool
class Foo(object):
n = 1000000
def __init__(self, x):
self.x = x + 1
pass
def run(self):
for i in range(1,self.n):
self.x *= 1.0*i/self.x
return self
class Bar(object):
def __init__(self):
pass
def go_all(self):
work = [Foo(i) for i in range(960)]
def do(obj):
return obj.run()
p = Pool(16)
finished_work = p.map(do, work)
return
bar = Bar()
bar.go_all()
It indeed doesn't work! I get the following error:
PicklingError: Can't pickle : attribute lookup
builtin.function failed
I don't quite understand why as everything seems to be perfectly pickeable. I have the following questions:
Can this be made to work without putting the p.map line in my main program?
If not, can "main" programs be called as sub-routines/modules, such to make it still work?
Is there some handy trick to loop back from a submodule to the main program and run it from there?
I'm on Linux and Python 2.7
I believe you misunderstood the documentation. What the documentation says is to do this:
if __name__ == '__main__':
bar = Bar()
bar.go_all()
So your p.map line does not need to be inside your "main function", or whatever. Only the code that actually spawns the subprocesses has to be "guarded". This is unavoidable due to limitations of the Windows OS.
Moreover, the function that you pass to Pool.map has to be importable (functions are pickled simply by their names, the interpreter then has to be able to import them to rebuild the function object when they are passed to the subprocess). So you should probably move your do function at the global level to avoid pickling errors.
The extra restrictions on the multiprocessing module on ms-windows stem from the fact that it doesn't have the fork system call. On UNIX-like operating systems, fork makes a perfect copy of a process and continues to run that next to the parent process. The only difference between them is that fork returns different value in the parent and child processes.
On ms-windows, multiprocessing needs to start a new Python instance using a native method to start processes. Then it needs to bring that Python instance into the same state as the "parent" process.
This means (among other things) that the Python code must be importable without side effects like trying to start yet another process. Hence the use of the if __name__ == '__main__' guard.

Categories

Resources