I would like to use a numpy array in shared memory for use with the multiprocessing module. The difficulty is using it like a numpy array, and not just as a ctypes array.
from multiprocessing import Process, Array
import scipy
def f(a):
a[0] = -a[0]
if __name__ == '__main__':
# Create the array
N = int(10)
unshared_arr = scipy.rand(N)
arr = Array('d', unshared_arr)
print "Originally, the first two elements of arr = %s"%(arr[:2])
# Create, start, and finish the child processes
p = Process(target=f, args=(arr,))
p.start()
p.join()
# Printing out the changed values
print "Now, the first two elements of arr = %s"%arr[:2]
This produces output such as:
Originally, the first two elements of arr = [0.3518653236697369, 0.517794725524976]
Now, the first two elements of arr = [-0.3518653236697369, 0.517794725524976]
The array can be accessed in a ctypes manner, e.g. arr[i] makes sense. However, it is not a numpy array, and I cannot perform operations such as -1*arr, or arr.sum(). I suppose a solution would be to convert the ctypes array into a numpy array. However (besides not being able to make this work), I don't believe it would be shared anymore.
It seems there would be a standard solution to what has to be a common problem.
To add to #unutbu's (not available anymore) and #Henry Gomersall's answers. You could use shared_arr.get_lock() to synchronize access when needed:
shared_arr = mp.Array(ctypes.c_double, N)
# ...
def f(i): # could be anything numpy accepts as an index such another numpy array
with shared_arr.get_lock(): # synchronize access
arr = np.frombuffer(shared_arr.get_obj()) # no data copying
arr[i] = -arr[i]
Example
import ctypes
import logging
import multiprocessing as mp
from contextlib import closing
import numpy as np
info = mp.get_logger().info
def main():
logger = mp.log_to_stderr()
logger.setLevel(logging.INFO)
# create shared array
N, M = 100, 11
shared_arr = mp.Array(ctypes.c_double, N)
arr = tonumpyarray(shared_arr)
# fill with random values
arr[:] = np.random.uniform(size=N)
arr_orig = arr.copy()
# write to arr from different processes
with closing(mp.Pool(initializer=init, initargs=(shared_arr,))) as p:
# many processes access the same slice
stop_f = N // 10
p.map_async(f, [slice(stop_f)]*M)
# many processes access different slices of the same array
assert M % 2 # odd
step = N // 10
p.map_async(g, [slice(i, i + step) for i in range(stop_f, N, step)])
p.join()
assert np.allclose(((-1)**M)*tonumpyarray(shared_arr), arr_orig)
def init(shared_arr_):
global shared_arr
shared_arr = shared_arr_ # must be inherited, not passed as an argument
def tonumpyarray(mp_arr):
return np.frombuffer(mp_arr.get_obj())
def f(i):
"""synchronized."""
with shared_arr.get_lock(): # synchronize access
g(i)
def g(i):
"""no synchronization."""
info("start %s" % (i,))
arr = tonumpyarray(shared_arr)
arr[i] = -1 * arr[i]
info("end %s" % (i,))
if __name__ == '__main__':
mp.freeze_support()
main()
If you don't need synchronized access or you create your own locks then mp.Array() is unnecessary. You could use mp.sharedctypes.RawArray in this case.
The Array object has a get_obj() method associated with it, which returns the ctypes array which presents a buffer interface. I think the following should work...
from multiprocessing import Process, Array
import scipy
import numpy
def f(a):
a[0] = -a[0]
if __name__ == '__main__':
# Create the array
N = int(10)
unshared_arr = scipy.rand(N)
a = Array('d', unshared_arr)
print "Originally, the first two elements of arr = %s"%(a[:2])
# Create, start, and finish the child process
p = Process(target=f, args=(a,))
p.start()
p.join()
# Print out the changed values
print "Now, the first two elements of arr = %s"%a[:2]
b = numpy.frombuffer(a.get_obj())
b[0] = 10.0
print a[0]
When run, this prints out the first element of a now being 10.0, showing a and b are just two views into the same memory.
In order to make sure it is still multiprocessor safe, I believe you will have to use the acquire and release methods that exist on the Array object, a, and its built in lock to make sure its all safely accessed (though I'm not an expert on the multiprocessor module).
While the answers already given are good, there is a much easier solution to this problem provided two conditions are met:
You are on a POSIX-compliant operating system (e.g. Linux, Mac OSX); and
Your child processes need read-only access to the shared array.
In this case you do not need to fiddle with explicitly making variables shared, as the child processes will be created using a fork. A forked child automatically shares the parent's memory space. In the context of Python multiprocessing, this means it shares all module-level variables; note that this does not hold for arguments that you explicitly pass to your child processes or to the functions you call on a multiprocessing.Pool or so.
A simple example:
import multiprocessing
import numpy as np
# will hold the (implicitly mem-shared) data
data_array = None
# child worker function
def job_handler(num):
# built-in id() returns unique memory ID of a variable
return id(data_array), np.sum(data_array)
def launch_jobs(data, num_jobs=5, num_worker=4):
global data_array
data_array = data
pool = multiprocessing.Pool(num_worker)
return pool.map(job_handler, range(num_jobs))
# create some random data and execute the child jobs
mem_ids, sumvals = zip(*launch_jobs(np.random.rand(10)))
# this will print 'True' on POSIX OS, since the data was shared
print(np.all(np.asarray(mem_ids) == id(data_array)))
I've written a small python module that uses POSIX shared memory to share numpy arrays between python interpreters. Maybe you will find it handy.
https://pypi.python.org/pypi/SharedArray
Here's how it works:
import numpy as np
import SharedArray as sa
# Create an array in shared memory
a = sa.create("test1", 10)
# Attach it as a different array. This can be done from another
# python interpreter as long as it runs on the same computer.
b = sa.attach("test1")
# See how they are actually sharing the same memory block
a[0] = 42
print(b[0])
# Destroying a does not affect b.
del a
print(b[0])
# See how "test1" is still present in shared memory even though we
# destroyed the array a.
sa.list()
# Now destroy the array "test1" from memory.
sa.delete("test1")
# The array b is not affected, but once you destroy it then the
# data are lost.
print(b[0])
You can use the sharedmem module: https://bitbucket.org/cleemesser/numpy-sharedmem
Here's your original code then, this time using shared memory that behaves like a NumPy array (note the additional last statement calling a NumPy sum() function):
from multiprocessing import Process
import sharedmem
import scipy
def f(a):
a[0] = -a[0]
if __name__ == '__main__':
# Create the array
N = int(10)
unshared_arr = scipy.rand(N)
arr = sharedmem.empty(N)
arr[:] = unshared_arr.copy()
print "Originally, the first two elements of arr = %s"%(arr[:2])
# Create, start, and finish the child process
p = Process(target=f, args=(arr,))
p.start()
p.join()
# Print out the changed values
print "Now, the first two elements of arr = %s"%arr[:2]
# Perform some NumPy operation
print arr.sum()
With Python3.8+ there is the multiprocessing.shared_memory standard library:
# np_sharing.py
from multiprocessing import Process
from multiprocessing.managers import SharedMemoryManager
from multiprocessing.shared_memory import SharedMemory
from typing import Tuple
import numpy as np
def create_np_array_from_shared_mem(
shared_mem: SharedMemory, shared_data_dtype: np.dtype, shared_data_shape: Tuple[int, ...]
) -> np.ndarray:
arr = np.frombuffer(shared_mem.buf, dtype=shared_data_dtype)
arr = arr.reshape(shared_data_shape)
return arr
def child_process(
shared_mem: SharedMemory, shared_data_dtype: np.dtype, shared_data_shape: Tuple[int, ...]
):
"""Logic to be executed by the child process"""
arr = create_np_array_from_shared_mem(shared_mem, shared_data_dtype, shared_data_shape)
arr[0, 0] = -arr[0, 0] # modify the array backed by shared memory
def main():
"""Logic to be executed by the parent process"""
# Data to be shared:
data_to_share = np.random.rand(10, 10)
SHARED_DATA_DTYPE = data_to_share.dtype
SHARED_DATA_SHAPE = data_to_share.shape
SHARED_DATA_NBYTES = data_to_share.nbytes
with SharedMemoryManager() as smm:
shared_mem = smm.SharedMemory(size=SHARED_DATA_NBYTES)
arr = create_np_array_from_shared_mem(shared_mem, SHARED_DATA_DTYPE, SHARED_DATA_SHAPE)
arr[:] = data_to_share # load the data into shared memory
print(f"The [0,0] element of arr is {arr[0,0]}") # before
# Run child process:
p = Process(target=child_process, args=(shared_mem, SHARED_DATA_DTYPE, SHARED_DATA_SHAPE))
p.start()
p.join()
print(f"The [0,0] element of arr is {arr[0,0]}") # after
del arr # delete np array so the shared memory can be deallocated
if __name__ == "__main__":
main()
Running the script:
$ python3.10 np_sharing.py
The [0,0] element of arr is 0.262091705529628
The [0,0] element of arr is -0.262091705529628
Since the arrays in different processes share the same underlying memory buffer, the standard caveats r.e. race conditions apply.
Related
I thought that SharedMemory would keep values of target arrays, but when I actually tried it, it seems it doesn't.
from multiprocessing import Process, Semaphore, shared_memory
import numpy as np
import time
dtype_eV = np.dtype({ 'names':['idx', 'value', 'size'], \
'formats':['int32', 'float64', 'float64'] })
def worker_writer(id, number, a, shm):
exst_shm = shared_memory.SharedMemory(name=shm)
b = np.ndarray(a.shape, dtype=a.dtype, buffer=exst_shm.buf)
for i in range(5):
time.sleep(0.5)
b['idx'][i] = i
def worker_reader(id, number, a, shm):
exst_shm = shared_memory.SharedMemory(name=shm)
b = np.ndarray(a.shape, dtype=a.dtype, buffer=exst_shm.buf)
for i in range(5):
time.sleep(1)
print(b['idx'][i], b['value'][i])
if __name__ == "__main__":
a = np.zeros(5, dtype=dtype_eV)
a['value'] = 100
shm = shared_memory.SharedMemory(create=True, size=a.nbytes)
c = np.ndarray(a.shape, dtype=a.dtype, buffer=shm.buf)
th1 = Process(target=worker_writer, args=(1, 50000000, a, shm.name))
th2 = Process(target=worker_reader, args=(2, 50000000, a, shm.name))
th1.start()
th2.start()
th1.join()
th2.join()
'''
result:
0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
'''
In the code above, the 2 processes can share one array(a) and access to it. But the value that was given before sharing(a['value'] = 100) is missing. Is it just natural or is there any way to keep the value even after sharing?
Here's an example of how to use shared_memory using numpy. It was pasted together from several of my other answers, but there are a couple pitfalls to keep in mind with shared_memory:
When you create a numpy ndarray from a shm object, it doesn't prevent the shm from being garbage collected. The unfortunate side effect of this is that the next time you try to access the array, you get a segfault. From another question I created a quick ndarray subclass to just attach the shm as an attribute, so a reference sticks around, and it doesn't get GC'd.
Another pitfall is that on Windows, the OS does the tracking of when to delete the memory rather than giving you the access to do so. That means that even if you don't call unlink, the memory will get deleted if there are no active references to that particular segment of memory (given by the name). The way to solve this is to make sure you keep an shm open on the main process that outlives all child processes. Calling close and unlink at the end keeps that reference to the end, and makes sure on other platforms you don't leak memory.
import numpy as np
import multiprocessing as mp
from multiprocessing.shared_memory import SharedMemory
class SHMArray(np.ndarray): #copied from https://numpy.org/doc/stable/user/basics.subclassing.html#slightly-more-realistic-example-attribute-added-to-existing-array
'''an ndarray subclass that holds on to a ref of shm so it doesn't get garbage collected too early.'''
def __new__(cls, input_array, shm=None):
obj = np.asarray(input_array).view(cls)
obj.shm = shm
return obj
def __array_finalize__(self, obj):
if obj is None: return
self.shm = getattr(obj, 'shm', None)
def child_func(name, shape, dtype):
shm = SharedMemory(name=name)
arr = SHMArray(np.ndarray(shape, buffer=shm.buf, dtype=dtype), shm)
arr[:] += 5
shm.close() #be sure to cleanup your shm's locally when they're not needed (referring to arr after this will segfault)
if __name__ == "__main__":
shape = (10,) # 1d array 10 elements long
dtype = 'f4' # 32 bit floats
dummy_array = np.ndarray(shape, dtype=dtype) #dumy array to calculate nbytes
shm = SharedMemory(create=True, size=dummy_array.nbytes)
arr = np.ndarray(shape, buffer=shm.buf, dtype=dtype) #create the real arr backed by the shm
arr[:] = 0
print(arr) #should print arr full of 0's
p1 = mp.Process(target=child_func, args=(shm.name, shape, dtype))
p1.start()
p1.join()
print(arr) #should print arr full of 5's
shm.close() #be sure to cleanup your shm's
shm.unlink() #call unlink when the actual memory can be deleted
Alternative, without the dummy array:
import math
s=np.dtype(dtype).itemsize*math.prod(list(shape))
# see https://stackoverflow.com/questions/16972501/size-of-data-type-using-numpy/16972612#16972612
shm = shared_memory.SharedMemory(create=True, size=s)
Instead of:
dummy_array = np.ndarray(shape, dtype=dtype) #dumy array to calculate nbytes
shm = SharedMemory(create=True, size=dummy_array.nbytes)
I have a 2-Dimensional Matrix ( say 5000 rows x 8000 columns) containing integers. I want to multiply each element of the matrix by 2 using multiprocessing in python so that each process gets a set of rows to work on and gets a target function "array_mult" which does the job on the partition of the matrix it has been sent.
Array has been partitioned by rows and each partition sent to a (sub)process
import time,os
import multiprocessing as mp
A=[[1,2,3],[4,5,6],[7,8,9]]
global arr
''' I am trying to use a global variable to write the output of the function so that
the storage is persistent and the output doesn't vanish when the process ends'''
def array_mult(a):
'''This is the function which is supposed to
multiply each element of input matrix a'''
print("array is =",a)
for i in range(len(a)):
print("counter is",i)
a[i]=a[i]*2
print(a,os.getpid())
arr.append(a)
if __name__ == '__main__':
starttime = time.time()
array_proc=list()
for i in range(3):
p=mp.Process(target=array_mult, args=(A[i], )) ### I am trying to send partitions of the list as the arg to the function array_mult
array_proc.append(p)
p.start()
for process in array_proc:
process.join()
print(time.time() - starttime)
print(A)
print(arr)
**
CONSTRAINT- Cannot use anything outside of python core modules or any functionality below python 3.6
**
Is using the ctypes library and using RawArray useful ? If so, how can I use it ?
Any other idea to hold the 2 -dimensional matrix ? (I don't want to use numpy as it's not a core package)
'''Created on 12-Jun-2020
#author: Shouvik
'''
import time,os
import multiprocessing as mp
from multiprocessing import sharedctypes
from ctypes import Structure,c_int
num_of_columns=10000
num_of_rows=10000
num_of_cpu=len(os.sched_getaffinity(0))
class Row_Vector(Structure):
_fields_ = [("column", c_int * num_of_columns)]
class array_2d(Structure):
_fields_ = [("Rectangular_Matrix", Row_Vector * num_of_rows)]
'''create_row_boundaries returns a list with the row-numbers which partition the matrix row-wise '''
def create_row_boundaries(num_of_rows):
num_of_cpu=len(os.sched_getaffinity(0))
row_boundary=[0,(int(num_of_rows/num_of_cpu) if num_of_rows>1 else 1)]
'we add num_of_rows/num_of_cpu to each partition'
index=2
while row_boundary[index-1]< (num_of_rows-1):
row_boundary.append(int(row_boundary[index-1])+1)
'''After adding integer number of 'num_of_rows/num_of_cpu' we can be left with any of 0,1,2... (num_of_cpu-1) rows
which should be added to the last element'''
if num_of_rows-num_of_cpu<row_boundary[index-1]+int(num_of_rows/num_of_cpu)<=num_of_rows :
#print("num of rows- num of cpu is",num_of_rows-num_of_cpu,"row_boundary[index]+int(num_of_rows/num_of_cpu)", row_boundary[index]+int(num_of_rows/num_of_cpu))
row_boundary.append(num_of_rows-1)
else:
row_boundary.append(int(row_boundary[index-1])+ int(num_of_rows/num_of_cpu))
index =index+2
return row_boundary
'''matrix_operation operates on each element of the matrix (type RawArray) that is passed to it'''
def matrix_operation(a,row_initial, row_final,column_initial, column_final):
print("this instance of 'matrix_operation' is going to work from row number {} to row number{}".format(row_initial,row_final))
for i in range(row_initial,row_final+1,1):
for j in range(column_initial,column_final):
a[i].column[j]=2 * a[i].column[j]
# print("Row_initial{} to Row_final{} done by process id {}".format(row_initial,row_final,os.getpid()))
# print("The process which operates on below matrix is",os.getpid())
# for i in range(row_initial,row_final+1,1):
# print([a[i].column[j] for j in range(column_initial,column_final)],"pid is {}".format(os.getpid()))
return
if __name__ == '__main__':
'''We create a matrix of type Raw_Array having num_of_rows rows
and each row is a column vector having num_of_columns columns and operate using multiprocessing'''
m1=sharedctypes.RawArray(Row_Vector, num_of_rows)
'''We create another matrix;operate on it sequentially on a single core of the cpu;time the operation'''
m2=sharedctypes.RawArray(Row_Vector, num_of_rows)
'''the two nested for loops below simply populate the matrices m1 and m2 with some values'''
for i in range(num_of_rows):
for j in range(num_of_columns):
m1[i].column[j]=i*j
m2[i].column[j]=i*j
'''The for loop and the print statement below print the matrix row-wise'''
# for i in range(num_of_rows):
# print([m1[i].column[j] for j in range(num_of_columns)] )
matrix_partition_by_row_num=create_row_boundaries(num_of_rows)
index=0
array_proc=list()
starttime = time.time()
for i in range(int(len(matrix_partition_by_row_num)/2)): #originally range had num_of_cpu as the argument
p=mp.Process(target=matrix_operation, args=(m1, matrix_partition_by_row_num[index],matrix_partition_by_row_num[index+1],0,num_of_columns))
'''We pass the matrix m1 and it's various partitioning rows to the 'matrix_operation' function'''
array_proc.append(p)
p.start()
index=index+2
for process in array_proc:
process.join()
print("Time taken for concurrent operation is {:e}".format(time.time() - starttime))
# for i in range(num_of_rows):
# print([m[i].column[j] for j in range(num_of_columns)] )
print("no of cpu",num_of_cpu,"matrix partition values",matrix_partition_by_row_num)
# for i in range(num_of_rows):
# print([m[i].column[j] for j in range(num_of_columns)])
'''We will simply input the entire matrix to the matrix_operation function and time the process'''
sequential_process_starttime=time.time()
matrix_operation(m2, 0, num_of_rows-1, 0, num_of_columns)
print("time taken for sequential operation is {:e}".format(time.time()-sequential_process_starttime))
Is_matrix_operation_correct= True
for i in range(num_of_rows):
for j in range(num_of_columns):
if (m1[i].column[j] != m2[i].column[j]) :
Is_matrix_operation_correct=False
break
print("Is matrix operation correct: {}".format(Is_matrix_operation_correct) )
The Setup :
I have two arrays from shared memory reals and imags :
#/usr/bin/env python2
reals = multiprocessing.RawArray('d', 10000000)
imags = multiprocessing.RawArray('d', 10000000)
then I make them numpy-arrays, named reals2 and imags2, without any copy :
import numpy as np
reals2 = np.frombuffer(reals)
imags2 = np.frombuffer(imags)
# check if the objects did a copy
assert reals2.flags['OWNDATA'] is False
assert imags2.flags['OWNDATA'] is False
I would like to then make a np.complex128 1D-array data, again without copying the data, but I don't know how to.
The Questions :
Can you make a np.complex128 1D-array data from a pair of float arrays, without copying, yes/no?
If yes, how?
Short answer: no. But if you control the sender then there is a solution that does not require copying.
Longer answer:
from my research I do not think there is a way to create a numpy complex array from two separate arrays without copying the data
IMO i think that you can not do this because all the numpy compiled c code assumes interleaved real, imag data
if you control the sender, you can get your data without any copy operations. here's how!
#!/usr/bin/env python2
import multiprocessing
import numpy as np
# parent process creates some data that needs to be shared with the child processes
data = np.random.randn(10) + 1.0j * np.random.randn(10)
assert data.dtype == np.complex128
# copy the data from the parent process to shared memory
shared_data = multiprocessing.RawArray('d', 2 * data.size)
shared_data[0::2] = data.real
shared_data[1::2] = data.imag
# simulate the child process getting only the shared_data
data2 = np.frombuffer(shared_data)
assert data2.flags['OWNDATA'] is False
assert data2.dtype == np.float64
assert data2.size == 2 * data.size
# convert reals to complex
data3 = data2.view(np.complex128)
assert data3.flags['OWNDATA'] is False
assert data3.dtype == np.complex128
assert data3.size == data.size
assert np.all(data3 == data)
# done - if no AssertionError then success
print 'success'
hat tip to: https://stackoverflow.com/a/32877245/52074 as a great starting point.
here's how to do the same processing but with multiple processes being started and getting the data back from each process and verifying the returned data
#!/usr/bin/env python2
import multiprocessing
import os
# third-party
import numpy as np
# constants
# =========
N_POINTS = 3
N_THREADS = 4
# functions
# =========
def func(index, shared_data, results_dict):
# simulate the child process getting only the shared_data
data2 = np.frombuffer(shared_data)
assert data2.flags['OWNDATA'] is False
assert data2.dtype == np.float64
# convert reals to complex
data3 = data2.view(np.complex128)
assert data3.flags['OWNDATA'] is False
assert data3.dtype == np.complex128
print '[child.pid=%s,type=%s]: %s'%(os.getpid(), type(shared_data), data3)
# return the results in a SLOW but relatively easy way
results_dict[os.getpid()] = np.copy(data3) * index
# the script
# ==========
if __name__ == '__main__':
# parent process creates some data that needs to be shared with the child processes
data = np.random.randn(N_POINTS) + 1.0j * np.random.randn(N_POINTS)
assert data.dtype == np.complex128
# copy the data from the parent process to shared memory
shared_data = multiprocessing.RawArray('d', 2 * data.size)
shared_data[0::2] = data.real
shared_data[1::2] = data.imag
print '[parent]: ', type(shared_data), data
# do multiprocessing
manager = multiprocessing.Manager()
results_dict = manager.dict()
processes = []
for index in xrange(N_THREADS):
process = multiprocessing.Process(target=func, args=(index, shared_data, results_dict))
processes.append(process)
for process in processes:
process.start()
for process in processes:
process.join()
# get the results back from the processes
results = [results_dict[process.pid] for process in processes]
# verify the values from the processes
for index in xrange(N_THREADS):
result = results[index]
assert np.all(result == data * index)
del processes
# done
print 'success'
I am trying to understand how multiprocessing works with Python. Here's my test code:
import numpy as np
import multiprocessing
import time
def worker(a):
for i in range(len(a)):
for j in arr2:
a[i] = a[i]*j
return len(a)
arr2 = np.random.rand(10000).tolist()
if __name__ == '__main__':
multiprocessing.freeze_support()
cores = multiprocessing.cpu_count()
arr1 = np.random.rand(1000000).tolist()
tmp = time.time()
pool = multiprocessing.Pool(processes=cores)
result = pool.map(worker, [arr1], chunksize=1000000/(cores-1))
print "mp time", time.time()-tmp
I have 8 cores. It usually ends up with 7 processes using only ~3% of the CPU for about a second, and the last process uses ~1/8 of the CPU for forever...(it has been running for about 15 minutes)
I understand that the interprocess communication usually bounds the complexity of parallel programming, but does it usually take this long? What else could cause the last process to take forever?
This thread: Python multiprocessing never joins seems to address a similar issue but it doesn't solve the problem with Pool.
It looks like you want to divide the work into chunks. You can use the range function to partition the data. On Linux, forked processes get a copy-on-write view of the parent memory so you can just pass down the indexes you want to work on. On Windows, no such luck. You need to pass in each sublist. This program should do it
import numpy as np
import multiprocessing
import time
import platform
def worker(a):
if platform.system() == "Linux":
# on linux we passed in start:len
start, length = a
a = arr1[start:length]
for i in range(len(a)):
for j in arr2:
a[i] = a[i]*j
return len(a)
arr2 = np.random.rand(10000).tolist()
if __name__ == '__main__':
multiprocessing.freeze_support()
cores = multiprocessing.cpu_count()
arr1 = np.random.rand(1000000).tolist()
tmp = time.time()
pool = multiprocessing.Pool(processes=cores)
chunk = (len(arr1)+cores-1)//cores
# on Windows, pass the sublist, on linux just the indexes and let the
# worker split from the view of parent memory space
if platform.system() == "Linux":
seq = [(i, i+chunk) for i in range(0, len(arr1), chunk)]
else:
seq = [arr1[i:i+chunk] for i in range(0, len(arr1), chunk)]
result = pool.map(worker, seq, chunksize=1)
print "mp time", time.time()-tmp
You point is here:
pool.map will automatically iterate the object which is [arr1] in your program. Please notice that the object is [arr1] but not arr1, that means the length of object you pass to pool.map is only one.
I think the simplest solution is replace [arr1] with arr1.
I'm trying to update a shared variable (numpy array in a namespace) when using the multiprocessing module. However, the variable is not updated and I dont understand why.
Here is a sample code to illustrate this:
from multiprocessing import Process, Manager
import numpy as np
chunk_size = 15
arr_length = 1000
jobs = []
namespace = Manager().Namespace()
namespace.arr = np.zeros(arr_length)
nb_chunk = arr_length/chunk_size + 1
def foo(i, ns):
from_idx = chunk_size*i
to_idx = min(arr_length, chunk_size*(i+1))
ns.arr[from_idx:to_idx] = np.random.randint(0, 100, to_idx-from_idx)
for i in np.arange(nb_chunk):
p = Process(target=foo, args=(i, namespace))
p.start()
jobs.append(p)
for i in np.arange(nb_chunk):
jobs[i].join()
print namespace.arr[:10]
You can not share in-built objects like list, dict across processes in Python. In order to share data between process, Python's multiprocessing provide two data structure:
Queue()
Pipe()
Also read: Exchanging objects between processes
The issue is that the Manager().Namespace() object doesn't notice that you're changing anything using ns.arr[from_idx:to_idx] = ... (as you're working on a inner data structure) and thus doesn't propagate to the other processes.
This answer explains very good what's going on here.
To fix it, create the list as a Manager().List() and pass this list to the processes, so that ns[from_idx:to_idx] = ... is recognized as a change and is propagated to the processes:
from multiprocessing import Process, Manager
import numpy as np
chunk_size = 15
arr_length = 1000
jobs = []
arr = Manager().list([0] * arr_length)
nb_chunk = arr_length/chunk_size + 1
def foo(i, ns):
from_idx = chunk_size*i
to_idx = min(arr_length, chunk_size*(i+1))
ns[from_idx:to_idx] = np.random.randint(0, 100, to_idx-from_idx)
for i in np.arange(nb_chunk):
p = Process(target=foo, args=(i, arr))
p.start()
jobs.append(p)
for i in np.arange(nb_chunk):
jobs[i].join()
print arr[:10]