I use this code: (from here)
import multiprocessing
def calc_square(numbers, result):
for idx, n in enumerate(numbers):
result[idx] = n*n
if __name__ == "__main__":
numbers = [2,3,5]
result = multiprocessing.Array('i',3)
p = multiprocessing.Process(target=calc_square, args=(numbers, result))
p.start()
p.join()
print(result[:])
How do I let this loop to run in parallel? For example, on 3 different processes.
Related
In my GUI application, I want to use multiprocessing to accelerate the calculation. Now, I can use multiprocessing, and collect the calculated result. Now, I want the subprocess can inform the main-process that the calculation is finished, but I can not find any solution.
My multiprocessing looks like:
import multiprocessing
from multiprocessing import Process
import numpy as np
class MyProcess(Process):
def __init__(self,name, array):
super(MyProcess,self).__init__()
self.name = name
self.array = array
recv_end, send_end = multiprocessing.Pipe(False)
self.recv = recv_end
self.send = send_end
def run(self):
s = 0
for a in self.array:
s += a
self.send.send(s)
def getResult(self):
return self.recv.recv()
if __name__ == '__main__':
process_list = []
for i in range(5):
a = np.random.random(10)
print(i, ' correct result: ', a.sum())
p = MyProcess(str(i), a)
p.start()
process_list.append(p)
for p in process_list:
p.join()
for p in process_list:
print(p.name, ' subprocess result: ', p.getResult())
I want the sub-process can inform the main-process that the calculation is finish so that I can show the result in my GUI.
Any suggestion is appreciated~~~
Assuming you would like to do something with a result (the sum of an numpy array, in your case) as soon as it has been generated, then I would use a multiprocessing pool with method multiprocessing.pool.Pool with method imap_unordered, which will return results in the order generated. In this case you need to pass to your worker function the index of the array in the list of arrays to be processed along with the array itself and have it return back this index along with the array's sum since this is the only way for the main process to know for which array the sum has been generated:
from multiprocessing import Pool, cpu_count
import numpy as np
def compute_sum(tpl):
# unpack tuple:
i, array = tpl
s = 0
for a in array:
s += a
return i, s
if __name__ == '__main__':
array_list = [np.random.random(10) for _ in range(5)]
n = len(array_list)
pool_size = min(cpu_count(), n)
pool = Pool(pool_size)
# get result as soon as it has been returned:
for i, s in pool.imap_unordered(compute_sum, zip(range(n), array_list)):
print(f'correct result {i}: {array_list[i].sum()}, actual result: {s}')
pool.close()
pool.join()
Prints:
correct result 0: 4.760033809335711, actual result: 4.76003380933571
correct result 1: 5.486818812843256, actual result: 5.486818812843257
correct result 2: 5.400374562564179, actual result: 5.400374562564179
correct result 3: 4.079376706247242, actual result: 4.079376706247242
correct result 4: 4.20860716467263, actual result: 4.20860716467263
In the above run the actual results generated happened to be in the same order in which the tasks were submitted. To demonstrate that in general the results could be generated in arbitrary order based on how long it takes for the worker function to compute its result, we introduce some randomness to the processing time:
from multiprocessing import Pool, cpu_count
import numpy as np
def compute_sum(tpl):
import time
# unpack tuple:
i, array = tpl
# results will be generated in random order:
time.sleep(np.random.sample())
s = 0
for a in array:
s += a
return i, s
if __name__ == '__main__':
array_list = [np.random.random(10) for _ in range(5)]
n = len(array_list)
pool_size = min(cpu_count(), n)
pool = Pool(pool_size)
# get result as soon as it has been returned:
for i, s in pool.imap_unordered(compute_sum, zip(range(n), array_list)):
print(f'correct result {i}: {array_list[i].sum()}, actual result: {s}')
pool.close()
pool.join()
Prints:
correct result 4: 6.662288433360379, actual result: 6.66228843336038
correct result 0: 3.352901187256162, actual result: 3.3529011872561614
correct result 3: 5.836344458981557, actual result: 5.836344458981557
correct result 2: 2.9950208717729656, actual result: 2.9950208717729656
correct result 1: 5.144743159869513, actual result: 5.144743159869513
If you are satisfied with getting back results in task-submission rather than task-completion order, then use method imap and there is no need to pass back and forth array indices:
from multiprocessing import Pool, cpu_count
import numpy as np
def compute_sum(array):
s = 0
for a in array:
s += a
return s
if __name__ == '__main__':
array_list = [np.random.random(10) for _ in range(5)]
n = len(array_list)
pool_size = min(cpu_count(), n)
pool = Pool(pool_size)
for i, s in enumerate(pool.imap(compute_sum, array_list)):
print(f'correct result {i}: {array_list[i].sum()}, actual result: {s}')
pool.close()
pool.join()
Prints:
correct result 0: 4.841913985702773, actual result: 4.841913985702773
correct result 1: 4.836923014762733, actual result: 4.836923014762733
correct result 2: 4.91242274200897, actual result: 4.91242274200897
correct result 3: 4.701913574838348, actual result: 4.701913574838349
correct result 4: 5.813666896917504, actual result: 5.813666896917503
Update
You can also use method apply_async specifying a callback function to be invoked when a result is returned from your worker function, compute_sum. apply_async returns a multiprocessing.pool.AsyncResult whose get method will block until the task has completed and returns the return value from the completed task. But here, since, we are using a callback function that will automatically be called with the result when the task completes instead of calling method multiprocessing.pool.AsyncResult.get, there is no need to save the AsyncResult instances. We also rely on calling methods multiprocessing.pool.Pool.close() followed by multiprocessing.pool.Pool.join() to block until all submitted tasks have completed and results returned:
from multiprocessing import Pool, cpu_count
import numpy as np
from functools import partial
def compute_sum(i, array):
s = 0
for a in array:
s += a
return i, s
def calculation_display(result, t):
# Unpack returned tuple:
i, s = t
print(f'correct result {i}: {array_list[i].sum()}, actual result: {s}')
result[i] = s
if __name__ == '__main__':
global array_list
array_list = [np.random.random(10) for _ in range(5)]
n = len(array_list)
result = [0] * n
pool_size = min(cpu_count(), n)
pool = Pool(pool_size)
# Get result as soon as it has been returned.
# Pass to our callback as the first argument the results list.
# The return value will now be the second argument:
my_callback = partial(calculation_display, result)
for i, array in enumerate(array_list):
pool.apply_async(compute_sum, args=(i, array), callback=my_callback)
# Wait for all submitted tasks to complete:
pool.close()
pool.join()
print('results:', result)
Prints:
correct result 0: 5.381579338696546, actual result: 5.381579338696546
correct result 1: 3.8780497856741274, actual result: 3.8780497856741274
correct result 2: 4.548733927791488, actual result: 4.548733927791488
correct result 3: 5.048921365623381, actual result: 5.048921365623381
correct result 4: 4.852415747983676, actual result: 4.852415747983676
results: [5.381579338696546, 3.8780497856741274, 4.548733927791488, 5.048921365623381, 4.852415747983676]
I'm using a construct similar to this example to run my processing in parallel with a progress bar courtesy of tqdm...
from multiprocessing import Pool
import time
from tqdm import *
def _foo(my_number):
square = my_number * my_number
return square
if __name__ == '__main__':
with Pool(processes=2) as p:
max_ = 30
with tqdm(total=max_) as pbar:
for _ in p.imap_unordered(_foo, range(0, max_)):
pbar.update()
results = p.join() ## My attempt to combine results
results is always NoneType though, and I can not work out how to get my results combined. I understand that with ...: will close what it is working with on completion automatically.
I've tried doing away with the outer with:
if __name__ == '__main__':
max_ = 10
p = Pool(processes=8)
with tqdm(total=max_) as pbar:
for _ in p.imap_unordered(_foo, range(0, max_)):
pbar.update()
p.close()
results = p.join()
print(f"Results : {results}")
Stumped as to how to join() my results?
Your call to p.join() just waits for all the pool processes to end and returns None. This call is actually unnecessary since you are using the pool as a context manager, that is you have specified with Pool(processes=2) as p:). When that block terminates an implicit call is made to p.terminate(), which immediately terminates the pool processes and any tasks that may be running or queued up to run (there are none in your case).
It is, in fact, iterating the iterator returned by the call to p.imap_unordered that returns each return value from your worker function, _foo. But since you are using method imap_unordered, the results returned may not be in submission order. In other words, you cannot assume that the return values will be in succession 0, 1, , 4, 9, etc. There are many ways to handle this, such as having your worker function return the original argument along with the squared value:
from multiprocessing import Pool
import time
from tqdm import *
def _foo(my_number):
square = my_number * my_number
return my_number, square # return the argunent along with the result
if __name__ == '__main__':
with Pool(processes=2) as p:
max_ = 30
results = [None] * 30; # preallocate the resulys array
with tqdm(total=max_) as pbar:
for x, result in p.imap_unordered(_foo, range(0, max_)):
results[x] = result
pbar.update()
print(results)
The second way is to not use imap_unordered, but rather apply_async with a callback function. The disadvantage of this is that for large iterables you do not have the option of specifying a chunksize argument as you do with imap_unordered:
from multiprocessing import Pool
import time
from tqdm import *
def _foo(my_number):
square = my_number * my_number
return square
if __name__ == '__main__':
def my_callback(_): # ignore result
pbar.update() # update progress bar when a result is produced
with Pool(processes=2) as p:
max_ = 30
with tqdm(total=max_) as pbar:
async_results = [p.apply_async(_foo, (x,), callback=my_callback) for x in range(0, max_)]
# wait for all tasks to complete:
p.close()
p.join()
results = [async_result.get() for async_result in async_results]
print(results)
I'm trying parallelize some nested loops using pool, moreover function have to return an array, but external array stays empty.
def calcul_T(m):
temp=[]
for n in range(0,N):
x = sym.Symbol('x')
y=sym.sin(x)
#.....some stuff.....
temp.append(y)
return temp
rt=[]
if __name__ == '__main__':
pool = Pool()
rt.append(pool.map(calcul_T, range(0,M)))
pool.close()
pool.join()
I expect getting at least array of arrays in order to make it 2-D array and then use it further, after if __name__ block
What do I wrong?
Use context manager:
from multiprocessing import Pool
def calcul_T(m):
temp=[]
for n in range(0,N):
x = sym.Symbol('x')
y=sym.sin(x)
#.....some stuff.....
temp.append(y)
return temp
rt=[]
if __name__ == '__main__':
with Pool(N_PROCESSES) as p:
rt = p.map(calcul_T, range(0,M))
EDIT
According to comment, accesing rt like normal 2D array works just fine (run in console, i changed calcul_T function for running this)
from multiprocessing import Pool
N = 10
M = 10
def calcul_T(m):
temp=[]
for n in range(0,N):
temp.append(n * m)
return temp
rt = []
if __name__ == '__main__':
with Pool(5) as p:
rt = p.map(calcul_T, range(0,M))
print(rt[8][8])
I am trying to write a process which does some computation on an Array filled with strings using the multiprocessing module. However, I am not able to get back the results. This is just a minimalist code example:
from multiprocessing import Process, Value, Array
from ctypes import c_char_p
# Process
def f(n, a):
for i in range(0,10):
a[i] = "test2".encode('latin-1')
if __name__ == '__main__':
# Set up array
arr = Array(c_char_p, range(10))
# Fill it with values
for i in range(0,10):
arr[i] = "test".encode('latin-1')
x = []
for i in range(0,10):
num = Value('d', float(i)*F)
p = Process(target=f, args=(num, arr,))
x.append(p)
p.start()
for p in x:
p.join()
# THis works
print(num.value)
# This will not give out anything
print(arr[0])
The last line won't print out anything, despite it being filled or altered.
The main thing that concerns me, is when changing the code to just simply using integers it will work:
from multiprocessing import Process, Value, Array
from ctypes import c_char_p
def f(n, a):
for i in range(0,10):
a[i] = 5
if __name__ == '__main__':
arr = Array('i',range(10))
for i in tqdm(range(0,10)):
arr[i] = 10
x = []
for i in range(0,10):
num = Value('d', float(i)*F)
p = Process(target=f, args=(num, arr,))
x.append(p)
p.start()
for p in x:
p.join()
print(num.value)
print(arr[0])
My Best guess is that this has something to do with the fact that the string array is acutally filled with char arrays and an integer is just one value, but I do not know how to fix this
This might answer your question, Basically the string array arr has an array of character pointers c_char_p, When the first process invokes the function f the character pointers are created in the context of itself but not in the other processes context, so eventually when the other processes tries to access the arr it will be invalid addresses.
In my case this seems to be working fine,
from multiprocessing import Process, Value, Array
from ctypes import c_char_p
values = ['test2438']*10
# Process
def f(n, a):
for i,s in enumerate(values):
a[i] = s
if __name__ == '__main__':
# Set up array
arr = Array(c_char_p, 10)
for i in range(0,10):
arr[i] = 'test'
# Fill it with values
x = []
for i in range(0,10):
num = Value('d', float(i))
p = Process(target=f, args=(num, arr,))
x.append(p)
p.start()
for p in x:
p.join()
# This will not give out anything
print(arr[:])
When I run this script on Linux, it prints 8 duplicates. How to force python use all cores on different results, rather than on duplicates?
from multiprocessing import Pool
def f():
f = open("/path/to/10.txt", 'r')
l = [s.strip('\n') for s in f]
f.close()
for a in range(0, len(l)):
for b in range(0, len(l)):
result = 0
if (a == b):
result = 1
else:
counter = 0
for i in range(len(l[a])):
if (int(l[a][i]) == int(l[b][i]) == 1):
counter += 1
result = counter / 10000
print((a + 1), (b + 1), result)
if __name__ == '__main__':
p = Process(target=f)
p.start()
p.join()
If you simply want to run more than one core you will have to use multiple processes, here you are just using one.
also you need to break your routine f in independent units/routine such a way that it can work in parallel and the whole task can be shared among the multiple worker processes.
Here is a sample 2-process code, which can use multiple cores on your machine:
from multiprocessing import Process
def task(arg):
pass
if __name__ == '__main__'
value = 'something'
prc1 = Process(target=task, args=(value,))
prc2 = Process(target=task, args=(value,))
prc1.start()
prc2.start()
prc1.join()
prc2.join()