Replacing multiprocessing pool.map with mpi4py

Replacing multiprocessing pool.map with mpi4py - python

I'm a beginner in using MPI, and I'm still going through the documentation. However, there's very little to work on when it comes to mpi4py. I have written a code that currently uses the multiprocessing module to run on many cores, but I need replace this with mpi4py so that I can use more than one node to run my code. My code is below, when using the multiprocessing module, and also without.
With multiprocessing,
import numpy as np
import multiprocessing
start_time = time.time()
E = 0.1
M = 5
n = 1000
G = 1
c = 1
stretch = [10, 1]
#Point-Distribution Generator Function
def CDF_inv(x, e, m):
A = 1/(1 + np.log(m/e))
if x == 1:
return m
elif 0 <= x <= A:
return e * x / A
elif A < x < 1:
return e * np.exp((x / A) - 1)
#Elliptical point distribution Generator Function
def get_coor_ellip(dist=CDF_inv, params=[E, M], stretch=stretch):
R = dist(random.random(), *params)
theta = random.random() * 2 * np.pi
return (R * np.cos(theta) * stretch[0], R * np.sin(theta) * stretch[1])
def get_dist_sq(x_array, y_array):
return x_array**2 + y_array**2
#Function to obtain alpha
def get_alpha(args):
zeta_list_part, M_list_part, X, Y = args
alpha_x = 0
alpha_y = 0
for key in range(len(M_list_part)):
z_m_z_x = X - zeta_list_part[key][0]
z_m_z_y = Y - zeta_list_part[key][1]
dist_z_m_z = get_dist_sq(z_m_z_x, z_m_z_y)
alpha_x += M_list_part[key] * z_m_z_x / dist_z_m_z
alpha_y += M_list_part[key] * z_m_z_y / dist_z_m_z
return (alpha_x, alpha_y)
#The part of the process containing the loop that needs to be parallelised, where I use pool.map()
if __name__ == '__main__':
# n processes, scale accordingly
num_processes = 10
pool = multiprocessing.Pool(processes=num_processes)
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
print len(x)*len(y)*n,'calculations to be carried out.'
M_list = np.array([.001 for i in range(n)])
# split zeta_list, M_list, X, and Y
zeta_list_split = np.array_split(zeta_list, num_processes, axis=0)
M_list_split = np.array_split(M_list, num_processes)
X_list = [X for e in range(num_processes)]
Y_list = [Y for e in range(num_processes)]
alpha_list = pool.map(
get_alpha, zip(zeta_list_split, M_list_split, X_list, Y_list))
alpha_x = 0
alpha_y = 0
for e in alpha_list:
alpha_x += e[0] * 4 * G / (c**2)
alpha_y += e[1] * 4 * G / (c**2)
print("%f seconds" % (time.time() - start_time))
Without multiprocessing,
import numpy as np
E = 0.1
M = 5
G = 1
c = 1
M_list = [.1 for i in range(n)]
#Point-Distribution Generator Function
def CDF_inv(x, e, m):
A = 1/(1 + np.log(m/e))
if x == 1:
return m
elif 0 <= x <= A:
return e * x / A
elif A < x < 1:
return e * np.exp((x / A) - 1)
n = 1000
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
stretch = [5, 2]
#Elliptical point distribution Generator Function
def get_coor_ellip(dist=CDF_inv, params=[E, M], stretch=stretch):
R = dist(random.random(), *params)
theta = random.random() * 2 * np.pi
return (R * np.cos(theta) * stretch[0], R * np.sin(theta) * stretch[1])
#zeta_list is the list of coordinates of a distribution of points
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
#Creation of a X-Y Grid
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
def get_dist_sq(x_array, y_array):
return x_array**2 + y_array**2
#Calculation of alpha, containing the loop that needs to be parallelised.
alpha_x = 0
alpha_y = 0
for key in range(len(M_list)):
z_m_z_x = X - zeta_list[key][0]
z_m_z_y = Y - zeta_list[key][1]
dist_z_m_z = get_dist_sq(z_m_z_x, z_m_z_y)
alpha_x += M_list[key] * z_m_z_x / dist_z_m_z
alpha_y += M_list[key] * z_m_z_y / dist_z_m_z
alpha_x *= 4 * G / (c**2)
alpha_y *= 4 * G / (c**2)
Basically what my code does is, it first generates a list of points that follow a certain distribution. Then I apply an equation to obtain the quantity 'alpha' using different relations between the distances of the points. The part that requires parallelisation is the single for loop involved in the calculation of alpha. What I want to do is to use mpi4py instead of multiprocessing to do this, and I am not sure how to get this going.

Transforming the multiprocessing.map version to MPI can be done using scatter / gather. In your case it is useful, that you already prepare the input list into one chunk for each rank. The main difference is, that all code gets executed by all ranks in the first place, so you must make everything that should be done only by the maste rank 0 conidtional.
if __name__ == '__main__':
comm = MPI.COMM_WORLD
if comm.rank == 0:
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
print len(x)*len(y)*n,'calculations to be carried out.'
M_list = np.array([.001 for i in range(n)])
# split zeta_list, M_list, X, and Y
zeta_list_split = np.array_split(zeta_list, comm.size, axis=0)
M_list_split = np.array_split(M_list, comm.size)
X_list = [X for e in range(comm.size)]
Y_list = [Y for e in range(comm.size)]
work_list = list(zip(zeta_list_split, M_list_split, X_list, Y_list))
else:
work_list = None
my_work = comm.scatter(work_list)
my_alpha = get_alpha(my_work)
alpha_list = comm.gather(my_alpha)
if comm.rank == 0:
alpha_x = 0
alpha_y = 0
for e in alpha_list:
alpha_x += e[0] * 4 * G / (c**2)
alpha_y += e[1] * 4 * G / (c**2)
This works fine as long as each processor gets a similar amount of work. If communication becomes an issue, you might want to split up the data generation among processors instead of doing it all on the master rank 0.
Note: Some things about the code are bogus, e.g. alpha_[xy] ends up as np.ndarray. The serial version runs into an error.

For people who are still interested in similar subjects, I highly recommend having a look at the MPIPoolExecutor() class here and the documentation is here.

Related

How to use multiprocessing pool for a multivariable function python

I'm trying to use the library multiprocessing, to run the next code:
import matplotlib.pyplot as plt
import numpy as np
from multiprocessing import Pool
def newtalt(fun,x0,err,mit):
xnew = x0.copy()
F, dF = fun(x0)
r = F.copy()
M = dF.copy()
sigma = np.linalg.norm(r)
for k in range(mit):
if sigma < err:
break
d = np.linalg.solve(M, -r)
xnew = xnew + d
r, M = fun(xnew)
sigma = np.linalg.norm(r)
return xnew, k
def fun(x):
f_r = x[0] ** 3 - 3 * x[0] * (x[1] ** 2) - 1
f_i = 3 * (x[0]**2) * x[1] - (x[1] ** 3)
f = np.array([f_r,f_i])
df_rx = 3 * (x[0] ** 2) - 3 * (x[1] ** 2)
df_ry = -6 * x[0] * x[1]
df_ix = 6 * x[0] * x[1]
df_iy = 3 * (x[0] ** 2) - 3 * (x[1] ** 2)
df = np.array([[df_rx,df_ry],[df_ix,df_iy]])
return f, df
if __name__=='__main__':
pool = Pool(processes=10)
err = 1e-5
mit = 300
N = 50
x = np.linspace(-2.5, 2.5, N)
y = np.linspace(-2.5, 2.5, N)
A = np.zeros((N, N))
B = np.zeros([N, N, 4], dtype=int)
for i in range(N):
for j in range(N):
z = np.array([x[i], y[j]])
xsol, it = pool.map(newtonalt,(fun,z,err,mit))
pool.close()
A[i,j] = it
plt.imshow(A,cmap='Set1')
plt.show()
Cleary it doesn't work, 'cause I don't really know how to use multiprocessing properly, and it's even more difficult when the function you use have multiple arguments. I read that isn't good use .pool() inside a for loop, but anyway, as I was saying, I'm really lost with this library.

Index Error : Index 2 is out of bounds for axis 0 with size 2

I'm trying to solve some ODE's using different methods and then printing and plotting my results. When I try to run it I get the error IndexError: index 2 is out of bounds for axis 0 with size 2
I know it has to do with the fact of the dimensions, but I thought that all of my dimensions were correct. Here is an example of each way I'm trying to solve the ode's
def f(t,x,y):
xprime = x - y + (2*t) - (t**2) - (t**3)
return xprime
def g(t,x,y):
yprime = x + y - (4*(t**2)) + (t**3)
return yprime
#Exact Solution
def exact(t):
y = np.zeros(len(t))
x = np.zeros(len(t))
for i in range(n):
cos_arr = np.cos(t)
sin_arr = np.sin(t)
y = np.exp(t) * cos_arr + t**2
x = np.exp(t) * sin_arr - t**3
return x, y
#Explicit Euler
def Eulerx(t0, tmax, x0, n):
t, dt = np.linspace(t0, tmax, n, retstep = True)
x = np.zeros(n)
y = np.zeros(n)
x[0] = x0
y[0] =y0
for i in range (n-1):
x[i+1] = x[i] + (dt/2) * f(t[i], x[i], y[i])
return t, x
#RK2
def RK2x(t0, tmax, x0, n):
t, dt = np.linspace(t0, tmax, n, retstep = True)
x = np.zeros(n)
y = np.zeros(n)
x[0] = x0
y[0]=y0
for i in range(n-1):
xK1 = f(t[i], x[i],y[i])
xK2 = f(t[i]+ dt, x[i] +dt * xK1, y[i])
x[i+1] = x[i] +(dt* (1/2)*(xK1 + xK2))
return t, x
#Classical RK4
def RK4x(t0, tmax, x0, n):
t, dt = np.linspace(t0, tmax, n, retstep = True)
x = np.zeros(n)
y = np.zeros(n)
x[0] = x0
y[0] =y0
for i in range(n-1):
x4K1 = f(t[i],x[i],y[i])
x4K2 = f(t[i]+((1/2)*dt), x[i]+ ((1/2)*dt*x4K1),y[i])
x4K3 = f(t[i] +((1/2)*dt), x[i] + ((1/2)*dt*x4K2),y[i])
x4K4 = f(t[i]+dt, x[i]+dt*x4K3,y[i])
x[i+1] = x[i] + (dt*(1/6)*(x4K1 + (2* x4K2) +(2*x4K3) +x4K4))
return t, x
if __name__ == '__main__':
t0 = 0
tmax = 1
x0 = 1
y0 = 0
n=50
[t,X1] = Eulerx(t0,tmax, x0,n)
[t,Y1] = Eulery(t0,tmax, y0,n)
[t, X2]= RK2x(t0,tmax, x0,n)
[t, Y2]= RK2y(t0,tmax, y0,n)
[t, X3]= RK4x(t0,tmax, x0,n)
[t, Y3]= RK4y(t0,tmax, y0,n)
x=exact(t)
y=exact(t)
abs_errx1= abs(x-X1)
abs_errx2= abs(x-X2)
abs_errx3= abs(x-X3)
print("=========================================================================")
print(" n Eulerx Eulery RK2x RK2y RK4x RK4y", end='\n')
for i in range(n):
print(abs_errx1[i], abs_erry1[i], abs_errx2[i], abs_erry2[i], abs_errx3[i], abs_erry3[i])
print("=========================================================================")

Your arrays abs_errx1, etc, are all size (2, 50). You are looking at abs_errx1[n], etc where n runs from 0 to 50. n is being used as the first dimension when you need it to be the second. I'm not sure what the first dimension is supposed to be.

Fitting a line with gradient descent

I am trying to fit a line to a couple of points using gradient descent. I am no expert on this and tried to write down the mathematical algorithm for it in python. It runs for a couple of iterations, but my predictions seem to explode at some point. Here is the code:
import numpy as np
import matplotlib.pyplot as plt
def mean_squared_error(n, A, b, m, c):
e = 0
for i in range(n):
e += (b[i] - (m*A[i] + c)) ** 2
return e/n
def der_wrt_m(n,A,b,m,c):
d = 0
for i in range(n):
d += (2 * (b[i] - (m*A[i] + c)) * (-A[i]))
return d/n
def der_wrt_c(n,A,b,m,c):
d = 0
for i in range(n):
d += (2 * (b[i] - (m*A[i] + c)))
return d/n
def update(n,A,b,m,c,descent_rate):
return descent_rate * der_wrt_m(n,A,b,m,c)), descent_rate * der_wrt_c(n,A,b,m,c))
A = np.array(((0,1),
(1,1),
(2,1),
(3,1)))
x = A.T[0]
b = np.array((1,2,0,3), ndmin=2 ).T
y = b.reshape(4)
def descent(x,y):
m = 0
c = 0
descent_rate = 0.00001
iterations = 100
n = len(x)
plt.scatter(x, y)
u = np.linspace(0,3,100)
prediction = 0
for itr in range(iterations):
print(m,c)
prediction = prediction + m * x + c
m,c = update(n,x,y,m,c,descent_rate)
plt.plot(u, u * m + c, '-')
descent(x,y)
And that's my output:
0 0
19.25 -10.5
-71335.1953125 24625.9453125
5593771382944640.0 -2166081169939480.2
-2.542705027685638e+48 9.692684648057364e+47
2.40856742196228e+146 -9.202614421953049e+145
-inf inf
nan nan
nan nan
nan nan
nan nan
nan nan
nan nan
etc...
Update: The values aren't exploding anymore, but it's still not converging in a nice manner:
# We could also solve it using gradient descent
import numpy as np
import matplotlib.pyplot as plt
def mean_squared_error(n, A, b, m, c):
e = 0
for i in range(n):
e += ((b[i] - (m * A[i] + c)) ** 2)
#print("mse:",e/n)
return e/n
def der_wrt_m(n,A,b,m,c):
d = 0
for i in range(n):
# d += (2 * (b[i] - (m*A[i] + c)) * (-A[i]))
d += (A[i] * (b[i] - (m*A[i] + c)))
#print("Dm",-2 * d/n)
return (-2 * d/n)
def der_wrt_c(n,A,b,m,c):
d = 0
for i in range(n):
d += (2 * (b[i] - (m*A[i] + c)))
#print("Dc",d/n)
return d/n
def update(n,A,b,m,c, descent_rate):
return (m - descent_rate * der_wrt_m(n,A,b,m,c)),(c - descent_rate * der_wrt_c(n,A,b,m,c))
A = np.array(((0,1),
(1,1),
(2,1),
(3,1)))
x = A.T[0]
b = np.array((1,2,0,3), ndmin=2 ).T
y = b.reshape(4)
def descent(x,y):
m = 0
c = 0
descent_rate = 0.0001
iterations = 10000
n = len(x)
plt.scatter(x, y)
u = np.linspace(0,3,100)
prediction = 0
for itr in range(iterations):
prediction = prediction + m * x + c
m,c = update(n,x,y,m,c,descent_rate)
loss = mean_squared_error(n, A, b, m, c)
print(loss)
print(m,c)
plt.plot(u, u * m + c, '-')
descent(x,y)
And now the graph looks like this after about 10000 iterations with a learning rate of 0.0001:
[4.10833186 5.21468937]
1.503547594304175 -1.9947003678083184
Whereas the least square fit shows something like this:

In your update function, you should subtract calculated gradients from current m and c
def update(n,A,b,m,c,descent_rate):
return m - (descent_rate * der_wrt_m(n,A,b,m,c)), c - (descent_rate * der_wrt_c(n,A,b,m,c))
Update: Here is the working version. I got rid of A matrix after obtaining x,y since it confuses me =). For example in your gradient calculations you have an expression d += (A[i] * (b[i] - (m*A[i] + c))) but it should be d += (x[i] * (b[i] - (m*x[i] + c))) since x[i] gives you a single element whereas A[i] gives you a list.
Also you forgot a minus sign while calculating derivative with respect to c. If your expression is (y - (m*x + c))^2) than derivative with respect to c should be 2 * (-1) * (y - (m*x + c)) since there is a minus in front of c.
# We could also solve it using gradient descent
import numpy as np
import matplotlib.pyplot as plt
def mean_squared_error(n, x, y, m, c):
e = 0
for i in range(n):
e += (m*x[i]+c - y[i])**2
e = e/n
return e/n
def der_wrt_m(n, x, y, m, c):
d = 0
for i in range(n):
d += x[i] * (y[i] - (m*x[i] + c))
d = -2 * d/n
return d
def der_wrt_c(n, x, y, m, c):
d = 0
for i in range(n):
d += (y[i] - (m*x[i] + c))
d = -2 * d/n
return d
def update(n,x,y,m,c, descent_rate):
return (m - descent_rate * der_wrt_m(n,x,y,m,c)),(c - descent_rate * der_wrt_c(n,x,y,m,c))
A = np.array(((0,1),
(1,1),
(2,1),
(3,1)))
x = A.T[0]
b = np.array((1,2,0,3), ndmin=2 ).T
y = b.reshape(4)
print(x)
print(y)
def descent(x,y):
m = 0.0
c = 0.0
descent_rate = 0.01
iterations = 10000
n = len(x)
plt.scatter(x, y)
u = np.linspace(0,3,100)
prediction = 0
for itr in range(iterations):
prediction = prediction + m * x + c
m,c = update(n,x,y,m,c,descent_rate)
loss = mean_squared_error(n, x, y, m, c)
print(loss)
print(loss)
print(m,c)
plt.plot(u, u * m + c, '-')
plt.show()
descent(x,y)

How to generate a multidimensional cube in Python

This program creates a cube of size Gridsize**3 with user choice of starting point and space between point (even if they are not function parameters there isn't difficult to implement).
import numpy as np
def CreateMap(Gridsize):
X = Y = Z = Gridsize
M = np.zeros(shape=(X*Y*Z, 3))
d_x = 5 / Gridsize # increment of the cube x dimension
d_y = 5 / Gridsize
d_z = 5 / Gridsize
x0 = -1.0
y0 = 1.0
z0 = 0
x = np.arange(x0, X * d_x, d_x, dtype=float)
y = np.arange(y0, Y * d_y, d_y, dtype=float)
z = np.arange(z0, Z * d_z, d_z, dtype=float)
g = 0
for i in range(X):
for j in range(Y):
for k in range(Z):
M[g, 0] = x[i]
M[g, 1] = y[j]
M[g, 2] = z[k]
g = g + 1
print(M)
return 0
I was wondering what was the best method to create an hyper cube of size Gridsize**n were n will also be user defined?

Check out np.meshgrid. Instead of your for loops, you can just do
M = np.stack(np.meshgrid(x, y, z))

If you guys have optimization advice...
import numpy as np
def CreateMap(Gridsize, x0, xf):
k = np.shape(x0)[0]
M = np.zeros(shape=(Gridsize**k, k))
d_x = np.zeros(k)
for i in range(k):
d = 0
j = 0
d_x[i] = (xf[i] - x0[i]) / (Gridsize - 1) # increment of the cube x dimension
x = np.arange(x0[i], xf[i]+d_x[i], d_x[i], dtype=float)
for v in range(Gridsize ** (k - i - 1)):
for j in range(Gridsize):
temp = x[j]
for z in range(Gridsize ** i):
M[d, i] = temp
d = d + 1
print(M)
return 0
x0 = np.array([-1, 0, 1])
xf = np.array([10, 2, 5])
CreateMap(4, x0, xf)

multithreaded mandelbrot set

Is it possible to change the formula of the mandelbrot set (which is f(z) = z^2 + c by default) to a different one ( f(z) = z^2 + c * e^(-z) is what i need) when using the escape time algorithm and if possible how?
I'm currently using this code by FB36
# Multi-threaded Mandelbrot Fractal (Do not run using IDLE!)
# FB - 201104306
import threading
from PIL import Image
w = 512 # image width
h = 512 # image height
image = Image.new("RGB", (w, h))
wh = w * h
maxIt = 256 # max number of iterations allowed
# drawing region (xa < xb & ya < yb)
xa = -2.0
xb = 1.0
ya = -1.5
yb = 1.5
xd = xb - xa
yd = yb - ya
numThr = 5 # number of threads to run
# lock = threading.Lock()
class ManFrThread(threading.Thread):
def __init__ (self, k):
self.k = k
threading.Thread.__init__(self)
def run(self):
# each thread only calculates its own share of pixels
for i in range(k, wh, numThr):
kx = i % w
ky = int(i / w)
a = xa + xd * kx / (w - 1.0)
b = ya + yd * ky / (h - 1.0)
x = a
y = b
for kc in range(maxIt):
x0 = x * x - y * y + a
y = 2.0 * x * y + b
x = x0
if x * x + y * y > 4:
# various color palettes can be created here
red = (kc % 8) * 32
green = (16 - kc % 16) * 16
blue = (kc % 16) * 16
# lock.acquire()
global image
image.putpixel((kx, ky), (red, green, blue))
# lock.release()
break
if __name__ == "__main__":
tArr = []
for k in range(numThr): # create all threads
tArr.append(ManFrThread(k))
for k in range(numThr): # start all threads
tArr[k].start()
for k in range(numThr): # wait until all threads finished
tArr[k].join()
image.save("MandelbrotFractal.png", "PNG")

From the code I infer that z = x + y * i and c = a + b * i. That corresponds f(z) - z ^2 + c. You want f(z) = z ^2 + c * e^(-z).
Recall that e^(-z) = e^-(x + yi) = e^(-x) * e^i(-y) = e^(-x)(cos(y) - i*sin(y)) = e^(-x)cos(y) - i (e^(-x)sin(y)). Thus you should update your lines to be the following:
x0 = x * x - y * y + a * exp(-x) * cos(y) + b * exp(-x) * sin(y);
y = 2.0 * x * y + a * exp(-x) * sin(y) - b * exp(-x) * cos(y)
x = x0
You might need to adjust maxIt if you don't get the level of feature differentiation you're after (it might take more or fewer iterations to escape now, on average) but this should be the mathematical expression you're after.
As pointed out in the comments, you might need to adjust the criterion itself and not just the maximum iterations in order to get the desired level of differentiation: changing the max doesn't help for ones that never escape.
You can try deriving a good escape condition or just try out some things and see what you get.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replacing multiprocessing pool.map with mpi4py - python

For people who are still interested in similar subjects, I highly recommend having a look at the MPIPoolExecutor() class here and the documentation is here.

Related

How to use multiprocessing pool for a multivariable function python

Index Error : Index 2 is out of bounds for axis 0 with size 2

Fitting a line with gradient descent

How to generate a multidimensional cube in Python

multithreaded mandelbrot set

Categories

Resources