Related
I'm having trouble finding out a way to optimize a triple loop in Python. I will directly give the code for a better and simpler representation of what I have to compute :
Given two 2-D arrays named samples (M x N) and D(N x N) along with the output results (NxN):
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
return results
It does the job but is not effective at all in python. I tried to unloop the for i.. for j.. loop but I cannot compute it correctly with the sigma in the way.
Does someone have an idea on how to optimize those few lines ? Any suggestions are welcomed such as numpy, numexpr, etc...
One way I found to improve your code (i.e reduce the number of loops) is by using np.meshgrid.
Here is the impovement I found. It took some fiddling but it gives the same output as your triple loop code. I kept the same code structure so you can see what parts correspond to what part. I hope this is of use to you!
for sigma in range(M):
xx, yy = np.meshgrid(samples[sigma], samples[sigma])
results += (1/N) * (xx * yy
- yy * D.T
- xx * D)
print(results) # or return results
.
Edit: Here's a small script to verify that the results are as expected:
import numpy as np
M, N = 3, 4
rng = np.random.default_rng(seed=42)
samples = rng.random((M, N))
D = rng.random((N, N))
results = rng.random((N, N))
results_old = results.copy()
results_new = results.copy()
for sigma in range(M):
for i in range(N):
for j in range(N):
results_old[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
print('\n\nresults_old', results_old, sep='\n')
for sigma in range(M):
xx, yy = np.meshgrid(samples[sigma], samples[sigma])
results_new += (1/N) * (xx * yy
- yy * D.T
- xx * D)
print('\n\nresults_new', results_new, sep='\n')
Edit 2: Entirely getting rid of loops: it is a bit convoluted but it essentially does the same thing.
M, N = samples.shape
xxx, yyy = np.meshgrid(samples, samples)
split_x = np.array(np.hsplit(np.vsplit(xxx, M)[0], M))
split_y = np.array(np.vsplit(np.hsplit(yyy, M)[0], M))
results += np.sum(
(1/N) * (split_x*split_y
- split_y*D.T
- split_x*D), axis=0)
print(results) # or return results
In order to vectorize for loops, we can make use of broadcasting and then reducing along any axes that are not reflected by the output array. To do so, we can "assign" one axis to each of the for loop indices (as a convention). For your example this means that all input arrays can be reshaped to have dimension 3 (i.e. len(a.shape) == 3); the axes correspond then to sigma, i, j respectively. Then we can perform all operations with the broadcasted arrays and finally reduce (sum) the result along the sigma axis (since only i, j are reflected in the result):
# Ordering of axes: (sigma, i, j)
samples_i = samples[:, :, np.newaxis]
samples_j = samples[:, np.newaxis, :]
D_ij = D[np.newaxis, :, :]
D_ji = D.T[np.newaxis, :, :]
return (samples_i*samples_j - samples_i*D_ji - samples_j*D_ij).sum(axis=0) / N
The following is a complete example that compares the reference code (using for loops) with the above version; note that I've removed the 1/N part in order to keep computations in the domain of integers and thus make the array equality test exact.
import time
import numpy as np
def timeit(func):
def wrapper(*args):
t_start = time.process_time()
res = func(*args)
t_total = time.process_time() - t_start
print(f'{func.__name__}: {t_total:.3f} seconds')
return res
return wrapper
rng = np.random.default_rng()
M, N = 100, 200
samples = rng.integers(0, 100, size=(M, N))
D = rng.integers(0, 100, size=(N, N))
#timeit
def reference(samples, D):
results = np.zeros(shape=(N, N))
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
return results
#timeit
def new(samples, D):
# Ordering of axes: (sigma, i, j)
samples_i = samples[:, :, np.newaxis]
samples_j = samples[:, np.newaxis, :]
D_ij = D[np.newaxis, :, :]
D_ji = D.T[np.newaxis, :, :]
return (samples_i*samples_j - samples_i*D_ji - samples_j*D_ij).sum(axis=0)
assert np.array_equal(reference(samples, D), new(samples, D))
This gives me the following benchmark results:
reference: 6.465 seconds
new: 0.133 seconds
I found easier to break the problem into smaller steps and work on it, until we have a single equation.
Going from your original formulation:
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
The first thing is to eliminate the j index in the inner most loop. For this we start working with vectors instead of single elements:
for sigma in range(M):
for i in range(N):
results[i, :] += (1/N) * (samples[sigma, i]*samples[sigma, :] - samples[sigma, i]*D[:, i] - samples[sigma, :]*D[i, :])
Then, we eliminate the second loop, the one with i index. In this step we start to think in matrices. Therefore, each loop is the direct summation of "sigma matrices".
for sigma in range(M):
results += (1/N) * (samples[sigma, :, np.newaxis] * samples[sigma] - samples[sigma, :, np.newaxis] * D.T - samples[sigma, :] * D)
I strongly recommend to use this step as the solution since vectorizing even more would require too much memory for a big value of M. But, just for knowlegde...
think of the matrices as 3-dimensional objects. We do the calculations and sum at the end in index zero as:
results = (1/N) * (samples[:, :, np.newaxis] * samples[:,np.newaxis] - samples[:, :, np.newaxis] * D.T - samples[:, np.newaxis, :] * D).sum(axis=0)
I have the following matrices: Q, P, q and y with shapes (100,100), (100,100), (100,100) and (100,2) respectively.
For every i, I want to compute the following:
This is what I've tried so far, it appears to work but I know this is bad practice
and painfully slow.
grad = np.zeros(100, 2)
for i in range(100):
tmp = 0
for j in range(100):
tmp += ((P[i, j] - Q[i, j]) * q[i, j] * (y[i, :] - y[j, :]))
grad[i, :] = tmp * 4
My question is how can I compute this using matrix operations instead of nested loops?
From your notation, try broadcasting:
grad = 4 * (((P-Q)*q)[...,None]*(y[:,None,:]-y[None])).sum(axis=1)
code:
def expected_profit(n):
total = 0
X = np.arange(0,n+1)
p = np.arange(0,n+1)
profit = np.arange(0,n+1)
for i in list(range(1,n+1)):
print("X_i:", X[i])
p[i] = binom.pmf(X[i],n,19/20)
print(p[i])
if X[i] > 100:
profit[i] = 50*n-60*(X[i]-100)
else:
profit[i] = 50*n
total += profit[i]*p[i]
return total
expected_profit(10)
>>>0
For some reason, after each iteration, p[i] is equal to zero. Yet when I manually type out (for example) binom.pmf(10,10,19/20) I get a non zero answer. What is the problem here?
This seems to happen with any call to binom.pmf within the function call.
With p = np.arange(0,n+1) you initialize p with an integer array 0,...,n. That makes that binom.pmf(...) is converted to an integer when assigned to p[i]. The solution is to make p an array of floats. np.zeros() by default creates an array of floats. The same problem holds for profit.
Fitting this into the code would look like:
from scipy.stats import binom
import numpy as np
def expected_profit(n):
n = 10
total = 0
X = np.arange(0, n + 1)
p = np.zeros(n + 1, dtype=float)
profit = np.zeros(n + 1, dtype=float)
for i in range(1, n + 1):
p[i] = binom.pmf(X[i], n, 19/20)
if X[i] > 100:
profit[i] = 50 * n - 60 * (X[i] - 100)
else:
profit[i] = 50 * n
total += profit[i] * p[i]
expected_profit(10)
How to append local variable from inside function to array/list in python?
below is my code.
I want to append corr variable to an empty array.
suppose T=[] .
its not appending and going in infinite loop.
how can I do this?
# Python Program to find correlation coefficient.
import math
# function that returns correlation coefficient.
def correlationCoefficient(X, Y, n) :
sum_X = 0
sum_Y = 0
sum_XY = 0
squareSum_X = 0
squareSum_Y = 0
i = 0
while i < n :
# sum of elements of array X.
sum_X = sum_X + X[i]
# sum of elements of array Y.
sum_Y = sum_Y + Y[i]
# sum of X[i] * Y[i].
sum_XY = sum_XY + X[i] * Y[i]
# sum of square of array elements.
squareSum_X = squareSum_X + X[i] * X[i]
squareSum_Y = squareSum_Y + Y[i] * Y[i]
z = ((float)(math.sqrt((n * squareSum_X -sum_X * sum_X)* (n * squareSum_Y -sum_Y * sum_Y))))
y = ((float)(n * sum_XY - sum_X * sum_Y))
i = i + 1
if z == 0:
corr = 0
else:
# use formula for calculating correlation coefficient.
corr=abs(y/z)
while corr<1:
T=[]
T.append(corr)
print("T",T)
return corr
# Driver function
A = [0,7.6,7.7,6.4,6.25,6.4,6.4,5.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8.3,6.4,3.2,3.2,3.25,3.25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5.35,5,4.85,5.65,5.4,5.35,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
B = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86]
X = [0]*5
Y = [0]*5
# the size of array.
n=5
# Function call to correlationCoefficient.
k=0
while k <= len(A):
i = k
m = 0
while i <= k+4:
X[m] = A[i]
#print("A[i]",A[i])
Y[m] = B[i]
#print("B[i]",B[i])
i = i + 1
m = m + 1
#correlationCoefficient(X, Y, 5)
print ((correlationCoefficient(X, Y, 5)))
k = k + 1
The relevant bit seems to be here:
corr=abs(y/z)
while corr<1:
T=[]
T.append(corr)
print("T",T)
return corr
You're blanking out the T array each time that while loop runs, and it will run forever if corr<1, since you never change the value of corr.
Move T=[] outside of the while i<n loop if you'd like it to stick around, and modify corr (or use an if instead) to avoid the infinite loop.
How would I write the following using list comprehension?
def mv(A,X,n):
Y = [0]*n
for i in range(n):
for j in range(n):
Y[i] += A[i][j] * X[j]
return Y
I believe that A is a matrix and that X is a vector. This is what I have tried so far, but it does not output the same thing:
def mv2(A,X,n):
res = [sum((A[i][j] * X[i]) for i in range(n) for j in range(n))]
return res
You are very close to the right answer, as you should apply sum on the right target
return [sum([A[i][j] * X[j] for j in range(n)]) for i in range(n)]
Notes: if you want to do the math with a library, numpy is a good option
import numpy as np
def mv2(A, X):
A = np.array(A)
X = np.array(X)
return np.dot(A, X)