Summation within iteration over two variables with matrix operations - python

I have the following matrices: Q, P, q and y with shapes (100,100), (100,100), (100,100) and (100,2) respectively.
For every i, I want to compute the following:
This is what I've tried so far, it appears to work but I know this is bad practice
and painfully slow.
grad = np.zeros(100, 2)
for i in range(100):
tmp = 0
for j in range(100):
tmp += ((P[i, j] - Q[i, j]) * q[i, j] * (y[i, :] - y[j, :]))
grad[i, :] = tmp * 4
My question is how can I compute this using matrix operations instead of nested loops?

From your notation, try broadcasting:
grad = 4 * (((P-Q)*q)[...,None]*(y[:,None,:]-y[None])).sum(axis=1)

Related

Is there any way to optimize a triple loop in Python by using numpy or other ressources?

I'm having trouble finding out a way to optimize a triple loop in Python. I will directly give the code for a better and simpler representation of what I have to compute :
Given two 2-D arrays named samples (M x N) and D(N x N) along with the output results (NxN):
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
return results
It does the job but is not effective at all in python. I tried to unloop the for i.. for j.. loop but I cannot compute it correctly with the sigma in the way.
Does someone have an idea on how to optimize those few lines ? Any suggestions are welcomed such as numpy, numexpr, etc...
One way I found to improve your code (i.e reduce the number of loops) is by using np.meshgrid.
Here is the impovement I found. It took some fiddling but it gives the same output as your triple loop code. I kept the same code structure so you can see what parts correspond to what part. I hope this is of use to you!
for sigma in range(M):
xx, yy = np.meshgrid(samples[sigma], samples[sigma])
results += (1/N) * (xx * yy
- yy * D.T
- xx * D)
print(results) # or return results
.
Edit: Here's a small script to verify that the results are as expected:
import numpy as np
M, N = 3, 4
rng = np.random.default_rng(seed=42)
samples = rng.random((M, N))
D = rng.random((N, N))
results = rng.random((N, N))
results_old = results.copy()
results_new = results.copy()
for sigma in range(M):
for i in range(N):
for j in range(N):
results_old[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
print('\n\nresults_old', results_old, sep='\n')
for sigma in range(M):
xx, yy = np.meshgrid(samples[sigma], samples[sigma])
results_new += (1/N) * (xx * yy
- yy * D.T
- xx * D)
print('\n\nresults_new', results_new, sep='\n')
Edit 2: Entirely getting rid of loops: it is a bit convoluted but it essentially does the same thing.
M, N = samples.shape
xxx, yyy = np.meshgrid(samples, samples)
split_x = np.array(np.hsplit(np.vsplit(xxx, M)[0], M))
split_y = np.array(np.vsplit(np.hsplit(yyy, M)[0], M))
results += np.sum(
(1/N) * (split_x*split_y
- split_y*D.T
- split_x*D), axis=0)
print(results) # or return results
In order to vectorize for loops, we can make use of broadcasting and then reducing along any axes that are not reflected by the output array. To do so, we can "assign" one axis to each of the for loop indices (as a convention). For your example this means that all input arrays can be reshaped to have dimension 3 (i.e. len(a.shape) == 3); the axes correspond then to sigma, i, j respectively. Then we can perform all operations with the broadcasted arrays and finally reduce (sum) the result along the sigma axis (since only i, j are reflected in the result):
# Ordering of axes: (sigma, i, j)
samples_i = samples[:, :, np.newaxis]
samples_j = samples[:, np.newaxis, :]
D_ij = D[np.newaxis, :, :]
D_ji = D.T[np.newaxis, :, :]
return (samples_i*samples_j - samples_i*D_ji - samples_j*D_ij).sum(axis=0) / N
The following is a complete example that compares the reference code (using for loops) with the above version; note that I've removed the 1/N part in order to keep computations in the domain of integers and thus make the array equality test exact.
import time
import numpy as np
def timeit(func):
def wrapper(*args):
t_start = time.process_time()
res = func(*args)
t_total = time.process_time() - t_start
print(f'{func.__name__}: {t_total:.3f} seconds')
return res
return wrapper
rng = np.random.default_rng()
M, N = 100, 200
samples = rng.integers(0, 100, size=(M, N))
D = rng.integers(0, 100, size=(N, N))
#timeit
def reference(samples, D):
results = np.zeros(shape=(N, N))
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
return results
#timeit
def new(samples, D):
# Ordering of axes: (sigma, i, j)
samples_i = samples[:, :, np.newaxis]
samples_j = samples[:, np.newaxis, :]
D_ij = D[np.newaxis, :, :]
D_ji = D.T[np.newaxis, :, :]
return (samples_i*samples_j - samples_i*D_ji - samples_j*D_ij).sum(axis=0)
assert np.array_equal(reference(samples, D), new(samples, D))
This gives me the following benchmark results:
reference: 6.465 seconds
new: 0.133 seconds
I found easier to break the problem into smaller steps and work on it, until we have a single equation.
Going from your original formulation:
for sigma in range(M):
for i in range(N):
for j in range(N):
results[i, j] += (1/N) * (samples[sigma, i]*samples[sigma, j]
- samples[sigma, i]*D[j, i]
- samples[sigma, j]*D[i, j])
The first thing is to eliminate the j index in the inner most loop. For this we start working with vectors instead of single elements:
for sigma in range(M):
for i in range(N):
results[i, :] += (1/N) * (samples[sigma, i]*samples[sigma, :] - samples[sigma, i]*D[:, i] - samples[sigma, :]*D[i, :])
Then, we eliminate the second loop, the one with i index. In this step we start to think in matrices. Therefore, each loop is the direct summation of "sigma matrices".
for sigma in range(M):
results += (1/N) * (samples[sigma, :, np.newaxis] * samples[sigma] - samples[sigma, :, np.newaxis] * D.T - samples[sigma, :] * D)
I strongly recommend to use this step as the solution since vectorizing even more would require too much memory for a big value of M. But, just for knowlegde...
think of the matrices as 3-dimensional objects. We do the calculations and sum at the end in index zero as:
results = (1/N) * (samples[:, :, np.newaxis] * samples[:,np.newaxis] - samples[:, :, np.newaxis] * D.T - samples[:, np.newaxis, :] * D).sum(axis=0)

Optimise matrix factorisation algorithm using numpy matrix operations

O = self.feedback_df_normalised.to_numpy() # original matrix
K = self.latent_feature_count
P = np.random.rand(len(O), K) # user embeddings
Q = np.random.rand(len(O[0]), K) # show embeddings
Q_T = np.transpose(Q)
for i in range(len(O)):
print("i:", i)
for j in range(len(O[0])):
print("j:", j)
A_ij = np.dot(P[i,:], Q_T[:,j])
dif_ij = O[i, j] - A_ij
dif_sqd += dif_ij ** 2
for k in range(K):
P[i, k] = P[i, k] + alpha * (2 * dif_ij * Q_T[k, j] - beta * P[i, k])
Q_T[k, j] = Q_T[k, j] + alpha * (2 * dif_ij * P[i, k] - beta * Q_T[k, j])
print("dif_sqd:", dif_sqd)
if dif_sqd < accepted_deviation:
A = P # Q_T
break
have this algorithm implementing matrix factorisation via gradient based on this one:
https://towardsdatascience.com/recommendation-system-matrix-factorization-d61978660b4b#:~:text=Collaborative%20filtering%20is%20the%20application,items'%20and%20users'%20entities.&text=Hence%2C%20from%20the%20matrix%20factorization,in%20user's%20preferences%20and%20interactions.
Iterative relationship the algorithm aims to implement
The general format of O is something like this:
O = [
[5,3,0,1],
[4,0,0,1],
[1,1,0,5],
[1,0,0,4],
[0,1,5,4],
[2,1,3,0],
]
When O becomes large this becomes veerrrrryyyy slow to execute though. I've been banging my head trying to think about how to do this sans looping, but I'm not good enough with matrices to figure it out. Any help would be appreciated.

Diagonal of a numpy matrix without compute the entire matrix

I have a simple algebric problem and I would like to solve it with numpy (of course that I could solve it easily with numba, but that is not the point).
Let us consider a first random matrix A with size (m x n), with n a big value, and a second random matrix B with size (n x n).
A = np.random.random((1E6, 1E2))
B = np.random.random((1E2, 1E2))
We want to compute the following expression:
np.diag(np.dot(np.dot(A,B),B.T))
The problem is that the entire matrix is loaded to the memory and only then is extracted the diagonal. Is it possible to do this operation in a more efficient way?
This is how I would approach it from your starting expression
np.diag(np.dot(np.dot(A,B),B.T))
You can start by grouping terms:
np.diag(np.dot(A, np.dot(B,B.T)))
then only use the first relevant (square) part of A:
np.diag(np.dot(A[:B.shape[0], :], np.dot(B,B.T)))
and then avoid the extra multiplications (that will fall out of the diagonal), by doing the element-wise multiplications yourself:
np.sum( np.multiply(A[:B.shape[0], :].T, np.dot(B,B.T)), 0)
Changed (A*B)*B.T to A*(B*B.T)
Multiplied only this part of A (A[:B.shape[0]]) that would result in the diagonal part of the matrix
import numpy as np
import time
A = np.random.random((1000_000, 100))
B = np.random.random((100, 100))
start_time = time.time()
result = np.diag(np.dot(np.dot(A, B), B.T))
print('Baseline: ', time.time() - start_time)
start_time = time.time()
for i in range(100):
result2 = np.diag(np.dot(A[:B.shape[0]], np.dot(B, B.T)))
print('Optimized: ', (time.time() - start_time) / 100)
stop = 1
assert np.allclose(result, result2)
Baseline: 1.7957241535186768
Optimized: 0.00016015291213989258
Yes.
N = 1E6
A = np.random.random((N, 1E2))
B = np.random.random((1E2, 1E2))
result = 0;
for i in range(N):
result += np.dot(np.dot(A[i,:], B[i,:])[i, :], B.T[i, :])
# Replacing B.T[i, :] with B[:, i].T might be a little more efficient
Explanation:
Say we have: K = np.dot(np.dot(A,B),B.T).
Then, K[0,0] = (A[0, :] * B[:,0])[0, :] * B.T[:])
Let X = (A[0, :] * B[:,0]), which is the [0, 0] element of np.dot(A,B)
Then X[0, :] * B.T[:, 0] is the [0, 0] element of np.dot(np.dot(A,B),B.T)
Then X[0, :] * B.T[:, 0] = (A[0, :] * B[:,0])[0, :] * B.T[:])
We can also generalize this result to: K[i,i] = (A[i, :] * B[:,i])[i, :] * B.T[:, i])

Error while running an array within function in python

RuntimeWarning: divide by zero encountered in double_scalars
While trying to insert array to an function
import numpy as np
import random
def lagrange(x, y, x_int):
n = x.size
y_int = 0
for i in range(0, n):
p = y[i]
for j in range(0, n):
if i != j:
p = p * (x_int - x[j]) / (x[i] - x[j])
y_int = y_int + p
return [y_int]
x = []
y = []
for i in range(1000):
x.append(random.randint(0,100))
y.append(random.randint(0,100))
fx = 3.5
print(lagrange(np.array(x),np.array(y),fx))
i expected to have 1000 iteration of output of an output, any solution to these problems?
Your error message refers to a function not mentioned in your code. But I assume the issue is because x[i] and x[j] could be the same number, and therefore you are dividing by zero on your p = p * (x_int - x[j]) / (x[i] - x[j]) line, which is not possible. You will need to add an exemption to do something different in the case x[i] equals x[j].
Since you're generating your x array randomly from a range of (0,100), and the array size is 1000, it's guranteed that x[i] = x[j] for some i,j. You need to ensure elements in x are unique.
See: How do I create a list of random numbers without duplicates?
In your nested loop could it be that you meant to do if x[i] != x[j]:
Those would be the values you wouldn't want to be the same in your division.

Fast math operations on an array in python

I have a fairly simple math operation I'd like to perform on a array. Let me write out the example:
A = numpy.ndarray((255, 255, 3), dtype=numpy.single)
# ..
for i in range(A.shape[0]):
for j in range(A.shape[1]):
x = simple_func1(i)
y = simple_func2(j)
A[i, j] = (alpha * x * y + beta * x**2 + gamma * y**2, 1, 0)
So basically, there's a mapping between (i, j) and the 3 values of that value (this is for visualization).
I'd like to roll this up and somehow vectorize this, but I'm not sure how to or if I can. Thanks.
Here is the vectorized version:
i = arange(255)
j = arange(255)
x = simple_func1(i)
y = simple_func2(j)
y = y.reshape(-1,1)
A = alpha * x * y + beta * x**2 + gamma * y**2 # broadcasting is your friend here
If you want to fill the last coordinates with 1 and 0:
B = empty(A.shape+(3,))
B[:,:,0] = A
B[:,:,1] = 1 # broadcasting again
B[:,:,2] = 0
You have to change simple_funcN so that they take arrays as input, and create arrays as output. After that, you could look into the numpy.meshgrid() or the cartesian() function here to build coordinate arrays. After that, you should be able to use the coordinate array(s) to fill A with a one-liner.

Categories

Resources