I am assigning values to a numpy array by looking up values in other numpy arrays. These arrays have potentially different indices. Here is an example:
import numpy as np
A=1; B=2; C=3; D=4; E=5
X = np.random.normal(0,1,(A,B,C,E))
Y = np.random.normal(0,1,(A,B,D))
Z = np.random.normal(0,1,(A,C))
Result = np.zeros((A,B,C,D,E))
for a in range(A):
for b in range(B):
for c in range(C):
for d in range(D):
for e in range(E):
Result[a,b,c,d,e] = Z[a,c] + Y[a,b,d] + X[a,b,c,e]
What is the best way to optimize this code? I can remove the E for loop using Result[a,b,c,d,:] = Z[a,c] + Y[a,b,d] + X[a,b,c,:]. But then how to remove the rest of the loops? I was also thinking that I could manipulate X,Y,Z before assignment so it merges easily with the dimensions of Result. There must be more elegant ways. Thanks for tips.
Here's one way:
Result = Z[:,None,:,None,None] + Y[:,:,None,:,None] + X[:,:,:,None,:]
To produce this vectorized version, all I did was replace the various indices into X, Y, and Z with full a,b,c,d,e-style indexing, inserting None where missing indices were found. For example, Y[a,b,d] becomes Y[a,b,None,d,None], which vectorizes into Y[:,:,None,:,None].
In numpy, indexing by None tells the array to pretend like it has an additional axis. This doesn't change the size of the array, but it does change how operations get broadcasted, which is what we need here. Check out the numpy broadcasting docs for more info.
Related
This question already has answers here:
How to get the cartesian product of multiple lists
(17 answers)
Closed 5 months ago.
I need to make calculations using two lists each with 36 elements in them. The calculation must use one value in each list using all combinations. Example:
listx = [x1 , x2 , x3 , ... , x36]
listy = [y1 , y2 , y3 , ... , y36]
F(x,y) = ((y-x)*(a/b))+x
x and y in F(x,y) must assume all combinations inside listx and listy. Results should be a matrix of (36 x 36)
This is what I've tried so far:
listx = np.arange(-0.05,0.301,0.01)
listy = np.arange(-0.05,0.301,0.01)
for x in listx:
for y in listy:
F = ((y-x)*(a/b))+x
print(F)
So I think the issue is that you are having trouble conceptualizing the grid that these solutions are supposed to be stored in. This calculation is good because it is an introduction to certain optimizations and additionally there are a few ways to do it. I'll show you the three I threw together.
First, you could do it with lists and loops, which is very inefficient (numpy is just to show the shape):
import numpy as np
x, y = [], []
length = 35
for i in range(length+1):
x.append(i/length) # Normalizing over the range of the grid
y.append(i/length) # to compare to later example
def func(x, y, a, b):
return ((y-x)*(a/b))+x
a=b=1 # Set a value for a and b
row = []
for i in x:
column = []
for j in y:
column.append(func(i,j,a,b))
row.append(column)
print(row)
print(np.shape(row))
This will output a solution assuming a and b are known, and it is a 36x36 matrix. To make the matrix, we have to create a large memory space which I called row and smaller memory spaces that are recreated each iteration of the loop I called column. The inner-most loop appends the values to the column list, while the evaluated column lists are appended to the top level row list. It will then have a matrix-esque appearance even if it is just a list of lists.
A more efficient way to do this is to use numpy. First, we can keep the loops if you wish and do the calculation with numpy arrays:
import numpy as np
x = y = np.linspace(0,1,36)
result = np.zeros((len(x), len(y)))
F = lambda x,y,a,b: ((y-x)*(a/b))+x
a=b=1
for idx, i in enumerate(x):
for jdx, j in enumerate(y):
result[idx, jdx] = F(i,j,a,b) # plug in value at idx, jdx grip point
print(result)
print(result.shape)
So here we create the grid using linspace and I just chose values from 0 to 1 in 36 steps. After this, I create the grid we will store the solutions in by making a numpy array with dimensions given by the length of the x and y arrays. Finally The function is created with a lambda function, which serves the same purpose of the def previously, just in one line. The loop is kept for now, which iterates over the values i, j and indexes of each idx, jdx. The results are added into the allocated storage at each index with result[idx, jdx] = F(i,j,a,b).
We can do better, because numpy exists to help remove loops in calculations. Instead, we can utilize the meshgrid function to create a matrix and evaluate the function with it, as so:
import numpy as np
x = y = np.linspace(0,1,36)
X, Y = np.meshgrid(x,y)
F = lambda x,y,a,b: ((y-x)*(a/b))+x
a=b=1
result = F(X,Y,a,b) # Plug in grid directly
print(result.T)
print(result.shape)
Here we use the numpy arrays and tell meshgrid that we want a 36x36 array with these values at each grid point. Then we define the lambda function as before and pass the new X and Y to the function. The output does not require additional storage or loops, so then we get the result.
It is good to practice using numpy for any calculation you want to do, because they can usually be done without loops.
So suppose i have two numpy ndarrays whose elements are matrices. I need element-wise multiplication for these two arrays, however, there should be matrix multiplication between the two matrix elements. Of course i would be able to implement this with for loops but i was looking to solve this problem without using an explicit for loop. How do i implement this?
EDIT: This for-loop does what I want to do. I'm on python 2.7
n = np.arange(8).reshape(2,2,1,2)
l = np.arange(1,9).reshape(2,2,2,1)
k = np.zeros((2,2))
for i in range(len(n)):
for j in range(len(n[i])):
k[i][j] = np.asscalar(n[i][j].dot(l[i][j]))
print k
Assuming your arrays of matrices are given as n+2 dimensional arrays A and B. What you want to achieve is as simple as C = A#B
Example
outer_dims = 2,3,4
inner_dims = 4,5,6
A = np.random.randint(0,10,(*outer_dims, *inner_dims[:2]))
B = np.random.randint(0,10,(*outer_dims, *inner_dims[1:]))
C = A#B
# check
for I in np.ndindex(outer_dims):
assert (C[I] == A[I]#B[I]).all()
UPDATE: Py2 version; thanks # hpaulj, Divakar
A = np.random.randint(0,10, outer_dims + inner_dims[:2])
B = np.random.randint(0,10, outer_dims + inner_dims[1:])
C = np.matmul(A,B)
# check
for I in np.ndindex(outer_dims):
assert (C[I] == np.matmul(A[I],B[I])).all()
If I understand correctly, this might work:
import numpy as np
a = np.array([[1,1],[1,0]])
b = np.array([[3,4],[5,4]])
x = np.array([[a,b],[b,a]])
y = np.array([[a,a],[b,b]])
result = np.array([_x # _y for _x, _y in zip(x,y)])
import numpy as np
I have two arrays of size n (to simplify, I use in this example n = 2):
A = array([[1,2,3],[1,2,3]])
B has two dimensions with n time a random integer: 1, 2 or 3.
Let's pretend:
B = array([[1],[3]])
What is the most pythonic way to subtract B from A in order to obtain C, C = array([2,3],[1,2]) ?
I tried to use np.subtract but due to the broadcasting rules I do not obtain C. I do not want to use mask or indices but element's values. I also tried to use np.delete, np.where without success.
Thank you.
This might work and should be quite Pythonic:
dd=[[val for val in A[i] if val not in B[i]] for i in xrange(len(A))]
I have to evaluate the following expression, given two quite large matrices A,B and a very complicated function F:
The mathematical expression
I was thinking if there is an efficient way in order to first find those indices i,j that will give a non-zero element after the multiplication of the matrices, so that I avoid the quite slow 'for loops'.
Current working code
# Starting with 4 random matrices
A = np.random.randint(0,2,size=(50,50))
B = np.random.randint(0,2,size=(50,50))
C = np.random.randint(0,2,size=(50,50))
D = np.random.randint(0,2,size=(50,50))
indices []
for i in range(A.shape[0]):
for j in range(A.shape[0]):
if A[i,j] != 0:
for k in range(B.shape[1]):
if B[j,k] != 0:
for l in range(C.shape[1]):
if A[i,j]*B[j,k]*C[k,l]*D[l,i]!=0:
indices.append((i,j,k,l))
print indices
As you can see, in order to get the indices I need I have to use nested loops (= huge computational time).
My guess would be NO: you cannot avoid the for-loops. In order to find all the indices ij you need to loop through all the elements which defeats the purpose of this check. Therefore, you should go ahead and use simple array elementwise multiplication and dot product in numpy - it should be quite fast with for loops taken care by numpy.
However, if you plan on using a Python loop then the answer is YES, you can avoid them by using numpy, using the following pseudo-code (=hand-waving):
i, j = np.indices((N, M)) # CAREFUL: you may need to swap i<->j or N<->M
fs = F(i, j, z) # array of values of function F
# for a given z over the index grid
R = np.dot(A*fs, B) # summation over j
# return R # if necessary do a summation over i: np.sum(R, axis=...)
If the issue is that computing fs = F(i, j, z) is a very slow operation, then you will have to identify elements of A that are zero using two loops built-in into numpy (so they are quite fast):
good = np.nonzero(A) # hidden double loop (for 2D data)
fs = np.zeros_like(A)
fs[good] = F(i[good], j[good], z) # compute F only where A != 0
I have the following code
l = len(time) #time is a 300 element list
ll = len(sample) #sample has 3 sublists each with 300 elements
w, h = ll, l
Matrix = [[0 for x in range(w)] for y in range(h)]
for n in range(0,l):
for m in range(0,ll):
x=sample[m]
Matrix[m][n]= x
When I run the code to fill the matrix I get an error message saying "list index out of range" I put in a print statement to see where the error happens and when m=0 and n=3 the matrix goes out of index.
from what I understand on the fourth line of the code I initialize a 3X300 matrix so why does it go out of index at 0X3 ?
You need to change Matrix[m][n]= x to Matrix[n][m]= x
The indexing of nested lists happens from the outside in. So for your code, you'll probably want:
Matrix[n][m] = x
If you prefer the other order, you can build the matrix differently (swap w and h in the list comprehensions).
Note that if you're going to be doing mathematical operations with this matrix, you may want to be using numpy arrays instead of Python lists. They're almost certainly going to be much more efficient at doing math operations than anything you can write yourself in pure Python.
Note that indexing in nested lists in Python happens from outside in, and so you'll have to change the order in which you index into your array, as follows:
Matrix[n][m] = x
For mathematical operations and matrix manipulations, using numpy two-dimensional arrays, is almost always a better choice. You can read more about them here.