Update 2D numpy array values - python

Is there a more efficient way to update the values of a multidimensional numpy array?
For example, I have a loop
for i in np.arange(5):
for j in np.arange(5):
if (i + j) % 2 == 0:
v[i,j] = v[i,j] + v[i, j + 1]
I was thinking on parallelizing this process later (with multiprocessing and Pool) but I can't imagine how. Maybe defining a function and using map but this is a 2D array and the operations depend on the element indexes.

Basically you are doing this:
You can do this in two lines using slice indexing:
v[0:5:2,0:5:2] += v[0:5:2,1:6:2] # even rows
v[1:5:2,1:5:2] += v[1:5:2,2:6:2] # odd rows

Related

How to make efficient moving index in numpy array

I want to know how to code an efficient index over a numpy array. For the moment, I go over the array elements using repeated additions. For example, I have to make some loops over an array A like this:
import numpy as np
A = np.arange(0,100)
M = 10
for i in range(A.shape[0]-M):
B = []
for j in range(M):
value = A[i+j]
B.append(value)
Is there a way for me to get the values without repeatedly doing the i+j addition?

Speed up numpy array assignment via list comprehension or mapping?

I'm trying to add the values of a small 2D numpy array ("source") into a larger 2D numpy array ("frame"), starting at a specific position in the frame-array ("pos_x" , "pos_y"). Right now, I have two for-loops adding the source-value to the frame-value at each position:
for i in range(x):
for j in range(y):
frame[pos_x+i][pos_y+j] += source[i][j]
("x" and "y" being the source-arrays' shape)
However, the arrays are quite large (the frame array shape: 5000x8000, and the source array shape: 1000x5000). So this process takes quite long (ca. 15 seconds).
Is there any way to speed up this process, either through list comprehension, or mapping, or anything else?
I've tried list comprehension like this with multiple statements and assignments:
frame = [[frame[pos_x+i][pos_y+j] + source[i][j] for j in range(y)] for i in range(x)]
(adapted from the Threads: How can I do assignments in a list comprehension? and Multiple statements in list compherensions in Python?)
but it takes just as long as the original for-loops.
Another idea was to only allow the loop for non-zero values with if source[i][j] != 0. But when I tried that, it took over three times as long (potential sub-question: any idea why?).
You can take advantage of numpy vectorization here instead of looping for a huge speedup. It's just a matter of calculating the indices and checking if you're out of bounds.
height, width = source.shape
pos_x2 = pos_x + width
pos_y2 = pos_y + height
#check for possible index out of range
fheight, fwidth = frame.shape
if pos_x2 > fwidth or pos_y2 > fheight:
print('source out of frame bounds')
else:
#add `source` to our slice of `frame`
frame[pos_y:pos_y2, pos_x:pos_x2] += source
While technically this will have the same time complexity, numpy uses efficient compiled C code that can take advantage of things like hardware vectorization and doesn't have all the overhead of the python interpreter.
slices are mutable in numpy; so you can do things like:
import numpy as np
A = np.zeros((10,10), int)
B = np.zeros((5,5), int) + 5
A[2:7,2:7] += B
print(A)

Combination of matrix elements giving non-zero value (PYTHON)

I have to evaluate the following expression, given two quite large matrices A,B and a very complicated function F:
The mathematical expression
I was thinking if there is an efficient way in order to first find those indices i,j that will give a non-zero element after the multiplication of the matrices, so that I avoid the quite slow 'for loops'.
Current working code
# Starting with 4 random matrices
A = np.random.randint(0,2,size=(50,50))
B = np.random.randint(0,2,size=(50,50))
C = np.random.randint(0,2,size=(50,50))
D = np.random.randint(0,2,size=(50,50))
indices []
for i in range(A.shape[0]):
for j in range(A.shape[0]):
if A[i,j] != 0:
for k in range(B.shape[1]):
if B[j,k] != 0:
for l in range(C.shape[1]):
if A[i,j]*B[j,k]*C[k,l]*D[l,i]!=0:
indices.append((i,j,k,l))
print indices
As you can see, in order to get the indices I need I have to use nested loops (= huge computational time).
My guess would be NO: you cannot avoid the for-loops. In order to find all the indices ij you need to loop through all the elements which defeats the purpose of this check. Therefore, you should go ahead and use simple array elementwise multiplication and dot product in numpy - it should be quite fast with for loops taken care by numpy.
However, if you plan on using a Python loop then the answer is YES, you can avoid them by using numpy, using the following pseudo-code (=hand-waving):
i, j = np.indices((N, M)) # CAREFUL: you may need to swap i<->j or N<->M
fs = F(i, j, z) # array of values of function F
# for a given z over the index grid
R = np.dot(A*fs, B) # summation over j
# return R # if necessary do a summation over i: np.sum(R, axis=...)
If the issue is that computing fs = F(i, j, z) is a very slow operation, then you will have to identify elements of A that are zero using two loops built-in into numpy (so they are quite fast):
good = np.nonzero(A) # hidden double loop (for 2D data)
fs = np.zeros_like(A)
fs[good] = F(i[good], j[good], z) # compute F only where A != 0

numpy iterating over multidimensional array

I am very new to numpy and I am trying to achieve the following in the most pythonic way. So, I have two arrays:
a=array([[0, 1, 2],[3,4,5]])
b=zeros(a.shape)
Now, what I would like is for each element in b for be one greater than the value of the corresponding element in a i.e b=a+1
I was wondering how this can be achieved in numpy.
The easiest way is the following:
b = a + 1
But if you want to iterate over the array yourself (although not recommended):
for i in range(len(a)):
for j in range(len(a[i])):
b[i][j] = a[i][j] + 1

Increment Numpy multi-d array with repeated indices

I'm interested in the multi-dimensional case of Increment Numpy array with repeated indices.
I have an N-dimensional array and a set N index arrays, who's values I want to increment. The index arrays might have have repeated entries.
Without repeats, the solution is
a = arange(24).reshape(2,3,4)
i = array([0,0,1])
j = array([0,1,1])
k = array([0,0,3])
a[i,j,k] += 1
With repeats, (ex. j=array([0,0,2]) ), I'm unable to make numpy increment the replicates.
How about this:
import numpy as np
a = np.zeros((2,3,4))
i = np.array([0,0,1])
j = np.array([0,0,1])
k = np.array([0,0,3])
ijk = np.vstack((i,j,k)).T
H,edge = np.histogramdd(ijk,bins=a.shape)
a += H
I don't know if there is an easier solution with direct array indexing, but this works:
for x,y,z in zip(i,j,k):
a[x,y,z] +=1

Categories

Resources