How to get the cumulative sum of numpy array in-place - python

I want to compute the integral image. for example
a=array([(1,2,3),(4,5,6)])
b = a.cumsum(axis=0)
This will generate another array b.Can I execute the cumsum in-place. If not . Are there any other methods to do that

You have to pass the argument out:
np.cumsum(a, axis=1, out=a)
OBS: your array is actually a 2-D array, so you can use axis=0 to sum along the rows and axis=1 to sum along the columns.

Try this using numpy directly numpy.cumsum(a) :
a=array([(1,2,3)])
b = np.cumsum(a)
print b
>>array([1,3,6])

Related

numba np.diff with axis=0

While using numba, axis=0 is acceptable parameters for np.sum(), but not with np.diff(). Why is this happening? I'm working with 2D, thus axis specification is needed.
#jit(nopython=True)
def jitsum(y):
np.sum(y, axis=0)
#jit(nopython=True)
def jitdiff(y): #this one will cause error
np.diff(y, axis=0)
Error: np_diff_impl() got an unexpected keyword argument 'axis'
A workaround in 2D will be:
#jit(nopython=True)
def jitdiff(y):
np.diff(y.T).T
np.diff on a 2D array with n=1, axis=1 is just
a[:, 1:] - a[:, :-1]
For axis=0:
a[1:, :] - a[:-1, :]
I suspect that the lines above will compile just fine with numba.
def sum(y):
a=np.sum(y, axis=0)
b=np.sum(y,axis=1)
print("Sum along the rows (axis=0):",a)
print("Sum along the columns (axis=1):",b)
def diff_order1(y):
a=np.diff(y,axis=0,n=1)
b=np.diff(y,axis=1,n=1) ## n=1 indicates 1st order difference
print("1st order difference along the rows (axis=0):",a)
print("1st order difference along the columns (axis=1):",b)
def diff_order2(y):
a=np.diff(y,axis=0,n=2)
b=np.diff(y,axis=1,n=2) ## n=2 indicates 2nd order difference
print("2nd order difference along the rows (axis=0):",a)
print("2nd order difference along the columns (axis=1):",b)
This function is just another version of solving the problem calling the .diff function twice for order 2 difference
def diff_order2_v2(y):
a=np.diff(np.diff(y,axis=1),axis=1)
b=np.diff(np.diff(y,axis=0),axis=0)
print("2nd order difference along the rows (axis=0):",a)
print("2nd order difference along the columns (axis=1):",b)
Try running this code, I tried to create functions for sum function and difference function for 1st order and 2nd order difference.

Array mean iteration

My question pertains to array iteration but is a bit more complicated. You see I have an array with a shape of (4, 50). What I want to do is find the mean of the arrays. I will show a simple explanation of what I mean
A = np.array([[10,5,3],[12,6,6],[9,8,7],[20,3,4]])
When this code is run, you get an array with a shape of (4,3). What I want is for the mean of each row to be found and returned.
Returned should be an array of ([[6],[8],[8],[9]]) with the same rows and naturally a column of 1.
Please explain the code and thought process behind it. Thank you very much.
Use the numpy.mean function. Parameter axis=1 means that the row-wise mean will be calculated. Parameter keepdims=True means that original array dimensions are kept.
import numpy as np
A = np.array([[10,5,3],[12,6,6],[9,8,7],[20,3,4]])
B = np.mean(A, axis=1, keepdims=True)
print(B)
# Output:
# [[6.]
# [8.]
# [8.]
# [9.]]
Use np.mean and list comprehension into a new array:
A = np.array([[10,5,3],[12,6,6], [9,8,7],[20,3,4]])
# Use .reshape() to get 4 rows by 1 column.
new_A = np.array([np.mean(row) for row in A]).reshape(-1, 1)
Output:
array([[6.], [8.], [8.], [9.]])

why does numpy array return wrong shape of sub arrays when indexing

An example is shown as follows:
>>> import numpy as np
>>> a=np.zeros((288,512))
>>> x1,x2,y1,y2=0,16,0,16
>>> p=a[x1:x2][y1:y2]
>>> p.shape
(16, 512)
>>> p=a[x1:x2,y1:y2]
>>> p.shape
I try to query a patch from an array, ranging from columns 0 to 16, and rows 0 to 16. I index the array in two ways and get very different result. a[x1:x2][y1:y2] gives me the wrong result.
Why?
Thx for helping me!!!
When you do a[x1:x2][y1:y2], you are slicing by rows twice. That is, a[x1:x2] will give you a shape (16,512). The second slice operation in a[x1:x2][y1:y2] is slicing the result of the first operation and will give you the same result.
In the second case, when you do a[x1:x2,y1:y2], you are slicing by the two dimensions of your 2-dimensional array.
Important note: If you have a 2-dimensional array and you slice like this:
a = np.zeros((10,15))
a[1:3].shape
Output:
(2, 15)
you will slice only by rows. Your resulting array will have 2 rows and the total number of columns (15 columns). If you want to slice by rows and columns, you will have to use a[1:3, 1:3].
The two methods of indexing you tried are not equivalent. In the first one (a[x1:x2][y1:y2]), you are essentially indexing the first axis twice. In the second, you are indexing the first and second axes.
a[x1:x2][y1:y2] can be rewritten as
p = a[x1:x2] # result still has two dimensions
p = p[y1:y2]
You are first indexing 0:16 in the first dimension. Then you index 0:16 in the first dimension of the result of the previous operation (which will simply return the same as a[x1:x2] because x1==y1 and x2==y2).
In the second method, you index the first and second dimensions directly. I would not write it this way, but one could write it like this to contrast it with the first method:
a[x1:x2][:, y1:y2]

Numpy array apply a function only to some elements

I have a numpy array let's say that has a shape (10,10) for example.
Now i want to apply np.exp() to this array, but just to some specific elements that satisfy a condition. For example i want to apply np.exp to all the elements that are not 0 or 1. Is there a way to do that without using for loop that iterate on each element of the array?
This is achievable with basic numpy operations. Here is a way to do that :
A = np.random.randint(0,5,size=(10,10)).astype(float) # data
goods = (A!=0) & (A!=1) # 10 x 10 boolean array
A[goods] = np.exp(A[goods]) # boolean indexing

How to index numpy array on subset of array of bools that is smaller than numpy array's dimensions?

My question is inspired by another one: Intersection of 2d and 1d Numpy array I am looking for a succinct solution that does not use in1d
The setup is this. I have a numpy array of bools telling me which values of numpy array A I should set equal to 0, called listed_array. However, I want to ignore the information in the first 3 columns of listed_array and only set A to zero as indicated in the other columns of listed_array.
I know the following is incorrect:
A[listed_array[:, 3:]] = 0
I also know I can pad this subset of listed_array with a call to hstack, and this will yield correct output, but is there something more succinct?
If I understand the question, this should do it:
A[:, 3:][listed_array[:, 3:]] = 0
which is a concise version of
mask3 = listed_array[:, 3:]
A3 = A[:, 3:] # This slice is a *view* of A, so changing A3 changes A.
A3[mask3] = 0

Categories

Resources