If I have an ndarray like this:
>>> a = np.arange(27).reshape(3,3,3)
>>> a
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
I know I can get the maximum along a certain axis using np.max(axis=...):
>>> a.max(axis=2)
array([[ 2, 5, 8],
[11, 14, 17],
[20, 23, 26]])
Alternatively, I could get the indices along that axis which correspond to the maximum values from:
>>> indices = a.argmax(axis=2)
>>> indices
array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2]])
My question -- Given the array indices and the array a, is there an elegant way to reproduce the array the array returned by a.max(axis=2)?
This would probably work:
import itertools as it
import numpy as np
def apply_mask(field,indices):
data = np.empty(indices.shape)
#It seems highly likely that there is a more numpy-approved way to do this.
idx = [range(i) for i in indices.shape]
for idx_tup,zidx in zip(it.product(*idx),indices.flat):
data[idx_tup] = field[idx_tup+(zidx,)]
return data
But, it seems pretty hacky/inefficient. It also doesn't allow for me to use this with any axis other than the "last" axis. Is there a numpy function (or some use of magical numpy indexing) to make this work? The naive a[:,:,a.argmax(axis=2)] doesn't work.
UPDATE:
It seems the following also works (and is a little nicer):
import numpy as np
def apply_mask(field,indices):
data = np.empty(indices.shape)
for idx_tup,zidx in np.ndenumerate(indices):
data[idx_tup] = field[idx_tup+(zidx,)]
return data
I would like to do this because I would like to extract the indices based on the data in 1 array (typically using argmax(axis=...)) and use those indices to pull data out of a bunch of other (equivalently shaped) arrays. I'm open to alternative ways to accomplish this (e.g. using boolean masked arrays). However, I like the "safety" that I get using these "index" arrays. With this I am guaranteed to have the right number of elements to create a new array which looks like a 2d "slice" through the 3d field.
Here is some magic numpy indexing that will do what you want, but unfortunately it's pretty unreadable.
def apply_mask(a, indices, axis):
magic_index = [np.arange(i) for i in indices.shape]
magic_index = np.ix_(*magic_index)
magic_index = magic_index[:axis] + (indices,) + magic_index[axis:]
return a[magic_index]
or equally unreadable:
def apply_mask(a, indices, axis):
magic_index = np.ogrid[tuple(slice(i) for i in indices.shape)]
magic_index.insert(axis, indices)
return a[magic_index]
I use index_at() to create the full index:
import numpy as np
def index_at(idx, shape, axis=-1):
if axis<0:
axis += len(shape)
shape = shape[:axis] + shape[axis+1:]
index = list(np.ix_(*[np.arange(n) for n in shape]))
index.insert(axis, idx)
return tuple(index)
a = np.random.randint(0, 10, (3, 4, 5))
axis = 1
idx = np.argmax(a, axis=axis)
print a[index_at(idx, a.shape, axis=axis)]
print np.max(a, axis=axis)
Related
I am studying image-processing using NumPy and facing a problem with filtering with convolution.
I would like to convolve a gray-scale image. (convolve a 2d Array with a smaller 2d Array)
Does anyone have an idea to refine my method?
I know that SciPy supports convolve2d but I want to make a convolve2d only by using NumPy.
What I have done
First, I made a 2d array the submatrices.
a = np.arange(25).reshape(5,5) # original matrix
submatrices = np.array([
[a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]],
[a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]],
[a[2:,:-2], a[2:,1:-1], a[2:,2:]]])
the submatrices seems complicated but what I am doing is shown in the following drawing.
Next, I multiplied each submatrices with a filter.
conv_filter = np.array([[0,-1,0],[-1,4,-1],[0,-1,0]])
multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices)
and summed them.
np.sum(np.sum(multiplied_subs, axis = -3), axis = -3)
#array([[ 6, 7, 8],
# [11, 12, 13],
# [16, 17, 18]])
Thus this procedure can be called my convolve2d.
def my_convolve2d(a, conv_filter):
submatrices = np.array([
[a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]],
[a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]],
[a[2:,:-2], a[2:,1:-1], a[2:,2:]]])
multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices)
return np.sum(np.sum(multiplied_subs, axis = -3), axis = -3)
However, I find this my_convolve2d troublesome for 3 reasons.
Generation of the submatrices is too awkward that is difficult to read and can only be used when the filter is 3*3
The size of the variant submatrices seems to be too big, since it is approximately 9 folds bigger than the original matrix.
The summing seems a little non intuitive. Simply said, ugly.
Thank you for reading this far.
Kind of update. I wrote a conv3d for myself. I will leave this as a public domain.
def convolve3d(img, kernel):
# calc the size of the array of submatrices
sub_shape = tuple(np.subtract(img.shape, kernel.shape) + 1)
# alias for the function
strd = np.lib.stride_tricks.as_strided
# make an array of submatrices
submatrices = strd(img,kernel.shape + sub_shape,img.strides * 2)
# sum the submatrices and kernel
convolved_matrix = np.einsum('hij,hijklm->klm', kernel, submatrices)
return convolved_matrix
You could generate the subarrays using as_strided:
import numpy as np
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
sub_shape = (3,3)
view_shape = tuple(np.subtract(a.shape, sub_shape) + 1) + sub_shape
strides = a.strides + a.strides
sub_matrices = np.lib.stride_tricks.as_strided(a,view_shape,strides)
To get rid of your second "ugly" sum, alter your einsum so that the output array only has j and k. This implies your second summation.
conv_filter = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
m = np.einsum('ij,ijkl->kl',conv_filter,sub_matrices)
# [[ 6 7 8]
# [11 12 13]
# [16 17 18]]
Cleaned up using as_strided and #Crispin 's einsum trick from above. Enforces the filter size into the expanded shape. Should even allow non-square inputs if the indices are compatible.
def conv2d(a, f):
s = f.shape + tuple(np.subtract(a.shape, f.shape) + 1)
strd = numpy.lib.stride_tricks.as_strided
subM = strd(a, shape = s, strides = a.strides * 2)
return np.einsum('ij,ijkl->kl', f, subM)
You can also use fft (one of the faster methods to perform convolutions)
from numpy.fft import fft2, ifft2
import numpy as np
def fft_convolve2d(x,y):
""" 2D convolution, using FFT"""
fr = fft2(x)
fr2 = fft2(np.flipud(np.fliplr(y)))
m,n = fr.shape
cc = np.real(ifft2(fr*fr2))
cc = np.roll(cc, -m/2+1,axis=0)
cc = np.roll(cc, -n/2+1,axis=1)
return cc
https://gist.github.com/thearn/5424195
you must pad the filter to be the same size as image ( place it in the middle of a zeros_like mat.)
cheers,
Dan
https://laurentperrinet.github.io/sciblog/posts/2017-09-20-the-fastest-2d-convolution-in-the-world.html
Check out all the convolution methods and their respective performances here.
Also, I found the below code snippet to be simpler.
from numpy.fft import fft2, ifft2
def np_fftconvolve(A, B):
return np.real(ifft2(fft2(A)*fft2(B, s=A.shape)))
I have x,y,v arrays of data points and I am binning v on x-y plane. I am trying to get the x,y,v values back after binning but I want them as arrays corresponding to each bin. My code can get them individually but that will not work for large data sets with many bins. Maybe I need to use loops of some kind but my understanding of loops is weak. Code:
from scipy import stats
import numpy as np
x=np.array([-10,-2,4,12,3,6,8,14,3])
y=np.array([5,5,-6,8,-20,10,2,2,8])
v=np.array([4,-6,-10,40,22,-14,20,8,-10])
ret = stats.binned_statistic_2d(x,
y,
values,
'count',
bins=2,
expand_binnumbers=True)
print('counts=',ret.statistic)
print('binnumber=', ret.binnumber)
binnumber = ret.binnumber
statistic = ret.statistic
# get the bin numbers according to some condition
idx_bin_x, idx_bin_y = np.where(statistic==statistic[1][1])#[0]
print('idx_binx=',idx_bin_x)
print('idx_bin_y=',idx_bin_y)
# A binnumber of i means the corresponding value is
# between (bin_edges[i-1], bin_edges[i]).
# -> increment the bin indices by one
idx_bin_x += 1
idx_bin_y += 1
print('idx_binx+1=',idx_bin_x)
print('idx_bin_y+1=',idx_bin_y)
# get the boolean mask and apply it
is_event_x = np.in1d(binnumber[0], idx_bin_x)
print('eventx=',is_event_x)
is_event_y = np.in1d(binnumber[1], idx_bin_y)
print('eventy=',is_event_y)
is_event_xy = np.logical_and(is_event_x, is_event_y)
print('event_xy=', is_event_xy)
events_x = x[is_event_xy]
events_y = y[is_event_xy]
event_v=v[is_event_xy]
print('x=', events_x)
print('y=', events_y)
print('v=',event_v)
This outputs x,y,v for the bin with count=5 but I want all 4 bins returning 4 arrays for each x,y,v. eg for bin1: x_bin1=[...], y_bin1=[...], v_bin1=[...] and so on for 4 bins.
Also, feel free to suggest if you think there are easier ways to bin 2d planes (x,y) with values (v) like mine and getting binned values. Thank you!
Using np.array facilitates a compact way to recover the arrays you are after:
from scipy import stats
# coordinates
x = np.array([-10,-2,4,12,3,6,8,14,3])
y = np.array([5,5,-6,8,-20,10,2,2,8])
v = np.array([4,-6,-10,40,22,-14,20,8,-10])
ret = stats.binned_statistic_2d(x, y, None, 'count', bins=2, expand_binnumbers=True)
b = ret.binnumber
for i in [1,2]:
for j in [1,2]:
m = (b[0] == i) & (b[1] == j) # mask
print((list(x[m]),list(y[m]),list(v[m])))
which gives for each of the four bins a tuple of 3 lists corresponding to x, y and v values:
([], [], [])
([-10, -2], [5, 5], [4, -6])
([4, 3], [-6, -20], [-10, 22])
([12, 6, 8, 14, 3], [8, 10, 2, 2, 8], [40, -14, 20, 8, -10])
I have a three dimensional numpy source array and a two-dimensional numpy array of indexes.
For example:
src = np.array([[[1,2,3],[4,5,6]],
[[7,8,9],[10,11,12]]])
idx = np.array([[0,1],
[1,2]])
I'd like to get a 2d array, where each element represents the indexed value in the innermost dimension in that position:
array([[1,5],
[8,12]])
How do I do this with numpy?
You can try np.take, here is the documentation.
However, you should count the index of the array after flattening all the elements. For example you should use
src = np.array([[[1,2,3],[4,5,6]],
[[7,8,9],[10,11,12]]])
idx = np.array([[0,4],
[7,11]])
# Wanted result
res = np.take(src, idx)
where src was regarded as [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
You can also try np.take_along_axis, here is the documentation.
Using this method need your src and idx in same dimension, therefore, you should first unsqueezed the src and squeeze the res.
# Unsqueezed the last dim
idx = np.expand_dims(idx, axis=-1)
# Squeeze the last dim
res = np.take_along_axis(src, idx, axis=2).squeeze(-1)
You can use the np.choose method with a little reshaping:
np.choose(idx.reshape((1, 2, 2)), src.transpose()).reshape((2, 2))
>>>> array([[ 1, 8],
[ 5, 12]])
Direct indexing:
src[np.arange(2)[:, None], np.arange(2), idx]
This question already has answers here:
Most efficient way to map function over numpy array
(11 answers)
Closed 4 years ago.
Lets say I create a 3x3 NumPy Matrix. What is the best way to apply a function to all elements in the matrix, with out looping through each element if possible?
import numpy as np
def myFunction(x):
return (x * 2) + 3
myMatrix = np.matlib.zeros((4, 4))
# What is the best way to apply myFunction to each element in myMatrix?
EDIT: The current solutions proposed work great if the function is matrix-friendly, but what if it's a function like this that deals with scalars only?
def randomize():
x = random.randrange(0, 10)
if x < 5:
x = -1
return x
Would the only way be to loop through the matrix and apply the function to each scalar inside the matrix? I'm not looking for a specific solution (like how to randomize the matrix), but rather a general solution to apply a function over the matrix. Hope this helps!
This shows two possible ways of doing maths on a whole Numpy array without using an explicit loop:
import numpy as np
# Make a simple array with unique elements
m = np.arange(12).reshape((4,3))
# Looks like:
# array([[ 0, 1, 2],
# [ 3, 4, 5],
# [ 6, 7, 8],
# [ 9, 10, 11]])
# Apply formula to all elements without loop
m = m*2 + 3
# Looks like:
# array([[ 3, 5, 7],
# [ 9, 11, 13],
# [15, 17, 19],
# [21, 23, 25]])
# Define a function
def f(x):
return (x*2) + 3
# Apply function to all elements
f(m)
# Looks like:
# array([[ 9, 13, 17],
# [21, 25, 29],
# [33, 37, 41],
# [45, 49, 53]])
I'm currently converting some old fortran code into python and looking to use numpy-style operations as much as I can, for speed.
The code calls for finding the products of all elements of two arrays, like so:
do i=1, nx
do j=1, ny
si(i,j) = xarray(i) * yarray(j)
enddo
enddo
so instead I have vectorized it like so:
for i, x in enumerate(xarray):
si[i] = x * yarray
but is there a way to remove that loop over x and generate the whole "nx x ny" array in one line, which would presumably be faster?
I think you are looking for np.outer
>>> nx = np.array([1,2,3,4])
>>> ny = np.array([2,3,4,5])
>>> np.outer(nx, ny)
array([[ 2, 3, 4, 5],
[ 4, 6, 8, 10],
[ 6, 9, 12, 15],
[ 8, 12, 16, 20]])
Try:
si = xarray.reshape(-1,1) * yarray