numpy, different slices for different layers of array - python

I have 3d array m*n*k and for every 2d-layer I want to take a subarray of size i*j. I have an array c with size 2*k of coordinates of starts of slices for every layer. Is there nice and easy way to get what I need without any loops?
Example:
test = np.arange(18).reshape((3,3,2))
c = np.array([[0,1], [0, 1]])
test[:,:,0] = array([[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
test[:,:,1] = array([[ 1, 3, 5],
[ 7, 9, 11],
[13, 15, 17]])
I want to get an array
[[[ 0, 9],
[ 2, 11]],
[[ 6, 15],
[ 8, 17]]]
Solution with loop:
h=2
w=2
layers = 2
F = np.zeros((h,w,layers))
for k in range(layers):
F[:,:,k] = test[c[0,k]:c[0,k]+h, c[1,k]:c[1,k]+w, k]

Here's a vectorized approach making use of broadcasting and advanced-indexing -
d0,d1,d2 = np.ogrid[:h,:w,:layers]
out = test[d0+c[0],d1+c[1],d2]
Sample run -
In [112]: test = np.arange(200).reshape((10,10,2))
...: c = np.array([[0,1], [0, 1]])
...:
In [113]: h=4
...: w=5
...: layers = 2
...: F = np.zeros((h,w,layers))
...: for k in range(layers):
...: F[:,:,k] = test[c[0,k]:c[0,k]+h, c[1,k]:c[1,k]+w, k]
...:
In [114]: d0,d1,d2 = np.ogrid[:h,:w,:layers]
...: out = test[d0+c[0],d1+c[1],d2]
...:
In [115]: np.allclose(F, out)
Out[115]: True

Related

Function over each value in Python Array (without using def)

The input array is x with dimensions (1 x 3) and the output array is 3 x 3 (column of input x column of input). The output array's diagonals are the values^2. If row != column, then the formula is x(row)+x(col) for each value. Currently for 1 x 3 but should assume a variety of dimensions as input. Cannot use 'def'. The current code does not work, what would you recommend?
x = np.array([[0, 5, 10]])
output array formulas =
[[i^2, x(row)+x(col), x(row)+x(col)]
[x(row)+x(col), i^2, x(row)+x(col)]
[x(row)+x(col), x(row)+x(col), i^2]]
# where row and column refer to the output matrix row, column. For example, the value in (1,2) is x(1)+x(2)= 5
ideal output =
[[0 5 10]
[5 25 15]
[10 15 100]]
Code Attempted:
x = np.array([[0, 5, 10]])
r, c = np.shape(x)
results = np.zeros((c, c))
g[range(c), range(c)] = x**2
for i in x:
for j in i:
results[i,j] = x[i]+x[j]
Learn to use numpy methods and broadcasting:
>>> x
array([[ 0, 5, 10]])
>>> x.T
array([[ 0],
[ 5],
[10]])
>>> x.T + x
array([[ 0, 5, 10],
[ 5, 10, 15],
[10, 15, 20]])
>>> result = x.T + x
>>> result
array([[ 0, 5, 10],
[ 5, 10, 15],
[10, 15, 20]])
Then this handy built-in:
>>> np.fill_diagonal(result, x**2)
>>> result
array([[ 0, 5, 10],
[ 5, 25, 15],
[ 10, 15, 100]])
Can replace the results[range(c), range(c)] = x**2
Try this:
x.repeat(x.shape[1], axis=0)
x = x+x.T
x[np.arange(len(x)),np.arange(len(x))] = (np.diag(x)/2)**2

Apply np.vectorize along one axis

Say I have two arrays arr1 and arr2:
arr1 = [0, 1, 2]
arr2 = [
[0, 1, 2],
[3, 4, 5],
[6, 7, 8],
]
And say I have a function that does something to the elements of this array:
def func(arr):
new_arr = arr.copy()
new_arr[0] = new_arr[0] * 2
new_arr[1] = new_arr[1] * 10
new_arr[2] = new_arr[2] * 100
return new_arr
Now I want to vectorize this, so that it works for both arr1 and arr2:
func(arr1)
# returns [0, 10, 200]
func(arr2)
# returns
# [0, 10, 200],
# [6, 40, 500],
# [12, 70, 800],
np.vectorize doesn't work because it breaks down each and every element in my array parameter. I want it to apply the function only along the first axis.
np.apply_along_axis almost works, except it won't consider 1-D array parameter to be a single parameter.
What's the best way to do this?
You can just directly multiply the arrays. It works thanks to numpy broadcasting:
factor = np.array([2, 10, 100])
arr1 * factor
array([ 0, 10, 200])
arr2 * factor
array([[ 0, 10, 200],
[ 6, 40, 500],
[ 12, 70, 800]])
If you take time to read the np.vectorize docs, you'll eventually encounter the signature option:
In [27]: f= np.vectorize(func, signature='(n)->(n)')
In [28]: f(arr1)
Out[28]: array([ 0, 10, 200])
In [29]: f(arr2)
Out[29]:
array([[ 0, 10, 200],
[ 6, 40, 500],
[ 12, 70, 800]])
And reading a bit further you'll encounter the caveats about performance.
Just do this:
import numpy as np
a = np.array([0, 1, 2])
b = np.array([
[0, 1, 2],
[3, 4, 5],
[6, 7, 8],
])
c = np.array([2, 10, 100])
print(a*c)
print(b*c)
Output:
[ 0 10 200]
[[ 0 10 200]
[ 6 40 500]
[ 12 70 800]]

Indexing numpy 2D array that wraps around

How do you index a numpy array that wraps around when its out of bounds?
For example, I have 3x3 array:
import numpy as np
matrix = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])
##
[[ 1 2 3 4 5]
[ 6 7 8 9 10]
[11 12 13 14 15]]
Say I would like to index the values around index (2,4) where value 15 is located. I would like to get back the array with values:
[[9, 10, 6]
[14, 15, 11]
[4, 5, 1]]
Basically all the values around 15 was returned, assuming it wraps around
A fairly standard idiom to find the neighboring elements in a numpy array is arr[x-1:x+2, y-1:y+2]. However, since you want to wrap, you can pad your array using wrap mode, and offset your x and y coordinates to account for this padding.
This answer assumes that you want the neighbors of the first occurence of your desired element.
First, find the indices of your element, and offset to account for padding:
x, y = np.unravel_index((m==15).argmax(), m.shape)
x += 1; y += 1
Now pad, and index your array to get your neighbors:
t = np.pad(m, 1, mode='wrap')
out = t[x-1:x+2, y-1:y+2]
array([[ 9, 10, 6],
[14, 15, 11],
[ 4, 5, 1]])
Here's how you can do it without padding. This can generalize easily to when you want more than just one neighbor and without the overhead of padding the array.
def get_wrapped(matrix, i, j):
m, n = matrix.shape
rows = [(i-1) % m, i, (i+1) % m]
cols = [(j-1) % n, j, (j+1) % n]
return matrix[rows][:, cols]
res = get_wrapped(matrix, 2, 4)
Let me explain what's happening here return matrix[rows][:, cols]. This is really two operations.
The first is matrix[rows] which is short hand for matrix[rows, :] which means give me the selected rows, and all columns for those rows.
Then next we do [:, cols] which means give me all the rows and the selected cols.
The take function works in-place.
>>> a = np.arange(1, 16).reshape(3,5)
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
>>> b = np.take(a, [3,4,5], axis=1, mode='wrap')
array([[ 4, 5, 1],
[ 9, 10, 6],
[14, 15, 11]])
>>> np.take(b, [1,2,3], mode='wrap', axis=0)
array([[ 9, 10, 6],
[14, 15, 11],
[ 4, 5, 1]])

Cycling Slicing in Python

I've come up with this question while trying to apply a Cesar Cipher to a matrix with different shift values for each row, i.e. given a matrix X
array([[1, 0, 8],
[5, 1, 4],
[2, 1, 1]])
with shift values of S = array([0, 1, 1]), the output needs to be
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])
This is easy to implement by the following code:
Y = []
for i in range(X.shape[0]):
if (S[i] > 0):
Y.append( X[i,S[i]::].tolist() + X[i,:S[i]:].tolist() )
else:
Y.append(X[i,:].tolist())
Y = np.array(Y)
This is a left-cycle-shift. I wonder how to do this in a more efficient way using numpy arrays?
Update: This example applies the shift to the columns of a matrix. Suppose that we have a 3D array
array([[[8, 1, 8],
[8, 6, 2],
[5, 3, 7]],
[[4, 1, 0],
[5, 9, 5],
[5, 1, 7]],
[[9, 8, 6],
[5, 1, 0],
[5, 5, 4]]])
Then, the cyclic right shift of S = array([0, 0, 1]) over the columns leads to
array([[[8, 1, 7],
[8, 6, 8],
[5, 3, 2]],
[[4, 1, 7],
[5, 9, 0],
[5, 1, 5]],
[[9, 8, 4],
[5, 1, 6],
[5, 5, 0]]])
Approach #1 : Use modulus to implement the cyclic pattern and get the new column indices and then simply use advanced-indexing to extract the elements, giving us a vectorized solution, like so -
def cyclic_slice(X, S):
m,n = X.shape
idx = np.mod(np.arange(n) + S[:,None],n)
return X[np.arange(m)[:,None], idx]
Approach #2 : We can also leverage the power of strides for further speedup. The idea would be to concatenate the sliced off portion from the start and append it at the end, then create sliding windows of lengths same as the number of cols and finally index into the appropriate window numbers to get the same rolled over effect. The implementation would be like so -
def cyclic_slice_strided(X, S):
X2 = np.column_stack((X,X[:,:-1]))
s0,s1 = X2.strides
strided = np.lib.stride_tricks.as_strided
m,n1 = X.shape
n2 = X2.shape[1]
X2_3D = strided(X2, shape=(m,n2-n1+1,n1), strides=(s0,s1,s1))
return X2_3D[np.arange(len(S)),S]
Sample run -
In [34]: X
Out[34]:
array([[1, 0, 8],
[5, 1, 4],
[2, 1, 1]])
In [35]: S
Out[35]: array([0, 1, 1])
In [36]: cyclic_slice(X, S)
Out[36]:
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])
Runtime test -
In [75]: X = np.random.rand(10000,100)
...: S = np.random.randint(0,100,(10000))
# #Moses Koledoye's soln
In [76]: %%timeit
...: Y = []
...: for i, x in zip(S, X):
...: Y.append(np.roll(x, -i))
10 loops, best of 3: 108 ms per loop
In [77]: %timeit cyclic_slice(X, S)
100 loops, best of 3: 14.1 ms per loop
In [78]: %timeit cyclic_slice_strided(X, S)
100 loops, best of 3: 4.3 ms per loop
Adaption for 3D case
Adapting approach #1 for the 3D case, we would have -
shift = 'left'
axis = 1 # axis along which S is to be used (axis=1 for rows)
n = X.shape[axis]
if shift == 'left':
Sa = S
else:
Sa = -S
# For rows
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[:,idx, np.arange(len(S))]
# For columns
idx = np.mod(Sa[:,None] + np.arange(n),n)
out = X[:,np.arange(len(S))[:,None], idx]
# For axis=0
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[idx, np.arange(len(S))]
There could be a way to have a generic solution for a generic axis, but I will keep it to this point.
You could shift each row using np.roll and use the new rows to build the output array:
Y = []
for i, x in zip(S, X):
Y.append(np.roll(x, -i))
print(np.array(Y))
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])

Numpy replacing specific column index per row by using a list of indexes with nan

I am trying the following:
a = np.array([[1,2,3], [4,5,6], [7,8,9]])
print a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
a[np.arange(len(a)), [1,0,2]] = 20 #--Code1
print a
array([[ 1, 20, 3],
[20, 5, 6],
[ 7, 8, 20]])
However, if my index has nan in them as:
a[np.arange(len(a)), [1,np.nan,2]] = 20 #--Code2
It errors out.
What I was trying to do is, if there is nan present in the index, don't change anything.
i.e. I wanted to implement Code2 above so that I can obtain the following:
array([[ 1, 20, 3],
[4, 5, 6],
[ 7, 8, 20]])
Use masking -
m = ~np.isnan(idx) # Mask of non-NaNs
row = np.arange(a.shape[0])[m]
col = idx[m].astype(int)
a[row, col] = 20
where, idx is the indexing array.
Sample run -
In [161]: a = np.array([[1,2,3], [4,5,6], [7,8,9]])
In [162]: idx = np.array([1,np.nan,2])
In [163]: m = ~np.isnan(idx) # Mask of non-NaNs
...: row = np.arange(a.shape[0])[m]
...: col = idx[m].astype(int)
...: a[row, col] = 20
...:
In [164]: a
Out[164]:
array([[ 1, 20, 3],
[ 4, 5, 6],
[ 7, 8, 20]])

Categories

Resources