using numpy as_strided to retrieve subarrays centered on main diagonal - python

I have a square array x, shape (N, N), and I would like to retrieve square sub-arrays of shape (n, n) which are centered on the main diagonal of x. For example, with N = 3 & n = 2, and operating on
x = np.arange(9).reshape((3, 3))
should yield
array([[[0, 1],
[3, 4]],
[[4, 5],
[7, 8]]])
One way is to use make_windows
def make_windows(a, sub_w, sub_h):
w, h = a.shape
a_strided = np.lib.stride_tricks.as_strided(
a, shape=[w - sub_w + 1, h - sub_h + 1,
sub_w, sub_h],
strides=a.strides + a.strides)
return a_strided
and do something like np.einsum('ii...->i...', make_windows(x, 2, 2)), but it would be neat to do it in one step. Is it doable with as_strided alone?

Sure:
def diag_windows(x, n):
if x.ndim != 2 or x.shape[0] != x.shape[1] or x.shape[0] < n:
raise ValueError("Invalid input")
w = as_strided(x, shape=(x.shape[0] - n + 1, n, n),
strides=(x.strides[0]+x.strides[1], x.strides[0], x.strides[1]))
return w
For example:
In [14]: x
Out[14]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [15]: diag_windows(x, 2)
Out[15]:
array([[[ 0, 1],
[ 4, 5]],
[[ 5, 6],
[ 9, 10]],
[[10, 11],
[14, 15]]])
In [16]: diag_windows(x, 3)
Out[16]:
array([[[ 0, 1, 2],
[ 4, 5, 6],
[ 8, 9, 10]],
[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]])
In [17]: diag_windows(x, 4)
Out[17]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]]])

Related

Get multiplication table generalized for n dimensions with numpy

Let a = np.arange(1, 4).
To get the 2 dimensional multiplication table for a, I do:
>>> a * a[:, None]
>>> array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
For 3 dimensions, I can do the following:
>>> a * a[:, None] * a[:, None, None]
>>> array([[[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9]],
[[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]],
[[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]]])
How could I write a function that takes a numpy array a and a number of dimensions n as input and ouputs the n dimensional multiplication table for a?
This should do what you need:
import itertools
a = np.arange(1, 4)
n = 3
def f(x, y):
return np.expand_dims(x, len(x.shape))*y
l = list(itertools.accumulate(np.repeat(np.atleast_2d(a), n, axis=0), f))[-1]
Just change n to be whatever dimension you need
First we can use numpy.expand_dims() for dynamically promoting the array dimensions as needed in a list/generator comprehension and then use an iterable product tool such as math.prod on Python 3.8+. The implementation would then look like as demonstrated below:
from math import prod
def n_dim_multiplication(arr, num_dims):
gen_arr = (np.expand_dims(a, axis=tuple(range(1, idx+1))) for idx in range(num_dims))
return prod(gen_arr)
Sample run for the 3 dimensional case:
# input array
In [83]: a = np.arange(1, 4)
# desired number of dimensions
In [84]: num_dims = 3
In [85]: n_dim_multiplication(a, num_dims)
Out[85]:
array([[[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9]],
[[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]],
[[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]]])

Numpy operation to expand array into sequential slices of given length?

my_function must expand a 1D numpy array to a 2D numpy array, with the 2nd axis containing the slices of length starting from the first index until the end. Example:
import numpy as np
a = np.arange(10)
print (my_function(a, length=3))
Expected output
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
I can achieve this using a for loop, but I was wondering if there is a numpy vectorization technique for this.
def my_function(a, length):
b = np.zeros((len(a)-(length-1), length))
for i in range(len(b)):
b[i] = a[i:i+length]
return b
If you're careful with the math and heed the warningin the docs, you can use np.lib.stride_tricks.as_strided(). You need to calculate the correct dimensions for your array so you don't overflow. Also note that as_strided() shares memory, so you will multiple references to the same memory in the final output. (You can of course, copy this to a new array).
>> import numpy as np
>> def my_function(a, length):
stride = a.strides[0]
l = len(a) - length + 1
return np.lib.stride_tricks.as_strided(a, (l, length), (stride,stride) )
>> np.array(my_function(np.arange(10), 3))
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
>> np.array(my_function(np.arange(15), 7))
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 1, 2, 3, 4, 5, 6, 7],
[ 2, 3, 4, 5, 6, 7, 8],
[ 3, 4, 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9, 10, 11],
[ 6, 7, 8, 9, 10, 11, 12],
[ 7, 8, 9, 10, 11, 12, 13],
[ 8, 9, 10, 11, 12, 13, 14]])
How about this function?
import numpy as np
def my_function(a, length):
result = []
for i in range(length):
result.append(a + i)
return np.vstack(result).T[:len(a) - length + 1]
a = np.arange(10)
length = 3
my_function(a, length)

Numpy: replace each element in a row by the maximum of other elements in the same row

Let say we have a 2-D array like this:
>>> a
array([[1, 1, 2],
[0, 2, 2],
[2, 2, 0],
[0, 2, 0]])
For each line I want to replace each element by the maximum of the 2 others in the same line.
I've found how to do it for each column separately, using numpy.amax and an identity array, like this:
>>> np.amax(a*(1-np.eye(3)[0]), axis=1)
array([ 2., 2., 2., 2.])
>>> np.amax(a*(1-np.eye(3)[1]), axis=1)
array([ 2., 2., 2., 0.])
>>> np.amax(a*(1-np.eye(3)[2]), axis=1)
array([ 1., 2., 2., 2.])
But I would like to know if there is a way to avoid a for loop and get directly the result which in this case should look like this:
>>> numpy_magic(a)
array([[2, 2, 1],
[2, 2, 2],
[2, 2, 2],
[2, 0, 2]])
Edit: after a few hours playing in the console, I've finally come up with the solution I was looking for. Be ready for some mind blowing one line code:
np.amax(a[[range(a.shape[0])]*a.shape[1],:][(np.eye(a.shape[1]) == 0)[:,[range(a.shape[1])*a.shape[0]]].reshape(a.shape[1],a.shape[0],a.shape[1])].reshape((a.shape[1],a.shape[0],a.shape[1]-1)),axis=2).transpose()
array([[2, 2, 1],
[2, 2, 2],
[2, 2, 2],
[2, 0, 2]])
Edit2: Paul has suggested a much more readable and faster alternative which is:
np.max(a[:, np.where(~np.identity(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)
After timing these 3 alternatives, both Paul's solutions are 4 times faster in every contexts (I've benchmarked for 2, 3 and 4 columns with 200 rows). Congratulations for these amazing pieces of code!
Last Edit (sorry): after replacing np.identity with np.eye which is faster, we now have the fastest and most concise solution:
np.max(a[:, np.where(~np.eye(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)
Here are two solutions, one that is specifically designed for max and a more general one that works for other operations as well.
Using the fact that all except possibly one maximums in each row are the maximum of the entire row, we can use argpartition to cheaply find the indices of the largest two elements. Then in the position of the largest we put the value of the second largest and everywhere else the largest value. Works also for more than 3 columns.
>>> a
array([[6, 0, 8, 8, 0, 4, 4, 5],
[3, 1, 5, 0, 9, 0, 3, 6],
[1, 6, 8, 3, 4, 7, 3, 7],
[2, 1, 6, 2, 9, 1, 8, 9],
[7, 3, 9, 5, 3, 7, 4, 3],
[3, 4, 3, 5, 8, 2, 2, 4],
[4, 1, 7, 9, 2, 5, 9, 6],
[5, 6, 8, 5, 5, 3, 3, 3]])
>>>
>>> M, N = a.shape
>>> result = np.empty_like(a)
>>> largest_two = np.argpartition(a, N-2, axis=-1)
>>> rng = np.arange(M)
>>> result[...] = a[rng, largest_two[:, -1], None]
>>> result[rng, largest_two[:, -1]] = a[rng, largest_two[:, -2]]>>>
>>> result
array([[8, 8, 8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 6, 9, 9, 9],
[8, 8, 7, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9],
[9, 9, 7, 9, 9, 9, 9, 9],
[8, 8, 8, 8, 5, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9],
[8, 8, 6, 8, 8, 8, 8, 8]])
This solution depends on specific properties of max.
A more general solution that for example also works for sum instead of max would be. Glue two copies of a together (side-by-side, not on top of each other). So the rows are something like a0 a1 a2 a3 a0 a1 a2 a3. For an index x we can get all but ax by slicing [x+1:x+4]. To do this vectorized we use stride_tricks:
>>> a
array([[2, 6, 0],
[5, 0, 0],
[5, 0, 9],
[6, 4, 4],
[5, 0, 8],
[1, 7, 5],
[9, 7, 7],
[4, 4, 3]])
>>> M, N = a.shape
>>> aa = np.c_[a, a]
>>> ast = np.lib.stride_tricks.as_strided(aa, (M, N+1, N-1), aa.strides + aa.strides[1:])
>>> result = np.max(ast[:, 1:, :], axis=-1)
>>> result
array([[6, 2, 6],
[0, 5, 5],
[9, 9, 5],
[4, 6, 6],
[8, 8, 5],
[7, 5, 7],
[7, 9, 9],
[4, 4, 4]])
# use sum instead of max
>>> result = np.sum(ast[:, 1:, :], axis=-1)
>>> result
array([[ 6, 2, 8],
[ 0, 5, 5],
[ 9, 14, 5],
[ 8, 10, 10],
[ 8, 13, 5],
[12, 6, 8],
[14, 16, 16],
[ 7, 7, 8]])
List comprehension solution.
np.array([np.amax(a * (1 - np.eye(3)[j]), axis=1) for j in range(a.shape[1])]).T
Similar to #Ethan's answer but with np.delete(), np.max(), and np.dstack():
np.dstack([np.max(np.delete(a, i, 1), axis=1) for i in range(a.shape[1])])
array([[2, 2, 1],
[2, 2, 2],
[2, 2, 2],
[2, 0, 2]])
delete() "filters" out each column successively;
max() finds the row-wise maximum of the remaining two columns
dstack() stacks the resulting 1d arrays
If you have more than 3 columns, note that this will find the maximum of "all other" columns rather than the "2-greatest" columns per row. For example:
a2 = np.arange(25).reshape(5,5)
np.dstack([np.max(np.delete(a2, i, 1), axis=1) for i in range(a2.shape[1])])
array([[[ 4, 4, 4, 4, 3],
[ 9, 9, 9, 9, 8],
[14, 14, 14, 14, 13],
[19, 19, 19, 19, 18],
[24, 24, 24, 24, 23]]])

Viewing cells of a grid in sliding windows with periodic boundaries

Consider a 2D array
>>> A = np.array(range(16)).reshape(4, 4)
>>> A
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
I would like to construct a function f(i,j) which pulls a 3x3 block from elements surrounding A[i,j] with periodic boundary conditions.
For example a non-boundary element would be
>>> f(1,1)
array([[ 0, 1, 2],
[ 4, 5, 6],
[ 8, 9, 10]])
and a boundary element would be
>>> f(0,0)
array([[15, 12, 13],
[ 3, 0, 1],
[ 7, 4, 5]])
view_as_windows comes close but does not wrap around periodic boundaries.
>>> from skimage.util.shape import view_as_windows
>>> view_as_windows(A,(3,3))
array([[[[ 0, 1, 2],
[ 4, 5, 6],
[ 8, 9, 10]],
[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]]],
[[[ 4, 5, 6],
[ 8, 9, 10],
[12, 13, 14]],
[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]]])
In this case view_as_windows(A)[0,0] == f(1,1) but f(0,0) is not in view_as_windows(A). I need a view_as_windows(A) type array which has the same number of elements as A, where each element has shape (3,3)
Simply pad with wrapping functionality using np.pad and then use Scikit's view_as_windows -
from skimage.util.shape import view_as_windows
Apad = np.pad(A,1,'wrap')
out = view_as_windows(Apad,(3,3))
Sample run -
In [65]: A
Out[65]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [66]: Apad = np.pad(A,1,'wrap')
In [67]: out = view_as_windows(Apad,(3,3))
In [68]: out[0,0]
Out[68]:
array([[15, 12, 13],
[ 3, 0, 1],
[ 7, 4, 5]])
In [69]: out[1,1]
Out[69]:
array([[ 0, 1, 2],
[ 4, 5, 6],
[ 8, 9, 10]])

Merging non-overlapping array blocks

I divided a (512x512) 2-dimensional array to 2x2 blocks using this function.
skimage.util.view_as_blocks (arr_in, block_shape)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> B = view_as_blocks(A, block_shape=(2, 2))
>>> B[0, 0]
array([[0, 1],
[4, 5]])
>>> B[0, 1]
array([[2, 3],
[6, 7]])
Now I need to put the same blocks to their original places after manipulation but I couldn't see any function in skimage for that.
What's the best way to merge the non-overlapping arrays as it was before?
Thank you!
Use transpose/swapaxes to swap the second and third axes and then reshape to have the last two axes merged -
B.transpose(0,2,1,3).reshape(-1,B.shape[1]*B.shape[3])
B.swapaxes(1,2).reshape(-1,B.shape[1]*B.shape[3])
Sample run -
In [41]: A
Out[41]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [42]: B = view_as_blocks(A, block_shape=(2, 2))
In [43]: B
Out[43]:
array([[[[ 0, 1],
[ 4, 5]],
[[ 2, 3],
[ 6, 7]]],
[[[ 8, 9],
[12, 13]],
[[10, 11],
[14, 15]]]])
In [44]: B.transpose(0,2,1,3).reshape(-1,B.shape[1]*B.shape[3])
Out[44]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
This is where you'd better use einops:
from einops import rearrange
# that's how you could rewrite view_as_blocks
B = rearrange(A, '(x dx) (y dy) -> x y dx dy', dx=2, dy=2)
# that's an answer to your question
A = rearrange(B, 'x y dx dy -> (x dx) (y dy)')
See documentation for more operations on images

Categories

Resources