Given a matrix A, and a list of row indices, and a list of column indices, how to efficiently extract the squared submatrices with size k centered by the row and column indices?
For example:
A = array([[12, 6, 14, 8, 4, 1],
[18, 13, 8, 10, 9, 19],
[ 8, 15, 6, 5, 6, 18],
[ 3, 0, 2, 14, 13, 12],
[ 4, 4, 5, 19, 0, 14],
[16, 8, 7, 7, 11, 0],
[ 3, 11, 2, 19, 11, 5],
[ 4, 2, 1, 9, 12, 12]])
r = np.array([2, 5])
c = np.array([3, 2])
k = 3
The output should be A[1:4, 2:5] and A[4:7, 1:4]. So basically, the outputs are squared submatrices in size kxk and centered on the [r,c] elements (A[2,3] and A[5,2] in this case)
How to do this efficiently and elegantly? Thanks
For the case when the submatrices be of the same shape, we can get sliding windows and then index those with the start indices along the rows and cols for our desired output. To get those windows, we can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows. More info on use of as_strided based view_as_windows -
from skimage.util.shape import view_as_windows
# Get all sliding windows
w = view_as_windows(A,(k,k))
# Select relevant ones for final o/p
out = w[r-k//2,c-k//2]
You mean something like this?
for x,y in zip(r,c):
s = k // 2
print("position:",[x - s,x + s + 1], [y - s,y + s + 1])
print(A[x - s:x + s + 1,y - s:y + s + 1])
print()
Output:
position: [1, 4] [2, 5]
[[ 8 10 9]
[ 6 5 6]
[ 2 14 13]]
position: [4, 7] [1, 4]
[[ 4 5 19]
[ 8 7 7]
[11 2 19]]
Note that k should be odd here
Related
I have N groups of C-dimension points. In each groups there are M points. So, there is a tensor of (N, M, C). Let's call it features.
I calculated the maximum element and the index through M dimension, to find the maximum points for each C dimension (a max pooling operation), resulting max tensor (N, 1, C) and index tensor (N, 1, C).
I have another tensor of shape (N, M, 3) storing the geometric coordinates of those N*M high-dimention points. Now, I want to use the index from the maximum points in each C dimension, to get the coordinates of all those maximum points.
For example, N=2, M=4, C=6.
The coordinate tensor, whose shape is (2, 4, 3):
[[[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[8, 7, 6]]
[11, 12, 13]
[14, 15, 16]
[17, 18, 19]
[18, 17, 16]]]
The indices tensor, whose shape is (2, 1, 6):
[[[0, 1, 2, 1, 2, 3]]
[[1, 2, 3, 2, 1, 0]]]
For example, the first element in indices is 0, I want to grab [1, 2, 3] from the coordinate tensor out. For the second element (1), I want to grab [4, 5, 6] out. For the third element in the next dimension (3), I want to grab [18, 17, 16] out.
The result tensor will look like:
[[[1, 2, 3] # 0
[4, 5, 6] # 1
[7, 8, 9] # 2
[4, 5, 6] # 1
[7, 8, 9] # 2
[8, 7, 6]] # 3
[[14, 15, 16] # 1
[17, 18, 19] # 2
[18, 17, 16] # 3
[17, 18, 19] # 2
[14, 15, 16] # 1
[11, 12, 13]]]# 0
whose shape is (2, 6, 3).
I tried to use torch.gather but I can not get it worked. I wrote a naive algorithm enumerating all N groups, but indeed it is slow, even using TorchScript's JIT. So, how to write this efficiently in pytorch?
You can use integer array indexing combined with broadcasting semantics to get your result.
import torch
x = torch.tensor([
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[8, 7, 6]],
[[11, 12, 13],
[14, 15, 16],
[17, 18, 19],
[18, 17, 16]],
])
i = torch.tensor([[[0, 1, 2, 1, 2, 3]],
[[1, 2, 3, 2, 1, 0]]])
# rows is shape [2, 1], cols is shape [2, 6]
rows = torch.arange(x.shape[0]).type_as(i).unsqueeze(1)
cols = i.squeeze(1)
# y is [2, 6, ...]
y = x[rows, cols]
Here is a problem I'm trying to solve. Let's say we've a square array:
In [10]: arr
Out[10]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
What I'd like to have is to flatten this array in a specific order: first I want to flatten the lower triangle along axis-0 and then pick the diagonal, and finally flatten the upper triangle again along axis-0, which would finally give the flattened array as:
# | lower triangle |diag.elements| upper triangle |
res = np.array([5, 9, 13, 10, 14, 15, 1, 6, 11, 16, 2, 3, 7, 4, 8, 12])
Here is my partial solution so far, which doesn't give desired result yet.
In [16]: arr[np.tril(arr, k=-1) != 0]
Out[16]: array([ 5, 9, 10, 13, 14, 15]) # not correct!
In [17]: np.diag(arr)
Out[17]: array([ 1, 6, 11, 16])
In [18]: arr[np.triu(arr, k=1) != 0]
Out[18]: array([ 2, 3, 4, 7, 8, 12]) # not correct!
Finally, to concatenate these 3 intermediate results. How to correctly index to obtain desired result? Alternatively, are there other ways of solving this problem?
Here's one based on masking and concatenating/stacking -
In [50]: r = np.arange(len(arr))
In [51]: mask = r[:,None]<r
In [54]: np.concatenate((arr.T[mask],np.diag(arr),arr.T[mask.T]))
Out[54]: array([ 5, 9, 13, 10, 14, 15, 1, 6, 11, 16, 2, 3, 7, 4, 8, 12])
Another based solely on masking -
n = len(arr)
r = np.arange(n)
mask = r[:,None]<r
diag_mask = r[:,None]==r
comp_mask = np.vstack((mask[None],diag_mask[None],mask.T[None]))
out = np.broadcast_to(arr.T,(3,n,n))[comp_mask]
Use the transpose:
lower = np.tril(a, -1).T.ravel()
diag = np.diag(a)
upper = np.triu(a, 1).T.ravel()
result = np.concatenate([lower[lower != 0], diag, upper[upper != 0]])
print(result)
Output:
[ 5 9 13 10 14 15 1 6 11 16 2 3 7 4 8 12]
I am using index to select (numpy broadcast)
ary=ary.T
i,c=ary.shape
x=np.arange(i)
y=np.arange(c)
np.concatenate([ary[x[:,None]<y],ary[x[:,None]==y],ary[x[:,None]>y]])
Out[1065]: array([ 5, 9, 13, 10, 14, 15, 1, 6, 11, 16, 2, 3, 7, 4, 8, 12])
Suppose I have a 2-dimensional numpy array of shape n X m (where n is large number and m >=1 ). Each column represents one attribute. An example for n=5, m=3 is provided below:
[[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]]
I want to train my model on the history of attributes with history_steps = p(1< p <= n). For p=2, the output I expect (of shape (n-p+1 X m*p)) is
[[1,4,2,5,3,6],
[4,7,5,8,6,9],
[7,10,8,11,9,12],
[10,13,11,14,12,15]]
I tried to implement this in pandas by separating columns and then concatenating outputs.
def buff(s, n):
return (pd.concat([s.shift(-i) for i in range(n)], axis=1).dropna().astype(float))
But, for my purposes a numpy based approach will be better. Also, I would like to avoid splitting and concatenating.
How do I go about doing this?
Here's a NumPy based approach with focus on performance using np.lib.stride_tricks.as_strided -
def strided_axis0(a, L = 2):
# INPUTS :
# a : Input array
# L : Length along rows to be cut to create per subarray
# Store shape and strides info
m,n = a.shape
s0,s1 = a.strides
nrows = m - L + 1
strided = np.lib.stride_tricks.as_strided
# Finally use strides to get the 3D array view and then reshape
return strided(a, shape=(nrows,n,L), strides=(s0,s1,s0)).reshape(nrows,-1)
Sample run -
In [27]: a
Out[27]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
In [28]: strided_axis0(a, L=2)
Out[28]:
array([[ 1, 4, 2, 5, 3, 6],
[ 4, 7, 5, 8, 6, 9],
[ 7, 10, 8, 11, 9, 12],
[10, 13, 11, 14, 12, 15]])
You can use dstack + reshape:
a = np.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])
# use `dstack` to stack the two arrays(one with last row removed, the other with first
# row removed), along the third axis, and then use reshape to flatten the second and third
# dimensions
np.dstack([a[:-1], a[1:]]).reshape(a.shape[0]-1, -1)
#array([[ 1, 4, 2, 5, 3, 6],
# [ 4, 7, 5, 8, 6, 9],
# [ 7, 10, 8, 11, 9, 12],
# [10, 13, 11, 14, 12, 15]])
To generalize to arbitrary p, use a list comprehension to generate a list of shifted arrays and then do stack+reshape:
n, m = a.shape
p = 3
np.dstack([a[i:(n-p+i+1)] for i in range(p)]).reshape(n-p+1, -1)
#array([[ 1, 4, 7, 2, 5, 8, 3, 6, 9],
# [ 4, 7, 10, 5, 8, 11, 6, 9, 12],
# [ 7, 10, 13, 8, 11, 14, 9, 12, 15]])
I have a dataset with the following information. The timebin variable is an identifier for the time period of the data. It can be assumed that timebin is unique and without any gaps (i.e. the data will always contain 2 if it contains 1 and 3).
timebin,lat,lon
0,9.0,2.0
1,12.0,4.0
2,15.0,6.0
3,18.0,8.0
4,21.0,10.0
5,24.0,12.0
6,27.0,14.0
7,30.0,16.0
I want to generate all the sequences of a fixed-length l with an amount of overlap o. For instance, for l=4 and o=2 the following groups of indices would be output:
[[0,1,2,3], [2,3,4,5], [4,5,6,7]]
This could be done using a loop, but I wonder if there is a more elegant and efficient way of doing it in python?
Use list comprehension:
l = 4
o = 2
e = 7
print([[x for x in range(s, s + l)] for s in range(0, e, o) if s + l <= e + 1])
Result:
[[0, 1, 2, 3], [2, 3, 4, 5], [4, 5, 6, 7]]
overlap = 2
data = [0, 1, 2 ,3 ,4, 5 ,6 ,7]
groups = [data[i: i + overlap * 2] for i in range(len(data) - overlap * 2 + 1)]
Is the rest of the provided data involved in any way?
Just from your question you could generate those sequences with list comprehensions:
>>> l = 4
>>> o = 2
>>> [[x for x in range(s, s+l)] for s in range(20)[::(l-o)]]
[0, 1, 2, 3], [2, 3, 4, 5], [4, 5, 6, 7], [6, 7, 8, 9], [8, 9, 10, 11],
[10, 11, 12, 13], [12, 13, 14, 15], [14, 15, 16, 17], [16, 17, 18, 19],
[18, 19, 20, 21]]
let's say I have this:
(numpy array)
a=
[0 1 2 3],
[4 5 6 7],
[8 9 10 11]
to get [1,1] which is 5 its diagonal is zero; according to numpy, a.diagonal(0)= [0,5,10]. How do I get the reverse or the right to left diagonal [2,5,8] for [1,1]? Is this possible?
My original problem is an 8 by 8 (0:7).. I hope that helps
Get a new array each row reversed.
>>> import numpy as np
>>> a = np.array([
... [0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 10, 11]
... ])
>>> a[:, ::-1]
array([[ 3, 2, 1, 0],
[ 7, 6, 5, 4],
[11, 10, 9, 8]])
>>> a[:, ::-1].diagonal(1)
array([2, 5, 8])
or using numpy.fliplr:
>>> np.fliplr(a).diagonal(1)
array([2, 5, 8])
Flip the array upside-down and use the same:
np.flipud(a).diagonal(0)[::-1]
Another way to achieve this is to use np.rot90
import numpy as np
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]])
my_diag = np.rot90(a).diagonal(-1)
Result:
>>> my_diag
array([2, 5, 8])
A number of answers so far. #Akavall is closest as you need to rotate or filip and transpose (equivilant operations). I haven't seen a response from the OP regarding expected behavior on the "long" part of the rectangle.
Generalized solution for a square matrix:
a = array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> [(i, np.rot90(a).diagonal(2*i-a.shape[0]+1)) for i in range(a.shape[0])]
[(0, array([0])),
(1, array([ 2, 6, 10])),
(2, array([ 4, 8, 12, 16, 20])),
(3, array([14, 18, 22])),
(4, array([24]))]
As a function:
def reverse_diag(arr, n):
idx = 2*n - arr.shape[0]+1
return np.rot90(arr).diagonal(idx)
original matrix can be made square with a[:np.min(a.shape),:np.min(a.shape)]
EDIT: OP indicated the array is square.... Final Answer is the above