I don' t understand the retriving value syntax - python

Here i have the shape of my set
input [8] : train_x.shape
Out [8] : (4500, 3, 2)
Then in don't understand the following syntax
input [9] : train_x_retrive = train_x[:, -1, :]
Thank you for your help

See,
(4500, 3, 2) means 3 dimension data
with 1st dimension having 4500 length, 2nd dimension having 3 length and 3rd dimension having 2 length.
What train_x[:, -1, :] Means is that retrieve all the data of first dimension, of the last data of 2nd dimension of all the 3rd dimension.
results shape will be (4500, 2)
--EDIT--
Turns out if the returned array has only one selection then there is no need to display it, and so np automatically squeeze that column. so instead of getting array of size (4500,1,2) it returns (4500,2)

While #thisisjaymehta 's answer is correct and well explained, I find it is much easier to understand what's happening in a 2D array.
Consider a 3 row, 2 col random array:
import numpy as np
X = np.random.random((3,2))
print(X)
Yields:
array([[0.05809464, 0.49751321],
[0.25815324, 0.23862334],
[0.56815427, 0.91610693]])
We can access individual elements with subscripting. For instance to reach row (horizontal) 0, col (vertical) 1 we can use X[0,1]:
print(X[0,1])
Which yields:
0.49751320772009267
Similarly we can reach the last row, and the first (0th) column by using X[-1,0]:
0.568154265734957
The notation : is used to address the whole of that axis, so to get the last row, and all of the columns in that last row we can use X[-1,:] to yeild:
array([0.56815427, 0.91610693])
This principle extends in 3 or more dimensions as well. So train_x[:, -1, :] means "All rows (first dimension), the last column (second dimension), and all of the third dimension". This results in an array of shape (4500,2) in your example, where you started with (4500, 3, 2).
The way I like to think of this is you have 4500 3x2 images, and you are requesting for the last row of each image. The resultant array contains 4500 1,2 image strips, squeezed into a 4500,2 array.
You could also do -2 in place of -1 to reach the penultimate index.

Related

Select multiple indices in an axis of pytorch tensor

My actual problem is in a higher dimension, but I am posting it in a smaller dimension to make it easy to visualize.
I have a tensor of shape (2,3,4):
x = torch.randn(2, 3, 4)
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343],
[ 1.5649, 1.0218, -1.3703, 1.8961],
[ 0.8652, 0.2491, -0.2556, 0.1311]],
[[ 0.5289, -1.2723, 2.3865, 0.0222],
[-1.5528, -0.4638, -0.6954, 0.1661],
[-1.8151, -0.4634, 1.6490, 0.6957]]])
From this tensor, I need to select rows given by a list of indices along axis-1.
Example,
indices = torch.tensor([0, 2])
Expected Output:
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343]],
[[-1.8151, -0.4634, 1.6490, 0.6957]]])
Output Shape: (2,1,4)
Explanation: Select 0th row from x[0], select 2nd row from x[1]. (Came from indices)
I tried using index_select like this:
torch.index_select(x, 1, indices)
But the problem is that it is selecting the 0th and 2nd row for each item in x. It looks like it needs some modification I could not figure it out at the moment.
In your case, this is quite straightforward. An easy way to navigate through two dimensions in parallel is to use a range on the first axis and your indexing tensor on the second:
>>> x[range(len(indices)), indices]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])
In more general cases though, this would require the use of torch.gather:
First expand indices such that it has enough dimensions:
index = indices[:,None,None].expand(x.size(0), -1, x.size(-1))
Then you can apply the function on x and index and squeeze dim=1:
>>> x.gather(dim=-2, index=index)[:,0]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])

Printing a Python Array

I have the array below that represent a matrix of 20 cols x 10 rows.
What I am trying to do is to get the value located on the third position after I provide the Column and Row Values. For example if I type in the values 3 and 0, I expect to get 183 as answer. I used the print command as follows print(matrix[3][0][I don't know]) either I get out of range or the undesirable results.
I also organized the data as matrix[[[0],[0],[180]], [[1],[0],[181]], [[2],[0],[182]],... without too much success.
I have the matrix data on a csv file, so I can formatted accordingly if the problem is the way I am presenting the data.
Can soomeone, please, take a look to this code and direct me? Thanks
matrix =[]
matrix =[
['0','0','180'],
['1','0','181'],
['2','0','182'],
['3','0','183'],
['4','0','184'],
['5','0','185'],
['6','0','186'],
['7','0','187'],
['18','0','198']]
print(matrix[?][?][value])
your matrix here is 9 * 3
if you want the 185, it's in the 6th row 3rd column, so indexes are 5 and 2 respectively.
matrix[5][2] will print the result, idk why you have a 3rd bracket.
basically to access an element you will do [rowNumber][colNumber] , first brackets will give you whatever is in that position of the big array (a 2 d array is just an array of arrays) so you get an array (1D with 3 element) you then put the index of the element in that 1D array.

Find index of the maximum value in a numpy array

I have a numpy array called predictions as follows
array([[3.7839172e-06, 8.0308418e-09, 2.2542761e-06, 5.9392878e-08,
5.3137046e-07, 1.7033290e-05, 1.7738441e-07, 1.0742254e-03,
1.8656212e-06, 9.9890006e-01]], dtype=float32)
In order to get the index of the maximum value in this array, I used the following
np.where(prediction==prediction.max())
But the result I am getting showing index 0 also.
(array([0], dtype=int64), array([9], dtype=int64))
Does anyone know why is it showing index 0 also?
Also how can I get just the index number instead of showing as (array([9], dtype=int64)
Use built-in function for it:
prediction.argmax()
output:
9
Also, that index 0 is the row number, so the max is at row 0 and column 9.
The predictions array here is two dimensional. When you call np.where with only a condition, this is the same as calling np.asarray(condition).nonzero() which returns you the indices of the non-zero elements of prediction==prediction.max() which is a boolean array with the only non-zero element at (0,9).
What you are looking for is the argmax function which will give you the index of the maximum value along an axis. You effectively only have one axis (2d but only one row) here so this should be fine.
As the other answers mentioned, you have a 2D array, so you end up with two indices. Since the array is just a row, the first index is always zero. You can bypass this in a number of ways:
Use prediction.argmax(). The default axis argument is None, which means operate on a flattened array. Other options that will get you the same result are prediction.argmax(-1) (last axis) and prediction.argmax(1) (second axis). Keep in mind that you will only ever get the index of the first maximum this way. That's fine if you only ever expect to have one, or only need one.
Use np.flatnonzero to get the linear indices similarly to the way you were doing:
np.flatnonzero(perdiction == prediction.max())
Use np.nonzero or np.where, but extract the axis you care about:
np.nonzero(prediction == prediction.max())[1]
ravel the array on input:
np.where(prediction.ravel() == prediction.max())
Do the same thing, but with np.squeeze:
np.nonzero(prediction.squeeze() == prediction.max())

why does numpy array return wrong shape of sub arrays when indexing

An example is shown as follows:
>>> import numpy as np
>>> a=np.zeros((288,512))
>>> x1,x2,y1,y2=0,16,0,16
>>> p=a[x1:x2][y1:y2]
>>> p.shape
(16, 512)
>>> p=a[x1:x2,y1:y2]
>>> p.shape
I try to query a patch from an array, ranging from columns 0 to 16, and rows 0 to 16. I index the array in two ways and get very different result. a[x1:x2][y1:y2] gives me the wrong result.
Why?
Thx for helping me!!!
When you do a[x1:x2][y1:y2], you are slicing by rows twice. That is, a[x1:x2] will give you a shape (16,512). The second slice operation in a[x1:x2][y1:y2] is slicing the result of the first operation and will give you the same result.
In the second case, when you do a[x1:x2,y1:y2], you are slicing by the two dimensions of your 2-dimensional array.
Important note: If you have a 2-dimensional array and you slice like this:
a = np.zeros((10,15))
a[1:3].shape
Output:
(2, 15)
you will slice only by rows. Your resulting array will have 2 rows and the total number of columns (15 columns). If you want to slice by rows and columns, you will have to use a[1:3, 1:3].
The two methods of indexing you tried are not equivalent. In the first one (a[x1:x2][y1:y2]), you are essentially indexing the first axis twice. In the second, you are indexing the first and second axes.
a[x1:x2][y1:y2] can be rewritten as
p = a[x1:x2] # result still has two dimensions
p = p[y1:y2]
You are first indexing 0:16 in the first dimension. Then you index 0:16 in the first dimension of the result of the previous operation (which will simply return the same as a[x1:x2] because x1==y1 and x2==y2).
In the second method, you index the first and second dimensions directly. I would not write it this way, but one could write it like this to contrast it with the first method:
a[x1:x2][:, y1:y2]

Indexing over the last axis when you don't know the rank in advance

How can I index the last axis of a Numpy array if I don't know its rank in advance?
Here is what I want to do: Let a be a Numpy array of unknown rank. I want the slice of the last k elements of the last axis.
If a is 1D, I want
b = a[-k:]
If a is 2D, I want
b = a[:, -k:]
If a is 3D, I want
b = a[:, :, -k:]
and so on.
I want this to work regardless of the rank of a (as long as the rank is at least 1).
The fact that I want the last k elements in the example is irrelevant of course, the point is that I want to specify indices for whatever the last axis is when I don't know the rank of an array in advance.
b = a[..., -k:]
This is mentioned in the docs.

Categories

Resources