Numpy [...,None] - python

I have found myself needing to add features to existing numpy arrays which has led to a question around what the last portion of the following code is actually doing:
np.ones(shape=feature_set.shape)[...,None]
Set-up
As an example, let's say I wish to solve for linear regression parameter estimates by using numpy and solving:
Assume I have a feature set shape (50,1), a target variable of shape (50,), and I wish to use the shape of my target variable to add a column for intercept values.
It would look something like this:
# Create random target & feature set
y_train = np.random.randint(0,100, size = (50,))
feature_set = np.random.randint(0,100,size=(50,1))
# Build a set of 1s after shape of target variable
int_train = np.ones(shape=y_train.shape)[...,None]
# Able to then add int_train to feature set
X = np.concatenate((int_train, feature_set),1)
What I Think I Know
I see the difference in output when I include [...,None] vs when I leave it off. Here it is:
The second version returns an error around input arrays needing the same number of dimensions, and eventually I stumbled on the solution to use [...,None].
Main Question
While I see the output of [...,None] gives me what I want, I am struggling to find any information on what it is actually supposed to do. Can anybody walk me through what this code actually means, what the None argument is doing, etc?
Thank you!

The slice of [..., None] consists of two "shortcuts":
The ellipsis literal component:
The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is a rank 5 array (i.e., it has 5 axes), then
x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].
(Source)
The None component:
numpy.newaxis
The newaxis object can be used in all slicing operations to create an axis of length one. newaxis is an alias for ‘None’, and ‘None’ can be used in place of this with the same result.
(Source)
So, arr[..., None] takes an array of dimension N and "adds" a dimension "at the end" for a resulting array of dimension N+1.
Example:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
print(x.shape) # (2, 3)
y = x[...,None]
print(y.shape) # (2, 3, 1)
z = x[:,:,np.newaxis]
print(z.shape) # (2, 3, 1)
a = np.expand_dims(x, axis=-1)
print(a.shape) # (2, 3, 1)
print((y == z).all()) # True
print((y == a).all()) # True

Consider this code:
np.ones(shape=(2,3))[...,None].shape
As you see the 'None' phrase change the (2,3) matrix to a (2,3,1) tensor. As a matter of fact it put the matrix in the LAST index of the tensor.
If you use
np.ones(shape=(2,3))[None, ...].shape
it put the matrix in the FIRST‌ index of the tensor

Related

pytorch view tensor and reduce one dimension

So I have a 4d tensor with shape [4,1,128,678] and I would like to view/reshape it as [4,678,128].
I have to do this for multiple tensors where the last shape value 678 is not always know and could be different, so [4,1,128,575]should also go to [4,575,128]
Any idea on what is the optimal operation to transform the tensor? view/reshape? and how?
Thanks
You could also use (less to write and IMO cleaner):
# x.shape == (4, 1, 128, 678)
x.squeeze().permute(0, 2, 1)
If you were to use view you would lose dimension information (but maybe that is what you want), in this case it would be:
x.squeeze().view(4, -1, 128)
permute reorders tensors, while shape only gives a different view without restructuring underlying memory. You can see the difference between those two operations in this StackOverflow answer.
Use einops instead, it can do all operations in one turn and verify known dimensions:
from einops import reshape
y = rearrange(x, 'x 1 y z -> x z y', x=4, y=128)

Check if numpy array has a normal shape

How do I check if a numpy array has a regular shape.
In the example below x is a *2 by 3* matrix. However y is not regular in the sense that it can't be represented as a proper matrix.
Given that I have a numpy array, is there a method (preferably in-built) that I can use to check that the numpy array is an actual matrix
In [9]: import numpy as np
In [10]: x = np.array([[1,2,3],[4,5,6]])
In [11]: x.shape
Out[11]: (2, 3)
In [12]: y = np.array([[1,2,3],[4,5]])
In [13]: y.shape
Out[13]: (2,)
Both are arrays and those are valid shapes. But, with normal, think you meant that each element has the same shape and length across it. For that, a better way would be to check for the datatype. For the variable length case, it would be object. So, we can check for that condition and call out accordingly. Hence, simply do -
def is_normal_arr(a): # a is input array to be tested
return a.dtype is not np.dtype('object')
I think the .shape method is capable of checking it.
If you input an array which can form a matrix it returns it's actual shape, (2, 3) in your case. If you input an incorrect matrix it returns something like (2,), which says something's wrong with the second dimension, so it can't form a matrix.
Here y is a one-dimensional array and the size of y is 2. y contains 2 list values.
AND x is our actual matrix in a proper format.
check the dimensions by y.ndim AND x.ndim.

Numpy slicing of a 3D matrix using a sequence `:n` is different than specifying columns `[0,1]` [duplicate]

I have a 4-D NumPy array, with axis say x,y,z,t. I want to take slice corresponding to t=0 and to permute the order in the y axis.
I have the following
import numpy as np
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
b.shape
I get (5, 4, 3) instead of (4,5,3).
When, instead, I enter
aa = a[:,:,:,0]
bb = aa[:,[1,2,3,4,0],:]
bb.shape
I get the expected (4,5,3). Can someone explain why does the first version swap the first two dimensions?
As #hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
In the above code snippet, what happens is the following:
when we do basic slicing along last dimension, it triggers a __getitem__ call. So, that dimension is gone. (i.e. no singleton dimension)
[1,2,3,4,0] returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.
Along first and third dimension, since you slice everything using :, the original length along those dimensions is retained.
Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)
You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing

Reshaping array of matrices in Python

I have a Numpy array X of n 2x2 matrices, arranged so that X.shape = (2,2,n), that is, to get the first matrix I call X[:,:,0]. I would like to reshape X into an array Y such that I can get the first matrix by calling Y[0] etc., but performing X.reshape(n,2,2) messes up the matrices. How can I get it to preserve the matrices while reshaping the array?
I am essentially trying to do this:
import numpy as np
Y = np.zeros([n,2,2])
for i in range(n):
Y[i] = X[:,:,i]
but without using the for loop. How can I do this with reshape or a similar function?
(To get an example array X, try X = np.concatenate([np.identity(2)[:,:,None]] * n, axis=2) for some n.)
numpy.moveaxis can be used to take a view of an array with one axis moved to a different position in the shape:
numpy.moveaxis(X, 2, 0)
numpy.moveaxis(a, source, destination) takes a view of array a where the axis originally at position source ends up at position destination, so numpy.moveaxis(X, 2, 0) makes the original axis 2 the new axis 0 in the view.
There's also numpy.transpose, which can be used to perform arbitrary rearrangements of an array's axes in one go if you pass it the optional second argument, and numpy.rollaxis, an older version of moveaxis with a more confusing calling convention.
Use swapaxis:
Y = X.swapaxes(0,2)

numpy meshgrid of dynamic shape

I am trying to use numpy meshgrid to generate some arrays. So, I have a nd array. Let us call it data and it can have an arbitrary shape and I am trying to generate some indices array as follows:
shape = data.shape
x = np.meshgrid[1,x-1 for x in shape]
I know the syntax looks crazy but sometimes I try things like these in python and it works! Anyway, is there a way to do this dynamic meshgrid in python? This comes back with invalid syntax error:
x = np.meshgrid[1,x-1 for x in shape]
^
SyntaxError: invalid syntax
EDIT:
I would like basically to create an array of indices. For example, I can do the following when the index always begins with 0
import numpy as np
array = np.random.rand(5, 5, 5)
shape = array.shape
indices = np.indices(x-1 for x in shape)
This creates an ndarray with indices starting from 0 to (n-1) along each of the axes of my input array. Now, I wanted to have the indexing begin from 1 and could not find a good way to do this.
EDIT:
For example, a call for an array with shape (4, 5, 6) could be something like:
x = np.meshgrid(np.arange(1,4), np.arange(1,5), np.arange(1, 6))
Going off your last example, you can do something like this:
x = np.meshgrid(*[np.arange(1, x) for x in shape])
You need to explicitly create a list of the values you want to pass to meshgrid. If you want each one to start at 1, you need to put the 1 in each call to arange. You can't do something like [1, arange(x)] and have it "distribute" the 1 through all the calls.
Then the * there expands the list into separate arguments. (See here for info.)

Categories

Resources