What is view method doing on numpy matrix that changes the shape? - python

So I have a function:
def create_vecs(colnames):
return np.matrix(data[colnames]).view(dtype=np.float64).reshape(-1, 3)
when I apply this function on my data, first part gets the columns of interest and returns a numpy matrix of size 1340*3. but then I'm not sure what view is doing on my data that it doesn't let my data to be reshaped to three columns. I'm confused how this view method works and how to change it so that I can reshape my data back to three columns.

When you say:
.reshape(-1, 3)
python reshape so that the second index be 3 unit and first index adapts to whatever it gets
for example 1340*3 you have 4020 unit data
so if you use
.reshape(-1, 5, 4)
the shape of the matrix becomes
(201, 5, 4)
I hope i was clear.

Related

Reshape three matrices into one

Suppose test = np.array(5*[np.eye(5), 10*np.eye(5), 15*np.eye(5)]). I have three matrices inside an array with shape (3, 5, 5). In general, how can I reshape test to make the three matrices into one? In that specific example, I would like the shape to be (15, 5). I want a general way of doing it without using some reeally specific as np.reshape(test, (15,5)).
You can use -1 in reshape that implicitly calculates the required dimension shape:
test = test.reshape(-1, test.shape[-1])

Trace Operation in Python not Forming Correct Array Shape

I'm looking to find the trace of matrices (using Numpy) in a function I have defined in Python. The input parameters tensor and tensor_transpose are both matrices of size (N,2,2) and are extracted from a VTK file (N is a rather large number and varies depending on the file). So both A and B are arrays of (N,2,2). By taking the trace of each array (sum of the diagonal terms), a single value for each array should be returned. So np.trace(A)**3)-(np.trace(B)**3 should be a single numerical value, with the array being of shape (N,1). My output though does not show this, with the returned shape being (2,).
Can anyone explain why? Is it an issue with the trace function and is there a solution?
import numpy as np
A=np.array(0.5*(tensor-tensor_transpose))
B=np.array(0.5*(tensor+tensor_transpose))
C=np.array(0.5*((np.trace(A)**3)-(np.trace(B)**3)))
print(A.shape)
print(B.shape)
print(C.shape)
#Output
#(60600, 2, 2)
#(60600, 2, 2)
#(2,)
Maybe you need to specify the axes:
np.trace(A, axis1=1, axis2=2)

Numpy slicing of a 3D matrix using a sequence `:n` is different than specifying columns `[0,1]` [duplicate]

I have a 4-D NumPy array, with axis say x,y,z,t. I want to take slice corresponding to t=0 and to permute the order in the y axis.
I have the following
import numpy as np
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
b.shape
I get (5, 4, 3) instead of (4,5,3).
When, instead, I enter
aa = a[:,:,:,0]
bb = aa[:,[1,2,3,4,0],:]
bb.shape
I get the expected (4,5,3). Can someone explain why does the first version swap the first two dimensions?
As #hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
In the above code snippet, what happens is the following:
when we do basic slicing along last dimension, it triggers a __getitem__ call. So, that dimension is gone. (i.e. no singleton dimension)
[1,2,3,4,0] returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.
Along first and third dimension, since you slice everything using :, the original length along those dimensions is retained.
Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)
You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing

Numpy [...,None]

I have found myself needing to add features to existing numpy arrays which has led to a question around what the last portion of the following code is actually doing:
np.ones(shape=feature_set.shape)[...,None]
Set-up
As an example, let's say I wish to solve for linear regression parameter estimates by using numpy and solving:
Assume I have a feature set shape (50,1), a target variable of shape (50,), and I wish to use the shape of my target variable to add a column for intercept values.
It would look something like this:
# Create random target & feature set
y_train = np.random.randint(0,100, size = (50,))
feature_set = np.random.randint(0,100,size=(50,1))
# Build a set of 1s after shape of target variable
int_train = np.ones(shape=y_train.shape)[...,None]
# Able to then add int_train to feature set
X = np.concatenate((int_train, feature_set),1)
What I Think I Know
I see the difference in output when I include [...,None] vs when I leave it off. Here it is:
The second version returns an error around input arrays needing the same number of dimensions, and eventually I stumbled on the solution to use [...,None].
Main Question
While I see the output of [...,None] gives me what I want, I am struggling to find any information on what it is actually supposed to do. Can anybody walk me through what this code actually means, what the None argument is doing, etc?
Thank you!
The slice of [..., None] consists of two "shortcuts":
The ellipsis literal component:
The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is a rank 5 array (i.e., it has 5 axes), then
x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].
(Source)
The None component:
numpy.newaxis
The newaxis object can be used in all slicing operations to create an axis of length one. newaxis is an alias for ‘None’, and ‘None’ can be used in place of this with the same result.
(Source)
So, arr[..., None] takes an array of dimension N and "adds" a dimension "at the end" for a resulting array of dimension N+1.
Example:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
print(x.shape) # (2, 3)
y = x[...,None]
print(y.shape) # (2, 3, 1)
z = x[:,:,np.newaxis]
print(z.shape) # (2, 3, 1)
a = np.expand_dims(x, axis=-1)
print(a.shape) # (2, 3, 1)
print((y == z).all()) # True
print((y == a).all()) # True
Consider this code:
np.ones(shape=(2,3))[...,None].shape
As you see the 'None' phrase change the (2,3) matrix to a (2,3,1) tensor. As a matter of fact it put the matrix in the LAST index of the tensor.
If you use
np.ones(shape=(2,3))[None, ...].shape
it put the matrix in the FIRST‌ index of the tensor

How to merge and split numpy array along the axis?

I have the data in the following form the shape of the array is
(10,4,4,3)
First i want to create an array with shape (merging, or flattening)
(10,48)
such that data (4,4,3) is converted to one row.
Secondly I want to go back to the original shape of the data(splitting) such that each element is again placed at the same location.
Thanks
b = a.reshape(10,48)
a = b.reshape(10,4,4,3)

Categories

Resources