Suppose test = np.array(5*[np.eye(5), 10*np.eye(5), 15*np.eye(5)]). I have three matrices inside an array with shape (3, 5, 5). In general, how can I reshape test to make the three matrices into one? In that specific example, I would like the shape to be (15, 5). I want a general way of doing it without using some reeally specific as np.reshape(test, (15,5)).
You can use -1 in reshape that implicitly calculates the required dimension shape:
test = test.reshape(-1, test.shape[-1])
Related
So I have a function:
def create_vecs(colnames):
return np.matrix(data[colnames]).view(dtype=np.float64).reshape(-1, 3)
when I apply this function on my data, first part gets the columns of interest and returns a numpy matrix of size 1340*3. but then I'm not sure what view is doing on my data that it doesn't let my data to be reshaped to three columns. I'm confused how this view method works and how to change it so that I can reshape my data back to three columns.
When you say:
.reshape(-1, 3)
python reshape so that the second index be 3 unit and first index adapts to whatever it gets
for example 1340*3 you have 4020 unit data
so if you use
.reshape(-1, 5, 4)
the shape of the matrix becomes
(201, 5, 4)
I hope i was clear.
So I have a 4d tensor with shape [4,1,128,678] and I would like to view/reshape it as [4,678,128].
I have to do this for multiple tensors where the last shape value 678 is not always know and could be different, so [4,1,128,575]should also go to [4,575,128]
Any idea on what is the optimal operation to transform the tensor? view/reshape? and how?
Thanks
You could also use (less to write and IMO cleaner):
# x.shape == (4, 1, 128, 678)
x.squeeze().permute(0, 2, 1)
If you were to use view you would lose dimension information (but maybe that is what you want), in this case it would be:
x.squeeze().view(4, -1, 128)
permute reorders tensors, while shape only gives a different view without restructuring underlying memory. You can see the difference between those two operations in this StackOverflow answer.
Use einops instead, it can do all operations in one turn and verify known dimensions:
from einops import reshape
y = rearrange(x, 'x 1 y z -> x z y', x=4, y=128)
I am looking for an elegant way to flatten an array of arbitrary shape to a matrix based on a single parameter that specifies the dimension to retain. For illustration, I would like
def my_func(input, dim):
# code to compute output
return output
Given for example an input array of shape 2x3x4, output should be for dim=0 an array of shape 12x2; for dim=1 an array of shape 8x3; for dim=2 an array of shape 6x8. If I want to flatten the last dimension only, then this is easily accomplished by
input.reshape(-1, input.shape[-1])
But I would like to add the functionality of adding dim (elegantly, without going through all possible cases + checking with if conditions, etc.). It might be possible by first swapping dimensions, so that the dimension of interest is trailing and then applying the operation above.
Any help?
We can permute axes and reshape -
# a is input array; axis is input axis/dim
np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis])
Functionally, it's basically pushing the specified axis to the back and then reshaping keeping that axis length to form the second axis and merging rest of the axes to form the first axis.
Sample runs -
In [32]: a = np.random.rand(2,3,4)
In [33]: axis = 0
In [34]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[34]: (12, 2)
In [35]: axis = 1
In [36]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[36]: (8, 3)
In [37]: axis = 2
In [38]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[38]: (6, 4)
I have found myself needing to add features to existing numpy arrays which has led to a question around what the last portion of the following code is actually doing:
np.ones(shape=feature_set.shape)[...,None]
Set-up
As an example, let's say I wish to solve for linear regression parameter estimates by using numpy and solving:
Assume I have a feature set shape (50,1), a target variable of shape (50,), and I wish to use the shape of my target variable to add a column for intercept values.
It would look something like this:
# Create random target & feature set
y_train = np.random.randint(0,100, size = (50,))
feature_set = np.random.randint(0,100,size=(50,1))
# Build a set of 1s after shape of target variable
int_train = np.ones(shape=y_train.shape)[...,None]
# Able to then add int_train to feature set
X = np.concatenate((int_train, feature_set),1)
What I Think I Know
I see the difference in output when I include [...,None] vs when I leave it off. Here it is:
The second version returns an error around input arrays needing the same number of dimensions, and eventually I stumbled on the solution to use [...,None].
Main Question
While I see the output of [...,None] gives me what I want, I am struggling to find any information on what it is actually supposed to do. Can anybody walk me through what this code actually means, what the None argument is doing, etc?
Thank you!
The slice of [..., None] consists of two "shortcuts":
The ellipsis literal component:
The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is a rank 5 array (i.e., it has 5 axes), then
x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].
(Source)
The None component:
numpy.newaxis
The newaxis object can be used in all slicing operations to create an axis of length one. newaxis is an alias for ‘None’, and ‘None’ can be used in place of this with the same result.
(Source)
So, arr[..., None] takes an array of dimension N and "adds" a dimension "at the end" for a resulting array of dimension N+1.
Example:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
print(x.shape) # (2, 3)
y = x[...,None]
print(y.shape) # (2, 3, 1)
z = x[:,:,np.newaxis]
print(z.shape) # (2, 3, 1)
a = np.expand_dims(x, axis=-1)
print(a.shape) # (2, 3, 1)
print((y == z).all()) # True
print((y == a).all()) # True
Consider this code:
np.ones(shape=(2,3))[...,None].shape
As you see the 'None' phrase change the (2,3) matrix to a (2,3,1) tensor. As a matter of fact it put the matrix in the LAST index of the tensor.
If you use
np.ones(shape=(2,3))[None, ...].shape
it put the matrix in the FIRST index of the tensor
So I am a little new to using matrices in Python, and I am looking for the best way to perform the following operation.
Say I have a vector of an arbitrary length, like this:
data = np.array(range(255))
And I want to fit this data inside a matrix with a shape like so:
concept = np.zeros((3, 9, 6))
Now, obviously this will not fit, and results in an error:
ValueError: cannot reshape array of size 255 into shape (3,9,6)
What would be the best way to go about fitting as much of the data vector inside the first matrix with the shape (3, 9, 6) while making sure any "overflow" is stored in a second (or third, fourth, etc.) matrix?
Does this make sense?
Basically, I want to be able to take a vector of any size and produce an arbitrary amount of matrices that have the data shaped according to the 3, 9, 6 dimensions.
Thank you for your help.
def each_matrix(a, dims):
size = dims.prod()
padded = np.concatenate([ a, np.zeros(size-1) ])
for i in range(len(padded) / size):
yield padded[i*size : (i+1)*size].reshape(dims)
for matrix in each_matrix(np.array(range(255)),
dims=np.array([ 3, 9, 6 ])):
print(str(matrix) + '\n\n-------\n')
This will fill the last matrix with zeros.
Here is a rough solution to your problem.
def split_padded(a,n):
padding = n - len(data)%n
numOfsplit = int(len(data)/n)+1
print padding, numOfsplit
return np.split(np.concatenate((a,np.zeros(padding))),numOfsplit)
data = np.array(range(255))
splitnum = 3*9*6
splitdata = split_padded(data,splitnum)
for mat in splitdata:
print mat.reshape(3,9,6)
It is very rough and works for 1D input for array.
First, calculating the number of 0 we need to pad in padding and then calculating the number of matrices we can get out of input data in numOfsplit and doing the splitting in last line.