numpy meshgrid of dynamic shape - python

I am trying to use numpy meshgrid to generate some arrays. So, I have a nd array. Let us call it data and it can have an arbitrary shape and I am trying to generate some indices array as follows:
shape = data.shape
x = np.meshgrid[1,x-1 for x in shape]
I know the syntax looks crazy but sometimes I try things like these in python and it works! Anyway, is there a way to do this dynamic meshgrid in python? This comes back with invalid syntax error:
x = np.meshgrid[1,x-1 for x in shape]
^
SyntaxError: invalid syntax
EDIT:
I would like basically to create an array of indices. For example, I can do the following when the index always begins with 0
import numpy as np
array = np.random.rand(5, 5, 5)
shape = array.shape
indices = np.indices(x-1 for x in shape)
This creates an ndarray with indices starting from 0 to (n-1) along each of the axes of my input array. Now, I wanted to have the indexing begin from 1 and could not find a good way to do this.
EDIT:
For example, a call for an array with shape (4, 5, 6) could be something like:
x = np.meshgrid(np.arange(1,4), np.arange(1,5), np.arange(1, 6))

Going off your last example, you can do something like this:
x = np.meshgrid(*[np.arange(1, x) for x in shape])
You need to explicitly create a list of the values you want to pass to meshgrid. If you want each one to start at 1, you need to put the 1 in each call to arange. You can't do something like [1, arange(x)] and have it "distribute" the 1 through all the calls.
Then the * there expands the list into separate arguments. (See here for info.)

Related

Append numpy one dimensional arrays does not lead to a matrix

I am trying to get a 2d array, by randomly generating its rows and appending
import numpy as np
my_nums = np.array([])
for i in range(100):
x = np.random.rand(2, 1)
my_nums = np.append(my_nums, np.array(x))
But I do not get what I want but instead get a 1d array.
What is wrong?
Transposing x did not help either.
You could do this by using np.append(axis=0) or np.vstack. This however requires the rows appended to have the same length as the rows already in the array.
You cannot use the same code to append a row with two values to an empty array, and to append a row to an already existing 2D array: numpy will throw a
ValueError: all the input arrays must have same number of dimensions.
You could initialize my_nums to work around this:
my_nums = np.random.rand(1, 2)
for i in range(99):
x = np.random.rand(1, 2)
my_nums = np.append(my_nums, x, axis=0)
Note the decrease in the range by one due to the initialization row. Also note that I changed the dimensions to (1, 2) to get actual row vectors.
Much easier than appending row-wise will of course be to create the array in the wanted final shape:
my_nums = np.random.rand(100, 2)

Numpy [...,None]

I have found myself needing to add features to existing numpy arrays which has led to a question around what the last portion of the following code is actually doing:
np.ones(shape=feature_set.shape)[...,None]
Set-up
As an example, let's say I wish to solve for linear regression parameter estimates by using numpy and solving:
Assume I have a feature set shape (50,1), a target variable of shape (50,), and I wish to use the shape of my target variable to add a column for intercept values.
It would look something like this:
# Create random target & feature set
y_train = np.random.randint(0,100, size = (50,))
feature_set = np.random.randint(0,100,size=(50,1))
# Build a set of 1s after shape of target variable
int_train = np.ones(shape=y_train.shape)[...,None]
# Able to then add int_train to feature set
X = np.concatenate((int_train, feature_set),1)
What I Think I Know
I see the difference in output when I include [...,None] vs when I leave it off. Here it is:
The second version returns an error around input arrays needing the same number of dimensions, and eventually I stumbled on the solution to use [...,None].
Main Question
While I see the output of [...,None] gives me what I want, I am struggling to find any information on what it is actually supposed to do. Can anybody walk me through what this code actually means, what the None argument is doing, etc?
Thank you!
The slice of [..., None] consists of two "shortcuts":
The ellipsis literal component:
The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is a rank 5 array (i.e., it has 5 axes), then
x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].
(Source)
The None component:
numpy.newaxis
The newaxis object can be used in all slicing operations to create an axis of length one. newaxis is an alias for ‘None’, and ‘None’ can be used in place of this with the same result.
(Source)
So, arr[..., None] takes an array of dimension N and "adds" a dimension "at the end" for a resulting array of dimension N+1.
Example:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
print(x.shape) # (2, 3)
y = x[...,None]
print(y.shape) # (2, 3, 1)
z = x[:,:,np.newaxis]
print(z.shape) # (2, 3, 1)
a = np.expand_dims(x, axis=-1)
print(a.shape) # (2, 3, 1)
print((y == z).all()) # True
print((y == a).all()) # True
Consider this code:
np.ones(shape=(2,3))[...,None].shape
As you see the 'None' phrase change the (2,3) matrix to a (2,3,1) tensor. As a matter of fact it put the matrix in the LAST index of the tensor.
If you use
np.ones(shape=(2,3))[None, ...].shape
it put the matrix in the FIRST‌ index of the tensor

Reshaping array of matrices in Python

I have a Numpy array X of n 2x2 matrices, arranged so that X.shape = (2,2,n), that is, to get the first matrix I call X[:,:,0]. I would like to reshape X into an array Y such that I can get the first matrix by calling Y[0] etc., but performing X.reshape(n,2,2) messes up the matrices. How can I get it to preserve the matrices while reshaping the array?
I am essentially trying to do this:
import numpy as np
Y = np.zeros([n,2,2])
for i in range(n):
Y[i] = X[:,:,i]
but without using the for loop. How can I do this with reshape or a similar function?
(To get an example array X, try X = np.concatenate([np.identity(2)[:,:,None]] * n, axis=2) for some n.)
numpy.moveaxis can be used to take a view of an array with one axis moved to a different position in the shape:
numpy.moveaxis(X, 2, 0)
numpy.moveaxis(a, source, destination) takes a view of array a where the axis originally at position source ends up at position destination, so numpy.moveaxis(X, 2, 0) makes the original axis 2 the new axis 0 in the view.
There's also numpy.transpose, which can be used to perform arbitrary rearrangements of an array's axes in one go if you pass it the optional second argument, and numpy.rollaxis, an older version of moveaxis with a more confusing calling convention.
Use swapaxis:
Y = X.swapaxes(0,2)

My numpy array always ends in zero?

I think I missed something somewhere. I filled a numpy array using two for loops (x and y) and a function based on the x,y position. The only problem is that the value of the array always ends in zero irregardless of the size of the array.
thetamap = numpy.zeros(36, dtype=float)
thetamap.shape = (6, 6)
for y in range(0,5):
for x in range(0,5):
thetamap[x][y] = x+y
print thetamap
range(0, 5) produces 0, 1, 2, 3, 4. The endpoint is always omitted. You want simply range(6).
Better yet, use the awesome power of NumPy to make the array in one line:
thetamap = np.arange(6) + np.arange(6)[:,None]
This makes a row vector and a column vector, then adds them together using NumPy broadcasting to make a matrix.

Convert a list of 2D numpy arrays to one 3D numpy array?

I have a list of several hundred 10x10 arrays that I want to stack together into a single Nx10x10 array. At first I tried a simple
newarray = np.array(mylist)
But that returned with "ValueError: setting an array element with a sequence."
Then I found the online documentation for dstack(), which looked perfect: "...This is a simple way to stack 2D arrays (images) into a single 3D array for processing." Which is exactly what I'm trying to do. However,
newarray = np.dstack(mylist)
tells me "ValueError: array dimensions must agree except for d_0", which is odd because all my arrays are 10x10. I thought maybe the problem was that dstack() expects a tuple instead of a list, but
newarray = np.dstack(tuple(mylist))
produced the same result.
At this point I've spent about two hours searching here and elsewhere to find out what I'm doing wrong and/or how to go about this correctly. I've even tried converting my list of arrays into a list of lists of lists and then back into a 3D array, but that didn't work either (I ended up with lists of lists of arrays, followed by the "setting array element as sequence" error again).
Any help would be appreciated.
newarray = np.dstack(mylist)
should work. For example:
import numpy as np
# Here is a list of five 10x10 arrays:
x = [np.random.random((10,10)) for _ in range(5)]
y = np.dstack(x)
print(y.shape)
# (10, 10, 5)
# To get the shape to be Nx10x10, you could use rollaxis:
y = np.rollaxis(y,-1)
print(y.shape)
# (5, 10, 10)
np.dstack returns a new array. Thus, using np.dstack requires as much additional memory as the input arrays. If you are tight on memory, an alternative to np.dstack which requires less memory is to
allocate space for the final array first, and then pour the input arrays into it one at a time.
For example, if you had 58 arrays of shape (159459, 2380), then you could use
y = np.empty((159459, 2380, 58))
for i in range(58):
# instantiate the input arrays one at a time
x = np.random.random((159459, 2380))
# copy x into y
y[..., i] = x

Categories

Resources