np.array ndmin argument: specify placement of added dimensions - python

I have an M-dimensional np.ndarray, where M <= N. Beyond this condition, the array may have any shape. I want to convert this array to N-dimensional, with dimensions 0 through M kept the same and dimensions M through N set to 1.
I can almost accomplish this behavior by copying the array using np.array and supplying the the ndmin argument. However, this places extra axis to the 'first' rather than 'last' positions:
>>> a3d = np.zeros((2,3,4))
>>> a5d = np.array(a3d, ndmin = 5)
>>> a5d.shape
(1, 1, 2, 3, 4) #actual shape
(2, 3, 4, 1, 1) #desired shape
Is there a way to specify where the added dimensions should go? Is there an alternate function I can use here which can result in my desired output?
Obviously in the example above I could manipulate the array after the fact to put axes in the order I want them, but since the orignal array could have had anywhere from 0 to 5 dimensions (and I want to keep original dimensions in the original order), I can't think of a way to do that without a tedious series of checks on the original shape.

I'd use .reshape ...
>>> a3d = a3d.reshape(a3d.shape + (1, 1))
>>> a3d.shape
(2, 3, 4, 1, 1)
If you want to pad up to a certain dimensionality:
>>> a3d = np.zeros((2,3,4))
>>> ndim = 5
>>> padded_shape = (a3d.shape + (1,)*ndim)[:ndim]
>>> new_a3d = a3d.reshape(padded_shape)
>>> new_a3d.shape
(2, 3, 4, 1, 1)

Just set
a5d = np.array(a3d)
a5d.shape = a3d.shape + (1, 1)
print a5d.shape
(2, 3, 4, 1, 1)
since the arrays are of the same physical size


Python equivalent of Matlab shiftdim()

I am currently converting some Matlab code to Python and I am wondering if there is a similar function to Matlab's shiftdim(A, n)
B = shiftdim(A,n) shifts the dimensions of an array A by n positions. shiftdim shifts the dimensions to the left when n is a positive integer and to the right when n is a negative integer. For example, if A is a 2-by-3-by-4 array, then shiftdim(A,2) returns a 4-by-2-by-3 array.
If you use numpy you can use np.moveaxis.
From the docs:
>>> x = np.zeros((3, 4, 5))
>>> np.moveaxis(x, 0, -1).shape
(4, 5, 3)
>>> np.moveaxis(x, -1, 0).shape
(5, 3, 4)
numpy.moveaxis(a, source, destination)[source]
a: np.ndarray
The array whose axes should be reordered.
source: int or sequence of int
Original positions of the axes to move. These must be unique.
destination: int or sequence of int
Destination positions for each of the original axes.
These must also be unique.
shiftdim's function is a bit more complex than shifting axes around.
For input shiftdim(A, n), if n is positive, shift the axes to the left by n (i.e., rotate), but if n is negative, shift the axes to the right and append trailing dimensions of size 1.
For input shiftdim(A), remove any trailing dimensions of size 1.
from collections import deque
import numpy as np
def shiftdim(array, n=None):
if n is not None:
if n >= 0:
axes = tuple(range(len(array.shape)))
new_axes = deque(axes)
return np.moveaxis(array, axes, tuple(new_axes))
return np.expand_dims(array, axis=tuple(range(-n)))
idx = 0
for dim in array.shape:
if dim == 1:
idx += 1
axes = tuple(range(idx))
# Note that this returns a tuple of 2 results
return np.squeeze(array, axis=axes), len(axes)
Same examples as the Matlab docs
a = np.random.uniform(size=(4, 2, 3, 5))
print(shiftdim(a, 2).shape) # prints (3, 5, 4, 2)
print(shiftdim(a, -2).shape) # prints (1, 1, 4, 2, 3, 5)
a = np.random.uniform(size=(1, 1, 3, 2, 4))
b, nshifts = shiftdim(a)
print(nshifts) # prints 2
print(b.shape) # prints (3, 2, 4)

NumPy: Concatenating 1D array to 3D array

Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:

Shape of numpy array comparison with empty list

I have some problems understanding how python/numpy is casting array shapes when comparing to an empty list - which as far as I understand - is an implicit (element wise) comparison with False.
In the following example the shape decreases by one in the last dimension, if it is not greater than 1.
z = N.zeros((2,2,1))
z == []
>> array([], shape=(2, 2, 0), dtype=bool)
z2 = N.zeros((2,2,2))
z2 ==[]
>> False
If, however, I compare with False directly, I get the expected output.
z = N.zeros((2,2,1))
(z == False).shape
>> (2, 2, 2)
z2 = N.zeros((2,2,2))
(z2 == False).shape
>> (2, 2, 1)
This is ordinary broadcasting at work. When you do
z = N.zeros((2,2,1))
z == []
[] is interpreted as an array of shape (0,), and then the shapes are broadcast against each other:
(2, 2, 1)
vs (0,)
Since (0,) is shorter than (2, 2, 1), it gets expanded, as if the array were copied repeatedly:
(2, 2, 1)
vs (2, 2, 0)
and since there's a 1 in the first shape and the other shape doesn't have a 1 there, the first shape gets "expanded" as if it were copied zero times:
(2, 2, 0)
vs (2, 2, 0)
The comparison thus results in an array of booleans with shape (2, 2, 0).
When z has shape (2, 2, 2):
z2 = N.zeros((2,2,2))
z2 ==[]
broadcasting fails, since a length-2 axis and a length-0 axis can't be broadcast against each other. NumPy reports that it doesn't know how to perform the comparison:
>>> numpy.zeros([2, 2, 2]).__eq__([])
The list doesn't know how either, so Python falls back on the default comparison by identity, and gets a result of False.
When you compare against False:
z = N.zeros((2,2,1))
(z == False).shape
False gets interpreted as an array of shape () - an empty shape! That gets broadcast out to shape (2, 2, 1), as if copied out to an array full of Falses, so the result has the same shape as z.

Inserting newaxis at variable position in NumPy arrays

Normally, when we know where should we insert the newaxis, we can do a[:, np.newaxis,...]. Is there any good way to insert the newaxis at certain axis?
Here is how I do it now. I think there must be some much better ways than this:
def addNewAxisAt(x, axis):
_s = list(x.shape)
_s.insert(axis, 1)
return x.reshape(tuple(_s))
def addNewAxisAt2(x, axis):
ind = [slice(None)]*x.ndim
ind.insert(axis, np.newaxis)
return x[ind]
That singleton dimension (dim length = 1) could be added as a shape criteria to the original array shape with np.insert and thus directly change its shape, like so -
x.shape = np.insert(x.shape,axis,1)
Well, we might as well extend this to invite more than one new axes with a bit of np.diff and np.cumsum trick, like so -
insert_idx = (np.diff(np.append(0,axis))-1).cumsum()+1
x.shape = np.insert(x.shape,insert_idx,1)
Sample runs -
In [151]: def addNewAxisAt(x, axis):
...: insert_idx = (np.diff(np.append(0,axis))-1).cumsum()+1
...: x.shape = np.insert(x.shape,insert_idx,1)
In [152]: A = np.random.rand(4,5)
In [153]: addNewAxisAt(A, axis=1)
In [154]: A.shape
Out[154]: (4, 1, 5)
In [155]: A = np.random.rand(5,6,8,9,4,2)
In [156]: addNewAxisAt(A, axis=5)
In [157]: A.shape
Out[157]: (5, 6, 8, 9, 4, 1, 2)
In [158]: A = np.random.rand(5,6,8,9,4,2,6,7)
In [159]: addNewAxisAt(A, axis=(1,3,4,6))
In [160]: A.shape
Out[160]: (5, 1, 6, 1, 1, 8, 1, 9, 4, 2, 6, 7)
np.insert does
slobj = [slice(None)]*ndim
slobj[axis] = slice(None, index)
new[slobj] = arr[slobj2]
Like you it constructs a list of slices, and modifies one or more elements.
apply_along_axis constructs an array, and converts it to indexing tuple
outarr[tuple(i.tolist())] = res
Other numpy functions work this way as well.
My suggestion is to make initial list large enough to hold the None. Then I don't need to use insert:
In [1076]: x=np.ones((3,2,4),int)
In [1077]: ind=[slice(None)]*(x.ndim+1)
In [1078]: ind[2]=None
In [1080]: x[ind].shape
Out[1080]: (3, 2, 1, 4)
In [1081]: x[tuple(ind)].shape # sometimes converting a list to tuple is wise
Out[1081]: (3, 2, 1, 4)
Turns out there is a np.expand_dims
In [1090]: np.expand_dims(x,2).shape
Out[1090]: (3, 2, 1, 4)
It uses reshape like you do, but creates the new shape with tuple concatenation.
def expand_dims(a, axis):
a = asarray(a)
shape = a.shape
if axis < 0:
axis = axis + len(shape) + 1
return a.reshape(shape[:axis] + (1,) + shape[axis:])
Timings don't tell me much about which is better. They are the 2 µs range, where simply wrapping the code in a function makes a difference.

A 3-D grid of regularly spaced points

I want to create a list containing the 3-D coords of a grid of regularly spaced points, each as a 3-element tuple. I'm looking for advice on the most efficient way to do this.
In C++ for instance, I simply loop over three nested loops, one for each coordinate. In Matlab, I would probably use the meshgrid function (which would do it in one command). I've read about meshgrid and mgrid in Python, and I've also read that using numpy's broadcasting rules is more efficient. It seems to me that using the zip function in combination with the numpy broadcast rules might be the most efficient way, but zip doesn't seem to be overloaded in numpy.
Use ndindex:
import numpy as np
for i in ind:
# (0, 0, 0)
# (0, 0, 1)
# (0, 1, 0)
# (0, 1, 1)
# (0, 2, 0)
# (0, 2, 1)
# (1, 0, 0)
# (1, 0, 1)
# (1, 1, 0)
# (1, 1, 1)
# (1, 2, 0)
# (1, 2, 1)
# (2, 0, 0)
# (2, 0, 1)
# (2, 1, 0)
# (2, 1, 1)
# (2, 2, 0)
# (2, 2, 1)
Instead of meshgrid and mgrid, you can use ogrid, which is a "sparse" version of mgrid. That is, only the dimension along which the values change are filled in. The others are simply broadcast. This uses much less memory for large grids than the non-sparse alternatives.
For example:
>>> import numpy as np
>>> x, y = np.ogrid[-1:2, -2:3]
>>> x
[ 0],
[ 1]])
>>> y
array([[-2, -1, 0, 1, 2]])
>>> x**2 + y**2
array([[5, 2, 1, 2, 5],
[4, 1, 0, 1, 4],
[5, 2, 1, 2, 5]])
I would say go with meshgrid or mgrid, in particular if you need non-integer coordinates. I'm surprised that Numpy's broadcasting rules would be more efficient, as meshgrid was designed especially for the problem that you want to solve.
for multi-d (greater than 2) meshgrids, use numpy.lib.index_tricks.nd_grid like so:
import numpy
grid = numpy.lib.index_tricks.nd_grid()
g1 = grid[:3,:3,:3]
g2 = grid[0:1:0.5, 0:1, 0:2]
g3 = grid[0:1:3j, 0:1:2j, 0:2:2j]
where g1 has x values of [0,1,2]
and g2 has x values of [0,.5],
and g3 has x values of [0.0,0.5,1.0] (the 3j defining the step count instead of the step increment. see the documentation for more details.
Here's an efficient option similar to your C++ solution, which I've used for exactly the same purpose:
import numpy, itertools, collections
def grid(xmin, xmax, xstep, ymin, ymax, ystep, zmin, zmax, zstep):
"return nested tuples of grid-sampled coordinates that include maxima"
return collections.deque( itertools.product(
numpy.arange(xmin, xmax+xstep, xstep).tolist(),
numpy.arange(ymin, ymax+ystep, ystep).tolist(),
numpy.arange(zmin, zmax+zstep, zstep).tolist() ) )
Performance is best (in my tests) when using a.tolist(), as shown above, but you can use a.flat instead and drop the deque() to get an iterator that will sip memory. Of course, you can also use a plain old tuple() or list() instead of deque() for a slight performance penalty (again, in my tests).

