Remove row from arbitrary dimension in numpy - python

I have a function, remrow which takes as input an arbitrary numpy nd array, arr, and an integer, n. My function should remove the last row from arr in the nth dimension. For example, if call my function like so:
remrow(arr,2)
with arr as a 3d array, then my function should return:
arr[:,:,:-1]
Similarly if I call;
remrow(arr,1)
and arr is a 5d array, then my function should return:
arr[:,:-1,:,:,:]
My problem is this; my function must work for all shapes and sizes of arr and all compatible n. How can I do this with numpy array indexing?

Construct an indexing tuple, consisting of the desired combination of slice(None) and slice(None,-1) objects.
In [75]: arr = np.arange(24).reshape(2,3,4)
In [76]: idx = [slice(None) for _ in arr.shape]
In [77]: idx
Out[77]: [slice(None, None, None), slice(None, None, None), slice(None, None, None)]
In [78]: idx[1]=slice(None,-1)
In [79]: arr[tuple(idx)].shape
Out[79]: (2, 2, 4)
In [80]: idx = [slice(None) for _ in arr.shape]
In [81]: idx[2]=slice(None,-1)
In [82]: arr[tuple(idx)].shape
Out[82]: (2, 3, 3)

Related

Efficient way to cast scalars to numpy arrays

When I write a function that accepts ndarray or scalar inputs
def foo(a):
# does something to `a`
#
# a: `x` dimensional array or scalar
# . . .
cast(a, x)
# deal with `a` as if it is an `x`-d array after this
Is there an effeicint way yo write that cast function? Basically what I'd want is a function that would cast:
a, a scalar to ndarray with shape ((1,)*x)
b, an ndarray with y<x dims explicitly to shape ((1,) * (y-x) + b.shape) (same as broadcasting)
c, an ndarray with x dims is unaffected
d, an ndarray with y>x dims throws an error
do it all in-place (at least when starting with an array), to prevent double memory
it seems like this functionality is repeated so often in built-in functions that there should be some shortcut for it, but I'm not finding it.
I can do a_ = np.array(a, ndmin = x, copy = False) and then assert len(a_.shape) == x) , but that still makes a copy of arrays. (i.e. a_.base is a is False). Is there any way around this?
asarray returns the array itself (if starting with an array):
In [271]: x=np.arange(10)
In [272]: y = np.asarray(x)
In [273]: id(x)
Out[273]: 2812424128
In [274]: id(y)
Out[274]: 2812424128 # same id
ndmin produces a view:
In [276]: y = np.array(x, ndmin=2, copy=False)
In [277]: y
Out[277]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
In [278]: id(x)
Out[278]: 2812424128
In [279]: id(y)
Out[279]: 2811135704 # different id
In [281]: x.__array_interface__['data']
Out[281]: (188551320, False)
In [282]: y.__array_interface__['data'] # same databuffer
Out[282]: (188551320, False)
ndmin on an array of the right dim already:
In [286]: x = np.arange(9).reshape(3,3)
In [287]: y = np.array(x, ndmin=2, copy=False)
In [288]: id(x)
Out[288]: 2810813120
In [289]: id(y)
Out[289]: 2810813120 # same id
Similar discussion with astype,
confused about the `copy` attribution of `numpy.astype`

How to multiply a numpy array by a list to get a multidimentional array?

In Python, I have a list and a numpy array.
I would like to multiply the array by the list in such a way that I get an array where the 3rd dimension represents the input array multiplied by each element of the list. Therefore:
in_list = [2,4,6]
in_array = np.random.rand(5,5)
result = ...
np.shape(result) ---> (3,5,5)
where (0,:,:) is the input array multiplied by the first element of the list (2);
(1,:,:) is the input array multiplied by the second element of the list (4), etc.
I have a feeling this question will be answered by broadcasting, but I'm not sure how to go around doing this.
You want np.multiply.outer. The outer method is defined for any NumPy "ufunc", including multiplication. Here's a demonstration:
In [1]: import numpy as np
In [2]: in_list = [2, 4, 6]
In [3]: in_array = np.random.rand(5, 5)
In [4]: result = np.multiply.outer(in_list, in_array)
In [5]: result.shape
Out[5]: (3, 5, 5)
In [6]: (result[1, :, :] == in_list[1] * in_array).all()
Out[6]: True
As you suggest, broadcasting gives an alternative solution: if you convert in_list to a 1d NumPy array of length 3, you can then reshape to an array of shape (3, 1, 1), and then a multiplication with in_array will broadcast appropriately:
In [9]: result2 = np.array(in_list)[:, None, None] * in_array
In [10]: result2.shape
Out[10]: (3, 5, 5)
In [11]: (result2[1, :, :] == in_list[1] * in_array).all()
Out[11]: True

Inserting newaxis at variable position in NumPy arrays

Normally, when we know where should we insert the newaxis, we can do a[:, np.newaxis,...]. Is there any good way to insert the newaxis at certain axis?
Here is how I do it now. I think there must be some much better ways than this:
def addNewAxisAt(x, axis):
_s = list(x.shape)
_s.insert(axis, 1)
return x.reshape(tuple(_s))
def addNewAxisAt2(x, axis):
ind = [slice(None)]*x.ndim
ind.insert(axis, np.newaxis)
return x[ind]
That singleton dimension (dim length = 1) could be added as a shape criteria to the original array shape with np.insert and thus directly change its shape, like so -
x.shape = np.insert(x.shape,axis,1)
Well, we might as well extend this to invite more than one new axes with a bit of np.diff and np.cumsum trick, like so -
insert_idx = (np.diff(np.append(0,axis))-1).cumsum()+1
x.shape = np.insert(x.shape,insert_idx,1)
Sample runs -
In [151]: def addNewAxisAt(x, axis):
...: insert_idx = (np.diff(np.append(0,axis))-1).cumsum()+1
...: x.shape = np.insert(x.shape,insert_idx,1)
...:
In [152]: A = np.random.rand(4,5)
In [153]: addNewAxisAt(A, axis=1)
In [154]: A.shape
Out[154]: (4, 1, 5)
In [155]: A = np.random.rand(5,6,8,9,4,2)
In [156]: addNewAxisAt(A, axis=5)
In [157]: A.shape
Out[157]: (5, 6, 8, 9, 4, 1, 2)
In [158]: A = np.random.rand(5,6,8,9,4,2,6,7)
In [159]: addNewAxisAt(A, axis=(1,3,4,6))
In [160]: A.shape
Out[160]: (5, 1, 6, 1, 1, 8, 1, 9, 4, 2, 6, 7)
np.insert does
slobj = [slice(None)]*ndim
...
slobj[axis] = slice(None, index)
...
new[slobj] = arr[slobj2]
Like you it constructs a list of slices, and modifies one or more elements.
apply_along_axis constructs an array, and converts it to indexing tuple
outarr[tuple(i.tolist())] = res
Other numpy functions work this way as well.
My suggestion is to make initial list large enough to hold the None. Then I don't need to use insert:
In [1076]: x=np.ones((3,2,4),int)
In [1077]: ind=[slice(None)]*(x.ndim+1)
In [1078]: ind[2]=None
In [1080]: x[ind].shape
Out[1080]: (3, 2, 1, 4)
In [1081]: x[tuple(ind)].shape # sometimes converting a list to tuple is wise
Out[1081]: (3, 2, 1, 4)
Turns out there is a np.expand_dims
In [1090]: np.expand_dims(x,2).shape
Out[1090]: (3, 2, 1, 4)
It uses reshape like you do, but creates the new shape with tuple concatenation.
def expand_dims(a, axis):
a = asarray(a)
shape = a.shape
if axis < 0:
axis = axis + len(shape) + 1
return a.reshape(shape[:axis] + (1,) + shape[axis:])
Timings don't tell me much about which is better. They are the 2 µs range, where simply wrapping the code in a function makes a difference.

Python: numpy shape confusion

I have a numpy array:
>>> type(myArray1)
Out[14]: numpy.ndarray
>>> myArray1.shape
Out[13]: (500,)
I have another array:
>>> type(myArray2)
Out[14]: numpy.ndarray
>>> myArray2.shape
Out[13]: (500,1)
( 1 ) What is the difference between (500,) and (500,1) ?
( 2 ) How do I change (500,) to (500,1)
(1) The difference between (500,) and (500,1) is that the first is the shape of a one-dimensional array, while the second is the shape of a 2-dimensional array whose 2nd dimension has length 1. This may be confusing at first since other languages don't make that distinction.
(2) You can use np.reshape to do that:
myArray1.reshape(-1,1).
You can also add a dimension to your array using np.expand_dims: np.expand_dims(myArray1, axis = 1).
The difference between (500,) and (500,1) is the number of dimension (the first one is "totally flat").
You can try it by yourself:
import numpy as np
arr = np.array([i for i in range(250)])
arr.shape
# (250,)
new_arr = np.array([i for i in range(250)], ndmin=2).T
new_arr.shape
# (250, 1)
# You can also reshape it directly:
arr.shape = (250, 1)
# And look the result:
arr
# array([[ 0],
# [ 1],
# [ 2],
# [ 3],
# [ 4],
# (...)
Try also to reverse the shape, like (1, 500) instead of (500, 1).

Index Numpy tensor without having to reshape

I have a tensor with the shape (5,48,15). How can I access an element along the 0th axis and still maintain 3 dimensions without needing to reshape. For example:
x.shape # this is (5,48,15)
m = x[0,:,:]
m.shape # This is (48,15)
m_new = m.reshape(1,48,15)
m_new.shape # This is now (1,48,15)
Is this possible without needing to reshape?
When you index an axis with a single integer, as with x[0, :, :], the dimensionality of the returned array drops by one.
To keep three dimensions, you can either...
insert a new axis at the same time as indexing:
>>> x[None, 0, :, :].shape
(1, 48, 15)
or use slicing:
>>> x[:1, :, :].shape
(1, 48, 15)
or use fancy indexing:
>>> x[[0], :, :].shape
(1, 48, 15)
The selection index needs to be a slice or list (or array):
m = x[[0],:,:]
m = x[:1,:,:]
m = x[0:1,:,:]

Categories

Resources