So I've created a numpy array:
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
I'm trying to delete the end element of this array's subarray:
a[0] = (a[0])[:-1]
And encounter this issue:
a[0] = (a[0])[:-1]
ValueError: could not broadcast input array from shape (2) into shape (3)
Why can't I change it ?
How do I do it?
Given:
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
You can do:
>>> a[:,0:2]
array([[1, 2],
[4, 5],
[7, 8]])
Or:
>>> np.delete(a,2,1)
array([[1, 2],
[4, 5],
[7, 8]])
Then in either case, assign that back to a since the result is a new array.
So:
>>> a=a[:,0:2]
>>> a
array([[1, 2],
[4, 5],
[7, 8]])
If you wanted only to delete 3 in the first row, that is a different problem. You can only do that if you have have an array of python lists since the sublists are not the same length.
Example:
>>> a = np.array([[1,2],[4,5,6],[7,8,9]])
>>> a
array([list([1, 2]), list([4, 5, 6]), list([7, 8, 9])], dtype=object)
If you do that, just stick to Python. You will have lost all the speed and other advantages of Numpy.
If by 'universal' you mean the last element of each row of a N x M array, just use .shape to find the dimensions:
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> a.shape
(3, 4)
>>> np.delete(a,a.shape[1]-1,1)
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
Or,
>>> a[:,0:a.shape[1]-1]
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
>>> a = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> type(a)
<class 'numpy.ndarray'>
>>> a.shape
(3, 3)
The variable a is matrix (2D array). It has certain number of rows and columns. In a matrix all the rows must be of same length. As so, in the above example, the matrix cannot be formed if the first row has length 2 and others 3. So deleting the last element of only the first(or any other subset) sub-array is not possible.
Instead you have to delete the last element of all the sub-arrays at the same time.
That can be done as
>>> a[:,0:2]
array([[1, 2],
[4, 5],
[7, 8]])
Or,
>>> np.delete(a,2,1)
array([[1, 2],
[4, 5],
[7, 8]])
This also applies to the elements of other positions. Deleting can be done of any element of the sub-arrays keeping in mind that all the sub-arrays should have same length.
However you can manipulate the last element(or any other) of any sub-array unless the shape remains constant.
>>> a[0][-1] = 19
>>> a
array([[ 1, 2, 19],
[ 4, 5, 6],
[ 7, 8, 9]])
In case you try to form a matrix with rows of unequal length, a 1D array of lists is formed on which no Numpy operations like vector processing, slicing, etc. works (the list operation works)
>>> b = np.array([[1,2,3],[1,2,3]])
>>> c = np.array([[1,2],[1,2,3]])
>>> b
array([[1, 2, 3],
[1, 2, 3]])
>>> b.shape
(2, 3)
>>> c
array([list([1, 2]), list([1, 2, 3])], dtype=object)
>>> c.shape
(2,)
>>> print(type(b),type(c))
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
Both are ndarray, but you can see the second variable c has is a 1D array of lists.
>>> b+b
array([[2, 4, 6],
[2, 4, 6]])
>>> c+c
array([list([1, 2, 1, 2]), list([1, 2, 3, 1, 2, 3])], dtype=object)
Similarly, b+b operation performs the element-wise addition of b with b, but c+c performs the concatenation operation among the two lists.
For Further Ref
How to make a multidimension numpy array with a varying row size?
Here is how:
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
a = a[:-1]
print(a)
Output:
[[1 2 3]
[4 5 6]]
Related
I want to find a concise way to sample n consecutive elements with stride m from a numpy array. The simplest case is with sampling 1 element with stride 2, which means getting every other element in a list, which can be done like this:
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[::2]
array([0, 2, 4, 6, 8])
However, what if I wanted to slice n consecutive elements with a stride of m where n and m can be any integers? For example, if I wanted to slice 2 consecutive elements with a stride of 3 I would get something like this:
array([0, 1, 3, 4, 6, 7, 9])
Is there a pythonic and concise way of doing this? Thank you!
If a is long enough you could reshape, slice, and ravel
a.reshape(-1,3)[:,:2].ravel()
But a has to be (9,) or (12,). And the result will still be a copy.
The suggested:
np.lib.stride_tricks.as_strided(a, (4,2), (8*3, 8)).ravel()[:-1]
is also a copy. The as_strided part is a view, but ravel will make a copy. And there is the ugliness of that extra element.
sliding_window_view was added as a safer version:
In [81]: np.lib.stride_tricks.sliding_window_view(a,(3))
Out[81]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
In [82]: np.lib.stride_tricks.sliding_window_view(a,(3))[::3,:2]
Out[82]:
array([[0, 1],
[3, 4],
[6, 7]])
Again ravel will make a copy. This omits the "extra" 9.
np.resize does a reshape with padding (repeating a as needed):
In [83]: np.resize(a, (4,3))
Out[83]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8],
[9, 0, 1]])
In [84]: np.resize(a, (4,3))[:,:2]
Out[84]:
array([[0, 1],
[3, 4],
[6, 7],
[9, 0]])
This code might be useful, I tested it on the example in the question (n=2, m=3)
import numpy as np
def get_slice(arr, n, m):
b = np.array([])
for i in range(0, len(arr), m):
b = np.concatenate((b, arr[i:i + n]))
return b
sliced_arr = get_slice(np.arange(10), n=2, m=3)
print(sliced_arr)
Output
[0. 1. 3. 4. 6. 7. 9.]
a = np.array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=int)
b = np.array([[8], [9]], dtype=int)
result wanted:
alist = [[0, 1, 2, 3, 8], [4, 5, 6, 7, 9]] # as np.array
I tried:
np.concatenate(alist,blist)
np.concatenate((alist,blist))
np.concatenate(alist, blist[0])
for a,b in zip(alist,blist): np.concatenate(a,b)
alist = [*map(np.concatenate, alist, blist)])
This got me various error messages I tried to fix by using the next trial. Nothing worked so far.
You are just missing the axis=1 keyword argument.
np.concatenate((a, b), axis=1)
Normally np.concatenate works on axis 0 (going down the array). But in this case you want to concatenate along axis 1 (going across the array). See the glossary for more information.
You can achieve this by using np.hstack, this will concatenate the two arrays, but at the second axis.
a = np.array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=int)
b = np.array([[8], [9]], dtype=int)
>>> np.hstack((a,b))
array([[0, 1, 2, 3, 8],
[4, 5, 6, 7, 9]])
I have two arrays, values and indexes
>>> values
array([[5, 4, 2, 4, 6],
[7, 9, 7, 3, 6]])
>>> indexes
array([[2, 4],
[0, 3],
[0, 1],
[1, 3]])
What i would like is a fast way (as my arrays are very large) to get, for each value of values the sum of the elements corresponding to all index collections that are in indexes.
I.e I want, for the first value [5, 4, 2, 4, 6] to get
>>> values[0][indexes.flatten()].reshape(indexes.shape)
array([[2, 6],
[5, 4],
[5, 4],
[4, 4]])
>>> values[0][indexes.flatten()].reshape(indexes.shape).sum(axis=1)
array([8, 9, 9, 8])
using this technique and looping over all values is the fastest I could come up with. Is there a better way? Thank you in advance for your time.
Approach #1
Simply index into columns and sum along the last axis -
values[:,indexes].sum(axis=-1)
Sample run -
In [39]: values
Out[39]:
array([[5, 4, 2, 4, 6],
[7, 9, 7, 3, 6]])
In [40]: indexes
Out[40]:
array([[2, 4],
[0, 3],
[0, 1],
[1, 3]])
In [41]: values[:,indexes].sum(axis=-1)
Out[41]:
array([[ 8, 9, 9, 8],
[13, 10, 16, 12]])
Approach #2
If there are no duplicates in each row of indexes, we can simply use matrix-multiplication to get the sum-reductions and this would be much faster -
m,n = indexes.shape[0], values.shape[1]
mask = np.zeros((n,m),dtype=bool) # faster with float dtype
mask[indexes, np.arange(m)[:,None]] = 1
out = values.dot(mask)
I am trying to understand numpy's combined slicing and indexing concept, however I am not sure how to correctly get the below results from numpy's output (by hand so that we can understand how numpy process combined slicing and indexing, which one will be process first?):
>>> import numpy as np
>>> a=np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> i=np.array([[0,1],[2,2]])
>>> a[i,:]
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 8, 9, 10, 11]]])
>>> j=np.array([[2,1],[3,3]])
>>> a[:,j]
array([[[ 2, 1],
[ 3, 3]],
[[ 6, 5],
[ 7, 7]],
[[10, 9],
[11, 11]]])
>>> aj=a[:,j]
>>> aj.shape
(3L, 2L, 2L)
I am bit confused about how aj's shape becomes (3,2,2) with the above output, any detailed explanations are very appreciated, thanks!
Whenever you use an array of indices, the result has the same shape as the indices; for example:
>>> x = np.ones(5)
>>> i = np.array([[0, 1], [1, 0]])
>>> x[i]
array([[ 1., 1.],
[ 1., 1.]])
We've indexed with a 2x2 array, and the result is a 2x2 array.
When combined with a slice, the size of the slice is preserved. For example:
>>> x = np.ones((5, 3))
>>> x[i, :].shape
(2, 2, 3)
Where the first example was a 2x2 array of items, this example is a 2x2 array of (length-3) rows.
The same is true when you switch the order of the slice:
>>> x = np.ones((5, 3))
>>> x[:, i].shape
(5, 2, 2)
This can be thought of as a list of five 2x2 arrays.
Just remember: when any dimension is indexed with a list or array, the result has the shape of the indices, not the shape of the input.
a[:,j][0] is equivalent to a[0,j] or [0, 1, 2, 3][j] which gives you [[2, 1], [3, 3]])
a[:,j][1] is equivalent to a[1,j] or [4, 5, 6, 7][j] which gives you [[6, 5], [7, 7]])
a[:,j][2] is equivalent to a[2,j] or [8, 9, 10, 11][j] which gives you [[10, 9], [11, 11]])
Say I have this object array containing lists of the same length:
>>> a = np.empty(2, dtype=object)
>>> a[0] = [1, 2, 3, 4]
>>> a[1] = [5, 6, 7, 8]
>>> a
array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=object)
How can I convert this to a numeric 2D array?
>>> a.shape
(2,)
>>> b = WHAT_GOES_HERE(a)
>>> b
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
>>> b.shape
(2, 4)
How can I do the reverse?
Does it get easier if my a array is an np.array of np.arrays, rather than an np.array of lists?
>>> na = np.empty(2, dtype=object)
>>> na[0] = np.array([1, 2, 3, 4])
>>> na[1] = np.array([5, 6, 7, 8])
>>> na
array([array([1, 2, 3, 4]), ([5, 6, 7, 8])], dtype=object)
One approach using np.concatenate -
b = np.concatenate(a).reshape(len(a),*np.shape(a[0]))
The improvement suggest by #Eric to use *np.shape(a[0]) should make it work for generic ND shapes.
Sample run -
In [183]: a
Out[183]: array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=object)
In [184]: a.shape
Out[184]: (2,)
In [185]: b = np.concatenate(a).reshape(len(a),*np.shape(a[0]))
In [186]: b
Out[186]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
In [187]: b.shape
Out[187]: (2, 4)
To get back a, it seems we can use a two-step process, like so -
a_back = np.empty(b.shape[0], dtype=object)
a_back[:] = b.tolist()
Sample run -
In [190]: a_back = np.empty(b.shape[0], dtype=object)
...: a_back[:] = b.tolist()
...:
In [191]: a_back
Out[191]: array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=object)
In [192]: a_back.shape
Out[192]: (2,)
You canuse np.vstack():
>>> a = np.vstack(a).astype(int)
Here's an approach that converts the source NumPy array to lists and then into the desired NumPy array:
b = np.array([k for k in a])
b
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
c = np.array([k for k in b], dtype=object)
c
array([[1, 2, 3, 4],
[5, 6, 7, 8]], dtype=object)
I found that round-tripping through list with np.array(list(a)) was sufficient.
This seems to be equivalent to using np.stack(a).
Both of these have the benefit of working in the more general case of converting a 1D array of ND arrays into an (N+1)D array.