Related
How can I delete multiple rows of NumPy array? For example, I want to delete the first five rows of x. I'm trying the following code:
import numpy as np
x = np.random.rand(10, 5)
np.delete(x, (0:5), axis=0)
but it doesn't work:
np.delete(x, (0:5), axis=0)
^
SyntaxError: invalid syntax
There are several ways to delete rows from NumPy array.
The easiest one is to use basic indexing as with standard Python lists:
>>> import numpy as np
>>> x = np.arange(35).reshape(7, 5)
>>> x
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
>>> result = x[5:]
>>> result
array([[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
You can select not only rows but columns as well:
>>> x[:2, 1:4]
array([[1, 2, 3],
[6, 7, 8]])
Another way is to use "fancy indexing" (indexing arrays using arrays):
>>> x[[0, 2, 6]]
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34]])
You can achieve the same using np.take:
>>> np.take(x, [0, 2, 6], axis=0)
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34]])
Yet another option is to use np.delete as in the question. For selecting the rows/columns for deletion it can accept slice objects, int, or array of ints:
>>> np.delete(x, slice(0, 5), axis=0)
array([[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
>>> np.delete(x, [0, 2, 3], axis=0)
array([[ 5, 6, 7, 8, 9],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
But all this time that I've been using NumPy I never needed this np.delete, as in this case it's much more convenient to use boolean indexing.
As an example, if I would want to remove/select those rows that start with a value greater than 12, I would do:
>>> mask_array = x[:, 0] < 12 # comparing values of the first column
>>> mask_array
array([ True, True, True, False, False, False, False])
>>> x[mask_array]
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> x[~mask_array] # ~ is an element-wise inversion
array([[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
For more information refer to the documentation on indexing: https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
If you want to delete selected rows you can write like
np.delete(x, (1,2,5), axis = 0)
This will delete 1,2 and 5 th line, and if you want to delete like (1:5) try this one
np.delete(x, np.s_[0:5], axis = 0)
by this you can delete 0 to 4 lines from your array.
np.s_[0:5] --->> slice(0, 5, None)
both are same.
Pass the multiple row numbers to the list argument.
General Syntax:
np.delete(array_name,[rownumber1,rownumber2,..,rownumber n],axis=0)
Example: delete first three rows in an array:
np.delete(array_name,[0,1,2],axis=0)
If I have a 3D array of ([4,3,3]) like this:
[[0,1,2] [[9,10,11 ] [[18,19,20] [[27,28,29]
[3,4,5] [12,13,14] [21,22,23] [30,31,32]
[6,7,8]] , [15,16,17]] , [24,25,26]] , [33,34,35]]
How would I convert it to a 2D array of ([6,6]) like this so that the 1st half of arrays are at the top half of the 160x160 and the 2nd half are at the bottom:
[[0,1,2,9,10,11]
[3,4,5,12,13,14]
[6,7,8,15,16,17]
[18,19,20,27,28,29]
[21,22,23,30,31,32]
[24,25,26,33,34,35]]
My array creation:
qDCTReversed = np.zeros((400,8,8), dtype=np.int)
And I need a (160,160) array.
A very fast one line solution using no for-loops is this:
# initialization
qDCTReversed = np.arange(4*3*3).reshape((4,3,3))
# calculation
qDCTReversed = qDCTReversed.reshape((2,2,3,3)).transpose((0,2,1,3)).reshape((6,6))
or for the (400,8,8) array:
qDCTReversed.reshape((20,20,8,8)).transpose((0,2,1,3)).reshape((160,160))
Speed comparison:
Mstaino's answer: 0.393 ms
yatu's answer: 0.138 ms
This answer: 0.016 ms
You can do this by looping over the list as such:
a = [[[ 0, 1, 2], [ 9,10,11]],
[[ 3, 4, 5], [12,13,14]],
[[ 6, 7, 8], [15,16,17]],
[[18,19,20], [27,28,29]],
[[21,22,23], [30,31,32]],
[[24,25,26], [33,34,35]]]
b = [[i for j in k for i in j ] for k in a]
print(b)
outputs:
[ 0, 1, 2, 9, 10, 11]
[ 3, 4, 5, 12, 13, 14]
[ 6, 7, 8, 15, 16, 17]
[18, 19, 20, 27, 28, 29]
[21, 22, 23, 30, 31, 32]
[24, 25, 26, 33, 34, 35]
The reshape you ask can be done with:
x = np.arange(36).reshape((4,3,3))
np.vstack(np.hstack(x[2*i:2+2*i]) for i in range(x.shape[0]//2))
>>array([[ 0, 1, 2, 9, 10, 11],
[ 3, 4, 5, 12, 13, 14],
[ 6, 7, 8, 15, 16, 17],
[18, 19, 20, 27, 28, 29],
[21, 22, 23, 30, 31, 32],
[24, 25, 26, 33, 34, 35]])
How can I delete multiple rows of NumPy array? For example, I want to delete the first five rows of x. I'm trying the following code:
import numpy as np
x = np.random.rand(10, 5)
np.delete(x, (0:5), axis=0)
but it doesn't work:
np.delete(x, (0:5), axis=0)
^
SyntaxError: invalid syntax
There are several ways to delete rows from NumPy array.
The easiest one is to use basic indexing as with standard Python lists:
>>> import numpy as np
>>> x = np.arange(35).reshape(7, 5)
>>> x
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
>>> result = x[5:]
>>> result
array([[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
You can select not only rows but columns as well:
>>> x[:2, 1:4]
array([[1, 2, 3],
[6, 7, 8]])
Another way is to use "fancy indexing" (indexing arrays using arrays):
>>> x[[0, 2, 6]]
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34]])
You can achieve the same using np.take:
>>> np.take(x, [0, 2, 6], axis=0)
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34]])
Yet another option is to use np.delete as in the question. For selecting the rows/columns for deletion it can accept slice objects, int, or array of ints:
>>> np.delete(x, slice(0, 5), axis=0)
array([[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
>>> np.delete(x, [0, 2, 3], axis=0)
array([[ 5, 6, 7, 8, 9],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
But all this time that I've been using NumPy I never needed this np.delete, as in this case it's much more convenient to use boolean indexing.
As an example, if I would want to remove/select those rows that start with a value greater than 12, I would do:
>>> mask_array = x[:, 0] < 12 # comparing values of the first column
>>> mask_array
array([ True, True, True, False, False, False, False])
>>> x[mask_array]
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> x[~mask_array] # ~ is an element-wise inversion
array([[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
For more information refer to the documentation on indexing: https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
If you want to delete selected rows you can write like
np.delete(x, (1,2,5), axis = 0)
This will delete 1,2 and 5 th line, and if you want to delete like (1:5) try this one
np.delete(x, np.s_[0:5], axis = 0)
by this you can delete 0 to 4 lines from your array.
np.s_[0:5] --->> slice(0, 5, None)
both are same.
Pass the multiple row numbers to the list argument.
General Syntax:
np.delete(array_name,[rownumber1,rownumber2,..,rownumber n],axis=0)
Example: delete first three rows in an array:
np.delete(array_name,[0,1,2],axis=0)
I have a 2d array like z and a 1d array denoting the "start column position" like starts. In addition I have a fixed row_length = 2
z = np.arange(35).reshape(5, -1)
# --> array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
starts = np.array([1,5,3,3,2])
What I want is the outcome of this slow for-loop, just quicker if possible.
result = np.zeros(
(z.shape[0], row_length),
dtype=z.dtype
)
for i in range(z.shape[0]):
s = starts[i]
result[i] = z[i, s:s+row_length]
So result in this example should look like this in the end:
array([[ 1, 2],
[12, 13],
[17, 18],
[24, 25],
[30, 31]])
I can't seem to find a way using either fancy indexing or np.take to deliver this result.
One approach would be to get those indices using broadcasted additions with those starts and row_length and then use NumPy's advanced-indexing to extract out all of those elements off the data array, like so -
idx = starts[:,None] + np.arange(row_length)
out = z[np.arange(idx.shape[0])[:,None], idx]
Sample run -
In [197]: z
Out[197]:
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
In [198]: starts = np.array([1,5,3,3,2])
In [199]: row_length = 2
In [200]: idx = starts[:,None] + np.arange(row_length)
In [202]: z[np.arange(idx.shape[0])[:,None], idx]
Out[202]:
array([[ 1, 2],
[12, 13],
[17, 18],
[24, 25],
[30, 31]])
I have the following slicing problem in numpy.
a = np.arange(36).reshape(-1,4)
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])
In my problem always three rows represent one sample, in my case coordinates.
I want to access this matrix in a way that if I use a[0:2] to get the following:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]
These are the first two coordinate samples.
I have to extract a large amount of these coordinate sets from an array.
Thanks
Based on How do you split a list into evenly sized chunks?, I found the following solution, which gives me the desired result.
def chunks(l, n, indices):
return np.vstack([l[idx*n:idx*n+n] for idx in indices])
chunks(a,3,[0,2])
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])
Probably this solution could be improved and somebody won't need the stacking.
If three rows are a sample, you can reshape your array to reflect that, use fancy indexing to retrieve your samples, then undo the shape change:
>>> a = a.reshape(-1, 3, 4)
>>> a[[0, 2]].reshape(-1, 4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])