This question already has answers here:
how to search for unique elements by the first column of a multidimensional array
(2 answers)
Closed 6 years ago.
I have an ndarray with the following content:
[0, 1]
[0, 5]
[1, 7]
[2, 9]
[2, 4]
[2, 4]
[3, 8]
[4, 2]
[4, 7]
Now I'd like to keep only the first row when the first element is the same for multiple rows. Would result in:
[0, 1]
[1, 7]
[2, 9]
[3, 8]
[4, 2]
How can I achieve this with numpy?
Given an input data as:
x = np.array([
[0, 1],
[0, 5],
[1, 7],
[2, 9],
[2, 4],
[2, 4],
[3, 8],
[4, 2],
[4, 7],
])
Then you could use numpy.unique with the return_index set to true (as #divakar mentioned in the commend) in order to find the unique indices of the first elements.
idx = numpyp.unique(x[:,0], return_index=True)[1]
Then you can just access them as:
x[idx]
Hope this helps.
Related
Given the following numpy array:
>>> a = np.arange(9).reshape((3, 3))
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
How can get the list of all possible column deletions? So in this case:
array([[[1, 2],
[4, 5],
[7, 8]],
[[0, 2],
[3, 5],
[6, 8]],
[[0, 1],
[3, 4],
[6, 7]]])
You can use itertools.combinations:
>>> from itertools import combinations
>>> np.array([a[:, list(comb)] for comb in combinations(range(a.shape[1]), r=2)])
array([[[0, 1],
[3, 4],
[6, 7]],
[[0, 2],
[3, 5],
[6, 8]],
[[1, 2],
[4, 5],
[7, 8]]])
Alternatively you can create a list of needed column indices first and then use integer array indexing to pick up the required columns from the original array:
r = range(a.shape[1])
cols = [[j for j in r if i != j] for i in r]
cols
# [[1, 2], [0, 2], [0, 1]]
a[:, cols].swapaxes(0, 1)
#[[[1 2]
# [4 5]
# [7 8]]
#
# [[0 2]
# [3 5]
# [6 8]]
#
# [[0 1]
# [3 4]
# [6 7]]]
I have a 2D numpy array that I need to extract a subset of data from where the value of the 2nd column is higher than a certain value. What's the best way to do this?
E.g. given the array:
array1 = [[1, 5], [2, 6], [3, 7], [4, 8]]
I would want to extract all rows where the 2nd column was higher than 6, so I'd get:
[3, 7], [4, 8]
Or, even more simply:
a[a[:,1] > 6]
Output:
array([[3, 7], [4, 8]])
Where a is the array.
Use numpy.where:
import numpy as np
a = np.array([[1, 5], [2, 6], [3, 7], [4, 8]])
# all elements where the second item it greater than 6:
print(a[np.where(a[:, 1] > 6)])
# output: [[3 7], [4 8]]
Use list comprehension:
array1 = [[1, 5], [2, 6], [3, 7], [4, 8]]
threshold = 6
print([elem for elem in array1 if elem[1] > threshold])
# [[3, 7], [4, 8]]
Or using numpy:
import numpy as np
array1 = np.array(array1)
print(array1[array1[:,1] > 6])
# array([[3, 7], [4, 8]])
This question already has answers here:
How to remove an element from a list by index
(18 answers)
Closed 2 years ago.
so I have this matrix:
matrix = [ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
And I want to remove the last row, so it return me something like this:
matrix = [ [1, 2, 3],
[4, 5, 6]
]
P.S. I cant use Numpy.
matrix = [ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
matrix[:-1]
This gives [[1, 2, 3], [4, 5, 6]] back
So what you can do is:
matrix = matrix[:-1]
This question already has answers here:
How to count RGB or HSV channel combination in an image?
(1 answer)
Count occurrences of unique arrays in array
(5 answers)
Closed 3 years ago.
I have a 3D integer tensor X with X.shape=(m, n, k)
I'd like to treat X as a (m, n) matrix with entries that are k sized integer vectors and count how many such unique entries are in each row.
So for example
>>> X
array([[[0, 1, 2],
[0, 1, 2],
[1, 2, 3],
[1, 2, 3]],
[[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8]]])
>>> X.shape
(2, 4, 3)
>>> count_unique(X)
[2, 4]
Since in the first row of the tensor there are 2 unique vectors and in the second row there are 4
Bonus points for returning the actual unique vectors, e.g.
>>> get_unique(X)
[[[0, 1, 2], [1, 2, 3]],\
[3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8]]]
My solution (partially vectorized) for the first question
count_unique = lambda X: [len(np.unique(row, axis=0)) for row in X]
unique_list = []
for sublist in X:
tmp_unique_list = []
for element in sublist:
if element not in tmp_unique_list:
tmp_unique_list.append(element)
unique_list.append(tmp_unique_list)
Output:
> unique list
[[[0, 1, 2], [1, 2, 3]], [[3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8]]]
And the count:
> [len(elem) for elem in unique_list]
[2, 4]
I have a large n x 2 numpy array that is formatted as (x, y) coordinates. I would like to filter this array so as to:
Identify coordinate pairs with duplicated x-values.
Keep only the coordinate pair of those duplicates with the highest y-value.
For example, in the following array:
arr = [[1, 4]
[1, 8]
[2, 3]
[4, 6]
[4, 2]
[5, 1]
[5, 2]
[5, 6]]
I would like the result to be:
arr = [[1, 8]
[2, 3]
[4, 6]
[5, 6]]
Ive explored np.unique and np.where but cannot figure out how to leverage them to solve this problem. Thanks so much!
Here's one way based on np.maximum.reduceat -
def grouby_maxY(a):
b = a[a[:,0].argsort()] # if first col is already sorted, skip this
grp_idx = np.flatnonzero(np.r_[True,(b[:-1,0] != b[1:,0])])
grp_maxY = np.maximum.reduceat(b[:,1], grp_idx)
return np.c_[b[grp_idx,0], grp_maxY]
Alternatively, if you want to bring np.unique, we can use it to find grp_idx with np.unique(b[:,0], return_index=1)[1].
Sample run -
In [453]: np.random.seed(0)
In [454]: arr = np.random.randint(0,5,(10,2))
In [455]: arr
Out[455]:
array([[4, 0],
[3, 3],
[3, 1],
[3, 2],
[4, 0],
[0, 4],
[2, 1],
[0, 1],
[1, 0],
[1, 4]])
In [456]: grouby_maxY(arr)
Out[456]:
array([[0, 4],
[1, 4],
[2, 1],
[3, 3],
[4, 0]])