In a Numpy ndarray, how do I remove elements in a dimension based on condition in a different dimension?
I have:
[[[1 3]
[1 4]]
[[2 6]
[2 8]]
[[3 5]
[3 5]]]
I want to remove based on condition x[:,:,1] < 7
Desired output ([:,1,:] removed):
[[[1 3]
[1 4]]
[[3 5]
[3 5]]]
EDIT: fixed typo
This may work:
x[np.where(np.all(x[..., 1] < 7, axis=1)), ...]
yields
array([[[[1, 3],
[1, 4]],
[[3, 5],
[3, 5]]]])
You do get an extra dimension, but that's easy to remove:
np.squeeze(x[np.where(np.all(x[..., 1] < 7, axis=1)), ...])
Briefly how it works:
First the condition: x[..., 1] < 7.
Then test if the condition is valid for all elements along the specific axis: np.all(x[..., 1] < 7, axis=1).
Then, use where to grab the indices instead of an array of booleans: np.where(np.all(x[..., 1] < 7, axis=1)).
And insert those indices into the relevant dimension: x[np.where(np.all(x[..., 1] < 7, axis=1)), ...].
As your desired output, you filter x on axis=0. Therefore, you may try this way
m = (x[:,:,1] < 7).all(1)
x_out = x[m,:,:]
Or simply
x_out = x[m]
Out[70]:
array([[[1, 3],
[1, 4]],
[[3, 5],
[3, 5]]])
Related
I have a numpy array of arrays x = [[1, 3, 4, 5], [6, 2, 5, 7]]. I want to get N maximum values from each array of the numpy array: [[5, 4], [7, 6]]. I have tried using np.argpartition(x, -N, axis=0)[-N:] but it gives ValueError: kth(=-3) out of bounds (1). What is the efficient way for doing this?
You can do this by sorting each row and slicing as you want:
np.sort(x, axis=1)[:, :2] # --> [[1 3] [2 5]] 2 minimum in each row
np.sort(x, axis=1)[:, 2:] # --> [[4 5] [6 7]] 2 maximum in each row
Is it possible to extract the upper values from the whole 3D array?
A simple example of a 3D array is below:
import numpy as np
a = np.array([[[7, 4, 2], [5, 0, 4], [0, 0, 5]],
[[7, 6, 1], [3, 9, 5], [0, 8, 7]],
[[8, 10, 3], [1, 2, 15], [9, 0, 1]]])
You can use the numpy building-matrices functions like numpy.triu (triangle-upper) or numpy.tril (triangle-lower) to return a copy of a matrix with the elements above or below the k-th diagonal zeroed.
If, on the other hand, you are only interested in the values above or below the diagonal (without having a copy of the matrix), you can simply use numpy.triu_indices and numpy.tril_indices, as follows:
upper_index = np.triu_indices(n=3, k=1)
where n is the size of the arrays for which the returned indices will be valid, and k the diagonal offset.
and return the indices for the triangle. The returned tuple contains two arrays, each with the indices along one dimension of the array:
(array([0, 0, 1], dtype=int64), array([1, 2, 2], dtype=int64))
now you can use the indexes obtained as indexes of your array and you will get:
a[upper_index]
and gives:
array([[5, 0, 4],
[0, 0, 5],
[0, 8, 7]])
Similarly you can find the part under the diagonal using numpy.tril_indices.
IUUC, You could use triu_indices:
result = a[np.triu_indices(3)]
print(result)
Output
[[7 4 2]
[5 0 4]
[0 0 5]
[3 9 5]
[0 8 7]
[9 0 1]]
If you want those strictly above the diagonal, you can pass an offset value:
result = a[np.triu_indices(3, 1)]
print(result)
Output
[[5 0 4]
[0 0 5]
[0 8 7]]
Suppose that I have a 1*3 vector [[1,3,5]] (or a list like [1,3,5] if you with), how do I generate a 9*2 matrix: [[1,1],[1,3],[1,5],[3,1],[3,3],[3,5],[5,1],[5,3],[5,5]]?
Elements in the new matrix is the pairwise combination of elements in the original matrix.
Also, the original matrix could be with zeros, like this [[0,1],[0,3],[0,5]].
The implementation should generalise to vectors of any dimensionalities.
Many thanks!
You can use tf.meshgrid() and tf.transpose() to generate two matrices. Then reshape and concat them.
import tensorflow as tf
a = tf.constant([[1,3,5]])
A,B=tf.meshgrid(a,tf.transpose(a))
result = tf.concat([tf.reshape(B,(-1,1)),tf.reshape(A,(-1,1))],axis=-1)
with tf.Session() as sess:
print(sess.run(result))
[[1 1]
[1 3]
[1 5]
[3 1]
[3 3]
[3 5]
[5 1]
[5 3]
[5 5]]
You can use product from itertools
from itertools import product
np.array([np.array(item) for item in product([1,3,5],repeat =2 )])
array([[1, 1],
[1, 3],
[1, 5],
[3, 1],
[3, 3],
[3, 5],
[5, 1],
[5, 3],
[5, 5]])
I also come up with an answer, similar to #giser_yugang, but not using tf.meshgrid and tf.concat.
import tensorflow as tf
inds = tf.constant([1,3,5])
num = tf.shape(inds)[0]
ind_flat_lower = tf.tile(inds,[num])
ind_mat = tf.reshape(ind_flat_lower,[num,num])
ind_flat_upper = tf.reshape(tf.transpose(ind_mat),[-1])
result = tf.transpose(tf.stack([ind_flat_upper,ind_flat_lower]))
with tf.Session() as sess:
print(sess.run(result))
[[1 1]
[1 3]
[1 5]
[3 1]
[3 3]
[3 5]
[5 1]
[5 3]
[5 5]]
I wrote the following:
arr3=np.array([[[1,2,3],[1,2,3],[1,2,3],[1,2,3]],[[2,2,3],[4,2,3],[4,2,2],[2,2,2]],[[1,1,1],[1,1,1],[1,1,1],[1,1,1]]])
As I expected,
arr3[0:3,1] should return the same result as
arr3[0:3][1]:array([[2, 2, 3],[4, 2, 3],[4, 2, 2],[2, 2, 2]])
But it returns:array([[1, 2, 3],[4, 2, 3],[1, 1, 1]]).
BTW, I am using python3 in Jupyter notebook
When doing arr3[0:3,1], you are taking element from 0:3 in the first axis and then for each of those, taking the first element.
This gives a different result to taking the 0:3 in the first axis with arr3[0:3] and then taking the first array from this axis.
So in this case, the 0:3 part does nothing in either case as the array's shape is (3, 4, 3) so taking the first 3 just gives you back the same array. This does absolutely nothing in the second case, but in the first case, it does serve as essentially a place holder so that you can access the second axis, but for that you should just use a colon so: [:, some_index].
See how its the same array?
>>> arr3[0:3]
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[2, 2, 3],
[4, 2, 3],
[4, 2, 2],
[2, 2, 2]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]]])
But then when you do arr3[:, 1] you are taking the second element from the second axis of the array so that will give you:
array([[1, 2, 3],
[4, 2, 3],
[1, 1, 1]])
whereas in the other case, you are taking the second element from the first axis of the array` so:
array([[2, 2, 3],
[4, 2, 3],
[4, 2, 2],
[2, 2, 2]])
To read further about numpy indexing, take a look at this page on scipy.
Take note of this specific description which applies directly to your problem:
When there is at least one slice (:), ellipsis (...) or np.newaxis in the index (or the array has more dimensions than there are advanced indexes), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element
Let's look at our multidimensional numpy array:
import numpy as np
arr3=np.array([
[
[1,2,3],[1,2,3],[1,2,3],[1,2,3]
],[
[2,2,3],[4,2,3],[4,2,2],[2,2,2]
],[
[1,1,1],[1,1,1],[1,1,1],[1,1,1]
]
])
print(arr3[0:3,1])
That returns:
[[1 2 3]
[4 2 3]
[1 1 1]]
Which makes sense because we are fetching row numbers 1 through 3 and we are grabbing only the first column.
However, arr3[0:3][1] returns the array from row 0 to row 3 and then selects the second row (or row index 1).
Observe:
print(arr3[0:3])
Returns:
[[[1 2 3]
[1 2 3]
[1 2 3]
[1 2 3]]
[[2 2 3]
[4 2 3]
[4 2 2]
[2 2 2]]
[[1 1 1]
[1 1 1]
[1 1 1]
[1 1 1]]]
It returns the a new array (which happens to be the same as our current array because we just asked for all rows in the array). Then we ask for the second row:
print(arr3[0:3][1])
Returns:
[[2 2 3]
[4 2 3]
[4 2 2]
[2 2 2]]
i have got list of x y coordinates:
import numpy as np
a=np.array([[2,1],[1,3],[1,5],[2,3],[3,5]])
that i've sorted with
a=np.sort(a,axis=0)
print a
>[[1 3] [1 5] [2 1] [2 3] [3 5]]
i'd like to perform a search :
a.searchsorted([2,1])
>Value error : object too deep for desired array
Any ideas how to do that ?
This gona work may be , if I got what you asking :
>>> a = [[1, 3], [1, 5], [2, 1], [2, 3], [3, 5]]
>>> [2, 1] in a
True