Reduce boolean values in python ndarray using AND - python

I have a python array of this shape [3, 1000, 3] with boolean values inside. The first 3 is the batch size and the values of a batch are like these
[[False, False, False]\n
[False, True, True]\n
[False, False, True]\n
[True, True, True]\n
...
]
size (1000, 3)
I want to apply the and function to each triplet to end up with this new array
[[False]\n
[False]\n
[False]\n
[True]\n
...
]
size (3, 1000)
Looking at numpy I didn't find something useful. I've also tried to import operator and apply reduce(operator.and_, array) but it doesn't work.
Any idea to solve this?

You can easily do this using np.all.
This will check if all values along the last dimension are True:
y = np.all(arr, axis=-1)
y.shape # (3, 1000)

Related

Optimal way to modify value of a numpy array based on condition

I have a numpy.ndarray of the form
import numpy as np
my_array = np.array([[True, True, False], [True, False, True]])
In this example is a matrix of 3 columns and two rows, but my_array is thinking as an arbitriary 2d shape. By other hand I have a numpy.ndarray that represent a vector W with lenght equal to the number of rows of my_array, this vector has float values, for example W = np.array([10., 1.5]). Additionally I have a list WT of two-tuples with lenght equal to W, for example WT = [(0,20.), (0,1.)]. These tuples represents mathematical intervals (a,b).
I want to modify the column values of my_arraybased on the following condition: Given a column, we change to False (or we keep False if the value was that) the i-th element of the column if the i-th element of W does not belong to the mathematical interval of the i-th two-tuple of WT. For example, the first column of my_array is [True, True], so we have to analyze if 10. belong to (0,20) and 1.5 belong to (0,1), the resulting column should be [True, False].
I have a for loop, but I think there is a smart way to do this.
Obs: I donĀ“t need to change values from False to True.
I made this implementation :
import numpy as np
my_array = np.array([[True, True, False], [True, False, True]])
W = np.array([10.0, 1.5])
WT = np.array([[0, 20], [0, 1]])
i = (W > WT[:,0]) * (W < WT[:,1])
print("my_array before", my_array)
my_array[:, 0] = i
print("my_array after", my_array)
It will update the column values given your conditions.

np.meshgrid throws DeprecationWarning or MemoryError for large inputs

For a clustering problem I am trying to create the ideal similarity matrix. That is, I have an one-dimensional array of cluster labels and need to create a two-dimensional binary or boolean matrix with an entry of 1 iff two data points belong to the same cluster.
To do so I use np.meshgrid but it only works for smaller examples. Here's an MWE:
With an array of size 5 it works as desired:
arr = np.random.randint(0, 10, size=5)
print(arr)
mesh_grid = np.meshgrid(arr, arr, sparse=True)
mesh_grid[0] == mesh_grid[1]
gives
[9 0 9 0 7]
array([[ True, False, False, False, False],
[False, True, False, False, False],
[False, False, True, False, False],
[False, False, False, True, False],
[False, False, False, False, True]])
However, with an array of size 60000 it does not work:
arr = np.random.randint(0, 10, size=60000)
mesh_grid = np.meshgrid(arr, arr, sparse=True)
mesh_grid[0] == mesh_grid[1]
gives
DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
mesh_grid[0] == mesh_grid[1]
Setting sparse=False throws a memory error. And based on this answer I assume the DeprecationWarningmust be due to memory too.
Question: How can I solve this or is there another more efficient way to obtain the desired matrix?
If, for example, your array is composed by only 10 differents element (0,1,2,3....) then you only need to compare your array with those 10 elements and not with the whole matrix.
So you can do the following operations:
# Number of different elements
n = 3
# Generate the random vector (2D)
arr = np.random.randint(0, n, size=10)[None,:]
# Create the vector containing all the different elements (2D)
num = np.r_[0:n][:,None]
# We broadcast the 2 vectors to obtain a n*10 matrix
uni = arr==num
# Based on the previous result, we duplicate the row that need to be duplicated:
res = uni[arr] # 10 * 10 matrix
You can use np.unique() to extract the unique values of arr in the case where your unique value are not linearly distributed.

What does `(pred[:, 2:4] > min_wh).all(1)` do in YOLO (Deep Learning)?

I am using yoloV3 under pytorch. I encountered this code (pred[:, 2:4] > min_wh).all(1) and do not know its function. Can anyone help? Thanks!
My concern is the use of ().all(1). I know .all() or .any(), but do not know .all(1). Please explain .all(1), thanks.
According to the documentation https://pytorch.org/docs/stable/tensors.html#torch.BoolTensor.all
there is all(dim) with first parameter dim. Which means it is the same as all(), but only on the selected dimension. It is basically used to select predictions (rows) on which both width and height are bigger than min_wh.
In your case pred has shape (number_of_predictions, 7) or
[
[x, y, w, h, object_conf, class_conf, class],
[x, y, w, h, object_conf, class_conf, class],
...
]
After pred[:, 2:4] > min_wh the result will be something like
[
[True, False],
[True, True],
[False, False],
...
]
we want to select rows which have both width and height bigger than min_wh, thus we need to use all(1).
Because
all() will give you True if all the elements are True, False otherwise
all(0) will give you tensor with shape (2,), for example [True, False]. First element will be True if all elements in the first column are True, False otherwise. Second element will be True if all elements in the second column are True, False otherwise.
and all(1) will give you tensor with shape (number_of_predictions,),
where each element is True only if all the elements in the row are True.

Check equality of two axes in multidiimensional numpy array

I am given a 3-dimensional shape(n,m,k) numpy array. I'd like to view this as a 2-dimensional matrix containing vectors, i.e. a nxm matrix with a vector of size k. I'd now like to check for two such arrays of shape (n,m,k) wheter entry (x,y,:) in the first array is equal to (x,y,:) in the second array. Is there a method to do this in numpy without using loops?
I'd thought about something like A == B conditioned on the first and second axis.
You can use a condition, and ndarray.all together with axis:
a = np.arange(27).reshape(3,3,3)
b = np.zeros_like(a)
b[0,1,2] = a[0,1,2]
b[1,2,0] = a[1,2,0]
b[2,1,:] = a[2,1,:] # set to the same 3-vector at n=2, m=1
(a == b).all(axis=2) # check whether all elements of last axis are equal
# array([[False, False, False],
# [False, False, False],
# [False, True, False]])
As you can see, for n=2 and m=1 we get the same 3-vector in a and b.

Python list notation, Numpy array notation: predictions[predictions < 1e-10] = 1e-10

I am trying to find out operation applied on list. I have list/array name predictions and and executing following set of instruction.
predictions[predictions < 1e-10] = 1e-10
This code snippet is from a Udacity Machine Learning assignment that uses Numpy.
It was used in the following manner:
def logprob(predictions, labels):
"""Log-probability of the true labels in a predicted batch."""
predictions[predictions < 1e-10] = 1e-10
return np.sum(np.multiply(labels, -np.log(predictions))) / labels.shape[0]
As pointed out by #MosesKoledoye and various others, it is actually a Numpy array. (Numpy is a Python library)
What does this line do?
As pointed out by #MosesKoledoye, predictions is most likely a numpy array.
A boolean array would then be generated using predictions < 1e-10. At all indices where the boolean array set by the condition is True, the value will be changed to 1e-10, ie. 10-10.
Example:
>>> a = np.array([1,2,3,4,5]) #define array
>>> a < 3 #define boolean array through condition
array([ True, True, False, False, False], dtype=bool)
>>> a[a<3] #select elements using boolean array
array([1, 2])
>>> a[a<3] = -1 #change value of elements which fit condition
>>> a
array([-1, -1, 3, 4, 5])
The reason this might be done in the code could be to prevent division by zero or to prevent negative numbers messing up things by instead inserting a very small number.
All elements of the array, for which the condition (element < 1e-10) is true, are set to 1e-10.
Practically you are setting a minimum value.

Categories

Resources