options for applying conditions to numpy arrays - python

I want to apply conditions to a numpy array and I feel like there is a better way out there. As a toy example say I want to know where the elements are equal to 2 or 3.
import numpy as np
a = np.arange(5)
one way would be to construct my condition piece by piece with numpy functions like so
result = np.logical_or(a == 2, a == 3)
One can see how this could get unwieldy with more complicated conditions though. Another option would be to use list comprehensions
result = np.array([x for x in a if x == 2 or x==3])
which is nice because now all my conditional logic can live together in one place but feels a little clunky because of the conversion to and from a list. It also doesn't work too well for multidimensional arrays.
Is there a better alternative that I am missing?

It's useful to point out that in the first example, you have a logical array, not the array [2, 3] (like you get in the second example). To recover the result from the second answer, you'd need
result = a[result]
However, in this case, since you're using boolean masks (True/False approximately equivalent to 1/0), you can actually use bitwise or to do the same thing as logical_or:
result = a[(a==2) | (a==3)]
A word of caution here -- Make sure you use parenthesis. Otherwise, operator precedence can be a bit nasty in these expressions (| binds tighter than ==).

you can remove elements form numpy array with delete
np.delete(a,[0,1,4])
or if you want to keep with the complement,
np.delete(a,np.delete(a,[2,3]))

You can & together views to get arbitrarily complex results:
>>> A = np.random.randint(0, 100, 25).reshape(5,5)
>>> A
array([[98, 4, 46, 40, 24],
[93, 75, 36, 19, 63],
[23, 10, 62, 14, 59],
[99, 24, 57, 78, 74],
[ 1, 83, 52, 54, 27]])
>>> A>10
array([[ True, False, True, True, True],
[ True, True, True, True, True],
[ True, False, True, True, True],
[ True, True, True, True, True],
[False, True, True, True, True]], dtype=bool)
>>> (A>10) & (A<20)
array([[False, False, False, False, False],
[False, False, False, True, False],
[False, False, False, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
>>> (A==19) | (A==14) # same output
You can also write a function and use map to call the function on each element. Inside the function have as many tests as you wish:
>>> def test(ele):
... return ele==2 or ele==3
...
>>> map(test,np.arange(5))
[False, False, True, True, False]
You can use numpy.vectorize:
>>> def test(x):
... return x>10 and x<20
...
>>> v=np.vectorize(test)
>>> v(A)
array([[False, False, False, False, False],
[False, False, False, True, False],
[False, False, False, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)

Related

Finding indices where True values connect vertically and horizontally with numpy

I want to merge certain values that are numerically close.
In this example I'll look to merge anything that can be connected via a neighboring value with a difference less 2.
import numpy as np
a = np.arange(10)
a = np.delete(a, (3, 7))
matrix = np.abs(a.reshape(-1,1) - a)
matrix < 2
array([[ True, True, False, False, False, False, False, False],
[ True, True, True, False, False, False, False, False],
[False, True, True, False, False, False, False, False],
[False, False, False, True, True, False, False, False],
[False, False, False, True, True, True, False, False],
[False, False, False, False, True, True, False, False],
[False, False, False, False, False, False, True, True],
[False, False, False, False, False, False, True, True]])
Starting at the top left corner:
First move horizontal to find the last true value.
Then go downwards to find the last true value.
Repeat both until the square is found.
Cannot connect True values diagonally.
This would give you the first square that goes from [0,2] to [2,2].
The desired output of this example would be:
[[0, 2], [3, 5], [6, 7]]
Where the values indicate the beginning and end of the square. Is there a good way of doing this?
I'd prefer not to loop if possible.
So I've done this using np.diagonal as the basis for the logic.
bm = matrix < 2
endcoords = np.argwhere(bm[1:].diagonal() == False)
zers = np.zeros(endcoords.shape)
zers[1:] += endcoords[:-1] + 1
end = np.hstack((zers, endcoords))
end
array([[0., 2.],
[3., 5.]])
I know this excludes the last coordinate, which for my example doesn't give the 'correct' answer. This however, does give the right answer for my workflow as I'm chunking arrays together - so the last missing coordinate will be the first of my next array. It shouldn't be too hard to finagle the right answer from this anyways.

How to apply conditions for rows in a tensor where there is boolean values

I have the following tensor:
predictions = torch.tensor([[ True, False, False],
[False, False, True],
[False, True, True],
[ True, False, False]])
I applied conditions along the axis like below.
new_pred= []
if predictions == ([True,False,False]):
new_pred = torch.Tensor(0)
if predictions == ([False,False,True]):
new_pred = torch.Tensor(2)
if predictions == ([False,True,True]):
new_pred = torch.Tensor(2)
So I want the final output (new_pred) to be:
tensor([0, 2, 2, 0])
But I am getting a blank [] for the new_pred tensor. I think my logic must be flawed since nothing is getting stored in the new_pred. Can someone help me write this logic accurately?
The type of predictions is torch.Tensor while ([True, False, False]) is a list, first, you have to make sure both sides have the same type.
predictions == torch.tensor([True,False,False])
>>> tensor([[ True, True, True],
[False, True, False],
[False, False, False],
[True, True, True]])
Then, you are still comparing a 2d tensor to a 1d tensor, which is ambiguous in an if statement, an easy way to fix this would be to write a for loop, compare each row of the predictions to the conditions and append the result to the new_pred list. Note that you will be comparing two booleans tensors with the size of three, therefore, you have to make sure the result of the comparison is True for all of the cells.
predictions = torch.tensor([[ True, False, False],
[False, False, True],
[False, True, True],
[ True, False, False]])
conditions = torch.tensor([[True,False,False],
[False,False,True],
[False,True,True]])
new_predict = []
for index in range(predictions.size(0)):
if (predictions[index] == conditions[0]).all():
new_predict.append(0)
# ...
Alternatively, you can use slicing to achieve your expected result without any for loop.

Python 2d array boolean reduction

I've got a 2D array comprised of boolean values (True,False). I'd like to consolidate the array to a 1D based on a logical function of the contents.
e.g.
Input:
[[True, True, False],
[False, False, False],
[True, True, True]]
Output (logical AND):
[False,
False,
True]
How would this be done without a loop ?
You can use Python's built-in all method with a list-comprehension:
[all(x) for x in my_list]
If that's still too loopy for you, combine it with map:
map(all, my_list)
Note that map doesn't return a list in Python 3. If you want a list as your result, you can call list(map(all, my_list)) instead.
You can do this without NumPy too. Here is one solution using list comprehension. Explanation: It will loop over sub-lists and even if one of the items in each sub-list is False, it outputs False else True.
inp = [[True, True, False],[False, False, False],[True, True, True]]
out = [False if False in i else True for i in inp]
print (out)
# [False, False, True]
Alternative (less verbose) as suggested by Jean below:
out = [False not in i for i in inp]
I'm assuming you want to apply logical ANDs to the rows. You can apply numpy.all.
>>> import numpy as np
>>> a = np.array([[True, True, False], [False, False, False], [True, True, True]])
>>> a
array([[ True, True, False],
[False, False, False],
[ True, True, True]])
>>>
>>> np.all(a, axis=1)
array([False, False, True])
For a solution without numpy, you can use operator.and_ and functools.reduce.
>>> from operator import and_
>>> from functools import reduce
>>>
>>> lst = [[True, True, False], [False, False, False], [True, True, True]]
>>> [reduce(and_, sub) for sub in lst]
[False, False, True]
edit: actually, reduce is a bit redundant in this particular case.
>>> [all(sub) for sub in lst]
[False, False, True]
does the job just as well.
You can do this with numpy with the numpy.all function:
>>> import numpy as np
>>> arr = np.array([[True, True, False],
... [False, False, False],
... [True, True, True]]
... )
>>> np.all(arr, axis=1)
array([False, False, True])
Here thus the i-th element is True if all elements of the i-th row are True, and False otherwise. Note that the list should be rectangular (all sublists should contain the same number of booleans).
In "pure" Python, you can use the all function as well, like:
>>> data = [[True, True, False], [False, False, False], [True, True, True]]
>>> list(map(all, data))
[False, False, True]
This approach will work as well if the "matrix" is not rectangular. Note that for an empty sublist, this will return True, since all elements in an empty sublist are True.
You can also do this with map and reduce:
from functools import reduce
l = [[True, True, False],
[False, False, False],
[True, True, True]]
final = list(map(lambda x: reduce(lambda a, b: a and b, x), l))
print(final)
# [False, False, True]
The benefit here is that you can change the reduce function to something else (say, an OR or something more adventurous).

Numpy: Create a mask array to select rectangle

Is there a way to generate an array, masking a rectangular area, without the need to initialize an empty array first?
mask = np.zeros((10,10), dtype=bool)
mask[10/2:,10/2:] = True
Im looking for an numpy expression that would replace these two lines and generate mask
If you really, really want to have a one-liner, this is a possible way. But I'd say your original code is cleaner, clearer, and overall better...
>>> np.logical_and.outer(np.arange(10) >= 5, np.arange(6) >= 3)
array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, True, True, True],
[False, False, False, True, True, True],
[False, False, False, True, True, True],
[False, False, False, True, True, True],
[False, False, False, True, True, True]], dtype=bool)
That can be done in one line:
mask = np.fromfunction(lambda i, j: (i >= 5) * (j >= 5), (10, 10), dtype=int)
but this turns out to be quite a bit slower than the original implementation. (It creates two arrays holding row and column indices).
You can use np.empty if you don't want to initialize the array and set the values manually. In this specific case, however, you might be better off using np.zeros or np.ones and flip the values for whichever is the smallest number of entries that you need to update.

Small clarification needed on numpy.any for matrices

I am having a slight problem in getting numpy.any() to work fine on my problem.
Consider I have a 3D matrix of N X M X M matrix, where I need to get rid of any matrix MXM that has all its elements the same [all zeros to say].
Here is an example to illustrate my issue
x = np.arange(250).reshape(10,5,5)
x[0,:,:] = 0
What I need to do is get rid of the first 5X5 matrix since it contain all zeros.
So I tried with
np.any(x,axis=0)
and expected to have a results of
[FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE]
but what i get is
array([[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True]
[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True]], dtype=bool)
Applying the follwing results with what I want but I hope that there is a better way without any loops
for i in range(x.shape[0]):
y.append(np.any(x[i,:,:]))
Did I make a mistake somewhere here?
Thanks!
In a 10x5x5 matrix with x[0,:,:] = 0 I would expect a result of:
[False, True, True, True, True, True, True, True, True, True]
because it is the first of ten 5x5 arrays which is all zero and not of five.
You get this result using
x.any(axis=1).any(axis=1)
or
x.any(axis=2).any(axis=1)
which means you first eliminate the second (axis=1) or the third (asix=2) dimension and then the remaining second (axis=1) and you get the only one dimension, which was originally the first one (axis=0).

Categories

Resources