How to compare two numpy arrays with multiple condition - python

I have 2 NumPy arrays like the below
array_1 = np.array([1.2, 2.3, -1.0, -0.5])
array_2 = np.array([-0.5, 1.3, 2.5, -0.9])
We can do the element-wise simple arithmetic calculation (addition, subtraction, division etc) easily using different np functions
array_sum = np.add(array_1, array_2)
print(array_sum) # [ 0.7 3.6 3.5 -0.4]
array_sign = np.sign(array_1 * array_2)
print(array_sign) # [-1. 1. 1. -1.]
However, I need to check element-wise multiple conditions for 2 arrays and want to save them in 2 new arrays (say X and Y).
For example, if both elements contain different sign (e.g.: 1st and 3rd element pairs of the given example)) then, X will contain 0 and Y will be the sum of the poitive element and abs(negative element)
X = [0]
Y = [1.7]
When both elements are positive (e.g.: 2nd element pair of the given example) then, X will contain the lower value and Y will contain the greater value
X = [1.3]
Y = [2.3]
If both elements are negative, then, X will be 0 and Y will be the sum of the abs(negative element) and abs(negative element)
So, the final X and Y will be something like
X = [0, 1.3, 0, 0]
Y = [1.7, 2.3, 3.5, 1.4]
I have gone through some posts (this, and this) that described, the comparison procedures between 2 arrays, but not getting idea for multiple conditions. Here, 2 arrays are very small but, my real arrays are very large (e.g.: contains 2097152 element per array).
Any ideas are highly appreciated.

Try with numpy.select:
conditions = [(array_1>0)&(array_2>0), (array_1<0)&(array_2<0)]
choiceX = [np.minimum(array_1, array_2), np.zeros(len(array_1))]
choiceY = [np.maximum(array_1, array_2), -np.add(array_1,array_2)]
X = np.select(conditions, choiceX)
Y = np.select(conditions, choiceY, np.add(np.abs(array_1), np.abs(array_2)))
>>> X
array([0. , 1.3, 0. , 0. ])
>>> Y
array([1.7, 2.3, 3.5, 1.4])

This will do it. It does require vertically stacking the two arrays. I'm sure someone will pipe up if there is a more efficient solution.
import numpy as np
array_1 = np.array([1.2, 2.3, -1.0, -0.5])
array_2 = np.array([-0.5, 1.3, 2.5, -0.9])
def pick(t):
if t[0] < 0 or t[1] < 0:
return (0,abs(t[0])+abs(t[1]))
return (t.min(), t.max())
print( np.apply_along_axis( pick, 0, np.vstack((array_1,array_2))))
Output:
[[0. 1.3 0. 0. ]
[1.7 2.3 3.5 1.4]]
The second line of the function can also be written:
return (0,np.abs(t).sum())
But since these will only be two-element arrays, I doubt that saves anything at all.

Related

Apply logical and/or operations along an axis in numpy python [duplicate]

For machine learning, I'm appliying Parzen Window algorithm.
I have an array (m,n). I would like to check on each row if any of the values is > 0.5 and if each of them is, then I would return 0, otherwise 1.
I would like to know if there is a way to do this without a loop thanks to numpy.
You can use np.all with axis=1 on a boolean array.
import numpy as np
arr = np.array([[0.8, 0.9], [0.1, 0.6], [0.2, 0.3]])
print(np.all(arr>0.5, axis=1))
>> [True False False]
import numpy as np
# Value Initialization
a = np.array([0.75, 0.25, 0.50])
y_predict = np.zeros((1, a.shape[0]))
#If the value is greater than 0.5, the value is 1; otherwise 0
y_predict = (a > 0.5).astype(float)
I have an array (m,n). I would like to check on each row if any of the values is > 0.5
That will be stored in b:
import numpy as np
a = # some np.array of shape (m,n)
b = np.any(a > 0.5, axis=1)
and if each of them is, then I would return 0, otherwise 1.
I'm assuming you mean 'and if this is the case for all rows'. In this case:
c = 1 - 1 * np.all(b)
c contains your return value, either 0 or 1.

How to always round up a XX.5 in numpy

I read that numpy is unbiased in rounding and that it works the way its designed. That "if you always round 0.5 up to the next largest number, then the average of a bunch rounded numbers is likely to be slightly larger than the average of the unrounded numbers: this bias or drift can have very bad effects on some numerical algorithms and make them inaccurate."
Disregarding this information and assuming that I always want to round up, how can I do it in numpy? Assuming my array can be quite large.
For simplicity, lets assume i have the array:
import numpy as np
A = [ [10, 15, 30], [25, 134, 41], [134, 413, 51]]
A = np.array(A, dtype=np.int16)
decimal = A * .1
whole = np.round(decimal)
decimal looks like:
[[ 1. 1.5 3. ]
[ 2.5 13.4 4.1]
[ 13.4 41.3 5.1]]
whole looks like:
[[ 1. 2. 3.]
[ 2. 13. 4.]
[ 13. 41. 5.]]
As you can see, 1.5 rounded to 2 and 2.5 also rounded to 2. How can I force to always get a round up answer for a XX.5? I know I can loop through the array and use python round() but that would definitely be much slower. Was wondering if there is a way to do it using numpy functions
The answer is almost never np.vectorize. You can, and should, do this in a fully vectorized manner. Let's say that for x >= 0, you want r = floor(x + 0.5). If you want negative numbers to round towards zero, the same formula applies for x < 0. So let's say that you always want to round away from zero. In that case, you are looking for ceil(x - 0.5) for x < 0.
To implement that for an entire array without calling np.vectorize, you can use masking:
def round_half_up(x):
mask = (x >= 0)
out = np.empty_like(x)
out[mask] = np.floor(x[mask] + 0.5)
out[~mask] = np.ceil(x[~mask] - 0.5)
return out
Notice that you don't need to use a mask if you round all in one direction:
def round_up(x):
return np.floor(x + 0.5)
Now if you want to make this really efficient, you can get rid of all the temp arrays. This will use the full power of ufuncs:
def round_half_up(x):
out = x.copy()
mask = (out >= 0)
np.add(out, 0.5, where=mask, out=out)
np.floor(out, where=mask, out=out)
np.invert(mask, out=mask)
np.subtract(out, 0.5, where=mask, out=out)
np.ceil(out, where=mask, out=out)
return out
And:
def round_up(x):
out = x + 0.5
np.floor(out, out=out)
return out
import numpy as np
A = [ [1.0, 1.5, 3.0], [2.5, 13.4, 4.1], [13.4, 41.3, 5.1]]
A = np.array(A)
print(A)
def rounder(x):
if (x-int(x) >= 0.5):
return np.ceil(x)
else:
return np.floor(x)
rounder_vec = np.vectorize(rounder)
whole = rounder_vec(A)
print(whole)
Alternatively, you can also look at numpy.ceil, numpy.floor, numpy.trunc for other rounding styles

How to remove negative outputs from a function?

I have used a function to calculate the difference between 2 values. From printing the output of the function below, the answer ranges from -5 to 4. However I only want the function to display positive answers only (i.e: 1 to 4).
Is it possible to disregard the negative values without changing the boundaries of x and nor change the value of a?
import numpy as np
L = 10
a = 5
def position(x,a):
return x-a
x = np.arange(0.0, L, 1)
print (position(x,a))
[-5. -4. -3. -2. -1. 0. 1. 2. 3. 4.]
import numpy as np
L = 10
a = 5
def position(x,a):
return x-a
x = np.arange(0.0, L, 1)
tmp = position(x,a)
print (tmp[tmp>=0])
This may help, an example of filtering a numpy array
import numpy
arr = numpy.array([-1.1, 0.0, 1.1])
print(arr)
bools = arr >= 0.0 # define selection
print(bools)
# filter by "bools"
print(arr[bools])
based on what I can take from your question this should work:
result = list(filter(lambda x:x>=0, position(x,a))

numpy *= not working

I use numpy to calculate matrix multiply.
If I use t = t * x, it works just fine, but if I use t *= x, it doesn't.
Do I need to use t = t * x?
import numpy as np
if __name__ == '__main__':
x = [
[0.9, 0.075, 0.025],
[0.15, 0.8, 0.05],
[0.25, 0.25, 0.5]
]
t = [1, 0, 0]
x = np.matrix(x)
t = np.matrix(t)
t = t * x # work , [[ 0.9 0.075 0.025]]
# t *= x # not work? always [[0 0 0]]
print t
You filled t with ints rather than floats, so NumPy decides you want a matrix of integer dtype. When you do t *= x, this requests that the operation be performed in place, reusing the t object to store the result. This forces the results to be cast to integers, so they can be stored in t.
Initialize t with floats:
t = numpy.matrix([1.0, 0.0, 0.0])
I would also recommend switching to plain arrays, rather than matrices. The convenience of * over dot isn't worth the inconsistencies matrix causes. If you're on Python 3.5 or later, you can even use # for matrix multiplication with regular arrays.

Numpy: given a vector of 0s and 1s, how to efficiently use Numpy functions manipulate values in another array, based on 0/1 value of first array?

So, let us say that we are given an array of ints like:
x = numpy.array([0, 1, 1, 0])
We are also given another array, with floats, the same length as x:
y = numpy.array([-1.5, 2.2, -1.0, 1.0])
I want to use x and y to make an array z such that z[i] = y[i] if y[i] <= 0 (regardless of what x[i] is), but z[i] = 0 if x[i] = 1 AND y[i] > 0. So, using our example arrays:
z = [-1.5, 0, -1.0, 1.0]
This would be easy to do if I were using Python for loops, but I don't want to use Python for loops. Another idea is to write it using for loops, and then simply use something like Cython or Numba to speed up the for loop.
However, I want to use Numpy functions as much as possible (that's what makes this question a question), but I don't really see how. Maybe using masks? How would you do it?
Method #1: enforce the condition directly.
>>> z = y.copy()
>>> z[(x == 1) & (y > 0)] = 0
>>> z
array([-1.5, 0. , -1. , 1. ])
Method #2: use np.where:
>>> np.where((x == 1) & (y > 0), 0, y)
array([-1.5, 0. , -1. , 1. ])

Categories

Resources