I have an array x and I want to apply a function f to every item in the matrix that meets some condition. Does Numpy offer a mechanism to make this easy?
Here's an example. My matrix x is supposed to contain only elements in the exclusive range (0, 1). However, due to rounding errors, some elements can be equal to 0 or 1. For every element in x that is exactly 0 I want to add epsilon and for every element that is exactly 1 I want to subtract epsilon.
Edit: (This edit was made after I had accepted askewchan's answer.) Another way to do this is to use numpy.clip.
You can do this:
a = np.array([0,.1,.5,1])
epsilon = 1e-5
a[a==0] += epsilon
a[a==1] += -epsilon
The reason this works is that a==0 returns a boolean array, just like what Валера Горбунов referred to in their answer:
In : a==0
Out: array([True, False, False, False], dtype=bool)
Then you're using that array as an index to a, which exposes the elements where True but not where False. There's a lot that you can do with this, see http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
Sorry this isn't more concrete, but you could create a Boolean array that has a TRUE value for every position that meets your condition and FALSE for those that don't.
For something like [0, 1, 0, 0] when testing for 1 you will get an array [FALSE, TRUE, FALSE, FALSE]. In which case you can do [0, 1, 0, 0] - (epsilon)[FALSE, TRUE, FALSE, FALSE] and leave the 0 values unaffected.
Boolean Array Example
You can use map() as documented at http://docs.python.org/2/tutorial/datastructures.html#functional-programming-tools:
def applyEpsilon(value):
myEpsilon = 0.001
if value == 0:
return myEpsilon
elif value == 1:
return 1-myEpsilon
return value
inputList = [0, 0.25, 0.5, 0.75, 0.99, 1]
print map(applyEpsilon, inputList)
Yields:
[0.001, 0.25, 0.5, 0.75, 0.99, 0.999]
Related
import numpy as np
x = np.array([1, -1, 2, 5, 7])
print(sum(x%2==0))
This is the code, and I can't understand what does ' sum(x%2==0) ' mean.
Does it mean to sum even number?
I'm studying for school test and My professor said output of the above code is 1.
But I can't understand what does ' sum(x%2==0)' mean..
x % 2 == 0 will change your array to [False, False, True, False, False]
Because every element will be converted to a boolean, which represents, if the number is even or odd
Then the sum gets evaluated, where False = 0 and True = 1
0 + 0 + 1 + 0 + 0 = 1
import numpy as np
x = np.array([1, -1, 2, 5, 7])
# step 1: create an intermediate array which contains the modulo 2 of each element (if the element is even it will be True, otherwise False)
y = x % 2 == 0 # [False, False, True, False, False]
# step 2: sum the intermediate array up. In this case the False values count as 0 and the True values as 1. There is one True value so the sum is 1
z = sum(y) # 1
For your purposes, here's an explanation. For Stack Overflow's purposes, I'm recommending to close this question as it's more coding help than a novel coding question.
The operations in this expresssion are as follows:
# operation 1
intermediate_result_1 = x%2
# operation 2
intermediate_result_2 = (intermediate_result_1 == 0)
# operation 3
sum(intermediate_result_2)
Operation 1: the modulo operator essentially returns the remainder when the first term is divided by the second term. Most basic mathematical operations (e.g. +,-,*,/,%,==,!=, etc) are implemented element-wise in numpy, which means that the operation is performed independently on each element in the array. Thus, the output from operation 1:
intermediate_result_1 = np.Array([1,1,0,1,1])
Operation 2: same for the equality operator ==. Each element of the array is compared to the right-hand value, and the resulting array has True (or 1) where the equality expression holds, and False (or 0) otherwise.
intermediate_result_2 = np.Array([0,0,1,0,0])
Operation 3: Lastly, the default sum() operator for a numpy array sums all values in the array. Note that numpy provides its own sum function which allows for summing along individual dimensions. Quite evidently the sum of this array's elements is 1.
numpy makes it easy for you to operate on the array object
as many answers already suggest that
x%2==0 returns [False, False, True, False, False]
but if you are still confused then try to understand it like this
lets make a function which checks if a value is even or not.
def is_even(ele):
return ele%2==0
then we use the map function
map() function returns a map object(which is an iterator) of the
results after applying the given function to each item of a given
iterable (list, tuple etc.)
NOTE: copied from GeeksforGeeks
then we take a simple list and map it with this function like so:
l=[1, -1, 2, 5, 7] # this is not a np array
print(map(is_even, l)) # this prints [False, False, True, False, False]
print(sum(map(is_even, l))) # this prints 1
The title is actually misleading, but I didn't know how to describe my problem in a short sentence. I don't care about inflection point, but I care about the point where the values switch from x > 1 to x < 1.
Consider the following array:
a = np.array([0.683, 0.819, 0.678, 1.189, 1.465, 0.93 , 0.903, 1.321, 1.321, 0.785, 0.875])
# do something... and here's what I want:
np.array([True, False, False, False, False, True, True, False, False, True, True])
Here are the rules:
First point in array is the starting point, and is always marked True
In order for values to be marked True, it must be smaller than 1 (x < 1).
However, even if a value is smaller than 1, if it's between the first value smaller than 1 and the first value greater than 1, mark it as False.
In case my explanation doesn't make sense, here's the picture of what I want to do:
The decimal values in the array a are just ratios: current point / previous point. How can I do this in Python?
the code I put hereafter do what you asked. Unfortunately, it doesn't use list comprehension.
The first thing I did was to write a function that find the indexes of the first value below zero and the first value above zero.
import numpy as np
a = np.array([0.683, 0.819, 0.678, 1.189, 1.465, 0.93 , 0.903, 1.321, 1.321, 0.785, 0.875])
### if a number is below ONE but in a position between the first true below zero and the first false above zero
### then it's false
## find the two indexes of the first value below 1 and the first value above 1
def find_indx(a):
first_min=0
for i in range(len(a)):
if(a[i]<1):
first_min=i
break
first_max=0
for i in range(len(a)):
if(a[i]>1):
first_max=i
break
return([first_min,first_max])
Using this function you can set, to false, the values that are below zero but are in the interval between the first below zero and the first above zero.
The two indexes are stored in "false_range".
Once you have that it's quite easy. The first point is always true.
If the indexes are between the "false_range" and below zero they become false.
If the points are outside the "false_range" their value depends if they are above 1 or below.
false_range=find_indx(a)
truth_list=[]
for i in range(len(a)):
## the first value is always true
if(i==0):
truth_list.append(True)
else:
## if the index is between the false_range and
## this value is below 1 assign False
if(i>false_range[0] and i<false_range[1] and a[i]<1):
truth_list.append(False)
## in all the other cases it depends only if the value is below or above zero
elif(a[i]>1):
truth_list.append(False)
elif(a[i]<1):
truth_list.append(True)
print(truth_list)
[True, False, False, False, False, True, True, False, False, True, True]
The printed list correspond to the one you gave, but please, test this solution before using it.
I'm studying GANs (and I'm a beginner in python) and I found this part of the code in the previous exercises that I don't understand. Concretely I don't understand why is used the boolean of the 9th line (Xk = X[Y == k]) for the reasons that I write down below
class BayesClassifier:
def fit(self, X, Y):
# assume classes are numbered 0...K-1
self.K = len(set(Y))
self.gaussians = []
self.p_y = np.zeros(self.K)
for k in range(self.K):
Xk = X[Y == k]
self.p_y[k] = len(Xk)
mean = Xk.mean(axis=0)
cov = np.cov(Xk.T)
g = {'m': mean, 'c': cov}
self.gaussians.append(g)
# normalize p(y)
self.p_y /= self.p_y.sum()
That boolean return a 0 or a 1 depending on the trueness of the Y ==
k, and for that reason always Xk will be the first or the second value of the X list. Y don't find the utility of that.
In the 10th line, len(Xk) always will be 1, why does it use that argument instead of a single 1?
The mean and covariance of the next lines are calculated only with one value each time.
I feel that I'm not understanding something very basic.
You should take into account that X, Y, k are NumPy arrays, not scalars, and some operators are overloaded for them. Particularly, == and Boolean-based indexing. == will be element-wise comparison, not the whole array comparison.
See how it works:
In [9]: Y = np.array([0,1,2])
In [10]: k = np.array([0,1,3])
In [11]: Y==k
Out[11]: array([ True, True, False])
So, the result of == is a Boolean array.
In [12]: X=np.array([0,2,4])
In [13]: X[Y==k]
Out[13]: array([0, 2])
The result is an array with elements selected from X when the condition is True
Hence len(Xk) will be the number of matched elements between X and k.
Thanks, Artem,
You are right. I found another answer by another channel, here it is:
It's a Numpy array - it's a special feature of NumPy arrays called
boolean indexing that lets you filter out only the values in the array
where the filter returns True:
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html?fbclid=IwAR3sGlgSwhv3i7IETsIxp4ROu9oZvNaaaBxZS01DrM5ShjWWRz22ShP2rIg#boolean-or-mask-index-arrays
import numpy as np
a = np.array([1, 2, 3, 4, 5]) filter = a > 3
print(filter)
[False, False, False, True, True]
print(a[filter])
[4, 5]
Suppose I have a 1D numpy array (A) containing 5 elements:
A = np.array([ -4.0, 5.0, -3.5, 5.4, -5.9])
I need to add 5 to all the elements of A that are lesser than zero. What is the numpy way to do this without for-looping ?
It can be done using mask:
A[A < 0] += 5
The way it works is - the expression A < 0 returns a boolean array. Each cell corresponds to the predicate applied on the matching cell. In the current example:
A < 0 # [ True False True False True]
And then, the action is applied only on the cells that match the predicate. So in this example, it works only on the True cells.
I found another answer:
A = np.where(A<0, A+5, A)
So I am writing a sudoku solver using a 9x9 array for the grid and a 9x9x9 array for its possibilities. Due to the backtracking algorithm I am using, I have to check whether the Sudoku is still valid, aka:
If there is a field that does not contain a number and has no possibilities left, return False.
So far I implemented that as follows:
for j in range(9):
for i in range(9):
if puzzle[j,i] == 0 and (solving[j,i] == 0).sum() == 9:
return False
return True
If the square at (j,i) contained, for example, the options 2, 3 and 7, the according possibility array would be:
solving[j,i] = array([0, 2, 3, 0, 0, 0, 7, 0, 0], dtype = int8)
I am trying to avoid such loops here and elsewhere in my solver. I tried this as follows:
solving[where(puzzle == 0)]
This returned an n x 9 array, where n is the amount of empty squares in my puzzle. I want the function to return False if any of the n arrays with shape 1x9 is exactly equal to
array([0,0,0,0,0,0,0,0,0], dtype = int8)
However, I did not have any success with various tries including .any() and .all() experiments. This is the latest code I was using:
def is_valid(self, puzzle, solving):
if (solving[py.where(puzzle == 0)].all() == 0).sum() > 0:
return False
return True
... which always returns False. What's the proper way to code this?
I think the following is the check you want to do:
np.all(puzzle | np.any(solving, axis=-1)
This returns True if all cells either have a value assigned (puzzle), or (|) have at least a non-zero entry in solving (np.any(solving, axis=-1)).