I have two numpy arrays with floating point values and I am trying to find the indices where the numbers are approximately equal (floating point comparisons).
So something like:
x = np.random.rand(3)
y = np.random.rand(3)
x[2] = y[2]
# Do the comparison and it should return 2 as the index
I tried something like
np.where(np.allclose(x, y))
However, this returns an empty array. If I do:
np.where(x == y) # This is fine.
I tried using a combination of numpy.where and numpy.allclose but could not make it work. Of course, I can do it with a loop but that seems tedious and unpythonic.
What you look for is np.isclose:
np.where(np.isclose(x, y))
You can always use something relying on:
np.where( np.abs(x-y) < epsilon )
Related
Let's say A and B are given vectors and the aim is to check which elements of Y lie between A and B. For example:
A=np.array([1,2,3,4])
B=np.array([10,20,30,40])
Y=np.array([8,15,0,50])
The expected output should look like:
[1,1,0,0] where the elements are not bool type so I can find out the number of true values, using np.sum()
You can so both comparisons and take the elementwise and
(A < Y) & (Y < B)
A np.sum() will work regardless of them being boolean. When in doubt, just cast to int using
X.astype(int)
It's as simple as
np.logical_and(A <= Y, Y <= B).astype(int)
But you can also sum a logical vector, numpy will handle the conversion under the hood.
In [1]: np.sum(np.logical_and(A <= Y, Y <= B).astype(int)) == np.sum(np.logical_and(A <= Y, Y <= B))
Out[1]: True
I have a function a=x*V where x assumes thousands of values as x = arange(1,1000,0.1) and V is a combination of other constants. These make a always complex (has nonzero real and imaginary parts). However, because a depends on other values, the imag(a) can be negative for some x's.
For what I am doing, however, I need imag(a) to be always positive, so I need to take the negative values and turn them into positive.
I have tried doing
if imag(a)<0:
imag(a) = -1*imag(a)
That didn't seem to work because it gives me the error: SyntaxError: Can't assign to function call. I thought it was because it's an array so I tried any() and all(), but that didn't work either.
I'm out of options now.
IIUC:
In [35]: a = np.array([1+1j, 2-2j, 3+3j, 4-4j])
In [36]: a.imag *= np.where(a.imag < 0, -1, 1)
In [37]: a
Out[37]: array([ 1.+1.j, 2.+2.j, 3.+3.j, 4.+4.j])
You can't redefine a function that way. It would be like saying
sqrt(x) = 2*sqrt(x)
What you can do is reassign the value of a (not imag(a)).
if imag(a) < 0
a = a - 2*imag(a)*j
For example, if a = 3 - 5j, then it would give you
3 - 5j - 2(-5)j = 3 + 5j
It appears to be faster than doing subtraction. For a full function:
import numpy as np
def imag_abs(x):
mask = x.imag < 0
x[mask] = np.conj(x[mask])
return x
I don't know how to describe this well so I'll just show it.
How do I do this...
for iy in random_y:
print(x[np.where(y == iy)], iy)
X y
[ 0.5] : 0.247403959255
[ 2.] : 0.841470984808
[ 49.5]: -0.373464754784
without for loops and I get a solution as a single array like when you use np.where() or array[cond]. Since you know, this is Python B)
NOTE: The reason why I want to do this is because I have a random subset of the Y values and I want to find the corresponding X values.
If you are looking for exact matches, you can simply use np.in1d as this is a perfect scenario for its usage, like so -
first_output = x[np.in1d(y,random_y)]
second_output = random_y[np.in1d(random_y,y)
If you are dealing with floating-point numbers, you might want to use some tolerance factor into the comparisons. So, for such cases, you can use NumPy broadcasting and then use np.where, like so -
tol = 1e-5 # Edit this to change tolerance
R,C = np.where(np.abs(random_y[:,None] - y)<=tol)
first_output = x[C]
second_output = random_y[R]
Maybe this could do the trick(not tested):
print(Str(x[np.where(y == iy)]) + " " + Str(iy) + "\n") for iy in random_y
Let's say that I know that x is bigger than y and both x and y are bigger than 0.
Can someone please help me write a function that takes two linear formulas (+/- only) and returns which one is bigger?
For example:
foo("x+y","2*x") #should return 2
foo("2*x","x+y") #should return 1
foo("x","2*y") #should return 0 (can't decide)
thanks alot!
The best way to do this in SymPy is to use the assumptions system.
First off, don't try to do tokenizing. Just use sympify if you have to input as strings, and if you don't have to, just create the expressions using symbols, like
x, y = symbols('x y')
a = x - y
b = 2*x
Please read the SymPy tutorial for more information.
The assumptions system doesn't support inequalities directly yet, so to represent x > y, you need to state that x - y is positive. To ask if 2*x > x - y, i.e., if 2*x - (x - y) is positive, given that x, y, and x - y are positive, do
In [27]: ask(Q.positive((2*x) - (x - y)), Q.positive(x) & Q.positive(y) & Q.positive(x - y))
Out[27]: True
The first argument to ask is what you are asking and the second argument is what you are assuming. & is logical and, so Q.positive(x) & Q.positive(y) & Q.positive(x - y) means to assume all three of those things.
It will return False if it knows it is false, and None if it can't determine. Note that SymPy works in a complex domain, so not positive doesn't necessarily mean negative. Hence, you should probably call ask on the negated expression as well if you get None, or call it again with negative instead of positive. If you want to include 0 (i.e., use >= instead of >, use nonnegative instead of positive and nonpositive instead of negative.
It isn't as smart as it could be yet, so you'll get a lot of Nones now when the answer could be known. In particular, I don't think it will really use the x > y fact very well at this point.
I tried to find a float number in ndarray. Due to the software package I am using (Abaqus), the precision it outputs is a little bit low. For example, 10 is something like 10.00003. Therefore, I was wondering whether there is a "correct" way to do it, that is neater than my code.
Example code:
import numpy as np
array = np.arange(10)
number = 5.00001
if I do this:
idx = np.where(number==array)[0][0]
Then the result is empty because 5.00001 does not equal to 5.
Now I am doing:
atol = 1e-3 # Absolute tolerance
idx = np.where(abs(number-array) < atol)[0][0]
which works, and is not too messy... Yet I was wondering there would be a neater way to do it. Thanks!
PS: numpy.allclose() is another way to do it, but I need to use number * np.ones([array.shape[0], array.shape[1]]) and it still seems verbose to me...
Edit: Thank you all so much for the fantastic answers! np.isclose() is the exact function that I am looking for, and I missed it since it is not in the doc... I wouldn't have realized this until they update the doc, if it weren't you guys. Thank you again!
PS: numpy.allclose() is another way to do it, but I need to use number * np.ones([array.shape[0], array.shape[1]]) and it still seems verbose to me...
You almost never need to do anything like number * np.ones([array.shape[0], array.shape[1]]). Just as you can multiply that scalar number by that ones array to multiply all of its 1 values by number, you can pass that scalar number to allclose to compare all of the original array's values to number. For example:
>>> a = np.array([[2.000000000001, 2.0000000002], [2.000000000001, 1.999999999]])
>>> np.allclose(a, 2)
True
As a side note, if you really do need an array of all 2s, there's an easier way to do it than multiplying 2 by ones:
>>> np.tile(2, array.shape)
array([[2, 2], [2, 2]])
For that matter, I don't know why you need to do [array.shape[0], array.shape[1]]. If the array is 2D, that's exactly the same thing as array.shape. If the array might be larger, it's exactly the same as array.shape[:2].
I'm not sure this solves your actual problem, because it seems like you want to know which ones are close and not close, rather than just whether or not they all are. But the fact that you said you could use allclose if not for the fact that it's too verbose to create the array to compare with.
So, if you need whereclose rather than allclose… well, there's no such function. But it's pretty easy to build yourself, and you can always wrap it up if you're doing it repeatedly.
If you had an isclose method—like allclose, but returning a bool array instead of a single bool—you could just write:
idx = np.where(isclose(a, b, 0, atol))[0][0]
… or, if you're doing it over and over:
def whereclose(a, b, rtol=1e-05, atol=1e-08):
return np.where(isclose(a, b, rtol, atol))
idx = whereclose(a, b, 0, atol)[0][0]
As it turns out, version 1.7 of numpy does have exactly that function (see also here), but it doesn't appear to be in the docs. If you don't want to rely on a possibly-undocumented function, or need to work with numpy 1.6, you can write it yourself trivially:
def isclose(a, b, rtol=1e-05, atol=1e-08):
return np.abs(a-b) <= (atol + rtol * np.abs(b))
If you have up-to-date numpy (1.7), then the best way is to use np.isclose which will broadcast the shapes together automatically:
import numpy as np
a = np.arange(10)
n = 5.000001
np.isclose(a, n).nonzero()
#(array([5]),)
or, if you expect only one match:
np.isclose(a, n).nonzero()[0][0]
#5
(np.nonzero is basically the same thing as np.where except that it doesn't have the if condition then/else capability)
The method you use above, specifically abs(A - B) < atol, is standard for doing floating point comparisons across many languages. Obviously when using numpy A and/or B can be arrays or numbers.
Here is another approach that might be useful to look at. I'm not sure it applies to your case, but it could be very helpful if you're looking for more than one number in the array (which is a common use case). It's inspired by this question which is kind of similar.
import numpy as np
def find_close(a, b, rtol=1e-05, atol=1e-08):
tol = atol + abs(b) * rtol
lo = b - tol
hi = b + tol
order = a.argsort()
a_sorted = a[order]
left = a_sorted.searchsorted(lo)
right = a_sorted.searchsorted(hi, 'right')
return [order[L:R] for L, R in zip(left, right)]
a = np.array([2., 3., 3., 4., 0., 1.])
b = np.array([1.01, 3.01, 100.01])
print find_close(a, b, atol=.1)
# [array([5]), array([1, 2]), array([], dtype=int64)]