Python: 'inf is inf', but '-inf is not -inf'?

Python: 'inf is inf', but '-inf is not -inf'? - python

Python 3.7
While writing the search code for the maximum, I encountered a strange behavior of negative infinity.
Can somebody explain why this behavior?
>>> inf = float('inf')
>>> inf is inf
True
>>> (-inf) is (-inf)
False
And I've already realized it's better to use == for comparison, but I'm interested in the answer to the question above.

inf is a variable, bound to a specific object. Any object is itself, so inf is inf.
-inf is an expression. It does math, and produces an object with value floating-point negative infinity. Python makes no promises about whether this will be the same object as any other object with that value. In your case, the two evaluations of -inf happened to produce different objects.
Again, there are no promises about what -inf is -inf will produce. The current CPython implementation happens to consistently produce False. PyPy's primitive handling produces True. A different Python version or implementation might inconsistently produce True or False based on current memory pressure, or whether you ran this as a script or interactively, or any other factor. CPython itself already has cases where object identity is different in a script or interactively; for example, this:
x = 1000
y = 1000
print(x is y)
prints different things in the current CPython implementation depending on whether you run it interactively.

-inf is an operation that produces a new float object, and you use it twice in the same expression. Two distinct objects are produced, because a Python implementation is not required to notice that both subexpressions could return a reference to the same object.

I want to add one point: in Python, when we use id() function, is operator, == operator, etc, the real target we're working with is value, not variable.
a is b equals to id(a) == id(b).
In your code, inf = float('inf') will produce a value named inf.
inf is also an expression, which will be evaluated to a value.
inf is inf will be True, because the evaluated values of two expressions are same one, in fact.
In Python, focusing on values is better and easier.
For example, in CPython implementation:
a = 1
b = 1
a is b # will be true
a = 10000
b = 10000
a is b # will be false
a = '10000'
b = '10000'
a is b # will be true
If we focus on variable, such thing is weird. If we focus on values, and if I tell you that CPython implementation has cache pool, which is used for small int and some literal str value, such thing is simple again:
Because of cache pool, small int 1 is cached, str '10000' is cached too, but big int 10000 is not cached so it will be created two times.
Try to focus on values.

Related

Why 'is' operator is different when adding a value to a variable [duplicate]

This question already has answers here:
"is" operator behaves unexpectedly with integers
(11 answers)
Closed last month.
in shell environment,
x = 250
y = 250
x + 100 is y + 100
x + 100 is y + 100 result is False.
Why is the result value "False" while id(x), id(y) is same?
Also
x = 250
y = 250
x is y
x += 10
y += 10
x is y
last statement result is False. It is same problem.
I thought both questions would be True.

At a high-level, is checks whether the values refer to the same reference, while == calls a method of the objects to compare them
As a general rule, only use is when comparing against singletons
None
True
False
and == everywhere else
#Mark Ransom's comment gets to why this can have unexpected effects (such as numbers sometimes comparing the same, not not every time) .. small numbers have a special priority and are cached, though to my knowledge this is implementation-specific (and so it may vary between different interpreters and versions of Python) and should not be relied upon
>>> a = 100
>>> b = 100
>>> c = 1000
>>> d = 1000
>>> a is b
True
>>> c is d
False

"The is operator does not match the values of the variables, but the instances themselves."
Source : Understanding the "is" operator
To sum up, the values are equal but the instances isn't the same.
I recommend this lecture from Python official documentation : https://docs.python.org/3/tutorial/classes.html

is checks object identity. It is only True if you are comparing the same object on both sides. Just because two integers have the same value doesn't mean they have to be the same object. In fact, they usually aren't.
CPython optimizes a small set of integers (-5 through +256) to always be singletons. That is, whenever an integer is created, python checks whether the integer is in a small range of cached integers and uses those when possible. This is pretty quick - for small integers, which are the most commonly used integers, you don't need to create new objects. But its not scalable. Outside that range, python will generally create a new integer even if the integer already exists. The cost of lookup is too much.
The python compiler may also reuse immutables such as integers and strings that it knows about in a given compilation unit. This lookup is done once at compile and is relatively fast considering everything that goes on in that compile step.
But you can't rely on any of this. This is just how python happens to be implemented and the moment and it may change in the future. There are several singletons you can count on: None, True and False will always work.

Testing a "very close number" in Python with doctest [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?

Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.

Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error

I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.

Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.

The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)

math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result

I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.

If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7

In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.

This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False

For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.

I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))

If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)

I found the following comparison helpful:
str(f1) == str(f2)

To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True

This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'

If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.

Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

The tolerance of using `==` to compare floating numbers in Python [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?

Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.

Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error

I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.

Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.

The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)

math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result

I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.

If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7

In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.

This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False

For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.

I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))

If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)

I found the following comparison helpful:
str(f1) == str(f2)

To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True

This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'

If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.

Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

'is' operator behaves unexpectedly with floats

I came across a confusing problem when unit testing a module. The module is actually casting values and I want to compare this values.
There is a difference in comparison with == and is (partly, I'm beware of the difference)
>>> 0.0 is 0.0
True # as expected
>>> float(0.0) is 0.0
True # as expected
As expected till now, but here is my "problem":
>>> float(0) is 0.0
False
>>> float(0) is float(0)
False
Why? At least the last one is really confusing to me. The internal representation of float(0) and float(0.0) should be equal. Comparison with == is working as expected.

This has to do with how is works. It checks for references instead of value. It returns True if either argument is assigned to the same object.
In this case, they are different instances; float(0) and float(0) have the same value ==, but are distinct entities as far as Python is concerned. CPython implementation also caches integers as singleton objects in this range -> [x | x ∈ ℤ ∧ -5 ≤ x ≤ 256 ]:
>>> 0.0 is 0.0
True
>>> float(0) is float(0) # Not the same reference, unique instances.
False
In this example we can demonstrate the integer caching principle:
>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False
Now, if floats are passed to float(), the float literal is simply returned (short-circuited), as in the same reference is used, as there's no need to instantiate a new float from an existing float:
>>> 0.0 is 0.0
True
>>> float(0.0) is float(0.0)
True
This can be demonstrated further by using int() also:
>>> int(256.0) is int(256.0) # Same reference, cached.
True
>>> int(257.0) is int(257.0) # Different references are returned, not cached.
False
>>> 257 is 257 # Same reference.
True
>>> 257.0 is 257.0 # Same reference. As #Martijn Pieters pointed out.
True
However, the results of is are also dependant on the scope it is being executed in (beyond the span of this question/explanation), please refer to user: #Jim's fantastic explanation on code objects. Even python's doc includes a section on this behavior:
5.9 Comparisons
[7]
Due to automatic garbage-collection, free lists, and the dynamic nature of descriptors, you may notice seemingly unusual behaviour in certain uses of the is operator, like those involving comparisons between instance methods, or constants. Check their documentation for more info.

If a float object is supplied to float(), CPython* just returns it without making a new object.
This can be seen in PyNumber_Float (which is eventually called from float_new) where the object o passed in is checked with PyFloat_CheckExact; if True, it just increases its reference count and returns it:
if (PyFloat_CheckExact(o)) {
Py_INCREF(o);
return o;
}
As a result, the id of the object stays the same. So the expression
>>> float(0.0) is float(0.0)
reduces to:
>>> 0.0 is 0.0
But why does that equal True? Well, CPython has some small optimizations.
In this case, it uses the same object for the two occurrences of 0.0 in your command because they are part of the same code object (short disclaimer: they're on the same logical line); so the is test will succeed.
This can be further corroborated if you execute float(0.0) in separate lines (or, delimited by ;) and then check for identity:
a = float(0.0); b = float(0.0) # Python compiles these separately
a is b # False
On the other hand, if an int (or a str) is supplied, CPython will create a new float object from it and return that. For this, it uses PyFloat_FromDouble and PyFloat_FromString respectively.
The effect is that the returned objects differ in ids (which used to check identities with is):
# Python uses the same object representing 0 to the calls to float
# but float returns new float objects when supplied with ints
# Thereby, the result will be False
float(0) is float(0)
*Note: All previous mentioned behavior applies for the implementation of python in C i.e CPython. Other implementations might exhibit different behavior. In short, don't depend on it.

Pythonic way of checking for 0 in if statement?

In coding a primality tester, I came across an interesting thought. When you want to do something if the result of an operation turns out to be 0, which is the better ('pythonic') way of tackling it?
# option A - comparison
if a % b == 0:
print('a is divisible by b')
# option B - is operator
if a % b is 0:
print('a is divisible by b')
# option C - boolean not
if not a % b:
print('a is divisible by b')
PEP 8 says that comparisons to singletons like None should be done with the is operator. It also says that checking for empty sequences should use not, and not to compare boolean values with == or is. However, it doesn't mention anything about checking for a 0 as a result.
So which option should I use?

Testing against 0 is (imo) best done by testing against 0. This also indicates that there might be other values than just 0 and 1.
If the called function really only returns 0 on success and 1 on fail to say Yes/No, Success/Failure, True/False, etc., then I think the function is the problem and should (if applicable) be fixed to return True and False instead.

just personal : I prefer the not a % b way because it seems to be highly readable. But now, to lower the confusion level in the code, I will use the == 0, as it express what you expect to test exactly in a more accurate way. It's the "care for debug" approach.

0 isn't guaranteed to be a singleton so don't use is to test against it: currently C Python re-uses small integers so there is probably only one int with the value 0 plus a long if you're still on Python 2.x, and any number of float zeroes not to mention False which all compare equal to 0. Some earlier versions of Python, before it got a separate bool type used a different int zero for the result of comparisons.
Use either == (which would be my preference) or just the not, whichever you prefer.

A and C are both valid and very pythonic.
B is not, because
0 semantically is not a singleton (it is in cPython, but that is an implementation detail).
It will not work with float a or b.
It is actually possible that this will not work in some other implementation of Python.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: 'inf is inf', but '-inf is not -inf'? - python

-inf is an operation that produces a new float object, and you use it twice in the same expression. Two distinct objects are produced, because a Python implementation is not required to notice that both subexpressions could return a reference to the same object.

Related

Why 'is' operator is different when adding a value to a variable [duplicate]

Testing a "very close number" in Python with doctest [duplicate]

The tolerance of using `==` to compare floating numbers in Python [duplicate]

'is' operator behaves unexpectedly with floats

Pythonic way of checking for 0 in if statement?

Categories

Resources