The documentation about numeric types states that:
Python fully supports mixed arithmetic: when a binary arithmetic operator has operands of different numeric types, the operand with the “narrower” type is widened to that of the other, where integer is narrower than floating point, which is narrower than complex. Comparisons between numbers of mixed type use the same rule.
This is supported by the following behavior:
>>> int.__eq__(1, 1.0)
NotImplemented
>>> float.__eq__(1.0, 1)
True
However for large integer numbers something else seems to happen since they won't compare equal unless explicitly converted to float:
>>> n = 3**64
>>> float(n) == n
False
>>> float(n) == float(n)
True
On the other hand, for powers of 2, this doesn't seem to be a problem:
>>> n = 2**512
>>> float(n) == n
True
Since the documentation implies that int is "widened" (I assume converted / cast?) to float I'd expect float(n) == n and float(n) == float(n) to be similar but the above example with n = 3**64 suggests differently. So what rules does Python use to compare int to float (or mixed numeric types in general)?
Tested with CPython 3.7.3 from Anaconda and PyPy 7.3.0 (Python 3.6.9).
The language specification on value comparisons contains the following paragraph:
Numbers of built-in numeric types (Numeric Types — int, float, complex) and of the standard library types fractions.Fraction and decimal.Decimal can be compared within and across their types, with the restriction that complex numbers do not support order comparison. Within the limits of the types involved, they compare mathematically (algorithmically) correct without loss of precision.
This means when two numeric types are compared, the actual (mathematical) numbers that are represented by these objects are compared. For example the numeral 16677181699666569.0 (which is 3**34) represents the number 16677181699666569 and even though in "float-space" there is no difference between this number and 16677181699666568.0 (3**34 - 1) they do represent different numbers. Due to limited floating point precision, on a 64-bit architecture, the value float(3**34) will be stored as 16677181699666568 and hence it represents a different number than the integer numeral 16677181699666569. For that reason we have float(3**34) != 3**34 which performs a comparison without loss of precision.
This property is important in order to guarantee transitivity of the equivalence relation of numeric types. If int to float comparison would give similar results as if the int object would be converted to a float object then the transitive relation would be invalidated:
>>> class Float(float):
... def __eq__(self, other):
... return super().__eq__(float(other))
...
>>> a = 3**34 - 1
>>> b = Float(3**34)
>>> c = 3**34
>>> a == b
True
>>> b == c
True
>>> a == c # transitivity demands that this holds true
False
The float.__eq__ implementation on the other hand, which considers the represented mathematical numbers, doesn't infringe that requirement:
>>> a = 3**34 - 1
>>> b = float(3**34)
>>> c = 3**34
>>> a == b
True
>>> b == c
False
>>> a == c
False
As a result of missing transitivity the order of the following list won't be changed by sorting (since all consecutive numbers appear to be equal):
>>> class Float(float):
... def __lt__(self, other):
... return super().__lt__(float(other))
... def __eq__(self, other):
... return super().__eq__(float(other))
...
>>> numbers = [3**34, Float(3**34), 3**34 - 1]
>>> sorted(numbers) == numbers
True
Using float on the other hand, the order is reversed:
>>> numbers = [3**34, float(3**34), 3**34 - 1]
>>> sorted(numbers) == numbers[::-1]
True
Related
How come the following expression evaluates to true?
In [0]: 1e18 == (1e18 + 50)
Out[0]: True
When replacing the scientific notation by exponentiation, this evaluates, as one would expected, to False:
In [1]: 10**18 == (10**18 + 50)
Out[1]: False
You do not have two pairs of floats. The top example, being scientific notation, is a pair of floats. Since the added difference is less than float precision, they compare as equal.
The bottom example is a pair of integers. You can easily check this with the type function. Python's long integers have no problem keeping the needed 18 digits of accuracy.
>>> type(1e18)
<class 'float'>
>>> type(10**18)
<class 'int'>
I'm looking to differentiate between a number like
2.0 or 2 and an actual fractional number such as 2.4. What would be the best way to do this? Currently I'm doing:
def is_fractional(num):
if not str(num).replace('.','').isdigit(): return
return float(num) != int(num)
>>> is_fractional(2)
False
>>> is_fractional(2.1)
True
>>> is_fractional(2.0)
False
>>> is_fractional('a')
>>>
That operation is built-in:
>>> 5.0.is_integer()
True
>>> 5.00000001.is_integer()
False
>>> 4.9999999.is_integer()
False
Documentation is here.
ADDENDUM
The initial solution only works for float. Here's a more complete answer, with tests:
from decimal import Decimal
def is_integer(x):
if isinstance(x, int):
return True
elif isinstance(x, float):
return x.is_integer()
elif isinstance(x, Decimal):
return x.as_integer_ratio()[1] == 1
return False
good = [
0,
0.0,
3,
-9999999999999999999999,
-2.0000000000000,
Decimal("3.000000"),
Decimal("-9")
]
bad = [
-9.99999999999999,
"dogs",
Decimal("-4.00000000000000000000000000000000001"),
Decimal("0.99999999999999999999999999999999999")
]
for x in good:
assert is_integer(x)
for x in bad:
assert not is_integer(x)
print("All tests passed")
If some of your numbers are decimal.Decimals, they might have range issues where conversion to float fails, or drops the fractional part that actually exists, depending on their precision:
>>> import decimal
>>> x = decimal.Decimal('1.00000000000000000000000000000000000001')
>>> str(x)
'1.00000000000000000000000000000000000001'
>>> float(x).is_integer()
True
>>> y = decimal.Decimal('1e5000')
>>> str(y)
'1E+5000'
>>> float(y)
inf
The str method will generally work (modulo problem cases like the one illustrated above), so you could stick with that, but it might be better to attempt to use is_integer and use a fallback if that fails:
try:
return x.is_integer()
except AttributeError:
pass
(as others note, you'll need to check for int and long here as well, if those are allowed types, since they are integers by definition but lack an is_integer attribute).
At this point, it's worth considering all of the other answers, but here's a specific decimal.Decimal handler:
# optional: special case decimal.Decimal here
try:
as_tuple = x.as_tuple()
trailing0s = len(list(itertools.takewhile(lambda i: i == 0, reversed(as_tuple[1]))))
return as_tuple[2] + trailing0s < 0
except (AttributeError, IndexError): # no as_tuple, or not 3 elements long, etc
pass
Why do not check if the difference between the truncation to integer and the exact value is not zero?
is_frac = lambda x: int(x)-x != 0
Python includes a fractions module that generates fractions (rational numbers) from strings, floats, integers, and much more. Just create a Fraction and check whether its denominator is other than 1 (the Fraction constructor will automatically reduce the number to lowest terms):
from fractions import Fraction
def is_fractional(num):
return Fraction(num).denominator != 1
Note that the method above may raise an exception if the conversion to a Fraction fails. In this case, it's not known whether the object is fractional.
If you are dealing with decimal module or with a float object, you can do this easily:
def is_factional(num):
return isinstance(num, (float, Decimal))
Here is one way to do it (assuming e.g. 2/2 is not "fractional" in the sense you have in mind):
# could also extend to other numeric types numpy.float32
from decimal import Decimal
def is_frac(n):
numeric_types = (int, float, Decimal)
assert isinstance(n, numeric_types), 'n must be numeric :/'
# (ints are never fractions)
if type(n) is int: return False
return n != float(int(n))
# various sorts of numbers
ns = [-1, -1.0, 0, 0.1, 1, 1.0, 1., 2.3, 1e0, 1e3, 1.1e3,
Decimal(3), Decimal(3.0), Decimal(3.1)]
# confirm that values are as expected
dict(zip(ns, [is_frac(n) for n in ns]))
This will only work if n is an int or a float or decimal.Decimal. But you could extend it to handle other numeric types such as numpy.float64 or numpy.int32 by just including them in numeric_types.
I came across some code with a line similar to
x[x<2]=0
Playing around with variations, I am still stuck on what this syntax does.
Examples:
>>> x = [1,2,3,4,5]
>>> x[x<2]
1
>>> x[x<3]
1
>>> x[x>2]
2
>>> x[x<2]=0
>>> x
[0, 2, 3, 4, 5]
This only makes sense with NumPy arrays. The behavior with lists is useless, and specific to Python 2 (not Python 3). You may want to double-check if the original object was indeed a NumPy array (see further below) and not a list.
But in your code here, x is a simple list.
Since
x < 2
is False
i.e 0, therefore
x[x<2] is x[0]
x[0] gets changed.
Conversely, x[x>2] is x[True] or x[1]
So, x[1] gets changed.
Why does this happen?
The rules for comparison are:
When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for string, numeric ordering for integers).
When you order a numeric and a non-numeric type, the numeric type comes first.
When you order two incompatible types where neither is numeric, they are ordered by the alphabetical order of their typenames:
So, we have the following order
numeric < list < string < tuple
See the accepted answer for How does Python compare string and int?.
If x is a NumPy array, then the syntax makes more sense because of boolean array indexing. In that case, x < 2 isn't a boolean at all; it's an array of booleans representing whether each element of x was less than 2. x[x < 2] = 0 then selects the elements of x that were less than 2 and sets those cells to 0. See Indexing.
>>> x = np.array([1., -1., -2., 3])
>>> x < 0
array([False, True, True, False], dtype=bool)
>>> x[x < 0] += 20 # All elements < 0 get increased by 20
>>> x
array([ 1., 19., 18., 3.]) # Only elements < 0 are affected
>>> x = [1,2,3,4,5]
>>> x<2
False
>>> x[False]
1
>>> x[True]
2
The bool is simply converted to an integer. The index is either 0 or 1.
The original code in your question works only in Python 2. If x is a list in Python 2, the comparison x < y is False if y is an integer. This is because it does not make sense to compare a list with an integer. However in Python 2, if the operands are not comparable, the comparison is based in CPython on the alphabetical ordering of the names of the types; additionally all numbers come first in mixed-type comparisons. This is not even spelled out in the documentation of CPython 2, and different Python 2 implementations could give different results. That is [1, 2, 3, 4, 5] < 2 evaluates to False because 2 is a number and thus "smaller" than a list in CPython. This mixed comparison was eventually deemed to be too obscure a feature, and was removed in Python 3.0.
Now, the result of < is a bool; and bool is a subclass of int:
>>> isinstance(False, int)
True
>>> isinstance(True, int)
True
>>> False == 0
True
>>> True == 1
True
>>> False + 5
5
>>> True + 5
6
So basically you're taking the element 0 or 1 depending on whether the comparison is true or false.
If you try the code above in Python 3, you will get TypeError: unorderable types: list() < int() due to a change in Python 3.0:
Ordering Comparisons
Python 3.0 has simplified the rules for ordering comparisons:
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types always compare unequal to each other.
There are many datatypes that overload the comparison operators to do something different (dataframes from pandas, numpy's arrays). If the code that you were using did something else, it was because x was not a list, but an instance of some other class with operator < overridden to return a value that is not a bool; and this value was then handled specially by x[] (aka __getitem__/__setitem__)
This has one more use: code golf. Code golf is the art of writing programs that solve some problem in as few source code bytes as possible.
return(a,b)[c<d]
is roughly equivalent to
if c < d:
return b
else:
return a
except that both a and b are evaluated in the first version, but not in the second version.
c<d evaluates to True or False.
(a, b) is a tuple.
Indexing on a tuple works like indexing on a list: (3,5)[1] == 5.
True is equal to 1 and False is equal to 0.
(a,b)[c<d]
(a,b)[True]
(a,b)[1]
b
or for False:
(a,b)[c<d]
(a,b)[False]
(a,b)[0]
a
There's a good list on the stack exchange network of many nasty things you can do to python in order to save a few bytes. https://codegolf.stackexchange.com/questions/54/tips-for-golfing-in-python
Although in normal code this should never be used, and in your case it would mean that x acts both as something that can be compared to an integer and as a container that supports slicing, which is a very unusual combination. It's probably Numpy code, as others have pointed out.
In general it could mean anything. It was already explained what it means if x is a list or numpy.ndarray but in general it only depends on how the comparison operators (<, >, ...) and also how the get/set-item ([...]-syntax) are implemented.
x.__getitem__(x.__lt__(2)) # this is what x[x < 2] means!
x.__setitem__(x.__lt__(2), 0) # this is what x[x < 2] = 0 means!
Because:
x < value is equivalent to x.__lt__(value)
x[value] is (roughly) equivalent to x.__getitem__(value)
x[value] = othervalue is (also roughly) equivalent to x.__setitem__(value, othervalue).
This can be customized to do anything you want. Just as an example (mimics a bit numpys-boolean indexing):
class Test:
def __init__(self, value):
self.value = value
def __lt__(self, other):
# You could do anything in here. For example create a new list indicating if that
# element is less than the other value
res = [item < other for item in self.value]
return self.__class__(res)
def __repr__(self):
return '{0} ({1})'.format(self.__class__.__name__, self.value)
def __getitem__(self, item):
# If you index with an instance of this class use "boolean-indexing"
if isinstance(item, Test):
res = self.__class__([i for i, index in zip(self.value, item) if index])
return res
# Something else was given just try to use it on the value
return self.value[item]
def __setitem__(self, item, value):
if isinstance(item, Test):
self.value = [i if not index else value for i, index in zip(self.value, item)]
else:
self.value[item] = value
So now let's see what happens if you use it:
>>> a = Test([1,2,3])
>>> a
Test ([1, 2, 3])
>>> a < 2 # calls __lt__
Test ([True, False, False])
>>> a[Test([True, False, False])] # calls __getitem__
Test ([1])
>>> a[a < 2] # or short form
Test ([1])
>>> a[a < 2] = 0 # calls __setitem__
>>> a
Test ([0, 2, 3])
Notice this is just one possibility. You are free to implement almost everything you want.
I am dealing with the following problem with unittest2:
assertAlmostEqual(69.88, 69.875, places=2) # returns True
but
assertAlmostEqual(1.28, 1.275, places=2) # returns False
I think problem is in the assertAlmostEqual method:
def assertAlmostEqual(self, first, second, places=None, ...):
if first == second:
# shortcut
return
...
if delta is not None:
...
else:
if places is None:
places = 7
if round(abs(second-first), places) == 0:
return
...
raise self.failureException(msg)
Should it instead be:
if abs(round(second, places) - round(first, places)) == 0
return
Your proposed fix doesn't make any difference, as you can easily demonstrate:
>>> places = 2
>>> first, second = 69.88, 69.875
>>> round(abs(second-first), places)
0.0
>>> abs(round(second, places) - round(first, places))
0.0
This is a problem with floating point precision, see e.g. Is floating point math broken? 69.88 cannot be represented exactly:
>>> "{:.40f}".format(69.88)
'69.8799999999999954525264911353588104248047'
The difference in the second example is
0.005
And even without mentioned biases of floating points result of round will be 0.01, so these numbers really different with 2-places precision
This method compares difference between numbers. It is kinda standard of comparing float numbers actually
So the problem is not with implementation, but with you expectations, that is different from common float comparison
>>> 100**0.5 != 4+6
False
>>> 100**0.5 == 4+6
True
>>> 4+6
10
>>> 100**0.5
10.0
>>> 10.0==10
True
Who can tell me why 10.0==10 is True?
I think 10.0 is a float and 10 is int,I know in java they are not equal.
Quoting from http://docs.python.org/2/library/stdtypes.html#numeric-types-int-float-long-complex
Python fully supports mixed arithmetic: when a binary arithmetic
operator has operands of different numeric types, the operand with the
“narrower” type is widened to that of the other, where plain integer
is narrower than long integer is narrower than floating point is
narrower than complex.
So, 10 is widened to 10.0. Thats why 10 == 10.0
Because that's the way Python defines equality for floats and integers. If the float represents a whole number, it's equal to the integer representing the same number (and even has the same hash code). Note that Java does something to similar effect (even though == cannot be overloaded by classes in Java). 10.0 == 10 is true because == with mixed (numeric) arguments performs binary numeric promotion which turns the int 10 into the floating point number 10.0.
u can use
>>> 10 == 10.0
True
>>> 10 is 10.0
False
is is the identity comparison.
== is the equality comparison.
is means is same instance. It evaluates to true if the variables on either side of the operator point to the same object and false otherwise.
Python automatically type converts during comparisons if possible and sensible to do so.
>>> False == 0
True
>>> True == 1
True
>>> True == 0
False
>>> True == 22
False
Also illustrates this behaviour.