Weird behaviour for Python is_integer from floor - python

When checking if a floor is an int, the recommend method would be is_integer:
However, I get a weird behaviour with the results of the log function:
print(log(9,3)); #2.0
print((log(9,3)).is_integer()); #True
print((log(243,3))); #5.0
print((log(243,3)).is_integer()); #False
Furthermore:
print((int) (log(9,3))); #2
print((int) (log(243,3))); #4
Is this normal?

log(243,3) simply doesn't give you exactly 5:
>>> '%.60f' % log(243,3)
'4.999999999999999111821580299874767661094665527343750000000000'
As the docs say, log(x, base) is "calculated as log(x)/log(base)". And neither log(243) nor log(3) can be represented exactly, and you get rounding errors. Sometimes you're lucky, sometimes you're not. Don't count on it.

When you want to compare float numbers, use math.isclose().
When you want to convert a float number that is close to an integer, use round().
Float numbers are too subject to error for "conventional" methods to be used. Their precision (and the precision of functions like log) is too limited, unfortunately. What looks like a 5 may not be an exact 5.
And yes: it is normal. This is not a problem with Python, but with every language I'm aware of (they all use the same underlying representation). Python offers some ways to work around float problems: decimal and fractions. Both have their own drawbacks, but sometimes they help. For example, with fractions, you can represent 1/3 without loss of precision. Similarly, with decimal, you can represent 0.1 exactly. However, you'll still have problems with log, sqrt, irrational numbers, numbers that require many digits to be represented and so on.

Related

What is the meaning of this following piece of code({:.3f}) used in a Decision Tree tutorial? [duplicate]

I don't understand why, by formatting a string containing a float value, the precision of this last one is not respected. Ie:
'%f' % 38.2551994324
returns:
'38.255199'
(4 signs lost!)
At the moment I solved specifying:
'%.10f' % 38.2551994324
which returns '38.2551994324' as expected… but should I really force manually how many decimal numbers I want? Is there a way to simply tell to python to keep all of them?! (what should I do for example if I don't know how many decimals my number has?)
but should I really force manually how many decimal numbers I want? Yes.
And even with specifying 10 decimal digits, you are still not printing all of them. Floating point numbers don't have that kind of precision anyway, they are mostly approximations of decimal numbers (they are really binary fractions added up). Try this:
>>> format(38.2551994324, '.32f')
'38.25519943239999776096738060005009'
there are many more decimals there that you didn't even specify.
When formatting a floating point number (be it with '%f' % number, '{:f}'.format(number) or format(number, 'f')), a default number of decimal places is displayed. This is no different from when using str() (or '%s' % number, '{}'.format(number) or format(number), which essentially use str() under the hood), only the number of decimals included by default differs; Python versions prior to 3.2 use 12 digits for the whole number when using str().
If you expect your rational number calculations to work with a specific, precise number of digits, then don't use floating point numbers. Use the decimal.Decimal type instead:
Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.
I would use the modern str.format() method:
>>> '{}'.format(38.2551994324)
'38.2551994324'
The modulo method for string formatting is now deprecated as per PEP-3101

Does Python document its behavior for rounding to a specified number of fractional digits?

Is the algorithm used for rounding a float in Python to a specified number of digits specified in any Python documentation? The semantics of round with zero fractional digits (i.e. rounding to an integer) are simple to understand, but it's not clear to me how the case where the number of digits is nonzero is implemented.
The most straightforward implementation of the function that I can think of (given the existence of round to zero fractional digits) would be:
def round_impl(x, ndigits):
return (10 ** -ndigits) * round(x * (10 ** ndigits))
I'm trying to write some C++ code that mimics the behavior of Python's round() function for all values of ndigits, and the above agrees with Python for the most part, when translated to equivalent C++ calls. However, there are some cases where it differs, e.g.:
>>> round(0.493125, 5)
0.49312
>>> round_impl(0.493125, 5)
0.49313
There is clearly a difference that occurs when the value to be rounded is at or very near the exact midpoint between two potential output values. Therefore, it seems important that I try to use the same technique if I want similar results.
Is the specific means for performing the rounding specified by Python? I'm using CPython 2.7.15 in my tests, but I'm specifically targeting v2.7+.
Also refer to What Every Programmer Should Know About Floating-Point Arithmetic, which has more detailed explanations for why this is happening as it is.
This is a mess. First of all, as far as float is concerned, there is no such number as 0.493125, when you write 0.493125 what you actually get is:
0.493124999999999980015985556747182272374629974365234375
So this number is not exactly between two decimals, it's actually closer to 0.49312 than it is to 0.49313, so it should definitely round to 0.49312, that much is clear.
The problem is that when you multiply by 105, you get the exact number 49312.5. So what happened here is the multiplication gave you an inexact result which by coincidence canceled out the rounding error in the original number. Two rounding errors canceled each other out, yay! But the problem is that when you do this, the rounding is actually incorrect... at least if you want to round up at midpoints, but Python 3 and Python 2 behave differently. Python 2 rounds away from 0, and Python 3 rounds towards even least-significant digits.
Python 2
if two multiples are equally close, rounding is done away from 0
Python 3
...if two multiples are equally close, rounding is done toward the even choice...
Summary
In Python 2,
>>> round(49312.5)
49313.0
>>> round(0.493125, 5)
0.49312
In Python 3,
>>> round(49312.5)
49312
>>> round(0.493125, 5)
0.49312
And in both cases, 0.493125 is really just a short way of writing 0.493124999999999980015985556747182272374629974365234375.
So, how does it work?
I see two plausible ways for round() to actually behave.
Choose the closest decimal number with the specified number of digits, and then round that decimal number to float precision. This is hard to implement, because it requires doing calculations with more precision than you can get from a float.
Take the two closest decimal numbers with the specified number of digits, round them both to float precision, and return whichever is closer. This will give incorrect results, because it rounds numbers twice.
And Python chooses... option #1! The exactly correct, but much harder to implement version. Refer to Objects/floatobject.c:927 double_round(). It uses the following process:
Write the floating-point number to a string in decimal format, using the requested precision.
Parse the string back in as a float.
This uses code based on David Gay's dtoa library. If you want C++ code that gets the actual correct result like Python does, this is a good start. Fortunately you can just include dtoa.c in your program and call it, since its licensing is very permissive.
The Python documentation for and 2.7 specifies the behaviour:
Values are rounded to the closest multiple of 10 to the power minus
ndigits; if two multiples are equally close, rounding is done away
from 0.
For 3.7:
For the built-in types supporting round(), values are rounded to the
closest multiple of 10 to the power minus ndigits; if two multiples
are equally close, rounding is done toward the even choice
Update:
The (cpython) implementation can be found floatobjcet.c in the function float___round___impl, which calls round if ndigits is not given, but double_round if it is.
double_round has two implementations.
One converts the double to a string (aka decimal) and back to a double.
The other one does some floating point calculations, calls to pow and at its core calls round. It seems to have more potential problems with overflows, since it actually multiplies the input by 10**-ndigits.
For the precise algorithm, look at the linked source file.

Default rounding mode in python, and how to specify it to another one?

What is the default rounding mode (rounding to nearest, etc) in Python? And how can we specify it?
With IEEE754-based platform (as most modern ones do, including x86, ARM, MIPS...), it's default mode "round to nearest, ties to even" is the only mode available in Python standard library. That is "provided" by standardized defaults and absense of library methods to change it. There are more languages that doesn't allow to change rounding mode - e.g. Java - so this isn't an isolated Python whim.
In real, there are too few reasons to change this. Direct rounding modes of IEEE754 are very special in their use. (I don't apologize the approach to stick on the default rounding, but simply comment on it.) For example, multiply of 1e308 by 1e308 with rounding to zero or to minus infinity results in approximately 1.8e308, so, the result is too far both from the exact answer and from POLA-based one (infinity). If you really need some specific modes for your computations, consider using specific libraries, like MPFR or gmpy2.
If you insist on changing this without external modules specialized on floating-point calculations, try using C-library fesetround via ctypes module or analog, e.g. here. Again, it's your choice to use such hacks and become responsible to all consequences. I'd suggest wrapping all pieces with special rounding to C-level code which restores the default mode on function exit.
The accepted answer is not really correct. While floats are probably what first comes to mind when someone asks about rounding modes, they are not the place you should look.
The reason is simple: rounding is something you use to make your answer have a smaller number of digits. Whenever you mention digits, you must be sure what base you're talking about. I don't know about you specifically, but when people speak of digits, they usually mean decimal digits. For that purpose, floats are obviously inadequate, since they have binary digits. You cannot round a float 0.12 to one decimal digit because it doesn't make sense: despite the appearance, it doesn't have that kind of digits. :-)
Of course, what you can do, is try to compensate for the decimal inexactness of floats by rounding them so the overshoots and undershoots cancel each other in the best possible way, and in that context, it has been proven long ago that there is only one right answer, ROUND_HALF_EVEN---and it is provided by float (and by round function, if you need it on some higher decimal place) out of the box. But please note that it's not the same as 'calculating the mean grade' (ROUND_HALF_UP, usually), or 'estimating the mean error' (ROUND_UP), or 'giving you a tax grade (ROUND_FLOOR), or various other specific tasks which need some fixed number of decimal digits (or in case of some now defunct currencies, some other base, but usually not binary).
And in fact, there is a Python standard library module which gives you all the rounding modes you might find useful, and given the above paragraphs, it is the logical place to look: of course, it's the decimal module. It represents floating point numbers not in base 2, but in base 10, and as such, it offers the meaningful possibility to round a number to a given number of decimals, using a rounding method that makes sense for a particular task.
>>> import statistics, decimal
>>> grades = map(decimal.Decimal, [4, 5])
>>> print(statistics.mean(grades).to_integral_exact(decimal.ROUND_HALF_UP))
5
HTH.
I've used these in the past without any negative effects:
from math import floor, ceil
def round_floor(scale, x):
return floor(x*(10**scale))/(10**scale)
def round_ceiling(scale,x):
return ceil(x*(10**scale))/(10**scale)
but I haven't considered the implications for very large or very precise numbers.
>>> round_floor(1,123.456)
123.4
>>> round_floor(-1,123.456)
120.0
>>> round_floor(2,123.456)
123.45
>>> round_ceiling(1,123.456)
123.5
>>> round_ceiling(-1,123.456)
130.0
>>> round_ceiling(-2,123.456)
200.0
round() is the built in fonction for rounding. it works as follows:
round(number[, ndigits])
it returns the float type number you are rounding with ndigits decimals
Values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done toward the even choice (so, for example, both round(0.5) and round(-0.5) are 0, and round(1.5) is 2). The return value is an integer if called with one argument, otherwise of the same type as number.
for limitations:
https://docs.python.org/3/tutorial/floatingpoint.html#tut-fp-issues
sources:
https://docs.python.org/3/library/functions.html

Python - round a float to 2 digits

I would need to have a float variable rounded to 2 significant digits and store the result into a new variable (or the same of before, it doesn't matter) but this is what happens:
>>> a
981.32000000000005
>>> b= round(a,2)
>>> b
981.32000000000005
I would need this result, but into a variable that cannot be a string since I need to insert it as a float...
>>> print b
981.32
Actually truncate would also work I don't need extreme precision in this case.
What you are trying to do is in fact impossible. That's because 981.32 is not exactly representable as a binary floating point value. The closest double precision binary floating point value is:
981.3200000000000500222085975110530853271484375
I suspect that this may come as something of a shock to you. If so, then I suggest that you read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
You might choose to tackle your problem in one of the following ways:
Accept that binary floating point numbers cannot represent such values exactly, and continue to use them. Don't do any rounding at all, and keep the full value. When you wish to display the value as text, format it so that only two decimal places are emitted.
Use a data type that can represent your number exactly. That means a decimal rather than binary type. In Python you would use decimal.
Try this :
Round = lambda x, n: eval('"%.' + str(int(n)) + 'f" % ' + repr(x))
print Round(0.1, 2)
0.10
print Round(0.1, 4)
0.1000
print Round(981,32000000000005, 2)
981,32
Just indicate the number of digits you want as a second kwarg
I wrote a solution of this problem.
Plz try
from decimal import *
from autorounddecimal.core import adround,decimal_round_digit
decimal_round_digit(Decimal("981.32000000000005")) #=> Decimal("981.32")
adround(981.32000000000005) # just wrap decimal_round_digit
More detail can be found in https://github.com/niitsuma/autorounddecimal
There is a difference between the way Python prints floats and the way it stores floats. For example:
>>> a = 1.0/5.0
>>> a
0.20000000000000001
>>> print a
0.2
It's not actually possible to store an exact representation of many floats, as David Heffernan points out. It can be done if, looking at the float as a fraction, the denominator is a power of 2 (such as 1/4, 3/8, 5/64). Otherwise, due to the inherent limitations of binary, it has to make do with an approximation.
Python recognizes this, and when you use the print function, it will use the nicer representation seen above. This may make you think that Python is storing the float exactly, when in fact it is not, because it's not possible with the IEEE standard float representation. The difference in calculation is pretty insignificant, though, so for most practical purposes it isn't a problem. If you really really need those significant digits, though, use the decimal package.

Unable to see Python's approximations in mathematical calculations

Problem: to see when computer makes approximation in mathematical calculations when I use Python
Example of the problem:
My old teacher once said the following statement
You cannot never calculate 200! with your computer.
I am not completely sure whether it is true or not nowadays.
It seems that it is, since I get a lot zeros for it from a Python script.
How can you see when your Python code makes approximations?
Python use arbitrary-precision arithmetic to calculate with integers, so it can exactly calculate 200!. For real numbers (so-called floating-point), Python does not use an exact representation. It uses a binary representation called IEEE 754, which is essentially scientific notation, except in base 2 instead of base 10.
Thus, any real number that cannot be exactly represented in base 2 with 53 bits of precision, Python cannot produce an exact result. For example, 0.1 (in base 10) is an infinite decimal in base 2, 0.0001100110011..., so it cannot be exactly represented. Hence, if you enter on a Python prompt:
>>> 0.1
0.10000000000000001
The result you get back is different, since has been converted from decimal to binary (with 53 bits of precision), back to decimal. As a consequence, you get things like this:
>>> 0.1 + 0.2 == 0.3
False
For a good (but long) read, see What Every Programmer Should Know About Floating-Point Arithmetic.
Python has unbounded integer sizes in the form of a long type. That is to say, if it is a whole number, the limit on the size of the number is restricted by the memory available to Python.
When you compute a large number such as 200! and you see an L on the end of it, that means Python has automatically cast the int to a long, because an int was not large enough to hold that number.
See section 6.4 of this page for more information.
200! is a very large number indeed.
If the range of an IEEE 64-bit double is 1.7E +/- 308 (15 digits), you can see that the largest factorial you can get is around 170!.
Python can handle arbitrary sized numbers, as can Java with its BigInteger.
Without some sort of clarification to that statement, it's obviously false. Just from personal experience, early lessons in programming (in the late 1980s) included solving very similar, if not exactly the same, problems. In general, to know some device which does calculations isn't making approximations, you have to prove (in the math sense of a proof) that it isn't.
Python's integer types (named int and long in 2.x, both folded into just the int type in 3.x) are very good, and do not overflow like, for example, the int type in C. If you do the obvious of print 200 * 199 * 198 * ... it may be slow, but it will be exact. Similiarly, addition, subtraction, and modulus are exact. Division is a mixed bag, as there's two operators, / and //, and they underwent a change in 2.x—in general you can only treat it as inexact.
If you want more control yet don't want to limit yourself to integers, look at the decimal module.
Python handles large numbers automatically (unlike a language like C where you can overflow its datatypes and the values reset to zero, for example) - over a certain point (sys.maxint or 2147483647) it converts the integer to a "long" (denoted by the L after the number), which can be any length:
>>> def fact(x):
... return reduce(lambda x, y: x * y, range(1, x+1))
...
>>> fact(10)
3628800
>>> fact(200)
788657867364790503552363213932185062295135977687173263294742533244359449963403342920304284011984623904177212138919638830257642790242637105061926624952829931113462857270763317237396988943922445621451664240254033291864131227428294853277524242407573903240321257405579568660226031904170324062351700858796178922222789623703897374720000000000000000000000000000000000000000000000000L
Long numbers are "easy", floating point is more complicated, and almost any computer representation of a floating point number is an approximation, for example:
>>> float(1)/3
0.33333333333333331
Obviously you can't store an infinite number of 3's in memory, so it cheats and rounds it a bit..
You may want to look at the decimal module:
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 do not have an exact representation in binary floating point. End users typically would not expect 1.1 to display as 1.1000000000000001 as it does with binary floating point.
Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem
See Handling very large numbers in Python.
Python has a BigNum class for holding 200! and will use it automatically.
Your teacher's statement, though not exactly true here is true in general. Computers have limitations, and it is good to know what they are. Remember that every time you add another integer of data storage, you can store a number that is 2^32 (4 billion +) times larger. It is hard to comprehend how many more numbers that is - but maths gets slower as you add more integers to store the exact value of a very large number.
As an example (what you can store with 1000 bits)
>>> 2 << 1000
2143017214372534641896850098120003621122809623411067214887500776740702102249872244986396
7576313917162551893458351062936503742905713846280871969155149397149607869135549648461970
8421492101247422837559083643060929499671638825347975351183310878921541258291423929553730
84335320859663305248773674411336138752L
I tried to illustrate how big a number you can store with 10000 bits, or even 8,000,000 bits (a megabyte) but that number is many pages long.

Categories

Resources