Python:Which way gives better precision - python

Is there any difference in precision between one time assignment:
res=n/k
and multiple assignment in for cycle:
for i in range(n):
res+=1/k
?

Floating-point division a/b is not mathematical division a ÷ b, except in very rare* circumstances.
Generally, floating point division a/b is a ÷ b + ε.
This is true for two reasons.
Float numbers (except in rare cases) are an approximation of the decimal number.
a is a + εa.
b is b + εb.
Float numbers uses a base 2 encoding of the digits to the right of the decimal place. When you write 3.1, this is expanded to a base-2 approximation that differs from the real value by a small amount.
Real decimal numbers have the same problem, by the way. Write down the decimal expansion of 1/3. Oops. You have to stop writing decimal places at some point. Binary floating point numbers have the same problem.
Division has a fixed number of binary places, meaning the answer is truncated. If there's a repeating binary pattern, it gets chopped. In rare cases, this doesn't matter. In general, you've introduced error by doing division.
Therefore, when you do something like repeatedly add 1/k values you're computing
1 ÷ k + ε
And adding those up. Your result (if you had the right range) would be
n × (1 ÷ k + ε) = n ÷ k + n × ε
You've multiplied the small error, ε, by n. Making it a big error. (Except in rare cases.)
This is bad. Very bad. All floating point division introduces an error. Your job as a programmer is to do the algebra to avoid or defer division to prevent this. Good software design means good algebra to prevent errors being introduced by the division operator.
[* The rare cases. In rare cases, the small error happens to be zero. The rare cases occur when your floating point values are small whole numbers or fractions that are sums of powers of two 1/2, 1/4, 1/8, etc. In the rare case that you have a benign number with a benign fractional part, the error will be zero.]

Sure, they are different, because of how floating point division works.
>>> res = 0
>>> for x in xrange(5000): res += 0.1
...
>>> res == 5000 * 0.1
False
There's a good explanation in the python official tutorial.

Well if k divides n then definitely the first one is more precise :-) To be serious, if the division is floating point and n > 1 then the first one will be more precise anyway though they will probably give different results, as nosklo said.
BTW, in Python 2.6 the division is integer by default so you'll have very different results. 1/k will always give 0 unless k <= 1.

Floating point arithmetic has representation and roundoff errors. For the types of data floating point numbers are intended to represent, real numbers of reasonable size, these errors are generally acceptable.
If you want to calculate the quotient of two numbers, the right way is simply to say result = n / k (beware if these are both integers and you have not said from __future__ import division, this is not what you may expect). The second way is silly, error-prone, and ugly.
There is some discussion of floating point inexactness in the Python tutorial: http://docs.python.org/tutorial/floatingpoint.html

Even if we charitably assume a floating-point division, there's very definitely a difference in precision; the for loop is executed n - 1 times!
assert (n-1) / k != n / k
Also depends on what res is initialised to in the second case :-)

Certainly there is a difference if you use floating point numbers, unless the Python interpreter/compiler you are using is capable of optimizing away the loop (Maybe Jython or IronPython might be able to? C compilers are pretty good at this).
If you actually want these two approaches to be the same precision though, and you are using integers for your numerator and denominator, you can use the python fractions package
from fractions import Fraction
n,k = 999,1000
res = Fraction(0,1)
for i in range(0,n):
res += Fraction(1,k)
print float(res)

Related

How to ensure expressions that evaluate to floats, give the expected integer value with int(*)

In this question's most general form, I want to know how I can guarantee that int(x * y) (with x and y both being floats gives me the arithmetically "correct" answer when I know the result to be a round number. For example: 1.5 * 2.0 = 3, or 16.0 / 2.0 = 8. I worry this could be a problem because int can round down if there is some floating point error. For example: int(16.0 - 5 * sys.float_info.epsilon) gives 15.
And specializing the question a bit, I could also ask about division between two ints where I know the result is a round number. For example 16 / 2 = 8. If this specialization changes the answer to the more general question, I'd like to know how.
By the way, I know that I could do int(round(x * y). I'm just wondering if there's a more direct built-in, or if there's some proven guarantee that means I don't have to worry about this in the first place.
If both inputs are exact, and the mathematically correct result is representable, then the output is also guaranteed to be exact. This is only true for a limited number of basic floating-point operations, but * and / are such operations.
Note that the "both inputs are exact" condition is only satisfiable for dyadic rationals. Numbers like 1.5 are fine, but numbers like 0.1 cannot be exactly represented in binary floating point. Also, floating point precision limits apply to integers, too, not just fractional values - very large integers may not be exactly representable, due to requiring more precision than a Python float has.

Float multiplication give Wrong Answer (python)

we know that multiplication and division are inverse each other, so in python Suppose i have a number 454546456756765675454.00 and i want to divided the number with 32 lets define a variable for example
value = 454546456756765675454.00/32
so the output will be, 1.4204576773648927e+19 or 14204576773648926720.000000, now i want to multiply the output with 32 so if i multiply 14204576773648926720.000000 * 32 then the output give me 454546456756765655040.00 not 454546456756765675454.00 why this happend? i am not good at math, but my question is why float multiply give me wrong answer ( i also try decimal module but its not work for me or maybe i dont know how to use decimal module to get exact answer)
Floating points are stored as binary fractions. Some number cannot be precisely written in base 2 form. So, their approximated value is store.
Now if this approximation had an error of +0.0001 for some number, and if this number is multiplied by 10000, then we our result will shift by value of 0.0001*10000 = 1.
It is same in pretty much all programming languages.
For operations where precision is very important, decimal module should be preferred.
i also try decimal module but its not work for me or maybe i dont know how to use decimal module to get exact answer
Your example, using decimal module, will look something like:
import decimal
value = decimal.Decimal(454546456756765675454)
vd = value/decimal.Decimal(32)
vm = vd*32
diff = vm - value
assert diff == decimal.Decimal(0)
# assert diff == 0.0
Within wide range limits, multiplication and division by powers of two, including 32, are exact in binary floating point. Conversion of a decimal is inexact. 454546456756765655040 is the closest IEEE 754 64-bit binary number to 454546456756765675454. The division and multiplication by 32 made no difference.
More generally, division and multiplication by the same number can result in rounding error in finite width decimal/binary etc. fractions unless all the prime factors of the divisor are also prime factors of the radix being used to represent fractions. In both binary and decimal fractions, division and multiplication by 3 can cause rounding error, because 3 is a factor of neither 2 nor 10.
Division and multiplication by 32 can be exact, given enough significand width, in both decimal and binary because two, the only prime factor of 32, is a factor of both 10 and 2.

Python 3.x rounding half up

I know that questions about rounding in python have been asked multiple times already, but the answers did not help me. I'm looking for a method that is rounding a float number half up and returns a float number. The method should also accept a parameter that defines the decimal place to round to. I wrote a method that implements this kind of rounding. However, I think it does not look elegant at all.
def round_half_up(number, dec_places):
s = str(number)
d = decimal.Decimal(s).quantize(
decimal.Decimal(10) ** -dec_places,
rounding=decimal.ROUND_HALF_UP)
return float(d)
I don't like it, that I have to convert float to a string (to avoid floating point inaccuracy) and then work with the decimal module.
Do you have any better solutions?
Edit: As pointed out in the answers below, the solution to my problem is not that obvious as correct rounding requires correct representation of numbers in the first place and this is not the case with float. So I would expect that the following code
def round_half_up(number, dec_places):
d = decimal.Decimal(number).quantize(
decimal.Decimal(10) ** -dec_places,
rounding=decimal.ROUND_HALF_UP)
return float(d)
(that differs from the code above just by the fact that the float number is directly converted into a decimal number and not to a string first) to return 2.18 when used like this: round_half_up(2.175, 2) But it doesn't because Decimal(2.175) will return Decimal('2.17499999999999982236431605997495353221893310546875'), the way the float number is represented by the computer.
Suprisingly, the first code returns 2.18 because the float number is converted to string first. It seems that the str() function conducts an implicit rounding to the number that was initially meant to be rounded. So there are two roundings taking place. Even though this is the result that I would expect, it is technically wrong.
Rounding is surprisingly hard to do right, because you have to handle floating-point calculations very carefully. If you are looking for an elegant solution (short, easy to understand), what you have like like a good starting point. To be correct, you should replace decimal.Decimal(str(number)) with creating the decimal from the number itself, which will give you a decimal version of its exact representation:
d = Decimal(number).quantize(...)...
Decimal(str(number)) effectively rounds twice, as formatting the float into the string representation performs its own rounding. This is because str(float value) won't try to print the full decimal representation of the float, it will only print enough digits to ensure that you get the same float back if you pass those exact digits to the float constructor.
If you want to retain correct rounding, but avoid depending on the big and complex decimal module, you can certainly do it, but you'll still need some way to implement the exact arithmetics needed for correct rounding. For example, you can use fractions:
import fractions, math
def round_half_up(number, dec_places=0):
sign = math.copysign(1, number)
number_exact = abs(fractions.Fraction(number))
shifted = number_exact * 10**dec_places
shifted_trunc = int(shifted)
if shifted - shifted_trunc >= fractions.Fraction(1, 2):
result = (shifted_trunc + 1) / 10**dec_places
else:
result = shifted_trunc / 10**dec_places
return sign * float(result)
assert round_half_up(1.49) == 1
assert round_half_up(1.5) == 2
assert round_half_up(1.51) == 2
assert round_half_up(2.49) == 2
assert round_half_up(2.5) == 3
assert round_half_up(2.51) == 3
Note that the only tricky part in the above code is the precise conversion of a floating-point to a fraction, and that can be off-loaded to the as_integer_ratio() float method, which is what both decimals and fractions do internally. So if you really want to remove the dependency on fractions, you can reduce the fractional arithmetic to pure integer arithmetic; you stay within the same line count at the expense of some legibility:
def round_half_up(number, dec_places=0):
sign = math.copysign(1, number)
exact = abs(number).as_integer_ratio()
shifted = (exact[0] * 10**dec_places), exact[1]
shifted_trunc = shifted[0] // shifted[1]
difference = (shifted[0] - shifted_trunc * shifted[1]), shifted[1]
if difference[0] * 2 >= difference[1]: # difference >= 1/2
shifted_trunc += 1
return sign * (shifted_trunc / 10**dec_places)
Note that testing these functions brings to spotlight the approximations performed when creating floating-point numbers. For example, print(round_half_up(2.175, 2)) prints 2.17 because the decimal number 2.175 cannot be represented exactly in binary, so it is replaced by an approximation that happens to be slightly smaller than the 2.175 decimal. The function receives that value, finds it smaller than the actual fraction corresponding to the 2.175 decimal, and decides to round it down. This is not a quirk of the implementation; the behavior derives from properties of floating-point numbers and is also present in the round built-in of Python 3 and 2.
I don't like it, that I have to convert float to a string (to avoid
floating point inaccuracy) and then work with the decimal module. Do
you have any better solutions?
Yes; use Decimal to represent your numbers throughout your whole program, if you need to represent numbers such as 2.675 exactly and have them round to 2.68 instead of 2.67.
There is no other way. The floating point number which is shown on your screen as 2.675 is not the real number 2.675; in fact, it is very slightly less than 2.675, which is why it gets rounded down to 2.67:
>>> 2.675 - 2
0.6749999999999998
It only shows in string form as '2.675' because that happens to be the shortest string such that float(s) == 2.6749999999999998. Note that this longer representation (with lots of 9s) isn't exact either.
However you write your rounding function, it is not possible for my_round(2.675, 2) to round up to 2.68 and also for my_round(2 + 0.6749999999999998, 2) to round down to 2.67; because the inputs are actually the same floating point number.
So if your number 2.675 ever gets converted to a float and back again, you have already lost the information about whether it should round up or down. The solution is not to make it float in the first place.
After trying for a very long time to produce an elegant one-line function, I ended up getting something that is comparable to a dictionary in size.
I would say the simplest way to do this is just to
def round_half_up(inp,dec_places):
return round(inp+0.0000001,dec_places)
i would acknowledge that this is not accurate in every cases, but should work if you just want a simple quick workaround.

sometimes missing a cent when translating euros to euro cents

I have to translate euro's (in a string) to euro cents (int):
Examples:
'12,1' => 1210
'14,51' => 1451
I use this python function:
int(round(float(amount.replace(',', '.')), 2) * 100)
But with this amount '1229,84' the result is : 122983
Update
I use the solution from Wim, bacause I use integers in both Python / Jinja and javascript for currency artitmetic. See also the answer from Chepner.
int(round(100 * float(amout.replace(',', '.')), 2))
My questions was anwered by Mr. Me, who explained the above result.
What the Docs Say, and a simple explanation
I tried it out, and was surprised that this was happening. So I turned to the documentation, and there is a little note in there that says.
Note The behavior of round() for floats can be surprising: for
example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This
is not a bug: it’s a result of the fact that most decimal fractions
can’t be represented exactly as a float.
Now what does that mean, most decimal fractions can't be represented as a float. Well the documentations follows up with a great link at explains this, but since you probably didn't come here to read a nerdy technical document, let me summarize what is going on.
Python uses the IEEE-754 floating point standard to represent floats. This standard compromises accuracy for speed. Some numbers cannot be accurately represented. For example .1 is actually represented as 0.1000000000000000055511151231257827021181583404541015625. Interestingly, .1 in binary is actually an infinitely repeating number, just like 1/3 is an infinitely repeating .333333.
An Under the Hood Case Study
Now on to your particular case. This was pretty fun to look into, and this is what I discovered.
first lets simplify what you where trying to do
>>> amount = '1229,84'
>>> int(round(float(amount.replace(',', '.')), 2) * 100)
>>> 122983
to
>>>int(1229.84 * 100)
>>> 122983
Sometimes Python1 is unable to 100% accurately display binary floating point numbers, for the same reason we are unable to display the fraction 1/3 as a decimal. When this happens Python hides any extra digits. .1 is actually stored as -0.100000000000000092, but Python will display it as .1 if you type it into the console. We can see those extra digits by doing int(1.1) - 1.13. we can apply this int(myNum) - myNum formula to most floating point numbers to see the extra hidden digits behind them.4. In your case we would do the following.
>>> int(1229.84) - 1229.84
-0.8399999999999181
1229.84 is actually 1229.8399999999999181. Continuing on.5
>>> 1229.84, 2) * 100
122983.99999999999 #there's all of our hidden digits showing up.
Now on to the last step. This is the part we are concerned about. Changing it back to an integer.
>>> int(122983.99999999999)
122983
It rounds downwards instead of upwards, however, if we never had multiplied it by 100, we would still have 2 more 9s at the end, and Python would round up.
>>> int(122983.9999999999999)
122984
??? Now what is going on. Why is Python rounding 122983.99999999999 down, but it rounds 122983.9999999999999 up? Well whenever Python turns a float into a integer it rounds down. However, you have to remember that to Python 122983.9999999999999 with the extra two 99s at the end is the same thing as 122984.0 For example.
>>> 122983.9999999999999
122984.0
>>> a = 122983.9999999999999
>>> int(a) - a
0.0
and without the two extra 99s on the end.
>>> 122983.99999999999
122983.99999999999
>>> a=122983.99999999999
>>> int(a) - a
-0.9999999999854481
Python is definitely treating 122983.9999999999999 as 122984.0 but not 122983.99999999999. Now back to casting 122983.99999999999 to an integer. Because we have created ourselves a decimal portion that is less than 122984 that Python sees as being a seperate number from 122984, and because casting to an integer always causes Python to round down, we get 122983 as a result.
Whew. That was a lot to go through, but I sure learned a lot writing this out, and I hope you did to. The solution to all of this is to use decimal numbers instead of floats which compromises speed for accuracy.
What about rounding? The original problem had some rounding in it as well -- it's useless. See appendix item 6.
The Solution
a) The easiest solution is to use the decimal module instead of floating point numbers. This is the preferred way of doing things in any finance or accounting program.
The documentation also mentioned the following solutions which I've summarized.
b) The exact value can be expressed and retrieved in a hexadecimal form via myFloat.hex() and float.fromhex(myHex)
c) The exact value can also be retrieved as a fraction through myFloat.as_integer_ratio()
d) The documentation briefly mentions using SciPy for floating point arithmitic, however this SO question mentions that SciPy's NumPy floats are nothing more than aliases to the built-in float type. The decimal module would be a better solution.
Appendix
1 - Even though I will often refer to Python's behavior, the things I talk about are part of the IEEE-754 floating point standard which is what the major programming languages use for their floating point numbers.
2 - int(1.1) - 1.1 gives me -0.10000000000000009, but according to the documentation .1 is really 0.1000000000000000055511151231257827021181583404541015625
3 - We used int(1.1) - 1.1 instead of int(.1) - .1 because int(.1) - .1 does not give us the hidden digits, but according to the documentation they should still be there for .1, hence I say int(someNum) -someNum works most of the time, but not all of the time.
4 - When we use the formula int(myNum) - myNum what is happening is that casting the number to an integer will round the number down so int(3.9) becomes 3, and when we minus 3 from 3.9 we are left with -.9. However, for some reason that I do not know, when we get rid of all the whole numbers, and we're just left with the decimal portion, Python decides to show us everything -- the whole mantissa.
5 - this does not really affect the outcome of our analysis, but when multiplying by 100, instead of the hidden digits being shifted over by 2 decimal places, they changed a little as well.
>>> a = 1229.84
>>> int(a) - a
-0.8399999999999181
>>> a = round(1229.84, 2) * 100
>>> int(a) - a
-0.9999999999854481 #I expected -0.9999999999918100?
6 - It may seem like we can get rid of all those extra digits by rounding to two decimal places.
>>> round(1229.84, 2) # which is really round(1229.8399999999999181, 2)
1229.84
But when we use our int(someNum) - someNum formula to see the hidden digits, they are still there.
>>> a = round(1229.84, 2)
>>> int(a) - a
-0.8399999999999181
This is because Python cannot store 1229.84 as a binary floating point number. It can't be done. So... rounding 1229.84 does absolutely nothing.
Don't use floating-point arithmetic for currency; rounding error for values that cannot be represented exactly will cause the type of loss you are seeing. Instead, convert the string representation to an integer number of cents, which you can convert to euros-and-cents for display as needed.
euros, cents = '12,1'.split(',') # '12,1' -> ('12', '1')
cents = 100*int(euros) + int(cents * 10 if len(cents) == 1 else 1) # ('12', '1') -> 1210
(Notice you'll need a check to handle cents without a trailing 0.)
display_str = '%d,%d' % divMod(cents, 100) # 1210 -> (12, 10) -> '12.10'
You can also use the Decimal class from the decimal module, which essentially encapsulates all the logic for using integers to represent fractional values.
As #wim mentions in a comment, use the Decimal type from the stdlib decimal module instead of the built in float type. Decimal objects do not have the binary rounding behavior that floats have and also have a precision that can be user defined.
Decimal should be used anywhere you are doing financial calculations or anywhere you need floating point calculations that behave like the decimal math people learn in school (as opposed to the binary floating point behavior of the built in float type).

Truncation in python

How can we truncate (not round) the cube root of a given number after the 10th decimal place in python?
For Example:
If number is 8 the required output is 2.0000000000 and for 33076161 it is 321.0000000000
Scale - truncate - unscale:
n = 10.0
cube_root = 1e-10 * int(1e10 * n**(1.0/3.0))
You should only do such truncations (unless you have a serious reason otherwise) while printing out results. There is no exact binary representation in floating point format, for a whole host of everyday decimal values:
print 33076161**(1.0/3.0)
A calculator gives you a different answer than Python gives you. Even Windows calculator does a passable job on cuberoot(33076161), whereas the answer given by python will be minutely incorrect unless you use rounding.
So, the question you ask is fundamentally unanswerable since it assumes capabilities that do not exist in floating point math.
Wrong Answer #1: This actually rounds instead of truncating, but for the cases you specified, it provides the correct output, probably due to rounding compensating for the inherent floating point precision problem you will hit in case #2:
print "%3.10f" % 10**(1.0/3.0)
Wrong Answer #2: But you could truncate (as a string) an 11-digit rounded value, which, as has been pointed out to me, would fail for values very near rollover, and in other strange ways, so DON'T do this:
print ("%3.11f" % 10**(1.0/3.0))[:-1]
Reasonably Close Answer #3: I wrote a little function that is for display only:
import math
def str_truncate(f,d):
s = f*(10.0**(d))
str = `math.trunc(s)`.rstrip('L')
n = len(str)-d
w = str[0:n]
if w=='':
w='0'
ad =str[n:d+n]
return w+'.'+ad
d = 8**(1.0/3.0)
t=str_truncate(d,10)
print 'case 1',t
d = 33076161**(1.0/3.0)
t=str_truncate(d,10)
print 'case 2',t
d = 10000**(1.0/3.0)
t=str_truncate(d,10)
print 'case 3',t
d = 0.1**(1.0/3.0)
t=str_truncate(d,10)
print 'case 4',t
Note that Python fails to perform exactly as per your expectations in case #2 due to your friendly neighborhood floating point precision being non-infinite.
You should maybe know about this document too:
What Every Computer Scientist Should Know About Floating Point
And you might be interested to know that Python has add-ons that provide arbitary precision features that will allow you to calculate the cube root of something to any number of decimals you might want. Using packages like mpmath, you can free yourself from the accuracy limitations of conventional floating point math, but at a considerable cost in performance (speed).
It is interesting to me that the built-in decimal unit does not solve this problem, since 1/3 is a rational (repeating) but non-terminating number in decimal, thus it can't be accurately represented either in decimal notation, nor floating point:
import decimal
third = decimal.Decimal(1)/decimal.Decimal(3)
print decimal.Decimal(33076161)**third # cuberoot using decimal
output:
320.9999999999999999999999998
Update: Sven provided this cool use of Logs which works for this particular case, it outputs the desired 321 value, instead of 320.99999...: Nifty. I love Log(). However this works for 321 cubed, but fails in the case of 320 cubed:
exp(log(33076161)/3)
It seems that fractions doesn't solve this problem, but I wish it did:
import fractions
third = fractions.Fraction(1,3)
def cuberoot(n):
return n ** third
print '%.14f'%cuberoot(33076161)
num = 17**(1.0/3.0)
num = int(num * 100000000000)/100000000000.0
print "%.10f" % num
What about this code .. I have created it for my personal use. although it is so simple, it is working well.
def truncation_machine(original,edge):
'''
Function of the function :) :
it performs truncation operation on long decimal numbers.
Input:
a) the number that needs to undergo truncation.
b) the no. of decimals that we want to KEEP.
Output:
A clean truncated number.
Example: original=1.123456789
edge=4
output=1.1234
'''
import math
g=original*(10**edge)
h=math.trunc(g)
T=h/(10**edge)
print('The original number ('+str(original)+') underwent a '+str(edge)+'-digit truncation to be in the form: '+str(T))
return T

Categories

Resources