Decimal place issues with floats and decimal.Decimal - python

I seem to be losing a lot of precision with floats.
For example I need to solve a matrix:
4.0x -2.0y 1.0z =11.0
1.0x +5.0y -3.0z =-6.0
2.0x +2.0y +5.0z =7.0
This is the code I use to import the matrix from a text file:
f = open('gauss.dat')
lines = f.readlines()
f.close()
j=0
for line in lines:
bits = string.split(line, ',')
s=[]
for i in range(len(bits)):
if (i!= len(bits)-1):
s.append(float(bits[i]))
#print s[i]
b.append(s)
y.append(float(bits[len(bits)-1]))
I need to solve using gauss-seidel so I need to rearrange the equations for x, y, and z:
x=(11+2y-1z)/4
y=(-6-x+3z)/5
z=(7-2x-2y)/7
Here is the code I use to rearrange the equations. b is a matrix of coefficients and y is the answer vector:
def equations(b,y):
i=0
eqn=[]
row=[]
while(i<len(b)):
j=0
row=[]
while(j<len(b)):
if(i==j):
row.append(y[i]/b[i][i])
else:
row.append(-b[i][j]/b[i][i])
j=j+1
eqn.append(row)
i=i+1
return eqn
However the answers I get back aren't precise to the decimal place.
For example, upon rearranging the second equation from above, I should get:
y=-1.2-.2x+.6z
What I get is:
y=-1.2-0.20000000000000001x+0.59999999999999998z
This might not seem like a big issue but when you raise the number to a very high power the error is quite large. Is there a way around this? I tried the Decimal class but it does not work well with powers (i.e, Decimal(x)**2).
Any ideas?

IEEE floating point is binary, not decimal. There is no fixed length binary fraction that is exactly 0.1, or any multiple thereof. It is a repeating fraction, like 1/3 in decimal.
Please read What Every Computer Scientist Should Know About Floating-Point Arithmetic
Other options besides a Decimal class are
using Common Lisp or Python 2.6 or another language with exact rationals
converting the doubles to close rationals using, e.g., frap

I'm not familiar enough with the Decimal class to help you out, but your problem is due to the fact that decimal fractions can often not be accurate represented in binary, so what you're seeing is the closest possible approximation; there's no way to avoid this problem without using a special class (like Decimal, probably).
EDIT: What about the decimal class isn't working properly for you? As long as I start with a string, rather than a float, powers seem to work fine.
>>> import decimal
>>> print(decimal.Decimal("1.2") ** 2)
1.44
The module documentation explains the need for and usage of decimal.Decimal pretty clearly, you should check it out if you haven't yet.

First, your input can be simplified a lot. You don't need to read and parse a file. You can just declare your objects in Python notation. Eval the file.
b = [
[4.0, -2.0, 1.0],
[1.0, +5.0, -3.0],
[2.0, +2.0, +5.0],
]
y = [ 11.0, -6.0, 7.0 ]
Second, y=-1.2-0.20000000000000001x+0.59999999999999998z isn't unusual. There's no exact representation in binary notation for 0.2 or 0.6. Consequently, the values displayed are the decimal approximations of the original not exact representations. Those are true for just about every kind of floating-point processor there is.
You can try the Python 2.6 fractions module. There's an older rational package that might help.
Yes, raising floating-point numbers to powers increases the errors. Consequently, you have to be sure to avoid using the right-most positions of the floating-point number, since those bits are mostly noise.
When displaying floating-point numbers, you have to appropriately round them to avoid seeing the noise bits.
>>> a
0.20000000000000001
>>> "%.4f" % (a,)
'0.2000'

I'd caution against the decimal module for tasks like this. Its purpose is really more dealing with real-world decimal numbers (eg. matching human bookkeeping practices), with finite precision, not performing exact precision math. There are numbers not exactly representable in decimal just as there are in binary, and performing arithmetic in decimal is also much slower than alternatives.
Instead, if you want exact results you should use rational arithmetic. These will represent numbers as a numerator/denomentator pair, so can exactly represent all rational numbers. If you're only using multiplication and division (rather than operations like square roots that can result in irrational numbers), you will never lose precision.
As others have mentioned, python 2.6 will have a built-in rational type, though note that this isn't really a high-performing implementation - for speed you're better using libraries like gmpy. Just replace your calls to float() to gmpy.mpq() and your code should now give exact results (though you may want to format the results as floats for display purposes).
Here's a slightly tidied version of your code to load a matrix that will use gmpy rationals instead:
def read_matrix(f):
b,y = [], []
for line in f:
bits = line.split(",")
b.append( map(gmpy.mpq, bits[:-1]) )
y.append(gmpy.mpq(bits[-1]))
return b,y

It is not an answer to your question, but related:
#!/usr/bin/env python
from numpy import abs, dot, loadtxt, max
from numpy.linalg import solve
data = loadtxt('gauss.dat', delimiter=',')
a, b = data[:,:-1], data[:,-1:]
x = solve(a, b) # here you may use any method you like instead of `solve`
print(x)
print(max(abs((dot(a, x) - b) / b))) # check solution
Example:
$ cat gauss.dat
4.0, 2.0, 1.0, 11.0
1.0, 5.0, 3.0, 6.0
2.0, 2.0, 5.0, 7.0
$ python loadtxt_example.py
[[ 2.4]
[ 0.6]
[ 0.2]]
0.0

Also see What is a simple example of floating point error, here on SO, which has some answers. The one I give actually uses python as the example language...

Related

adjusting floats to maintain high precision

I am writing some Python code that requires a very high degree of precision. I started to use Numpy float64, but that didn't work as required, and I then started using the "Decimal" module, which then worked fine.
I would ideally prefer, however, to use floats rather than use the decimal module - and I recall someone once telling me that it's possible manipulate floats in some way so that the level of precision can be achieved (by multiplying or something?).
Is this true or am I misremembering? Sorry if this is a little vague.
Thank you!
It depends on the kind of number you have. For example if you are adding values in the interval [1...2) you might be better of using offsettet values:
>>> a = 1.0000000000000000001
>>> a
1.0
>>> a+a
1.0
>>> a = 0.0000000000000000001
>>> a
1e-19
For simpler storage you can write them as tuple (n, f) with n being a natural number (int) and f the fraction in the interval [0...1).
Computation with such kind of values is tricky however.
>>> (1+1, a+a)
(2, 2e-19)
If in doubt stick with Decimal or use bigfloat as suggested by BenDundee.
Another useful package is mpmath:
import mpmath as mp
p.dps = 64 #64 decimal places
msqrt(3)/2
mpf('0.8660254037844386467637231707529361834714026269051903140279034897246')
p.dps = 30 #30 decimal places
mpf('0.866025403784438646763723170752918')

Simple Basic Python compare

I found this interesting question when I was doing homework
we know, 47.36/1.6**2 == 18.5
but when I try to run the following code, it gives me a False(should be true)
print 47.36/1.6**2 == 18.5
Do anyone know what's going on?
You're probably getting an answer like 18.49999999999, which is not exactly equal to 18.5.
As always, the relevant reference for this is What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Short answer: IEEE 754 floating point can't exactly represent fractions where the denominator isn't a power of two, like 1/4, 1/16, 1/256, etc. You can get awfully close, given enough digits, but never quite exactly there.
You compare floating point numbers by defining "equals" as "within a certain delta". You could write something like:
def almost_equals(a, b, delta=0.0005):
return abs(a - b) <= delta
and then test for "probably equal" with:
>>> almost_equals(47.36/1.6**2, 18.5)
True
I would avoid checking for exact equality when comparing two floats. Instead take the difference and see if it is smaller than a value you consider close to zero.
(47.36/1.6**2 - 18.5) < 0.00000000001
will be
True
>>> 47.36/1.6**2
18.499999999999996
See this page on Floating Point Arithmetic: Issues and Limitations.
Here is how you can calculate this to exactly 18.5 without using any rounding or "close enough" behavior by using the decimal module:
>>> from decimal import Decimal
>>> Decimal('47.36') / Decimal('1.6')**2 == Decimal('18.5')
True
>>> float(Decimal('47.36') / Decimal('1.6')**2) == 18.5
True
As others have said:
>>> 47.36/1.6**2
18.499999999999996
But, this is NOT due to a floating-point arithmetic problem as far as I can tell. Even if you use decimal math by wrapping the operands in Decimal() (after from decimal import Decimal) you will still get Decimal('18.49999999999999772404279952') as the answer.
It's possible I'm using Decimal() wrong here and my result also has some sort of floating point error; however, if I'm correct, that expression flat out does not equal 18.5, no matter what kind of math you use.
Edit: As Greg points out in the comments, the problem with my approach here is that Decimal(1.6) will just convert the float representation of 1.6, inaccuracies intact, into a Decimal. This gives the correct answer:
>>> Decimal('47.36') / Decimal('1.6')**2
Decimal('18.5')
Better still would be to use the fractions module as suggested by Kirk.
47.36/1.6*2 return integer. So 47.36/1.6*2 would be 18, which is not equal to 18.5.
Edit
Sorry about that, actually it is being stored as 18.499999.
You should do this
import numpy as np
print np.around((47.36/1.6**2), decimals=1) == 18.5
This would return True.

Truncation in python

How can we truncate (not round) the cube root of a given number after the 10th decimal place in python?
For Example:
If number is 8 the required output is 2.0000000000 and for 33076161 it is 321.0000000000
Scale - truncate - unscale:
n = 10.0
cube_root = 1e-10 * int(1e10 * n**(1.0/3.0))
You should only do such truncations (unless you have a serious reason otherwise) while printing out results. There is no exact binary representation in floating point format, for a whole host of everyday decimal values:
print 33076161**(1.0/3.0)
A calculator gives you a different answer than Python gives you. Even Windows calculator does a passable job on cuberoot(33076161), whereas the answer given by python will be minutely incorrect unless you use rounding.
So, the question you ask is fundamentally unanswerable since it assumes capabilities that do not exist in floating point math.
Wrong Answer #1: This actually rounds instead of truncating, but for the cases you specified, it provides the correct output, probably due to rounding compensating for the inherent floating point precision problem you will hit in case #2:
print "%3.10f" % 10**(1.0/3.0)
Wrong Answer #2: But you could truncate (as a string) an 11-digit rounded value, which, as has been pointed out to me, would fail for values very near rollover, and in other strange ways, so DON'T do this:
print ("%3.11f" % 10**(1.0/3.0))[:-1]
Reasonably Close Answer #3: I wrote a little function that is for display only:
import math
def str_truncate(f,d):
s = f*(10.0**(d))
str = `math.trunc(s)`.rstrip('L')
n = len(str)-d
w = str[0:n]
if w=='':
w='0'
ad =str[n:d+n]
return w+'.'+ad
d = 8**(1.0/3.0)
t=str_truncate(d,10)
print 'case 1',t
d = 33076161**(1.0/3.0)
t=str_truncate(d,10)
print 'case 2',t
d = 10000**(1.0/3.0)
t=str_truncate(d,10)
print 'case 3',t
d = 0.1**(1.0/3.0)
t=str_truncate(d,10)
print 'case 4',t
Note that Python fails to perform exactly as per your expectations in case #2 due to your friendly neighborhood floating point precision being non-infinite.
You should maybe know about this document too:
What Every Computer Scientist Should Know About Floating Point
And you might be interested to know that Python has add-ons that provide arbitary precision features that will allow you to calculate the cube root of something to any number of decimals you might want. Using packages like mpmath, you can free yourself from the accuracy limitations of conventional floating point math, but at a considerable cost in performance (speed).
It is interesting to me that the built-in decimal unit does not solve this problem, since 1/3 is a rational (repeating) but non-terminating number in decimal, thus it can't be accurately represented either in decimal notation, nor floating point:
import decimal
third = decimal.Decimal(1)/decimal.Decimal(3)
print decimal.Decimal(33076161)**third # cuberoot using decimal
output:
320.9999999999999999999999998
Update: Sven provided this cool use of Logs which works for this particular case, it outputs the desired 321 value, instead of 320.99999...: Nifty. I love Log(). However this works for 321 cubed, but fails in the case of 320 cubed:
exp(log(33076161)/3)
It seems that fractions doesn't solve this problem, but I wish it did:
import fractions
third = fractions.Fraction(1,3)
def cuberoot(n):
return n ** third
print '%.14f'%cuberoot(33076161)
num = 17**(1.0/3.0)
num = int(num * 100000000000)/100000000000.0
print "%.10f" % num
What about this code .. I have created it for my personal use. although it is so simple, it is working well.
def truncation_machine(original,edge):
'''
Function of the function :) :
it performs truncation operation on long decimal numbers.
Input:
a) the number that needs to undergo truncation.
b) the no. of decimals that we want to KEEP.
Output:
A clean truncated number.
Example: original=1.123456789
edge=4
output=1.1234
'''
import math
g=original*(10**edge)
h=math.trunc(g)
T=h/(10**edge)
print('The original number ('+str(original)+') underwent a '+str(edge)+'-digit truncation to be in the form: '+str(T))
return T

Python: a could be rounded to b in the general case

As a part of some unit testing code that I'm writing, I wrote the following function. The purpose of which is to determine if 'a' could be rounded to 'b', regardless of how accurate 'a' or 'b' are.
def couldRoundTo(a,b):
"""Can you round a to some number of digits, such that it equals b?"""
roundEnd = len(str(b))
if a == b:
return True
for x in range(0,roundEnd):
if round(a,x) == b:
return True
return False
Here's some output from the function:
>>> couldRoundTo(3.934567892987, 3.9)
True
>>> couldRoundTo(3.934567892987, 3.3)
False
>>> couldRoundTo(3.934567892987, 3.93)
True
>>> couldRoundTo(3.934567892987, 3.94)
False
As far as I can tell, it works. However, I'm scared of relying on it considering I don't have a perfect grasp of issues concerning floating point accuracy. Could someone tell me if this is an appropriate way to implement this function? If not, how could I improve it?
Could someone tell me if this is an appropriate way to implement this function?
It depends. The given function will behave surprisingly if b isn't precisely equal to a value that would normally be obtained directly from decimal-to-binary-float conversion.
For example:
>>> print(0.1, 0.2/2, 0.3/3)
0.1 0.1 0.1
>>> couldRoundTo(0.123, 0.1)
True
>>> couldRoundTo(0.123, 0.2/2)
True
>>> couldRoundTo(0.123, 0.3/3)
False
This fails because the calculation of 0.3 / 3 results in a slightly different representation than 0.1 and 0.2 / 2 (and round(0.123, 1)).
If not, how could I improve it?
Rule of thumb: if your calculation specifically involves decimal digits in any way, just use Decimal, to avoid all the lossy base-2 round-tripping.
In particular, Decimal includes a helper called quantize that makes this problem trivially easy:
from decimal import Decimal
def roundable(a, b):
a = Decimal(str(a))
b = Decimal(str(b))
return a.quantize(b) == b
One way to do it:
def could_round_to(a, b):
(x, y) = map(len, str(b).split('.'))
round_format = "%" + "%d.%df"%(x, y)
return round_format%a == str(b)
First, we take the number of digits before and after the decimal in x and y. Then, we construct a format such as %x.yf. Then, we supply a to the format string.
>>> "%2.2f"%123.1234
'123.12'
>>> "%2.2f"%123.1264
'123.13'
>>> "%3.2f"%000.001
'0.00'
Now, all that's left is comparing the strings.
The only point that I'm afraid of is the conversion from strings to floating points when interpreting floating-point literals (as in http://docs.python.org/reference/lexical_analysis.html#floating-point-literals). I don't know if there is any guarantee that a floating-point literal will evaluate to the floating-point number that is closest to the given string. This mentioned section is the place in the specification where I would expect such a guarantee.
For example, Java is much more specific about what to expect from a string literal. From the documentation of Double.valueOf(String):
[...] [the argument] is regarded as representing an exact decimal value in the usual "computerized scientific notation" or as an exact hexadecimal value; this exact numerical value is then conceptually converted to an "infinitely precise" binary value that is then rounded to type double by the usual round-to-nearest rule of IEEE 754 floating-point arithmetic [...]
Unless you can find such a guarantee anywhere in the Python documentation, you can be just lucky, because some earlier floating-point libraries (on which Python might rely) convert a string just to a floating-point number nearby, not to the best available.
Unfortunately, it seems to me that neither round, nor float, nor the specification for floating-point literaly give you any usable guarantee.
If you purpose is to test if round function will round to the target, then you are correct. Otherwise (what else is the purpose?) if you are in doubt , you should use decimal module

Python:Which way gives better precision

Is there any difference in precision between one time assignment:
res=n/k
and multiple assignment in for cycle:
for i in range(n):
res+=1/k
?
Floating-point division a/b is not mathematical division a ÷ b, except in very rare* circumstances.
Generally, floating point division a/b is a ÷ b + ε.
This is true for two reasons.
Float numbers (except in rare cases) are an approximation of the decimal number.
a is a + εa.
b is b + εb.
Float numbers uses a base 2 encoding of the digits to the right of the decimal place. When you write 3.1, this is expanded to a base-2 approximation that differs from the real value by a small amount.
Real decimal numbers have the same problem, by the way. Write down the decimal expansion of 1/3. Oops. You have to stop writing decimal places at some point. Binary floating point numbers have the same problem.
Division has a fixed number of binary places, meaning the answer is truncated. If there's a repeating binary pattern, it gets chopped. In rare cases, this doesn't matter. In general, you've introduced error by doing division.
Therefore, when you do something like repeatedly add 1/k values you're computing
1 ÷ k + ε
And adding those up. Your result (if you had the right range) would be
n × (1 ÷ k + ε) = n ÷ k + n × ε
You've multiplied the small error, ε, by n. Making it a big error. (Except in rare cases.)
This is bad. Very bad. All floating point division introduces an error. Your job as a programmer is to do the algebra to avoid or defer division to prevent this. Good software design means good algebra to prevent errors being introduced by the division operator.
[* The rare cases. In rare cases, the small error happens to be zero. The rare cases occur when your floating point values are small whole numbers or fractions that are sums of powers of two 1/2, 1/4, 1/8, etc. In the rare case that you have a benign number with a benign fractional part, the error will be zero.]
Sure, they are different, because of how floating point division works.
>>> res = 0
>>> for x in xrange(5000): res += 0.1
...
>>> res == 5000 * 0.1
False
There's a good explanation in the python official tutorial.
Well if k divides n then definitely the first one is more precise :-) To be serious, if the division is floating point and n > 1 then the first one will be more precise anyway though they will probably give different results, as nosklo said.
BTW, in Python 2.6 the division is integer by default so you'll have very different results. 1/k will always give 0 unless k <= 1.
Floating point arithmetic has representation and roundoff errors. For the types of data floating point numbers are intended to represent, real numbers of reasonable size, these errors are generally acceptable.
If you want to calculate the quotient of two numbers, the right way is simply to say result = n / k (beware if these are both integers and you have not said from __future__ import division, this is not what you may expect). The second way is silly, error-prone, and ugly.
There is some discussion of floating point inexactness in the Python tutorial: http://docs.python.org/tutorial/floatingpoint.html
Even if we charitably assume a floating-point division, there's very definitely a difference in precision; the for loop is executed n - 1 times!
assert (n-1) / k != n / k
Also depends on what res is initialised to in the second case :-)
Certainly there is a difference if you use floating point numbers, unless the Python interpreter/compiler you are using is capable of optimizing away the loop (Maybe Jython or IronPython might be able to? C compilers are pretty good at this).
If you actually want these two approaches to be the same precision though, and you are using integers for your numerator and denominator, you can use the python fractions package
from fractions import Fraction
n,k = 999,1000
res = Fraction(0,1)
for i in range(0,n):
res += Fraction(1,k)
print float(res)

Categories

Resources