adjusting floats to maintain high precision - python

I am writing some Python code that requires a very high degree of precision. I started to use Numpy float64, but that didn't work as required, and I then started using the "Decimal" module, which then worked fine.
I would ideally prefer, however, to use floats rather than use the decimal module - and I recall someone once telling me that it's possible manipulate floats in some way so that the level of precision can be achieved (by multiplying or something?).
Is this true or am I misremembering? Sorry if this is a little vague.
Thank you!

It depends on the kind of number you have. For example if you are adding values in the interval [1...2) you might be better of using offsettet values:
>>> a = 1.0000000000000000001
>>> a
1.0
>>> a+a
1.0
>>> a = 0.0000000000000000001
>>> a
1e-19
For simpler storage you can write them as tuple (n, f) with n being a natural number (int) and f the fraction in the interval [0...1).
Computation with such kind of values is tricky however.
>>> (1+1, a+a)
(2, 2e-19)
If in doubt stick with Decimal or use bigfloat as suggested by BenDundee.

Another useful package is mpmath:
import mpmath as mp
p.dps = 64 #64 decimal places
msqrt(3)/2
mpf('0.8660254037844386467637231707529361834714026269051903140279034897246')
p.dps = 30 #30 decimal places
mpf('0.866025403784438646763723170752918')

Related

Why Decimal(math.pow(2,60)-1) is NOT equal to Decimal(math.pow(2,60))-Decimal(1)?

The decimal module provides support for fast correctly-rounded decimal floating point arithmetic.
I wrote this to learn this module.
from decimal import *
getcontext().prec = 19
print(Decimal(math.pow(2,60)-1))
print(Decimal(math.pow(2,60))-Decimal(1))
the weird this is, I got 2 different results.
1152921504606846976
1152921504606846975
why is that?
Note the number is a long integer rather than a float/double
That is not weird at all. math.pow(2,60) will return a float (1.152921504606847e+18) with all the float limitations, such as deducting 1 from this large number will not change the outcome and you use this arithmetic before applying Decimal.
Indeed Using Decimal overcomes this as well as using the ** instead of math.pow.
>>> 2**60
>>> 1152921504606846976
>>> 2**60 - 1
>>> 1152921504606846975

How to correctly deal with floating point arithmetic in Python?

How to correctly add or subtract using floats?
For example how to perform:
2.4e-07 - 1e-8
so that it returns 2.3e-7 instead of 2.2999999999999997e-07.
Converting to int first yields unexpected results, the below returns 2.2e-07:
int(2.4e-07 * 1e8 - 1) * 1e-8
Similarly,
(2.4e-07 * 1e8 - 1) * 1e-8
returns 2.2999999999999997e-07.
How to perform subtraction and addition of numbers with 8 decimal point precision?
2.2999999999999997e-07 is not sufficient as the number is used as a lookup in a dictionary, and the key is 2.3e-7. This means that any value other than 2.3e-7 results in an incorrect lookup.
I suggest using the decimal data type (it is present in the stardard installation of Python), because it uses fixed precision to avoid just the differences you are talking about.
>>> from decimal import Decimal
>>> x = Decimal('2.4e-7')
>>> x
Decimal('2.4E-7')
>>> y = Decimal('1e-8')
>>> y
Decimal('1E-8')
>>> x - y
Decimal('2.3E-7')
It's really just a way of skirting around the issue of floating point arithmetic, but I suggest using the decimal package from the standard library. It lets you do exact floating point math.
Using your example,
$ from decimal import Decimal
$ x = Decimal('2.4e-7')
$ y = Decimal('1e-8')
$ x-y
Decimal('2.3E-7')
It's worth noting that Decimal objects are different than the float built-in, but they are mostly interchangeable.
I do not know if it is what you are looking for but you can try that kind of thing:
a = 0.555555555
a = float("{0:.2f}".format(a))
>>> 0.56
I hope it will help you!
Adrien

Floor function is eliminating Integer scientific notation, Python

I will explain my problem by example:
>>> #In this case, I get unwanted result
>>> k = 20685671025767659927959422028 / 2580360422
>>> k
8.016582043889239e+18
>>> math.floor(k)
8016582043889239040
>>> #I dont want this to happen ^^, let it remain 8.016582043889239e+18
>>> #The following case though, is fine
>>> k2 = 5/6
>>> k2
0.8333333333333334
>>> math.floor(k2)
0
How do I make math.floor not flooring the scientific notated numbers? Is there a rule for which numbers are represented in a scientific notation (I guess it would be a certain boundry).
EDIT:
I first thought that the math.floor function was causing an accuracy loss, but it turns out that the first calculation itself lost the calculation's accuracy, which had me really confused, it can be easily seen here:
>>> 20685671025767659927959422028 / 2580360422
8016582043889239040
>>> 8016582043889239040 * 2580360422
20685671025767659370513274880
>>> 20685671025767659927959422028 - 20685671025767659370513274880
557446147148
>>> 557446147148 / 2580360422
216.0342184739958
>>> ##this is >1, meaning I lost quite a bit of information, and it was not due to the flooring
So now my problem is how to get the actual result of the division. I looked at the following thread:
How to print all digits of a large number in python?
But for some reason I didn't get the same result.
EDIT:
I found a simple solution for the division accuracy problem in here:
How to manage division of huge numbers in Python?
Apparently the // operator returns an int rather then float, which has no size limit apart to the machine's memory.
In Python 3, math.floor returns an integer. Integers are not displayed using scientific notation. Some floats are represented using scientific notation. If you want scientific notation, try converting back to float.
>>> float(math.floor(20685671025767659927959422028 / 2580360422))
8.016582043889239e+18
As Tadhg McDonald-Jensen indicates, you can also use str.format to get a string representation of your integer in scientific notation:
>>> k = 20685671025767659927959422028 / 2580360422
>>> "{:e}".format(k)
'8.016582e+18'
This may, in fact, be more practical than converting to float. As a general rule of thumb, you should choose a numeric data type based on the precision and range you require, without worrying about what it looks like when printed.

Converting a float to a string without rounding it

I'm making a program that, for reasons not needed to be explained, requires a float to be converted into a string to be counted with len(). However, str(float(x)) results in x being rounded when converted to a string, which throws the entire thing off. Does anyone know of a fix for it?
Here's the code being used if you want to know:
len(str(float(x)/3))
Some form of rounding is often unavoidable when dealing with floating point numbers. This is because numbers that you can express exactly in base 10 cannot always be expressed exactly in base 2 (which your computer uses).
For example:
>>> .1
0.10000000000000001
In this case, you're seeing .1 converted to a string using repr:
>>> repr(.1)
'0.10000000000000001'
I believe python chops off the last few digits when you use str() in order to work around this problem, but it's a partial workaround that doesn't substitute for understanding what's going on.
>>> str(.1)
'0.1'
I'm not sure exactly what problems "rounding" is causing you. Perhaps you would do better with string formatting as a way to more precisely control your output?
e.g.
>>> '%.5f' % .1
'0.10000'
>>> '%.5f' % .12345678
'0.12346'
Documentation here.
len(repr(float(x)/3))
However I must say that this isn't as reliable as you think.
Floats are entered/displayed as decimal numbers, but your computer (in fact, your standard C library) stores them as binary. You get some side effects from this transition:
>>> print len(repr(0.1))
19
>>> print repr(0.1)
0.10000000000000001
The explanation on why this happens is in this chapter of the python tutorial.
A solution would be to use a type that specifically tracks decimal numbers, like python's decimal.Decimal:
>>> print len(str(decimal.Decimal('0.1')))
3
Other answers already pointed out that the representation of floating numbers is a thorny issue, to say the least.
Since you don't give enough context in your question, I cannot know if the decimal module can be useful for your needs:
http://docs.python.org/library/decimal.html
Among other things you can explicitly specify the precision that you wish to obtain (from the docs):
>>> getcontext().prec = 6
>>> Decimal('3.0')
Decimal('3.0')
>>> Decimal('3.1415926535')
Decimal('3.1415926535')
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85987')
>>> getcontext().rounding = ROUND_UP
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85988')
A simple example from my prompt (python 2.6):
>>> import decimal
>>> a = decimal.Decimal('10.000000001')
>>> a
Decimal('10.000000001')
>>> print a
10.000000001
>>> b = decimal.Decimal('10.00000000000000000000000000900000002')
>>> print b
10.00000000000000000000000000900000002
>>> print str(b)
10.00000000000000000000000000900000002
>>> len(str(b/decimal.Decimal('3.0')))
29
Maybe this can help?
decimal is in python stdlib since 2.4, with additions in python 2.6.
Hope this helps,
Francesco
I know this is too late but for those who are coming here for the first time, I'd like to post a solution. I have a float value index and a string imgfile and I had the same problem as you. This is how I fixed the issue
index = 1.0
imgfile = 'data/2.jpg'
out = '%.1f,%s' % (index,imgfile)
print out
The output is
1.0,data/2.jpg
You may modify this formatting example as per your convenience.

Decimal place issues with floats and decimal.Decimal

I seem to be losing a lot of precision with floats.
For example I need to solve a matrix:
4.0x -2.0y 1.0z =11.0
1.0x +5.0y -3.0z =-6.0
2.0x +2.0y +5.0z =7.0
This is the code I use to import the matrix from a text file:
f = open('gauss.dat')
lines = f.readlines()
f.close()
j=0
for line in lines:
bits = string.split(line, ',')
s=[]
for i in range(len(bits)):
if (i!= len(bits)-1):
s.append(float(bits[i]))
#print s[i]
b.append(s)
y.append(float(bits[len(bits)-1]))
I need to solve using gauss-seidel so I need to rearrange the equations for x, y, and z:
x=(11+2y-1z)/4
y=(-6-x+3z)/5
z=(7-2x-2y)/7
Here is the code I use to rearrange the equations. b is a matrix of coefficients and y is the answer vector:
def equations(b,y):
i=0
eqn=[]
row=[]
while(i<len(b)):
j=0
row=[]
while(j<len(b)):
if(i==j):
row.append(y[i]/b[i][i])
else:
row.append(-b[i][j]/b[i][i])
j=j+1
eqn.append(row)
i=i+1
return eqn
However the answers I get back aren't precise to the decimal place.
For example, upon rearranging the second equation from above, I should get:
y=-1.2-.2x+.6z
What I get is:
y=-1.2-0.20000000000000001x+0.59999999999999998z
This might not seem like a big issue but when you raise the number to a very high power the error is quite large. Is there a way around this? I tried the Decimal class but it does not work well with powers (i.e, Decimal(x)**2).
Any ideas?
IEEE floating point is binary, not decimal. There is no fixed length binary fraction that is exactly 0.1, or any multiple thereof. It is a repeating fraction, like 1/3 in decimal.
Please read What Every Computer Scientist Should Know About Floating-Point Arithmetic
Other options besides a Decimal class are
using Common Lisp or Python 2.6 or another language with exact rationals
converting the doubles to close rationals using, e.g., frap
I'm not familiar enough with the Decimal class to help you out, but your problem is due to the fact that decimal fractions can often not be accurate represented in binary, so what you're seeing is the closest possible approximation; there's no way to avoid this problem without using a special class (like Decimal, probably).
EDIT: What about the decimal class isn't working properly for you? As long as I start with a string, rather than a float, powers seem to work fine.
>>> import decimal
>>> print(decimal.Decimal("1.2") ** 2)
1.44
The module documentation explains the need for and usage of decimal.Decimal pretty clearly, you should check it out if you haven't yet.
First, your input can be simplified a lot. You don't need to read and parse a file. You can just declare your objects in Python notation. Eval the file.
b = [
[4.0, -2.0, 1.0],
[1.0, +5.0, -3.0],
[2.0, +2.0, +5.0],
]
y = [ 11.0, -6.0, 7.0 ]
Second, y=-1.2-0.20000000000000001x+0.59999999999999998z isn't unusual. There's no exact representation in binary notation for 0.2 or 0.6. Consequently, the values displayed are the decimal approximations of the original not exact representations. Those are true for just about every kind of floating-point processor there is.
You can try the Python 2.6 fractions module. There's an older rational package that might help.
Yes, raising floating-point numbers to powers increases the errors. Consequently, you have to be sure to avoid using the right-most positions of the floating-point number, since those bits are mostly noise.
When displaying floating-point numbers, you have to appropriately round them to avoid seeing the noise bits.
>>> a
0.20000000000000001
>>> "%.4f" % (a,)
'0.2000'
I'd caution against the decimal module for tasks like this. Its purpose is really more dealing with real-world decimal numbers (eg. matching human bookkeeping practices), with finite precision, not performing exact precision math. There are numbers not exactly representable in decimal just as there are in binary, and performing arithmetic in decimal is also much slower than alternatives.
Instead, if you want exact results you should use rational arithmetic. These will represent numbers as a numerator/denomentator pair, so can exactly represent all rational numbers. If you're only using multiplication and division (rather than operations like square roots that can result in irrational numbers), you will never lose precision.
As others have mentioned, python 2.6 will have a built-in rational type, though note that this isn't really a high-performing implementation - for speed you're better using libraries like gmpy. Just replace your calls to float() to gmpy.mpq() and your code should now give exact results (though you may want to format the results as floats for display purposes).
Here's a slightly tidied version of your code to load a matrix that will use gmpy rationals instead:
def read_matrix(f):
b,y = [], []
for line in f:
bits = line.split(",")
b.append( map(gmpy.mpq, bits[:-1]) )
y.append(gmpy.mpq(bits[-1]))
return b,y
It is not an answer to your question, but related:
#!/usr/bin/env python
from numpy import abs, dot, loadtxt, max
from numpy.linalg import solve
data = loadtxt('gauss.dat', delimiter=',')
a, b = data[:,:-1], data[:,-1:]
x = solve(a, b) # here you may use any method you like instead of `solve`
print(x)
print(max(abs((dot(a, x) - b) / b))) # check solution
Example:
$ cat gauss.dat
4.0, 2.0, 1.0, 11.0
1.0, 5.0, 3.0, 6.0
2.0, 2.0, 5.0, 7.0
$ python loadtxt_example.py
[[ 2.4]
[ 0.6]
[ 0.2]]
0.0
Also see What is a simple example of floating point error, here on SO, which has some answers. The one I give actually uses python as the example language...

Categories

Resources