I am depending on some code that uses the Decimal class because it needs precision to a certain number of decimal places. Some of the functions allow inputs to be floats because of the way that it interfaces with other parts of the codebase. To convert them to decimal objects, it uses things like
mydec = decimal.Decimal(str(x))
where x is the float taken as input. My question is, does anyone know what the standard is for the 'str' method as applied to floats?
For example, take the number 2.1234512. It is stored internally as 2.12345119999999999 because of how floats are represented.
>>> x = 2.12345119999999999
>>> x
2.1234511999999999
>>> str(x)
'2.1234512'
Ok, str(x) in this case is doing something like '%.6f' % x. This is a problem with the way my code converts to decimals. Take the following:
>>> d = decimal.Decimal('2.12345119999999999')
>>> ds = decimal.Decimal(str(2.12345119999999999))
>>> d - ds
Decimal('-1E-17')
So if I have the float, 2.12345119999999999, and I want to pass it to Decimal, converting it to a string using str() gets me the wrong answer. I need to know what are the rules for str(x) that determine what the formatting will be, because I need to determine whether this code needs to be re-written to avoid this error (note that it might be OK, because, for example, the code might round to the 10th decimal place once we have a decimal object)
There must be some set of rules in python's docs that hopefully someone here can point me to. Thanks!
In the Python source, look in "Include/floatobject.h". The precision for the string conversion is set a few lines from the top after an comment with some explanation of the choice:
/* The str() precision PyFloat_STR_PRECISION is chosen so that in most cases,
the rounding noise created by various operations is suppressed, while
giving plenty of precision for practical use. */
#define PyFloat_STR_PRECISION 12
You have the option of rebuilding, if you need something different. Any changes will change formatting of floats and complex numbers. See ./Objects/complexobject.c and ./Objects/floatobject.c. Also, you can compare the difference between how repr and str convert doubles in these two files.
There's a couple of issues worth discussing here, but the summary is: you cannot extract information that is not stored on your system already.
If you've taken a decimal number and stored it as a floating point, you'll have lost information, since most decimal (base 10) numbers with a finite number of digits cannot be stored using a finite number of digits in base 2 (binary).
As was mentioned, str(a_float) will really call a_float.__str__(). As the documentation states, the purpose of that method is to
return a string containing a nicely printable representation of an object
There's no particular definition for the float case. My opinion is that, for your purposes, you should consider __str__'s behavior to be undefined, since there's no official documentation on it - the current implementation can change anytime.
If you don't have the original strings, there's no way to extract the missing digits of the decimal representation from the float objects. All you can do is round predictably, using string formatting (which you mention):
Decimal( "{0:.5f}".format(a_float) )
You can also remove 0s on the right with resulting_string.rstrip("0").
Again, this method does not recover the information that has been lost.
Related
Goal
I want to convert all floats to two decimal places regardless of number of decimal places float has without repeating convert code.
For example, I want to convert
50 to 50.00
50.5 to 50.50
without repeating the convert code again and again. What I mean is explained in the following section - research.
Not what this question is about
This question is NOT about:
Only limiting floats to two decimal points - if there is less than two decimal points then I want it to have two decimal points with zeros for the unused spaces.
Flooring the decimal or ceiling it.
Rounding the decimal off.
This question is not a duplicate of this question.
That question only answers the first part of my question - convert floats to two decimal places regardless of number of decimal places float has, not the second part - without repeating convert code.
Nor this question.
That is just how to add units before the decimal place. My question
is: how to convert all floats to two decimal places regardless of
number of decimal places float has without repeating convert code.
Research
I found two ways I can achieve the convert. One is using the decimal module:
from decimal import *
TWOPLACES = Decimal(10) ** -2
print(Decimal('9.9').quantize(TWOPLACES))
Another, without using any other modules:
print(f"{9.9:.2f}")
However, that does not fully answer my question. Realise that the code to convert keeps being needed to repeat itself? I keep having to repeat the code to convert again and again. Sadly, my whole program is already almost completed and it will be quite a waste of time to add this code here and there so the format will be correct. Is there any way to convert all floats to two decimal places regardless of number of decimal places float has without repeating convert code?
Clarification
What I mean by convert is, something like what Dmytro Chasovskyi said, that I want all places with floats in my program without extra changes to start to operate like decimals. For example, if I had the operation 1.2345 + 2.7 + 3 + 56.1223183 it should be 1.23 + 2.70 + 3.00 + 56.12.
Also, float is a number, not a function.
The bad news is: there is no "float" with "two decimal places".
Floating point numbers are represented internally with a fixed number of digits in base 2. https://floating-point-gui.de/basic/ .
And these are both efficient and accurate enough for almost all calculations we perform with any modern program.
What we normally want is that the human-readable text representation of a number, in all outputs of a program, shows only two digits. And this is controlled at wherever your program is either writting the value to a text file, to the screen, or rendering it to an HTML template (which is "writing it to a text file", again).
So, it happens that the same syntaxes that will convert a number to text, embedded in another string, allows additionally to control the exact output of the number. You put as an example print(f"{9.9:.2f}"). The only thing that looks impractical there is due to you hardcoding your number along with its conversion. Typically, the number will be in a variable.
Them, all you have to do is writting, wherever you output the number:
print(f"The value is: {myvar:.02f}")
instead of
print(f"The value is: {myvar}")
Or in whatever function you are calling that will need the rendered version of the number instead of print. Notice that the use of the word "rendered" here is deliberate: while your program is running, the number is stored in an efficient way in memory, directly usable by the CPU, that is not human readable. At any point you want to "see" the number, you have to convert it into text. It is just that some calls to it implicitly, like print(myvar). Then, just resort to explicitly converting it in these places - `print(f"{myvar:.02f}").
really having 2 decimal places in memory
If you use decimal.Decimal, then yes, there are ways to keep the internal representation of the number with 2 decimal digits,
but them, instead of just converting the number on output, you must convert it into a "2 decimal place" value on all inputs as well
That means that whenever ingesting a number into your program, be it typed by the user, read from a binary file or database, or received via wire from a sensor, you have to apply a similar transform to the one used in the output as detailed above. More precisely: you convert your float to a properly formatted string, and then convert that to a decimal.Decimal.
And this will prevent your program of accumulating errors due to base conversion, but you will still need to force the format to 2 decimal places on every output, just like above.
Use this function.
def cvt_decimal(input):
number = float(input)
return ("%.2f" % number)
print(cvt_decimal(50))
print(cvt_decimal(50.5))
Output is :
50.00
50.50
** Process exited - Return Code: 0 **
Press Enter to exit terminal
you can modify the decimal precision, even if you do any operation between 2 decimal types
import decimal
from decimal import Decimal
decimal.getcontext().prec = 2
a = Decimal('0.12345')
b = Decimal('0.12345')
print(a + b)
Decimal calculations are precise but it takes more time to do calculations, keep that in mind.
Python (and almost anything else) has known limitations while working with floating point numbers (nice overview provided here).
While problem is described well in the documentation it avoids providing any approach to fixing it. And with this question I am seeking to find a more or less robust way to avoid situations like the following:
print(math.floor(0.09/0.015)) # >> 6
print(math.floor(0.009/0.0015)) # >> 5
print(99.99-99.973) # >> 0.016999999999825377
print(.99-.973) # >> 0.017000000000000015
var = 0.009
step = 0.0015
print(var < math.floor(var/step)*step+step) # False
print(var < (math.floor(var/step)+1)*step) # True
And unlike suggested in this question, their solution does not help to fix a problem like next peace of code failing randomly:
total_bins = math.ceil((data_max - data_min) / width) # round to upper
new_max = data_min + total_bins * width
assert new_max >= data_max
# fails. because for example 1.9459999999999997 < 1.946
If you deal in discrete quantities, use int.
Sometimes people use float in places where they definitely shouldn't. If you're counting something (like number of cars in the world) as opposed to measuring something (like how much gasoline is used per day), floating-point is probably the wrong choice. Currency is another example where floating point numbers are often abused: if you're storing your bank account balance in a database, it's really not 123.45 dollars, it's 12345 cents. (But also see below about Decimal.)
Most of the rest of the time, use float.
Floating-point numbers are general-purpose. They're extremely accurate; they just can't represent certain fractions, like finite decimal numbers can't represent the number 1/3. Floats are generally suited for any kind of analog quantity where the measurement has error bars: length, mass, frequency, energy -- if there's uncertainty on the order of 2^(-52) or greater, there's probably no good reason not to use float.
If you need human-readable numbers, use float but format it.
"This number looks weird" is a bad reason not to use float. But that doesn't mean you have to display the number to arbitrary precision. If a number with only three significant figures comes out to 19.99909997918947, format it to one decimal place and be done with it.
>>> print('{:0.1f}'.format(e**pi - pi))
20.0
If you need precise decimal representation, use Decimal.
Sraw's answer refers to the decimal module, which is part of the standard library. I already mentioned currency as a discrete quantity, but you may need to do calculations on amounts of currency in which not all numbers are discrete, for example calculating interest. If you're writing code for an accounting system, there will be rules that say when rounding is applied and to what accuracy various calculations are done, and those specifications will be written in terms of decimal places. In this situation and others where the decimal representation is inherent to the problem specification, you'll want to use a decimal type.
>>> from decimal import Decimal
>>> rate = Decimal('0.0345')
>>> principal = Decimal('3412.65')
>>> interest = rate*principal
>>> interest
Decimal('117.736425')
>>> interest.quantize(Decimal('0.01'))
Decimal('117.74')
But most importantly, use data types and operations that make sense in context.
Several of your examples use math.floor, which takes a float and chops off the fractional part. In any situation where you should use math.floor, floating-point error doesn't matter. (If you want to round to the nearest integer, use round instead.) Yes, there are ways to use floating-point operations that have wrong results from a mathematical standpoint. But real-world quantities usually fall into one of these categories:
Exact, and therefore should not be put in a float;
Imprecise to a degree far exceeding the likely accumulation of floating-point error.
As a programmer, it's part of your job to know the quantities you're dealing with and choose appropriate data types. So there's no "fix" for floating point numbers, because there's no "problem" really -- just people using the wrong type for the wrong thing.
Let's talk about decimal. Actually, this library converts number into a string-like object, and then do any arithmetical operation based on chars.
So in this case, it can handle significantly huge number with almost perfect precision.
But, as it calculate number based on chars, it cost much more.
Further, if you want to use decimal, to ensure precision, you need consistently use it. If you mix decimal with normal types such as float, it may cause unexpected problems.
Finally, when you construct a Decimal object, it is better to pass a string but not a number.
>>> print(Decimal(99.99) - Decimal(99.973))
0.01699999999999590727384202182
>>> print(Decimal("99.99") - Decimal("99.973"))
0.017
It depends what your end goal is - there is no way to "perfectly" store floating point numbers. Only "good enough".
If you are working with money for example (dollars and cents) it is common practice to not store dollars - and only cents. (dollar = 100 cents) - this is how paypal stores your account balance on their servers.
There is also the python Decimal class for fixed point arithmetic.
So I have a list of tuples of two floats each. Each tuple represents a range. I am going through another list of floats which represent values to be fit into the ranges. All of these floats are < 1 but positive, so precision matter. One of my tests to determine if a value fits into a range is failing when it should pass. If I print the value and the range that is causing problems I can tell this much:
curValue = 0.00145000000671
range = (0.0014500000067055225, 0.0020968749796738849)
The conditional that is failing is:
if curValue > range[0] and ... blah :
# do some stuff
From the values given by curValue and range, the test should clearly pass (don't worry about what is in the conditional). Now, if I print explicitly what the value of range[0] is I get:
range[0] = 0.00145000000671
Which would explain why the test is failing. So my question then, is why is the float changing when it is accessed. It has decimal values available up to a certain precision when part of a tuple, and a different precision when accessed. Why would this be? What can I do to ensure my data maintains a consistent amount of precision across my calculations?
The float doesn't change. The built-in numberic types are all immutable. The cause for what you're observing is that:
print range[0] uses str on the float, which (up until very recent versions of Python) printed less digits of a float.
Printing a tuple (be it with repr or str) uses repr on the individual items, which gives a much more accurate representation (again, this isn't true anymore in recent releases which use a better algorithm for both).
As for why the condition doesn't work out the way you expect, it's propably the usual culprit, the limited precision of floats. Try print repr(curVal), repr(range[0]) to see if what Python decided was the closest representation of your float literal possible.
In modern day PC's floats aren't that precise. So even if you enter pi as a constant to 100 decimals, it's only getting a few of them accurate. The same is happening to you. This is because in 32-bit floats you only get 24 bits of mantissa, which limits your precision (and in unexpected ways because it's in base2).
Please note, 0.00145000000671 isn't the exact value as stored by Python. Python only diplays a few decimals of the complete stored float if you use print. If you want to see exactly how python stores the float use repr.
If you want better precision use the decimal module.
It isn't changing per se. Python is doing its best to store the data as a float, but that number is too precise for float, so Python modifies it before it is even accessed (in the very process of storing it). Funny how something so small is such a big pain.
You need to use a arbitrary fixed point module like Simple Python Fixed Point or the decimal module.
Not sure it would work in this case, because I don't know if Python's limiting in the output or in the storage itself, but you could try doing:
if curValue - range[0] > 0 and...
I would need to have a float variable rounded to 2 significant digits and store the result into a new variable (or the same of before, it doesn't matter) but this is what happens:
>>> a
981.32000000000005
>>> b= round(a,2)
>>> b
981.32000000000005
I would need this result, but into a variable that cannot be a string since I need to insert it as a float...
>>> print b
981.32
Actually truncate would also work I don't need extreme precision in this case.
What you are trying to do is in fact impossible. That's because 981.32 is not exactly representable as a binary floating point value. The closest double precision binary floating point value is:
981.3200000000000500222085975110530853271484375
I suspect that this may come as something of a shock to you. If so, then I suggest that you read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
You might choose to tackle your problem in one of the following ways:
Accept that binary floating point numbers cannot represent such values exactly, and continue to use them. Don't do any rounding at all, and keep the full value. When you wish to display the value as text, format it so that only two decimal places are emitted.
Use a data type that can represent your number exactly. That means a decimal rather than binary type. In Python you would use decimal.
Try this :
Round = lambda x, n: eval('"%.' + str(int(n)) + 'f" % ' + repr(x))
print Round(0.1, 2)
0.10
print Round(0.1, 4)
0.1000
print Round(981,32000000000005, 2)
981,32
Just indicate the number of digits you want as a second kwarg
I wrote a solution of this problem.
Plz try
from decimal import *
from autorounddecimal.core import adround,decimal_round_digit
decimal_round_digit(Decimal("981.32000000000005")) #=> Decimal("981.32")
adround(981.32000000000005) # just wrap decimal_round_digit
More detail can be found in https://github.com/niitsuma/autorounddecimal
There is a difference between the way Python prints floats and the way it stores floats. For example:
>>> a = 1.0/5.0
>>> a
0.20000000000000001
>>> print a
0.2
It's not actually possible to store an exact representation of many floats, as David Heffernan points out. It can be done if, looking at the float as a fraction, the denominator is a power of 2 (such as 1/4, 3/8, 5/64). Otherwise, due to the inherent limitations of binary, it has to make do with an approximation.
Python recognizes this, and when you use the print function, it will use the nicer representation seen above. This may make you think that Python is storing the float exactly, when in fact it is not, because it's not possible with the IEEE standard float representation. The difference in calculation is pretty insignificant, though, so for most practical purposes it isn't a problem. If you really really need those significant digits, though, use the decimal package.
I am trying to write a method in Python 3.2 that encrypts a phrase and then decrypts it. The problem is that the numbers are so big that when Python does math with them it immediately converts it into scientific notation. Since my code requires all the numbers to function scientific notation, this is not useful.
What I have is:
coded = ((eval(input(':'))+1213633288469888484)/2)+1042
Basically, I just get a number from the user and do some math to it.
I have tried format() and a couple other things but I can't get them to work.
EDIT: I use only even integers.
In python3, '/' does real division (e.g. floating point). To get integer division, you need to use //. In other words 100/2 yields 50.0 (float) whereas 100//2 yields 50 (integer)
Your code probably needs to be changed as:
coded = ((eval(input(':'))+1213633288469888484)//2)+1042
As a cautionary tale however, you may want to consider using int instead of eval:
coded = ((int(input(':'))+1213633288469888484)//2)+1042
If you know that the floating point value is really an integer, or you don't care about dropping the fractional part, you can just convert it to an int before you print it.
>>> print 1.2e16
1.2e+16
>>> print int(1.2e16)
12000000000000000