Python Dictionary Floats - python

I came across a strange behavior in Python (2.6.1) dictionaries:
The code I have is:
new_item = {'val': 1.4}
print new_item['val']
print new_item
And the result is:
1.4
{'val': 1.3999999999999999}
Why is this? It happens with some numbers, but not others. For example:
0.1 becomes 0.1000...001
0.4 becomes 0.4000...002
0.7 becomes 0.6999...996
1.9 becomes 1.8888...889

This is not Python-specific, the issue appears with every language that uses binary floating point (which is pretty much every mainstream language).
From the Floating-Point Guide:
Because internally, computers use a format (binary floating-point)
that cannot accurately represent a number like 0.1, 0.2 or 0.3 at all.
When the code is compiled or interpreted, your “0.1” is already
rounded to the nearest number in that format, which results in a small
rounding error even before the calculation happens.
Some values can be exactly represented as binary fraction, and output formatting routines will often display the shortest number that is closer to the actual value than to any other floating-point number, which masks some of the rounding errors.

This problem is related to floating point representations in binary, as others have pointed out.
But I thought you might want something that would help you solve your implied problem in Python.
It's unrelated to dictionaries, so if I were you, I would remove that tag.
If you can use a fixed-precision decimal number for your purposes, I would recommend you check out the Python decimal module. From the page (emphaisis mine):
Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have an exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.
The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants.

Related

How does using decimal for money avoid the floating point problems in python?

So currency/money has lot's of known math issues when using a floating point. It seems in python that decimal is used in money libraries, but according to the python docs, decimal is based on a floating point. So how does this not have the same problems?
context
a lot of currency libraries measure their monetary values as integers (so cents of USD, not dollars). We've just had the issue of a python application representing it's money as decimal, it goes into javascript, which then needs to convert it to an integer for another service.
10.05 / 100, became 1050.0000...1 which is of course, not an integer. So of course I was wondering why python chose this route, as most recommendations I've seen recommend treating money as integers.
You are confusing binary floating point with decimal floating point. From the module documentation:
The decimal module provides support for fast correctly-rounded decimal floating point arithmetic.
[...]
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point
(bold emphasis mine).
The floating point aspect refers to the variability of the exponent; the number 12300000 can be represented as 123 with a decimal exponent of 5 (10 ** 5). Both float and decimal use a floating point representation. But float adds up a number of binary fractions (1/2 + 1/4 + 1/8 + 1/16 + ...), and that makes them unsuitable for representing currencies as binary fractions can not predicisely model 1/100ths or 1/10ths, which currency values tend to use a lot.
The DZone article on floating point issues for currency you link also teaches you about the Java java.math.BigDecimal package. Python's decimal is essentially the same thing; where the BigDecimal documentation talks about values consist[ing] of an arbitrary precision integer unscaled value and a 32-bit integer scale, the scale is essentially the position of the floating point.
Because decimal can represent 1/100ths (cents) in currency values exactly, it is far more suitable to model currency values.
Decimal avoids some of the problems of binary floating-point, but not all, possibly not even most.
The actual problem is not floating-point but numerical formats. No numerical format can represent all real numbers, or even all rational numbers, so no numerical format can handle all the operations we want to do with numbers.
Money is commonly represented in decimal fractions of a unit of currency. For example, the US dollar and many other currencies have as a “cent” which is 1/100th of a dollar. A decimal format can represent 1/100th exactly. A binary format cannot. So, with a decimal format, you can:
Represent decimal units of currency exactly (within bounds of the format).
Add and subtract decimal amounts of currency exactly (within bounds of the format).
Multiply decimal units of currency by integers exactly (within bounds of the format).
However, problems arise when you try:
To average numbers or divide by numbers other than powers of ten (or two or five). For example, if a grocery wants to sell a product at three for a dollar, there is no way to represent ⅓ exactly in a decimal format.
Multiplying numbers with decimal fractions more than a few times. Each multiplication will increase the number of digits after the decimal point. For example, interest compounded monthly for a year cannot be computed exactly with typical decimal formats.
Any complex (in the general sense, not mathematical) operations such as exponentiation that may be involved in considering the time value of money, stock market options evaluation, and so on.
There is no general solution to how to compute numerically. Studying numerical computing and its errors is an entire field of study with textbooks, courses, and research papers. So you cannot solve numerical problems merely by choosing a format. It is important to understand whatever format(s) you use, what errors arise in using them, how to deal with those errors, and what results you need to achieve.
Decimal types allow decimal floating point rather than binary floating point. The class of problems you are referring to relate to the latter.

Mongodb lack of precision incrementing floats

I have a problem because Mongodb doesn't seem to maintain precision when incrementing floats. For example, the following should yield 2.0:
from decimal import Decimal # for python precision
for i in range(40):
db.test.update({}, {'$inc': {'count': float(Decimal(1) / 20)}}, upsert=True)
print db.test.find_one()['count']
2.000000000000001
How can I get around this issue?
Unfortunately, you can't -- at least not directly. Mongo stores floating-point numbers as double-precision IEEE floats (https://en.wikipedia.org/wiki/IEEE_floating_point), and those rounding errors are inherent to the format.
I'm noticing you're using Decimals in your code -- they're converted to Python floats (which are doubles) before being sent to the DB. If you want to keep your true decimal precision, you'll have to store your numbers as stringified Decimals, which means you'll also have to give up Mongo's number-handling facilities such as $inc.
It is, sadly, a tradeoff you'll be confronted to in most databases and programming languages: IEEE floating-point numbers is the format CPUs natively deal with, and any attempts to stray away from them (to use arbitrary-precision decimals like decimal.Decimal) come with a big performance and usability penalty.

Can anyone explain this subtraction and summation in python? [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 9 years ago.
I'm a bit confused of how this subtraction and summation work this way:
A = 5
B = 0.1
C = A+B-A
And I found the answer is 0.099999999999999645. Why the answer is not 0.1?
This is a floating point rounding error. The Python website has a really good tutorial on floating point numbers that explains what this is and why it happens.
If you want an exact result you can:
try using the decimal module
format your result to display a set number of decimal places (this doesn't fix the rounding error):
print "%.2f"%C
I also recommend reading "What Every Computer Scientist Should Know About Floating-Point Arithmetic" from Brian's answer.
Why the answer is not 0.1?
Floating point numbers are not precise enough to get that answer. But holy cow is it ever close!
I recommend that you read "What Every Computer Scientist Should Know About Floating-Point Arithmetic"
You're seeing an artefact of floating point arithmetic, which doesn't have infinit precision. See this article for a full description of how FP maths works, and why you see rounding errors.
Computers use "binary numbers" to store information. Integers can be stored exactly, but fractional numbers are usually stored as "floating-point numbers".
There are numbers that are easy to write in base-10 that cannot be exactly represented in binary floating-point format, and 0.1 is one of those numbers.
It is possible to store numbers exactly, and work with the numbers exactly. For example, the number 0.1 can be stored as 1 / 10, in other words stored as a numerator (1) and a denominator (10), with the understanding that the numerator is divided by the denominator. Then a properly-written math library can work with these fractions and do math for you. But it is much, much slower than just using floating-point numbers, so it's not that often used. (And I think in banking, they usually just use integers instead of floating-point to store money; $1.23 can be stored as the number 123, with an implicit two decimal places. When dealing in money, floating point isn't exact enough!)
This is because of the so called epsilon value. This means that from x to x+E every floating point number is considered to be equal to x.
You can read something about this value in this Q&A
in python this epsilon value(E) depends on the magnitune of the number, you can always get it from numpy.spacing(x)

Why is decimal multiplication slightly inaccurate? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why do simple math operations on floating point return unexpected (inacurate) results in VB.Net and Python?
Why does this happen in Python:
>>>
>>> 483.6 * 3
1450.8000000000002
>>>
I know this happens in other languages, and I'm not asking how to fix this. I know you can do:
>>>
>>> from decimal import Decimal
>>> Decimal('483.6') * 3
Decimal('1450.8')
>>>
So what exactly causes this to happen? Why do decimals get slightly inaccurate when doing math like this?
Is there any specific reason the computer doesn't get this right?
See the Python documentation on floating point numbers. Essentially when you create a floating point number you are using base 2 arithmetic. Just as 1/3 is .333.... on into infinity, so most floating point numbers cannot be exactly expressed in base 2. Hence your result.
The difference between the Python interpreter and some other languages is that others may not display these extra digits. It's not a bug in Python, just how the hardware computes using floating-point arithmetic.
Computers can't represent every floating point number perfectly.
Basically, floating point numbers are represented in scientific notation, but in base 2. Now, try representing 1/3 (base 10) with scientific notation. You might try 3 * 10-1 or, better yet, 33333333 * 10-8. You could keep adding 3's, but you'd never have an exact value of 1/3. Now, try representing 1/10 in binary scientific notation, and you'll find that the same thing happens.
Here is a good link about floating point in python.
As you delve into lower level topics, you'll see how floating point is represented in a computer. In C, for example, floating point numbers are represented as explained in this stackoverflow question. You don't need to read this to understand why decimals can't be represented exactly, but it might give you a better idea of what's going on.
Computers store numbers as bits (in binary). Unfortunately, even with infinite memory, you cannot accurately represent some decimals in binary, for example 0.3. The notion is a kin to trying to store 1/3 in decimal notation exactly.

Floating point representation error in Python [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How is floating point stored? When does it matter?
Python rounding error with float numbers
I am trying to understand why we get floating point representation error in python. I know this is not new question here but honestly I am finding hard time to understand it. I am going through the official document of python http://docs.python.org/tutorial/floatingpoint.html on section Representation Error at bottom of the page.
But I am not able to get how this expression J/2**N comes into picture and why in my interpreter I am getting this value.
0.1--->0.10000000000000001
The closest question I found is floating point issue and How are floating point numbers are stored in memory? but not able to understand.
Can anyone please in detail and simple language? Appreciate any help.
Thanks,
Sunil
You can think of 0.1 being a rational number for a computer - a rational number whose decimal expansion is not finite.
Take 1/3 for instance. For us humans, we know that it means "one third" (no more, no less). But if we were to write it down without fractions, we would have to write 0.3333... and so on. In fact, there is no way we can represent exactly one third with a decimal notation. So there are numbers we can write using decimal notation, and numbers we can't. For the latter, we have to use fractions - and we can do so because we have been taught maths at school.
On the other hand, the computer works with bits (only 2 digits: 1 and 0), and can only work with a binary notation - no fractions. Because of the different basis (2 instead of 10), the concept of a finite rational number is somewhat shifted: numbers that we can represent exactly in decimal notation may not be represented exactly in binary notation, and vice versa. What looks like a simple case for us (1/10=one tenth=0.1, exactly) is not necessarily an easy case for a CPU.

Categories

Resources