This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 3 years ago.
When i execute these 2 lines, i get 2 different results. - Why?
item variable is a type numpy.float32
print(item)
print(item * 1)
output:
0.0006
0.0006000000284984708
I suspect this is being related to the numpy.float32 type somehow?
If i try to convert the numpy.float32 to float i get this:
item = float(item)
print(item)
output:
0.0006000000284984708
What you observe unfortunately is not avoidable. It has to do with the internal representation of a float number. In this case it doesn't even have to do with calculation issues, as suggested in comments here.
(Binary base) float numbers as used by most languages are represented as (+/- mantisse)*2^exponent.
The important part here is the mantisse, that doesn't allow to represent all numbers exactly. The value range of the mantisse and the exponent depend on the bit length of the float you use. The exponent is responsible for the maximum and minimum representable numbers, while the mantisse is responsible for the precision of the displayable numers (loosely speaking the "granularity" of the numbers).
So for your question, the mantisse is more important. As said it is like a bit array. In a byte a bit has a value depending on it's position of 1, 2, 4, ...
In the mantisse it is similar, but instead of 1, 2, 3, 4, the bits have the value 1/2, 1/4, 1/8, ...
So if you want to represent 0.75, the bits with the values 1/2 and 1/4 would be set in your mantisse and the exponent would be 0. That's it in very short.
Now, if you would try to represent a value like 0.11 in a float represenation, you will notice, that it is not possible. No matter if you use float32 or float64:
import numpy as np
item=np.float64('0.11')
print('{0:2.60f}'.format(item))
output: 0.110000000000000000555111512312578270211815834045410156250000
item=np.float32('0.11')
print('{0:2.60f}'.format(item))
output: 0.109999999403953552246093750000000000000000000000000000000000
Btw. if you want to represent the value 0.25 (1/4) it is not that the bit for 1/4 is set, but instead the bit for 1/2 and the exponent is set to -1, so 1/2*2^(-1) is again 0.25. This is done in a normalization process.
But if you want to increase the precision you could use float64, as I did it in my example. It will reduce this phenomenon a bit.
It seems, that some systems also support decimal based floats. I haven't worked with them, but probably they would avoid this kind of problems (not the calculation issus though mentioned in the post someone else posted as an answer).
The reason you see two different results is that your variable item is in numpy.float32, as you said. Python internally uses 64 bit floating point numbers, so
print(item)
returns the (lower precision) result in 32 bit, while
print(item * 1)
first multiplies with 1, which is an integer. It is not possible to multiply integer with float, so Python converts both into floats - 64 bit floats, since you do not specify anything else. The result is then a 64 bit float.
If you would specify another type of "1",
print(item * numpy.float32(1))
returns the same result as print(item), because there is no type conversion and everything can stay in 32 bit.
You haven't specified exactly what the problem is, beyond "the numbers don't match". How you handle floating point depends a little on your application, but in general you can't rely on comparing floating point numbers exactly. With a few obvious exceptions: 0 times anything should be 0, 1 times anything should be 1 (there's more, but lets stop there). So why is 1*item different from item?
>>> item = np.float32(0.0006)
>>> item
0.0006
>>> item*1
0.0006000000284984708
Right, this seems to contradict common sense. No, it's just the wrong way. Do an actual comparison and everything is still alright with the world.
>>> item == item*1
True
The numbers are the same. This should make sense - increasing the precision of a floating point shouldn't change it's value, and multiplying by 1 should not change a number.
So, what's going on? Numpy converts an np.float32 value to a python float which prints with nice rounding. However, item*1 is an np.float64 which by default shows more siginificant figures. If you print both of these with the same amount of significant figures you can see there's no real difference.
>>> "{:0.015f}".format(item*1)
'0.000600000028498'
>>> "{:0.015f}".format(item)
'0.000600000028498'
So that's it. What python prints isn't meant to be a completely accurate representation of numbers. The other answers get into why 0.0006 can't be represented exactly.
Edit
Rounding doesn't change this, it just converts item to a python float which prints with rounding.
>>> "{:0.015f}".format(round(item, 4))
'0.000600000028498'
I cannot seem to find the logic in this, but have made a workaround simply converting the numpy.float32 to float and rounding the numbers to a specific decimal.
Related
I had some issues with a piece of code and ended up doing the following command line snippet.This was just an experiment and I didn't store such large values in any variable in the real code(modulo 10**9 +7).
>>> a=1
>>> for i in range(1,101):
... a=a*i
...
>>> b=1
>>> for i in range(1,51):
... b=b*i
...
>>> c=pow(2,50)
>>> a//(b*c)
2725392139750729502980713245400918633290796330545803413734328823443106201171875
>>> a/(b*c)
2.7253921397507295e+78
>>> (a//(b*c))%(10**9 +7)
196932377
>>> (a/(b*c))%(10**9 +7)
45708938.0
>>>
I don't understand why integer divison gives the correct output while floating point divison fails.
Basically I calculated: ( (100!) / ((50!)*(2^50)) ) % (10**9 +7)
Because of precision.
Integers and floats are coded differently. In particular, in python 3, integers can be arbitrarily large - the one you gave, for example, is more than 250 bits large when you convert it to binary. They're stored in a way that can accommodate however large they are.
However, floating-point numbers are constrained to a certain size - usually 64 bits. These 64 bits are divided into a sign (1 bit), mantissa, and exponent - the number of bits in the mantissa limit how precise the number can be. Python's documentation contains a section on this limitation.
So, when you do
(a//(b*c))%(10**9 +7)
you're performing that calculation with integers, which, again, are arbitrarily large. However, when you do this:
(a/(b*c))%(10**9 +7)
you're performing that calculation with a number that only has 18 significant digits - it's already imprecise, and doing more calculations with it only further corrupts the answer.
What you can do to avoid this, if you need to use very large floating-point numbers, is use python's decimal module (which is part of the standard library), which will not have these problems.
The reason is that integers are precise, but floats are limited by the floating point precision: Python2.7 default float precision
This is more of a numerical analysis rather than programming question, but I suppose some of you will be able to answer it.
In the sum two floats, is there any precision lost? Why?
In the sum of a float and a integer, is there any precision lost? Why?
Thanks.
In the sum two floats, is there any precision lost?
If both floats have differing magnitude and both are using the complete precision range (of about 7 decimal digits) then yes, you will see some loss in the last places.
Why?
This is because floats are stored in the form of (sign) (mantissa) × 2(exponent). If two values have differing exponents and you add them, then the smaller value will get reduced to less digits in the mantissa (because it has to adapt to the larger exponent):
PS> [float]([float]0.0000001 + [float]1)
1
In the sum of a float and a integer, is there any precision lost?
Yes, a normal 32-bit integer is capable of representing values exactly which do not fit exactly into a float. A float can still store approximately the same number, but no longer exactly. Of course, this only applies to numbers that are large enough, i. e. longer than 24 bits.
Why?
Because float has 24 bits of precision and (32-bit) integers have 32. float will still be able to retain the magnitude and most of the significant digits, but the last places may likely differ:
PS> [float]2100000050 + [float]100
2100000100
The precision depends on the magnitude of the original numbers. In floating point, the computer represents the number 312 internally as scientific notation:
3.12000000000 * 10 ^ 2
The decimal places in the left hand side (mantissa) are fixed. The exponent also has an upper and lower bound. This allows it to represent very large or very small numbers.
If you try to add two numbers which are the same in magnitude, the result should remain the same in precision, because the decimal point doesn't have to move:
312.0 + 643.0 <==>
3.12000000000 * 10 ^ 2 +
6.43000000000 * 10 ^ 2
-----------------------
9.55000000000 * 10 ^ 2
If you tried to add a very big and a very small number, you would lose precision because they must be squeezed into the above format. Consider 312 + 12300000000000000000000. First you have to scale the smaller number to line up with the bigger one, then add:
1.23000000000 * 10 ^ 15 +
0.00000000003 * 10 ^ 15
-----------------------
1.23000000003 <-- precision lost here!
Floating point can handle very large, or very small numbers. But it can't represent both at the same time.
As for ints and doubles being added, the int gets turned into a double immediately, then the above applies.
When adding two floating point numbers, there is generally some error. D. Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" describes the effect and the reasons in detail, and also how to calculate an upper bound on the error, and how to reason about the precision of more complex calculations.
When adding a float to an integer, the integer is first converted to a float by C++, so two floats are being added and error is introduced for the same reasons as above.
The precision available for a float is limited, so of course there is always the risk that any given operation drops precision.
The answer for both your questions is "yes".
If you try adding a very large float to a very small one, you will for instance have problems.
Or if you try to add an integer to a float, where the integer uses more bits than the float has available for its mantissa.
The short answer: a computer represents a float with a limited number of bits, which is often done with mantissa and exponent, so only a few bytes are used for the significant digits, and the others are used to represent the position of the decimal point.
If you were to try to add (say) 10^23 and 7, then it won't be able to accurately represent that result. A similar argument applies when adding a float and integer -- the integer will be promoted to a float.
In the sum two floats, is there any precision lost?
In the sum of a float and a integer, is there any precision lost? Why?
Not always. If the sum is representable with the precision you ask, and you won't get any precision loss.
Example: 0.5 + 0.75 => no precision loss
x * 0.5 => no precision loss (except if x is too much small)
In the general case, one add floats in slightly different ranges so there is a precision loss which actually depends on the rounding mode.
ie: if you're adding numbers with totally different ranges, expect precision problems.
Denormals are here to give extra-precision in extreme cases, at the expense of CPU.
Depending on how your compiler handle floating-point computation, results can vary.
With strict IEEE semantics, adding two 32 bits floats should not give better accuracy than 32 bits.
In practice it may requires more instruction to ensure that, so you shouldn't rely on accurate and repeatable results with floating-point.
In both cases yes:
assert( 1E+36f + 1.0f == 1E+36f );
assert( 1E+36f + 1 == 1E+36f );
The case float + int is the same as float + float, because a standard conversion is applied to the int. In the case of float + float, this is implementation dependent, because an implementation may choose to do the addition at double precision. There may be some loss when you store the result, of course.
In both cases, the answer is "yes". When adding an int to a float, the integer is converted to floating point representation before the addition takes place anyway.
To understand why, I suggest you read this gem: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
This question already has answers here:
Why is math.sqrt() incorrect for large numbers?
(4 answers)
Is floating point math broken?
(31 answers)
Closed 5 years ago.
If you take a number, take its square root, drop the decimal, and then raise it to the second power, the result should always be less than or equal to the original number.
This seems to hold true in python until you try it on 99999999999999975425 for some reason.
import math
def check(n):
assert math.pow(math.floor(math.sqrt(n)), 2) <= n
check(99999999999999975424) # No exception.
check(99999999999999975425) # Throws AssertionError.
It looks like math.pow(math.floor(math.sqrt(99999999999999975425)), 2) returns 1e+20.
I assume this has something to do with the way we store values in python... something related to floating point arithmetic, but I can't reason about specifically how that affects this case.
The problem is not really about sqrt or pow, the problem is you're using numbers larger than floating point can represent precisely. Standard IEEE 64 bit floating point arithmetic can't represent every integer value beyond 52 bits (plus one sign bit).
Try just converting your inputs to float and back again:
>>> int(float(99999999999999975424))
99999999999999967232
>>> int(float(99999999999999975425))
99999999999999983616
As you can see, the representable value skipped by 16384. The first step in math.sqrt is converting to float (C double), and at that moment, your value increased by enough to ruin the end result.
Short version: float can't represent large integers precisely. Use decimal if you need greater precision. Or if you don't care about the fractional component, as of 3.8, you can use math.isqrt, which works entirely in integer space (so you never experience precision loss, only the round down loss you expect), giving you the guarantee you're looking for, that the result is "the greatest integer a such that a² ≤ n".
Unlike Evan Rose's (now-deleted) answer claims, this is not due to an epsilon value in the sqrt algorithm.
Most math module functions cast their inputs to float, and math.sqrt is one of them.
99999999999999975425 cannot be represented as a float. For this input, the cast produces a float with exact numeric value 99999999999999983616, which repr shows as 9.999999999999998e+19:
>>> float(99999999999999975425)
9.999999999999998e+19
>>> int(_)
99999999999999983616L
The closest float to the square root of this number is 10000000000.0, and that's what math.sqrt returns.
I have to translate euro's (in a string) to euro cents (int):
Examples:
'12,1' => 1210
'14,51' => 1451
I use this python function:
int(round(float(amount.replace(',', '.')), 2) * 100)
But with this amount '1229,84' the result is : 122983
Update
I use the solution from Wim, bacause I use integers in both Python / Jinja and javascript for currency artitmetic. See also the answer from Chepner.
int(round(100 * float(amout.replace(',', '.')), 2))
My questions was anwered by Mr. Me, who explained the above result.
What the Docs Say, and a simple explanation
I tried it out, and was surprised that this was happening. So I turned to the documentation, and there is a little note in there that says.
Note The behavior of round() for floats can be surprising: for
example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This
is not a bug: it’s a result of the fact that most decimal fractions
can’t be represented exactly as a float.
Now what does that mean, most decimal fractions can't be represented as a float. Well the documentations follows up with a great link at explains this, but since you probably didn't come here to read a nerdy technical document, let me summarize what is going on.
Python uses the IEEE-754 floating point standard to represent floats. This standard compromises accuracy for speed. Some numbers cannot be accurately represented. For example .1 is actually represented as 0.1000000000000000055511151231257827021181583404541015625. Interestingly, .1 in binary is actually an infinitely repeating number, just like 1/3 is an infinitely repeating .333333.
An Under the Hood Case Study
Now on to your particular case. This was pretty fun to look into, and this is what I discovered.
first lets simplify what you where trying to do
>>> amount = '1229,84'
>>> int(round(float(amount.replace(',', '.')), 2) * 100)
>>> 122983
to
>>>int(1229.84 * 100)
>>> 122983
Sometimes Python1 is unable to 100% accurately display binary floating point numbers, for the same reason we are unable to display the fraction 1/3 as a decimal. When this happens Python hides any extra digits. .1 is actually stored as -0.100000000000000092, but Python will display it as .1 if you type it into the console. We can see those extra digits by doing int(1.1) - 1.13. we can apply this int(myNum) - myNum formula to most floating point numbers to see the extra hidden digits behind them.4. In your case we would do the following.
>>> int(1229.84) - 1229.84
-0.8399999999999181
1229.84 is actually 1229.8399999999999181. Continuing on.5
>>> 1229.84, 2) * 100
122983.99999999999 #there's all of our hidden digits showing up.
Now on to the last step. This is the part we are concerned about. Changing it back to an integer.
>>> int(122983.99999999999)
122983
It rounds downwards instead of upwards, however, if we never had multiplied it by 100, we would still have 2 more 9s at the end, and Python would round up.
>>> int(122983.9999999999999)
122984
??? Now what is going on. Why is Python rounding 122983.99999999999 down, but it rounds 122983.9999999999999 up? Well whenever Python turns a float into a integer it rounds down. However, you have to remember that to Python 122983.9999999999999 with the extra two 99s at the end is the same thing as 122984.0 For example.
>>> 122983.9999999999999
122984.0
>>> a = 122983.9999999999999
>>> int(a) - a
0.0
and without the two extra 99s on the end.
>>> 122983.99999999999
122983.99999999999
>>> a=122983.99999999999
>>> int(a) - a
-0.9999999999854481
Python is definitely treating 122983.9999999999999 as 122984.0 but not 122983.99999999999. Now back to casting 122983.99999999999 to an integer. Because we have created ourselves a decimal portion that is less than 122984 that Python sees as being a seperate number from 122984, and because casting to an integer always causes Python to round down, we get 122983 as a result.
Whew. That was a lot to go through, but I sure learned a lot writing this out, and I hope you did to. The solution to all of this is to use decimal numbers instead of floats which compromises speed for accuracy.
What about rounding? The original problem had some rounding in it as well -- it's useless. See appendix item 6.
The Solution
a) The easiest solution is to use the decimal module instead of floating point numbers. This is the preferred way of doing things in any finance or accounting program.
The documentation also mentioned the following solutions which I've summarized.
b) The exact value can be expressed and retrieved in a hexadecimal form via myFloat.hex() and float.fromhex(myHex)
c) The exact value can also be retrieved as a fraction through myFloat.as_integer_ratio()
d) The documentation briefly mentions using SciPy for floating point arithmitic, however this SO question mentions that SciPy's NumPy floats are nothing more than aliases to the built-in float type. The decimal module would be a better solution.
Appendix
1 - Even though I will often refer to Python's behavior, the things I talk about are part of the IEEE-754 floating point standard which is what the major programming languages use for their floating point numbers.
2 - int(1.1) - 1.1 gives me -0.10000000000000009, but according to the documentation .1 is really 0.1000000000000000055511151231257827021181583404541015625
3 - We used int(1.1) - 1.1 instead of int(.1) - .1 because int(.1) - .1 does not give us the hidden digits, but according to the documentation they should still be there for .1, hence I say int(someNum) -someNum works most of the time, but not all of the time.
4 - When we use the formula int(myNum) - myNum what is happening is that casting the number to an integer will round the number down so int(3.9) becomes 3, and when we minus 3 from 3.9 we are left with -.9. However, for some reason that I do not know, when we get rid of all the whole numbers, and we're just left with the decimal portion, Python decides to show us everything -- the whole mantissa.
5 - this does not really affect the outcome of our analysis, but when multiplying by 100, instead of the hidden digits being shifted over by 2 decimal places, they changed a little as well.
>>> a = 1229.84
>>> int(a) - a
-0.8399999999999181
>>> a = round(1229.84, 2) * 100
>>> int(a) - a
-0.9999999999854481 #I expected -0.9999999999918100?
6 - It may seem like we can get rid of all those extra digits by rounding to two decimal places.
>>> round(1229.84, 2) # which is really round(1229.8399999999999181, 2)
1229.84
But when we use our int(someNum) - someNum formula to see the hidden digits, they are still there.
>>> a = round(1229.84, 2)
>>> int(a) - a
-0.8399999999999181
This is because Python cannot store 1229.84 as a binary floating point number. It can't be done. So... rounding 1229.84 does absolutely nothing.
Don't use floating-point arithmetic for currency; rounding error for values that cannot be represented exactly will cause the type of loss you are seeing. Instead, convert the string representation to an integer number of cents, which you can convert to euros-and-cents for display as needed.
euros, cents = '12,1'.split(',') # '12,1' -> ('12', '1')
cents = 100*int(euros) + int(cents * 10 if len(cents) == 1 else 1) # ('12', '1') -> 1210
(Notice you'll need a check to handle cents without a trailing 0.)
display_str = '%d,%d' % divMod(cents, 100) # 1210 -> (12, 10) -> '12.10'
You can also use the Decimal class from the decimal module, which essentially encapsulates all the logic for using integers to represent fractional values.
As #wim mentions in a comment, use the Decimal type from the stdlib decimal module instead of the built in float type. Decimal objects do not have the binary rounding behavior that floats have and also have a precision that can be user defined.
Decimal should be used anywhere you are doing financial calculations or anywhere you need floating point calculations that behave like the decimal math people learn in school (as opposed to the binary floating point behavior of the built in float type).
How to check if a float value is within a range (0.50,150.00) and has 2 decimal digits?
For example, 15.22366 should be false (too many decimal digits). But 15.22 should be true.
I tried something like:
data= input()
if data in range(0.50,150.00):
return True
Is that you are looking for?
def check(value):
if 0.50 <= value <= 150 and round(value,2)==value:
return True
return False
Given your comment:
i input 15.22366 it is going to return true; that is why i specified the range; it should accept 15.22
Simply said, floating point values are imprecise. Many values don't have a precise representation. Say for example 1.40. It might be displayed "as it":
>>> f = 1.40
>>> print f
1.4
But this is an illusion. Python has rounded that value in order to nicely display it. The real value as referenced by the variable f is quite different:
>>> from decimal import Decimal
>>> Decimal(f)
Decimal('1.399999999999999911182158029987476766109466552734375')
According to your rule of having only 2 decimals, should f reference a valid value or not?
The easiest way to fix that issue is probably to use round(...,2) as I suggested in the code above. But this in only an heuristic -- only able to reject "largely wrong" values. See my point here:
>>> for v in [ 1.40,
... 1.405,
... 1.399999999999999911182158029987476766109466552734375,
... 1.39999999999999991118,
... 1.3999999999999991118]:
... print check(v), v
...
True 1.4
False 1.405
True 1.4
True 1.4
False 1.4
Notice how the last few results might seems surprising at first. I hope my above explanations put some light on this.
As a final advice, for your needs as I guess them from your question, you should definitively consider using "decimal arithmetic". Python provides the decimal module for that purpose.
float is the wrong data type to use for your case, Use Decimal instead.
Check python docs for issues and limitations. To quote from there (I've generalised the text in Italics)
Floating-point numbers are represented in computer hardware as base 2 (binary) fractions.
no matter how many base 2 digits you’re willing to use, some decimal value (like 0.1) cannot be represented exactly as a base 2 fraction.
Stop at any finite number of bits, and you get an approximation
On a typical machine running Python, there are 53 bits of precision available for a Python float, so the value stored internally when you enter a decimal number is the binary fraction which is close to, but not exactly equal to it.
The documentation for the built-in round() function says that it rounds to the nearest value, rounding ties away from zero.
And finally, it recommends
If you’re in a situation where you care which way your decimal halfway-cases are rounded, you should consider using the decimal module.
And this will hold for your case as well, as you are looking for a precision of 2 digits after decimal points, which float just can't guarantee.
EDIT Note: The answer below corresponds to original question related to random float generation
Seeing that you need 2 digits of sure shot precision, I would suggest generating integer random numbers in range [50, 15000] and dividing them by 100 to convert them to float yourself.
import random
random.randint(50, 15000)/100.0
Why don't you just use round?
round(random.uniform(0.5, 150.0), 2)
Probably what you want to do is not to change the value itself. As said by Cyber in the comment, even if your round a floating point number, it will always store the same precision. If you need to change the way it is printed:
n = random.uniform(0.5, 150)
print '%.2f' % n # 58.03
The easiest way is to first convert the decimal to string and split with '.' and check if the length of the character. If it is >2 then pass on. i.e. Convert use input number to check if it is in a given range.
a=15.22366
if len(str(a).split('.')[1])>2:
if 0.50 <= value <= 150:
<do your stuff>>