This question already has answers here:
Why do I get "OverflowError: (34, 'Result too large')" or "OverflowError: (34, 'Numerical result out of range')" from floating-point exponentiation?
(6 answers)
Closed 9 days ago.
I have to handle very big integers in my program, but I get the following error:
Traceback (most recent call last):
File "[path]", line n, in <module>
number = int(numbers[0]*(10**numbers[1]))
OverflowError: (34, 'Numerical result out of range')
for number = int(n)when I entered 8e10000000 as n.
Is there a way to solve this problem?
Thanks in advance.
The number 8e10000000 is not an integer, it is a floating point number to Python. Any number using the e notation is treated as a float. Python uses (usually) a 64-bit float format, and that cannot hold such a large number.
So the problem is not the integer, it is the float you start with. The error is not at the line number = int(n), it is at the line n = 8e10000000 or whatever equivalent you used.
You can avoid that error by using
n = 8 * 10**10000000
This results in an integer. But be careful--that takes a lot of time and memory to build the integer in RAM. (My system took 19 seconds to execute that one command.) And if you try to print that value, the computer will take a very long time and a large amount of memory to build up the string value to be printed.
Finally, as others have pointed out, that statement that you claim does not match the error message. So it is possible that something else is going on. If you want closure from us, show an entire code snippet that shows that error.
8e10000000 is very large number, and Python it represents as a float.
CPython usually store this float in 64-bit size, which is too small for such a big number.
For such large numbers is safe to use Decimal module:
import sys
from decimal import Decimal
print('max size = ', sys.maxsize)
number = Decimal("8e10000000")
print(number)
Outputs:
max size = 9223372036854775807
8E+10000000
The number 9223372036854775807 is exactly 2^63 - 1.
What you're running into is an issue where python is Strongly Typed but also Dynamically Typed. 8e10000000 is actually of the python (and C) type float and is a valid value for a double precision floating point binary representation, whereas the maximum valid value for a python int is 9,223,372,036,854,775,807 (found with sys.maxint).
So, python has a decimal library that has a class decimal.Decimal documentation here that does arbitrary precision numbers. It's not going to be as memory or speed efficient, but it bypasses the size-limit and precision issues that floating point numbers have, and are particularly when dealing with money.
The other option that you can consider if you are truly using inter values is to use long(n) to cast your variable to an arbitrarily large integer in python 2.5+ (in python 3.0+ int is a long) here's a link to the PEP 237 that talks about it.
Related
I am trying to solve problem 26 from Project Euler and I am wondering how to show the long version of a floating-point number. For example if we have 1/19 how do we get 64, 128, or more digits of that float in python? An even more useful builtin function would be that returns the numbers after the decimal until it repeats? I know that floats technically store decimal points up until a certain point and then round of to keep things efficient, memory-wise, but is there a way to overload that until you get the repeating part of it? I would guess that such a function would give an exception to an irrational number but is there a function that works for at least rational numbers?
See the Decimal datatype.
from decimal import *
getcontext().prec = 64
print(Decimal(1) / Decimal(19))
https://docs.python.org/3/library/decimal.html
When converting a float to a str, I can specify the number of decimal points I want to display
'%.6f' % 0.1
> '0.100000'
'%.6f' % .12345678901234567890
> '0.123457'
But when simply calling str on a float in python 2.7, it seems to default to 12 decimal points max
str(0.1)
>'0.1'
str(.12345678901234567890)
>'0.123456789012'
Where is this max # of decimal points defined/documented? Can I programmatically get this number?
The number of decimals displayed is going to vary greatly, and there won't be a way to predict how many will be displayed in pure Python. Some libraries like numpy allow you to set precision of output.
This is simply because of the limitations of float representation.
The relevant parts of the link talk about how Python chooses to display floats.
Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine
Python keeps the number of digits manageable by displaying a rounded value instead
Now, there is the possibility of overlap here:
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction
The method for choosing which decimal values to display was changed in Python 3.1 (But the last sentence implies this might be an implementation detail).
For example, the numbers 0.1 and 0.10000000000000001 are both
approximated by 3602879701896397 / 2 ** 55. Since all of these decimal
values share the same approximation, any one of them could be
displayed while still preserving the invariant eval(repr(x)) == x
Historically, the Python prompt and built-in repr() function would
choose the one with 17 significant digits, 0.10000000000000001.
Starting with Python 3.1, Python (on most systems) is now able to
choose the shortest of these and simply display 0.1.
I do not believe this exists in the python language spec. However, the cpython implementation does specify it. The float_repr() function, which turns a float into a string, eventually calls a helper function with the 'r' formatter, which eventually calls a utility function that hardcodes the format to what comes down to format(float, '.16g'). That code can be seen here. Note that this is for python3.6.
>>> import math
>>> str(math.pi*4)
12.5663706144
giving the maximum number of signification digits (both before and after the decimal) at 16. It appears that in the python2.7 implementation, this value was hardcoded to .12g. As for why this happened (and is somewhat lacking documentation, can be found here.)
So if you are trying to get how long a number will be formatted when printed, simply get it's length with .12g.
def len_when_displayed(n):
return len(format(n, '.12g'))
Well, if you're looking for a pure python way of accomplishing this, you could always use something like,
len(str(.12345678901234567890).split('.')[1])
>>>> 12
I couldn't find it in the documentation and will add it here if I do, but this is a work around that can at least always return the length of precision if you want to know before hand.
As you said, it always seems to be 12 even when feeding bigger floating-points.
From what I was able to find, this number can be highly variable and in these cases, finding it empirically seems to be the most reliable way of doing it. So, what I would do is define a simple method like this,
def max_floating_point():
counter = 0
current_length = 0
str_rep = '.1'
while(counter <= current_length):
str_rep += '1'
current_length = len(str(float(str_rep)).split('.')[1])
counter += 1
return current_length
This will return you the maximum length representation on your current system,
print max_floating_point()
>>>> 12
By looking at the output of random numbers converted, I have been unable to understand how the length of the str() is determined, e.g. under Python 3.6.6:
>>> str(.123456789123456789123456789)
'0.12345678912345678'
>>> str(.111111111111111111111111111)
'0.1111111111111111'
You may opt for this code that actually simulates your real situation:
import random
maxdec=max(map(lambda x:len(str(x)),filter(lambda x:x>.1,[random.random() for i in range(99)])))-2
Here we are testing the length of ~90 random numbers in the (.1,1) open interval after conversion (and deducing the 0. from the left, hence the -2).
Python 2.7.5 on a 64bit linux gives me 12, and Python 3.4.8 and 3.6.6 give me 17.
I have used one python code in PyCharm in Linux and the format of number was
-91.35357. When I used the same code in PyCharm in Windows format was
-91.35356999999999. The problem is that value is consisted in the file name which I need to open (and the list of files to open is long).
Anyone knows possible explanation and hot to fix it?
Floats
Always remember that float numbers have a limited precision. If you think about it, there must be a limit to how exactly you represent a number if you limit storage to 32 or 64 bits (or any other number).
in Python
Python provides just one float type. Float numbers are usually implemented using 64 bits, but yet they might be 64 bit in one Python binary, 32 bit on another, so you can't really rely on that (however, see #Mark Dickinson comment below).
Let's test this. But note that, because Python does not provide float32 and float64 alternatives, we will use a different library, numpy, to provide us with those types and operations:
>>> n = 1.23456789012345678901234567890
>>> n
1.2345678901234567
>>> numpy.float64(n)
1.2345678901234567
>>> numpy.float32(n)
1.2345679
Here we can see that Python, in my computer, handles the variable as a float64. This already truncates the number we introduced (because a float64 can only handle so much precision).
When we use a float32, precision is further reduced and, because of truncation, the closest number we can represent is slightly different.
Conclusion
Float resolution is limited. Furthermore, some operations behave differently across different architectures.
Even if you are using a consistent float size, not all numbers can be represented, and operations will accumulate truncation errors.
Comparing a float to another float shall be done considering a possible error margin. Do not use float_a == float_b, instead use abs(float_a - float_b) < error_margin.
Relying on float representations is always a bad idea. Python sometimes uses scientific notation:
>>> a = 0.0000000001
>>> str(a)
'1e-10'
You can get consistent rounding approximation (ie, to use in file names), but remember that storage and representation are different things. This other thread may assist you: Limiting floats to two decimal points
In general, I'd advise against using float numbers in file names or as any other kind of identifier.
Latitude / Longitude
float32 numbers have not enough precision to represent the 5th and 6th decimal numbers in latitude/longitude pairs (depending on whether the integer part has one, two or three digits).
If you want to learn what's really happening, check this page and test some of your numbers: https://www.h-schmidt.net/FloatConverter/IEEE754.html
Representing
Note that Python rounds float values when representing them:
>>> lat = 123.456789
>>> "{0:.6f}".format(lat)
'123.456789'
>>> "{0:.5f}".format(lat)
'123.45679'
And as stated above, latitude/longitude cannot be correctly represented by a float32 down to the 6th decimal, and furthermore, the truncated float values are rounded when presented by Python:
>>> lat = 123.456789
>>> lat
123.456789
>>> "{0:.5f}".format(numpy.float64(lat))
'123.45679'
>>> "{0:.5f}".format(numpy.float32(lat))
'123.45679'
>>> "{0:.6f}".format(numpy.float32(lat))
'123.456787'
As you can see, the rounded version of that float32 number fails to match the original number from the 5th decimal. But also does the rounded version to the 5th decimal of the float64 number.
Your PyCharm on Linux is simply rounding of your large floating point number. Rounding it off to the nearest 6 or 7 can resolve your issue but DONT USE THESE AS FILE NAMES.
Keeping your code constant in both cases then, their can be many explanations:
1) 32-bit Processors handles float differently than 64-Bit Processors.
2) PyCharm for both Linux and Windows behaves differently for floating points which we cannot determine exactly, may be PyCharm for Windows is better optimised.
edit 1
Explanation for Point 1
on 32-Bit processors everything is really done in 80-bit precision internally. The precision really just determines how many of those bits are stored in memory. This is part of the reason why different optimisation settings can change results slightly: They change the amount of rounding from 80-bit to 32- or 64-bit.
edit 2
You can use hashmapping for saving your data in files and then mapping them onto the co-ordinates.
Example:
# variable = {(long,lat):"<random_file_name>"}
cordinates_and_file ={(-92.45453534,-87.2123123):"AxdwaWAsdAwdz"}
So I have a list of tuples of two floats each. Each tuple represents a range. I am going through another list of floats which represent values to be fit into the ranges. All of these floats are < 1 but positive, so precision matter. One of my tests to determine if a value fits into a range is failing when it should pass. If I print the value and the range that is causing problems I can tell this much:
curValue = 0.00145000000671
range = (0.0014500000067055225, 0.0020968749796738849)
The conditional that is failing is:
if curValue > range[0] and ... blah :
# do some stuff
From the values given by curValue and range, the test should clearly pass (don't worry about what is in the conditional). Now, if I print explicitly what the value of range[0] is I get:
range[0] = 0.00145000000671
Which would explain why the test is failing. So my question then, is why is the float changing when it is accessed. It has decimal values available up to a certain precision when part of a tuple, and a different precision when accessed. Why would this be? What can I do to ensure my data maintains a consistent amount of precision across my calculations?
The float doesn't change. The built-in numberic types are all immutable. The cause for what you're observing is that:
print range[0] uses str on the float, which (up until very recent versions of Python) printed less digits of a float.
Printing a tuple (be it with repr or str) uses repr on the individual items, which gives a much more accurate representation (again, this isn't true anymore in recent releases which use a better algorithm for both).
As for why the condition doesn't work out the way you expect, it's propably the usual culprit, the limited precision of floats. Try print repr(curVal), repr(range[0]) to see if what Python decided was the closest representation of your float literal possible.
In modern day PC's floats aren't that precise. So even if you enter pi as a constant to 100 decimals, it's only getting a few of them accurate. The same is happening to you. This is because in 32-bit floats you only get 24 bits of mantissa, which limits your precision (and in unexpected ways because it's in base2).
Please note, 0.00145000000671 isn't the exact value as stored by Python. Python only diplays a few decimals of the complete stored float if you use print. If you want to see exactly how python stores the float use repr.
If you want better precision use the decimal module.
It isn't changing per se. Python is doing its best to store the data as a float, but that number is too precise for float, so Python modifies it before it is even accessed (in the very process of storing it). Funny how something so small is such a big pain.
You need to use a arbitrary fixed point module like Simple Python Fixed Point or the decimal module.
Not sure it would work in this case, because I don't know if Python's limiting in the output or in the storage itself, but you could try doing:
if curValue - range[0] > 0 and...
Problem: to see when computer makes approximation in mathematical calculations when I use Python
Example of the problem:
My old teacher once said the following statement
You cannot never calculate 200! with your computer.
I am not completely sure whether it is true or not nowadays.
It seems that it is, since I get a lot zeros for it from a Python script.
How can you see when your Python code makes approximations?
Python use arbitrary-precision arithmetic to calculate with integers, so it can exactly calculate 200!. For real numbers (so-called floating-point), Python does not use an exact representation. It uses a binary representation called IEEE 754, which is essentially scientific notation, except in base 2 instead of base 10.
Thus, any real number that cannot be exactly represented in base 2 with 53 bits of precision, Python cannot produce an exact result. For example, 0.1 (in base 10) is an infinite decimal in base 2, 0.0001100110011..., so it cannot be exactly represented. Hence, if you enter on a Python prompt:
>>> 0.1
0.10000000000000001
The result you get back is different, since has been converted from decimal to binary (with 53 bits of precision), back to decimal. As a consequence, you get things like this:
>>> 0.1 + 0.2 == 0.3
False
For a good (but long) read, see What Every Programmer Should Know About Floating-Point Arithmetic.
Python has unbounded integer sizes in the form of a long type. That is to say, if it is a whole number, the limit on the size of the number is restricted by the memory available to Python.
When you compute a large number such as 200! and you see an L on the end of it, that means Python has automatically cast the int to a long, because an int was not large enough to hold that number.
See section 6.4 of this page for more information.
200! is a very large number indeed.
If the range of an IEEE 64-bit double is 1.7E +/- 308 (15 digits), you can see that the largest factorial you can get is around 170!.
Python can handle arbitrary sized numbers, as can Java with its BigInteger.
Without some sort of clarification to that statement, it's obviously false. Just from personal experience, early lessons in programming (in the late 1980s) included solving very similar, if not exactly the same, problems. In general, to know some device which does calculations isn't making approximations, you have to prove (in the math sense of a proof) that it isn't.
Python's integer types (named int and long in 2.x, both folded into just the int type in 3.x) are very good, and do not overflow like, for example, the int type in C. If you do the obvious of print 200 * 199 * 198 * ... it may be slow, but it will be exact. Similiarly, addition, subtraction, and modulus are exact. Division is a mixed bag, as there's two operators, / and //, and they underwent a change in 2.x—in general you can only treat it as inexact.
If you want more control yet don't want to limit yourself to integers, look at the decimal module.
Python handles large numbers automatically (unlike a language like C where you can overflow its datatypes and the values reset to zero, for example) - over a certain point (sys.maxint or 2147483647) it converts the integer to a "long" (denoted by the L after the number), which can be any length:
>>> def fact(x):
... return reduce(lambda x, y: x * y, range(1, x+1))
...
>>> fact(10)
3628800
>>> fact(200)
788657867364790503552363213932185062295135977687173263294742533244359449963403342920304284011984623904177212138919638830257642790242637105061926624952829931113462857270763317237396988943922445621451664240254033291864131227428294853277524242407573903240321257405579568660226031904170324062351700858796178922222789623703897374720000000000000000000000000000000000000000000000000L
Long numbers are "easy", floating point is more complicated, and almost any computer representation of a floating point number is an approximation, for example:
>>> float(1)/3
0.33333333333333331
Obviously you can't store an infinite number of 3's in memory, so it cheats and rounds it a bit..
You may want to look at the decimal module:
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 do not have an exact representation in binary floating point. End users typically would not expect 1.1 to display as 1.1000000000000001 as it does with binary floating point.
Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem
See Handling very large numbers in Python.
Python has a BigNum class for holding 200! and will use it automatically.
Your teacher's statement, though not exactly true here is true in general. Computers have limitations, and it is good to know what they are. Remember that every time you add another integer of data storage, you can store a number that is 2^32 (4 billion +) times larger. It is hard to comprehend how many more numbers that is - but maths gets slower as you add more integers to store the exact value of a very large number.
As an example (what you can store with 1000 bits)
>>> 2 << 1000
2143017214372534641896850098120003621122809623411067214887500776740702102249872244986396
7576313917162551893458351062936503742905713846280871969155149397149607869135549648461970
8421492101247422837559083643060929499671638825347975351183310878921541258291423929553730
84335320859663305248773674411336138752L
I tried to illustrate how big a number you can store with 10000 bits, or even 8,000,000 bits (a megabyte) but that number is many pages long.