technical problem on python with infinite float

technical problem on python with infinite float - python

I am using Python, and I have a problem, I want to do a program tha can count from 1 to infinite, to know how much is the infinite.
Here is my code :
a=0
for i in range(1, 10e+99):
a += 1
print (a)
but it says " 'float' object cannot be interpreted as an integer "
whereas 10e+99 is not a float
help me please

Per the Python 2 documentation and Python 3 documentation, range requires integer arguments.
In IEEE-754 32-bit binary floating-point, the largest representable finite number is about 3.4028e38. When converting numerals, such as 1e99 in source code, to this format, any number greater than or equal to 2128−2104 (340,282,377,062,143,265,289,209,819,405,393,854,464) will be converted to infinity, assuming the common round-to-nearest-ties-to-even method is used. Because of this, 10e+99 (which stands for 10•1099 and hence 10100) would act like infinity. However, Python implementations more typically use IEEE-754 64-bit binary floating-point, in which the largest representable finite number is 21024−2971, and 10e99 acts as a finite number.1 Thus, to get infinity, you would need around 1e309.
It is not humanly possible to test whether a loop incrementing by 1 from 1 to 10e99 will produce infinity because the total computing power available to humans is only around 1030 additions per year (for a loose sense of “around”, some orders of magnitude). This is insufficient to count to the limit of 32-bit floating-point finite numbers, let alone that of the 64-bit floating-point numbers.
If the arithmetic were done in a floating-point format, it would never reach infinity even with unlimited computing power because, once the sum reached 253 in IEEE-754 64-bit binary, adding 1 would not change the number; 253 would be produced in each iteration. This is because IEEE-754 64-bit binary has only 53 bits available for the significand, so 253+1 is not representable. The nearest representable values are 253 and 253+2. When arithmetic is performed, the exact real-number result is by default rounded to the nearest representable value, with ties rounded to the number with the even low bit in its significand. When 1 is added to 253 the real-number result 253+1 is rounded to 253, and the sum thus stays at 253 for all future iterations.
Footnote
1 The representable value nearest 10100 is 10,000,000,000,000,000,159,028,911,097,599,180,468,360,808,563,945,281,389,781,327,557,747,838,772,170,381,060,813,469,985,856,815,104.

The problem arises because the range() function takes an int, whereas 10e+99 is indeed a float. While 10e+99 is of course not infinity, and therefore you shouldn't expect infinity to pop up anywhere during the execution of your program, if you really wanted to get your for loop to work as it is you could simply do
a=0
for i in range(1, int(10e+99)):
a += 1
print (a)
As other users have pointed out, I would however rethink your strategy entirely: using a range-based for loop to "find out" the value of infinity just doesn't work. Infinity is not a number.

Perhaps you meant your program to go on forever:
a = 0
while True:
a += 1
print(a)
In my head when I see while True: I replace it with 'forever'.

With is code you can check you variable is infinity or not.
import math
infinity = float('inf')
a = 99999999999999999999999999999999
if a > infinity:
print('Your number is an infinity number')
else:
print('Your number is not an infinity number')
#or you can check with math.isinf
print('Your number is Infinity: ',math.isinf(infinity ))
# Also infinity can be both positive and negative
Note: infinity is infinity that has no end, whatever your value or number you enter it will always return false.

Here is what is going to happen if you correct and execute your program:
a=0
for i in range(1, 10**100):
a += 1
print (a)
Suppose you have a super efficient python virtual machine (everyone knows how efficient they are...).
Suppose you have a very efficient implementation of (unbounded) large integers.
Suppose each loop takes a few machine cycles to print those numbers in decimal form (say only 1000 which is well under reality).
Suppose each cycle takes approximately 1.0e-10 s (10GHz) which means having an implementation of print taking advantage of parallelism.
With those irrealistic hypothesis, that's already 10^93 s necessary for the program to complete.
The age of universe is estimated to be less than 10^18 s. Whaouh! It gonna be long.
Now let's compute the energy it's gonna take on a base of 400W computer.
Assuming that all Sun matter (2e30 kg) can be converted into electrical power for your computer (thru E=m c^2), you are going to consume about 2 10^48 equivalent of Sun to perform this computation.
Before you hit return, I kindly ask you: think twice! Save the universe!

Related

How precise is python's agglomerative clustering algorithm?

Apologies if a question like this is inappropriate for this platform but I can't find any information on this anywhere. I'm using sklearn to do a cluster analysis on some points; this is the relevant portion of my code:
clustering = AgglomerativeClustering(n_clusters=None, affinity='euclidean',
distance_threshold=d, linkage='single').fit(i)
number = clustering.n_clusters_
I would like to know the precision to which I can define 'd' which in this case is the distance threshold above which clusters won't be merged. For example, if I set d = 0.000002, would it use this value or would it be rounded to zero? How many decimal places can I use basically.
Thanks in advance

Scikit-learn's AgglomerativeClustering class stores the distance_threshold value as a float type, which on most Python systems means double precision, that is 64 bit. The decimal number you enter is converted to a base-2 exponential number under the hood and rounded accordingly if necessary to fit into the 64 bit storage slot. 1 bit is reserved for the sign, 11 bits for the exponent, and 52 bits for the significant digits.
Note that when you have a number such as 0.000002, starting with many zeros and having only one significant digit, the factor determining the smallest possible value is the number of bits for the exponent. So the question is, how small a number can be represented with 11 bits storage for the exponent?
Let's see:
2 ** -(2 ** 11)
Out: 0.0
2 ** -(2 ** 10)
Out: 5.562684646268003e-309
So if you enter your d value as a decimal number, without using exponential notation, you would have to type at least 309 zeros for that limit to kick in. Thus the value will practically never be rounded to zero, but there will be a small rounding error unless your decimal number happens to have a simple base-2 representation.

Numpy float mean calculation precision

I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.

Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.

This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.

The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.

int(str) of a huge number

if i have a number that is too big to be represented with 64 bits so i receive a string that contains it.
what happens if i use:
num = int(num_str)
i am asking because it looks like it works accurately and i dont understand how, does is allocate more memory for that?
i was required to check if a huge number is a power of 2. someone suggested:
def power(self, A):
A = int(A)
if A == 1:
return 0
x =bin(A)
if x.count('1')>1:
return 0
else:
return 1
while i understand why under regular circumstances it would work, the fact that the numbers are much larger than 2^64 and it still works baffles me.

According to the Python manual's description on the representation of integers:
These represent numbers in an unlimited range, subject to available (virtual) memory only. For the purpose of shift and mask operations, a binary representation is assumed, and negative numbers are represented in a variant of 2’s complement which gives the illusion of an infinite string of sign bits extending to the left.

How does python handle very small float numbers?

This is more a curiosity than a technical problem.
I'm trying to better understand how floating point numbers are handled in Python. In particular, I'm curious about the number returned by sys.float_info.epsilon = 2.220446049250313e-16.
I can see, looking up on the documentation on Double-precision floating-point, that this number can also be written as 1/pow(2, 52). So far, so good.
I decided to write a small python script (see below. Disclaimer: this code is ugly and can burn your eyes) which start from eps = 0.1 and makes the comparison 1.0 == 1.0 + eps. If False, it means eps is big enough to make a difference. Then I try to find a smaller number by subtracting 1 from the last digit and adding the digit 1 to the right of the last and looking for False again by incrementing the last digit.
I am pretty confident that the code is ok because at certain point (32 decimal places) I get eps = 0.00000000000000011102230246251567 = 1.1102230246251567e-16 which is very close to 1/pow(2, 53) = 1.1102230246251565e-16 (last digit differs by 2).
I thought the code would no produce sensible numbers after that. However, the script kept working, always zeroing in a more accurate decimal number until it reached 107 decimal places. Beyond that, the code did not find a False to the test. I got very intrigued with that result and could not wrap my head around it.
Does this 107 decimal places float number have any meaning? If positive, what is it particular about it?
If not, what is python doing past the 32 decimal places eps? Surely there is some algorithm python is cranking to get to the 107 long float.
The script.
total = 520 # hard-coded after try-and-error max number of iterations.
dig = [1]
n = 1
for t in range(total):
eps = '0.'+''.join(str(x) for x in dig)
if(1.0 == 1.0 + float(eps)):
if dig[-1] == 9:
print(eps, n)
n += 1
dig.append(1)
else:
dig[-1] += 1
else:
print(eps, n)
n += 1
dig[-1] -= 1
dig.append(1)
The output (part of it). Values are the eps and the number of decimal places
0.1 1
0.01 2
(...)
0.000000000000001 15
0.0000000000000002 16
0.00000000000000012 17
0.000000000000000112 18
0.0000000000000001111 19
0.00000000000000011103 20
(...)
0.0000000000000001110223024625157 31
0.00000000000000011102230246251567 32
0.000000000000000111022302462515667 33
(...)
0.000000000000000111022302462515666368314810887391490808258832543534838643850548578484449535608291625976563 105
0.0000000000000001110223024625156663683148108873914908082588325435348386438505485784844495356082916259765626 106
0.00000000000000011102230246251566636831481088739149080825883254353483864385054857848444953560829162597656251 107
I ran this code in Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32.

Your test involves a double rounding and is finding the number 2−53+2−105.
Many Python implementations use the IEEE-754 binary64 format. (This is not required by the Python documentation.) In this format, the significand (fraction portion) of a floating-point number has 53 bits. (52 are encoded in a primary significand field. 1 is encoded via the exponent field.) For numbers in the interval [1, 2), the significand is scaled (by the exponent portion of the floating-point representation) so that its leading bit corresponds to a value of 1 (20). This means is trailing bit corresponds to a value of 2−52.
Thus, the difference between 1 and the next number representable in this format is 2−52—that is the smallest change that can be made in the number, by increasing the low bit.
Now, suppose x contains 1. If we add 2−52 to it, we will of course get 1+2−52, since that result is representable. What happens if we add something slightly smaller, say ¾•2−52? In this case, the real-number result, 1+¾•2−52, is not representable. It must be rounded to a representable number. The common default rounding method is to round to the nearest representable number. In this case, that is 1+2−52.
Thus, adding to 1 some numbers smaller than 2−52 still produces 1+2−52. What is the smallest number we can add to 1 and get this result?
In case of ties, where the real-number result is exactly halfway between two representable numbers, the common default rounding method uses the one with the even low bit. So, with a choice between 1 (trailing bit 0) and 1+2−52 (trailing bit 1), it chooses 1. That means if we add ½•2−52 to 1, it will produce 1.
If we add any number greater than ½•2−52 to 1, there will be no tie; the real-number result will be nearer to 1+2−52, and that will be the result.
The next question is what is the smallest number greater than ½•2−52 (2−53) that we can add to 1? If the number has to be in the IEEE-754 binary64 format, it is limited by its significand. With the leading bit scaled to represent 2−53, the trailing bit represents 2−53−52 = 2−105.
Therefore, 2−53+2−105 is the smallest binary64 value we can add to 1 to get 1+2−52.
As your program tests values, it works with a decimal numeral. That decimal numeral is converted to the floating-point format and then added to 1. So it is finding the smallest number in the floating-point format that produces a sum greater than 1, and that is the number described above, 2−53+2−105. Its value in decimal is 1.110223024625156663683148108873914908082588325435348386438505485784844495356082916259765625•10−16.

Egyptian fraction using Fibonacci's Algorithm

I have this problem in which we are trying to find an Egyptian fraction using Fibonacci's algorithm. For the numerator, it is always must be equal to one. Then, we have to determine whether the bottom is a practical number.
We have 2 inputs from the user in which they give us a number (that must be positive)
I have already found a way to determine whether or not the bottom number of the rational number is a practical number..(a great similiar example : Practical Number) but I am lost on how to convert it to an Egyptian fraction.
In the instructions, it states that we should find the biggest fraction based off of our fractors list. For example: if the rational number is 5/8, the factors of 8 are [1,2,4]. The largest fraction that could be subtracted from this is 1/2.
I don't even know where to start with this conversion. I just know that if the second number from the user input is a practical number, I must calculate the equivalent egyptian fraction..
The output should run similiarly to this:
Num1 : 7
Num 2: 8
Denomiator factors: [1,2,4,8]
Num 2 is a practical number.
Fraction can be represented by:
1/2 + 1/4 + 1/8
Any starting help would be appreciated. I truly understand the concept and what it's asking - I am just stuck on where to start. Example codes would be a great help.

Okay ... I'm going to echo what you've just told us, using the example 7/8.
Start with the two parts of the fraction: numer=7, denom=8
Determine that denom is a practical number; this includes returning its factors, [1, 2, 4, 8].
Sort the factors in order; if you're guaranteed that the fraction is always less than 1, you can discard the 1 factor.
Iterate through the list, one factor at a time, building the terms of your Egyptian fraction.
pseudo-code:
for factor in factor_list:
weight = denom/factor
while weight < numer:
# Add the fraction 1/factor to the solution;
# reduce numer by weight (subtracting that fraction)
# When you exit these loops,
# numer should be 0, and
# you should have accumulated all of
# the "1/factor" fractions in your solution.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.