Given two positive floating point numbers x and y, how would you compute x/y to within a specified tolerance e if the division operator
cannot be used?
You cannot use any library functions, such as log and exp; addition
and multiplication are acceptable.
May I know how can I solve it? I know the approach to solving division is to use bitwise operator, but in that approach, when x is less than y, the loop stops.
def divide(x, y):
# break down x/y into (x-by)/y + b , where b is the integer answer
# b can be computed using addition of numbers of power of 2
result = 0
power = 32
y_power = y << power
while x >= y:
while y_power > x:
y_power = y_power>> 1
power -= 1
x = x - y_power
result += 1 << power
return result
An option is to use the Newton-Raphson iterations, known to converge quadratically (so that the number of exact bits will grow like 1, 2, 4, 8, 16, 32, 64).
First compute the inverse of y with the iterates
z(n+1) = z(n) (2 - z(n) y(n)),
and after convergence form the product
x.z(N) ~ x/y
But the challenge is to find a good starting approximation z(0), which should be within a factor 2 of 1/y.
If the context allows it, you can play directly with the exponent of the floating-point representation and replace Y.2^e by 1.2^-e or √2.2^-e.
If this is forbidden, you can setup a table of all the possible powers of 2 in advance and perform a dichotomic search to locate y in the table. Then the inverse power is easily found in the table.
For double precision floats, there are 11 exponent bits so that the table of powers should hold 2047 values, which can be considered a lot. You can trade storage for computation by storing only the exponents 2^0, 2^±1, 2^±2, 2^±3... Then during the dichotomic search, you will recreate the intermediate exponents on demand by means of products (i.e. 2^5 = 2^4.2^1), and at the same time, form the product of inverses. This can be done efficiently, using lg(p) multiplies only, where p=|lg(y)| is the desired power.
Example: lookup of the power for 1000; the exponents are denoted in binary.
1000 > 2^1b = 2
1000 > 2^10b = 4
1000 > 2^100b = 16
1000 > 2^1000b = 256
1000 < 2^10000b = 65536
Then
1000 < 2^1100b = 16.256 = 4096
1000 < 2^1010b = 4.256 = 1024
1000 > 2^1001b = 2.256 = 512
so that
2^9 < 1000 < 2^10.
Now the Newton-Raphson iterations yield
z0 = 0.001381067932
z1 = 0.001381067932 x (2 - 1000 x 0.001381067932) = 0.000854787231197
z2 = 0.000978913251777
z3 = 0.000999555349049
z4 = 0.000999999802286
z5 = 0.001
Likely most straightforward solution is to probably to use Newton's method for division to compute the reciprocal, which may then be multiplied by the numerator to yield the final result.
This is an iterative process gradually refining an initial guess and doubling the precision on every iteration, and involves only multiplication and addition.
One complication is generating a suitable initial guess, since an improper selection may fail to converge or take a larger number of iterations to reach the desired precision. For floating-point numbers the easiest solution is to normalize for the power-of-two exponent and use 1 as the initial guess, then invert and reapply the exponent separately for the final result. This yields roughly 2^iteration bits of precision, and so 6 iterations should be sufficient for a typical IEEE-754 double with a 53-bit mantissa.
Computing the result to within an absolute error tolerance e is difficult however given the limited precision of the intermediate computations. If specified too tightly it may not be representable and, worse, a minimal half-ULP bound requires exact arithmetic. If so you will be forced to manually implement the equivalent of an exact IEEE-754 division function by hand while taking great care with rounding and special cases.
Below is one possible implementation in C:
double divide(double numer, double denom, unsigned int precision) {
int exp;
denom = frexp(denom, &exp);
double guess = 1.4142135623731;
if(denom < 0)
guess = -guess;
while(precision--)
guess *= 2 - denom * guess;
return ldexp(numer * guess, -exp);
}
Handling and analysis of special-cases such as zero, other denormals, infinity or NaNs is left as an exercise for the reader.
The frexp and ldexp library functions are easily substituted for manual bit-extraction of the exponent and mantissa. However this is messy and non-portable, and no specific floating-point representation was specified in the question.
First, you should separate signs and exponents from the both numbers. After that, we'll divide pure positive mantissas and adapt the result using former exponents and signs.
As for dividing mantissas, it is simple, if you'll remember that division is not only inverted multiplication, but also the many-times done substraction. The number of times is the result.
A:B->C, precision e
C=0
allowance= e*B
multiplicator = 1
delta = B
while (delta< allowance && A>0)
if A<delta {
multiplicator*=0.1 // 1/10
delta*=0.1 // 1/10
} else {
A-=delta;
C+=multiplicator
}
}
Really, we can use any number>1 instead of 10. It would be interesting, which will give the most effectivity. Of course, if we use 2, we can use shift instead of multiplication inside the cycle.
Related
I want to do division but with subtraction. I also don't necessarily want the exact answer.
No floating point numbers too (preferably)
How can this be achieved?
Thanks in advance:)
Also the process should almost be as fast as normal division.
to approximate x divided by y you can subtract y from x until the result is smaller or equal to 0 and then the result of the division would be the number of times you subtracted y from x. However this doesn't work with negatives numbers.
Well, let's say you have your numerator and your denominator. The division basically consists in estimating how many denominator you have in your numerator.
So a simple loop should do:
def divide_by_sub(numerator, denominator):
# Init
result = 0
remains = numerator
# Substract as much as possible
while remains >= denominator:
remains -= denominator
result += 1
# Here we have the "floor" part of the result
return result
This will give you the "floor" part of your result. Please consider adding some guardrails to handle "denominator is zero", "numerator is negative", etc.
My best guess, if you want go further, would be to then add an argument to the function for the precision you want like precision and then multiply remains by it (for instance 10 or 100), and reloop on it. It's doable recursively:
def divide_by_sub(numerator, denominator, precision):
# Init
result = 0
remains = numerator
# Substract as much as possible
while remains >= denominator:
remains -= denominator
result += 1
# Here we have the "floor" part of the result. We proceed to more digits
if precision > 1:
remains = remains * precision
float_result = divide_by_sub(remains, denominator, 1)
result += float_result/precision
return result
Giving you, for instance for divide_by_sub(7,3,1000) the following:
2.333
My aim is to find np.mod(np.array[int], some_number) for a numpy array containing very large integers. Some_number is rational, but in general not an exact decimal fraction. I want to make sure that the modulos are as accurate as possible since I need to bin the results for a histogram in a later step, so any errors due to floating-point precision might mean that values will end up in the wrong bin.
I am aware that the modulo function with floats is limited by floating-point precision, so I am hesitating to use np.mod(array[int], float).
I then came across the fractions module of the python library. Can someone give advice as to whether the results obtained via np.mod(np.array[int], Fraction(int1, int2)) would be more accurate than using a float? If not, what is the best approach for such a problem?
So you have a fraction some_number=n/d
Computing the modulo is like performing this division:
a = q*(n/d) + (r/d)
the remainder is a fraction with numerator r.
It can be written like this:
a*d = q * n + r
The problem you have is that a*d could overflow.
But the problem can be written like this:
a = q1 * n + r1
d = q2 * n + r2
a*d = (q1*q2*n+q1*r2+q2*r1) * n + (r1*r2)
given that n/d is between 10 and 100, n>d, q2=0, r2=d, the algorithm is
compute a modulo n => r1
compute (r1*d) modulo n => r
divide r by d => a modulo n/d
If it's for putting in bins, you don't need step 3.
I'm trying to implement basic arithmetic on Bill Gosper's continued logarithms, which are a 'mutation' of continued fractions allowing the term co-routines to emit and consume very small messages even on very large or very small numbers.
Reversible arithmetic, such as {+,-,*,/} are fairly straightforwardly described by Gosper at least in a unary representation, but I'm having difficulty implementing the modulo operator which effectively truncates information from the division operation.
I've realized the modulo operator can be mostly defined with operations I already have:
a mod b == a - b * floor(a / b)
leaving my only problem with floor.
I've also read that the run-length encoded format for continued logarithms effectively describes
'... the integer part of the log base 2 of the number remaining to be
described.'
So yielding the first term right away (pass through) produces the correct output so far, but leaves a significant portion to be determined which I assume requires some sort of carry mechanism.
I've written the following code to test input terms and the expected output terms, but I'm mainly looking for high level algorithm ideas behind implementing floor.
An example input (1234 / 5) to output pair is
Input: [7, 0, 3, 0, 0, 0, 0, 1, 3, 3, 1]
Output: [7, 0, 3, 1, 4, 2, 1, 1]
from fractions import Fraction
def const(frac):
""" CL bistream from a fraction >= 1 or 0. """
while frac:
if frac >= 2:
yield 1
frac = Fraction(frac, 2)
else:
yield 0
frac -= 1
frac = Fraction(1, frac) if frac else 0
def rle(bit_seq):
""" Run-length encoded CL bitstream. """
s = 0
for bit in bit_seq:
s += bit
if not bit:
yield s
s = 0
def floor(rle_seq):
""" RLE CL terms of the greatest integer less than rle_seq. """
#pass
yield from output
""" Sample input/output pairs for floor(). """
num = Fraction(1234)
for den in range(1, int(num)+1):
input = list(rle(const(num / den)))
output = list(rle(const(num // den))) # Integer division!
print("> ", input)
print(">> ", output)
print(">>*", list(floor(input)))
print()
assert(list(floor(input)) == output)
How can I implement the floor operator in the spirit of continued
fraction arithmetic by consuming terms only when necessary and
emitting terms right away, and especially only using the run-length
encoded format (in binary) rather than the unary expansion Gosper
tends to describe.
By assuming that the next coefficient in the run-length encoding is infinite, you can get a lower bound. By assuming that the next term is 1, you can get an upper bound.
You can simply process as many run-length encoded coefficients until you know that both the lower and the upper bound are in the half-open interval [N, N + 1). In this case you know that the floor of the continued logarithm is N. This is similar to what Bill Gosper does at the start of the linked document.
Note, however, that this process doesn't necessarily terminate. For example, when you multiply sqrt(2) by sqrt(2), you get, of course, the number 2. However, the continued logarithm for sqrt(2) is infinite. To evaluate the product sqrt(2) * sqrt(2) you will need all the coefficients to know that you will end up with 2. With any finite number of terms, you can't decide if the product is less than 2 or at least equal to it.
Note that this problem is not specific to continued logarithms, but it is a fundamental problem that occurs in any system in which you can have two numbers for which the representation is infinite but the product can be represented with a finite number of coefficients.
To illustrate this, suppose that these coroutines don't spit out run-length encoded values, but decimal digits, and we want to calculate floor(sqrt(2) * sqrt(2)). After how many steps can we be sure that the product will be at least 2? Let's take 11 digits, just to see what happens:
1.41421356237 * 1.41421356237 = 1.9999999999912458800169
As you might guess, we get arbitrarily close to 2, but will never 'reach' 2. Indeed, without knowing that the source of the digits is sqrt(2), it might just happen that the digits terminate after that point and that the product ends up below 2. Similarly, all following digits might be 9's, which would result in a product slightly above 2.
(A simpler example would be to take the floor of a routine that produces 0.9999...)
So in these kind of arbitrary-precision numerical systems you can end up in situations where you can only calculate some interval (N - epsilon, N + epsilon), where you can make epsilon arbitrarily small, but never equal to zero. It is not possible to take the floor of this expression, as -- by the numerical methods employed -- it is not possible to decide if the real value will end up below or above N.
Perfect power is a positive integer that can be expressed as an integer power of another positive integer.
The task is to check whether a given integer is a perfect power.
Here is my code:
def isPP2(x):
c=[]
for z in range(2,int(x/2)+1):
if (x**(1./float(z)))*10%10==0:
c.append(int(x**(1./float(z)))), c.append(z)
if len(c)>=2:
return c[0:2]
else:
return None
It works perfect with all numbers, for example:
isPP2(81)
[9, 2]
isPP2(2187)
[3, 7]
But it doesn't work with 343 (73).
Because 343**(1.0/float(3)) is not 7.0, it's 6.99999999999999. You're trying to solve an integer problem with floating point math.
As explained in this link, floating point numbers are not stored perfectly in computers. You are most likely experiencing some error in calculation based off of this very small difference that persists in floating point calculations.
When I run your function, the equation ((x ** (1./float(z))) * 10 % 10) results in 9.99999999999999986, not 10 as is expected. This is due to the slight error involved in floating point arithmetic.
If you must calculate the value as a float (which may or may not be useful in your overall goal), you can define an accuracy range for your result. A simple check would look something like this:
precision = 1.e-6
check = (x ** (1./float(z))) * 10 % 10
if check == 0:
# No changes to previous code
elif 10 - check < precision:
c.append(int(x**(1./float(z))) + 1)
c.append(z)
precision is defined in scientific notation, being equal to 1 x 10^(-6) or 0.000001, but it can be decreased in magnitude if this large range of precision introduces other errors, which is not likely but entirely possible. I added 1 to the result since the original number was less than the target.
As the other answers have already explained why your algorithm fails, I will concentrate on providing an alternative algorithm that avoids the issue.
import math
def isPP2(x):
# exp2 = log_2(x) i.e. 2**exp2 == x
# is a much better upper bound for the exponents to test,
# as 2 is the smallest base exp2 is the biggest exponent we can expect.
exp2 = math.log(x, 2)
for exp in range(2, int(exp2)):
# to avoid floating point issues we simply round the base we get
# and then test it against x by calculating base**exp
# side note:
# according to the docs ** and the build in pow()
# work integer based as long as all arguments are integer.
base = round( x**(1./float(exp)) )
if base**exp == x:
return base, exp
return None
print( isPP2(81) ) # (9, 2)
print( isPP2(2187) ) # (3, 7)
print( isPP2(343) ) # (7, 3)
print( isPP2(232**34) ) # (53824, 17)
As with your algorithm this only returns the first solution if there is more than one.
When researching for this question and reading the sourcecode in random.py, I started wondering whether randrange and randint really behave as "advertised". I am very much inclined to believe so, but the way I read it, randrange is essentially implemented as
start + int(random.random()*(stop-start))
(assuming integer values for start and stop), so randrange(1, 10) should return a random number between 1 and 9.
randint(start, stop) is calling randrange(start, stop+1), thereby returning a number between 1 and 10.
My question is now:
If random() were ever to return 1.0, then randint(1,10) would return 11, wouldn't it?
From random.py and the docs:
"""Get the next random number in the range [0.0, 1.0)."""
The ) indicates that the interval is exclusive 1.0. That is, it will never return 1.0.
This is a general convention in mathematics, [ and ] is inclusive, while ( and ) is exclusive, and the two types of parenthesis can be mixed as (a, b] or [a, b). Have a look at wikipedia: Interval (mathematics) for a formal explanation.
Other answers have pointed out that the result of random() is always strictly less than 1.0; however, that's only half the story.
If you're computing randrange(n) as int(random() * n), you also need to know that for any Python float x satisfying 0.0 <= x < 1.0, and any positive integer n, it's true that 0.0 <= x * n < n, so that int(x * n) is strictly less than n.
There are two things that could go wrong here: first, when we compute x * n, n is implicitly converted to a float. For large enough n, that conversion might alter the value. But if you look at the Python source, you'll see that it only uses the int(random() * n) method for n smaller than 2**53 (here and below I'm assuming that the platform uses IEEE 754 doubles), which is the range where the conversion of n to a float is guaranteed not to lose information (because n can be represented exactly as a float).
The second thing that could go wrong is that the result of the multiplication x * n (which is now being performed as a product of floats, remember) probably won't be exactly representable, so there will be some rounding involved. If x is close enough to 1.0, it's conceivable that the rounding will round the result up to n itself.
To see that this can't happen, we only need to consider the largest possible value for x, which is (on almost all machines that Python runs on) 1 - 2**-53. So we need to show that (1 - 2**-53) * n < n for our positive integer n, since it'll always be true that random() * n <= (1 - 2**-53) * n.
Proof (Sketch) Let k be the unique integer k such that 2**(k-1) < n <= 2**k. Then the next float down from n is n - 2**(k-53). We need to show that n*(1-2**53) (i.e., the actual, unrounded, value of the product) is closer to n - 2**(k-53) than to n, so that it'll always be rounded down. But a little arithmetic shows that the distance from n*(1-2**-53) to n is 2**-53 * n, while the distance from n*(1-2**-53) to n - 2**(k-53) is (2**k - n) * 2**-53. But 2**k - n < n (because we chose k so that 2**(k-1) < n), so the product is closer to n - 2**(k-53), so it will get rounded down (assuming, that is, that the platform is doing some form of round-to-nearest).
So we're safe. Phew!
Addendum (2015-07-04): The above assumes IEEE 754 binary64 arithmetic, with round-ties-to-even rounding mode. On many machines, that assumption is fairly safe. However, on x86 machines that use the x87 FPU for floating-point (for example, various flavours of 32-bit Linux), there's a possibility of double rounding in the multiplication, and that makes it possible for random() * n to round up to n in the case where random() returns the largest possible value. The smallest such n for which this can happen is n = 2049. See the discussion at http://bugs.python.org/issue24546 for more.
From Python documentation:
Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0).
Like almost every PRNG of float numbers..