Continued logarithm arithmetic: floor operator on run-length encoded terms

Continued logarithm arithmetic: floor operator on run-length encoded terms - python

I'm trying to implement basic arithmetic on Bill Gosper's continued logarithms, which are a 'mutation' of continued fractions allowing the term co-routines to emit and consume very small messages even on very large or very small numbers.
Reversible arithmetic, such as {+,-,*,/} are fairly straightforwardly described by Gosper at least in a unary representation, but I'm having difficulty implementing the modulo operator which effectively truncates information from the division operation.
I've realized the modulo operator can be mostly defined with operations I already have:
a mod b == a - b * floor(a / b)
leaving my only problem with floor.
I've also read that the run-length encoded format for continued logarithms effectively describes
'... the integer part of the log base 2 of the number remaining to be
described.'
So yielding the first term right away (pass through) produces the correct output so far, but leaves a significant portion to be determined which I assume requires some sort of carry mechanism.
I've written the following code to test input terms and the expected output terms, but I'm mainly looking for high level algorithm ideas behind implementing floor.
An example input (1234 / 5) to output pair is
Input: [7, 0, 3, 0, 0, 0, 0, 1, 3, 3, 1]
Output: [7, 0, 3, 1, 4, 2, 1, 1]
from fractions import Fraction
def const(frac):
""" CL bistream from a fraction >= 1 or 0. """
while frac:
if frac >= 2:
yield 1
frac = Fraction(frac, 2)
else:
yield 0
frac -= 1
frac = Fraction(1, frac) if frac else 0
def rle(bit_seq):
""" Run-length encoded CL bitstream. """
s = 0
for bit in bit_seq:
s += bit
if not bit:
yield s
s = 0
def floor(rle_seq):
""" RLE CL terms of the greatest integer less than rle_seq. """
#pass
yield from output
""" Sample input/output pairs for floor(). """
num = Fraction(1234)
for den in range(1, int(num)+1):
input = list(rle(const(num / den)))
output = list(rle(const(num // den))) # Integer division!
print("> ", input)
print(">> ", output)
print(">>*", list(floor(input)))
print()
assert(list(floor(input)) == output)
How can I implement the floor operator in the spirit of continued
fraction arithmetic by consuming terms only when necessary and
emitting terms right away, and especially only using the run-length
encoded format (in binary) rather than the unary expansion Gosper
tends to describe.

By assuming that the next coefficient in the run-length encoding is infinite, you can get a lower bound. By assuming that the next term is 1, you can get an upper bound.
You can simply process as many run-length encoded coefficients until you know that both the lower and the upper bound are in the half-open interval [N, N + 1). In this case you know that the floor of the continued logarithm is N. This is similar to what Bill Gosper does at the start of the linked document.
Note, however, that this process doesn't necessarily terminate. For example, when you multiply sqrt(2) by sqrt(2), you get, of course, the number 2. However, the continued logarithm for sqrt(2) is infinite. To evaluate the product sqrt(2) * sqrt(2) you will need all the coefficients to know that you will end up with 2. With any finite number of terms, you can't decide if the product is less than 2 or at least equal to it.
Note that this problem is not specific to continued logarithms, but it is a fundamental problem that occurs in any system in which you can have two numbers for which the representation is infinite but the product can be represented with a finite number of coefficients.
To illustrate this, suppose that these coroutines don't spit out run-length encoded values, but decimal digits, and we want to calculate floor(sqrt(2) * sqrt(2)). After how many steps can we be sure that the product will be at least 2? Let's take 11 digits, just to see what happens:
1.41421356237 * 1.41421356237 = 1.9999999999912458800169
As you might guess, we get arbitrarily close to 2, but will never 'reach' 2. Indeed, without knowing that the source of the digits is sqrt(2), it might just happen that the digits terminate after that point and that the product ends up below 2. Similarly, all following digits might be 9's, which would result in a product slightly above 2.
(A simpler example would be to take the floor of a routine that produces 0.9999...)
So in these kind of arbitrary-precision numerical systems you can end up in situations where you can only calculate some interval (N - epsilon, N + epsilon), where you can make epsilon arbitrarily small, but never equal to zero. It is not possible to take the floor of this expression, as -- by the numerical methods employed -- it is not possible to decide if the real value will end up below or above N.

Related

Why is (-27)**(1.0/3.0) not -3.0 in Python?

In math, you are allowed to take cubic roots of negative numbers, because a negative number multiplied by two other negative numbers results in a negative number. Raising something to a fractional power 1/n is the same as taking the nth root of it. Therefore, the cubic root of -27, or (-27)**(1.0/3.0) comes out to -3.
But in Python 2, when I type in (-27)**(1.0/3.0), it gives me an error:
Traceback (most recent call last):
File "python", line 1, in <module>
ValueError: negative number cannot be raised to a fractional power
Python 3 doesn't produce an exception, but it gives a complex number that doesn't look anything like -3:
>>> (-27)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
Why don't I get the result that makes mathematical sense? And is there a workaround for this?

-27 has a real cube root (and two non-real cube roots), but (-27)**(1.0/3.0) does not mean "take the real cube root of -27".
First, 1.0/3.0 doesn't evaluate to exactly one third, due to the limits of floating-point representation. It evaluates to exactly
0.333333333333333314829616256247390992939472198486328125
though by default, Python won't print the exact value.
Second, ** is not a root-finding operation, whether real roots or principal roots or some other choice. It is the exponentiation operator. General exponentiation of negative numbers to arbitrary real powers is messy, and the usual definitions don't match with real nth roots; for example, the usual definition of (-27)^(1/3) would give you the principal root, a complex number, not -3.
Python 2 decides that it's probably better to raise an error for stuff like this unless you make your intentions explicit, for example by exponentiating the absolute value and then applying the sign:
def real_nth_root(x, n):
# approximate
# if n is even, x must be non-negative, and we'll pick the non-negative root.
if n % 2 == 0 and x < 0:
raise ValueError("No real root.")
return (abs(x) ** (1.0/n)) * (-1 if x < 0 else 1)
or by using complex exp and log to take the principal root:
import cmath
def principal_nth_root(x, n):
# still approximate
return cmath.exp(cmath.log(x)/n)
or by just casting to complex for complex exponentiation (equivalent to the exp-log thing up to rounding error):
>>> complex(-27)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
Python 3 uses complex exponentiation for negative-number-to-noninteger, which gives the principal nth root for y == 1.0/n:
>>> (-27)**(1/3) # Python 3
(1.5000000000000004+2.598076211353316j)

The type coercion rules documented by builtin pow apply here, since you're using a float for the exponent.
Just make sure that either the base or the exponent is a complex instance and it works:
>>> (-27+0j)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
>>> (-27)**(complex(1.0/3.0))
(1.5000000000000004+2.598076211353316j)
To find all three roots, consider numpy:
>>> import numpy as np
>>> np.roots([1, 0, 0, 27])
array([-3.0+0.j , 1.5+2.59807621j, 1.5-2.59807621j])
The list [1, 0, 0, 27] here refers to the coefficients of the equation 1x³ + 0x² + 0x + 27.

I do not think Python, or your version of it, supports this function. I pasted the same equation into my Python interpreter, (IDLE) and it solved it, with no errors. I am using Python 3.2.

Floating point Division without using Division Operator

Given two positive floating point numbers x and y, how would you compute x/y to within a specified tolerance e if the division operator
cannot be used?
You cannot use any library functions, such as log and exp; addition
and multiplication are acceptable.
May I know how can I solve it? I know the approach to solving division is to use bitwise operator, but in that approach, when x is less than y, the loop stops.
def divide(x, y):
# break down x/y into (x-by)/y + b , where b is the integer answer
# b can be computed using addition of numbers of power of 2
result = 0
power = 32
y_power = y << power
while x >= y:
while y_power > x:
y_power = y_power>> 1
power -= 1
x = x - y_power
result += 1 << power
return result

An option is to use the Newton-Raphson iterations, known to converge quadratically (so that the number of exact bits will grow like 1, 2, 4, 8, 16, 32, 64).
First compute the inverse of y with the iterates
z(n+1) = z(n) (2 - z(n) y(n)),
and after convergence form the product
x.z(N) ~ x/y
But the challenge is to find a good starting approximation z(0), which should be within a factor 2 of 1/y.
If the context allows it, you can play directly with the exponent of the floating-point representation and replace Y.2^e by 1.2^-e or √2.2^-e.
If this is forbidden, you can setup a table of all the possible powers of 2 in advance and perform a dichotomic search to locate y in the table. Then the inverse power is easily found in the table.
For double precision floats, there are 11 exponent bits so that the table of powers should hold 2047 values, which can be considered a lot. You can trade storage for computation by storing only the exponents 2^0, 2^±1, 2^±2, 2^±3... Then during the dichotomic search, you will recreate the intermediate exponents on demand by means of products (i.e. 2^5 = 2^4.2^1), and at the same time, form the product of inverses. This can be done efficiently, using lg(p) multiplies only, where p=|lg(y)| is the desired power.
Example: lookup of the power for 1000; the exponents are denoted in binary.
1000 > 2^1b = 2
1000 > 2^10b = 4
1000 > 2^100b = 16
1000 > 2^1000b = 256
1000 < 2^10000b = 65536
Then
1000 < 2^1100b = 16.256 = 4096
1000 < 2^1010b = 4.256 = 1024
1000 > 2^1001b = 2.256 = 512
so that
2^9 < 1000 < 2^10.
Now the Newton-Raphson iterations yield
z0 = 0.001381067932
z1 = 0.001381067932 x (2 - 1000 x 0.001381067932) = 0.000854787231197
z2 = 0.000978913251777
z3 = 0.000999555349049
z4 = 0.000999999802286
z5 = 0.001

Likely most straightforward solution is to probably to use Newton's method for division to compute the reciprocal, which may then be multiplied by the numerator to yield the final result.
This is an iterative process gradually refining an initial guess and doubling the precision on every iteration, and involves only multiplication and addition.
One complication is generating a suitable initial guess, since an improper selection may fail to converge or take a larger number of iterations to reach the desired precision. For floating-point numbers the easiest solution is to normalize for the power-of-two exponent and use 1 as the initial guess, then invert and reapply the exponent separately for the final result. This yields roughly 2^iteration bits of precision, and so 6 iterations should be sufficient for a typical IEEE-754 double with a 53-bit mantissa.
Computing the result to within an absolute error tolerance e is difficult however given the limited precision of the intermediate computations. If specified too tightly it may not be representable and, worse, a minimal half-ULP bound requires exact arithmetic. If so you will be forced to manually implement the equivalent of an exact IEEE-754 division function by hand while taking great care with rounding and special cases.
Below is one possible implementation in C:
double divide(double numer, double denom, unsigned int precision) {
int exp;
denom = frexp(denom, &exp);
double guess = 1.4142135623731;
if(denom < 0)
guess = -guess;
while(precision--)
guess *= 2 - denom * guess;
return ldexp(numer * guess, -exp);
}
Handling and analysis of special-cases such as zero, other denormals, infinity or NaNs is left as an exercise for the reader.
The frexp and ldexp library functions are easily substituted for manual bit-extraction of the exponent and mantissa. However this is messy and non-portable, and no specific floating-point representation was specified in the question.

First, you should separate signs and exponents from the both numbers. After that, we'll divide pure positive mantissas and adapt the result using former exponents and signs.
As for dividing mantissas, it is simple, if you'll remember that division is not only inverted multiplication, but also the many-times done substraction. The number of times is the result.
A:B->C, precision e
C=0
allowance= e*B
multiplicator = 1
delta = B
while (delta< allowance && A>0)
if A<delta {
multiplicator*=0.1 // 1/10
delta*=0.1 // 1/10
} else {
A-=delta;
C+=multiplicator
}
}
Really, we can use any number>1 instead of 10. It would be interesting, which will give the most effectivity. Of course, if we use 2, we can use shift instead of multiplication inside the cycle.

Perfect integer evaluation fails with input 343

Perfect power is a positive integer that can be expressed as an integer power of another positive integer.
The task is to check whether a given integer is a perfect power.
Here is my code:
def isPP2(x):
c=[]
for z in range(2,int(x/2)+1):
if (x**(1./float(z)))*10%10==0:
c.append(int(x**(1./float(z)))), c.append(z)
if len(c)>=2:
return c[0:2]
else:
return None
It works perfect with all numbers, for example:
isPP2(81)
[9, 2]
isPP2(2187)
[3, 7]
But it doesn't work with 343 (73).

Because 343**(1.0/float(3)) is not 7.0, it's 6.99999999999999. You're trying to solve an integer problem with floating point math.

As explained in this link, floating point numbers are not stored perfectly in computers. You are most likely experiencing some error in calculation based off of this very small difference that persists in floating point calculations.
When I run your function, the equation ((x ** (1./float(z))) * 10 % 10) results in 9.99999999999999986, not 10 as is expected. This is due to the slight error involved in floating point arithmetic.
If you must calculate the value as a float (which may or may not be useful in your overall goal), you can define an accuracy range for your result. A simple check would look something like this:
precision = 1.e-6
check = (x ** (1./float(z))) * 10 % 10
if check == 0:
# No changes to previous code
elif 10 - check < precision:
c.append(int(x**(1./float(z))) + 1)
c.append(z)
precision is defined in scientific notation, being equal to 1 x 10^(-6) or 0.000001, but it can be decreased in magnitude if this large range of precision introduces other errors, which is not likely but entirely possible. I added 1 to the result since the original number was less than the target.

As the other answers have already explained why your algorithm fails, I will concentrate on providing an alternative algorithm that avoids the issue.
import math
def isPP2(x):
# exp2 = log_2(x) i.e. 2**exp2 == x
# is a much better upper bound for the exponents to test,
# as 2 is the smallest base exp2 is the biggest exponent we can expect.
exp2 = math.log(x, 2)
for exp in range(2, int(exp2)):
# to avoid floating point issues we simply round the base we get
# and then test it against x by calculating base**exp
# side note:
# according to the docs ** and the build in pow()
# work integer based as long as all arguments are integer.
base = round( x**(1./float(exp)) )
if base**exp == x:
return base, exp
return None
print( isPP2(81) ) # (9, 2)
print( isPP2(2187) ) # (3, 7)
print( isPP2(343) ) # (7, 3)
print( isPP2(232**34) ) # (53824, 17)
As with your algorithm this only returns the first solution if there is more than one.

question on karatsuba multiplication

I want to implement Karatsuba's 2-split multiplication in Python. However, writing numbers in the form
A=c*x+d
where x is a power of the base (let x=b^m) close to sqrt(A).
How am I supposed to find x, if I can't even use division and multiplication? Should I count the number of digits and shift A to the left by half the number of digits?
Thanks.

Almost. You don't shift A by half the number of digits; you shift 1. Of course, this is only efficient if the base is a power of 2, since "shifting" in base 10 (for example) has to be done with multiplications. (Edit: well, ok, you can multiply with shifts and additions. But it's ever so much simpler with a power of 2.)
If you're using Python 3.1 or greater, counting the bits is easy, because 3.1 introduced the int.bit_length() method. For other versions of Python, you can count the bits by copying A and shifting it right until it's 0. This can be done in O(log N) time (N = # of digits) with a sort of binary search method - shift by many bits, if it's 0 then that was too many, etc.

You already accepted an answer since I started writing this, but:
What Tom said: in Python 3.x you can get n = int.bit_length() directly.
In Python 2.x you get n in O(log2(A)) time by binary-search, like below.
Here is (2.x) code that calculates both. Let the base-2 exponent of x be n, i.e. x = 2**n.
First we get n by binary-search by shifting. (Really we only needed n/2, so that's one unnecessary last iteration).
Then when we know n, getting x,c,d is easy (still no using division)
def karatsuba_form(A,n=32):
"""Binary-search for Karatsuba form using binary shifts"""
# First search for n ~ log2(A)
step = n >> 1
while step>0:
c = A >> n
print 'n=%2d step=%2d -> c=%d' % (n,step,c)
if c:
n += step
else:
n -= step
# More concisely, could say: n = (n+step) if c else (n-step)
step >>= 1
# Then take x = 2^(n/2) ˜ sqrt(A)
ndiv2 = n/2
# Find Karatsuba form
c = (A >> ndiv2)
x = (1 << ndiv2)
d = A - (c << ndiv2)
return (x,c,d)

Your question is already answered in the article to which you referred: "Karatsuba's basic step works for any base B and any m, but the recursive algorithm is most efficient when m is equal to n/2, rounded up" ... n being the number of digits, and 0 <= value_of_digit < B.
Some perspective that might help:
You are allowed (and required!) to use elementary operations like number_of_digits // 2 and divmod(digit_x * digit_x, B) ... in school arithmetic, where B is 10, you are required (for example) to know that divmod(9 * 8, 10) produces (7, 2).
When implementing large number arithmetic on a computer, it is usual to make B the largest power of 2 that will support the elementary multiplication operation conveniently. For example in the CPython implementation on a 32-bit machine, B is chosen to to be 2 ** 15 (i.e. 32768), because then product = digit_x * digit_y; hi = product >> 15; lo = product & 0x7FFF; works without overflow and without concern about a sign bit.
I'm not sure what you are trying to achieve with an implementation in Python that uses B == 2, with numbers represented by Python ints, whose implementation in C already uses the Karatsuba algorithm for multiplying numbers that are large enough to make it worthwhile. It can't be speed.
As a learning exercise, you might like to try representing a number as a list of digits, with the base B being an input parameter.

Required Working Precision for the BBP Algorithm?

I'm looking to compute the nth digit of Pi in a low-memory environment. As I don't have decimals available to me, this integer-only BBP algorithm in Python has been a great starting point. I only need to calculate one digit of Pi at a time. How can I determine the lowest I can set D, the "number of digits of working precision"?
D=4 gives me many correct digits, but a few digits will be off by one. For example, computing digit 393 with precision of 4 gives me 0xafda, from which I extract the digit 0xa. However, the correct digit is 0xb.
No matter how high I set D, it seems that testing a sufficient number of digits finds an one where the formula returns an incorrect value.
I've tried upping the precision when the digit is "close" to another, e.g. 0x3fff or 0x1000, but cannot find any good definition of "close"; for instance, calculating at digit 9798 gives me 0xcde6 , which is not very close to 0xd000, but the correct digit is 0xd.
Can anyone help me figure out how much working precision is needed to calculate a given digit using this algorithm?
Thank you,
edit
For Reference:
precision (D) first wrong digit
------------- ------------------
3 27
4 161
5 733
6 4329
7 21139
8+ ???
Note that I am calculating one digit at a time, e.g.:
for i in range(1,n):
D = 3 # or whatever precision I'm testing
digit = pi(i) # extracts most significant digit from integer-only BBP result
if( digit != HARDCODED_PI[i] ):
print("non matching digit #%d, got %x instead of %x" % (i,digit,HARDCODED_PI[i]) )

No matter how high I set D, it seems
that testing a sufficient number of
digits finds an one where the formula
returns an incorrect value.
You will always get an error if you are testing a sufficient number of digits - the algorithm does not use arbitrary precision, so rounding errors will show up eventually.
The unbounded iteration with break when the digit doesn't change is going to be difficult to determine the minimum precision required for a given number of digits.
Your best bet is to determine it empirically, ideally by comparing against a known correct source, and increasing the number of digits precision until you get match, or if a correct source is not available, start with your maximum precision (which I guess is 14, since the 15th digit will almost always contain a rounding error.)
EDIT: To be more precise, the algorithm includes a loop - from 0..n, where n is the digit to compute. Each iteration of the loop will introduce a certain amount of error. After looping a sufficient number of times, the error will encroach into the most significant digit that you are computing, and so the result will be wrong.
The wikipedia article uses 14 digits of precision, and this is sufficient to correctly compute the 10**8 digit. As you've shown, fewer digits of precision leads to errors occuring earlier, as there is less precision and error becomes visible with fewer iterations. The net result is that the value for n for which we can correctly compute a digit becomes lower with fewer digits of precision.
If you have D hex digits of precision, that's D*4 bits. With each iteration, an error of 0.5bits is introduced in the least significant bit, so with 2 iterations there is a chance the LSB is wrong. During summation, these errors are added, and so are accumulated. If the number of errors summed reaches the LSB in the most significant digit, then the single digit you extract will be wrong. Roughly speaking, that is when N > 2**(D-0.75). (Correct to some logarithmic base.)
Empirically extrapolating your data, it seems an approximate fit is N=~(2**(2.05*D)), although there are few datapoints so this may not be an accurate predictor.
The BBP algorithm you've chosen is iterative, and so it will take progressively longer to compute digits in the sequence. To compute digits 0..n, will take O(n^2) steps.
The wikipedia article gives a formula for calculating the n'th digit that doesn't require iteration, just exponentiation and rational numbers. This will not suffer the same loss of precision as the iterative algorithm and you can compute any digit of pi as needed in constant time (or at worst logarithmic type, depending upon the implementation of exponentiation with modulus), so computing n digits will take O(n) time possibly O(n log n).

from typing import TypeVar
from gmpy2 import mpz, mpq, powmod as gmpy2_powmod, is_signed as gmpy2_is_signed
__all__ = ['PiSlice']
Integer = TypeVar('Integer', int, mpz)
class PiSlice:
'''
References
----------
"BBP digit-extraction algorithm for π"
https://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%93Plouffe_formula
'''
version = '1.0.0'
def __spigot(self, p: Integer, a: Integer, accuracy: mpq) -> mpq:
def search_junction(p: Integer, a: Integer) -> Integer:
n = mpz(0)
divisor = 8 * p + a
while 16 ** n < divisor:
n += 1
divisor -= 8
return p - (n - 1)
p = mpz(p)
junction = search_junction(p, a)
s = 0
divisor = a
for k in range(junction):
s += mpq(gmpy2_powmod(16, p - k, divisor), divisor)
divisor += 8
for n in range(mpz(p - junction), -1, -1):
if (intermediate := mpq(16 ** n, divisor)) >= accuracy:
s += intermediate
divisor += 8
else:
return s
n = mpz(1)
while (intermediate := mpq(mpq(1, 16 ** n), divisor)) >= accuracy:
s += intermediate
n += 1
divisor += 8
return s
def __init__(self, p: Integer):
'''
'''
self.p = p
def raw(self, m: Integer) -> Integer:
'''
Parameters
----------
m: Integer
Sets the number of slices to return.
Return
------
random_raw: Integer
Returns a hexadecimal slice of Pi.
'''
p = self.p
spigot = self.__spigot
accuracy = mpq(1, 2 ** (mpz(m + 64) * 4)) #64 is the margin of accuracy.
sum_spigot = 4 * spigot(p, 1, accuracy) - 2 * spigot(p, 4, accuracy) - spigot(p, 5, accuracy) - spigot(p, 6, accuracy)
proper_fraction_of_sum_spigot = mpq(sum_spigot.numerator % sum_spigot.denominator, sum_spigot.denominator)
if gmpy2_is_signed(proper_fraction_of_sum_spigot):
proper_fraction_of_sum_spigot += 1
return mpz(mpz(16) ** m * proper_fraction_of_sum_spigot)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.