How to compute an expensive high precision sum in python?

How to compute an expensive high precision sum in python? - python

My problem is very simple. I would like to compute the following sum.
from __future__ import division
from scipy.misc import comb
import math
for n in xrange(2,1000,10):
m = 2.2*n/math.log(n)
print sum(sum(comb(n,a) * comb(n-a,b) * (comb(a+b,a)*2**(-a-b))**m
for b in xrange(n+1))
for a in xrange(1,n+1))
However python gives RuntimeWarning: overflow encountered in multiply and nan as the output and it is also very very slow.
Is there a clever way to do this?

The reason why you get NaNs is you end up evaluating numbers like
comb(600 + 600, 600) == 3.96509646226102e+359
This is too large to fit into a floating point number:
>>> numpy.finfo(float).max
1.7976931348623157e+308
Take logarithms to avoid it:
from __future__ import division, absolute_import, print_function
from scipy.special import betaln
from scipy.misc import logsumexp
import numpy as np
def binomln(n, k):
# Assumes binom(n, k) >= 0
return -betaln(1 + n - k, 1 + k) - np.log(n + 1)
for n in range(2, 1000, 10):
m = 2.2*n/np.log(n)
a = np.arange(1, n + 1)[np.newaxis,:]
b = np.arange(n + 1)[:,np.newaxis]
v = (binomln(n, a)
+ binomln(n - a, b)
+ m*binomln(a + b, a)
- m*(a+b) * np.log(2))
term = np.exp(logsumexp(v))
print(term)

Use the Memoize pattern. With that, redefine comb:
#memoized
def newcomb(a, b):
return comb(a, b)
And replace all calls to comb with newcomb. Also, for a minor improvement, remove the brackets. If you make explicit lists, you waste time constructing them. If you remove them, you're effectively using generator expressions.
Update:
This won't solve the nan issue, but does make it a lot faster.
For everyone who does not see this as being faster, are you applying the memoize decorator? On my machine, the original function takes 29.7s to go up to 200, but only 3.8s with the memoized version.
What memoize does is simply store all your invocations of comb in a lookup table. So if in a later iteration you're invoking comb with the same arguments as you had at some point in the past, it doesn't recalculate it - it simply looks it up in the lookup table.

Related

Python for loop: RuntimeWarning: divide by zero

I'm currently trying to implement the Newton-Raphson algorithm for some finance-based calculations.
I tried it in Python with a simple for loop, but I get this RuntimeWarning: divide by zero encountered in double_scalars and I also get no result of the approximation. I tried to fix it by checking every division on my own, but I found no step where Python should be forced to divide by a zero.
import numpy as np
import math as m
import scipy.stats as si
def totalvol_zero(M):
v_0 = m.sqrt(2 * abs(M))
return v_0
def C_prime(M,v):
C_prime = si.norm.cdf(M/v + v/2) - m.exp(-M)*si.norm.cdf(M/v - v/2)
return C_prime
def NR(M,C_prime_obs):
v_0 = totalvol_zero(M)
for k in range(0,7,1):
v_0 = v_0 - ((C_prime(M,v_0) - C_prime_obs)/(m.sqrt(1/(m.pi * 2))*m.exp(-0.5*((M/v_0 + v_0/2)**2))))
k += 1
return v_0
print(NR(2,2))
This may be a really easy error/typo for some of you because I am still a beginner in Python but at the moment I just don't see anything wrong in this code and can't explain why this warning appeared and why I don't get any value as result.
Edit:
Sorry, I forgot about M and v. They are just explicit formulas so I didn't thought that they are the cause of this problem.
def moneyness(S,K,d,r,t):
F = S * m.exp((r-d)*t)
M = m.log(F/K)
return M
def totalvol(sigma,t):
v = sigma * m.sqrt(t)
return v
These are the explicit expressions for M and v. M defines the moneyness of an option, while v is the total volatility of it. But because I didn't even express M and v in the for-loop like that, but used them just as numbers for the Newton-Raphson, I don't think they will help solve the problem.
C_prime_obs is a converted call price of an option. The value should be always positive but since I never divided by C_prime_obs, this doesn't change anything.

Does Python have a function which computes multinomial coefficients?

I was looking for a Python library function which computes multinomial coefficients.
I could not find any such function in any of the standard libraries.
For binomial coefficients (of which multinomial coefficients are a generalization) there is scipy.special.binom and also scipy.misc.comb. Also, numpy.random.multinomial draws samples from a multinomial distribution, and sympy.ntheory.multinomial.multinomial_coefficients returns a dictionary related to multinomial coefficients.
However, I could not find a multinomial coefficients function proper, which given a,b,...,z returns (a+b+...+z)!/(a! b! ... z!). Did I miss it? Is there a good reason there is none available?
I would be happy to contribute an efficient implementation to SciPy say. (I would have to figure out how to contribute, as I have never done this).
For background, they do come up when expanding (a+b+...+z)^n. Also, they count the ways of depositing a+b+...+z distinct objects into distinct bins such that the first bin contains a objects, etc. I need them occasionally for a Project Euler problem.
BTW, other languages do offer this function: Mathematica, MATLAB, Maple.

To partially answer my own question, here is my simple and fairly efficient implementation of the multinomial function:
def multinomial(lst):
res, i = 1, 1
for a in lst:
for j in range(1,a+1):
res *= i
res //= j
i += 1
return res
It seems from the comments so far that no efficient implementation of the function exists in any of the standard libraries.
Update (January 2020). As Don Hatch has pointed out in the comments, this can be further improved by looking for the largest argument (especially for the case that it dominates all others):
def multinomial(lst):
res, i = 1, sum(lst)
i0 = lst.index(max(lst))
for a in lst[:i0] + lst[i0+1:]:
for j in range(1,a+1):
res *= i
res //= j
i -= 1
return res

No, there is not a built-in multinomial library or function in Python.
Anyway this time math could help you. In fact a simple method for calculating the multinomial
keeping an eye on the performance is to rewrite it by using the characterization of the multinomial coefficient as a product of binomial coefficients:
where of course
Thanks to scipy.special.binom and the magic of recursion you can solve the problem like this:
from scipy.special import binom
def multinomial(params):
if len(params) == 1:
return 1
return binom(sum(params), params[-1]) * multinomial(params[:-1])
where params = [n1, n2, ..., nk].
Note: Splitting the multinomial as a product of binomial is also good to prevent overflow in general.

You wrote "sympy.ntheory.multinomial.multinomial_coefficients returns a dictionary related to multinomial coefficients", but it is not clear from that comment if you know how to extract the specific coefficients from that dictionary. Using the notation from the wikipedia link, the SymPy function gives you all the multinomial coefficients for the given m and n. If you only want a specific coefficient, just pull it out of the dictionary:
In [39]: from sympy import ntheory
In [40]: def sympy_multinomial(params):
...: m = len(params)
...: n = sum(params)
...: return ntheory.multinomial_coefficients(m, n)[tuple(params)]
...:
In [41]: sympy_multinomial([1, 2, 3])
Out[41]: 60
In [42]: sympy_multinomial([10, 20, 30])
Out[42]: 3553261127084984957001360
Busy Beaver gave an answer written in terms of scipy.special.binom. A potential problem with that implementation is that binom(n, k) returns a floating point value. If the coefficient is large enough, it will not be exact, so it would probably not help you with a Project Euler problem. Instead of binom, you can use scipy.special.comb, with the argument exact=True. This is Busy Beaver's function, modified to use comb:
In [46]: from scipy.special import comb
In [47]: def scipy_multinomial(params):
...: if len(params) == 1:
...: return 1
...: coeff = (comb(sum(params), params[-1], exact=True) *
...: scipy_multinomial(params[:-1]))
...: return coeff
...:
In [48]: scipy_multinomial([1, 2, 3])
Out[48]: 60
In [49]: scipy_multinomial([10, 20, 30])
Out[49]: 3553261127084984957001360

Here are two approaches, one using factorials, one using Stirling's approximation.
Using factorials
You can define a function to return multinomial coefficients in a single line using vectorised code (instead of for-loops) as follows:
from scipy.special import factorial
def multinomial_coeff(c):
return factorial(c.sum()) / factorial(c).prod()
(Where c is an np.ndarray containing the number of counts for each different object). Usage example:
>>> import numpy as np
>>> coeffs = np.array([2, 3, 4])
>>> multinomial_coeff(coeffs)
1260.0
In some cases this might be slower because you will be computing certain factorial expressions multiple times, in other cases this might be faster because I believe that numpy naturally parallelises vectorised code. Also this reduces the required number of lines in your program and is arguably more readable. If someone has the time to run speed tests on these different options then I'd be interested to see the results.
Using Stirling's approximation
In fact the logarithm of the multinomial coefficient is much faster to compute (based on Stirling's approximation) and allows computation of much larger coefficients:
from scipy.special import gammaln
def log_multinomial_coeff(c):
return gammaln(c.sum()+1) - gammaln(c+1).sum()
Usage example:
>>> import numpy as np
>>> coeffs = np.array([2, 3, 4])
>>> np.exp(log_multinomial_coeff(coeffs))
1259.999999999999

Your own answer (the accepted one) is quite good, and is especially simple. However, it does have one significant inefficiency: your outer loop for a in lst is executed one more time than is necessary. In the first pass through that loop, the values of i and j are always identical, so the multiplications and divisions do nothing. In your example multinomial([123, 134, 145]), there are 123 unneeded multiplications and divisions, adding time to the code.
I suggest finding the maximum value in the parameters and removing it, so those unneeded operations are not done. That adds complexity to the code but reduces the execution time, especially for short lists of large numbers. My code below executes multcoeff(123, 134, 145) in 111 microseconds, while your code takes 141 microseconds. That is not a large increase, but that could matter. So here is my code. This also takes individual values as parameters rather than a list, so that is another difference from your code.
def multcoeff(*args):
"""Return the multinomial coefficient
(n1 + n2 + ...)! / n1! / n2! / ..."""
if not args: # no parameters
return 1
# Find and store the index of the largest parameter so we can skip
# it (for efficiency)
skipndx = args.index(max(args))
newargs = args[:skipndx] + args[skipndx + 1:]
result = 1
num = args[skipndx] + 1 # a factor in the numerator
for n in newargs:
for den in range(1, n + 1): # a factor in the denominator
result = result * num // den
num += 1
return result

Starting Python 3.8,
since the standard library now includes the math.comb function (binomial coefficient)
and since the multinomial coefficient can be computed as a product of binomial coefficients
we can implement it without external libraries:
import math
def multinomial(*params):
return math.prod(math.comb(sum(params[:i]), x) for i, x in enumerate(params, 1))
multinomial(10, 20, 30) # 3553261127084984957001360

Compute sum with huge intermediate values

I would like to compute
for values of n up to 1000000 as accurately as possible. Here is some sample code.
from __future__ import division
from scipy.misc import comb
def M(n):
return sum(comb(n,k,exact=True)*(1/n)*(1-k/n)**(2*n-k)*(k/n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i,M(i)
The first problem is that I get OverflowError: long int too large to convert to float when n = 1101. This is because comb(n,k,exact=True) is too large to be converted to a float. The end result is however always a number around 0.159 .
I asked a related question at How to compute sum with large intermediate values however this question is different for three main reasons.
The formula I want to compute is different which causes different problems.
The solution proposed before to use exact=True does not help here as can be seen in the example I gave. Coding up my own implementation of comb is also not going to work as I still need to perform the floating point division.
I need to compute the answer for much bigger values than before which causes new problems. I suspect it can't be done without coding up the sum in some clever way.
A solution that doesn't crash is to use
from fractions import Fraction
def M2(n):
return sum(comb(n,k,exact=True)*Fraction(1,n)*(1-Fraction(k,n))**(2*n-k)*Fraction(k,n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i, M2(i)*1.0
Unfortunately it is now so slow that I don't get an answer for n=1101 in a reasonable amount of time.
So the second problem is how to make it fast enough to complete for large n.

You can compute each summand in with a logarithm transformation that replaces multiplication, division, and exponentiation with addition, subtraction, and multiplication, respectively.
def summand(n,k):
lk=log(k)
ln=log(n)
a=(lk-ln)*(k-1)
b=(log(n-k)-ln)*(2*n-k)
c=-ln
d=sum(log(x) for x in xrange(n-k+1,n+1))-sum(log(x) for x in xrange(1,k+1))
return exp(a+b+c+d)
def M(n):
return sum(summand(n,k) for k in xrange(1,n))
Note that when k=n the summand will be zero so I do not compute it since the logarithm will be undefined.

You can use gmpy2. It has arbitrary precision floating point arithmetic with large exponent bounds.
from __future__ import division
from gmpy2 import comb,mpfr,fsum
def M(n):
return fsum(comb(n,k)*(mpfr(1)/n)*(mpfr(1)-mpfr(k)/n)**(mpfr(2)*n-k)*(mpfr(k)/n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i,M(i)
Here is an excerpt of the output:
2001 0.15857490038127975
2101 0.15857582611615381
2201 0.15857666768820194
2301 0.15857743607577454
2401 0.15857814042739268
2501 0.15857878842787806
2601 0.15857938657957615
Disclaimer: I maintain gmpy2.

A rather brutal method is to compute all the factors and then mutliply in such a way that the result stays around 1.0 (Python 3.x):
def M(n):
return sum(summand(n, k) for k in range(1, n + 1))
def f1(n, k):
for i in range(k - 1):
yield k
for i in range(k):
yield n - i
def f2(n, k):
for i in range(k - 1):
yield 1 / n
for i in range(2 * n - k):
yield 1 - k / n
yield 1 / n
for i in range(2, k + 1):
yield 1 / i
def summand(n, k):
result = 1.0
factors1 = f1(n, k)
factors2 = f2(n, k)
while True:
empty1 = False
for factor in factors1:
result *= factor
if result > 1:
break
else:
empty1 = True
for factor in factors2:
result *= factor
if result < 1:
break
else:
if empty1:
break
return result
For M(1101) I get 0.15855899364641846, but it takes a few seconds. M(2000) takes about 14 seconds and yields 0.15857489065619598.
(I'm sure it can be optimised.)

Mathematica to Python

How can this Mathematica code be ported to Python? I do not know the Mathematica syntax and am having a hard time understanding how this is described in a more traditional language.
Source (pg 5): http://subjoin.net/misc/m496pres1.nb.pdf

This cannot be ported to Python directly as the definition a[j] uses the Symbolic Arithmetic feature of Mathematica.
a[j] is basically the coefficient of xj in the series expansion of that rational function inside Apart.
Assume you have a[j], then f[n] is easy. A Block in Mathematica basically introduces a scope for variables. The first list initializes the variable, and the rest is the execution of the code. So
from __future__ import division
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(binomial(q+5-j, 5) * a[r+20*j] for j in range(5))
(binomial is the Binomial coefficient.)

Using the proposed solutions from the previous answers I found that sympy sadly doesn't compute the apart() of the rational immediatly. It somehow gets confused. Moreover, the python list of coefficients returned by *Poly.all_coeffs()* has a different semantics than a Mathmatica list. Hence the try-except-clause in the definition of a().
The following code does work and the output, for some tested values, concurs with the answers given by the Mathematica formula in Mathematica 7:
from __future__ import division
from sympy import expand, Poly, binomial, apart
from sympy.abc import x
A = Poly(apart(expand(((1-x**20)**5)) / expand((((1-x)**2)*(1-x**2)*(1-x**5)*(1-x**10))))).all_coeffs()
def a(n):
try:
return A[n]
except IndexError:
return 0
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(a[r+20*j]* binomial(q+5-j, 5) for j in range(5))
print map(f, [100, 50, 1000, 150])

The symbolics can be done with sympy. Combined with KennyTM's answer, something like this might be what you want:
from __future__ import division
from sympy import Symbol, apart, binomial
x = Symbol('x')
poly = (1-x**20)**5 / ((1-x)**2 * (1-x**2) * (1-x**5) * (1-x**10))
poly2 = apart(poly,x)
def a(j):
return poly2.coeff(x**j)
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(binomial(q+5-j, 5)*a(r+20*j) for j in range(5))
Although I have to admit that f(n) does not work (I'm not very good at Python).

Optimized method for calculating cosine distance in Python

I wrote a method to calculate the cosine distance between two arrays:
def cosine_distance(a, b):
if len(a) != len(b):
return False
numerator = 0
denoma = 0
denomb = 0
for i in range(len(a)):
numerator += a[i]*b[i]
denoma += abs(a[i])**2
denomb += abs(b[i])**2
result = 1 - numerator / (sqrt(denoma)*sqrt(denomb))
return result
Running it can be very slow on a large array. Is there an optimized version of this method that would run faster?
Update: I've tried all the suggestions to date, including scipy. Here's the version to beat, incorporating suggestions from Mike and Steve:
def cosine_distance(a, b):
if len(a) != len(b):
raise ValueError, "a and b must be same length" #Steve
numerator = 0
denoma = 0
denomb = 0
for i in range(len(a)): #Mike's optimizations:
ai = a[i] #only calculate once
bi = b[i]
numerator += ai*bi #faster than exponent (barely)
denoma += ai*ai #strip abs() since it's squaring
denomb += bi*bi
result = 1 - numerator / (sqrt(denoma)*sqrt(denomb))
return result

If you can use SciPy, you can use cosine from spatial.distance:
http://docs.scipy.org/doc/scipy/reference/spatial.distance.html
If you can't use SciPy, you could try to obtain a small speedup by rewriting your Python (EDIT: but it didn't work out like I thought it would, see below).
from itertools import izip
from math import sqrt
def cosine_distance(a, b):
if len(a) != len(b):
raise ValueError, "a and b must be same length"
numerator = sum(tup[0] * tup[1] for tup in izip(a,b))
denoma = sum(avalue ** 2 for avalue in a)
denomb = sum(bvalue ** 2 for bvalue in b)
result = 1 - numerator / (sqrt(denoma)*sqrt(denomb))
return result
It is better to raise an exception when the lengths of a and b are mismatched.
By using generator expressions inside of calls to sum() you can calculate your values with most of the work being done by the C code inside of Python. This should be faster than using a for loop.
I haven't timed this so I can't guess how much faster it might be. But the SciPy code is almost certainly written in C or C++ and it should be about as fast as you can get.
If you are doing bioinformatics in Python, you really should be using SciPy anyway.
EDIT: Darius Bacon timed my code and found it slower. So I timed my code and... yes, it is slower. The lesson for all: when you are trying to speed things up, don't guess, measure.
I am baffled as to why my attempt to put more work on the C internals of Python is slower. I tried it for lists of length 1000 and it was still slower.
I can't spend any more time on trying to hack the Python cleverly. If you need more speed, I suggest you try SciPy.
EDIT: I just tested by hand, without timeit. I find that for short a and b, the old code is faster; for long a and b, the new code is faster; in both cases the difference is not large. (I'm now wondering if I can trust timeit on my Windows computer; I want to try this test again on Linux.) I wouldn't change working code to try to get it faster. And one more time I urge you to try SciPy. :-)

(I originally thought) you're not going to speed it up a lot without breaking out to C (like numpy or scipy) or changing what you compute. But here's how I'd try that, anyway:
from itertools import imap
from math import sqrt
from operator import mul
def cosine_distance(a, b):
assert len(a) == len(b)
return 1 - (sum(imap(mul, a, b))
/ sqrt(sum(imap(mul, a, a))
* sum(imap(mul, b, b))))
It's roughly twice as fast in Python 2.6 with 500k-element arrays. (After changing map to imap, following Jarret Hardie.)
Here's a tweaked version of the original poster's revised code:
from itertools import izip
def cosine_distance(a, b):
assert len(a) == len(b)
ab_sum, a_sum, b_sum = 0, 0, 0
for ai, bi in izip(a, b):
ab_sum += ai * bi
a_sum += ai * ai
b_sum += bi * bi
return 1 - ab_sum / sqrt(a_sum * b_sum)
It's ugly, but it does come out faster. . .
Edit: And try Psyco! It speeds up the final version by another factor of 4. How could I forget?

No need to take abs() of a[i] and b[i] if you're squaring it.
Store a[i] and b[i] in temporary variables, to avoid doing the indexing more than once.
Maybe the compiler can optimize this, but maybe not.
Check into the **2 operator. Is it simplifying it into a multiply, or is it using a general power function (log - multiply by 2 - antilog).
Don't do sqrt twice (though the cost of that is small). Do sqrt(denoma * denomb).

Similar to Darius Bacon's answer, I've been toying with operator and itertools to produce a faster answer. The following seems to be 1/3 faster on a 500-item array according to timeit:
from math import sqrt
from itertools import imap
from operator import mul
def op_cosine(a, b):
dot_prod = sum(imap(mul, a, b))
a_veclen = sqrt(sum(i ** 2 for i in a))
b_veclen = sqrt(sum(i ** 2 for i in b))
return 1 - dot_prod / (a_veclen * b_veclen)

This is faster for arrays of around 1000+ elements.
from numpy import array
def cosine_distance(a, b):
a=array(a)
b=array(b)
numerator=(a*b).sum()
denoma=(a*a).sum()
denomb=(b*b).sum()
result = 1 - numerator / sqrt(denoma*denomb)
return result

Using the C code inside of SciPy wins big for long input arrays. Using simple and direct Python wins for short input arrays; Darius Bacon's izip()-based code benchmarked out best. Thus, the ultimate solution is to decide which one to use at runtime, based on the length of the input arrays:
from scipy.spatial.distance import cosine as scipy_cos_dist
from itertools import izip
from math import sqrt
def cosine_distance(a, b):
len_a = len(a)
assert len_a == len(b)
if len_a > 200: # 200 is a magic value found by benchmark
return scipy_cos_dist(a, b)
# function below is basically just Darius Bacon's code
ab_sum = a_sum = b_sum = 0
for ai, bi in izip(a, b):
ab_sum += ai * bi
a_sum += ai * ai
b_sum += bi * bi
return 1 - ab_sum / sqrt(a_sum * b_sum)
I made a test harness that tested the functions with different length inputs, and found that around length 200 the SciPy function started to win. The bigger the input arrays, the bigger it wins. For very short length arrays, say length 3, the simpler code wins. This function adds a tiny amount of overhead to decide which way to do it, then does it the best way.
In case you are interested, here is the test harness:
from darius2 import cosine_distance as fn_darius2
fn_darius2.__name__ = "fn_darius2"
from ult import cosine_distance as fn_ult
fn_ult.__name__ = "fn_ult"
from scipy.spatial.distance import cosine as fn_scipy
fn_scipy.__name__ = "fn_scipy"
import random
import time
lst_fn = [fn_darius2, fn_scipy, fn_ult]
def run_test(fn, lst0, lst1, test_len):
start = time.time()
for _ in xrange(test_len):
fn(lst0, lst1)
end = time.time()
return end - start
for data_len in range(50, 500, 10):
a = [random.random() for _ in xrange(data_len)]
b = [random.random() for _ in xrange(data_len)]
print "len(a) ==", len(a)
test_len = 10**3
for fn in lst_fn:
n = fn.__name__
r = fn(a, b)
t = run_test(fn, a, b, test_len)
print "%s:\t%f seconds, result %f" % (n, t, r)

def cd(a,b):
if(len(a)!=len(b)):
raise ValueError, "a and b must be the same length"
rn = range(len(a))
adb = sum([a[k]*b[k] for k in rn])
nma = sqrt(sum([a[k]*a[k] for k in rn]))
nmb = sqrt(sum([b[k]*b[k] for k in rn]))
result = 1 - adb / (nma*nmb)
return result

Your updated solution still has two square roots. You can reduce this to one by replacing the sqrt line with:
result = 1 - numerator /
(sqrt(denoma*denomb))
A multiply is typically quite a bit quicker than a sqrt. It might not seem much as it is only called once in the function, but it sounds like you are calculating a lot of cosine distances, so the improvement will add up.
Your code looks like it should be ripe for vector optimizations. So if cross-platofrm support is not an issue and you want to speed it even further, you could code the cosine distance code in C and make sure your compiler is aggressively vectorizing the resulting code (even Pentium II is capable of some floating point vectorisation)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to compute an expensive high precision sum in python? - python

Related

Python for loop: RuntimeWarning: divide by zero

Does Python have a function which computes multinomial coefficients?

Compute sum with huge intermediate values

Mathematica to Python

Optimized method for calculating cosine distance in Python

Categories

Resources