What is faster, scipy fsolve vs root?

What is faster, scipy fsolve vs root? - python

I am trying to find the root of a function. I have used fsolve in the past but as my data sets get larger, it seems to get more inconsistent (--> n = 187). Now I am looking for alternatives and have found scipy.root. I don't understand what the difference is between the two, and which one is better in my scenario.
I am trying to solve the following 3N coupled equations and trying to find vector x y and z:
My code is the following, where inrec, outrec and rec are predetermined lists of values:
import scipy as sp
import numpy as np
from scipy.optimize import fsolve
import math
def f(w, n, onrec, inrec, rec):
F = [0]*3*n
for i in range(n):
F[i] = -onrec[i] #k_i>
F[n+i] = -inrec[i] #k_i<
F[(2*n)+i] = -rec[i] #k_i <>
for j in range(n):
if i == j:
None
else: #below the three functions are stated. w[i] = x_i, w[n+i] = y_i, w[2*n + i] = z_i
F[i] += (w[i]*w[n+j])/(1+w[i]*w[n+j]+w[j]*w[n+i]+w[2*n+i]*w[2*n+j])
F[n+i] += (w[j]*w[n+i])/(1+w[i]*w[n+j]+w[j]*w[n+i]+w[2*n+i]*w[2*n+j])
F[2*n+i] += (w[(2*n)+i]*w[(2*n)+j])/(1+w[i]*w[n+j]+w[j]*w[n+i]+w[2*n+i]*w[2*n+j])
return(F)
u = [1]*3*n
s = fsolve(f, u, args=(n, onrec, inrec, rec))

As #Bob suggests, step 1 must be to vectorise your inner function. After that, to your main question: it's not the right thing to ask, because
fsolve is just a wrapper around the hybr algorithm, which is already provided as one option in root; and
worry about correctness before performance.
It's almost certain that the optimiser is giving up on your problem and the results are invalid. The only circumstance under which I was able to convince it to converge was with n=4 and the Levenberg-Marquardt algorithm. If (four years later) you still need to solve this, I recommend bringing the problem to a different community like Mathematics StackExchange. In the meantime, though, here is a vectorised example with one converging solution:
import numpy as np
from numpy.random import default_rng
from scipy.optimize import root
def lhs(xyz: np.ndarray) -> np.ndarray:
x, y, z = xyz[..., np.newaxis]
n = len(x)
coeff = (1 - np.eye(n)) / (1 + x*y.T + x.T*y + z*z.T)
numerator = np.stack((
x * y.T,
x.T * y,
z.T * z,
))
result = (numerator * coeff).sum(axis=2)
return result
def to_root(w: np.ndarray, recs: np.ndarray) -> np.ndarray:
xyz = w.reshape((3, -1))
rhs = lhs(xyz) - recs
return rhs.ravel()
def test_data(n: int = 4) -> tuple[np.ndarray, np.ndarray]:
rand = default_rng(seed=0)
secret_solution = rand.uniform(-1, 1, (3, n))
recs = lhs(secret_solution)
return secret_solution, recs
def method_search() -> None:
secret_solution, recs = test_data()
for method in ('hybr', 'lm', 'broyden1', 'broyden2', 'anderson',
'linearmixing', 'diagbroyden', 'excitingmixing',
'krylov', 'df-sane'):
try:
result = root(
to_root, x0=np.ones_like(recs),
args=(recs,), method=method,
options={
'maxiter': 5_000,
'maxfev': 5_000,
},
)
except Exception:
continue
print(method, result.message,
f'nfev={getattr(result, "nfev", None)} nit={getattr(result, "nit", None)}')
print('Estimated RHS:')
print(lhs(result.x.reshape((3, -1))))
print('Estimated error:')
print(to_root(result.x, recs))
print()
def successful_example() -> None:
n = 4
print('n=', n)
secret_solution, recs = test_data(n=n)
result = root(
to_root, x0=np.ones_like(recs),
args=(recs,), method='lm',
)
print(result.message)
print('function evaluations:', result.nfev)
error = to_root(result.x, recs)
print('Error:', error.dot(error))
print()
if __name__ == '__main__':
successful_example()
n= 4
The relative error between two consecutive iterates is at most 0.000000
function evaluations: 1221
Error: 8.721381160163159e-30

Related

How to give an initial simplex to Scipy.minimize?

I tried to enter an initial simplex to Nelder Mead but got an exception in python, that the shape is wrong. However, I don't know and can't figure out the shape, nor do I know where to look it up.
When I want to use the scipy Nelder Mead algorithm, I want to have a very flexible initial simplex or also restart the optimization from a certain point without iterating the initial simplex again.
However, I get an exception that the shape of my initial simplex is wrong:
ValueError: `initial_simplex` should be an array of shape (N+1,N)
I could not find a good description or an example how to enter an initial simplex to the algorithm. Can someone provide a minimal example including the initial_simplex parameter?

after testing I found a way it works.
This workinge xample takes a simplex into the algorithm which is then evaluated and from there the algorithm is started:
from math import pi, sin
from random import uniform
import matplotlib.pyplot as plt
from scipy.optimize import minimize
def function(x, a, b, c):
return a * x ** 2 + b * x + c
def cost_function(guess):
y_test = [function(x_i, *guess) for x_i in x_range]
differences = [(y_i - data_i)**2 for y_i, data_i in zip(y_test, data)]
opt_plot.set_ydata(y_test)
plt.pause(1e-6)
cost = sum(differences) / len(differences)
print('cost', cost, 'guess', guess, end='\n')
return cost
def get_initial_simplex(guess, delta_0=.2):
print('get simplex')
simplex = []
simplex.append([cost_function(guess), guess])
for i in range(len(guess)):
simplex_guess = guess.copy()
simplex_guess[i] += delta_0
cost = cost_function(simplex_guess)
simplex.append([cost, simplex_guess])
simplex = sorted(simplex, key=lambda x: x[0])
print('done')
return [elem[1] for elem in simplex]
# create data
x_range = [i / 100 for i in range(-100, 100)]
data = [3 * sin(x_i + pi / 2) + 2 for x_i in x_range]
# plot the data:
fig, ax = plt.subplots()
ax.plot(x_range, data)
opt_plot, = ax.plot(x_range, [0 for _ in data])
guess = [uniform(-1,1) for _ in range(3)]
# start optimization of mse function
options ={
'initial_simplex': get_initial_simplex(guess)
}
result = minimize(cost_function, guess, method='Nelder-Mead', options=options)

What is the most efficient way to find the root of a complex function in Sympy (a Python library for symbolic mathematics)?

I am using the root_scalar function from the scipy.optimize module to find the root of a complex function defined in sympy. However, the function takes around 15-20 seconds to return the root, and I need to find a way to speed up this computation. Is it possible to convert the entire sympy function to scipy for faster processing, or is there any other way to optimize this process and reduce the computation time?
from sympy.stats import Gamma, density, cdf, E, variance
from sympy import Symbol, pprint, simplify
import numpy as np
l = 7
m = 30
p = 17
w = 6
K = 500
c = 6
h = 0.1
mean = 500
std = 296
def calculate_mean(days):
return mean*days
def calculate_std(days):
return std*np.sqrt(days)
def calculate_mean_std(days):
mean = calculate_mean(days)
std = calculate_std(days)
return mean, std
mean_m, std_m = calculate_mean_std(m)
mean_l, std_l = calculate_mean_std(l)
shape_m = (mean_m/std_m)**2
scale_m = std_m**2/mean_m
shape_l = (mean_l/std_l)**2
scale_l = std_l**2/mean_l
k = Symbol("k", positive=True)
theta = Symbol("theta", positive=True)
x = Symbol("x")
X = Gamma("z", k, theta)
P = density(X)(x)
C = cdf(X, meijerg=True)(x)
cdf_m_symb = C.subs([(theta, scale_m) , (k, shape_m)])
cdf_l_symb = C.subs([(theta, scale_l) , (k, shape_l)])
pdf_m_symb = P.subs([(theta, scale_m) , (k, shape_m)])
pdf_l_symb = P.subs([(theta, scale_l) , (k, shape_l)])
max_Q = np.ceil(mean*(m+l)).astype(int)
def g(r: float) -> float:
result = sp.N(-p + (p + w * cdf_m_symb.subs(x, max_Q)) * cdf_l_symb.subs(x, r) + \
w * sp.Integral(cdf_l_symb * pdf_m_symb.subs(x, (r + max_Q - x)), (x, 0, r)))
return result
from scipy.optimize import root_scalar
import sympy as sp
import time
start_time = time.time()
r0 = 200 # initial estimate for the root
bracket = (-10, 5000) # the upper and lower bounds of where the root is
solution = root_scalar(g, x0=r0, bracket=bracket)
print(solution) # info about the convergence
print("Results: ",solution.root) # the actual number
end_time = time.time()
print("Time taken:", end_time - start_time)
Here is the output from the above code
converged: True
flag: 'converged'
function_calls: 10
iterations: 9
root: 3966.9429368680453
Results: 3966.9429368680453
Time taken: 13.81236743927002
I have provided the code that I am currently using and the output that it produces. Any suggestions or examples of how to optimize this process would be greatly appreciated.

Compute the integral numerically, tabulate g(x) and interpolate x(g). Then your root-finding is nothing but evaluating a spline at a given point. Can't get any faster then that.

more accuracy with sicpy interp1d

I am trying to implement a non parametric estimation of the KL divergence shown in this paper
Here is my code:
import numpy as np
import math
import itertools
import random
from scipy.interpolate import interp1d
def log(x):
if x > 0: return math.log(x)
else: return 0
g = lambda x, inp,N : sum(0.5 + 0.5 * np.sign(x-inp))/N
def ecdf(x,N):
out = [g(i,x,N) for i in x]
fun = interp1d(x, out, kind='linear', bounds_error = False, fill_value = (0,1))
return fun
def KL_est(x,y):
ex = min(np.diff(sorted(np.unique(x))))
ey = min(np.diff(sorted(np.unique(y))))
e = min(ex,ey) * 0.9
N = len(x)
x.sort()
y.sort()
P = ecdf(x,N)
Q = ecdf(y,N)
KL = sum(log(v) for v in ((P(x)-P(x-e))/(Q(x)-Q(x-e))) ) / N
return KL
My trouble is with scipy interp1d. I am using the function returned from interp1d to find the value of new inputs. The problem is, some of the input values are very close (10^-5 apart) and the function returns the same value for both. In my code above, Q(x) - Q(x-e) leads to a divide by zero error.
Here is some test code that reproduces the problem:
x = np.random.normal(0, 1, 10)
y = np.random.normal(0, 1, 10)
ex = min(np.diff(sorted(np.unique(x))))
ey = min(np.diff(sorted(np.unique(y))))
e = min(ex,ey) * 0.9
N = len(x)
x.sort()
y.sort()
P = ecdf(x,N)
Q = ecdf(y,N)
KL = sum(log(v) for v in ((P(x)-P(x-e))/(Q(x)-Q(x-e))) ) / N
How would I go about getting a more accurate interpolation?

As e gets small you are effectively trying to compute the ratio of derivatives of P and Q numerically. As you are finding, you run out of precision really quickly in floating point doing it this way.
An alternate approach would be to use an interpolation function that can return derivatives directly. For example, you could try scipy.interpolate.InterpolatedUnivariateSpline. You were saying kind='linear' to interp1d, so the equivalent is k=1. Once you construct it, the spline has method derivatives() that gives you all the derivatives at different points. For small values of e you could switch to using the derivative.

Multiprocessing python function for numerical calculations

Hoping to get some help here with parallelising my python code, I've been struggling with it for a while and come up with several errors in whichever way I try, currently running the code will take about 2-3 hours to complete, The code is given below;
import numpy as np
from scipy.constants import Boltzmann, elementary_charge as kb, e
import multiprocessing
from functools import partial
Tc = 9.2
x = []
g= []
def Delta(T):
'''
Delta(T) takes a temperature as an input and calculates a
temperature dependent variable based on Tc which is defined as a
global parameter
'''
d0 = (pi/1.78)*kb*Tc
D0 = d0*(np.sqrt(1-(T**2/Tc**2)))
return D0
def element_in_sum(T, n, phi):
D = Delta(T)
matsubara_frequency = (np.pi * kb * T) * (2*n + 1)
factor_d = np.sqrt((D**2 * cos(phi/2)**2) + matsubara_frequency**2)
element = ((2 * D * np.cos(phi/2))/ factor_d) * np.arctan((D * np.sin(phi/2))/factor_d)
return element
def sum_elements(T, M, phi):
'''
sum_elements(T,M,phi) is the most computationally heavy part
of the calculations, the larger the M value the more accurate the
results are.
T: temperature
M: number of steps for matrix calculation the larger the more accurate the calculation
phi: The phase of the system can be between 0- pi
'''
X = list(np.arange(0,M,1))
Y = [element_in_sum(T, n, phi) for n in X]
return sum(Y)
def KO_1(M, T, phi):
Iko1Rn = (2 * np.pi * kb * T /e) * sum_elements(T, M, phi)
return Iko1Rn
def main():
for j in range(1, 92):
T = 0.1*j
for i in range(1, 314):
phi = 0.01*i
pool = multiprocessing.Pool()
result = pool.apply_async(KO_1,args=(26000, T, phi,))
g.append(result)
pool.close()
pool.join()
A = max(g);
x.append(A)
del g[:]
My approach was to try and send the KO1 function into a multiprocessing pool but I either get a Pickling error or a too many files open, Any help is greatly appreciated, and if multiprocessing is the wrong approach I would love any guide.

I haven't tested your code, but you can do several things to improve it.
First of all, don't create arrays unnecessarily. sum_elements creates three array-like objects when it can use just one generator. First, np.arange creates a numpy array, then the list function creates a list object and and then the list comprehension creates another list. The function does 4 times the work it should.
The correct way to implement it (in python3) would be:
def sum_elements(T, M, phi):
return sum(element_in_sum(T, n, phi) for n in range(0, M, 1))
If you use python2, replace range with xrange.
This tip will probably help you in any python script you'll write.
Also, try to utilize multiprocessing better. It seems what you need to do is to create a multiprocessing.Pool object once, and use the pool.map function.
The main function should look like this:
def job(args):
i, j = args
T = 0.1*j
phi = 0.01*i
return K0_1(26000, T, phi)
def main():
pool = multiprocessing.Pool(processes=4) # You can change this number
x = [max(pool.imap(job, ((i, j) for i in range(1, 314)) for j in range(1, 92)]
Notice that I used a tuple in order to pass multiple arguments to job.

This is not an answer to the question, but if I may, I would propose how to speed up the code using simple numpy array operations. Have a look at the following code:
import numpy as np
from scipy.constants import Boltzmann, elementary_charge as kb, e
import time
Tc = 9.2
RAM = 4*1024**2 # 4GB
def Delta(T):
'''
Delta(T) takes a temperature as an input and calculates a
temperature dependent variable based on Tc which is defined as a
global parameter
'''
d0 = (np.pi/1.78)*kb*Tc
D0 = d0*(np.sqrt(1-(T**2/Tc**2)))
return D0
def element_in_sum(T, n, phi):
D = Delta(T)
matsubara_frequency = (np.pi * kb * T) * (2*n + 1)
factor_d = np.sqrt((D**2 * np.cos(phi/2)**2) + matsubara_frequency**2)
element = ((2 * D * np.cos(phi/2))/ factor_d) * np.arctan((D * np.sin(phi/2))/factor_d)
return element
def KO_1(M, T, phi):
X = np.arange(M)[:,np.newaxis,np.newaxis]
sizeX = int((float(RAM) / sum(T.shape))/sum(phi.shape)/8) #8byte
i0 = 0
Iko1Rn = 0. * T * phi
while (i0+sizeX) <= M:
print "X = %i"%i0
indices = slice(i0, i0+sizeX)
Iko1Rn += (2 * np.pi * kb * T /e) * element_in_sum(T, X[indices], phi).sum(0)
i0 += sizeX
return Iko1Rn
def main():
T = np.arange(0.1,9.2,0.1)[:,np.newaxis]
phi = np.linspace(0,np.pi, 361)
M = 26000
result = KO_1(M, T, phi)
return result, result.max()
T0 = time.time()
r, rmax = main()
print time.time() - T0
It runs a bit more than 20sec on my PC. One has to be careful not to use too much memory, that is why there is still a loop with a bit complicated construction to use only pieces of X. If enough memory is present, then it is not necessary.
One should also note that this is just the first step of speeding up. Much improvement could be reached still using e.g. just in time compilation or cython.

NumPy Broadcasting: Calculating sum of squared differences between two arrays

I have the following code. It is taking forever in Python. There must be a way to translate this calculation into a broadcast...
def euclidean_square(a,b):
squares = np.zeros((a.shape[0],b.shape[0]))
for i in range(squares.shape[0]):
for j in range(squares.shape[1]):
diff = a[i,:] - b[j,:]
sqr = diff**2.0
squares[i,j] = np.sum(sqr)
return squares

You can use np.einsum after calculating the differences in a broadcasted way, like so -
ab = a[:,None,:] - b
out = np.einsum('ijk,ijk->ij',ab,ab)
Or use scipy's cdist with its optional metric argument set as 'sqeuclidean' to give us the squared euclidean distances as needed for our problem, like so -
from scipy.spatial.distance import cdist
out = cdist(a,b,'sqeuclidean')

I collected the different methods proposed here, and in two other questions, and measured the speed of the different methods:
import numpy as np
import scipy.spatial
import sklearn.metrics
def dist_direct(x, y):
d = np.expand_dims(x, -2) - y
return np.sum(np.square(d), axis=-1)
def dist_einsum(x, y):
d = np.expand_dims(x, -2) - y
return np.einsum('ijk,ijk->ij', d, d)
def dist_scipy(x, y):
return scipy.spatial.distance.cdist(x, y, "sqeuclidean")
def dist_sklearn(x, y):
return sklearn.metrics.pairwise.pairwise_distances(x, y, "sqeuclidean")
def dist_layers(x, y):
res = np.zeros((x.shape[0], y.shape[0]))
for i in range(x.shape[1]):
res += np.subtract.outer(x[:, i], y[:, i])**2
return res
# inspired by the excellent https://github.com/droyed/eucl_dist
def dist_ext1(x, y):
nx, p = x.shape
x_ext = np.empty((nx, 3*p))
x_ext[:, :p] = 1
x_ext[:, p:2*p] = x
x_ext[:, 2*p:] = np.square(x)
ny = y.shape[0]
y_ext = np.empty((3*p, ny))
y_ext[:p] = np.square(y).T
y_ext[p:2*p] = -2*y.T
y_ext[2*p:] = 1
return x_ext.dot(y_ext)
# https://stackoverflow.com/a/47877630/648741
def dist_ext2(x, y):
return np.einsum('ij,ij->i', x, x)[:,None] + np.einsum('ij,ij->i', y, y) - 2 * x.dot(y.T)
I use timeit to compare the speed of the different methods. For the comparison, I use vectors of length 10, with 100 vectors in the first group, and 1000 vectors in the second group.
import timeit
p = 10
x = np.random.standard_normal((100, p))
y = np.random.standard_normal((1000, p))
for method in dir():
if not method.startswith("dist_"):
continue
t = timeit.timeit(f"{method}(x, y)", number=1000, globals=globals())
print(f"{method:12} {t:5.2f}ms")
On my laptop, the results are as follows:
dist_direct 5.07ms
dist_einsum 3.43ms
dist_ext1 0.20ms <-- fastest
dist_ext2 0.35ms
dist_layers 2.82ms
dist_scipy 0.60ms
dist_sklearn 0.67ms
While the two methods dist_ext1 and dist_ext2, both based on the idea of writing (x-y)**2 as x**2 - 2*x*y + y**2, are very fast, there is a downside: When the distance between x and y is very small, due to cancellation error the numerical result can sometimes be (very slightly) negative.

Another solution besides using cdist is the following
difference_squared = np.zeros((a.shape[0], b.shape[0]))
for dimension_iterator in range(a.shape[1]):
difference_squared = difference_squared + np.subtract.outer(a[:, dimension_iterator], b[:, dimension_iterator])**2.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is faster, scipy fsolve vs root? - python

Related

How to give an initial simplex to Scipy.minimize?

What is the most efficient way to find the root of a complex function in Sympy (a Python library for symbolic mathematics)?

more accuracy with sicpy interp1d

Multiprocessing python function for numerical calculations

NumPy Broadcasting: Calculating sum of squared differences between two arrays

Categories

Resources