I am very new to scipy and doing data analysis in python. I am trying to solve the following regularized optimization problem and unfortunately I haven't been able to make too much sense from the scipy documentation. I am looking to solve the following constrained optimization problem using scipy.optimize
Here is the function I am looking to minimize:
here A is an m X n matrix , the first term in the minimization is the residual sum of squares, the second is the matrix frobenius (L2 norm) of a sparse n X n matrix W, and the third one is an L1 norm of the same matrix W.
In the function A is an m X n matrix , the first term in the minimization is the residual sum of squares, the second term is the matrix frobenius (L2 norm) of a sparse n X n matrix W, and the third one is an L1 norm of the same matrix W.
I would like to know how to minimize this function subject to the constraints that:
wj >= 0
wj,j = 0
I would like to use coordinate descent (or any other method that scipy.optimize provides) to solve the above problem. I would like so direction on how to achieve this as I have no idea how to take the frobenius norm or how to tune the parameters beta and lambda or whether the scipy.optimize will tune and return the parameters for me. Any help regarding these questions would be much appreciated.
Thanks in advance!
How large is m and n?
Here is a basic example for how to use fmin:
from scipy import optimize
import numpy as np
m = 5
n = 3
a = np.random.rand(m, n)
idx = np.arange(n)
def func(w, beta, lam):
w = w.reshape(n, n)
w2 = np.abs(w)
w2[idx, idx] = 0
return 0.5*((a - np.dot(a, w2))**2).sum() + lam*w2.sum() + 0.5*beta*(w2**2).sum()
w = optimize.fmin(func, np.random.rand(n*n), args=(0.1, 0.2))
w = w.reshape(n, n)
w[idx, idx] = 0
w = np.abs(w)
print w
If you want to use coordinate descent, you can implement it by theano.
http://deeplearning.net/software/theano/
Your problem seems tailor-made for cvxopt - http://cvxopt.org/
and in particular
http://cvxopt.org/userguide/solvers.html#problems-with-nonlinear-objectives
using fmin would likely be slower, since it does not take advantage of gradient / Hessian information.
The code in HYRY's answer also has the drawback that as far as fmin is concerned the diagonal W is a variable and fmin would try to move the W-diagonal values around until it realizes that they don't do anything (since the objective function resets them to zero). Here is the implementation in cvxopt of HYRY's code that explicitly enforces the zero-constraints and uses gradient info, WARNING: I couldn't derive the Hessian for your objective... and you might double-check the gradient as well:
'''CVXOPT version:'''
from numpy import *
from cvxopt import matrix, mul
''' warning: CVXOPT uses column-major order (Fortran) '''
m = 5
n = 3
n_active = (n)*(n-1)
A = matrix(random.rand(m*n),(m,n))
ids = arange(n)
beta = 0.1;
lam = 0.2;
W = matrix(zeros(n*n), (n,n));
def cvx_objective_func(w=None, z=None):
if w is None:
num_nonlinear_constraints = 0;
w_0 = matrix(1, (n_active,1), 'd');
return num_nonlinear_constraints, w_0
#main call:
'calculate objective:'
'form W matrix, warning _w is column-major order (Fortran)'
'''column-major order!'''
_w = matrix(w, (n, n-1))
for k in xrange(n):
W[k, 0:k] = _w[k, 0:k]
W[k, k+1:n] = _w[k, k:n-1]
squared_error = A - A*W
objective_value = .5 * sum( mul(squared_error,squared_error)) +\
.5* beta*sum(mul(W,W)) +\
lam * sum(abs(W));
'not sure if i calculated this right...'
_Df = -A.T*(squared_error) + beta*W + lam;
'''column-major order!'''
Df = matrix(0., (1, n*(n-1)))
for jdx in arange(n):
for idx in list(arange(0,jdx)) + list(arange(jdx+1,n)):
idx = int(idx);
jdx = int(jdx)
Df[0, jdx*(n-1) + idx] = _Df[idx, jdx]
if z is None:
return objective_value, Df
'''Also form hessian of objective+non-linear constraints
(but there are no nonlinear constraints) :
This is the trickiest part...
WARNING: H is for sure coded wrong'''
H = matrix(1., (n_active, n_active))
return objective_value, Df, H
m, w_0 = cvx_objective_func()
print cvx_objective_func(w_0)
G = -matrix(diag(ones(n_active),), (n_active,n_active))
h = matrix(0., (n_active,1), 'd')
from cvxopt import solvers
print solvers.cp(cvx_objective_func, G=G, h=h)
having said that, the tricks to eliminate the equality/inequality constraints in HYRY's code are quite cute
Related
I would like to numerically compute this ODE from time 0 -> T :
ODE equation where all of the sub-matrix are numerically given in a paper. Here are all of the variables :
import numpy as np
T = 1
eta = np.diag([2e-7, 2e-7])
R = [[0.33, 3.95],
[-2.52, 10.23]]
R = np.array(R)
gamma = 2e-5
GAMMA = 100
S_bar = [54.23, 27.45]
cov = [[0.47, 0.2],
[0.2, 0.14]]
cov = np.array(cov)
shape = cov.shape
Q = 0.5*np.block([[gamma*cov, R],
[np.transpose(R), np.zeros(shape)]])
Y = np.block([[np.zeros(shape), np.zeros(shape)],
[gamma*cov, R]])
U = np.block([[-linalg.inv(eta), np.zeros(shape)],
[np.zeros(shape), 2*gamma*cov]])
P_T = np.block([[-GAMMA*np.ones(shape), np.zeros(shape)],
[np.zeros(shape), np.zeros(shape)]])
Now I define de function f so that P' = f(t, P) :
n = len(P_T)
def f(t, X):
X = X.reshape([n, n])
return (Q + np.transpose(Y)#X + X#Y + X#U#X).reshape(-1)
Now my goal is to numerically solve this ODE, im trying to figure out the right function solve so that if I integrate the ODE from T to 0, then using the final value I get, I integrate back from 0 to T, the two matrices I get are actually (nearly) the same. Here is my solve function :
from scipy import integrate
def solve(interval, initial_value):
return integrate.solve_ivp(f, interval, initial_value, method="LSODA", max_step=1e-4)
Now I can test wether the computation is right :
solv = solve([T, 0], P_T.reshape(-1))
y = np.array(solv.y)
solv2 = solve([0, T], y[:, -1])
y2 = np.array(solv2.y)
# print(solv.status)
# print(solv2.status)
# this lines shows the diffenrence between the initial matrix at T and the final matrix computed at T
# the smallest is the value, the better is the computation
print(sum(sum(abs((P_T - y2[:, -1].reshape([n, n]))))))
My issue is : No matter what "solve" function im using (using different methods, different step sizes, testing all the parameters...) I always get either errors or a very bad convergence (the difference between the two matrices is too high).
Knowing that according to the paper where this ODE comes from ( (23) in https://arxiv.org/pdf/2103.13773v4.pdf) there exists a solution, how can I numerically compute it?
I want to use GEKKO to solve the following optimization problem:
Minimize x'Qx + 1e-10 * sum_{i=1}^n x_i^0.1
subject to 1' x = 1 and x >= 0
However, the following code returns sol = [0., 0., 0. ,0. ,1.] and Objective: 1.99419 as a solution. Which is far from optimal, I'll explain why below.
import numpy as np
from gekko import GEKKO
n = 5
m = GEKKO(remote=False)
m.options.SOLVER = 1
m.options.IMODE = 3
x = [m.Var(lb=0, ub=1) for _ in range(n)]
m.Equation(m.sum(x) == 1)
np.random.seed(0)
Q = np.random.uniform(-1, 1, size=(n, n))
Q = np.dot(Q.T, Q)
## Add h_i^p
c, p = 1e-10, 0.1
for i in range(n):
m.Obj(c * x[i] ** p)
for j in range(n):
m.Obj(x[i] * Q[i, j] * x[j])
m.solve(disp=True)
sol = np.array(x).flatten()
This is clearly wrong since if we only optimize the quadratic part (x'Qx) using below code, and put the solution to the initial objective, we get a much smaller objective value (Objective: 0.02489503). The 1e-10 * sum_{i=1}^n x_i^p is esentially ignored since it is very small.
m1 = GEKKO(remote=False)
m1.options.SOLVER = 1
m1.options.OTOL = 1e-10
x1 = [m1.Var(lb=0, ub=1) for _ in range(n)]
m1.Equation(m1.sum(x1) == 1)
m1.qobj(b=np.zeros(n), A=2 * Q, x=x1, otype='min')
m1.solve(disp=True)
sol = np.array(x1).flatten()
Is there any way to resolve this? Thank you!
Gekko solves nonlinear programming optimization problems with gradient-based methods: interior point and active set SQP. It looks like there is a problem with the objective function. Use matrix operations in Numpy to simplify the objective definition.
## Create Objective
c, p = 1e-10, 0.1
obj = np.dot(np.dot(x,Q),x) + c*m.sum([xi**p for xi in x])
m.Minimize(obj)
Here is the modified script that solves with Gekko. Increase MAX_ITER if the default limit of 250 is reached.
import numpy as np
from gekko import GEKKO
n = 5
m = GEKKO(remote=False)
m.options.SOLVER = 3
m.options.IMODE = 3
x = m.Array(m.Var,n,value=0.1, lb=1e-6, ub=1)
m.Equation(m.sum(x) == 1)
np.random.seed(0)
Q = np.random.uniform(-1, 1, size=(n, n))
Q = np.dot(Q.T, Q)
print(Q)
## Create Objective
c, p = 1e-10, 0.1
obj = np.dot(np.dot(x,Q),x) + c*m.sum([xi**p for xi in x])
m.Minimize(obj)
# adjust solver tolerance
m.options.RTOL=1e-10
m.options.OTOL=1e-10
m.options.MAX_ITER = 1000
m.solve(disp=True)
sol = np.array(x).flatten()
print('x: ', sol)
print('obj: ', m.options.OBJFCNVAL)
This gives an optimal solution that is also global because it is a Quadratic Programming (QP) problem (convex optimization). Using a nonlinear programming (SQP) solver for QP problems gives a solution with the IPOPT solver:
x: [[0.36315827507] [0.081993130341] [1e-06] [0.086231281612] [0.46861632269]]
obj: 0.024895918696
As far as I could see, gekko looks like it's built for machine learning, which focuses on local optimization opposed to global optimization, and typically most libraries will not be able to guarantee you optimal solutions.
If you really want optimal solutions, than for this case I would suggest looking into interval arithmetic. There are packages such as mpmath which can offer this, though I have yet to see optimizers using it in my brief time searching.
The TL;DR on how interval arithmetic works is you feed in a range of inputs and get back a range of outputs. For example, you can test if 1 is in the range of possible outputs for x1 + x2 + x3 + x4, and you can see the minimum/maximum potential values for your objective function. In this way, you can progressively split your intervals in half, keeping only intervals for which your constraints are potentially satisfied and for which your objective function's maximum potential is at least the largest minimum potential. This allows you to achieve guaranteed convergence to global optimums at the cost of a lot more computation.
I am currently trying to write some python code to solve an arbitrary system of first order ODEs, using a general explicit Runge-Kutta method defined by the values alpha, gamma (both vectors of dimension m) and beta (lower triangular matrix of dimension m x m) of the Butcher table which are passed in by the user. My code appears to work for single ODEs, having tested it on a few different examples, but I'm struggling to generalise my code to vector valued ODEs (i.e. systems).
In particular, I try to solve a Van der Pol oscillator ODE (reduced to a first order system) using Heun's method defined by the Butcher Tableau values given in my code, but I receive the errors
"RuntimeWarning: overflow encountered in double_scalars f = lambda t,u: np.array(... etc)" and
"RuntimeWarning: invalid value encountered in add kvec[i] = f(t+alpha[i]*h,y+h*sum)"
followed by my solution vector that is clearly blowing up. Note that the commented out code below is one of the examples of single ODEs that I tried and is solved correctly. Could anyone please help? Here is my code:
import numpy as np
def rk(t,y,h,f,alpha,beta,gamma):
'''Runga Kutta iteration'''
return y + h*phi(t,y,h,f,alpha,beta,gamma)
def phi(t,y,h,f,alpha,beta,gamma):
'''Phi function for the Runga Kutta iteration'''
m = len(alpha)
count = np.zeros(len(f(t,y)))
kvec = k(t,y,h,f,alpha,beta,gamma)
for i in range(1,m+1):
count = count + gamma[i-1]*kvec[i-1]
return count
def k(t,y,h,f,alpha,beta,gamma):
'''returning a vector containing each step k_{i} in the m step Runga Kutta method'''
m = len(alpha)
kvec = np.zeros((m,len(f(t,y))))
kvec[0] = f(t,y)
for i in range(1,m):
sum = np.zeros(len(f(t,y)))
for l in range(1,i+1):
sum = sum + beta[i][l-1]*kvec[l-1]
kvec[i] = f(t+alpha[i]*h,y+h*sum)
return kvec
def timeLoop(y0,N,f,alpha,beta,gamma,h,rk):
'''function that loops through time using the RK method'''
t = np.zeros([N+1])
y = np.zeros([N+1,len(y0)])
y[0] = y0
t[0] = 0
for i in range(1,N+1):
y[i] = rk(t[i-1],y[i-1], h, f,alpha,beta,gamma)
t[i] = t[i-1]+h
return t,y
#################################################################
'''f = lambda t,y: (c-y)**2
Y = lambda t: np.array([(1+t*c*(c-1))/(1+t*(c-1))])
h0 = 1
c = 1.5
T = 10
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
eff_rk = compute(h0,Y(0),T,f,alpha,beta,gamma,rk, Y,11)'''
#constants
mu = 100
T = 1000
h = 0.01
N = int(T/h)
#initial conditions
y0 = 0.02
d0 = 0
init = np.array([y0,d0])
#Butcher Tableau for Heun's method
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
#rhs of the ode system
f = lambda t,u: np.array([u[1],mu*(1-u[0]**2)*u[1]-u[0]])
#solving the system
time, sol = timeLoop(init,N,f,alpha,beta,gamma,h,rk)
print(sol)
Your step size is not small enough. The Van der Pol oscillator with mu=100 is a fast-slow system with very sharp turns at the switching of the modes, so rather stiff. With explicit methods this requires small step sizes, the smallest sensible step size is 1e-5 to 1e-6. You get a solution on the limit cycle already for h=0.001, with resulting velocities up to 150.
You can reduce some of that stiffness by using a different velocity/impulse variable. In the equation
x'' - mu*(1-x^2)*x' + x = 0
you can combine the first two terms into a derivative,
mu*v = x' - mu*(1-x^2/3)*x
so that
x' = mu*(v+(1-x^2/3)*x)
v' = -x/mu
The second equation is now uniformly slow close to the limit cycle, while the first has long relatively straight jumps when v leaves the cubic v=x^3/3-x.
This integrates nicely with the original h=0.01, keeping the solution inside the box [-3,3]x[-2,2], even if it shows some strange oscillations that are not present for smaller step sizes and the exact solution.
I am trying to find the elements of a matrix inverse for an ill-conditioned matrix
Consider the complex non-Hermitian matrix M, I know this matrix has one zero eigenvalue, and is therefor singular. However, I need to find the sum of the matrix elements: v#f(M)#u, where u and v are both vectors and f(x)=1/x (effectively the matrix inverse). I know that the zeroth eigenvalue does not contribute to this sum, so there is no explicit issue with the singularity. However, my code is very numerically unstable, I presume this is a consequence of an error in finding the eigenvalues of the system.
Starting by building the preliminary matrices:
import numpy as np
import scipy as sc
g0 = np.array([0,0,1])
g1 = np.array([0,1,0])
e0 = np.array([1,0,0])
sm = np.outer(g0, e0)
sp = np.outer(e0, g0)
def spre(op):
return np.kron(np.eye(op.shape[0]),op)
def spost(op):
return np.kron(op.T,np.eye(op.shape[0]))
def sprepost(op1,op2):
return np.kron(op1.T,op2)
sm_reg = spre(sm)
sp_reg = spre(sp)
spsm_reg=spre(sp#sm)
hil_dim = int(g0.shape[0])
cav_proj= np.eye(hil_dim).reshape(hil_dim**2,)
rho0 =(np.outer(e0,e0)).reshape(hil_dim**2,)
def ham(g):
return g * (np.outer(g1,e0) + np.outer(e0, g1))
def lind_op(A):
L = 2 * sprepost(A,A.conj().T) - spre(A.conj().T # A)
L += - spost(A.conj().T # A)
return L
def JC_lio(g, kappa, gamma):
unit = -1j * (spre(ham(g)) - spost(ham(g)))
lind = gamma * lind_op(np.outer(g0 , e0)) + kappa * lind_op(np.outer(g0 , g1))
return unit + lind
Now define a function that first finds the left and right eigenvalues, and then finds the sum of the matrix elements:
def power_int(g, kappa, gamma):
# Construct the non-Hermitian matrix of interest
lio = JC_lio(g,kappa,gamma)
#Find its left and right eigenvectors:
ev, left, right = scipy.linalg.eig(lio, left=True,right=True)
# Find the appropriate normalisation factors
norm = np.array([(left.conj().T[ii]).dot(right.conj().T[ii]) for ii in range(len(ev))])
#Find the similarity transformation for the problem
P = right
Pinv = (left/norm).conj().T
#find the projectors for the Eigenbasis
Proj = [np.outer(P.conj().T[ii],Pinv[ii]) for ii in range(len(ev))]
#Find the relevant matrix elements between the Eigenbasis and the projectors --- this is where the zero eigenvector gets removed
PowList = [(spsm_reg# Proj[ii] # rho0).dot(cav_proj) for ii in range(len(ev))]
#apply the function
Pow = 0
for ii in range(len(ev)):
if PowList[ii]==0:
Pow = Pow
else:
Pow += PowList[ii]/ev[ii]
return -np.pi * np.real(Pow)
#example run:
grange = np.linspace(0.001,10,40)
dat = np.array([power_int(g, 1, 1) for g in grange])
Running this code leads to extremely oscillatory results where I expect a smooth curve. I suspect this error is due to poor accuracy in determining the eigenvectors, but I can't seem to find any documentation on this. Any insights would be welcome.
My objective is to perform an Inverse Laplace Transform on some decay data (NMR T2 decay via CPMG). For that, we were provided with the CONTIN algorithm. This algorithm was adapted to Matlab by Iari-Gabriel Marino, and it works very well. I want to adapt this code into Python. The core of the problem is with scipy.optimize.fmin, which is not minimizing the mean square deviation (MSD) in any way similar to Matlab's fminsearch. The latter results in a good minimization, while the former doesn't.
I have gone through line by line of my adapted code in Python, and the original Matlab. I checked every matrix and every output. I used this to identify that the critical point is in fmin. I also tried scipy.optimize.minimize and other minimization algorithms, but none gave even remotely satisfactory results.
I have made two MWE, for Python and Matlab, to make it reproducible to all. The example data were obtained from the documentation of the matlab function. Apologies if this is long code, but I don't really know how to shorten it without sacrificing readability and clarity. I tried to have the lines match as closely as possible. I am using Python 3.7.3, scipy v1.3.0, numpy 1.16.2, Matlab R2018b, on Windows 8.1. It's a relatively recent Anaconda install (<2 months).
My code:
import numpy as np
from scipy.optimize import fmin
import matplotlib.pyplot as plt
def msd(g, y, A, alpha, R, w, constraints):
""" msd: mean square deviation. This is the function to be minimized by fmin"""
if 'zero_at_extremes' in constraints:
g[0] = 0
g[-1] = 0
if 'g>0' in constraints:
g = np.abs(g)
r = np.diff(g, axis=0, n=2)
yfit = A # g
# Sum of weighted square residuals
VAR = np.sum(w * (y - yfit) ** 2)
# Regularizor
REG = alpha ** 2 * np.sum((r - R # g) ** 2)
# output to be minimized
return VAR + REG
# Objective: match this distribution
g0 = np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]).reshape((-1, 1))
s0 = np.logspace(-3, 6, len(g0)).reshape((-1, 1))
t = np.linspace(0.01, 500, 100).reshape((-1, 1))
sM, tM = np.meshgrid(s0, t)
A = np.exp(-tM / sM)
np.random.seed(1)
# Creates data from the initial distribution with some random noise.
data = (A # g0) + 0.07 * np.random.rand(t.size).reshape((-1, 1))
# Parameters and function start
alpha = 1E-2 # regularization parameter
s = np.logspace(-3, 6, 20).reshape((-1, 1)) # x of the ILT
g0 = np.ones(s.size).reshape((-1, 1)) # guess of y of ILT
y = data # noisy data
options = {'maxiter':1e8, 'maxfun':1e8} # for the fmin function
constraints=['g>0', 'zero_at_extremes'] # constraints for the MSD function
R=np.zeros((len(g0) - 2, len(g0)), order='F') # Regularizor
w=np.ones(y.reshape(-1, 1).size).reshape((-1, 1)) # Weights
sM, tM = np.meshgrid(s, t, indexing='xy')
A = np.exp(-tM/sM)
g0 = g0 * y.sum() / (A # g0).sum() # Makes a "better guess" for the distribution, according to algorithm
print('msd of input data:\n', msd(g0, y, A, alpha, R, w, constraints))
for i in range(5): # Just for testing. If this is extremely high, ~1000, it's still bad.
g = fmin(func=msd,
x0 = g0,
args=(y, A, alpha, R, w, constraints),
**options,
disp=True)[:, np.newaxis]
msdfit = msd(g, y, A, alpha, R, w, constraints)
if 'zero_at_extremes' in constraints:
g[0] = 0
g[-1] = 0
if 'g>0' in constraints:
g = np.abs(g)
g0 = g
print('New guess', g)
print('Final msd of g', msdfit)
# Visualize the fit
plt.plot(s, g, label='Initial approximation')
plt.plot(np.logspace(-3, 6, 11), np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]), label='Distribution to match')
plt.xscale('log')
plt.legend()
plt.show()
Matlab:
% Objective: match this distribution
g0 = [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0]';
s0 = logspace(-3,6,length(g0))';
t = linspace(0.01,500,100)';
[sM,tM] = meshgrid(s0,t);
A = exp(-tM./sM);
rng(1);
% Creates data from the initial distribution with some random noise.
data = A*g0 + 0.07*rand(size(t));
% Parameters and function start
alpha = 1e-2; % regularization parameter
s = logspace(-3,6,20)'; % x of the ILT
g0 = ones(size(s)); % initial guess of y of ILT
y = data; % noisy data
options = optimset('MaxFunEvals',1e8,'MaxIter',1e8); % constraints for fminsearch
constraints = {'g>0','zero_at_the_extremes'}; % constraints for MSD
R = zeros(length(g0)-2,length(g0));
w = ones(size(y(:)));
[sM,tM] = meshgrid(s,t);
A = exp(-tM./sM);
g0 = g0*sum(y)/sum(A*g0); % Makes a "better guess" for the distribution
disp('msd of input data:')
disp(msd(g0, y, A, alpha, R, w, constraints))
for k = 1:5
[g,msdfit] = fminsearch(#msd,g0,options,y,A,alpha,R,w,constraints);
if ismember('zero_at_the_extremes',constraints)
g(1) = 0;
g(end) = 0;
end
if ismember('g>0',constraints)
g = abs(g);
end
g0 = g;
end
disp('New guess')
disp(g)
disp('Final msd of g')
disp(msdfit)
% Visualize the fit
semilogx(s, g)
hold on
semilogx(logspace(-3,6,11), [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0])
legend('First approximation', 'Distribution to match')
hold off
function out = msd(g,y,A,alpha,R,w,constraints)
% msd: The mean square deviation; this is the function
% that has to be minimized by fminsearch
% Constraints and any 'a priori' knowledge
if ismember('zero_at_the_extremes',constraints)
g(1) = 0;
g(end) = 0;
end
if ismember('g>0',constraints)
g = abs(g); % must be g(i)>=0 for each i
end
r = diff(diff(g(1:end))); % second derivative of g
yfit = A*g;
% Sum of weighted square residuals
VAR = sum(w.*(y-yfit).^2);
% Regularizor
REG = alpha^2 * sum((r-R*g).^2);
% Output to be minimized
out = VAR+REG;
end
Here is the optimization in Python
Here is the optimization in Matlab
I have checked the output of MSD of g0 before starting, and both give the value of 2651. After minimization, Python goes up, to 4547, and Matlab goes down to 0.1381.
I think the problem is one of the following. It's in my implementation, that is, I am using fmin wrong, or there's some other passage I got wrong, but I can't figure out what. The fact the MSD increases when it should have decreased with a minimization function is damning. Reading the documentation, the scipy implementation is different from Matlab's (they use the Nelder Mead method described in Lagarias, per their documentation), while scipy uses the original Nelder Mead). Maybe that affects significantly? Or perhaps my initial guess is too bad for scipy's algorithm?
So, quite a long time since I posted this, but I wanted to share what I ended up learning and doing.
The Inverse Laplace Transform for CPMG data is a bit of a misnomer, and it's more properly called just inversion. The general problem is solving a Fredholm integral of the first kind. One way of doing this is the Tikhonov regularization method. Turns out, you can describe this problem quite easily using numpy, and solve it with a scipy package, so I don't have to "reinvent" the wheel with this.
I used the solution shown in this post, and the names here reflect that solution.
def tikhonov_regularized_inversion(
kernel: np.ndarray, alpha: float, data: np.ndarray
) -> np.ndarray:
data = data.reshape(-1, 1)
I = alpha * np.eye(*kernel.shape)
C = np.concatenate([kernel, I], axis=0)
d = np.concatenate([data, np.zeros_like(data)])
x, _ = nnls(C, d.flatten())
Here, kernel is a matrix containing all the possible exponential decay curves, and my solution judges the contribution of each decay curve in the data I received. First, I stack my data as a column, then pad it with zeros, creating the vector d. I then stack my kernel on top of a diagonal matrix containing the regularization parameter alpha along the diagonal, of the same size as the kernel. Last, I call the convenient nnls, a non negative least square solver in scipy.optimize. This is because there's no reason to have a negative contribution, only no contribution.
This solved my problem, it's quick and convenient.