OpenMDAO cache_linear_solution not updating initial guess - python

I want to save time on expensive linear solves for my optimization by using the previous linear solutions as initial guesses for the subsequent iteration in an optimization. I'm looking through OpenMDAO's example for the cache_linear_solution feature which seems to have been developed for this purpose (here) and code shown below:
from distutils.version import LooseVersion
import numpy as np
import scipy
from scipy.sparse.linalg import gmres
import openmdao.api as om
class QuadraticComp(om.ImplicitComponent):
"""
A Simple Implicit Component representing a Quadratic Equation.
R(a, b, c, x) = ax^2 + bx + c
Solution via Quadratic Formula:
x = (-b + sqrt(b^2 - 4ac)) / 2a
"""
def setup(self):
self.add_input('a', val=1.)
self.add_input('b', val=1.)
self.add_input('c', val=1.)
self.add_output('states', val=[0,0])
self.declare_partials(of='*', wrt='*')
def apply_nonlinear(self, inputs, outputs, residuals):
a = inputs['a']
b = inputs['b']
c = inputs['c']
x = outputs['states'][0]
y = outputs['states'][1]
residuals['states'][0] = a * x ** 2 + b * x + c
residuals['states'][1] = a * y + b
def solve_nonlinear(self, inputs, outputs):
a = inputs['a']
b = inputs['b']
c = inputs['c']
outputs['states'][0] = (-b + (b ** 2 - 4 * a * c) ** 0.5) / (2 * a)
outputs['states'][1] = -b/a
def linearize(self, inputs, outputs, partials):
a = inputs['a'][0]
b = inputs['b'][0]
c = inputs['c'][0]
x = outputs['states'][0]
y = outputs['states'][1]
partials['states', 'a'] = [[x**2],[y]]
partials['states', 'b'] = [[x],[1]]
partials['states', 'c'] = [[1.0],[0]]
partials['states', 'states'] = [[2*a*x+b, 0],[0, a]]
self.state_jac = np.array([[2*a*x+b, 0],[0, a]])
def solve_linear(self, d_outputs, d_residuals, mode):
if mode == 'fwd':
print("incoming initial guess", d_outputs['states'])
if LooseVersion(scipy.__version__) < LooseVersion("1.1"):
d_outputs['states'] = gmres(self.state_jac, d_residuals['states'], x0=d_outputs['states'])[0]
else:
d_outputs['states'] = gmres(self.state_jac, d_residuals['states'], x0=d_outputs['states'], atol='legacy')[0]
elif mode == 'rev':
if LooseVersion(scipy.__version__) < LooseVersion("1.1"):
d_residuals['states'] = gmres(self.state_jac, d_outputs['states'], x0=d_residuals['states'])[0]
else:
d_residuals['states'] = gmres(self.state_jac, d_outputs['states'], x0=d_residuals['states'], atol='legacy')[0]
p = om.Problem()
indeps = p.model.add_subsystem('indeps', om.IndepVarComp(), promotes_outputs=['a', 'b', 'c'])
indeps.add_output('a', 1.)
indeps.add_output('b', 4.)
indeps.add_output('c', 1.)
p.model.add_subsystem('quad', QuadraticComp(), promotes_inputs=['a', 'b', 'c'], promotes_outputs=['states'])
p.model.add_design_var('a', cache_linear_solution=True)
p.model.add_constraint('states', upper=10)
p.setup(mode='fwd')
p.run_model()
print(p['states'])
derivs = p.compute_totals(of=['states'], wrt=['a'])
print(derivs['states', 'a'])
p['a'] = 4
derivs = p.compute_totals(of=['states'], wrt=['a'])
print(derivs['states', 'a'])
The above code gives the following print out:
[-0.26794919 -4. ]
incoming initial guess [0. 0.]
[[-0.02072594]
[ 4. ]]
incoming initial guess [0. 0.]
[[-0.02072594]
[ 4. ]]
From the print out of this example it doesn't look like the initial guess for the linear guess is actually being updated. Am I missing something? I've also tried running the code with the cache_linear_solution set to False and the result seems to be the same.

Currently, the caching of linear solutions only happens when the total derivatives are computed during the run of a driver, so if you want to check to make sure it's happening during your optimization (in the run_driver call), change
derivs = p.compute_totals(of=['states'], wrt=['a'])
to
derivs = p.driver._compute_totals(of=['states'], wrt=['a'], global_names=False)
When I do that with your code, I get the following output:
[-0.26794919 -4. ]
incoming initial guess [0. 0.]
[[-0.02072594]
[ 4. ]]
incoming initial guess [-0.02072594 4. ]
[[-0.02072594]
[ 4. ]]
Note that the global_names=False arg is only needed if you use promoted names for your of and wrt variables.
I will update our example code to reflect the correct way to do this.

Related

Python vs MATLAB in symbolic integration

In python, I was trying to resolve this symbolic integral
t = symbols('t')
inte = sympy.Matrix( " long and complex expresion with many exp(t*numbers) in a matrix " )
Gam = sympy.integrate( inte , (t , 0 , 0.1 ) )
Gam = Gam.evalf()
Gam = np.array( Gam ) # to turn the expression into a numpy array
In Matlab, I was trying to resolve the same expression
syms t
inter = eval( int( inte , t , 0 , 0.1 ) ) % to turn the expression into a numerical array
The fact is:
In Matlab, I got a result that (before it is evaluated in "eval") it contains the form "cos(t)" inside and when it is evaluated in "eval" gave me an array with only real numbers, also this is the correct answer in the overall exercise.
But, when I make the same task in python I get an expression that doesn't contain the symbolic result with "cos(t)" instead it has many "exp(I*t)".So, when I evaluate the result in order to get the array, I have an array with only complex numbers.
Because of this, one could think "Matlab is better than python solving symbolics integrals".
My question is ¿Is there some way that I could reach the same result in python? I refuse to believe that it isn't possible to obtain the same result with some of the algebraic manipulations, or maybe, solving the integral in another way.
Of course, I have already done the usual, simplify, and expand. If you want to run the python code, it is here.
t = symbols( 't ' )
init_printing(use_unicode=True)
tini = 0
tfin = 10
T = 1e-1
time = np.array( np.arange( tini , tfin , T ) , ndmin=2 )
A = np.array([[-8, 1, 0, ],
[-5, 0, 1, ],
[-6, 0, 0, ]])
B = np.array([[0],
[1],
[0]])
V , R = np.linalg.eig(A) # Eigvalues in a vector, Matrix Diagonalizante
M = np.round( np.linalg.inv(R) # A # R , 4) # Diagonal Matrix
ExpA_t = np.diag( e**(V*t) )
R = smp.Matrix(R)
B = smp.Matrix(B)
ExpA_t = smp.Matrix(ExpA_t)
inte = ( R # ExpA_t # R**-1 ) # B
Gam = integrate( inte , (t , tini , T) ) # here is the issue
Gam = np.array( Gam , dtype = complex )
Thanks.
The solution was to add expand to inte
expand(inte)

Solving matrix differential equations in python with odeint?

I'm trying to solve the system:
dP/dt = AP
where,
A is the matrix [ [1,2], [3,4]]
P is the the matrix [ [P1, P2], [P3, P4] ]
with initial condition P0 = [P10, P20, P30, P40] = [1,2,3,5]
I've implemented two different methods in python but I'm getting different answers.
The first is to split the system into 2 pairs of coupled ODEs and then find the eigenvalues and eigenvectors of A. These can then be substituted into a general solution:
Below is some of the code focusing on obtaining a solution for P1.
A = np.array([[1,2],[3,4]])
# extract eigenvalues
lambda1, lambda2, = np.linalg.eig(A)[0]
# extract eigenvectors
x1, x2 = np.linalg.eig(A)[1]
# combine eigenvectors and solve system at t=0 with initial condition to get constants
x1 = x1.reshape(2,1)
x2 = x2.reshape(2,1)
X = np.concatenate((x1,x2),axis=1)
P10 = 1
P30 = 3
Pinitial = np.array([P10, P30])
C1, C2 = np.linalg.solve(X, Pinitial)
# combine everything to get the solution to P1
t = np.linspace(0,3)
P1true = C1 * np.exp(lambda1 * t) * x1[0] + C2 * np.exp(lambda2 * t) * x2[0]
print(P1true)
My output to this implementation is the following:
[ 1.00000000e+00 4.90610966e-01 -1.96909212e-01 -1.13239194e+00
-2.41285490e+00 -4.17309019e+00 -6.60037624e+00 -9.95491931e+00
-1.45982562e+01 -2.10327188e+01 -2.99562678e+01 -4.23386834e+01
-5.95274276e+01 -8.33947363e+01 -1.16541999e+02 -1.62583733e+02
-2.26542159e+02 -3.15395444e+02 -4.38839463e+02 -6.10346242e+02
-8.48634612e+02 -1.17971363e+03 -1.63972185e+03 -2.27887232e+03
-3.16693407e+03 -4.40084834e+03 -6.11531116e+03 -8.49747721e+03
-1.18073904e+04 -1.64063715e+04 -2.27964611e+04 -3.16752246e+04
-4.40119010e+04 -6.11532094e+04 -8.49703621e+04 -1.18063333e+05
-1.64044684e+05 -2.27933923e+05 -3.16705455e+05 -4.40049940e+05
-6.11432160e+05 -8.49560890e+05 -1.18043123e+06 -1.64016232e+06
-2.27894028e+06 -3.16649669e+06 -4.39972081e+06 -6.11323640e+06
-8.49409776e+06 -1.18022094e+07]
My second implementation is to use scipy's odeint:
# function that defines dP/dt
def model(P,t):
A = np.array([[1,2], [3,4]])
P = np.array([[P[0], P[1]],[P[2],P[3]]])
RHS = np.matmul(A,P)
RHS = RHS.reshape(1,4)
dPdt = RHS.tolist()
return dPdt[0]
# initial condition
P0 = [1,2,3,5]
#time points
t = np.linspace(0,3)
#solve model
P = odeint(model,P0,t)
# print P1 solution
print(P[:,0])
For which I get the following output:
[1.00000000e+00 1.50619854e+00 2.20691051e+00 3.17795032e+00
4.52465783e+00 6.39339726e+00 8.98753443e+00 1.25896370e+01
1.75923199e+01 2.45411052e+01 3.41939721e+01 4.76041020e+01
6.62348466e+01 9.21194739e+01 1.28083127e+02 1.78051229e+02
2.47477996e+02 3.43941844e+02 4.77972676e+02 6.64201371e+02
9.22956950e+02 1.28248577e+03 1.78203503e+03 2.47613711e+03
3.44056260e+03 4.78059170e+03 6.64250701e+03 9.22956240e+03
1.28241709e+04 1.78187343e+04 2.47584791e+04 3.44009754e+04
4.77988372e+04 6.64146291e+04 9.22805262e+04 1.28220154e+05
1.78156830e+05 2.47541841e+05 3.43949539e+05 4.77904179e+05
6.64028793e+05 9.22641500e+05 1.28197351e+06 1.78125097e+06
2.47497704e+06 3.43888166e+06 4.77818858e+06 6.63910199e+06
9.22476676e+06 1.28174445e+07]
Which isn't the same as my first implementation? Can anyone see where I've gone wrong? I feel like the problem might be with my second implementation?

Scipy ODR results with huge relative errors for sd_beta

When running the ODR algorithm on some experiment data, I've been asked to run it with the following model:
It is clear that this fitting function is containing a redundant degree of freedom.
When I run the fitting on my experiment data I get enormous relative errors of beta, starting from 8000% relative error.
When I try to run the fitting again but with a fitting function that doesn't have a redundant degree of freedom, such as:
I don't get this kind of problem.
Why is this happening? Why the ODR algorithm is so sensitive for redundant degrees of freedom? I wasn't able to answer these questions to my supervisors. An answer will be much appreciated.
Reproducing code example:
from scipy.odr import RealData, Model, ODR
def func1(a, x):
return a[0] * (x + a[1]) / (a[3] * (x + a[1]) + a[1] * x) + a[2]
def func2(a, x):
return a[0] / (x + a[1]) + a[2]
# fmt: off
zx = [
1911.125, 2216.95, 2707.71, 3010.225, 3410.612, 3906.015, 4575.105, 5517.548,
6918.481,
]
dx = [
0.291112577, 0.321695254, 0.370771197, 0.401026507, 0.441068641, 0.490601621,
0.557573268, 0.651755155, 0.79184836,
]
zy = [
0.000998056, 0.000905647, 0.000800098, 0.000751041, 0.000699982, 0.000650532,
0.000600444, 0.000550005, 0.000500201,
]
dy = [
5.49029e-07, 5.02824e-07, 4.5005e-07, 4.25532e-07, 3.99991e-07, 3.75266e-07,
3.50222e-07, 3.25003e-07, 3.00101e-07,
]
# fmt: on
data = RealData(x=zx, y=zy, sx=dx, sy=dy)
print("Func 1")
print("======")
beta01 = [
1.46,
4775.4,
0.01,
1000,
]
model1 = Model(func1)
odr1 = ODR(data, model1, beta0=beta01)
result1 = odr1.run()
print("beta", result1.beta)
print("sd beta", result1.sd_beta)
print("relative", result1.sd_beta / result1.beta * 100)
print()
print()
print("Func 2")
print("======")
beta02 = [
1,
1,
1,
]
model2 = Model(func2)
odr2 = ODR(data, model2, beta0=beta02)
result2 = odr2.run()
print("beta", result2.beta)
print("sd beta", result2.sd_beta)
print("relative", result2.sd_beta / result2.beta * 100)
This prints out:
Func 1
======
beta [ 1.30884537e+00 -2.82585952e+03 7.79755196e-04 9.47943376e+01]
sd beta [1.16144608e+02 3.73765816e+06 6.12613738e-01 4.20775596e+03]
relative [ 8873.82193523 -132266.24068473 78564.88054498 4438.82627453]
Func 2
======
beta [1.40128121e+00 9.80844274e+01 3.00511669e-04]
sd beta [2.73990552e-03 3.22344713e+00 3.74538794e-07]
relative [0.1955286 3.28640051 0.12463369]
Scipy/Numpy/Python version information:
Versions are:
Scipy - 1.4.1
Numpy - 1.18.2
Python - 3.7.2
The problem is not with the degrees of freedom.
The degrees of freedom is the difference between the number of data points and the number of fitting parameters.
The problem has the same number of degrees of freedom for the two formulae, as they have the same number of parameters.
It also looks like that you do not have free degrees of freedom, which is good news, it means that it can potentially be fitted.
However, you are right that first expression has some problem: the parameters you are trying to fit are not independent.
This is probably better understood with some simpler example.
Consider the following expression:
y = x + b + c
which you try to fit, given n data for x and y with n >> 2.
The question is: what are the optimal value for b and c? This cannot be answered. All you can say from x and y data is about the combination. Therefore, if b + c is 0, the fit cannot tell us if b = 1000, c = -1000 or b = 1, c= -1, but at least we can say that given b we can determine c.
What is the error on a given b? Potentially infinite. That is the reason for the fitting to give you that large relative error.

Manual AIC calculation vs LassoLarsIC

I am trying to calculate AIC manually, but my function gives different scores compared to the LassoLarsIC score. Can someone tell me what is wrong with my calculation.
Here my function:
def aic(y_pred, y, k):
ll = (-1/(2*np.var(y)))*np.sum((y_pred-y)**2) - (len(y)/2)*np.log(np.var(y)) - (len(y)/2)*np.log(2*np.pi)
return -2*ll + 2*k
Thanks a lot
Edit:
My example is simple, here is the complete code:
X = np.array([0, 0.1111, 0.2222, 0.3333, 0.4444, 0.5556, 0.6667, 0.7778, 0.8889, 1]).reshape(-1, 1)
y = np.array([0.0528, 0.798 , 0.8486, 0.8719, 0.1732, -0.3629, -0.7528, -0.9985, -0.6727, -0.1197]).reshape(-1, 1)
poly = plf(9)
F = poly.fit_transform(X)[:, 1:]
scl = StandardScaler()
F = scl.fit_transform(F)
aic_lasso = LassoLarsIC(normalize=False)
aic_lasso.fit(F, y)
aic_lasso.criterion_
Output:
array([10. , 7.29642036, 8.9544056 , 7.06390981, 6.14233987,
7.96489293, 7.76894903, 7.61736515, 7.39575925, 7.25866825,
7.01418447, 6.90314784, 6.6465343 , 6.60361937, 8.12547536,
8.09620652, 8.09610375, 10.09599191, 12.0959849 , 12.09597075,
12.09596367, 12.09579736, 10.09579645, 10.09579616, 12.09579393,
12.09579199, 12.09579079, 14.09541338, 16.01988119])
y_pred = aic_lasso.predict(F)
aic(y_pred, y, 2)
Output:
146.42615433502792
K is 2 becuase, lasso sets the other coeff. to 0.
I guess this answer arrives way too late, but the mistake is that you use var(y) instead of std(residuals)
This will work
def aic(resid, nparams):
n = len (resid)
sig = np.std(resid)
ll = -n*np.log(sig*np.sqrt(np.pi*2))- np.sum((resid / sig)**2)/2
return float(2*nparams - 2*ll)

How to real-time filter with scipy and lfilter?

Disclaimer: I am probably not as good at DSP as I should be and therefore have more issues than I should have getting this code to work.
I need to filter incoming signals as they happen. I tried to make this code to work, but I have not been able to so far.
Referencing scipy.signal.lfilter doc
import numpy as np
import scipy.signal
import matplotlib.pyplot as plt
from lib import fnlib
samples = 100
x = np.linspace(0, 7, samples)
y = [] # Unfiltered output
y_filt1 = [] # Real-time filtered
nyq = 0.5 * samples
f1_norm = 0.1 / nyq
f2_norm = 2 / nyq
b, a = scipy.signal.butter(2, [f1_norm, f2_norm], 'band', analog=False)
zi = scipy.signal.lfilter_zi(b,a)
zi = zi*(np.sin(0) + 0.1*np.sin(15*0))
This sets zi as zi*y[0 ] initially, which in this case is 0. I have got it from the example code in the lfilter documentation, but I am not sure if this is correct at all.
Then it comes to the point where I am not sure what to do with the few initial samples.
The coefficients a and b are len(a) = 5 here.
As lfilter takes input values from now to n-4, do I pad it with zeroes, or do I need to wait until 5 samples have gone by and take them as a single bloc, then continuously sample each next step in the same way?
for i in range(0, len(a)-1): # Append 0 as initial values, wrong?
y.append(0)
step = 0
for i in xrange(0, samples): #x:
tmp = np.sin(x[i]) + 0.1*np.sin(15*x[i])
y.append(tmp)
# What to do with the inital filterings until len(y) == len(a) ?
if (step> len(a)):
y_filt, zi = scipy.signal.lfilter(b, a, y[-len(a):], axis=-1, zi=zi)
y_filt1.append(y_filt[4])
print(len(y))
y = y[4:]
print(len(y))
y_filt2 = scipy.signal.lfilter(b, a, y) # Offline filtered
plt.plot(x, y, x, y_filt1, x, y_filt2)
plt.show()
I think I had the same problem, and found a solution on https://github.com/scipy/scipy/issues/5116:
from scipy import zeros, signal, random
def filter_sbs():
data = random.random(2000)
b = signal.firwin(150, 0.004)
z = signal.lfilter_zi(b, 1) * data[0]
result = zeros(data.size)
for i, x in enumerate(data):
result[i], z = signal.lfilter(b, 1, [x], zi=z)
return result
if __name__ == '__main__':
result = filter_sbs()
The idea is to pass the filter state z in each subsequent call to lfilter. For the first few samples the filter may give strange results, but later (depending on the filter length) it starts to behave correctly.
The problem is not how you are buffering the input. The problem is that in the 'offline' version, the state of the filter is initialized using lfilter_zi which computes the internal state of an LTI so that the output will already be in steady-state when new samples arrive at the input. In the 'real-time' version, you skip this so that the filter's initial state is 0. You can either initialize both versions to using lfilter_zi or else initialize both to 0. Then, it doesn't matter how many samples you filter at a time.
Note, if you initialize to 0, the filter will 'ring' for a certain amount of time before reaching a steady state. In the case of FIR filters, there is an analytic solution for determining this time. For many IIR filters, there is not.
This following is correct. For simplicity's sake I initialize to 0 and feed the input on sample at a time. However, any non-zero block size will produce equivalent output.
from scipy import signal, random
from numpy import zeros
def filter_sbs(data, b):
z = zeros(b.size-1)
result = zeros(data.size)
for i, x in enumerate(data):
result[i], z = signal.lfilter(b, 1, [x], zi=z)
return result
def filter(data, b):
result = signal.lfilter(b,1,data)
return result
if __name__ == '__main__':
data = random.random(20000)
b = signal.firwin(150, 0.004)
result1 = filter_sbs(data, b)
result2 = filter(data, b)
print(result1 - result2)
Output:
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 ... -5.55111512e-17
0.00000000e+00 1.66533454e-16]

Categories

Resources