python lmfit, non-modifiable array as parameter for Model class? - python

I am just trying to fit a function which retrieves the correlation Pearson coefficient between two arrays. These two arrays are passed to the function as input parameters but they do not change. For the function, they should be interpreted as constants. I found an option for the parameters where it is possible to fix one parameter, i.e., it can not vary, but it works only for scalar values.
When I call Model.make_params(), the Model Class tries to check of these arrays are lower or grater than the minimum/maximum. This evaluation is not needed as they are constants.
My function:
def __lin_iteration2__(xref, yref_scaled, xobs, yobs, slope, offset, verbose=False, niter=None):
Acal = 1 + (offset + slope*xref)/xref
xr_new = xref * Acal
obs_interp1d = interp1d(xobs, yobs, kind='cubic')
yobs_new = scale_vector(obs_interp1d(xr_new))
rho = Pearson(yref_scaled, yobs_new)
return rho
Where xref, yref_scaled, xobs and yobs are arrays that do not change, i.e., constants. 'interp1d' is the interpolator operator coming from scipy.interpolate, 'scale_vector' scale a vector between -1 and 1, and 'Pearson' calculates the Pearson correlation coefficient.
Who I setup the Model class:
m = Model(corr.__lin_iteration3__)
par = m.make_params(yref_scaled = corr.yref_scaled, \
obs_interp1d=corr.obs_interp1d, offset=0, scale=0)
par['yref_scaled'].vary = False
par['obs_interp1d'].vary = False
r = m.fit
The error I got (just in the second line when I call the 'make_params' function of Model Class):
Traceback (most recent call last):
File "<ipython-input-3-c8f6550e831e>", line 1, in <module>
runfile('/home/andrey/Noveltis/tests/new_correl_sp/new_correl.py', wdir='/home/andrey/Noveltis/tests/new_correl_sp')
File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/andrey/Noveltis/tests/new_correl_sp/new_correl.py", line 264, in <module>
obs_interp1d=corr.obs_interp1d, offset=0, scale=0)
File "/usr/lib/python3/dist-packages/lmfit/model.py", line 401, in make_params
params.add(par)
File "/usr/lib/python3/dist-packages/lmfit/parameter.py", line 338, in add
self.__setitem__(name.name, name)
File "/usr/lib/python3/dist-packages/lmfit/parameter.py", line 145, in __setitem__
self._asteval.symtable[key] = par.value
File "/usr/lib/python3/dist-packages/lmfit/parameter.py", line 801, in value
return self._getval()
File "/usr/lib/python3/dist-packages/lmfit/parameter.py", line 786, in _getval
if self._val > self.max:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In lmfit, arguments to the model function are expected to be scalar, floating point parameter values, except for "independent variables" which can be any python objects. By default, the first function argument is assumed to be an independent variable, as is any keyword argument with a non-numeric default value. But, you can specify which arguments are the independent variables (and there can be more than one) when creating your model.
I think what you want is:
m = Model(corr.__lin_iteration3__, independent_vars=['xref', 'yref_scaled', 'xobs', 'yobs'])
But also: You could also pass is any Python object, so you could pack your ref and obs data into other structures and do something like
def lin_iteration(Data, slope, offset, verbose=False, niter=None):
Acal = 1 + (offset + slope*Data['xref'])/Data['xref']
xr_new = Data['xref'] * Acal
# or maybe that would be clearer as just
# xr_new = offset + (1+slope)* Data['xref']
obs_interp1d = interp1d(Data['xobs'], Data['yobs'], kind='cubic')
yobs_new = scale_vector(obs_interp1d(xr_new))
rho = Pearson(Data['yref_scaled'], yobs_new)
return rho
and
m = Model(lin_iteration)
par = m.make_params(offset=0, scale=0)
Data = {'xref': xref, 'yref_scaled': yref_scaled, 'xobs': xobs, 'yobs': yobs}
result = m.fit(Data, params)
Of course, that's all untested but it might make your life easier...

Related

List handling in GEKKO python

i'm currently working on a model of a distillation flask for a university project, the phisical problem is described by a DAE system, and i'm trying to solve it using GEKKO.
I'm facing a problem with list handling:
In this case i built a function that outputs the compressibility factor of a mixture, and it requires as inputs 3 gekko variables T1,x,y (x,y arrays)
zv1 = m.Param(value=ZCALC(n,comps,R0,p,T1.value,x,y))
m = GEKKO()
y = m.Array(m.Var,n,value=0.)
x = m.Array(m.Var,n,value=0.)
for i in range(n):
y[i].value = y0[i]
x[i].value = x0[i]
T1 = m.Var(value=3.31513478e+02, lb=300, ub=900)
If i leave the 3 values as they are i recieve some errors like:
File "F:\Codice_GEKKO\D86_GEKKO.py", line 113, in <module>
zv1 = m.Param(value=ZCALC(n,comps,R0,p,T1.value,x,y))
File "F:\Codice_GEKKO\compressibilityfactor.py", line 48, in ZCALC
zv=np.max(np.roots([1,-1,(Av-Bv-Bv**2),-Av*Bv]))
File "<__array_function__ internals>", line 6, in roots
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\lib\polynomial.py", line 222, in roots
non_zero = NX.nonzero(NX.ravel(p))[0]
File "<__array_function__ internals>", line 6, in nonzero
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 1908, in nonzero
return _wrapfunc(a, 'nonzero')
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 67, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 44, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\gekko\gk_operators.py", line 25, in __len__
return len(self.value)
File "C:\Users\verci\AppData\Local\Programs\Python\Python36\lib\site-packages\gekko\gk_operators.py", line 144, in __len__
return len(self.value)
TypeError: object of type 'int' has no len()```
Traceback (most recent call last):
File "F:\Codice_GEKKO\D86_GEKKO.py", line 113, in <module>
zv1 = m.Param(value=ZCALC(n,comps,R0,p,T1,x,y))
File "F:\Codice_GEKKO\compressibilityfactor.py", line 27, in ZCALC
(1-np.sqrt(t/tc[ii])))**2
TypeError: loop of ufunc does not support argument 0 of type GK_Operators which has no callable sqrt method
The first error is biven by the fact that x and y are not list, but they are GEKKO arrays and the second error is due to T1 not being a float (t=T1)
I found out that by using T1.value i can avoid the second error but still i have the first error
I have read the gekko documentation but i haven't been able to find a method to obtain a "standard" python list from a GEKKO array
Thank you in advance for your help
There are two different methods for obtained the value of zv.
Option 1: Initialization Calculation
The first method is to use floating point numbers to obtain a single calculation that can be used for initialization of a parameter. This first method allows any type of functions such as np.roots() or np.sqrt(). The function ZCALC() returns a floating point number. Even though Gekko variables are used as an input, the floating point number is accessed from a scalar variable with T1.value or from an array variable with x[i].value.
def ZCALC(n,comps,R0,p,T1,x,y):
# using the initialized values
t = T1.value
i = 0 # select values from x,y arrays
x1 = x[i].value
y1 = y[i].value
print('t,x[0],y[0] initialized values')
print(t,x1,y1)
# include equations for compressibility factor
w = (1-np.sqrt(t/300))**2
z = np.max(np.roots([1,-1,x1**2,x1*y1]))
# original equations from question
#(1-np.sqrt(t/tc[ii])))**2
#zv=np.max(np.roots([1,-1,(Av-Bv-Bv**2),-Av*Bv]))
return z
zv1 = m.Param(value=ZCALC(n,comps,R0,p,T1,x,y))
Option 2: Implicit Calculation
If the compressibility factor needs to change as T1 and x,y change then use Gekko variables so that the model is compiled with that dependency. The functions are only called during problem initialization. Gekko needs the equations with specific Gekko functions to enable automatic differentiation to provide gradients to the solvers.
def ZCALC2(n,comps,R0,p,T1,x,y):
# using gekko variables
t = T1
i = 0
x1 = x[i] # use index to x array
y1 = y[i] # use index to y array
# use Gekko equations, not Numpy
w = (x1/y1)*(1-m.sqrt(t/300))**2
# set lower bound to get the maximum root
zv = m.Var(value=ZCALC(n,comps,R0,p,T1,x,y),lb=10)
# solve for roots of eq with gekko, not with np.roots
eq = 1-zv+x1**2*zv+x1*y1*zv**3
m.Equation(eq==0)
return zv
zv2 = ZCALC2(n,comps,R0,p,T1,x,y)
Here is a script that shows the two methods:
import numpy as np
m=GEKKO(remote=False)
def ZCALC(n,comps,R0,p,T1,x,y):
# using the initialized values
t = T1.value
i = 0 # select values from x,y arrays
x1 = x[i].value
y1 = y[i].value
print('t,x[0],y[0] initialized values')
print(t,x1,y1)
# include equations for compressibility factor
w = (1-np.sqrt(t/300))**2
z = np.max(np.roots([1,-1,x1**2,x1*y1]))
# original equations from question
#(1-np.sqrt(t/tc[ii])))**2
#zv=np.max(np.roots([1,-1,(Av-Bv-Bv**2),-Av*Bv]))
return z
def ZCALC2(n,comps,R0,p,T1,x,y):
# using gekko variables
t = T1
i = 0
x1 = x[i] # use index to x array
y1 = y[i] # use index to y array
# use Gekko equations, not Numpy
w = (x1/y1)*(1-m.sqrt(t/300))**2
# set lower bound to get the maximum root
zv = m.Var(value=ZCALC(n,comps,R0,p,T1,x,y),lb=10)
# solve for roots of eq with gekko, not with np.roots
eq = 1-zv+x1**2*zv+x1*y1*zv**3
m.Equation(eq==0)
return zv
n = 3
y = m.Array(m.Var,n)
x = m.Array(m.Var,n)
x0 = [0.1,0.2,0.3]
y0 = [0.15,0.25,0.35]
for i in range(n):
y[i].value = y0[i]
x[i].value = x0[i]
T1 = m.Var(value=331, lb=300, ub=900)
comps = ['C2=','C3=','C2H8']; R0 = 8.314; p=10
# define Zv from initialized values (fixed parameter)
zv1 = m.Param(value=ZCALC(n,comps,R0,p,T1,x,y))
# define Zv from Gekko variables (updates with T1,x,y changes)
zv2 = ZCALC2(n,comps,R0,p,T1,x,y)
# initialized value of zv1 does not update with changes in T1,x,y
# initialized value of zv2 does update with changes in T1,x,y
print('initialized value of zv1, zv2')
print(zv1.value,zv2.value)
If the compressibility factor correlations can't be expressed as Gekko equations then try the cspline for 1D or bspline for 2D functions to create an approximation. You may be able to use the bspline function if compressibility can depend on just 2 variables T and x (replace y with an explicit calculation of x).

Is there a way for scipy.integrate.quad to accept arrays in args?

I have a function:
def xx(th, T, B):
f = integrate.quad(xint, 0, np.inf, args = (th, T, B))[0]
a = v(th)*f
return a
where xint is a function of functions of p, th, T, B. All the preceding functions work well; xx(th, T, B) should then be integrated over th and the other variables are single numbers.
When I run this I get: TypeError: only size-1 arrays can be converted to Python scalars because th is an array rather than a single number.
I've tried using lambda functions and also dblquad to do both the integrals in the same calculation but nothing has been working. Bearing in mind the limits thus trying to avoid a for loop, is there any way of getting integrate.quad to accept an array argument?
Traceback when run:
run file.py
Traceback (most recent call last):
File "file.py", line 280, in <module>
file()
File "file.py", line 240, in sctif
axes[0, 2].plot(th, xx(th,10, 10))
File "file.py", line 166, in xx
f, _ = integrate.quad(xint, 0, 100,args = (th,T,B))
File "/home/caitlin/anaconda3/lib/python3.7/site-packages/scipy/integrate/quadpack.py", line 352, in quad
points)
File "/home/caitlin/anaconda3/lib/python3.7/site-packages/scipy/integrate/quadpack.py", line 463, in _quad
return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
TypeError: only size-1 arrays can be converted to Python scalars`
I never really got the usefulness of the args option. IMHO, the code becomes clearer if you define a function that accepts only one argument, perhaps by wrapping:
th = 2.0
T = 1.0
B = 3.0
def xint(x):
return th * x ** B / T
f, _ = integrate.quad(xint, 0, np.inf)
Now, if one of th, T, B is a vector, your function is vector-valued, and you cannot use quad anymore. I'd look into quad_vec or quadpy.quad. (quadpy is a project of mine.)

RuntimeError: Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead

I'm new to pytorch. I'm trying to implement a custom loss function by computing the absolute and relative distance and concatenating them.
def distance(p1, p2,labels):
"""
Returns the distance between the point sets p1 and p2
p1 = m by d matrix containing a set of points
p2 = m by d matrix containing a different set of points
returns: an m-length vector containing the distance from each point in
p1 to the corresponding point in p2
"""
if not np.all(p1.shape == p2.shape):
raise ValueError("p1 and p2 must be the same shape.")
d = p1.shape[1]
features = np.zeros(dtype=np.float32, shape=(p1.shape[0], d * 2))
features[:, :d] = np.abs(p1 - p2)
features[:, d:] = (p1 + p2) / 2
return features
The problem appears when I try to run it and I get the following error:
File "... line 34, in distance
features[:, :d] = np.abs(p1 - p2)
File "...\Anaconda2\envs\tensorflow_gpuenv\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "...\lib\site-packages\tensorflow\python\ops\math_ops.py", line 266, in abs
x = ops.convert_to_tensor(x, name="x")
File "...\lib\site-packages\tensorflow\python\framework\ops.py", line 1087, in convert_to_tensor
return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
RuntimeError: Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead.
The error tells you exactly what is going wrong. You need to convert first the pytorch tensor into a numpy array. You can do that using .detach().numpy()
So inside your function you can put:
def distance(p1, p2, labels):
p1 = p1.detach().numpy()
p2 = p2.detach().numpy()
labels = labels.detach().numpy()
Even though be aware that tensor are meant to speed up calculations, converting them into array means to increase computation time. So looking at pytorch functions as suggested in the comments is definitely the best approach to follow.

Classifier.fit for oneclassSVM complaining about float Type. TypeError float is required

I'm trying to fit two One Class SVMs to a small sets of data. These sets of data are call m1 and m2 respectively. m1 and m2 are lists of decimals which are converted to numpy arrays of type float t1 and t2.
When I attempt to fit the oneclass SVMs to these sets of data I am seeing errors saying that the the fit function will only accept a float. Can someone help me fix this problem?
Example Values:
m1 =[0.020000000000000018, 0.22799999999999998, 0.15799999999999992, 0.18999999999999995, 0.264]
m2 = [0.1279999999999999, 0.07400000000000007, 0.75, 1.0, 1.0]
Code below:
classifier1 =sklearn.svm.OneClassSVM(kernel='linear', nu ='0.5',gamma ='auto')
classifier2 = sklearn.svm.OneClassSVM(kernel='linear', nu ='0.5',gamma='auto')
for x in xrange(len(m1)):
print" Iteration "+str(x)
t1.append(float(m1[x]))
t2.append(float(m2[x]))
tx = np.array(t1).astype(float)
ty = np.array(t2).astype(float)
t1 = np.r_[tx+1.0,tx-1.0]
t2 = np.r_[ty+1.0,ty-1.0]
print t1
print t2
clfit1 = classifier1.fit(t1.astype(float))
clfit2 = classifier2.fit(t2.astype(float))
Error on commandline:
/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "normalize_data.py", line 108, in <module>
main()
File "normalize_data.py", line 15, in main
trainSVM(result1[0],yval1,result2[0],yval2,0.04)
File "normalize_data.py", line 99, in trainSVM
clfit1 = classifier1.fit(t1.astype(float))
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/classes.py", line 1029, in fit
**params)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 193, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 251, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 59, in sklearn.svm.libsvm.fit (sklearn/svm/libsvm.c:1571)
TypeError: a float is required
made an error and set nu as a string instead of a float.
setting nu=0.05 fixes the problem.

Python: How to use the function nd.Hessiandiag on a complex function

I want to use the function Hessiandiag from the package (Numdifftools) to get the diagonal elements of an Hessian matrix using the optimal parameters that minimizes a function.
Here's a simple example of the usage of Hessiandiag taken from developers website of Numdifftools:
import numpy as np
import numdifftools as nd
fun = lambda x : x[0] + x[1]**2 + x[2]**3
ddfun = lambda x : np.asarray((0, 2, 6*x[2]))
Hfun = nd.Hessdiag(fun)
hd = Hfun([1,2,3]) # HD = [ 0,2,18]
Say that my function that I want to get the Hessian matrix is too complex to be written using lambda. My function is stored in another file under the name Latent (using the def Latent(x1, x2, x3) command). I can't do the following:
from Latent import Latent # That's my function
import numpy as np
import numdifftools as nd
fun = lambda x1, x2, x3 : Latent(x1, x2, x3)
Hfun = nd.Hessdiag(fun)
hd = Hfun([np.array([1.2, 1.5, 2]), np.array(3.4, 5), 6]) # three parameters
...this doesn't work...
This is the error:
raise ValueError('%s must be scalar, one of [1 2 3 4].' % name)
ValueError: n must be scalar, one of [1 2 3 4].
How can I use nd.Hessdiag with my complex function without using lambda?
UPDATE
I have also tried this:
fun = lambda x: Latent(x[0], x[1], x[2])
Hfun = nd.Hessdiag(fun)
hd = Hfun([x1, x2, x3])
I get this error:
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 601, in runfile
execfile(filename, namespace)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 66, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "F:/dropbox/Dropbox/Research/Fisher Martineau Sheng/SEC/codes/Python Latent Factor/V1/Main_v1.py", line 101, in <module>
hd = Hfun(np.array([est_param, allret_.T, n, capt, num_treat]))
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\numdifftools\core.py", line 1161, in __call__
return self.hessdiag(x)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\numdifftools\core.py", line 1171, in hessdiag
dder, self.error_estimate, self.final_delta = self._partial_der(x)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\numdifftools\core.py", line 830, in _partial_der
self._x = np.asarray(x0, dtype=float)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\numpy\core\numeric.py", line 460, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

Categories

Resources