machine learning and optimizing scipy - python

I have coding machine learning and for optimizing my cost function i used scipy.optimize.minimum for it and scipy doesn't return right answer.so what should i do?
code:
data1 = pd.read_csv('ex2data1.txt', header = None, names =
['exam1','exam2', 'y'])
data1['ones'] = pd.Series(np.ones(100), dtype = int)
data1 = data1[['ones', 'exam1', 'exam2', 'y']]
X = np.matrix(data1.iloc[:, 0:3])
y = np.matrix(data1.iloc[:, 3:])
def gFunction(z):
return sc.special.expit(-z)
def hFunction(theta, X):
theta = np.matrix(theta).T
h = np.matrix(gFunction(X.dot(theta)))
return h
def costFunction(theta, X, y):
m = y.size
h = hFunction(theta, X).T
j = (-1 / m) * (np.dot(np.log(h), y) + np.dot(np.log(1-h), (1-y)))
return j
def gradientDescent(theta, X, y):
theta = np.matrix(theta)
m = y.size
h = hFunction(theta, X)
gradient = (1 / m) * X.T.dot(h - y)
return gradient.flatten()
initial_theta = np.zeros(X.shape[1])
cost = costFunction(initial_theta, X, y)
grad = gradientDescent(initial_theta, X, y)
print('Cost: \n', cost)
print('Grad: \n', grad)
Cost:
[[ 0.69314718]]
Grad:
[[ -0.1 -12.00921659 -11.26284221]]
def optimizer(costFunction, theta, X, y, gradientDescent):
optimum = sc.optimize.minimize(costFunction, theta, args = (X, y),
method = None, jac = gradientDescent, options={'maxiter':400})
return optimum
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.693147
Iterations: 0
Function evaluations: 106
Gradient evaluations: 94
Out[46]:
fun: matrix([[ 0.69314718]])
hess_inv: array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
jac: matrix([[ -0.1 , -12.00921659, -11.26284221]])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 106
nit: 0
njev: 94
status: 2
success: False
x: array([ 0., 0., 0.])
this is the message that says success False
i have done everything right i don't know what's happen

It's hard to debug something like this when:
code is not reproducible because of external data
question does not even try to explain what is optimized here
There are some strange design-decisions:
use of np.matrix -> do use np.array!
don't call the jacobian gradientDescent
And then in regards to your observation:
Iterations: 0
Function evaluations: 106
Gradient evaluations: 94
zero iterations while doing so many function-evaluations is a very bad sign. Something is very broken. Probably line-search is going crazy above, but that's just a guess.
Now what's broken?:
your jacobian is definitely broken!
i did not check the math, but:
your jacobian-shape is dependent on the number of samples when number of variables is fixed -> no ! that does not make sense!
Steps to do:
run with jac=False
If working: your cost-fuc looks ok
If not working: your trouble probably (no proof) starts even there
repair the jacobian!
check the jacobian against check_grad
I wonder why you don't get any shape errors here. I do, when trying to mimic your input shapes and playing around with sample-size!

Related

How to constrain the weight of characteristic variables in regression

Now I faced a problem that for a data sample(lets‘s say 10 continuous variables and one dependent variable),  I need fit a model for the prediction. I would like constrain the weights of all the variables between a particular number, like abs(0.2). Which means the variables should no more than 0.2 or less than -0.2. However, I tried lasso and ridge regression in sklearn.linear_model(Also tried ElasticNet) to control the weights of variables, it's not quite good because there always be one or two extreme large weights or sometimes when I gave a large alpha the r square shows the model was really bad. I tried to write my own methods, but I could only constrain the sum of weights nor the every weight of variables. SVR would provide a pretty close answer, however I still wanna ask if there are some good choices for muti-regression with self define constrains?
import numpy as np
from scipy.optimize import shgo
def my_general_linear_model_func(A1,b1):
num_x = np.shape(A1)[1]
def my_func(x):
ls = 0.5*(b1-np.dot(A1,x))**2
result = np.sum(ls)
return result
def g1(x):
return np.sum(x) #sum of X >= 0
def g2(x):
return 1-np.sum(x) #sum of X <= 1
cons = ({'type': 'ineq', 'fun': g1}
,{'type': 'ineq', 'fun': g2})
x0 = np.zeros(num_x)
bnds = [(0,1)]
for i in range(num_x-1):
bnds.append((0,1))
res1 = shgo(my_func,
bounds = bnds,
constraints=cons)
return res1
A1 = np.array([[0.12,5.96,3.14],[0.68,7.89,4.56]])
b1 = np.array([3,5])
my_general_linear_model_func(A1,b1)
The result:
fun: 0.07651391974288956
funl: array([0.07651392, 0.11079534, 0.2564125 ])
message: 'Optimization terminated successfully.'
nfev: 53
nit: 2
nlfev: 49
nlhev: 0
nljev: 12
success: True
x: array([1.12339358e-16, 5.62146099e-02, 9.43785390e-01])
xl: array([[1.12339358e-16, 5.62146099e-02, 9.43785390e-01],
[3.90241087e-01, 5.00000000e-01, 1.09758913e-01],
[5.00000000e-01, 5.00000000e-01, 0.00000000e+00]])

Maximising a function

experts. I'm trying to maximize a function my_obj with the Nelder-Mead algorithm to fit my data. For this i have taken help from the scipy's optimize.fmin . I think i am very close to the solutions but missing something and getting an error like:
As explained in the scipy.optimize.minimize documentation, you should be using a 1-D array (or a 1-D list because it is compatible) as input for your objective function instead of multiple parameters:
#!/usr/bin/env python
import numpy as np
from scipy.optimize import minimize
d1 = np.array([ 5.0, 10.0, 15.0, 20.0, 25.0])
h = np.array([10000720600.0, 10011506200.0, 10057741200.0, 10178305100.0,10415318500.0])
b = 2.0
cx = 2.0
#objective function
def obj_function(x): # EDIT: Input is a list
m,n,r= x
pw = 1/cx
c = b*cx
x1 = 1+(d1/n)**c
x2 = 1+(d1/m)**c
x3 = (x1/x2)**pw
dcal = (r)*x3
dobs = (h)
deld=((np.log10(dcal)-np.log10(dobs)))**2
return np.sum(deld)
print(obj_function([5.0,10.0,15.0])) # EDIT: Input is a list
x0 = [5.0,10.0,15.0]
print(obj_function(x0))
res = minimize(obj_function, x0, method='nelder-mead')
print(res)
Output:
% python3 script.py
432.6485766651165
432.6485766651165
final_simplex: (array([[7.76285924e+00, 3.02470699e-04, 1.93396980e+01],
[7.76286507e+00, 3.02555020e-04, 1.93397231e+01],
[7.76285178e+00, 3.01100639e-04, 1.93397381e+01],
[7.76286445e+00, 3.01025402e-04, 1.93397169e+01]]), array([0.12196442, 0.12196914, 0.12197448, 0.12198028]))
fun: 0.12196441986340725
message: 'Optimization terminated successfully.'
nfev: 130
nit: 67
status: 0
success: True
x: array([7.76285924e+00, 3.02470699e-04, 1.93396980e+01])

Scipy: How can I use Bounds with trust-constr?

for my constrained Problem I want to use the Scipy-Trusted-Constr algorithm as I have a multivariable, constraint problem. I dont want /can't calculate the Jacobi/Hessian analytically, and compute it.
However, when setting the bounds, the computation of the Jacobian crashes:
File "C:\Python27\lib\site-packages\scipy\optimize\_trustregion_constr\tr_interior_point.py", line 56, in __init__
self.jac0 = self._compute_jacobian(jac_eq0, jac_ineq0, s0)
File "C:\Python27\lib\site-packages\scipy\optimize\_trustregion_constr\tr_interior_point.py", line 164, in _compute_jacobian
[J_ineq, S]]))
File "C:\Python27\lib\site-packages\numpy\matrixlib\defmatrix.py", line 1237, in bmat
arr_rows.append(concatenate(row, axis=-1))
ValueError: all the input array dimensions except for the concatenation axis must match exactly
The error occurs both when using old style bounds and the newest Bounds object. I could reproduce the error with this code:
import numpy as np
import scipy.optimize as scopt
def RosenbrockN(x):
result = 0
for i in range(len(x)-1):
result += 100*(x[i+1]-x[i]**2)**2+(1-x[i])**2
return result
x0 = [0.0, 0.0, 0.0]
#bounds = scopt.Bounds([-2.0,-0.5,-2.0],[2.0,0.8,0.7])
bounds = [(-2.0,2.0),(-0.5,0.8),(-2.0,0.7)]
Res = scopt.minimize(RosenbrockN, x0, \
method = 'trust-constr', bounds = bounds, \
jac = '2-point', hess = scopt.SR1())
I take it that I just misunderstand how bounds are set, but cant find my mistake. Advice is appreciated.
EDIT: I also tried the code example from the documentation which gave the same result. Other methods as SLSQP work well with bounds.
SciPy Version 1.1.0, Python Version 2.7.4, OS Win 7 Ent.
I tried several times with the method "trust-constr" and the boundary constraints fails to be incorporated. I solved this issue by using linear constraints for the boundary conditions. Following the example
from scipy.optimize import minimize, LinearConstraint, Bounds
def RosenbrockN(x):
result = 0
for i in range(len(x)-1):
result += 100*(x[i+1]-x[i]**2)**2+(1-x[i])**2
return result
x0 = [0.0, 0.0, 0.0]
# This will not work:
#bounds = Bounds([-2.0,-0.5,-2.0],[2.0,0.8,0.7])
# This works
lb = [-2.0,-0.5,-2.0]
ub = [2.0,0.8,0.7]
A = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
lcons = LinearConstraint(A, lb=lb, ub=ub, keep_feasible=True)
Res = minimize(RosenbrockN, x0, method = 'trust-constr', constraints=lcons)
I removed your jac and hess arguments and got it to work; perhaps the problem lies there?
import numpy as np
import scipy.optimize as scopt
def RosenbrockN(x):
result = 0
for i in range(len(x)-1):
result += 100*(x[i+1]-x[i]**2)**2+(1-x[i])**2
return result
x0 = [0.0, 0.0, 0.0]
#bounds = scopt.Bounds([-2.0,-0.5,-2.0],[2.0,0.8,0.7])
bounds = [(-2.0,2.0),(-0.5,0.8),(-2.0,0.7)]
Res = scopt.minimize(RosenbrockN, x0, \
method = 'SLSQP', bounds = bounds)
print(Res)
Result is
fun: 0.051111012543332675
jac: array([-0.00297706, -0.50601892, -0.00621008])
message: 'Optimization terminated successfully.'
nfev: 95
nit: 18
njev: 18
status: 0
success: True
x: array([0.89475126, 0.8 , 0.63996894])

Python3 Scipy: Desired error not necessarily achieved due to precision loss

I'm implementing Andrew Ng's Coursera course in Python and I'm doing Ex2 right now, Logistic Regression. I'm trying to use SciPy's optimize.minimize but I can't seem to get it to run correctly. I'll try to give as brief a summary of my code as possible while being thorough. I'm using Python3. Here is my variable setup, I move everything to numpy after using pandas to read in the csv file:
import numpy as np
import pandas as pd
from scipy.optimize import fmin_bfgs
from scipy import optimize as opt
from scipy.optimize import minimize
class Ex2:
def __init__(self):
self.pandas_data = pd.read_csv("ex2data1.txt", skipinitialspace=True)
self.data = self.pandas_data.values
self.data = np.insert(self.data, 0, 1, axis=1)
self.x = self.data[:, 0:3]
self.y = self.data[:, 3:]
self.theta = np.zeros(shape=(self.x.shape[1]))
x: (100, 3) numpy ndarray
y: (100, 1) numpy ndarray
theta: (3,) numpy ndarray (1-d)
Then, I define a sigmoid, cost and gradient function to give to Scipy's minimize:
#staticmethod
def sigmoid(x):
return 1/(1 + np.exp(x))
def cost(self, theta):
x = self.x
y = self.y
m = len(y)
h = self.sigmoid(x.dot(theta))
j = (1/m) * ((-y.T.dot(np.log(h))) - ((1-y).T.dot(np.log(1-h))))
return j[0]
def grad(self, theta):
x = self.x
y = self.y
theta = np.expand_dims(theta, axis=0)
m = len(y)
h = self.sigmoid(x.dot(theta.T))
grad = (1/m) * (x.T.dot(h-y))
grad = np.squeeze(grad)
return grad
These take theta, a 1-D numpy ndarray. Cost returns a scalar (the cost associated with the theta given) and gradient returns a 1-D numpy ndarray of updates for theta.
When I then run this code:
def run(self):
options = {'maxiter': 100}
print(minimize(self.cost, self.theta, jac=self.grad, options=options))
ex2 = Ex2()
ex2.run()
I get:
fun: 0.69314718055994529
hess_inv: array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
jac: array([ -0.1 , -12.00921659, -11.26284221])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 106
nit: 0
njev: 94
status: 2
success: False
x: array([ 0., 0., 0.])
Process finished with exit code 0
Can't quite get the formatting right on the output, apologies. That's the gist of what I'm doing, am I returning something from cost or gradient incorrectly? That seems most likely to me but I've been trying various combinations and formats of return values and nothing seems to work. Any help is greatly appreciated.
Edit: Among other things, to debug this I've made sure that cost and grad are returning what I expect, which they are (cost: float, grad: 1-D ndarray). Running both on an initial theta array of zeros gives me the same values as I get in Octave (which I know to be correct thanks to the provided code for the exercises). However, giving these values to the minimize function does not seem to be minimizing the theta values as expected.
If anyone stumbles across this and happens to have the same problem, I figured out that in my sigmoid function I should have had
return 1/(1 + np.exp(-x))
but had
return 1/(1 + np.exp(x))
After fixing that, the minimize function converged normally.

nonlinear optimization with vectors, scalars and inequality constraints

I have set of equation in form: Y=aA+bB
where Y-is know vector of floats (only this one is known!); a, b are unkown scalar (float) and A, B are unknown vectors of floats. Each equation have it own Y, a, b, whereas all equation share the same unknow vectors A and B.
I have set of such equation so my problem is to minimize function:
(Y-aA-bB)+(Y'-a'A-b'B)+....
I have also many inequality constrains of type: Ai>Aj (Ai i-th element of vector A), Bi>= Bk, Bi>0, a>a', ...
Is there any software or library (ideally for python) which can handle this problem?
General remarks
This is a linear problem (at least in the linear least-squares sense, continue reading)!
It's also incompletely specified as it's not clear if there should be always a feasible solution in your case or if you want to minimize some given loss in general. Your text sounds like the latter, but in this case one has to chose the loss (which makes a difference in regards to possible algorithms). Let's take the euclidean-norm (probably the best pick here)!
Ignoring constraints for a moment, we can view this problem as basic least-squares solution to a linear matrix equation problem (euclidean-norm vs. squared euclidean-norm does not make a difference!).
min || b - Ax ||^2
Here:
M = number of Y's
N = size of Y
b = (Y0,
Y1,
...) -> shape: M*N (flattened: Y_x = (y_x_0, y_x_1).T)
A = ((a0, 0, 0, ..., b0, 0, 0, ...),
(0, a0, 0, ..., 0, b0, 0, ...),
(0, 0, a0, ..., 0, 0, b0, ...),
...
(a1, 0, 0, ..., b1, 0, 0, ...)) -> shape: (M*N, N*2)
x = (A0, A1, A2, ... B0, B1, B2, ...) -> shape: N*2 (one for A, one for B)
What you should do
If unconstrained:
Convert to standard-form and use numpy's lstsq
If constrained:
Either use customized optimization algorithms, or:
Linear-programming (if minimizing absolute-differences / l1-norm)
I'm too lazy to formulate it for scipy's linprog
Not that hard, but l1-norm is non-trivial using scipy's API
Much easier to formulate with cvxpy (obj=cvxpy.norm(X, 1))
Quadratic-programming / Second-order-cone-programming (if minimizing euclidean norm / l2-norm)
Again, too lazy to formuate it; no special solver available at scipy yet
Could be easily formulated with cvxpy (obj=cvxpy.norm(X, 2))
Emergency: use general-purpose constrained nonlinear-optimization algorithms like SLSQP -> see code
Some hacky code (not the best approach!)
This code:
Is just a demo!
Uses general nonlinear optimization algorithms from scipy
Therefore:
easier to formulate
Less fast & robust than LP, QP, SOCP
But will achieve approximately the same result as convergence on convex optimization problems is guaranteed
Uses automatic-differentiation whenever needed
(author too lazy to add gradients)
this can really hurt if performance is important
Is really ugly in terms of np.repeat vs. broadcasting!
Code:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
""" Fake-problem (usually the job of the question-author!) """
def get_partial(N=10):
Y = np.random.uniform(size=N)
a, b = np.random.uniform(size=2)
return Y, a, b
""" Optimization """
def optimize(list_partials, N, M):
""" General approach:
This is a linear system of equations (with constraints)
Basic (unconstrained) form: min || b - Ax ||^2
"""
Y_all = np.vstack(map(lambda x: x[0], list_partials)).ravel() # flat 1d
a_all = np.hstack(map(lambda x: np.repeat(x[1], N), list_partials)) # repeat to be of same shape
b_all = np.hstack(map(lambda x: np.repeat(x[2], N), list_partials)) # """
def func(x):
A = x[:N]
B = x[N:]
return np.linalg.norm(Y_all - a_all * np.repeat(A, M) - b_all * np.repeat(B, M))
""" Example constraints: A >= B element-wise """
cons = ({'type': 'ineq',
'fun' : lambda x: x[:N] - x[N:]})
res = minimize(func, np.zeros(N*2), constraints=cons, method='SLSQP', options={'disp': True})
print(res)
print(Y_all - a_all * np.repeat(res.x[:N], M) - b_all * np.repeat(res.x[N:], M))
""" Test """
M = 4
N = 3
list_partials = [get_partial(N) for i in range(M)]
optimize(list_partials, N, M)
Output:
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.9019356096498999
Iterations: 12
Function evaluations: 96
Gradient evaluations: 12
fun: 0.9019356096498999
jac: array([ 1.03786588e-04, 4.84041870e-04, 2.08129734e-01,
1.57609582e-04, 2.87599862e-04, -2.07959406e-01])
message: 'Optimization terminated successfully.'
nfev: 96
nit: 12
njev: 12
status: 0
success: True
x: array([ 1.82177105, 0.62803449, 0.63815278, -1.16960281, 0.03147683,
0.63815278])
[ 3.78873785e-02 3.41189867e-01 -3.79020251e-01 -2.79338679e-04
-7.98836875e-02 7.94168282e-02 -1.33155595e-01 1.32869391e-01
-3.73398306e-01 4.54460178e-01 2.01297470e-01 3.42682496e-01]
I did not check the result! If there is an error it's an implementation-error, not a conceptional one (my opinion)!
I agree with sascha that this is a linear problem. As I do not like constrains very much, I prefer, actually, to make it a non-linear without constrains. I do so by setting the vector A=(a1**2, a1**2+a2**2, a1**2+a2**2+a3**2, ...) like this it is ensured that it is all positive and A_i > A_j for i>j. That makes errors a bit problematic, as you now have to consider error propagation to get A1, A2, etc. including correlation, but I will have an important point on that at the end. The "simple" solution would look as follows:
import numpy as np
from scipy.optimize import leastsq
from random import random
np.set_printoptions(linewidth=190)
def generate_random_vector(n, sortIt=True):
out=np.fromiter( (random() for x in range(n) ),np.float)
if sortIt:
out.sort()
return out
def residuals(parameters,dataVec,dataLength,vecDims):
aParams=parameters[:dataLength]
bParams=parameters[dataLength:2*dataLength]
AParams=parameters[-2*vecDims:-vecDims]
BParams=parameters[-vecDims:]
YList=dataVec
AVec=[a**2 for a in AParams]##assures A_i > 0
BVec=[b**2 for b in BParams]
AAVec=np.cumsum(AVec)##assures A_i>A_j for i>j
BBVec=np.cumsum(BVec)
dist=[ np.array(Y)-a*np.array(AAVec)-b*np.array(BBVec) for Y,a,b in zip(YList,aParams,bParams) ]
dist=np.ravel(dist)
return dist
if __name__=="__main__":
aList=generate_random_vector(20, sortIt=False)
bList=generate_random_vector(20, sortIt=False)
AVec=generate_random_vector(5)
BVec=generate_random_vector(5)
YList=[a*AVec+b*BVec for a,b in zip(aList,bList)]
aGuess=20*[.2]
bGuess=20*[.3]
AGuess=5*[.4]
BGuess=5*[.5]
bestFitValues, covMX, infoDict, messages ,ier = leastsq(residuals, aGuess+bGuess+AGuess+BGuess ,args=(YList,20,5) ,full_output=True)
print "a"
print aList
besta = bestFitValues[:20]
print besta
print "b"
print bList
bestb = bestFitValues[20:40]
print bestb
print "A"
print AVec
bestA = bestFitValues[-2*5:-5]
realBestA = np.cumsum([x**2 for x in bestA])
print realBestA
print "B"
print BVec
bestB = bestFitValues[-5:]
realBestB = np.cumsum([x**2 for x in bestB])
print realBestB
print covMX
The problem on errors and correlation is that the solution to the problem is not unique. If Y = a A + b B is a solution and we, e.g., rotate such that A = c E + s F and B = -s E + c F then also Y = (ac-bs) E + (as+bc) F =e E + f F is a solution. The parameter space is, hence, completely flat at "the solution" resulting in huge errors and apocalyptic correlations.

Categories

Resources