I'm trying to use scipy.optimize to identify the optimal values for 3 parameters(variable). I am starting with a very simple optimization function that sums the analyzed parameters together with some predefined (past) values. The values are bound using some fixed values. I set the value of the sign parameter to -1 as I am dealing with a maximization problem. However, scipy returns [0, 0, 0] as optimal values (same as setting sign=1), while the correct solution is [2, 2, 2]. Am I setting something wrong? What am I missing?
import scipy.optimize as optimize
import numpy as np
old = [1,1,1]
def f(params,sign=-1.0):
first, second, third = params
return sum(old+[first, second, third])
initial_guess = [2,2,2]
in1 = 1
in2 = 2
in3 = 1
bnds = ((0, in1+2), (0, in2+2), (0, in3+2))
result = optimize.minimize(f, initial_guess, bounds=bnds)
print result.x
In general when performing nonlinear optimization, libraries like your function to take only a single parameter vector. A good idea, generally, if you want to maximize a function is to minimize its inverse. If you simply want to maximize the value of x1+x2+x3, I would write things out this way:
from scipy.optimize import minimize
def f(x):
return 1/sum(x)
guess = [2,2,2]
x1bnds = (0, 3)
x2bnds = (0, 4)
x3bnds = (0, 5)
bnds = (x1bnds, x2bnds, x3bnds)
result = minimize(f, guess, bounds=bnds)
print(result.x) will give you [3,4,5] because the optimizer hit the bounds.
If you want to operate on the distance between your input parameters and some other values, I would modify the setup as so:
from functools import partial
from scipy.optimize import minimize
import numpy as np
other_values = np.asarray([3,4,5])
def f(x, other_pts):
x_lcl = np.asarray(x)
difference = x_lcl-other_pts
return 1/difference.sum()
guess = [2,2,2]
x1bnds = (0, 3)
x2bnds = (0, 4)
x3bnds = (0, 5)
bnds = (x1bnds, x2bnds, x3bnds)
f_opt = partial(f, other_values)
result = minimize(f_opt, guess, bounds=bnds)
print(result.x) will give you [0,0,0] because the optimizer hit the bounds.
It is a good idea to make the function you optimize not depend on external data (globals) -- using a partial will make everything a little nicer.
If you don't want to use numpy, you could use a list comprehension to do the elementwise subtraction of x and the other parameter vector, but this way things are a little nicer.
Related
how do i write a code to show the addition operation between two arrays (row wise), i don't want the result of the addition but want to illustrate the operation. Here is what I have, however my code is not giving me the right output
import numpy as np
Grid = np.random.randint(-50,50, size=(5,4))
iList =np.array([[1, -1, 2, -2]])
result = (Grid.astype(str), iList.astype(str))
print(result)
the output needs to be something to this effect
([3+1 4-1 4+2 5-2]
[6+1 9-1 7+2 8-2]
etc.
thank you.
You basically want to apply a function to two numpy arrays of different sizes, making use of numpy's broadcasting capability.
This works:
import numpy as np
grid = np.random.randint(-50, 50, size=(5, 4))
i_list = np.array([[1, -1, 2, -2]])
def sum_text(x: int, y: int):
return f'{x}+{y}'
# create a ufunc, telling numpy that it takes 2 arguments and returns 1 value
np_sum_text = np.frompyfunc(sum_text, 2, 1)
result = np_sum_text(grid, i_list)
print(result)
Result:
[['46+1' '-27+-1' '35+2' '-3+-2']
['-5+1' '6+-1' '2+2' '22+-2']
['6+1' '-45+-1' '-21+2' '31+-2']
['25+1' '-4+-1' '-24+2' '3+-2']
['-32+1' '-10+-1' '-19+2' '28+-2']]
Or maybe you don't need to reuse that function and like one-liners:
print(np.frompyfunc(lambda x, y: f'{x}+{y}', 2, 1)(grid, i_list))
Getting rid of the + before a negative integer is trivial:
def sum_text(x: int, y: int):
return f'{x}+{y}' if y >= 0 else f'{x}{y}'
I'm trying to make custom gradient descent estimator, however, I am encountering the issue with storing the parameter values at every step of the gradient descent algorithm. Here is the code skeleton:
from numpy import *
import pandas as pd
from joblib import Parallel, delayed
from multiprocessing import cpu_count
ftemp = zeros((2, ))
stemp = empty([1, ], dtype='<U10')
la = 10
vals = pd.DataFrame(index=range(la), columns=['a', 'b', 'string']
def sfun(k1, k2, k3, string):
a = k1*k2
b = k2*k3
s = string
nums = [a, b]
strs = [s]
return(nums, strs)
def store(inp):
r = rsfun(inp[0], inp[1], inp[2], inp[3])
ftemp = append(ftemp, asarray(r[0]), axis = 0)
stemp = append(stemp, asarray(r[1]), axis = 0)
return(ftemp, stemp)
for l in range(la):
inputs = [(2, 3, 4, 'he'),
(4, 6, 2, 'je'),
(2, 7, 5, 'ke')]
Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
vals.iloc[l, 0:2] = ftemp[0, 0], ftemp[0, 1]
vals.iloc[l, 2] = stemp[0]
d = ftemp[2, 0]-ftemp[0, 0]
Note: most of the gradient descent stuff is removed because I do not have any issues with that. the main issues that I have are storing the values at each step.
sfun() is the loss function (I know that it doesn't look like that here) and store() is just an attempt to store the parameter values with each step.
The important aspect here is that I want to parallelize the process as sfun() is computationally expensive and the issue with that I want to save values for all parallel runs.
I tried solving this in many different ways, but I always get a different error.
No need to make a temporary storage array, possible to store the results of Parallel() function directly by:
a = Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
Most importantly, a is populated in order that the inputs are given.
I want to model wiener process and I want to add some implementation of elementary result, so I want to write something like this:
# w is elementary result
def W(T, w = 0, dt = 0.001):
x = [0]
for t in np.arange(0, T, dt):
x.append(x[-1] + np.random.normal(0,dt, w))
return x
and I expect that with same w I got same output of W. But np.random.normal doesn't support such thing. How can I implement it?
Maybe clarity on how set seed works.
import numpy as np
np.random.seed(0)
np.random.normal(0,0.001,1)
>> array([0.00176405]) #My output
np.random.normal(0,0.001,1)
>>array([0.00040016])
np.random.seed(0)
np.random.normal(0,0.001,1)
>> array([0.00176405]) #Same output
np.random.normal(0,0.001,1)
>>array([0.00040016])
Every time you want the same results, you have to reset the seed again.
How do I use fsolve to calculate the value of y for the following non-linear equation in Python
y=x^3 -√y
(when x = 0, 1, 2.3611, 2.9033, 3.2859, 3.5915)
I have tried by solving the problem on paper and then using a function to calculate the value of y. But I am unable to use fsolve to do the same for me.
def func(x):return np.round(((-1+np.sqrt(1+(4*x**3)))/2)**2,4)
One should avoid the root functions if applying a numerical algorithm, at zero arguments the root function is non-smooth. Thus use y=z², hoping that the function z^2+z is convex enough that starting at 1.0 the root-finding iteration stays with positive values for z, and take the square root of the result,
y = [ fsolve(lambda z: z**2+z-x**3, 1.0)[0]**2 for x in [0, 1, 2.3611, 2.9033, 3.2859, 3.5915] ]
This gets the solutions
[0.0, 0.3819660112501052, 10.000316539128024, 20.000195919522547, 30.00100142437062, 40.00161656606038]
You can try like this:
import math
from scipy.optimize import fsolve
y = [0.000, 0.3820, 10.00, 20.00, 30.00, 40.00]
def func(x):
for i in y:
return x**3-(math.sqrt(i))
x0 = fsolve(func, [0, 1, 2.3611, 2.9033, 3.2859, 3.5915])
OutPut:
[0.00000000e+000 2.24279573e-109 5.29546580e-109 6.51151078e-109 7.36960349e-109 8.05500243e-109]
It is scientific notation. If 1e-5 It means 1 × 10−5. In other words, 0.00001.
Convert scientific notation to decimals:
Now, x0 = [0.00000000e+000 2.24279573e-109 5.29546580e-109 6.51151078e-109 7.36960349e-109 8.05500243e-109]
for i in x0:
data = float("{:.8f}".format(float(str(i))))
print(data)
I'm wondering how the following code could be faster. At the moment, it seems unreasonably slow, and I suspect I may be using the autograd API wrong. The output I expect is each element of timeline evaluated at the jacobian of f, which I do get, but it takes a long time:
import numpy as np
from autograd import jacobian
def f(params):
mu_, log_sigma_ = params
Z = timeline * mu_ / log_sigma_
return Z
timeline = np.linspace(1, 100, 40000)
gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]))
I would expect the following:
jacobian(f) returns an function that represents the gradient vector w.r.t. the parameters.
jacobian(f)(np.array([1.0, 1.0])) is the Jacobian evaluated at the point (1, 1). To me, this should be like a vectorized numpy function, so it should execute very fast, even for 40k length arrays. However, this is not what is happening.
Even something like the following has the same poor performance:
import numpy as np
from autograd import jacobian
def f(params, t):
mu_, log_sigma_ = params
Z = t * mu_ / log_sigma_
return Z
timeline = np.linspace(1, 100, 40000)
gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]), timeline)
From https://github.com/HIPS/autograd/issues/439 I gathered that there is an undocumented function autograd.make_jvp which calculates the jacobian with a fast forward mode.
The link states:
Given a function f, vectors x and v in the domain of f, make_jvp(f)(x)(v) computes both f(x) and the Jacobian of f evaluated at x, right multiplied by the vector v.
To get the full Jacobian of f you just need to write a loop to evaluate make_jvp(f)(x)(v) for each v in the standard basis of f's domain. Our reverse mode Jacobian operator works in the same way.
From your example:
import autograd.numpy as np
from autograd import make_jvp
def f(params):
mu_, log_sigma_ = params
Z = timeline * mu_ / log_sigma_
return Z
timeline = np.linspace(1, 100, 40000)
gradient_at_mle = make_jvp(f)(np.array([1.0, 1.0]))
# loop through each basis
# [1, 0] evaluates (f(0), first column of jacobian)
# [0, 1] evaluates (f(0), second column of jacobian)
for basis in (np.array([1, 0]), np.array([0, 1])):
val_of_f, col_of_jacobian = gradient_at_mle(basis)
print(col_of_jacobian)
Output:
[ 1. 1.00247506 1.00495012 ... 99.99504988 99.99752494
100. ]
[ -1. -1.00247506 -1.00495012 ... -99.99504988 -99.99752494
-100. ]
This runs in ~ 0.005 seconds on google collab.
Edit:
Functions like cdf aren't defined for the regular jvp yet but you can use another undocumented function make_jvp_reversemode where it is defined. Usage is similar except that the output is only the column and not the value of the function:
import autograd.numpy as np
from autograd.scipy.stats.norm import cdf
from autograd.differential_operators import make_jvp_reversemode
def f(params):
mu_, log_sigma_ = params
Z = timeline * cdf(mu_ / log_sigma_)
return Z
timeline = np.linspace(1, 100, 40000)
gradient_at_mle = make_jvp_reversemode(f)(np.array([1.0, 1.0]))
# loop through each basis
# [1, 0] evaluates first column of jacobian
# [0, 1] evaluates second column of jacobian
for basis in (np.array([1, 0]), np.array([0, 1])):
col_of_jacobian = gradient_at_mle(basis)
print(col_of_jacobian)
Output:
[0.05399097 0.0541246 0.05425823 ... 5.39882939 5.39896302 5.39909665]
[-0.05399097 -0.0541246 -0.05425823 ... -5.39882939 -5.39896302 -5.39909665]
Note that make_jvp_reversemode will be slightly faster than make_jvp by a constant factor due to it's use of caching.