I am trying to understand the UCB Bandit that is written about in this book, and I'm running into a bit of trouble accessing a given arm of a multiarm bandit.
To put it in simpler terms: I have a class called NormalArm as follows:
class NormalArm():
def __init__(self, mu, sigma):
self.mu = mu
self.sigma = sigma
def draw(self):
return random.gauss(self.mu, self.sigma)
def __len__(self):
return len(self.mu)
Now I can feed NormalArm two vectors of equal length n, mu and sigma, and that will create n "gaussian" arms that return a random value from a normal distribution with mean mu and standard deviation sigma when I access .draw(). I want to add a __getitem__ attribute so that I can access the 2nd or 3rd arm only. How can I do this?
I think you haven't got your types right. Your code cannot work.
You say that mu and sigma are vectors. But you pass them to random.gauss() that expects scalars. Still you insist on this thing being a sequence when you define __len__().
What you want, I suspect, is something like this:
arms = NormalArms([0.0, 0.1], [1.0, 1.5])
assert len(arms) == 2 # hint: write tests right now!
second_gaussian = arms[1] # gives you a gaussian(0.1, 1.5)
This class works the above way.
class NormalArms(object):
def __init__(self, mus, sigmas):
assert len(mus) == len(sigmas), "mus and sigmas lengths don't match"
self.pairs = zip(mus, sigmas)
def __getitem__(self, index):
mu, sigma = self.pairs[index]
return random.gauss(mu, sigma)
def __len__(self):
return len(self.pairs)
It was not hard.
Related
I've got a little project for my college and I need to write a method which fits an array to some function, here's it's part:
def Linear(self,x,a,b):
return a*x+b
def Quadratic(self, x, a,b,c):
return a*(x**2)+b*x+c
def Sinusoid(self,t, a, gam, omega, phi, offset):
return np.e ** (gam * t) * a * np.sin((2 * np.pi * omega * t) + phi) + offset
def Fit(self, name):
func= getattr(App, name)
self.fit_params, self.covariance_matrix = curve_fit(func, self.t, self.a, maxfev= 100000)
But it returns absolutely wrong values and also doesn't even work for Sinusoid function (ptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated'). I've already checked if it's not an issue with getattr function but it works correctly.
I'm running out of ideas where the issue is.
There are several problems here.
Linear, Quadratic and Sinusoid do not need self, you can define those as staticmethod:
#This is the decorator use to define staticmethod
#staticmethod
def Linear(x,a,b):
return a*x+b
The same applies to other methods (except Fit).
It will help when they are called in the future. When you call them with curve_fit(func,...), for example, if func is Sinusoid, what you are doing is
curve_fit(Sinusoid, ...) and not curve_fit(self.Sinusoid,...). When curve_fit use Sinusoid, the first argument may be understood as the self, therefore, you get and error.
If you define
#staticmethod
def Sinusoid(t, a, gam, omega, phi, offset):
return np.e ** (gam * t) * a * np.sin((2 * np.pi * omega * t) + phi) + offset
a, gam, omega, phi and offset are constants to fit. Therefore, when you call curve_fit, you need to pass this constants as params:
scipy.optimize.curve_fit(f, xdata, ydata, p0=None, sigma=None, absolute_sigma=False, check_finite=True, bounds=- inf, inf, method=None, jac=None, **kwargs)
To clarify: The first argument is the func, the second is your experimental data for x, and the third, is the data for y. And after that all the others kwargs, like maxfev. I don't know if self.a is the data for y, or if you want to set an initial value for a. I'm going to guess you want to set the initial value of a. You need to do it like this: p0=[x1,x2,x3,x4,..., xn] where xi is the initial value of the i-th constant. If you don't pass p0 as an argument to curve_fit, the default value for each constant is going to be 1. But here it comes the last problem:
The diferent functions use a diferent number arguments, so for the first one, Linear, you need to do p0 = [a,1]. For the second, p0 = [a, 1, 1]. And for the last one, the Sinusoid one, p0 = [a, 1, 1, 1, 1]
I don't really know if you can pass p0 = [a] to all of the functions and curve_fit will magically understand. But try so and if curve_fit doesn't let you, use condictionals:
def Fit(self, name):
if func = Linear:
p0 = [a,1]
elif func = Quadratic:
p0 = [a,1,1]
elif func = Sinusoid:
p0 = [a,1,1,1,1]
self.fit_params, self.covariance_matrix = curve_fit(func, self.t, self.y_values, p0=p0, maxfev= 100000)
Where self.y_values is the experimental data as an array-like element.
If self.a is the data for y you can cut all of this mambojambo and let Fit define as it currently is. But don't forguet the #staticmethod in the other functions.
Hope this solves your problem.
I would like to perform some tests on my optimisation routine using scipy.optimize.minimize, in particular graphing the convergence (or the rather objective function) each iteration, over multiple tests.
Suppose I have the following linearly constrained quadratic optimisation problem:
minimise: x_i Q_ij x_j + a|x_i|
subject to: sum(x_i) = 1
I can code this as:
def _fun(x, Q, a):
c = np.einsum('i,ij,j->', x, Q, x)
p = np.sum(a * np.abs(x))
return c + p
def _constr(x):
return np.sum(x) - 1
And I will implement the optimisation in scipy as:
x_0 = # some initial vector
x_soln = scipy.optimise.minimize(_fun, x_0, args=(Q, a), method='SLSQP',
constraints={'type': 'eq', 'fun': _constr})
I see that there is a callback argument but which only accepts a single argument of the parameter values at each iteration. How can I utilise this in a more esoteric case where I might have other arguments that need to be supplied to my callback function?
The way I solved this was use a generic callback cache object referenced each time from my callback function. Let's say you want to do 20 tests and plot the objective function after each iteration in the same chart. You will need an outer loop to run 20 tests, but we'll create that later.
First lets create a class that will store all iteration objective function values for us, and a couple extra bits and pieces:
class OpObj(object):
def __init__(self, Q, a):
self.Q, self.a = Q, a
rv = np.random.rand()
self.x_0 = np.array([rv, (1-rv)/2, (1-rv)/2])
self.f = np.full(shape=(500,), fill_value=np.NaN)
self.count = 0
def _fun(self, x):
return _fun(x, self.Q, self.a)
Also lets add a callback function that manipulates that class obj. Don't worry that it has more than one argument for now since we'll fix this later. Just make sure the first parameter is the solution variables.
def cb(xk, obj=None):
obj.f[obj.count] = obj._fun(xk)
obj.count += 1
All this does is use the object's functions and values to update itself, counting the number of iterations each time. This function will be called after each iteration.
Putting this all together all we need is two more things: 1) some matplotlib-ing to do the plot, and fixing the callback to have only one argument. We can do that with a decorator which is exactly what functools partial does. It returns a function with less arguments than the original. So the final code looks like this:
import matplotlib.pyplot as plt
import scipy.optimize as op
import numpy as np
from functools import partial
Q = np.array([[1.0, 0.75, 0.45], [0.75, 1.0, 0.60], [0.45, 0.60, 1.0]])
a = 1.0
def _fun(x, Q, a):
c = np.einsum('i,ij,j->', x, Q, x)
p = np.sum(a * np.abs(x))
return c + p
def _constr(x):
return np.sum(x) - 1
class OpObj(object):
def __init__(self, Q, a):
self.Q, self.a = Q, a
rv = np.random.rand()
self.x_0 = np.array([rv, (1-rv)/2, (1-rv)/2])
self.f = np.full(shape=(500,), fill_value=np.NaN)
self.count = 0
def _fun(self, x):
return _fun(x, self.Q, self.a)
def cb(xk, obj=None):
obj.f[obj.count] = obj._fun(xk)
obj.count += 1
fig, ax = plt.subplots(1,1)
x = np.linspace(1,500,500)
for test in range(20):
op_obj = OpObj(Q, a)
x_soln = op.minimize(_fun, op_obj.x_0, args=(Q, a), method='SLSQP',
constraints={'type': 'eq', 'fun': _constr},
callback=partial(cb, obj=op_obj))
ax.plot(x, op_obj.f)
ax.set_ylim((1.71,1.76))
plt.show()
I am doing a multi integral with 4 variables, among them 2 have limits as functions. However the error appears on one of my constant-limit variable. Really cannot figure our why. Many thanks for your advice!
from numpy import sqrt, sin, cos, pi, arcsin, maximum
from sympy.functions.special.delta_functions import Heaviside
from scipy.integrate import nquad
def bmax(x):
return 1.14*10**9/sin(x/2)**(9/7)
def thetal(x,y,z):
return arcsin(3.7*10**15*sqrt(cos(x/2)**2/10**6-1.23*10**10/z+0.003*sin(x/2)**2*(2.51*10**63/sin(x/2)**9/y**7-1))/(z*sin(x/2)**2*cos(x/2)*(2.51*10**63/sin(x/2)**9/y**7-1)))
def rt(x,y):
return 3.69*10**12/(2.5*10**63/sin(x/2)**7*y**7-sin(x/2)**2)
def rd(x,y):
return maximum(1.23*10**10,rt(x,y))
def rl(x,y):
return rd(x,y)*(sqrt(1+5.04*10**16/(rd(x,y)*cos(x/2)**2))-1)/2
def wbound():
return [1.23*10**10,3.1*10**16]
def zbound():
return [10**(-10),pi-10**(-10)]
def ybound(z):
return [0,bmax(z)-10**(-10)]
def xbound(z,y,w):
return [thetal(z,y,w),pi-thetal(z,y,w)]
def f(x,y,z,w):
return [5.77/10**30*sin(z)*sin(z/2)*y*sin(x)*Heaviside(w-rl(z,y))*Heaviside(w-rd(z,y))/w**2]
result = nquad(f, [xbound, ybound,zbound,wbound])
The reason for that error is that although you don't want these bounds to depend on the variables, nquad still passes the variables to the functions you provide to it. So the bound functions have to take the right number of variables:
def wbound():
return [1.23*10**10,3.1*10**16]
def zbound(w_foo):
return [10**(-10),pi-10**(-10)]
def ybound(z, w_foo):
return [0,bmax(z)-10**(-10)]
def xbound(z,y,w):
return [thetal(z,y,w),pi-thetal(z,y,w)]
Now the functions zbound and ybound accept the extra variables but simply ignore them.
I'm not sure about the last bound, xbound(...): Do you want the variables y and z to be flipped? The supposedly correct ordering according to the definition of scipy.integrate.nquad would be
def xbound(y,z,w):
...
Edit: As kazemakase pointed out, the function f should return a float instead of a list so the brackets [...] in the return statement should be removed.
nquad expects a sequence of bounds for its second argument, with a rather stringent syntax.
If the integrand f depends on x, y, z, w and this is the order of definition, the terms in bounds must be, in sequence, xb, yb, zb and wb, where each of the bounds can be either a 2-tuple, e.g., xb = (xmin, xmax)
or a function that returns a 2-tuple.
The critical point is, the arguments of those functions... when we perform the inner integration, in dx, we have available y, z and w for computing the bounds in x, so that it must be
def xb(y,z,w): return(..., ...) — likewise
def yb(z,w): return (..., ...) and
def zb(w): return (..., ...).
The bounds with respect to the last variable of integration must be constant.
To summarize
# DEFINITIONS
def f(x, y, z, w): return .. . # x inner integration, ..., w outer integration
def xb(y,z,w): return (...,...) # or simply xb=(...,...) if it's a constant
def yb(z,w): return (...,...) # or yb=(...,...)
def zb(w): return (...,...) # or zb=(...,...)
wb = (...,...)
# INTEGRATION
result, _ = nquad(f, [xb, yb, zb, wb])
I have the following code below that prints the PDF graph for a particular mean and standard deviation.
http://imgur.com/a/oVgML
Now I need to find the actual probability, of a particular value. So for example if my mean is 0, and my value is 0, my probability is 1. This is usually done by calculating the area under the curve. Similar to this:
http://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
I am not sure how to approach this problem
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def normal(power, mean, std, val):
a = 1/(np.sqrt(2*np.pi)*std)
diff = np.abs(np.power(val-mean, power))
b = np.exp(-(diff)/(2*std*std))
return a*b
pdf_array = []
array = np.arange(-2,2,0.1)
print array
for i in array:
print i
pdf = normal(2, 0, 0.1, i)
print pdf
pdf_array.append(pdf)
plt.plot(array, pdf_array)
plt.ylabel('some numbers')
plt.axis([-2, 2, 0, 5])
plt.show()
print
Unless you have a reason to implement this yourself. All these functions are available in scipy.stats.norm
I think you asking for the cdf, then use this code:
from scipy.stats import norm
print(norm.cdf(x, mean, std))
If you want to write it from scratch:
class PDF():
def __init__(self,mu=0, sigma=1):
self.mean = mu
self.stdev = sigma
self.data = []
def calculate_mean(self):
self.mean = sum(self.data) // len(self.data)
return self.mean
def calculate_stdev(self,sample=True):
if sample:
n = len(self.data)-1
else:
n = len(self.data)
mean = self.mean
sigma = 0
for el in self.data:
sigma += (el - mean)**2
sigma = math.sqrt(sigma / n)
self.stdev = sigma
return self.stdev
def pdf(self, x):
return (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean) / self.stdev) ** 2)
The area under a curve y = f(x) from x = a to x = b is the same as the integral of f(x)dx from x = a to x = b. Scipy has a quick easy way to do integrals. And just so you understand, the probability of finding a single point in that area cannot be one because the idea is that the total area under the curve is one (unless MAYBE it's a delta function). So you should get 0 ≤ probability of value < 1 for any particular value of interest. There may be different ways of doing it, but a conventional way is to assign confidence intervals along the x-axis like this. I would read up on Gaussian curves and normalization before continuing to code it.
I am very new to object-oriented programming in Python and I am working to implement the accepted answer to this question in Python (it's originally in R).
I have a simple question - is it possible to access the output of one method for use in another method without first binding the output to self? I presume the answer is "no" - but I also imagine there is some technique that accomplishes the same task that I am not thinking of.
My start to the code is below. It works fine until you get to the kappa method. I would really like to be able to define kappa as a simple extension to curvature (since it's just the absolute value of the same) but I'm not particularly interested in adding it the list of attributes. I may just been overthinking this too, and either something like a closure is possible in Python or adding to the attribute list is the Pythonic thing to do?
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
class Road(object):
def __init__(self, x, y): #x, y are lists
# Raw data
self.x = x
self.y = y
# Calculate and set cubic spline functions
n = range(1, len(x)+1)
fx = InterpolatedUnivariateSpline(n, x, k=3)
fy = InterpolatedUnivariateSpline(n, y, k=3)
self.fx = fx
self.fy = fy
def curvature(self, t):
# Calculate and return the curvature
xp = self.fx.derivative(); yp = self.fy.derivative()
xpp = xp.derivative(); ypp = yp.derivative()
vel = np.sqrt(xp(t)**2 + yp(t)**2) #Velocity
curv = (xp(t)*ypp(t) - yp(t)*xpp(t)) / (vel**3) #Signed curvature
return curv
def kappa(self, t):
return abs(curv)
Just call the other method:
class Road(object):
...
def kappa(self, t):
return abs(self.curvature(t=t))