Normal Equation Implementation in Python / Numpy

Normal Equation Implementation in Python / Numpy - python

I've written some beginner code to calculate the co-efficients of a simple linear model using the normal equation.
# Modules
import numpy as np
# Loading data set
X, y = np.loadtxt('ex1data3.txt', delimiter=',', unpack=True)
data = np.genfromtxt('ex1data3.txt', delimiter=',')
def normalEquation(X, y):
m = int(np.size(data[:, 1]))
# This is the feature / parameter (2x2) vector that will
# contain my minimized values
theta = []
# I create a bias_vector to add to my newly created X vector
bias_vector = np.ones((m, 1))
# I need to reshape my original X(m,) vector so that I can
# manipulate it with my bias_vector; they need to share the same
# dimensions.
X = np.reshape(X, (m, 1))
# I combine these two vectors together to get a (m, 2) matrix
X = np.append(bias_vector, X, axis=1)
# Normal Equation:
# theta = inv(X^T * X) * X^T * y
# For convenience I create a new, tranposed X matrix
X_transpose = np.transpose(X)
# Calculating theta
theta = np.linalg.inv(X_transpose.dot(X))
theta = theta.dot(X_transpose)
theta = theta.dot(y)
return theta
p = normalEquation(X, y)
print(p)
Using the small data set found here:
http://www.lauradhamilton.com/tutorial-linear-regression-with-octave
I get the co-efficients: [-0.34390603; 0.2124426 ] using the above code instead of: [24.9660; 3.3058]. Could anyone help clarify where I am going wrong?

You can implement normal equation like below:
import numpy as np
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X] # add x0 = 1 to each instance
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
X_new = np.array([[0], [2]])
X_new_b = np.c_[np.ones((2, 1)), X_new] # add x0 = 1 to each instance
y_predict = X_new_b.dot(theta_best)
y_predict

This assumes X is an m by n+1 dimensional matrix where x_0 always = 1 and y is a m-dimensional vector.
import numpy as np
step1 = np.dot(X.T, X)
step2 = np.linalg.pinv(step1)
step3 = np.dot(step2, X.T)
theta = np.dot(step3, y) # if y is m x 1. If 1xm, then use y.T

Your implementation is correct. You've only swapped X and y (look closely how they define x and y), that's why you get a different result.
The call normalEquation(y, X) gives [ 24.96601443 3.30576144] as it should.

Here is the normal equation in one line:
theta = np.dot(np.linalg.inv(np.dot(X.T,X)),np.dot(X.T,Y))

Related

SKlearn Gaussian Process with constant, manually set correlation

I want to use the Gaussian Process approximation for a simple 1D test function to illustrate a few things. I want to iterate over a few different values for the correlation matrix (since this is 1D it is just a single value) and show what effect different values have on the approximation. My understanding is, that "theta" is the parameter for this. Therefore I want to set the theta value manually and don't want any optimization/changes to it. I thought the constant kernel and the clone_with_theta function might get me what I want but I didn't get it to work. Here is what I have so far:
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as ConstantKernel
def f(x):
"""The function to predict."""
return x/2 + ((1/10 + x) * np.sin(5*x - 1))/(1 + x**2 * (np.sin(x - (1/2))**2))
# ----------------------------------------------------------------------
# Data Points
X = np.atleast_2d(np.delete(np.linspace(-1,1, 7),4)).T
y = f(X).ravel()
# Instantiate a Gaussian Process model
kernel = ConstantKernel(constant_value=1, constant_value_bounds='fixed')
theta = np.array([0.5,0.5])
kernel = kernel.clone_with_theta(theta)
gp = GaussianProcessRegressor(kernel=kernel, optimizer=None)
# Fit to data using Maximum Likelihood Estimation of the parameters
gp.fit(X, y)
# Make the prediction on the meshed x-axis (ask for MSE as well)
y_pred, sigma = gp.predict(x, return_std=True)
# Plot
# ...

I programmed a simple implementation myself now, which allows to set correlation (here 'b') manually:
import numpy as np
from numpy.linalg import inv
def f(x):
"""The function to predict."""
return x/2 + ((1/10 + x) * np.sin(5*x - 1))/(1 + x**2 * (np.sin(x - (1/2))**2))
def kriging_approx(x,xt,yt,b,mu,R_inv):
N = yt.size
one = np.matrix(np.ones((yt.size))).T
r = np.zeros((N))
for i in range(0,N):
r[i]= np.exp(-b * (xt[i]-x)**2)
y = mu + np.matmul(np.matmul(r.T,R_inv),yt - mu*one)
y = y[0,0]
return y
def calc_R (x,b):
N = x.size
# setup R
R = np.zeros((N,N))
for i in range(0,N):
for j in range(0,N):
R[i][j] = np.exp(-b * (x[i]-x[j])**2)
R_inv = inv(R)
return R, R_inv
def calc_mu_sig (yt, R_inv):
N = yt.size
one = np.matrix(np.ones((N))).T
mu = np.matmul(np.matmul(one.T,R_inv),yt) / np.matmul(np.matmul(one.T,R_inv),one)
mu = mu[0,0]
sig2 = (np.matmul(np.matmul((yt - mu*one).T,R_inv),yt - mu*one))/(N)
sig2 = sig2[0,0]
return mu, sig2
# ----------------------------------------------------------------------
# Data Points
xt = np.linspace(-1,1, 7)
yt = np.matrix((f(xt))).T
# Calc R
R, R_inv = calc_R(xt, b)
# Calc mu and sigma
mu_dach, sig_dach2 = calc_mu_sig(yt, R_inv)
# Point to get approximation for
x = 1
y_approx = kriging_approx(x, xt, yt, b, mu_dach, R_inv)

How can I get data for a quiver plot in torch?

I have some function z(x, y) and I would like to generate a quiver plot (a 2D plot of the gradients). Something like this:
In order to do it, I have to run gradient over a linear mesh and adjust data to the format that matplotlib.quiver does.
A naive way is to iterate forward and backward in a loop:
for i in range(10):
for j in range(10):
x = torch.tensor(1. * i, requires_grad=True)
y = torch.tensor(1. * j, requires_grad=True)
z = x ** 2 + y ** 2
z.backward()
print(x.grad, y.grad)
This is obviously very inefficient. There are some examples on how to generate a linear mesh from x, y but I would need later change the mesh back to the format of the forward formula, get vectors of gradient and put them back, etc..
A simple example in numpy would be:
import matplotlib.pyplot as plt
n = 25
x_range = np.linspace(-25, 25, n)
y_range = np.linspace(-25, 25, n)
X, Y = np.meshgrid(x_range, y_range)
Z = X**2 + Y**2
U, V = 2*X, 2*Y
plt.quiver(X, Y, U, V, Z, alpha=.9)
What would be the standard way of doing this with pytorch? Are there some simple examples available?

You can compute gradients of non-scalars by passing torch.Tensors of ones.
import matplotlib.pyplot as plt
import torch
# create meshgrid
n = 25
a = torch.linspace(-25, 25, n)
b = torch.linspace(-25, 25, n)
x = a.repeat(n)
y = b.repeat(n, 1).t().contiguous().view(-1)
x.requires_grad = True
y.requires_grad=True
z = x**2 + y**2
# this line will compute the gradients
torch.autograd.backward([z], [torch.ones(x.size()), torch.ones(y.size())])
# detach to plot
plt.quiver(x.detach(), y.detach(), x.grad, y.grad, z.detach(), alpha=.9)
plt.show()
If you need to do this repeatedly you need to zero the gradients (set x.grad = y.grad = None).

Create a 2D array that is the product of two functions

I am working on a visualization and trying to create a 2D array that is the product of a normalized Gaussian function on the X axis and a normalized exponential function on the Y axis (using Python).

I would use NumPy for this. You can use np.meshgrid to create the (X, Y) axes and use NumPy's vectorized functions to create the function on these coordinates. The array f below is your two-dimensional array, here containing the product of exp(-X/4) and exp(-((Y-2)/1.5)**2). (Substitute your own normalized functions here.)
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,10,100)
y = np.linspace(0,5,100)
X, Y = np.meshgrid(x, y)
f = np.exp(-X/4.) * np.exp(-((Y-2)/1.5)**2)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(f)
plt.show()
If you can't or don't want to use NumPy, you'll have to loop by hand and use conventional math functions:
import math
dx, dy = 0.1, 0.05
nx, ny = 101, 101
f = [[None]*nx for i in range(ny)]
for ix in range(nx):
x = xmin + dx*ix
for iy in range(ny):
y = ymin + dy*iy
f[iy][ix] = math.exp(-x/4.) * math.exp(-((y-2)/1.5)**2)

I would use numpy for this, because numpy makes it very simple to do what you want. If you can't use it, then something like the following should work:
import math
def gauss(x, mu=0.0, sigma=1.0):
return 1.0 / math.sqrt(2.0*math.pi*sigma**2) * math.exp(-0.5*(x-mu)**2/sigma**2)
def exponential(x, lam=1.0):
return lam * math.exp(-lam * x)
# X values from -10 to 10 with 0.01 step size
xvals = [x * 0.01 for x in range(-1000, 1001)]
# Y values from 0 to 10 with 0.01 step size
yvals = [y * 0.01 for y in range(0, 1001)]
# Calculate your function at the grid points
f = [[gauss(x)*exponential(y) for x in xvals] for y in yvals]

Best way to interpolate a numpy.ndarray along an axis

I have 4-dimensional data, say for the temperature, in an numpy.ndarray.
The shape of the array is (ntime, nheight_in, nlat, nlon).
I have corresponding 1D arrays for each of the dimensions that tell me which time, height, latitude, and longitude a certain value corresponds to, for this example I need height_in giving the height in metres.
Now I need to bring it onto a different height dimension, height_out, with a different length.
The following seems to do what I want:
ntime, nheight_in, nlat, nlon = t_in.shape
nheight_out = len(height_out)
t_out = np.empty((ntime, nheight_out, nlat, nlon))
for time in range(ntime):
for lat in range(nlat):
for lon in range(nlon):
t_out[time, :, lat, lon] = np.interp(
height_out, height_in, t[time, :, lat, lon]
)
But with 3 nested loops, and lots of switching between python and numpy, I don't think this is the best way to do it.
Any suggestions on how to improve this? Thanks

scipy's interp1d can help:
import numpy as np
from scipy.interpolate import interp1d
ntime, nheight_in, nlat, nlon = (10, 20, 30, 40)
heights = np.linspace(0, 1, nheight_in)
t_in = np.random.normal(size=(ntime, nheight_in, nlat, nlon))
f_out = interp1d(heights, t_in, axis=1)
nheight_out = 50
new_heights = np.linspace(0, 1, nheight_out)
t_out = f_out(new_heights)

I was looking for a similar function that works with irregularly spaced coordinates, and ended up writing my own function. As far as I see, the interpolation is handled nicely and the performance in terms of memory and speed is also quite good. I thought I'd share it here in case anyone else comes across this question looking for a similar function:
import numpy as np
import warnings
def interp_along_axis(y, x, newx, axis, inverse=False, method='linear'):
""" Interpolate vertical profiles, e.g. of atmospheric variables
using vectorized numpy operations
This function assumes that the x-xoordinate increases monotonically
ps:
* Updated to work with irregularly spaced x-coordinate.
* Updated to work with irregularly spaced newx-coordinate
* Updated to easily inverse the direction of the x-coordinate
* Updated to fill with nans outside extrapolation range
* Updated to include a linear interpolation method as well
(it was initially written for a cubic function)
Peter Kalverla
March 2018
--------------------
More info:
Algorithm from: http://www.paulinternet.nl/?page=bicubic
It approximates y = f(x) = ax^3 + bx^2 + cx + d
where y may be an ndarray input vector
Returns f(newx)
The algorithm uses the derivative f'(x) = 3ax^2 + 2bx + c
and uses the fact that:
f(0) = d
f(1) = a + b + c + d
f'(0) = c
f'(1) = 3a + 2b + c
Rewriting this yields expressions for a, b, c, d:
a = 2f(0) - 2f(1) + f'(0) + f'(1)
b = -3f(0) + 3f(1) - 2f'(0) - f'(1)
c = f'(0)
d = f(0)
These can be evaluated at two neighbouring points in x and
as such constitute the piecewise cubic interpolator.
"""
# View of x and y with axis as first dimension
if inverse:
_x = np.moveaxis(x, axis, 0)[::-1, ...]
_y = np.moveaxis(y, axis, 0)[::-1, ...]
_newx = np.moveaxis(newx, axis, 0)[::-1, ...]
else:
_y = np.moveaxis(y, axis, 0)
_x = np.moveaxis(x, axis, 0)
_newx = np.moveaxis(newx, axis, 0)
# Sanity checks
if np.any(_newx[0] < _x[0]) or np.any(_newx[-1] > _x[-1]):
# raise ValueError('This function cannot extrapolate')
warnings.warn("Some values are outside the interpolation range. "
"These will be filled with NaN")
if np.any(np.diff(_x, axis=0) < 0):
raise ValueError('x should increase monotonically')
if np.any(np.diff(_newx, axis=0) < 0):
raise ValueError('newx should increase monotonically')
# Cubic interpolation needs the gradient of y in addition to its values
if method == 'cubic':
# For now, simply use a numpy function to get the derivatives
# This produces the largest memory overhead of the function and
# could alternatively be done in passing.
ydx = np.gradient(_y, axis=0, edge_order=2)
# This will later be concatenated with a dynamic '0th' index
ind = [i for i in np.indices(_y.shape[1:])]
# Allocate the output array
original_dims = _y.shape
newdims = list(original_dims)
newdims[0] = len(_newx)
newy = np.zeros(newdims)
# set initial bounds
i_lower = np.zeros(_x.shape[1:], dtype=int)
i_upper = np.ones(_x.shape[1:], dtype=int)
x_lower = _x[0, ...]
x_upper = _x[1, ...]
for i, xi in enumerate(_newx):
# Start at the 'bottom' of the array and work upwards
# This only works if x and newx increase monotonically
# Update bounds where necessary and possible
needs_update = (xi > x_upper) & (i_upper+1<len(_x))
# print x_upper.max(), np.any(needs_update)
while np.any(needs_update):
i_lower = np.where(needs_update, i_lower+1, i_lower)
i_upper = i_lower + 1
x_lower = _x[[i_lower]+ind]
x_upper = _x[[i_upper]+ind]
# Check again
needs_update = (xi > x_upper) & (i_upper+1<len(_x))
# Express the position of xi relative to its neighbours
xj = (xi-x_lower)/(x_upper - x_lower)
# Determine where there is a valid interpolation range
within_bounds = (_x[0, ...] < xi) & (xi < _x[-1, ...])
if method == 'linear':
f0, f1 = _y[[i_lower]+ind], _y[[i_upper]+ind]
a = f1 - f0
b = f0
newy[i, ...] = np.where(within_bounds, a*xj+b, np.nan)
elif method=='cubic':
f0, f1 = _y[[i_lower]+ind], _y[[i_upper]+ind]
df0, df1 = ydx[[i_lower]+ind], ydx[[i_upper]+ind]
a = 2*f0 - 2*f1 + df0 + df1
b = -3*f0 + 3*f1 - 2*df0 - df1
c = df0
d = f0
newy[i, ...] = np.where(within_bounds, a*xj**3 + b*xj**2 + c*xj + d, np.nan)
else:
raise ValueError("invalid interpolation method"
"(choose 'linear' or 'cubic')")
if inverse:
newy = newy[::-1, ...]
return np.moveaxis(newy, 0, axis)
And this is a small example to test it:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d as scipy1d
# toy coordinates and data
nx, ny, nz = 25, 30, 10
x = np.arange(nx)
y = np.arange(ny)
z = np.tile(np.arange(nz), (nx,ny,1)) + np.random.randn(nx, ny, nz)*.1
testdata = np.random.randn(nx,ny,nz) # x,y,z
# Desired z-coordinates (must be between bounds of z)
znew = np.tile(np.linspace(2,nz-2,50), (nx,ny,1)) + np.random.randn(nx, ny, 50)*0.01
# Inverse the coordinates for testing
z = z[..., ::-1]
znew = znew[..., ::-1]
# Now use own routine
ynew = interp_along_axis(testdata, z, znew, axis=2, inverse=True)
# Check some random profiles
for i in range(5):
randx = np.random.randint(nx)
randy = np.random.randint(ny)
checkfunc = scipy1d(z[randx, randy], testdata[randx,randy], kind='cubic')
checkdata = checkfunc(znew)
fig, ax = plt.subplots()
ax.plot(testdata[randx, randy], z[randx, randy], 'x', label='original data')
ax.plot(checkdata[randx, randy], znew[randx, randy], label='scipy')
ax.plot(ynew[randx, randy], znew[randx, randy], '--', label='Peter')
ax.legend()
plt.show()

Following the criteria of numpy.interp, one can assign the left/right bounds to the points outside the range adding this lines after within_bounds = ...
out_lbound = (xi <= _x[0,...])
out_rbound = (_x[-1,...] <= xi)
and
newy[i, out_lbound] = _y[0, out_lbound]
newy[i, out_rbound] = _y[-1, out_rbound]
after newy[i, ...] = ....
If I understood well the strategy used by #Peter9192, I think the changes are in the same line. I've checked a little bit, but maybe some strange case could not work properly.

Understanding scipy's least square function with IRLS

I'm having a bit of trouble understanding how this function works.
a, b = scipy.linalg.lstsq(X, w*signal)[0]
I know that signal is the array representing the signal and currently w is just [1,1,1,1,1...]
How should I manipulate X or w to imitate weighted least squares or iteratively reweighted least squared?

If you product X and y with sqrt(weight) you can calculate weighted least squares.
You can get the formula by following link:
http://en.wikipedia.org/wiki/Linear_least_squares_%28mathematics%29#Weighted_linear_least_squares
here is an example:
Prepare data:
import numpy as np
np.random.seed(0)
N = 20
X = np.random.rand(N, 3)
w = np.array([1.0, 2.0, 3.0])
y = np.dot(X, w) + np.random.rand(N) * 0.1
OLS:
from scipy import linalg
w1 = linalg.lstsq(X, y)[0]
print w1
output:
[ 0.98561405 2.0275357 3.05930664]
WLS:
weights = np.linspace(1, 2, N)
Xw = X * np.sqrt(weights)[:, None]
yw = y * np.sqrt(weights)
print linalg.lstsq(Xw, yw)[0]
output:
[ 0.98799029 2.02599521 3.0623824 ]
Check result by statsmodels:
import statsmodels.api as sm
mod_wls = sm.WLS(y, X, weights=weights)
res = mod_wls.fit()
print res.params
output:
[ 0.98799029 2.02599521 3.0623824 ]

Create a diagonal matrix W from the elementwise square-roots of w. Then I think you just want:
scipy.linalg.lstsq(np.dot(W, X), np.dot(W*signal))
Following http://en.wikipedia.org/wiki/Linear_least_squares_(mathematics)#Weighted_linear_least_squares

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Normal Equation Implementation in Python / Numpy - python

This assumes X is an m by n+1 dimensional matrix where x_0 always = 1 and y is a m-dimensional vector. import numpy as np step1 = np.dot(X.T, X) step2 = np.linalg.pinv(step1) step3 = np.dot(step2, X.T) theta = np.dot(step3, y) # if y is m x 1. If 1xm, then use y.T

Your implementation is correct. You've only swapped X and y (look closely how they define x and y), that's why you get a different result. The call normalEquation(y, X) gives [ 24.96601443 3.30576144] as it should.

Here is the normal equation in one line: theta = np.dot(np.linalg.inv(np.dot(X.T,X)),np.dot(X.T,Y))

Related

SKlearn Gaussian Process with constant, manually set correlation

How can I get data for a quiver plot in torch?

Create a 2D array that is the product of two functions

Best way to interpolate a numpy.ndarray along an axis

Understanding scipy's least square function with IRLS

Categories

Resources