Implementation of cost function in linear regression - python

I am trying to implement the cost function on a simple training dataset and visualise the cost function in 3D.
The shape of my cost function is not as it is supposed to be.
This is my code:
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d.axes3d import Axes3D
import pandas as pd
from scipy.interpolate import griddata
def create_array(start, end, resolution):
return np.linspace(start, end, int((end - start)/resolution + 1))
def f(x,a,b):
x = np.array(x)
return a*x+b # or Theta_1 * x + Theta_0
def get_J(x, y, a, b):
x = np.array(x)
y = np.array(y)
# return 1/(2*len(y)) * sum(pow(f(x,a,b) - y, 2))
# Simple implementation
sum = 0
for i in range(0, len(x)):
sum+= (f(x[i],a,b) - y[i])**2
return 1/(2*len(y))*sum
# Training set
x = np.array([0,1,2,3])
y = np.array([0,1,2,3])
Theta_0 = create_array(-20, 10, 0.5)
Theta_1 = create_array(-20, 10, 0.5)
X,Y = np.meshgrid(Theta_0, Theta_1)
X=X.flatten()
Y=Y.flatten()
J = [get_J(x, y, X[i], Y[i]) for i in range(0,len(X))]
# simple set to verify 3D plotting is doing as expetected - OK
# X = [10, 0, -10,-20, 10, 0, -10,-20, 10, 0,-10, -20, 10, 0, -10,-20]
# Y = [-20, -20, -20, -20, -10, -10, -10, -10, 0, 0, 0, 0, 10, 10, 10, 10]
# J = [50, 25, 26, 60, 24, 10, 11, 26, 10, 0, 2, 11, 52, 26, 27, 63]
# Create the graphing elements
xyz = {'x': X, 'y': Y, 'z': J}
# put the data into a pandas DataFrame (this is what my data looks like)
df = pd.DataFrame(xyz, index=range(len(xyz['x'])))
# re-create the 2D-arrays
x1 = np.linspace(df['x'].min(), df['x'].max(), len(df['x'].unique()))
y1 = np.linspace(df['y'].min(), df['y'].max(), len(df['y'].unique()))
x2, y2 = np.meshgrid(x1, y1)
z2 = griddata((df['x'], df['y']), df['z'], (x2, y2), method='cubic')
fig = plt.figure(figsize =(14, 9))
ax = Axes3D(fig)
surf = ax.plot_surface(x2, y2, z2, rstride=1, cstride=1, cmap=plt.get_cmap('coolwarm'),linewidth=0, antialiased=False)
plt.gca().invert_xaxis()
ax.set_xlabel('\u03B81', fontweight ='bold')
ax.set_ylabel('\u03B80', fontweight ='bold')
ax.set_zlabel('J (\u03B81, \u03B80)', fontweight ='bold')
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
The 3D plot has the following shape:
when it is supposed to have this shape:

If you take paper and pencil and analytically derive the J you have implemented, you arrive at something like this:
a = theta_1: -20 ... 10
b = theta_0: -20 ... 10
J(a,b) ~ b^2 + (a+b-1)^2 + (2a+b-2)^2 + (3a+b-3)^2
This basically means that a and b are coupled like a+b. The a+b like terms are squared and a plot of (a+b)^2 looks like this (made with gnuplot):
The reference plot has another form which looks more like a and b being independent, as in a^2 + b^2, lets plot this:
So we should be able to reproduce the reference plot if J has the form
J(a, b) ~ a^2 + b^2 + (other terms except a*b)
The form of J is given by the training set x and y. I leave it to you to show analytically that the values in x build the coupling between a and b. For y, I play with the values and arrive at:
x = np.array([-1, 1])
y = np.array([1, -4])
This is the simplest setting I can think of. There are many more possibilities.
I'm not that deep into machine learning and the meaning of these values. My knowledge basically comes from here. So if I'm wrong please let me know.
Now I get the following image, and I think it is quite close to the reference one, at least the shape:
As a summary: I don't think there is a bug in your implementation. I think, you have plotted different data.

Related

2D Elliptical fit to x,y data with tilt [duplicate]

How do I create a confidence ellipsis in a scatterplot using matplotlib?
The following code works until creating scatter plot. Then, is anyone familiar with putting confidence ellipses over the scatter plot?
import numpy as np
import matplotlib.pyplot as plt
x = [5,7,11,15,16,17,18]
y = [8, 5, 8, 9, 17, 18, 25]
plt.scatter(x,y)
plt.show()
Following is the reference for Confidence Ellipses from SAS.
http://support.sas.com/documentation/cdl/en/grstatproc/62603/HTML/default/viewer.htm#a003160800.htm
The code in sas is like this:
proc sgscatter data=sashelp.iris(where=(species="Versicolor"));
title "Versicolor Length and Width";
compare y=(sepalwidth petalwidth)
x=(sepallength petallength)
/ reg ellipse=(type=mean) spacing=4;
run;
The following code draws a one, two, and three standard deviation sized ellipses:
x = [5,7,11,15,16,17,18]
y = [8, 5, 8, 9, 17, 18, 25]
cov = np.cov(x, y)
lambda_, v = np.linalg.eig(cov)
lambda_ = np.sqrt(lambda_)
from matplotlib.patches import Ellipse
import matplotlib.pyplot as plt
ax = plt.subplot(111, aspect='equal')
for j in xrange(1, 4):
ell = Ellipse(xy=(np.mean(x), np.mean(y)),
width=lambda_[0]*j*2, height=lambda_[1]*j*2,
angle=np.rad2deg(np.arccos(v[0, 0])))
ell.set_facecolor('none')
ax.add_artist(ell)
plt.scatter(x, y)
plt.show()
After giving the accepted answer a go, I found that it doesn't choose the quadrant correctly when calculating theta, as it relies on np.arccos:
Taking a look at the 'possible duplicate' and Joe Kington's solution on github, I watered his code down to this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
def eigsorted(cov):
vals, vecs = np.linalg.eigh(cov)
order = vals.argsort()[::-1]
return vals[order], vecs[:,order]
x = [5,7,11,15,16,17,18]
y = [25, 18, 17, 9, 8, 5, 8]
nstd = 2
ax = plt.subplot(111)
cov = np.cov(x, y)
vals, vecs = eigsorted(cov)
theta = np.degrees(np.arctan2(*vecs[:,0][::-1]))
w, h = 2 * nstd * np.sqrt(vals)
ell = Ellipse(xy=(np.mean(x), np.mean(y)),
width=w, height=h,
angle=theta, color='black')
ell.set_facecolor('none')
ax.add_artist(ell)
plt.scatter(x, y)
plt.show()
In addition to the accepted answer: I think the correct angle should be:
angle=np.rad2deg(np.arctan2(*v[:,np.argmax(abs(lambda_))][::-1])))
and the corresponding width (larger eigenvalue) and height should be:
width=lambda_[np.argmax(abs(lambda_))]*j*2, height=lambda_[1-np.argmax(abs(lambda_))]*j*2
As we need to find the corresponding eigenvector for the largest eigenvalue. Since "the eigenvalues are not necessarily ordered" according to the specs https://numpy.org/doc/stable/reference/generated/numpy.linalg.eig.html and v[:,i] is the eigenvector corresponding to the eigenvalue lambda_[i]; we should find the correct column of the eigenvector by np.argmax(abs(lambda_)).
There is no need to compute angles explicitly once you have the eigendecomposition of your covariance matrix: the rotation portion already encodes that information for you for free:
cov = np.cov(x, y)
val, rot = np.linalg.eig(cov)
val = np.sqrt(val)
center = np.mean([x, y], axis=1)[:, None]
t = np.linspace(0, 2.0 * np.pi, 1000)
xy = np.stack((np.cos(t), np.sin(t)), axis=-1)
plt.scatter(x, y)
plt.plot(*(rot # (val * xy).T + center))
You can expand your ellipse by applying a scale before translation:
plt.plot(*(2 * rot # (val * xy).T + center))

How do I correctly obscure multiple overlapping plots by fill?

I have a figure where I wish to fill under each plot to obscure the plots behind it.
My desired result is akin to this example:
I think I need to set the zorder of either plot or fill_between (or both?), but I can't seem to get the correct combinations.
My current plot and code are below.
My current code:
import numpy as np
import matplotlib.pyplot as plt
def gaussian(x, mu, sig):
return np.exp(-(x - mu)*(x - mu) / (2 * sig*sig))
mus = [4, 3, 2, 1, 0, -1, -2, -3, -4]
x = np.linspace(-10, 10, 500)
for i in range(len(mus) - 1, -1, -1):
mu = mus[i]
y = gaussian(x, mu, 1) + i * 0.1
plt.plot(x, y)
plt.fill_between(x, y, 0, color="lightgray") # The plot lines are not hidden by the fill. Probably need to do something with zorder
plt.show()
and of course, I find the answer after I ask...
for i in range(len(mus) - 1, -1, -1):
mu = mus[i]
y = gaussian(x, mu, 1) + i * 0.1
zorder = len(mus) - i #zorder increases as I draw the plots "in front"
plt.plot(x, y, zorder=zorder)
plt.fill_between(x, y, 0, color="lightgray", zorder=zorder)

Plot average of scattered values in 2D bins as a histogram/hexplot

I have 3 dimensional scattered data x, y, z.
I want to plot the average of z in bins of x and y as a hex plot or 2D histogram plot.
Is there any matplotlib function to do this?
I can only come up with some very cumbersome implementations even though this seems to be a common problem.
E.g. something like this:
Except that the color should depend on the average z values for the (x, y) bin (rather than the number of entries in the (x, y) bin as in the default hexplot/2D histogram functionalities).
If binning is what you are asking, then binned_statistic_2d might work for you. Here's an example:
from scipy.stats import binned_statistic_2d
import numpy as np
x = np.random.uniform(0, 10, 1000)
y = np.random.uniform(10, 20, 1000)
z = np.exp(-(x-3)**2/5 - (y-18)**2/5) + np.random.random(1000)
x_bins = np.linspace(0, 10, 10)
y_bins = np.linspace(10, 20, 10)
ret = binned_statistic_2d(x, y, z, statistic=np.mean, bins=[x_bins, y_bins])
fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(12, 4))
ax0.scatter(x, y, c=z)
ax1.imshow(ret.statistic.T, origin='bottom', extent=(0, 10, 10, 20))
#Andrea's answer is very clear and helpful, but I wanted to mention a faster alternative that does not use the scipy library.
The idea is to do a 2d histogram of x and y weighted by the z variable (it has the sum of the z variable in each bin) and then normalize against the histogram without weights (it has the number of counts in each bin). In this way, you will calculate the average of the z variable in each bin.
The code:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(0, 10, 10**7)
y = np.random.uniform(10, 20, 10**7)
z = np.exp(-(x-3)**2/5 - (y-18)**2/5) + np.random.random(10**7)
x_bins = np.linspace(0, 10, 50)
y_bins = np.linspace(10, 20, 50)
H, xedges, yedges = np.histogram2d(x, y, bins = [x_bins, y_bins], weights = z)
H_counts, xedges, yedges = np.histogram2d(x, y, bins = [x_bins, y_bins])
H = H/H_counts
plt.imshow(H.T, origin='lower', cmap='RdBu',
extent=[xedges[0], xedges[-1], yedges[0], yedges[-1]])
plt.colorbar()
In my computer, this method is approximately a factor 5 faster than using scipy's binned_statistic_2d.

How can I give specific x values to `scipy.interpolate.splev`?

How can I interpolate a hysteresis loop at specific x points? Multiple related questions/answers are available on SOF regarding B-spline interpolation using scipy.interpolate.splprep (other questions here or here). However, I have hundreds of hysteresis loops at very similar (but not exactly same) x positions and I would like to perform B-spline interpolation on all of them at specific x coordinates.
Taking a previous example:
import numpy as np
from scipy import interpolate
from matplotlib import pyplot as plt
x = np.array([23, 24, 24, 25, 25])
y = np.array([13, 12, 13, 12, 13])
# append the starting x,y coordinates
x = np.r_[x, x[0]]
y = np.r_[y, y[0]]
# fit splines to x=f(u) and y=g(u), treating both as periodic. also note that s=0
# is needed in order to force the spline fit to pass through all the input points.
tck, u = interpolate.splprep([x, y], s=0, per=True)
# evaluate the spline fits for 1000 evenly spaced distance values
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# plot the result
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, 'or')
ax.plot(xi, yi, '-b')
plt.show()
Is it possible to provide specific x values to interpolate.splev? I get unexpected results:
x2, y2 = interpolate.splev(np.linspace(start=23, stop=25, num=30), tck)
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, 'or')
ax.plot(x2, y2, '-b')
plt.show()
The b-spline gives x and y positions for a given u (between 0 and 1).
Getting y positions for a given x position involves solving for the inverse. As there can be many y's corresponding to one x (in the given example there are places with 4 y's, for example at x=24).
A simple way to get a list of (x,y)'s for x between two limits, is to create a filter:
import numpy as np
from scipy import interpolate
from matplotlib import pyplot as plt
x = np.array([23, 24, 24, 25, 25])
y = np.array([13, 12, 13, 12, 13])
# append the starting x,y coordinates
x = np.r_[x, x[0]]
y = np.r_[y, y[0]]
tck, u = interpolate.splprep([x, y], s=0, per=True)
# evaluate the spline fits for 1000 evenly spaced distance values
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# plot the result
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, 'or')
ax.plot(xi, yi, '-b')
filter = (xi >= 24) & (xi <= 25)
x2 = xi[filter]
y2 = yi[filter]
ax.scatter(x2, y2, color='c')
plt.show()

Creating a Confidence Ellipsis in a scatterplot using matplotlib

How do I create a confidence ellipsis in a scatterplot using matplotlib?
The following code works until creating scatter plot. Then, is anyone familiar with putting confidence ellipses over the scatter plot?
import numpy as np
import matplotlib.pyplot as plt
x = [5,7,11,15,16,17,18]
y = [8, 5, 8, 9, 17, 18, 25]
plt.scatter(x,y)
plt.show()
Following is the reference for Confidence Ellipses from SAS.
http://support.sas.com/documentation/cdl/en/grstatproc/62603/HTML/default/viewer.htm#a003160800.htm
The code in sas is like this:
proc sgscatter data=sashelp.iris(where=(species="Versicolor"));
title "Versicolor Length and Width";
compare y=(sepalwidth petalwidth)
x=(sepallength petallength)
/ reg ellipse=(type=mean) spacing=4;
run;
The following code draws a one, two, and three standard deviation sized ellipses:
x = [5,7,11,15,16,17,18]
y = [8, 5, 8, 9, 17, 18, 25]
cov = np.cov(x, y)
lambda_, v = np.linalg.eig(cov)
lambda_ = np.sqrt(lambda_)
from matplotlib.patches import Ellipse
import matplotlib.pyplot as plt
ax = plt.subplot(111, aspect='equal')
for j in xrange(1, 4):
ell = Ellipse(xy=(np.mean(x), np.mean(y)),
width=lambda_[0]*j*2, height=lambda_[1]*j*2,
angle=np.rad2deg(np.arccos(v[0, 0])))
ell.set_facecolor('none')
ax.add_artist(ell)
plt.scatter(x, y)
plt.show()
After giving the accepted answer a go, I found that it doesn't choose the quadrant correctly when calculating theta, as it relies on np.arccos:
Taking a look at the 'possible duplicate' and Joe Kington's solution on github, I watered his code down to this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
def eigsorted(cov):
vals, vecs = np.linalg.eigh(cov)
order = vals.argsort()[::-1]
return vals[order], vecs[:,order]
x = [5,7,11,15,16,17,18]
y = [25, 18, 17, 9, 8, 5, 8]
nstd = 2
ax = plt.subplot(111)
cov = np.cov(x, y)
vals, vecs = eigsorted(cov)
theta = np.degrees(np.arctan2(*vecs[:,0][::-1]))
w, h = 2 * nstd * np.sqrt(vals)
ell = Ellipse(xy=(np.mean(x), np.mean(y)),
width=w, height=h,
angle=theta, color='black')
ell.set_facecolor('none')
ax.add_artist(ell)
plt.scatter(x, y)
plt.show()
In addition to the accepted answer: I think the correct angle should be:
angle=np.rad2deg(np.arctan2(*v[:,np.argmax(abs(lambda_))][::-1])))
and the corresponding width (larger eigenvalue) and height should be:
width=lambda_[np.argmax(abs(lambda_))]*j*2, height=lambda_[1-np.argmax(abs(lambda_))]*j*2
As we need to find the corresponding eigenvector for the largest eigenvalue. Since "the eigenvalues are not necessarily ordered" according to the specs https://numpy.org/doc/stable/reference/generated/numpy.linalg.eig.html and v[:,i] is the eigenvector corresponding to the eigenvalue lambda_[i]; we should find the correct column of the eigenvector by np.argmax(abs(lambda_)).
There is no need to compute angles explicitly once you have the eigendecomposition of your covariance matrix: the rotation portion already encodes that information for you for free:
cov = np.cov(x, y)
val, rot = np.linalg.eig(cov)
val = np.sqrt(val)
center = np.mean([x, y], axis=1)[:, None]
t = np.linspace(0, 2.0 * np.pi, 1000)
xy = np.stack((np.cos(t), np.sin(t)), axis=-1)
plt.scatter(x, y)
plt.plot(*(rot # (val * xy).T + center))
You can expand your ellipse by applying a scale before translation:
plt.plot(*(2 * rot # (val * xy).T + center))

Categories

Resources