Implementing Linear Regression using numpy - python

I am trying to learn the linear equation y = x1 + x2 + e where e is a random error between 0 and 0.5.
The data is defined as this:
X1 = np.random.randint(1, 10000, 5000)
X2 = np.random.randint(1, 10000, 5000)
e = np.array([random.uniform(0, 0.5) for i in range(5000)])
y = X1 + X2 + e
When I am implementing a simple gradient descent to find the parameters, the Loss and gradients all are exploding. Where am I going wrong? The code for gradient descent:
w1, w2, b = 1, 1, 0
n = X1.shape[0]
alpha = 0.01
for i in range(5):
y_pred = w1 * X1 + w2 * X2 + b
L = np.sum(np.square(y - y_pred))/(2 * n)
dL_dw1 = (-1/n) * np.sum((y - y_pred) * X1)
dL_dw2 = (-1/n) * np.sum((y - y_pred) * X2)
dL_db = (-1/n) * np.sum((y - y_pred))
w1 = w1 - alpha * dL_dw1
w2 = w2 - alpha * dL_dw2
b = b - alpha * dL_db
print(L, w1, w2, b)
The output for this is:
0.042928723015982384 , 13.7023102434034 , 13.670617201430483 , 0.00254938447277222
9291487188.8259 , -7353857.489486973 , -7293941.123714662 , -1261.9252592161051
3.096713445664372e+21 , 4247172241132.3584 , 4209117175658.749 , 728518135.2857293
1.0320897597938595e+33 , -2.4520737800716524e+18 , -2.4298158059267333e+18 , -420579738783719.2
3.4398058610314825e+44 , 1.415615899689713e+24 , 1.402742160404974e+24 , 2.428043942370682e+20

All you are missing is Data normalization. For Gradient based learning algorithms you have to make sure the data is normalized i.e it has mean=0 and std=1.
Lets verify so by having a constant error (say e=33).
X1 = np.random.randint(1, 10000, 5000)
X2 = np.random.randint(1, 10000, 5000)
e = 33
# Normalize data
X1 = (X1 - np.mean(X1))/np.std(X1)
X2 = (X2 - np.mean(X2))/np.std(X2)
y = X1 + X2 + e
w1, w2, b = np.random.rand(), np.random.rand(), np.random.rand()
n = X1.shape[0]
alpha = 0.01
for i in range(1000):
y_pred = w1 * X1 + w2 * X2 + b
L = np.sum(np.square(y - y_pred))/(2 * n)
dL_dw1 = (-1/n) * np.sum((y - y_pred) * X1)
dL_dw2 = (-1/n) * np.sum((y - y_pred) * X2)
dL_db = (-1/n) * np.sum((y - y_pred))
w1 = w1 - alpha * dL_dw1
w2 = w2 - alpha * dL_dw2
b = b - alpha * dL_db
if (i)%100 == 0:
print(L)
print (w1, w2, b)
Output:
Loss: 517.7575710514508
Loss: 69.36601211594098
Loss: 9.29326322560041
Loss: 1.2450619081931993
Loss: 0.16680720657514425
Loss: 0.022348057963833764
Loss: 0.002994096883392299
Loss: 0.0004011372165515275
Loss: 5.374289796164062e-05
Loss: 7.2002934167549005e-06
0.9999609731610163 0.9999911458582055 32.99861157362915
As you can see it did converge.
There are no issues in your code except that you have to normalize your data.
Now you can plug back your error and find the best possible estimates.

Okay there are a few problem with the problem formulation
Scaling: Gradient descents generally need the variables to be scaled well in order to ensure that the alpha can be set properly. Everything is relative in most of the cases and you can always multiply a problem by a fixed constant. However because the weights are manipulated directly by alpha value the very high or very low values of the weights are harder to reach I am hereby scaling your mechanism down by about 10000 and also reducing the random error to scale
import numpy as np
import random
X1 = np.random.random(5000)
X2 = np.random.random(5000)
e = np.array([random.uniform(0, 0.0005) for i in range(5000)])
y = X1 + X2 + e
Dependence of y_pred on b: The Value of B i am not sure what it is supposed to do and why are you explicitly introducing an error to y_pred. Your prediction should assume that there is no error :D
If X and Ys are scaled well a few tries with hyperparameter would yield a good value
for i in range(5):
y_pred = w1 * X1 + w2 * X2
L = np.sum(np.square(y - y_pred))/(2 * n)
dL_dw1 = -(1/n) * np.sum((y - y_pred) * X1)
dL_dw2 = -(1/n) * np.sum((y - y_pred) * X2)
dL_db = -(1/n) * np.sum((y - y_pred))
w1 = w1 - alpha * dL_dw1
w2 = w2 - alpha * dL_dw2
print(L, w1, w2)
You can play around with those values but they will converge
w1, w2, b = 1.1, 0.9, 0.01
alpha = 1
0.0008532534726479387 1.0911950693892498 0.9082610891021278
0.0007137567968828647 1.0833134985852988 0.9159869797801239
0.0005971536415151483 1.0761750602775175 0.9231234590515701
0.0004996145120126794 1.0696746682185534 0.9296797694772246
0.0004180103133293466 1.0637407602096771 0.9356885401106588

Related

Orbit spirals using 4th order Yoshida integration

I am attempting to use the 4th order Yoshida integration technique to model the orbit of satellites in circular orbits around the Earth.
However, the orbits I achieve spiral away quite quickly. The code for a Moon like satellite is below. Interestingly, the particles behaved when I use Euler method, however, I wanted to try a more accurate method. The issue could then be within how I have implemented the algorithm itself.
I have tried using the gravitational parameter rather then computing G*M, but this did not help. I also reduced the time-step, messed around with units, printed and checked values for various integration steps etc., but could not find anything.
Is this the correct use of this algorithm?
G = 6.674e-20 # km^3 kg^-1 s^-2
day = 60.0 * 60.0 * 24.0 # length of day
dt = day / 10.0
M = 5.972e24 # kg
N = 1
delta = np.random.random(1) * 2.0 * np.pi / N
angles = np.linspace(0.0, 2.0 * np.pi, N) + delta
rad = np.random.uniform(low = 384e3, high = 384e3, size = (N))
x, y = rad * np.cos(angles), ringrad * np.sin(angles)
vx, vy = np.sqrt(G*M / rad) * -np.sin(angles), np.sqrt(G*M / rad) * np.cos(angles)
def update(frame):
global x, y, vx, vy, dt, day
positions.set_data(x, y)
# coefficients
q = 2**(1/3)
w1 = 1 / (2 - q)
w0 = -q * w1
d1 = w1
d3 = w1
d2 = w0
c1 = w1 / 2
c2 = (w0 + w1) / 2
c3 = c2
c4 = c1
# Step 1
x1 = x + c1*vx*dt
y1 = y + c1*vy*dt
dist1 = np.hypot(x1, y1)
acc1 = -(G*M) / (dist1**2.0)
dx1 = x1 - x
dy1 = y1 - y
accx1 = (acc1*dx1)/(x1)
accy1 = (acc1*dy1)/(y1)
vx1 = vx + d1*accx1*dt
vy1 = vy + d1*accy1*dt
# Step 2
x2 = x1 + c2*vx1*dt
y2 = y1 + c2*vy1*dt
dist2 = np.hypot(x2, y2)
acc2 = -(G*M) / (dist2**2.0)
dx2 = x2 - x1
dy2 = y2 - y1
accx2 = (acc2*dx2)/(x2)
accy2 = (acc2*dy2)/(y2)
vx2 = vx1 + d2*accx2*dt
vy2 = vy1 + d2*accy2*dt
# Step 3
x3 = x2 + c3*vx2*dt
y3 = y2 + c3*vy2*dt
dist3 = np.hypot(x3, y3)
acc3 = -(G*M) / (dist3**2.0)
dx3 = x3 - x2
dy3 = y3 - y2
accx3 = (acc3*dx3)/(x3)
accy3 = (acc3*dy3)/(y3)
vx3 = vx2 + d3*accx3*dt
vy3 = vy2 + d3*accy3*dt
# Full step
x = x3 + c4*vx3*dt
y = y3 + c4*vy3*dt
vx = vx3
vy = vy3
return positions

Fading a Line Exponentially

I'd like to fade the values of a line on the Y axis using a gradient that I can control.
This is a simplified version of what I'm trying right now.
y_values = [1,1,1,1,1,1]
for i, y in enumerate(y_values): #i is the index, y is the individual y value
perc = i/(len(y_values)-1) #progress the fade by the percentage of the index divided by the total values
amt = y * perc #y is the target amount to decrease multiplied by index percentage
y -= amt
print(i,y)
This code produces this:
1.0
0.8
0.6
0.4
0.2
0.0
It's creating a linear fade, but how do I increase the fade with an exponential fade like this?
Thank you!
To make exponential fading, you have to provide two coefficients to provide initial factor 1.0 at the start value x1 and desired final factor k at the end of interval x2
y = y * f(x)
f(x) = A * exp(-B * x)
So
f(x1) = 1 = A * exp(B * x1)
f(x2) = k = A * exp(B * x2)
divide the second by the first
k = exp(B * (x2 - x1))
ln(k) = B * (x2 - x1)
so
B = ln(k) / (x2 - x1)
A = exp(B * x1)
Example for x1 = 0, x2 = 60, k = 0.01
B = -4.6/60= -0.076
A = 1
f(x) = exp(-0.076*x)
f(30) = exp(-0.076*20) = 0.1
Python example:
import math
def calcfading(x1, x2, ratio):
B = math.log(ratio) / (x2 - x1)
A = math.exp(B * x1)
return A, B
def coef(x, fade):
return fade[0]*math.exp(x*fade[1])
cosine = [[x, math.cos(x)] for x in range(0, 11)]
print(cosine)
print()
fade = calcfading(0, 10, 0.01)
expcosine = [[s[0], coef(s[0], fade)*s[1]] for s in cosine]
print(expcosine)

Multivariate Regression Numpy for Math Homework

I'm looking to use multivariate regression with least squares as my cost function to find a,b,c for ax^2 +bx + c that best fits cos(x) from (-2,2). My cost won't decrease but is ridiculously high- what I am doing wrong?
x = np.linspace(-2,2,100)
y = np.cos(x)
theta = np.random.random((3,1))
m = len(y)
for i in range(10000):
#Calculate my y_hat
y_hat = np.array([(theta[0]*(a**2) + theta[1]*a + theta[2]) for a in x])
#Calculate my cost based off y_hat and y
cost = np.sum((y_hat - y) ** 2) * (1/m)
#Calculate my derivatives based off y_hat and x
da = (2 / m) * np.sum((y_hat - y) * (x**2))
db = (2 / m) * np.sum((y_hat - y) * (x))
dc = (2 / m) * np.sum((y_hat - y))
#update step
theta[0] = theta[0] - 0.0001*(da)
theta[1] = theta[1] - 0.0001*(db)
theta[2] = theta[2] - 0.0001*(dc)
print("Epoch Num: {} Cost: {}".format(i, cost))
print(theta)
You're calculation of y_hat is slightly incorrect. It's currently a 2D array of shape (100,1).
This should help. It pulls the "zeroith" element from each of the rows:
theta_ = [(theta[0]*(a**2) + theta[1]*a + theta[2]) for a in x]
y_hat = np.array([t[0] for t in theta_])

Neural Network XOR with numpy not converging

I have trained a Neural Net to solve the XOR problem. The problem with my network is that it is not converging. I am using Andrew Ng's methods and notations as taught in the DeepLearning.ai course.
Here's the code :
import numpy as np
from __future__ import print_function
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
Y = np.array([[0, 1, 1, 0]])
np.random.seed(1)
W1 = np.random.randn(3, 2) * 0.0001
b1 = np.ones((3, 1))
W2 = np.random.randn(1, 3) * 0.0001
b2 = np.ones((1, 1))
The next part for the Backpropagation:
learning_rate = 0.01
m = 4
for iteration in range(100000):
# forward propagation
# layer1
Z1 = np.dot(W1, X.T) + b1
A1 = sigmoid(Z1)
# layer2
Z2 = np.dot(W2, A1) + b2
A2 = sigmoid(Z2)
# backpropagation
dZ2 = Y - A2
dW2 = (1 / m) * np.dot(dZ2, A1.T)
db2 = (1 / m) * np.sum(dZ2, axis=1, keepdims=True)
dZ1 = np.dot(dW2.T, dZ2) * sigmoid_gradient(Z1)
dW1 = (1 / m) * np.dot(dZ1, X)
db1 = (1 / m) * np.sum(dZ1, axis=1, keepdims=True)
# checking if shapes are correctly preserved
assert (dZ2.shape == Z2.shape)
assert (dW2.shape == W2.shape)
assert (db2.shape == b2.shape)
assert (dZ1.shape == Z1.shape)
assert (dW1.shape == W1.shape)
assert (db1.shape == b1.shape)
# update parameters
W1 = W1 + learning_rate * dW1
W2 = W2 + learning_rate * dW2
b1 = b1 + learning_rate * db1
b2 = b2 + learning_rate * db2
# print every 10k
if (iteration % 10000 == 0):
print(A2)
You have made a couple of mistakes in your code. For example, in computing the W2.
...
dZ2 = Y - A2
dW2 = (1 / m) * np.dot(dZ2, A1.T)
...
W2 = W2 + learning_rate * dW2
We want to calculate the derivative of Cost with respect to W2 using the chain rule.
We can write the derivatives as follows:
You haven't implemented the middle part which computes the derivative of the Z2.
You can check out this video, it explains the math part of backpropagation. Moreover, you can check out this simple implementation of the neural network.

How to correctly implement a time dependent variable when using scipy.integrate.odeint

I'm trying to solve a system of odes basically resembling this one but with one more spring and damper ==> http://scipy-cookbook.readthedocs.io/items/CoupledSpringMassSystem.html
I have a slight problem, though, because one of the parameters I want to implement is time dependent. My first attempt is the following one :
import scipy as sci
import numpy as np
import matplotlib.pyplot as plt
def bump(t):
if t <= (0.25 / 6.9):
return (0.075 * (1 - np.cos(np.pi * 8 * 6.9 * t)))
else:
return 0
def membre_droite(w, t, p):
x1,y1,x2,y2,x3,y3 = w
m1,m2,m3,k1,k2,k3,l1,l2,l3,c2,c3 = p
f = [y1,
(-k1 * (x1 - l1 - bump(t)) + k2 * (x2 - x1 - l2) + c2 * (y2 - y1)) / m1,
y2,
(-c2 * (y2 - y1) - k2 * (x2 - x1 - l2) + k3 * (x3 - x2 - l3) + c3 * (y3 - y2)) / m2,
y3,
(-c3 * (y3 - y2) - k3 * (x3 - x2 - l3)) / m3]
return f
# Initial values
x11 = 0.08
y11 = 0
x22 = 0.35
y22 = 0
x33 = 0.6
y33 = 0
# Parameters
m1 = 90
m2 = 4000
m3 = 105
k1 = 250000
k2 = 25000
k3 = 30000
l1 = 0.08
l2 = x22-x11
l3 = x33-x22
c2 = 2500
c3 = 850
# Initial paramaters regrouped + time array
time=np.linspace(0.0, 5, 1001)
w0 = [x11,y11,x22,y22,x33,y33]
p0 = [m1,m2,m3,k1,k2,k3,l1,l2,l3,c2,c3]
x1,y1,x2,y2,x3,y3 = sci.integrate.odeint(membre_droite, w0, time, args=(p0,)).T
plt.plot(time,x1,'b')
plt.plot(time,x2,'g')
plt.plot(time,x3,'r')
plt.plot(time,y2,'yellow')
plt.plot(time,y3,'black')
plt.xlabel('t')
plt.grid(True)
plt.legend((r'$x_1$', r'$x_2$', r'$x_3$', r'$y_2$', r'$y_3$'))
plt.show()
The error I get is :
if t <= (0.25 / 6.9):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I've looked for similar cases and I came across this topic ==> Solving a system of odes (with changing constant!) using scipy.integrate.odeint?
I've then attempted to adapt my code to this format :
import scipy as sci
import numpy as np
import matplotlib.pyplot as plt
def bump(t):
if t <= (0.25 / 6.9):
return (0.075 * (1 - np.cos(np.pi * 8 * 6.9 * t)))
else:
return 0
def membre_droite(w, t, bump):
x1,y1,x2,y2,x3,y3 = w
f = [y1,
(-250000 * (x1 - x11 - bump(t)) + 25000 * (x2 - x1 - x22 + x11) + 2500 * (y2-y1)) / 90,
y2,
(-2500 * (y2 - y1) - 25000 * (x2 - x1 - x22 + x11) + 30000 * (x3 - x2 - x33 + x22) + 850 * (y3 - y2)) / 4000,
y3,
(-850 * (y3 - y2) - 30000 * (x3 - x2 - x33 + x22)) / 105]
return f
# Initial values
x11 = 0.08
y11 = 0
x22 = 0.35
y22 = 0
x33 = 0.6
y33 = 0
# Initial paramaters regrouped + time array
time = np.linspace(0.0, 5, 1001)
w0 = [x11,y11,x22,y22,x33,y33]
x1,y1,x2,y2,x3,y3 = sci.integrate.odeint(membre_droite, w0, time, args=(bump,)).T
plt.plot(time,x1,'b')
plt.plot(time,x2,'g')
plt.plot(time,x3,'r')
plt.plot(time,y2,'yellow')
plt.plot(time,y3,'black')
plt.xlabel('t')
plt.grid(True)
plt.legend((r'$x_1$', r'$x_2$', r'$x_3$', r'$y_2$', r'$y_3$'))
plt.show()
Reading the previous link, it should have worked but I get another error :
(-250000 * (x1 - x11 - bump(t)) + 25000 * (x2 - x1 - x22 + x11) + 2500 * (y2 - y1)) / 90,
TypeError: 'list' object is not callable
Change to:
from scipy.integrate import odeint
and
x1,y1,x2,y2,x3,y3 = odeint(membre_droite, w0, time, args=(bump,)).T
Complete code:
from scipy.integrate import odeint
import numpy as np
import matplotlib.pyplot as plt
def bump(t):
if t <= (0.25 / 6.9):
return (0.075 * (1 - np.cos(np.pi * 8 * 6.9 * t)))
else:
return 0
def membre_droite(w, t, bump):
x1,y1,x2,y2,x3,y3 = w
f = [y1,
(-250000 * (x1 - x11 - bump(t)) + 25000 * (x2 - x1 - x22 + x11) + 2500 * (y2 - y1)) / 90,
y2,
(-2500 * (y2 - y1) - 25000 * (x2 - x1 - x22 + x11) + 30000 * (x3 - x2 - x33 + x22) + 850 * (y3 - y2)) / 4000,
y3,
(-850 * (y3 - y2) - 30000 * (x3 - x2 - x33 + x22)) / 105]
return f
# Initial values
x11 = 0.08
y11 = 0
x22 = 0.35
y22 = 0
x33 = 0.6
y33 = 0
# Initial paramaters regrouped + time array
time = np.linspace(0.0, 5, 1001)
w0 = [x11,y11,x22,y22,x33,y33]
x1,y1,x2,y2,x3,y3 = odeint(membre_droite, w0, time, args=(bump,)).T
plt.plot(time,x1,'b')
plt.plot(time,x2,'g')
plt.plot(time,x3,'r')
plt.plot(time,y2,'yellow')
plt.plot(time,y3,'black')
plt.xlabel('t')
plt.grid(True)
plt.legend((r'$x_1$', r'$x_2$', r'$x_3$', r'$y_2$', r'$y_3$'))
plt.show()

Categories

Resources