Logistic regression Python implementation - python

I tried to implement logistic regression only with numpy in Python, but the result is not satisfying. The predictions seems incorrect and loss is not improving so it is probably something wrong with the code. Does anyone know what could fix it? Thank you very much!
Here is algorithm:
import numpy as np
# training data and labels
X = np.concatenate((np.random.normal(0.25, 0.1, 50), np.random.normal(0.75, 0.1, 50)), axis=None)
Y = np.concatenate((np.zeros((50,), dtype=np.int32), np.ones((50,), dtype=np.int32)), axis=None)
def logistic_sigmoid(a):
return 1 / (1 + np.exp(-a))
# forward pass
def forward_pass(w, x):
return logistic_sigmoid(w * x)
# gradient computation
def backward_pass(x, y, y_real):
return np.sum((y - y_real) * x)
# computing loss
def loss(y, y_real):
return -np.sum(y_real * np.log(y) + (1 - y_real) * np.log(1 - y))
# training
def train():
w = 0.0
learning_rate = 0.01
i = 200
test_number = 0.3
for epoch in range(i):
y = forward_pass(w, X)
gradient = backward_pass(X, y, Y)
w = w - learning_rate * gradient
print(f'epoch {epoch + 1}, x = {test_number}, y = {forward_pass(w, test_number):.3f}, loss = {loss(y, Y):.3f}')
train()

At first glance you are missing you intercept term (typically called b_0, or bias) and its gradient update. Also in the backward_pass and loss calculations you are not dividing by the amount of data samples.
You can see two examples of how to implement it from scratch here:
1: Example based on Andrew Ng explanations in the Machine Learning course in Coursera
2: Implementation of Jason Brownlee from Machine Learning mastery website

Related

Need help in debugging Shallow Neural network using numpy

I'm doing a hands-on for learning and have created a model in python using numpy that's being trained on breast cancer dataSet from sklearn library. Model is running without any error and giving me Train and Test accuracy as 92.48826291079813% and 90.9090909090909% respectively. However somehow I'm not able to complete the hands-on since (probably) my result is different than expected. I don't know where the problem is because I don't know the right answer, also don't see any error.
Would request someone to help me with this. Code is given below.
#Import numpy as np and pandas as pd
"""
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
**Define method initialiseNetwork() initilise weights with zeros of shape(num_features, 1) and also bias b to zero
parameters: num_features(number of input features)
returns : dictionary of weight vector and bias**
def initialiseNetwork(num_features):
W = np.zeros((num_features,1))
b = 0
parameters = {"W": W, "b": b}
return parameters
** define function sigmoid for the input z.
parameters: z
returns: $1/(1+e^{(-z)})$ **
def sigmoid(z):
a = 1/(1 + np.exp(-z))
return a
** Define method forwardPropagation() which implements forward propagtion defined as Z = (W.T dot_product X) + b, A = sigmoid(Z)
parameters: X, parameters
returns: A **
def forwardPropagation(X, parameters):
W = parameters["W"]
b = parameters["b"]
Z = np.dot(W.T,X) + b
A = sigmoid(Z)
return A
** Define function cost() which calculate the cost given by −(sum(Y\*log(A)+(1−Y)\*log(1−A)))/num_samples, here * is elementwise product
parameters: A,Y,num_samples(number of samples)
returns: cost **
def cost(A, Y, num_samples):
cost = -1/num_samples * np.sum(Y*np.log(A) + (1-Y)*(np.log(1-A)))
#cost = Y*np.log(A) + (1-Y)*(np.log(1-A))
return cost
** Define method backPropgation() to get the derivatives of weigths and bias
parameters: X,Y,A,num_samples
returns: dW,db **
def backPropagration(X, Y, A, num_samples):
dZ = A - Y
dW = (np.dot(X,dZ.T))/num_samples #(X dot_product dZ.T)/num_samples
db = np.sum(dZ)/num_samples #sum(dZ)/num_samples
return dW, db
** Define function updateParameters() to update current parameters with its derivatives
w = w - learning_rate \* dw
b = b - learning_rate \* db
parameters: parameters,dW,db, learning_rate
returns: dictionary of updated parameters **
def updateParameters(parameters, dW, db, learning_rate):
W = parameters["W"] - (learning_rate * dW)
b = parameters["b"] - (learning_rate * db)
return {"W": W, "b": b}
** Define the model for forward propagation
parameters: X,Y, num_iter(number of iterations), learning_rate
returns: parameters(dictionary of updated weights and bias) **
def model(X, Y, num_iter, learning_rate):
num_features = X.shape[0]
num_samples = X.shape[1]
parameters = initialiseNetwork(num_features) #call initialiseNetwork()
for i in range(num_iter):
#A = forwardPropagation(X, Y, parameters) # calculate final output A from forwardPropagation()
A = forwardPropagation(X, parameters)
if(i%100 == 0):
print("cost after {} iteration: {}".format(i, cost(A, Y, num_samples)))
dW, db = backPropagration(X, Y, A, num_samples) # calculate derivatives from backpropagation
parameters = updateParameters(parameters, dW, db, learning_rate) # update parameters
return parameters
** Run the below cell to define the function to predict the output.It takes updated parameters and input data as function parameters and returns the predicted output **
def predict(X, parameters):
W = parameters["W"]
b = parameters["b"]
b = b.reshape(b.shape[0],1)
Z = np.dot(W.T,X) + b
Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
return Y
** The code in the below cell loads the breast cancer data set from sklearn.
The input variable(X_cancer) is about the dimensions of tumor cell and targrt variable(y_cancer) classifies tumor as malignant(0) or benign(1) **
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
** Split the data into train and test set using train_test_split(). Set the random state to 25. Refer the code snippet in topic 4 **
X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer,
random_state = 25)
** Since the dimensions of tumor is not uniform you need to normalize the data before feeding to the network
The below function is used to normalize the input data. **
def normalize(data):
col_max = np.max(data, axis = 0)
col_min = np.min(data, axis = 0)
return np.divide(data - col_min, col_max - col_min)
** Normalize X_train and X_test and assign it to X_train_n and X_test_n respectively **
X_train_n = normalize(X_train)
X_test_n = normalize(X_test)
** Transpose X_train_n and X_test_n so that rows represents features and column represents the samples
Reshape Y_train and y_test into row vector whose length is equal to number of samples.Use np.reshape() **
X_trainT = X_train_n.T
#print(X_trainT.shape)
X_testT = X_test_n.T
#print(X_testT.shape)
y_trainT = y_train.reshape(1,X_trainT.shape[1])
y_testT = y_test.reshape(1,X_testT.shape[1])
** Train the network using X_trainT,y_trainT with number of iterations 4000 and learning rate 0.75 **
parameters = model(X_trainT, y_trainT, 4000, 0.75) #call the model() function with parametrs mentioned in the above cell
** Predict the output of test and train data using X_trainT and X_testT using predict() method> Use the parametes returned from the trained model **
yPredTrain = predict(X_trainT, parameters) # pass weigths and bias from parameters dictionary and X_trainT as input to the function
yPredTest = predict(X_testT, parameters) # pass the same parameters but X_testT as input data
** Run the below cell print the accuracy of model on train and test data. ***
accuracy_train = 100 - np.mean(np.abs(yPredTrain - y_trainT)) * 100
accuracy_test = 100 - np.mean(np.abs(yPredTest - y_testT)) * 100
print("train accuracy: {} %".format(accuracy_train))
print("test accuracy: {} %".format(accuracy_test))
My Output:
train accuracy: 92.48826291079813 %
test accuracy: 90.9090909090909 %
I figured out where the problem was. It was the third line in predict function where I was reshaping bias which was not at all necessary.
def predict(X, parameters):
W = parameters["W"]
b = parameters["b"]
**b = b.reshape(b.shape[0],1)**
Z = np.dot(W.T,X) + b
Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
return Y
and third line in back-propagation function needed to be corrected as np.sum(dZ)/num_samples.
def backPropagration(X, Y, A, num_samples):
dZ = A - Y
dW = (np.dot(X,dZ.T))/num_samples
** db = sum(dZ)/num_samples **
return dW, db
After I corrected both functions, the model gave me train accuracy as 98.59154929577464% and test accuracy as 93.00699300699301%.

Pytorch RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

This code is built up as follows: My robot takes a picture, some tf computer vision model calculates where in the picture the target object starts. This information (x1 and x2 coordinate) is passed to a pytorch model. It should learn to predict the correct motor activations, in order to get closer to the target. After the movement is executed, the robot takes a picture again and the tf cv model should calculate whether the motor activation brought the robot closer to the desired state (x1 at 10, x2 coordinate at at31)
However every time i run the code pytorch is not able to calculate the gradients.
I'm wondering if this is some data-type problem or if it is a more general one: Is it impossible to calculate the gradients if the loss is not calculated directly from the pytorch network's output?
Any help and suggestions will be greatly appreciated.
#define policy model (model to learn a policy for my robot)
import torch
import torch.nn as nn
import torch.nn.functional as F
class policy_gradient_model(nn.Module):
def __init__(self):
super(policy_gradient_model, self).__init__()
self.fc0 = nn.Linear(2, 2)
self.fc1 = nn.Linear(2, 32)
self.fc2 = nn.Linear(32, 64)
self.fc3 = nn.Linear(64,32)
self.fc4 = nn.Linear(32,32)
self.fc5 = nn.Linear(32, 2)
def forward(self,x):
x = self.fc0(x)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = F.relu(self.fc4(x))
x = F.relu(self.fc5(x))
return x
policy_model = policy_gradient_model().double()
print(policy_model)
optimizer = torch.optim.AdamW(policy_model.parameters(), lr=0.005, betas=(0.9,0.999), eps=1e-08, weight_decay=0.01, amsgrad=False)
#make robot move as predicted by pytorch network (not all code included)
def move(motor_controls):
#define curvature
# motor_controls[0] = sigmoid(motor_controls[0])
activation_left = 1+(motor_controls[0])*99
activation_right = 1+(1- motor_controls[0])*99
print("activation left:", activation_left, ". activation right:",activation_right, ". time:", motor_controls[1]*100)
#start movement
#main
import cv2
import numpy as np
import time
from torch.autograd import Variable
print("start training")
losses=[]
losses_end_of_epoch=[]
number_of_steps_each_epoch=[]
loss_function = nn.MSELoss(reduction='mean')
#each epoch
for epoch in range(2):
count=0
target_reached=False
while target_reached==False:
print("epoch: ", epoch, ". step:", count)
###process and take picture
indices = process_picture()
###binary_network(sliced)=indices as input for policy model
optimizer.zero_grad()
###output: 1 for curvature, 1 for duration of movement
motor_controls = policy_model(Variable(torch.from_numpy(indices))).detach().numpy()
print("NO TANH output for motor: 1)activation left, 2)time ", motor_controls)
motor_controls[0] = np.tanh(motor_controls[0])
motor_controls[1] = np.tanh(motor_controls[1])
print("TANH output for motor: 1)activation left, 2)time ", motor_controls)
###execute suggested action
move(motor_controls)
###take and process picture2 (after movement)
indices = (process_picture())
###loss=(binary_network(picture2) - desired
print("calculate loss")
print("idx", indices, type(torch.tensor(indices)))
# loss = 0
# loss = (indices[0]-10)**2+(indices[1]-31)**2
# loss = loss/2
print("shape of indices", indices.shape)
array=np.zeros((1,2))
array[0]=indices
print(array.shape, type(array))
array2 = torch.ones([1,2])
loss = loss_function(torch.tensor(array).double(), torch.tensor([[10.0,31.0]]).double()).float()
print("loss: ", loss, type(loss), loss.shape)
# array2[0] = loss_function(torch.tensor(array).double(),
torch.tensor([[10.0,31.0]]).double()).float()
losses.append(loss)
#start line causing the error-message (still part of main)
###calculate gradients
loss.backward()
#end line causing the error-message (still part of main)
###apply gradients
optimizer.step()
#Output (so far as intented) (not all included)
#calculate loss
idx [14. 15.] <class 'torch.Tensor'>
shape of indices (2,)
(1, 2) <class 'numpy.ndarray'>
loss: tensor(136.) <class 'torch.Tensor'> torch.Size([])
#Error Message:
Traceback (most recent call last):
File "/home/pi/Desktop/GradientPolicyLearning/PolicyModel.py", line 259, in <module>
array2.backward()
File "/home/pi/.local/lib/python3.7/site-packages/torch/tensor.py", line 134, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/pi/.local/lib/python3.7/site-packages/torch/autograd/__init__.py", line 99, in
backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
If you call .detach() on the prediction, that will delete the gradients. Since you are first getting indices from the model and then trying to backprop the error, I would suggest
prediction = policy_model(torch.from_numpy(indices))
motor_controls = prediction.clone().detach().numpy()
This would keep the predictions as it is with the calculated gradients that can be backproped.
Now you can do
loss = loss_function(prediction, torch.tensor([[10.0,31.0]]).double()).float()
Note, you might wanna call double of the prediction if it throws an error.
It is indeed impossible to calculate the gradients if the loss is not calculated directly from the PyTorch network's output because then you would not be able to apply the chain rule which is used to optimise the gradients.
simple solution, turn on the Context Manager that sets gradient calculation to ON, if it is off
torch.set_grad_enabled(True) # Context-manager
Make sure that all your inputs into the NN, the output of NN and ground truth/target values are all of type torch.tensor and not list, numpy.array or any other iterable.
Also, make sure that they are not converted to list or numpy.array at any point either.
In my case, I got this error because I performed list comprehension on the tensor containing predicted values from NN. I did this to get the max value in each row. Then, converted the list back to a torch.tensor. before calculating the loss.
This back and forth conversion disables the gradient calculations
In my case, I got past this error by specifying requires_grad=True when defining my input tensors
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('dark_background')
# define rosenbrock function and gradient
a = 1
b = 5
def f(x):
return (a - x[0]) ** 2 + b * (x[1] - x[0] ** 2) ** 2
def jac(x):
dx1 = -2 * a + 4 * b * x[0] ** 3 - 4 * b * x[0] * x[1] + 2 * x[0]
dx2 = 2 * b * (x[1] - x[0] ** 2)
return np.array([dx1, dx2])
# create stochastic rosenbrock function and gradient
def f_rand(x):
return f(x) * np.random.uniform(0.5, 1.5)
def jac_rand(x): return jac(x) * np.random.uniform(0.5, 1.5)
# use hand coded adam
x = np.array([0.1, 0.1])
x0 = x.copy()
j = jac_rand(x)
beta1=0.9
beta2=0.999
eps=1e-8
m = x * 0
v = x * 0
learning_rate = .1
for ii in range(200):
m = (1 - beta1) * j + beta1 * m # first moment estimate.
v = (1 - beta2) * (j ** 2) + beta2 * v # second moment estimate.
mhat = m / (1 - beta1 ** (ii + 1)) # bias correction.
vhat = v / (1 - beta2 ** (ii + 1))
x = x - learning_rate * mhat / (np.sqrt(vhat) + eps)
x -= learning_rate * v
j = jac_rand(x)
print('hand code finds optimal to be ', x, f(x))
# attempt to use pytorch
import torch
x_tensor = torch.tensor(x0, requires_grad=True)
optimizer = torch.optim.Adam([x_tensor], lr=learning_rate)
def closure():
optimizer.zero_grad()
loss = f_rand(x_tensor)
loss.backward()
return loss
for ii in range(200):
optimizer.step(closure)
print('My PyTorch attempt found ', x_tensor, f(x_tensor))
Following worked for me:
loss.requires_grad = True
loss.backward()

How to implement a negative binomial loss function in python to use in light GBM?

I have a machine learning problem that I believe the negative binomial loss function would fit well, but the light gbm package doesn't have it as a standard, I'm trying to implement it, but I'm don't know how to get Gradient and Hessian, does anyone know how I can do this? I managed to get to the loss function, but I can't get to the gradient and hessian.
import math
def custom_asymmetric_valid(y_pred,y_true):
y_true = y_true.get_label()
p = 0.5
n = y_pred
loss = math.gamma(n) + math.gamma(y_true + 1) - math.gamma(n + y_true) - n * math.log(p) - y_true * math.log(1 - p)
return "custom_asymmetric_eval", np.mean(loss), False
Now how to get the Gradient and Hessian?
def custom_asymmetric_train(y_pred,y_true):
residual = (y_true.get_label() - y_pred).astype("float")
grad = ?
hess = ?
return grad, hess
Anyone could help?
this is possible with scipy automatically:
from scipy.misc import derivative
from scipy.special import gamma
def custom_asymmetric_train(y_pred, dtrain):
y_true = dtrain.label
p = 0.5
def loss(x,t):
loss = gamma(x) + gamma(t+1) - gamma(x+t) - x*np.log(p) - t*np.log(1-p)
return loss
partial_d = lambda x: loss(x, y_true)
grad = derivative(partial_d, y_pred, n=1, dx=1e-6)
hess = derivative(partial_d, y_pred, n=2, dx=1e-6)
return grad, hess

Train a logistic regression with regularization model from scratch

I am trying to implement Logistic Regression model with regularisation. I got stuck in computing the gradient because when I am running my gradient descent algorithm it actually shows that the cost function is increasing rather than decreasing.
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def Probability(theta, X):
return sigmoid(np.dot(X,theta))
def cost_function_regression(theta, x, y, Lambda):
# Computes the cost function for all the training samples
m = x.shape[0]
total_cost = (-(1 / m) * np.sum(
np.dot(y.T, np.log(Probability( theta,x))) + np.dot((1 - y).T, np.log(
1 - Probability(theta,x))))) + (Lambda/ 2)* np.sum(np.dot(theta, theta.T))
return total_cost
def Gradient_regression( theta, X,y, Lambda ):
m=X.shape[0]
grad=(((1/m)* np.dot(X.T, Probability(theta,X)-y)) + np.sum((Lambda/m )* theta))
return(grad)
We will start by establishing the theory followed by the working example and end with some comments.
Problem statement
The steps in fitting/training a logistic regression model (as with any supervised ML model) using gradient decent method are as below
Identify a hypothesis function [h(X)] with parameters [w,b]
Identify a loss function [J(w,b)]
Forward propagation: Make predictions using the hypothesis functions [y_hat = h(X)]
Calculate the error between the actual label [y] and the predicted label [y_hat] using the loss function.
Backward propagation: Adjust the parameters in the hypothesis function based on the error (by calculating the gradients), using the update rule
Got to step 3 if gradients are high else end
Calculating gradients
Hypothesis function for logistic regression :
Where X is a vector and X^i is the ith element of the vector.
The commonly used loss function for logistic regression is log loss. The log loss with l2 regularization is:
Lets calculate the gradients
Similarly
Now that we know the gradients, lets code the gradient decent algorithm to fit the parameters of our logistic regression model
Toy Example
# load data
iris = datasets.load_iris()
# Lets take only two classes
y = iris.target
X = iris.data[y != 2]
y = y[y != 2]
# Normalize data to 0 mean and 1 std
X[:, 0] = (X[:, 0] - np.mean(X[:, 0]))/np.std(X[:, 0])
X[:, 1] = (X[:, 1] - np.mean(X[:, 1]))/np.std(X[:, 1])
X[:, 2] = (X[:, 2] - np.mean(X[:, 2]))/np.std(X[:, 2])
X[:, 3] = (X[:, 3] - np.mean(X[:, 3]))/np.std(X[:, 3])
def sigmoid(x):
return 1 / (1+math.exp(-x))
# initialize weights
w0, w1, w2, w3, b = 0.01,0.01,0.01,0.01,0.01
n = len(X)
# Learning rate
alpha = 0.01
# The gardient decent loop
while True:
y_hat = [sigmoid(w0*x[0] + w1*x[1] + w2*x[2] + w3*x[3] + b) for x in X]
delta_w0 = -np.sum([(y[j] - y_hat[j])*X[j,0] for j in range(n)])/n + 2*w0
delta_w1 = -np.sum([(y[j] - y_hat[j])*X[j,1] for j in range(n)])/n + 2*w1
delta_w2 = -np.sum([(y[j] - y_hat[j])*X[j,2] for j in range(n)])/n + 2*w2
delta_w3 = -np.sum([(y[j] - y_hat[j])*X[j,3] for j in range(n)])/n + 2*w3
delta_b = -np.sum([(y[j] - y_hat[j]) for j in range(n)])/n + 2*b
w0 = w0 - alpha*delta_w0
w1 = w1 - alpha*delta_w1
w2 = w2 - alpha*delta_w2
w3 = w3 - alpha*delta_w3
b = b - alpha*delta_b
if np.sum(np.abs([delta_w0, delta_w1, delta_w2, delta_w3, delta_b])) < 1e-5:
break
# Make predictions
pred = [1 if i > 0.5 else 0 for i in y_hat]
# Find no:of correct predictions
correct = np.sum([1 if pred[i] == y[i] else 0 for i in range(n)])
print (correct)
Comments
The above toy example is the coded in the most inefficient way. The intention was to show the steps clearly rather then efficiency. Saying that, we will have to vectorize the operations (using np arrays and matrics operations) for efficiency.
Data normalization is important
The models are trained on train data and the performance is measured on test/validation data.

Training model doesn't work when implementing multilinear regression from scratch in python

I managed to successfully implement multilinear regression using only numpy for Iris dataset. I wanted to do the same for
boston houses data set but my model won't learn and I have no idea why.
import pandas as pd
# read data and split into test and training sets
data = pd.read_csv('train.csv')
data = (data - data.mean()) / data.std() # normalize data
split_data = np.random.rand(len(data)) < 0.8
train_data = data[split_data].round(5)
test_data = data[~split_data]
# create matrices
input_features_train = train_data.drop(['ID', 'medv'], 1).values
output_feature_train = train_data.medv.values.reshape(-1, 1)
ones = np.ones([input_features_train.shape[0], 1])
input_features_train = np.concatenate((ones, input_features_train), 1)
weight = np.zeros([1, 14])
def computeCost(X, y, theta):
summed = np.power(((X # theta.T) - y), 2)
return np.sum(summed) / (2 * len(X))
def gradientDescent(X, y, theta, iters, alpha):
costs = np.zeros(iters)
for i in range(iters):
theta = theta - (alpha / len(X)) * np.sum(X * (X # theta.T - y), 0)
costs[i] = computeCost(X, y, theta)
return theta, costs
learning_rate = 0.01
iterations = 100000
weights, cost = gradientDescent(input_features_train, output_feature_train, weight, iterations, learning_rate)
print("Weights: ", weights)
finalCost = computeCost(input_features_train, output_feature_train, weights)
# test
input_features_test = test_data.drop(['ID', 'medv'], 1).values
output_feature_test = test_data.medv.values.reshape(-1, 1)
ones = np.ones([input_features_test.shape[0], 1])
input_features_test = np.concatenate((ones, input_features_test), 1)
def test_data(input_features, output_feature, weights):
predictions = np.round(np.dot(input_features, weights.T))
for i in range(len(output_feature)):
predicted = predictions[i]
success = predictions[i] == output_feature[i]
print('For features: ', input_features[i], ' housing price should be ', output_feature[i])
print("Predicted: %f" % predicted)
print("Is success? ", success)
print()
test_data(input_features_test, output_feature_test, weights)
predictions = np.round(np.dot(input_features_test, weights.T))
accuracy = (sum(predictions == output_feature_test) / float(len(output_feature_test)) * 100)[0]
print("Accuracy of the model is ", accuracy, "% after ", iterations, "iterations")
example output goes as follow
Weights: [[ 0.01465871 -0.11583742 0.17729105 0.01249782 0.09822299 -0.31249182
0.25208063 -0.00937766 -0.48751822 0.46772537 -0.27637035 -0.1590125
0.12926108 -0.48910136]]
For features: [ 1. -0.44852959 -0.47141352 0.09095532 -0.25240023 0.13793157
0.46506236 0.03105118 -0.62153314 -0.98758424 -0.79769195 1.18594974
0.37563165 -0.40259248] housing price should be [-0.04019949]
Predicted: 0.000000
Is success? [False]
I tried even 10000000 iterations and still it fails all tests and has 0% accuracy. On iris data set I managed to get 100% with this model so I don't understand why it won't work.
I suspect it might be something with data normalization as without it I get RuntimeWarning: overflow encountered in power
summed = np.power(((X # theta.T) - y), 2) error which I also don't know why is happening.
Could you please point me in the right direction ? Thanks!
I really suggest you to use scikit learn for this. You can use SGD Regressor,or Cat Boost Regressor which have inbuilt support required for approaches like these.
The main motive behind this suggestion is using gradient descent manually may lead to some logical error which may go undetected.
Try to solve with scikit learn.That might help.

Categories

Resources