I'm training a CNN architecture to solve a regression problem using PyTorch where my output is a tensor of 20 values. I planned to use RMSE as my loss function for the model and tried to use PyTorch's nn.MSELoss() and took the square root for it using torch.sqrt() for that but got confused after obtaining the results.I'll try my best to explain why. It's obvious that for a batch-size bs my output tensor's dimensions would be [bs , 20].I tried to implement and RMSE function of my own :
def loss_function (predicted_x , target ):
loss = torch.sum(torch.square(predicted_x - target) , axis= 1)/(predicted_x.size()[1]) #Taking the mean of all the squares by dividing it with the number of outputs i.e 20 in my case
loss = torch.sqrt(loss)
loss = torch.sum(loss)/predicted_x.size()[0] #averaging out by batch-size
return loss
But the output of my loss_function() and how PyTorch implements it with nn.MSELoss() differed . I'm not sure whether my implementation is wrong or am I using nn.MSELoss() in the wrong way.
The MSE loss is the mean of the squares of the errors. You're taking the square-root after computing the MSE, so there is no way to compare your loss function's output to that of the PyTorch nn.MSELoss() function — they're computing different values.
However, you could just use the nn.MSELoss() to create your own RMSE loss function as:
loss_fn = nn.MSELoss()
RMSE_loss = torch.sqrt(loss_fn(prediction, target))
RMSE_loss.backward()
Hope that helps.
To replicate the default PyTorch's MSE (Mean-squared error) loss function, you need to change your loss_function method to the following:
def loss_function (predicted_x , target ):
loss = torch.sum(torch.square(predicted_x - target) , axis= 1)/(predicted_x.size()[1])
loss = torch.sum(loss)/loss.shape[0]
return loss
Here is why the above method works - MSE Loss means mean squared error loss. So you need not have to implement square root (torch.sqrt) in your code. By default, the loss in PyTorch does an average of all examples in the batch for calculating loss. Hence the second line in the method.
To implement RMSELoss and integrate into your training, you can do it look this:
class RMSELoss(torch.nn.Module):
def __init__(self):
super(RMSELoss,self).__init__()
def forward(self,x,y):
criterion = nn.MSELoss()
loss = torch.sqrt(criterion(x, y))
return loss
And you can call this class similar to any loss function in PyTorch.
Related
Is there a way to have two loss functions in Keras in which the second loss function takes the output from the first loss function?
I am working on a Neural Network with Keras and I want to add another custom function to the Loss term inside the model.compile() to regularize and somehow penalize it, which is the form:
model.compile(loss_1='mean_squared_error', optimizer=Adam(lr=learning_rate), metrics=['mae'])
I would like to add another loss function as a sum of the predicted values from the Loss_1 outputs so that I can tell the Neural Network to minimize the sum of the predicted values from the Loss_1 model. How can I do that (loss_2)?
Something like:
model.compile(loss_1='mean_squared_error', loss_2= np.sum(****PREDICTED_OUTPUT_FROM_LOSS_FUNCTION_1****), optimizer=Adam(lr=learning_rate), metrics=['mae'])
how can this be implemented?
You should define a custom loss function
def custom_loss_function(y_true, y_pred):
squared_difference = tf.square(y_true - y_pred)
absolute_difference = tf.abs(y_true - y_pred)
loss = tf.reduce_mean(squared_difference, axis=-1) + tf.reduce_mean(absolute_difference, axis=-1)
return loss
model.compile(optimizer='adam', loss=custom_loss_function)
I believe that would solve your problem
I am trying to implement logistic regression from scratch using binary cross entropy loss function. The loss function implemented below is created based on the following formula.
def binary_crossentropy(y, yhat):
no_of_samples = len(y)
numerator_1 = y*np.log(yhat)
numerator_2 = (1-y) * np.log(1-yhat)
loss = -(np.sum(numerator_1 + numerator_2) / no_of_samples)
return loss
And below is how I implement the training using gradient descent.
L = 0.01
epochs = 40000
no_of_samples = len(x)
# Keeping track of the loss
loss = []
for _ in range(epochs):
yhat = sigmoid(x*weight + bias)
# Finding out the loss of each iteration
loss.append(binary_crossentropy(y, yhat))
d_weight = np.sum(x *(yhat-y)) / no_of_samples
d_bias = np.sum(yhat-y) / no_of_samples
weight = weight - L*d_weight
bias = bias - L*d_bias
The training above goes fine since the weight and bias are properly adjusted. But my question here is that, why the loss graph appears to be very fluctuating?
I have ever tried implementing linear regression and the loss appears to be constantly decreasing.
Is there anything incorrect in my logistic regression implementation? If my implementation is already correct, why does it fluctuate that way?
You need to optimize hyperparameters to see if the problem solves or not. One thing that can be done is to change the type of optimizers that you used. For instance, you can use Fmin_tnc instead of gradient descent.
Besides, you can tune the epochs, L and type of solvers (‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’) if you use sklearn for regression.
I wrote a squared loss function for categorisation of one hot encoded data
def squared_categorical_loss(y_true, y_pred):
return K.mean(K.square(1.0 - K.sum(y_true * y_pred, axis=(1))))
which works when given numpy array examples, such as
y_true = np.asarray([[1,0,0],[0,1,0]])
y_pred = np.asarray([[0.5,0.2,0.3],[0.4,0.6,0]])
squared_categorical_loss(y_true, y_pred)
The example above returns a tensor with the value 0.205 which is the mean of (1-0.5)^2 and (1-0.6)^2, which is the desired result and what should be an optimisable loss function that generally correlates with accuracy but when I apply it to a TensorFlow model
model.compile(optimizer='adam',
loss=squared_categorical_loss,
metrics=['accuracy'])
the loss decreases to extremely small values while the training accuracy stays below 50% which shouldn't be possible as a loss below 0.125 couldn't be mathematically achieved without the accuracy being above 50% so what is wrong with my implementation?
Thanks!
It will work only if y_pred is normalized (sum equals to 1).
I think that you forgot to apply softmax in the last layer of your model.
I am building an image segmentation model using keras and I want to train my model on multiple loss functions. I have seen this link but I am looking for a simpler and straight-forward solutions for this situation as my loss functions are quite complex. Can someone tell me how to build a model with single output with multiple losses in keras.
You can use multiple losses with one output using weighted loss, which is a sum of your losses multiplied by weight. Create your custom loss which will return a sum of other losses with coefficients and pass it to model.compile. There is an example here.
This is just an example from here. You could play around with it.
def custom_losses(y_true, y_pred):
alpha = 0.6
squared_difference = tf.square(y_true - y_pred)
Huber = tf.keras.losses.huber(y_true, y_pred)
return tf.reduce_mean(squared_difference, axis=-1) + (alpha*Huber)
model.compile(optimizer='adam', loss=custom_losses,metrics=['MeanSquaredError'])
I started using Pytorch and I'm currently working on a Project where I'm using a simple feed forward neural network for linear regression. The Problem is I didn't find anything in Pytorch that allows me to get the Accuracy of a linear regression Model as in Keras or in SKlearn. in keras it would be simple just by setting metrics=["accuracy"] inside the compile function. I searched in the docs and official website of Pytorch but I didn't find anything. seems that this API doesn't exist in Pytorch. I know that I can observe the loss during training or I can simply get the test loss and based on it I can know whether the loss decreased or not but I want to use that Keras Structure where I get the loss value and also an Accuracy value. the Keras way looks more clear. I also tried to implement an accuracy function using the r2_score from sklearn but it gave me wierd values:
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)
def train(model, optimizer, loss_fn):
def train_step(x, y):
model.train()
optimizer.zero_grad()
out = model(x)
loss = loss_fn(out, y)
loss.backward()
optimizer.step()
return loss.item()
return train_step
def fit(epochs=100):
train_func = train(model, optimizer, criterion)
count, total = 0, 0
loss_list, accuracy_list, iters = [], [], []
for e in range(epochs):
for X, y in train_loader:
loss = train_func(X, y)
count += 1
total += len(y)
if count % 50 == 0:
print("loss= ", loss)
loss_list.append(loss)
iters.append(total)
if count % 100 == 0:
model.eval() # im not sure if we can do this in pytorch. I mean evaluating the model while training! it would be great if you tell me whether this is ok or not
out = model(X)
out = out.detach().numpy()
y = y.detach().numpy()
accuracy = r2_score(y, out) # r2_score is the scikit learn r2 score function.
print("accuracy = ", accuracy) # here i get wierd values and it doesn't get better over time, in contrast the loss decreased over time
accuracy_list.append(accuracy)
return iters, loss_list, accuracy_list
I know how to implement an Accuracy function in case of Classification Problem because it is using discrete values. that is clear to me because the implementation is easy and clear. I must only look which correct prediction did the model made and then calculate accuracy. but in this Case I have continuous values so that's why I couldn't implement the function myself and it surprised me that Pytorch don't have a built in function for this. so could someone maybe tell me how to implement this or where to find an Implementation of it?
another thing is where to use the evaluation and where to set the model in evaluation mode by calling the eval function. should I use it during training like I did in my Code or should I train and then test after training and if I test during training should I call the eval function as I did there or it will affect the training when the loop goes back to training mode? another thing I didn't find it also in Pytorch which is Cross validation. how should I implement it in pytorch if there is no API for it like in Keras?
Accuracy does not exist in regression problems.
A similar measure of "accuracy" for a regression problem might be the R-squared score.
If you are using pytorch to train your neural networks, have a look at the package
torchmetrics. You may find what you need there.
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
Look here for more info: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html