Training a model with single output on multiple losses keras - python

I am building an image segmentation model using keras and I want to train my model on multiple loss functions. I have seen this link but I am looking for a simpler and straight-forward solutions for this situation as my loss functions are quite complex. Can someone tell me how to build a model with single output with multiple losses in keras.

You can use multiple losses with one output using weighted loss, which is a sum of your losses multiplied by weight. Create your custom loss which will return a sum of other losses with coefficients and pass it to model.compile. There is an example here.

This is just an example from here. You could play around with it.
def custom_losses(y_true, y_pred):
alpha = 0.6
squared_difference = tf.square(y_true - y_pred)
Huber = tf.keras.losses.huber(y_true, y_pred)
return tf.reduce_mean(squared_difference, axis=-1) + (alpha*Huber)
model.compile(optimizer='adam', loss=custom_losses,metrics=['MeanSquaredError'])

Related

How to add multiple losses into gradienttape

I am testing tf.gradienttape. I wrote a model with several output layers, each with an own loss, where i wanted to integrate the gradienttape. My question is: are there specific techniques how to implement the several losses to the gradient as target?
I know one option is to take the mean of the losses. Is that always necessary? Can't I just input a list of losses and the gradienttape knows which losses belong to which output layer?
In the TensorFlow document: Unless you set persistent=True a GradientTape can only be used to compute one set of gradients.
To calculate multiple losses, you need multiple tapes. Something like:
with tf.GradientTape() as t1:
loss1_result= loss1(true, pred)
grads1 = t1.gradient(loss1_result, var_list1)
with tf.GradientTape() as t2:
loss2_result= loss2(true, pred)
grads2 = t2.gradient(loss2_result, var_list2)
Then apply it.
opt1.apply_gradients(zip(grads1, var_list1))
opt2.apply_gradients(zip(grads2, var_list2))

Handling and Combining two loss function in Keras TF

Is there a way to have two loss functions in Keras in which the second loss function takes the output from the first loss function?
I am working on a Neural Network with Keras and I want to add another custom function to the Loss term inside the model.compile() to regularize and somehow penalize it, which is the form:
model.compile(loss_1='mean_squared_error', optimizer=Adam(lr=learning_rate), metrics=['mae'])
I would like to add another loss function as a sum of the predicted values from the Loss_1 outputs so that I can tell the Neural Network to minimize the sum of the predicted values from the Loss_1 model. How can I do that (loss_2)?
Something like:
model.compile(loss_1='mean_squared_error', loss_2= np.sum(****PREDICTED_OUTPUT_FROM_LOSS_FUNCTION_1****), optimizer=Adam(lr=learning_rate), metrics=['mae'])
how can this be implemented?
You should define a custom loss function
def custom_loss_function(y_true, y_pred):
squared_difference = tf.square(y_true - y_pred)
absolute_difference = tf.abs(y_true - y_pred)
loss = tf.reduce_mean(squared_difference, axis=-1) + tf.reduce_mean(absolute_difference, axis=-1)
return loss
model.compile(optimizer='adam', loss=custom_loss_function)
I believe that would solve your problem

RMSE loss for multi output regression problem in PyTorch

I'm training a CNN architecture to solve a regression problem using PyTorch where my output is a tensor of 20 values. I planned to use RMSE as my loss function for the model and tried to use PyTorch's nn.MSELoss() and took the square root for it using torch.sqrt() for that but got confused after obtaining the results.I'll try my best to explain why. It's obvious that for a batch-size bs my output tensor's dimensions would be [bs , 20].I tried to implement and RMSE function of my own :
def loss_function (predicted_x , target ):
loss = torch.sum(torch.square(predicted_x - target) , axis= 1)/(predicted_x.size()[1]) #Taking the mean of all the squares by dividing it with the number of outputs i.e 20 in my case
loss = torch.sqrt(loss)
loss = torch.sum(loss)/predicted_x.size()[0] #averaging out by batch-size
return loss
But the output of my loss_function() and how PyTorch implements it with nn.MSELoss() differed . I'm not sure whether my implementation is wrong or am I using nn.MSELoss() in the wrong way.
The MSE loss is the mean of the squares of the errors. You're taking the square-root after computing the MSE, so there is no way to compare your loss function's output to that of the PyTorch nn.MSELoss() function — they're computing different values.
However, you could just use the nn.MSELoss() to create your own RMSE loss function as:
loss_fn = nn.MSELoss()
RMSE_loss = torch.sqrt(loss_fn(prediction, target))
RMSE_loss.backward()
Hope that helps.
To replicate the default PyTorch's MSE (Mean-squared error) loss function, you need to change your loss_function method to the following:
def loss_function (predicted_x , target ):
loss = torch.sum(torch.square(predicted_x - target) , axis= 1)/(predicted_x.size()[1])
loss = torch.sum(loss)/loss.shape[0]
return loss
Here is why the above method works - MSE Loss means mean squared error loss. So you need not have to implement square root (torch.sqrt) in your code. By default, the loss in PyTorch does an average of all examples in the batch for calculating loss. Hence the second line in the method.
To implement RMSELoss and integrate into your training, you can do it look this:
class RMSELoss(torch.nn.Module):
def __init__(self):
super(RMSELoss,self).__init__()
def forward(self,x,y):
criterion = nn.MSELoss()
loss = torch.sqrt(criterion(x, y))
return loss
And you can call this class similar to any loss function in PyTorch.

Deal with imbalanced dataset in text classification with Keras and Theano

For ~20,000 text datasets, the true and false samples are ~5,000 against ~1,5000. Two-channel textCNN built with Keras and Theano is used to do the classification. F1 score is the evaluation metric. The F1 score is not bad while the confusion matrix shows that the accuracy of the true samples is relatively low(~40%). But actually it is very important to predict the true samples accurately. Therefore, want to design a custom binary cross entropy loss function to increase the weight of mis-classified true samples and make the model focus more on predicting accurately on the true samples.
tried class_weight with sklearn in model.fit method and it did not work very well since the weight applied to all samples instead of the mis-classified ones.
tried and adjusted the method mentioned here: https://github.com/keras-team/keras/issues/2115, but the loss function was categorical cross entropy and it did not work well for the binary classification problem. Tried to modified the loss function to a binary one but encounter some issues concerning the input dimension.
The sample code of the cost sensitive loss function focusing on the mis-classified samples is:
def w_categorical_crossentropy(y_true, y_pred, weights):
nb_cl = len(weights)
final_mask = K.zeros_like(y_pred[:, 0])
y_pred_max = K.max(y_pred, axis=1)
y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
y_pred_max_mat = K.equal(y_pred, y_pred_max)
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
Actually, a custom loss function for binary classification implemented with Keras and Theano that focuses on the mis-classified samples is of great importance to the imbalanced dataset. Please help troubleshoot this. Thanks!
Well when I have to deal with imbalanced datasets in keras, what I do is to first compute the weights for each class and pass them to the model instance during training. This will look something like this:
from sklearn.utils import compute_class_weight
w = compute_class_weight('balanced', np.unique(targets), targets)
# here I am adding only two categories with their corresponding weights
# you can spin a loop or continue by hand until you include all of your categories
weights = {
np.unique(targets)[0] : w[0], # class 0 with weight 0
np.unique(targets)[1] : w[1] # class 1 with weight 1
}
# then during training you do like this
model.fit(x=features, y=targets, {..}, class_weight=weights)
I believe this will solve your problem.

Kaggle airbus ship detection challenge.How to deal with class imbalance?

My model always predict under probability 0.5 for all pixels.
I dropped all images without ships and have tried focal loss,iou loss,weighted loss to deal with imbalance .
But the result is same.After few batches the masks i predicted gradually became all zeros.
Here is my notebook: enter link description here
Kaggle discussion:enter link description here
In the notebook , basically what i did is :
(1)discard all samples where there is no ship
(2)build a plain u-net
(3)define three custom loss function(iouloss,focal_binarycrossentropy,biased_crossentropy), all of which i have tried.
(4)train and submit
#define different losses to try
def iouloss(y_true,y_pred):
intersection = K.sum(y_true * y_pred, axis=-1)
sum_ = K.sum(y_true + y_pred, axis=-1)
jac = intersection / (sum_ - intersection)
return 1 - jac
def focal_binarycrossentropy(y_true,y_pred):
#focal loss with gamma 8
t1=K.binary_crossentropy(y_true, y_pred)
t2=tf.where(tf.equal(y_true,0),t1*(y_pred**8),t1*((1-y_pred)**8))
return t2
def biased_crossentropy(y_true,y_pred):
#apply 1000 times heavier punishment to ship pixels
t1=K.binary_crossentropy(y_true, y_pred)
t2=tf.where(tf.equal(y_true,0),t1*1000,t1)
return t2
...
#try different loss function
unet.compile(loss=iouloss, optimizer="adam", metrics=[ioumetric])
or
unet.compile(loss=focal_binarycrossentropy, optimizer="adam", metrics=[ioumetric])
or
unet.compile(loss=biased_crossentropy, optimizer="adam", metrics=[ioumetric])
...
#start training
unet.train_on_batch(x=image_batch,y=mask_batch)
One option that Keras provides is class_weight parameter in fit from documentation:
class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.
This will allow you to counter the imbalance to some extent.
I have heard use of the Dice coefficient for this problem, although I have no personal experience of having done so. Perhaps you could try this? It is related to the Jaccard but have heard anecdotally that it is easier to train. Sorry not to offer anything more concrete.

Categories

Resources