I am using Keras for boundary/contour detection using a Unet. When I use binary cross-entropy as the loss, the losses decrease over time as expected the predicted boundaries look reasonable
However, I have tried custom loss for Dice with varying LRs, none of them are working well.
smooth = 1e-6
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice(y_true, y_pred):
return 1-dice_coef(y_true, y_pred)
the loss values don't improve. That is, it will show something like
loss: nan - dice: .9607 - val_loss: nan - val_dice: .9631
I get NaNs for the losses and values for dice and val_dice that barely change as the epochs iterate. This is regardless of what I use for the LR, whether it be .01 to 1e-6
The dimensions of the train images/labels looks like N x H x W x 1, where N is the number of images, H/W are the height/width of each image
can anyone help?
Related
I see some implementation for Dice loss for a binary problem. however, I need to use it for NLP using the Trainer of Hugging Face transformer. So do we have any implementation or guide to use dice loss for a multiclass problem?
sample from this link:
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_multilabel(y_true, y_pred, numLabels=5):
dice=0
for index in range(numLabels):
dice -= dice_coef(y_true[:,index,:,:,:], y_pred[:,index,:,:,:])
return dice
I train U-Net model for segment particular areas (3 classes + background) in CT scan.
To evaluate the model performance i use custom accuracy function:
(1 - alpha) * dice + alpha * hausdorff , 0<alpha<1
During training, the hausdorff part increase too fast ,after several epochs achieved to 98%, compare to the dice (30%-40%).
In the end of the training:
total accuracy: ~73% | dice: ~50% | hausdorff: ~98%
Additional information about the u-net model:
#epochs - 1 (training time - 9 hours).
optimizer - Adam.
use data augmentation.
use dropout in the downsample & upsample path.
The evaluation function:
Dice:
import segmentation_models as sm
def dice(y_true, y_pred):
return 1. - sm.losses.dice_loss(gt=y_true, pr=y_pred)
hausdorff distance:
import numpy as np
from monai.metrics.utils import get_mask_edges, get_surface_distance
def hausdorff(y_true, y_pred):
h, w = y_true.shape[1], y_true.shape[2]
max_dist = np.sqrt(w ** 2 + h ** 2)
(edges_pred, edges_gt) = get_mask_edges(y_pred, y_true)
surface_distance = get_surface_distance(edges_pred, edges_gt,
distance_metric="euclidean")
if surface_distance.shape == (0,):
return 1.0
dist = np.max(surface_distance)
if dist > max_dist:
return 0.0
return 1. - dist / max_dist
Custom accuracy:
import keras.backend as K
def accuracy(y_true, y_pred):
y_true = tf.cast(y_true, dtype=tf.float32)
y_pred = tf.cast(y_pred, dtype=tf.float32)
y_true_2d, y_pred_2d = tf.argmax(y_true, axis=-1), tf.argmax(y_pred, axis=-1)
if np.all(y_true_2d.numpy() == y_pred_2d.numpy()):
return 0 * K.sum(y_pred) + 1.
d = dice(y_true, y_pred)
h = hausdorff(y_true, y_pred)
acc = (1. - alpha) * d + alpha * h
return acc
The hausdorff distance implementation is correct in this case? any additional suggestions how to solve this?
Thanks,
I am working on a image segmentation and classification model which take one 3D image (64x64x64) and return two outputs (3D mask 64x64x64 and one-hot encoded category). Two outputs has been defined like this:
seg_final_stage = Conv3D(seg_classes, kernel_size=3, strides=1, padding="same", name="Seg_Final")(decoder1)
output_seg = Activation('sigmoid', name = "Seg_Final_Sigmoid")(seg_final_stage)
class_final = Dense(texture_classes, name="Class_Final")(class_dense3)
output_class = Activation('softmax', name = "Class_Final_SoftMax")(class_final)
model = Model(input, [output_seg, output_class], name = Config["Model_Name"])
There are two different method that I have tried but both failed, it seems like each output will return its own loss instead of just one loss. Following are the loss function I have now.
# Custom loss function
def dice_coef(y_true, y_pred):
smooth = 1.
y_true_f = tf.reshape(y_true, [-1])
y_pred_f = tf.reshape(y_pred, [-1])
intersection = tf.reduce_sum(y_true_f * y_pred_f)
score = (2. * intersection + smooth) / (tf.reduce_sum(y_true_f) + tf.reduce_sum(y_pred_f) + smooth)
return score
def dice_loss(y_true, y_pred):
return 1 - dice_coef(y_true, y_pred)
def CCE(y_true, y_pred):
class_loss = CategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
return class_loss(y_true, y_pred)
def hybrid(y_true, y_pred):
mask_weight = 0.8
class_weight = 0.2
mask_loss = dice_loss(y_true[0], y_pred[0])
class_loss = CCE(y_true[1], y_pred[1])
return mask_weight * mask_loss + class_weight * class_loss
The first try:
I use single hybrid loss function that try to sum the dice loss and CCE together with weights. However, the model returns 2 losses like output_seg_hybrid and output_class_hybrid. I thought the y_true and y_pred are formed as array so that I can that the first item to calculate dice and the second item for CCE.
model.compile(loss=hybrid,
optimizer=Nadam(learning_rate=Config["learning_rate"], beta_1=0.9, beta_2=0.999, epsilon=1e-07, name="Nadam"),
metrics=[hybrid])
The second try:
I applied 2 losses like below to ensure the output has corresponding loss so it can return 2 loss and sum-up with weights. What I get is actually 4 losses like Seg_Final_Sigmoid_dice_loss, Class_Final_SoftMax_dice_loss, Seg_Final_Sigmoid_CCE and Class_Final_SoftMax_CCE.
model.compile(loss={'Seg_Final_Sigmoid':dice_loss, 'Class_Final_SoftMax':CCE},
optimizer=model.optimizer, metrics=[dice_loss, CCE], loss_weights = [0.8,0.2])
What should I do if I just want to sum-up these two losses as one loss?
my U-Net train_dice_loss is decreasing but my val_dice_loss remains at 0.4. It looks like the network is overfitting but shouldn't val_dice_loss increase at some point?
The network is based on the Carvana Segmentation Competition (Colab Carvana Segmentation). I use the same model, target function and data augmentation pipeline, but I've got much less data (~1900 Images a 256x256px). I split my data into a Training, Validation and Test-Set. On the Test Set my model predicts quite well (~ average dice_coeff 0.75) but I can't explain this graph.
Additional Information:
def dice_coeff(y_true, y_pred):
smooth = 1.
# Flatten
y_true_f = tf.reshape(y_true, [-1])
y_pred_f = tf.reshape(y_pred, [-1])
intersection = tf.reduce_sum(y_true_f * y_pred_f)
score = (2. * intersection + smooth) / (tf.reduce_sum(y_true_f) +
tf.reduce_sum(y_pred_f) + smooth)
return score
def dice_loss(y_true, y_pred):
loss = 1 - dice_coeff(y_true, y_pred)
return loss
def bce_dice_loss(y_true, y_pred):
loss = losses.binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
return loss
I also tried different Splits and Keras Optimizer. It always ends at ~ 0.4.
Assume that y_true and y_pred are in [-1,1]. I want a weighted mean-square-error loss function, in which the loss for samples that are positive in the y_true and negative in y_pred or vice versa are weighted by exp(alpha). Here is my code:
import keras.backend as K
alpha = 1.0
def custom_loss(y_true, y_pred):
se = K.square(y_pred-y_true)
true_label = K.less_equal(y_true,0.0)
pred_label = K.less_equal(y_pred,0.0)
return K.mean(se * K.exp(alpha*K.cast(K.not_equal(true_label,pred_label), tf.float32)))
And here is a plot of this loss function. Different curves are for different values for y_true.
I want to know:
Whether this is a valid loss function, since it is not differentiable in 0?
Is my code correct?
I suggest you this type of loss function to handle imbalance dataset
def focal_loss(y_true, y_pred):
gamma = 2.0, alpha = 0.25
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
return -K.sum(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1))-K.sum((1-alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
from this source