DICE loss too low but no overlap between prediction and label - python

I am trying to achieve the segmentation of the bone on the cross sectional area of MRI images with the Unet I found here https://github.com/zhixuhao/unet. The label is a binary png image which I intend to compare to my prediction. For the frames, I wrote my own data loader for my DICOM MRI files
I am using the following dice loss in keras
def DiceLoss(targets, inputs, smooth=1e-6):
#flatten label and prediction tensors
inputs = K.flatten(inputs)
targets = K.flatten(targets)
intersection = K.sum(K.dot(targets, inputs))
dice = (2*intersection + smooth) / (K.sum(targets) + K.sum(inputs) + smooth)
return 1 - dice
I have seen it in kaggle https://www.kaggle.com/bigironsphere/loss-function-library-keras-pytorch
among other similar implementations in Keras that I found out there.
But the DICE loss is always too small (10^-4) even though there is no overlap between the prediction and the label. Why is this? should I first convert my prediction to a binary mask? if so, How can I do this in keras? I tried to convert targets to a numpy array by doing targets.numpy() and then apply a threshold, but it throws a "tensor does not have attribute numpy" error. Do you have any other idea?
thank you

Related

Trying to define combine loss function in CNN with VGG16 perceptual loss and SSIM

Problem definition:
I am implementing a CNN using Tensorflow. The Input and output are of size samples x 128 x 128 x 1 (grayscale image). In loss function I already have SSIM (0-1) and now my goal is to combine SSIM value with perceptual loss using pre-trained VGG16. I have already consulted following answers link1, link2 but instead of concatenating VGG model at end of the main model I would like to compute feature maps inside loss function at specific layers (e.g. pool1, pool2, pool3) and compute overall MSE. I have defined loss function as following:
Combined Loss function:
def lossfun( yTrue, yPred):
alpha = 0.5
return (1-alpha)*perceptual_loss(yTrue, yPred) + alpha*K.mean(1-tf.image.ssim(yTrue, yPred, 1.0))
and perceptual loss:
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
model = VGG16()
model = Model(inputs=model.inputs, outputs=model.layers[1].output)
def perceptual_loss(yTrue, yPred):
true = model(preprocess_input(yTrue))
P=Concatenate()([yPred,yPred,yPred])
pred = model(preprocess_input(P))
vggLoss = tf.math.reduce_mean(tf.math.square(true - pred))
return vggLoss
The Error I am running into is followig:
ValueError: Dimensions must be equal, but are 224 and 128 for 'loss_22/conv2d_132_loss/sub' (op: 'Sub') with input shapes: [?,224,224,64], [?,128,128,64].
Error arises due to following reason:
yPred has size None,128,128,1 , after concatenating it three time and pred = model(preprocess_input(P)) I receive feature map named pred of size None,128,128,64. While yTrue has size None and after true = model(preprocess_input(yTrue)) dimension of true is None,224,224,64. This eventually creates dimension incompatibility while computing final vggLoss.
Question
Since I am new to this task I am not sure if I am approaching the problem in the right manner. Should I create samples of size 224x244 instead of 128x128 in order to avoid this conflict, or is there any other workaround to fix this issue?
Thank you !

U net Multiclass segmentation image input dataset error

I am trying to do multiclass segmentation with U-net. In the previous trials I tried the binary segmentation and it works. But when I try to do multiclass I am facing this error.
ValueError: 'generator yielded an element of shape (128,192,1) where an element of shape (128,192,5) was expected
This 5 denoted the number of classes. This is how I defined my output layer. output:Tensor("output/sigmoid:0",shape(?,128,192,5),dtype=float32).
I kept a crop size of input_shape:(128,192,1) because of grayscale image
and label_shape:(128,192,5)
Data is loaded in the tensorflow dataset and uses a tf.iterator.
A generator yields data from tf.dataset.
def get_datapoint_generator(self):
def generator():
for i in itertools.count(1):
datapoint_dict=self._get_next_datapoint()
yield datapoint_dict['image'],datapoint_dict['mask']
The _get_next_datapoint_ function gets next datapoint from ram, and processes cropping and augmentation.
Now, where would have it gone wrong that the it doesnt match with the output shape?
Can you try to use this implementation? I am using this one but it is in Keras
def sparse_crossentropy(y_true, y_pred):
nb_classes = K.int_shape(y_pred)[-1]
y_true = K.one_hot(tf.cast(y_true[:, :, 0], dtype=tf.int32), nb_classes + 1)
return K.categorical_crossentropy(y_true, y_pred)

Class Imbalance for image classification

I am working on multi-label image classification where some labels have very few images. How to handle these cases?
Data augmentation, which means making 'clones' (reverse image/ set different angle/ etc.)
Do Image Augmentation for your data-set. Image augmentation means add variation (noise, resize etc) to your training image in a way that your object you are classifying can be seen through naked eye.
Some code for Image augmentation are.
adding Noise
gaussian_noise=iaa.AdditiveGaussianNoise(10,20)
noise_image=gaussian_noise.augment_image(image)
ia.imshow(noise_image)
Cropping
crop = iaa.Crop(percent=(0, 0.3)) # crop image
corp_image=crop.augment_image(image)
ia.imshow(corp_image)
Sheering
shear = iaa.Affine(shear=(0,40))
shear_image=shear.augment_image(image)
ia.imshow(shear_image)
Flipping
#flipping image horizontally
flip_hr=iaa.Fliplr(p=1.0)
flip_hr_image= flip_hr.augment_image(image)
ia.imshow(flip_hr_image)
Now you just need to put that into your data generator and your problem for class imbalance will be solved
While you can augment your data as suggested in the answers, you can use different weights to balance your multi-label loss. If n_c is the number of samples in class c then you can weight your loss value for class c:
l_c' = (1/n_c) * l_c

Use trained discriminator in GAN to calculate probabilities

I followed this tutorial on GAN - https://github.com/adeshpande3/Generative-Adversarial-Networks/blob/master/Generative%20Adversarial%20Networks%20Tutorial.ipynb
I want to use the trained discriminator for calculating probabilities of test images(I trained on images which represent a certain set, and want to check the probability the test image resembles that set.) I used the following code - (after reloading the model)
newP= sess.run(Dx, feed_dict={x_placeholder: dataset2})
print("prob: " + str(newP)
But It is not giving probabilities, some random floats >1. How to use the trained discrimanator for finding probabilities?
Use, prob = tf.nn.sigmoid(Dx) for your probabilities. Since Dx outputs a single value between 0-1, softmax for a single output will always be 1.(exp(Dx)/exp(Dx) = 1)

Depth prediction in monocular images using NYUv2 dataset through deep learning

Recently I am working on a research problem of obtaining depth from a monocular image using deep learning. The data set I am using is NYUv2 RGBD data set. I used the VGGNet-16 net and modified my inputs as required by the VGGnet. I used 23000 images as my initial dataset and got the RMSE error as 0.13. However, most of the research papers in this area show the RMSE to be of the range 0.6-0.9 (Although the dataset is same, but the subset of images used by each of the papers are different). One paper among them is published recently in CVPR. So I am skeptical of my approach of obtaining the RMSE. I am using keras for deep learning and theano as the back end. Below is the snippet of my RMSE code in keras:
code 1:
from keras import backend as K
def custom_rmse(y_true,y_pred):
temp = K.mean(K.square(abs(y_pred - y_true)), axis=-1)
return K.sqrt(temp)
Some of the researh papers have mentioned of not considering the zero pixel values in the target depth image. So I also wrote the modified code for RMSE as follows:
from keras import backend as K
def custom_rmse1(y_true,y_pred):
ind = np.nonzero(y_true)
y_pred = y_pred[ind[0],ind[1], ind[2]]
y_true = y_true[ind[0],ind[1], ind[2]]
temp1 = abs(y_pred - y_true)
temp2 = K.square(temp1)
temp3 = K.mean(temp2)
t1 = K.sqrt(temp3)
return t1
But in both the cases, the result remained almost (0.11-0.20) same. Also I tried with different subset of images. Even I tried with 122000 images with 200 epochs. Still I got the RMSE as 0.19.
Please suggest how to proceed this problem ?

Categories

Resources