Class Imbalance for image classification

Class Imbalance for image classification - python

I am working on multi-label image classification where some labels have very few images. How to handle these cases?

Data augmentation, which means making 'clones' (reverse image/ set different angle/ etc.)

Do Image Augmentation for your data-set. Image augmentation means add variation (noise, resize etc) to your training image in a way that your object you are classifying can be seen through naked eye.
Some code for Image augmentation are.
adding Noise
gaussian_noise=iaa.AdditiveGaussianNoise(10,20)
noise_image=gaussian_noise.augment_image(image)
ia.imshow(noise_image)
Cropping
crop = iaa.Crop(percent=(0, 0.3)) # crop image
corp_image=crop.augment_image(image)
ia.imshow(corp_image)
Sheering
shear = iaa.Affine(shear=(0,40))
shear_image=shear.augment_image(image)
ia.imshow(shear_image)
Flipping
#flipping image horizontally
flip_hr=iaa.Fliplr(p=1.0)
flip_hr_image= flip_hr.augment_image(image)
ia.imshow(flip_hr_image)
Now you just need to put that into your data generator and your problem for class imbalance will be solved

While you can augment your data as suggested in the answers, you can use different weights to balance your multi-label loss. If n_c is the number of samples in class c then you can weight your loss value for class c:
l_c' = (1/n_c) * l_c

Related

How do you apply a Tensor Flow model on a single input and obtain the actual prediction and how to implement the model in a separate script

Recently I have been learning Tensor Flow, and I have written a few machine learning programs, however, I am wondering in what way can I test the model on a single input and receive the prediction, and not just evaluate the accuracy of the model on a lot of data as you would do using the model.fit() function. I am also wondering how can I then implement the model in a script, that for example gathers data and feeds it into the model automatically to obtain the predictions and then for example plots the results on a graph.
Thanks in advance.

To use your trained model for a single input lets call it y, you must process y to have the same data format your model was trained on. For example lets assume that you trained on model on images of cats and dog. If you model trained properly you should be able to submit a picture of a cat or a dog to it and have it tell you which it is.
Now if images were the input used to train the model they had a certain image shape (height,width) and a certain channel format for example RGB or Grayscale etc. So for the image y you want to predict you must ensure its size is the same height and width the model was trained on. If the model was trained on rgb images then y must be an rgb image. one more thing. When using model.predict say for predicting the single image y you will have to account for the fact that model.predict requires that you have the first dimension of y to be the batch_size. For the case of a single image the batch size is 1. So you need to expand the dimensions of y to include the batch size. For an immage the shape of y is (height, width,channels). It doesn't have a batch dimension so you need to add it. You can do that with
the y=np.expand_dims(y,axis=0) which will now give y the shape (1, height,width,channels). For example lets assume you trained you model on images of shape (224,224,3) in rgb format. You have an image y you want to classify and say it is a directory my_pics. The code below shows how to handle doing a prediction on image y. Somewhere in your training code you need to have an ordered list called classes. For the dog example the index code for cat might be 0 and the index code for dog then will be 1. So classes would be classes=['cat', 'dog']
model=tf.keras.models.load_model(path where model is stored) # load the trained model
image_path=r'my_pics' # path to image y
y=cv2.imread(image_path) #Note cv2 reads in images as bgr
y=cv2.resize(y, (224,224) # gives y the same shape as the training images
y=cv2.cvtColor(y, cv2.COLOR_BGR2RGB)# convert from bgr to rgb
y=np.expand_dims(y, axis=0) # y has shape (1,224,224,3)
prediction = model.predict(y) # make a prediction on y
print (prediction) # is a list with a probability value for each class
class_index=np.argmax(prediction # gives index of entry in prediction with highest probability
klass=classes[class_index} #selects the class name from the ordered list of classes
print (class)

How to calculate similarity between an image and a text?

I have several images and I want to know if there is any aircraft in the images or not.
I used the clip shown below but the output is [[1.0]], while the image is the face of humans. I think it is because it uses softmax.
I tried to use logits_per_image but the value is not understandable to me tensor([[20.03]]).
Is there any way to know if an image is related to a word in percent or so?
Can I use object detection in my problem to see if there are any aircraft in my image?
from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
image = Image.open('image_4.jpg')
inputs = processor(text=['aircraft'], images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
probs.tolist()

DICE loss too low but no overlap between prediction and label

I am trying to achieve the segmentation of the bone on the cross sectional area of MRI images with the Unet I found here https://github.com/zhixuhao/unet. The label is a binary png image which I intend to compare to my prediction. For the frames, I wrote my own data loader for my DICOM MRI files
I am using the following dice loss in keras
def DiceLoss(targets, inputs, smooth=1e-6):
#flatten label and prediction tensors
inputs = K.flatten(inputs)
targets = K.flatten(targets)
intersection = K.sum(K.dot(targets, inputs))
dice = (2*intersection + smooth) / (K.sum(targets) + K.sum(inputs) + smooth)
return 1 - dice
I have seen it in kaggle https://www.kaggle.com/bigironsphere/loss-function-library-keras-pytorch
among other similar implementations in Keras that I found out there.
But the DICE loss is always too small (10^-4) even though there is no overlap between the prediction and the label. Why is this? should I first convert my prediction to a binary mask? if so, How can I do this in keras? I tried to convert targets to a numpy array by doing targets.numpy() and then apply a threshold, but it throws a "tensor does not have attribute numpy" error. Do you have any other idea?
thank you

Why does my keras model get good accuracy but bad predictions?

So, I am trying to make a model which can predict doodles. I am using google's quick draw data :https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/numpy_bitmap which are images rendered into 28x28 greyscale bitmap numpy array. I only chose 10 classes and took 60,000 photos to train/evaluate. I get a test accuracy of 91% . When I try to make predictions with data from test data, it works. But when i make a drawing in paint and convert it into 28x28, it doesn't make good predictions. What sort of data do I need to have? What kind of preprocessing does the image need?
This is how i preprocessed the data from google's npy file
def load_set(name,path,resultx,resulty,label):
loaded_set = np.load(path+name+".npy")
loaded_set = loaded_set.reshape(loaded_set.shape[0],1,28,28)
# print(name,loaded_set.shape)
loaded_set = loaded_set[0:6000,0:6000,0:6000,0:6000]
resultx = np.append(resultx,loaded_set,axis=0)
resulty = createLabelArray(label,loaded_set.shape[0],resulty)
print("loaded "+name)
return resultx,resulty
def createLabelArray(label,size,result):
for i in range(0,size):
result = np.append(result,[[label]],axis=0)
return result
where label is the label i want for that category.
I shuffle them afterwards and everything.
And this is how I am trying to process new images(drawings by me):
print("[INFO] loading and preprocessing image...")
image = image_utils.load_img(os.path.join(path, name), grayscale=True,target_size=(28, 28))
image = image_utils.img_to_array(image)
print(image.shape)
image = np.expand_dims(image, axis=0)
print(image.shape)
image = image.astype('float32')
image /= 255
return image
Please help, I've been stuck on this for a while now. Thank you

Seems to be a typical case of overfitting.
Please try 10-fold cross-validation to get accuracy of model.
Further use regularization and dropout in keras to prevent overfitting.

Scikit-learn SVM digit recognition

I want to make a program to recognize the digit in an image. I follow the tutorial in scikit learn .
I can train and fit the svm classifier like the following.
First, I import the libraries and dataset
from sklearn import datasets, svm, metrics
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
Second, I create the SVM model and train it with the dataset.
classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])
And then, I try to read my own image and use the function predict() to recognize the digit.
Here is my image:
I reshape the image into (8, 8) and then convert it to a 1D array.
img = misc.imread("w1.jpg")
img = misc.imresize(img, (8, 8))
img = img[:, :, 0]
Finally, when I print out the prediction, it returns [1]
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
print predicted
Whatever I user others images, it still returns [1]
When I print out the "default" dataset of number "9", it looks like:
My image number "9" :
You can see the non-zero number is quite large for my image.
I dont know why. I am looking for help to solve my problem. Thanks

My best bet would be that there is a problem with your data types and array shapes.
It looks like you are training on numpy arrays that are of the type np.float64 (or possibly np.float32 on 32 bit systems, I don't remember) and where each image has the shape (64,).
Meanwhile your input image for prediction, after the resizing operation in your code, is of type uint8 and shape (1, 64).
I would first try changing the shape of your input image since dtype conversions often just work as you would expect. So change this line:
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
to this:
predicted = classifier.predict(img.reshape(img.shape[0]*img.shape[1]))
If that doesn't fix it, you can always try recasting the data type as well with
img = img.astype(digits.images.dtype).
I hope that helps. Debugging by proxy is a lot harder than actually sitting in front of your computer :)
Edit: According to the SciPy documentation, the training data contains integer values from 0 to 16. The values in your input image should be scaled to fit the same interval. (http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)

1) You need to create your own training set - based on data similar to what you will be making predictions. The call to datasets.load_digits() in scikit-learn is loading a preprocessed version of the MNIST Digits dataset, which, for all we know, could have very different images to the ones that you are trying to recognise.
2) You need to set the parameters of your classifier properly. The call to svm.SVC(gamma = 0.001) is just choosing an arbitrary value of the gamma parameter in SVC, which may not be the best option. In addition, you are not configuring the C parameter - which is pretty important for SVMs. I'd bet that this is one of the reasons why your output is 'always 1'.
3) Whatever final settings you choose for your model, you'll need to use a cross-validation scheme to ensure that the algorithm is effectively learning
There's a lot of Machine Learning theory behind this, but, as a good start, I would really recommend to have a look at SVM - scikit-learn for a more in-depth description of how the SVC implementation in sickit-learn works, and GridSearchCV for a simple technique for parameter setting.

It's just a guess but... The Training set from Sk-Learn are black numbers on a white background. And you are trying to predict numbers which are white on a black background...
I think you should either train on your training set, or train on the negative version of your pictures.
I hope this help !

If you look at:
http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits
you can see that each point in the matrix as a value between 0-16.
You can try to transform the values of the image to between 0-16. I did it and now the prediction works well for the digit 9 but not for 8 and 6. It doesn't give 1 any more.
from sklearn import datasets, svm, metrics
import cv2
import numpy as np
# Load digit database
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
# Train SVM classifier
classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])
# Read image "9"
img = cv2.imread("w1.jpg")
img = img[:,:,0];
img = cv2.resize(img, (8, 8))
# Normalize the values in the image to 0-16
minValueInImage = np.min(img)
maxValueInImage = np.max(img)
normaliizeImg = np.floor(np.divide((img - minValueInImage).astype(np.float),(maxValueInImage-minValueInImage).astype(np.float))*16)
# Predict
predicted = classifier.predict(normaliizeImg.reshape((1,normaliizeImg.shape[0]*normaliizeImg.shape[1] )))
print predicted

I have solved this problem using below methods:
check the number of attributes, too large or too small.
check the scale of your gray value, I change to [0,16].
check data type, I change it to uint8.
check the number of training data, too small or not.
I hope it helps. ^.^

Hi in addition to #carrdelling respond, i will add that you may use the same training set, if you normalize your images to have the same range of value.
For example you could binaries your data ( 1 if > 0, 0 else ) or you could divide by the maximum intensity in your image to have an arbitrary interval [0;1].

You probably want to extract features relevant to to your data set from the images and train your model on them.
One example I copied from here.
surf = cv2.SURF(400)
kp, des = surf.detectAndCompute(img,None)
But the SURF features may not be the most useful or relevant to your dataset and training task. You should try others too like HOG or others.
Remember this more high level the features you extract the more general/error-tolerant your model will be to unseen images. However, you may be sacrificing accuracy in your known samples and test cases.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Class Imbalance for image classification - python

I am working on multi-label image classification where some labels have very few images. How to handle these cases?

Data augmentation, which means making 'clones' (reverse image/ set different angle/ etc.)

While you can augment your data as suggested in the answers, you can use different weights to balance your multi-label loss. If n_c is the number of samples in class c then you can weight your loss value for class c: l_c' = (1/n_c) * l_c

Related

How do you apply a Tensor Flow model on a single input and obtain the actual prediction and how to implement the model in a separate script

How to calculate similarity between an image and a text?

DICE loss too low but no overlap between prediction and label

Why does my keras model get good accuracy but bad predictions?

Scikit-learn SVM digit recognition

Categories

Resources