I'm working in a project that has 700 images for 2 classes (1400 total). I'm using VGG16 but i'm new with this model and I don't know what could I do to improve this model..
This is my model:
vgg16_model = VGG16(weights="imagenet", include_top=True)
# (1) visualize layers
print("VGG16 model layers")
for i, layer in enumerate(vgg16_model.layers):
print(i, layer.name, layer.output_shape)
# (2) remove the top layer
base_model = Model(input=vgg16_model.input,
output=vgg16_model.get_layer("block5_pool").output)
# (3) attach a new top layer
base_out = base_model.output
base_out = Reshape((25088,))(base_out)
top_fc1 = Dense(256, activation="relu")(base_out)
top_fc1 = Dropout(0.5)(top_fc1)
# output layer: (None, 5)
top_preds = Dense(1, activation="sigmoid")(top_fc1)
# (4) freeze weights until the last but one convolution layer (block4_pool)
for layer in base_model.layers[0:14]:
layer.trainable = False
# (5) create new hybrid model
model = Model(input=base_model.input, output=top_preds)
# (6) compile and train the model
sgd = SGD(lr=1e-4, momentum=0.9)
model.compile(optimizer=sgd, loss="binary_crossentropy", metrics=["accuracy"])
history = model.fit([data], [labels], nb_epoch=NUM_EPOCHS,
batch_size=BATCH_SIZE, validation_split=0.1)
# evaluate final model
vlabels = model.predict(np.array(valid))
model.save('model.h5')
... that gives me the follow return:
Train on 1260 samples, validate on 140 samples
Epoch 1/5
1260/1260 [==============================] - 437s 347ms/step - loss: 0.2200 - acc: 0.9746 - val_loss: 2.4432e-05 - val_acc: 1.0000
Epoch 2/5
1260/1260 [==============================] - 456s 362ms/step - loss: 0.0090 - acc: 0.9984 - val_loss: 1.5452e-04 - val_acc: 1.0000
Epoch 3/5
1260/1260 [==============================] - 438s 347ms/step - loss: 1.3702e-07 - acc: 1.0000 - val_loss: 8.4489e-05 - val_acc: 1.0000
Epoch 4/5
1260/1260 [==============================] - 446s 354ms/step - loss: 4.2592e-06 - acc: 1.0000 - val_loss: 7.6768e-05 - val_acc: 1.0000
Epoch 5/5
1260/1260 [==============================] - 457s 363ms/step - loss: 0.0017 - acc: 0.9992 - val_loss: 1.1921e-07 - val_acc: 1.0000
It seems to be a bit overfitting..
My predict.py:
def fix_layer0(filename, batch_input_shape, dtype):
with h5py.File(filename, 'r+') as f:
model_config = json.loads(f.attrs['model_config'].decode('utf-8'))
layer0 = model_config['config']['layers'][0]['config']
layer0['batch_input_shape'] = batch_input_shape
layer0['dtype'] = dtype
f.attrs['model_config'] = json.dumps(model_config).encode('utf-8')
fix_layer0('model.h5', [None, 224, 224, 3], 'float32')
model = load_model('model.h5')
for filename in os.listdir(r'v/'):
if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
ImageCV = cv2.addWeighted(ImageCV,4, cv2.GaussianBlur(ImageCV,(0,0), 224/25), -4, 120) #The same process made when I get data in train
ImageCV = ImageCV.reshape(-1,224,224,3)
print(model.predict(ImageCV))
And the results are strange because only the 2 first images are of 'class 0'.. the others are 'class 1':
[[0.99905235]]
[[0.]]
[[1.]]
[[0.012198]]
[[0.]]
[[1.]]
[[1.6363418e-07]]
[[0.99997246]]
[[0.00433112]]
[[0.9996668]]
[[1.]]
[[6.183685e-08]]
What can I do to improve it? I'm a little confused..
ImageCV = cv2.addWeighted(ImageCV,4, cv2.GaussianBlur(ImageCV,(0,0),
224/25), -4, 120)
Not sure why you do this for the test data. For the validation/test data, usually only normalization is done. During training as well you need to apply the same normalization as a final step before feeding the data to the network.
Refer to this example for fine-tuning VGG16 for a two class problem(dogs vs cats)
https://gist.github.com/fchollet/7eb39b44eb9e16e59632d25fb3119975
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
To reduce overfitting, you can do data augmentation for the training data i.e. feed the original data & augmented data(apply operations like flip, zoom etc.). Keras ImageDataGenerators make it easy to do the augmentation. Explored in the above tutorial as well.
https://keras.io/preprocessing/image/
First of all, Keras predict will return the scores of the regression (probabilities for each class) and predict_classes will return the most likely class of your prediction. For example, if you classify between cats and dogs, predict could output 0.2 for cat and 0.8 for dog.
So, if you use predict, there should be two values per picture, one for each class.
The reason why you only have one value is that your network only has one output neuron. It should have two, as there are two classes.
top_preds = Dense(2, activation="sigmoid")(top_fc1)
If you now want to see most likely class, not the probabilities, you should use predict_classes.
Related
I am using the below LeNet architecture to train my image classification model , I have noticed that both train , val accuracy not improving for each iteration . Can any one expertise in this area explain what might have gone wrong ?
training samples - 110 images belonging to 2 classes.
validation - 50 images belonging to 2 classes.
#LeNet
import keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
#import dropout class if needed
from keras.layers import Dropout
from keras import regularizers
model = Sequential()
#Layer 1
#Conv Layer 1
model.add(Conv2D(filters = 6,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (32,32,3)))
#Pooling layer 1
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Layer 2
#Conv Layer 2
model.add(Conv2D(filters = 16,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (14,14,6)))
#Pooling Layer 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Flatten
model.add(Flatten())
#Layer 3
#Fully connected layer 1
model.add(Dense(units=128,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#Layer 4
#Fully connected layer 2
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#layer 5
#Fully connected layer 3
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#layer 6
#Fully connected layer 4
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#Layer 7
#Output Layer
model.add(Dense(units = 2, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
#Image Augmentation
train_datagen = ImageDataGenerator(
rescale=1./255, #rescaling pixel value bw 0 and 1
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
#Just Feature scaling
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'/Dataset/Skin_cancer/training',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')
test_set = test_datagen.flow_from_directory(
'/Dataset/Skin_cancer/testing',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')
model.fit_generator(
training_set,
steps_per_epoch=50, #number of input (image)
epochs=25,
validation_data=test_set,
validation_steps=10) # number of training sample
Epoch 1/25
50/50 [==============================] - 52s 1s/step - loss: 0.8568 - accuracy: 0.4963 - val_loss: 0.7004 - val_accuracy: 0.5000
Epoch 2/25
50/50 [==============================] - 50s 1s/step - loss: 0.6940 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/25
50/50 [==============================] - 48s 967ms/step - loss: 0.6932 - accuracy: 0.5065 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/25
50/50 [==============================] - 50s 1s/step - loss: 0.6932 - accuracy: 0.4824 - val_loss: 0.6933 - val_accuracy: 0.5000
Epoch 5/25
50/50 [==============================] - 49s 974ms/step - loss: 0.6932 - accuracy: 0.4949 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 6/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4854 - val_loss: 0.6931 - val_accuracy: 0.5000
Epoch 7/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5015 - val_loss: 0.6918 - val_accuracy: 0.5000
Epoch 8/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4986 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 9/25
50/50 [==============================] - 49s 973ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6929 - val_accuracy: 0.5000
Epoch 10/25
50/50 [==============================] - 50s 1s/step - loss: 0.6931 - accuracy: 0.5044 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 11/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5022 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 12/25
Most importantly is that you are using loss = 'categorical_crossentropy', change it to loss = 'binary_crossentropy' as you have just 2 classes. And also change class_mode='categorical' to class_mode='binary' in flow_from_directory.
As #desertnaut rightly mentioned, categorical_crossentropy goes hand in hand with softmax activation in the last layer, and if you change the loss to binary_crossentropy the last activation should also be changed to sigmoid.
Other Improvements:
You have very limited data (160 images) and you have used almost 50% of data as validation data.
As you are building the model for image classification, you just have two Conv2D Layer and 4 dense Layer. The Dense layers are adding huge amount of weights to be learnt. Add few more conv2d layer and reduce the Dense layer.
Set batch_size = 1 and remove steps_per_epoch. As you have very less input let every epoch have same number of steps as input records.
Use the default glorot_uniform kernel initializer.
To further tune your model, build model using multiple Conv2D layer, followed by GlobalAveragePooling2D layer and FC Layer and final softmax layer.
Use Data Augmentation technique like horizontal_flip, vertical_flip, shear_range, zoom_range of ImageDataGenerator to increase the number of training and validation images.
Moving the comments to answer section as suggested by #desertnaut -
Question - Thanks ! Yes , less data is the problem I figured . One additional
question - why is that adding more dense layer than conv layer
negatively affecting the model, is there any rule to follow when we
decide how many conv and dense layer we gonna use ? –
Arun_Ramji_Shanmugam 2 days ago
Answer - To answer the first part of your question, Conv2D layer maintains the
spatial information of the image and weights to be learnt depend on
the kernel size and stride mentioned in the layer,where as the Dense
layer needs the output of Conv2D to be flattened and used further
hence losing the spatial information. Also dense layer adds more
number of weights, for example 2 dense layers of 512 adds
(512*512)=262144 params or weights to the model(has to be learnt by
the model).That means you have to train for more number of epochs and
with good hype parameters settings for learning of these weights. –
Tensorflow Warriors 2 days ago
Answer - To answer the second part of your question,use systematic experiments
to discover what works best for your specific dataset. Also it depends
on processing power you hold. Remember, deeper networks is always
better, at the cost of more data and increased complexity of learning.
A conventional approach is to look for similar problems and deep
learning architectures which have already been shown to work. Also we
have the flexibility to utilize the pretrained models like resnet, vgg
etc, use these models by freezing the part of the layers and training
on remaining layers. – Tensorflow Warriors 2 days ago
Question - Thank you for detailed answer !! If you don't bother one more question
- so when we are using already trained model (may be some layers) , isn't it required to be trained on same input data as the one we gonna
work ? – Arun_Ramji_Shanmugam yesterday
Answer - The intuition behind transfer learning for image classification is
that if a model is trained on a large and general enough dataset, this
model will effectively serve as a generic model of the visual world.
You can find transfer learning example with explanation here -
tensorflow.org/tutorials/images/transfer_learning . – Tensorflow
Warriors yesterday
Remove all kernel_initializer='uniform' arguments from your layers; don't specify anything here, the default initializer glorot_uniform is the highly recommended one (and the uniform is a particularly bad one).
As a general rule, keep in mind that the default values for such rather advanced settings are there for your convenience, they are implicitly recommended, and you should better not mess with them unless you have specific reasons to do so and you know exactly what you are doing.
For the kernel_initializer argument in particular, I have started believing that it has caused a lot of unnecessary pain to people (just see here for the most recent example).
Also, dropout should not be used by default, especially in cases like here where the model seems to struggle to learn anything; start without any dropout (comment out the respective layers), and only add it back if you see signs of overfitting.
I'm making a simple classification algo with a keras neural network. The goal is to take 3 data points on weather and decide whether or not there's a wildfire. Here's an image of the .csv dataset that I'm using to train the model(this image is only the top few lines and isn't the entire thing ):
wildfire weather dataset
As you can see, there are 4 columns with the fourth being either a "1" which means "fire", or a "0" which means "no fire". I want the algo to predict either a 1 or a 0. This is the code that I wrote:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import csv
#THIS IS USED TO TRAIN THE MODEL
# Importing the dataset
dataset = pd.read_csv('Fire_Weather.csv')
dataset.head()
X=dataset.iloc[:,0:3]
Y=dataset.iloc[:,3]
X.head()
obj=StandardScaler()
X=obj.fit_transform(X)
X_train,X_test,y_train,y_test=train_test_split(X, Y, test_size=0.25)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation =
'relu', input_dim = 3))
# classifier.add(Dropout(p = 0.1))
# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation
= 'relu'))
# classifier.add(Dropout(p = 0.1))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation
= 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics
= ['accuracy'])
classifier.fit(X_train, y_train, batch_size = 3, epochs = 10)
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
print(y_pred)
classifier.save("weather_model.h5")
The problem is that whenever I run this, my accuracy is always "0.0000e+00" and my training output looks like this:
Epoch 1/10
2146/2146 [==============================] - 2s 758us/step - loss: nan - accuracy: 0.0238
Epoch 2/10
2146/2146 [==============================] - 1s 625us/step - loss: nan - accuracy: 0.0000e+00
Epoch 3/10
2146/2146 [==============================] - 1s 604us/step - loss: nan - accuracy: 0.0000e+00
Epoch 4/10
2146/2146 [==============================] - 1s 609us/step - loss: nan - accuracy: 0.0000e+00
Epoch 5/10
2146/2146 [==============================] - 1s 624us/step - loss: nan - accuracy: 0.0000e+00
Epoch 6/10
2146/2146 [==============================] - 1s 633us/step - loss: nan - accuracy: 0.0000e+00
Epoch 7/10
2146/2146 [==============================] - 1s 481us/step - loss: nan - accuracy: 0.0000e+00
Epoch 8/10
2146/2146 [==============================] - 1s 476us/step - loss: nan - accuracy: 0.0000e+00
Epoch 9/10
2146/2146 [==============================] - 1s 474us/step - loss: nan - accuracy: 0.0000e+00
Epoch 10/10
2146/2146 [==============================] - 1s 474us/step - loss: nan - accuracy: 0.0000e+00
Does anyone know why this is happening and what I could do to my code to fix this?
Thank You!
EDIT: I realized that my earlier response was highly misleading, which was thankfully pointed out by #xdurch0 and #Timbus Calin. Here is an edited answer.
Check that all your input values are valid. Are there any nan or inf values in your training data?
Try using different activation functions. ReLU is good, but it is prone to what is known as the dying ReLu problem, where the neural network basically learns nothing since no updates are made to its weight. One possibility is to use Leaky ReLu or PReLU.
Try using gradient clipping, which is a technique used to tackle vanishing or exploding gradients (which is likely what is happening in your case). Keras allows users to configure clipnorm clip value for optimizers.
There are posts on SO that report similar problems, such as this one, which might also be of interest to you.
I am following this guide to learn image classification with neural networks:
https://www.tensorflow.org/tutorials/keras/classification
And I implement this code for my custom dataset. I have 2300 gray scaled 1024x1024 pictures to train model. I hold all my images in 3D numpy array as train_images and test_images. I have 4 class which are 0,1,2,3 and I hold those as list, named "labels".
train_images.shape # returns (2300,1024,1024)
test_images.shape # returns (384,1024,1024)
# normalize values
train_images = train_images / 255.0
test_images = test_images / 255.0
model = keras.Sequential([
keras.layers.Flatten(input_shape=(1024, 1024)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(4, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, labels, epochs=10)
Everything almost same with guide. But my epoch accuracy is around 0.4
Epoch 10/10
...
2176/2300 [===========================>..] - ETA: 0s - loss: 9.5701 - acc: 0.4062
2208/2300 [===========================>..] - ETA: 0s - loss: 9.5628 - acc: 0.4067
2240/2300 [============================>.] - ETA: 0s - loss: 9.5485 - acc: 0.4076
2272/2300 [============================>.] - ETA: 0s - loss: 9.5417 - acc: 0.4080
2300/2300 [==============================] - 12s 5ms/step - loss: 9.5307 - acc: 0.4087
Also in guide some predictions are fractional but when I try to do prediction, My model predictions are only 0 or 1. It says this is %100 (x) but its wrong.
predictions = model.predict(test_images)
print(predictions)
# 0 | 0 | 1 | 0
# 0 | 0 | 1 | 0
# 1 | 0 | 0 | 0
UPDATED
Here is epoch results for 256*256 2 classed 100 images per class:
32/200 [===>..........................] - ETA: 0s - loss: 8.5627 - acc: 0.4688
200/200 [==============================] - 0s 317us/step - loss: 8.0590 - acc: 0.5000
Epoch 10/10
Also I lowered my classes into 2 but my predictions are still return %100 and wrong class.
I dont know where I am doing wrong. If you have any advice/idea I would be grateful. Thank you in advance.
40% accuracy is not good. It needs to train more. You should rescale images to 128 or 256 to save time. Also try increasing epoch count to something like 100 or minimize loss to at least around 1 before testing. Another thing is class imbalance.
According to this, https://arxiv.org/abs/1708.07747 link Fashion MNIST contains 7000 images per class with 70000 images in total. If your dataset has class imbalance which seems likely then you should look into other metrics and methods.
This is a newbie question, but I just can't get the simplest Keras experiment work. I went through a course and its samples work well, so my computer is set up correctly.
I've a few thousand 16x12 images, called "GGYRBGBBW.png", "BBYWBRBBB.png" and so on. The images have a single color with minimal shade differences and are reduced to a single pixel when loading. The first character of the filenames serves as training labels (e.g. green images' names start with 'G'). I need to build and train a simple model that indicates one of the 6 possible colors in the image. (This is a first learning step towards a much more complicated project).
I don't want to use ImageDataGenerator because this whole project will grow beyond that a simple directory-structure categorization can do, and I'll do the image randomization with my own external image generator.
I created a Keras model that looks like this:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_1 (Flatten) (None, 3) 0
_________________________________________________________________
dense_1 (Dense) (None, 6) 24
=================================================================
Total params: 24
Trainable params: 24
Non-trainable params: 0
So, pretty simple - input shape is only three values (RGB), flattened, then 6 output neurons for each color category.
When running the experiments, beginning with random initial values, accuracy stays super low, sometimes 0.
210/210 [==============================] - 0s 2ms/step - loss: 7.2430 - acc: 0.2095 - val_loss: 6.7980 - val_acc: 0.2000
Epoch 2/50
210/210 [==============================] - 0s 10us/step - loss: 7.2411 - acc: 0.2095 - val_loss: 9.6617 - val_acc: 0.2000
Epoch 3/50
210/210 [==============================] - 0s 5us/step - loss: 9.9256 - acc: 0.2095 - val_loss: 9.6598 - val_acc: 0.2000
Epoch 4/50
210/210 [==============================] - 0s 5us/step - loss: 9.9236 - acc: 0.2095 - val_loss: 9.6579 - val_acc: 0.2000
Epoch 5/50
210/210 [==============================] - 0s 10us/step - loss: 9.9217 - acc: 0.2095 - val_loss: 9.6560 - val_acc: 0.2000
Epoch 6/50
210/210 [==============================] - 0s 10us/step - loss: 9.9197 - acc: 0.2095 - val_loss: 9.6541 - val_acc: 0.2000
I must be missing something trivial, but since I'm a noob in this, I can't figure out what. Here's the complete source code:
from __future__ import print_function
import random
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
os.system('cls')
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras import backend as K
from keras.preprocessing.image import img_to_array
import numpy as np
num_classes = 6
batch_size = 300
epochs = 50
Image_width, Image_height = 1, 1
train_dir = './OneColorTrainingData'
colors="WYBGRO"
def load(numOfPics):
X = []
y = []
print("Reading training images")
allFiles = os.listdir(train_dir)
randomFiles = random.choices(allFiles, k=numOfPics)
for f in randomFiles:
path = os.path.join(train_dir, f)
img = keras.preprocessing.image.load_img(path, grayscale=False, target_size=(Image_width, Image_width))
img = img_to_array(img)
img /= 255
X.append(img)
y.append(colors.index(f[0]))
y = keras.utils.to_categorical(y, num_classes=num_classes)
return X, y
Data, labels = load(batch_size)
print(str(len(Data)) + " training files loaded")
print(labels)
model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(num_classes, activation=K.tanh))
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])
print(model.input_shape)
print(model.summary())
hist = model.fit(np.array(Data), np.array(labels), batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.3)
score = model.evaluate(np.array(Data), np.array(labels), verbose=0)
print('Test loss: ', score[0])
print('Test accuracy: ', score[1])
Any help would be appreciated.
You have K.tanh as your finaly Dense layer activation. When doing 1-of-many classification we often use softmax instead which will produce a probability distribution over the colour classes:
model.add(Dense(num_classes, activation='softmax'))
Now your target labels will be one-hot vectors [0,0,1,0,0,0] indication which class it is. You can also use sparse_categorical_crossentropy loss and give labels as class integers 2 as your target. It means the same thing in this context.
I'm not positive, but I think you need a hidden layer. You currently have a single affine transformation on your input: h_0 = W_0*x + b_0. This is passed through a K.tanh, so your logits are simply y = tanh(h_0). I have a hunch that for a small enough problem like this you may be able to prove that this cannot converge, but I am not certain.
I would just add a second dense layer, and use softmax for your final output as #nuric suggests:
model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(10, activation=K.tanh))
model.add(Dense(num_classes, activation='softmax'))
OK, I figured it out. I needed to switch to softmax (thanks all who suggested), and use much more epochs because convergence was random and slow. I didn't think I needed to use more epochs because nothing changed after the first few - but it turned out that once I had 500 instead of 50, the network managed to learn all the colors one by one and hit 100% accuracy (with a single output layer) almost every time. Thank you all for the help!
I'm trying to use Keras and its MobileNet implementation to do object localization (output the x/y coordinates of a few features, instead of classes) and I'm running into some likely very basic issue that I can't figure out.
My code looks like this:
# =============================
# Load MobileNet and change the top layers.
model = applications.MobileNet(weights="imagenet",
include_top=False,
input_shape=(224, 224, 3))
# Freeze all the layers except the very last 5.
for layer in model.layers[:-5]:
layer.trainable = False
# Adding custom Layers at the end, after the last Conv2D layer.
x = model.output
x = GlobalAveragePooling2D()(x)
x = Reshape((1, 1, 1024))(x)
x = Dropout(0.5)(x)
x = Conv2D(1024, (1, 1), activation='relu', padding='same', name='conv_preds')(x)
x = Dense(1024, activation="relu")(x)
# I'd like this to output 4 variables, two pairs of x/y coordinates
x = Dense(PREDICT_SIZE, activation="sigmoid")(x)
predictions = Reshape((PREDICT_SIZE,))(x)
# =============================
# Create the new final model.
model_final = Model(input = model.input, output = predictions)
def custom_loss(y_true, y_pred):
'''Trying to compute the Euclidian distance as a Loss Function'''
return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1))
model_final.compile(loss = custom_loss,
optimizer = optimizers.adam(lr=0.0001),
metrics=["accuracy"])
With this model, then I load the data and try to train it.
x_train, y_train, x_val, y_val = load_data(DATASET_DIR)
# This load_data is my own implementation. It returns the images
# as tensors.
# ==> x_train[0].shape= (224, 224, 3)
#
# y_train and y_val look like this:
# ==> y_train[0]= [ 0.182 -0.0933 0.072 -0.0453]
#
# holding values in the [0, 1] interval for where the pixel
# is relative to the width/height of the image.
#
model_final.fit(x_train, y_train,
batch_size=batch_size, epochs=5, shuffle=False,
validation_data=(x_val, y_val))
Unfortunately, what I get when I run this model to train, I get something like this:
Train on 45 samples, validate on 5 samples
Epoch 1/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 2/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 3/5
I'm at a loss about why my loss value is "nan". I must be doing something wrong, and I've tried to change everything - the loss function, the shape of the output... but I can't figure out what I'm doing wrong.
Any help would be appreciated!
UPDATE: it seems like the issue is in the way I load_data.
If I create the image data like this it fails and results in loss:nan
i = pil_image.open(img_filename)
img = image.load_img(img_filename, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = keras.applications.mobilenet.preprocess_input(x)
x_train = np.append(x_train, x, axis=0)
but if I do something trivial like this, 'fit' works just fine and computes real values for loss:
x_train = np.random.random((100, 224, 224, 3))
sigh I wonder what's happening...
UPDATE #2: I figured out what the issue was
Documenting this here in case it helps anybody.
The way to properly generate the input tensors for MobileNet is this one:
test_img=[]
for i in range(len(test)):
temp_img=image.load_img(test_path+test['filename'][i],target_size=(224,224))
temp_img=image.img_to_array(temp_img)
test_img.append(temp_img)
test_img=np.array(test_img)
test_img=preprocess_input(test_img)
Notice how making it into a numpy.array and running preprocess_input happens on the whole batch of images. Doing it image by image seems to not have worked (what I was doing before).
Hope this helps somebody someday.