So I've been following Google's official tensorflow guide and trying to build a simple neural network using Keras. But when it comes to training the model, it does not use the entire dataset (with 60000 entries) and instead uses only 1875 entries for training. Any possible fix?
import tensorflow as tf
from tensorflow import keras
import numpy as np
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
train_images = train_images / 255.0
test_images = test_images / 255.0
class_names = ['T-shirt', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot']
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss= tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10)
Output:
Epoch 1/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3183 - accuracy: 0.8866
Epoch 2/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3169 - accuracy: 0.8873
Epoch 3/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3144 - accuracy: 0.8885
Epoch 4/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3130 - accuracy: 0.8885
Epoch 5/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3110 - accuracy: 0.8883
Epoch 6/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3090 - accuracy: 0.8888
Epoch 7/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3073 - accuracy: 0.8895
Epoch 8/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3057 - accuracy: 0.8900
Epoch 9/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3040 - accuracy: 0.8905
Epoch 10/10
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3025 - accuracy: 0.8915
<tensorflow.python.keras.callbacks.History at 0x7fbe0e5aebe0>
Here's the original google colab notebook where I've been working on this: https://colab.research.google.com/drive/1NdtzXHEpiNnelcMaJeEm6zmp34JMcN38
The number 1875 shown during fitting the model is not the training samples; it is the number of batches.
model.fit includes an optional argument batch_size, which, according to the documentation:
If unspecified, batch_size will default to 32.
So, what happens here is - you fit with the default batch size of 32 (since you have not specified anything different), so the total number of batches for your data is
60000/32 = 1875
It does not train on 1875 samples.
Epoch 1/10
1875/1875 [===
1875 here is the number of steps, not samples. In fit method, there is an argument, batch_size. The default value for it is 32. So 1875*32=60000. The implementation is correct.
If you train it with batch_size=16, you will see the number of steps will be 3750 instead of 1875, since 60000/16=3750.
Just use batch_size = 1, if you want the entire 60000 data samples to be visible.
Related
I am new to Keras and have been practicing with resources from the web. Unfortunately, I cannot build a model without it throwing the following error:
ValueError: logits and labels must have the same shape, received ((None, 10) vs (None, 1)).
I have attempted the following:
DF = pd.read_csv("https://raw.githubusercontent.com/EpistasisLab/tpot/master/tutorials/MAGIC%20Gamma%20Telescope/MAGIC%20Gamma%20Telescope%20Data.csv")
X = DF.iloc[:,0:-1]
y = DF.iloc[:,-1]
yBin = np.array([1 if x == 'g' else 0 for x in y ])
scaler = StandardScaler()
X1 = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X1, yBin, test_size=0.25, random_state=2018)
print(X_train.__class__,X_test.__class__,y_train.__class__,y_test.__class__ )
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(10,activation="softmax"))
model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x=X_train,
y=y_train,
epochs=600,
validation_data=(X_test, y_test), verbose=1
)
I have read my model is likely wrong in terms of input parameters, what is the correct approach?
When I look at the shape of your data
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
I see, that X is 10-dimensional and y us 1-dimensional
Therefore, you need 10-dimensional input
model.build(input_shape=(None,10))
and 1-dimensional output in the last dense layer
model.add(Dense(1,activation="softmax"))
Target variable yBin/y_train/y_test is 1D array (has a shape (None,1) for a given batch).
Your logits come from the Dense layer and the last Dense layer has 10 neurons with softmax activation. So it will give 10 outputs for each input or (batch_size,10) for each batch. This is represented formally as (None,10).
To resolve the particular shape mismatch issue in question change the neuron count of dense layer to 1 and set activation finction to "sigmoid".
model.add(Dense(1,activation="sigmoid"))
As correctly mentioned by #MSS, You need to use sigmoid activation function with 1 neuron in the last dense layer to match the logits with the labels(1,0) of your dataset which indicates binary class.
Fixed code:
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(1,activation="sigmoid"))
#model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=['accuracy'])
model.fit(x=X_train,y=y_train,epochs=10,validation_data=(X_test, y_test),verbose=1)
Output:
Epoch 1/10
446/446 [==============================] - 3s 4ms/step - loss: 0.5400 - accuracy: 0.7449 - val_loss: 0.4769 - val_accuracy: 0.7800
Epoch 2/10
446/446 [==============================] - 2s 4ms/step - loss: 0.4425 - accuracy: 0.7987 - val_loss: 0.4241 - val_accuracy: 0.8095
Epoch 3/10
446/446 [==============================] - 2s 3ms/step - loss: 0.4082 - accuracy: 0.8175 - val_loss: 0.4034 - val_accuracy: 0.8242
Epoch 4/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3934 - accuracy: 0.8286 - val_loss: 0.3927 - val_accuracy: 0.8313
Epoch 5/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3854 - accuracy: 0.8347 - val_loss: 0.3866 - val_accuracy: 0.8320
Epoch 6/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3800 - accuracy: 0.8397 - val_loss: 0.3827 - val_accuracy: 0.8364
Epoch 7/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3762 - accuracy: 0.8411 - val_loss: 0.3786 - val_accuracy: 0.8387
Epoch 8/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3726 - accuracy: 0.8432 - val_loss: 0.3764 - val_accuracy: 0.8404
Epoch 9/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3695 - accuracy: 0.8466 - val_loss: 0.3724 - val_accuracy: 0.8408
Epoch 10/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3665 - accuracy: 0.8478 - val_loss: 0.3698 - val_accuracy: 0.8454
<keras.callbacks.History at 0x7f68ca30f670>
I am working on project with Keras Captcha OCR model. This model is about text Captcha recognition with CTC encoded output, apart form combining CNN and RNN.
I am trying to see the accuracy number from training output. How can I get the number of accuracy and validation accuracy?
Here is the training code form keras model:
epochs = 100
early_stopping_patience = 10
# Add early stopping
early_stopping = keras.callbacks.EarlyStopping(
monitor="val_loss", patience=early_stopping_patience, restore_best_weights=True
)
# Train the model
history = model.fit(
train_dataset,
validation_data=validation_dataset,
epochs=epochs,
callbacks=[early_stopping],
)
And this is the training output:
Epoch 1/100
59/59 [==============================] - 3s 53ms/step - loss: 21.5722 - val_loss: 16.3351
Epoch 2/100
59/59 [==============================] - 2s 27ms/step - loss: 16.3335 - val_loss: 16.3062
Epoch 3/100
59/59 [==============================] - 2s 27ms/step - loss: 16.3360 - val_loss: 16.3116
Epoch 4/100
59/59 [==============================] - 2s 27ms/step - loss: 16.3318 - val_loss: 16.3167
Epoch 5/100
Before calling model.fit just specify the metric you want to compute during training, in this case accuracy:
model.compile(optimizer= your_optimizer, loss= your_loss, metrics=['acc'])
Link to the documentation.
I have stuck in the assignment for three days and checked everything I could get from the internet. but the loss rate of my model cannot be reduced. The model is just random guessing the validation dataset.
(data source)[https://www.kaggle.com/datamunge/sign-language-mnist]
Here are some methods I have tried and verified that don't work:
increasing batch size, but the batch size seems to be irrelevant to the high loss rate and low accuracy.
check the format of input data, but I found nothing, everything seems to work properly.
try to remove image augmentation, the loss rate doesn't care.
try to change optimizer, I have tried Adam, RMSDrop, SGD.
try to add more neurons and increase epoch of training, only increase the training accuracy but not validation accuracy.
check my environment, I have run other sample codes of CNN and they worked as expected.
Here is my code and output.
import matplotlib.pyplot as plt
import csv
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from os import getcwd
import sys
def progressbar(it, prefix="", size=29, file=sys.stdout):
# This def is made by: https://stackoverflow.com/users/1207193/iambr
# it is the list you are going to iterate
# prefix is the title of your progress bar
# size is the length of your progress bar
count = len(it)
def show(j):
x = int(size*j/count)
file.write("%s[%s%s%s] %i/%i\r" %
(prefix, "="*x, ">", "."*(size-x), j, count))
file.flush()
show(0)
for i, item in enumerate(it):
yield item
show(i+1)
file.write("\n")
file.flush()
def get_data(filename):
with open(filename) as training_file:
images = np.empty((0, 28, 28), dtype=float)
labels = np.empty((0), dtype=float)
# Your code starts here
raw_file = np.loadtxt(training_file.readlines()[
:-1], dtype=float, skiprows=1, delimiter=',')
for row in progressbar(raw_file, "Loading data: "):
if(len(row) == 785):
labels = np.append(labels, row[0])
image = np.reshape(row[1:785], (1, 28, 28))
images = np.append(image, images, axis=0)
print(f'read file:{filename} complete')
return images, labels
# full data set
# path_sign_mnist_train = f'{getcwd()}/tmp2/sign_mnist_train.csv'
# path_sign_mnist_test = f'{getcwd()}/tmp2/sign_mnist_test.csv'
# reduce training set
path_sign_mnist_train = f'{getcwd()}/tmp2/sign_mnist_train_a.csv'
path_sign_mnist_test = f'{getcwd()}/tmp2/sign_mnist_test_a.csv'
training_images, training_labels = get_data(path_sign_mnist_train)
testing_images, testing_labels = get_data(path_sign_mnist_test)
training_images=training_images/255.
testing_images=testing_images/255.
# Keep these
print(training_images.shape)
print(training_labels.shape)
print(testing_images.shape)
print(testing_labels.shape)
print(testing_labels)
# Testing code
plt.imshow(training_images[1], interpolation='nearest')
plt.show()
print(training_labels[1])
train_datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=14, # randomly rotate images in the range (degrees, 0 to 180)
zoom_range = 0.09, # Randomly zoom image
width_shift_range=0.14, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.14, # randomly shift images vertically (fraction of total height)
horizontal_flip=False, # randomly flip images
vertical_flip=False, # randomly flip images
brightness_range = (0.8, 1.0), # brightness of image
rescale = 1. / 255.)
validation_datagen = ImageDataGenerator(rescale=1./255.)
training_images = np.reshape(training_images, (-1,28,28,1))
train_datagen.fit(training_images)
testing_images = np.reshape(testing_images,(-1,28,28,1))
training_labels=tf.keras.utils.to_categorical(training_labels,num_classes=25)
testing_labels=tf.keras.utils.to_categorical(testing_labels, num_classes=25)
batch_size = 16
train_generator = train_datagen.flow(
training_images,
training_labels, batch_size=batch_size)
validation_generator = validation_datagen.flow(
testing_images,
testing_labels, batch_size=batch_size)
# Keep These
print(training_images.shape)
print(testing_images.shape)
# Their output should be:
# (27455, 28, 28, 1)
# (7172, 28, 28, 1)
# Define the model
# Use no more than 2 Conv2D and 2 MaxPooling2D
model = tf.keras.models.Sequential([
# Your Code Here
tf.keras.layers.Conv2D(64, (3, 3), activation='relu',
input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(25, activation='softmax')
])
# Compile Model.
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.005),loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
# Train the Model
history = model.fit_generator(train_generator,
validation_data=validation_generator,
steps_per_epoch=len(training_images)//batch_size,
epochs=10,
validation_steps=len(testing_images)//batch_size
)
# model.evaluate(testing_images/255., testing_labels, verbose=0)
# Plot the chart for accuracy and loss on both training and validation
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'r', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
but the loss rate almost doesn't change...
Epoch 1/10
WARNING:tensorflow:AutoGraph could not transform <function Model.make_train_function.<locals>.train_function at 0x0000026B4B18F948> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: unsupported operand type(s) for -: 'NoneType' and 'int'
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
2022-01-27 09:40:05.564400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2022-01-27 09:40:05.743540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-01-27 09:40:06.492580: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
430/437 [============================>.] - ETA: 0s - loss: 3.1891 - accuracy: 0.0461WARNING:tensorflow:AutoGraph could not transform <function Model.make_test_function.<locals>.test_function at 0x0000026B490A4F78> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: unsupported operand type(s) for -: 'NoneType' and 'int'
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
437/437 [==============================] - 3s 7ms/step - loss: 3.1890 - accuracy: 0.0463 - val_loss: 3.2067 - val_accuracy: 0.0230
Epoch 2/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1828 - accuracy: 0.0425 - val_loss: 3.1952 - val_accuracy: 0.0333
Epoch 3/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1802 - accuracy: 0.0401 - val_loss: 3.2006 - val_accuracy: 0.0230
Epoch 4/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1789 - accuracy: 0.0434 - val_loss: 3.2012 - val_accuracy: 0.0348
Epoch 5/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1782 - accuracy: 0.0448 - val_loss: 3.2109 - val_accuracy: 0.0345
Epoch 6/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1784 - accuracy: 0.0454 - val_loss: 3.2056 - val_accuracy: 0.0230
Epoch 7/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1782 - accuracy: 0.0407 - val_loss: 3.2032 - val_accuracy: 0.0230
Epoch 8/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1780 - accuracy: 0.0391 - val_loss: 3.2080 - val_accuracy: 0.0230
Epoch 9/10
437/437 [==============================] - 3s 7ms/step - loss: 3.1775 - accuracy: 0.0417 - val_loss: 3.2033 - val_accuracy: 0.0230
Epoch 10/10
418/437 [===========================>..] - ETA: 0s - loss: 3.1773 - accuracy: 0.0460Traceback (most recent call last):
The only problem with this network is, learning too fast.
if you set the learning rate from 0.005 to 0.0005, this model works fine.
# Compile Model.
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0005),
loss='categorical_crossentropy', metrics=['accuracy'])
Do not learn too fast, or you will stuck in a local minimum and never get out.
Epoch 2/10
437/437 [==============================] - 3s 6ms/step - loss: 2.5773 - accuracy: 0.2133 - val_loss: 2.2050 - val_accuracy: 0.3542
Epoch 3/10
437/437 [==============================] - 3s 6ms/step - loss: 2.1190 - accuracy: 0.3262 - val_loss: 1.6197 - val_accuracy: 0.5278
Epoch 4/10
437/437 [==============================] - 3s 7ms/step - loss: 1.7566 - accuracy: 0.4223 - val_loss: 1.3985 - val_accuracy: 0.5492
Epoch 5/10
437/437 [==============================] - 3s 6ms/step - loss: 1.5062 - accuracy: 0.4929 - val_loss: 1.1146 - val_accuracy: 0.7000
Epoch 6/10
437/437 [==============================] - 3s 6ms/step - loss: 1.3736 - accuracy: 0.5323 - val_loss: 1.0778 - val_accuracy: 0.6756
Epoch 7/10
437/437 [==============================] - 3s 6ms/step - loss: 1.2198 - accuracy: 0.5836 - val_loss: 0.8912 - val_accuracy: 0.7650
Epoch 8/10
437/437 [==============================] - 3s 6ms/step - loss: 1.1396 - accuracy: 0.6066 - val_loss: 0.8298 - val_accuracy: 0.7486
Epoch 9/10
437/437 [==============================] - 3s 6ms/step - loss: 1.1084 - accuracy: 0.6182 - val_loss: 0.9152 - val_accuracy: 0.6830
Epoch 10/10
437/437 [==============================] - 3s 6ms/step - loss: 1.0196 - accuracy: 0.6525 - val_loss: 0.8014 - val_accuracy: 0.7307
BTW: the efficiency of the reading method is not very 'python'. This works better.
def get_data(filename):
with open(filename) as training_file:
raw_file = np.loadtxt(training_file.readlines()[
:-1], dtype=float, skiprows=1, delimiter=',')
labels=np.array([i[0] for i in raw_file])
images=np.array([i[1:785] for i in raw_file])
images=images.reshape(-1,28,28)
print(f'read file:{filename} complete')
return images, labels
Value of val_acc does not change over the epochs.
Summary:
I'm using a pre-trained (ImageNet) VGG16 from Keras;
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet', include_top=True, input_shape=(224, 224, 3))
Database from ISBI 2016 (ISIC) - which is a set of 900 images of skin lesion used for binary classification (malignant or benign) for training and validation, plus 379 images for testing -;
I use the top dense layers of VGG16 except the last one (that classifies over 1000 classes), and use a binary output with sigmoid function activation;
conv_base.layers.pop() # Remove last one
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(1, activation='sigmoid'))
Unlock the dense layers setting them to trainable;
Fetch the data, which are in two different folders, one named "malignant" and the other "benign", within the "training data" folder;
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
folder = 'ISBI2016_ISIC_Part3_Training_Data'
batch_size = 20
full_datagen = ImageDataGenerator(
rescale=1./255,
#rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
validation_split = 0.2, # 20% validation
horizontal_flip=True)
train_generator = full_datagen.flow_from_directory( # Found 721 images belonging to 2 classes.
folder,
target_size=(224, 224),
batch_size=batch_size,
subset = 'training',
class_mode='binary')
validation_generator = full_datagen.flow_from_directory( # Found 179 images belonging to 2 classes.
folder,
target_size=(224, 224),
batch_size=batch_size,
subset = 'validation',
shuffle=False,
class_mode='binary')
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=0.001), # High learning rate
metrics=['accuracy'])
history = model.fit_generator(
train_generator,
steps_per_epoch=721 // batch_size+1,
epochs=20,
validation_data=validation_generator,
validation_steps=180 // batch_size+1,
)
Then I fine-tune it with 100 more epochs and lower learning rate, setting the last convolutional layer to trainable.
I've tried many things such as:
Changing the optimizer (RMSprop, Adam and SGD);
Removing the top dense layers of the pre-trained VGG16 and adding mine;
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
Shuffle=True in validation_generator;
Changing batch size;
Varying the learning rate (0.001, 0.0001, 2e-5).
The results are similar to the following:
Epoch 1/100
37/37 [==============================] - 33s 900ms/step - loss: 0.6394 - acc: 0.7857 - val_loss: 0.6343 - val_acc: 0.8101
Epoch 2/100
37/37 [==============================] - 30s 819ms/step - loss: 0.6342 - acc: 0.8107 - val_loss: 0.6342 - val_acc: 0.8101
Epoch 3/100
37/37 [==============================] - 30s 822ms/step - loss: 0.6324 - acc: 0.8188 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 4/100
37/37 [==============================] - 31s 840ms/step - loss: 0.6346 - acc: 0.8080 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 5/100
37/37 [==============================] - 31s 833ms/step - loss: 0.6395 - acc: 0.7843 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 6/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 7/100
37/37 [==============================] - 31s 834ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 8/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6342 - acc: 0.8093 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 9/100
37/37 [==============================] - 31s 849ms/step - loss: 0.6330 - acc: 0.8147 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 10/100
37/37 [==============================] - 30s 812ms/step - loss: 0.6332 - acc: 0.8134 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 11/100
37/37 [==============================] - 31s 839ms/step - loss: 0.6338 - acc: 0.8107 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 12/100
37/37 [==============================] - 30s 807ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 13/100
37/37 [==============================] - 32s 852ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 14/100
37/37 [==============================] - 31s 826ms/step - loss: 0.6330 - acc: 0.8134 - val_loss: 0.6336 - val_acc: 0.8101
Epoch 15/100
37/37 [==============================] - 32s 854ms/step - loss: 0.6335 - acc: 0.8107 - val_loss: 0.6336 - val_acc: 0.8101
And goes on the same way, with constant val_acc = 0.8101.
When I use the test set after finishing training, the confusion matrix gives me 100% correct on benign lesions (304) and 0% on malignant, as so:
Confusion Matrix
[[304 0]
[ 75 0]]
What could I be doing wrong?
Thank you.
VGG16 was trained on RGB centered data. Your ImageDataGenerator does not enable featurewise_center, however, so you're feeding your net with raw RGB data. The VGG convolutional base can't process this to provide any meaningful information, so your net ends up universally guessing the more common class.
In general, when you see this type of problem (your net exclusively guessing the most common class), it means that there's something wrong with your data, not with the net. It can be caused by a preprocessing step like this or by a significant portion of "poisoned" anomalous training data that actively harms the training process.
Here is my code:
AE_0 = Sequential()
encoder = Sequential([Dense(output_dim=100, input_dim=256, activation='sigmoid')])
decoder = Sequential([Dense(output_dim=256, input_dim=100, activation='linear')])
AE_0.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=True))
AE_0.compile(loss='mse', optimizer=SGD(lr=0.03, momentum=0.9, decay=0.001, nesterov=True))
AE_0.fit(X, X, batch_size=21, nb_epoch=500, show_accuracy=True)
X has a shape (537621, 256). I'm trying to find a way to compress the vectors of size 256 to 100, then to 70, then to 50. I have done this is Lasagne but in Keras it seems to be easier to work w/ Autoencoders.
Here is the output:
Epoch 1/500
537621/537621 [==============================] - 27s - loss: 0.1339 - acc: 0.0036
Epoch 2/500
537621/537621 [==============================] - 32s - loss: 0.1339 - acc: 0.0036
Epoch 3/500
252336/537621 [=============>................] - ETA: 14s - loss: 0.1339 - acc: 0.0035
And it continues like this on and on..
It's now fixed on master:) openning issues is sometimes best choice
https://github.com/fchollet/keras/issues/1604