I saved the history of training by
history = model.fit(train_generator, epochs=epochs, steps_per_epoch=train_steps,
verbose=1, callbacks=callbacks, validation_data=val_generator,
validation_steps=val_steps,batch_size=16)
with open('history_epochs.pkl', 'wb') as f:
dump(history.history, f)
Can I use the file of history to continue from the last epoch? and how please
Below applies to any deep learning library …
Build model
Train model.
Save model (should be saving parameters/weights as well).
Load model from the saved file (any time any where).
Continue with more training.
You can use the pickle file to save and load your model and continue training:
Create your model
Train your model
Save your model as a pickle file
Code for the above steps:
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import joblib
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat','Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
fig, axes = plt.subplots(2,5,figsize=(15,6))
for idx, axe in enumerate(axes.flatten()):
axe.axis('off')
idx_img = np.argwhere(y_train==idx)[0][0]
axe.imshow(X_train[idx_img], cmap=plt.cm.binary)
axe.set_title(class_names[y_train[idx_img]])
X_train = X_train.astype('float32') / 255.0
X_train = tf.expand_dims(X_train, axis=-1)
X_test = X_test.astype('float32') / 255.0
X_test = tf.expand_dims(X_test, axis=-1)
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(X_train.shape[1], X_train.shape[1], 1)))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(64, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Dense(10, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train, batch_size=256, epochs=3, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)
joblib.dump(model, 'model.pkl')
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 128) 1280
batch_normalization (BatchN (None, 26, 26, 128) 512
ormalization)
dropout (Dropout) (None, 26, 26, 128) 0
conv2d_1 (Conv2D) (None, 24, 24, 64) 73792
batch_normalization_1 (Batc (None, 24, 24, 64) 256
hNormalization)
dropout_1 (Dropout) (None, 24, 24, 64) 0
conv2d_2 (Conv2D) (None, 22, 22, 128) 73856
batch_normalization_2 (Batc (None, 22, 22, 128) 512
hNormalization)
dropout_2 (Dropout) (None, 22, 22, 128) 0
flatten (Flatten) (None, 61952) 0
dense (Dense) (None, 512) 31719936
dropout_3 (Dropout) (None, 512) 0
dense_1 (Dense) (None, 128) 65664
dropout_4 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 10) 1290
=================================================================
Total params: 31,937,098
Trainable params: 31,936,458
Non-trainable params: 640
_________________________________________________________________
Epoch 1/3
188/188 [==============================] - 19s 81ms/step - loss: 0.8264 - accuracy: 0.7398 - val_loss: 3.4644 - val_accuracy: 0.1245
Epoch 2/3
188/188 [==============================] - 14s 75ms/step - loss: 0.4896 - accuracy: 0.8283 - val_loss: 1.2240 - val_accuracy: 0.5802
Epoch 3/3
188/188 [==============================] - 14s 77ms/step - loss: 0.4055 - accuracy: 0.8544 - val_loss: 0.3711 - val_accuracy: 0.8675
313/313 [==============================] - 2s 5ms/step - loss: 0.3850 - accuracy: 0.8591
[0.3849639296531677, 0.8590999841690063]
INFO:tensorflow:Assets written to: ram://****/assets
['model.pkl']
Load your model
Continue Training
Code for the above steps:
model = joblib.load("/content/model.pkl")
model.fit(X_train, y_train, batch_size=256, epochs=2, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)
Output:
Epoch 1/2
188/188 [==============================] - 17s 84ms/step - loss: 0.4414 - accuracy: 0.8496 - val_loss: 0.3449 - val_accuracy: 0.8697
Epoch 2/2
188/188 [==============================] - 15s 82ms/step - loss: 0.3704 - accuracy: 0.8708 - val_loss: 0.2884 - val_accuracy: 0.8965
313/313 [==============================] - 1s 5ms/step - loss: 0.3114 - accuracy: 0.8938
[0.31136029958724976, 0.8938000202178955]
Related
This question already has answers here:
Obtaining output of an Intermediate layer in TensorFlow/Keras
(2 answers)
Closed 8 months ago.
I have a TensorFlow model like this-
I like to know the values of the red marked layer (5 float values) for the specific input to check how the model responds at this layer (attention layer). I need this value so that I can know if my attention layer is extracting values correctly or not.
As the model is an end-to-end model, I am unsure how I can extract values of an internal layer for specific input. Can anyone please help?
You can write class Callback then pass your input and check output of each layer that you want:
import tensorflow as tf
import numpy as np
class CustomCallback(tf.keras.callbacks.Callback):
def __init__(self):
self.data = np.random.rand(1,10)
def on_epoch_end(self, epoch, logs=None):
dns_layer = self.model.layers[6]
outputs = dns_layer(self.data)
tf.print(f'\n input: {self.data}')
tf.print(f'\n output: {outputs}')
x_train = tf.random.normal((10, 32, 32))
y_train = tf.random.uniform((10, 1), maxval=10)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(256, input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.LSTM(256))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(5, activation='softmax'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss = tf.keras.losses.SparseCategoricalCrossentropy(False))
model.summary()
for layer in model.layers:
print(layer)
model.fit(x_train, y_train , epochs=3, callbacks=[CustomCallback()], batch_size=32)
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 32, 256) 295936
dropout (Dropout) (None, 32, 256) 0
lstm_1 (LSTM) (None, 256) 525312
dropout_1 (Dropout) (None, 256) 0
dense (Dense) (None, 10) 2570
dropout_2 (Dropout) (None, 10) 0
dense_1 (Dense) (None, 5) 55
dropout_3 (Dropout) (None, 5) 0
dense_2 (Dense) (None, 10) 60
=================================================================
Total params: 823,933
Trainable params: 823,933
Non-trainable params: 0
_________________________________________________________________
<keras.layers.recurrent_v2.LSTM object at 0x7f6e2163dbd0>
<keras.layers.core.dropout.Dropout object at 0x7f6da1d2efd0>
<keras.layers.recurrent_v2.LSTM object at 0x7f6d9dfe0a50>
<keras.layers.core.dropout.Dropout object at 0x7f6d9de1ec90>
<keras.layers.core.dense.Dense object at 0x7f6d9de04dd0>
<keras.layers.core.dropout.Dropout object at 0x7f6d9dd549d0>
<keras.layers.core.dense.Dense object at 0x7f6d9dd8ec90>
<keras.layers.core.dropout.Dropout object at 0x7f6d9dedd650>
<keras.layers.core.dense.Dense object at 0x7f6d9ddc2ed0>
Epoch 1/3
1/1 [==============================] - ETA: 0s - loss: 2.4188
input: [[0.91498145 0.98430978 0.22720893 0.76032816 0.78405846 0.72664182
0.7772921 0.9851892 0.41715033 0.21014543]]
output: [[0.5767021 0.04140956 0.1909151 0.06737834 0.12359484]]
1/1 [==============================] - 12s 12s/step - loss: 2.4188
Epoch 2/3
1/1 [==============================] - ETA: 0s - loss: 2.4111
input: [[0.91498145 0.98430978 0.22720893 0.76032816 0.78405846 0.72664182
0.7772921 0.9851892 0.41715033 0.21014543]]
output: [[0.5780218 0.04101932 0.18909878 0.06769065 0.12416941]]
1/1 [==============================] - 0s 376ms/step - loss: 2.4111
Epoch 3/3
1/1 [==============================] - ETA: 0s - loss: 2.3978
input: [[0.91498145 0.98430978 0.22720893 0.76032816 0.78405846 0.72664182
0.7772921 0.9851892 0.41715033 0.21014543]]
output: [[0.579072 0.04067017 0.1874026 0.0679936 0.12486164]]
1/1 [==============================] - 0s 458ms/step - loss: 2.3978
I'm trying to make this model work. Initially x.shape is (6703, 56) and y.shape is a binary column having shape (6703, ). Then I run
y = y.to_numpy()
y = y.astype("float32")
y = tf.keras.utils.to_categorical(y, 2)
and now y.shape is (6703, 2). I run
X_train, X_test, Y_train, Y_test = train_test_split(x, y, test_size=0.2, random_state=42)
and now
X_train shape is (5362, 56)
Y_train shape is (5362, 2)
X_test shape is (1341, 56)
Y_test shape is (1341, 2)
Then I build the model:
model = tf.keras.models.Sequential(name="3layers")
model.add(keras.layers.Dense(N_HIDDEN,
input_shape=(len(X_train[0]),),
name="dense1",
activation="relu"))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(N_HIDDEN,
name="dense2",
activation="relu"))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(NB_CLASSES,
name="dense3",
activation="softmax"))
model.summary()
model.compile(optimizer="SGD", #SGD adam
loss="categorical_crossentropy",
metrics=["accuracy"])
model.fit(X_train, Y_train,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
test_loss, test_acc = model.evaluate(X_test, Y_test)
The summary is what I expect:
dense1 (Dense) (None, 64) 3648
dropout_18 (Dropout) (None, 64) 0
dense2 (Dense) (None, 64) 4160
dropout_19 (Dropout) (None, 64) 0
dense3 (Dense) (None, 2) 130
but the output is
Epoch 1/5
> 429/429 [==============================] - 1s 1ms/step - loss: nan - accuracy: 0.5141 - val_loss: nan - val_accuracy: 0.4884
Epoch 2/5
> 429/429 [==============================] - 0s 1ms/step - loss: nan - accuracy: 0.5143 - val_loss: nan - val_accuracy: 0.4884
Epoch 3/5
> 429/429 [==============================] - 0s 987us/step - loss: nan - accuracy: 0.5143 - val_loss: nan - val_accuracy: 0.4884
I've tried changing many parameters, I'm stuck.
I found what it was. There were some "None" values in the x matrix that caused the problem. Removing them it started evaluating a numeric loss. Very poor accuracy, but this will be another problem to solve.
The validation accuracy of my 1D CNN is stuck on 0.5 and that's because I'm always getting the same prediction out of a balanced data set. At the same time my training accuracy keeps increasing and the loss decreasing as intended.
Strangely, if I do model.evaluate() on my training set (that has close to 1 accuracy in the last epoch), the accuracy will also be 0.5. How can the accuracy here differ so much from the training accuracy of the last epoch? I've also tried with a batch size of 1 for both training and evaluating and the problem persists.
Well, I've been searching for different solutions for quite some time but still no luck. Possible problems I've already looked into:
My data set is properly balanced and shuffled;
My labels are correct;
Tried adding fully connected layers;
Tried adding/removing dropout from the fully connected layers;
Tried the same architecture, but with the last layer with 1 neuron and sigmoid activation;
Tried changing the learning rates (went down to 0.0001 but still the same problem).
Here's my code:
import pathlib
import numpy as np
import ipynb.fs.defs.preprocessDataset as preprocessDataset
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras import Input
from tensorflow.keras.layers import Conv1D, BatchNormalization, Activation, MaxPooling1D, Flatten, Dropout, Dense
from tensorflow.keras.optimizers import SGD
main_folder = pathlib.Path.cwd().parent
datasetsFolder=f'{main_folder}\\datasets'
trainDataset = preprocessDataset.loadDataset('DatasetTime_Sg12p5_Ov75_Train',datasetsFolder)
testDataset = preprocessDataset.loadDataset('DatasetTime_Sg12p5_Ov75_Test',datasetsFolder)
X_train,Y_train,Names_train=trainDataset[0],trainDataset[1],trainDataset[2]
X_test,Y_test,Names_test=testDataset[0],testDataset[1],testDataset[2]
model = Sequential()
model.add(Input(shape=X_train.shape[1:]))
model.add(Conv1D(16, 61, strides=1, padding="same"))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling1D(2, strides=2, padding="valid"))
model.add(Conv1D(32, 3, strides=1, padding="same"))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling1D(2, strides=2, padding="valid"))
model.add(Conv1D(64, 3, strides=1, padding="same"))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling1D(2, strides=2, padding="valid"))
model.add(Conv1D(64, 3, strides=1, padding="same"))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling1D(2, strides=2, padding="valid"))
model.add(Conv1D(64, 3, strides=1, padding="same"))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(200))
model.add(Activation('relu'))
model.add(Dense(2))
model.add(Activation('softmax'))
opt = SGD(learning_rate=0.01)
model.compile(loss='binary_crossentropy',optimizer=opt,metrics=['accuracy'])
model.summary()
model.fit(X_train,Y_train,epochs=10,shuffle=False,validation_data=(X_test, Y_test))
model.evaluate(X_train,Y_train)
Here's model.fit():
model.fit(X_train,Y_train,epochs=10,shuffle=False,validation_data=(X_test, Y_test))
Epoch 1/10
914/914 [==============================] - 277s 300ms/step - loss: 0.6405 - accuracy: 0.6543 - val_loss: 7.9835 - val_accuracy: 0.5000
Epoch 2/10
914/914 [==============================] - 270s 295ms/step - loss: 0.3997 - accuracy: 0.8204 - val_loss: 19.8981 - val_accuracy: 0.5000
Epoch 3/10
914/914 [==============================] - 273s 298ms/step - loss: 0.2976 - accuracy: 0.8730 - val_loss: 1.9558 - val_accuracy: 0.5002
Epoch 4/10
914/914 [==============================] - 278s 304ms/step - loss: 0.2897 - accuracy: 0.8776 - val_loss: 20.2678 - val_accuracy: 0.5000
Epoch 5/10
914/914 [==============================] - 277s 303ms/step - loss: 0.2459 - accuracy: 0.8991 - val_loss: 5.4945 - val_accuracy: 0.5000
Epoch 6/10
914/914 [==============================] - 268s 294ms/step - loss: 0.2008 - accuracy: 0.9181 - val_loss: 32.4579 - val_accuracy: 0.5000
Epoch 7/10
914/914 [==============================] - 271s 297ms/step - loss: 0.1695 - accuracy: 0.9317 - val_loss: 14.9538 - val_accuracy: 0.5000
Epoch 8/10
914/914 [==============================] - 276s 302ms/step - loss: 0.1423 - accuracy: 0.9452 - val_loss: 1.4420 - val_accuracy: 0.4988
Epoch 9/10
914/914 [==============================] - 266s 291ms/step - loss: 0.1261 - accuracy: 0.9497 - val_loss: 4.3830 - val_accuracy: 0.5005
Epoch 10/10
914/914 [==============================] - 272s 297ms/step - loss: 0.1142 - accuracy: 0.9548 - val_loss: 1.6054 - val_accuracy: 0.5009
Here's model.evaluate():
model.evaluate(X_train,Y_train)
914/914 [==============================] - 35s 37ms/step - loss: 1.7588 - accuracy: 0.5009
Here's model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 4096, 16) 992
_________________________________________________________________
batch_normalization (BatchNo (None, 4096, 16) 64
_________________________________________________________________
activation (Activation) (None, 4096, 16) 0
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 2048, 16) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 2048, 32) 1568
_________________________________________________________________
batch_normalization_1 (Batch (None, 2048, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 2048, 32) 0
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 1024, 32) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 1024, 64) 6208
_________________________________________________________________
batch_normalization_2 (Batch (None, 1024, 64) 256
_________________________________________________________________
activation_2 (Activation) (None, 1024, 64) 0
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 512, 64) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 512, 64) 12352
_________________________________________________________________
batch_normalization_3 (Batch (None, 512, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 512, 64) 0
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 256, 64) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 256, 64) 12352
_________________________________________________________________
batch_normalization_4 (Batch (None, 256, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 256, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 16384) 0
_________________________________________________________________
dropout (Dropout) (None, 16384) 0
_________________________________________________________________
dense (Dense) (None, 200) 3277000
_________________________________________________________________
activation_5 (Activation) (None, 200) 0
_________________________________________________________________
dense_1 (Dense) (None, 2) 402
_________________________________________________________________
activation_6 (Activation) (None, 2) 0
=================================================================
Total params: 3,311,834
Trainable params: 3,311,354
Non-trainable params: 480
_________________________________________________________________
... also tried with sigmoid but the issue persists ...
You don't want to be "trying" out activation functions or loss functions for a well-defined problem statement. It seems you are mixing up a single-label multi-class and a multi-label multi-class architecture.
Your output is a 2 class multi-class output with softmax activation which is great, but you use binary_crossentropy which would only make sense when used in a multi-class setting for multi-label problems.
You would want to use categorical_crossentropy instead. Furthermore, I would have suggested focal loss if there was class imbalance but it seems you have a 50,50 class proportion, so that's not necessary.
Remember, accuracy is decided based on which loss is being used! Check the different classes here. When you use binary_crossentropy the accuracy used is binaryaccuracy while with categorical_crossentropy, it uses categoricalaccuracy
Check this chart for details on what to use in what type of problem statement.
Other than that, there is a bottleneck in your network at flatten() and Dense(). The number of trainable parameters is quite high relative to other layers. I would advise using another CNN layer to bring the number of filters to say 128 and the size of sequence even smaller. And reduce the number of neurons for that Dense layer as well.
98.9% (3,277,000/3,311,354) of all of your trainable parameters reside between the Flatten and Dense layer! Not a great architectural choice!
Outside the above points, the model results are totally dependent on your data itself. I wouldn't be able to help more without knowledge of the data.
The solution for my problem was implementing Batch Renormalization: BatchNormalization(renorm=True). In addition normalizing the inputs helped a lot improving the overall performance of the neural network.
When trying to get confusion matrix for a ConvNet constantly getting the same error.
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image
from sklearn.metrics import classification_report, confusion_matrix
img_width, img_height = 150, 150
train_data_dir = "train"
validation_data_dir = "test"
nb_train_samples = 2000
nb_validation_samples = 400
epochs = 50
batch_size = 40 #16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range= 0.2,
horizontal_flip= True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = test_datagen.flow_from_directory(
train_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')`
Applying CNN Layers ...
model.compile(loss= 'binary_crossentropy',
optimizer= 'rmsprop',
metrics= ['accuracy'] )
`model.fit_generator(
train_generator,
steps_per_epoch= nb_train_samples // batch_size,
epochs= epochs,
validation_data= validation_generator,
validation_steps= nb_validation_samples // batch_size)
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))`
Getting error mentioned below but don't know how to resolve it
ValueError: Found input variables with inconsistent numbers of samples: [400, 440]
I am able to recreate your error using Dogs_Vs_Cats dataset. Where i have 2000 samples in train directory and 400 samples in validation directory.
Please change model.predict_generator from
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size+1)
to
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size)
will resolve this ValueError: Found input variables with inconsistent numbers of samples: [400, 440]
Please refer complete code as below
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image
from sklearn.metrics import classification_report, confusion_matrix
from google.colab import drive
drive.mount('/content/drive')
train_data_dir = '/content/drive/My Drive/Dogs_Vs_Cats/train'
validation_data_dir = '/content/drive/My Drive/Dogs_Vs_Cats/validation'
img_width, img_height = 150, 150
nb_train_samples = 2000
nb_validation_samples = 400
epochs = 10
batch_size = 40 #16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range= 0.2,
horizontal_flip= True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = test_datagen.flow_from_directory(
train_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
model = Sequential()
model.add(Conv2D(32, (3, 3), strides = (1, 1), input_shape = input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), strides = (1, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(loss = 'binary_crossentropy',
optimizer = 'rmsprop',
metrics = ['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch= nb_train_samples // batch_size,
epochs= epochs,
validation_data= validation_generator,
validation_steps= nb_validation_samples // batch_size)
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
Output:
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Found 2000 images belonging to 2 classes.
Found 400 images belonging to 2 classes.
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_9 (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
activation_9 (Activation) (None, 148, 148, 32) 0
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 72, 72, 64) 18496
_________________________________________________________________
activation_10 (Activation) (None, 72, 72, 64) 0
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 36, 36, 64) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 82944) 0
_________________________________________________________________
dense_9 (Dense) (None, 64) 5308480
_________________________________________________________________
dropout_5 (Dropout) (None, 64) 0
_________________________________________________________________
dense_10 (Dense) (None, 1) 65
=================================================================
Total params: 5,327,937
Trainable params: 5,327,937
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
50/50 [==============================] - 12s 233ms/step - loss: 0.9345 - accuracy: 0.5375 - val_loss: 0.6303 - val_accuracy: 0.5225
Epoch 2/10
50/50 [==============================] - 11s 226ms/step - loss: 0.6745 - accuracy: 0.5965 - val_loss: 0.6094 - val_accuracy: 0.6725
Epoch 3/10
50/50 [==============================] - 11s 223ms/step - loss: 0.6196 - accuracy: 0.6605 - val_loss: 0.5694 - val_accuracy: 0.7150
Epoch 4/10
50/50 [==============================] - 11s 223ms/step - loss: 0.5501 - accuracy: 0.7285 - val_loss: 0.6216 - val_accuracy: 0.7225
Epoch 5/10
50/50 [==============================] - 11s 221ms/step - loss: 0.4794 - accuracy: 0.7790 - val_loss: 0.6268 - val_accuracy: 0.6025
Epoch 6/10
50/50 [==============================] - 11s 226ms/step - loss: 0.4038 - accuracy: 0.8195 - val_loss: 0.4842 - val_accuracy: 0.6975
Epoch 7/10
50/50 [==============================] - 11s 222ms/step - loss: 0.3207 - accuracy: 0.8595 - val_loss: 0.5600 - val_accuracy: 0.7325
Epoch 8/10
50/50 [==============================] - 13s 257ms/step - loss: 0.2574 - accuracy: 0.8920 - val_loss: 0.9705 - val_accuracy: 0.7525
Epoch 9/10
50/50 [==============================] - 13s 252ms/step - loss: 0.2049 - accuracy: 0.9235 - val_loss: 0.7311 - val_accuracy: 0.7475
Epoch 10/10
50/50 [==============================] - 13s 251ms/step - loss: 0.1448 - accuracy: 0.9515 - val_loss: 1.0541 - val_accuracy: 0.7150
Confusion Matrix
[[200 0]
[200 0]]
Hope this answers your question. If not please share complete traceback and code for debug, i am happy to help you.
I am new to deep learning and have been trying to convert the Keras sequential API to the functional API running on the CIFAR10 image dataset but have been having some difficulty. I've converted the model which looks the same except for the input layer yet the sequential has an average accuracy of around ~70% and my functional has an average accuracy of around ~10%. I would really appreciate some help with regards to figuring out what is going wrong. Here is my functional code:
import tensorflow as tf
from tensorflow import keras
from keras import datasets, layers, models
from keras.models import Model, Input, Sequential
import matplotlib.pyplot as plt
Download and prepare:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
input_shape = train_images[0,:,:,:].shape
Create model:
input = layers.Input(shape=input_shape)
x = layers.Conv2D(32, (3, 3), activation='relu',padding='valid')(input)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dense(10)(x)
model = Model(input, x, name='Functional')
Compile and train:
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Here is a link to the original sequential CNN which is a google collaboratory notebook. I would really appreciate any help in trying to understand and fix what is going wrong. Thank you in advance.
There seems to be some issues with SparseCategoricalCrossentropy loss.
Check this: https://github.com/tensorflow/tensorflow/issues/38632
The following model gives good accuracy:
import tensorflow as tf
from tensorflow import keras
from keras import datasets, layers, models
from keras.models import Model, Input, Sequential
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
train_labels, test_labels = tf.keras.utils.to_categorical(train_labels, 10) , tf.keras.utils.to_categorical(test_labels, 10)
input_shape = train_images[0,:,:,:].shape
input = layers.Input(shape=input_shape)
x = layers.Conv2D(32, (3, 3), activation='relu',padding='valid')(input)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input, x, name='Functional')
model.summary()
model.compile(optimizer='adam',
loss=loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
conv2d_16 (Conv2D) (None, 30, 30, 32) 896
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 15, 15, 32) 0
_________________________________________________________________
conv2d_17 (Conv2D) (None, 13, 13, 64) 18496
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 6, 6, 64) 0
_________________________________________________________________
conv2d_18 (Conv2D) (None, 4, 4, 64) 36928
_________________________________________________________________
flatten_6 (Flatten) (None, 1024) 0
_________________________________________________________________
dense_11 (Dense) (None, 64) 65600
_________________________________________________________________
dense_12 (Dense) (None, 10) 650
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 15s 305us/step - loss: 1.4870 - accuracy: 0.4600 - val_loss: 1.2874 - val_accuracy: 0.5488
Epoch 2/10
50000/50000 [==============================] - 15s 301us/step - loss: 1.1365 - accuracy: 0.5989 - val_loss: 1.0789 - val_accuracy: 0.6191
Epoch 3/10
50000/50000 [==============================] - 15s 301us/step - loss: 0.9869 - accuracy: 0.6547 - val_loss: 0.9506 - val_accuracy: 0.6700
Epoch 4/10
50000/50000 [==============================] - 15s 301us/step - loss: 0.8896 - accuracy: 0.6907 - val_loss: 0.9509 - val_accuracy: 0.6695
Epoch 5/10
50000/50000 [==============================] - 16s 311us/step - loss: 0.8135 - accuracy: 0.7151 - val_loss: 0.8688 - val_accuracy: 0.7046
Epoch 6/10
50000/50000 [==============================] - 15s 303us/step - loss: 0.7566 - accuracy: 0.7351 - val_loss: 0.8411 - val_accuracy: 0.7141