Loss is NaN using activation softmax and loss function categorical_crossentropy

Loss is NaN using activation softmax and loss function categorical_crossentropy - python

I'm trying to make this model work. Initially x.shape is (6703, 56) and y.shape is a binary column having shape (6703, ). Then I run
y = y.to_numpy()
y = y.astype("float32")
y = tf.keras.utils.to_categorical(y, 2)
and now y.shape is (6703, 2). I run
X_train, X_test, Y_train, Y_test = train_test_split(x, y, test_size=0.2, random_state=42)
and now
X_train shape is (5362, 56)
Y_train shape is (5362, 2)
X_test shape is (1341, 56)
Y_test shape is (1341, 2)
Then I build the model:
model = tf.keras.models.Sequential(name="3layers")
model.add(keras.layers.Dense(N_HIDDEN,
input_shape=(len(X_train[0]),),
name="dense1",
activation="relu"))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(N_HIDDEN,
name="dense2",
activation="relu"))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(NB_CLASSES,
name="dense3",
activation="softmax"))
model.summary()
model.compile(optimizer="SGD", #SGD adam
loss="categorical_crossentropy",
metrics=["accuracy"])
model.fit(X_train, Y_train,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
test_loss, test_acc = model.evaluate(X_test, Y_test)
The summary is what I expect:
dense1 (Dense) (None, 64) 3648
dropout_18 (Dropout) (None, 64) 0
dense2 (Dense) (None, 64) 4160
dropout_19 (Dropout) (None, 64) 0
dense3 (Dense) (None, 2) 130
but the output is
Epoch 1/5
> 429/429 [==============================] - 1s 1ms/step - loss: nan - accuracy: 0.5141 - val_loss: nan - val_accuracy: 0.4884
Epoch 2/5
> 429/429 [==============================] - 0s 1ms/step - loss: nan - accuracy: 0.5143 - val_loss: nan - val_accuracy: 0.4884
Epoch 3/5
> 429/429 [==============================] - 0s 987us/step - loss: nan - accuracy: 0.5143 - val_loss: nan - val_accuracy: 0.4884
I've tried changing many parameters, I'm stuck.

I found what it was. There were some "None" values in the x matrix that caused the problem. Removing them it started evaluating a numeric loss. Very poor accuracy, but this will be another problem to solve.

Related

how can i continue training from last epoch?

I saved the history of training by
history = model.fit(train_generator, epochs=epochs, steps_per_epoch=train_steps,
verbose=1, callbacks=callbacks, validation_data=val_generator,
validation_steps=val_steps,batch_size=16)
with open('history_epochs.pkl', 'wb') as f:
dump(history.history, f)
Can I use the file of history to continue from the last epoch? and how please

Below applies to any deep learning library …
Build model
Train model.
Save model (should be saving parameters/weights as well).
Load model from the saved file (any time any where).
Continue with more training.

You can use the pickle file to save and load your model and continue training:
Create your model
Train your model
Save your model as a pickle file
Code for the above steps:
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import joblib
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat','Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
fig, axes = plt.subplots(2,5,figsize=(15,6))
for idx, axe in enumerate(axes.flatten()):
axe.axis('off')
idx_img = np.argwhere(y_train==idx)[0][0]
axe.imshow(X_train[idx_img], cmap=plt.cm.binary)
axe.set_title(class_names[y_train[idx_img]])
X_train = X_train.astype('float32') / 255.0
X_train = tf.expand_dims(X_train, axis=-1)
X_test = X_test.astype('float32') / 255.0
X_test = tf.expand_dims(X_test, axis=-1)
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(X_train.shape[1], X_train.shape[1], 1)))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(64, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Dense(10, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train, batch_size=256, epochs=3, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)
joblib.dump(model, 'model.pkl')
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 128) 1280
batch_normalization (BatchN (None, 26, 26, 128) 512
ormalization)
dropout (Dropout) (None, 26, 26, 128) 0
conv2d_1 (Conv2D) (None, 24, 24, 64) 73792
batch_normalization_1 (Batc (None, 24, 24, 64) 256
hNormalization)
dropout_1 (Dropout) (None, 24, 24, 64) 0
conv2d_2 (Conv2D) (None, 22, 22, 128) 73856
batch_normalization_2 (Batc (None, 22, 22, 128) 512
hNormalization)
dropout_2 (Dropout) (None, 22, 22, 128) 0
flatten (Flatten) (None, 61952) 0
dense (Dense) (None, 512) 31719936
dropout_3 (Dropout) (None, 512) 0
dense_1 (Dense) (None, 128) 65664
dropout_4 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 10) 1290
=================================================================
Total params: 31,937,098
Trainable params: 31,936,458
Non-trainable params: 640
_________________________________________________________________
Epoch 1/3
188/188 [==============================] - 19s 81ms/step - loss: 0.8264 - accuracy: 0.7398 - val_loss: 3.4644 - val_accuracy: 0.1245
Epoch 2/3
188/188 [==============================] - 14s 75ms/step - loss: 0.4896 - accuracy: 0.8283 - val_loss: 1.2240 - val_accuracy: 0.5802
Epoch 3/3
188/188 [==============================] - 14s 77ms/step - loss: 0.4055 - accuracy: 0.8544 - val_loss: 0.3711 - val_accuracy: 0.8675
313/313 [==============================] - 2s 5ms/step - loss: 0.3850 - accuracy: 0.8591
[0.3849639296531677, 0.8590999841690063]
INFO:tensorflow:Assets written to: ram://****/assets
['model.pkl']
Load your model
Continue Training
Code for the above steps:
model = joblib.load("/content/model.pkl")
model.fit(X_train, y_train, batch_size=256, epochs=2, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)
Output:
Epoch 1/2
188/188 [==============================] - 17s 84ms/step - loss: 0.4414 - accuracy: 0.8496 - val_loss: 0.3449 - val_accuracy: 0.8697
Epoch 2/2
188/188 [==============================] - 15s 82ms/step - loss: 0.3704 - accuracy: 0.8708 - val_loss: 0.2884 - val_accuracy: 0.8965
313/313 [==============================] - 1s 5ms/step - loss: 0.3114 - accuracy: 0.8938
[0.31136029958724976, 0.8938000202178955]

ValueError: Shapes (None, 1) and (None, 64) are incompatible

I'm currently trying to build a classification model in keras but I keep getting a shape error. This is my model right now. Is there anything that I am doing wrong?
predictors=["Length", "Diameter", "Height", "Shucked weight", "Viscera weight", "Shell weight", "Rings"]
x_train, x_test, y_train, y_test =train_test_split(db[predictors], db["Sex"], test_size=.2)
x_train= x_train.to_numpy()
x_test = x_test.to_numpy()
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(64, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
history = model.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512,
validation_data=(x_val, y_val))
ValueError: Shapes (None, 1) and (None, 64) are incompatible
Data Source https://www.kaggle.com/rodolfomendes/abalone-dataset

The output of the last layer consists of 64 different values, while your labels are of 1 value only.

This error is because you have 3 classes(labels) in your dataset and you are not defining those in your model's last layer. (As mentioned by #subspring)
model = Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3)) # You need to mention this in the last dense layer
As the label data in this dataset is not numeric.
y_train.unique() #array(['I', 'M', 'F'], dtype=object)
For that, you can use LabelEncoder as below:
from sklearn.preprocessing import LabelEncoder
def Labels(y_train, y_test):
LabEnc = LabelEncoder()
LabEnc.fit(y_train)
Enc_y_train = LabEnc.transform(y_train)
Enc_y_test = LabEnc.transform(y_test)
return Enc_y_train, Enc_y_test
y_train, y_test = Labels(y_train, y_test)
y_train # array([1, 1, 2, ..., 2, 2, 0])
Now train the model by converting the input data (x_train,x_test) into an array.
x_train= np.array(x_train)
x_test = np.array(x_test)
#compile the model
model.compile(optimizer='rmsprop',
loss=tf.keras.losses.MeanSquaredError(),
metrics=['accuracy'])
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
#train the model
history = model.fit(partial_x_train,
partial_y_train,
epochs=5,
batch_size=512,
validation_data=(x_val, y_val))
Output:
Epoch 1/5
5/5 [==============================] - 2s 80ms/step - loss: 0.8610 - accuracy: 0.3302 - val_loss: 0.7966 - val_accuracy: 0.2350
Epoch 2/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7997 - accuracy: 0.2563 - val_loss: 0.7491 - val_accuracy: 0.4620
Epoch 3/5
5/5 [==============================] - 0s 16ms/step - loss: 0.7917 - accuracy: 0.3315 - val_loss: 0.7883 - val_accuracy: 0.2680
Epoch 4/5
5/5 [==============================] - 0s 15ms/step - loss: 0.7949 - accuracy: 0.3405 - val_loss: 0.7499 - val_accuracy: 0.3390
Epoch 5/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7884 - accuracy: 0.3306 - val_loss: 0.7605 - val_accuracy: 0.3670

Tensorflow save and load_model not working but save and load_weights does

I am using tensorflow version 2.8.0:
I have seen this issue from multiple sources all over forums, githubs, and even some here for the past 5 years with no definitive answer that has worked for me... For some reason, in certain situations, a loaded model from a previous save yields very different results from the original model evaluation. I haven't seen any well documented and investigative questions about this so I thought I'd show my full code below (simple illustration of the issue).
This is an application of transfer learning from a pre-trained tensorflow model. The model is first trained through 5 epochs on train_data, then fine tuned (with more trainable params) for 5 more. Evaluating the model on test_data shows an accuracy of 0.5671. The model is then saved and loaded in .h5 format (I have also tried the tf SavedModel format and the result is the same). The resultant loaded_model yields an evaluation accuracy on the same, unaltered test_data of 0.4535.
The result should be the same (0.5671)... so to further investigate I decided to save the fine tuned model's weights independently, construct and compile the same model architecture in new_model, and load the saved model's weights into new_model. Evaluating new_model yields the correct result, an accuracy of 0.5671. ----- Okay, so it must be the weights not saving properly right? I pulled the weights from each of these three models (model, loaded_model, new_model) and compared their flattened results. They are all the same. I really have no idea what's going on here but I'm assuming it is not random initialization, because the loaded_model evaluation results really did not perform anywhere near the fine tuned model - I would assume they would converge much closer.
import tensorflow as tf
tf.random.set_seed(42)
import pandas as pd
import numpy as np
import os
import pathlib
data_dir = pathlib.Path("101_food_classes_10_percent/train")
class_names = np.array(sorted([item.name for item in data_dir.glob('*')]))
train_dir = './101_food_classes_10_percent/train/'
test_dir = './101_food_classes_10_percent/test/'
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen=ImageDataGenerator()
train_data = datagen.flow_from_directory(directory = train_dir,
target_size = (224,224),
batch_size = 32,
class_mode='categorical')
test_data = datagen.flow_from_directory(directory = test_dir,
target_size = (224,224),
batch_size = 32,
class_mode='categorical')
from tensorflow.keras.layers.experimental import preprocessing
data_augmentation = tf.keras.Sequential([
preprocessing.RandomFlip('horizontal'),
preprocessing.RandomRotation(0.2),
preprocessing.RandomZoom(0.2),
preprocessing.RandomHeight(0.2),
preprocessing.RandomWidth(0.2)
#preprocessing.Rescaling(1/255.) in EfficientNet it's already scaled but could use this for non-scaled
], name = 'data_augmentation')
Found 7575 images belonging to 101 classes.
Found 25250 images belonging to 101 classes.
# Build headless model - Feature Extraction
# Setup base with frozen layers
base_model = tf.keras.applications.EfficientNetB0(include_top=False)
base_model.trainable=False
inputs = tf.keras.layers.Input(shape = (224,224,3))
x = data_augmentation(inputs)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x) # Pool base_model's outputs into a feature vector
outputs = tf.keras.layers.Dense(len(class_names), activation='softmax')(x)
model = tf.keras.Model(inputs,outputs)
model.compile('Adam', 'categorical_crossentropy', metrics=['accuracy'])
model.summary()
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
data_augmentation (Sequentia (None, None, None, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
_________________________________________________________________
global_average_pooling2d_1 ( (None, 1280) 0
_________________________________________________________________
dense_1 (Dense) (None, 101) 129381
=================================================================
Total params: 4,178,952
Trainable params: 129,381
Non-trainable params: 4,049,571
_________________________________________________________________
history = model.fit(train_data, validation_data=test_data,
validation_steps=int(0.15*len(test_data)),
epochs=5, callbacks = [checkpoint_callback])
Epoch 1/5
237/237 [==============================] - 63s 230ms/step - loss: 3.4712 - accuracy: 0.2482 - val_loss: 2.4446 - val_accuracy: 0.4497
Epoch 2/5
237/237 [==============================] - 52s 221ms/step - loss: 2.3575 - accuracy: 0.4561 - val_loss: 2.0051 - val_accuracy: 0.5093
Epoch 3/5
237/237 [==============================] - 51s 216ms/step - loss: 1.9838 - accuracy: 0.5265 - val_loss: 1.8313 - val_accuracy: 0.5360
Epoch 4/5
237/237 [==============================] - 51s 212ms/step - loss: 1.7497 - accuracy: 0.5761 - val_loss: 1.7417 - val_accuracy: 0.5461
Epoch 5/5
237/237 [==============================] - 53s 221ms/step - loss: 1.6035 - accuracy: 0.6141 - val_loss: 1.7012 - val_accuracy: 0.5601
model.evaluate(test_data)
790/790 [==============================] - 87s 110ms/step - loss: 1.7294 - accuracy: 0.5481
[1.7294203042984009, 0.5480791926383972]
# Fine tuning: unfreeze some layers, lower leaning rate by 10x
base_model.trainable=True
# Refreeze every layer except last 5, adjust tiner tuned features down the model
for layer in base_model.layers[:-5]:
layer.trainable=False
# recompile and lower learning rate by 10x
model.compile(tf.keras.optimizers.Adam(learning_rate=0.0001), 'categorical_crossentropy', metrics=['accuracy'])
model.summary()
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
data_augmentation (Sequentia (None, None, None, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
_________________________________________________________________
global_average_pooling2d_1 ( (None, 1280) 0
_________________________________________________________________
dense_1 (Dense) (None, 101) 129381
=================================================================
Total params: 4,178,952
Trainable params: 910,821
Non-trainable params: 3,268,131
_________________________________________________________________
# Fine Tune for 5 more epochs starting with last epoch left off at:
fine_tune_epochs=10 # Total number of epochs we're after: 5 feature extraction, 5 fine tuning
history_fine_tune = model.fit(train_data,
validation_data = test_data,
validation_steps=int(0.15*len(test_data)),
epochs = fine_tune_epochs,
initial_epoch = history.epoch[-1])
Epoch 5/10
237/237 [==============================] - 59s 220ms/step - loss: 1.3571 - accuracy: 0.6543 - val_loss: 1.6403 - val_accuracy: 0.5567
Epoch 6/10
237/237 [==============================] - 51s 213ms/step - loss: 1.2478 - accuracy: 0.6688 - val_loss: 1.6805 - val_accuracy: 0.5596
Epoch 7/10
237/237 [==============================] - 46s 193ms/step - loss: 1.1424 - accuracy: 0.6964 - val_loss: 1.6352 - val_accuracy: 0.5736
Epoch 8/10
237/237 [==============================] - 45s 191ms/step - loss: 1.0902 - accuracy: 0.7065 - val_loss: 1.6494 - val_accuracy: 0.5657
Epoch 9/10
237/237 [==============================] - 46s 193ms/step - loss: 1.0229 - accuracy: 0.7275 - val_loss: 1.6348 - val_accuracy: 0.5633
Epoch 10/10
237/237 [==============================] - 45s 191ms/step - loss: 0.9704 - accuracy: 0.7434 - val_loss: 1.6990 - val_accuracy: 0.5670
model.evaluate(test_data)
790/790 [==============================] - 83s 105ms/step - loss: 1.6578 - accuracy: 0.5671
[1.657836675643921, 0.5670890808105469]
model.save("./101_food_classes_10_percent/big_modelh5")
loaded_model = tf.keras.models.load_model("./101_food_classes_10_percent/big_modelh5.h5")
loaded_model.summary()
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
data_augmentation (Sequentia (None, None, None, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
_________________________________________________________________
global_average_pooling2d_1 ( (None, 1280) 0
_________________________________________________________________
dense_1 (Dense) (None, 101) 129381
=================================================================
Total params: 4,178,952
Trainable params: 910,821
Non-trainable params: 3,268,131
_________________________________________________________________
loaded_model.evaluate(test_data)
790/790 [==============================] - 85s 104ms/step - loss: 2.1780 - accuracy: 0.4535 - loss: 2.1790 - accuracy
[2.1780412197113037, 0.4534653425216675]
# Try save_weights to another model
model.save_weights('my_model_weights.h5')
inputs = tf.keras.layers.Input(shape = (224,224,3))
x = data_augmentation(inputs)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x) # Pool base_model's outputs into a feature vector
outputs = tf.keras.layers.Dense(len(class_names), activation='softmax')(x)
new_model = tf.keras.Model(inputs,outputs)
new_model.compile('Adam', 'categorical_crossentropy', metrics=['accuracy'])
new_model.summary()
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
data_augmentation (Sequentia (None, None, None, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
_________________________________________________________________
global_average_pooling2d_2 ( (None, 1280) 0
_________________________________________________________________
dense_2 (Dense) (None, 101) 129381
=================================================================
Total params: 4,178,952
Trainable params: 910,821
Non-trainable params: 3,268,131
_________________________________________________________________
new_model.load_weights('my_model_weights.h5')
# Saving weights works... but not save and load_model
new_model.evaluate(test_data)
790/790 [==============================] - 88s 109ms/step - loss: 1.6578 - accuracy: 0.5671
[1.6578353643417358, 0.5670890808105469]
# Check if weights are the same?
m1 = model.get_weights()
m2 = new_model.get_weights()
m3 = loaded_model.get_weights()
len(m1)==len(m2)==len(m3)
True
from collections.abc import Iterable
def flatten(l):
for el in l:
if isinstance(el, Iterable) and not isinstance(el, (str, bytes)):
yield from flatten(el)
else:
yield el
m1 = flatten(m1)
m2 = flatten(m2)
m3 = flatten(m3)
print(list(m1)==list(m2))
print(list(m1)==list(m3))
True
True

This is because you have not saved your entire model using .h5 extension, but you are using .h5 for saving the weights. Please check below code section:
model.save("./101_food_classes_10_percent/big_modelh5") # add .h5
loaded_model = tf.keras.models.load_model("./101_food_classes_10_percent/big_modelh5.h5")
loaded_model.summary()
Use this code to save the entire model to a HDF5 file format and try again loading it:
model.save("./101_food_classes_10_percent/big_modelh5.h5")
Check this for more details on saving model in .hdf5 format.

confusion_matrix() library is giving ValueError

When trying to get confusion matrix for a ConvNet constantly getting the same error.
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image
from sklearn.metrics import classification_report, confusion_matrix
img_width, img_height = 150, 150
train_data_dir = "train"
validation_data_dir = "test"
nb_train_samples = 2000
nb_validation_samples = 400
epochs = 50
batch_size = 40 #16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range= 0.2,
horizontal_flip= True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = test_datagen.flow_from_directory(
train_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')`
Applying CNN Layers ...
model.compile(loss= 'binary_crossentropy',
optimizer= 'rmsprop',
metrics= ['accuracy'] )
`model.fit_generator(
train_generator,
steps_per_epoch= nb_train_samples // batch_size,
epochs= epochs,
validation_data= validation_generator,
validation_steps= nb_validation_samples // batch_size)
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))`
Getting error mentioned below but don't know how to resolve it
ValueError: Found input variables with inconsistent numbers of samples: [400, 440]

I am able to recreate your error using Dogs_Vs_Cats dataset. Where i have 2000 samples in train directory and 400 samples in validation directory.
Please change model.predict_generator from
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size+1)
to
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size)
will resolve this ValueError: Found input variables with inconsistent numbers of samples: [400, 440]
Please refer complete code as below
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image
from sklearn.metrics import classification_report, confusion_matrix
from google.colab import drive
drive.mount('/content/drive')
train_data_dir = '/content/drive/My Drive/Dogs_Vs_Cats/train'
validation_data_dir = '/content/drive/My Drive/Dogs_Vs_Cats/validation'
img_width, img_height = 150, 150
nb_train_samples = 2000
nb_validation_samples = 400
epochs = 10
batch_size = 40 #16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range= 0.2,
horizontal_flip= True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = test_datagen.flow_from_directory(
train_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size= (img_width, img_height),
batch_size= batch_size,
class_mode= 'binary')
model = Sequential()
model.add(Conv2D(32, (3, 3), strides = (1, 1), input_shape = input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), strides = (1, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(loss = 'binary_crossentropy',
optimizer = 'rmsprop',
metrics = ['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch= nb_train_samples // batch_size,
epochs= epochs,
validation_data= validation_generator,
validation_steps= nb_validation_samples // batch_size)
Y_pred = model.predict_generator(validation_generator, nb_validation_samples // batch_size)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
Output:
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Found 2000 images belonging to 2 classes.
Found 400 images belonging to 2 classes.
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_9 (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
activation_9 (Activation) (None, 148, 148, 32) 0
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 72, 72, 64) 18496
_________________________________________________________________
activation_10 (Activation) (None, 72, 72, 64) 0
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 36, 36, 64) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 82944) 0
_________________________________________________________________
dense_9 (Dense) (None, 64) 5308480
_________________________________________________________________
dropout_5 (Dropout) (None, 64) 0
_________________________________________________________________
dense_10 (Dense) (None, 1) 65
=================================================================
Total params: 5,327,937
Trainable params: 5,327,937
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
50/50 [==============================] - 12s 233ms/step - loss: 0.9345 - accuracy: 0.5375 - val_loss: 0.6303 - val_accuracy: 0.5225
Epoch 2/10
50/50 [==============================] - 11s 226ms/step - loss: 0.6745 - accuracy: 0.5965 - val_loss: 0.6094 - val_accuracy: 0.6725
Epoch 3/10
50/50 [==============================] - 11s 223ms/step - loss: 0.6196 - accuracy: 0.6605 - val_loss: 0.5694 - val_accuracy: 0.7150
Epoch 4/10
50/50 [==============================] - 11s 223ms/step - loss: 0.5501 - accuracy: 0.7285 - val_loss: 0.6216 - val_accuracy: 0.7225
Epoch 5/10
50/50 [==============================] - 11s 221ms/step - loss: 0.4794 - accuracy: 0.7790 - val_loss: 0.6268 - val_accuracy: 0.6025
Epoch 6/10
50/50 [==============================] - 11s 226ms/step - loss: 0.4038 - accuracy: 0.8195 - val_loss: 0.4842 - val_accuracy: 0.6975
Epoch 7/10
50/50 [==============================] - 11s 222ms/step - loss: 0.3207 - accuracy: 0.8595 - val_loss: 0.5600 - val_accuracy: 0.7325
Epoch 8/10
50/50 [==============================] - 13s 257ms/step - loss: 0.2574 - accuracy: 0.8920 - val_loss: 0.9705 - val_accuracy: 0.7525
Epoch 9/10
50/50 [==============================] - 13s 252ms/step - loss: 0.2049 - accuracy: 0.9235 - val_loss: 0.7311 - val_accuracy: 0.7475
Epoch 10/10
50/50 [==============================] - 13s 251ms/step - loss: 0.1448 - accuracy: 0.9515 - val_loss: 1.0541 - val_accuracy: 0.7150
Confusion Matrix
[[200 0]
[200 0]]
Hope this answers your question. If not please share complete traceback and code for debug, i am happy to help you.

Keras Sequential to Functional API

I am new to deep learning and have been trying to convert the Keras sequential API to the functional API running on the CIFAR10 image dataset but have been having some difficulty. I've converted the model which looks the same except for the input layer yet the sequential has an average accuracy of around ~70% and my functional has an average accuracy of around ~10%. I would really appreciate some help with regards to figuring out what is going wrong. Here is my functional code:
import tensorflow as tf
from tensorflow import keras
from keras import datasets, layers, models
from keras.models import Model, Input, Sequential
import matplotlib.pyplot as plt
Download and prepare:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
input_shape = train_images[0,:,:,:].shape
Create model:
input = layers.Input(shape=input_shape)
x = layers.Conv2D(32, (3, 3), activation='relu',padding='valid')(input)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dense(10)(x)
model = Model(input, x, name='Functional')
Compile and train:
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Here is a link to the original sequential CNN which is a google collaboratory notebook. I would really appreciate any help in trying to understand and fix what is going wrong. Thank you in advance.

There seems to be some issues with SparseCategoricalCrossentropy loss.
Check this: https://github.com/tensorflow/tensorflow/issues/38632
The following model gives good accuracy:
import tensorflow as tf
from tensorflow import keras
from keras import datasets, layers, models
from keras.models import Model, Input, Sequential
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
train_labels, test_labels = tf.keras.utils.to_categorical(train_labels, 10) , tf.keras.utils.to_categorical(test_labels, 10)
input_shape = train_images[0,:,:,:].shape
input = layers.Input(shape=input_shape)
x = layers.Conv2D(32, (3, 3), activation='relu',padding='valid')(input)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input, x, name='Functional')
model.summary()
model.compile(optimizer='adam',
loss=loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
conv2d_16 (Conv2D) (None, 30, 30, 32) 896
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 15, 15, 32) 0
_________________________________________________________________
conv2d_17 (Conv2D) (None, 13, 13, 64) 18496
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 6, 6, 64) 0
_________________________________________________________________
conv2d_18 (Conv2D) (None, 4, 4, 64) 36928
_________________________________________________________________
flatten_6 (Flatten) (None, 1024) 0
_________________________________________________________________
dense_11 (Dense) (None, 64) 65600
_________________________________________________________________
dense_12 (Dense) (None, 10) 650
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 15s 305us/step - loss: 1.4870 - accuracy: 0.4600 - val_loss: 1.2874 - val_accuracy: 0.5488
Epoch 2/10
50000/50000 [==============================] - 15s 301us/step - loss: 1.1365 - accuracy: 0.5989 - val_loss: 1.0789 - val_accuracy: 0.6191
Epoch 3/10
50000/50000 [==============================] - 15s 301us/step - loss: 0.9869 - accuracy: 0.6547 - val_loss: 0.9506 - val_accuracy: 0.6700
Epoch 4/10
50000/50000 [==============================] - 15s 301us/step - loss: 0.8896 - accuracy: 0.6907 - val_loss: 0.9509 - val_accuracy: 0.6695
Epoch 5/10
50000/50000 [==============================] - 16s 311us/step - loss: 0.8135 - accuracy: 0.7151 - val_loss: 0.8688 - val_accuracy: 0.7046
Epoch 6/10
50000/50000 [==============================] - 15s 303us/step - loss: 0.7566 - accuracy: 0.7351 - val_loss: 0.8411 - val_accuracy: 0.7141

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Loss is NaN using activation softmax and loss function categorical_crossentropy - python

I found what it was. There were some "None" values in the x matrix that caused the problem. Removing them it started evaluating a numeric loss. Very poor accuracy, but this will be another problem to solve.

Related

how can i continue training from last epoch?

ValueError: Shapes (None, 1) and (None, 64) are incompatible

Tensorflow save and load_model not working but save and load_weights does

confusion_matrix() library is giving ValueError

Keras Sequential to Functional API

Categories

Resources