I am attempting to train a simple convolutional neural network shown below.
model= Sequential()
model.add(Conv1D(32, 3, padding='same', input_shape=(700,7))
model.add(Activation('relu'))
model.add(Conv1D(32,3))
model.add(Activation('relu'))
model.add(MaxPooling1D())
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
I fit it using 100 epochs and a validation training split 0.2 on input data shaped [1000L, 700L, 7L]. Every single one of my epochs displaed the following:
loss: nan - acc:0.0000e+00 - val_loss: nan - val_acc: 0.0000e+00
So my question is, what went wrong and how do I fix it? Is the problem with the network or how my data is being inputed and fitted to the model?
Related
I have 101 folders from 0-100 containing synthetic training images.
This is my code:
dataset = tf.keras.utils.image_dataset_from_directory(
'Pictures/synthdataset5', labels='inferred', label_mode='int', class_names=None, color_mode='rgb', batch_size=32, image_size=(128,128), shuffle=True, seed=None, validation_split=None, subset=None,interpolation='bilinear', follow_links=False,crop_to_aspect_ratio=False
)
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten
model = Sequential()
model.add(Conv2D(32, kernel_size=5, activation='relu', input_shape=(128,128,3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=5, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(dataset,epochs=75)
And I always get the same result for every epoch:
Epoch 1/75
469/469 [==============================] - 632s 1s/step - loss: 0.0000e+00 - accuracy: 0.0098
What's wrong???
So turns out your loss might be the problem after all.
If you use SparseCategoricalCrossentropy instead as loss it should work.
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
After this you should adjust the last layer to:
model.add(Dense(101, activation='softmax'))
Also don't forget to import import tensorflow as tf
Let me know if this solves the issue.
I've a dataset where I need to predict the target, that it is 0 or 1,
for me is good to know the prediction is near to 0, like 0.20 or near to 1, like 0.89 and so on.
my model structure is this:
model = Sequential()
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(128, return_sequences=True, recurrent_dropout=0.2,activation='relu'))
model.add(Dense(128, activation="relu",
kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),
bias_regularizer=regularizers.l2(1e-4),
activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(64, return_sequences=True,activation='relu'))
model.add(Dense(64, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),
bias_regularizer=regularizers.l2(1e-4),
activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(32, return_sequences=True, recurrent_dropout=0.2, activation='relu'))
model.add(Dense(32, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),
bias_regularizer=regularizers.l2(1e-4),
activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))
model.add(BatchNormalization())
model.add(Dense(1, activation='linear'))
from keras.metrics import categorical_accuracy
model.compile(optimizer='rmsprop',loss="mse",metrics=['accuracy'])
model.fit(X_train,y_train,epochs=1000, batch_size=16, verbose=1, validation_split=0.1, callbacks=callback)
Summary of model is here: https://pastebin.com/Ba6ErEzj
Verbosity on training is:
Epoch 58/1000
277/277 [==============================] - 1s 5ms/step - loss: 0.2510 - accuracy: 0.4937 - val_loss: 0.2523 - val_accuracy: 0.4878
Epoch 59/1000
277/277 [==============================] - 1s 5ms/step - loss: 0.2515 - accuracy: 0.4941 - val_loss: 0.2504 - val_accuracy: 0.5122
How can I improve that? accuracy around 0.50 on 0 or 1 output is useless.
This is my Colab code.
To wrap-up suggestions (some already offered in the comments), with some justification...
Mistakes. You are in a binary classification setting, so:
Using MSE is wrong; you should use loss='binary_crossentropy'
In your last single-node layer, you should use activation='sigmoid'.
Best practices. Things like dropout, batch normalization, and kernel & bial regularizers are used for regularization, i.e. (roughly speaking) to avoid overfitting. They should not be used by default, and doing so is well-known to prevent learning (as it seems to be the case here):
Remove all dropout layers
Remove all batch normalization layers
Remove all kernel, bias, and activity regularizers.
You can consider adding some of these back step by step later, but only if you see signs of overfitting.
General advice. Nowadays, usually the first choice for an optimizer is Adam, so change to optimizer='adam' as a first approach.
That said, at the end of the day, everything depends on your data (both their quantity & quality) and the particular problem to be addressed. Experimentation is king (but keeping in mind the general principles stated above).
When I whant to train my model in tf it seems like tf don't get right values (cf screen).
I expect to have 21759 and not 680
It's appening since I changed of OS (fedora 30 xfce -> fedora 32 gnome) and on others laptops there is not this issue.
I am using Tf 2.2.
My dataset is made by somes csv created by tshark: A screen of my DS
Here is few lines of my code:
My model:
model = Sequential()
model.add(LSTM(9, input_shape=dataset[0].shape, activation='relu', return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(9, input_shape=dataset[0].shape, activation='relu', return_sequences=True))
model.add(Dropout(0.3))
model.add(Dense(9, activation='relu'))
model.add(Flatten())
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=1e-4, decay=1e-5)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
Do you have any ideas ?
PS: It happens too with this .PY
import tensorflow as tf
dataset = [[1, 1],[2, 2]] * 50
label = [0, 1] * 50
print(len(dataset))
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, activation="relu", input_shape=(2,)),
tf.keras.layers.Dense(2, activation="softmax")
])
model.compile(
loss="sparse_categorical_crossentropy",
optimizer="sgd",
metrics=["accuracy"]
)
history = model.fit(dataset, label, epochs=1)
Ouput:
100
4/4 [==============================] - 0s 611us/step - loss: 0.6578 - accuracy: 0.5000
Like Koralp Catalsakal said it was just an "configuration difference" issue.
So I just had to set up manually the batch_size.
I am training a three layer neural network with keras:
model = models.Sequential()
model.add(Conv2D(32, (3, 3), padding="same",
input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))
model.add(BatchNormalization(axis=channels))
model.add(Activation("relu"))
model.add(Conv2D(64, (3, 3), padding="same",
input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))
model.add(BatchNormalization(axis=channels))
model.add(Activation("relu"))
model.add(Conv2D(128, (3, 3), padding="same",
input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))
model.add(BatchNormalization(axis=channels))
model.add(Activation("relu"))
model.add(layers.Flatten())
model.add(layers.Dense(neurons, activation='relu', kernel_regularizer=l2(reg)))
model.add(Dropout(0.50))
model.add(Dense(2))
model.add(Activation("softmax"))
My data has two classes, and I am using sparse categorical cross entropy:
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
history = model.fit(x=X, y=y, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val),
shuffle=True,
callbacks=callbacks,
verbose=1)
My data has the following shape:
X: (232, 100, 150, 3)
y: (232,)
Where X are images and y is either 1 or 0, because of using the sparse loss function
The loss is very high for both accuracy and validation, even if the training accuracy is 1! I get values over 20 for the loss, which I understand are not reasonable.
If I set the model to try for a few epochs and output the predictions for the labels and the true values, and I get the categorical cross entropy from them, the value I get is <1, as expected, even when I make the calculation with keras' function (I change to categorical because the sparse gives an error)
21/21 [==============================] - 7s 313ms/step - loss: 44.1764 - acc: 1.0000 - val_loss: 44.7084 - val_acc: 0.7857
cce = tf.keras.losses.CategoricalCrossentropy()
pred = model.predict(x=X_val, batch_size=len(X_val))
loss = cce(true_categorical, pred)
Categorical loss 0.6077293753623962
Is there a way to know exactly how this is calculated and why the high values? Batch size is 8.
The loss printed by Keras is the total loss.
Regularization is also a loss added to the model based on the value of the weights.
Since you have a lot of weights, you also have a lot of contributions to the total loss.
That is why it's big.
If you remove the regularization, you will see the final loss equal to the categorical crossentropy loss.
I have recently written and run this code to train a CNN with Theano and Keras:
#Building the model
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,8,182)))
model.add(Convolution2D(32, 3, 3, activation='relu'))
model.add(Convolution2D(32, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
#Compiling the CNN
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['binary_accuracy'])
#Fitting data and training the model
model.fit(X_train, y_train, batch_size=32, nb_epoch=100, verbose=1)
#Saving weights
model.save_weights('trained_cnn.h5', overwrite=True)
I tested it on my CPU, and each epoch took about 3 minutes. A sample output for the first epoch is this:
Epoch 1/100
72000/72000 [==============================] - 204s - loss: 0.6935 - binary_accuracy: 0.5059
Now, I have migrated to a Nvidia Titan X GPU. I have also been forced to move to Keras 2 and thus have updated my code as follows, implementing the necessary changes:
#Building the model
model = Sequential()
model.add(Conv2D(32, 3, activation='relu', input_shape=(1,8,182)))
model.add(Conv2D(32, 3, activation='relu'))
model.add(Conv2D(32, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
#Compiling the CNN
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['binary_accuracy'])
#Fitting data and training the model
model.fit(X_train, y_train, batch_size=32,epochs=100, verbose=2)
#Saving weights
model.save_weights('trained_cnn_b_2.h5', overwrite=True)
Now whenever I run on my GPU, the program just gets stuck saying
epoch 1/100
and nothing happens after this, even when I wait for more than 10 minutes.
Why is this happening and how can I fix it? I haven't changed any of my code, besides the Keras functions. No errors are thrown. Where am I going wrong? Is there something wrong with the verbose command which is stopping the program from executing?
Edit 1: I have left my set up running overnight, but there is still no execution after that line.
Edit 2: I am using CUDA 7.5.17
Edit 3: This program from here works perfectly fine. It completes execution in less than 10 seconds, as expected.
# Create your first MLP in Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy
# fix random seed for reproducibility
numpy.random.seed(7)
# load pima indians dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
NOTE: I have verified that my GPU is working completely fine.
EDIT: I migrated to Tensorflow, and it had no problems with the CUDA version being below 9. With Tensorflow, the program executes perfectly.