After playing around with Keras, I realized that somehow models.fit() doesn't retrain the parameters after calling it again.
Below is my toy example. I called models.fit() 6 times, and at the fourth epoch I train it with a completely new dataset. What should happen is that the model should change at the fourth iteration so the fifth iteration should produce different scores as the third iteration. However, this is not what's happening.
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(1, 199, 40)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
nb_epoch = 6
# I know you can add nb_epoch into the fit function, but please ignore that for now
for e in range(nb_epoch):
if e ==3:
# For the third epoch only, let's train the model
# on a completely new dataset
model.fit(X_train1, y_train1, nb_epoch=1, batch_size=batch_size)
else:
model.fit(X_train2, y_train2, nb_epoch=1, batch_size=batch_size)
Results:
546/546 [==============================] - 11s - loss: 4.0249 - acc: 0.6996
Epoch 1/1
546/546 [==============================] - 11s - loss: 4.0443 - acc: 0.7491
Epoch 1/1
546/546 [==============================] - 11s - loss: 4.0443 - acc: 0.7491
Epoch 1/1
365/365 [==============================] - 7s - loss: 3.7977 - acc: 0.7644
Epoch 1/1
546/546 [==============================] - 11s - loss: 4.0443 - acc: 0.7491
Epoch 1/1
546/546 [==============================] - 11s - loss: 4.0443 - acc: 0.7491
It seems to be like calling model.fit after the second iteration has no effect at all on the model. Even if new data is given.
Any ideas on why this is happening? I also tried train_on_batch and it produces the same results.
Related
I have built a tensorflow model and am getting no change in my validation accuracy in different epochs, which makes me believe there is something wrong in my setup. Below is my code.
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import regularizers
import tensorflow as tf
model = Sequential()
model.add(Conv2D(16, (3, 3), input_shape=(299, 299,3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(32, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(64, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(64, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
# this converts our 3D feature maps to 1D feature vectors
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
batch_size=32
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
# shear_range=0.2,
# zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
'Documents/Training', # this is the target directory
target_size=(299, 299), #all images will be resized to 299
batch_size=batch_size,
class_mode='binary') # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
'Documents/Dev',
target_size=(299, 299),
batch_size=batch_size,
class_mode='binary')
#w1 = tf.Variable(tf.truncated_normal([784, 30], stddev=0.1))
model.fit_generator(
train_generator,
steps_per_epoch=50 // batch_size,
verbose = 1,
epochs=10,
validation_data=validation_generator,
validation_steps=8 // batch_size)
Which when I run produces the following output. Anything I'm missing here as far as my architecture is concerned or data generation steps? I have referenced Tensorflow model accuracy not increasing and accuracy not increasing in tensorflow model to no avail yet.
Epoch 1/10
3/3 [==============================] - 2s 593ms/step - loss: 0.6719 - accuracy: 0.6250 - val_loss: 0.8198 - val_accuracy: 0.5000
Epoch 2/10
3/3 [==============================] - 2s 607ms/step - loss: 0.6521 - accuracy: 0.6667 - val_loss: 0.8518 - val_accuracy: 0.5000
Epoch 3/10
3/3 [==============================] - 2s 609ms/step - loss: 0.6752 - accuracy: 0.6250 - val_loss: 0.7129 - val_accuracy: 0.5000
Epoch 4/10
3/3 [==============================] - 2s 611ms/step - loss: 0.6841 - accuracy: 0.6250 - val_loss: 0.7010 - val_accuracy: 0.5000
Epoch 5/10
3/3 [==============================] - 2s 608ms/step - loss: 0.6977 - accuracy: 0.5417 - val_loss: 0.6551 - val_accuracy: 0.5000
Epoch 6/10
3/3 [==============================] - 2s 607ms/step - loss: 0.6508 - accuracy: 0.7083 - val_loss: 0.5752 - val_accuracy: 0.5000
Epoch 7/10
3/3 [==============================] - 2s 615ms/step - loss: 0.6596 - accuracy: 0.6875 - val_loss: 0.9326 - val_accuracy: 0.5000
Epoch 8/10
3/3 [==============================] - 2s 604ms/step - loss: 0.7022 - accuracy: 0.6458 - val_loss: 0.6976 - val_accuracy: 0.5000
Epoch 9/10
3/3 [==============================] - 2s 591ms/step - loss: 0.6331 - accuracy: 0.7292 - val_loss: 0.9571 - val_accuracy: 0.5000
Epoch 10/10
3/3 [==============================] - 2s 595ms/step - loss: 0.6085 - accuracy: 0.7292 - val_loss: 0.6029 - val_accuracy: 0.5000
Out[24]: <keras.callbacks.callbacks.History at 0x1ee4e3a8f08>
You are setting the training steps per epoch =50//32=1. So do you only have 50 training images? Similarly for validation you have steps = 8//32=0. Do you have only 8 validation images? When you execute the program how many images do the training and validation generators print out they have found? You will need more images than that. Try setting your batch size =1
i'm doing for my final project and i'm new in ConVnets. i want to classifies which one is genuine image and spoof image. i have +-8000 data (combine). and i want to show you some of my training log.
Epoch 7/100
311/311 [==============================] - 20s 63ms/step - loss: 0.3274 - accuracy: 0.8675 - val_loss: 0.2481 - val_accuracy: 0.9002
Epoch 8/100
311/311 [==============================] - 20s 63ms/step - loss: 0.3189 - accuracy: 0.8691 - val_loss: 0.3015 - val_accuracy: 0.8684
Epoch 9/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3201 - accuracy: 0.8667 - val_loss: 0.2460 - val_accuracy: 0.9036
Epoch 10/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3063 - accuracy: 0.8723 - val_loss: 0.2752 - val_accuracy: 0.8901
Epoch 11/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3086 - accuracy: 0.8749 - val_loss: 0.2717 - val_accuracy: 0.8988
[INFO] evaluating network...
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
model.add(Conv2D(16, (3, 3), padding="same", input_shape=inputShape)) model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(16, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Conv2D(32, (5, 5), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(classes))
model.add(Activation("softmax"))
the input is 32x32 and it has two classes. i used EarlyStopping in keras to prevent overfitting. and i always change the value of learning rate and try to change the number of neuron node but still the training always stop below 20 epoch. any advice to prevent overfitting ? since i'm beginner in convolutional neural network. Thanks in advance !
PS LR: 0.001 BS: 20 EPOCHS: 100
I'm learning deep learning in keras and I have a problem.
The loss isn't decreasing and it's very high, about 650.
I'm working on MNIST dataset from tensorflow.keras.datasets.mnist
There is no error, just my NN isn't learning.
There is my model:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
import tensorflow.nn as tfnn
inputdim = 28 * 28
model = Sequential()
model.add(Flatten())
model.add(Dense(inputdim, activation = tfnn.relu))
model.add(Dense(128, activation = tfnn.relu))
model.add(Dense(10, activation = tfnn.softmax))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(X_train, Y_train, epochs = 4)
and my output:
Epoch 1/4
60000/60000 [==============================] - 32s 527us/sample - loss: 646.0926 - acc: 6.6667e-05
Epoch 2/4
60000/60000 [==============================] - 39s 652us/sample - loss: 646.1003 - acc: 0.0000e+00 - l - ETA: 0s - loss: 646.0983 - acc: 0.0000e
Epoch 3/4
60000/60000 [==============================] - 35s 590us/sample - loss: 646.1003 - acc: 0.0000e+00
Epoch 4/4
60000/60000 [==============================] - 33s 544us/sample - loss: 646.1003 - acc: 0.0000e+00
```
Ok, I added BatchNormalization between lines and changed loss function to 'sparse_categorical_crossentropy'. That's how my NN looks like:
model = Sequential()
model.add(Flatten())
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(inputdim, activation = tfnn.relu))
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(128, activation = tfnn.relu))
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(10, activation = tfnn.softmax))
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
and thats a results:
Epoch 1/4
60000/60000 [==============================] - 68s 1ms/sample - loss: 0.2045 - acc: 0.9374
Epoch 2/4
60000/60000 [==============================] - 55s 916us/sample - loss: 0.1007 - acc: 0.9689
Thanks for your help!
You may try sparse_categorical_crossentropy loss function. Also what is your batch size? and as has already been suggested you may want to increase number of epochs.
I'm running Keras with a tensorflow-backend.
I try to predict images.
My model looks like this:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(50, 50, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(1, activation='softmax'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
Running my code, it produces this output over 10 epochs:
Epoch 1/10
24946/24946 [==============================] - 36s 1ms/sample - loss: 7.9693 - acc: 0.5001
Epoch 2/10
24946/24946 [==============================] - 35s 1ms/sample - loss: 7.9693 - acc: 0.5001
...
Epoch 9/10
24946/24946 [==============================] - 30s 1ms/sample - loss: 7.9693 - acc: 0.5001
Epoch 10/10
24946/24946 [==============================] - 30s 1ms/sample - loss: 7.9693 - acc: 0.5001
1/1 [==============================] - 0s 36ms/step
[[1.]]
Anyhow, I do not understand why the accuracy is always 0.5001 over all 10 epochs.
My question is: Why does the accuracy not change within any epoch?
This part of your code makes no sense:
model.add(Dense(1, activation='softmax'))
Softmax with only one neuron with always produce output of constant 1.0, due to the normalization. If you want to do binary classification, you should use a sigmoid activation at the output.
I'm creating a very simple 2 layer feed forward network but am finding that the loss is not updating at all. I have some ideas but I wanted to get additional feedback/guidance.
Details about the data:
X_train:
(336876, 158)
X_dev:
(42109, 158)
Y_train counts:
0 285793
1 51083
Name: default, dtype: int64
Y_dev counts:
0 35724
1 6385
Name: default, dtype: int64
And here is my model architecture:
# define the architecture of the network
model = Sequential()
model.add(Dense(50, input_dim=X_train.shape[1], init="uniform", activation="relu"))
model.add(Dense(3print("[INFO] compiling model...")
adam = Adam(lr=0.01)
model.compile(loss="binary_crossentropy", optimizer=adam,
metrics=['accuracy'])
model.fit(np.array(X_train), np.array(Y_train), epochs=12, batch_size=128, verbose=1)Dense(1, activation = 'sigmoid'))
Now, with this, my loss after the first few epochs are as follows:
Epoch 1/12
336876/336876 [==============================] - 8s - loss: 2.4441 - acc: 0.8484
Epoch 2/12
336876/336876 [==============================] - 7s - loss: 2.4441 - acc: 0.8484
Epoch 3/12
336876/336876 [==============================] - 6s - loss: 2.4441 - acc: 0.8484
Epoch 4/12
336876/336876 [==============================] - 7s - loss: 2.4441 - acc: 0.8484
Epoch 5/12
336876/336876 [==============================] - 7s - loss: 2.4441 - acc: 0.8484
Epoch 6/12
336876/336876 [==============================] - 7s - loss: 2.4441 - acc: 0.8484
Epoch 7/12
336876/336876 [==============================] - 7s - loss: 2.4441 - acc: 0.8484
Epoch 8/12
336876/336876 [==============================] - 6s - loss: 2.4441 - acc: 0.8484
Epoch 9/12
336876/336876 [==============================] - 6s - loss: 2.4441 - acc: 0.8484
And when I test the model after, my f1_score is 0. My main thought was that I may need more data but I'd still expect it to perform better than it is now on the test set. Could it be that it is overfitting? I added Dropout but no luck there either.
Any help would be much appreciated.
at first glance, I believe that your learning rate is too high. Also, please consider normalizing your data especially if different features have different ranges of values (look at Scaling). Also, please consider changing your layer activations depending on whether your labels are multi-class or not. Assuming your code is of this form (you seem to have some typos in problem description):
# define the architecture of the network
model = Sequential()
#also what is the init="uniform" argument? I did not find this in keras documentation, consider removing this.
model.add(Dense(50, input_dim=X_train.shape[1], init="uniform",
activation="relu"))
model.add(Dense(1, activation = 'sigmoid')))
#a slightly more conservative learning rate, play around with this.
adam = Adam(lr=0.0001)
model.compile(loss="binary_crossentropy", optimizer=adam,
metrics=['accuracy'])
model.fit(np.array(X_train), np.array(Y_train), epochs=12, batch_size=128,
verbose=1)
This should lead the loss to converge. If not, please consider deepening your neural net (think about how many parameters you may need).
Consider adding the classification layer before compiling your model.
model.add(Dense(1, activation = 'sigmoid'))
adam = Adam(lr=0.01)
model.compile(loss="binary_crossentropy", optimizer=adam,
metrics=['accuracy'])
model.fit(np.array(X_train), np.array(Y_train), epochs=12, batch_size=128, verbose=1)