Loss function exhibits strange behavior during training

Loss function exhibits strange behavior during training - python

I am building a Deep Learning model for regression:
model = keras.Sequential([
keras.layers.InputLayer(input_shape=np.shape(X_train)[1:]),
keras.layers.Conv1D(filters=30, kernel_size=3, activation=tf.nn.tanh),
keras.layers.Dropout(0.1),
keras.layers.AveragePooling1D(pool_size=2),
keras.layers.Conv1D(filters=20, kernel_size=3, activation=tf.nn.tanh),
keras.layers.Dropout(0.1),
keras.layers.AveragePooling1D(pool_size=2),
keras.layers.Flatten(),
keras.layers.Dense(30, tf.nn.tanh),
keras.layers.Dense(20, tf.nn.tanh),
keras.layers.Dense(10, tf.nn.tanh),
keras.layers.Dense(3)
])
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
model.fit(
X_train,
Y_train,
epochs=300,
batch_size=32,
validation_split=0.2,
shuffle=True,
callbacks=[early_stopping]
)
During training, the loss function (and MAE) exhibit this strange behavior:
What does this trend indicate? Could it mean that the model is overfitting?

It looks to me that your optimiser changes (decreases) the learning rate at those sudden change curvy points.

I think, There is an issue with your dataset. I have seen that your training and validation losses are precisely the same value, which is practically not possible.
Please check your dataset and shuffle it before splitting.

Related

Oscillatory behavior of the train versus validation loss during LSTM model

I am working on time series classification using LSTM model. Here is the architecture:
np.random.seed(16)
python_random.seed(17)
tf.random.set_seed(18)
model =Sequential()
model.add(LSTM(128, input_shape = (50, 5),return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss=tfa.losses.SigmoidFocalCrossEntropy()
, metrics=[tf.keras.metrics.AUC(name='auc'),tf.keras.metrics.binary_accuracy,tf.keras.metrics.Recall()]
, optimizer=adam)
np.random.seed(25)
python_random.seed(26)
tf.random.set_seed(27)
keras_callbacks = [
EarlyStopping(monitor='val_loss', patience=20, mode='min'),
ModelCheckpoint('1LSTM_4_4_2022.h5', monitor='val_loss', save_best_only=True, mode='min')
]
history=model.fit(X_train, y_train, batch_size=256,verbose=1, validation_data=(X_val,
y_val), epochs=100,class_weight=class_weights,callbacks=keras_callbacks)
I used Early stopping to avoid overfitting. However, I don't understand why I am seeing this oscillatory loss function. My dataset is severely imbalanced with imbalance ratio 11500: 1. I used class_weight to handle class imbalance. Class distribution was same in train-validation-test data. How can I explain this loss function?
However, it was alright in ROC-AUC plot. I don't know what I am missing here. I appreciate your explanations.

Machine Learning with Keras: Different Validation Loss for the Same Model

I am trying to use keras to train a simple feedforward network. I tried two different methods of what I think is the same network, but one is performing significantly better. The first one and the better performing one is the following:
inputs = keras.Input(shape=(384,))
dense = layers.Dense(64, activation="relu")
x = dense(inputs)
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(384)(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="simple_model")
model.compile(loss='mse',optimizer='Adam')
history = model.fit(X_train,
y_train_tf,
epochs=20,
validation_data=(X_test, y_test),
steps_per_epoch=100,
validation_steps=50)
and it settles on a validation loss of about 0.2. The second model performs much worse:
model = keras.models.Sequential()
model.add(Dense(64, input_shape=(384,), activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(384, activation='relu'))
optimizer = tf.keras.optimizers.Adam()
model.compile(loss='mse', optimizer=optimizer)
history = model.fit(X_train,
y_train_tf,
epochs=20,
validation_data=(X_test, y_test),
steps_per_epoch=100,
validation_steps=50)
and this has validation loss of around 5. But when I do model.summary, they look virtually the same. Is there something wrong with the second model?

I am not sure that they are the same since second model has relu activation after last layer (384 units) and first doesn't. This might be the issue since default activation of the Keras dense layer is None.

Keras - Tune a sequential model by testing all the possible hyper parameters

I'm working on a simple Keras sequential model and I'm trying to test different combinations of hyperparameters but is there a way to try all the possible combinations of these hyperparameters automatically which provides me the best combinations?
Here's my keras model:
model = Sequential()
input_neurons = 70
model.add(LSTM(input_neurons, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(LeakyReLU(alpha=0.5))
model.add(Dropout(0.1))
model.add(Dense(1))
optimizer = RMSprop(learning_rate=0.00134)
model.compile(loss=loss_func, optimizer=optimizer)
history = model.fit(
train_X,
train_y,
epochs=200, batch_size=72,
validation_data=(test_X, test_y),
verbose=2, shuffle=False)

Yes, you can try hyperas andtalos for example, but there are other too. Just look up automatic hyperparameter optimization and you will surely find more results.

Validation accuracy is low and not increasing while training accuracy is increasing

I am a newbie to Keras and machine learning in general. I’m trying to build a classification model using the Sequential model. After some experiments, I see that my validation accuracy behavior is very low and not increasing, although the training accuracy works well. I added regularization parameters to the layers and dropouts also in between the layers. Still, the behavior exists. Here’s my code.
from keras.regularizers import l2
model = keras.models.Sequential()
model.add(keras.layers.Conv1D(filters=32, kernel_size=1, strides=1, padding="SAME", activation="relu", input_shape=[512,1],kernel_regularizer=keras.regularizers.l2(l=0.1))) # 一定要加 input shape
keras.layers.Dropout=0.35
model.add(keras.layers.MaxPool1D(pool_size=1,activity_regularizer=l2(0.01)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(256, activation="softmax",activity_regularizer=l2(0.01)))
model.compile(loss="sparse_categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"])
Ahistory = model.fit(train_x, trainy, epochs=300,
validation_split = 0.2,
batch_size = 16)
And here is the final results I got.
What is the reason behind this.? How do I fine-tune the model.?

Predicting in Stateful LSTMs

I have the following Keras model, although it could be generalised to a normal RNN using GRUs.
model = Sequential()
model.add(GRU(40, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(GRU(10, batch_input_shape=(batch_size, look_back, features), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
# Train model
iter = 10000
for i in range(iter):
model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=0, shuffle=False)
if (i<(iter-1)):
model.reset_states()
testPred = model.predict(testX,batch_size=batch_size)
print(mean_squared_error(testY,testPred))
If I don't have the if statement with regards to resetting the state, the mean squared error has always been higher. Considering that the test set is right after the train set wouldn't it make sense that you would want to preserve the state of the last memory block?
This tutorial seems to suggest otherwise: http://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/ (i.e. he simply doesn't have that if statement, doesn't explicitly mention anything about keeping the last state).
So just wondering if I am correct about this.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Loss function exhibits strange behavior during training - python

It looks to me that your optimiser changes (decreases) the learning rate at those sudden change curvy points.

I think, There is an issue with your dataset. I have seen that your training and validation losses are precisely the same value, which is practically not possible. Please check your dataset and shuffle it before splitting.

Related

Oscillatory behavior of the train versus validation loss during LSTM model

Machine Learning with Keras: Different Validation Loss for the Same Model

Keras - Tune a sequential model by testing all the possible hyper parameters

Validation accuracy is low and not increasing while training accuracy is increasing

Predicting in Stateful LSTMs

Categories

Resources