I am building a Deep Learning model for regression:
model = keras.Sequential([
keras.layers.InputLayer(input_shape=np.shape(X_train)[1:]),
keras.layers.Conv1D(filters=30, kernel_size=3, activation=tf.nn.tanh),
keras.layers.Dropout(0.1),
keras.layers.AveragePooling1D(pool_size=2),
keras.layers.Conv1D(filters=20, kernel_size=3, activation=tf.nn.tanh),
keras.layers.Dropout(0.1),
keras.layers.AveragePooling1D(pool_size=2),
keras.layers.Flatten(),
keras.layers.Dense(30, tf.nn.tanh),
keras.layers.Dense(20, tf.nn.tanh),
keras.layers.Dense(10, tf.nn.tanh),
keras.layers.Dense(3)
])
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
model.fit(
X_train,
Y_train,
epochs=300,
batch_size=32,
validation_split=0.2,
shuffle=True,
callbacks=[early_stopping]
)
During training, the loss function (and MAE) exhibit this strange behavior:
What does this trend indicate? Could it mean that the model is overfitting?
It looks to me that your optimiser changes (decreases) the learning rate at those sudden change curvy points.
I think, There is an issue with your dataset. I have seen that your training and validation losses are precisely the same value, which is practically not possible.
Please check your dataset and shuffle it before splitting.
Related
I am working on time series classification using LSTM model. Here is the architecture:
np.random.seed(16)
python_random.seed(17)
tf.random.set_seed(18)
model =Sequential()
model.add(LSTM(128, input_shape = (50, 5),return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss=tfa.losses.SigmoidFocalCrossEntropy()
, metrics=[tf.keras.metrics.AUC(name='auc'),tf.keras.metrics.binary_accuracy,tf.keras.metrics.Recall()]
, optimizer=adam)
np.random.seed(25)
python_random.seed(26)
tf.random.set_seed(27)
keras_callbacks = [
EarlyStopping(monitor='val_loss', patience=20, mode='min'),
ModelCheckpoint('1LSTM_4_4_2022.h5', monitor='val_loss', save_best_only=True, mode='min')
]
history=model.fit(X_train, y_train, batch_size=256,verbose=1, validation_data=(X_val,
y_val), epochs=100,class_weight=class_weights,callbacks=keras_callbacks)
I used Early stopping to avoid overfitting. However, I don't understand why I am seeing this oscillatory loss function. My dataset is severely imbalanced with imbalance ratio 11500: 1. I used class_weight to handle class imbalance. Class distribution was same in train-validation-test data. How can I explain this loss function?
However, it was alright in ROC-AUC plot. I don't know what I am missing here. I appreciate your explanations.
I am trying to use keras to train a simple feedforward network. I tried two different methods of what I think is the same network, but one is performing significantly better. The first one and the better performing one is the following:
inputs = keras.Input(shape=(384,))
dense = layers.Dense(64, activation="relu")
x = dense(inputs)
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(384)(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="simple_model")
model.compile(loss='mse',optimizer='Adam')
history = model.fit(X_train,
y_train_tf,
epochs=20,
validation_data=(X_test, y_test),
steps_per_epoch=100,
validation_steps=50)
and it settles on a validation loss of about 0.2. The second model performs much worse:
model = keras.models.Sequential()
model.add(Dense(64, input_shape=(384,), activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(384, activation='relu'))
optimizer = tf.keras.optimizers.Adam()
model.compile(loss='mse', optimizer=optimizer)
history = model.fit(X_train,
y_train_tf,
epochs=20,
validation_data=(X_test, y_test),
steps_per_epoch=100,
validation_steps=50)
and this has validation loss of around 5. But when I do model.summary, they look virtually the same. Is there something wrong with the second model?
I am not sure that they are the same since second model has relu activation after last layer (384 units) and first doesn't. This might be the issue since default activation of the Keras dense layer is None.
I'm working on a simple Keras sequential model and I'm trying to test different combinations of hyperparameters but is there a way to try all the possible combinations of these hyperparameters automatically which provides me the best combinations?
Here's my keras model:
model = Sequential()
input_neurons = 70
model.add(LSTM(input_neurons, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(LeakyReLU(alpha=0.5))
model.add(Dropout(0.1))
model.add(Dense(1))
optimizer = RMSprop(learning_rate=0.00134)
model.compile(loss=loss_func, optimizer=optimizer)
history = model.fit(
train_X,
train_y,
epochs=200, batch_size=72,
validation_data=(test_X, test_y),
verbose=2, shuffle=False)
Yes, you can try hyperas andtalos for example, but there are other too. Just look up automatic hyperparameter optimization and you will surely find more results.
I am a newbie to Keras and machine learning in general. I’m trying to build a classification model using the Sequential model. After some experiments, I see that my validation accuracy behavior is very low and not increasing, although the training accuracy works well. I added regularization parameters to the layers and dropouts also in between the layers. Still, the behavior exists. Here’s my code.
from keras.regularizers import l2
model = keras.models.Sequential()
model.add(keras.layers.Conv1D(filters=32, kernel_size=1, strides=1, padding="SAME", activation="relu", input_shape=[512,1],kernel_regularizer=keras.regularizers.l2(l=0.1))) # 一定要加 input shape
keras.layers.Dropout=0.35
model.add(keras.layers.MaxPool1D(pool_size=1,activity_regularizer=l2(0.01)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(256, activation="softmax",activity_regularizer=l2(0.01)))
model.compile(loss="sparse_categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"])
Ahistory = model.fit(train_x, trainy, epochs=300,
validation_split = 0.2,
batch_size = 16)
And here is the final results I got.
What is the reason behind this.? How do I fine-tune the model.?
I have the following Keras model, although it could be generalised to a normal RNN using GRUs.
model = Sequential()
model.add(GRU(40, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(GRU(10, batch_input_shape=(batch_size, look_back, features), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
# Train model
iter = 10000
for i in range(iter):
model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=0, shuffle=False)
if (i<(iter-1)):
model.reset_states()
testPred = model.predict(testX,batch_size=batch_size)
print(mean_squared_error(testY,testPred))
If I don't have the if statement with regards to resetting the state, the mean squared error has always been higher. Considering that the test set is right after the train set wouldn't it make sense that you would want to preserve the state of the last memory block?
This tutorial seems to suggest otherwise: http://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/ (i.e. he simply doesn't have that if statement, doesn't explicitly mention anything about keeping the last state).
So just wondering if I am correct about this.