I am new to working with LSTM models, but I have a small network. I have extracted MFCC features from my audio files and have flattened it and given as input. But the validation accuracy is stuck between 2 values and my accuracy is decreasing continuously.
I have used RMSprop with a learning rate of 0.001.
I have tried changing Optimizer, adding dropout, and batch normalization.
The dataset is evenly balanced also.
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 3460, 1) 0
_________________________________________________________________
cu_dnnlstm_1 (CuDNNLSTM) (None, 3460, 1024) 4206592
_________________________________________________________________
cu_dnnlstm_2 (CuDNNLSTM) (None, 1024) 8396800
_________________________________________________________________
dense_1 (Dense) (None, 512) 524800
_________________________________________________________________
batch_normalization_1 (Batch (None, 512) 2048
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 131328
_________________________________________________________________
batch_normalization_2 (Batch (None, 256) 1024
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 257
=================================================================
Total params: 13,262,849
Trainable params: 13,261,313
Non-trainable params: 1,536
_________________________________________________________________
Train on 385 samples, validate on 165 samples
Epoch 1/10
385/385 [==============================] - 61s 160ms/step - loss: 1.0811 - accuracy: 0.5143 - val_loss: 0.6917 - val_accuracy: 0.5273
Epoch 2/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7536 - accuracy: 0.5169 - val_loss: 0.6980 - val_accuracy: 0.4727
Epoch 3/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7484 - accuracy: 0.5039 - val_loss: 0.7002 - val_accuracy: 0.4727
Epoch 4/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7333 - accuracy: 0.5091 - val_loss: 0.7030 - val_accuracy: 0.5273
Epoch 5/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7486 - accuracy: 0.4675 - val_loss: 0.6917 - val_accuracy: 0.5273
Epoch 6/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7222 - accuracy: 0.4935 - val_loss: 0.6917 - val_accuracy: 0.5273
Epoch 7/10
385/385 [==============================] - 55s 143ms/step - loss: 0.7208 - accuracy: 0.4883 - val_loss: 0.6919 - val_accuracy: 0.5273
Epoch 8/10
385/385 [==============================] - 55s 142ms/step - loss: 0.7134 - accuracy: 0.4805 - val_loss: 0.6919 - val_accuracy: 0.5273
Epoch 9/10
385/385 [==============================] - 55s 143ms/step - loss: 0.7168 - accuracy: 0.4987 - val_loss: 0.6927 - val_accuracy: 0.5273
Epoch 10/10
385/385 [==============================] - 55s 143ms/step - loss: 0.7089 - accuracy: 0.4909 - val_loss: 0.6926 - val_accuracy: 0.5273
Here is my code:
def build_model():
input = Input((20*173,1))
x = Conv1D(filters=16, kernel_size=4, activation='relu')(input)
x = AveragePooling1D(pool_size=2)(x)
x = Conv1D(filters=16, kernel_size=3, activation='relu')(x)
x = AveragePooling1D(pool_size=2)(x)
x = Flatten()(x)
x = keras.layers.Reshape((13808, 1))(x)
x = CuDNNLSTM(1024, return_sequences=True)(x)
x = CuDNNLSTM(512)(x)
x = Dense(256,activation='relu')(x)
x = Dropout(0.3)(x)
x = Dense(128,activation='relu')(x)
x = Dropout(0.3)(x)
x = Dense(1,activation='sigmoid')(x)
model = Model(inputs=input, outputs=x)
return model
reduce_lr = ReduceLROnPlateau(monitor='val_accuracy', factor=0.2,patience=3, min_lr=0.001)
opt = RMSprop(lr=0.0001)
m2 = build_model()
m2.compile(loss = "binary_crossentropy", metrics=['accuracy'],optimizer = opt)
m2.fit(X, y, batch_size=16, epochs=10, validation_split=0.3,callbacks = [reduce_lr])
Related
I have a large dataset with 2 Million rows and 2800 columns, containing 2% of anomalous data. Currently, there is a label that says anomalous or not by 0 or 1, they were marked manually by domain experts. I have a need to convert this into unsupervised learning.
So, I started with PYOD's Autoencoders as they work well on high-dimensional data. The problem is all of them gave me high false positives. Based on the tutorial I developed the following Autoencoder
from pyod.models.auto_encoder import AutoEncoder
encoder=AutoEncoder(contamination=0.02,epochs=12,hidden_neurons=[2000,1000,500,500,1000,2000])
data.shape()
encoder.fit(data)
target
0.0 9737
1.0 263
dtype: int64
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_54 (Dense) (None, 2869) 8234030
dropout_44 (Dropout) (None, 2869) 0
dense_55 (Dense) (None, 2869) 8234030
dropout_45 (Dropout) (None, 2869) 0
dense_56 (Dense) (None, 2000) 5740000
dropout_46 (Dropout) (None, 2000) 0
dense_57 (Dense) (None, 1000) 2001000
dropout_47 (Dropout) (None, 1000) 0
dense_58 (Dense) (None, 500) 500500
dropout_48 (Dropout) (None, 500) 0
dense_59 (Dense) (None, 500) 250500
dropout_49 (Dropout) (None, 500) 0
dense_60 (Dense) (None, 1000) 501000
dropout_50 (Dropout) (None, 1000) 0
dense_61 (Dense) (None, 2000) 2002000
dropout_51 (Dropout) (None, 2000) 0
dense_62 (Dense) (None, 2869) 5740869
=================================================================
Total params: 33,203,929
Trainable params: 33,203,929
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/12
282/282 [==============================] - 22s 74ms/step - loss: 261.8725 - val_loss: 164.1958
Epoch 2/12
282/282 [==============================] - 21s 73ms/step - loss: 102.1214 - val_loss: 365.0436
Epoch 3/12
282/282 [==============================] - 21s 73ms/step - loss: 54.5027 - val_loss: 598.0752
Epoch 4/12
282/282 [==============================] - 20s 72ms/step - loss: 28.4714 - val_loss: 867.0073
Epoch 5/12
282/282 [==============================] - 20s 72ms/step - loss: 14.0551 - val_loss: 1149.2327
Epoch 6/12
282/282 [==============================] - 20s 72ms/step - loss: 7.2151 - val_loss: 1323.5684
Epoch 7/12
282/282 [==============================] - 20s 73ms/step - loss: 3.8648 - val_loss: 1449.9386
Epoch 8/12
282/282 [==============================] - 20s 72ms/step - loss: 2.7034 - val_loss: 1611.7833
Epoch 9/12
282/282 [==============================] - 20s 72ms/step - loss: 1.6767 - val_loss: 1712.9929
Epoch 10/12
282/282 [==============================] - 20s 72ms/step - loss: 1.3498 - val_loss: 1777.0973
Epoch 11/12
282/282 [==============================] - 20s 72ms/step - loss: 1.1861 - val_loss: 1821.0354
Epoch 12/12
282/282 [==============================] - 20s 73ms/step - loss: 1.1071 - val_loss: 1846.3872
313/313 [==============================] - 3s 10ms/step
AutoEncoder(batch_size=32, contamination=0.02, dropout_rate=0.2, epochs=12,
hidden_activation='relu',
hidden_neurons=[2000, 1000, 500, 500, 1000, 2000],
l2_regularizer=0.1,
loss=<function mean_squared_error at 0x7fadf2d4cf80>,
optimizer='adam', output_activation='sigmoid', preprocessing=True,
random_state=None, validation_size=0.1, verbose=1)
To speed up the iterations, I took 10K records with 2% of anomalous data init, as shown in the output of data.shape. As we can see training loss is reducing; however, val_loss is oscillating. The situation is the same even if the epochs=100.
When I predict on the same training data, not validation data
encoder.predict(data)
I get very high false positives, it will produce 200 anomalous data, but only 9 of them are actual anomalies based on the manual labels.
1). Am I using the encoder correctly?
2). I think, as the data were manually labeled by domain experts, I think the data itself doesn't have enough information to reveal the anomalous data. Hence, it needs to transformed, to help models identify anomalies correctly?
Please suggest.
Thanks
I accidentally forgot to the change the variable input to x at the Conv1D call function. But when I train with that model the loss is far better then when I fix the error.
The model with the error (scroll to the right).
inputs = keras.layers.Input(shape=self.input)
concat = []
for _ in range(4):
x = keras.layers.Conv1D(32, kernel_size=3, strides=1, dilation_rate=1, padding="same", activation="relu", use_bias=False)(inputs)
x = keras.layers.Conv1D(64, kernel_size=3, strides=1, dilation_rate=1, padding="same", activation="relu", use_bias=False)(inputs) # <-- should be Conv1D(...)(x)
x = keras.layers.Conv1D(128, kernel_size=3, strides=1, dilation_rate=1, padding="same", activation="relu", use_bias=False)(inputs) # <-- should be Conv1D(...)(x)
x = keras.layers.LSTM(32, activation="sigmoid", return_sequences=True)(x)
x = keras.layers.LSTM(32, activation="sigmoid", return_sequences=False)(x)
concat.append(x)
x = keras.layers.Concatenate(axis=1)(concat)
x = keras.layers.Dense(128, activation="relu")(x)
x = keras.layers.Dense(128, activation="relu")(x)
outputs = keras.layers.Dense(self.output)(x)
self.model = keras.models.Model(inputs=inputs, outputs=outputs)
The model summary & training of the model with the error (scroll down).
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 24, 8)] 0
__________________________________________________________________________________________________
conv1d_2 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_5 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_8 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_11 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
lstm (LSTM) (None, 24, 32) 20608 conv1d_2[0][0]
__________________________________________________________________________________________________
lstm_2 (LSTM) (None, 24, 32) 20608 conv1d_5[0][0]
__________________________________________________________________________________________________
lstm_4 (LSTM) (None, 24, 32) 20608 conv1d_8[0][0]
__________________________________________________________________________________________________
lstm_6 (LSTM) (None, 24, 32) 20608 conv1d_11[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 32) 8320 lstm[0][0]
__________________________________________________________________________________________________
lstm_3 (LSTM) (None, 32) 8320 lstm_2[0][0]
__________________________________________________________________________________________________
lstm_5 (LSTM) (None, 32) 8320 lstm_4[0][0]
__________________________________________________________________________________________________
lstm_7 (LSTM) (None, 32) 8320 lstm_6[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 128) 0 lstm_1[0][0]
lstm_3[0][0]
lstm_5[0][0]
lstm_7[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 128) 16512 concatenate[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 128) 16512 dense[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 1) 129 dense_1[0][0]
==================================================================================================
Total params: 161,153
Trainable params: 161,153
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/250
628/628 [==============================] - 14s 16ms/step - loss: 1.0818 - precision: 0.5038 - val_loss: 1.0670 - val_precision: 0.5293
Epoch 2/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0783 - precision: 0.5250 - val_loss: 1.0668 - val_precision: 0.5254
Epoch 3/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0769 - precision: 0.5352 - val_loss: 1.0665 - val_precision: 0.5229
Epoch 4/250
628/628 [==============================] - 9s 15ms/step - loss: 1.0762 - precision: 0.5357 - val_loss: 1.0653 - val_precision: 0.5291
Epoch 5/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0755 - precision: 0.5358 - val_loss: 1.0660 - val_precision: 0.5163
Epoch 6/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0738 - precision: 0.5378 - val_loss: 1.0640 - val_precision: 0.5260
Epoch 7/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0727 - precision: 0.5384 - val_loss: 1.0634 - val_precision: 0.5257
Epoch 8/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0706 - precision: 0.5380 - val_loss: 1.0616 - val_precision: 0.5306
Epoch 9/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0692 - precision: 0.5471 - val_loss: 1.0599 - val_precision: 0.5375
Epoch 10/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0684 - precision: 0.5467 - val_loss: 1.0583 - val_precision: 0.5435
Epoch 11/250
628/628 [==============================] - 9s 15ms/step - loss: 1.0665 - precision: 0.5534 - val_loss: 1.0577 - val_precision: 0.5486
Epoch 12/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0658 - precision: 0.5487 - val_loss: 1.0623 - val_precision: 0.5472
Epoch 13/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0642 - precision: 0.5513 - val_loss: 1.0569 - val_precision: 0.5488
Epoch 14/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0634 - precision: 0.5530 - val_loss: 1.0571 - val_precision: 0.5347
Epoch 15/250
628/628 [==============================] - 9s 15ms/step - loss: 1.0622 - precision: 0.5506 - val_loss: 1.0538 - val_precision: 0.5445
Epoch 16/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0607 - precision: 0.5527 - val_loss: 1.0537 - val_precision: 0.5489
Epoch 17/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0594 - precision: 0.5526 - val_loss: 1.0550 - val_precision: 0.5450
Epoch 18/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0583 - precision: 0.5544 - val_loss: 1.0566 - val_precision: 0.5461
Epoch 19/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0556 - precision: 0.5571 - val_loss: 1.0521 - val_precision: 0.5405
Epoch 20/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0545 - precision: 0.5600 - val_loss: 1.0524 - val_precision: 0.5480
Epoch 21/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0532 - precision: 0.5611 - val_loss: 1.0487 - val_precision: 0.5467
Epoch 22/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0520 - precision: 0.5603 - val_loss: 1.0522 - val_precision: 0.5496
Epoch 23/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0508 - precision: 0.5583 - val_loss: 1.0494 - val_precision: 0.5497
Epoch 24/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0480 - precision: 0.5630 - val_loss: 1.0461 - val_precision: 0.5489
Epoch 25/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0463 - precision: 0.5617 - val_loss: 1.0461 - val_precision: 0.5505
Epoch 26/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0457 - precision: 0.5643 - val_loss: 1.0449 - val_precision: 0.5548
Epoch 27/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0430 - precision: 0.5659 - val_loss: 1.0472 - val_precision: 0.5504
Epoch 28/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0426 - precision: 0.5679 - val_loss: 1.0415 - val_precision: 0.5516
Epoch 29/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0389 - precision: 0.5679 - val_loss: 1.0459 - val_precision: 0.5542
Epoch 30/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0379 - precision: 0.5709 - val_loss: 1.0421 - val_precision: 0.5583
Epoch 31/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0366 - precision: 0.5723 - val_loss: 1.0423 - val_precision: 0.5586
Epoch 32/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0335 - precision: 0.5765 - val_loss: 1.0415 - val_precision: 0.5573
Epoch 33/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0318 - precision: 0.5772 - val_loss: 1.0399 - val_precision: 0.5580
Epoch 34/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0287 - precision: 0.5789 - val_loss: 1.0423 - val_precision: 0.5495
Epoch 35/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0276 - precision: 0.5862 - val_loss: 1.0354 - val_precision: 0.5658
Epoch 36/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0252 - precision: 0.5841 - val_loss: 1.0321 - val_precision: 0.5619
Epoch 37/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0233 - precision: 0.5861 - val_loss: 1.0348 - val_precision: 0.5651
Epoch 38/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0215 - precision: 0.5876 - val_loss: 1.0327 - val_precision: 0.5677
Epoch 39/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0187 - precision: 0.5905 - val_loss: 1.0350 - val_precision: 0.5699
Epoch 40/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0158 - precision: 0.5938 - val_loss: 1.0301 - val_precision: 0.5702
Epoch 41/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0154 - precision: 0.5955 - val_loss: 1.0291 - val_precision: 0.5671
Epoch 42/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0099 - precision: 0.5972 - val_loss: 1.0328 - val_precision: 0.5786
Epoch 43/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0076 - precision: 0.5996 - val_loss: 1.0327 - val_precision: 0.5712
Epoch 44/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0030 - precision: 0.6066 - val_loss: 1.0231 - val_precision: 0.5708
Epoch 45/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9996 - precision: 0.6047 - val_loss: 1.0276 - val_precision: 0.5728
Epoch 46/250
628/628 [==============================] - 12s 19ms/step - loss: 0.9965 - precision: 0.6072 - val_loss: 1.0206 - val_precision: 0.5744
Epoch 47/250
628/628 [==============================] - 11s 18ms/step - loss: 0.9910 - precision: 0.6134 - val_loss: 1.0182 - val_precision: 0.5837
Epoch 48/250
628/628 [==============================] - 10s 16ms/step - loss: 0.9865 - precision: 0.6114 - val_loss: 1.0204 - val_precision: 0.5750
Epoch 49/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9808 - precision: 0.6155 - val_loss: 1.0251 - val_precision: 0.5745
Epoch 50/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9773 - precision: 0.6129 - val_loss: 1.0147 - val_precision: 0.5877
Epoch 51/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9737 - precision: 0.6184 - val_loss: 1.0073 - val_precision: 0.5871
Epoch 52/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9696 - precision: 0.6174 - val_loss: 1.0078 - val_precision: 0.5807
Epoch 53/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9631 - precision: 0.6265 - val_loss: 1.0015 - val_precision: 0.5927
Epoch 54/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9618 - precision: 0.6216 - val_loss: 1.0064 - val_precision: 0.5916
Epoch 55/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9571 - precision: 0.6246 - val_loss: 1.0127 - val_precision: 0.5907
Epoch 56/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9588 - precision: 0.6251 - val_loss: 1.0012 - val_precision: 0.5903
Epoch 57/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9499 - precision: 0.6297 - val_loss: 1.0192 - val_precision: 0.5824
Epoch 58/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9471 - precision: 0.6273 - val_loss: 1.0103 - val_precision: 0.5893
Epoch 59/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9428 - precision: 0.6367 - val_loss: 0.9949 - val_precision: 0.5943
Epoch 60/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9366 - precision: 0.6348 - val_loss: 0.9926 - val_precision: 0.5946
Epoch 61/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9356 - precision: 0.6356 - val_loss: 0.9868 - val_precision: 0.6016
Epoch 62/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9280 - precision: 0.6385 - val_loss: 0.9902 - val_precision: 0.5949
Epoch 63/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9255 - precision: 0.6403 - val_loss: 0.9877 - val_precision: 0.5957
Epoch 64/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9217 - precision: 0.6425 - val_loss: 1.0087 - val_precision: 0.5918
Epoch 65/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9156 - precision: 0.6460 - val_loss: 1.0007 - val_precision: 0.5961
Epoch 66/250
628/628 [==============================] - 10s 15ms/step - loss: 0.9155 - precision: 0.6454 - val_loss: 0.9873 - val_precision: 0.5965
09-01-22 15:18:32 - Saving model weights to /vserver/storages/packages/trader/.cache/weights/linear/neuralnet.1.1.14.h5 ... done
09-01-22 15:18:32 - Trained the model in 10.7m.
Training evaluation: loss: 0.9263 - precision: 0.6409
Validation evaluation: loss: 0.9868 - precision: 0.6016
Now the model with the error fixed.
inputs = keras.layers.Input(shape=self.input)
concat = []
for _ in range(4):
x = keras.layers.Conv1D(128, kernel_size=3, strides=1, dilation_rate=1, padding="same", activation="relu", use_bias=False)(inputs)
x = keras.layers.LSTM(32, activation="sigmoid", return_sequences=True)(x)
x = keras.layers.LSTM(32, activation="sigmoid", return_sequences=False)(x)
concat.append(x)
x = keras.layers.Concatenate(axis=1)(concat)
x = keras.layers.Dense(128, activation="relu")(x)
x = keras.layers.Dense(128, activation="relu")(x)
outputs = keras.layers.Dense(self.output)(x)
self.model = keras.models.Model(inputs=inputs, outputs=outputs)
The model summary and training from the model with the error fixed (scroll down).
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 24, 8)] 0
__________________________________________________________________________________________________
conv1d (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_1 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_2 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
conv1d_3 (Conv1D) (None, 24, 128) 3072 input_1[0][0]
__________________________________________________________________________________________________
lstm (LSTM) (None, 24, 32) 20608 conv1d[0][0]
__________________________________________________________________________________________________
lstm_2 (LSTM) (None, 24, 32) 20608 conv1d_1[0][0]
__________________________________________________________________________________________________
lstm_4 (LSTM) (None, 24, 32) 20608 conv1d_2[0][0]
__________________________________________________________________________________________________
lstm_6 (LSTM) (None, 24, 32) 20608 conv1d_3[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 32) 8320 lstm[0][0]
__________________________________________________________________________________________________
lstm_3 (LSTM) (None, 32) 8320 lstm_2[0][0]
__________________________________________________________________________________________________
lstm_5 (LSTM) (None, 32) 8320 lstm_4[0][0]
__________________________________________________________________________________________________
lstm_7 (LSTM) (None, 32) 8320 lstm_6[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 128) 0 lstm_1[0][0]
lstm_3[0][0]
lstm_5[0][0]
lstm_7[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 128) 16512 concatenate[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 128) 16512 dense[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 1) 129 dense_1[0][0]
==================================================================================================
Total params: 161,153
Trainable params: 161,153
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/250
628/628 [==============================] - 14s 16ms/step - loss: 1.0800 - precision: 0.5006 - val_loss: 1.0678 - val_precision: 0.5036
Epoch 2/250
628/628 [==============================] - 9s 15ms/step - loss: 1.0792 - precision: 0.4970 - val_loss: 1.0678 - val_precision: 0.5091
Epoch 3/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0791 - precision: 0.4990 - val_loss: 1.0680 - val_precision: 0.4909
Epoch 4/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0791 - precision: 0.5016 - val_loss: 1.0683 - val_precision: 0.4909
Epoch 5/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0791 - precision: 0.5018 - val_loss: 1.0678 - val_precision: 0.5091
Epoch 6/250
628/628 [==============================] - 10s 15ms/step - loss: 1.0790 - precision: 0.4996 - val_loss: 1.0678 - val_precision: 0.5091
09-01-22 15:04:12 - Saving model weights to /vserver/storages/packages/trader/.cache/weights/linear/neuralnet.1.1.14.h5 ... done
09-01-22 15:04:12 - Trained the model in 1.0m.
Training evaluation: loss: 1.0788 - precision: 0.5161
Validation evaluation: loss: 1.0678 - precision: 0.5036
As you can see the model with the error performs far better then without. While they are technically identical.
I have tested it multiple times and it remains the same.
How can this be possible?
Edit: the number of epochs are different because of the EarlyStopping callback.
These models are exactly the same as you can see it in the summary. Because at the first code (with an error), first 2 Conv1D layers are not connected to the model, this makes them both identical. So what makes the difference between those models results. It is because of epoch numbers and initial weights.
Initial weights are the random weights of the model at the initial state. So when you create 2 model which are identical in layers and test them on a validation set without training, they will give different results. The reason for this is difference between the initial weights.(Weights are deciding what will be the output of model).
When it comes to epochs there could be local minima problem. Model can think that best weights are founded because there is a local minima. I think this is what makes it stopped at 6th epoch. You can google local minima if you don't know about it. But it shouldn't be stuck to local minima at every run (because initial weights are changes every run). So problem should be fixed if you run the model few times.
So, I am trying to code a multivariate LSTM for time series forecasting, and in my model, the losses decrease but accuracy metrics do not change at all. I tried changing number of neurons, layers, learning rate, early stopping, activation function on the output layer, and l2 regularization but nothing works. I am a beginner in machine learning, and so any help would be appreciated.Most of my efforts were like throwing stones in the dark. I am attaching a the GitHub link to my code, as well as a few of the training epochs.
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from keras.regularizers import l2
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping
model = Sequential()
model.add(LSTM(64,activation='sigmoid',return_sequences=True,input_shape = (trainX.shape[1],trainX.shape[2])))
model.add(LSTM(32,activation='sigmoid',return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(trainY.shape[1]))
opt = Adam(learning_rate= 1e-3)
model.compile(optimizer='adam',loss = 'mse', metrics=['accuracy'])
model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_6 (LSTM) (None, 200, 64) 19200
_________________________________________________________________
lstm_7 (LSTM) (None, 32) 12416
_________________________________________________________________
dropout_3 (Dropout) (None, 32) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 33
=================================================================
Total params: 31,649
Trainable params: 31,649
Non-trainable params: 0
es_callback = EarlyStopping(monitor='val_loss', patience=3)
history = model.fit(trainX,trainY,epochs=40,batch_size= 32,verbose=1,validation_split=0.2, callbacks= [es_callback])
Epoch 1/40
214/214 [==============================] - 58s 169ms/step - loss: 0.1663 - accuracy: 0.0000e+00 - val_loss: 0.0483 - val_accuracy: 5.8617e-04
Epoch 2/40
214/214 [==============================] - 35s 164ms/step - loss: 0.0497 - accuracy: 0.0000e+00 - val_loss: 0.0446 - val_accuracy: 5.8617e-04
Epoch 3/40
214/214 [==============================] - 35s 164ms/step - loss: 0.0309 - accuracy: 0.0000e+00 - val_loss: 0.0092 - val_accuracy: 5.8617e-04
Epoch 4/40
214/214 [==============================] - 35s 163ms/step - loss: 0.0143 - accuracy: 0.0000e+00 - val_loss: 0.0230 - val_accuracy: 5.8617e-04
Epoch 5/40
214/214 [==============================] - 35s 163ms/step - loss: 0.0115 - accuracy: 0.0000e+00 - val_loss: 0.0160 - val_accuracy: 5.8617e-04
Epoch 6/40
214/214 [==============================] - 35s 163ms/step - loss: 0.0099 - accuracy: 0.0000e+00 - val_loss: 0.0172 - val_accuracy: 5.8617e-04
My code: https://github.com/RiddhimanRaut/Deep-Learning-based-CPR-estimation/blob/main/CPR_prediction_multivariate_LSTM_tobetrialled_1.ipynb
Thank you!
Accuracy is the metric for classification tasks. To measure if a regression model is good or not, measurement such as MSE can be applied.
I think the discussion here can provide more information.
Attempting to make predictions using Kaggle Diabetic retinopathy data set and a CNN model. There are five classes to be predicted. Distribution % of the data label wise is as below.
0 0.73
2 0.15
1 0.07
3 0.02
4 0.02
Name: level, dtype: float64
The relevant important code blocks are furnished below.
# Network training parameters
EPOCHS = 25
BATCH_SIZE =50
VERBOSE = 1
lr=0.0001
OPTIMIZER = tf.keras.optimizers.Adam(lr)
target_size =(256, 256)
NB_CLASSES = 5
THe Image generator class and the preprocessing codes as below.
data_gen=tf.keras.preprocessing.image.ImageDataGenerator(rotation_range=45,
horizontal_flip=True,
vertical_flip=True,
rescale=1./255,
validation_split=0.2)
train_gen=data_gen.flow_from_dataframe(
dataframe=label_csv, directory=IMAGE_FOLDER_PATH,
x_col='image', y_col='level',
target_size=target_size,
class_mode='categorical',
batch_size=BATCH_SIZE, shuffle=True,
subset='training',
validate_filenames=True
)
Found 28101 validated image filenames belonging to 5 classes.
validation_gen=data_gen.flow_from_dataframe(
dataframe=label_csv, directory=IMAGE_FOLDER_PATH,
x_col='image', y_col='level',
target_size=target_size,
class_mode='categorical',
batch_size=BATCH_SIZE, shuffle=True,
subset='validation',
validate_filenames=True
)
Found 7025 validated image filenames belonging to 5 classes.
train_gen.image_shape
(256, 256, 3)
Model building code blocks as below.
# Architect your CNN model1
model1=tf.keras.models.Sequential()
model1.add(tf.keras.layers.Conv2D(256,(3,3),input_shape=INPUT_SHAPE,activation='relu'))
model1.add(tf.keras.layers.MaxPool2D(pool_size=(2,2)))
model1.add(tf.keras.layers.Conv2D(128,(3,3),activation='relu'))
model1.add(tf.keras.layers.MaxPool2D(pool_size=(2,2)))
model1.add(tf.keras.layers.Conv2D(64,(3,3),activation='relu'))
model1.add(tf.keras.layers.MaxPool2D(pool_size=(2,2)))
model1.add(tf.keras.layers.Conv2D(32,(3,3),activation='relu'))
model1.add(tf.keras.layers.MaxPool2D(pool_size=(2,2)))
model1.add(tf.keras.layers.Flatten())
model1.add(tf.keras.layers.Dense(units=512,activation='relu'))
model1.add(tf.keras.layers.Dense(units=256,activation='relu'))
model1.add(tf.keras.layers.Dense(units=128,activation='relu'))
model1.add(tf.keras.layers.Dense(units=64,activation='relu'))
model1.add(tf.keras.layers.Dense(units=32,activation='relu'))
model1.add(tf.keras.layers.Dense(units=NB_CLASSES,activation='softmax'))
model1.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 256) 7168
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 256) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 125, 125, 128) 295040
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 62, 62, 128) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 60, 60, 64) 73792
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 30, 30, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 28, 28, 32) 18464
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 6272) 0
_________________________________________________________________
dense (Dense) (None, 512) 3211776
_________________________________________________________________
dense_1 (Dense) (None, 256) 131328
_________________________________________________________________
dense_2 (Dense) (None, 128) 32896
_________________________________________________________________
dense_3 (Dense) (None, 64) 8256
_________________________________________________________________
dense_4 (Dense) (None, 32) 2080
_________________________________________________________________
dense_5 (Dense) (None, 5) 165
=================================================================
Total params: 3,780,965
Trainable params: 3,780,965
Non-trainable params: 0
# Compile model1
model1.compile(optimizer=OPTIMIZER,metrics=['accuracy'],loss='categorical_crossentropy')
print (train_gen.n,train_gen.batch_size)
28101 50
STEP_SIZE_TRAIN=train_gen.n//train_gen.batch_size
STEP_SIZE_VALID=validation_gen.n//validation_gen.batch_size
print(STEP_SIZE_TRAIN)
print(STEP_SIZE_VALID)
562
140
# Fit the model1
history1=model1.fit(train_gen,
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=validation_gen,
validation_steps=STEP_SIZE_VALID,
epochs=EPOCHS,verbose=1)
History of the epoch as below and trained stopped at epoch -14 as no improvement observed.
Epoch 1/25
562/562 [==============================] - 1484s 3s/step - loss: 0.9437 - accuracy: 0.7290 - val_loss: 0.8678 - val_accuracy: 0.7309
Epoch 2/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8748 - accuracy: 0.7337 - val_loss: 0.8673 - val_accuracy: 0.7309
Epoch 3/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8681 - accuracy: 0.7367 - val_loss: 0.8614 - val_accuracy: 0.7306
Epoch 4/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8619 - accuracy: 0.7333 - val_loss: 0.8592 - val_accuracy: 0.7306
Epoch 5/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8565 - accuracy: 0.7375 - val_loss: 0.8625 - val_accuracy: 0.7304
Epoch 6/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8608 - accuracy: 0.7357 - val_loss: 0.8556 - val_accuracy: 0.7310
Epoch 7/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8568 - accuracy: 0.7335 - val_loss: 0.8614 - val_accuracy: 0.7304
Epoch 8/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8541 - accuracy: 0.7349 - val_loss: 0.8591 - val_accuracy: 0.7301
Epoch 9/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8582 - accuracy: 0.7321 - val_loss: 0.8583 - val_accuracy: 0.7303
Epoch 10/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8509 - accuracy: 0.7354 - val_loss: 0.8599 - val_accuracy: 0.7311
Epoch 11/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8521 - accuracy: 0.7325 - val_loss: 0.8584 - val_accuracy: 0.7304
Epoch 12/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8422 - accuracy: 0.7352 - val_loss: 0.8481 - val_accuracy: 0.7307
Epoch 13/25
562/562 [==============================] - 1463s 3s/step - loss: 0.8511 - accuracy: 0.7345 - val_loss: 0.8477 - val_accuracy: 0.7307
Epoch 14/25
562/562 [==============================] - 1462s 3s/step - loss: 0.8314 - accuracy: 0.7387 - val_loss: 0.8528 - val_accuracy: 0.7300
Epoch 15/25
73/562 [==>...........................] - ETA: 17:12 - loss: 0.8388 - accuracy: 0.7344
Validation accuracy not improving more than 73 % even after several epochs.In the earlier trial i tried the learning rate 0.001 but the case was same with no improvements.
Request suggestions to improve the model accuracy.
Also how can we use Grid search when we use the Image generator for preprocessing and would invite suggestions for the same
Many thanks in advance
your problem is most likely due to overfitting. your data is quite unbalanced and in addition to finding a better model, a better learning rate or a better optimizer. you could also create a custom generator to augment and select your data in a more balanced way.
I use custom generators for most of the models at work, I can't share the full code of generators but I'll show you a pseudocode example of how to create one. it's actually quite fun to play around and add more steps to it. you can -and you probably should- add pre-processing and post-processing steps but I hope this code gives you an overall idea of the process.
import random
import numpy as np
class myCostumGenerator:
def __init__(self) -> None:
# load dataset into a dict, if it's too big then just load filenames and load them at runtime
# each dict key is a class name, and each value is a list of images or filenames
self.dataSet, self.imageHeight, self.imageWidth, self.imageChannels = loadData()
def labelBinarizer(self, label):
# this is how you convert class names into target Y
pass
def augment(self, image):
# this is how you augment your images
pass
def yeildData(self):
while True:#keras generators need to run infinitly
for className, data in self.dataSet.items():
yield self.augment(random.choice(data)), self.labelBinarizer(className)
def getEmptyBatch(self, batchSize):
return (
np.empty([batchSize, self.imageHeight, self.imageWidth, self.imageChannels]),
np.empty([batchSize, len(self.dataset.keys())]), 0)
def getBatches(self, batchSize):
X, Y, i = self.getEmptyBatch(batchSize)
for image, label in self.yieldData():
X[i, ...] = image
Y[i, ...] = label
i += 1
if i== batchSize:
yield X, Y
X, Y, i = self.getEmptyBatch(batchSize)
# your model definition and other stuff
# ...
# ...
# ...
# with this method of defining a generator, you have to set number of steps per epoch
generator = myCostumGenerator()
model.fit(
generator.getBatches(batchSize=256),
steps_per_epoch = 500
# other params
)
I am very much novice at neural networks / machine learning. I am trying to learn more by using RotNet, a NN that will classify rotation angles in images. I am trying to train my network using the MNIST dataset, and have changed only one line of the repo (a log directory file path) but other than that have been able to run it successfully.
Here is how I am running it based on the README:
& .../Anaconda3/envs/tflow/python.exe .../RotNet/train/train_mnist.py
and then the output:
Using TensorFlow backend.
Input shape: (28, 28, 1)
60000 train samples
10000 test samples
2020-10-16 12:18:17.031214: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 28, 28, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 26, 26, 64) 640
_________________________________________________________________
conv2d_2 (Conv2D) (None, 24, 24, 64) 36928
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 12, 12, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 1179776
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 360) 46440
=================================================================
Total params: 1,263,784
Trainable params: 1,263,784
Non-trainable params: 0
_________________________________________________________________
Epoch 1/50
1/468 [..............................] - ETA: 2:21 - loss: 5.8862 - angle_error: 87.14062020-10-16 12:18:18.337183: I tensorflow/core/profiler/lib/profiler_session.cc:184] Profiler session started.
469/468 [==============================] - 61s 130ms/step - loss: 5.0338 - angle_error: 81.4492 - val_loss: 4.1144 - val_angle_error: 65.9470
Epoch 2/50
469/468 [==============================] - 61s 131ms/step - loss: 4.3072 - angle_error: 64.7485 - val_loss: 3.4630 - val_angle_error: 53.0140
Epoch 3/50
469/468 [==============================] - 63s 134ms/step - loss: 4.0303 - angle_error: 56.3245 - val_loss: 3.2241 - val_angle_error: 47.0283
Epoch 4/50
469/468 [==============================] - 63s 134ms/step - loss: 3.8824 - angle_error: 52.2043 - val_loss: 3.3227 - val_angle_error: 43.2439
Epoch 5/50
469/468 [==============================] - 63s 135ms/step - loss: 3.7982 - angle_error: 49.9996 - val_loss: 3.1930 - val_angle_error: 41.1242
Epoch 6/50
469/468 [==============================] - 73s 155ms/step - loss: 3.7288 - angle_error: 48.4027 - val_loss: 2.9600 - val_angle_error: 39.9322
Epoch 7/50
469/468 [==============================] - 63s 133ms/step - loss: 3.6781 - angle_error: 46.5616 - val_loss: 3.2243 - val_angle_error: 38.6193
Epoch 8/50
469/468 [==============================] - 62s 132ms/step - loss: 3.6439 - angle_error: 45.2133 - val_loss: 2.8629 - val_angle_error: 38.0046
Epoch 9/50
469/468 [==============================] - 62s 132ms/step - loss: 3.6132 - angle_error: 44.7204 - val_loss: 3.0085 - val_angle_error: 37.4514
Epoch 10/50
469/468 [==============================] - 62s 132ms/step - loss: 3.5817 - angle_error: 43.8439 - val_loss: 3.0073 - val_angle_error: 35.8109
The script train_mnist.py is located here and it specifies 50 epochs. I am getting no error, the program simply stops after the 8th or 10th epoch. I am at a loss for how to fix this issue. Any advice would be appreciated!
I took a quick look at the code. In it there is this line:
callbacks=[checkpointer, early_stopping, tensorboard]
The call back early_stopping by default monitors the validation loss. The code used for early stopping is set such that if the validation loss fails to improve for more than 2 consecutive epochs training will halt. That is why it does not train for 50 epochs. If you want it to continue training for the full 50 remove early_stopping from the line of code above. You can see that early_stopping is causing the training to terminate by changing the code in the script from
early_stopping = EarlyStopping(patience=2)
# change code to
early_stopping = EarlyStopping(patience=2, verbose=1)
From the training data this model does not appear to be training very well. I suggest you try transfer learning with MobileNet. Code below shows how to use it,
mobile = tf.keras.applications.mobilenet.MobileNet( include_top=False, input_shape=(img_size, img_size,3), pooling='max', weights='imagenet', dropout=.5)
x=mobile.layers[-1].output # this is the last layer in the mobilenet model the global max pooling layer
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x=Dense(126, activation='relu')(x)
x=Dropout(rate=.3, seed = 123)(x)
predictions=Dense (len(classes), activation='softmax')(x)
model = Model(inputs=mobile.input, outputs=predictions)
Adapt the above to your situation it should work much better
for layer in model.layers:
layer.trainable=True
model.compile(Adamax(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])