To solve a problem I developed some deep learning models such as MLP (6 Dense layers), CNN (1 Conv1D + 1 Dense layers), LSTM (1 LSTM + 2 Dense layers) and constructed their lose and accuracy chart.
CNN model
model = Sequential()
model.add(Embedding(len(vectorizer.get_feature_names()) + 1,
64, # Embedding size
input_length=MAX_SEQ_LENGHT))
model.add(Conv1D(64, 5, activation='relu'))
model.add(MaxPooling1D(5))
model.add(Flatten())
model.add(Dense(64))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
I have these questions:
1-Which epoche is suitable for each one?
Chart 1 => 16
Chart 2 => 15
Chart 3 => 5
2-Are these charts show overfit? (especially chart 1. what wrong with it? how it can possible the accuracy of the test is more than train!?)
3-is it ok when training loss is higher than test loss (in chart 1)?! should I increase the epoch?
Chart 1:
Chart 2:
Chart 3:
Your model seemed to be overfit in chart-1.If you can't increase the dataset(directly or by data augmentation technique) then chart-3 seemed to be the best model among the three.
Related
The shape of the train/test data is (samples, 256, 256, 1). The training dataset has around 1400 samples, the validation dataset has 150 samples, and the test dataset has 250 samples. Then I build a CNN model for a six-object classification task. However, no matter how hard I tuning the parameters and add/remove layers(conv&dense), I get a chance level of accuracy all the time (around 16.5%). Thus, I would like to know whether I made some deadly mistakes while building the model. Or there is something wrong with the data itself, not the CNN model.
Code:
def build_cnn_model(input_shape, activation='relu'):
model = Sequential()
# 3 Convolution layer with Max polling
model.add(Conv2D(64, (5, 5), activation=activation, padding = 'same', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (5, 5), activation=activation, padding = 'same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(256, (5, 5), activation=activation, padding = 'same'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
# 3 Full connected layer
model.add(Dense(1024, activation = activation))
model.add(Dropout(0.5))
model.add(Dense(512, activation = activation))
model.add(Dropout(0.5))
model.add(Dense(6, activation = 'softmax')) # 6 classes
# summarize the model
print(model.summary())
return model
def compile_and_fit_model(model, X_train, y_train, X_vali, y_vali, batch_size, n_epochs, LR=0.01):
# compile the model
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=LR),
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
# fit the model
history = model.fit(x=X_train,
y=y_train,
batch_size=batch_size,
epochs=n_epochs,
verbose=1,
validation_data=(X_vali, y_vali))
return model, history
I transformed the MEG data my professor recorded into Magnitude Scalogram using CWT. pywt.cwt(data, scales, wavelet) was used. And if I plot the coefficients I got from cwt, I will have a graph like this (I emerged 62 channels into one graph). enter image description here
I used the coefficients as train/test data for the CNN model. However, I tuned the parameters and tried to add/remove layers for the CNN model, and the classification accuracy was unchanged. Thus, I want to know where I made mistakes. Did I make mistakes with building the CNN model, or did I make mistakes with CWT (the way I handled data)?
Please give me some advices, thank you.
How is the accuracy of the training data? If you have a small dataset and the model does not overfit after training for a while, then something is wrong with the model. You can also test with existing datasets, which the model should be able to handle (like Fashion MNIST).
Testing if you handled the data correctly is harder. Did you write unit tests for the different steps in the preprocessing pipeline?
I am trying to use keras to do nonlinear regression. I have simulated 90000 datasets and labelled them with 2 parameters. My goal is to have a fully connected NN to estimate these two parameters after training. Currently, the model works well for fitting only one label. As a test, I have tried fitting each label independently and this works well, however when I want to fit both simultaneously it fails (i.e., the model predicts 1 label accurately, but not the other. In some instances the second label is off by a factor of 1000 and in other cases it simply reads [0.] ... depending on the activation for my output layer). 1 label is on the order of 1e7 and the other label varies between 0 and 1. I have tried normalizing both labels to lie between 0 and 1 - this didn't help. Each input should be a vector of size 1024 and associated with 2 labels.
Any help or literature suggestions on how to fit multi-labelled data would be much appreciated. Attached below is the code for my model. Thank you.
# Build The Model
model = Sequential()
# The Input Layer :
model.add(Dense(1024, kernel_initializer='normal', input_dim=1024, activation='relu'))
# The Hidden Layers :
model.add(Dense(1024, kernel_initializer='normal',activation='relu'))
model.add(Dense(1024, kernel_initializer='normal',activation='relu'))
model.add(Dense(1024, kernel_initializer='normal',activation='relu'))
# The Output Layer :
model.add(Dense(2, kernel_initializer='normal', activation='relu'))
# Compile the network :
model.compile(loss='MSE', optimizer='adam', metrics=['MSE'])
model.summary()
i created a CNN for a project i am involved with and i need to present it. The issue is, I am not sure about how to count the layers.
Here is my model:
model = Sequential()
model.add(Conv2D(64,(3,3), input_shape = (40,40,2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64,(3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(1600))
model.add(Reshape((40,40)))
model.add(Activation('sigmoid'))
model.compile(loss='MSE',
optimizer='SGD',
metrics=['MAE'])
len(model.layers) returned 12 :
So i used 1 input 10 hidden 1 output layers,
or
i need to count them as a group and say 1 input 2 hidden 1 output?
When calculating the depth of a CNN network, we only consider the layers that have tunable/trainable weights/parameters. In CNN only Convolutional Layers and Fully Connected Layers will have trainable parameters. If you want to label layers consider only Convolutional, Full Connected and Output layers (Conv2D and Dense).
Max Pooling Layers are generally take together with Convolutional Layers as one layer.
This is the model I am trying to replicate (more information in linked paper):
In our models, we adopted one dropout layer between LSTM models and the first fully-connected layer and another dropout layer between the first fully-connected layer and the second fully-connected layer. Their masking probabilities are both set to 0.5.
...
For our proposed CBLSTM, one-layer CNN is firstly designed, whose filter number, filter size and pooling size are set to 150, 10 and 5. Therefore, the shape of the raw sensory sequence is changed from 100 x 12 to 19 x 150 after CNN. Then, a two-layer bi-directional LSTM is built on top of the CNN.
Backward and forward LSTMs share the same layer sizes as [150, 200]. Therefore, the output of the LSTM module is the concatenated vector of the representations learned by backward and forward LSTMs, and its dimensionality is 400. Then, before feeding the representation into the linear regression layer, two fully-connected layers with a size of [500, 600] are adopted. The nonlinearity activation functions in our proposed CBLSTM are all set to ReLu.
Source: Zhao, R., Yan, R., Wang, J., & Mao, K. (2017). Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors, 17(2), 273. link to paper
The input is 630 samples x 100 timesteps x 12 features.
How my model looks at the moment:
model = Sequential()
model.add(Conv1D(filters=150, kernel_size=10, activation='relu', input_shape=(100,12)))
model.add(MaxPooling1D(pool_size=5, strides=None, padding='valid'))
model.add(Bidirectional(LSTM(150, return_sequences=True), merge_mode='concat'))
model.add(Bidirectional(LSTM(200, return_sequences=False), merge_mode='concat'))
model.add(Dropout(0.5))
model.add(Dense(500, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(600, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['mae'])
While the training loss steadily decreases per epoch, the validation set does not and diverges pretty quickly. This indicates that there is a mistake in my model which I have not yet been able to find. Any ideas as to what is wrong?
Side note: I am using the same data as input as the authors.
I am working on binary classification sentiment analysis either positive or negative my lstm code is working fine but i am converting my lstm code into cnn having Value error of "input_length" is 30, but received input has shape (None, 1)
my input shape is (30,1) my batch size is 24 in lstm
model.add(Embedding(30,30,input_length=30))
model.add(Conv1D(padding='valid',activation='relu',strides=1))
model.add(Dropout(0.2))
model.add(Dense(30))
model.add(Dropout(0.2))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.fit(inputBatch, ponlabel,batch_size=24,epochs=20,validation_data=(inputBatch, ponlabel))
for the sake of reference I have put my LSTM Code.
model.add(LSTM(100, input_shape=(30, 1)))
model.add(Dense(30, activation="relu"))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss='mean_absolute_error', optimizer='adam',metrics=["accuracy"])
model.fit(inputBatch, ponlabel,
batch_size=24, epochs=20, verbose=1)
You are using Convolution2D but your data over timesteps is 1 dimensional. So you need to use Convolutional1D to convolve over the tokens in your sentence. There is a CNN text classification in the Keras examples folder imdb_cnn.py.