Time series classification using CNN - python

I am trying to build a convolutional neural network which classifies time series data into two classes. For the time being I only have a small dataset so what I need first is to augment my datasets so I can feed them into a network.
For the data augmentation task, I found some very helpful methods at https://github.com/uchidalab/time_series_augmentation repository. What I have tried so far is to add some gaussian noise to my data, a permutation method, a time warping, a window slice and a window warp methods. These methods are being applied on a (batches, batch_rows, channels)=(354, 400, 3) dataset to generate a (1770, 400, 3) dataset (including train and test datasets and their corresponding labels).
Given the fact that I have a limited number of inputs, I would like to know if you have any suggestions for a 1D CNN structure for a good performance over these datasets.
What I have tried so far is this network:
verbose, epochs, batch_size = 0, 10, 8
n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
model = Sequential()
model.add(Conv1D(filters=16, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
# evaluate model
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
No matter the changes I make in the parameters and the hyperparameters, I always get an accuracy around 50%, meaning that a binary classifier does not exists.
I would really appreciate if anyone can tell me what probably is the problem. Does this happens due to poor data quality produced by the augmentation methods? Or is it has to do with the network itself?
Thanks in advance

If it's a classification between two classes, you should use binary_crossentropy as loss function.

Related

Multi class image classification using CNN

I wanted to classify images which consist five classes. I wanted to use CNN. But when I try with several models, the training accuracy will not increase than 20%. Please some one help me to overcome this. Mostly model will trained within 3 epoches and when epoches increase there is no improvement in accuracy. Can anyone suggest me a solution or model or can specify what could be the problem?
Below is one of the model i have used
#defining training and test sets
x_train,x_val,y_train,y_val=train_test_split(x,y,test_size=0.2, random_state=42)
print('Training data and target sizes: \n{}, {}'.format(x_train.shape,y_train.shape))
print('Test data and target sizes: \n{}, {}'.format(x_val.shape,y_val.shape))
Training data and target sizes:
(2398, 224, 224, 3), (2398,)
Test data and target sizes:
(600, 224, 224, 3), (600,)
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet',pooling='avg', input_shape=(img_rows, img_cols, img_channel))
print(base_model.summary())
#Adding custom Layers
add_model = Sequential()
add_model.add(Dense(1024, activation='relu',input_shape=base_model.output_shape[1:]))
add_model.add(Dropout(0.60))
add_model.add(Dense(1, activation='sigmoid'))
print(add_model.summary())
# creating the final model
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
# compile the model
opt = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
reduce_lr = ReduceLROnPlateau(monitor='val_acc',
patience=5,
verbose=1,
factor=0.1,
cooldown=10,
min_lr=0.00001)
model.compile(
loss='categorical_crossentropy',
metrics=['acc'],
optimizer='adam'
)
print(model.summary())
n_fold = 5
kf = model_selection.KFold(n_splits = n_fold, shuffle = True)
eval_fun = metrics.roc_auc_score
model.fit(x_train,y_train,epochs=50,batch_size=50,validation_data=(x_val,y_val))
is it okay could you share the part of the code where you're fitting the model. It's not available in the post.
And since the output is not reproducible due to lack of data, I suggest you go through this link https://www.kaggle.com/kenconstable/alzheimer-s-multi-class-classification
It's really well explained and it has given the best practices of multi-class-classification based on transfer learning as well as from scratch. In case you don't find this helpful, It would be helpful to share the training script including the model.fit() code.
Okay, so here's the issue,
In your code, you may be creating a base model with inception V3, however, you are not really adding that base model to your add_model variable.
Your add_model variable is essentially a dense network and not a CNN. Also, another thing, although it's not a big deal is that you're creating your own optimiser opt and not using it in model.compile
Can you please try this code out and let me know if it works:
# function to build the model
def build_transfer_model(conv_base,dropout,dense_node,learn_rate,metric):
"""
Build and compile a transfer learning model
Input: a base model, dropout rate, the number of filters in the dense node,
the learning rate and performance metrics
Output: A compiled CNN model
"""
# clear previous run
backend.clear_session()
# build the model
model = Sequential()
model.add(conv_base)
model.add(Dropout(dropout))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(dense_node,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
# complile the model
model.compile(
optimizer = tensorflow.keras.optimizers.Adam(lr=learn_rate),
loss = 'categorical_crossentropy',
metrics = metric )
model.summary()
return model
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet',pooling='avg', input_shape=(img_rows, img_cols, img_channel))
model = build_transfer_model(conv_base=base_model,dropout=0.6,dense_node =1024,learn_rate=0.001,metric=['acc'])
print(model.summary())
model.fit(x_train,y_train,epochs=50,batch_size=50,validation_data=(x_val,y_val))
If you pay attention in the function, the first thing we are adding to the instance of Sequential() is the base layer (InceptionV3 in your case). But you were adding a dense layer directly. Although it may get the weights from the output layer of the base inception V3, it will be a dense network, not a CNN. So please check this out.
I may have changed the variable names, although I have tried not to do the same. And, please change the order of the layers in the build_transfer_model function according to your requirement.
In case it doesn't work, let me know.
Thanks.
You have to use model.fit() to actually train the model after compiling. Right now, it has randomly initialized weights, and is therefore making random predictions. Since you have five classes, the accuracy is approximately 1/5 = 20%. Training your model may take time depending on model size and amount of data you have.

Why the accuracy of the neural network stops increasing

I'm trying to solve the Titanic competition on Kaggle. But the modelaccuracy isn't going beyond 80%.
I tried to change a number of hidden nodes, a number of epochs, also tried to apply batch normalization, dropout, changing the weights initializations, but there's the same 80%. What am I doing wrong?
This is my code below:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(10, input_shape=(5,), kernel_initializer='he_normal', activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(20, kernel_initializer='he_normal', activation='relu'))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(2, kernel_initializer=tf.keras.initializers.GlorotNormal(), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
train_scores = model.fit(train_features, train_labels, epochs=200, batch_size=64, verbose=2)
And here's on the picture accuracy in some last epochs:model accuracy
How can I improve it?
You can try normalising the data, Generally while implementing Neural Networks we don't need to normalise our data (if the network is deep) but since here we are only working with 3 layers only I guess normalising the data might help.
I would suggest to split your training data again into training and validation set and use K-fold cross validation ( I am not sure about this one!! I too am new in this field).
But in general I have seen if the accuracy is constant then the best approach is to alter the training data ( I mean normalise it or try imputing NaN values with the mean (rather than setting the to 0)).

Neural Network always predicting the same class

I've developed an Image classifier using a convolutional neural network. The whole code is written using Keras. The dataset contains .jpg images with sizes 360X480 and labels 0,1,2 and 3. The data has been balanced so there is the same amount of pictures for each label both in the training and validate datasets.
I've organized my data in directories to use a data generator function that will load the images while the model is training.
The organization of the data is as follows:
Data:
Train: 0: a1.jpg
a2.jpg
...
1: b1.jpg
b2.jpg
...
...
Same for all labels and for test data.
The code used to define and fit the neural network is the following:
model = Sequential()
model.add(Conv2D(128, kernel_size=3, activation='relu', input_shape=input_shape, padding="valid"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, kernel_size=3, activation='relu', padding="valid"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32, kernel_size=3, activation='relu', padding="valid"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512, activation="relu"))
model.add(Dense(4, activation="softmax"))
opt = SGD(lr=learning_rate)
model.compile(loss="categorical_crossentropy", optimizer=opt,
metrics=["accuracy"])
train_datagen = ImageDataGenerator(
width_shift_range=0.05,
height_shift_range=0.05,
rescale=1./255,
horizontal_flip=True)
validate_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
path_data + '/train',
shuffle = True,
target_size=input_generator,
batch_size=batch_size)
validation_generator = train_datagen.flow_from_directory(
path_data + '/validate',
shuffle = True,
target_size=input_generator,
batch_size=batch_size)
steps_per_epoch = np.ceil(train_size/batch_size)
validation_steps = np.ceil(validation_size/batch_size)
H = model.fit_generator(
train_generator,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
validation_data=validation_generator,
validation_steps=validation_steps)
When I run the network for training, the training accuracy stays around 0.25 and when calculating a confusion matrix on the test dataset, I realize it's predicting everything to be in the same class. What could be happening?
I've tried many different experiments to determine where could the problem come from:
I've changed the network to a VGG architecture, training all the layers, reducing the size of the pictures to 150x150 with different learning rates (0.001, 0.01, 0.1).
I've changed the dataset to be only 16 photos for training (4 for each label) and 5 for validation (1 for each of three labels, and 2 for the fourth label). Using VGG like in the last bullet point, with 100 epochs (trying to force overfitting) the network hasn't learned anything. I even tried to reduce the size of the pictures to be 50x50 and the problem persists.
After this attempts, I've created a dataset of 16 pictures (150x150) for training, 5 for validating where each picture is just a plain color (red, blue, yellow, green, one for each label), no shapes, no images, only plain color. The neural network hasn't been able to learn (in the same configuration than the previous two bullet points).
None of this have solved the problem, the network still predicts everything to be from the same class.
CURRENT STATE
I transformed my data to be classified as binary (merged labels 0, 1 and 2, 3). This first step makes sense since the labels were levels of dirt in a pipe (0: 0-25%, 1: 25-50%, ...).
Used ResNet50 architecture without initial wheights.
Cropped the images to be 224x224.
Results: the output of the model for a few images makes sense (it's not [1,0]), and the accuracy rose along the epochs (at least in the training dataset, meaning that the network is learning).
Next steps will be optimizing the model, and maybe coming back to the initial label classification since it was the original purpose of the project.

CNN model overfitting on multi-class classification

I am trying to use GloVe embeddings to train a cnn model based on this article (also a rnn, which has this issue). The dataset is a labeled data: text (tweets) with labels (hate, offensive or neither).
The problem is that model performs well on train set but poorly on validation set.
here is the model:
kernel_size = 2
filters = 256
pool_size = 2
gru_node = 64
model = Sequential()
model.add(Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True))
model.add(Dropout(0.25))
model.add(Conv1D(filters, kernel_size, activation='relu'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(Conv1D(filters, kernel_size, activation='softmax'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, recurrent_dropout=0.2))
model.add(Dense(1024,activation='relu'))
model.add(Dense(nclasses))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
fitting the model:
X = df.tweet
y = df['classifi'] # classes 0,1,2
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, shuffle=False)
X_train_Glove,X_test_Glove, word_index,embeddings_index = loadData_Tokenizer(X_train,X_test)
model_RCNN = Build_Model_RCNN_Text(word_index,embeddings_index, 20)
model_RCNN.fit(X_train_Glove, y_train,validation_data=(X_test_Glove, y_test),
epochs=15,batch_size=128,verbose=2)
predicted = model_RCNN.predict(X_test_Glove)
predicted = np.argmax(predicted, axis=1)
print(metrics.classification_report(y_test, predicted))
this is what the distribution looks like (0:hate, 1:offensive, 2:neither)
model summary
Results:
classification report
is this the correct approach or am I missing something here
Generally speaking there are two sides that you can tackle overfitting:
Improving the data
More unique data
oversampling (to balance data)
Limiting the network structure
Dropout (You've implemented this)
Less parameters (You might want to benchmark against a much smaller network)
regularization (ex. L1 and L2)
I'd suggest trying with significantly fewer parameters (because this is quick) and oversampling (because your data seems lopsided).
Also, You can also try hyperparameter fitting. Making a large number of networks with different parameters than picking the best one.
Note: if you do hyper parameter fitting make sure to have an extra validation set because you can easily overfit your test set this way.
Side note: Sometimes when troubleshooting NN it is helpful to set the optimizer to a basic stochastic gradient descent. It slows the training down a bunch but makes the progression much clearer.
Good luck!

Keras LSTM Continue training after save

i'm working on a LSTM model and i'd like to save it and continue later with extra data as it accumulates.
My problem is that after save the model and load it again next time i run the script, the prediction is completely wrong, it just mimics the data i entered into it.
Here's the model initialization:
# create and fit the LSTM network
if retrain == 1:
print "Creating a newly retrained network."
model = Sequential()
model.add(LSTM(inputDimension, input_shape=(1, inputDimension)))
model.add(Dense(inputDimension, activation='relu'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=epochs, batch_size=batch_size, verbose=2)
model.save("model.{}.h5".format(interval))
else:
print "Using an existing network."
model = load_model("model.{}.h5".format(interval))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=epochs, batch_size=batch_size, verbose=2)
model.save("model.{}.h5".format(interval))
del model
model = load_model("model.{}.h5".format(interval))
model.compile(loss='mean_squared_error', optimizer='adam')
The first dataset, when retrain is set to 1, is around 10 000 entries with around 3k epoch and 5% batch size.
The second dataset is a single entry data. as in one row, with again 3k epochs and batch_size=1
Solved
I was reloading the scaler incorrectly:
scaler = joblib.load('scaler.{}.data'.format(interval))
dataset = scaler.fit_transform(dataset)
Correct:
scaler = joblib.load('scaler.{}.data'.format(interval))
dataset = scaler.transform(dataset)
fit_transform recalculates the multipliers for the scaled values, that means there will be an offset from the original data.
From the functional keras model api for model.fit():
initial_epoch: Integer. Epoch at which to start training (useful for resuming a previous training run).
Setting this parameter might solve your problem.
I think the source of the problem is the adaptive learning rate from adam. During training the learning rate anturally declines for more finetuning of the model. When you retrain your model with only one sample the weight updates are much too big (because of the resetted learning rate) which can totally destroy your previous weights.
If initial_epoch is not good, than try to start your second training with a lower learning rate.

Categories

Resources