I want to add some extra information to a CNN as gender, age, a vector...
My CNN have as inputs matrices that represent voice histograms with dimensions 125x64. Since they are from different persons, I would like to add that information to the model. Besides, I would like to add some vector 125x1 who represents the pitch or the energy of the voice (getting from feature extraction) but I think is not a good idea to attach it to the histogram.
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='valid', strides=1,
input_shape=input_shape, activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 3), strides=(1, 3)))
model.add(Conv2D(32, (1, 3), padding='valid', strides=1,
input_shape=input_shape, activation='relu'))
model.add(MaxPooling2D(pool_size=(1, 3), strides=(1, 3)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
It indeed doesn't make much sense to add that data to the histogram. Keras has an explanation in their own documentation to how you use multiple inputs in a model: https://keras.io/getting-started/functional-api-guide/. The paragraph Multi-input and multi-output models seems to be what you're looking for.
Related
So I train my model with a dataset and for each epoch I can see the loss and val_loss go down (it is important to note that val_loss will go down to a certain point, but then it stagnates as well, having some minor ups and downs) and accuracy go up but for some reason my val_accuracy stays at roughly 0.33.
I browsed this and it seems to be a problem of overfitting so i added Dropout layers and regularization by using l2 on some layers of the model but it seems to have no effect. Therefore I would like to ask you what do you think I could improve in my model in order to make the val_loss keep going down and my val_accuracy not stagnate and therefore keep going up.
I've tried to use more images but the problem seems to be the same.. Not sure if my increment of images was enough tho.
Should I add Dropout layers in the Conv2D layers?
Should I use less or more l2 regularization?
Should I use even more images?
Just some questions that might have something to do with my problem.
My model is below:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), input_shape=(580, 360, 1), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(Dense(532, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(266, activation='softmax'))
model.add(Reshape((7, 38)))
print(model.summary())
optimizer = keras.optimizers.SGD(lr=0.00001)
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])
Thanks in advance!
PS: Here is the graph of training:
PS2: Here is the end of training:
209/209 [==============================] - 68s 327ms/step - loss: 0.7421 - accuracy: 0.9160 - val_loss: 3.8159 - val_accuracy: 0.3152
Epoch 40/40
This seems to be a classic overfitting problem.
It would be nice to have a more detailed introduction to the problem, like is it a classification task? Are your images grayscale? What is the purpose of this network?
With this information I would say that any proper regularization to the network should help. Some item you could try:
For conv layers I recommend using SpatialDropout layers.
Get more data (if possible)
Use data augmentation (if possible)
Increase the rate of the dropout layers
Try reducing the complexity of your model architecture (maybe fewer layers, fewer number of filters in general, fewer number of neurons in dense layers, etc.)
Hope this helps!
Just a hint :
You have a problem with your CNN architecture, the size must be lower and lower at each convolution, but in your case it is growing: you have 16, 32, 64, 64, 128. You should do that in the reverse manner. Start from input_shape=(580,360) and then you may go, let us say to shapes 256, 128, 64, 32 for Conv2D.
I'm building a CNN to control a vehicle within a video game. The network takes a screenshot as input and uses controller values as targets. For now I'm just using two controller values as targets: steering which is a value between -1 and 1, as well as throttle, which is between 0 and 1. I have rounded the values of steering into 7 values, and throttle into 4 values giving me 28 distinct classes which I will balance (reason for rounding was difficulty balancing un-binned classes).
My question is whether I should train the network with a single value target 1-27 (one for each case), or whether I should use the two rounded controller values as targets (an array: [steering, throttle])? I understand that both create 28 target classes, but does the structure of the target output affect the performance of the network? Is one of these options noticeably better than the other?
The model for preliminary testing:
'''
model = Sequential()
model.add(Conv2D(24, kernel_size=(5, 5), strides=(2, 2), activation='relu', input_shape= INPUT_SHAPE))
model.add(Conv2D(36, kernel_size=(5, 5), strides=(2, 2), activation='relu'))
model.add(Conv2D(48, kernel_size=(5, 5), strides=(2, 2), activation='relu'))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(1164, activation='relu'))
drop_out = 1 - keep_prob
model.add(Dropout(drop_out))
model.add(Dense(100, activation='relu'))
model.add(Dropout(drop_out))
model.add(Dense(50, activation='relu'))
model.add(Dropout(drop_out))
model.add(Dense(10, activation='relu'))
model.add(Dropout(drop_out))
model.add(Dense(OUT_SHAPE, activation='softsign'))
'''
Let's say I have binary classification task and I'm using a CNN. Simply visualizing the CNN isn't very helpful as the input isn't images. However, I would like to know which particular filters contribute the most for an input sample to be considered a particular class.
Given the following architecture (implemented using Keras), how do I achieve this?
model = Sequential()
model.add(Conv2D(32, kernel_size=(10, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (10, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
I explored resources A and B. But neither seem helpful for what I want to do. If there are other suggestions for understanding what a network learns for non-image datasets, that'd be really helpful.
i'm trying to make a simple classification model for the cifar-10 dataset. The model fails when it gets to Maxpooling fuction. It says that it has the incorrect Syntax but for the life of me i cannot figure out whats wrong.
Is it the version of keras i'm using? when i add maxpooling to the model with a size of 2, 2 it don't work and in the documentation, i am doing the exact same thing which makes me think its a version problem.
Sorry if the problem is obvious
model = Sequential()
model.add(Conv2D(32, (3,3), padding = 'same', input_shape=(32,32,3)))
model.add(Activation('relu')
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu')
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
model.summary()
Max pooling does not have any issue.your issue is you are missing some brackets in the previous line. find below the corrected code
model = Sequential()
model.add(Conv2D(32, (3,3), padding = 'same', input_shape=(32,32,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
model.summary()
Hope this helps.
I am using this CNN architecture :
def createModel():
model = Sequential()
model.add(Conv2D(96, (11, 11), strides=(4,4), padding='same', activation='relu', input_shape=(224,224,3))) #, input_shape=input_shape
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2,2), padding='same'))
model.add(Conv2D(256, (5, 5), strides=(1,1), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2,2), padding='same'))
model.add(Conv2D(384, (3, 3), strides=(1,1), padding='same', activation='relu'))
model.add(Conv2D(384, (3, 3), strides=(1,1), padding='same', activation='relu'))
model.add(Conv2D(256, (3, 3), strides=(1,1), padding='same', activation='relu'))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
return model
At the total, it makes 226,065,795 trainable parameters... I train my model on a quite small dataset. Indeed, I have three classes, each one contains 1000 images, that are split between training and validation set. I used OneHotEncoding.
However, when I train the model, the overall accuracy tends to a very low result : 32%. I am using categorical_crossentropy loss function. When looking deeper, it seems that the model always predicts 1,0,0 vector, i.e. the first class and never the others, even for the trained data that thus have been shown to the model.
What's the problem there ? Why do I have poor results ? My dataset is balanced tough. Is it because my number of parameters is too high compared to the amount of data? I don't understand...
EDIT : removing Dropout still keeps the bad thing