I would like to train an autoencoder by using only specific PARTS of a layer (the layer named FEATURES in the autoencoder example at the bottom of this question).
In my case, NOK pictures for a new product are very rare, but needed for training. The aim is generate NOK pictures from OK pictures (all examples I found did the opposite). The idea is to force learning OK-picture structure in features[0:n-x] and learning NOK-picture structure (maybe from a similiar product) in features[n-x:n] in order to use the NOK-features as parameters to generate NOK-pictures from OK-pictures.
Two ideas came to my mind using a non-random dropout
(1) keras.layers.Dropout(rate, noise_shape=None, seed=None) has the noise_shape argument, but I am not sure if it helps me as it only describes the shape. It would be perfect to be able to provide a mask consisting of {0,1} to apply on the layer in order to switch on/off specific nodes
(2) creating a custom layer (named MaskLayer below) which performs masking specific nodes of the layer e.g. as a tuple of {0,1}.
I have read this, but I do not think it applies (generate a layer by concatenating layers which can be freezed separately).
def autoEncGenerate0( imgSizeX=28, imgSizeY=28, imgDepth=1): ####:
''' keras blog autoencoder'''
input_img = Input(shape=(imgSizeX, imgSizeY, imgDepth))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((4, 4), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded0 = MaxPooling2D((8, 8), padding='same', name="FEATURES")(x)
encoded1 = MaskLayer(mask)(encoded0) # TO BE DONE (B2) masking layer parts
x = Conv2D(32, (3, 3), activation='relu', padding='same')(encoded1)
x = UpSampling2D((8, 8))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((4, 4))(x)
decoded = Conv2D( imgDepth, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
return( autoencoder)
Thanks for hints.
There is trainable attribute that each instance of tf.keras.layer.Layer has which disables training of the variables of that layer. UpSampling2D doesn't have any variables so you CAN'T train it. What you want is to train the variables of the convolutional layer that comes before that upsampling layer.
You could do it like this:
# define architecture here
autoencoder = Model(input_img, decoded)
layers_names = [l.name for l in autoencoder.layers]
trainable_layer_index = layers_names.index('FEATURES') - 1
for i in range(len(autoencoder.layers)):
if i != trainable_layer_index:
autoencoder.layers[i].trainable = False
# compile here
NOTE that you compile the model AFTER you set layers to trainable/non-trainable.
Related
I'm new to Deep Learning and I can't find anywhere how to do the bottleneck in my AE with convolutional and dense layers. The code below is the specific part where I'm struggling:
...
encoded = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
# encoded = Dense(2)(encoded) # Linear activation function at the bottleneck
decoded = Conv2D(8, (3, 3), activation='relu', padding='same')(decoded)
...
I tried some solutions, like flatten and reshape, but nothing seems to work here. The point is that I need the latent space to be a dense layer of 2 because I need to sample points [x,y] from it. I did it with MLP following this link (https://www.kaggle.com/code/apapiu/manifold-learning-and-autoencoders/notebook) and it worked, but I can't manage to do the same with my structure.
Thanks in advice, and best regards!
Convolution2D takes the input of a 4+ Dimension tensor, hence you need to reshape the input before passing it to Convolution2D layer. You can use a model like below.
input_img = Input(shape=(784,))
input_img1 = Reshape(target_shape=(28,28,1))(input_img)
encoded = Convolution2D(8, (3, 3), activation='relu', padding='same')(input_img1)
encoded = Dense(2)(encoded)
decoded1 = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
decoded2 = Flatten()(decoded1)
decoded = Dense(784,)(decoded2)
Please refer to this gist for complete code with random data.
I am trying to configure a network for character recognition of sequential data like license plates.
Now I would like to use the architecture which is noted in Table 3 in Deep Automatic Licence Plate Recognition system (link: http://www.ee.iisc.ac.in/people/faculty/soma.biswas/Papers/jain_icgvip2016_alpr.pdf).
The architecture the authors presented is this one:
The first layers are very common, but where I was stumbling was the top (the part in the red frame) of the architecture. They mention 11 parallel layers and I am really unsure how to get this in Python. I coded this architecture but it does not seem to be right to me.
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5), input_shape = (32, 96, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(256, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024, activation = "relu"))
model.add(Dense(11*37, activation="Softmax"))
model.add(keras.layers.Reshape((11, 37)))
Could someone help? How do I have to code the top to get an equal architecture like the authors?
The code below can build the architecture described in the image.
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate, Dropout
def create_model(input_shape = (32, 96, 1)):
input_img = Input(shape=input_shape)
'''
Add the ST Layer here.
'''
model = Conv2D(64, kernel_size=(5, 5), input_shape = input_shape, activation = "relu")(input_img)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(128, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(256, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Flatten()(model)
backbone = Dense(1024, activation="relu")(model)
branches = []
for i in range(11):
branches.append(backbone)
branches[i] = Dense(37, activation = "softmax", name="branch_"+str(i))(branches[i])
output = Concatenate(axis=1)(branches)
output = Reshape((11, 37))(output)
model = Model(input_img, output)
return model
From my understanding, your implementation is almost correct. The authors train 11 individual classifiers taking as input the output from the Fully Connected Layer. Here, you can think of "parallel" as "independent".
However, you cannot apply the Softmax activation right after the Fully Connected Layer. Since all the classifiers are independent, we want each of them to output a probability for each possible character. Putting things differently, we want the sum of the outputs of each classifier to be 1. Hence, the correct implementation would be:
...
model.add(Dense(1024, activation = "relu"))
# Feeding every neuron with the previous layer's output
model.add(Dense(11*37))
model.add(keras.layers.Reshape((11, 37)))
model.add(keras.activations.softmax(x, axis=1))
I am trying to work on the Kaggle dataset(10_monkey_dataset), and I am not getting good accuracy on my model.
here is the full model is written in karas.
model.add(Conv2D(32, (3, 3), input_shape=(224, 224, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
With batch_size = 32 epochs = 350
I had trained it on google colab.
Here I am attaching the graph:
Full code can be found on my GitHub:
https://github.com/harsh52/10_monkey_species_kaggle_dataset/blob/master/CNN_monkey.ipynb
It looks like your model is overfitting as your train accuracy is way higher than your test accuracy. I would suggest applying any of the following techniques:
L1/L2 regularization
Adding more dropout
Batch normalization
Trying a smaller network architecture
You might also want to try the leaky relu activation function as this sometimes gives better results.
Also checkout this Notebook, this guy seems to get a better accuracy with a more simple model.
There is really many things you can do in order to improve your accuracy :
Model architecture
You can add more convolutional layer per bloc. A "bloc" is the combinaison of multiple layers like (CONV -> RELU -> CONV -> RELU - > BATCHNORM -> MAXPOOL)
You can add more blocs in order to have a deeper model
You can add batchnorm as said by #Daan Klijn, it can give more stability to your network and lower the overfitting
You can increase your model width, by increasing the number of filters
Add more complexe bloc, like squeeze and excite bloc (https://github.com/titu1994/keras-squeeze-excite-network), residual bloc, inception bloc
Exemple of model architecture :
Bloc1 : 3 conv with 32 filters
Bloc2 : 3 conv with 64 filters
Bloc2 : 3 conv with 128 filters
Bloc2 : 3 conv with 256 filters
Data preparation
You can preprocess your data by normalizing it
Add data augmentation (rotation, shifting, brightness, ...)
Other thing
Tweek your hypermaters, change optimizer (Adam, SGD, ..), change learning rate
Exemple of one bloc with 2 conv and 32 filters:
x = Conv2D(32, (3, 3) , padding='SAME')(model_input)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Conv2D(32, (3, 3) , padding='SAME')(x)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
Exemple of one bloc with 3 conv and 64filters:
x = Conv2D(64, (3, 3) , padding='SAME')(model_input)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3) , padding='SAME')(x)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3) , padding='SAME')(x)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
Along with other great answers you may also try different activation functions to get 1% more accuracy. I saw Mish activation function giving some good results.
https://github.com/digantamisra98/Mish
https://forums.fast.ai/t/meet-mish-new-activation-function-possible-successor-to-relu/53299/76
I have a CNN and I want to sneak in some extra information into one of the final layers.
Here's a simplified version of the code. Watch for the comment
def define_model():
model = Sequential()
model.add(Conv2D(32, (3,3))
model.add(Conv2D(32, (3,3))
model.add(MaxPooling2D((2,2))
model.add(Conv2D(64, (3,3))
model.add(Conv2D(64, (3,3))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
# this next layer is where I want to sneak the neuron(s) in
model.add(Dense(1024))
model.add(Dropout(rate=0.4))
model.add(Dense(168))
model.compile()
return model
So I have some additional information about the input image which might be able to help the network. Think of it as a clue which may or may not deserve a reasonable amount of weighting.
The clue is in the form of an integer which technically is in [0, inf) but practically is probably in [0, 20].
So my questions are
What's the appropriate way to represent that hint speaking in terms of NN architecture in general.
How do I tweak the Keras model to make that happen in practice?
Bonus: If I wanted to, could I prevent the subsequent dropout from ever dropping out this added feature?
This could work by using Keras' functional API:
def define_model():
inputs = Input(input_shape=(...))
hints = Input(input_shape=(...))
x = Conv2D(32, (3,3))(inputs)
x = Conv2D(32, (3,3))(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(64, (3,3))(x)
x = Conv2D(64, (3,3))(x)
x = MaxPooling2D((2,2))(x)
x = Flatten()(x)
x = Add()([x, hints])
x = Dense(1024)(x)
x = Dropout(rate=0.4)(x)
outputs = Dense(168)(x)
model = Model([inputs, hints], outputs)
model.compile()
return model
I don't know about protecting it from the dropout using Keras though.
I have the following convolutional neural network to apply to images:
classifier = Sequential()
classifier.add(Convolution2D(128, (3, 3), input_shape = (128, 128, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Convolution2D(64, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Flatten())
After applying the convolutional and maxpooling layers, I flatten the results and want to store only that result (later I want to work with this result using unsupervised methods). How do I do that? The only examples I have continue the proccess to fit the model and I never store the flatten layers.
This is covered in the Keras documentation for pretrained models. See the examples about feature extraction, https://keras.io/applications/#extract-features-with-vgg16
Once you have your model, you just do:
features = model.predict(x)