I have a CNN which is made in keras.
I can view the weights of a specific layer using:
weights,biases = model.layers[n].get_weights()
print(weights,biases)
But I want to view the weights in all relevant layers so I tried:
for layer in model.layers:
weights,biases = layer.get_weights()
But this throws an error for layers which don't have weights (like the max pool layer).
So is it possible to check if a layer has weights, something like:
if layer.weights != None:
Edit: Added the keras model I am using
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(1, kernel_size=(3,3),activation='relu', input_shape=(6,6,1)))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(len(sign_label_list),activation='relu'))
You cold try this:
for layer in model.layers:
if len(layer.get_weights()) > 0:
print(layer.name, layer.get_weights())
Related
Does pre-trained VGG-16 with ImageNet weights work with LeakyRelu activation function?
Our input has negative values hence we would like to use LeakyRelu.
Sample code below.
VGG-16 using ReLu:
# CREATE VGG16
vgg = VGG16(include_top=False, pooling = 'max', weights='imagenet', input_shape=(96,96,3))
vgg.get_config()
# freeze weights in vgg layers 1 to 10
for layer in vgg.layers[:-10]:
layer.trainable = False
# Check if any layer is not frozen
for each in vgg.layers:
print(each, each.trainable)
# Put a \newline to make it neat
print('')
# Look at the summary of VGG16
vgg.summary()
VGG-16 using LeakyRelu:
# CREATE VGG16
vgg = VGG16(include_top=False, pooling = 'max', weights='imagenet', input_shape=(96,96,3))
for layer in vgg.layers[:-10]:
if (hasattr(layer,'activation'))==True:
layer.activation = LeakyReLU(alpha=leaky_relu_alpha)
# freeze weights in vgg layers 1 to 10
for layer in vgg.layers[:-10]:
layer.trainable = False
# Check if any layer is not frozen
for each in vgg.layers:
print(each, each.trainable)
# Put a \newline to make it neat
print('')
# Look at the summary of VGG16
vgg.summary()
If the VGG is pertained with relu, then leakyrelu may not work. Because the network layer after the relu activation function never saw feature map with negative values and therefore may lead to low performance. However, the impact of leakyrelu on performance highly depends on your task. I suggest you fine-tune the pre-trained VGG with leakyrelu activation and use a small learning rate for several epochs. This should address the problem of performance degradation.
Another way of handling your negative input is normalizing it into value within range of [0,1] or some other ranges.
I am using a pre-trained Resnet model for a classification problem and the model is over-fitting on the data. I want to try adding dropouts after each activation function layer of the pre-trained Resnet Architecture.
def add_dropouts(model, probability = 0.5):
print("Adding Dropouts")
updated_model = tf.keras.models.Sequential()
for layer in model.layers:
print("layer = ", layer)
updated_model.add(layer)
if isinstance(layer, tf.keras.layers.Activation):
updated_model.add(tf.keras.layers.Dropout(probability))
print("updated model Summary = ", updated_model.summary)
print("model Summary = ", model.summary)
model = updated_model
return model
base_model = tf.keras.applications.ResNet50V2(weights=weights, include_top=False, input_shape=input_img_shape, pooling='avg')
base_model = add_dropouts(base_model, probability = 0.5)
But I am getting the following error:
ValueError: Input 0 of layer conv2_block1_3_conv is incompatible with the layer: expected axis -1 of input shape to have value 64 but received input with shape [None, 128, 128, 256]
This is preventing me from copying the layers individually. How can I solve this? If anyone has a better idea to add Dropouts after every activation layer, please share.
I trained a Many-to-Many sequence model in Keras with return_sequences=True and TimeDistributed wrapper on the last Dense layer:
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=50))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))
# train...
model.save_weights("weights.h5")
So during the training the loss is calculated over all hidden states (in every timestamp). But for inference I only need the get output on the last timestamp. So I load the weights into Many-to-One sequence model for inference without TimeDistributed wrapper and I set return_sequences=False to get only last output of the LSTM layer:
inference_model = Sequential()
inference_model.add(Embedding(input_dim=vocab_size, output_dim=50))
inference_model.add(LSTM(100, return_sequences=False))
inference_model.add(Dense(vocab_size, activation='softmax'))
inference_model.load_weights("weights.h5")
When I test my inference model on a sequence with length 20 I expect to get a prediction with shape (vocab_size) but inference_model.predict(...) still returns predictions for every timestamp - a tensor of shape (20, vocab_size)
If, for whatever reason, you need only the last timestep during inference, you can build a new model which applies the trained model on the input and returns the last timestep as its output using the Lambda layer:
from keras.models import Model
from keras.layers import Input, Lambda
inp = Input(shape=put_the_input_shape_here)
x = model(inp) # apply trained model on the input
out = Lambda(lambda x: x[:,-1])(x)
inference_model = Model(inp, out)
Side Note: As already stated in this answer, TimeDistributed(Dense(...)) and Dense(...) are equivalent, since Dense layer is applied on the last dimension of its input Tensor. Hence, that's why you get the same output shape.
I'm trying to do a simple binary classification problem using Keras and its pre-built ImageNet CNN architecture.
For VGG16, I took the following approach,
vgg16_model = keras.application.vgg16.VGG16()
'''Rebuild the vgg16 using an empty sequential model'''
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
'''Since the problem is binary, I got rid of the output layer and added a more appropriate output layer.'''
model.pop()
'''Freeze other pre-trained weights'''
for layer in model.layers:
layer.trainable = False
'''Add the modified final layer'''
model.add(Dense(2, activation = 'softmax'))
And this worked marvelously with higher accuracy than my custom built CNN. But it took a while to train and I wanted to take a similar approach using Xception and InceptionV3 since they were lighter models with higher accuracy.
xception_model = keras.applicaitons.xception.Xception()
model = Sequential()
for layer in xception_model.layers:
model_xception.add(layer)
When I run the above code, I get the following error:
ValueError: Input 0 is incompatible with layer conv2d_193: expected axis -1 of input shape to have value 64 but got shape (None, None, None, 128)
Basically, I would like to do the same thing as I did with VGG16 model; keep the other pretrained weights as they are and simply modify the output layer to a binary classification output instead of an output layer with 1000 outcomes. I can see that unlike VGG16, which has relatively straightforward convolution layer structure, Xception and InceptionV3 have some funky nodes that I'm not 100% familiar with and I'm assuming those are causing issues.
Your code fails because InceptionV3 and Xception are not Sequential models (i.e., they contain "branches"). So you can't just add the layers into a Sequential container.
Now since the top layers of both InceptionV3 and Xception consist of a GlobalAveragePooling2D layer and the final Dense(1000) layer,
if include_top:
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
if you want to remove the final dense layer, you can just set include_top=False plus pooling='avg' when creating these models.
base_model = InceptionV3(include_top=False, pooling='avg')
for layer in base_model.layers:
layer.trainable = False
output = Dense(2, activation='softmax')(base_model.output)
model = Model(base_model.input, output)
I want to us a pre-trained model to classify images. Every image can have multiple classes (in this case up to 21). Of course not all classes occur in every image. Therefore, I want to add a Masking layer to the model which masks all zeros (given value if a class doesn't exist).
Now I've never used a Masking layer before, but in most online examples it is added directly after the Input layer. This is, however, not possible in the example below, because the Masking layer is not compatible with all layers (in this case a Conv2D layer.
It was possible to insert the Masking layer before the fully-connected output layer. But this did not really affect the loss of the model (loss is around 20!).
Am I doing it right or is adding the Masking layer at the top of the model totally useless?
My Model
from keras.layers import Input, GlobalAveragePooling2D, Masking, Dense, Dropout
from keras.applications.inception_v3 import InceptionV3
from keras.models import Model
#Define input tensor
input_tensor = Input(shape=(512, 512, 3))
# create the base pre-trained model
base_model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a Mask fully-connected layer and a dropout
x = Masking(0.)(x)
x = Dense(1024, activation='relu')(x)
x = Dropout(0.2)(x)
# and a logistic layer with 21 classes
predictions = Dense(21, activation='sigmoid')(x)
# create final model
model = Model(inputs=base_model.input, outputs=predictions)
# compile model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
Example labels
# input
np.array([[1,2,0,0,0,6,7,8,0,0,0,12,13,14,15,16,17,18,19,20,21],
[0,2,0,4,5,6,7,8,9,10,11,0,0,14,15,16,17,18,19,20,21]])