I have a question regarding transfer learning. Let suppose there is a neural network model that takes an input of shape (250,7). I want to initialise the model with the weights of this pre-trained model and then train it on my dataset to update the weights according to my dataset. But my dataset is of shape (251,8). Is there way to initialise the weights using pre-trained model considering my input shape is different? If so how can I do that? Insights will be appreciated.
You could try adding another layer before the transfer learning model. Just like the last layer, this will update its weights on your dataset and should work fine.
Related
I have a pretrained model trained in Keras.
I am trying to use that model in another task, but I don't need all the layers, but only the first 4 conv layers.
I have the model saved in "keras_pretrained_model.h5"
Is it possible to initialize the first 4 conv layers of the new model using the weights of the first 4 conv layers of the pretrained model from the '.h5' file?
Is loading the whole pretrained model first always necessary??
The pretrained model actually takes up a lot of space and I am not sure how to delete the pretrained model after I initializing the new model with the weights. As far as I understand, using tf.keras.backend.clear_session() will clear the new model created along with the old one.
So, my question is
Is there any way to initialize the weights in the new model layers without loading the whole pretrained model?
If I have to load the whole pretrained model, how to delete only the pretrained model without harming the new model in any way?
I have thought of two processes
If the names of the layers of the first 4 layers of the new model is same as the 4 layers in the pre-trained model, then
new_model.load_weights(path_to_old_model_file, by_name = True)
If the names don't mathc, then we can do layer wise weight initialization by taking the weights from the corresponding layers in the old model h5 file and setting the weights using set_weights() method.
I have written a code, which I have uploaded to github, here.
I would be very grateful, if anyone seeing this gives a feedback on this!!
I am new to semantic segmentation, I implemented the FCN network and now I want to try not to train from scratch and use the pre-train vgg16 weights. I saw an implementation like this link, but I am not sure where the new dataset input comes to the network.
To be more clear, in the above link, the vvg part returns input image from trained network and the the output layers 3,4,7.
image_input, pool3_out, pool4_out, fc7_out = self._load_vgg16()
I am not sure where the new batch of data gets into the model. I appreciate your guidance.
Considering the example of Image classification on ImageNet, How to update the pre-trained model using the new data points.
I have loaded the pre-trained model. I have a new data point that is quite different from the distribution of the original data on which the model was previously trained. So, I would like to update/fine-tune the model with the help of new data point. How to go about doing it? Can anyone help me out in doing it? I am using pytorch 0.4.0 for implementation, running on GPU Tesla K40C.
If you don't want to change the output of the classifier (i.e. the number of classes), then you can simply continue training the model with new example images, assuming that they are reshaped to the same shape that the pretrained model accepts.
On the other hand, if you want to change the number of classes in a pre-trained model, then you can replace the last fully connected layer with a new one and train only this specific layer on new samples. Here's a sample code for this case from PyTorch's autograd mechanics notes:
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
# Replace the last fully-connected layer
# Parameters of newly constructed modules have requires_grad=True by default
model.fc = nn.Linear(512, 100)
# Optimize only the classifier
optimizer = optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9)
I have trained a neural network classifier using keras, and it is currently predicting with 90% accuracy. I would like to make predictions on another file without having to include the entire model. Is it possible to make the predictions using just the output weight matrix? I.e, mathematically, how can I make the same classification using the weight matrix without having to re-train the model every time i get a new data input for which I need a classification prediction?
Save your model or the weights of your model:
model.save(filename)
np.save(filename, model.get_weights())
For loading, in the first case:
from keras.models import load_model
model = load_model(filename)
In the second case:
#recreate the model then:
model.set_weights(np.load(filename))
Then:
results = model.predict(batch_of_data)
I am working on a project that requires me to add new units to the output layer of a neural network to implement a form of transfer learning. I was wondering if I could do this and set the units' weights using either Keras or TensorFlow.
Specifically I would like to append an output neuron to the output layer of the Keras model and set that neuron's initial weights and bias.
Stumbled upon the answer to my own question. Thanks everyone for the answers/comments.
https://keras.io/layers/about-keras-layers/
The first few lines of this source detail how to load and set weights.
Essentially, appending an output neuron to a Keras model can be accomplished by loading the old output layer, appending the new weights, and setting weights for a new layer. Code is below.
# Load weights of previous output layer, set weights for new layer
old_layer_weights = model.layers.pop().get_weights()
new_neuron_weights = np.ndarray(shape=[1,bottleneck_size])
# Set new weights
# Append new weights, add new layer
new_layer = Dense(num_classes).set_weights(np.append(old_layer_weights,new_neuron_weights))
model.add(new_layer)
You could add new units to the output layer of a pre-trained neural network. This form of transfer learning is said to be called using the bottleneck features of a pre-trained network. This could be implemented both in tensorflow as well as in Keras.
Please find the tutorial in Keras below:
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
Also, find the tutorial for tensorflow below:
https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/08_Transfer_Learning.ipynb
Hope this helps!