I want to use a set of pre-trained weights to train my model for MNIST classification. More specifically, I train my model on one dataset. I want to use the final weights as the starting weights to train the model on a different dataset. To do this, I use
intial_weights = model1.get_weights()
model2 = create_model()
model2.set_weights(initial_weights)
model2.fit(x=x_train59,y=y_train59, epochs=20,callbacks = [cp_callback2])
My question is that whether model.fit() will ignore the initial weights set using model2.set_weights() or not. And if it does ignore, is there a way to make sure that model2.fit() uses the weights obtained previously. Also, is there a way to visualize the starting weights before model.fit() starts training. Thanks much in advance!
When you do model2.set_weights, you changed the weights of model2. That's all.
You can see the weights the same way: w2 = model2.get_weights(). Then print w2 in a convenient way.
Related
I am trying to implement a model that uses encoding from multiple pre-trained BERT models on different datasets and gets a combined representation using a fully-connected layer. In this, I want that BERT models should remain fixed and only fully-connected layers should get trained. Is it possible to achieve this in huggingface-transformers? I don't see any flag which allows me to do that.
PS: I don't want to go by the way of dumping the encoding of inputs for each BERT model and use them as inputs.
A simple solution to this is to just exclude the parameters related to the BERT model while passing to the optimizer.
param_optimizer = [x for x in param_optimizer if 'bert' not in x[0]]
optimizer = AdamW(param_optimizer, lr)
I have a sequential model with a custom loss function for training. For prediction and validation however, I want to remove one layer. Is there any way to do this? The easiest thing I could think would be within a custom metric by being able to get the value of output from a previous layer without access to the input. Alternatively, I could run prediction and verification on a separate model, but I worry about constructing a separate model because I want the weights to be saved. Any suggestions? I have spent a lot of time with this and any thing I try has involved scope issues. I took a look at this: Keras, How to get the output of each layer? but every answer I see requires me to know the inputs.
You can create separate models. Each model will need to be compiled. My solution was of this form...
inputs = Input(input_shape)
model = Conv2D(32, [3,3])(inputs)
# pass the model through some layers
# finish the model
model = Model(inputs=inputs, outputs=model)
input_2 = Input(input_shape)
second_model = model(input_2)
# pass the second model through some layers
second_model = Model(inputs=inputs, outputs=second_model)
model.compile(...
second_model.compile(...
Now any training done to second_model affects the weights of model, allowing you to do training off of second_model and predictions with model.
Is it possible in Keras that the training of each or some of outputs in multi-output training start at different epochs? For example one of the outputs takes some other outputs as its input. But those outputs at the beginning are quite premature and it brings huge computational burdens to the model. This output that I would like its training to be postponed to some time later is a custom layer that has to apply some image processing operations to its input which is an image generated by another output but at the beginning that the generated image is quite meaningless, I think it's just waste of time for first epochs to apply this custom layer. Is there a way to do that? Like we have weights over each output's loss, do we have different starting point for calculating each output's loss?
Build a model that does not contain the later output.
Train that model to the degree you want.
Build a new model that incorporates the old model into it.
Compile the new model with the new loss functions you want.
Train that model.
To elaborate on step 3: Keras models can be used like layers in Keras' functional API.
You can build a normal model like so:
input = Input((100,))
x = Dense(50)(input)
x = Dense(1, activation='sigmoid')(x)
model = Model(input, x)
However, if you have another standard Keras model, it can be used just like any other layer. For example, if we have a model (created with Sequential(), Model(), or keras.models.load_model()) called model1, we can put it in like this:
input = Input((100,))
x = model1(input)
x = Dense(1, activation='sigmoid')(x)
model = Model(input, x)
This would be the equivalent of putting in each layer in model1 individually.
Considering the example of Image classification on ImageNet, How to update the pre-trained model using the new data points.
I have loaded the pre-trained model. I have a new data point that is quite different from the distribution of the original data on which the model was previously trained. So, I would like to update/fine-tune the model with the help of new data point. How to go about doing it? Can anyone help me out in doing it? I am using pytorch 0.4.0 for implementation, running on GPU Tesla K40C.
If you don't want to change the output of the classifier (i.e. the number of classes), then you can simply continue training the model with new example images, assuming that they are reshaped to the same shape that the pretrained model accepts.
On the other hand, if you want to change the number of classes in a pre-trained model, then you can replace the last fully connected layer with a new one and train only this specific layer on new samples. Here's a sample code for this case from PyTorch's autograd mechanics notes:
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
# Replace the last fully-connected layer
# Parameters of newly constructed modules have requires_grad=True by default
model.fc = nn.Linear(512, 100)
# Optimize only the classifier
optimizer = optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9)
I have trained a neural network classifier using keras, and it is currently predicting with 90% accuracy. I would like to make predictions on another file without having to include the entire model. Is it possible to make the predictions using just the output weight matrix? I.e, mathematically, how can I make the same classification using the weight matrix without having to re-train the model every time i get a new data input for which I need a classification prediction?
Save your model or the weights of your model:
model.save(filename)
np.save(filename, model.get_weights())
For loading, in the first case:
from keras.models import load_model
model = load_model(filename)
In the second case:
#recreate the model then:
model.set_weights(np.load(filename))
Then:
results = model.predict(batch_of_data)