Is it possible to fetch the weights of the previous layer, modify them and set again to the next layer. I want to introduce a custom layer in the network which will modify the weights( as per the desired logic ) and then set the modified weight values to the next layer. Similar to what is depicted in the figure below:
I am not sure if this is possible or not. I know that we can dump the snapshot and then use it to set the new weights. I can also converted the weights using the snapshots. But, I dont know how to do this within the network itself ( without taking or using any snapshot).
Thanks
KK
tl;dr: Load one model (without compiling) and use the weights you want to initialize a model. Create new weights for the layers you want to change.
Full version:
As per this thread and as explained by fchollet himself, The canonical way to do this is to load your weights into the previous Keras model (you don't need to compile it, so it's instant) and use that model as a query-able datastructure to access the weights.
For a sequential model you can do it like this:
weights = model.layers[5].get_weights()
model.layers[5].set_weights(weights)
See also: another discussion on this topic with fchollet.
Related
What I want to test is what is the difference of the output before and after I modify the last hidden layer values.
So I can divide my question in two parts
First
last_hidden_layer_values=[1,2,3,4,5]
And I want to set it into:
my_hidden_layer_values=[1,1,1,1,1]
Second:
After modifying last_hidden_layer_values into my_hidden_layer_values, how can I assume the model prediction?
I'm using tensorflow to evaluate this, but I am new to it. I have searched this a little bit. Is tf.train.Checkpoint the answer? But it seems like it is only for training?
Let's say you have two different models: one with the original hidden layer values, and one with the altered hidden layer values. If you want to compare the different predictions of these two models on some prediction, you can use the predict method here.
i.e. something like
model_1.predict(input_x)
# and
model_2.predict(input_x)
Hope this helps!
We now have a trained network for classification task. The top of the network is like
so the layer relu_fc1 is something like extracted features, then softmax to class prediction.
Now we want to extract these features directly. In normal case, we can do it by
y = sess.graph.get_tensor_by_name('relu_fc1:0') sess.run(y,...)
That's great, but we still want to make it faster, so we use TensorRT to convert the saved model. However, after the conversion, we can't get the right tensor in the relu_fc1 because TensorRT mixed the operation up and produced something like TRTENgineOp_1.
I want to know is there a way to get the intermediate layer's output after TensorRT? I guess maybe it's easier for us can delete the last layers in the network then do the conversion, but can't find practical materials for removing the layers in tensorflow.
I want to know is there a way to get the intermediate layer's output after TensorRT? I guess maybe it's easier for us can delete the last layers in the network then do the conversion, but can't find practical materials for removing the layers in tensorflow.
For this question, when you do the tf-to-onnx conversion, you can specify which layer as the final output for the onnx model. Then, you can do the onnx-to-tensorrt conversion.
For more details, see tensorflow-onnx. The --outputs parameter is what you want.
I have been using tf.reset_default_graph() to test out several graphs/models. However, now I would like to train the same model/graph, on different datasets. Therefore, I would like to reset the weights of the created model without having to delete and then create the whole model again. Please note that, I am going to train the model on 20 different datasets. Therefore, maybe resetting the weights only is a simpler operation than deleting the existing model and creating a new one for each dataset. Please correct me if I'm wrong.
Based on this GitHub issue, there is no single function to do it, but the offered workaround is to save your initial weights, then use it to re-initialize them alter (rather than randomly re-initializing them each time).
Initially,
Winit = weights.get_weights() # do this once before training
Later,
weights.set_weights(Winit) # call each time to reset weights
I'm trying to load three different models in the same process. Only the first one works as expected, the rest of them return like random results.
Basically the order is as follows:
define and compile first model
load trained weights before
rename layers
the same process for the second model
the same process for the third model
So, something like:
model1 = Model(inputs=Input(shape=input_size_im) , outputs=layers_firstmodel)
model1.compile(optimizer='sgd', loss='mse')
model1.load_weights(weights_first, by_name=True)
# rename layers but didn't work
model2 = Model(inputs=Input(shape=input_size_im) , outputs=layers_secondmodel)
model2.compile(optimizer='sgd', loss='mse')
model2.load_weights(weights_second, by_name=True)
# rename layers but didn't work
model3 = Model(inputs=Input(shape=input_size_im) , outputs=layers_thirdmodel)
model3.compile(optimizer='sgd', loss='mse')
model3.load_weights(weights_third, by_name=True)
# rename layers but didn't work
for im in list_images:
results_firstmodel = model1.predict(im)
results_secondmodel = model2.predict(im)
results_thirdmodel = model2.predict(im)
I'd like to perform some inference over a bunch of images. To do that the idea consists in looping over the images and perform inference with these three algorithms, and return the results.
I have tried to rename all layers to make them unique with no success. Also I created a different graph for each network, and with a different session do the inference. This works but it's very inefficient (in addition I have to set their weights every time because of sess.run(tf.global_variables_initializer()) removes them). Each time it's created a session tensorflow prints "creating tensorflow device (/device:GPU:0)".
I am running Tensorflow 1.4.0-rc0, Keras 2.1.1 and Ubuntu 16.04 kernel 4.14.
The OP is correct here. There is a serious bug when you try to load multiple weight files in the same script. The above answer doesn't solve this. If you actually interrogate the weights when loading weights for multiple models in the same script you will notice that the weights are different than when you just load weights for one model on its own. This is where the randomness is the OP observes coming from.
EDIT: To solve this problem you have to encapsulate the model.load_weight command within a function and the randomness that you are experiencing should go away. The problem is that something weird screws up when you have multiple load_weight commands in the same script like you have above. If you load those model weights with a function you issues should go away.
From the Keras docs we have this explanation for the user of load_weights:
loads the weights of the model from a HDF5 file (created by save_weights). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use by_name=True to load only those layers with the same name.
Therefore, if your architecture is unchanged you should drop the by_name=True or make it False (its default value). This could be causing the inconsistencies that you are facing, as your weights are not being loaded probably due to having different names on your layers.
Another important thing to consider is the nature of your HDF5 file, and the way you created it. If it indeed contains only the weights (created with save_weights as the docs point out) then there should be no problem in proceeding as explained before.
Now, if that HDF5 contains weights and architecture in the same file, then you should be loading it with keras.models.load_model instead (further reading if you like here). If this is the case then this would also explain those inconsistencies.
As a side suggestion, I prefer to save my models using Callbacks, like the ModelCheckpoint or the EarlyStopping if you want to automatically determine when to stop training. This not only gives you greater flexibility when training and saving your models (as you can stop them on the optimal training epoch or when you desire), but also makes loading those models easily, as you can simply use the load_model method to load both architecture and weights to your desired variable.
Finally, here is one useful SO post where saving (and loading) Keras models is explained.
I was checking the Caffe LeNet Tutorial here and a question came to mind:
What's the difference between these 2 codes:
self.solver.step(1)
and
self.solver.net.forward() # train net
They both seem to train the network at least according to the comment.
Personally I think the first one trains the network on the training data and updates the weights of both net and test_net but the second one seems to only forward a batch of data and apply the learned weights from the previous step.
If what I think is right, then what is the purpose of the second code in the tutorial? why did the code do a net.forward ? can't solver.step(1) do this itself?
Thanks for your time
step does one full iteration, covering all three phases: forward evaluation, backward propagation, and update. The call to forward does only the first of these. There are also differences in the signature (parameter list).
I discovered a strange behavior in solver.step(1) and solver.net.forward(). When I used a custom layer for the input network, my instance layer needs a variable before using:
solver.net.layers[0].mySet(variable)
That variable was set in a local variable for my layer. But when I called for solver.step, that variable does not appear. However it does, when I use solver.net.forward(). I am not certain, but maybe solver.step is instantiating a new variable for the layer.