I would like to retrain Keras Model, Inception_v3, from scratch.
The model is defined here:
https://github.com/keras-team/keras-applications/blob/master/keras_applications/inception_v3.py
I read some posts,
The listed solutions are:
Freeze the layers (This is not what I want...)
for layer in model.layers:
layer.trainable = False
https://stackoverflow.com/a/51727616/7748163
Reset all layers by checking for initializers:
def reset_weights(model):
session = K.get_session()
for layer in model.layers:
if hasattr(layer, 'kernel_initializer'):
layer.kernel_initializer.run(session=session)
if hasattr(layer, 'bias_initializer'):
layer.bias_initializer.run(session=session)
Use tf.variables_initializer
model = InceptionV3()
for layer in model.layers:
sess.run(tf.variables_initializer(layer.weights))
Reference: https://stackoverflow.com/a/56634827/7748163
The best one I think, but it raises an error.
sess = tf.Session()
for layer in model.layers:
for v in layer.__dict__:
v_arg = getattr(layer,v)
if hasattr(v_arg,'initializer'):
initializer_method = getattr(v_arg, 'initializer')
initializer_method.run(session=sess)
print('reinitializing layer {}.{}'.format(layer.name, v))
However, none of them works for Inception_v3.
The error information is for BatchNorm layer:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable batch_normalization_9/moving_mean from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/batch_normalization_9/moving_mean/N10tensorflow3VarE does not exist.
[[{{node batch_normalization_9_1/AssignMovingAvg/ReadVariableOp}}]]
[[metrics_1/categorical_accuracy/Identity/_469]]
So, how to re-train the existing Keras Models, and initialize the variables? What is the best practice to re-train a model from Keras applications?
Further discussion:
https://github.com/keras-team/keras/issues/341
Why not simply not asking for the weights?
model = Inception_V3(..., weights=None,...)
https://github.com/keras-team/keras-applications/blob/master/keras_applications/inception_v3.py/#L100
Related
I have tried tons of methods like onnx2keras, pytorch2keras and so on. But there would always be something wrong...
Since my model is not really complicated: just a ResNet18-encoder + Decoder with some skip-connections. I'm considering to simply transfer them one layer by another, from pytorch to Keras.
Before I try I'd like to ask if you have similar experience? I know there's set_weights method, but that's for keras-to-keras so nothing special. However, Keras is object-based model, so how can I assign name-based weights, e.g. 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var' to a BN? I don't want TF1.x solutions because all of my work is on TF2.x.
So In my opinion, it would be something like:
# 1. Save weights and names from pytorch model
weights_dict = torch_mode.static_dict()
# 2. Construct Keras model
keras_model = tf.keras.models.Model(...)
# 3. Now load weights for each layer in Keras model
for var_name, weight in weights_dict.items():
# Assign conv with weight with'encoder.conv1.weight'
# Assign BN with 'encoder.bn1.weight', 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var', 'encoder.bn1.num_batches_tracked'
But I don't know how... Look forward to your opinions!
Could you try pt2keras and see if it works?
link: https://github.com/JWLee89/pt2keras/
To install pt2keras, type the following in the terminal:
pip install -U pt2keras
Below is a simple example for converting resnet18.
import tensorflow as tf
from torchvision.models.resnet import resnet18
from pt2keras import Pt2Keras
if __name__ == '__main__':
input_shape = (1, 3, 224, 224)
# Grab model
model = resnet18(pretrained=False).eval()
# Create pt2keras object
converter = Pt2Keras()
# convert model
keras_model: tf.keras.Model = converter.convert(model, input_shape, strict=True)
# Save the model
keras_model.save('output_model.h5')
# Do whatever else that you want afterwards ...
I have attached the converted keras model visualized using netron:
Before I try I'd like to ask if you have similar experience? I know there's set_weights method, but that's for keras-to-keras so nothing special. However, Keras is object-based model, so how can I assign name-based weights, e.g. 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var' to a BN? I don't want TF1.x solutions because all of my work is on TF2.x.
Unfortunately, as far as I know, you cannot attach name-based weights to individual parameters in Keras like you can in PyTorch, since keras is layer-based. However, you can name the batch-norm layer, which I am guessing is not very useful to you.
I would like to create a model consisting of 2 convolutional, one flatten, and one dense layer in Keras. This would be a model with shared weights, so without any predefined input layer.
It is possible to do using the sequential way:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu))
model.add(tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(200,activation=tf.nn.relu))
However, using the Functional API, produces a TypeError:
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)
model2 = tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Flatten()(model2)
model2 = tf.keras.layers.Dense(200,activation=tf.nn.relu)(model2)
Error :
TypeError: Inputs to a layer should be tensors. Got: <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7fb060598100>
Is it impossible to do this way, or am I missing something?
The keras sequential api is designed to be easier to use, and as a result is less flexible than the functional api. The benefit of this is that an input 'layer' shape can be inferred automatically by whatever shape of the data you pass to it. The downside is that this easier to use model is simplified, and so you can't do things like using multiple inputs.
From the keras docs:
A Sequential model is not appropriate when:
Your model has multiple inputs or multiple outputs
Any of your layers has multiple inputs or multiple outputs
You need to do layer sharing
You want non-linear topology (e.g. a residual connection, a
multi-branch model)
The functional api is designed to be more flexible i.e. multiple inputs, and so it doesn't make any sort of automatic inference for you, hence the error. You must explicitly pass an input layer in this case. For your use case, it might seem odd that it doesn't automatically infer the shape, however when you consider the wider use-case scenario it makes sense.
So the second scenario should be :
model2 = tf.keras.layers.Input((10,3,2)) # specified input layer
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Flatten()(model2)
model2 = tf.keras.layers.Dense(200,activation=tf.nn.relu)(model2)
Update
If you want to create two separate models and join them together, you should use the functional API, and then due to it's constraints you must therefore use input layers. So you could do something like:
import tensorflow as tf
from tensorflow.keras.layers import Input, Flatten, Dense, concatenate, Conv2D
from tensorflow.keras.models import Model
input1 = Input((10,3,2))
model1 = Dense(200,activation=tf.nn.relu)(input1)
input2 = Input((10,3,2))
model2 = Dense(200,activation=tf.nn.relu)(input2)
merged = concatenate([model1, model2])
merged = Conv2D(10,3,2,'valid',activation=tf.nn.relu)(merged)
merged = Flatten()(merged)
merged = Dense(200,activation=tf.nn.relu)(merged)
model = Model(inputs=[input1, input2], outputs=merged)
Above we have two separate inputs and then two Dense layers - you can build these separate lines however you want, and then to merge them together to pass them through a convolutional layer you need to use a tf.keras.layers.concatenate layer, and then you can continue the joint model from there. Wrapping the whole thing inside a Model object then allows you access training and inference methods like fit/predict etc.
The linking in keras works by propagating tensors through the layers. So in your second example, at the beginning model2 is an instance of a keras.layers.Layer and not a tf.Tensor that why you get the error.
Input creates a tensor which can then be used to link the layers. So if there is not a specific reason, you just add one:
model2 = tf.keras.layers.Input((10,3,2))
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)(model2)
I am building a model in tensorflow version 2.0 (upgrading is not an option due to compatibility with my version of cuda, which I do not have permission to change). I am using tf.strategy.MirroredStrategy() to train my model on 2 GPUs. However, I am trying to instantiate a custom dense layer whose weights are the transpose of the weights of a different dense layer. My code involves this line to build the custom layer:
from tensorflow.keras import backend as K
class DenseTied(Layer):
# Really long class, full code can be found at link below
def build(self, input_shape):
self.kernel = K.transpose(self.tied_to.kernel)
I am then using this in a model as follows:
from tensorflow.keras.layers import Input, Dense
def build_model(input_shape):
model_input = Input(shape=input_shape)
dense1 = Dense(6144, activation='relu')
dense_tied1 = DenseTied(49152, tied_to=dense1)
x = dense1(model_input)
model_output = dense_tied1(x)
model = Model(model_input, model_output)
model.compile(optimizer='adam', loss='mse')
return model
When trying to build this model I get an error: AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_distribute_strategy'.
I have been tracking this down for a while now and I pinpointed that the issue is in the line
self.kernel = K.transpose(self.tied_to.kernel)
It seems that self.tied_to.kernel is of type <class 'tensorflow.python.distribute.values.MirroredVariable'> but after calling K.transpose() on it the resulting output is of type <class 'tensorflow.python.framework.ops.EagerTensor'>. I tried following the instructions here but it did not work. I get AttributeError: 'MirroredStrategy' object has no attribute 'run' when in the docs it does. So I think maybe my version of Tensorflow is too old for that method.
How can I update a mirrored variable in Tensorflow 2.0?
Also if you want to see the full custom layer code, I am trying to implement the dense tied layer described here.
As of now, the documentation is for tensorflow 2.3. If you are using 2.0 it should be
strategy.experimental_run_v2 instead of strategy.run.
I am curious whether a loss function can implement intermediate layer outputs within keras, without designing the model to feed the intermediate layers as outputs. I have seen a solution can be to redesign the architecture to return the intermediate layer in addition to the final prediction and use that as a workaround, but I'm unclear whether a layer output can be accessed directly from a loss function
I'm unclear whether a layer output can be accessed directly from a loss function
It certainly can.
By way of an example, consider this model using the functional API:
inp = keras.layers.Input(shape=(28, 28))
flat = keras.layers.Flatten()(inp)
dense = keras.layers.Dense(128, activation=tf.nn.relu)(flat)
out = keras.layers.Dense(10, activation=tf.nn.softmax)(dense)
model = keras.models.Model(inputs=inp, outputs=out )
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
If, say, we wanted to introduce a new loss function that also penalised the largest weight of the outputs of our dense layer then we could write a custom loss function something like this:
def my_funky_loss_fn(y_true, y_pred):
return (keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
+ keras.backend.max(dense))
which we can use in our model just by passing our new loss function to the compile() method:
model.compile(optimizer='adam',
loss=my_funky_loss_fn,
metrics=['accuracy'])
I have seen such kind of code as follow:
embed_word = Embedding(params['word_voc_size'], params['embed_dim'], weights=[word_embed_matrix], input_length = params['word_max_size']
, trainable=False, mask_zero=True)
When I look up the document in Keras website [https://faroit.github.io/keras-docs/2.1.5/layers/embeddings/][1]
I didnt see weights argument,
keras.layers.Embedding(input_dim, output_dim, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None)
So I am confused,why we can use the argument weights which was not defined the in Keras document?
My keras version is 2.1.5. Hope someone can help me.
Keras' Embedding layer subclasses the Layer class (every Keras layer does this). The weights attribute is implemented in this base class, so every subclass will allow to set this attribute through a weights argument. This is also why you won't find it back in the documentation or the implementation of the Embedding layer itself.
You can check the base layer implementation here (Ctrl + F for 'weight').