How to load weights from Pytorch to Keras layer by layer?

How to load weights from Pytorch to Keras layer by layer? - python

I have tried tons of methods like onnx2keras, pytorch2keras and so on. But there would always be something wrong...
Since my model is not really complicated: just a ResNet18-encoder + Decoder with some skip-connections. I'm considering to simply transfer them one layer by another, from pytorch to Keras.
Before I try I'd like to ask if you have similar experience? I know there's set_weights method, but that's for keras-to-keras so nothing special. However, Keras is object-based model, so how can I assign name-based weights, e.g. 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var' to a BN? I don't want TF1.x solutions because all of my work is on TF2.x.
So In my opinion, it would be something like:
# 1. Save weights and names from pytorch model
weights_dict = torch_mode.static_dict()
# 2. Construct Keras model
keras_model = tf.keras.models.Model(...)
# 3. Now load weights for each layer in Keras model
for var_name, weight in weights_dict.items():
# Assign conv with weight with'encoder.conv1.weight'
# Assign BN with 'encoder.bn1.weight', 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var', 'encoder.bn1.num_batches_tracked'
But I don't know how... Look forward to your opinions!

Could you try pt2keras and see if it works?
link: https://github.com/JWLee89/pt2keras/
To install pt2keras, type the following in the terminal:
pip install -U pt2keras
Below is a simple example for converting resnet18.
import tensorflow as tf
from torchvision.models.resnet import resnet18
from pt2keras import Pt2Keras
if __name__ == '__main__':
input_shape = (1, 3, 224, 224)
# Grab model
model = resnet18(pretrained=False).eval()
# Create pt2keras object
converter = Pt2Keras()
# convert model
keras_model: tf.keras.Model = converter.convert(model, input_shape, strict=True)
# Save the model
keras_model.save('output_model.h5')
# Do whatever else that you want afterwards ...
I have attached the converted keras model visualized using netron:
Before I try I'd like to ask if you have similar experience? I know there's set_weights method, but that's for keras-to-keras so nothing special. However, Keras is object-based model, so how can I assign name-based weights, e.g. 'encoder.bn1.bias', 'encoder.bn1.running_mean', 'encoder.bn1.running_var' to a BN? I don't want TF1.x solutions because all of my work is on TF2.x.
Unfortunately, as far as I know, you cannot attach name-based weights to individual parameters in Keras like you can in PyTorch, since keras is layer-based. However, you can name the batch-norm layer, which I am guessing is not very useful to you.

Related

How to get TensorFlow operations contained in Keras model

I have a TensorFlow Keras model (TensorFlow 2.6.0); here's a basic example:
import tensorflow as tf
x = inp = tf.keras.Input((5,))
x = tf.keras.layers.Dense(7, activation="relu")(x)
x = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inp, x)
I would like to get all the tf.Operation objects in the graph for the model, select specific operations, then create a new tf.function or tf.keras.Model to output the values of those tensors on arbitrary inputs.
For example, in my simple model above, I might want to get the outputs of all relu operators. I know in that case, I could redefine the model to include the output of that layer as another output of the model, but the point here is that I already have the model (it's much more complicated than above), and there are specific operators that I want to find to get the outputs of.

Have you tried this:
all_ops = tf.get_default_graph().get_operations()
If you got an empty list and you use Tensorflow 2.x , you may try this:
import tensorflow as tf
print(tf.__version__)
tf.compat.v1.disable_eager_execution() # disable eager execution
a = tf.constant([1],name='aa')
print(tf.compat.v1.get_default_graph().get_operations())
print(tf.compat.v1.get_default_graph().get_tensor_by_name('aa:0'))

Accessing training data during tensorflow graph execution

I'd like to use pre-trained sentence embeddings in my tensorflow graph execution model. The embeddings are available dynamically from a function call, which takes in an array of sentences and outputs an array of sentence embeddings. This function uses a pre-trained pytorch model so has to remain separate from the tensorflow model I'm training:
def get_pretrained_embeddings(sentences):
return pretrained_pytorch_model.encode(sentences)
My tensorflow model looks like this:
class SentenceModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, sentences):
embedding_layer = tf.keras.layers.Embedding(
10_000,
256,
embeddings_initializer=tf.keras.initializers.Constant(get_pretrained_embeddings(sentences)),
trainable=False,
)
sentence_text_embedding = tf.keras.Sequential([
embedding_layer,
tf.keras.layers.GlobalAveragePooling1D(),
])
return sentence_text_embedding,
But when I try to train this model using
cached_train = train.shuffle(100_000).batch(1024)
model.fit(cached_train)
my embeddings_initializer call gets the error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
I assume this is because tensorflow is trying to compile the graph using symbolic data. How can I get my external function, which relies on the current training data batch, to work with tensorflow's graph training?

Tensorflow compiles models to an execution graph before performing the actual training process. The obvious side-effect that clues us into this is if we have a regular Python print() statement in e.g. our call() method, it will only get executed once as Tensorflow runs through your code to construct the execution graph, which it will later convert to native code.
The other side effect of this is that cannot use anything that isn't a tensor of some description when training. By 'tensor' here, all of the following can be considered a tensor:
The input value of your call() method (obviously)
A tf.Sequential
A tf.keras.Model/tf.keras.layers.Layer subclass
A SparseTensor
A tf.constant()
....probably more I haven't listed here.
To this end, you would need to convert your PyTorch model to a Tensorflow one to be able to reference it in a subclass of tf.keras.Model/tf.keras.layers.Layer.
As a side note, if you do find you need to iterate a tensor, you should just be able to iterate it on the 1st dimension (i.e. the batch size) like so:
for part in some_tensor:
pass
If you want to iterate on some other dimension, I recommend doing a tf.unstack(some_tensor, axis=AXIS_NUMBER_HERE) first and iterate over the result thereof.

How to update mirrored variable in tensorflow 2.0?

I am building a model in tensorflow version 2.0 (upgrading is not an option due to compatibility with my version of cuda, which I do not have permission to change). I am using tf.strategy.MirroredStrategy() to train my model on 2 GPUs. However, I am trying to instantiate a custom dense layer whose weights are the transpose of the weights of a different dense layer. My code involves this line to build the custom layer:
from tensorflow.keras import backend as K
class DenseTied(Layer):
# Really long class, full code can be found at link below
def build(self, input_shape):
self.kernel = K.transpose(self.tied_to.kernel)
I am then using this in a model as follows:
from tensorflow.keras.layers import Input, Dense
def build_model(input_shape):
model_input = Input(shape=input_shape)
dense1 = Dense(6144, activation='relu')
dense_tied1 = DenseTied(49152, tied_to=dense1)
x = dense1(model_input)
model_output = dense_tied1(x)
model = Model(model_input, model_output)
model.compile(optimizer='adam', loss='mse')
return model
When trying to build this model I get an error: AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_distribute_strategy'.
I have been tracking this down for a while now and I pinpointed that the issue is in the line
self.kernel = K.transpose(self.tied_to.kernel)
It seems that self.tied_to.kernel is of type <class 'tensorflow.python.distribute.values.MirroredVariable'> but after calling K.transpose() on it the resulting output is of type <class 'tensorflow.python.framework.ops.EagerTensor'>. I tried following the instructions here but it did not work. I get AttributeError: 'MirroredStrategy' object has no attribute 'run' when in the docs it does. So I think maybe my version of Tensorflow is too old for that method.
How can I update a mirrored variable in Tensorflow 2.0?
Also if you want to see the full custom layer code, I am trying to implement the dense tied layer described here.

As of now, the documentation is for tensorflow 2.3. If you are using 2.0 it should be
strategy.experimental_run_v2 instead of strategy.run.

Keras Embedding ,where is the "weights" argument?

I have seen such kind of code as follow:
embed_word = Embedding(params['word_voc_size'], params['embed_dim'], weights=[word_embed_matrix], input_length = params['word_max_size']
, trainable=False, mask_zero=True)
When I look up the document in Keras website [https://faroit.github.io/keras-docs/2.1.5/layers/embeddings/][1]
I didnt see weights argument,
keras.layers.Embedding(input_dim, output_dim, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None)
So I am confused,why we can use the argument weights which was not defined the in Keras document?
My keras version is 2.1.5. Hope someone can help me.

Keras' Embedding layer subclasses the Layer class (every Keras layer does this). The weights attribute is implemented in this base class, so every subclass will allow to set this attribute through a weights argument. This is also why you won't find it back in the documentation or the implementation of the Embedding layer itself.
You can check the base layer implementation here (Ctrl + F for 'weight').

How to access weight variables in Keras layers in tensor form for clip_by_weight?

I'm implementing WGAN and need to clip weight variables.
I'm currently using Tensorflow with Keras as high-level API. Thus building layers with Keras to avoid manually creation and initialization of variables.
The problem is WGAN need to clip weight varibales, This can be done using tf.clip_by_value(x, v0, v1) once I got those weight variable tensors, but I don't know to how to get them safely.
One possible solution maybe using tf.get_collection() to get all trainable variables. But I don't know how to get only weight variable without bias variables.
Another solution is layer.get_weights(), but it get numpy arrays, although I can clip them with numpy APIs and set them using layer.set_weights(), but this may need CPU-GPU corporation, and may not be a good choice since clip operation needs to be performed on each train step.
The only way I know is access them directly using exact variable names which I can get from TF lower level APIs or TensorBoard, but this is may not be safe since naming rule of Keras is not guaranteed to be stable.
Is there any clean way to perform clip_by_value only on those Ws with Tensorflow and Keras?

You can use constraints(here) class to implement new constraints on parameters.
Here is how you can easily implement clip on weights and use it in your model.
from keras.constraints import Constraint
from keras import backend as K
class WeightClip(Constraint):
'''Clips the weights incident to each hidden unit to be inside a range
'''
def __init__(self, c=2):
self.c = c
def __call__(self, p):
return K.clip(p, -self.c, self.c)
def get_config(self):
return {'name': self.__class__.__name__,
'c': self.c}
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(30, input_dim=100, W_constraint = WeightClip(2)))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop')
X = np.random.random((1000,100))
Y = np.random.random((1000,1))
model.fit(X,Y)
I have tested the running of the above code, but not the validity of the constraints. You can do so by getting the model weights after training using model.get_weights() or model.layers[idx].get_weights() and checking whether its abiding the constraints.
Note: The constrain is not added to all the model weights .. but just to the weights of the specific layer its used and also W_constraint adds constrain to W param and b_constraint to b (bias) param

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to load weights from Pytorch to Keras layer by layer? - python

Related

How to get TensorFlow operations contained in Keras model

Accessing training data during tensorflow graph execution

How to update mirrored variable in tensorflow 2.0?

Keras Embedding ,where is the "weights" argument?

How to access weight variables in Keras layers in tensor form for clip_by_weight?

Categories

Resources