I'm having some difficulties writing an extract_weights and an initialize function for a tf.Module model that i later convert to tflite.
The idea is that, i want to use this model for on device training.
The project architecture is as it follows:
-first i create a transfer learning model that will later be used for training
-then i upload this model in my android application where i train it using the tflite.Interpreter
-the model will be trained federated using a flower server
The problem that i have at the moment is that flower needs to colect from each device the weights as ByteBuffers after each training loop, but i don't seem to understand how i could save them in my android application.
These are the methods that i wrote
#tf.function
def extract_weights(self):
"""
Extracts the traininable weights of the head model as a list of numpy arrays.
Paramaters:
Returns:
Map of extracted weights and biases.
"""
tmp_dict = {}
tensor_names = [weight.name for weight in self.head_model.weights]
tensors_to_save = [weight.read_value() for weight in self.head_model.weights]
for index, layer in enumerate(tensors_to_save):
tmp_dict[tensor_names[index]] = layer
return tmp_dict
#tf.function(input_signature=[SIGNATURE_DICT])
def initialize_weights(self, weights):
"""
Initializes weights of the head model.
Paramaters:
weights : Tensors used for initialization.
Returns:
NONE
"""
tensor_names = [weight.name for weight in self.head_model.weights]
for i, tensor in enumerate(self.head_model.weights):
tensor.assign(weights[tensor_names[i]])
To notice that when i instantiate a TransferLearningModel(my model class that implements tf.Module) object and call these to functions i got no problems but when i try to convert them to tflite i get this error:
ValueError: Got a non-Tensor value<tf.Operation 'StatefulPartitionedCall' type=StatefulPartitionedCall>for key 'output_0' in the output of the function __inference_initialize_weights_8582 used to generate the SavedModel signature 'initialize'. Outputs for functions used as signatures must be a ValueError: Got a non-Tensor value<tf.Operation 'StatefulPartitionedCall' type=StatefulPartitionedCall> for key 'output_0' in the output of the function __inference_initialize_weights_8582 used to generate the SavedModel signature 'initialize'. Outputs for functions used as signatures must be a single Tensor, a sequence of Tensors, or a dictionary from string to Tensor.
I understand the error but i don t get why i have to return something when simply initializing the weights of my model.
Related
TLDR - what is considered best practice when extracting feature maps from ResNet?
I'm trying to feed the entire CIFAR10 dataset through ResNet18, to extract a new dataset that consists of some non-output activation of every sample in CIFAR10. I have implemented a code that generates this dataset, but the running time takes too long (exceeds Google Colab free RAM access, which is quite some RAM). The code I've implemented is based on a blog post called Intermediate Activations — the forward hook.
activation = {}
def get_activation(name):
"""
when given as input to register_forward_hook, this function is implicitly called when model.forward() is performed
and saves the output of layer 'name' in the dictionary described above.
:param name:
:return:
"""
def hook(model, input, output):
activation[name] = output.detach()
return hook
the get_activation helper function is used inside the activation_maps function which takes the feature map provided from the 4th layer, 2nd BasicBlock, conv1 layer (batch-size, 3,224,224) -> (batch-size,512,7,7) of ResNet18
(PS - this layer was arbitrarily chosen - is there a known layer from which the activations are better?)
ResNet18 = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
def activation_maps(name='conv1'):
"""
This function takes a batch and returns some non - last activation alongside the true labels
:return: train_activations_and_true_labels: array of tuples (Activation,True_labels) as train data
"""
non_output_activation_map = ResNet18.layer4[1].register_forward_hook(get_activation(name))
# now we create a list of activations and true labels for every sample.
# This means that if we looped over (X,y) in a dataloader, we can now loop (activation,y) which is
# an element in the arrays below, like a regular dataloader.
train_activations_and_true_labels = []
for i, (X_train, y_train) in enumerate(train_dataloader):
out = ResNet18(X_train)
train_activations_and_true_labels.append((activation[name], y_train))
print(f"Training data [{i}/{len(train_dataloader)}]", end='\r')
non_output_activation_map.remove() # detaching hooks
return train_activations_and_true_labels
Now, this code runs - but exceeds the memory capacity of my PyCharm/Google-Colab Am I missing something? what is the best approach when extracting feature maps?
What batch size are you using, and how much RAM do you have available? Resnet is a somewhat large model, and the layer you're extracting is quite large as well so storing all that in memory might be causing issues.
Try reducing your batch size, or storing intermediary results to disk and clearing them from memory.
You might also consider turning off the gradient computation when calling the ResNet18 model, this would save a good bit of memory. Putting the #torch.no_grad() decorator on activation_maps(name='conv1') might work.
I'd like to use pre-trained sentence embeddings in my tensorflow graph execution model. The embeddings are available dynamically from a function call, which takes in an array of sentences and outputs an array of sentence embeddings. This function uses a pre-trained pytorch model so has to remain separate from the tensorflow model I'm training:
def get_pretrained_embeddings(sentences):
return pretrained_pytorch_model.encode(sentences)
My tensorflow model looks like this:
class SentenceModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, sentences):
embedding_layer = tf.keras.layers.Embedding(
10_000,
256,
embeddings_initializer=tf.keras.initializers.Constant(get_pretrained_embeddings(sentences)),
trainable=False,
)
sentence_text_embedding = tf.keras.Sequential([
embedding_layer,
tf.keras.layers.GlobalAveragePooling1D(),
])
return sentence_text_embedding,
But when I try to train this model using
cached_train = train.shuffle(100_000).batch(1024)
model.fit(cached_train)
my embeddings_initializer call gets the error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
I assume this is because tensorflow is trying to compile the graph using symbolic data. How can I get my external function, which relies on the current training data batch, to work with tensorflow's graph training?
Tensorflow compiles models to an execution graph before performing the actual training process. The obvious side-effect that clues us into this is if we have a regular Python print() statement in e.g. our call() method, it will only get executed once as Tensorflow runs through your code to construct the execution graph, which it will later convert to native code.
The other side effect of this is that cannot use anything that isn't a tensor of some description when training. By 'tensor' here, all of the following can be considered a tensor:
The input value of your call() method (obviously)
A tf.Sequential
A tf.keras.Model/tf.keras.layers.Layer subclass
A SparseTensor
A tf.constant()
....probably more I haven't listed here.
To this end, you would need to convert your PyTorch model to a Tensorflow one to be able to reference it in a subclass of tf.keras.Model/tf.keras.layers.Layer.
As a side note, if you do find you need to iterate a tensor, you should just be able to iterate it on the 1st dimension (i.e. the batch size) like so:
for part in some_tensor:
pass
If you want to iterate on some other dimension, I recommend doing a tf.unstack(some_tensor, axis=AXIS_NUMBER_HERE) first and iterate over the result thereof.
I initialized nn.Embedding with some pretrain parameters (they are 128 dim vectors), the following code demonstrates how I do this:
self.myvectors = gensim.models.KeyedVectors.load_word2vec_format(cfg.vec_dir)
self.vec_weights = torch.FloatTensor(self.myvectors.vectors)
self.embeds = torch.nn.Embedding.from_pretrained(self.vec_weights)
cfg.vec_dir is a json file where vec_dir indicates the path of the pretrained 128 dim vectors I used to initialize this layer.
After the model is trained, I print out this embedding layer, and I found that the parameters are exactly the same as I initialized them, so clearly the parameters are not updated during the training. Why is this happening? What should I do in order to update these vectors?
The torch.nn.Embedding.from_pretrained classmethod by default freezes the parameters. If you want to train the parameters, you need to set the freeze keyword argument to False. See the documentation.
So you might try this instead:
self.embeds = torch.nn.Embedding.from_pretrained(self.vec_weights, freeze=False)
I am using a very simple keras model in TFX, to solve a regression problem.
It seems that TFX wants that you use a keras model with named outputs, so I made:
output = {key: tf.keras.layers.Dense(1, name = key)(x)
for key in _transformed_names(_LABEL_KEYS)}
model(inputs, outputs)
I don't understand how the evaluator maps the label names of my dataset with the output names of my model.
In my code I set up the label_keys and prediction_keys argument in tfma.ModelSpec with a list of form :
[["model output name", "Label key in my Dataset"]]
It seems that the proto message is created correctly, but when I run the Evaluator I get the following error:
ValueError: unable to prepare labels and predictions because the labels and/or predictions are dicts with unrecognized keys. If a multi-output keras model (or estimator) was used check that an output_name was provided. If an estimator was used check that common prediction keys were provided (e.g. logistic, probabilities, etc)
If I try to provide a single label key and a single prediction key using the label_key and prediction_key arguments I get the following error:
TypeError: update_state() takes from 2 to 3 positional arguments but 4 were given [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/ComputePerSlice/ComputeUnsampledMetrics/CombinePerSliceKey/WindowIntoDiscarding']
I have tried in all possible ways but nothing.
Is there a way to use a model with no named outputs (an Dense output layer with more than one node) ? Or a way to solve this problem ?
P.S. Is there a Tutorials of a TFX pipeline with a multiple outputs keras model ?
Thanks.
in eval_config,setting
options=Options(include_default_metrics=BoolValue(value=False))
e.g:
eval_config = tfma.EvalConfig(
model_specs = [...],
slicing_specs=[tfma.SlicingSpec(),...],
metrics_specs=[...],
options=Options(include_default_metrics=BoolValue(value=False))
)
evaluator = Evaluator(
...
eval_config=eval_config
)
Currently I am working on a tensorflow model. This model classifies a situation based on 2 string and a number. So my placeholders look as follows:
Input1 = tf.placeholder("string", shape=None, name="string1")
Input2 = tf.placeholder("string", shape=None, name="string2")
Input3 = tf.placeholder("float", shape=None, name="distance")
label = tf.placeholder("int64", shape=None, name="output")
I want to serve this model with Tensorflow Serving with this code:
signature_definition = tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input1': model_input1, 'input2': model_input2, 'input3': model_input3},
outputs={'outputs': model_output},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)
builder = tf.saved_model.builder.SavedModelBuilder(SERVE_PATH)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
signature_definition
})
But the model I wrote want the strings as one_hot coded input. Does someone knows how to transform the input tensors to one_hot coded ones, and feed those to my model?
While training my model, I just transformed them with a function before feeding them. This seems not possible while serving because I can only define a input function, not the flow of the inputdata.
tf.one_hot provide the one hot encoding.
However, more broadly, you need to coordinate the training and serving to use the same index. Tensorflow Transform provides the way to do many transformations (one-hot, scale, bucketize), including one-hot encoding, at training data processing phase, and save the transformation as part of the model graph, therefore automatically re-apply the same transformation at the serving time, saving you the manual work. Check out their example with the link below:
Example: https://www.tensorflow.org/tfx/transform/tutorials/TFT_simple_example
Example 2: https://github.com/tensorflow/transform/blob/master/examples/sentiment_example.py
The Full Python API: https://www.tensorflow.org/tfx/transform/api_docs/python/tft
The function you look for there is tft.compute_and_apply_vocabulary.