Currently I am working on a tensorflow model. This model classifies a situation based on 2 string and a number. So my placeholders look as follows:
Input1 = tf.placeholder("string", shape=None, name="string1")
Input2 = tf.placeholder("string", shape=None, name="string2")
Input3 = tf.placeholder("float", shape=None, name="distance")
label = tf.placeholder("int64", shape=None, name="output")
I want to serve this model with Tensorflow Serving with this code:
signature_definition = tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input1': model_input1, 'input2': model_input2, 'input3': model_input3},
outputs={'outputs': model_output},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)
builder = tf.saved_model.builder.SavedModelBuilder(SERVE_PATH)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
signature_definition
})
But the model I wrote want the strings as one_hot coded input. Does someone knows how to transform the input tensors to one_hot coded ones, and feed those to my model?
While training my model, I just transformed them with a function before feeding them. This seems not possible while serving because I can only define a input function, not the flow of the inputdata.
tf.one_hot provide the one hot encoding.
However, more broadly, you need to coordinate the training and serving to use the same index. Tensorflow Transform provides the way to do many transformations (one-hot, scale, bucketize), including one-hot encoding, at training data processing phase, and save the transformation as part of the model graph, therefore automatically re-apply the same transformation at the serving time, saving you the manual work. Check out their example with the link below:
Example: https://www.tensorflow.org/tfx/transform/tutorials/TFT_simple_example
Example 2: https://github.com/tensorflow/transform/blob/master/examples/sentiment_example.py
The Full Python API: https://www.tensorflow.org/tfx/transform/api_docs/python/tft
The function you look for there is tft.compute_and_apply_vocabulary.
Related
TLDR - what is considered best practice when extracting feature maps from ResNet?
I'm trying to feed the entire CIFAR10 dataset through ResNet18, to extract a new dataset that consists of some non-output activation of every sample in CIFAR10. I have implemented a code that generates this dataset, but the running time takes too long (exceeds Google Colab free RAM access, which is quite some RAM). The code I've implemented is based on a blog post called Intermediate Activations — the forward hook.
activation = {}
def get_activation(name):
"""
when given as input to register_forward_hook, this function is implicitly called when model.forward() is performed
and saves the output of layer 'name' in the dictionary described above.
:param name:
:return:
"""
def hook(model, input, output):
activation[name] = output.detach()
return hook
the get_activation helper function is used inside the activation_maps function which takes the feature map provided from the 4th layer, 2nd BasicBlock, conv1 layer (batch-size, 3,224,224) -> (batch-size,512,7,7) of ResNet18
(PS - this layer was arbitrarily chosen - is there a known layer from which the activations are better?)
ResNet18 = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
def activation_maps(name='conv1'):
"""
This function takes a batch and returns some non - last activation alongside the true labels
:return: train_activations_and_true_labels: array of tuples (Activation,True_labels) as train data
"""
non_output_activation_map = ResNet18.layer4[1].register_forward_hook(get_activation(name))
# now we create a list of activations and true labels for every sample.
# This means that if we looped over (X,y) in a dataloader, we can now loop (activation,y) which is
# an element in the arrays below, like a regular dataloader.
train_activations_and_true_labels = []
for i, (X_train, y_train) in enumerate(train_dataloader):
out = ResNet18(X_train)
train_activations_and_true_labels.append((activation[name], y_train))
print(f"Training data [{i}/{len(train_dataloader)}]", end='\r')
non_output_activation_map.remove() # detaching hooks
return train_activations_and_true_labels
Now, this code runs - but exceeds the memory capacity of my PyCharm/Google-Colab Am I missing something? what is the best approach when extracting feature maps?
What batch size are you using, and how much RAM do you have available? Resnet is a somewhat large model, and the layer you're extracting is quite large as well so storing all that in memory might be causing issues.
Try reducing your batch size, or storing intermediary results to disk and clearing them from memory.
You might also consider turning off the gradient computation when calling the ResNet18 model, this would save a good bit of memory. Putting the #torch.no_grad() decorator on activation_maps(name='conv1') might work.
I'd like to use pre-trained sentence embeddings in my tensorflow graph execution model. The embeddings are available dynamically from a function call, which takes in an array of sentences and outputs an array of sentence embeddings. This function uses a pre-trained pytorch model so has to remain separate from the tensorflow model I'm training:
def get_pretrained_embeddings(sentences):
return pretrained_pytorch_model.encode(sentences)
My tensorflow model looks like this:
class SentenceModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, sentences):
embedding_layer = tf.keras.layers.Embedding(
10_000,
256,
embeddings_initializer=tf.keras.initializers.Constant(get_pretrained_embeddings(sentences)),
trainable=False,
)
sentence_text_embedding = tf.keras.Sequential([
embedding_layer,
tf.keras.layers.GlobalAveragePooling1D(),
])
return sentence_text_embedding,
But when I try to train this model using
cached_train = train.shuffle(100_000).batch(1024)
model.fit(cached_train)
my embeddings_initializer call gets the error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
I assume this is because tensorflow is trying to compile the graph using symbolic data. How can I get my external function, which relies on the current training data batch, to work with tensorflow's graph training?
Tensorflow compiles models to an execution graph before performing the actual training process. The obvious side-effect that clues us into this is if we have a regular Python print() statement in e.g. our call() method, it will only get executed once as Tensorflow runs through your code to construct the execution graph, which it will later convert to native code.
The other side effect of this is that cannot use anything that isn't a tensor of some description when training. By 'tensor' here, all of the following can be considered a tensor:
The input value of your call() method (obviously)
A tf.Sequential
A tf.keras.Model/tf.keras.layers.Layer subclass
A SparseTensor
A tf.constant()
....probably more I haven't listed here.
To this end, you would need to convert your PyTorch model to a Tensorflow one to be able to reference it in a subclass of tf.keras.Model/tf.keras.layers.Layer.
As a side note, if you do find you need to iterate a tensor, you should just be able to iterate it on the 1st dimension (i.e. the batch size) like so:
for part in some_tensor:
pass
If you want to iterate on some other dimension, I recommend doing a tf.unstack(some_tensor, axis=AXIS_NUMBER_HERE) first and iterate over the result thereof.
I'm having issues in understanding/implementing a custom loss function in my model.
I have a keras model which is composed by 3 sub models as you can see here in the model architecture,
Now, I'd like to use the outputs of model and model_2 in my custom loss function.
I understand that in the loss function definition I can write:
def custom_mse(y_true, y_pred):
*calculate stuff*
return loss
But how do I tell the model to take its 2 outputs as inputs of the loss function?
Maybe, and i hope so, it's super trivial but I didn't find anything online, if you could help me it'd be fantastic.
Thanks in advance
Context:
model and model_2 are the same pretrained model, a binary classifier, which predicts the interaction between 2 inputs (of image-like type).
model_1 is a generative model which will edit one of the inputs.
Therefore:
complete_model = Model(inputs=[input_1, input_2], outputs=[out_model, out_model2])
opt = *an optimizer*
complete_model.compile(loss=custom_mse,
??????,
optimizer = opt,
metrics=['whatever'])
The main goal is to compare the prediction with the edited input against the one with the un-edited input, therefore the model will outputs the 2 interactions, which i need to use in the loss function.
EDIT:
Thank you Andrey for the solution,
Now however i can't manage to implement together the 2 loss functions, namely the one with add_loss(func) and a classic binary_crossentropy in model.complie(loss='binary_crossentropy', ...).
Can I maybe add an add_loss specifying model_2.output and the label? If yes do you know how?
They work by themselves but not together, when i try to run the code they raise
ValueError: Shapes must be equal rank, but are 0 and 4 From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](binary_crossentropy/weighted_loss/value, complete_model/generator/tf_op_layer_SquaredDifference_3/SquaredDifference_3)' with input shapes: [], [?,500,400,1].
You can add loss with compile() only for standard loss function signature (y_true, y_pred). You can not use it because your signature is something like (y_true, (y_pred1, y_pred2)). Use add_loss() API instead. See here: https://keras.io/api/losses/
I initialized nn.Embedding with some pretrain parameters (they are 128 dim vectors), the following code demonstrates how I do this:
self.myvectors = gensim.models.KeyedVectors.load_word2vec_format(cfg.vec_dir)
self.vec_weights = torch.FloatTensor(self.myvectors.vectors)
self.embeds = torch.nn.Embedding.from_pretrained(self.vec_weights)
cfg.vec_dir is a json file where vec_dir indicates the path of the pretrained 128 dim vectors I used to initialize this layer.
After the model is trained, I print out this embedding layer, and I found that the parameters are exactly the same as I initialized them, so clearly the parameters are not updated during the training. Why is this happening? What should I do in order to update these vectors?
The torch.nn.Embedding.from_pretrained classmethod by default freezes the parameters. If you want to train the parameters, you need to set the freeze keyword argument to False. See the documentation.
So you might try this instead:
self.embeds = torch.nn.Embedding.from_pretrained(self.vec_weights, freeze=False)
I load features and labels from my training dataset. Both of them are originally numpy arrays, but I change them to the torch tensor using torch.from _numpy(features.copy()) and torch.tensor(labels.astype(np.bool)).
And I notice that torch.autograd.Variable is something like placeholder in tensorflow.
When I train my network, first I tried
features = features.cuda()
labels = labels.cuda()
outputs = Config.MODEL(features)
loss = Config.LOSS(outputs, labels)
Then I tried
features = features.cuda()
labels = labels.cuda()
input_var = Variable(features)
target_var = Variable(labels)
outputs = Config.MODEL(input_var)
loss = Config.LOSS(outputs, target_var)
Both blocks succeed in activating training, but I worried that there might be trivial difference.
According to this question you no longer need variables to use Pytorch Autograd.
Thanks to #skytree, we can make this even more explizit: Variables have been deprecated, i.e. you're not supposed to use them anymore.
Autograd automatically supports Tensors with requires_grad set to True.
And more importantly
Variable(tensor) and Variable(tensor, requires_grad) still work as expected, but they return Tensors instead of Variables.
This means that if your features and labels are tensors already (which they seem to be in your example) your Variable(features) and Variable(labels) does only return a tensor again.
The original purpose of Variables was to be able to use automatic differentiation (Source):
Variables are just wrappers for the tensors so you can now easily auto compute the gradients.