Following the tutorials in this post, I am trying to train an autoencoder and extract the features from its hidden layer.
So here are my questions:
In the autoencoder class, there is a "forward" function. However, I cannot see anywhere in the code that this function is called. So how does it get trained?
My question above is because I feel if I want to extract the features, I should add another function (f"orward_hidden") in the autoencoder class:
def forward(self, features):
#print("in forward")
#print(type(features))
activation = self.encoder_hidden_layer(features)
activation = torch.relu(activation)
code = self.encoder_output_layer(activation)
code = torch.relu(code)
activation = self.decoder_hidden_layer(code)
activation = torch.relu(activation)
activation = self.decoder_output_layer(activation)
reconstructed = torch.relu(activation)
return reconstructed
def forward_hidden(self, features):
activation = self.encoder_hidden_layer(features)
activation = torch.relu(activation)
code = self.encoder_output_layer(activation)
code = torch.relu(code)
return code
Then, after training, which means after this line in the main code:
print("AE, epoch : {}/{}, loss = {:.6f}".format(epoch + 1, epochs_AE, loss))
I can put the following code to retrieve the features from the hidden layer:
hidden_features = model_AE.forward_hidden(my_input)
Is this way correct? Still, I am wondering how the "forward" function was used for training. Because I cannot see it anywhere in the code that is being called.
forward is the essence of your model and actually defines what the model does.
It is implicetly called with model(input) during the training.
If you are askling how to extract intermediate features after running the model, you can register a forward-hook like described here, that will "catch" the values for you.
When creating a class with nn.Module when working with PyTorch, the forward function is called implicitly and you do not need to separately call it.
Related
I'm currently working on a solution via PyTorch. I'm not going to share the exact solution but I will provide code that reproduces the issue I'm having.
I have a model defined as follows:
class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
self.fc1 = nn.Linear(10,4)
def foward(self,x):
return nn.functional.relu(self.fc1(x))
Then I create a instance: my_model = Net(). Next I create an Adam optimizer as such:
optim = Adam(my_model.parameters())
# create a random input
inputs = torch.tensor(np.array([1,1,1,1,1,2,2,2,2,2]),dtype=torch.float32,requires_grad=True)
# get the outputs
outputs = my_model(inputs)
# compute gradients / backprop via
outputs.backward(gradient=torch.tensor([1.,1.,1.,5.]))
# store parameters before optimizer step
before_step = list(my_model.parameters())[0].detach().numpy()
# update parameters via
optim.step()
# collect parameters again
after_step = list(my_model.parameters())[0].detach().numpy()
# Print if parameters are the same or not
print(np.array_equal(before_step,after_step)) # Prints True
I provided my models parameters to the Adam optimizer, so I'm not exactly sure why the parameters aren't updating. I know in most cases one uses a loss function, however I cannot do that in my case but I assumed if I specified model paramters to the optimizers, it would know to connect the two.
Anyone know why the parameters aren't getting updated?
The problem is with detach (docs).
As noted at the bottom:
Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks
So that is exactly what's happening here. To correctly compare the parameters, you need to clone (docs) them to get a real copy.
list(my_model.parameters())[0].clone().detach().numpy()
On a side note, it can be helpful if you check the gradients after optim.step() with print(list(my_model.parameters())[0].grad) to check if the graph is intact. Also, don't forget to call optim.zero_grad().
I'm having issues in understanding/implementing a custom loss function in my model.
I have a keras model which is composed by 3 sub models as you can see here in the model architecture,
Now, I'd like to use the outputs of model and model_2 in my custom loss function.
I understand that in the loss function definition I can write:
def custom_mse(y_true, y_pred):
*calculate stuff*
return loss
But how do I tell the model to take its 2 outputs as inputs of the loss function?
Maybe, and i hope so, it's super trivial but I didn't find anything online, if you could help me it'd be fantastic.
Thanks in advance
Context:
model and model_2 are the same pretrained model, a binary classifier, which predicts the interaction between 2 inputs (of image-like type).
model_1 is a generative model which will edit one of the inputs.
Therefore:
complete_model = Model(inputs=[input_1, input_2], outputs=[out_model, out_model2])
opt = *an optimizer*
complete_model.compile(loss=custom_mse,
??????,
optimizer = opt,
metrics=['whatever'])
The main goal is to compare the prediction with the edited input against the one with the un-edited input, therefore the model will outputs the 2 interactions, which i need to use in the loss function.
EDIT:
Thank you Andrey for the solution,
Now however i can't manage to implement together the 2 loss functions, namely the one with add_loss(func) and a classic binary_crossentropy in model.complie(loss='binary_crossentropy', ...).
Can I maybe add an add_loss specifying model_2.output and the label? If yes do you know how?
They work by themselves but not together, when i try to run the code they raise
ValueError: Shapes must be equal rank, but are 0 and 4 From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](binary_crossentropy/weighted_loss/value, complete_model/generator/tf_op_layer_SquaredDifference_3/SquaredDifference_3)' with input shapes: [], [?,500,400,1].
You can add loss with compile() only for standard loss function signature (y_true, y_pred). You can not use it because your signature is something like (y_true, (y_pred1, y_pred2)). Use add_loss() API instead. See here: https://keras.io/api/losses/
I am curious whether a loss function can implement intermediate layer outputs within keras, without designing the model to feed the intermediate layers as outputs. I have seen a solution can be to redesign the architecture to return the intermediate layer in addition to the final prediction and use that as a workaround, but I'm unclear whether a layer output can be accessed directly from a loss function
I'm unclear whether a layer output can be accessed directly from a loss function
It certainly can.
By way of an example, consider this model using the functional API:
inp = keras.layers.Input(shape=(28, 28))
flat = keras.layers.Flatten()(inp)
dense = keras.layers.Dense(128, activation=tf.nn.relu)(flat)
out = keras.layers.Dense(10, activation=tf.nn.softmax)(dense)
model = keras.models.Model(inputs=inp, outputs=out )
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
If, say, we wanted to introduce a new loss function that also penalised the largest weight of the outputs of our dense layer then we could write a custom loss function something like this:
def my_funky_loss_fn(y_true, y_pred):
return (keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
+ keras.backend.max(dense))
which we can use in our model just by passing our new loss function to the compile() method:
model.compile(optimizer='adam',
loss=my_funky_loss_fn,
metrics=['accuracy'])
Currently I am working on a tensorflow model. This model classifies a situation based on 2 string and a number. So my placeholders look as follows:
Input1 = tf.placeholder("string", shape=None, name="string1")
Input2 = tf.placeholder("string", shape=None, name="string2")
Input3 = tf.placeholder("float", shape=None, name="distance")
label = tf.placeholder("int64", shape=None, name="output")
I want to serve this model with Tensorflow Serving with this code:
signature_definition = tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input1': model_input1, 'input2': model_input2, 'input3': model_input3},
outputs={'outputs': model_output},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)
builder = tf.saved_model.builder.SavedModelBuilder(SERVE_PATH)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
signature_definition
})
But the model I wrote want the strings as one_hot coded input. Does someone knows how to transform the input tensors to one_hot coded ones, and feed those to my model?
While training my model, I just transformed them with a function before feeding them. This seems not possible while serving because I can only define a input function, not the flow of the inputdata.
tf.one_hot provide the one hot encoding.
However, more broadly, you need to coordinate the training and serving to use the same index. Tensorflow Transform provides the way to do many transformations (one-hot, scale, bucketize), including one-hot encoding, at training data processing phase, and save the transformation as part of the model graph, therefore automatically re-apply the same transformation at the serving time, saving you the manual work. Check out their example with the link below:
Example: https://www.tensorflow.org/tfx/transform/tutorials/TFT_simple_example
Example 2: https://github.com/tensorflow/transform/blob/master/examples/sentiment_example.py
The Full Python API: https://www.tensorflow.org/tfx/transform/api_docs/python/tft
The function you look for there is tft.compute_and_apply_vocabulary.
Here is an example in Pytorch:
optimizer = optim.Adam([modifier_var], lr=0.0005)
And here in Tensorflow:
self.train = self.optimizer.minimize(self.loss, var_list=[self.modifier])
But Chainer's optimizers only can use on 'Link', how can I apply Optimizer on Variable in Chainer?
In short, there is no way to directly assign chainer.Variable (even nor chainer.Parameter) to chainer.Optimizer.
The following is some redundant explanation.
First, I re-define Variable and Parameter to avoid confusion.
Variable is (1) torch.Tensor in PyTorch v4, (2) torch.autograd.Variable in PyTorch v3, and (3) chainer.Variable in Chainer v4.
Variable is an object who holds two tensors; .data and .grad. It is the necessary and sufficient condition, so Variable is not necessarily a learnable parameter, which is a target of the optimizer.
In both libraries, there is another class Parameter, which is similar but not the same with Variable. Parameter is torch.autograd.Parameter in Pytorch and chainer.Parameter in Chainer.
Parameter must be a learnable parameter and should be optimized.
Therefore, there should be no case to register Variable (not Parameter) to Optimizer (although PyTorch allows to register Variable to Optimizer: this is just for backward compatibility).
Second, in PyTorch torch.nn.Optimizer directly optimizes Parameter, but in Chainer chainer.Optimizer DOES NOT optimize Parameter: instead, chainer.UpdateRule does. The Optimizer just registers UpdateRules to Parameters in a Link.
Therefore, it is only natural that chainer.Optimizer does not receive Parameter as its arguments, because it is just a "delivery-man" of UpdateRule.
If you want to attach different UpdateRule for each Parameter, you should directly create an instance of UpdateRule subclass, and attach it to the Parameter.
Below is an example to learn regression task by MyChain MLP model using Adam optimizer in Chainer.
from chainer import Chain, Variable
# Prepare your model (neural network) as `Link` or `Chain`
class MyChain(Chain):
def __init__(self):
super(MyChain, self).__init__(
l1=L.Linear(None, 30),
l2=L.Linear(None, 30),
l3=L.Linear(None, 1)
)
def __call__(self, x):
h = self.l1(x)
h = self.l2(F.sigmoid(h))
return self.l3(F.sigmoid(h))
model = MyChain()
# Then you can instantiate optimizer
optimizer = chainer.optimizers.Adam()
# Register model to optimizer (to indicate which parameter to update)
optimizer.setup(model)
# Calculate loss, and update parameter as follows.
def lossfun(x, y):
loss = F.mean_squared_error(model(x), y)
return loss
# this iteration is "training", to fit the model into desired function.
for i in range(300):
optimizer.update(lossfun, x, y)
So in summary, you need to setup the model, after that you can use update function to calculate loss and update model's parameter.
The above code comes from here
Also, there are other way to write training code using Trainer module. For more detailed tutorial of Chainer, please refer below
chainer-handson
deep-learning-tutorial-with-chainer