Compare the following code snippets. I implemented a simple keras model like this
inp = layers.Input((10,2))
x = layers.Flatten()(inp)
x = layers.Dense(5)(x)
m = models.Model(inputs=inp, outputs=x)
For one reason or another, I need to have my model in an objective way. So no problem, it's easy to reimplement that into:
class MyModel(tf.keras.Model):
def __init__(self, inp_shape, out_size = 5):
super(MyModel, self).__init__()
self.inp = layers.InputLayer(input_shape=inp_shape)
self.flatten = layers.Flatten()
self.dense = layers.Dense(out_size)
def call(self, a):
x = self.inp(a)
x = self.flatten(x)
x = self.dense(x)
return x
However in the second case when I try to run:
m = MyModel((10,2))
m.summary()
I get:
ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.
I don't quite get why? Shouldn't the above be equivalent?
The reason for this is that when you create an object of this model you are just creating its layers and not its graph. So in short the output from layer 1 is not going in layer 2 cause those are entirely separate attributes of the class but when you call the model those separate attributes combines and form the graph.
When you define a model in tf. keras with subclassed API, you need to build the model first by calling build or run the model on some data.
m = MyModel((10,2))
m.build(input_shape=(10, 2)) # < -- build the model
m.summary()
That said, you don't also need to define self.inp while building the model with subclassed API. The .summary() may not look right to you for the subclassed model, you may need to check this instead.
Related
i built and trained my LSTM model for a regression task and everything works fine. i would like to use the fast_gradient_method function from cleverhans (or any other cleverhans function as the issue stands for any other attack).
i don't understand how am i supposed to pass the model to the function. from cleverhans:
:param model_fn: a callable that takes an input tensor and returns the model logits
whatever input i give to the function (the model itself, the weights i get with get_weights, the weights of the "stage" right before the dense layer...), i get this error:
TypeError: 'module' object is not callable
what would be the correct input to make it work?
in the only working example i found, the following line of code is used to define logits_model and then pass it as :param model_fn:, but i still get the error above
logits_model = tf.keras.Model(model.input,model.layers[-1].output)
to pass a valid model, it should be defined in the following way:
(it is just an example)
"make" is only needed for model.summary() to work, I found the code in another SO post that I can't seem to find right now
class modSubclass(Model):
def __init__(self):
super(modSubclass, self).__init__()
self.lstm1 = GRU(hidden_size1, activation='relu',return_sequences=True,input_shape=(input_size,1))
self.lstm2 = GRU(hidden_size2, activation='relu')
self.dense1 = Dense(K, activation='relu')
def call(self,x):
x = self.lstm1(x)
x = self.lstm2(x)
x = self.dense1(x)
return x
def make(self, input_shape):
'''
This method makes the command "model.summary()" work.
input_shape: (H,W,C), do not specify batch B
'''
x = tf.keras.layers.Input(shape=input_shape)
model = tf.keras.Model(inputs=[x], outputs=self.call(x), name='actor')
print(model.summary())
return model
I am planning on learning Pytorch. However at this stage I would like to ask a question so that I can understand some code I am reading
When you have a class whose base class is nn.Module say
class My_model(nn.Module)
how are inferences supposed to be run there?
In the code I am reading it says
tasks_output, other = my_model(data)
Wouldn't that just be creating an object? (like calling the class constructor)
How, in pytorch, are inference supposed to be made?
(for reference I am talking when my_model is set to my_model.eval())
EDIT: My apologies. I made the mistake of declaring the class and object as one.. I corrected the code
You are confusion __init__ and __call__.
In your example my_model is a class, therefore calling
my_model_instance = my_model(arguments)
Invoke's my_model.__init__ with arguments. The result of this call is a new instance of my_model in the variable my_model_instance.
Once you instantiated the class my_model as the variable my_model_instance, you can evaluate the model on the training data:
tasks_output, other = my_model_instance(data)
"Calling" (i.e., putting parenthesis after the variable name) the instance of the model causes python to invoke the method __call__ of the class.
In the case of classes derived from nn.Modules this will invoke __call__ of nn.Module that does some pytorch stuff and eventually calls your implementation of forward method of my_class.
Please see this detailed thread on the difference between __init__ and __call__ in python in general.
It is often a convenient follow PEP8 Style Guide for Python Code:
Class names should normally use the CapWords convention.
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
You have for exemple :
class My_model(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# Call construtor of Class
my_model = My_model()
It's important to differentie the class and objet.
The name's class start with capital letter in Python.
The constructor as you can see, it doesn't take a data/input parameter, alone the function forward have one.
After, for the training, you must to need :
criterion who calcul error that model with the labels.
It must have optimizer for back propagation algorythm
Exemple :
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
For end, you must to need, with a loop, this elements :
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Here, you have one iteration of back propagation.
Pytorch documentation
If you want thinking the inference in backpropagation, you can read how create a layer with pytorch and how the pytorch use autograph.
The tensor use Autograph for backpropagation. Exemple with Pytorch documentation
import torch
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
loss.backward()
print(w.grad)
print(b.grad)
The resultat give the backpropagation, where the cross entropy criterion calcul the distance with model and label. The Tensor z is not unique matrice of value but a class with "memory the calcul" with w, b, x, y.
In the layer the gradiant use the forward function for this calcul or a function backward if necesserie.
Best regard
Models in PyTorch are defined with classes by inheriting from the base nn.Module class:
class Model(nn.Module)
pass
You can then implement a forward method that acts as the inference code. Whether it be for training or evaluation, it is supposed to return the output of your model.
class Model(nn.Module)
forward(self, x)
return x**2
Once you have that you can initialize a new model with:
model = Model()
To use your newly initialized model, you won't actually call forward directly. The underlying structure of nn.Module makes it such that you can call __call__ instead. Which will handle the call to your forward implementation. To use it, you will just call your object like a function:
>>> model(2)
4
In the documentation page you can see that nn.Module.eval will set the model to evaluation mode which affects particular layers such as batch normalization layers and dropouts. These types of layers are usually turned on for training and turned off for evaluation and testing. You can use it as
model.eval()
When doing model evaluation and testing, it is advised to use the torch.no_grad context manager. This avoids having to retain the activations which are used for gradient backpropagation.
with torch.no_grad():
out = model(x)
Or as a decorator on top of your function/method declaration:
#torch.no_grad()
validate():
pass
I am trying to apply one idea proposed by Rusu et al. in https://arxiv.org/pdf/1511.06295.pdf, which consists in training a NN changing the output layer according to the class of the input, i.e., provided that we know the id of the input, we would pick the corresponding output layer. This way, all the hidden layers would be trained with all the data, but each output layer would only be trained with its corresponding type of input data.
This is meant to achieve good results in a transfer learning framework.
How can I implement this "change of the last layer" in tensorflow 2.0?
If you use model subclassing, you can actually define you forward pass.
class MyModel(tf.keras.Model):
def __init__(self):
super(Model, self).__init__()
self.block_1 = BlockA()
self.block_2 = BlockB()
self.global_pool = layers.GlobalAveragePooling2D()
self.classifier = Dense(num_classes)
def call(self, inputs):
if condition:
x = self.block_1(inputs)
else:
x = self.block_2(inputs)
x = self.global_pool(x)
return self.classifier(x)
You'll still have the backprop part to figure out, but I think it's fairly easy if you use a multioutput model and train all your "last layers" at the same time.
I have a complex keras model in which one of the layers is a custom pretrained layer which expects "int32" as inputs. This model is implemented as a class that inherits from Model and it is implemented like this:
class MyModel(tf.keras.models.Model):
def __init__(self, size, input_shape):
super(MyModel, self).__init__()
self.layer = My_Layer()
self.build(input_shape)
def call(self, inputs):
return self.layer(inputs)
But when it reaches the self.build method, it throws the next error:
ValueError: You cannot build your model by calling `build` if your layers do not support float type inputs. Instead, in order to instantiate and build your model, `call` your model on real tensor data (of the correct dtype).
How can I fix it?
The exception is thrown when building a model with model.build.
model.build function build a model based on given input shape.
The error is raised because when we trying to build a model, it first calls a model with x argument depending on input shape type in the following code
if (isinstance(input_shape, list) and
all(d is None or isinstance(d, int) for d in input_shape)):
input_shape = tuple(input_shape)
if isinstance(input_shape, list):
x = [base_layer_utils.generate_placeholders_from_shape(shape)
for shape in input_shape]
elif isinstance(input_shape, dict):
x = {
k: base_layer_utils.generate_placeholders_from_shape(shape)
for k, shape in input_shape.items()
}
else:
x = base_layer_utils.generate_placeholders_from_shape(input_shape)
x is a TensorFlow placeholder here. So when trying to call a model with x as an input it will pop a TypeError and the result except for block will work and give an error.
I assume your input shape is 16x16. Instead of using self.build([(16,16)]) this, call the model based on real tensor
inputs = tf.keras.Input(shape=(16,))
self.call(inputs)
Workaround
I've encountered the same problem when trying to export model with multiple Int-typed input tensor as SavedModel. I worked around by overriding the build method and manually specifying self._build_input_shape. So you solution would look like:
class MyModel(tf.keras.models.Model):
def __init__(self, size, input_shape):
super(MyModel, self).__init__()
self.layer = My_Layer()
self.build(input_shape)
def call(self, inputs):
return self.layer(inputs)
def build(self, input_shapes):
super(tf.keras.Model, self).build(input_shapes)
What happened in the original code
The default build method of tf.keras.Model object will treat by default input tensors as float tensors, which ends up throwing the exception.
Such behavior of tf.keras.Model is defined here, where inputs for your model are created by base_layer_utils.generate_placeholders_from_shape, which will specify dtype as float.
What would happen with the workaround
As tf.keras.Model.build would finally invoke it's super class's build function tf.keras.layer.Layer.build, the workaround skips tf.keras.Model.build logic that causes the problem, but you may have to add complemental code after that in case you rely on other logics in defined in tf.keras.Model.build
I have some cnn, and I want to fetch the value of some intermediate layer corresponding to a some key from the state dict.
How could this be done?
Thanks.
I think you need to create a new class that redefines the forward pass through a given model. However, most probably you will need to create the code regarding the architecture of your model. You can find here an example:
class extract_layers():
def __init__(self, model, target_layer):
self.model = model
self.target_layer = target_layer
def __call__(self, x):
return self.forward(x)
def forward(self, x):
module = self.model._modules[self.target_layer]
# get output of the desired layer
features = module(x)
# get output of the whole model
x = self.model(x)
return x, features
model = models.vgg19(pretrained=True)
target_layer = 'features'
extractor = extract_layers(model, target_layer)
image = Variable(torch.randn(1, 3, 244, 244))
x, features = extractor(image)
In this case, I am using the pre-defined vgg19 network given in the pytorch models zoo. The network has the layers structured in two modules the features for the convolutional part and the classifier for the fully-connected part. In this case, since features wraps all the convolutional layers of the network it is straightforward. If your architecture has several layers with different names, you will need to store their output using something similar to this:
for name, module in self.model._modules.items():
x = module(x) # forward the module individually
if name in self.target_layer:
features = x # store the output of the desired layer
Also, you should keep in mind that you need to reshape the output of the layer that connects the convolutional part to the fully-connected one. It should be easy to do if you know the name of that layer.