I have an instance of torchvision.models.ResNet and I have my class CondBatchNorm2d that is a module similar to BatchNorm2d but the forward method accepts an additional input y that does not come from the previous layer since it is an input of the whole network:
def forward(self, x, y=None):
...
I know how to substitute each BatchNorm2d instance with an instance of CondBatchNorm2d but I am not sure how to write my own forward method to include the new input for intermediate CondBatchNorm2d layers. Should I iterate in the forward on the resnet children or is there a more suitable way to do it?
Hackish, assuming
it is the input of the whole network
you might create a new model wrapping conditional CondBatchNorm:
class FedCondBatchNorm2d:
def __init__(self, y, *args, **kwargs):
self.batch_norm = CondBatchNorm2d(*args, **kwargs)
self.cond_img = y
def forward(self, x):
return self.batch_norm(x, self.cond_img)
As its API is now the same as original Reset you might simply switch blocks via module.apply
Related
I created a custom layer (e.g. from Dense), to which I want to pass a non trainable Variable (custom_input) of constant size (custom_input_shape). I'll leave out some of the details to keep it short.
class CustomLayer(tf.keras.layers.Dense):
def __init__(self, custom_input_shape, **kwargs):
super().__init__(**kwargs)
self.input_spec = [
self.input_spec,
tf.keras.layers.InputSpec(shape=([self.custom_input_shape])),
]
def build(self, input_shape):
super().build(
self.input_spec = [
self.input_spec,
tf.keras.layers.InputSpec(shape=([self.custom_input_shape])),
]
def call(self, inputs: Tuple[tf.Tensor, tf.Tensor]):
x, custom_input = inputs
...
return x
I adapted the input_spec, and pass in the regular inputs including the custom_input as a tuple (inputs, custom_input).
The custom_input is calculated once in the call() method of my custom model before each layer gets called, and gets passed to each custom layer that needs the custom_input.
Training a model with a Layer like this in it works, but I am not sure if this is the recommended way to go about this.
Additionally, if I want to save a custom Model with this layer, it fails. model.save() tries to call the model with some generic input, and fails some custom calculations in call() because it assumes the custom_input is dependent on the batch size, and passes in an empty Tensor because of it, while it should actually be of fixed size.
I would be glad for any recommendations on how to solve this according to Tensorflow 'rules'.
I am planning on learning Pytorch. However at this stage I would like to ask a question so that I can understand some code I am reading
When you have a class whose base class is nn.Module say
class My_model(nn.Module)
how are inferences supposed to be run there?
In the code I am reading it says
tasks_output, other = my_model(data)
Wouldn't that just be creating an object? (like calling the class constructor)
How, in pytorch, are inference supposed to be made?
(for reference I am talking when my_model is set to my_model.eval())
EDIT: My apologies. I made the mistake of declaring the class and object as one.. I corrected the code
You are confusion __init__ and __call__.
In your example my_model is a class, therefore calling
my_model_instance = my_model(arguments)
Invoke's my_model.__init__ with arguments. The result of this call is a new instance of my_model in the variable my_model_instance.
Once you instantiated the class my_model as the variable my_model_instance, you can evaluate the model on the training data:
tasks_output, other = my_model_instance(data)
"Calling" (i.e., putting parenthesis after the variable name) the instance of the model causes python to invoke the method __call__ of the class.
In the case of classes derived from nn.Modules this will invoke __call__ of nn.Module that does some pytorch stuff and eventually calls your implementation of forward method of my_class.
Please see this detailed thread on the difference between __init__ and __call__ in python in general.
It is often a convenient follow PEP8 Style Guide for Python Code:
Class names should normally use the CapWords convention.
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
You have for exemple :
class My_model(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# Call construtor of Class
my_model = My_model()
It's important to differentie the class and objet.
The name's class start with capital letter in Python.
The constructor as you can see, it doesn't take a data/input parameter, alone the function forward have one.
After, for the training, you must to need :
criterion who calcul error that model with the labels.
It must have optimizer for back propagation algorythm
Exemple :
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
For end, you must to need, with a loop, this elements :
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Here, you have one iteration of back propagation.
Pytorch documentation
If you want thinking the inference in backpropagation, you can read how create a layer with pytorch and how the pytorch use autograph.
The tensor use Autograph for backpropagation. Exemple with Pytorch documentation
import torch
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
loss.backward()
print(w.grad)
print(b.grad)
The resultat give the backpropagation, where the cross entropy criterion calcul the distance with model and label. The Tensor z is not unique matrice of value but a class with "memory the calcul" with w, b, x, y.
In the layer the gradiant use the forward function for this calcul or a function backward if necesserie.
Best regard
Models in PyTorch are defined with classes by inheriting from the base nn.Module class:
class Model(nn.Module)
pass
You can then implement a forward method that acts as the inference code. Whether it be for training or evaluation, it is supposed to return the output of your model.
class Model(nn.Module)
forward(self, x)
return x**2
Once you have that you can initialize a new model with:
model = Model()
To use your newly initialized model, you won't actually call forward directly. The underlying structure of nn.Module makes it such that you can call __call__ instead. Which will handle the call to your forward implementation. To use it, you will just call your object like a function:
>>> model(2)
4
In the documentation page you can see that nn.Module.eval will set the model to evaluation mode which affects particular layers such as batch normalization layers and dropouts. These types of layers are usually turned on for training and turned off for evaluation and testing. You can use it as
model.eval()
When doing model evaluation and testing, it is advised to use the torch.no_grad context manager. This avoids having to retain the activations which are used for gradient backpropagation.
with torch.no_grad():
out = model(x)
Or as a decorator on top of your function/method declaration:
#torch.no_grad()
validate():
pass
I'm trying to use multiple inputs in custom layers in Tensorflow-Keras. Usage can be anything, right now it is defined as multiplying the mask with the image. I've search SO and the only answer I could find was for TF 1.x so it didn't do any good.
class mul(layers.Layer):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# I've added pass because this is the simplest form I can come up with.
pass
def call(self, inputs):
# magic happens here and multiplications occur
return(Z)
EDIT: Since TensorFlow v2.3/2.4, the contract is to use a list of inputs to the call method. For keras (not tf.keras) I think the answer below still applies.
Implementing multiple inputs is done in the call method of your class, there are two alternatives:
List input, here the inputs parameter is expected to be a list containing all the inputs, the advantage here is that it can be variable size. You can index the list, or unpack arguments using the = operator:
def call(self, inputs):
Z = inputs[0] * inputs[1]
#Alternate
input1, input2 = inputs
Z = input1 * input2
return Z
Multiple input parameters in the call method, works but then the number of parameters is fixed when the layer is defined:
def call(self, input1, input2):
Z = input1 * input2
return Z
Whatever method you choose to implement this depends if you need fixed size or variable sized number of arguments. Of course each method changes how the layer has to be called, either by passing a list of arguments, or by passing arguments one by one in the function call.
You can also use *args in the first method to allow for a call method with a variable number of arguments, but overall keras' own layers that take multiple inputs (like Concatenate and Add) are implemented using lists.
try in this way
class mul(layers.Layer):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# I've added pass because this is the simplest form I can come up with.
pass
def call(self, inputs):
inp1, inp2 = inputs
Z = inp1*inp2
return Z
inp1 = Input((10))
inp2 = Input((10))
x = mul()([inp1,inp2])
x = Dense(1)(x)
model = Model([inp1,inp2],x)
model.summary()
I am trying to apply one idea proposed by Rusu et al. in https://arxiv.org/pdf/1511.06295.pdf, which consists in training a NN changing the output layer according to the class of the input, i.e., provided that we know the id of the input, we would pick the corresponding output layer. This way, all the hidden layers would be trained with all the data, but each output layer would only be trained with its corresponding type of input data.
This is meant to achieve good results in a transfer learning framework.
How can I implement this "change of the last layer" in tensorflow 2.0?
If you use model subclassing, you can actually define you forward pass.
class MyModel(tf.keras.Model):
def __init__(self):
super(Model, self).__init__()
self.block_1 = BlockA()
self.block_2 = BlockB()
self.global_pool = layers.GlobalAveragePooling2D()
self.classifier = Dense(num_classes)
def call(self, inputs):
if condition:
x = self.block_1(inputs)
else:
x = self.block_2(inputs)
x = self.global_pool(x)
return self.classifier(x)
You'll still have the backprop part to figure out, but I think it's fairly easy if you use a multioutput model and train all your "last layers" at the same time.
I have a complex keras model in which one of the layers is a custom pretrained layer which expects "int32" as inputs. This model is implemented as a class that inherits from Model and it is implemented like this:
class MyModel(tf.keras.models.Model):
def __init__(self, size, input_shape):
super(MyModel, self).__init__()
self.layer = My_Layer()
self.build(input_shape)
def call(self, inputs):
return self.layer(inputs)
But when it reaches the self.build method, it throws the next error:
ValueError: You cannot build your model by calling `build` if your layers do not support float type inputs. Instead, in order to instantiate and build your model, `call` your model on real tensor data (of the correct dtype).
How can I fix it?
The exception is thrown when building a model with model.build.
model.build function build a model based on given input shape.
The error is raised because when we trying to build a model, it first calls a model with x argument depending on input shape type in the following code
if (isinstance(input_shape, list) and
all(d is None or isinstance(d, int) for d in input_shape)):
input_shape = tuple(input_shape)
if isinstance(input_shape, list):
x = [base_layer_utils.generate_placeholders_from_shape(shape)
for shape in input_shape]
elif isinstance(input_shape, dict):
x = {
k: base_layer_utils.generate_placeholders_from_shape(shape)
for k, shape in input_shape.items()
}
else:
x = base_layer_utils.generate_placeholders_from_shape(input_shape)
x is a TensorFlow placeholder here. So when trying to call a model with x as an input it will pop a TypeError and the result except for block will work and give an error.
I assume your input shape is 16x16. Instead of using self.build([(16,16)]) this, call the model based on real tensor
inputs = tf.keras.Input(shape=(16,))
self.call(inputs)
Workaround
I've encountered the same problem when trying to export model with multiple Int-typed input tensor as SavedModel. I worked around by overriding the build method and manually specifying self._build_input_shape. So you solution would look like:
class MyModel(tf.keras.models.Model):
def __init__(self, size, input_shape):
super(MyModel, self).__init__()
self.layer = My_Layer()
self.build(input_shape)
def call(self, inputs):
return self.layer(inputs)
def build(self, input_shapes):
super(tf.keras.Model, self).build(input_shapes)
What happened in the original code
The default build method of tf.keras.Model object will treat by default input tensors as float tensors, which ends up throwing the exception.
Such behavior of tf.keras.Model is defined here, where inputs for your model are created by base_layer_utils.generate_placeholders_from_shape, which will specify dtype as float.
What would happen with the workaround
As tf.keras.Model.build would finally invoke it's super class's build function tf.keras.layer.Layer.build, the workaround skips tf.keras.Model.build logic that causes the problem, but you may have to add complemental code after that in case you rely on other logics in defined in tf.keras.Model.build