pytorch nn.Module inference - python

I am planning on learning Pytorch. However at this stage I would like to ask a question so that I can understand some code I am reading
When you have a class whose base class is nn.Module say
class My_model(nn.Module)
how are inferences supposed to be run there?
In the code I am reading it says
tasks_output, other = my_model(data)
Wouldn't that just be creating an object? (like calling the class constructor)
How, in pytorch, are inference supposed to be made?
(for reference I am talking when my_model is set to my_model.eval())
EDIT: My apologies. I made the mistake of declaring the class and object as one.. I corrected the code

You are confusion __init__ and __call__.
In your example my_model is a class, therefore calling
my_model_instance = my_model(arguments)
Invoke's my_model.__init__ with arguments. The result of this call is a new instance of my_model in the variable my_model_instance.
Once you instantiated the class my_model as the variable my_model_instance, you can evaluate the model on the training data:
tasks_output, other = my_model_instance(data)
"Calling" (i.e., putting parenthesis after the variable name) the instance of the model causes python to invoke the method __call__ of the class.
In the case of classes derived from nn.Modules this will invoke __call__ of nn.Module that does some pytorch stuff and eventually calls your implementation of forward method of my_class.
Please see this detailed thread on the difference between __init__ and __call__ in python in general.
It is often a convenient follow PEP8 Style Guide for Python Code:
Class names should normally use the CapWords convention.
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.

You have for exemple :
class My_model(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# Call construtor of Class
my_model = My_model()
It's important to differentie the class and objet.
The name's class start with capital letter in Python.
The constructor as you can see, it doesn't take a data/input parameter, alone the function forward have one.
After, for the training, you must to need :
criterion who calcul error that model with the labels.
It must have optimizer for back propagation algorythm
Exemple :
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
For end, you must to need, with a loop, this elements :
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Here, you have one iteration of back propagation.
Pytorch documentation
If you want thinking the inference in backpropagation, you can read how create a layer with pytorch and how the pytorch use autograph.
The tensor use Autograph for backpropagation. Exemple with Pytorch documentation
import torch
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
loss.backward()
print(w.grad)
print(b.grad)
The resultat give the backpropagation, where the cross entropy criterion calcul the distance with model and label. The Tensor z is not unique matrice of value but a class with "memory the calcul" with w, b, x, y.
In the layer the gradiant use the forward function for this calcul or a function backward if necesserie.
Best regard

Models in PyTorch are defined with classes by inheriting from the base nn.Module class:
class Model(nn.Module)
pass
You can then implement a forward method that acts as the inference code. Whether it be for training or evaluation, it is supposed to return the output of your model.
class Model(nn.Module)
forward(self, x)
return x**2
Once you have that you can initialize a new model with:
model = Model()
To use your newly initialized model, you won't actually call forward directly. The underlying structure of nn.Module makes it such that you can call __call__ instead. Which will handle the call to your forward implementation. To use it, you will just call your object like a function:
>>> model(2)
4
In the documentation page you can see that nn.Module.eval will set the model to evaluation mode which affects particular layers such as batch normalization layers and dropouts. These types of layers are usually turned on for training and turned off for evaluation and testing. You can use it as
model.eval()
When doing model evaluation and testing, it is advised to use the torch.no_grad context manager. This avoids having to retain the activations which are used for gradient backpropagation.
with torch.no_grad():
out = model(x)
Or as a decorator on top of your function/method declaration:
#torch.no_grad()
validate():
pass

Related

cleverhans, tf2, fgsm - how can i pass my LSTM regression model to the fast gradient method function in cleverhans? (logits)

i built and trained my LSTM model for a regression task and everything works fine. i would like to use the fast_gradient_method function from cleverhans (or any other cleverhans function as the issue stands for any other attack).
i don't understand how am i supposed to pass the model to the function. from cleverhans:
:param model_fn: a callable that takes an input tensor and returns the model logits
whatever input i give to the function (the model itself, the weights i get with get_weights, the weights of the "stage" right before the dense layer...), i get this error:
TypeError: 'module' object is not callable
what would be the correct input to make it work?
in the only working example i found, the following line of code is used to define logits_model and then pass it as :param model_fn:, but i still get the error above
logits_model = tf.keras.Model(model.input,model.layers[-1].output)
to pass a valid model, it should be defined in the following way:
(it is just an example)
"make" is only needed for model.summary() to work, I found the code in another SO post that I can't seem to find right now
class modSubclass(Model):
def __init__(self):
super(modSubclass, self).__init__()
self.lstm1 = GRU(hidden_size1, activation='relu',return_sequences=True,input_shape=(input_size,1))
self.lstm2 = GRU(hidden_size2, activation='relu')
self.dense1 = Dense(K, activation='relu')
def call(self,x):
x = self.lstm1(x)
x = self.lstm2(x)
x = self.dense1(x)
return x
def make(self, input_shape):
'''
This method makes the command "model.summary()" work.
input_shape: (H,W,C), do not specify batch B
'''
x = tf.keras.layers.Input(shape=input_shape)
model = tf.keras.Model(inputs=[x], outputs=self.call(x), name='actor')
print(model.summary())
return model

specifying input shape in keras model in object-oriented way

Compare the following code snippets. I implemented a simple keras model like this
inp = layers.Input((10,2))
x = layers.Flatten()(inp)
x = layers.Dense(5)(x)
m = models.Model(inputs=inp, outputs=x)
For one reason or another, I need to have my model in an objective way. So no problem, it's easy to reimplement that into:
class MyModel(tf.keras.Model):
def __init__(self, inp_shape, out_size = 5):
super(MyModel, self).__init__()
self.inp = layers.InputLayer(input_shape=inp_shape)
self.flatten = layers.Flatten()
self.dense = layers.Dense(out_size)
def call(self, a):
x = self.inp(a)
x = self.flatten(x)
x = self.dense(x)
return x
However in the second case when I try to run:
m = MyModel((10,2))
m.summary()
I get:
ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.
I don't quite get why? Shouldn't the above be equivalent?
The reason for this is that when you create an object of this model you are just creating its layers and not its graph. So in short the output from layer 1 is not going in layer 2 cause those are entirely separate attributes of the class but when you call the model those separate attributes combines and form the graph.
When you define a model in tf. keras with subclassed API, you need to build the model first by calling build or run the model on some data.
m = MyModel((10,2))
m.build(input_shape=(10, 2)) # < -- build the model
m.summary()
That said, you don't also need to define self.inp while building the model with subclassed API. The .summary() may not look right to you for the subclassed model, you may need to check this instead.

Constrain parameters to be -1, 0 or 1 in neural network in pytorch

I want to constrain the parameters of an intermediate layer in a neural network to prefer discrete values: -1, 0, or 1. The idea is to add a custom objective function that would increase the loss if the parameters take any other value. Note that, I want to constrain parameters of a particular layer, not all layers.
How can I implement this in pytorch? I want to add this custom loss to the total loss in the training loop, something like this:
custom_loss = constrain_parameters_to_be_discrete
loss = other_loss + custom_loss
May be using a Dirichlet prior might help, any pointer to this?
Extending upon #Shai answer and mixing it with this answer one could do it simpler via custom layer into which you could pass your specific layer.
First, the calculated derivative of torch.abs(x**2 - torch.abs(x)) taken from WolframAlpha (check here) would be placed inside regularize function.
Now the Constrainer layer:
class Constrainer(torch.nn.Module):
def __init__(self, module, weight_decay=1.0):
super().__init__()
self.module = module
self.weight_decay = weight_decay
# Backward hook is registered on the specified module
self.hook = self.module.register_full_backward_hook(self._weight_decay_hook)
# Not working with grad accumulation, check original answer and pointers there
# If that's needed
def _weight_decay_hook(self, *_):
for parameter in self.module.parameters():
parameter.grad = self.regularize(parameter)
def regularize(self, parameter):
# Derivative of the regularization term created by #Shia
sgn = torch.sign(parameter)
return self.weight_decay * (
(sgn - 2 * parameter) * torch.sign(1 - parameter * sgn)
)
def forward(self, *args, **kwargs):
# Simply forward and args and kwargs to module
return self.module(*args, **kwargs)
Usage is really simple (with your specified weight_decay hyperparameter if you need more/less force on the params):
constrained_layer = Constrainer(torch.nn.Linear(20, 10), weight_decay=0.1)
Now you don't have to worry about different loss functions and can use your model normally.
You can use the loss function:
def custom_loss_function(x):
loss = torch.abs(x**2 - torch.abs(x))
return loss.mean()
This graph plots the proposed loss for a single element:
As you can see, the proposed loss is zero for x={-1, 0, 1} and positive otherwise.
Note that if you want to apply this loss to the weights of a specific layer, then your x here are the weights, not the activations of the layer.

Understanding Pytorch filter function

I was going through the documentation of PyTorch framework and found lots of instances where a variable is assigned a function but when it calls the function the parameters change. Not sure on how this works, any pointers would be helpful.
What I do understand -
def func1(word):
print("hello", word)
var1 = func1
Now in this scenario, var1("world") would print the string hello world.
But what I dont understand is some lines from PyTorch like:
def __init__(self, input_size, num_classes):
super(NN,self).__init__()
self.fc1 = nn.Linear(input_size, 50)
self.fc2 = nn.Linear(50, num_classes)
def forward(self,x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
How do we know that only 1 param should be passed to self.fc2. It seems to be independent of the number of params defined in nn.Linear
Does nn.Linear return a function like func1 that we store in var1 from the earlier example? If so is there any documentation on what is being returned?
I do find the usage for each function in the nn module but is there something that gives more details on how exactly this works?
nn.Linear is not a function (and neither are all the other layers, like the convolution layers, batchnorms...), but a functor, which means it is a class which implements the __call__ method/operator which is called when you write something like self.fc2(x).
The __call__ operator is implemented in the nn.Module base class, and it's a call to another method _call_impl which itself calls (basically) the forward method. Therefore, thanks to inheritance magic, when you make a class derive from nn.Module, you only need to implement the forward method.
The signature of this method is kinda up to you, but in most cases it will take a tensor as input and return another tensor.
In summary :
# calls the constructor of nn.Linear. self.fc1 is now a functor
self.fc1 = nn.Linear(20, 10)
# calls the fc1 functor on an input
y = self.fc1(torch.randn(2, 10))
# which is basically doing
y = self.fc1.forward(torch.randn(2, 10))

Taking a derivative through torch.ge, or how to explicitly define a derivative in pytorch

I am trying to set up a network in which one layer maps from real numbers to {0, 1} (i.e. makes output binary).
What I tried
While I was able to find that torch.ge provides such functionality, whenever I want to train any parameter occurring before that layer in a network PyTorch breaks.
I have been also trying to find if there is any way in PyTorch/autograd, to override the derivative of a module by hand. More specifically in this cause, I would just like to pass derivative through the torch.ge, without changing it.
Minimal Example
Here is a minimal example I produced, which uses a typical neural network training structure in PyTorch.
import torch
import torch.nn as nn
import torch.optim as optim
class LinearGE(nn.Module):
def __init__(self, features_in, features_out):
super().__init__()
self.fc = nn.Linear(features_in, features_out)
def forward(self, x):
return torch.ge(self.fc(x), 0)
x = torch.randn(size=(10, 30))
y = torch.randint(2, size=(10, 10))
# Define Model
m1 = LinearGE(30, 10)
opt = optim.SGD(m1.parameters(), lr=0.01)
crit = nn.MSELoss()
# Train Model
for x_batch, y_batch in zip(x, y):
# zero the parameter gradients
opt.zero_grad()
# forward + backward + optimize
pred = m1(x_batch)
loss = crit(pred.float(), y_batch.float())
loss.backward()
opt.step()
What I encountered
When I run the above code the following error occurs:
File "__minimal.py", line 33, in <module>
loss.backward()
...
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
This error makes sense since torch.ge function is not differentiable. However, since MaxPool2D is also not differentiable, I believe that there are ways of mitigating non-differentiability in PyTorch.
It would be great if someone could point me to any source which can help me either implement my own backprop for a custom module, or any way of avoiding this error message.
Thanks!
Two things I noticed
If your input x is 10x30 (10 examples, 30 features)and the number of output node is 10, then the parameter matrix is 30x10. The expected output matrix is 10x10 (10 examples 10 output nodes)
ge = greater than and equal to. As the code indicated, x >= 0 element wise. We can use relu.
class LinearGE(nn.Module):
def __init__(self, features_in, features_out):
super().__init__()
self.fc = nn.Linear(features_in, features_out)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
return self.relu(self.fc(x))
or torch.max
torch.max(self.fc(x), 0)[0]

Categories

Resources