I was going through the documentation of PyTorch framework and found lots of instances where a variable is assigned a function but when it calls the function the parameters change. Not sure on how this works, any pointers would be helpful.
What I do understand -
def func1(word):
print("hello", word)
var1 = func1
Now in this scenario, var1("world") would print the string hello world.
But what I dont understand is some lines from PyTorch like:
def __init__(self, input_size, num_classes):
super(NN,self).__init__()
self.fc1 = nn.Linear(input_size, 50)
self.fc2 = nn.Linear(50, num_classes)
def forward(self,x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
How do we know that only 1 param should be passed to self.fc2. It seems to be independent of the number of params defined in nn.Linear
Does nn.Linear return a function like func1 that we store in var1 from the earlier example? If so is there any documentation on what is being returned?
I do find the usage for each function in the nn module but is there something that gives more details on how exactly this works?
nn.Linear is not a function (and neither are all the other layers, like the convolution layers, batchnorms...), but a functor, which means it is a class which implements the __call__ method/operator which is called when you write something like self.fc2(x).
The __call__ operator is implemented in the nn.Module base class, and it's a call to another method _call_impl which itself calls (basically) the forward method. Therefore, thanks to inheritance magic, when you make a class derive from nn.Module, you only need to implement the forward method.
The signature of this method is kinda up to you, but in most cases it will take a tensor as input and return another tensor.
In summary :
# calls the constructor of nn.Linear. self.fc1 is now a functor
self.fc1 = nn.Linear(20, 10)
# calls the fc1 functor on an input
y = self.fc1(torch.randn(2, 10))
# which is basically doing
y = self.fc1.forward(torch.randn(2, 10))
Related
i built and trained my LSTM model for a regression task and everything works fine. i would like to use the fast_gradient_method function from cleverhans (or any other cleverhans function as the issue stands for any other attack).
i don't understand how am i supposed to pass the model to the function. from cleverhans:
:param model_fn: a callable that takes an input tensor and returns the model logits
whatever input i give to the function (the model itself, the weights i get with get_weights, the weights of the "stage" right before the dense layer...), i get this error:
TypeError: 'module' object is not callable
what would be the correct input to make it work?
in the only working example i found, the following line of code is used to define logits_model and then pass it as :param model_fn:, but i still get the error above
logits_model = tf.keras.Model(model.input,model.layers[-1].output)
to pass a valid model, it should be defined in the following way:
(it is just an example)
"make" is only needed for model.summary() to work, I found the code in another SO post that I can't seem to find right now
class modSubclass(Model):
def __init__(self):
super(modSubclass, self).__init__()
self.lstm1 = GRU(hidden_size1, activation='relu',return_sequences=True,input_shape=(input_size,1))
self.lstm2 = GRU(hidden_size2, activation='relu')
self.dense1 = Dense(K, activation='relu')
def call(self,x):
x = self.lstm1(x)
x = self.lstm2(x)
x = self.dense1(x)
return x
def make(self, input_shape):
'''
This method makes the command "model.summary()" work.
input_shape: (H,W,C), do not specify batch B
'''
x = tf.keras.layers.Input(shape=input_shape)
model = tf.keras.Model(inputs=[x], outputs=self.call(x), name='actor')
print(model.summary())
return model
I have an instance of torchvision.models.ResNet and I have my class CondBatchNorm2d that is a module similar to BatchNorm2d but the forward method accepts an additional input y that does not come from the previous layer since it is an input of the whole network:
def forward(self, x, y=None):
...
I know how to substitute each BatchNorm2d instance with an instance of CondBatchNorm2d but I am not sure how to write my own forward method to include the new input for intermediate CondBatchNorm2d layers. Should I iterate in the forward on the resnet children or is there a more suitable way to do it?
Hackish, assuming
it is the input of the whole network
you might create a new model wrapping conditional CondBatchNorm:
class FedCondBatchNorm2d:
def __init__(self, y, *args, **kwargs):
self.batch_norm = CondBatchNorm2d(*args, **kwargs)
self.cond_img = y
def forward(self, x):
return self.batch_norm(x, self.cond_img)
As its API is now the same as original Reset you might simply switch blocks via module.apply
I am planning on learning Pytorch. However at this stage I would like to ask a question so that I can understand some code I am reading
When you have a class whose base class is nn.Module say
class My_model(nn.Module)
how are inferences supposed to be run there?
In the code I am reading it says
tasks_output, other = my_model(data)
Wouldn't that just be creating an object? (like calling the class constructor)
How, in pytorch, are inference supposed to be made?
(for reference I am talking when my_model is set to my_model.eval())
EDIT: My apologies. I made the mistake of declaring the class and object as one.. I corrected the code
You are confusion __init__ and __call__.
In your example my_model is a class, therefore calling
my_model_instance = my_model(arguments)
Invoke's my_model.__init__ with arguments. The result of this call is a new instance of my_model in the variable my_model_instance.
Once you instantiated the class my_model as the variable my_model_instance, you can evaluate the model on the training data:
tasks_output, other = my_model_instance(data)
"Calling" (i.e., putting parenthesis after the variable name) the instance of the model causes python to invoke the method __call__ of the class.
In the case of classes derived from nn.Modules this will invoke __call__ of nn.Module that does some pytorch stuff and eventually calls your implementation of forward method of my_class.
Please see this detailed thread on the difference between __init__ and __call__ in python in general.
It is often a convenient follow PEP8 Style Guide for Python Code:
Class names should normally use the CapWords convention.
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
You have for exemple :
class My_model(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# Call construtor of Class
my_model = My_model()
It's important to differentie the class and objet.
The name's class start with capital letter in Python.
The constructor as you can see, it doesn't take a data/input parameter, alone the function forward have one.
After, for the training, you must to need :
criterion who calcul error that model with the labels.
It must have optimizer for back propagation algorythm
Exemple :
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
For end, you must to need, with a loop, this elements :
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Here, you have one iteration of back propagation.
Pytorch documentation
If you want thinking the inference in backpropagation, you can read how create a layer with pytorch and how the pytorch use autograph.
The tensor use Autograph for backpropagation. Exemple with Pytorch documentation
import torch
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
loss.backward()
print(w.grad)
print(b.grad)
The resultat give the backpropagation, where the cross entropy criterion calcul the distance with model and label. The Tensor z is not unique matrice of value but a class with "memory the calcul" with w, b, x, y.
In the layer the gradiant use the forward function for this calcul or a function backward if necesserie.
Best regard
Models in PyTorch are defined with classes by inheriting from the base nn.Module class:
class Model(nn.Module)
pass
You can then implement a forward method that acts as the inference code. Whether it be for training or evaluation, it is supposed to return the output of your model.
class Model(nn.Module)
forward(self, x)
return x**2
Once you have that you can initialize a new model with:
model = Model()
To use your newly initialized model, you won't actually call forward directly. The underlying structure of nn.Module makes it such that you can call __call__ instead. Which will handle the call to your forward implementation. To use it, you will just call your object like a function:
>>> model(2)
4
In the documentation page you can see that nn.Module.eval will set the model to evaluation mode which affects particular layers such as batch normalization layers and dropouts. These types of layers are usually turned on for training and turned off for evaluation and testing. You can use it as
model.eval()
When doing model evaluation and testing, it is advised to use the torch.no_grad context manager. This avoids having to retain the activations which are used for gradient backpropagation.
with torch.no_grad():
out = model(x)
Or as a decorator on top of your function/method declaration:
#torch.no_grad()
validate():
pass
In keras / tensorflow it is often quite simple to describe layers directly as functions that map their input to an output, like so:
def resnet_block(x, kernel_size):
ch = x.shape[-1]
out = Conv2D(ch, kernel_size, strides = (1,1), padding='same', activation='relu')(x)
out = Conv2D(ch, kernel_size, strides = (1,1), padding='same', activation='relu')(out)
out = Add()([x,out])
return out
whereas subclassing Layer to get something like
r = ResNetBlock(kernel_size=(3,3))
y = r(x)
is a little more cumbersome (or even a lot more cumbersome for more complex examples).
Since keras seems perfectly happy to construct the underlying weights of its layers when they're being called for the first time, I was wondering if it was possible to just wrap functions such as the one above and let keras figure things out once there are inputs, i.e. I would like it to look like this:
r = FunctionWrapperLayer(lambda x:resnet_block(x, kernel_size=(3,3)))
y = r(x)
I've made an attempt at implementing FunctionWrapperLayer, which looks as follows:
class FunctionWrapperLayer(Layer):
def __init__(self, fn):
super(FunctionWrapperLayer, self).__init__()
self.fn = fn
def build(self, input_shape):
shape = input_shape[1:]
inputs = Input(shape)
outputs = self.fn(inputs)
self.model = Model(inputs=inputs, outputs=outputs)
self.model.compile()
def call(self, x):
return self.model(x)
This looks like it might work, however I've run into some bizarre issues whenever I use activations, e.g. with
def bad(x):
out = tf.keras.activations.sigmoid(x)
out = Conv2D(1, (1,1), strides=(1,1), padding='same')(out)
return out
x = tf.constant(tf.reshape(tf.range(48,dtype=tf.float32),[1,4,-1,1])
w = FunctionWrapperLayer(bad)
w(x)
I get the following error
FailedPreconditionError: Error while reading resource variable _AnonymousVar34 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar34/class tensorflow::Var does not exist.
[[node conv2d_6/BiasAdd/ReadVariableOp (defined at <ipython-input-33-fc380d9255c5>:12) ]] [Op:__inference_keras_scratch_graph_353]
What this suggests to me is that there is something inherently wrong with initializing models like that in the build method. Maybe someone has a better idea as to what might be going on there or how else to get the functionality I would like.
Update:
As mentioned by jr15, the above does work when the function involved only uses keras layers. However, the following ALSO works, which has me a little puzzled:
i = Input(x.shape[1:])
o = bad(i)
model = Model(inputs=i, outputs=o)
model(x)
Incidentally, model.submodules yields
(<tensorflow.python.keras.engine.input_layer.InputLayer at 0x219d80c77c0>,
<tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer at 0x219d7afc820>,
<tensorflow.python.keras.layers.convolutional.Conv2D at 0x219d7deafa0>)
meaning the activation is automatically turned into a "TensorFlowOpLayer" when doing it like that.
Another update:
Looking at the original error message, it seems like the activation isn't the only culprit. If I remove the convolution and use the wrapper everything works as well and again I find a "TensorFlowOpLayer" when inspecting the submodules.
You solution actually works! The trouble you're running into is that tf.keras.activations.sigmoid is not a Layer, but a plain Tensorflow function. To make it work, use keras.layers.Activation("sigmoid")(x) instead. For the more general case, where you want to use some Tensorflow function as a layer, you can wrap it in a Lambda layer like so:
out = keras.layers.Lambda(lambda x: tf.some_function(x))(out)
See the docs for more info: https://keras.io/api/layers/core_layers/lambda/
With Tensorflow 2.4 it apparently just works now. The submodules now show a "TFOpLambda" layer.
To anybody interested, here is some slightly improved wrapper code that also accommodates multi-input models:
class FunctionWrapperLayer(Layer):
def __init__(self, fn):
super(FunctionWrapperLayer, self).__init__()
self.fn = fn
def build(self, input_shapes):
super(FunctionWrapperLayer, self).build(input_shapes)
if type(input_shapes) is list:
inputs = [Input(shape[1:]) for shape in input_shapes]
else:
inputs = Input(input_shapes[1:])
outputs = self.fn(inputs)
self.fn_model = Model(inputs=inputs, outputs=outputs)
self.fn_model.compile()
def call(self, x):
return self.fn_model(x)
I'm trying to use multiple inputs in custom layers in Tensorflow-Keras. Usage can be anything, right now it is defined as multiplying the mask with the image. I've search SO and the only answer I could find was for TF 1.x so it didn't do any good.
class mul(layers.Layer):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# I've added pass because this is the simplest form I can come up with.
pass
def call(self, inputs):
# magic happens here and multiplications occur
return(Z)
EDIT: Since TensorFlow v2.3/2.4, the contract is to use a list of inputs to the call method. For keras (not tf.keras) I think the answer below still applies.
Implementing multiple inputs is done in the call method of your class, there are two alternatives:
List input, here the inputs parameter is expected to be a list containing all the inputs, the advantage here is that it can be variable size. You can index the list, or unpack arguments using the = operator:
def call(self, inputs):
Z = inputs[0] * inputs[1]
#Alternate
input1, input2 = inputs
Z = input1 * input2
return Z
Multiple input parameters in the call method, works but then the number of parameters is fixed when the layer is defined:
def call(self, input1, input2):
Z = input1 * input2
return Z
Whatever method you choose to implement this depends if you need fixed size or variable sized number of arguments. Of course each method changes how the layer has to be called, either by passing a list of arguments, or by passing arguments one by one in the function call.
You can also use *args in the first method to allow for a call method with a variable number of arguments, but overall keras' own layers that take multiple inputs (like Concatenate and Add) are implemented using lists.
try in this way
class mul(layers.Layer):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# I've added pass because this is the simplest form I can come up with.
pass
def call(self, inputs):
inp1, inp2 = inputs
Z = inp1*inp2
return Z
inp1 = Input((10))
inp2 = Input((10))
x = mul()([inp1,inp2])
x = Dense(1)(x)
model = Model([inp1,inp2],x)
model.summary()