How does the output shape of submodules in pytorch is determined? why is the output shape of a certain sub-module is modified in the code below?
When I separate the head of a classical classifier from its backbone in the following way:
import torch, torchvision
from torchsummary import summary
effnet = torchvision.models.efficientnet_b0(num_classes = 2)
backbone = torch.nn.Sequential(*(list(effnet.children())[0]))
adaptive_pool = list(effnet.children())[1]
head = list(effnet.children())[2]
model = torch.nn.Sequential(*[backbone, adaptive_pool, head])
summary(model, (3,256,256), device = 'cpu') # <== Error
I get the following error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (2560x1 and 1280x2)
This error is due to modified output shape of the sub-module adaptive_pool. To workaround this problem, flatten can be used as follows:
class flatten(torch.nn.Module):
def forward(self, input):
return input.view(input.size(0), -1)
model = torch.nn.Sequential(*[backbone, adaptive_pool, flatten(), head])
summary(model, (3,256,256), device = 'cpu')
Why is the output shape of the sub-module adaptive_pool is modified?
The output of an nn.AdaptiveAvgPool2d is 4D even if the average is computed globally i.e output_size=1. In other words, the output shape of your global pooling layer is (N, C, 1, 1). This means you indeed need to flatten it for the layer which is fully connected.
In the referenced original efficient net classification network, the implementation of the flattening operation is done directly in the forward logic without the use of a dedicated layer. See this line.
Instead of implementing your own flattening layer, you can use the built-in nn.Flatten. More details about this module can be found here.
>>> model = nn.Sequential(backbone, adaptive_pool, nn.Flatten(1), head)
Related
I'm working with the tensorflow.keras API, and I've encountered a syntax which I'm unfamiliar with, i.e., applying a layer on a sub-models' output, as shown in the following example from this tutorial:
from tensorflow.keras import Model, layers
from tensorflow.keras.applications import resnet
target_shape = (200, 200)
base_cnn = resnet.ResNet50(
weights="imagenet", input_shape=target_shape + (3,), include_top=False
)
flatten = layers.Flatten()(base_cnn.output)
dense1 = layers.Dense(512, activation="relu")(flatten)
dense1 = layers.BatchNormalization()(dense1)
dense2 = layers.Dense(256, activation="relu")(dense1)
dense2 = layers.BatchNormalization()(dense2)
output = layers.Dense(256)(dense2)
embedding = Model(base_cnn.input, output, name="Embedding")
In the official reference of layers.Flatten for example, I couldn't find the explanation of what does applying it on a layer actually do. In the keras.Layer reference I've encountered this explanation:
call(self, inputs, *args, **kwargs): Called in call after making sure build() has been called. call() performs the logic of applying the layer to the input tensors (which should be passed in as argument).
So my question is:
What does flatten = layers.Flatten()(base_cnn.output) do?
You are creating a model based on a pre-trained model. This pre-trained model will not be actively trained with the rest of your layers unless you explicitly set trainable=True. That is, you are only interested in extracting its useful features. A flattening operation is usually used to convert a multidimensional output into a one-dimensional tensor, and that is exactly what is happening in this line: flatten = layers.Flatten()(base_cnn.output). A one-dimensional tensor is often a desirable end result of a model, especially in supervised learning. The output of the pre-trained resnet model is (None, 7, 7, 2048) and you want to generate 1D feature vectors for each input and compare them, so you flatten that output, resulting in a tensor with the shape (None, 100352) or (None, 7 * 7 * 2048).
Alternatives to Flatten would be GlobalMaxPooling2D and GlobalAveragePooling2D, which downsample an input by taking the max or average value along the spatial dimensions. For more information on this topic check out this post.
I am trying to implement a learning-to-rank model using a pre-trained BERT available on tensorflow hub. I am using a variation of ListNet loss function, which requires each training instance to be a list of several ranked documents in relation to a query. I need the model to be able to accept data in a shape (batch_size, list_size, sentence_length), where the model loops over the 'list_size' axis in each training instance, returns the ranks and passes them to the loss function. In a simple model that only consists of dense layers, this is easily done by augmenting the dimensions of the input layer. For example:
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras import Model
input = Input([6,10])
x = Dense(20,activation='relu')(input)
output = Dense(1, activation='sigmoid')(x)
model = Model(inputs=input, outputs=output)
...now the model will perform 6 forward passes over vectors of length 10 before calculating the loss and updating gradients.
I am trying to do the same with the BERT model and its preprocessing layer:
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text as text
bert_preprocess_model = hub.KerasLayer('https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/1')
bert_model = hub.KerasLayer('https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3')
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
processed_input = bert_preprocess_model(text_input)
output = bert_model(processed_input)
model = tf.keras.Model(text_input, output)
But when I try to change the shape of 'text_input' to, say, (6), or meddle with it in any way really, it always results in the same type of error:
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (3 total):
* Tensor("inputs:0", shape=(None, 6), dtype=string)
* False
* None
Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (3 total):
* TensorSpec(shape=(None,), dtype=tf.string, name='sentences')
* False
* None
Keyword arguments: {}
....
As per https://www.tensorflow.org/hub/api_docs/python/hub/KerasLayer, it seems like you can configure the input shape of hub.KerasLayer via tf.keras.layers.InputSpec. In my case, I guess it would be something like this:
bert_preprocess_model.input_spec = tf.keras.layers.InputSpec(ndim=2)
bert_model.input_spec = tf.keras.layers.InputSpec(ndim=2)
When I run the above code, the attributes indeed get changed, but when trying to build the model, the same exact error appears.
Is there any way to easily resolve this without the necessity to create a custom training loop?
Suppose you have a batch of B examples, each with exactly N text strings, which makes a 2-dimensional Tensor of shape [B, N]. With tf.reshape(), you can turn that into a 1-dimensional tensor of shape [B*N], send it through BERT (which preserves the order of inputs) and then reshape it back to [B,N]. (There's also tf.keras.layers.Reshape, but that hides the batch dimension from you.)
If it's not exactly N text strings each time, you'll have to do some bookkeeping on the side (e.g., store inputs in a tf.RaggedTensor, run BERT on its .values, and construct a new RaggedTensor with the same .row_splits from the result.)
I wish to exam the values of a tensor after mask is applied to it.
Here is a truncated part of the model. I let temp = x so later I wish to print temp to check the exact values.
So given a 4-class classification model using acoustic features. Assume I have data in (1000,50,136) as (batch, timesteps, features)
The objective is to check if the model is studying the features by timesteps. In other words, we wish to reassure the model is learning using slice as the red rectangle in the picture. Logically, it is the way for Keras LSTM layer but the confusion matrix produced is quite different when a parameter changes (eg. Dense units). The validation accuracy stays 45% thus we would like to visualize the model.
The proposed idea is to print out the first step of the first batch and print out the input in the model. If they are the same, then model is learning in the right way ((136,1) features once) instead of (50,1) timesteps of a single feature once.
input_feature = Input(shape=(X_train.shape[1],X_train.shape[2]))
x = Masking(mask_value=0)(input_feature)
temp = x
x = Dense(Dense_unit,kernel_regularizer=l2(dense_reg), activation='relu')(x)
I have tried tf.print() which brought me AttributeError: 'Tensor' object has no attribute '_datatype_enum'
As Get output from a non final keras model layer suggested by Lescurel.
model2 = Model(inputs=[input_attention, input_feature], outputs=model.get_layer('masking')).output
print(model2.predict(X_test))
AttributeError: 'Masking' object has no attribute 'op'
You want to output after mask.
lescurel's link in the comment shows how to do that.
This link to github, too.
You need to make a new model that
takes as inputs the input from your model
takes as outputs the output from the layer
I tested it with some made-up code derived from your snippets.
import numpy as np
from keras import Input
from keras.layers import Masking, Dense
from keras.regularizers import l2
from keras.models import Sequential, Model
X_train = np.random.rand(4,3,2)
Dense_unit = 1
dense_reg = 0.01
mdl = Sequential()
mdl.add(Input(shape=(X_train.shape[1],X_train.shape[2]),name='input_feature'))
mdl.add(Masking(mask_value=0,name='masking'))
mdl.add(Dense(Dense_unit,kernel_regularizer=l2(dense_reg),activation='relu',name='output_feature'))
mdl.summary()
mdl2mask = Model(inputs=mdl.input,outputs=mdl.get_layer("masking").output)
maskoutput = mdl2mask.predict(X_train)
mdloutput = mdl.predict(X_train)
maskoutput # print output after/of masking
mdloutput # print output of mdl
maskoutput.shape #(4, 3, 2): masking has the shape of the layer before (input here)
mdloutput.shape #(4, 3, 1): shape of the output of dense
I pass a list a to my custom function and I want to tf.tile it after converting it to a constant tensor. The times I tile it depends on the shape of y_true. I don't know how I can get the shape of y_true as integers. Here's the code:
def getloss(a):
a = tf.constant(a, tf.float32)
def loss(y_true, y_pred):
a = tf.reshape(a, [1,1,-1])
ytrue_shape = y_true.get_shape().as_list() #????
multiples = tf.constant([ytrue_shape[0], ytrue_shape[1], 1], tf.int32)
a = tf.tile(a, multiples)
#...
return loss
I have tried y_true.get_shape().as_list() but it reports an error because the first dimension (batch size) is None when compiling the model. Is there any way I can use the shape of y_true here?
When trying to access the shape of a tensor during the building of the model, when not all shapes are known, it is best to use tf.shape. It will be evaluated when the model is ran, as stated in the doc :
tf.shape and Tensor.shape should be identical in eager mode. Within tf.function or within a compat.v1 context, not all dimensions may be known until execution time. Hence when defining custom layers and models for graph mode, prefer the dynamic tf.shape(x) over the static x.shape.
ytrue_shape = tf.shape(y_true)
This will yield a Tensor, so use TF ops to get what you want :
multiples = tf.concat((tf.shape(y_true_shape)[:2],[1]),axis=0)
I want to write some custom Keras Layers and do some advanced calculations in the layer, for example with Numpy, Scikit, OpenCV...
I know there are some math functions in keras.backend that can operate on tensors, but i need some more advanced functions.
However, i have no clue how to implement this correctly, i get the error message:
You must feed a value for placeholder tensor 'input_1' with dtype float and shape [...]
Here is my custom layer:
class MyCustomLayer(Layer):
def __init__(self, **kwargs):
super(MyCustomLayer, self).__init__(**kwargs)
def call(self, inputs):
"""
How to implement this correctly in Keras?
"""
nparray = K.eval(inputs) # <-- does not work
# do some calculations here with nparray
# for example with Numpy, Scipy, Scikit, OpenCV...
result = K.variable(nparray, dtype='float32')
return result
def compute_output_shape(self, input_shape):
output_shape = tuple([input_shape[0], 256, input_shape[3]])
return output_shape # (batch, 256, channels)
The error appears here in this dummy model:
inputs = Input(shape=(96, 96, 3))
x = MyCustomLayer()(inputs)
x = Flatten()(x)
x = Activation("relu")(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)
model = Model(inputs=inputs, outputs=predictions)
Thanks for all hints...
TD;LR You should not mix Numpy inside Keras layers. Keras uses Tensorflow underneath because it has to track all the computations to be able to compute the gradients in the backward phase.
If you dig in Tensorflow, you will see that it almost covers all the Numpy functionality (or even extends it) and if I remember correctly, Tensorflow functionality can be accessed through the Keras backend (K).
What are the advance calculations/functions you need?
i think that this kinda process should apply before the model because the process does not contain variables so it cant be optimized.
K.eval(inputs) does not work beacuse you are trying to evaluate a placeholder not variable placeholders has not values for evaluate. if you want get values you should feed it or you can make a list from tensors one by one with tf.unstack()
nparray = tf.unstack(tf.unstack(tf.unstack(inputs,96,0),96,0),3,0)
your call function is wrong because returns a variable you should return a constant:
result = K.constant(nparray, dtype='float32')
return result