Naming weights of a layer within a custom layer - python

I have a custom layer within a Dense sublayer. I want to be able to name the weights of this sublayer. However, using name="my_dense" on the sublayer initializer doesn't seem to do this; the weights simply get named after the outer custom layer.
To illustrate the problem, suppose I want a custom layer that simply stacks two dense layers. I'll print the names of the weights of this custom layer.
class DoubleDense(keras.layers.Layer):
def __init__(self, units, **kwargs):
self.dense1 = keras.layers.Dense(units, name="first_dense")
self.dense2 = keras.layers.Dense(units, name="second_dense")
super(DoubleDense, self).__init__(**kwargs)
def build(self, input_shape):
self.dense1.build(input_shape)
self.dense2.build(self.dense1.units)
def call(self, input):
hidden = self.dense1(input)
return self.dense2(hidden)
dd = DoubleDense(3)
# We need to evaluate the layer once to build the weights
trivial_input = tf.ones((1,10))
output = dd(trivial_input)
# Print the names of all variables in the DoubleDense layer
print([weight.name for weight in dd.weights])
The output is this:
['double_dense_1/kernel:0',
'double_dense_1/bias:0',
'double_dense_1/kernel:0',
'double_dense_1/bias:0']
...but I was expecting something more like this:
['double_dense_1/first_dense_1/kernel:0',
'double_dense_1/first_dense_1/bias:0',
'double_dense_1/second_dense_1/kernel:0',
'double_dense_1/second_dense_1/bias:0']
So, Keras has named these weights ambiguously; there is no way to tell whether a weight tensor belongs to dd.dense1 or dd.dense2 by its name alone. I realise I could select the layer first and then the weights (dd.dense1.weights), but I would prefer not to do this in my application.
Is there a way to name the weights of a sublayer of a custom layer?

If you want the name for the subclass layers you need to include name_scope and then call build for each layer.
Below is the modified code which will give names for each layer in the output.
class DoubleDense(keras.layers.Layer):
def __init__(self, units, **kwargs):
self.dense1 = keras.layers.Dense(units)
self.dense2 = keras.layers.Dense(units)
super(DoubleDense, self).__init__( **kwargs)
def build(self, input_shape):
with tf.name_scope("first_dense"):
self.dense1.build(input_shape)
with tf.name_scope("second_dense"):
self.dense2.build(self.dense1.units)
def call(self, input):
hidden = self.dense1(input)
return self.dense2(hidden)
dd = DoubleDense(3)
# We need to evaluate the layer once to build the weights
trivial_input = tf.ones((1,10))
output = dd(trivial_input)
# Print the names of all variables in the DoubleDense layer
print([weight.name for weight in dd.weights])
Output:
['double_dense/first_dense/kernel:0', 'double_dense/first_dense/bias:0', 'double_dense/second_dense/kernel:0', 'double_dense/second_dense/bias:0']
Hope this answers your question, Happy Learning!

Related

Why is the custom layer decomposed into several operations in Keras?

I want to get the weights of my custom layer, but I couldn't get them by model.layer().get_weights()[X].
So I checked the layers of the model, it seems that the custom layer is decomposed into several operations and no weights can be found in these layers.
Here is the custom layer code
class PixelBaseConv(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(PixelBaseConv, self).__init__(**kwargs)
def build(self, input_shape):
# kernel_shape: w*h*c*output_dim
kernel_size = input_shape[1:]
kernel_shape = (1,) + kernel_size + (self.output_dim, )
self.kernel = self.add_weight(name='kernel',
shape=kernel_shape,
initializer='uniform',
trainable=True)
super(PixelBaseConv, self).build(input_shape)
def call(self, inputs):
# output_shape: w*h*output_dim
outputs = []
inputs = K.cast(inputs, dtype="float32")
for i in range(self.output_dim):
#output = tf.keras.layers.Multiply()([inputs, self.kernel[..., i]])
output = inputs*self.kernel[...,i]
output = K.sum(output, axis=-1)
if len(outputs) != 0:
outputs = np.dstack([outputs, output])
else:
outputs = output[..., np.newaxis]
return tf.convert_to_tensor(outputs)
def compute_output_shape(self, input_shape):
return input_shape + (self.output_dim, )
Here is part of the model structure
enter image description here
I tried different ways to obtain the weights but due to the strange layers, failed.
Expected: the first five layers are replaced with single layer which has a trainable kernel. Weights can be get directly by get_weights()
I listed weight list length of the first 10 layers and printed weight of layer 1 by following codes
for i in range(len(model.layers)):
print("layer " + str(i), len(model.layers[i].get_weights()))
print(model.layers[1].get_weights()[0])
and got the result and error
enter image description here
enter image description here
I found why this problem occurred.
I wrote the custom layer by
import tensorflow.python.keras
while using other keras layers and creating the model by
import tensorflow.keras
I think these two libraries may not be compatible, so my custom layer was splitted into several operation layers. Thus, weights cannot be obtained and updated.
I changed all imports to tensorflow.keras, now everything goes well.

Trainable weights in custom layers?

I was learning making custom layers in tensor flow but could not find out how to add trainable weights for example
class Linear(layers.Layer):
def __init__(self, units = 32, **kwargs):
super().__init__(kwargs)
self.units = units
def build(self, input_shape):
self.layer = layers.Dense(self.units, trainable= True)
super().build(input_shape)
def call(self, inputs):
return self.layer(inputs)
Now if I do
linear_layer = Linear(8)
x = tf.ones(shape =(4,3))
y = linear_layer(x)
print(linear_layer.trainable_variables)
I get an empty matrix and thus during gradient calculation I get no gradients, my question is how to create custom layers in a way that default keras layers are also trainable in that. One more thing if I do linear_layer.weights then it give me the weights, it means there is some problem with trainable weights.
My mind is stuck on that
To get trainable variables you have to access the "layer" attribute of your custom layer:
linear_layer = Linear(8)
x = tf.ones(shape =(4,3))
y = linear_layer(x)
print(linear_layer.layer.trainable_variables)
note that you just create a pre_built layer (Dense) in the build method instead of create the weights of your custom layer. look at link https://www.tensorflow.org/tutorials/customization/custom_layers

How to reshape keras mask within custom layer

Note: I posted about this issue already here. I'm creating a new question because:
1. I think the issue specifically relates to reshaping my mask within my custom layer, but I'm not sure enough of that to completely ignore the other error I wrote about in the original post.
2. There are many posts about reshaping Keras layers or adding Masking layers, but I couldn't find any about reshaping a mask within a layer, so I hope this post can be useful more generally.
The issue:
I have a custom Keras layer that takes 2D input and returns 3D output (batch_size, max_length, 1024), which is passed on to a BiLSTM followed by a CRF.
The custom Keras layer is copied from this repository. The difference is I take the 'elmo' instead of 'default' outputs from the Elmo model, so that the output is 3D as required by the BiLSTM:
result = self.elmo(K.squeeze(K.cast(x, tf.string), axis=1),
as_dict=True,
signature='default',
)['elmo'] # The original code used 'default'
However the compute_mask function isn't appropriate for my architecture, as it's output is 2D. Thus I get the error:
InvalidArgumentError: Incompatible shapes: [32,47] vs. [32,0] [[{{node loss/crf_1_loss/mul_6}}]]
where 32 is batch size and 47 is one less than my specified max_length.
I'm sure I need to reshape the mask, but I couldn't find out anywhere how.
Happy to make a git repo with the whole thing and/or full stack trace if need be.
Custom ELMo Layer:
class ElmoEmbeddingLayer(Layer):
def __init__(self, **kwargs):
self.dimensions = 1024
self.trainable = True
super(ElmoEmbeddingLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.elmo = hub.Module('https://tfhub.dev/google/elmo/2', trainable=self.trainable, name="{}_module".format(self.name))
self.trainable_weights += K.tf.trainable_variables(scope="^{}_module/.*".format(self.name))
super(ElmoEmbeddingLayer, self).build(input_shape)
def call(self, x, mask=None):
result = self.elmo(K.squeeze(K.cast(x, tf.string), axis=1),
as_dict=True, signature='default',)['elmo']
return result
# Original compute_mask function. Raises;
# InvalidArgumentError: Incompatible shapes: [32,47] vs. [32,0] [[{{node loss/crf_1_loss/mul_6}}]]
def compute_mask(self, inputs, mask=None):
return K.not_equal(inputs, '__PAD__')
def compute_output_shape(self, input_shape):
return input_shape[0], 48, self.dimensions
The model is built as follows:
def build_model(): # uses crf from keras_contrib
input = layers.Input(shape=(1,), dtype=tf.string)
model = ElmoEmbeddingLayer(name='ElmoEmbeddingLayer')(input)
model = Bidirectional(LSTM(units=512, return_sequences=True))(model)
crf = CRF(num_tags)
out = crf(model)
model = Model(input, out)
model.compile(optimizer="rmsprop", loss=crf_loss, metrics=[crf_accuracy, categorical_accuracy, mean_squared_error])
model.summary()
return model

Why listing model components in pyTorch is not useful?

I am trying to create Feed forward neural networks with N layers
So idea is suppose If I want 2 inputs 3 hidden and 2 outputs than I will just pass [2,3,2] to neural network class and neural network model will get created so if I want [100,1000,1000,2]
where in this case 100 is inputs, two hidden layers contains 1000 neuron each and 2 outputs so I want fully connected neural network where I just wanted to pass list which contains number of neuron in each layer.
So for that I have written following code
class FeedforwardNeuralNetModel(nn.Module):
def __init__(self, layers):
super(FeedforwardNeuralNetModel, self).__init__()
self.fc=[]
self.sigmoid=[]
self.activationValue = []
self.layers = layers
for i in range(len(layers)-1):
self.fc.append(nn.Linear(layers[i],layers[i+1]))
self.sigmoid.append(nn.Sigmoid())
def forward(self, x):
out=x
for i in range(len(self.fc)):
out=self.fc[i](out)
out = self.sigmoid[i](out)
return out
when I tried to use it I found it kind of empty model
model=FeedforwardNeuralNetModel([3,5,10,2])
print(model)
>>FeedforwardNeuralNetModel()
and when I used following code
class FeedforwardNeuralNetModel(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(FeedforwardNeuralNetModel, self).__init__()
# Linear function
self.fc1 = nn.Linear(input_dim, hidden_dim)
# Non-linearity
self.tanh = nn.Tanh()
# Linear function (readout)
self.fc2 = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
# Linear function
out = self.fc1(x)
# Non-linearity
out = self.tanh(out)
# Linear function (readout)
out = self.fc2(out)
return out
and when I tried to print this model I found following result
print(model)
>>FeedforwardNeuralNetModel(
(fc1): Linear(in_features=3, out_features=5, bias=True)
(sigmoid): Sigmoid()
(fc2): Linear(in_features=5, out_features=10, bias=True)
)
in my code I am just creating lists that is what difference
I just wanted to understand why in torch listing model components is not useful?
If you do print(FeedForwardNetModel([1,2,3]) it gives the following error
AttributeError: 'FeedforwardNeuralNetModel' object has no attribute '_modules'
which basically means that the object is not able to recognize modules that you have declared.
Why does this happen?
Currently, modules are declared in self.fc which is list and hence torch has no way of knowing if it is a model unless it does a deep search which is bad and inefficient.
How can we let torch know that self.fc is a list of modules?
By using nn.ModuleList (See modified code below). ModuleList and ModuleDict are python list and dictionaries respectively, but they tell torch that the list/dict contains a nn module.
#modified init function
def __init__(self, layers):
super().__init__()
self.fc=nn.ModuleList()
self.sigmoid=[]
self.activationValue = []
self.layers = layers
for i in range(len(layers)-1):
self.fc.append(nn.Linear(layers[i],layers[i+1]))
self.sigmoid.append(nn.Sigmoid())

Keras weights of first layer didn't change

I'm very new to Keras and I'm writing a custom layer which implements Gaussian function [exp(-(w*x-mean)^2/sigma^2) where W, mean, sigma are all randomly generated].
Below is code for the custom layer:
class Gaussian(Layer):
def __init__(self,**kwargs):
super(Gaussian, self).__init__(**kwargs)
def build(self, input_shape):
# Create trainable weights for this layer.
self.W_init = np.random.rand(1,input_shape[1])
self.W = K.variable(self.W_init, name="W")
# Create trainable means for this layer.
self.mean_init = np.random.rand(1,input_shape[1])
self.mean = K.variable(self.mean_init, name="mean")
# Create trainable sigmas for this layer.
self.sigma_init = np.random.rand(1,input_shape[1])
self.sigma = K.variable(self.sigma_init, name="sigma")
self.trainable_weights = [self.mean, self.sigma]
super(Gaussian, self).build(input_shape) # Be sure to call this somewhere!
def call(self, x):
result = tf.multiply(x, self.W)
result = tf.subtract(x, self.mean)
result = tf.multiply(tf.square(result),-1)
result = tf.divide(result, tf.square(self.sigma))
return result
def compute_output_shape(self, input_shape):
return input_shape
After putting it as the first layer in a Keras mnist tutorial(just wanted to make sure it runs without producing errors, didn't care for accuracy) and training the model, it appeared that the loss stopped decreasing after around 4 epochs and only the numbers of "mean" and "sigma" changed after training while the numbers of "W" remains the same. However, this doesn't happen if I put it as the second layer.
I ran the Keras mnist tutorial again without the custom layer and found out that the weights of the first layer didn't change either.
Is not updating the weights of first layer(more specifically the very first parameter) a Keras thing or am I missing something? Can I force it to update?
Thank you!
You are not implementing your layer correctly, Keras is not aware of your weights, that means they are not being trained by gradient descent. Take a look at this example:
from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
class MyLayer(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(name='kernel',
shape=(input_shape[1], self.output_dim),
initializer='uniform',
trainable=True)
super(MyLayer, self).build(input_shape) # Be sure to call this at the end
def call(self, x):
return K.dot(x, self.kernel)
def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_dim)
Here you have to use add_weight to obtain a trainable weight, not just use K.variable as you are currently doing. This way your weights will be registered with Keras and they will be trained properly. You should do this for all trainable parameters in your layer.

Categories

Resources