initial state of LSTM - python

I would like to inspect the internal state of an LSTM layer.
In particular I would like to look at the status at prediction time, after feeding a new example to the network.
I have understood this can be done using:
from keras import backend as K
# load pre-trained model somewhere
# select a LSTM layer
for layer in model.layers:
if 'LSTM' in str(layer):
break
# get inputs somewhere
val = np.random.random((...))
x = K.variable(value=val)
initial_state = layer.get_initial_states(???)
[desired_states]=layer.step(inputs=x,states=initial_state)
What is the input shape required by get_initial_states?
Is this a correct way of inspecting an LSTM

Related

Feed keras model input to the output layer

So I am building a keras sequential model in which the last output layer is an Upsampling2D layer & I need to feed the input image to that output layer to do a simple operation and return the output, any ideas?
EDIT :
The model mentioned before is the generator of a GAN model in which I need to add the input image to the output of the generator before feeding it to the discriminator
1.You can define a backbone model using inputs of pre-trained model and the outputs of the last layer before the output layer of pre-trained model
2.Base on that backbone model, defined new model have that new skip connection and the output layer as same as pre-trained model
3.Set the weights of output layer in new model to equal to weights of output layer in pre-trained model, using: new_model.layers[-1].set_weights(pre_model.layers[-1].get_weights())
Here is one good article about Adding Layers to the middle of a pre-trained network whithout invalidating the weights
So for the future reference, I solved it by using lambda layers as follow :
# z is the input I needed to use later on with the generator output to perform a certain function
generated_image = self.generator(z)
generated_image_modified=tf.keras.layers.Concatenate()([generated_image,z])
# with x[...,variable_you_need_range] you can access the input we just concatenated in your train loop
lambd = tf.keras.layers.Lambda(lambda x: your_function(x[...,2:5],x[...,:2]))(generated_image_modified)
full_model = self.discriminator(lambd)
self.combined = Model(z,outputs = full_model)

Load weights for last layer (output layer) to a new model from trained network

Is it possible to load the weights to the last layer in my new model from trained network by using set_weights and get_weights scheme ?
The point is, i saved the weight of each layer as a mat file (after training) to make some calculation in Matlab and i want just the modified weights of the last layer to be loaded to the last layer in my new model and other layers get the same weights as the trained model. It is a bit trickey, since the saved format is mat.
weights1 = lstm_model1.layers[0].get_weights()[0]
biases1 = lstm_model1.layers[0].get_weights()[1]
weights2 = lstm_model1.layers[2].get_weights()[0]
biases2 = lstm_model1.layers[2].get_weights()[1]
weights3 = lstm_model1.layers[4].get_weights()[0]
biases3 = lstm_model1.layers[4].get_weights()[1]
# Save the weights and biases for adaptation algorithm
savemat("weights1.mat", mdict={'weights1': weights1})
savemat("biases1.mat", mdict={'biases1': biases1})
savemat("weights2.mat", mdict={'weights2': weights2})
savemat("biases2.mat", mdict={'biases2': biases2})
savemat("weights3.mat", mdict={'weights3': weights3})
savemat("biases3.mat", mdict={'biases3': biases3})
How can i load just the old weights of other layers to the new model (without the last layer) and the modified weights of last layer to the last layer in the new one ?
If it was saved as a .h5 file format, this works. However, I’m not sure about .mat:
In simplicity, you just have to callget_weights on the desired layer, and similarly, set_weights on the corresponding layer of the other model:
last_layer_weights = old_model.layers[-1].get_weights()
new_model.layers[-1].set_weights(last_layer_weights)
For a more complete code sample, here you go:
# Create an arbitrary model with some weights, for example
model = Sequential(layers = [
Dense(70, input_shape = (100,)),
Dense(60),
Dense(50),
Dense(5)])
# Save the weights of the model
model.save_weights(“model.h5”)
# Later, load in the model (we only really need the layer in question)
old_model = Sequential(layers = [
Dense(70, input_shape = (100,)),
Dense(60),
Dense(50),
Dense(5)])
old_model.load_weights(“model.h5”)
# Create a new model with slightly different architecture (except for the layer in question, at least)
new_model = Sequential(layers = [
Dense(80, input_shape = (100,)),
Dense(60),
Dense(50),
Dense(5)])
# Set the weights of the final layer of the new model to the weights of the final layer of the old model, but leaving other layers unchanged.
new_model.layers[-1].set_weights(old_model.layers[-1].get_weights())
# Assert that the weights of the final layer is the same, but other are not.
print (np.all(new_model.layers[-1].get_weights()[0] == old_model.layers[-1].get_weights()[0]))
>> True
print (np.all(new_model.layers[-2].get_weights()[0] == old_model.layers[-2].get_weights()[0]))
>> False

Extracting features from EfficientNet Tensorflow

I have a CNN model trained using EfficientNetB6.
My task is to extract the features of this trained model by removing the last dense layer and then using those weights to train a boosting model.
i did this using Pytorch earlier and was able to extract the weights from the layers i was interested and predicted on my validation set and then boosted.
I am doing this now in tensorflow but currently stuck.
Below is my model structure and I have tried using the code on the website but did not had any luck.
I want to remove the last dense layer and predict on the validation set using the remaining layers.
I tried using :
layer_name = 'efficientnet-b6'
intermediate_layer_model = tf.keras.Model(inputs = model.input, outputs = model.get_layer(layer_name).output)
but i get an error "
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 760, 760, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []"
Any way to resolve this?
Sorry my bad.
I simply added a GlobalAveragePooling2D layer after the efficientnet layer and i am able to extract the weights and continue :)
just for reference:
def build_model(dim=CFG['net_size'], ef=0):
inp = tf.keras.layers.Input(shape=(dim,dim,3))
base = EFNS[ef](input_shape=(dim,dim,3),weights='imagenet',include_top=False)
x = base(inp)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(1,activation='sigmoid')(x)
model = tf.keras.Model(inputs=inp,outputs=x)
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
loss = tf.keras.losses.BinaryCrossentropy(label_smoothing=0.05)
model.compile(optimizer=CFG['optimizer'],loss=loss,metrics=[tf.keras.metrics.AUC(name='auc')])
print(model.summary())
return model

Adding Dropout to testing/inference phase

I've trained the following model for some timeseries in Keras:
input_layer = Input(batch_shape=(56, 3864))
first_layer = Dense(24, input_dim=28, activation='relu',
activity_regularizer=None,
kernel_regularizer=None)(input_layer)
first_layer = Dropout(0.3)(first_layer)
second_layer = Dense(12, activation='relu')(first_layer)
second_layer = Dropout(0.3)(second_layer)
out = Dense(56)(second_layer)
model_1 = Model(input_layer, out)
Then I defined a new model with the trained layers of model_1 and added dropout layers with a different rate, drp, to it:
input_2 = Input(batch_shape=(56, 3864))
first_dense_layer = model_1.layers[1](input_2)
first_dropout_layer = model_1.layers[2](first_dense_layer)
new_dropout = Dropout(drp)(first_dropout_layer)
snd_dense_layer = model_1.layers[3](new_dropout)
snd_dropout_layer = model_1.layers[4](snd_dense_layer)
new_dropout_2 = Dropout(drp)(snd_dropout_layer)
output = model_1.layers[5](new_dropout_2)
model_2 = Model(input_2, output)
Then I'm getting the prediction results of these two models as follow:
result_1 = model_1.predict(test_data, batch_size=56)
result_2 = model_2.predict(test_data, batch_size=56)
I was expecting to get completely different results because the second model has new dropout layers and theses two models are different (IMO), but that's not the case. Both are generating the same result. Why is that happening?
As I mentioned in the comments, the Dropout layer is turned off in inference phase (i.e. test mode), so when you use model.predict() the Dropout layers are not active. However, if you would like to have a model that uses Dropout both in training and inference phase, you can pass training argument when calling it, as suggested by François Chollet:
# ...
new_dropout = Dropout(drp)(first_dropout_layer, training=True)
# ...
Alternatively, If you have already trained your model and now want to use it in inference mode and keep the Dropout layers (and possibly other layers which have different behavior in training/inference phase such as BatchNormalization) active, you can define a backend function that takes the model's inputs as well as Keras learning phase:
from keras import backend as K
func = K.function(model.inputs + [K.learning_phase()], model.outputs)
# to use it pass 1 to set the learning phase to training mode
outputs = func([input_arrays] + [1.])
your question has a simple solution in the latest version of Tensorflow. you can set the training argument of the call method to true.
you can run a code like the below code:
model(input,training=True)
by using training=True TensorFlow automatically applies the Dropout layer in inference mode.
As there are already some working code solutions above, I will simply add a few more details regarding dropout during inference to prevent confusion.
Based on the original paper, Dropout layers play the role of turning off (setting gradients to zero) the neuron nodes during training to reduce overfitting. However, once we finish off with training and start testing the model, we do not 'touch' any neurons, thus, all the units are considered to make the decision when inferencing. This causes previously 'dead' neuron weights to be large than expected due to the usage of Dropout. To prevent this, a scaling factor is applied to balance the network node. To be more precise, if a unit is retained with probability p during training, the outgoing weights of that unit are multiplied by p during the prediction stage.

How to pass a pair of images first through a conv net and then through a recurrent net in Keras?

I would like to compare two images with both a convolutional and a recurrent network. First I want to pass my first image through some VGG-like stack, then feed it into a first RNN input. Then the second image should pass THE SAME VGG and after that go into a second input of the RNN.
How do I implement this topology with Keras?
The recurrent network should remember the first image while processing the second.
UPDATE
Suppose I have two inputs:
input1 = layers.Input(...)
input2 = layers.Input(...)
Currently I have two VGG branches
x1 = vgg_stack(...)(x1)
x2 = vgg_stack(...)(x2)
x = layers.concatenate([x1, x2])
x = final_MLP(...)(x)
How would I replace it with signle vgg_stack applied to both inputs, and then these results are passed to RNN?
You should try to use the TimeDistributed wrapper. You can find the doc here
It basically takes the first dimension after the batch as a 'temporal dimension' and it applies the layer (or model?) that you give as an argument to every temporal step. So use it like this :
from keras.layers import TimeDistributed
input_layer = Input((num_of_images, image_dims...))
# m_cnn is your VGG like model, taking one image as input.
layer1 = TimeDistributed(m_cnn)(input_layer)
layer2 = YourRNNLayer(...)(layer1)
I hope this makes sense to you :)

Categories

Resources