initial state lstm encoder decoder keras - python

I am trying to build an LSTM encoder decoder where my main goal is that the inital state of the decoder is the same as the encoder. I found the code below from here and tried to attach it to my case. I have a data with shape (1000,20,1). I want that the encoder decoder gives me in the output my input back. I do not know how to correct the code that it is working, even I understand the error. When I try to run this, I get the following error:
The model expects 2 input arrays, but only received one array. Found:
array with shape (10000, 20, 1)
from keras.models import Model
from keras.layers import Input
from keras.layers import LSTM
from keras.layers import Dense
from keras.models import Sequential
latent_dim = 128
encoder_inputs = Input(shape=(20,1))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(20, 1))
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(1, activation='tanh')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='adam', loss='mse', metrics=['acc', 'mae'])
history=model.fit(xtrain, xtrain, epochs=200, verbose=2, shuffle=False)
I also have this model, but I am not sure how I can initalize here the encoder state same as the decoder state. Is the repeatvector doing this ?
#define model
model = Sequential()
model.add(LSTM(100, input_shape=(n_timesteps_in, n_features)))
model.add(RepeatVector(n_timesteps_in))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(n_features, activation='tanh')))
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
history=model.fit(train, train, epochs=epochs, verbose=2, shuffle=False)

You are building a model with 2 inputs, namely encoder_inputs and decoder_inputs but only giving one input .fit(xtrain, xtrain, ...) the second argument is the output. In case you need to give another argument of the form .fit([xtrain, the_inputs_for_decoder], xtrain, ...)

Related

Adding additional hidden layer and attention layer to LSTM model

sequence_input = Input(shape=(MAX_LENGTH_SEQUENCE,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
l_lstm = Bidirectional(LSTM(10))(embedded_sequences)
preds = Dense(len(macronum), activation='softmax')(l_lstm)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
I need to add additional hidden layer and an attention layer to the above LSTM model , usually i construct the model in this way:
model = tensorflow.keras.Sequential()
model.add(tensorflow.keras.layers.LSTM(128, dropout=0.3,
recurrent_dropout=0.2,input_shape=(N, K), return_sequences=True))
#model.add(tensorflow.keras.layers.LSTM(128, dropout=0.3,
recurrent_dropout=0.2,input_shape=(N, K), return_sequences=True))
model.add(Attention(name='attention_weight'))
model.add(tensorflow.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())
print(_x_train.shape)
model.fit(_x_train, _y_train, epochs=10, batch_size=1)
scores = model.evaluate(_x_test, _y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
as shown in the second code i normaly stack and add as many layer as i need using
model.add
but for the first code i'm not familiar with this approach,
if i want to add extra LSTM layer and Attention layer to the first code, where should i include them inside the code? what is the right syntax ?

What is the correct method to construct a many-to-many LSTM model using Keras in Python?

I am trying to make a 3 sequence many-to-many LSTM model, but I am confused about it's implementation in Keras. I searched on internet for examples of many-to-many models, but each website gives different method. That has confused me even more. What is the correct method of those? I want a model like this:
Some of the various methods I found were
Using encoder, decoder
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
model = Sequential()
# encoder layer
model.add(LSTM(100, activation='relu', input_shape=(3, 1)))
# repeat vector
model.add(RepeatVector(3))
# decoder layer
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
Another with encoder, decoder
from keras.models import Model
from keras.layers import Input, LSTM, Dense
encoder_inputs = Input(shape=(None, 1))
encoder = LSTM(100, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, 1))
decoder_lstm = LSTM(100, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.add(TimeDistributed(Dense(2)))
model.compile(optimizer='adam', loss='mse')
model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.compile(optimizer='adam', loss='mse')
Which one of these is the correct method? which one will give the model like the one I want?
You have to mention your problem statement first.
1 and 2 are best for neural machine translation problems. While 2 is superior because it is considering return states in LSTM layer. 3 is also a good architecture where logic from input to output is simple. 4 is a very basic architecture becuase nth output in the output array has knowledge about [0 to n-1th input, not later ones] also no fully connected (Dense) layer so even moderate logic cannot be learned here.

Resource localhost/total/N10tensorflow3VarE does not exist

I'm working with Google Colab and trying to train a model using VGG blocks. Like this:
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.BinaryAccuracy(name='accuracy'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.AUC(name='auc'),
]
# function for creating a vgg block
def vgg_block(layer_in, n_filters, n_conv):
# add convolutional layers
for _ in range(n_conv):
layer_in = Conv2D(n_filters, (3,3), padding='same', activation='relu')(layer_in)
# add max pooling layer
layer_in = MaxPooling2D((2,2), strides=(2,2))(layer_in)
return layer_in
# define model input
visible = Input(shape=(256, 256, 3))
# add vgg module
layer = vgg_block(visible, 64, 2)
#####################################
flat = Flatten()(layer)
hidden1 = Dense(128, activation='relu')(flat)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
print(model.summary())
# plot model architecture
plot_model(model, show_shapes=True, to_file='vgg_block.png')
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=METRICS)
# New lines to obtain the best model in term of validation accuracy
from keras.callbacks import ModelCheckpoint
filepath="weights-improvement-{epoch:02d}-{val_accuracy:.2f}.h5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
But, when I try to use model.fit_generator it gives me an error. The code I'm using is:
history = model.fit_generator(
train_generator,
steps_per_epoch=2000 // batch_size,
epochs=20,
validation_data=validation_generator,
validation_steps=800 // batch_size,
callbacks=callbacks_list
)
I have tried everything and I don't know what to do. It gives me the following error:
NotFoundError: 2 root error(s) found.
(0) Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
[[{{node metrics/accuracy/AssignAddVariableOp}}]]
[[metrics/precision/Mean/_87]]
(1) Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
[[{{node metrics/accuracy/AssignAddVariableOp}}]]
0 successful operations.
0 derived errors ignored.
I would appreciate any help. I'm kind of new here. What could I do? Thanks!
It seems the problem arises only using the native keras but when I tried to implement your code and modified it in Tensorflow 2.x as below:
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Dense, Flatten
from tensorflow.keras.models import Model
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.BinaryAccuracy(name='accuracy'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.AUC(name='auc'),
]
# function for creating a vgg block
def vgg_block(layer_in, n_filters, n_conv):
# add convolutional layers
for _ in range(n_conv):
layer_in = Conv2D(n_filters, (3,3), padding='same', activation='relu')(layer_in)
# add max pooling layer
layer_in = MaxPooling2D((2,2), strides=(2,2))(layer_in)
return layer_in
# define model input
visible = Input(shape=(256, 256, 3))
# add vgg module
layer = vgg_block(visible, 64, 2)
#####################################
flat = Flatten()(layer)
hidden1 = Dense(128, activation='relu')(flat)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
print(model.summary())
# # plot model architecture
# plot_model(model, show_shapes=True, to_file='vgg_block.png')
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=METRICS)
# New lines to obtain the best model in term of validation accuracy
from tensorflow.keras.callbacks import ModelCheckpoint
filepath="weights-improvement-{epoch:02d}-{val_accuracy:.2f}.h5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
## Synthetic Inputs
train_input = tf.random.normal((100, 256, 256, 3))
train_output = tf.random.normal((100, 1))
# Test Model.fit same as Model.fit_generator in TF 2.1.0
model.fit(train_input, train_output, epochs = 1)
The problem didn't show up and it is working properly.
You can try this in TF 2.x instead. I hope this solved your problem.
The problem is with the Keras version. Try with Keras 2.3.0 and above versions where they introduce new class-based metrics.
Keras documentation

Can't use TimeDistributed with LSTM

I try to use the TimeDistributed layer of Keras, but I have some issues.
Dataset shape:
Training Set: (800, 7, 231) with 7 being the timestep.
Training Labels: (800, 7)
Validation Set: (700, 7, 231)
Validation labels (700,7)
My goal is to do binary classification. I have information for seven successive day (with explain why 7 is my timestep). I also have the labels for the 7 successive days, which I want to use in order to predict only the last day (the seventh day).
Below is my code:
from keras.layers import LSTM
model = Sequential()
model.add(LSTM(120, input_shape=(final_dataset.shape[1], final_dataset.shape[2]), return_sequences=True))
print('ok')
model.add(TimeDistributed(Dense(15, activation='softmax')))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
history = model.fit(training_set, labels_train, epochs=1, validation_data=(validation_set, labels_validation))
My error: Error when checking target: expected activation 1 to have 3 dimensions, but got array with shape (800, 7)
Edit:
I tried an other version, with encoder decoder, which not works for the moment:
from keras.layers import LSTM
from keras.models import Sequential, Model
from keras.layers import Dense, Input, TimeDistributed, Flatten
# Define an input sequence and process it.
# Input layer of the encoder :
encoder_input = Input(shape=(final_dataset.shape[1], final_dataset.shape[2]))
# Hidden layers of the encoder :
encoder_LSTM = LSTM(120, input_shape=(final_dataset.shape[1], final_dataset.shape[2]), return_sequences=True)(encoder_input)
# Output layer of the encoder :
encoder_LSTM2_layer = LSTM(120, return_state=True)
encoder_outputs, state_h, state_c = encoder_LSTM2_layer(encoder_LSTM)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
# Input layer of the decoder :
decoder_input = Input(shape=(6,))
# Hidden layers of the decoder :
decoder_LSTM_layer = LSTM(120, input_shape=(6,), return_sequences=True)
decoder_LSTM = decoder_LSTM_layer(decoder_input, initial_state = encoder_states)
decoder_LSTM_2_layer = LSTM(120, return_sequences=True, return_state=True)
decoder_LSTM_2,_,_ = decoder_LSTM_2_layer(decoder_LSTM)
# Output layer of the decoder :
decoder_dense = Dense(2, activation='sigmoid')
decoder_outputs = decoder_dense(decoder_LSTM_2)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_input, decoder_input], decoder_outputs)
model.summary()
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([final_dataset,
labels_train[:,:6]],
labels_train[:,6])

How to get values of penultimate layer in convolutional neural network(CNN)?

I am trying to implement CNN for a classification task. I want to see the how the weights are being optimized at each epoch. To do so, I need the values of penultimate layer. Also, I will hard code the last layer and backpropagation myself. Please recommend APIs also which which will be helpful.
Edit: I have added a code from keras examples. Looking forward to edit it.
This link provide some hint. I have mentioned the layer after which I require the output.
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalMaxPooling1D
from keras.datasets import imdb
# set parameters:
max_features = 5000
maxlen = 400
batch_size = 100
embedding_dims = 50
filters = 250
kernel_size = 3
hidden_dims = 250
epochs = 100
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
model.add(Dropout(0.2))
# we add a Convolution1D, which will learn filters
# word group filters of size filter_length:
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
# we use max pooling:
model.add(GlobalMaxPooling1D())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.2))
model.add(Activation('relu'))
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1))
model.add(Activation('sigmoid')) #<======== I need output after this.
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
You can get the individual layers of your model like this:
num_layer = 7 # Dense(1) layer
layer = model.layers[num_layer]
I want to see the how the weights are being optimized at each epoch.
To get the weights of the layer use layer.get_weights() like this:
w, b = layer.get_weights() # weights and bias of Dense(1)
I need the values of penultimate layer.
To get the value of the evaluation of the last layer use model.predict():
prediction = model.predict(x_test)
To get the evaluation of any other layer do it with tensorflow like this:
input = tf.placeholder(tf.float32) # Create input placeholder
layer_output = layer(input) # create layer output operation
init_op = tf.global_variables_initializer() # initialize variables
with tf.Session() as sess:
sess.run(init_op)
# evaluate layer output
output = sess.run(layer_output, feed_dict = {input: x_test})
print(output)

Categories

Resources