Can't use TimeDistributed with LSTM - python

I try to use the TimeDistributed layer of Keras, but I have some issues.
Dataset shape:
Training Set: (800, 7, 231) with 7 being the timestep.
Training Labels: (800, 7)
Validation Set: (700, 7, 231)
Validation labels (700,7)
My goal is to do binary classification. I have information for seven successive day (with explain why 7 is my timestep). I also have the labels for the 7 successive days, which I want to use in order to predict only the last day (the seventh day).
Below is my code:
from keras.layers import LSTM
model = Sequential()
model.add(LSTM(120, input_shape=(final_dataset.shape[1], final_dataset.shape[2]), return_sequences=True))
print('ok')
model.add(TimeDistributed(Dense(15, activation='softmax')))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
history = model.fit(training_set, labels_train, epochs=1, validation_data=(validation_set, labels_validation))
My error: Error when checking target: expected activation 1 to have 3 dimensions, but got array with shape (800, 7)
Edit:
I tried an other version, with encoder decoder, which not works for the moment:
from keras.layers import LSTM
from keras.models import Sequential, Model
from keras.layers import Dense, Input, TimeDistributed, Flatten
# Define an input sequence and process it.
# Input layer of the encoder :
encoder_input = Input(shape=(final_dataset.shape[1], final_dataset.shape[2]))
# Hidden layers of the encoder :
encoder_LSTM = LSTM(120, input_shape=(final_dataset.shape[1], final_dataset.shape[2]), return_sequences=True)(encoder_input)
# Output layer of the encoder :
encoder_LSTM2_layer = LSTM(120, return_state=True)
encoder_outputs, state_h, state_c = encoder_LSTM2_layer(encoder_LSTM)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
# Input layer of the decoder :
decoder_input = Input(shape=(6,))
# Hidden layers of the decoder :
decoder_LSTM_layer = LSTM(120, input_shape=(6,), return_sequences=True)
decoder_LSTM = decoder_LSTM_layer(decoder_input, initial_state = encoder_states)
decoder_LSTM_2_layer = LSTM(120, return_sequences=True, return_state=True)
decoder_LSTM_2,_,_ = decoder_LSTM_2_layer(decoder_LSTM)
# Output layer of the decoder :
decoder_dense = Dense(2, activation='sigmoid')
decoder_outputs = decoder_dense(decoder_LSTM_2)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_input, decoder_input], decoder_outputs)
model.summary()
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([final_dataset,
labels_train[:,:6]],
labels_train[:,6])

Related

What is the correct method to construct a many-to-many LSTM model using Keras in Python?

I am trying to make a 3 sequence many-to-many LSTM model, but I am confused about it's implementation in Keras. I searched on internet for examples of many-to-many models, but each website gives different method. That has confused me even more. What is the correct method of those? I want a model like this:
Some of the various methods I found were
Using encoder, decoder
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
model = Sequential()
# encoder layer
model.add(LSTM(100, activation='relu', input_shape=(3, 1)))
# repeat vector
model.add(RepeatVector(3))
# decoder layer
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
Another with encoder, decoder
from keras.models import Model
from keras.layers import Input, LSTM, Dense
encoder_inputs = Input(shape=(None, 1))
encoder = LSTM(100, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, 1))
decoder_lstm = LSTM(100, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.add(TimeDistributed(Dense(2)))
model.compile(optimizer='adam', loss='mse')
model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.compile(optimizer='adam', loss='mse')
Which one of these is the correct method? which one will give the model like the one I want?
You have to mention your problem statement first.
1 and 2 are best for neural machine translation problems. While 2 is superior because it is considering return states in LSTM layer. 3 is also a good architecture where logic from input to output is simple. 4 is a very basic architecture becuase nth output in the output array has knowledge about [0 to n-1th input, not later ones] also no fully connected (Dense) layer so even moderate logic cannot be learned here.

ValueError : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18]

I'm new with Keras and I'm trying to build a model for personal use/future learning. I've just started with python and I came up with this code (with help of videos and tutorials). I have a data of 16324 instances, each instance consists of 18 features and 1 dependent variable.
import pandas as pd
import os
import time
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM, BatchNormalization
from tensorflow.keras.callbacks import TensorBoard, ModelCheckpoint
EPOCHS = 10
BATCH_SIZE = 64
NAME = f"-TEST-{int(time.time())}"
df = pd.read_csv("EntryData.csv", names=['1SH5', '1SHA', '1SA5', '1SAA', '1WH5', '1WHA', '2SA5', '2SAA', '2SH5', '2SHA', '2WA5', '2WAA', '3R1', '3R2', '3R3', '3R4', '3R5', '3R6', 'Target'])
df_val = 14554
validation_df = df[df.index > df_val]
df = df[df.index <= df_val]
train_x = df.drop(columns=['Target'])
train_y = df[['Target']]
validation_x = validation_df.drop(columns=['Target'])
validation_y = validation_df[['Target']]
model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1:]), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
tensorboard = TensorBoard(log_dir=f'logs/{NAME}')
filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}"
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones
history = model.fit(
train_x, train_y,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(validation_x, validation_y),
callbacks=[tensorboard, checkpoint],)
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
model.save("models/{}".format(NAME))
In line
model.add(LSTM(128, input_shape=(train_x.shape[1:]), return_sequences=True))
is throwing an error:
ValueError: Input 0 of layer lstm is incompatible with the layer:
expected ndim=3, found ndim=2. Full shape received: [None, 18]
I was searching for solution on this site and on google for few hours now and I was not able to find proper answer for this or I was not able to implement the solution for similar problem.
Thank you for any tips.
An LSTM network expects three dimensional input of this format:
(n_samples, time_steps, features)
There are two main ways this can be a problem.
Your input is 2D
You have stacked (multiple) LSTM layers
1. Your input is 2D
You need to turn your input to 3D.
x = x.reshape(len(x), 1, x.shape[1])
# or
x = np.expand_dims(x, 1)
Then, specify the right input shape in the first layer:
LSTM(64, input_shape=(x.shape[1:]))
2. You have stacked LSTM layers
By default, LSTM layers will not return sequences, i.e., they will return 2D output. This means that the second LSTM layer will not have the 3D input it needs. To address this, you need to set the return_sequences=True:
tf.keras.layers.LSTM(8, return_sequences=True),
tf.keras.layers.LSTM(8)
Here's how to reproduce and solve the 2D input problem:
import tensorflow as tf
import numpy as np
x = np.random.rand(100, 10)
# x = np.expand_dims(x, 1) # uncomment to solve the problem
y = np.random.randint(0, 2, 100)
model = tf.keras.Sequential([
tf.keras.layers.LSTM(8),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(x, y, validation_split=0.1)
Here's how to reproduce and solve the stacked LSTM layers problem:
import tensorflow as tf
import numpy as np
x = np.random.rand(100, 1, 10)
y = np.random.randint(0, 2, 100)
model = tf.keras.Sequential([
tf.keras.layers.LSTM(8), # use return_sequences=True to solve the problem
tf.keras.layers.LSTM(8),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(x, y, validation_split=0.1)

How to use the last hidden layer weights from one pre-trained MLP as input to a new MLP (transfer learning) with Keras?

I want to do transfer learning with simple MLP models. First I train a 1 hidden layer feed forward network on large data:
net = Sequential()
net.add(Dense(500, input_dim=2048, kernel_initializer='normal', activation='relu'))
net.add(Dense(1, kernel_initializer='normal'))
net.compile(loss='mean_absolute_error', optimizer='adam')
net.fit(x_transf,
y_transf,
epochs=1000,
batch_size=8,
verbose=0)
Then I want to pass the unique hidden layer as input to a new network, in which I want to add a second layer. The re-used layer should not be trainable.
idx = 1 # index of desired layer
input_shape = net.layers[idx].get_input_shape_at(0) # get the input shape of desired layer
input_layer = net.layers[idx]
input_layer.trainable = False
transf_model = Sequential()
transf_model.add(input_layer)
transf_model.add(Dense(input_shape[1], activation='relu'))
transf_model.compile(loss='mean_absolute_error', optimizer='adam')
transf_model.fit(x,
y,
epochs=10,
batch_size=8,
verbose=0)
EDIT:
The above code returns:
ValueError: Error when checking target: expected dense_9 to have shape (None, 500) but got array with shape (436, 1)
What's the trick to make this work?
I would simply use Functional API to build such a model:
shared_layer = net.layers[0] # you want the first layer, so index = 0
shared_layer.trainable = False
inp = Input(the_shape_of_one_input_sample) # e.g. (2048,)
x = shared_layer(inp)
x = Dense(800, ...)(x)
out = Dense(1, ...)(x)
model = Model(inp, out)
# the rest is the same...

initial state lstm encoder decoder keras

I am trying to build an LSTM encoder decoder where my main goal is that the inital state of the decoder is the same as the encoder. I found the code below from here and tried to attach it to my case. I have a data with shape (1000,20,1). I want that the encoder decoder gives me in the output my input back. I do not know how to correct the code that it is working, even I understand the error. When I try to run this, I get the following error:
The model expects 2 input arrays, but only received one array. Found:
array with shape (10000, 20, 1)
from keras.models import Model
from keras.layers import Input
from keras.layers import LSTM
from keras.layers import Dense
from keras.models import Sequential
latent_dim = 128
encoder_inputs = Input(shape=(20,1))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(20, 1))
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(1, activation='tanh')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='adam', loss='mse', metrics=['acc', 'mae'])
history=model.fit(xtrain, xtrain, epochs=200, verbose=2, shuffle=False)
I also have this model, but I am not sure how I can initalize here the encoder state same as the decoder state. Is the repeatvector doing this ?
#define model
model = Sequential()
model.add(LSTM(100, input_shape=(n_timesteps_in, n_features)))
model.add(RepeatVector(n_timesteps_in))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(n_features, activation='tanh')))
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
history=model.fit(train, train, epochs=epochs, verbose=2, shuffle=False)
You are building a model with 2 inputs, namely encoder_inputs and decoder_inputs but only giving one input .fit(xtrain, xtrain, ...) the second argument is the output. In case you need to give another argument of the form .fit([xtrain, the_inputs_for_decoder], xtrain, ...)

How to get values of penultimate layer in convolutional neural network(CNN)?

I am trying to implement CNN for a classification task. I want to see the how the weights are being optimized at each epoch. To do so, I need the values of penultimate layer. Also, I will hard code the last layer and backpropagation myself. Please recommend APIs also which which will be helpful.
Edit: I have added a code from keras examples. Looking forward to edit it.
This link provide some hint. I have mentioned the layer after which I require the output.
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalMaxPooling1D
from keras.datasets import imdb
# set parameters:
max_features = 5000
maxlen = 400
batch_size = 100
embedding_dims = 50
filters = 250
kernel_size = 3
hidden_dims = 250
epochs = 100
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
model.add(Dropout(0.2))
# we add a Convolution1D, which will learn filters
# word group filters of size filter_length:
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
# we use max pooling:
model.add(GlobalMaxPooling1D())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.2))
model.add(Activation('relu'))
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1))
model.add(Activation('sigmoid')) #<======== I need output after this.
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
You can get the individual layers of your model like this:
num_layer = 7 # Dense(1) layer
layer = model.layers[num_layer]
I want to see the how the weights are being optimized at each epoch.
To get the weights of the layer use layer.get_weights() like this:
w, b = layer.get_weights() # weights and bias of Dense(1)
I need the values of penultimate layer.
To get the value of the evaluation of the last layer use model.predict():
prediction = model.predict(x_test)
To get the evaluation of any other layer do it with tensorflow like this:
input = tf.placeholder(tf.float32) # Create input placeholder
layer_output = layer(input) # create layer output operation
init_op = tf.global_variables_initializer() # initialize variables
with tf.Session() as sess:
sess.run(init_op)
# evaluate layer output
output = sess.run(layer_output, feed_dict = {input: x_test})
print(output)

Categories

Resources