I have a built a LSTM architecture using Keras. My goal is to map length 29 time series input sequences of floats to length 29 output sequences of floats. I am trying to implement a "many-to-many" approach. I followed this post for implementing such a model.
I start by reshaping each data point into an np.array of shape `(1, 29, 1). I have multiple data points and train the model on each one separately. The following code is how I build my model:
def build_model():
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(tf.keras.layers.LeakyReLU(alpha=0.3))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
#cast data
for point in train_dict:
train_data = train_dict[point]
train_dataset = tf.data.Dataset.from_tensor_slices((
tf.cast(train_data[0], features_type),
tf.cast(train_data[1], target_type))
).repeat() #cast into X, Y
# fit model
model.fit(train_dataset, epochs=100,steps_per_epoch = 1,verbose=0)
print(model.summary())
return model
I am confused because when I call model.predict(test_point, steps = 1, verbose = 1) the model returns 29 length 29 sequences! I don't understand why this is happening, based on my understanding from the linked post. When I try return_state=True instead of return_sequences=True then my code raises this error: ValueError: All layers in a Sequential model should have a single output tensor. For multi-output layers, use the functional API.
How do I solve the problem?
Your model has few flaws.
The last layer of your model is an LSTM. Assuming you're doing either classification / regression. This should be followed by a Dense layer (SoftMax/sigmoid - classification, linear - regression). But since this is a time-series problem, dense layer should be wrapped in a TimeDistributed wrapper.
It's odd to apply a LeakyReLU on top of the LSTM.
I've fixed the code with fixes for above issues. See if that helps.
from tensorflow.keras.layers import Embedding, Input, Bidirectional, LSTM, Dense, Concatenate, LeakyReLU, TimeDistributed
from tensorflow.keras.initializers import Constant
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
def build_model():
# define model
model = Sequential()
model.add(LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
print(model.summary())
return model
model = build_model()
Related
I have a dataset of size 273985 x 5 that I'm training as a path prediction problem. I chose an LSTM inspired by this paper: https://ieeexplore.ieee.org/abstract/document/9225479
I have a baseline implementation as such:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.callbacks import EarlyStopping
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
my_sequence = np.array(sample)
# reshape input into [samples, timesteps, features]
n_in = len(my_sequence)
my_sequence = my_sequence.reshape((1, n_in, 5))
# define model
model = Sequential()
model.add(LSTM(10, activation='sigmoid', input_shape=(n_in,5)))
model.add(RepeatVector(n_in))
model.add(LSTM(10, activation='sigmoid', return_sequences=True))
model.add(TimeDistributed(Dense(5)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(my_sequence, my_sequence, epochs=300, verbose=0)
# structure of the model and the layers
plot_model(model, show_shapes=True, to_file=path)
# demonstrate recreation
predicted = model.predict(my_sequence, verbose=0)
print(predicted)
print(my_sequence)
Right now, I am choosing my training sample by hand but I want to train my entire dataset much like bootstrapping where I train 1-50, predict the next 50; train 2-50, predict the next 50… until the end of the test set then compare my prediction against the actual values.
Would this be done via batching the data or k-fold validation? Also, how would one go about it or calculate the appropriate evaluation metric?
Thank you!
I'm trying to extract features from a pretrained model and use on my own model. I can successfully instantiate the Inveption V3 Model and save the outputs tu use as inputs for my model, but as i try to use it i get error. I tried to delete the Flatten layer but looks like the problem isnt this. I think the problem is about the last_output but have no clue on how to solve it.
The code:
#%% Imports.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import layers, Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
import os, signal
import numpy as np
#%% Instatiate an Inception V3 model
url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5" # Get the weights from the pretrained model
local_weights_file = tf.keras.utils.get_file("inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5", origin = url, extract = True)
pre_trained_model = InceptionV3(input_shape=(150, 150, 3), include_top=False, weights=None) # include_top=False argument, we load a network that doesn't include
pre_trained_model.load_weights(local_weights_file) # the classification layers at the top—ideal for feature extraction.
# Make the model non-trainable, since we will only use it for feature extraction; we won't update the weights of the pretrained model during training.
for layers in pre_trained_model.layers:
layers.trainable = False
# The layer we will use for feature extraction in Inception v3 is called mixed7. It is not the bottleneck of the network, but we are using it to keep a
# sufficiently large feature map (7x7 in this case). (Using the bottleneck layer would have resulting in a 3x3 feature map, which is a bit small.)
last_layer = pre_trained_model.get_layer('mixed7')
print('last layer output shape:', last_layer.output_shape)
last_output = last_layer.output
print(last_output)
# %% Stick a fully connected classifier on top of last_output
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1, activation='sigmoid')(x)
# Configure and compile the model
model = Model(pre_trained_model.input, x)
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(lr=0.0001),
metrics=['acc'])
the error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
c:\Users\jpaul\Code\Google_ML_Crash_Course\02_Practica\02_Image_Classification\image_classification_part3.py in
39 # Flatten the output layer to 1 dimension
----> 40 x = layers.Flatten()(last_output)
41
42 # Add a fully connected layer with 1,024 hidden units and ReLU activation
43 x = layers.Dense(1024, activation='relu')(x)
AttributeError: 'Concatenate' object has no attribute 'Flatten'
In your for loop, you overwrote the layers identifier from the import statement of
from tensorflow.keras import layers
So when you try to create a new Flatten() layer, the identifier layers contains a Concatenate object rather than the Keras layers module you were expecting.
Change the variable name in your for loop and you should be good.
Newish to Keras and constructing a neural net with two dense layers. There's too much data to hold in memory, so I'm using the fit_generator function, but get the error ValueError: No data provided for "dense_2". Need data for each key in: ['dense_2']. Small example below:
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
model = Sequential([
Dense(100, input_shape=(1924800,), activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
def generate_arrays_from_files(path, batch_size=50):
while True:
# Do things....
yield ({'dense_1_input': np.asarray(outdata)}, {'output': np.asarray(outlabels)})
model.fit_generator(generate_arrays_from_files(path), steps_per_epoch=5, epochs=10)
Edit: forgot the compile line
You don't need to specify the layer in the input, and you obviously won't need to pass data to the second dense layer. Note that it's better to use a Keras generator, you can create a custom one like this or use a standard one.
You also need to compile your model.
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
model = Sequential([
Dense(100, input_shape=(1924800,), activation='relu'),
Dense(1, activation='sigmoid')
])
optimizer = keras.optimizers.Adam(lr=1e-3)
model.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
def generate_arrays_from_files(path, batch_size=50):
while True:
# Do things....
yield np.asarray(outdata), np.asarray(outlabels)
model.fit_generator(generate_arrays_from_files(path), steps_per_epoch=5, epochs=10)
Is it normal to feed a vector of (1924800,) to the model by the way?
I trained a Many-to-Many sequence model in Keras with return_sequences=True and TimeDistributed wrapper on the last Dense layer:
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=50))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))
# train...
model.save_weights("weights.h5")
So during the training the loss is calculated over all hidden states (in every timestamp). But for inference I only need the get output on the last timestamp. So I load the weights into Many-to-One sequence model for inference without TimeDistributed wrapper and I set return_sequences=False to get only last output of the LSTM layer:
inference_model = Sequential()
inference_model.add(Embedding(input_dim=vocab_size, output_dim=50))
inference_model.add(LSTM(100, return_sequences=False))
inference_model.add(Dense(vocab_size, activation='softmax'))
inference_model.load_weights("weights.h5")
When I test my inference model on a sequence with length 20 I expect to get a prediction with shape (vocab_size) but inference_model.predict(...) still returns predictions for every timestamp - a tensor of shape (20, vocab_size)
If, for whatever reason, you need only the last timestep during inference, you can build a new model which applies the trained model on the input and returns the last timestep as its output using the Lambda layer:
from keras.models import Model
from keras.layers import Input, Lambda
inp = Input(shape=put_the_input_shape_here)
x = model(inp) # apply trained model on the input
out = Lambda(lambda x: x[:,-1])(x)
inference_model = Model(inp, out)
Side Note: As already stated in this answer, TimeDistributed(Dense(...)) and Dense(...) are equivalent, since Dense layer is applied on the last dimension of its input Tensor. Hence, that's why you get the same output shape.
I have a multi input Keras neural network where I want to calculate the gradient of the output with respect to one of the intermediate layers. The layer I chose is inside one of my sequential models that handles one of the two inputs to the model.
But when I try to calculate the gradient with respect to this layer I get [None] returned as the output. I am able to take the gradient for other layers such as the concatenate layer, but not any layers within my 2 branches. Is it possible for me to take the gradient of this layer given that it is within a sequential wrapper in my model?
Here is some code that shows what I'm trying to do.
from keras.models import Sequential, Model
from keras.layers import Dense, Input, concatenate
import keras.backend as K
# Init branch 1
branch1 = Sequential()
branch1.add(Dense(64, activation='relu', input_shape=(1000,)))
branch1.add(Dense(32, activation='relu'))
# Init branch 2
branch2 = Sequential()
branch2.add(Dense(16, activation='relu', input_shape=(500,)))
branch2.add(Dense(8, activation='relu'))
branch1_input = Input(shape=(1000,))
branch1_out = branch1(branch1_input)
branch2_input = Input(shape=(500,))
branch2_out = branch2(branch2_input)
# Combine
x = concatenate([branch1_out, branch2_out])
out = Dense(1)(x)
# Create model
model = Model(inputs=[branch1_input, branch2_input], outputs=out)
layer = model.get_layer('sequential_1').get_layer('dense_2')
grads = K.gradients(model.output, layer.output)
print(grads) # prints out [None]