I have a dataset of size 273985 x 5 that I'm training as a path prediction problem. I chose an LSTM inspired by this paper: https://ieeexplore.ieee.org/abstract/document/9225479
I have a baseline implementation as such:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.callbacks import EarlyStopping
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
my_sequence = np.array(sample)
# reshape input into [samples, timesteps, features]
n_in = len(my_sequence)
my_sequence = my_sequence.reshape((1, n_in, 5))
# define model
model = Sequential()
model.add(LSTM(10, activation='sigmoid', input_shape=(n_in,5)))
model.add(RepeatVector(n_in))
model.add(LSTM(10, activation='sigmoid', return_sequences=True))
model.add(TimeDistributed(Dense(5)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(my_sequence, my_sequence, epochs=300, verbose=0)
# structure of the model and the layers
plot_model(model, show_shapes=True, to_file=path)
# demonstrate recreation
predicted = model.predict(my_sequence, verbose=0)
print(predicted)
print(my_sequence)
Right now, I am choosing my training sample by hand but I want to train my entire dataset much like bootstrapping where I train 1-50, predict the next 50; train 2-50, predict the next 50… until the end of the test set then compare my prediction against the actual values.
Would this be done via batching the data or k-fold validation? Also, how would one go about it or calculate the appropriate evaluation metric?
Thank you!
Related
Newish to Keras and constructing a neural net with two dense layers. There's too much data to hold in memory, so I'm using the fit_generator function, but get the error ValueError: No data provided for "dense_2". Need data for each key in: ['dense_2']. Small example below:
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
model = Sequential([
Dense(100, input_shape=(1924800,), activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
def generate_arrays_from_files(path, batch_size=50):
while True:
# Do things....
yield ({'dense_1_input': np.asarray(outdata)}, {'output': np.asarray(outlabels)})
model.fit_generator(generate_arrays_from_files(path), steps_per_epoch=5, epochs=10)
Edit: forgot the compile line
You don't need to specify the layer in the input, and you obviously won't need to pass data to the second dense layer. Note that it's better to use a Keras generator, you can create a custom one like this or use a standard one.
You also need to compile your model.
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
model = Sequential([
Dense(100, input_shape=(1924800,), activation='relu'),
Dense(1, activation='sigmoid')
])
optimizer = keras.optimizers.Adam(lr=1e-3)
model.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
def generate_arrays_from_files(path, batch_size=50):
while True:
# Do things....
yield np.asarray(outdata), np.asarray(outlabels)
model.fit_generator(generate_arrays_from_files(path), steps_per_epoch=5, epochs=10)
Is it normal to feed a vector of (1924800,) to the model by the way?
I have a built a LSTM architecture using Keras. My goal is to map length 29 time series input sequences of floats to length 29 output sequences of floats. I am trying to implement a "many-to-many" approach. I followed this post for implementing such a model.
I start by reshaping each data point into an np.array of shape `(1, 29, 1). I have multiple data points and train the model on each one separately. The following code is how I build my model:
def build_model():
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(tf.keras.layers.LeakyReLU(alpha=0.3))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
#cast data
for point in train_dict:
train_data = train_dict[point]
train_dataset = tf.data.Dataset.from_tensor_slices((
tf.cast(train_data[0], features_type),
tf.cast(train_data[1], target_type))
).repeat() #cast into X, Y
# fit model
model.fit(train_dataset, epochs=100,steps_per_epoch = 1,verbose=0)
print(model.summary())
return model
I am confused because when I call model.predict(test_point, steps = 1, verbose = 1) the model returns 29 length 29 sequences! I don't understand why this is happening, based on my understanding from the linked post. When I try return_state=True instead of return_sequences=True then my code raises this error: ValueError: All layers in a Sequential model should have a single output tensor. For multi-output layers, use the functional API.
How do I solve the problem?
Your model has few flaws.
The last layer of your model is an LSTM. Assuming you're doing either classification / regression. This should be followed by a Dense layer (SoftMax/sigmoid - classification, linear - regression). But since this is a time-series problem, dense layer should be wrapped in a TimeDistributed wrapper.
It's odd to apply a LeakyReLU on top of the LSTM.
I've fixed the code with fixes for above issues. See if that helps.
from tensorflow.keras.layers import Embedding, Input, Bidirectional, LSTM, Dense, Concatenate, LeakyReLU, TimeDistributed
from tensorflow.keras.initializers import Constant
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
def build_model():
# define model
model = Sequential()
model.add(LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
print(model.summary())
return model
model = build_model()
Keras prediction returns the same value everywhere.
I have some xyz data that I want to predict in a regular grid, using keras ML.
I am using something wrong and can't figure it out.
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adadelta, Adam
m=1e5
data=np.random.rand(m,3) # let's generate some random data (i do actually have real data that make sense)
dx=0.05
xmin=np.min(data[:,0])
xmax=np.max(data[:,0])
ymin=np.min(data[:,1])
ymax=np.max(data[:,1])
xs=np.arange(xmin,xmax+dx,dx)
ys=np.arange(ymin,ymax+dx,dx)
xg,yg=np.meshgrid(xs,ys)
shape = (len(ys), len(xs))
activation='sigmoid'
hidden_layer_sizes=[128, 64, 32, 16]
keras_model = Sequential()
keras_model.add(Dense(hidden_layer_sizes[0], activation=activation, input_shape=(2, )))
for hl_size in hidden_layer_sizes[1: ]:
keras_model.add(Dense(hl_size, activation=activation))
keras_model.add(Dense(1))
keras_model.compile(loss='mean_squared_error', optimizer=Adam())
keras_model.save_weights('cache.h5')
keras_model.summary()
keras_model.load_weights('cache.h5') # re-initialize Keras model weights
keras_history = keras_model.fit(data[:,:2], data[:,2], batch_size=m, epochs=20000, verbose=1)
X_test = np.vstack((xg.flatten(), yg.flatten())).T
res_keras=keras_model.predict(X_test).reshape(shape)
I am expecting some values "close" to an interpolation function.
Where is the mistake in my code?
change you activation from sigmoid to relu
set
activation='relu'
I'm just trying to explore keras and tensorflow with the famous MNIST dataset.
I already applied some basic neural networks, but when it comes to tuning some hyperparameters, especially the number of layers, thanks to the sklearn wrapper GridSearchCV, I get the error below:
Parameter values for parameter (hidden_layers) need to be a sequence(but not a string) or np.ndarray.
So you can have a better view I post the main parts of my code.
Data preparation
# Extract label
X_train=train.drop(labels = ["label"],axis = 1,inplace=False)
Y_train=train['label']
del train
# Reshape to fit MLP
X_train = X_train.values.reshape(X_train.shape[0],784).astype('float32')
X_train = X_train / 255
# Label format
from keras.utils import np_utils
Y_train = keras.utils.to_categorical(Y_train, num_classes = 10)
num_classes = Y_train.shape[1]
Keras part
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
# Function with hyperparameters to optimize
def create_model(optimizer='adam', activation = 'sigmoid', hidden_layers=2):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(32, activation=activation, input_shape=784))
for i in range(hidden_layers):
# Add one hidden layer
model.add(Dense(16, activation=activation))
# Add an output layer
model.add(Dense(num_classes, activation='softmax'))
#compile model
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
# Model which will be the input for the GridSearchCV function
modelCV = KerasClassifier(build_fn=create_model, verbose=0)
GridSearchCV
from keras.activations import relu, sigmoid
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import Dropout
from keras.utils import np_utils
activations = [sigmoid, relu]
param_grid = dict(hidden_layers=3,activation=activations, batch_size = [256], epochs=[30])
grid = GridSearchCV(estimator=modelCV, param_grid=param_grid, scoring='accuracy')
grid_result = grid.fit(X_train, Y_train)
I just want to let you know that the same kind of question has already been asked here Grid Search the number of hidden layers with keras but the answer is not complete at all and I can't add a comment to reply to the answerer.
Thank you!
You should add:
for i in range(int(hidden_layers)):
# Add one hidden layer
model.add(Dense(16, activation=activation))
Try to add the values of param_grid as lists :
params_grid={"hidden_layers": [3]}
When you are setting your parameter hidden layer =2 it goes as a string thus an error it throw.
Ideally it should a sequence to run the code that's what you error says
I'm trying to train a deep classifier in Keras both with and without pretraining of the hidden layers via stacked autoencoders. My problem is that the pretraining seems to drastically degrade performance (i.e. if pretrain is set to False in the code below the training error of the final classification layer converges much faster). This seems completely outrageous to me given that pretraining should only initialize the weights of the hidden layers and I don't see how that could completely kill the models performance even if that initialization does not work very well. I can not include the specific dataset I used but the effect should occur for any appropriate dataset (e.g. minist). What is going on here and how can I fix it?
EDIT: code is now reproducible with the MNIST data, final line prints change in loss function, which is significantly lower with pre-training.
I have also slightly modified the code and added sample learning curves below:
from functools import partial
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.layers import Dense
from keras.models import Sequential
from keras.optimizers import SGD
from keras.regularizers import l2
from keras.utils import to_categorical
(inputs_train, targets_train), _ = mnist.load_data()
inputs_train = inputs_train[:1000].reshape(1000, 784)
targets_train = to_categorical(targets_train[:1000])
hidden_nodes = [256] * 4
learning_rate = 0.01
regularization = 1e-6
epochs = 30
def train_model(pretrain):
model = Sequential()
layer = partial(Dense,
activation='sigmoid',
kernel_initializer='random_normal',
kernel_regularizer=l2(regularization))
for i, hn in enumerate(hidden_nodes):
kwargs = dict(units=hn, name='hidden_{}'.format(i + 1))
if i == 0:
kwargs['input_dim'] = inputs_train.shape[1]
model.add(layer(**kwargs))
if pretrain:
# train autoencoders
inputs_train_ = inputs_train.copy()
for i, hn in enumerate(hidden_nodes):
autoencoder = Sequential()
autoencoder.add(layer(units=hn,
input_dim=inputs_train_.shape[1],
name='hidden'))
autoencoder.add(layer(units=inputs_train_.shape[1],
name='decode'))
autoencoder.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='binary_crossentropy')
autoencoder.fit(
inputs_train_,
inputs_train_,
batch_size=32,
epochs=epochs,
verbose=0)
autoencoder.pop()
model.layers[i].set_weights(autoencoder.layers[0].get_weights())
inputs_train_ = autoencoder.predict(inputs_train_)
num_classes = targets_train.shape[1]
model.add(Dense(units=num_classes,
activation='softmax',
name='classify'))
model.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='categorical_crossentropy')
h = model.fit(
inputs_train,
targets_train,
batch_size=32,
epochs=epochs,
verbose=0)
return h.history['loss']
plt.plot(train_model(pretrain=False), label="Without Pre-Training")
plt.plot(train_model(pretrain=True), label="With Pre-Training")
plt.xlabel("Epoch")
plt.ylabel("Cross-Entropy")
plt.legend()
plt.show()