Why are LSTM prediction shape not as expected?

Why are LSTM prediction shape not as expected? - python

I am having a time series dataset with the timestamp and one value (in total 2 columns only) and training an LSTM to predict the values for each hour.
So I prepare the data set to the model is like below :
Take the last 5 previous values of the value as X and observed value for the hour as y.
then I split the train and test for each X and y.
So I have the train and data sets with the below shape after scaling it with a min-max scaler.
print(train_X.shape,train_y.shape,test_X.shape,test_y.shape)
(16195, 5) (16195,) (8716, 5) (8716,)
then I build the model by
model = Sequential()
model.add(LSTM(5, input_shape=(n_steps,n_features),recurrent_dropout=0.2,return_sequences=True))
model.add(BatchNormalization())
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
I fit the model and predict by
history = model.fit(train_X, train_y, epochs=10, batch_size=64,validation_data=(test_X, test_y),shuffle=False)
#predict the instances
predicted = model.predict(test_X)
I have now predicted in the shape of (8716, 5, 1).
which is not correct I guess because the prediction should be the same as test_y shape which is (8716,).
So when I reshape to inverse scale
predicted=yhat.reshape(predicted.shape[0], -1).reshape(-1, 1)
inverse_predictions= scaler_y.inverse_transform(predicted)
This gives the shape as (43580, 1) which is wrong because the predicted is having the dimension (8716, 5, 1) instead of (8716,).
I am not sure which part is causing the error. Any help is appreciated.

You can delete return_sequences=True, that should fix the issue.
Alternatively you can use a flattening layer but I don't think this is what you would want to do here.

Related

predict_classes returning a conflicting shape for a LSTM classification model

I have quite a bit of trouble understanding the expected shape of the input/output for an LSTM problem.
Specifically for this example I have 386 of length 100 each containing 14 features. For each such sequence, I need only predict whether it is in the 0 or 1 class. The respective shapes and model are
X_test.shape,y_test.shape
((358, 100, 14), (358, 1))
model = Sequential()
model.add(LSTM(64,return_sequences=True,input_shape=(None,14)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy' , metrics=['accuracy'])
Now if (after fitting) I want to predict the output of the model, the shape of the prediction is inconsistent with y_test!
y_pred = model.predict_classes(X_test)
y_pred.shape
(358, 100, 1)
Here I'd expect the shape to match y_test, and be (358,1) instead of the output given by predict_classes()
I am clearly misunderstanding something here. What am I missing here? Is there a different way to tackle this problem altogether?

You're returning the 3rd dim of the LSTM return_sequences=True, where the input to the last sigmoid layer will be 3D. Thus, the sigmoid layer will be applied on the last dim.
Just do the following:
model.add(LSTM(64,return_sequences=False,input_shape=(None,14)))

Sliding window approach for DNN

I'm trying to implement the sliding windows approach and use DNN for the forecasting part. The window length = 24
What I did:
I have x (input) and y (output) in the data set. I kept the "y" value as it is (single array). And on the x-value:
def generate_input(data, sequence_length=1):
x_data = []
for i in range(len(data)-sequence_length+1):
a = data[i:(i+sequence_length)]
x_data.append(a)
return np.array (x_data)
sequence_length = 24
x_train = generate_input(train, sequence_length)
#Shape of X train: (201389, 24)
#Shape of y train: (201412,)
model = Sequential()
model.add(Dense(30,input_shape= (x_train.shape[1],)))
model.add(Dense(20))
model.add(Dropout(0.2))
model.compile(loss="mse", optimizer='rmsprop')
model.summary()
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
validation_split=0.1)
The error message I'm receiving:
Error when checking target: expected dropout_5 to have shape (20,) but got
array with shape (1,)
One more question, how can I use the same approach for multivariate time series? I want to use sequences as input to predict y.
I changed the slicing part to:
x_data.append(data[i:i+sequence_length])
But I received an error:
cannot copy sequence with size 24 to array axis with dimension 4

model.summary() should show you that the output layer in your model is the Dropout layer with a shape of (None, 20). That is probably not what you want. It seems that you are trying to predict a single value. Thus you need to add a Dense(1) layer after. It is also highly unusual to have dropout as an output layer.
Also, x_train and y_train should have the same shape[0].

PCA Implementation on a Convolutional Neural Network

After applying PCA on MNIST data, I identified CNN model and layers. After fitting CNN model (X_train_PCA, Y_train) I end up with dimension problem at evaluation phase. Here is the message
"ValueError: Error when checking input: expected conv2d_1_input to have shape (1, 10, 10) but got array with shape (1, 28, 28)". When I try to reshape X_test into 10X10 format, I got a very low score
First I applied min-max regularization, and then PCA to X_train. Then, I produced validation data from X_train. The problem is; I can fit the data in 100 dimension format(after applying PCA), my input data becomes 10X10. When I try to get score from fitted model using X_test which is still (10000, 1, 28, 28)). I get an errors as mentioned above. How can I solve dimension problem. I also tried to transform X_test with minmaxscaler and PCA. No change in score
pca_3D = PCA(n_components=100)
X_train_pca = pca_3D.fit_transform(X_train)
X_train_pca.shape
cnn_model_1_scores = cnn_model_1.evaluate(X_test, Y_test, verbose=0)
# Split the data into training, validation and test sets
X_train1 = X_pca_proj_3D[:train_size]
X_valid = X_pca_proj_3D[train_size:]
Y_train1 = Y_train[:train_size]
Y_valid = Y_train[train_size:]
# We need to convert the input into (samples, channels, rows, cols) format
X_train1 = X_train1.reshape(X_train1.shape[0], 1, 10,
10).astype('float32')
X_valid = X_valid.reshape(X_valid.shape[0], 1, 10, 10).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype('float32')
X_train1.shape, X_valid.shape, X_test.shape
((51000, 1, 10, 10), (9000, 1, 10, 10), (10000, 1, 28, 28))
#create model
cnn_model_1=Sequential()
#1st Dense Layer
cnn_model_1.add(Conv2D(32, kernel_size=(5,5),
data_format="channels_first",
input_shape=(1,10,10),
activation='relu'))
#Max-Pooling
cnn_model_1.add(MaxPooling2D(pool_size=(2,2)))
#Max pooling is a sample-based discretization process. The objective is to
down-sample an input representation (image, hidden-layer output matrix,
etc.), reducing its dimensionality
# the number of layers, remains unchanged in the pooling operation
#cnn_model_1.add(BatchNormalization())
#Dropout
cnn_model_1.add(Flatten())
#cnn_model_1.add(BatchNormalization())
#2nd Dense Layer
cnn_model_1.add(Dense(128, activation='relu'))
#final softmax layer
cnn_model_1.add(Dense(10, activation='softmax'))
# print a summary and check if you created the network you intended
cnn_model_1.summary()
#Compile Model
cnn_model_1.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
#Fit the model
cnn_model_1_history=cnn_model_1.fit(X_train1, Y_train1,
validation_data=(X_valid, Y_valid), epochs=5, batch_size=100, verbose=2)
# Final evaluation of the model
cnn_model_1_scores = cnn_model_1.evaluate(X_test, Y_test, verbose=0)
print("Baseline Test Accuracy={0:.2f}% (categorical_crossentropy) loss=
{1:.2f}".format(cnn_model_1_scores[1]*100, cnn_model_1_scores[0]))
cnn_model_1_scores

I solved the problem, updating the post to give intuition for other coders to debug their code. First, I applied PCA on X_test data and after getting low score I tried without applying. As #Scott suggested, this was wrong. After carefully checking my code, I saw that I forgot to change X_test to X_test_pca after applying PCA on test data while constructing CNN model. I also fitted PCA on X_train while applying PCA on X_test data.

Getting dimensions right for a single layer keras LSTM

I have some hard time to get the dimensions of a LSTM network right.
So I have the following data:
train_data.shape
(25391, 3) # to be read as 25391 timesteps and 3 features
train_labels.shape
(25391, 1) # to be read as 25391 timesteps and 1 feature
So I have thought my input dimension is (1, len(train_data), train_data.shape[1]) as I plan to submit 1 batch. But I get the following error:
Error when checking target: expected lstm_10 to have 2 dimensions, but got array with shape (1, 25391, 1)
Here is the model code:
model = Sequential()
model.add(LSTM(1, # predict one feature and one timestep
batch_input_shape=(1, len(train_data), train_data.shape[1]),
activation='tanh',
return_sequences=False))
model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])
print(model.summary())
# as 1 sample with len(train_data) time steps and train_data.shape[1] features.
model.fit(x=train_data.values.reshape(1, len(train_data), train_data.shape[1]),
y=train_labels.values.reshape(1, len(train_labels), train_labels.shape[1]),
epochs=1,
verbose=1,
validation_split=0.8,
validation_data=None,
shuffle=False)
How should the input dimensions look like?

The problem is in the target (i.e. labels) shape you provide (i.e. Error when checking target). The output of LSTM layer in your model, which is also the output of the model, has a shape of (None, 1) since you are specifying to only the final output to be returned (i.e. return_sequences=False). In order to have the output of each timestep you need to set return_sequences=True. This way the output shape of LSTM layer would be (None, num_timesteps, num_units) which is consistent with the shape of labels array you provide.

Specifying Dense using keras library

I slightly misunderstand how to create a simple Sequence for my data.
The data has the following dimensions:
X_train.shape
(2369, 12)
y_train.shape
(2369,)
X_test.shape
(592, 12)
y_test.shape
(592,)
This is how I create the model:
batch_size = 128
nb_epoch = 20
in_out_neurons = X_train.shape[1]
dimof_middle = 100
model = Sequential()
model.add(Dense(batch_size, batch_input_shape=(None, in_out_neurons)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(batch_size))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(in_out_neurons))
model.add(Activation('linear'))
# I am solving the regression problem, not the classification one
model.compile(loss="mean_squared_error", optimizer="rmsprop")
history = model.fit(X_train, y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, y_test))
The error message:
Exception: Error when checking model input: expected dense_input_14 to
have shape (None, 1) but got array with shape (2369, 12)ç
The error is:
Error when checking model target: expected activation_42 to have shape
(None, 12) but got array with shape (2369, 1)
This error occurs at line:
model.add(Dense(in_out_neurons))
How to change Dense to make it work?
Another question is how to add a simple autoencoder in order to initialize weights of ANN?

One of your problems is that you seem to misunderstand what a batch is.
A batch is the number of training samples computed at a time, so instead of computing one training sample from X_train at a time you use, for example, 100 at a time. The important bit here is that this has nothing to do with your model.
So when you write
model.add(Dense(batch_size, batch_input_shape=(None, in_out_neurons)))
then you create a fully connected layer with an output size of one batch. That does not make a lot of sense.
Another problem is that your model's output is 12 neurons while your Y is only one value/neuron. Your model looks like this:
|
v
[128]
[128]
[ 12]
|
v
Then what fit() does is, it inputs a matrix of shape (128, 12) ((batch size, X_train.shape[1])) into the model and attempts to compare the output of shape (128,12) from the last layer to the corresponding Y values of the batch (shape (128,1)).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why are LSTM prediction shape not as expected? - python

You can delete return_sequences=True, that should fix the issue. Alternatively you can use a flattening layer but I don't think this is what you would want to do here.

Related

predict_classes returning a conflicting shape for a LSTM classification model

Sliding window approach for DNN

PCA Implementation on a Convolutional Neural Network

Getting dimensions right for a single layer keras LSTM

Specifying Dense using keras library

Categories

Resources