Keras: Dense layer expects 2d matrix after LSTM - python

My Y_train is a one-hot encoded label matrix.
The shape of my Y_train is (10, 1000, 3) because I have three different categories.
My model is defined as:
model = Sequential()
model.add(LSTM(100, input_shape=(1000, 38), return_sequences=False))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])
When I train my model, I get the following error:
Error when checking target: expected dense_83 to have 2 dimensions,
but got array with shape (8, 1000, 3)
This occurs because my Y_train is a 3d matrix, instead of a 2d matrix. The only way I've been able to solve this is by setting return_sequences=True but not sure if that will affect my LSTM's output.
Is this the correct way to deal with categorical labels? By setting return_sequences=True as a parameter of LSTM?
In other words, is it okay to return_sequences before a Softmax layer?
Thank you!

Related

ValueError: Input 0 of layer "sequential_8" is incompatible with the layer - deep learning model

I am attempting to setup my first deep learning sequential model with a small test dataset.
Unfortunately, I get the following error message when I call model.fit():
ValueError: Input 0 of layer "sequential_8" is incompatible with the layer: expected shape=(None, 160, 4000), found shape=(32, 4000)
My model is as follows
num_of_classes = 2
input_shape = (1,4000)
y_train_cat = keras.utils.to_categorical(y_train, num_of_classes)
y_test_cat = keras.utils.to_categorical(y_test, num_of_classes)
model = Sequential()
model.add(Conv1D(filters=10, kernel_size=5, input_shape=(160, 4000)))
model.add(MaxPool1D(pool_size=5))
model.add(Flatten())
model.add(Dense(1000, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
The data is of the following dimensions
x_train.shape is (160, 4000)
y_train_cat is (160, 2)
There are two classes.
Thank you for reading this far and your help in advance
When you give a layer the shape, is supposed to be the same of a single sample... change
model.add(Conv1D(filters=10, kernel_size=5, input_shape=(160, 4000)))
to
model.add(Conv1D(filters=10, kernel_size=5, input_shape=(4000,1)))
and it should work fine
Edit:
You probably also need to reshape your input to add a dimension:
x_train = np.expand_dims(x_train, 2)
Explanation:
consider a single element, a 1D convolution "slides" a 1D filter over your 1D element, however you can assume to have multiple channels, thus the leading "1" in the shape of the input

predict_classes returning a conflicting shape for a LSTM classification model

I have quite a bit of trouble understanding the expected shape of the input/output for an LSTM problem.
Specifically for this example I have 386 of length 100 each containing 14 features. For each such sequence, I need only predict whether it is in the 0 or 1 class. The respective shapes and model are
X_test.shape,y_test.shape
((358, 100, 14), (358, 1))
model = Sequential()
model.add(LSTM(64,return_sequences=True,input_shape=(None,14)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy' , metrics=['accuracy'])
Now if (after fitting) I want to predict the output of the model, the shape of the prediction is inconsistent with y_test!
y_pred = model.predict_classes(X_test)
y_pred.shape
(358, 100, 1)
Here I'd expect the shape to match y_test, and be (358,1) instead of the output given by predict_classes()
I am clearly misunderstanding something here. What am I missing here? Is there a different way to tackle this problem altogether?
You're returning the 3rd dim of the LSTM return_sequences=True, where the input to the last sigmoid layer will be 3D. Thus, the sigmoid layer will be applied on the last dim.
Just do the following:
model.add(LSTM(64,return_sequences=False,input_shape=(None,14)))

Keras LSTM different input output shape

In my binary multilabel sequence classification problem, I have 22 timesteps in each input sentence. Now that I have added 200 dimensions of word embedding to each timestep, so my current input shape is (*number of input sentence*,22,200). My output shape would be (*number of input sentence*,4), eg.[1,0,0,1].
My first question is, how to build the Keras LSTM model to accept 3D input and output 2D results. The following code outputs the error:
ValueError: Error when checking target: expected dense_41 to have 3 dimensions, but got array with shape (7339, 4)
My second question is, when I add TimeDistributed layer, should I set the number of Dense layer to the number of features in input, in my case, that is 200?
.
X_train, X_test, y_train, y_test = train_test_split(padded_docs2, new_y, test_size=0.33, random_state=42)
start = datetime.datetime.now()
print(start)
# define the model
model = Sequential()
e = Embedding(input_dim=vocab_size2, input_length=22, output_dim=200, weights=[embedding_matrix2], trainable=False)
model.add(e)
model.add(LSTM(128, input_shape=(X_train.shape[1],200),dropout=0.2, recurrent_dropout=0.1, return_sequences=True))
model.add(TimeDistributed(Dense(200)))
model.add(Dense(y_train.shape[1],activation='sigmoid'))
# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])
# summarize the model
print(model.summary())
# fit the model
model.fit(X_train, y_train, epochs=300, verbose=0)
end = datetime.datetime.now()
print(end)
print('Time taken to build the model: ', end-start)
Please let me know if I have missed out any information, thanks.
Your model's Lstm layers gets 3D sequence and produces outputs of 3D. The same goes to TimeDistributed layer. If you want lstm to return 2D tensor the argument return_sequences should be false. Now you don't have to use TimeDistributed Wrapper. With this setup your model would be
model = Sequential()
e = Embedding(input_dim=vocab_size2, input_length=22, output_dim=200, weights=[embedding_matrix2], trainable=False)
model.add(e)
model.add(LSTM(128, input_shape=(X_train.shape[1],200),dropout=0.2, recurrent_dropout=0.1, return_sequences=False))
model.add(Dense(200))
model.add(Dense(y_train.shape[1],activation='sigmoid'))
###Edit:
TimeDistributed applies a given layer to each temporal slices of inputs.In your case for example, the temporal dimension is X_train.shape[1]. Let's assume X_train.shape[1] == 10 and consider the following line.
model.add(TimeDistributed(Dense(200)))
Here the TimeDistributed wrapper creates one dense layer(Dense(200)) for each temporal slices(total of 10 dense layers). So for each temporal dimension you will get output with shape(batch_size, 200) and the final output tensor would have shape of (batch_size, 10, 200). But you said you want 2D output. So the TimeDistributed wouldn't work to get 2D from 3D inputs.
The other case is if you remove TimeDistributed wrapper and use only dense, like this.
model.add(Dense(200))
Then the dense layer first flatten the input to have shape (batch_size * 10, 200) and computes the dot product of fully connected layer. After dot product the dense layer reshapes the outputs to have the same shape as inputs. In your case (batch_size, 10, 200) and it is still 3D tensor.
But if you don't want to change the lstm layer you can replace TimeDistributed layer with another lstm layer with return_sequences set to false. Now your model would look like this.
model = Sequential()
e = Embedding(input_dim=vocab_size2, input_length=22, output_dim=200, weights=[embedding_matrix2], trainable=False)
model.add(e)
model.add(LSTM(128, input_shape=(X_train.shape[1],200),dropout=0.2, recurrent_dropout=0.1, return_sequences=True))
model.add(LSTM(200, input_shape=(X_train.shape[1],200),dropout=0.2, recurrent_dropout=0.1, return_sequences=False))
model.add(Dense(y_train.shape[1],activation='sigmoid'))

ValueError: Error when checking : expected dense_1_input to have shape (3,) but got array with shape (1,)

I am trying to predict using the learned .h5 file.
The learning model is as follows.
model =Sequential()
model.add(Dense(12, input_dim=3, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
And I wrote the form of the input as follows.
x = np.array([[band1_input[input_cols_loop][input_rows_loop]],[band2_input[input_cols_loop][input_rows_loop]],[band3_input[input_cols_loop][input_rows_loop]]])
prediction_prob = model.predict(x)
I thought the shape was correct, but the following error occurred.
ValueError: Error when checking : expected dense_1_input to have shape (3,) but got array with shape (1,)
The shape of x is obviously (3,1), but the above error doesn't disappear (the data is from a csv file in the form of (value 1, value 2, value 3, class)).
How can I solve this problem?
The shape of x is obviously (3,1), but the above error continues.
You are right, but that's not what keras expects. It expects (1, 3) shape: by convention, axis 0 denotes the batch size and axis 1 denotes the features. The first Dense layer accepts 3 features, that's why it complains when it sees just one.
The solution is simply to transpose x.

Building a LSTM Cell using Keras

I'm trying to build a RNN for text generation. I'm stuck at building my LSTM cell. The data is shaped like this- X is the input sparse matrix of dim(90809,2700) and Y is the output matrix of dimension(90809,27). The following is my code for defining the LSTM Cell-
model = Sequential()
model.add(LSTM(128, input_shape=(X.shape[0], X.shape[1])))
model.add(Dropout(0.2))
model.add(Dense(Y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
My understanding is that the input_shape should be the dimension of the input matrix, and the dense layer should be the size of the output for each observation, i.e 27 in this case. However, I get the following error-
Exception: Error when checking model input: expected lstm_input_3 to have 3 dimensions, but got array with shape (90809, 2700)
I'm not able to figure out what is going wrong. Can anyone please help me figure out why is the lstm_input expecting 3 dimensions?
I tried the following as well-
X= np.reshape(np.asarray(dataX), (n_patterns, n_vocab*seq_length,1))
Y=np.reshape(np.asarray(dataY), (n_patterns, n_vocab,1))
This gave me the following error-
Exception: Error when checking model input: expected lstm_input_7 to have shape (None, 90809, 2700) but got array with shape (90809, 2700, 1)
Any help will be appreciated. Thanks!
You should read about the difference between input_shape, batch_input_shape and input_dim here.
For input_shape, we don't need to define the batch_size. This is how your LSTM layer should look like.
model.add(LSTM(128, input_shape=(X.shape[1], 1)))
or
model.add(LSTM(128, batch_input_shape=(X.shape[0], X.shape[1], 1)))

Categories

Resources