LSTM for 1D input - TensorFlow Exception - python

i'm working on a 2-layer RNN (LSTM). I think i have successfully reshaped my train and test set but when i try to run the code, it stops with the Exception:
Exception: When using TensorFlow, you should define explicitly the
number of timesteps of your sequences. If your first layer is an
Embedding, make sure to pass it an "input_length" argument. Otherwise,
make sure the first layer has an "input_shape" or "batch_input_shape"
argument, including the time axis.
I tried several configuration, but no one works well. I don't know how to fix it..
Here it is the code where i create the model and reshape X_train and X_test
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], EMB_SIZE))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], EMB_SIZE))
print 'Building model...'
model = Sequential()
model.add(LSTM(input_dim=EMB_SIZE, output_dim=100, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2], 1)))
model.add(LSTM(input_dim=EMB_SIZE, output_dim=100, return_sequences=False,input_shape=(X_train.shape[1], X_train.shape[2], 1)))
model.add(Dense(2))
model.add(Activation('softmax'))
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
score= model.evaluate(X_test, Y_test, batch_size=128)
print score
any help is really appreciated!
Thank you in advance <3

The number of units in the last layer defines the output shape of the model.
The output shape must be the same shape as your targets (Y).
Dense(2) -> Output shape = (None, 2)
Dense(1) -> Output shape = (None, 1)
Y_train -> Target shape = (15015,1)
Whoa.... Keras 0.3.3? No wonder everything will be problematic.

Related

Value error in convolutional neural network due to data shape

I am trying to predict the of number peaks in time series data by using a CNN and keep on getting a data shape error. My data looks as follows:
X = list of 520 lists (each is a time series) of various lengths (shortest = 137 elements, longest = 2297 elements)
y = list with 520 elements, each being the number of peaks for the respective time series
Due to the various lengths of the time series, I padded X. The shapes of X_train and X_test, after converting them from numpy arrays to tensors are:
X_train.shape = TensorShape([390, 2297])
X_test.shape = TensorShape([130, 2297])
I am new to keras and I am very unsure about the input_size in the first Conv1D layer. According to this post (Keras/Tensorflow Conv1D expected input shape) I chose it as (2297, 1) or (520, 1), but none of them works. The documentation of Keras says that the input shape should be (batch_size, feature_size, channels), where batch_size is omitted though.
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import Adam
#for structure of X and y, see explanation above
X_padded = tf.keras.preprocessing.sequence.pad_sequences(X)
X_train, X_test, y_train, y_test = train_test_split(X_padded, y, test_size=0.25, random_state=33)
X_train = tf.convert_to_tensor(X_train)
X_test = tf.convert_to_tensor(X_test)
y_train = tf.convert_to_tensor(y_train)
y_test = tf.convert_to_tensor(y_test)
model = keras.Sequential()
model.add(Conv1D(filters=16, kernel_size=3, activation = 'relu', strides = 1, padding = 'same', input_shape=(2297, 1)))
model.add(Dropout(0.1))
model.add(Conv1D(filters=32, kernel_size=3, activation = 'relu', strides = 1, padding = 'same'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(9, activation='softmax')) # '9' because there are 9 possible peak counts in the data
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
progress = model.fit(X_train, y_train, epochs = 15, validation_data = (X_test, y_test), verbose=1)
Error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 2297]
What might be the issue here?
I was able to solve it. The correct input shape is given here Convolutional neural network Conv1d input shape in the answer of user 'rnso'.
I shaped my X_train and X_test (being numpy.arrays) as
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
and stated the input_shape in the Conv1D statement as input_shape=(ncols, 1)
input_shape=(2297, 1)

Why do I get a "ValueError: Shape mismatch" when using CategoricalCrossentropy loss function?

I created a simple TensorFlow model (no convolution layers) using the MNIST dataset. I initially used SparseCategoricalCrossentropy loss function and it worked fine. I created a nearly identical model this time using CategoricalCrossentropy loss and changed the labels to one-hot encoding:
(x_train, y_train), (x_test, y_test) = mnist_data
# scale data:
x_train = x_train / 255
x_test = x_test / 255
# create one-hot encoding:
y_train_one_hot = tf.one_hot(y_train, 10).numpy()
y_test_one_hot = tf.one_hot(y_test, 10).numpy()
# print shapes:
# each image is 28x28; 60,000 examples; 10 possible output values:
print(x_train.shape) # (60000, 28, 28)
print(y_train.shape) # (60000, 10)
model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(128, activation='relu'),
Dropout(0.2),
Dense(10, activation='softmax')
])
model.compile(
Adam(learning_rate=0.01),
loss='categorical_crossentropy',
metrics=['accuracy']
)
model_history = model.fit(
x_train,
y_train_one_hot,
epochs=20
)
However, I'm not able to even train the model as I get an error: ValueError: Shape mismatch: The shape of labels (received (320,)) should equal the shape of logits except for the last dimension (received (32, 10)). I don't really understand what shape is wrong or why.
Edit:
I forgot to show how I set mnist_data:
mnist_data = tf.keras.datasets.mnist.load_data()
all I did was change the first line of code
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
It trained as it should. Note you have this line of code
print(y_train.shape) # (60000, 10)
the dimension is not (60000,10) it is (60000). After you one hot encode it it will be (60000,10)

Keras CNN Incompatible with Convolution2D

I am getting into Convolutional Neural Networks and want to create one for MNIST data. Whenever I add a convolutional Layer to my CNN, I get an error:
Input 0 is incompatible with layer conv2d_4: expected ndim=4, found ndim=5
I have attemped to reshape X_Train data set but was not successful
I tried to add a flatten layer first but that returns this error:
Input 0 is incompatible with layer conv2d_5: expected ndim=4, found ndim=2
import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import Flatten, Dense, Dropout
img_width, img_height = 28, 28
mnist = keras.datasets.mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = keras.utils.normalize(X_train, axis=1) #Normalizes from 0-1 (originally each pixel is valued 0-255)
X_test = keras.utils.normalize(X_test, axis=1) #Normalizes from 0-1 (originally each pixel is valued 0-255)
Y_train = keras.utils.to_categorical(Y_train) #Reshapes to allow ytrain to work with x train
Y_test = keras.utils.to_categorical(Y_test)
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
Y_train = lb.fit_transform(Y_train)
Y_test = lb.fit_transform(Y_test)
#Model
model = Sequential()
model.add(Flatten())
model.add(Convolution2D(16, 5, 5, activation='relu', input_shape=(1,img_width, img_height, 1)))
model.add(Dense(128, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.2))
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer = 'adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=3, verbose=2)
val_loss, val_acc = model.evaluate(X_test, Y_test) #Check to see if model fits test
print(val_loss, val_acc)
If I comment out the Convolutional layer, it works very well (accuracy>95%), but I am planning on making a more complex neural network that requires Convolution in the future and this is my starting point
Keras is looking for a tensor of dimension 4 but it's getting ndim as number of dimension as 2.
first make sure you kernel size in Conv2D layer is in parenthesis
model.add(Convolution2D(32, (3, 3), activation='relu', input_shape=(img_height, img_height, 1)))
Second you need to reshape the X_train, X_test variable as Conv2D layer is expecting a tensor input.
X_train = X_train.reshape(-1,28, 28, 1) #Reshape for CNN - should work!!
X_test = X_test.reshape(-1,28, 28, 1)
model.fit(X_train, Y_train, epochs=3, verbose=2)
For more information about Conv2D you can look into Keras Documentation here
Hope this helps.
There are two issues in your code.
You are encoding your labels two times, once using to_categorical, and another time using LabelBinarizer. The latter is no needed here, so just encode your labels into categorical once, using to_categorical.
2.- Your input shape is incorrect, it should be (28, 28, 1).
Also you should add a Flatten layer after the convolutional layers so the Dense layer works properly.

How to convert 1D flattened MNIST Keras to LSTM model without unflattening?

I want to change my model architecture a bit on the LSTM so it accepts the same exact flattened inputs the full connected approach does.
Working Dnn model from Keras examples
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
# import the data
from keras.datasets import mnist
# read the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
num_pixels = x_train.shape[1] * x_train.shape[2] # find size of one-dimensional vector
x_train = x_train.reshape(x_train.shape[0], num_pixels).astype('float32') # flatten training images
x_test = x_test.reshape(x_test.shape[0], num_pixels).astype('float32') # flatten test images
# normalize inputs from 0-255 to 0-1
x_train = x_train / 255
x_test = x_test / 255
# one hot encode outputs
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]
print(num_classes)
# define classification model
def classification_model():
# create model
model = Sequential()
model.add(Dense(num_pixels, activation='relu', input_shape=(num_pixels,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
return model
# build the model
model = classification_model()
# fit the model
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, verbose=2)
# evaluate the model
scores = model.evaluate(x_test, y_test, verbose=0)
Same problem but trying LSTM (syntax error still)
def kaggle_LSTM_model():
model = Sequential()
model.add(LSTM(128, input_shape=(x_train.shape[1:]), activation='relu', return_sequences=True))
# What does return_sequences=True do?
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt,
metrics=['accuracy'])
return model
model_kaggle_LSTM = kaggle_LSTM_model()
# fit the model
model_kaggle_LSTM.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, verbose=2)
# evaluate the model
scores = model_kaggle_LSTM.evaluate(x_test, y_test, verbose=0)
Problem is here:
model.add(LSTM(128, input_shape=(x_train.shape[1:]), activation='relu', return_sequences=True))
ValueError: Input 0 is incompatible with layer lstm_17: expected
ndim=3, found ndim=2
If I go back and don't flatten x_train and y_train, it works. However, I'd like this to be "just another model choice" that feeds off the same pre-processed input. I thought passing shape[1:] would work as that it the real flattened input_shape. I'm sure it's something easy I'm missing about the dimensionality, but I couldn't get it after an hour of twiddling and debugging, although did figure out not flattening the 28x28 to 784 works, but I don't understand why it works. Thanks a lot!
For bonus points, an example of how to do either DNN or LSTM in either 1D (784,) or 2D (28, 28) would be the best.
RNN layers such as LSTM are meant for sequence processing (i.e. a series of vectors which their order of appearance matters). You can look at an image from top to bottom, and consider each row of pixels as a vector. Therefore, the image would be a sequence of vectors and can be fed to the RNN layer. Therefore, according to this description, you should expect that the RNN layer take an input of shape (sequence_length, number_of_features). That's why when you feed the images to the LSTM network in their original shape, i.e. (28,28), it works.
Now if you insist on feeding the LSTM model the flattened image, i.e. with shape (784,), you have at least two options: either you can consider this as a sequence of length one, i.e. (1, 748), which does not make much sense; or you can add a Reshape layer to your model to reshape back the input to its original shape suitable for the input shape of a LSTM layer, like this:
from keras.layers import Reshape
def kaggle_LSTM_model():
model = Sequential()
model.add(Reshape((28,28), input_shape=x_train.shape[1:]))
# the rest is the same...

Declaring input_shape of a converted Sequence in Keras?

I am trying to run a neural network on text inputs. This is a binary classification. Here is my working code so far:
df = pd.read_csv(pathname, encoding = "ISO-8859-1")
df = df[['content_cleaned', 'meaningful']] #Content cleaned: text, meaningful: label
X = df['content_cleaned']
y = df['meaningful']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21)
tokenizer = Tokenizer(num_words=100)
tokenizer.fit_on_texts(X_train)
X_train_encoded = tokenizer.texts_to_sequences(X_train)
X_test_encoded = tokenizer.texts_to_sequences(X_test)
max_len = 100
X_train = pad_sequences(X_train_encoded, maxlen=max_len)
X_test = pad_sequences(X_test_encoded, maxlen=max_len)
batch_size = 100
max_words = 100
input_dim = X_train.shape[1] # Number of features
model = Sequential()
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, X_test,
batch_size=batch_size,
epochs=5,
verbose=1,
validation_split=0.1)
My question is two parts. First is with the input_shape when creating the layers. I am confused as to the syntax of declaring this. When running this command:
print(X_train.shape)
I am getting this shape: (3609, 100).
From my understanding, this is telling me that there are 3609 instances. From viewing other examples, my naive assumption was to use the 100 as there are 100 types (may be understanding this incorrectly) corresponding to the max_words that I initialized. I believe that I may have done the syntax incorrectly when initializing the input_shape.
The second question is with an error message when running all of this (most likely with the incorrect input_shape). The error message highlights this line of code:
validation_split=0.1)
The error message is:
ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (1547, 1
Am I going about this problem incorrectly? I am very new to Deep Learning.
The input_shape argument specifies the shape of one training sample. Therefore, you need to set it to X_train.shape[1:] (i.e. ignore samples or batch axis):
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
Further, pass X_train and y_train to the fit_generator (instead of X_train_encoded and X_test_encoded).
You missed two ending parenthesis ) at the line where you defined the input of your model. Also make sure that you provide your activation function.
Change your code as below:
model.add(layers.Dense(10, activation='relu', input_shape=(X_train.shape[0],)))
EDIT:
For your last error just change your input_shape to input_shape=(X_train.shape[0],).

Categories

Resources