rnn and input shape on a 1d toy dataset with keras - python

i was trying to implement a simple RNN on a 1 dimensional toy dataset (just one x and one target variable y) to learn but I am having issue with the input shape.
X_train has input shape equal to (848,)
y_train the same
my model is very simple
from keras.layers import Input, SimpleRNN, Dense
def create_rnn_model(train_size):
inputs = Input(shape=( 1,))
rnn = SimpleRNN(units=32, input_shape=( 1,))(inputs)
dense1 = Dense(units=64, activation='relu')(rnn)
dense2 = Dense(units=64, activation='relu')(dense1)
modello = Model(inputs=inputs, outputs=dense2)
return modello
optimizer = tf.optimizers.SGD(learning_rate=0.0001,momentum=0.9)
rnn_model = create_rnn_model(train_size=train_size)
rnn_model.compile(optimizer=optimizer,
loss="mse" , jit_compile=True ]
)
but whenever i try to fit it i get this input shape that doesn't allow me to go further
ValueError: Input 0 of layer "simple_rnn_5" is incompatible with the
layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)
i know it depends on the format that the rnn want but i don't understand how to organize the data correctly.
i was trying without any windows size but i would really like to understand how to set a windows size of 10 for example.

Related

Correct keras LSTM input shape after text-embedding

I'm trying to understand the keras LSTM layer a bit better in regards to timesteps, but am still struggling a bit.
I want to create a model that is able to compare 2 inputs (siamese network). So my input is twice a preprocessed text. The preprocessing is done as followed:
max_len = 64
data['cleaned_text_1'] = assets.apply(lambda x: clean_string(data[]), axis=1)
data['text_1_seq'] = t.texts_to_sequences(cleaned_text_1.astype(str).values)
data['text_1_seq_pad'] = [list(x) for x in pad_sequences(assets['text_1_seq'], maxlen=max_len, padding='post')]
same is being done for the second text input. T is from keras.preprocessing.text.Tokenizer.
I defined the model with:
common_embed = Embedding(
name="synopsis_embedd",
input_dim=len(t.word_index)+1,
output_dim=300,
input_length=len(data['text_1_seq_pad'].tolist()[0]),
trainable=True
)
lstm_layer = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(32, dropout=0.2, recurrent_dropout=0.2)
)
input1 = tf.keras.Input(shape=(len(data['text_1_seq_pad'].tolist()[0]),))
e1 = common_embed(input1)
x1 = lstm_layer(e1)
input2 = tf.keras.Input(shape=(len(data['text_1_seq_pad'].tolist()[0]),))
e2 = common_embed(input2)
x2 = lstm_layer(e2)
merged = tf.keras.layers.Lambda(
function=l1_distance, output_shape=l1_dist_output_shape, name='L1_distance'
)([x1, x2])
conc = Concatenate(axis=-1)([merged, x1, x2])
x = Dropout(0.01)(conc)
preds = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs=[input1, input2], outputs=preds)
that seems to work if I feed the numpy data with the fit method:
model.fit(
x = [np.array(data['text_1_seq_pad'].tolist()), np.array(data['text_2_seq_pad'].tolist())],
y = y_train.values.reshape(-1,1),
epochs=epochs,
batch_size=batch_size,
validation_data=([np.array(val['text_1_seq_pad'].tolist()), np.array(val['text_2_seq_pad'].tolist())], y_val.values.reshape(-1,1)),
)
What I'm trying to understand at the moment is what is the shape in my case for the LSTM layer for:
samples
time_steps
features
Is it correct that the input_shape for the LSTM layer would be input_shape=(300,1) because I set the embedding output dim to 300 and I have only 1 input feature per LSTM?
And do I need to reshape the embedding output or can I just set
lstm_layer = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(32, input_shape=(300,1), dropout=0.2, recurrent_dropout=0.2)
)
from the embedding output?
Example notebook can be found in Github or as Colab
In general, an LSTM layer needs 3D inputs shaped this way : (batch_size, lenght of an input sequence , number of features ). (Batch size is not really important, so you can just consider that one input need to have this shape (lenght of sequence, number of features par item) )
In your case, the output dim of your embedding layer is 300. So your LSTM have 300 features.
Then, using LSTM on sentences requires a constant number of tokens. LSTM works with constant input dimension, you can not pass it a text with 12 tokens following by another one with 68 tokens. Indeed, you need to fix a limit and pad the sequence if needed.
So, if your sentence is 20 tokens long and that your limit is 50, you need to pad (add at the end of your sequence) the sequence with 30 “neutral” tokens (often zeros).
After all, your LSTM input dimension must be (number of token per text, dimension of your embedding outputs) -> (50, 300) in my example.
To learn more about it, it suggest you to take a look to this : (but in your case, you can replace time_steps by number_of_tokens)
https://shiva-verma.medium.com/understanding-input-and-output-shape-in-lstm-keras-c501ee95c65e
Share
Edit
Delete
Flag

Keras Functional API embedding layer output to LSTM

When passing the output of my embedding layer to the LSTM layer I'm running into a ValueError that I cannot figure out. My model is:
def lstm_mod(self, n_cells,batch_size):
input = tf.keras.Input((self.n_seq, self.n_features))
embedding = tf.keras.layers.Embedding(batch_size,self.n_seq,input_length=self.n_clusters)(input)
x= tf.keras.layers.LSTM(n_cells)(embedding)
out = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(input, out,name="LSTM")
model.compile(loss='mse', optimizer='Adam')
return model
The error is:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 128, 7, 128]
Given that the dimensions passed to the model input and the embedding layer are consistent through the arguments of the model I'm puzzled by this. Any guidance is appreciated.
Keras adds an additional dimension (None) when you feed your data through your model because it processes your data in batches.
In this line :
input = tf.keras.Input((self.n_seq, self.n_features))
You've defined a 2-dimensional input, and Keras adds a 3rd dimension (the batch), hence expected ndim=3.
However, the data that is being passed to the input layer is 4-dimensional, which means that your actual input data shape is 3-dimensional + the batch dimension, not 2-dimensional + batch.
To fix this you need to either re-shape your 3-D input to 2-D, or add an additional dimension to the input shape.
Print out the values for self.n_seq and self.n_features and find out what is missing from the shape 128, 7, 128 and that should guide you as to what you need to add.

Creating a Keras CNN for image alteration

I'm working on a problem that involves computationally evaluating three-dimensional data of the shape (32, 16, 5) and providing a corrected form of this data also in the shape of (32, 16, 5). The problem is relatively specific to my field, but it can be viewed as analogous to processing color images (just with five color channels instead of three). If it helps, this could be thought of as a color correction model.
In my initial efforts, I created a random forest model using XGBoost for each of these output parameters. I had good results, but found that the sheer number of output parameters (32*16*5 = 2560) made the runtime of this approach too long, so I am looking for an alternative.
I'm looking at using Keras to solve this, using a convolutional neural network approach, since the adjacent 'pixels' in my data should have some useful information about their neighbors. Note that 'adjacency' here is both spatial and in the color channels. So far, I am doing alright in creating a simple model that I believe has inputs/outputs of the correct shape, but I am running into an issue when I try to train the model on some dummy images:
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (batch_size, width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv3D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = [input_img]*500
outputs = [output_img]*500
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(input_img, output_img)
However, when I run this code, I get the following error:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=3. Full shape received: [32, 16, 5]
I am not really sure what the other two expected dimensions are for the data passed into model.fit(). I suspect this is a problem with the way that I am formatting my input data. Even if I have a list of input/output images, that will only bring the ndim of my data to 4, not 5.
I have been trying to find similar examples in the documentation and around the web to see what I'm doing incorrectly, but 3D convolution on a non-classifier network seems a bit off the beaten path, and I'm not having much luck (or just don't know the name of what I should search for).
I have tried passing the dummy training set to model.fit instead of two individual images. Fitting with model.fit(inputs, outputs) instead, I get:
ValueError: Layer sequential expects 1 inputs, but it received 500 input tensors.
It seems that passing a list of tensors isn't correct here. If I convert the list of input images to numpy arrays with:
inputs = np.array(inputs)
outputs = np.array(outputs)
This does bring up the number of dimensions in my input data to 4, but Keras is still expecting 5. The error I get in this case is very similar to the first:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: [None, 32, 16, 5]
I'm definitely not understanding something here, and any help would be appreciated.
I think you made two mistakes in your code:
Instead of using Conv3D, you need to use Conv2D.
model.fit(input_img, output_img) should be model.fit(inputs, outputs).
The reason why you need to use Conv2D is the shape of your data is (length,width,channel), it doesn't possess an extra dimension.
Try the script below
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = np.array([input_img]*500)
outputs = np.array([output_img]*500)
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(inputs, outputs)

Shape mismatch with Tensorflow Dataset and Network

I am getting an error relating to shapes whilst defining a very simple network using Tensorflow 2.
My code is:
import tensorflow as tf
import pandas as pd
data = pd.read_csv('data.csv')
target = data.pop('result')
target = tf.keras.utils.to_categorical(target.values, num_classes=3)
data_set = tf.data.Dataset.from_tensor_slices((data.values, target))
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=data.shape[1:]),
tf.keras.layers.Dense(12, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit(data_set, epochs=5)
The call to fit() throws the following error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 12 but received input with shape [12, 1]
Walking through the code:
The input CSV file has thirteen columns - with the last being the label
This is converted to a 3 bit one-hot encoding
The Dataset is constructed of two Tensors - one of shape (12,) and the other of shape (3,)
The network Input layer defines it's expected shape as be the value data shape ignoring the first axis which is the batch size
I am stumped about why there is mismatch between the shape of the data and the expected data shape for the network - especially as the latter is defined by reference to the former.
Add .batch() at the end of the dataset:
data_set = tf.data.Dataset.from_tensor_slices((data.values, target)).batch(8)

How to setup input shape for 1dCNN+LSTM network (Keras)?

I have the following idea to implement:
Input -> CNN-> LSTM -> Dense -> Output
The Input has 100 time steps, each step has a 64-dimensional feature vector
A Conv1D layer will extract features at each time step. The CNN layer contains 64 filters, each has length 16 taps. Then, a maxpooling layer will extract the single maximum value of each convolutional output, so a total of 64 features will be extracted at each time step.
Then, the output of the CNN layer will be fed into an LSTM layer with 64 neurons. Number of recurrence is the same as time step of input, which is 100 time steps. The LSTM layer should return a sequence of 64-dimensional output (the length of sequence == number of time steps == 100, so there should be 100*64=6400 numbers).
input = Input(shape=(100,64), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...
But this doesn't work. The second line reports "list index out of range" and I don't understand what's going on.
I'm new to Keras, so I appreciate sincerely if anyone could help me with it.
This picture explains how CNN should be applied to EACH TIME STEP
The problem is with your input. Your input is of shape (100, 64) in which the first dimension is the timesteps. So ignoring that, your input is of shape (64) to a Conv1D.
Now, refer to the Keras Conv1D documentation, which states that the input should be a 3D tensor (batch_size, steps, input_dim). Ignoring the batch_size, your input should be a 2D tensor (steps, input_dim).
So, you are providing 1D tensor input, where the expected size of the input is a 2D tensor. For example, if you are providing Natural Language input to the Conv1D in form of words, then there are 64 words in your sentence and supposing each word is encoded with a vector of length 50, your input should be (64, 50).
Also, make sure that you are feeding the right input to LSTM as given in the code below.
So, the correct code should be
embedding_size = 50 # Set this accordingingly
mfcc_input = Input(shape=(100, 64, embedding_size), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
# Directly feeding CNN_out to LSTM will also raise Error, since the 3rd dimension is 1, you need to purge it as
CNN_out = Reshape((int(CNN_out.shape[1]), int(CNN_out.shape[3])))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...

Categories

Resources