Keras Functional API embedding layer output to LSTM - python

When passing the output of my embedding layer to the LSTM layer I'm running into a ValueError that I cannot figure out. My model is:
def lstm_mod(self, n_cells,batch_size):
input = tf.keras.Input((self.n_seq, self.n_features))
embedding = tf.keras.layers.Embedding(batch_size,self.n_seq,input_length=self.n_clusters)(input)
x= tf.keras.layers.LSTM(n_cells)(embedding)
out = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(input, out,name="LSTM")
model.compile(loss='mse', optimizer='Adam')
return model
The error is:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 128, 7, 128]
Given that the dimensions passed to the model input and the embedding layer are consistent through the arguments of the model I'm puzzled by this. Any guidance is appreciated.

Keras adds an additional dimension (None) when you feed your data through your model because it processes your data in batches.
In this line :
input = tf.keras.Input((self.n_seq, self.n_features))
You've defined a 2-dimensional input, and Keras adds a 3rd dimension (the batch), hence expected ndim=3.
However, the data that is being passed to the input layer is 4-dimensional, which means that your actual input data shape is 3-dimensional + the batch dimension, not 2-dimensional + batch.
To fix this you need to either re-shape your 3-D input to 2-D, or add an additional dimension to the input shape.
Print out the values for self.n_seq and self.n_features and find out what is missing from the shape 128, 7, 128 and that should guide you as to what you need to add.

Related

rnn and input shape on a 1d toy dataset with keras

i was trying to implement a simple RNN on a 1 dimensional toy dataset (just one x and one target variable y) to learn but I am having issue with the input shape.
X_train has input shape equal to (848,)
y_train the same
my model is very simple
from keras.layers import Input, SimpleRNN, Dense
def create_rnn_model(train_size):
inputs = Input(shape=( 1,))
rnn = SimpleRNN(units=32, input_shape=( 1,))(inputs)
dense1 = Dense(units=64, activation='relu')(rnn)
dense2 = Dense(units=64, activation='relu')(dense1)
modello = Model(inputs=inputs, outputs=dense2)
return modello
optimizer = tf.optimizers.SGD(learning_rate=0.0001,momentum=0.9)
rnn_model = create_rnn_model(train_size=train_size)
rnn_model.compile(optimizer=optimizer,
loss="mse" , jit_compile=True ]
)
but whenever i try to fit it i get this input shape that doesn't allow me to go further
ValueError: Input 0 of layer "simple_rnn_5" is incompatible with the
layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)
i know it depends on the format that the rnn want but i don't understand how to organize the data correctly.
i was trying without any windows size but i would really like to understand how to set a windows size of 10 for example.

How to define size of input in Recurrent Neural Network for unsupervised learning model?

I'm trying to fit a neural network model(unsupervised).
I have only an input x, which is a vector with 10000 components, and I have an output, which is also a vector of the same size as x.
I've created the model using the below mentioned code:
model = tf.keras.Sequential()
model.add(tf.keras.layers.SimpleRNN(20,input_shape= (10000,1),activation='tanh',return_sequences=True,kernel_initializer='glorot_normal'))
model.add(tf.keras.layers.Dense(20,activation='tanh'))
model.add(tf.keras.layers.Dense(1,activation='tanh'))
I'm here confused about the input shape, as I'm getting a warning message which reads:
WARNING:tensorflow:Model was constructed with shape (None, 10000, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, 10000, 1), dtype=tf.float32, name='simple_rnn_2_input'), name='simple_rnn_2_input', description="created by layer 'simple_rnn_2_input'"), but it was called on an input with incompatible shape (10000, 1, 1).
How should I define the input shape within the model and the input shape used for calling the model, so as to match which that of the model?
Also, is it possible to call the same model using different input sizes, like vector with 10000/100/50 components?
Any help would be appreciated. Thanks in advance.

What is the input shape of the InputLayer in keras Tensorflow?

I have this data
X_regression = tf.range(0, 1000, 5)
y_regression = X + 100
X_reg_train, X_reg_test = X_regression[:150], X_regression[150:]
y_reg_train, y_reg_test = y_regression[:150], y_regression[150:]
I inspect the data input data
X_reg_train[0], X_reg_train[0].shape, X_reg_train[0].ndim
and it returns:
(<tf.Tensor: shape=(), dtype=int32, numpy=0>, TensorShape([]), 0)
I build a model:
# Set the random seed
tf.random.set_seed(42)
# Create the model
model_reg = tf.keras.models.Sequential()
# Add Input layer
model_reg.add(tf.keras.layers.InputLayer(input_shape=[1]))
# Add Hidden layers
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Add last layer
model_reg.add(tf.keras.layers.Dense(units=1))
# Compile the model
model_reg.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.mae,
metrics=[tf.keras.metrics.mae])
# Fit the model
model_reg.fit(X_reg_train, y_reg_train, epochs=10)
The model works.
However, I am confused about input_shape
Why is it [1] in this situation? Why is it sometimes a tuple?
Would appreciate an explanation of different formats of input_shape in different situations.
InputLayer is actually just the same as specifying the parameter input_shape in a Dense layer. Keras actually uses InputLayer when you use method 2 in the background.
# Method 1
model_reg.add(tf.keras.layers.InputLayer(input_shape=(1,)))
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Method 2
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(1,), activation=tf.keras.activations.relu))
The parameter input_shape is actually supposed to be a tuple, if you noticed that I set the input_shape in your example to be (1,) this is a tuple with a single element in it. As your data is 1D, you pass in a single element at a time therefore the input shape is (1,).
If your input data was a 2D input for example when trying to predict the price of a house based on multiple variables, you would have multiple rows and multiple columns of data. In this case, you pass in the input shape of the last dimension of the X_reg_train which is the number of inputs. If X_reg_train was (1000,10) then we use the input_shape of (10,).
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(X_reg_train.shape[1],), activation=tf.keras.activations.relu))
Ignoring the batch_size for a moment, with this we are actually just sending a single row of the data to predict a single house price. The batch_size is just here to chunk multiple rows of data together so that we do not have to load the entire dataset into memory which is computationally expensive, so we send small chunks, with the default value being 32. When running the training you would have noticed that under each epoch it says 5/5 which are for the 5 batches of data you have, since the training size is 150, 150 / 32 = 5(rounded up).
For 3D input with the Dense layer it actually just gets flattened to a 2D input, i.e. from (batch_size, sequence_length, dim) -> (batch_size * sequence_length, dim) -> (batch_size, sequence_length, hidden_units) which is the same as using a Conv1D layer with a kernel of 1. So I wouldn't even use the Dense layer in this case.
In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.
Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).
Each type of layer requires the input with a certain number of dimensions:
Dense layers require inputs as (batch_size, input_size) or (batch_size, optional,...,optional, input_size) or in your case just (input_size)
2D convolutional layers need inputs as:
if using channels_last: (batch_size, imageside1, imageside2, channels)
if using channels_first: (batch_size, channels, imageside1, imageside2)
1D convolutions and recurrent layers use (batch_size, sequence_length, features)
Here are some helpful links : Keras input explanation: input_shape, units, batch_size, dim, etc https://keras.io/api/layers/core_layers/input/

How to correctly create a multi input neural network

i'm building a NN that has, as input, two car images and classifies if thery are the same make and model. My problem is in the fitmethod of keras, because there is this error
ValueError: Error when checking target: expected dense_3 to have shape (1,) but got array with shape (2,)
The network architecture is the following:
input1=Input((150,200,3))
model1=InceptionV3(include_top=False, weights='imagenet', input_tensor=input1)
model1.layers.pop()
input2=Input((150,200,3))
model2=InceptionV3(include_top=False, weights='imagenet', input_tensor=input2)
model2.layers.pop()
for layer in model2.layers:
layer.name = "custom_layer_"+ layer.name
concat = concatenate([model1.layers[-1].output,model2.layers[-1].output])
flat = Flatten()(concat)
dense1=Dense(100, activation='relu')(flat)
do1=Dropout(0.25)(dense1)
dense2=Dense(50, activation='relu')(do1)
do2=Dropout(0.25)(dense2)
dense3=Dense(1, activation='softmax')(do2)
model = Model(inputs=[model1.input,model2.input],outputs=dense3)
My idea is that the error is due to the to_catogorical method that i have called on the array which stores, as 0 or 1, if the two cars have the same make and model or not. Any suggestion?
Since you are doing binary classification with one-hot encoded labels, then you should change this line:
dense3=Dense(1, activation='softmax')(do2)
To:
dense3=Dense(2, activation='softmax')(do2)
Softmax with a single neuron makes no sense, two neurons should be used for binary classification with softmax activation.

How to setup input shape for 1dCNN+LSTM network (Keras)?

I have the following idea to implement:
Input -> CNN-> LSTM -> Dense -> Output
The Input has 100 time steps, each step has a 64-dimensional feature vector
A Conv1D layer will extract features at each time step. The CNN layer contains 64 filters, each has length 16 taps. Then, a maxpooling layer will extract the single maximum value of each convolutional output, so a total of 64 features will be extracted at each time step.
Then, the output of the CNN layer will be fed into an LSTM layer with 64 neurons. Number of recurrence is the same as time step of input, which is 100 time steps. The LSTM layer should return a sequence of 64-dimensional output (the length of sequence == number of time steps == 100, so there should be 100*64=6400 numbers).
input = Input(shape=(100,64), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...
But this doesn't work. The second line reports "list index out of range" and I don't understand what's going on.
I'm new to Keras, so I appreciate sincerely if anyone could help me with it.
This picture explains how CNN should be applied to EACH TIME STEP
The problem is with your input. Your input is of shape (100, 64) in which the first dimension is the timesteps. So ignoring that, your input is of shape (64) to a Conv1D.
Now, refer to the Keras Conv1D documentation, which states that the input should be a 3D tensor (batch_size, steps, input_dim). Ignoring the batch_size, your input should be a 2D tensor (steps, input_dim).
So, you are providing 1D tensor input, where the expected size of the input is a 2D tensor. For example, if you are providing Natural Language input to the Conv1D in form of words, then there are 64 words in your sentence and supposing each word is encoded with a vector of length 50, your input should be (64, 50).
Also, make sure that you are feeding the right input to LSTM as given in the code below.
So, the correct code should be
embedding_size = 50 # Set this accordingingly
mfcc_input = Input(shape=(100, 64, embedding_size), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
# Directly feeding CNN_out to LSTM will also raise Error, since the 3rd dimension is 1, you need to purge it as
CNN_out = Reshape((int(CNN_out.shape[1]), int(CNN_out.shape[3])))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...

Categories

Resources