I have this data
X_regression = tf.range(0, 1000, 5)
y_regression = X + 100
X_reg_train, X_reg_test = X_regression[:150], X_regression[150:]
y_reg_train, y_reg_test = y_regression[:150], y_regression[150:]
I inspect the data input data
X_reg_train[0], X_reg_train[0].shape, X_reg_train[0].ndim
and it returns:
(<tf.Tensor: shape=(), dtype=int32, numpy=0>, TensorShape([]), 0)
I build a model:
# Set the random seed
tf.random.set_seed(42)
# Create the model
model_reg = tf.keras.models.Sequential()
# Add Input layer
model_reg.add(tf.keras.layers.InputLayer(input_shape=[1]))
# Add Hidden layers
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Add last layer
model_reg.add(tf.keras.layers.Dense(units=1))
# Compile the model
model_reg.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.mae,
metrics=[tf.keras.metrics.mae])
# Fit the model
model_reg.fit(X_reg_train, y_reg_train, epochs=10)
The model works.
However, I am confused about input_shape
Why is it [1] in this situation? Why is it sometimes a tuple?
Would appreciate an explanation of different formats of input_shape in different situations.
InputLayer is actually just the same as specifying the parameter input_shape in a Dense layer. Keras actually uses InputLayer when you use method 2 in the background.
# Method 1
model_reg.add(tf.keras.layers.InputLayer(input_shape=(1,)))
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Method 2
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(1,), activation=tf.keras.activations.relu))
The parameter input_shape is actually supposed to be a tuple, if you noticed that I set the input_shape in your example to be (1,) this is a tuple with a single element in it. As your data is 1D, you pass in a single element at a time therefore the input shape is (1,).
If your input data was a 2D input for example when trying to predict the price of a house based on multiple variables, you would have multiple rows and multiple columns of data. In this case, you pass in the input shape of the last dimension of the X_reg_train which is the number of inputs. If X_reg_train was (1000,10) then we use the input_shape of (10,).
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(X_reg_train.shape[1],), activation=tf.keras.activations.relu))
Ignoring the batch_size for a moment, with this we are actually just sending a single row of the data to predict a single house price. The batch_size is just here to chunk multiple rows of data together so that we do not have to load the entire dataset into memory which is computationally expensive, so we send small chunks, with the default value being 32. When running the training you would have noticed that under each epoch it says 5/5 which are for the 5 batches of data you have, since the training size is 150, 150 / 32 = 5(rounded up).
For 3D input with the Dense layer it actually just gets flattened to a 2D input, i.e. from (batch_size, sequence_length, dim) -> (batch_size * sequence_length, dim) -> (batch_size, sequence_length, hidden_units) which is the same as using a Conv1D layer with a kernel of 1. So I wouldn't even use the Dense layer in this case.
In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.
Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).
Each type of layer requires the input with a certain number of dimensions:
Dense layers require inputs as (batch_size, input_size) or (batch_size, optional,...,optional, input_size) or in your case just (input_size)
2D convolutional layers need inputs as:
if using channels_last: (batch_size, imageside1, imageside2, channels)
if using channels_first: (batch_size, channels, imageside1, imageside2)
1D convolutions and recurrent layers use (batch_size, sequence_length, features)
Here are some helpful links : Keras input explanation: input_shape, units, batch_size, dim, etc https://keras.io/api/layers/core_layers/input/
Related
When passing the output of my embedding layer to the LSTM layer I'm running into a ValueError that I cannot figure out. My model is:
def lstm_mod(self, n_cells,batch_size):
input = tf.keras.Input((self.n_seq, self.n_features))
embedding = tf.keras.layers.Embedding(batch_size,self.n_seq,input_length=self.n_clusters)(input)
x= tf.keras.layers.LSTM(n_cells)(embedding)
out = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(input, out,name="LSTM")
model.compile(loss='mse', optimizer='Adam')
return model
The error is:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 128, 7, 128]
Given that the dimensions passed to the model input and the embedding layer are consistent through the arguments of the model I'm puzzled by this. Any guidance is appreciated.
Keras adds an additional dimension (None) when you feed your data through your model because it processes your data in batches.
In this line :
input = tf.keras.Input((self.n_seq, self.n_features))
You've defined a 2-dimensional input, and Keras adds a 3rd dimension (the batch), hence expected ndim=3.
However, the data that is being passed to the input layer is 4-dimensional, which means that your actual input data shape is 3-dimensional + the batch dimension, not 2-dimensional + batch.
To fix this you need to either re-shape your 3-D input to 2-D, or add an additional dimension to the input shape.
Print out the values for self.n_seq and self.n_features and find out what is missing from the shape 128, 7, 128 and that should guide you as to what you need to add.
Can I use Convolutional layers of keras without gpu support? I am getting errors when I use it on Colab with runtime as None.
My code looks like this:
model = tf.keras.Sequential()
model.add(layers.Conv1D(1,5, name='conv1', padding="same", activation='relu',data_format="channels_first", input_shape=(1,2048)))
# model.add(layers.LSTM(5, activation='tanh'))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(num_classes, activation='softmax'))
#model.summary()
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
x_train = train_value
y_train = train_label
x_test = test_value
y_test = test_label
print(np.shape(x_train)) #shape of x train is (4459, 1, 2048)
print(np.shape(x_test)) #shape of test is (1340,1,2048)
history = model.fit(x_train, y_train,
batch_size=100,
epochs=30,
verbose=1,
validation_data=(x_test, y_test)
)
It is running fine on GPU but gives following error on CPU:
InvalidArgumentError: Conv2DCustomBackpropFilterOp only supports NHWC.
[[{{node
training/SGD/gradients/gradients/conv1/conv1d_grad/Conv2DBackpropFilter}}]]
UnimplementedError: The Conv2D op currently only supports the NHWC
tensor format on the CPU. The op was given the format: NCHW [[{{node
conv1_1/conv1d}}]]
I have figured out that the problem is with the format of Input Data. My input data are vectors of size (1,2048). Can you please guide me on how to convert these vectors to NHWC format?
I would really appreciate it, if someone can clear this up for me.
Thanks in advance.
Per the Keras documentation
data_format: A string, one of "channels_last" (default) or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras) while "channels_first" corresponds to inputs with shape (batch, channels, steps)
Now Keras in TensorFlow appears to implement Conv1D in terms of a Conv2D operator - basically forming an "image" with 1 row, W columns, and then your C "channels". That's why your getting error messages about image shapes when you don't have image data.
In the docs above "channels" are the number of data items per time step (e.g. perhaps you have 5 sensor readings at each time step so you'd have 5 channels). From your answers above I believe you're passing tensors with shape (n, 1, 2048) where n is your batch size. So, with channels_last TensorFlow thinks that means you have n examples in your batch each with a sequence length of 1 and 2048 data items per time step - that is only a single time step with 2048 data items per observation (e.g. 2048 sensor readings taken at each time step) in which case the convolution won't be doing a convolution - it'd be equivalent to a single dense layer taking all 2048 numbers as input.
I think in reality you have only a single data item per time step and you have 2048 time steps. That explains why passing channels_first improves your accuracy - now TensorFlow understand that your data represents 1 data item samples 2048 times and it can do a convolution over that data.
To fix you can just tf.reshape(t, (1, 2048, 1)) - and remove the channels_first (that code assumes you're doing batches of size 1 and your tensor is named t). Now it's in the format (n, s, 1) where n is the batch size (1 here), s is the number of time steps (2048), and 1 indicates one data point per time step. You can now run the same model on the GPU or CPU.
My data is shaped like this:
(50000, 28, 28)
It is 2D images, and the first dimension is the number of samples. My first layer does flattening. I also want to add BatchNormalization layers:
model.add(ll.InputLayer([28, 28]))
model.add(ll.Flatten())
# network body
model.add(ll.Dense(1000, kernel_regularizer=regularizers.l2(0.01), activation='elu'))
model.add(ll.BatchNormalization())
What is the correct value for axis to pass?
You usually want to normalize the feature dimension over the batch so, for flat inputs, you should set axis=-1 in BatchNormalization.
I have the following idea to implement:
Input -> CNN-> LSTM -> Dense -> Output
The Input has 100 time steps, each step has a 64-dimensional feature vector
A Conv1D layer will extract features at each time step. The CNN layer contains 64 filters, each has length 16 taps. Then, a maxpooling layer will extract the single maximum value of each convolutional output, so a total of 64 features will be extracted at each time step.
Then, the output of the CNN layer will be fed into an LSTM layer with 64 neurons. Number of recurrence is the same as time step of input, which is 100 time steps. The LSTM layer should return a sequence of 64-dimensional output (the length of sequence == number of time steps == 100, so there should be 100*64=6400 numbers).
input = Input(shape=(100,64), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...
But this doesn't work. The second line reports "list index out of range" and I don't understand what's going on.
I'm new to Keras, so I appreciate sincerely if anyone could help me with it.
This picture explains how CNN should be applied to EACH TIME STEP
The problem is with your input. Your input is of shape (100, 64) in which the first dimension is the timesteps. So ignoring that, your input is of shape (64) to a Conv1D.
Now, refer to the Keras Conv1D documentation, which states that the input should be a 3D tensor (batch_size, steps, input_dim). Ignoring the batch_size, your input should be a 2D tensor (steps, input_dim).
So, you are providing 1D tensor input, where the expected size of the input is a 2D tensor. For example, if you are providing Natural Language input to the Conv1D in form of words, then there are 64 words in your sentence and supposing each word is encoded with a vector of length 50, your input should be (64, 50).
Also, make sure that you are feeding the right input to LSTM as given in the code below.
So, the correct code should be
embedding_size = 50 # Set this accordingingly
mfcc_input = Input(shape=(100, 64, embedding_size), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
# Directly feeding CNN_out to LSTM will also raise Error, since the 3rd dimension is 1, you need to purge it as
CNN_out = Reshape((int(CNN_out.shape[1]), int(CNN_out.shape[3])))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...
I've got a question on Tensorflow LSTM-Implementation. There are currently several implementations in TF, but I use:
cell = tf.contrib.rnn.BasicLSTMCell(n_units)
where n_units is the amount of 'parallel' LSTM-Cells.
Then to get my output I call:
rnn_outputs, rnn_states = tf.nn.dynamic_rnn(cell, x,
initial_state=initial_state, time_major=False)
where (as time_major=False) x is of shape (batch_size, time_steps, input_length)
where batch_size is my batch_size
where time_steps is the amount of timesteps my RNN will go through
where input_length is the length of one of my input vectors (vector fed into the network on one specific timestep on one specific batch)
I expect rnn_outputs to be of shape (batch_size, time_steps, n_units, input_length) as I have not specified another output size.
Documentation of nn.dynamic_rnn tells me that output is of shape (batch_size, input_length, cell.output_size).
The documentation of tf.contrib.rnn.BasicLSTMCell does have a property output_size, which is defaulted to n_units (the amount of LSTM-cells I use).
So does each LSTM-Cell only output a scalar for every given timestep? I would expect it to output a vector of the length of the input vector. This seems not to be the case from how I understand it right now, so I am confused. Can you tell me whether that's the case or how I could change it to output a vector of size of the input vector per single lstm-cell maybe?
I think the primary confusion is on the terminology of the LSTM cell's argument: num_units. Unfortunately it doesn't mean, as the name suggests, "the no. of LSTM cells" that should be equal to your time-steps. They actually correspond to the number of dimensions in the hidden state (cell state + hidden state vector).
The call to dynamic_rnn() returns a tensor of shape: [batch_size, time_steps, output_size] where,
(Please note this) output_size = num_units; if (num_proj = None) in the lstm cell
where as, output_size = num_proj; if it is defined.
Now, typically, you will extract the last time_step's result and project it to the size of output dimensions using a mat-mul + biases operation manually, or use the num_proj argument in the LSTM cell.
I have been through the same confusion and had to look really deep to get it cleared. Hope this answer clears some of it.