Keras and input shape to Conv1D issues - python

First off, I am very new to Neural Nets and Keras.
I am trying to create a simple Neural Network using Keras where the input is a time series and the output is another time series of same length (1 dimensional vectors).
I made dummy code to create random input and output time series using a Conv1D layer. The Conv1D layer then outputs 6 different time series (because I have 6 filters) and the next layer I define to add all 6 of those outputs into one which is the output to the entire network.
import numpy as np
import tensorflow as tf
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Conv1D, Input, Lambda
def summation(x):
y = tf.reduce_sum(x, 0)
return y
time_len = 100 # total length of time series
num_filters = 6 # number of filters/outputs to Conv1D layer
kernel_len = 10 # length of kernel (memory size of convolution)
# create random input and output time series
X = np.random.randn(time_len)
Y = np.random.randn(time_len)
# Create neural network architecture
input_layer = Input(shape = X.shape)
conv_layer = Conv1D(filters = num_filters, kernel_size = kernel_len, padding = 'same')(input_layer)
summation_layer = Lambda(summation)(conv_layer)
model = Model(inputs = input_layer, outputs = summation_layer)
model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae'])
model.fit(X,Y,epochs = 1, metrics = ['mae'])
The error I get is:
ValueError: Input 0 of layer conv1d_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 100]
Looking at the Keras documentation for Conv1D, the input shape is supposed to be a 3D tensor of shape (batch, steps, channels) which I don't understand if we are working with 1 dimensional data.
Can you explain the meaning of each of the items: batch, steps, and channels? And how should I shape my 1D vectors to allow my network to run?

What is a (training) sample?
The (training) data may consists of tens, hundreds or thousands of samples. For example, each image in an image dataset like Cifar-10 or ImageNet is a sample. As another example, for a timseries dataset which consists of weather statistics recorded during the days over 10 years, each training sample may be a timeseries of each day. If we have recorded 100 measurements during the day and each measurement consists of temperature and humidity (i.e. we have two features per measurement) then the shape of our dataset is roughly (10x365, 100, 2).
What is batch size?
The batch size is simply the number of samples that can be processed by the model at a single time. We can set the batch size using the batch_size argument of fit method in Keras. The common values are 16, 32, 64, 128, 256, etc (though you must choose a number such that your machine could have enough RAM to allocate the required resources).
Further, the "steps" (also called "sequence length") and "channels" (also called "feature size") are the number of measurements and the size of each measurement, respectively. For example in our weather example above, we have steps=100 and channels=2.
To resolve the issue with your code you need to define your training data (i.e. X) such that it has a shape of (num_samples, steps or time_len, channels or feat_size):
n_samples = 1000 # we have 1000 samples in our training data
n_channels = 1 # each measurement has one feature
X = np.random.randn(n_samples, time_len, n_channels)
# if you want to predict one value for each measurement
Y = np.random.randn(n_samples, time_len)
# or if you want to predict one value for each sample
Y = np.random.randn(n_samples)
Edit:
One more thing is that you should pass the shape of one sample as the input shape of the model. Therefore, the input shape of Input layer must be passed like shape=X.shape[1:].

Related

What is the input shape of the InputLayer in keras Tensorflow?

I have this data
X_regression = tf.range(0, 1000, 5)
y_regression = X + 100
X_reg_train, X_reg_test = X_regression[:150], X_regression[150:]
y_reg_train, y_reg_test = y_regression[:150], y_regression[150:]
I inspect the data input data
X_reg_train[0], X_reg_train[0].shape, X_reg_train[0].ndim
and it returns:
(<tf.Tensor: shape=(), dtype=int32, numpy=0>, TensorShape([]), 0)
I build a model:
# Set the random seed
tf.random.set_seed(42)
# Create the model
model_reg = tf.keras.models.Sequential()
# Add Input layer
model_reg.add(tf.keras.layers.InputLayer(input_shape=[1]))
# Add Hidden layers
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Add last layer
model_reg.add(tf.keras.layers.Dense(units=1))
# Compile the model
model_reg.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.mae,
metrics=[tf.keras.metrics.mae])
# Fit the model
model_reg.fit(X_reg_train, y_reg_train, epochs=10)
The model works.
However, I am confused about input_shape
Why is it [1] in this situation? Why is it sometimes a tuple?
Would appreciate an explanation of different formats of input_shape in different situations.
InputLayer is actually just the same as specifying the parameter input_shape in a Dense layer. Keras actually uses InputLayer when you use method 2 in the background.
# Method 1
model_reg.add(tf.keras.layers.InputLayer(input_shape=(1,)))
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Method 2
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(1,), activation=tf.keras.activations.relu))
The parameter input_shape is actually supposed to be a tuple, if you noticed that I set the input_shape in your example to be (1,) this is a tuple with a single element in it. As your data is 1D, you pass in a single element at a time therefore the input shape is (1,).
If your input data was a 2D input for example when trying to predict the price of a house based on multiple variables, you would have multiple rows and multiple columns of data. In this case, you pass in the input shape of the last dimension of the X_reg_train which is the number of inputs. If X_reg_train was (1000,10) then we use the input_shape of (10,).
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(X_reg_train.shape[1],), activation=tf.keras.activations.relu))
Ignoring the batch_size for a moment, with this we are actually just sending a single row of the data to predict a single house price. The batch_size is just here to chunk multiple rows of data together so that we do not have to load the entire dataset into memory which is computationally expensive, so we send small chunks, with the default value being 32. When running the training you would have noticed that under each epoch it says 5/5 which are for the 5 batches of data you have, since the training size is 150, 150 / 32 = 5(rounded up).
For 3D input with the Dense layer it actually just gets flattened to a 2D input, i.e. from (batch_size, sequence_length, dim) -> (batch_size * sequence_length, dim) -> (batch_size, sequence_length, hidden_units) which is the same as using a Conv1D layer with a kernel of 1. So I wouldn't even use the Dense layer in this case.
In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.
Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).
Each type of layer requires the input with a certain number of dimensions:
Dense layers require inputs as (batch_size, input_size) or (batch_size, optional,...,optional, input_size) or in your case just (input_size)
2D convolutional layers need inputs as:
if using channels_last: (batch_size, imageside1, imageside2, channels)
if using channels_first: (batch_size, channels, imageside1, imageside2)
1D convolutions and recurrent layers use (batch_size, sequence_length, features)
Here are some helpful links : Keras input explanation: input_shape, units, batch_size, dim, etc https://keras.io/api/layers/core_layers/input/

Train autoencoder without looping

I have some data that is of shape 10000 x 1440 x 8 where 10000 is the number of days, 1440 the number of minutes and 8 is the number of features.
For each day, ie. each submatrix of size 1440 x 8 I wish to train an autoencoder and extract the weights from the second layer, such that my output will be a matrix output = 10000 x 8
I can do this in a loop with
import numpy as np
from keras.layers import Input, Dense
from keras import regularizers, models, optimizers
data = np.random.random(size=(10000,1440,8))
def AE(y, epochs=100,learning_rate = 1e-4, regularization = 5e-4, epochs=3):
input = Input(shape=(y.shape[1],))
encoded = Dense(1, activation='relu',
kernel_regularizer=regularizers.l2(regularization))(input)
decoded = Dense(y.shape[1], activation='relu',
kernel_regularizer=regularizers.l2(regularization))(encoded)
autoencoder = models.Model(input, decoded)
autoencoder.compile(optimizer=optimizers.Adam(lr=learning_rate), loss='mean_squared_error')
autoencoder.fit(y, y, epochs=epochs, batch_size=10, shuffle=False)
(w1,b1,w2,b2)=autoencoder.get_weights()
return (w1,b1,w2,b2)
lst = []
for i in range(data.shape[0]):
y = data[i]
(_, _, w2, _) = AE(y)
lst.append(w2[0])
output = np.array(lst)
However, this feels very stupid as surely I must be able to just pass the 3D data to the autoencoder and retrieve what I want. However, if I try modify the shape of input to be input = Input(shape=(y.shape[1],y.shape[2]))
I get an error
ValueError: Dimensions must be equal, but are 1440 and 8 for '{{node
mean_squared_error/SquaredDifference}} =
SquaredDifference[T=DT_FLOAT](model_778/dense_1558/Relu,
IteratorGetNext:1)' with input shapes: [?,1440,1440], [?,1440,8].
Any pointers on how to get the shape right?
Simply reshape your your data like so and call the function.
data = data.reshape(data.shape[0]*data.shape[1], -1)
(w1, b1, w2, b2) = AE(data)
print(w2.shape)
Your first layer of the NN is a Dense layer. You can only pass two dimensional data into it. One dimension will be batch size and the other dimension will be the feature vector. When you are using the data in the way you are using it, you are considering each data point independently. Which means that you can join the first two axes together and just pass it on to the NN. However, note that you would still need to modify the code so that you are not passing the entire dataset at once to the NN. You need to split the data into batches and loop over those before passing it on. And honestly, it's the same as what you are doing now. So your looping is not as bad as you think it is for what you are trying to do.
However, also note that you have a time series data and considering each datapoint as an independent point doesn't really make sense. You need an LSTM layer or something to learn the time series encoding.

Creating a Keras CNN for image alteration

I'm working on a problem that involves computationally evaluating three-dimensional data of the shape (32, 16, 5) and providing a corrected form of this data also in the shape of (32, 16, 5). The problem is relatively specific to my field, but it can be viewed as analogous to processing color images (just with five color channels instead of three). If it helps, this could be thought of as a color correction model.
In my initial efforts, I created a random forest model using XGBoost for each of these output parameters. I had good results, but found that the sheer number of output parameters (32*16*5 = 2560) made the runtime of this approach too long, so I am looking for an alternative.
I'm looking at using Keras to solve this, using a convolutional neural network approach, since the adjacent 'pixels' in my data should have some useful information about their neighbors. Note that 'adjacency' here is both spatial and in the color channels. So far, I am doing alright in creating a simple model that I believe has inputs/outputs of the correct shape, but I am running into an issue when I try to train the model on some dummy images:
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (batch_size, width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv3D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = [input_img]*500
outputs = [output_img]*500
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(input_img, output_img)
However, when I run this code, I get the following error:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=3. Full shape received: [32, 16, 5]
I am not really sure what the other two expected dimensions are for the data passed into model.fit(). I suspect this is a problem with the way that I am formatting my input data. Even if I have a list of input/output images, that will only bring the ndim of my data to 4, not 5.
I have been trying to find similar examples in the documentation and around the web to see what I'm doing incorrectly, but 3D convolution on a non-classifier network seems a bit off the beaten path, and I'm not having much luck (or just don't know the name of what I should search for).
I have tried passing the dummy training set to model.fit instead of two individual images. Fitting with model.fit(inputs, outputs) instead, I get:
ValueError: Layer sequential expects 1 inputs, but it received 500 input tensors.
It seems that passing a list of tensors isn't correct here. If I convert the list of input images to numpy arrays with:
inputs = np.array(inputs)
outputs = np.array(outputs)
This does bring up the number of dimensions in my input data to 4, but Keras is still expecting 5. The error I get in this case is very similar to the first:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: [None, 32, 16, 5]
I'm definitely not understanding something here, and any help would be appreciated.
I think you made two mistakes in your code:
Instead of using Conv3D, you need to use Conv2D.
model.fit(input_img, output_img) should be model.fit(inputs, outputs).
The reason why you need to use Conv2D is the shape of your data is (length,width,channel), it doesn't possess an extra dimension.
Try the script below
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = np.array([input_img]*500)
outputs = np.array([output_img]*500)
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(inputs, outputs)

Keras: Understanding the role of Embedding layer in a Conditional GAN

I am working to understand Erik Linder-Norén's implementation of the Categorical GAN model, and am confused by the generator in that model:
def build_generator(self):
model = Sequential()
# ...some lines removed...
model.add(Dense(np.prod(self.img_shape), activation='tanh'))
model.add(Reshape(self.img_shape))
model.summary()
noise = Input(shape=(self.latent_dim,))
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))
model_input = multiply([noise, label_embedding])
img = model(model_input)
return Model([noise, label], img)
My question is: How does the Embedding() layer work here?
I know that noise is a vector that has length 100, and label is an integer, but I don't understand what the label_embedding object contains or how it functions here.
I tried printing the shape of label_embedding to try and figure out what's going on in that Embedding() line but that returns (?,?).
If anyone could help me understand how the Embedding() lines here work, I'd be very grateful for their assistance!
To keep in mind why use embedding here at all: the alternative is to concatenate the noise with the conditioned class, which may cause the generator to completely ignore the noise values, generating data with high similarity in each class (or even just 1 per class).
From the documentation, https://keras.io/layers/embeddings/#embedding,
Turns positive integers (indexes) into dense vectors of fixed size.
eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]
In the GAN model, the input integer(0-9) is converted to a vector of shape 100. With this short code snippet, we can feed some test input to check the output shape of the Embedding layer.
from keras.layers import Input, Embedding
from keras.models import Model
import numpy as np
latent_dim = 100
num_classes = 10
label = Input(shape=(1,), dtype='int32')
label_embedding = Embedding(num_classes, latent_dim)(label)
mod = Model(label, label_embedding)
test_input = np.zeros((1))
print(f'output shape is {mod.predict(test_input).shape}')
mod.summary()
output shape is (1, 1, 100)
From model summary, output shape for embedding layer is (1,100) which is the same as output of predict.
embedding_1 (Embedding) (None, 1, 100) 1000
One additional point, in the output shape (1,1,100), the leftmost 1 is the batch size, the middle 1 is the input length. In this case, we provided an input of length 1.
The embedding stores the per label state. If I read the code correctly, each label corresponds to a digit; i.e. there is an embedding that captures how to generate a 0, 1, ... 9.
This code takes some random noise and multiplies it to this per label state. The result should be a vector that leads the generator to display the digit corresponding to the label (i.e. 0..9).

How to setup input shape for 1dCNN+LSTM network (Keras)?

I have the following idea to implement:
Input -> CNN-> LSTM -> Dense -> Output
The Input has 100 time steps, each step has a 64-dimensional feature vector
A Conv1D layer will extract features at each time step. The CNN layer contains 64 filters, each has length 16 taps. Then, a maxpooling layer will extract the single maximum value of each convolutional output, so a total of 64 features will be extracted at each time step.
Then, the output of the CNN layer will be fed into an LSTM layer with 64 neurons. Number of recurrence is the same as time step of input, which is 100 time steps. The LSTM layer should return a sequence of 64-dimensional output (the length of sequence == number of time steps == 100, so there should be 100*64=6400 numbers).
input = Input(shape=(100,64), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...
But this doesn't work. The second line reports "list index out of range" and I don't understand what's going on.
I'm new to Keras, so I appreciate sincerely if anyone could help me with it.
This picture explains how CNN should be applied to EACH TIME STEP
The problem is with your input. Your input is of shape (100, 64) in which the first dimension is the timesteps. So ignoring that, your input is of shape (64) to a Conv1D.
Now, refer to the Keras Conv1D documentation, which states that the input should be a 3D tensor (batch_size, steps, input_dim). Ignoring the batch_size, your input should be a 2D tensor (steps, input_dim).
So, you are providing 1D tensor input, where the expected size of the input is a 2D tensor. For example, if you are providing Natural Language input to the Conv1D in form of words, then there are 64 words in your sentence and supposing each word is encoded with a vector of length 50, your input should be (64, 50).
Also, make sure that you are feeding the right input to LSTM as given in the code below.
So, the correct code should be
embedding_size = 50 # Set this accordingingly
mfcc_input = Input(shape=(100, 64, embedding_size), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)
# Directly feeding CNN_out to LSTM will also raise Error, since the 3rd dimension is 1, you need to purge it as
CNN_out = Reshape((int(CNN_out.shape[1]), int(CNN_out.shape[3])))(CNN_out)
LSTM_out = LSTM(64,return_sequences=True)(CNN_out)
... (more code) ...

Categories

Resources