Can I use Convolutional layers of keras without gpu support? I am getting errors when I use it on Colab with runtime as None.
My code looks like this:
model = tf.keras.Sequential()
model.add(layers.Conv1D(1,5, name='conv1', padding="same", activation='relu',data_format="channels_first", input_shape=(1,2048)))
# model.add(layers.LSTM(5, activation='tanh'))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(num_classes, activation='softmax'))
#model.summary()
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
x_train = train_value
y_train = train_label
x_test = test_value
y_test = test_label
print(np.shape(x_train)) #shape of x train is (4459, 1, 2048)
print(np.shape(x_test)) #shape of test is (1340,1,2048)
history = model.fit(x_train, y_train,
batch_size=100,
epochs=30,
verbose=1,
validation_data=(x_test, y_test)
)
It is running fine on GPU but gives following error on CPU:
InvalidArgumentError: Conv2DCustomBackpropFilterOp only supports NHWC.
[[{{node
training/SGD/gradients/gradients/conv1/conv1d_grad/Conv2DBackpropFilter}}]]
UnimplementedError: The Conv2D op currently only supports the NHWC
tensor format on the CPU. The op was given the format: NCHW [[{{node
conv1_1/conv1d}}]]
I have figured out that the problem is with the format of Input Data. My input data are vectors of size (1,2048). Can you please guide me on how to convert these vectors to NHWC format?
I would really appreciate it, if someone can clear this up for me.
Thanks in advance.
Per the Keras documentation
data_format: A string, one of "channels_last" (default) or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras) while "channels_first" corresponds to inputs with shape (batch, channels, steps)
Now Keras in TensorFlow appears to implement Conv1D in terms of a Conv2D operator - basically forming an "image" with 1 row, W columns, and then your C "channels". That's why your getting error messages about image shapes when you don't have image data.
In the docs above "channels" are the number of data items per time step (e.g. perhaps you have 5 sensor readings at each time step so you'd have 5 channels). From your answers above I believe you're passing tensors with shape (n, 1, 2048) where n is your batch size. So, with channels_last TensorFlow thinks that means you have n examples in your batch each with a sequence length of 1 and 2048 data items per time step - that is only a single time step with 2048 data items per observation (e.g. 2048 sensor readings taken at each time step) in which case the convolution won't be doing a convolution - it'd be equivalent to a single dense layer taking all 2048 numbers as input.
I think in reality you have only a single data item per time step and you have 2048 time steps. That explains why passing channels_first improves your accuracy - now TensorFlow understand that your data represents 1 data item samples 2048 times and it can do a convolution over that data.
To fix you can just tf.reshape(t, (1, 2048, 1)) - and remove the channels_first (that code assumes you're doing batches of size 1 and your tensor is named t). Now it's in the format (n, s, 1) where n is the batch size (1 here), s is the number of time steps (2048), and 1 indicates one data point per time step. You can now run the same model on the GPU or CPU.
Related
I have timeseries data of electricity consumption per hour with length (17544, 1) in the following format:
[[17.6]
[38.2]
[39.4]
...
[46. ]
[44. ]
[40.2]]
My goal is to use as input the last 7 days of data, namely 24*7=168 and predict the next 24 hours of electricity consumption.
I am using the following script to prepare the dataset for training and testing:
# Split into training/test sets
train_size = int(len(data) * 0.7)
val_size = int(len(data) * 0.2)
train, val, test = data[:train_size], data[train_size:(train_size + val_size)], data[(train_size + val_size):]
# Prepare the data in a format required for LSTM (samples, timesteps, features)
def Create_Dataset(df, lookback=1, prediction_horizon=1):
X, Y = [], []
for i in range(lookback, len(df)-lookback):
X.append(df[i-lookback : i, 0])
Y.append(df[i : i + prediction_horizon, 0])
return np.array(X), np.array(Y)
lookback = 7 * 24
prediction_horizon = 24
X_train, Y_train = Create_Dataset(train, lookback, prediction_horizon)
X_val, Y_val = Create_Dataset(val, lookback, prediction_horizon)
X_test, Y_test = Create_Dataset(test, lookback, prediction_horizon)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
X_val = np.reshape(X_val, (X_val.shape[0], X_val.shape[1], 1))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
The model is of the following form:
model = Sequential()
model.add(LSTM(64, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(prediction_horizon))
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_absolute_error'])
I have trained the model successfully and I need to validate it with external data. I expect that by giving the following input of length (168, 1):
[[17.6]
[38.2]
[39.4]
...
[46.9]
[48.6]
[46.1]]
I will get an output of 24 points of the predictions, but instead I am getting an output of shape (168,24) and the following warning:
WARNING:tensorflow:Model was constructed with shape (None, 168, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, 168, 1), dtype=tf.float32, name='lstm_3_input'), name='lstm_3_input', description="created by layer 'lstm_3_input'"), but it was called on an input with incompatible shape (None, 1, 1).
Any idea of what is wrong here?
In Keras LSTMs take a 3D input with shape [batch, timesteps, feature]. In Keras batch is generally shown as None since it can vary (you can see this in the warning). Your timesteps is 168 and feature is 1 since the your only feature is a value. I think the problem is you give an input of (168,1) which doesn't have a 3D shape. My guess is you probably reshape it somewhere and it becomes (168,1,1). Otherwise feeding 2D shape should give an error instead of a warning. Then instead it becomes as if you have a batch of 168 instead of 1.
Thats why it is saying your input is (None,1,1). So you just have to reshape the input into (1,168,1).
To summarize the problem your input must be a 3D tensor with a shape of [batch, timesteps, feature]. If you have a single sample with a 2D shape you just reshape it into (1,168,1).
Lastly one thing I want to mention is that you are predicting the next 24 hours all at once (In the dense layer). Generally we predict one step at a time. Predicting more than one value is a bit against the purpose of LSTMs. Generally after one prediction we feed the output of LSTM into itself as if it is a correct value.
Unfortunately I am not sure how it can be implemented in Keras but I will add a pseudo code to describe the general idea.
out, state = lstm(input.reshape(1,168,1))
for i in range(24):
out,state = lstm(out,state)
Here you give the output of the LSTM to itself in the next timestep. Also you have to give it the state. LSTMs have a state which is passed and altered in each timestep.(LSTMs use this to pass information through time). Each out tensor in the loop will correspond to predictions for each hour. You can check Andrej Karpathy's famous blog post to get a better idea. In that blog a similar idea is used to build network that generates a script based on a single letter. It is the Character-Level Language Models part. The idea is pretty the same.
I have this data
X_regression = tf.range(0, 1000, 5)
y_regression = X + 100
X_reg_train, X_reg_test = X_regression[:150], X_regression[150:]
y_reg_train, y_reg_test = y_regression[:150], y_regression[150:]
I inspect the data input data
X_reg_train[0], X_reg_train[0].shape, X_reg_train[0].ndim
and it returns:
(<tf.Tensor: shape=(), dtype=int32, numpy=0>, TensorShape([]), 0)
I build a model:
# Set the random seed
tf.random.set_seed(42)
# Create the model
model_reg = tf.keras.models.Sequential()
# Add Input layer
model_reg.add(tf.keras.layers.InputLayer(input_shape=[1]))
# Add Hidden layers
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Add last layer
model_reg.add(tf.keras.layers.Dense(units=1))
# Compile the model
model_reg.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.mae,
metrics=[tf.keras.metrics.mae])
# Fit the model
model_reg.fit(X_reg_train, y_reg_train, epochs=10)
The model works.
However, I am confused about input_shape
Why is it [1] in this situation? Why is it sometimes a tuple?
Would appreciate an explanation of different formats of input_shape in different situations.
InputLayer is actually just the same as specifying the parameter input_shape in a Dense layer. Keras actually uses InputLayer when you use method 2 in the background.
# Method 1
model_reg.add(tf.keras.layers.InputLayer(input_shape=(1,)))
model_reg.add(tf.keras.layers.Dense(units=10, activation=tf.keras.activations.relu))
# Method 2
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(1,), activation=tf.keras.activations.relu))
The parameter input_shape is actually supposed to be a tuple, if you noticed that I set the input_shape in your example to be (1,) this is a tuple with a single element in it. As your data is 1D, you pass in a single element at a time therefore the input shape is (1,).
If your input data was a 2D input for example when trying to predict the price of a house based on multiple variables, you would have multiple rows and multiple columns of data. In this case, you pass in the input shape of the last dimension of the X_reg_train which is the number of inputs. If X_reg_train was (1000,10) then we use the input_shape of (10,).
model_reg.add(tf.keras.layers.Dense(units=10, input_shape=(X_reg_train.shape[1],), activation=tf.keras.activations.relu))
Ignoring the batch_size for a moment, with this we are actually just sending a single row of the data to predict a single house price. The batch_size is just here to chunk multiple rows of data together so that we do not have to load the entire dataset into memory which is computationally expensive, so we send small chunks, with the default value being 32. When running the training you would have noticed that under each epoch it says 5/5 which are for the 5 batches of data you have, since the training size is 150, 150 / 32 = 5(rounded up).
For 3D input with the Dense layer it actually just gets flattened to a 2D input, i.e. from (batch_size, sequence_length, dim) -> (batch_size * sequence_length, dim) -> (batch_size, sequence_length, hidden_units) which is the same as using a Conv1D layer with a kernel of 1. So I wouldn't even use the Dense layer in this case.
In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.
Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).
Each type of layer requires the input with a certain number of dimensions:
Dense layers require inputs as (batch_size, input_size) or (batch_size, optional,...,optional, input_size) or in your case just (input_size)
2D convolutional layers need inputs as:
if using channels_last: (batch_size, imageside1, imageside2, channels)
if using channels_first: (batch_size, channels, imageside1, imageside2)
1D convolutions and recurrent layers use (batch_size, sequence_length, features)
Here are some helpful links : Keras input explanation: input_shape, units, batch_size, dim, etc https://keras.io/api/layers/core_layers/input/
Hello people of the internet,
I am quite new and I have made a simple tensorflow model:
model = tf.keras.models.Sequential()
model.add(Input(batch_shape=(batch_size = 100, 50, 100, 12)))
model.add(Conv2D(filters= 1, kernel_size = (1,1))
model.add(Reshape((50, 100)))
model.add(LSTM(units=16))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer = 'Adam')
I have checked the model summary and the dimensions seem to be sane.
However, the problem occurs when I finish running the model for 1 epoch. The reason is that I have less than 100 samples in the last batch as my total sample size is not a multiple of 100. As such the Reshape layer gets the following error:
InvalidArgumentError: Input to reshape is a tensor with 460000 values, but the requested shape has 500000
[[node sequential_18/reshape_9/Reshape (defined at <ipython-input-48-fa829a07c08d>:26) ]] [Op:__inference_distributed_function_167420]
I have technically managed to fix the error by changing the batch_size to a factor of the total sample size. However, I am still wondering if there is a more robust solution that could allow the Reshape layer to take in variable batch sizes.
Many thanks for your help!!
Oh never mind, I realise if I don't specify the batch size in input, it is fine. So instead of batch_shape, use shape in the input layer!
I am trying to create human action recognition model. But when I try to add TimeDistributed feature, I have an input_shape problem. How can I convert input_shape from 4d to 5d?
I want to train my dataset with each 10 images, in order to understand actions.
Dataset size= (28000,90,90,1)
#define CNN model
cnn = Sequential()
cnn.add(Conv2D(filters=32,kernel_size=
(5,5),padding="Same",activation="relu",input_shape=(90,90,1)))
cnn.add(MaxPooling2D(pool_size=(2,2)))
cnn.add(Dropout(0.25))
cnn.add(Conv2D(filters=16,kernel_size=(5,5),padding="Same",activation="relu"))
cnn.add(MaxPooling2D(pool_size=(2,2)))
cnn.add(Dropout(0.25))
cnn.add(Conv2D(filters=32,kernel_size=(5,5),padding="Same",activation="relu"))
cnn.add(MaxPooling2D(pool_size=(2,2)))
cnn.add(Dropout(0.25))
cnn.add(Flatten())
cnn.add(Dense(4096, activation="relu"))
#define LSTM model
model = Sequential()
model.add(TimeDistributed(cnn,input_shape=(10,90,90,1)))
model.add(LSTM(10))
model.add(Dense(2, activation="softmax"))
verbose, epochs, batch_size = 0, 25, 64
optimizer=Adam(lr=0.001,beta_1=0.9,beta_2=0.999)
model.compile(optimizer=optimizer,loss="binary_crossentropy",metrics=["accuracy"])
model.fit(x_train, y_train,validation_data=(x_val,y_val), epochs=epochs, batch_size=batch_size)
Here the error:
ValueError: Error when checking input: expected time_distributed_8_input to have 5 dimensions, but got array with shape (28000, 90, 90)
I had the same issue. I'm using Tensorflow.keras from TensorFlow 2.0 alpha. My input data was formed in the following way:
(list, list, numpy.ndarray, numpy.ndarray, numpy.ndarray) corresponding to number of records in batch, number of timesteps, img_width, img_height, channels.
Turns out the Tensorflow input-shape-validation code actually misses the case where the input shape of a given record is built from a list containing numpy arrays, and the list dimension is stripped off. It does handle almost every other formation of the data.
I modified the Tensorflow library code locally to fix it and have reported the fix (https://github.com/tensorflow/tensorflow/issues/28323, which I hope to fix and submit to TF this week).
That said - I think if you change your input data set into a form consisting of (list, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray) it may solve the issue you're having.
I am trying to perform the usual classification on the MNIST database but with randomly cropped digits.
Images are cropped the following way : removed randomly first/last and/or row/column.
I would like to use a Convolutional Neural Network using Keras (and Tensorflow backend) to perform convolution and then the usual classification.
Inputs are of variable size and i can't manage to get it to work.
Here is how I cropped digits
import numpy as np
from keras.utils import to_categorical
from sklearn.datasets import load_digits
digits = load_digits()
X = digits.images
X = np.expand_dims(X, axis=3)
X_crop = list()
for index in range(len(X)):
X_crop.append(X[index, np.random.randint(0,2):np.random.randint(7,9), np.random.randint(0,2):np.random.randint(7,9), :])
X_crop = np.array(X_crop)
y = to_categorical(digits.target)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_crop, y, train_size=0.8, test_size=0.2)
And here is the architecture of the model I want to use
from keras.layers import Dense, Dropout
from keras.layers.convolutional import Conv2D
from keras.models import Sequential
model = Sequential()
model.add(Conv2D(filters=10,
kernel_size=(3,3),
input_shape=(None, None, 1),
data_format='channels_last'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train, epochs=100, batch_size=16, validation_data=(X_test, y_test))
Does someone have an idea on how to handle variable sized input in my neural network?
And how to perform classification?
TL/DR - go to point 4
So - before we get to the point - let's fix some problems with your network:
Your network will not work because of activation: with categorical_crossentropy you need to have a softmax activation:
model.add(Dense(10, activation='softmax'))
Vectorize spatial tensors: as Daniel mentioned - you need to, at some stage, switch your vectors from spatial (images) to vectorized (vectors). Currently - applying Dense to output from a Conv2D is equivalent to (1, 1) convolution. So basically - output from your network is spatial - not vectorized what causes dimensionality mismatch (you can check that by running your network or checking the model.summary(). In order to change that you need to use either GlobalMaxPooling2D or GlobalAveragePooling2D. E.g.:
model.add(Conv2D(filters=10,
kernel_size=(3, 3),
input_shape=(None, None, 1),
padding="same",
data_format='channels_last'))
model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
Concatenated numpy arrays need to have the same shape: if you check the shape of X_crop you'll see that it's not a spatial matrix. It's because you concatenated matrices with different shapes. Sadly - it's impossible to overcome this issue as numpy.array need to have a fixed shape.
How to make your network train on examples of different shape: The most important thing in doing this is to understand two things. First - is that in a single batch every image should have the same size. Second - is that calling fit multiple times is a bad idea - as you reset inner model states. So here is what needs to be done:
a. Write a function which crops a single batch - e.g. a get_cropped_batches_generator which given a matrix cuts a batch out of it and crops it randomly.
b. Use train_on_batch method. Here is an example code:
from six import next
batches_generator = get_cropped_batches_generator(X, batch_size=16)
losses = list()
for epoch_nb in range(nb_of_epochs):
epoch_losses = list()
for batch_nb in range(nb_of_batches):
# cropped_x has a different shape for different batches (in general)
cropped_x, cropped_y = next(batches_generator)
current_loss = model.train_on_batch(cropped_x, cropped_y)
epoch_losses.append(current_loss)
losses.append(epoch_losses.sum() / (1.0 * len(epoch_losses))
final_loss = losses.sum() / (1.0 * len(losses))
So - a few comments to code above: First, train_on_batch doesn't use nice keras progress bar. It returns a single loss value (for a given batch) - that's why I added logic to compute loss. You could use Progbar callback for that also. Second - you need to implement get_cropped_batches_generator - I haven't written a code to keep my answer a little bit more clear. You could ask another question on how to implement it. Last thing - I use six to keep compatibility between Python 2 and Python 3.
Usually, a model containing Dense layers cannot have variable size inputs, unless the outputs are also variable. But see the workaround and also the other answer using GlobalMaxPooling2D - The workaround is equivalent to GlobalAveragePooling2D. These are layers that can eliminiate the variable size before a Dense layer and suppress the spatial dimensions.
For an image classification case, you may want to resize the images outside the model.
When my images are in numpy format, I resize them like this:
from PIL import Image
im = Image.fromarray(imgNumpy)
im = im.resize(newSize,Image.LANCZOS) #you can use options other than LANCZOS as well
imgNumpy = np.asarray(im)
Why?
A convolutional layer has its weights as filters. There is a static filter size, and the same filter is applied to the image over and over.
But a dense layer has its weights based on the input. If there is 1 input, there is a set of weights. If there are 2 inputs, you've got twice as much weights. But weights must be trained, and changing the amount of weights will definitely change the result of the model.
As #Marcin commented, what I've said is true when your input shape for Dense layers has two dimensions: (batchSize,inputFeatures).
But actually keras dense layers can accept inputs with more dimensions. These additional dimensions (which come out of the convolutional layers) can vary in size. But this would make the output of these dense layers also variable in size.
Nonetheless, at the end you will need a fixed size for classification: 10 classes and that's it. For reducing the dimensions, people often use Flatten layers, and the error will appear here.
A possible fishy workaround (not tested):
At the end of the convolutional part of the model, use a lambda layer to condense all the values in a fixed size tensor, probably taking a mean of the side dimensions and keeping the channels (channels are not variable)
Suppose the last convolutional layer is:
model.add(Conv2D(filters,kernel_size,...))
#so its output shape is (None,None,None,filters) = (batchSize,side1,side2,filters)
Let's add a lambda layer to condense the spatial dimensions and keep only the filters dimension:
import keras.backend as K
def collapseSides(x):
axis=1 #if you're using the channels_last format (default)
axis=-1 #if you're using the channels_first format
#x has shape (batchSize, side1, side2, filters)
step1 = K.mean(x,axis=axis) #mean of side1
return K.mean(step1,axis=axis) #mean of side2
#this will result in a tensor shape of (batchSize,filters)
Since the amount of filters is fixed (you have kicked out the None dimensions), the dense layers should probably work:
model.add(Lambda(collapseSides,output_shape=(filters,)))
model.add(Dense.......)
.....
In order for this to possibly work, I suggest that the number of filters in the last convolutional layer be at least 10.
With this, you can make input_shape=(None,None,1)
If you're doing this, remember that you can only pass input data with a fixed size per batch. So you have to separate your entire data in smaller batches, each batch having images all of the same size. See here: Keras misinterprets training data shape