I am attempting to train a nural network on the emnist dataset but when I attempt to flatten my image, it throws the following error:
WARNING:tensorflow:Model was constructed with shape (None, 28, 28) for input Tensor("flatten_input:0", shape=(None, 28, 28), dtype=float32), but it was called on an input with incompatible shape (None, 1, 28, 28).
I can't figure out what seems to be the problem and have attempted changing my preprocessing, removing batch size from my model.fit and my ds.map.
Here is the full code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
from tensorflow import keras
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
def preprocess(dict):
image = dict['image']
image = tf.transpose(image)
label = dict['label']
return image, label
train_data, validation_data = tfds.load('emnist/letters', split = ['train', 'test'])
train_data_gen = train_data.map(preprocess).shuffle(1000).batch(32)
validation_data_gen = validation_data.map(preprocess).batch(32)
print(train_data_gen)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape = (28, 28)),
tf.keras.layers.Dense(128, activation = 'relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation = 'softmax')
])
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy'])
early_stopping = keras.callbacks.EarlyStopping(monitor = 'val_accuracy', patience = 10)
history = model.fit(train_data_gen, epochs = 50, batch_size = 32, validation_data = validation_data_gen, callbacks = [early_stopping], verbose = 1)
model.save('emnistmodel.h5')
So there's actually a few things going on here, so let's address them one at a time.
Input shape
So to address your immediate question, you're receiving an incompatible shape error because, well, the shape of the input doesn't match the expected shape.
In this line tf.keras.layers.Flatten(input_shape=(28, 28)), we are telling the model to expect inputs of shape (28, 28), but this isn't accurate. Our inputs actually have shape (28, 28, 1) because we are taking a 28x28 pixel image with 1 channel (as opposed to a colour image which would have 3 channels r, g, and b). So to solve this immediate problem, we simply update the model to use the shape of the input. i.e. tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
Number of output nodes
As Rishabh suggested in his answer, the EMNIST dataset has more than 10 balanced classes. However, in your case, you appear to be using EMNIST Letters which has 26 balanced classes. So your neural net should correspondingly have 27 output nodes (since the class labels go from 1.. 26 while our output nodes correspond to 0.. 26) to be able to classify the given data. Of course, giving it extra output nodes will enable it to run as well, but these will give us additional weights to train that are not necessary which will increase the amount of training time needed for our model. In short, your final layer should be tf.keras.layers.Dense(27, activation='softmax')
Preprocessing TensorFlow Datasets
Reading your preprocess() function, I believe you're trying to convert the training and validation datasets into tuples of (image, label). Instead of creating our own function, TensorFlow conveniently implements this for us through the parameter as_supervised.
Additionally, I see some extra preprocessing that you're trying to achieve such as batching and shuffling the data. Again, TensorFlow implements batch_size and shuffle_files (See common arguments) for us! So loading the dataset would looking something like
train_data, validation_data = tfds.load('emnist/letters',
split=['train', 'test'],
shuffle_files=True,
batch_size=32,
as_supervised=True)
Some additional notes
Also, as a suggestion, consider excluding batch_size from model.fit(). Defining the same thing at two different places is a recipe for bugs and unexpected behaviours. Moreover, when using TensorFlow Datasets, it's not necessary because they already generate batches.
Overall your updated program should look something like this
import matplotlib.pyplot as plt
import tensorflow_datasets as tfds
from tensorflow import keras
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
train_data, validation_data = tfds.load('emnist/letters',
split=['train', 'test'],
shuffle_files=True,
batch_size=32,
as_supervised=True)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(27, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
early_stopping = keras.callbacks.EarlyStopping(
monitor='val_accuracy', patience=10)
history = model.fit(train_data,
epochs=50,
validation_data=validation_data,
callbacks=[early_stopping],
verbose=1)
model.save('emnistmodel.h5')
Hope this helps!
Hie #Rattandeep I just checked the emnist dataset It has 47 different classes and in your dense layer, you have mentioned 10.
If you change your code from
tf.keras.layers.Dense(10, activation = 'softmax')
To this one, it will work
tf.keras.layers.Dense(47, activation = 'softmax')
Thanks
Related
So here is my code:
import tensorflow as tf
from tensorflow import keras
from scipy.io import loadmat
# Load the training data en test data
trainingData = loadmat("C:\\Users\\alexb\\vs code python\\train_32x32.mat")
testData = loadmat("C:\\Users\\alexb\\vs code python\\test_32x32.mat")
# Normalize the inputs
trainingData['X'] = trainingData['X'] / 255.0
testData['X'] = testData['X'] / 255.0
# Build the model
model = keras.Sequential([
keras.layers.Flatten(input_shape=(32, 32)), # input layer (1)
keras.layers.Dense(128, activation='relu'), # hidden layer (2)
keras.layers.Dense(10, activation='softmax') # output layer (3)
])
# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Train the model
model.fit(trainingData['X'], trainingData['y'], epochs=5)
# Evaluate the model
test_loss, test_acc = model.evaluate(testData['X'], testData['y'], verbose=2)
print('\nTest accuracy:', test_acc)
The error is:
ValueError: Data cardinality is ambiguous:
x sizes: 32
y sizes: 73257
The problem here is that I use cropped version of the svhn dataset (http://ufldl.stanford.edu/housenumbers/). The problem is that testData['X'] is a 4-dimensional matrix of a collection of 32 by 32 images, for some reason the program read it by entering from the part that has a length of 32 instead of accessing the whole image.
I tries to change the order of the data in the matrix, but this is not a really clean process and it messes things up in other ways.
I am used to working in PyTorch but now have to learn Tensorflow for my job. I am trying to get up to speed by creating a simple dense network and training it on the MNIST dataset, but I cannot get it to train. My super simple code:
import tensorflow as tf
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
# Load mnist data from keras
(train_data, train_label), (test_data, test_label) = tf.keras.datasets.mnist.load_data(path="mnist.npz")
train_label, test_label = to_categorical(train_label), to_categorical(test_label)
train_data, train_label, test_data, test_label = Flatten()(train_data), Flatten()(train_label), Flatten()(test_data), Flatten()(test_label)
# Create generic SGD optimizer (no learning schedule)
optimizer = SGD(learning_rate = 0.01)
# Define function to build and compile model
def build_mnist_model(input_shape, batch_size = 30):
input_img = Input(shape = input_shape, batch_size = batch_size)
# Pass through dense layer
x = Dense(200, activation = 'relu', use_bias = True)(input_img)
x = Dense(400, activation = 'relu', use_bias = True)(x)
scores = Dense(10, activation = 'softmax', use_bias = True)(x)
# Create and compile tf model
mnist_model = Model(input_img, scores)
mnist_model.compile(optimizer = optimizer, loss = 'categorical_crossentropy')
return mnist_model
# Build the model
mnist_model = build_mnist_model(train_data[0].shape)
# Train the model
mnist_model.fit(
x = train_data,
y = train_label,
batch_size = 30,
epochs = 20,
verbose = 2,
shuffle = True,
# steps_per_epoch = 200
)
When I run this I get
ValueError: When using data tensors as input to a model, you should specify the `steps_per_epoch` argument.
This does not really make sense to me because my train_data and train_label are just regular tensors and per the Tensorflow documentation in this case it should default to the number of samples in the dataset divided by the batch size (which would be 200 in my case).
At any rate, I tried specifying steps_per_epoch = 200 when I call mnist_model.fit() but then I get a different error:
InvalidArgumentError: Incompatible shapes: [60000,10] vs. [30,1]
[[{{node training_4/SGD/gradients/gradients/loss_5/dense_17_loss/softmax_cross_entropy_with_logits_grad/mul}}]]
I can't seem to discern where a size mismatch would come from. In PyTorch, I am used to manually creating batches (by subindexing my data and label tensors) but in Tensorflow this seems to happen automatically. As such, this leaves me quite confused about what batch has the wrong size, how it got the wrong size, etc. I hope this simple model is way easier than I am making it and I just do not know the Tensorflow tricks yet.
Thanks for the help.
Okay so I'm pretty new to deep learning and have a very basic doubt. I have an input data with an array containing 255 data (Araay shape (255,)) in epochs_data and their corresponding labels in new_labels (Array shape (255,)).
I split the data using the following code:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(epochs_data, new_labels, test_size = 0.2, random_state=30)
I'm using a sequential model:
from keras.models import Sequential
from keras import layers
from keras.layers import Dense, Activation, Flatten
model = Sequential()
I know how to code for the hidden layers and output layer:
model.add(Dense(500, activation='relu')) #Hidden Layer
model.add(Dense(2, activation='softmax')) #Output Layer
But I don't know how to code layer for input with the input_shape specified. The X_train is the input.It's an array of shape (180,). Also tell me how to code the model.fit() for the same. Any help is appreciated.
You have to copy this line before the hidden layer. You can add the activation function that you want. Finally, as you can see this line represent both the input layer and the 1° hidden layer (you have to choose the n° of neuron (I put 100) )
model.add(Dense(100, input_shape = (X_train.shape[1],))
EDIT:
Before fitting your model you have to configure your model with this line:
model.compile(loss = 'mse', optimizer = 'Adam', metrics = ['mse'])
So you have to choose a metric that in this case is Mean Squarred Error and an optimizer like Adam, Adamax, ect.
Then you can fit your model choosing the data (X,Y), n° epochs, val_split and the batch size.
history = model.fit(X_train, y_train, epochs = 200,
validation_split = 0.1, batch_size=250)
I'm using Keras framework to build a stacked LSTM model as follows:
model.add(layers.LSTM(units=32,
batch_input_shape=(1, 100, 64),
stateful=True,
return_sequences=True))
model.add(layers.LSTM(units=32, stateful=True, return_sequences=True))
model.add(layers.LSTM(units=32, stateful=True, return_sequences=False))
model.add(layers.Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(train_dataset,
train_labels,
epochs=1,
validation_split = 0.2,
verbose=1,
batch_size=1,
shuffle=False)
Knowing that the default batch_size for mode.fit, model.predict and model.evaluate is 32, the model forces me to change this default batch_size to the samebatch_size value used in batch_input_shape (batch_size, time_steps, input_dims).
My questions are:
What is the difference between passing the batch_size into
batch_input_shape or into the model.fit?
Could I train with batch_size, lets say 10, and evaluate on a single batch (rather than
10 batches) if I passes the batch_size into the structure of the
LSTM layer through batch_input_shape?
when the lstm layer is in stateful mode, the batch size must be given and cannot be None.
this is because the lstm is stateful and needs to know how to concatenate the hidden states from the t-1 timestep batch to the t timestep batch
When you create a Sequential() model it is defined to support any batch size. In particular, in TensorFlow 1.* the input is a placeholder that has None as the first dimension:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [None, 2] <-- supports any batch size
print(model.inputs[0].op.type == 'Placeholder') # True
If you use tf.keras.InputLayer() you can define a fixed batch size like this:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=50)) # <-- same as using batch_input_shape
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [50, 2] <-- supports only batch_size==50
print(model.inputs[0].op.type == 'Placeholder') # True
The batch size of model.fit() method is used to split your data to batches. For example, if you use InputLayer() and define a fixed batch size while providing different value of a batch size to the model.fit() method you will get ValueError:
import tensorflow as tf
import numpy as np
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=2)) # <--batch_size==2
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss='categorical_crossentropy')
x_train = np.random.normal(size=(10, 2))
y_train = np.array([[0, 1] for _ in range(10)])
model.fit(x_train, y_train, batch_size=3) # <--batch_size==3
This will raise:
ValueError: Thebatch_sizeargument value 3 is incompatible with the specified batch size of your Input Layer: 2
To summarize: If you define a batch size None you can pass any number of samples for training or evaluation, even all samples at once without splitting to batches (if the data is too big you will get OutOfMemoryError). If you define a fixed batch size you will have to use the same fixed batch size for training and evaluation.
I have a dataset of 3000 observations. Each observation consists of 3 timeseries of length 200 samples. As the output I have 5 class labels.
So I build train as test sets as follows:
test_split = round(num_samples * 3 / 4)
X_train = X_all[:test_split, :, :] # Start upto just before test_split
y_train = y_all[:test_split]
X_test = X_all[test_split:, :, :] # From test_split to end
y_test = y_all[test_split:]
# Print shapes and class labels
print(X_train.shape)
print(y_train.shape)
> (2250, 200, 3)
> (22250, 5)
I build my network using Keras functional API:
from keras.models import Model
from keras.layers import Dense, Activation, Input, Dropout, concatenate
from keras.layers.recurrent import LSTM
from keras.constraints import maxnorm
from keras.optimizers import SGD
from keras.callbacks import EarlyStopping
series_len = 200
num_RNN_neurons = 64
ch1 = Input(shape=(series_len, 1), name='ch1')
ch2 = Input(shape=(series_len, 1), name='ch2')
ch3 = Input(shape=(series_len, 1), name='ch3')
ch1_layer = LSTM(num_RNN_neurons, return_sequences=False)(ch1)
ch2_layer = LSTM(num_RNN_neurons, return_sequences=False)(ch2)
ch3_layer = LSTM(num_RNN_neurons, return_sequences=False)(ch3)
visible = concatenate([
ch1_layer,
ch2_layer,
ch3_layer])
hidden1 = Dense(30, activation='linear', name='weighted_average_channels')(visible)
output = Dense(num_classes, activation='softmax')(hidden1)
model = Model(inputs= [ch1, ch2, ch3], outputs=output)
# Compile model
model.compile(loss='categorical_crossentropy', optimizer=SGD(), metrics=['accuracy'])
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-4, patience=5, verbose=1, mode='auto')
Then, I try to fit the model:
# Fit the model
model.fit(X_train, y_train,
epochs=epochs,
batch_size=batch_size,
validation_data=(X_test, y_test),
callbacks=[monitor],
verbose=1)
and I get the following error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), but instead got the following list of 1 arrays...
How should I reshape my data, to solve the issue?
You magically assume a single input with 3 time series X_train will split into 4 channels and be assigned to different inputs. Well this doesn't happen and that is what the error is complaining about. You have 1 input:
ch123_in = Input(shape=(series_len, 3), name='ch123')
latent = LSTM(num_RNN_neurons)(ch123_in)
hidden1 = Dense(30, activation='linear', name='weighted_average_channels')(latent)
By merging the series together into single LSTM, the model might pickup relations across time series as well. Now your target shape has to be y_train.shape == (2250, 5), the first dimension must match X_train.shape[0].
Another point is you have Dense layer with linear activation, that is almost useless as it doesn't provide any non-linearity. You might want to use a non-linear activation function like relu.