I am new to Keras and have been practicing with resources from the web. Unfortunately, I cannot build a model without it throwing the following error:
ValueError: logits and labels must have the same shape, received ((None, 10) vs (None, 1)).
I have attempted the following:
DF = pd.read_csv("https://raw.githubusercontent.com/EpistasisLab/tpot/master/tutorials/MAGIC%20Gamma%20Telescope/MAGIC%20Gamma%20Telescope%20Data.csv")
X = DF.iloc[:,0:-1]
y = DF.iloc[:,-1]
yBin = np.array([1 if x == 'g' else 0 for x in y ])
scaler = StandardScaler()
X1 = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X1, yBin, test_size=0.25, random_state=2018)
print(X_train.__class__,X_test.__class__,y_train.__class__,y_test.__class__ )
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(10,activation="softmax"))
model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x=X_train,
y=y_train,
epochs=600,
validation_data=(X_test, y_test), verbose=1
)
I have read my model is likely wrong in terms of input parameters, what is the correct approach?
When I look at the shape of your data
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
I see, that X is 10-dimensional and y us 1-dimensional
Therefore, you need 10-dimensional input
model.build(input_shape=(None,10))
and 1-dimensional output in the last dense layer
model.add(Dense(1,activation="softmax"))
Target variable yBin/y_train/y_test is 1D array (has a shape (None,1) for a given batch).
Your logits come from the Dense layer and the last Dense layer has 10 neurons with softmax activation. So it will give 10 outputs for each input or (batch_size,10) for each batch. This is represented formally as (None,10).
To resolve the particular shape mismatch issue in question change the neuron count of dense layer to 1 and set activation finction to "sigmoid".
model.add(Dense(1,activation="sigmoid"))
As correctly mentioned by #MSS, You need to use sigmoid activation function with 1 neuron in the last dense layer to match the logits with the labels(1,0) of your dataset which indicates binary class.
Fixed code:
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(1,activation="sigmoid"))
#model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=['accuracy'])
model.fit(x=X_train,y=y_train,epochs=10,validation_data=(X_test, y_test),verbose=1)
Output:
Epoch 1/10
446/446 [==============================] - 3s 4ms/step - loss: 0.5400 - accuracy: 0.7449 - val_loss: 0.4769 - val_accuracy: 0.7800
Epoch 2/10
446/446 [==============================] - 2s 4ms/step - loss: 0.4425 - accuracy: 0.7987 - val_loss: 0.4241 - val_accuracy: 0.8095
Epoch 3/10
446/446 [==============================] - 2s 3ms/step - loss: 0.4082 - accuracy: 0.8175 - val_loss: 0.4034 - val_accuracy: 0.8242
Epoch 4/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3934 - accuracy: 0.8286 - val_loss: 0.3927 - val_accuracy: 0.8313
Epoch 5/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3854 - accuracy: 0.8347 - val_loss: 0.3866 - val_accuracy: 0.8320
Epoch 6/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3800 - accuracy: 0.8397 - val_loss: 0.3827 - val_accuracy: 0.8364
Epoch 7/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3762 - accuracy: 0.8411 - val_loss: 0.3786 - val_accuracy: 0.8387
Epoch 8/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3726 - accuracy: 0.8432 - val_loss: 0.3764 - val_accuracy: 0.8404
Epoch 9/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3695 - accuracy: 0.8466 - val_loss: 0.3724 - val_accuracy: 0.8408
Epoch 10/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3665 - accuracy: 0.8478 - val_loss: 0.3698 - val_accuracy: 0.8454
<keras.callbacks.History at 0x7f68ca30f670>
Related
First of all, I know that there is a similar thread here:
https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn
But unfortunately, it does not help. I probably have a bug inside my code which I cannot find. What I am trying to do is to classify some WAV files. But the model does not learn.
At first, I am collecting the files and saving them in an array.
Second, create new directories, one for train data and one for val data.
Next, I am reading the WAV files, creating spectrograms, and saving them all to the train directory.
Afterward, I am moving 20% of the data from the train directory to the val directory.
Note: While creating the spectrograms I am checking the length of the WAV. If it is too short (less than 2 sec), I am doubling it. Out of this spectrogram, I am cutting a random chunk and saving only this. As a result, all images do have the same height and width.
Then as the next step, I am loading the train and val images. And here I am also doing the normalization.
IMG_WIDTH=300
IMG_HEIGHT=300
IMG_DIM = (IMG_WIDTH, IMG_HEIGHT, 3)
train_files = glob.glob(DBMEL_PATH + "*",recursive=True)
train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in train_files]
train_imgs = np.array(train_imgs) / 255 # normalizing Data
train_labels = [fn.split('\\')[-1].split('.')[1].strip() for fn in train_files]
validation_files = glob.glob(DBMEL_VAL_PATH + "*",recursive=True)
validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in validation_files]
validation_imgs = np.array(validation_imgs) / 255 # normalizing Data
validation_labels = [fn.split('\\')[-1].split('.')[1].strip() for fn in validation_files]
I have checked the variables and printing them. I guess this is working quite well. The arrays contain 80% and respectively 20% of the total data.
#Train dataset shape: (3756, 300, 300, 3)
#Validation dataset shape: (939, 300, 300, 3)
Next, I have also implemented a One-Hot-Encoder.
So far so good. In the next step I create empty DataGenerators, so without any data augmentation. When calling the DataGenerators, one time for train-data and one time for val-data, I'll pass the arrays for images (train_imgs, validation_imgs) and the one-hot-encoded-labels (train_labels_enc, validation_labels_enc).
Okay. Here now comes the tricky part.
First, create/load a pre-trained network
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import Model
import tensorflow.keras
input_shape=(IMG_HEIGHT,IMG_WIDTH,3)
restnet = ResNet50(include_top=False, weights='imagenet', input_shape=(IMG_HEIGHT,IMG_WIDTH,3))
output = restnet.layers[-1].output
output = tensorflow.keras.layers.Flatten()(output)
restnet = Model(restnet.input, output)
for layer in restnet.layers:
layer.trainable = False
And now finally creating the model itself. While creating the model I am using the pre-trained network for transfer learning. I guess somewhere there must be a problem.
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer
from tensorflow.keras.models import Sequential
from tensorflow.keras import optimizers
model = Sequential()
model.add(restnet) # <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
And the models run with this
history = model.fit_generator(train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=val_generator,
validation_steps=10,
verbose=1
)
But even after 50 epochs the accuracy stalls at around 0.15
Epoch 1/100
100/100 [==============================] - 711s 7s/step - loss: 10.6419 - accuracy: 0.1530 - val_loss: 1.9416 - val_accuracy: 0.1467
Epoch 2/100
100/100 [==============================] - 733s 7s/step - loss: 1.9595 - accuracy: 0.1550 - val_loss: 1.9372 - val_accuracy: 0.1267
Epoch 3/100
100/100 [==============================] - 731s 7s/step - loss: 1.9940 - accuracy: 0.1444 - val_loss: 1.9388 - val_accuracy: 0.1400
Epoch 4/100
100/100 [==============================] - 735s 7s/step - loss: 1.9416 - accuracy: 0.1535 - val_loss: 1.9380 - val_accuracy: 0.1733
Epoch 5/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1656 - val_loss: 1.9345 - val_accuracy: 0.1533
Epoch 6/100
100/100 [==============================] - 741s 7s/step - loss: 1.9364 - accuracy: 0.1667 - val_loss: 1.9286 - val_accuracy: 0.1767
Epoch 7/100
100/100 [==============================] - 740s 7s/step - loss: 1.9389 - accuracy: 0.1523 - val_loss: 1.9305 - val_accuracy: 0.1400
Epoch 8/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1623 - val_loss: 1.9441 - val_accuracy: 0.1667
Epoch 9/100
100/100 [==============================] - 735s 7s/step - loss: 1.9391 - accuracy: 0.1582 - val_loss: 1.9458 - val_accuracy: 0.1333
Epoch 10/100
100/100 [==============================] - 734s 7s/step - loss: 1.9381 - accuracy: 0.1602 - val_loss: 1.9372 - val_accuracy: 0.1700
Epoch 11/100
100/100 [==============================] - 739s 7s/step - loss: 1.9392 - accuracy: 0.1623 - val_loss: 1.9302 - val_accuracy: 0.2167
Epoch 12/100
100/100 [==============================] - 741s 7s/step - loss: 1.9368 - accuracy: 0.1627 - val_loss: 1.9326 - val_accuracy: 0.1467
Epoch 13/100
100/100 [==============================] - 740s 7s/step - loss: 1.9381 - accuracy: 0.1513 - val_loss: 1.9312 - val_accuracy: 0.1733
Epoch 14/100
100/100 [==============================] - 736s 7s/step - loss: 1.9396 - accuracy: 0.1542 - val_loss: 1.9407 - val_accuracy: 0.1367
Epoch 15/100
100/100 [==============================] - 741s 7s/step - loss: 1.9393 - accuracy: 0.1597 - val_loss: 1.9336 - val_accuracy: 0.1333
Epoch 16/100
100/100 [==============================] - 737s 7s/step - loss: 1.9375 - accuracy: 0.1659 - val_loss: 1.9354 - val_accuracy: 0.1267
Epoch 17/100
100/100 [==============================] - 741s 7s/step - loss: 1.9422 - accuracy: 0.1487 - val_loss: 1.9307 - val_accuracy: 0.1567
Epoch 18/100
100/100 [==============================] - 738s 7s/step - loss: 1.9399 - accuracy: 0.1680 - val_loss: 1.9408 - val_accuracy: 0.1567
Epoch 19/100
100/100 [==============================] - 743s 7s/step - loss: 1.9405 - accuracy: 0.1610 - val_loss: 1.9335 - val_accuracy: 0.1533
Epoch 20/100
100/100 [==============================] - 738s 7s/step - loss: 1.9410 - accuracy: 0.1575 - val_loss: 1.9331 - val_accuracy: 0.1533
Epoch 21/100
100/100 [==============================] - 746s 7s/step - loss: 1.9395 - accuracy: 0.1639 - val_loss: 1.9344 - val_accuracy: 0.1733
Epoch 22/100
100/100 [==============================] - 746s 7s/step - loss: 1.9393 - accuracy: 0.1585 - val_loss: 1.9354 - val_accuracy: 0.1667
Epoch 23/100
100/100 [==============================] - 746s 7s/step - loss: 1.9398 - accuracy: 0.1599 - val_loss: 1.9352 - val_accuracy: 0.1500
Epoch 24/100
100/100 [==============================] - 746s 7s/step - loss: 1.9392 - accuracy: 0.1585 - val_loss: 1.9449 - val_accuracy: 0.1667
Epoch 25/100
100/100 [==============================] - 746s 7s/step - loss: 1.9399 - accuracy: 0.1495 - val_loss: 1.9352 - val_accuracy: 0.1600
Can anyone please help to find the problem?
I solved the problem on my own.
I exchanged this
model = Sequential()
model.add(restnet) # <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
with this:
base_model = tf.keras.applications.MobileNetV2(input_shape = (224, 224, 3), include_top = False, weights = "imagenet")
model = Sequential()
model.add(base_model)
model.add(tf.keras.layers.GlobalAveragePooling2D())
model.add(Dropout(0.2))
model.add(Dense(number_classes, activation="softmax"))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.00001),
loss="categorical_crossentropy",
metrics=['accuracy'])
model.summary()
And I found out one more thing. In contrary to some tutorials, using data augmentation is not useful when working with spectrograms.
Without data augmentation I got 0.99 on train-accuracy and 0.72 on val-accuracy. But with data augmentation I got only 0.75 on train-accuracy and 0.16 on val-accuracy.
Input X = [[1,1,1,1,1], [1,2,1,3,7], [3,1,5,7,8]] etc..
Output Y = [[0.77],[0.63],[0.77],[1.26]] etc..
input x mean some combination example
["car", "black", "sport", "xenon", "5dor"]
["car", "red", "sport", "noxenon", "3dor"] etc...
output mean some score of combination.
What i need? i need to predict is combination good or bad....
Dataset size 10k..
Model:
model.add(Dense(20, input_dim = 5, activation = 'relu'))
model.add(Dense(20, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
optimizer = adam, loss = mse, validation split 0.2, epoch 30
Tr:
Epoch 1/30
238/238 [==============================] - 0s 783us/step - loss: 29.8973 - val_loss: 19.0270
Epoch 2/30
238/238 [==============================] - 0s 599us/step - loss: 29.6696 - val_loss: 19.0100
Epoch 3/30
238/238 [==============================] - 0s 579us/step - loss: 29.6606 - val_loss: 19.0066
Epoch 4/30
238/238 [==============================] - 0s 583us/step - loss: 29.6579 - val_loss: 19.0050
Epoch 5/30
not good no sens...
i need some good documentation how to proper setup or build model...
Just tried to reproduce. My results differ from yours. Please check:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras import Model
inputA = Input(shape=(5, ))
x = Dense(20, activation='relu')(inputA)
x = Dense(20, activation='relu')(x)
x = Dense(1, activation='linear')(x)
model = Model(inputs=inputA, outputs=x)
model.compile(optimizer = 'adam', loss = 'mse')
input = tf.random.uniform([10000, 5], 0, 10, dtype=tf.int32)
labels = tf.random.uniform([10000, 1])
model.fit(input, labels, epochs=30, validation_split=0.2)
Results:
Epoch 1/30 250/250 [==============================] - 1s 3ms/step -
loss: 0.1980 - val_loss: 0.1082
Epoch 2/30 250/250 [==============================] - 1s 2ms/step -
loss: 0.0988 - val_loss: 0.0951
Epoch 3/30 250/250 [==============================] - 1s 2ms/step -
loss: 0.0918 - val_loss: 0.0916
Epoch 4/30 250/250 [==============================] - 1s 2ms/step -
loss: 0.0892 - val_loss: 0.0872
Epoch 5/30 250/250 [==============================] - 0s 2ms/step -
loss: 0.0886 - val_loss: 0.0859
Epoch 6/30 250/250 [==============================] - 1s 2ms/step -
loss: 0.0864 - val_loss: 0.0860
Epoch 7/30 250/250 [==============================] - 1s 3ms/step -
loss: 0.0873 - val_loss: 0.0863
Epoch 8/30 250/250 [==============================] - 1s 2ms/step -
loss: 0.0863 - val_loss: 0.0992
Epoch 9/30 250/250 [==============================] - 0s 2ms/step -
loss: 0.0876 - val_loss: 0.0865
The model should work on real figures.
I am building a training model for my character recognition system. During every epochs, I am getting the same accuracy and it doesn't improve. I have currently 4000 training images and 77 validation images.
My model is as follows:
inputs = Input(shape=(32,32,3))
x = Conv2D(filters = 64, kernel_size = 5, activation = 'relu')(inputs)
x = MaxPooling2D()(x)
x = Conv2D(filters = 32,
kernel_size = 3,
activation = 'relu')(x)
x = MaxPooling2D()(x)
x = Flatten()(x)
x=Dense(256,
activation='relu')(x)
outputs = Dense(1, activation = 'softmax')(x)
model = Model(inputs = inputs, outputs = outputs)
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
data_gen_train = ImageDataGenerator(rescale=1/255)
data_gen_test=ImageDataGenerator(rescale=1/255)
data_gen_valid = ImageDataGenerator(rescale=1/255)
train_generator = data_gen_train.flow_from_directory(directory=r"./drive/My Drive/train_dataset",
target_size=(32,32), batch_size=10, class_mode="binary")
valid_generator = data_gen_valid.flow_from_directory(directory=r"./drive/My
Drive/validation_dataset", target_size=(32,32), batch_size=2, class_mode="binary")
test_generator = data_gen_test.flow_from_directory(
directory=r"./drive/My Drive/test_dataset",
target_size=(32, 32),
batch_size=6,
class_mode="binary"
)
model.fit(
train_generator,
epochs =10,
steps_per_epoch=400,
validation_steps=37,
validation_data=valid_generator)
The result is as follows:
Found 4000 images belonging to 2 classes.
Found 77 images belonging to 2 classes.
Found 6 images belonging to 2 classes.
Epoch 1/10
400/400 [==============================] - 14s 35ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 2/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 3/10
400/400 [==============================] - 13s 34ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 4/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 5/10
400/400 [==============================] - 18s 46ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5541
Epoch 6/10
400/400 [==============================] - 13s 34ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 7/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 8/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5946
Epoch 9/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 10/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
<tensorflow.python.keras.callbacks.History at 0x7fa3a5f4a8d0>
If you are trying to recognize charaters of 2 classes, you should:
use class_mode="binary" in the flow_from_directory function
use binary_crossentropy as loss
your last layer must have 1 neuron with sigmoid activation function
In case there are more than 2 classes:
do not use class_mode="binary" in the flow_from_directory function
use categorical_crossentropy as loss
your last layer must have n neurons with softmax activation, where n stands for the number of classes
I'm trying to train a neural network using tensorflow.keras, but I don't understand how do I train it with a numpy array of list (in python3).
I have tried to change the input shape of the layers, but I don't really understand how it's work.
import tensorflow as tf
from tensorflow import keras
import numpy as np
# Create the array of data
train_data = [[1.0,2.0,3.0],[4.0,5.0,6.0]]
train_data_np = np.asarray(train_data)
train_label = [[1,2,3],[4,5,6]]
train_label_np = np.asarray(train_data)
### Build the model
model = keras.Sequential([
keras.layers.Dense(3,input_shape =(3,2)),
keras.layers.Dense(3,activation=tf.nn.sigmoid)
])
model.compile(optimizer='sgd',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
#Train the model
model.fit(train_data_np,train_label_np,epochs=10)
The error is "Error when checking input: expected dense_input to have 3 dimensions, but got array with shape (2, 3)" when model.fit is called.
While defining a Keras model, you have to provide input shape to the first layer of the model.
For example, if your training data have n rows and m features ie shape : (n, m), You have to set the input_shape of the first Dense layer of the model to (m, ) ie the model should expect m features coming into it.
Now coming to your toy data,
train_data = [[1.0,2.0,3.0],[4.0,5.0,6.0]]
train_data_np = np.asarray(train_data)
train_label = [[1,2,3],[4,5,6]]
train_label_np = np.asarray(train_label)
Here, train_data_np.shape is (2, 3) ie 2 rows and 3 features, then you have to define the model like this,
model = keras.Sequential([
keras.layers.Dense(3,input_shape =(3, )),
keras.layers.Dense(3,activation=tf.nn.sigmoid)
])
Now, your labels are [[1,2,3],[4,5,6]]. In the normal 3 class classification task this will be a one-hot vector with 1 and 0s. But let's leave that aside as this is a toy example to check Keras.
If the target label ie y_train is one-hot then you have to use categorical_crossentropy loss instead of sparse_categorical_crossentropy.
So you can compile and train the model like this
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy'])
#Train the model
model.fit(train_data_np,train_label_np,epochs=10)
Epoch 1/10
2/2 [==============================] - 0s 61ms/step - loss: 11.5406 - acc: 0.0000e+00
Epoch 2/10
2/2 [==============================] - 0s 0us/step - loss: 11.4970 - acc: 0.5000
Epoch 3/10
2/2 [==============================] - 0s 0us/step - loss: 11.4664 - acc: 0.5000
Epoch 4/10
2/2 [==============================] - 0s 498us/step - loss: 11.4430 - acc: 0.5000
Epoch 5/10
2/2 [==============================] - 0s 496us/step - loss: 11.4243 - acc: 1.0000
Epoch 6/10
2/2 [==============================] - 0s 483us/step - loss: 11.4087 - acc: 1.0000
Epoch 7/10
2/2 [==============================] - 0s 1ms/step - loss: 11.3954 - acc: 1.0000
Epoch 8/10
2/2 [==============================] - 0s 997us/step - loss: 11.3840 - acc: 1.0000
Epoch 9/10
2/2 [==============================] - 0s 1ms/step - loss: 11.3740 - acc: 1.0000
Epoch 10/10
2/2 [==============================] - 0s 995us/step - loss: 11.3653 - acc: 1.0000
I'm working on a multi-class classification problem with Keras 2.1.3 and a Tensorflow backend. I have two numpy arrays, x and y and I'm using tf.data.Dataset like this:
dataset = tf.data.Dataset.from_tensor_slices(({"sequence": x}, y))
dataset = dataset.apply(tf.contrib.data.batch_and_drop_remainder(self.batch_size))
dataset = dataset.repeat()
xt, yt = dataset.make_one_shot_iterator().get_next()
Then I make my Keras model (omitted for brevity), compile, and fit:
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'],
)
model.fit(xt, yt, steps_per_epoch=100, epochs=10)
This works perfectly well. But when I add in callbacks, I run into issues. Specifically, if I do this:
callbacks = [
tf.keras.callbacks.ModelCheckpoint("model_{epoch:04d}_{val_acc:.4f}.h5",
monitor='val_acc',
verbose=1,
save_best_only=True,
mode='max'),
tf.keras.callbacks.TensorBoard(os.path.join('.', 'logs')),
tf.keras.callbacks.EarlyStopping(monitor='val_acc', patience=5, min_delta=0, mode='max')
]
model.fit(xt, yt, steps_per_epoch=10, epochs=100, callbacks=callbacks)
I get:
KeyError: 'val_acc'
Also, if I include validation_split=0.1 in my model.fit(...) call, I'm told:
ValueError: If your data is in the form of symbolic tensors, you cannot use validation_split.`
What is the normal way to use callbacks and validation splits with tf.data.Dataset (tensors)?
Thanks!
Using the tensorflow keras API, you can provide a Dataset for training and another for validation.
First some imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense
import numpy as np
define the function which will split the numpy arrays into training/val
def split(x, y, val_size=50):
idx = np.random.choice(x.shape[0], size=val_size, replace=False)
not_idx = list(set(range(x.shape[0])).difference(set(idx)))
x_val = x[idx]
y_val = y[idx]
x_train = x[not_idx]
y_train = y[not_idx]
return x_train, y_train, x_val, y_val
define numpy arrays and the train/val tensorflow Datasets
x = np.random.randn(150, 9)
y = np.random.randint(0, 10, 150)
x_train, y_train, x_val, y_val = split(x, y)
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, tf.one_hot(y_train, depth=10)))
train_dataset = train_dataset.batch(32).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, tf.one_hot(y_val, depth=10)))
val_dataset = val_dataset.batch(32).repeat()
Make the model (notice we are using the tensorflow keras API)
model = keras.models.Sequential([Dense(64, input_shape=(9,), activation='relu'),
Dense(64, activation='relu'),
Dense(10, activation='softmax')
])
model.compile(optimizer='Adam', loss='categorical_crossentropy',
metrics=['accuracy'])
model.summary()
model.fit(train_dataset,
epochs=10,
steps_per_epoch=int(100/32)+1,
validation_data=val_dataset,
validation_steps=2)
and the model trains, kind of (output):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 64) 640
_________________________________________________________________
dense_1 (Dense) (None, 64) 4160
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 5,450
Trainable params: 5,450
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
4/4 [==============================] - 0s 69ms/step - loss: 2.3170 - acc: 0.1328 - val_loss: 2.3877 - val_acc: 0.0712
Epoch 2/10
4/4 [==============================] - 0s 2ms/step - loss: 2.2628 - acc: 0.2500 - val_loss: 2.3850 - val_acc: 0.0712
Epoch 3/10
4/4 [==============================] - 0s 2ms/step - loss: 2.2169 - acc: 0.2656 - val_loss: 2.3838 - val_acc: 0.0712
Epoch 4/10
4/4 [==============================] - 0s 2ms/step - loss: 2.1743 - acc: 0.3359 - val_loss: 2.3830 - val_acc: 0.0590
Epoch 5/10
4/4 [==============================] - 0s 2ms/step - loss: 2.1343 - acc: 0.3594 - val_loss: 2.3838 - val_acc: 0.0590
Epoch 6/10
4/4 [==============================] - 0s 2ms/step - loss: 2.0959 - acc: 0.3516 - val_loss: 2.3858 - val_acc: 0.0590
Epoch 7/10
4/4 [==============================] - 0s 4ms/step - loss: 2.0583 - acc: 0.3750 - val_loss: 2.3887 - val_acc: 0.0590
Epoch 8/10
4/4 [==============================] - 0s 2ms/step - loss: 2.0223 - acc: 0.4453 - val_loss: 2.3918 - val_acc: 0.0747
Epoch 9/10
4/4 [==============================] - 0s 2ms/step - loss: 1.9870 - acc: 0.4609 - val_loss: 2.3954 - val_acc: 0.1059
Epoch 10/10
4/4 [==============================] - 0s 2ms/step - loss: 1.9523 - acc: 0.4609 - val_loss: 2.3995 - val_acc: 0.1059
Callbacks
Adding callbacks also works,
callbacks = [ tf.keras.callbacks.ModelCheckpoint("model_{epoch:04d}_{val_acc:.4f}.h5",
monitor='val_acc',
verbose=1,
save_best_only=True,
mode='max'),
tf.keras.callbacks.TensorBoard('./logs'),
tf.keras.callbacks.EarlyStopping(monitor='val_acc', patience=5, min_delta=0, mode='max')
]
model.fit(train_dataset, epochs=10, steps_per_epoch=int(100/32)+1, validation_data=val_dataset,
validation_steps=2, callbacks=callbacks)
Output:
Epoch 1/10
4/4
[==============================] - 0s 59ms/step - loss: 2.3274 - acc: 0.1094 - val_loss: 2.3143 - val_acc: 0.0833
Epoch 00001: val_acc improved from -inf to 0.08333, saving model to model_0001_0.0833.h5
Epoch 2/10
4/4 [==============================] - 0s 2ms/step - loss: 2.2655 - acc: 0.1094 - val_loss: 2.3204 - val_acc: 0.1389
Epoch 00002: val_acc improved from 0.08333 to 0.13889, saving model to model_0002_0.1389.h5
Epoch 3/10
4/4 [==============================] - 0s 5ms/step - loss: 2.2122 - acc: 0.1250 - val_loss: 2.3289 - val_acc: 0.1111
Epoch 00003: val_acc did not improve from 0.13889
Epoch 4/10
4/4 [==============================] - 0s 2ms/step - loss: 2.1644 - acc: 0.1953 - val_loss: 2.3388 - val_acc: 0.0556
Epoch 00004: val_acc did not improve from 0.13889
Epoch 5/10
4/4 [==============================] - 0s 2ms/step - loss: 2.1211 - acc: 0.2734 - val_loss: 2.3495 - val_acc: 0.0556
Epoch 00005: val_acc did not improve from 0.13889
Epoch 6/10
4/4 [==============================] - 0s 4ms/step - loss: 2.0808 - acc: 0.2969 - val_loss: 2.3616 - val_acc: 0.0556
Epoch 00006: val_acc did not improve from 0.13889
Epoch 7/10
4/4 [==============================] - 0s 2ms/step - loss: 2.0431 - acc: 0.2969 - val_loss: 2.3749 - val_acc: 0.0712
Epoch 00007: val_acc did not improve from 0.13889
I think your problem is type of your datasets, where xt and yt is not a list or array, so it can not to slice or split.
Let assume x is array with shape (1000,2) fill with random number using numpy and y is class with 10 classes.
# Define x and y
import numpy as np
np.random.seed(2018)
n_cat = 10
x = np.random.rand(1000,2) # Generate random 2 variables and 1000 rows
y = [np.random.randint(1,n_cat) for i in range(len(x))] # Make 10 class
Then you assemble the model and compile
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'],
)
Create callbacks for checkpoint model
callbacks = [
tf.keras.callbacks.ModelCheckpoint("model_{epoch:04d}_{val_acc:.4f}.h5",
monitor='val_acc',
verbose=1,
save_best_only=True,
mode='max'),
tf.keras.callbacks.TensorBoard(os.path.join('.', 'logs')),
tf.keras.callbacks.EarlyStopping(monitor='val_acc', patience=5, min_delta=0, mode='max')
]
fit the model
model.fit(x, y, batch_size=32, epochs=100, callbacks=callbacks, validation_split=0.1)
it's works for me.