Related
I'm trying to train my neural network with 10 epochs. But my attempts are unsuccessful. I don't get it why am I always getting something like this:
35/300 [==>...........................] - ETA: 1:09 - loss: 0.0000e+00 - accuracy: 1.0000
36/300 [==>...........................] - ETA: 1:09 - loss: 0.0000e+00 - accuracy: 1.0000
37/300 [==>...........................] - ETA: 1:08 - loss: 0.0000e+00 - accuracy: 1.0000
Here are my batch size and image width/height and whole feeding proccess:
batch_size = 32
img_height = 150
img_width = 150
dataset_url = "http://cnrpark.it/dataset/CNR-EXT-Patches-150x150.zip"
print(dataset_url)
data_dir = tf.keras.utils.get_file(origin=dataset_url,
fname='CNR-EXT-Patches-150x150',
untar=True)
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
num_classes = 1
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
class_names = train_ds.class_names
print(class_names)
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
normalization_layer = tf.keras.layers.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
print(np.min(first_image), np.max(first_image))
model = tf.keras.Sequential([
tf.keras.layers.Rescaling(1./255),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation='sigmoid'),
tf.keras.layers.Dense(num_classes)
])
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
model.compile(
optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
model.fit(
train_ds,
validation_data=val_ds,
epochs=10
)
From np.min and np.max I'm getting these values: 0.08627451 0.5568628 so this obviously wouldn't be the case. What should be wrong in my attempt?
EDIT, my epoch looks like this now:
5/300 [..............................] - ETA: 1:12 - loss: 0.1564 - accuracy: 0.9750
6/300 [..............................] - ETA: 1:15 - loss: 0.1311 - accuracy: 0.8333
7/300 [..............................] - ETA: 1:13 - loss: 0.1124 - accuracy: 0.7143
8/300 [..............................] - ETA: 1:13 - loss: 0.0984 - accuracy: 0.6250
9/300 [..............................] - ETA: 1:12 - loss: 0.0874 - accuracy: 0.5556
And a little later:
51/300 [====>.........................] - ETA: 1:04 - loss: 0.0154 - accuracy: 0.0980
You have set num_classes = 1, although your dataset has two classes:
LABEL is 0 for free, 1 for busy.
So, if you want to use tf.keras.losses.SparseCategoricalCrossentropy, try:
tf.keras.layers.Dense(2)
You could also consider using binary_crossentropy if you only have two classes. You would have to change your loss function and output layer to:
tf.keras.layers.Dense(1, activation="sigmoid")
You need the following:
num_classes = 2
tf.keras.layers.Dense(num_classes, activation="softmax)
I can see a normalization_layer, looks like you are rescaling twice.
I'm trying to build a Siamese Neural Network to analyze the MNIST dataset, however when trying to fit the model to the dataset I encounter this problem according to which I have training data and labels shapes' mismatch. I tried changing the loss function as well as tried to squeeze the labels array, and neither of "solutions" worked.
Here are the train and labels arrays' shapes:
pairTrain shape: (120000, 2, 28, 28, 1)
labelTrain shape: (120000, 1)
Here's my model:
def build_model(input_shape, embedDim=48):
inputs = Input(input_shape)
x = Conv2D(64, (2, 2), padding="same", activation="relu", input_shape=input_shape)(inputs)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(32, (2, 2), padding="same", activation="relu")(x)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(16, (2, 2), padding="same", activation="relu")(x)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
outputs = Flatten()(x)
outputs = Dense(embedDim)(outputs)
model = Model(inputs, outputs)
return model
And finally here's the code that generates the error itself:
imgA = Input(shape=(28, 28, 1))
imgB = Input(shape=(28, 28, 1))
featA = build_model((28, 28, 1))(imgA)
featB = build_model((28, 28, 1))(imgB)
distance = Lambda(euclidean_distance)([featA, featB])
output = Dense(1, activation="sigmoid")(distance)
model = Model([imgA, imgB], output)
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
history = model.fit(
[pairTrain[:, 0], pairTrain[:, 1]], labelTrain,
validation_data=[[pairTest[:, 0], pairTest[:, 1]], labelTest],
batch_size=64,
epochs=10
)
model.save("output/siamese_model")
Please help me to resolve the problem.
I was not able to reproduce the error using the below code. I suspect that your labels shape is different than the one you reported or it does not contain strictly binary data (0s and 1s) only.
Also, you should use tf.keras.losses.BinaryCrossentropy instead of tf.keras.losses.CategoricalCrossentropy as your labels should be binary with the sigmoid activation in the last layer.
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Input, Flatten, Dense, Lambda
from tensorflow.keras.models import Model
import tensorflow as tf
def build_model(input_shape, embedDim=48):
inputs = Input(input_shape)
x = Conv2D(64, (2, 2), padding="same", activation="relu", input_shape=input_shape)(inputs)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(32, (2, 2), padding="same", activation="relu")(x)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(16, (2, 2), padding="same", activation="relu")(x)
x = MaxPooling2D()(x)
x = Dropout(0.3)(x)
outputs = Flatten()(x)
outputs = Dense(embedDim)(outputs)
model = Model(inputs, outputs)
return model
imgA = Input(shape=(28, 28, 1))
imgB = Input(shape=(28, 28, 1))
featA = build_model((28, 28, 1))(imgA)
featB = build_model((28, 28, 1))(imgB)
distance = Lambda(lambda x: x[0]-x[1])([featA, featB])
output = Dense(1, activation="sigmoid")(distance)
model = Model([imgA, imgB], output)
pairTrain = tf.random.uniform((10, 2, 28, 28, 1))
labelTrain = tf.random.uniform(shape=(10, 1), minval=0, maxval=2, dtype=tf.int32)
pairTest = tf.random.uniform((10, 2, 28, 28, 1))
labelTest = tf.random.uniform(shape=(10, 1), minval=0, maxval=2, dtype=tf.int32)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(
[pairTrain[:, 0], pairTrain[:, 1]], labelTrain,
validation_data=[[pairTest[:, 0], pairTest[:, 1]], labelTest],
batch_size=64,
epochs=10
)
model.save("output/siamese_model")
Epoch 1/10
1/1 [==============================] - 2s 2s/step - loss: 0.7061 - accuracy: 0.5000 - val_loss: 0.6862 - val_accuracy: 0.7000
Epoch 2/10
1/1 [==============================] - 0s 80ms/step - loss: 0.7882 - accuracy: 0.4000 - val_loss: 0.6751 - val_accuracy: 0.6000
Epoch 3/10
1/1 [==============================] - 0s 81ms/step - loss: 0.6358 - accuracy: 0.5000 - val_loss: 0.6755 - val_accuracy: 0.6000
Epoch 4/10
1/1 [==============================] - 0s 79ms/step - loss: 0.7027 - accuracy: 0.5000 - val_loss: 0.6759 - val_accuracy: 0.6000
Epoch 5/10
1/1 [==============================] - 0s 82ms/step - loss: 0.6970 - accuracy: 0.4000 - val_loss: 0.6752 - val_accuracy: 0.6000
Epoch 6/10
1/1 [==============================] - 0s 83ms/step - loss: 0.7564 - accuracy: 0.4000 - val_loss: 0.6779 - val_accuracy: 0.6000
Epoch 7/10
1/1 [==============================] - 0s 73ms/step - loss: 0.7123 - accuracy: 0.6000 - val_loss: 0.6818 - val_accuracy: 0.6000
I'm currently trying to build a classification model in keras but I keep getting a shape error. This is my model right now. Is there anything that I am doing wrong?
predictors=["Length", "Diameter", "Height", "Shucked weight", "Viscera weight", "Shell weight", "Rings"]
x_train, x_test, y_train, y_test =train_test_split(db[predictors], db["Sex"], test_size=.2)
x_train= x_train.to_numpy()
x_test = x_test.to_numpy()
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(64, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
history = model.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512,
validation_data=(x_val, y_val))
ValueError: Shapes (None, 1) and (None, 64) are incompatible
Data Source https://www.kaggle.com/rodolfomendes/abalone-dataset
The output of the last layer consists of 64 different values, while your labels are of 1 value only.
This error is because you have 3 classes(labels) in your dataset and you are not defining those in your model's last layer. (As mentioned by #subspring)
model = Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3)) # You need to mention this in the last dense layer
As the label data in this dataset is not numeric.
y_train.unique() #array(['I', 'M', 'F'], dtype=object)
For that, you can use LabelEncoder as below:
from sklearn.preprocessing import LabelEncoder
def Labels(y_train, y_test):
LabEnc = LabelEncoder()
LabEnc.fit(y_train)
Enc_y_train = LabEnc.transform(y_train)
Enc_y_test = LabEnc.transform(y_test)
return Enc_y_train, Enc_y_test
y_train, y_test = Labels(y_train, y_test)
y_train # array([1, 1, 2, ..., 2, 2, 0])
Now train the model by converting the input data (x_train,x_test) into an array.
x_train= np.array(x_train)
x_test = np.array(x_test)
#compile the model
model.compile(optimizer='rmsprop',
loss=tf.keras.losses.MeanSquaredError(),
metrics=['accuracy'])
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
#train the model
history = model.fit(partial_x_train,
partial_y_train,
epochs=5,
batch_size=512,
validation_data=(x_val, y_val))
Output:
Epoch 1/5
5/5 [==============================] - 2s 80ms/step - loss: 0.8610 - accuracy: 0.3302 - val_loss: 0.7966 - val_accuracy: 0.2350
Epoch 2/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7997 - accuracy: 0.2563 - val_loss: 0.7491 - val_accuracy: 0.4620
Epoch 3/5
5/5 [==============================] - 0s 16ms/step - loss: 0.7917 - accuracy: 0.3315 - val_loss: 0.7883 - val_accuracy: 0.2680
Epoch 4/5
5/5 [==============================] - 0s 15ms/step - loss: 0.7949 - accuracy: 0.3405 - val_loss: 0.7499 - val_accuracy: 0.3390
Epoch 5/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7884 - accuracy: 0.3306 - val_loss: 0.7605 - val_accuracy: 0.3670
Good morning everyone, I want to ask in this group, I have built a mode from my learning results but I'm having problems in the training section, when I try to train my model it produces los with nonsensical values and val_los which doesn't make sense but on acuracy produces a normal value. this happened when I added a class, in the first experiment with two classes, I managed to do it with 95% accuracy and 93% validation accuracy but after I added 2 classes I received results that didn't make sense, as for some solutions I tried but didn't help me, I've tried to use softmax and categorical_crosscetropy but that doesn't work. when i try to add num_class on dense screen part i get "ValueError: logits and labels must have the same shape ((None, 3) vs (None, 1))". Hope someone can help me . thank you. Here I attach my code
Load Datashet
training_dir = r"Dataset/train/"
validation_dir = r"Dataset/val/"
testing_dir = r"Dataset/test/"
categories = ['class_A', 'class_B', 'class_C', 'class_D']
Created Data Training
img_size = (128,128)
training_data = []
validation_data = []
testing_data = []
def create_training_data():
for category in categories:
path = os.path.join(training_dir,category)
class_num = categories.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array,img_size)
training_data.append([new_array,class_num])
except Exception as e:
pass
def create_validation_data():
for category in categories:
path = os.path.join(validation_dir,category)
class_num = categories.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array,img_size)
validation_data.append([new_array,class_num])
except Exception as e:
pass
def create_testing_data():
for category in categories:
path = os.path.join(testing_dir,category)
class_num = categories.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array,img_size)
testing_data.append([new_array,class_num])
except Exception as e:
pass
create_training_data()
Normalization of data to range 0-1 and labeling data
X_train = []
Y_train = []
for features,label in training_data:
X_train.append(features)
Y_train.append(label)
X_train = np.array(X_train).reshape(-1,128,128)
X_train = X_train.astype('float32')/255.0
X_train = X_train.reshape(-1,128,128,3)
print(X_train.shape)
X_val = []
Y_val = []
for features,label in validation_data:
X_val.append(features)
Y_val.append(label)
X_val = np.array(X_val).reshape(-1,128,128)
X_val = X_val.astype('float32')/255.0
X_val = X_val.reshape(-1,128,128,3)
print(X_val.shape)
X_test = []
Y_test = []
for features,label in testing_data:
X_test.append(features)
Y_test.append(label)
X_test = np.array(X_test).reshape(-1,128,128)
X_test = X_test.astype('float32')/255.0
X_test = X_test.reshape(-1,128,128,3)
print(X_test.shape)
Labeling Data using label encoder for Y_train/val/test
lb = LabelEncoder()
Y_train = lb.fit_transform(Y_train)
Y_val = lb.fit_transform(Y_val)
Y_test = lb.fit_transform(Y_test)
ImageDataGenerator
datagen = ImageDataGenerator(
rotation_range = 30,
zoom_range = 0.1,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=False)
hyperparameter Tuning
HP_APL_DROPOUT = hp.HParam('dropout_apl', hp.RealInterval(0.05, 0.1))
HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([64, 128]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.25, 0.5))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'adamax']))
METRIC_ACCURACY = 'accuracy'
with tf.summary.create_file_writer('logs/hparam_tuning').as_default():
hp.hparams_config(
hparams=[HP_APL_DROPOUT, HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER],
metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')],
)
Define Traini Model
def train_test_model(hparams):
model = Sequential([
Input(shape=(128,128,3)),
Conv2D(filters=32, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
MaxPooling2D(pool_size = (2,2)),
Conv2D(filters=64, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
Conv2D(filters=64, kernel_size=3, strides=1, padding='same', activation='swish'),
MaxPooling2D(pool_size = (2,2)),
Dropout(hparams[HP_APL_DROPOUT]),
Conv2D(filters=128, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
MaxPooling2D(pool_size = (2,2)),
Conv2D(filters=128, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
MaxPooling2D(pool_size = (2,2)),
Dropout(hparams[HP_APL_DROPOUT]),
Conv2D(filters=256, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
MaxPooling2D(pool_size = (2,2)),
Dropout(0.05),
Conv2D(filters=256, kernel_size=3, strides=1, padding='same', activation='swish'),
BatchNormalization(),
MaxPooling2D(pool_size = (2,2)),
Dropout(hparams[HP_APL_DROPOUT]),
GlobalMaxPool2D(),
Flatten(),
Dense(hparams[HP_NUM_UNITS], activation="swish"),
Dense(128, activation = 'swish'),
Dropout(hparams[HP_DROPOUT]),
Dense(1, activation='sigmoid')
]
)
model.compile(
optimizer=hparams[HP_OPTIMIZER],
loss='binary_crossentropy',
metrics=['accuracy'],
)
datagen.fit(X_train)
history = model.fit_generator(
datagen.flow(X_train,Y_train, batch_size=batch_size),
epochs = 5,
validation_data = datagen.flow(X_val,Y_val))
_, accuracy = model.evaluate(X_val, Y_val)
return accuracy
Running Model
def run(run_dir, hparams):
with tf.summary.create_file_writer(run_dir).as_default():
hp.hparams(hparams)
accuracy = train_test_model(hparams)
tf.summary.scalar(METRIC_ACCURACY, accuracy, step=50)
session_num = 0
for dropout_apl_rate in (HP_APL_DROPOUT.domain.min_value, HP_APL_DROPOUT.domain.max_value):
for num_units in HP_NUM_UNITS.domain.values:
for dropout_rate in (HP_DROPOUT.domain.min_value, HP_DROPOUT.domain.max_value):
for optimizer in HP_OPTIMIZER.domain.values:
hparams = {
HP_APL_DROPOUT: dropout_apl_rate,
HP_NUM_UNITS: num_units,
HP_DROPOUT: dropout_rate,
HP_OPTIMIZER: optimizer,
}
run_name = "run-%d" % session_num
print('--- Starting trial: %s' % run_name)
print({h.name: hparams[h] for h in hparams})
run('logs/hparam_tuning/' + run_name, hparams)
session_num += 1
Error Log
--- Starting trial: run-0
{'dropout_apl': 0.05, 'num_units': 64, 'dropout': 0.25, 'optimizer': 'adam'}
C:\anaconda3\envs\sub_base_one\lib\site-packages\keras\engine\training.py:1972: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
warnings.warn('`Model.fit_generator` is deprecated and '
Epoch 1/5
478/478 [==============================] - 57s 111ms/step - loss: -1034990.9375 - accuracy: 0.3468 - val_loss: -3408716.0000 - val_accuracy: 0.2863
Epoch 2/5
478/478 [==============================] - 52s 109ms/step - loss: -27187870.0000 - accuracy: 0.3779 - val_loss: -70205120.0000 - val_accuracy: 0.3440
Epoch 3/5
478/478 [==============================] - 53s 111ms/step - loss: -159927984.0000 - accuracy: 0.3481 - val_loss: -305764640.0000 - val_accuracy: 0.4137
Epoch 4/5
478/478 [==============================] - 55s 114ms/step - loss: -513765920.0000 - accuracy: 0.3257 - val_loss: -822113920.0000 - val_accuracy: 0.2790
Epoch 5/5
478/478 [==============================] - 54s 113ms/step - loss: -1208896128.0000 - accuracy: 0.3155 - val_loss: -1744491776.0000 - val_accuracy: 0.3015
60/60 [==============================] - 1s 17ms/step - loss: -1766485504.0000 - accuracy: 0.3639
--- Starting trial: run-1
{'dropout_apl': 0.05, 'num_units': 64, 'dropout': 0.25, 'optimizer': 'adamax'}
Epoch 1/5
478/478 [==============================] - 56s 115ms/step - loss: -34721.3828 - accuracy: 0.3015 - val_loss: -56985.7461 - val_accuracy: 0.2465
Epoch 2/5
478/478 [==============================] - 55s 115ms/step - loss: -642847.1875 - accuracy: 0.3482 - val_loss: -1573540.5000 - val_accuracy: 0.4442
Epoch 3/5
478/478 [==============================] - 55s 114ms/step - loss: -3433380.0000 - accuracy: 0.4208 - val_loss: -6417029.0000 - val_accuracy: 0.4373
Epoch 4/5
478/478 [==============================] - 55s 115ms/step - loss: -10932957.0000 - accuracy: 0.3973 - val_loss: -16847372.0000 - val_accuracy: 0.3382
Epoch 5/5
478/478 [==============================] - 56s 117ms/step - loss: -26557560.0000 - accuracy: 0.3720 - val_loss: -38307184.0000 - val_accuracy: 0.4483
60/60 [==============================] - 1s 17ms/step - loss: -39560612.0000 - accuracy: 0.4704
--- Starting trial: run-2
{'dropout_apl': 0.05, 'num_units': 64, 'dropout': 0.25, 'optimizer': 'sgd'}
Epoch 1/5
478/478 [==============================] - 56s 114ms/step - loss: nan - accuracy: 0.2530 - val_loss: nan - val_accuracy: 0.2533
Epoch 2/5
478/478 [==============================] - 54s 113ms/step - loss: nan - accuracy: 0.2532 - val_loss: nan - val_accuracy: 0.2533
Epoch 3/5
148/478 [========>.....................] - ETA: 33s - loss: nan - accuracy: 0.2544
I hope someone can help solve the problem and tell me the cause and solution . Thank you for helping
I want to predict 8-characters license plates, so I wrote the below model in Keras:
x = Input(shape=(HEIGHT, WIDTH, CHANNELS))
base_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(HEIGHT, WIDTH, CHANNELS))
base_model.trainable = False
y = base_model(x)
y = Reshape((8, 9 * 256))(y)
y = LSTM(units=20, return_sequences='true')(y)
y = Dropout(0.5)(y)
y = TimeDistributed(Dense(TOTAL_CHARS, activation="softmax", activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = Dropout(0.25)(y)
model = Model(input=x, output=y)
model.compile(loss="categorical_crossentropy", optimizer='rmsprop', metrics=['accuracy'])
I have about 6000 data for training which I augment them with ImageGenerator. My problem is that the loss and accuracy are approximately constant during time:
************************************************************
Epoch: 1
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4525 - acc: 0.1924Epoch 00001: val_loss improved from 2.17175 to 2.15020, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4535 - acc: 0.1924 - val_loss: 2.1502 - val_acc: 0.2232
************************************************************
Epoch: 2
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6848/6869 [============================>.] - ETA: 0s - loss: 5.4543 - acc: 0.1959Epoch 00001: val_loss improved from 2.15020 to 2.11809, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 26s 4ms/step - loss: 5.4537 - acc: 0.1958 - val_loss: 2.1181 - val_acc: 0.2281
************************************************************
Epoch: 3
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4284 - acc: 0.1977Epoch 00001: val_loss improved from 2.11809 to 2.09679, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4282 - acc: 0.1978 - val_loss: 2.0968 - val_acc: 0.2304
************************************************************
Epoch: 4
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4500 - acc: 0.2004Epoch 00001: val_loss did not improve
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4490 - acc: 0.2004 - val_loss: 2.1146 - val_acc: 0.2355
************************************************************
Epoch: 5
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6848/6869 [============================>.] - ETA: 0s - loss: 5.4399 - acc: 0.2006Epoch 00001: val_loss did not improve
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4374 - acc: 0.2009 - val_loss: 2.1102 - val_acc: 0.2324
************************************************************
Epoch: 6
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4636 - acc: 0.1977Epoch 00001: val_loss improved from 2.09679 to 2.09076, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4629 - acc: 0.1978 - val_loss: 2.0908 - val_acc: 0.2341
************************************************************
Now, I am not sure exactly the correctness of my model and I think the problem is my model. Is this the correct way to combine CNN and LSTM?
I also have tried the below mode:
REGUL_PARAM = 0
image = Input(shape=(HEIGHT, WIDTH, CHANNELS))
x = Reshape((8, HEIGHT, int(WIDTH/8), CHANNELS))(image)
y = TimeDistributed(Conv2D(16, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(x)
y = TimeDistributed(MaxPooling2D((2, 2)))(y)
y = TimeDistributed(Conv2D(32, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = TimeDistributed(MaxPooling2D((2, 2)))(y)
y = TimeDistributed(Conv2D(64, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = Reshape((int(y.shape[1]), int(y.shape[4]*y.shape[3]*y.shape[2])))(y)
y = Bidirectional(LSTM(units=50, return_sequences='true'))(y)
y = TimeDistributed(Dense(64, activity_regularizer=regularizers.l2(REGUL_PARAM), activation='relu'))(y)
y = Dropout(0.25)(y)
y = TimeDistributed(Dense(TOTAL_CHARS, activity_regularizer=regularizers.l2(REGUL_PARAM), activation='softmax'))(y)
y = Dropout(0.25)(y)
model = Model(inputs=image, outputs=y)
the accuracy for this is about 70%, but the point is that I cannot overfit even on a small potion of my data.
Apparently, your model doesn't work well.
You may take a look at this code.
'''Train a recurrent convolutional network on the IMDB sentiment
classification task.
Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Conv1D, MaxPooling1D
from keras.datasets import imdb
# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128
# Convolution
kernel_size = 5
filters = 64
pool_size = 4
# LSTM
lstm_output_size = 70
# Training
batch_size = 30
epochs = 2
'''
Note:
batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(lstm_output_size))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)