Unbelievable difference of custom metric between fit and evaluate - python

I run a tensorflow u-net model without dropout (but BN) with a custom metric called "average accuracy". This is literally the section of code. As you can see, datasets must be the same as I do nothing in between fit and evaluate.
model.fit(x=train_ds, epochs=epochs, validation_data=val_ds, shuffle=True,
callbacks=callbacks)
model.evaluate(train_ds)
model.evaluate(val_ds)
train_ds and val_ds are tf.Dataset. And here the output.
...
Epoch 10/10
148/148 [==============================] - 103s 698ms/step - loss: 0.1765 - accuracy: 0.5872 - average_accuracy: 0.9620 - val_loss: 0.5845 - val_accuracy: 0.5788 - val_average_accuracy: 0.5432
148/148 [==============================] - 22s 118ms/step - loss: 0.5056 - accuracy: 0.4540 - average_accuracy: 0.3654
29/29 [==============================] - 5s 122ms/step - loss: 0.5845 - accuracy: 0.5788 - average_accuracy: 0.5432
There is an unbelievable difference between average_accuracy during training (fit) and average_accuracy of evaluate (both on training dataset). I know that BN can have this effect and also that performance changes during training so they will never be equal. But from 96% to 36%?
My custom accuracy is defined here but I doubt it's my personal implementation as it should be somehow close no matter what I did (I think).
Any hint here is useful. I don't know if I should review the custom metric, the dataset, the model. It seems outside all of them.
I tried to continue training after stopping and average_accuracy starts from where it left at more than 90%.
Context of custom metric. I use it for semantic segmentation. So each image has an image of labels as output of WxHx4 (4 are my total number of classes).
It computes the average accuracy, for example, the accuracy of each class separately and then, if they were 4 classes it does sum(accuracies per class) / 4.
Here the main code:
def custom_average_accuracy(y_true, y_pred):
# Mask to remove the labels (y_true) that are zero: ex. [0, 0, 0]
remove_zeros_mask = tf.math.logical_not(tf.math.reduce_all(tf.math.logical_not(tf.cast(y_true, bool)), axis=-1))
y_true = tf.boolean_mask(y_true, remove_zeros_mask)
y_pred = tf.boolean_mask(y_pred, remove_zeros_mask)
num_cls = y_true.shape[-1]
y_pred = tf.math.argmax(y_pred, axis=-1) # ex. [0, 0, 1] -> [2]
y_true = tf.math.argmax(y_true, axis=-1)
accuracies = tf.TensorArray(tf.float32, size=0, dynamic_size=True)
for i in range(0, num_cls):
cls_mask = y_true == i
cls_y_true = tf.boolean_mask(y_true, cls_mask)
if not tf.equal(tf.size(cls_y_true), 0): # Some images don't have all the classes present.
new_acc = _accuracy(y_true=cls_y_true, y_pred=tf.boolean_mask(y_pred, cls_mask))
accuracies = accuracies.write(accuracies.size(), new_acc)
accuracies = accuracies.stack()
return tf.math.reduce_sum(accuracies) / tf.cast(len(accuracies), dtype=accuracies.dtype)
I believe the problem might be on the if not tf.equal(tf.size(cls_y_true), 0) line but I still can't seem were.
More wird information. This is exactly my lines of code:
x_input, y_true = np.concatenate([x for x, y in ds], axis=0), np.concatenate([y for x, y in ds], axis=0)
model.evaluate(x=x_input, y=y_true) # This gets 38% accuracy
model.evaluate(ds) # This gets 55% accuracy
What the hell is going on here? How can those lines of code give a different result?!?!
So now I have that if I don't do the ds = ds.shuffle() the example up (30ish vs 50ish ACC values) are Ok.

I tried to reproduce this behavior but could not find the discrepancies you noted. The only thing I changed was not tf.equal to tf.math.not_equal:
import pathlib
import tensorflow as tf
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)
num_classes = 5
batch_size = 32
img_height = 180
img_width = 180
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
def to_categorical(images, labels):
return images, tf.one_hot(labels, num_classes)
train_ds = train_ds.map(to_categorical)
val_ds = val_ds.map(to_categorical)
model = tf.keras.Sequential([
tf.keras.layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_classes)
])
def _accuracy(y_true, y_pred):
y_true.shape.assert_is_compatible_with(y_pred.shape)
if y_true.dtype != y_pred.dtype:
y_pred = tf.cast(y_pred, y_true.dtype)
reduced_sum = tf.reduce_sum(tf.cast(tf.math.equal(y_true, y_pred), tf.keras.backend.floatx()), axis=-1)
return tf.math.divide_no_nan(reduced_sum, tf.cast(tf.shape(y_pred)[-1], reduced_sum.dtype))
def custom_average_accuracy(y_true, y_pred):
# Mask to remove the labels (y_true) that are zero: ex. [0, 0, 0]
remove_zeros_mask = tf.math.logical_not(tf.math.reduce_all(tf.math.logical_not(tf.cast(y_true, bool)), axis=-1))
y_true = tf.boolean_mask(y_true, remove_zeros_mask)
y_pred = tf.boolean_mask(y_pred, remove_zeros_mask)
num_cls = y_true.shape[-1]
y_pred = tf.math.argmax(y_pred, axis=-1) # ex. [0, 0, 1] -> [2]
y_true = tf.math.argmax(y_true, axis=-1)
accuracies = tf.TensorArray(tf.float32, size=0, dynamic_size=True)
for i in range(0, num_cls):
cls_mask = y_true == i
cls_y_true = tf.boolean_mask(y_true, cls_mask)
if tf.math.not_equal(tf.size(cls_y_true), 0): # Some images don't have all the classes present.
new_acc = _accuracy(y_true=cls_y_true, y_pred=tf.boolean_mask(y_pred, cls_mask))
accuracies = accuracies.write(accuracies.size(), new_acc)
accuracies = accuracies.stack()
return tf.math.reduce_sum(accuracies) / tf.cast(len(accuracies), dtype=accuracies.dtype)
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy', custom_average_accuracy])
epochs=10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs)
model.evaluate(train_ds)
model.evaluate(val_ds)
Found 3670 files belonging to 5 classes.
Using 734 files for validation.
Found 3670 files belonging to 5 classes.
Using 2936 files for training.
Epoch 1/10
92/92 [==============================] - 11s 95ms/step - loss: 1.6220 - accuracy: 0.2868 - custom_average_accuracy: 0.2824 - val_loss: 1.2868 - val_accuracy: 0.4946 - val_custom_average_accuracy: 0.4597
Epoch 2/10
92/92 [==============================] - 8s 85ms/step - loss: 1.2131 - accuracy: 0.4785 - custom_average_accuracy: 0.4495 - val_loss: 1.2051 - val_accuracy: 0.4673 - val_custom_average_accuracy: 0.4350
Epoch 3/10
92/92 [==============================] - 8s 84ms/step - loss: 1.0713 - accuracy: 0.5620 - custom_average_accuracy: 0.5404 - val_loss: 1.1070 - val_accuracy: 0.5232 - val_custom_average_accuracy: 0.5003
Epoch 4/10
92/92 [==============================] - 8s 83ms/step - loss: 0.9463 - accuracy: 0.6281 - custom_average_accuracy: 0.6203 - val_loss: 0.9880 - val_accuracy: 0.5967 - val_custom_average_accuracy: 0.5755
Epoch 5/10
92/92 [==============================] - 8s 84ms/step - loss: 0.8400 - accuracy: 0.6771 - custom_average_accuracy: 0.6730 - val_loss: 0.9420 - val_accuracy: 0.6308 - val_custom_average_accuracy: 0.6245
Epoch 6/10
92/92 [==============================] - 8s 83ms/step - loss: 0.7594 - accuracy: 0.7027 - custom_average_accuracy: 0.7004 - val_loss: 0.8972 - val_accuracy: 0.6431 - val_custom_average_accuracy: 0.6328
Epoch 7/10
92/92 [==============================] - 8s 82ms/step - loss: 0.6211 - accuracy: 0.7619 - custom_average_accuracy: 0.7563 - val_loss: 0.8999 - val_accuracy: 0.6431 - val_custom_average_accuracy: 0.6174
Epoch 8/10
92/92 [==============================] - 8s 82ms/step - loss: 0.5108 - accuracy: 0.8116 - custom_average_accuracy: 0.8046 - val_loss: 0.8809 - val_accuracy: 0.6689 - val_custom_average_accuracy: 0.6457
Epoch 9/10
92/92 [==============================] - 8s 83ms/step - loss: 0.3985 - accuracy: 0.8535 - custom_average_accuracy: 0.8534 - val_loss: 0.9364 - val_accuracy: 0.6676 - val_custom_average_accuracy: 0.6539
Epoch 10/10
92/92 [==============================] - 8s 83ms/step - loss: 0.3023 - accuracy: 0.8995 - custom_average_accuracy: 0.9010 - val_loss: 1.0118 - val_accuracy: 0.6662 - val_custom_average_accuracy: 0.6405
92/92 [==============================] - 6s 62ms/step - loss: 0.2038 - accuracy: 0.9363 - custom_average_accuracy: 0.9357
23/23 [==============================] - 2s 50ms/step - loss: 1.0118 - accuracy: 0.6662 - custom_average_accuracy: 0.663

Well, I was using a TensorFlow dataset. I changed to NumPy and now all seems logical and works.
Still, I need to know the reason tf ds didn't work but at least I don't longer have these weird results.
Not tested yet (I would need to get the code back to what it was, probably do it someday) but this might be related.

Related

High diversity between val_accuracy and evaluation results

I'm working on a multiclass text classification problem.
After splitting the data to train and validation data frames, I've performed text augmentation to balance the data (only on the train data of course).
I've ended up with a balanced trained data and 44325 samples (of trained data).
Later on I've applied the "clean text" task for getting (i.e. stemming and stuff) on the trained data.
train['text'] = train['text'].apply(clean_text)
X_train = train.iloc[:, :-1]
y_train = train.iloc[:, -1:]
X_test = valid.iloc[:, :-1]
y_test = valid.iloc[:, -1:]
y_test = pd.DataFrame(y_test).reset_index(drop=True)
tokenizer = Tokenizer(num_words=vocab_size, oov_token='<OOV>')
tokenizer.fit_on_texts(X_train['text'])
train_seq = tokenizer.texts_to_sequences(X_train['text'])
train_padded = pad_sequences(train_seq, maxlen=max_length, padding=padding_type, truncating=trunc_type)
validation_seq = tokenizer.texts_to_sequences(X_test['text'])
validation_padded = pad_sequences(validation_seq, maxlen=max_length, padding=padding_type, truncating=trunc_type)
print('Shape of train data tensor:', train_padded.shape)
print('Shape of validation data tensor:', validation_padded.shape)
Output:
Shape of train data tensor: (44325, 200)
Shape of validation data tensor: (5466, 200)
Here's the encoding section:
encode = OneHotEncoder()
training_labels = encode.fit_transform(y_train)
validation_labels = encode.transform(y_test)
training_labels = training_labels.toarray()
validation_labels = validation_labels.toarray()
Model:
model = Sequential()
model.add(Embedding(vocab_size, embedding_dim, input_length=train_padded.shape[1]))
model.add(Conv1D(48, len(GROUPS), activation='relu', padding='valid'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(len(GROUPS), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
epochs = 100
batch_size = 32
history = model.fit(train_padded, training_labels, shuffle=True ,
epochs=epochs, batch_size=batch_size,
validation_split=0.2,
validation_data=(validation_padded, validation_labels),
callbacks=[ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.0001),
EarlyStopping(monitor='val_loss', mode='min', patience=2, verbose=1),
EarlyStopping(monitor='val_accuracy', mode='max', patience=5, verbose=1)])
Model output:
Epoch 1/100
1109/1109 [==============================] - 88s 79ms/step - loss: 1.2021 - accuracy: 0.5235 - val_loss: 0.8374 - val_accuracy: 0.7232
Epoch 2/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.9505 - accuracy: 0.6645 - val_loss: 0.7488 - val_accuracy: 0.7461
Epoch 3/100
1109/1109 [==============================] - 86s 77ms/step - loss: 0.8378 - accuracy: 0.7058 - val_loss: 0.6686 - val_accuracy: 0.7663
Epoch 4/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.7391 - accuracy: 0.7382 - val_loss: 0.6134 - val_accuracy: 0.7891
Epoch 5/100
1109/1109 [==============================] - 89s 80ms/step - loss: 0.6763 - accuracy: 0.7546 - val_loss: 0.5832 - val_accuracy: 0.7997
Epoch 6/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.6185 - accuracy: 0.7760 - val_loss: 0.5529 - val_accuracy: 0.8050
Epoch 7/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.5737 - accuracy: 0.7912 - val_loss: 0.5311 - val_accuracy: 0.8153
Epoch 8/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.5226 - accuracy: 0.8080 - val_loss: 0.5268 - val_accuracy: 0.8226
Epoch 9/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.4955 - accuracy: 0.8171 - val_loss: 0.5142 - val_accuracy: 0.8285
Epoch 10/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.4665 - accuracy: 0.8265 - val_loss: 0.5035 - val_accuracy: 0.8338
Epoch 11/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.4410 - accuracy: 0.8348 - val_loss: 0.5082 - val_accuracy: 0.8399
Epoch 12/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.4190 - accuracy: 0.8407 - val_loss: 0.5160 - val_accuracy: 0.8414
Epoch 00012: early stopping
... and to the last part I'm unsure of:
def evaluate_preds(y_true, y_preds):
"""
Performs evaluation comparison on y_true labels vs. y_pred labels
on a classification.
"""
accuracy = accuracy_score(y_true, y_preds)
precision = precision_score(y_true, y_preds, average='micro')
recall = recall_score(y_true, y_preds, average='micro')
f1 = f1_score(y_true, y_preds, average='micro')
metric_dict = {"accuracy": round(accuracy, 2),
"precision": round(precision, 2),
"recall": round(recall, 2),
"f1": round(f1, 2)}
print(f"Acc: {accuracy * 100:.2f}%")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 score: {f1:.2f}")
return metric_dict
predicted = model.predict(validation_padded)
evaluate_preds(np.argmax(validation_labels, axis=1), np.argmax(predicted, axis=1))
Output:
Acc: 40.16%
Precision: 0.40
Recall: 0.40
F1 score: 0.40
I can't understand what am I doing wrong.
How come accuracy of the last mentioned method is so low compare to val_accuracy?

Preprocessing for Transfer Learning Model with an Inception Network

I am trying to build an image classification model using an Inception Network as the base. This is a simple binary classification model.
My images are available in many smaller directories within one big directory. Each of them has its own 'image id' and that is how they have been named. In addition to this, I have a few tsv files which contain these image ids and the respective labels ('Positive' or 'Negative').
When I train the model, I see that my accuracy fluctuates without much progress. I was wondering if there is anything wrong with the way that I have prepared my dataset. I have written a few functions for this purpose.
Before I get to these functions, given below is how I have defined my model,
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.4)(x)
predictions = Dense(2, activation='sigmoid')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
These are the functions that I have written in order to prepare my data,
def vectorize_img(img_path):
img = load_img(img_path, target_size=(224, 224)) #size is 224,224 by default
x = img_to_array(img) #change to np array
x = preprocess_input(x) #make input confirm to InceptionV3 input format
return x
def prepare_features(base_dir, limit):
features_dict = dict()
for dir1 in os.listdir(base_dir):
for dir2 in os.listdir(base_dir + dir1):
for file in os.listdir(base_dir + dir1 + '/' + dir2):
if len(features_dict) < limit:
try:
img_path = base_dir + dir1 + '/' + dir2 + '/' + file
x = vectorize_img(img_path)
name_id = file.split('.')[0] #take the file name and use as id in dict
features_dict[name_id] = x
except Exception as e:
print(e)
pass
return features_dict
def prepare_data(file_path, features_dict):
inputs = []
labels = []
df = pd.read_csv(file_path, sep='\t')
df = df[['image_id', 'label_text_image']]
df['class'] = df[['image_id', 'label_text_image']].apply(lambda x: 1 if x['label_text_image'] == 'Positive' else 0, axis = 1)
for index, row in df.iterrows():
try:
inputs.append(features_dict[row['image_id']])
labels.append(row['class'])
except:
pass
return np.asarray(inputs), tf.one_hot(np.asarray(labels), depth=2)
These functions are then called to prepare my dataset,
features_dict = prepare_features('/path/to/img/dir', 8000)
x_train, y_train = prepare_data('/path/to/train/tsv', features_dict)
x_dev, y_dev = prepare_data('/path/to/dev/tsv', features_dict)
x_test, y_test = prepare_data('/path/to/test/tsv', features_dict)
Finally, the model is trained,
EPOCHS = 50
BATCH_SIZE = 32
STEPS_PER_EPOCH = 1
history = model.fit(x=x_train, y=y_train, validation_data=(x_dev, y_dev), epochs=EPOCHS, steps_per_epoch=STEPS_PER_EPOCH, batch_size=BATCH_SIZE)
model.evaluate(x=x_test, y=y_test, batch_size=BATCH_SIZE)
Am I doing something wrong?
Here are the results that my model achieves,
Epoch 1/50
1/1 [==============================] - 158s 158s/step - loss: 0.8298 - accuracy: 0.5000 - val_loss: 0.7432 - val_accuracy: 0.5227
Epoch 2/50
1/1 [==============================] - 113s 113s/step - loss: 0.7775 - accuracy: 0.4688 - val_loss: 0.8225 - val_accuracy: 0.5153
Epoch 3/50
1/1 [==============================] - 113s 113s/step - loss: 0.7663 - accuracy: 0.5625 - val_loss: 0.8431 - val_accuracy: 0.5174
Epoch 4/50
1/1 [==============================] - 156s 156s/step - loss: 1.1292 - accuracy: 0.5312 - val_loss: 0.7763 - val_accuracy: 0.5227
Epoch 5/50
1/1 [==============================] - 114s 114s/step - loss: 0.7452 - accuracy: 0.5312 - val_loss: 0.7332 - val_accuracy: 0.5448
Epoch 6/50
1/1 [==============================] - 156s 156s/step - loss: 0.7884 - accuracy: 0.5312 - val_loss: 0.7072 - val_accuracy: 0.5606
Epoch 7/50
1/1 [==============================] - 114s 114s/step - loss: 0.7856 - accuracy: 0.5312 - val_loss: 0.7195 - val_accuracy: 0.5764
Epoch 8/50
1/1 [==============================] - 156s 156s/step - loss: 0.9203 - accuracy: 0.5312 - val_loss: 0.7348 - val_accuracy: 0.5616
Epoch 9/50
1/1 [==============================] - 156s 156s/step - loss: 0.8639 - accuracy: 0.4062 - val_loss: 0.7275 - val_accuracy: 0.5690
Epoch 10/50
1/1 [==============================] - 156s 156s/step - loss: 0.6170 - accuracy: 0.7188 - val_loss: 0.7125 - val_accuracy: 0.5880
Epoch 11/50
1/1 [==============================] - 156s 156s/step - loss: 0.5756 - accuracy: 0.7188 - val_loss: 0.6979 - val_accuracy: 0.6017
Epoch 12/50
1/1 [==============================] - 113s 113s/step - loss: 0.9976 - accuracy: 0.4375 - val_loss: 0.6834 - val_accuracy: 0.5933
Epoch 13/50
1/1 [==============================] - 156s 156s/step - loss: 0.7025 - accuracy: 0.5938 - val_loss: 0.6863 - val_accuracy: 0.5838
You mentioned that it is a binary classification hence labels are {0,1}. In this case your model output should either be
predictions = Dense(2, activation='softmax')(x)
with categorical labels [0,1] or [1,0]
or
predictions = Dense(1, activation='sigmoid')(x)
with binary label 1 or 0
but you are using output 2 with sigmoid i.e. predictions = Dense(2, activation='sigmoid')(x).

Model learns when using keras but doesn't with tf.keras

The model that I am using is this:
from keras.layers import (Input, MaxPooling1D, Dropout,
BatchNormalization, Activation, Add,
Flatten, Conv1D, Dense)
from keras.models import Model
import numpy as np
class ResidualUnit(object):
"""References
----------
.. [1] K. He, X. Zhang, S. Ren, and J. Sun, "Identity Mappings in Deep Residual Networks,"
arXiv:1603.05027 [cs], Mar. 2016. https://arxiv.org/pdf/1603.05027.pdf.
.. [2] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778. https://arxiv.org/pdf/1512.03385.pdf
"""
def __init__(self, n_samples_out, n_filters_out, kernel_initializer='he_normal',
dropout_rate=0.8, kernel_size=17, preactivation=True,
postactivation_bn=False, activation_function='relu'):
self.n_samples_out = n_samples_out
self.n_filters_out = n_filters_out
self.kernel_initializer = kernel_initializer
self.dropout_rate = dropout_rate
self.kernel_size = kernel_size
self.preactivation = preactivation
self.postactivation_bn = postactivation_bn
self.activation_function = activation_function
def _skip_connection(self, y, downsample, n_filters_in):
"""Implement skip connection."""
# Deal with downsampling
if downsample > 1:
y = MaxPooling1D(downsample, strides=downsample, padding='same')(y)
elif downsample == 1:
y = y
else:
raise ValueError("Number of samples should always decrease.")
# Deal with n_filters dimension increase
if n_filters_in != self.n_filters_out:
# This is one of the two alternatives presented in ResNet paper
# Other option is to just fill the matrix with zeros.
y = Conv1D(self.n_filters_out, 1, padding='same',
use_bias=False,
kernel_initializer=self.kernel_initializer
)(y)
return y
def _batch_norm_plus_activation(self, x):
if self.postactivation_bn:
x = Activation(self.activation_function)(x)
x = BatchNormalization(center=False, scale=False)(x)
else:
x = BatchNormalization()(x)
x = Activation(self.activation_function)(x)
return x
def __call__(self, inputs):
"""Residual unit."""
x, y = inputs
n_samples_in = y.shape[1]
downsample = n_samples_in // self.n_samples_out
n_filters_in = y.shape[2]
y = self._skip_connection(y, downsample, n_filters_in)
# 1st layer
x = Conv1D(self.n_filters_out, self.kernel_size, padding='same',
use_bias=False,
kernel_initializer=self.kernel_initializer
)(x)
x = self._batch_norm_plus_activation(x)
if self.dropout_rate > 0:
x = Dropout(self.dropout_rate)(x)
# 2nd layer
x = Conv1D(self.n_filters_out, self.kernel_size, strides=downsample,
padding='same', use_bias=False,
kernel_initializer=self.kernel_initializer
)(x)
if self.preactivation:
x = Add()([x, y]) # Sum skip connection and main connection
y = x
x = self._batch_norm_plus_activation(x)
if self.dropout_rate > 0:
x = Dropout(self.dropout_rate)(x)
else:
x = BatchNormalization()(x)
x = Add()([x, y]) # Sum skip connection and main connection
x = Activation(self.activation_function)(x)
if self.dropout_rate > 0:
x = Dropout(self.dropout_rate)(x)
y = x
return [x, y]
# ----- Model ----- #
kernel_size = 16
kernel_initializer = 'he_normal'
signal = Input(shape=(1000, 12), dtype=np.float32, name='signal')
age_range = Input(shape=(6,), dtype=np.float32, name='age_range')
is_male = Input(shape=(1,), dtype=np.float32, name='is_male')
x = signal
x = Conv1D(64, kernel_size, padding='same', use_bias=False,
kernel_initializer=kernel_initializer
)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x, y = ResidualUnit(512, 128, kernel_size=kernel_size,
kernel_initializer=kernel_initializer
)([x, x])
x, y = ResidualUnit(256, 196, kernel_size=kernel_size,
kernel_initializer=kernel_initializer
)([x, y])
x, y = ResidualUnit(64, 256, kernel_size=kernel_size,
kernel_initializer=kernel_initializer
)([x, y])
x, _ = ResidualUnit(16, 320, kernel_size=kernel_size, kernel_initializer=kernel_initializer
)([x, y])
x = Flatten()(x)
diagn = Dense(2, activation='sigmoid', kernel_initializer=kernel_initializer)(x)
model = Model(signal, diagn)
model.summary()
# ----- Train ----- #
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
loss = 'binary_crossentropy'
lr = 0.001
batch_size = 64
opt = Adam(learning_rate=0.001)
callbacks = [ReduceLROnPlateau(monitor='val_loss',
factor=0.1,
patience=7,
min_lr=lr / 100)]
model.compile(optimizer=opt, loss=loss, metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=70,
initial_epoch=0,
validation_split=0.1,
shuffle='batch',
callbacks=callbacks,
verbose=1)
# Save final result
model.save("./final_model_middle_one.hdf5")
When I substitute the use of Keras with tf.keras, which I need to use the qkeras library, the model doesn't learn and gets stuck at a much lower accuracy at every iteration. What could be causing this?
When I use keras the accuracy start high at 83% and slightly increases during training.
Train on 17340 samples, validate on 1927 samples
Epoch 1/70
17340/17340 [==============================] - 33s 2ms/step - loss: 0.3908 - accuracy: 0.8314 - val_loss: 0.3283 - val_accuracy: 0.8710
Epoch 2/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3641 - accuracy: 0.8416 - val_loss: 0.3340 - val_accuracy: 0.8612
Epoch 3/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3525 - accuracy: 0.8483 - val_loss: 0.3847 - val_accuracy: 0.8550
Epoch 4/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3354 - accuracy: 0.8563 - val_loss: 0.4641 - val_accuracy: 0.8215
Epoch 5/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3269 - accuracy: 0.8590 - val_loss: 0.7172 - val_accuracy: 0.7870
Epoch 6/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3202 - accuracy: 0.8630 - val_loss: 0.3599 - val_accuracy: 0.8617
Epoch 7/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3101 - accuracy: 0.8678 - val_loss: 0.2659 - val_accuracy: 0.8934
Epoch 8/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.3058 - accuracy: 0.8688 - val_loss: 0.5683 - val_accuracy: 0.8293
Epoch 9/70
17340/17340 [==============================] - 31s 2ms/step - loss: 0.2980 - accuracy: 0.8739 - val_loss: 0.3442 - val_accuracy: 0.8643
Epoch 10/70
7424/17340 [===========>..................] - ETA: 17s - loss: 0.2966 - accuracy: 0.8707
When I use tf.keras the accuracy starts at 50% and does not increase considerably during training:
Epoch 1/70
271/271 [==============================] - 30s 110ms/step - loss: 0.9325 - accuracy: 0.5093 - val_loss: 0.6973 - val_accuracy: 0.5470 - lr: 0.0010
Epoch 2/70
271/271 [==============================] - 29s 108ms/step - loss: 0.8424 - accuracy: 0.5157 - val_loss: 0.6660 - val_accuracy: 0.6528 - lr: 0.0010
Epoch 3/70
271/271 [==============================] - 29s 108ms/step - loss: 0.8066 - accuracy: 0.5213 - val_loss: 0.6441 - val_accuracy: 0.6539 - lr: 0.0010
Epoch 4/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7884 - accuracy: 0.5272 - val_loss: 0.6649 - val_accuracy: 0.6559 - lr: 0.0010
Epoch 5/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7888 - accuracy: 0.5368 - val_loss: 0.6899 - val_accuracy: 0.5760 - lr: 0.0010
Epoch 6/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7617 - accuracy: 0.5304 - val_loss: 0.6641 - val_accuracy: 0.6533 - lr: 0.0010
Epoch 7/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7485 - accuracy: 0.5333 - val_loss: 0.6450 - val_accuracy: 0.6544 - lr: 0.0010
Epoch 8/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7431 - accuracy: 0.5382 - val_loss: 0.6599 - val_accuracy: 0.6539 - lr: 0.0010
Epoch 9/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7336 - accuracy: 0.5421 - val_loss: 0.6532 - val_accuracy: 0.6554 - lr: 0.0010
Epoch 10/70
271/271 [==============================] - 29s 108ms/step - loss: 0.7274 - accuracy: 0.5379 - val_loss: 0.6753 - val_accuracy: 0.6492 - lr: 0.0010
The lines that have been changed between the two trials are the lines where I import keras modules by adding 'tensorflow.' in front of them. I don't know why the results would be so different, possibly due to different default values of certain parameters?
It might be related to how the accuracy metric is computed in keras vs tf.keras. As far as I can tell the accuracy function is usually used when you have one-hot-encoded output. However, it seems that you are outputting two values [A, B] with a sigmoid function applied to each value.
Since I don't know the labels you're using, there might be two cases:
a) You want to predict A or B. If sos I would change the activation function to softmax
b) You want to predict between A or not A and B or not B. In this case I would modify the output tensor shape to have two heads, each with two values: head_A = [A, not_A] and head_B = [B, not_B]. I would then hot-encode the labels respectively and then I would assume you could use the accuracy metric.
Alternatively, you can create a custom metric that is appropriate to your output shape.
I have a similar (same?) problem, I was manipulating some examples from Kaggle, and was unable to save the model using keras. After much Googling I realised that I needed to use tensorflow.keras. This solved my problem, but the 60000 data items I have and was using for training dropped to a reported 1875. Although the error was still 10%.
1875 * 32 = 60000.
This is my fit.
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, verbose=True,
callbacks=[early_stopping_monitor])
1539/1875 [=======================>......] - ETA: 3s - loss: 0.4445 - accuracy: 0.8418
It turns out that fit defaults to a batch size of 32. If I increase the batch size to 64 I get half the reported data sets, which makes sense:
model.fit(X_train, y_train, batch_size=64, validation_data=(X_test, y_test), epochs=epochs, verbose=True,
callbacks=[early_stopping_monitor])
938/938 [==============================] - 16s 17ms/step - loss: 0.4568 - accuracy: 0.8388
I noticed from your code that you've set batch_size to 64, and your reported data items reduce from 17340 to 271 which is about a 64th, this must also affect your accuracy due to the data you are using.
From the docs here: https://www.tensorflow.org/api_docs/python/tf/keras/Sequential
batch_size
Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of a dataset, generators, or keras.utils.Sequence instances (since they generate batches).
From the Keras docs: https://keras.rstudio.com/reference/fit.html, it also says that the batch size defaults to 32, it must just be reported differently when training the model.
Hope this helps.

Keras LSTM Model not learning

I wrote this code a few days ago and I had a few bugs but with some help, I was able to fix them. The Model is not learning. I tried different batch sizes, different amount of epochs, different activation functions, checked my data a few times for flaws I wasn't able to find any. It is due in a week or so for a school project. Any help will be very much valued.
Here is the code.
from keras.layers import Dense, Input, Concatenate, Dropout
from sklearn.preprocessing import MinMaxScaler
from keras.models import Model
from keras.layers import LSTM
import tensorflow as tf
import NetworkRequest as NR
import ParseNetworkRequest as PNR
import numpy as np
def buildModel():
_Price = Input(shape=(1, 1))
_Volume = Input(shape=(1, 1))
PriceLayer = LSTM(128)(_Price)
VolumeLayer = LSTM(128)(_Volume)
merged = Concatenate(axis=1)([PriceLayer, VolumeLayer])
Dropout(0.2)
dense1 = Dense(128, input_dim=2, activation='relu', use_bias=True)(merged)
Dropout(0.2)
dense2 = Dense(64, input_dim=2, activation='relu', use_bias=True)(dense1)
Dropout(0.2)
output = Dense(1, activation='softmax', use_bias=True)(dense2)
opt = tf.keras.optimizers.Adam(learning_rate=1e-3, decay=1e-6)
_Model = Model(inputs=[_Price, _Volume], output=output)
_Model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
return _Model
if __name__ == '__main__':
api_key = "47BGPYJPFN4CEC20"
stock = "DJI"
Index = ['4. close', '5. volume']
RawData = NR.Initial_Network_Request(api_key, stock)
Closing = PNR.Parse_Network_Request(RawData, Index[0])
Volume = PNR.Parse_Network_Request(RawData, Index[1])
Length = len(Closing)
scalar = MinMaxScaler(feature_range=(0, 1))
Closing_scaled = scalar.fit_transform(np.reshape(Closing[:-1], (-1, 1)))
Volume_scaled = scalar.fit_transform(np.reshape(Volume[:-1], (-1, 1)))
Labels_scaled = scalar.fit_transform(np.reshape(Closing[1:], (-1, 1)))
Train_Closing = Closing_scaled[:int(0.9 * Length)]
Train_Closing = np.reshape(Train_Closing, (Train_Closing.shape[0], 1, 1))
Train_Volume = Volume_scaled[:int(0.9 * Length)]
Train_Volume = np.reshape(Train_Volume, (Train_Volume.shape[0], 1, 1))
Train_Labels = Labels_scaled[:int((0.9 * Length))]
Train_Labels = np.reshape(Train_Labels, (Train_Labels.shape[0], 1))
# -------------------------------------------------------------------------------------------#
Test_Closing = Closing_scaled[int(0.9 * Length):(Length - 1)]
Test_Closing = np.reshape(Test_Closing, (Test_Closing.shape[0], 1, 1))
Test_Volume = Volume_scaled[int(0.9 * Length):(Length - 1)]
Test_Volume = np.reshape(Test_Volume, (Test_Volume.shape[0], 1, 1))
Test_Labels = Labels_scaled[int(0.9 * Length):(Length - 1)]
Test_Labels = np.reshape(Test_Labels, (Test_Labels.shape[0], 1))
Predict_Closing = Closing_scaled[-1]
Predict_Closing = np.reshape(Predict_Closing, (Predict_Closing.shape[0], 1, 1))
Predict_Volume = Volume_scaled[-1]
Predict_Volume = np.reshape(Predict_Volume, (Predict_Volume.shape[0], 1, 1))
Predict_Label = Labels_scaled[-1]
Predict_Label = np.reshape(Predict_Label, (Predict_Label.shape[0], 1))
model = buildModel()
model.fit(
[
Train_Closing,
Train_Volume
],
[
Train_Labels
],
validation_data=(
[
Test_Closing,
Test_Volume
],
[
Test_Labels
]
),
epochs=10,
batch_size=Length
)
This is the output when I run it.
Using TensorFlow backend.
2020-01-01 16:31:47.905012: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199985000 Hz
2020-01-01 16:31:47.906105: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x49214f0 executing computations on platform Host. Devices:
2020-01-01 16:31:47.906137: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
/home/martin/PycharmProjects/MarketPredictor/Model.py:26: UserWarning: Update your `Model` call to the Keras 2 API: `Model(inputs=[<tf.Tenso..., outputs=Tensor("de...)`
_Model = Model(inputs=[_Price, _Volume], output=output)
Train on 4527 samples, validate on 503 samples
Epoch 1/10
4527/4527 [==============================] - 1s 179us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 2/10
4527/4527 [==============================] - 0s 41us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 3/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 4/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 5/10
4527/4527 [==============================] - 0s 43us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 6/10
4527/4527 [==============================] - 0s 39us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 7/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 8/10
4527/4527 [==============================] - 0s 39us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 9/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 10/10
4527/4527 [==============================] - 0s 38us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Process finished with exit code 0
The loss is high, and the accuracy is 0.
Please help.
You're using activation functions and metrics made for a classification task, not a stock forecasting task (with a continuous target).
For continuous targets, your final activation layer should be linear. Metrics should be mse or mae, not accuracy.
accuracy would only be satisfied is the dji prediction is exactly equal to the actual price. Since dji has at least 7 digits, it's nearly impossible.
Here's my suggestion:
Use a simpler network: Not sure how big is your dataset, but sometimes using dense. layer isn't helpful. Looks like the weights of there intermediate layers are not changing at all. Try the model with just one dense layer.
Reduce dropout: Try with using one dropout layer with Dropout(0.1).
Adam defaults: Start with using adam optimizer with its default parameters.
Metric selection: As mentioned by Nicolas's answer, use a regression metric instead of accuracy.

Neural network in keras not converging

I'm building a simple Neural network in Keras, like the following:
# create model
model = Sequential()
model.add(Dense(1000, input_dim=x_train.shape[1], activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='mean_squared_error', metrics=['accuracy'], optimizer='RMSprop')
# Fit the model
model.fit(x_train, y_train, epochs=20, batch_size=700, verbose=2)
# evaluate the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
The shape of the used data is:
x_train = (49972, 601)
y_train = (49972, 1)
My problem is that the network is not converging, the accuracy is fixed on 0.0168, like below:
Epoch 1/20
- 1s - loss: 3.2222 - acc: 0.0174
Epoch 2/20
- 1s - loss: 3.1757 - acc: 0.0187
Epoch 3/20
- 1s - loss: 3.1731 - acc: 0.0212
Epoch 4/20
- 1s - loss: 3.1721 - acc: 0.0220
Epoch 5/20
- 1s - loss: 3.1716 - acc: 0.0225
Epoch 6/20
- 1s - loss: 3.1711 - acc: 0.0235
Epoch 7/20
- 1s - loss: 3.1698 - acc: 0.0245
Epoch 8/20
- 1s - loss: 3.1690 - acc: 0.0251
Epoch 9/20
- 1s - loss: 3.1686 - acc: 0.0257
Epoch 10/20
- 1s - loss: 3.1679 - acc: 0.0261
Epoch 11/20
- 1s - loss: 3.1674 - acc: 0.0267
Epoch 12/20
- 1s - loss: 3.1667 - acc: 0.0277
Epoch 13/20
- 1s - loss: 3.1656 - acc: 0.0285
Epoch 14/20
- 1s - loss: 3.1653 - acc: 0.0288
Epoch 15/20
- 1s - loss: 3.1653 - acc: 0.0291
I used Sklearn library to build the same structure with the same data, and it works perfectly, shown me an accuracy higher than 0.5:
model = Pipeline([
('classifier', MLPClassifier(hidden_layer_sizes=(1000), activation='relu',
max_iter=20, verbose=2, batch_size=700, random_state=0))
])
I'm totally sure that I used the same data for both models, and this is how I prepare it:
def load_data():
le = preprocessing.LabelEncoder()
with open('_DATA_train.txt', 'rb') as fp:
train = pickle.load(fp)
with open('_DATA_test.txt', 'rb') as fp:
test = pickle.load(fp)
x_train = train[:,0:(train.shape[1]-1)]
y_train = train[:,(train.shape[1]-1)]
y_train = le.fit_transform(y_train).reshape([-1,1])
x_test = test[:,0:(test.shape[1]-1)]
y_test = test[:,(test.shape[1]-1)]
y_test = le.fit_transform(y_test).reshape([-1,1])
print(x_train.shape, ' ' , y_train.shape)
print(x_test.shape, ' ' , y_test.shape)
return x_train, y_train, x_test, y_test
What is the problem with the Keras structure?
Edited:
it's a multi-class classification problem: y_training [0 ,1, 2, 3]
For a multiclass problem your labels should be one hot encoded. For example if the options are [0 ,1, 2, 3] and the label is 1 then it should be [0, 1, 0, 0].
Your final layer should be a dense layer with 4 units and an activation of softmax.
model.add(Dense(4, activation='softmax'))
And your loss should be categorical_crossentropy
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='RMSprop')

Categories

Resources