How to calculate precision and recall in Keras

How to calculate precision and recall in Keras - python

I am building a multi-class classifier with Keras 2.02 (with Tensorflow backend)，and I do not know how to calculate precision and recall in Keras. Please help me.

Python package keras-metrics could be useful for this (I'm the package's author).
import keras
import keras_metrics
model = models.Sequential()
model.add(keras.layers.Dense(1, activation="sigmoid", input_dim=2))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="sgd",
loss="binary_crossentropy",
metrics=[keras_metrics.precision(), keras_metrics.recall()])
UPDATE: Starting with Keras version 2.3.0, such metrics as precision, recall, etc. are provided within library distribution package.
The usage is the following:
model.compile(optimizer="sgd",
loss="binary_crossentropy",
metrics=[keras.metrics.Precision(), keras.metrics.Recall()])

As of Keras 2.0, precision and recall were removed from the master branch. You will have to implement them yourself. Follow this guide to create custom metrics : Here.
Precision and recall equation can be found Here
Or reuse the code from keras before it was removed Here.
There metrics were remove because they were batch-wise so the value may or may not be correct.

My answer is based on the comment of Keras GH issue. It calculates validation precision and recall at every epoch for a onehot-encoded classification task. Also please look at this SO answer to see how it can be done with keras.backend functionality.
import keras as keras
import numpy as np
from keras.optimizers import SGD
from sklearn.metrics import precision_score, recall_score
model = keras.models.Sequential()
# ...
sgd = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
class Metrics(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self._data = []
def on_epoch_end(self, batch, logs={}):
X_val, y_val = self.validation_data[0], self.validation_data[1]
y_predict = np.asarray(model.predict(X_val))
y_val = np.argmax(y_val, axis=1)
y_predict = np.argmax(y_predict, axis=1)
self._data.append({
'val_recall': recall_score(y_val, y_predict),
'val_precision': precision_score(y_val, y_predict),
})
return
def get_data(self):
return self._data
metrics = Metrics()
history = model.fit(X_train, y_train, epochs=100, validation_data=(X_val, y_val), callbacks=[metrics])
metrics.get_data()

This thread is a little stale, but just in case it'll help someone landing here. If you are willing to upgrade to Keras v2.1.6, there has been a lot of work on getting stateful metrics to work though there seems to be more work that is being done (https://github.com/keras-team/keras/pull/9446).
Anyway, I found the best way to integrate precision/recall was using the custom metric that subclasses Layer, shown by example in BinaryTruePositives.
For recall, this would look like:
class Recall(keras.layers.Layer):
"""Stateful Metric to count the total recall over all batches.
Assumes predictions and targets of shape `(samples, 1)`.
# Arguments
name: String, name for the metric.
"""
def __init__(self, name='recall', **kwargs):
super(Recall, self).__init__(name=name, **kwargs)
self.stateful = True
self.recall = K.variable(value=0.0, dtype='float32')
self.true_positives = K.variable(value=0, dtype='int32')
self.false_negatives = K.variable(value=0, dtype='int32')
def reset_states(self):
K.set_value(self.recall, 0.0)
K.set_value(self.true_positives, 0)
K.set_value(self.false_negatives, 0)
def __call__(self, y_true, y_pred):
"""Computes the number of true positives in a batch.
# Arguments
y_true: Tensor, batch_wise labels
y_pred: Tensor, batch_wise predictions
# Returns
The total number of true positives seen this epoch at the
completion of the batch.
"""
y_true = K.cast(y_true, 'int32')
y_pred = K.cast(K.round(y_pred), 'int32')
# False negative calculations
y_true = K.cast(y_true, 'int32')
y_pred = K.cast(K.round(y_pred), 'int32')
false_neg = K.cast(K.sum(K.cast(K.greater(y_pred, y_true), 'int32')), 'int32')
current_false_neg = self.false_negatives * 1
self.add_update(K.update_add(self.false_negatives,
false_neg),
inputs=[y_true, y_pred])
# True positive calculations
correct_preds = K.cast(K.equal(y_pred, y_true), 'int32')
true_pos = K.cast(K.sum(correct_preds * y_true), 'int32')
current_true_pos = self.true_positives * 1
self.add_update(K.update_add(self.true_positives,
true_pos),
inputs=[y_true, y_pred])
# Combine
recall = (K.cast(self.true_positives, 'float32') / (K.cast(self.true_positives, 'float32') + K.cast(self.false_negatives, 'float32') + K.cast(K.epsilon(), 'float32')))
self.add_update(K.update(self.recall,
recall),
inputs=[y_true, y_pred])
return recall

Use Scikit Learn framework for this.
from sklearn.metrics import classification_report
history = model.fit(x_train, y_train, batch_size=32, epochs=10, verbose=1, validation_data=(x_test, y_test), shuffle=True)
pred = model.predict(x_test, batch_size=32, verbose=1)
predicted = np.argmax(pred, axis=1)
report = classification_report(np.argmax(y_test, axis=1), predicted)
print(report)
This blog is very useful.

Related

How to integrate keras model with sequential backward selection code?

I am trying to integrate a Keras deep neural network as a classifier within code for sequential backward feature selection in Python. (Originally, I tried to wrap the Keras deep neural network within Scikeras to use within scikit-learn's built in sequential feature selection models, but I kept getting error messages).
I found this code from scratch for sequential backward feature selection (taken from https://vitalflux.com/sequential-backward-feature-selection-python-example/), and have been trying to integrate a Keras model in the code to replace the "estimator" within the function but I keep getting this error: ValueError: Input 0 of layer "sequential_410" is incompatible with the layer: expected shape=(None, 45), found shape=(None, 44)
Here is the code that I have so far for the sequential backward feature selection and the deep neural network:
import pandas as pd
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import Adam
from scikeras.wrappers import KerasClassifier, KerasRegressor
# SBS (sequential backward feature selection) from scratch
#=====================================================
from sklearn.metrics import accuracy_score
from itertools import combinations
from sklearn.base import clone
class SequentialBackwardSearch():
'''
Instantiate with Estimator and given number of features
'''
def __init__(self, estimator, k_features):
self.estimator = clone(estimator)
self.k_features = k_features
'''
X_train - Training data Pandas dataframe
X_test - Test data Pandas dataframe
y_train - Training label Pandas dataframe
y_test - Test data Pandas dataframe
'''
def fit(self, X_train, X_test, y_train, y_test):
dim = X_train.shape[1]
self.indices_ = tuple(range(dim))
self.subsets_ = [self.indices_]
score = self._calc_score(X_train.values, X_test.values,
y_train.values, y_test.values, self.indices_)
self.scores_ = [score]
'''
Iterate through all the dimensions until k_features is reached
At the end of loop, dimension count is reduced by 1
'''
while dim > k_features:
scores = []
subsets = []
'''
Iterate through different combinations of features, train the model,
record the score
'''
for p in combinations(self.indices_, r=dim - 1):
score = self._calc_score(X_train.values, X_test.values, y_train.values, y_test.values, p)
scores.append(score)
subsets.append(p)
#
# Get the index of best score
#
best_score_index = np.argmax(scores)
#
# Record the best score
#
self.scores_.append(scores[best_score_index])
#
# Get the indices of features which gave best score
#
self.indices_ = subsets[best_score_index]
#
# Record the indices of features for best score
#
self.subsets_.append(self.indices_)
dim -= 1 # Dimension is reduced by 1
'''
Transform training, test data set to the data set
havng features which gave best score
'''
def transform(self, X):
return X.values[:, self.indices_]
'''
Train models with specific set of features
indices - indices of features
'''
def _calc_score(self, X_train, X_test, y_train, y_test, indices):
self.estimator.fit(X_train[:, indices], y_train.ravel())
y_pred = self.estimator.predict(X_test[:, indices])
score = accuracy_score(y_test, y_pred)
return score
# ===============================================
# Keras deep neural network
def dnn():
model = keras.Sequential([
layers.Dense(20, activation='relu', input_shape = (X_train.shape[1])),
layers.Dropout(0.3),
layers.Dense(20, activation='relu'),
layers.Dropout(0.3),
layers.Dense(1, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['binary_accuracy']
)
early_stopping = keras.callbacks.EarlyStopping(
patience=5,
min_delta=0.001,
restore_best_weights=True,
)
history = model.fit(
X_train, y_train,
validation_data=(X_test, y_test),
batch_size=512,
callbacks=[early_stopping],
)
history_df = pd.DataFrame(history.history)
print("Minimum Validation Loss: {:0.4f}".format(history_df['val_loss'].min()));
history_df.loc[:, ['loss', 'val_loss']].plot(title="Cross-entropy")
history_df.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot(title="Accuracy")
return model
keras_clf = KerasClassifier(dnn,
epochs=5,
verbose=False)
keras_clf._estimator_type = "classifier"
And this is the code I have for trying to integrate them together:
k_features = 5
#
# Instantiate SequentialBackwardSearch
#
sbs = SequentialBackwardSearch(keras_clf, k_features)
#
# Fit the data to determine the k_features which give the
# most optimal model performance
#
sbs.fit(X_train, X_test, y_train, y_test)
#
# Transform the training data set to dataset having k_features
# giving most optimal model performance
#
X_train_kfeatures = sbs.transform(X_train)
#
# Transform the test data set to dataset having k_features
#
X_test_kfeatures = sbs.transform(X_test)
sbs.indices_
X_train.columns[[sbs.indices_]] # sbs is an instance of SequentialBackwardSearch class
I am wondering whether this is even possible (integrating a neural network to the existing code for sequential backward feature selection) or if there's anything I can do to get it to run and output the top 5 features from the training dataset. I have tried to address the error message by altering the input shape of the neural network, but I believe it is correct (45 features). Any help or advice would be welcome!

This should work with SciKeras!
I had to clean up your code / fix some bugs. I first did a "sanity check" using Scikit-Learn's MLPClassfier, then I ran it against an MLPClassfier created using Keras. Details may differ for more complex model architectures, but this shows that it does work.
import numpy as np
# SBS (sequential backward feature selection) from scratch
#=====================================================
from sklearn.metrics import accuracy_score
from itertools import combinations
from sklearn.base import clone
class SequentialBackwardSearch:
'''
Instantiate with Estimator and given number of features
'''
def __init__(self, estimator, k_features):
self.estimator = clone(estimator)
self.k_features = k_features
'''
X_train - Training data Pandas dataframe
X_test - Test data Pandas dataframe
y_train - Training label Pandas dataframe
y_test - Test data Pandas dataframe
'''
def fit(self, X_train, X_test, y_train, y_test):
dim = X_train.shape[1]
self.indices_ = tuple(range(dim))
self.subsets_ = [self.indices_]
score = self._calc_score(X_train, X_test,
y_train, y_test, self.indices_)
self.scores_ = [score]
'''
Iterate through all the dimensions until k_features is reached
At the end of loop, dimension count is reduced by 1
'''
while dim > self.k_features:
scores = []
subsets = []
'''
Iterate through different combinations of features, train the model,
record the score
'''
for p in combinations(self.indices_, r=dim - 1):
score = self._calc_score(X_train, X_test, y_train, y_test, p)
scores.append(score)
subsets.append(p)
#
# Get the index of best score
#
best_score_index = np.argmax(scores)
#
# Record the best score
#
self.scores_.append(scores[best_score_index])
#
# Get the indices of features which gave best score
#
self.indices_ = subsets[best_score_index]
#
# Record the indices of features for best score
#
self.subsets_.append(self.indices_)
dim -= 1 # Dimension is reduced by 1
'''
Transform training, test data set to the data set
havng features which gave best score
'''
def transform(self, X):
return X.values[:, self.indices_]
'''
Train models with specific set of features
indices - indices of features
'''
def _calc_score(self, X_train, X_test, y_train, y_test, indices):
self.estimator.fit(X_train[:, indices], y_train.ravel())
y_pred = self.estimator.predict(X_test[:, indices])
score = accuracy_score(y_test, y_pred)
return score
# Sklearn MLPClassifier
from sklearn.neural_network import MLPClassifier
estimator = MLPClassifier()
search = SequentialBackwardSearch(estimator, 1)
X = np.random.randint(0, 2, size=(100, 5))
y = X[:, -1]
search.fit(X, X, y, y)
assert list(search.indices_) == [4]
# SciKeras MLPClassifier
# see https://www.adriangb.com/scikeras/stable/notebooks/MLPClassifier_MLPRegressor.html
import tensorflow.keras as keras
from scikeras.wrappers import KerasClassifier
class KerasMLPClassifier(KerasClassifier):
def __init__(
self,
hidden_layer_sizes=(100, ),
optimizer="adam",
optimizer__learning_rate=0.001,
epochs=200,
verbose=0,
**kwargs,
):
super().__init__(**kwargs)
self.hidden_layer_sizes = hidden_layer_sizes
self.optimizer = optimizer
self.epochs = epochs
self.verbose = verbose
def _keras_build_fn(self, compile_kwargs):
model = keras.Sequential()
inp = keras.layers.Input(shape=(self.n_features_in_))
model.add(inp)
for hidden_layer_size in self.hidden_layer_sizes:
layer = keras.layers.Dense(hidden_layer_size, activation="relu")
model.add(layer)
if self.target_type_ == "binary":
n_output_units = 1
output_activation = "sigmoid"
loss = "binary_crossentropy"
elif self.target_type_ == "multiclass":
n_output_units = self.n_classes_
output_activation = "softmax"
loss = "sparse_categorical_crossentropy"
else:
raise NotImplementedError(f"Unsupported task type: {self.target_type_}")
out = keras.layers.Dense(n_output_units, activation=output_activation)
model.add(out)
model.compile(loss=loss, optimizer=compile_kwargs["optimizer"])
return model
estimator2 = KerasMLPClassifier()
search2 = SequentialBackwardSearch(estimator2, 1)
search2.fit(X, X, y, y)
assert list(search2.indices_) == [4]
Notebook version (can't promise this will be around forever): https://colab.research.google.com/drive/1EWxT3GWZsqhftz4f7W5GsXNe_SPtva4H#scrollTo=chU7wLn1BTU1

Why do I get different predictions using Keras sequential neural network in a loop?

I came across a weird difference between keras model.fit() and sklearn model.fit() functions. When model.fit() is called inside a loop I get inconsistent predictions using a Keras sequential model. This is not the case when using an sklearn model. See sample code to reproduce the phenomenon.
from numpy.random import seed
seed(1337)
import tensorflow as tf
tf.random.set_seed(1337)
from sklearn.linear_model import LogisticRegression
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import InputLayer
from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler
import numpy as np
def get_sequential_dnn(NUM_COLS, NUM_ROWS):
# code for model
if __name__ == "__main__":
input_size = 10
X, y = make_blobs(n_samples=100, centers=2, n_features=input_size,
random_state=1
)
scalar = MinMaxScaler()
scalar.fit(X)
X = scalar.transform(X)
model = get_sequential_dnn(X.shape[1], X.shape[0])
# print(model.summary())
# model = LogisticRegression()
for i in range(2):
model.fit(X, y, epochs=100, verbose=0, shuffle=False)
# model.fit(X, y)
Xnew, _ = make_blobs(n_samples=3, centers=2, n_features=10, random_state=1)
Xnew = scalar.transform(Xnew)
# make a prediction
# ynew = model.predict_proba(Xnew)[:, 1]
ynew = model.predict_proba(Xnew)
ynew = np.array(ynew)
# show the inputs and predicted outputs
print('--------------')
for i in range(len(Xnew)):
print("X=%s \n Predicted=%s" % (Xnew[i], ynew[i]))
The output of this is
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[0.9931685]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.35249507]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.35249507]
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[1.]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.17942095]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.17942095]
While if I use a Logistic Regression (un-comment the commented lines) the predictions are consistent:
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
I get that the obvious solution to this is fit the model before the loop, probably there is a strong randomness how Keras models fit the data to the labels, but there are a couple of cases where you need to have a loop to get prediction scores. For example if you want to perform a 10-fold cross validation to get the AUC, sensitivity, specificity values on a training data. In these situations this randomness is unacceptable.
What is causing this inconsistency and what is the solution to it?

There are couple of issue with the way your are trying to make reproducible results with keras.
You are calling the fit (when i==1) over the already fitted model (when i==0). So the optimizer sees different sets of inital weights in both the cases and so you will end up in two different models. Solution: Get a fresh model everytime. This is not the case with sklearn, which starts with fresh initialized weights every time a fit is called.
model.fit internally might use a current stage of random number generator. You seeded it outside the loop, so the state will be different when fit is called the second time. Solution: Seed inside the loop.
Sample code with issue
# Issue 2 here
tf.random.set_seed(1337)
def get_model():
model = Sequential()
model.add(Dense(4, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
return model
X = np.random.randn(10,8)
y = np.random.randn(10,1)
# Issue 1 here
model = get_model()
results = []
for i in range(10):
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))
As you can see the assert fails
Corrected code
results = []
for i in range(10):
tf.random.set_seed(1337)
model = get_model()
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))

Why does tensorflow show inaccurate loss?

I'm using Tensorflow to train a network to predict the third item in a list of numbers.
When I train, the network appears to train quite well and do well on both the training and test set. However, when I evaluate its performance myself, it seems to be doing quite poorly.
For example, at the end of training, Tensorflow says that the validation loss is 2.1 x 10^(-5). However, when I compute it myself, I get 0.17 x 10^0. What am I doing wrong?
Here's code that can be run on Google Colab:
import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split
def create_dataset(k=5, n=2, example_amount=200):
'''Create a dataset of numbers where the goal is to always output the nth number'''
# UPGRADE: this could be done better with numpy to just generate all the examples at once
example_amount = 1000
x = []
y = []
ans = [x, y]
for i in range(example_amount):
example_x = np.random.rand(k)
example_y = example_x[n]
x.append(example_x)
y.append(example_y)
return ans
def tensorize(tensor_like) -> tf.Tensor:
'''Turn stuff into tensors'''
return tf.convert_to_tensor(tensor_like, dtype=tf.float32)
def split_dataset(dataset, train_split=0.8, random_state=42):
'''
Takes in a list (or tuple) where index 0 contains the inputs and index 1 contains the outputs
outputs x_train, x_test, y_train, y_test, train_indexes, test_indexes all as tf.Tensor
'''
indices = np.arange(len(dataset[0]))
return tuple([tensorize(data) for data in train_test_split(dataset[0], dataset[1], indices, train_size=train_split, random_state=random_state)])
# how many numbers in each example
K = 5
# the index of the solution
N = 2
# how many examples
EXAMPLE_AMOUNT = 20000
# what percentage of the examples are in the training set
TRAIN_SPLIT = 0.5
# how long to train for
epochs = 50
dataset = create_dataset(K, N, EXAMPLE_AMOUNT)
x_train, x_test, y_train, y_test, train_indexes, test_indexes = split_dataset(dataset, train_split=TRAIN_SPLIT)
model_input = tf.keras.layers.Input(shape=(K,), name="input")
model_dense1 = tf.keras.layers.Dense(10, name="dense1")(model_input)
model_dense2 = tf.keras.layers.Dense(10, name="dense2")(model_dense1)
model_output = tf.keras.layers.Dense(1, name="output")(model_dense2)
model = tf.keras.Model(inputs=model_input, outputs=model_output)
model.compile(optimizer=tf.keras.optimizers.Adam(), loss="mse")
history = model.fit(x=x_train, y=y_train, validation_data=(x_test, y_test), epochs=epochs)
# the validation loss as Tensorflow computes it
print(history.history["val_loss"][-1]) # 2.1036579710198566e-05
# the validation loss as I compute it
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test, model.predict(x_test))).numpy()
print(val_loss) # 0.1655631

What you miss is that the shape of y_test.
y_test.numpy().shape
(500,) <-- causing the behaviour
Simply reshape it like:
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test.numpy().reshape(-1,1), model.predict(x_test))).numpy()
print(val_loss) # 1.1548506e-05
Also:
history.history["val_loss"][-1] # 1.1548506336112041e-05
Or you can flatten() both of the data while calculating it:
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test.numpy().flatten(), model.predict(x_test).flatten())).numpy()
print(val_loss) # 1.1548506e-05

Tensorflow 2.0: How are metrics computed when the ouput is sequential?

I have been working with binary sequential inputs and outputs using Tensorflow 2.0, and I've been wondering which approach Tensorflow uses to compute metrics such as recall or accuracy during training in those scenarios.
Each sample to my network consists of 60 timesteps, each with 300 features, and thus my expected output is a (60, 1) array of 1s and 0s. Suppose I have 2000 validation samples. When evaluating the validation set for each epoch, does tensorflow concatenates all of the 2000 samples into a single (2000*60=120000, 1) array and then compares to the concatenated groundtruth labels, or does it evalutes each of the (60, 1) individually and then returns a mean of those values? Is there any way to modify this behavior?

Tensorflow/Keras by default computes the metrics batch-wise for train data, while it computes the same metrics on ALL the data passed in validation_data parameters in fit method.
This means that the metric printed during fitting for the train data is the mean of that score calculated on all the batches. In other words, for trainset keras evaluates each bach individually and then returns a mean of those values. For validation data is different, keras gets all the validation samples and then compares them with the "concatenated" groundtruth labels.
To prove this behavior with code I propose a dummy example. I provide a custom callback that computes for sure the accuracy score on ALL the data passed at the end of the epoch (for train and optionally validation). this is useful for us to understand the behavior of tensorflow during training.
import numpy as np
from sklearn.metrics import accuracy_score
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.callbacks import *
class ACC_custom(tf.keras.callbacks.Callback):
def __init__(self, train, validation=None):
super(ACC_custom, self).__init__()
self.validation = validation
self.train = train
def on_epoch_end(self, epoch, logs={}):
logs['ACC_score_train'] = float('-inf')
X_train, y_train = self.train[0], self.train[1]
y_pred = (self.model.predict(X_train).ravel()>0.5)+0
score = accuracy_score(y_train.ravel(), y_pred)
if (self.validation):
logs['ACC_score_val'] = float('-inf')
X_valid, y_valid = self.validation[0], self.validation[1]
y_val_pred = (self.model.predict(X_valid).ravel()>0.5)+0
val_score = accuracy_score(y_valid.ravel(), y_val_pred)
logs['ACC_score_train'] = np.round(score, 5)
logs['ACC_score_val'] = np.round(val_score, 5)
else:
logs['ACC_score_train'] = np.round(score, 5)
create dummy data
x_train = np.random.uniform(0,1, (1000,60,10))
y_train = np.random.randint(0,2, (1000,60,1))
x_val = np.random.uniform(0,1, (500,60,10))
y_val = np.random.randint(0,2, (500,60,1))
fit model
inp = Input(shape=((60,10)), dtype='float32')
x = Dense(32, activation='relu')(inp)
out = Dense(1, activation='sigmoid')(x)
model = Model(inp, out)
es = EarlyStopping(patience=10, verbose=1, min_delta=0.001,
monitor='ACC_score_val', mode='max', restore_best_weights=True)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(x_train,y_train, epochs=10, verbose=2,
callbacks=[ACC_custom(train=(x_train,y_train),validation=(x_val,y_val)),es],
validation_data=(x_val,y_val))
in the graphs below I make a comparison between the accuracies computed by our callback and the accuracy computed by keras
plt.plot(history.history['ACC_score_train'], label='accuracy_callback_train')
plt.plot(history.history['accuracy'], label='accuracy_default_train')
plt.legend(); plt.title('train accuracy')
plt.plot(history.history['ACC_score_val'], label='accuracy_callback_valid')
plt.plot(history.history['val_accuracy'], label='accuracy_default_valid')
plt.legend(); plt.title('validation accuracy')
as we can see the accuracy on the train data (first plot) is different between the default method and our callbacks. this means that the accuracy of train data is calculated batch-wise.
the validation accuracy (second plot) calculated by our callback and the default method is the same! this means that the score on validation data is computed one-shoot

keras custom metric with sample weights

I am trying to define a custom metric in Keras that takes into account sample weights. When fitting the model I use the sample weights as follows:
training_history = model.fit(
train_data,
train_labels,
sample_weight = train_weights,
epochs = num_epochs,
batch_size = 128,
validation_data = (validation_data, validatation_labels, validation_weights ),
)
An example of a custom metric I am using is the AUC (area under roc curve ), which I defined as follows:
from keras import backend as K
import tensorflow as tf
def auc(true_labels, predictions, weights = None):
auc = tf.metrics.auc(true_labels, predictions, weights = weights)[1]
K.get_session().run(tf.local_variables_initializer())
return auc
and I use this metric when compiling the model:
model.compile(
optimizer = optimizer,
loss = 'binary_crossentropy',
metrics = ['accuracy', auc]
)
But as far as I can tell, the metric does not take into account the sample weights. In fact I verified this by comparing the metric value I see when training the model using the custom metric defined above to what I get by computing it myself from the model output and the sample weights, which indeed yield very different results. How would I define the auc metric shown above to take into account the sample weights?

You could wrap your metric with another function that takes sample_weights as an argument:
def auc(weights):
def metric(true_labels, predictions):
auc = tf.metrics.auc(true_labels, predictions, weights=weights)[1]
K.get_session().run(tf.local_variables_initializer())
return auc
return metric
And then define an extra input placeholder that will receive the sample weights:
sample_weights = Input(shape=(1,))
Your model can then be compiled as follows:
model.compile(
optimizer = optimizer,
loss = 'binary_crossentropy',
metrics = ['accuracy', auc(sample_weights)]
)
NOTE: Not tested.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to calculate precision and recall in Keras - python

I am building a multi-class classifier with Keras 2.02 (with Tensorflow backend)，and I do not know how to calculate precision and recall in Keras. Please help me.

Related

How to integrate keras model with sequential backward selection code?

Why do I get different predictions using Keras sequential neural network in a loop?

Why does tensorflow show inaccurate loss?

Tensorflow 2.0: How are metrics computed when the ouput is sequential?

keras custom metric with sample weights

Categories

Resources