Python Keras weighted accuracy metric is much different than regular accuracy metric

Python Keras weighted accuracy metric is much different than regular accuracy metric - python

I am training a Transformer model for Time Series Classification. To check the results, I am using a Baseline model which uses the previous target as the next prediction. I am using a data generator to handle the data. The dataset is imbalanced so I am using sample_weights to deal with this, so the data generator outputs 3 variables: inputs, labels, sample_weights.
I have tried setting the sample_weights to all 1's in order to test that things are working well. The baseline model produces identical results with weighted accuracy and regular accuracy, which is expected. However for the Transformer, I am seeing completely different values for the weighted accuracy and regular accuracy, even though the sample_weights are all 1's. Since the sample weights are all 1's I would expect the weighted accuracy to be the same as the regular accuracy, Why are these different?
It seems like the regular metric is normalized to 1, while the weighted metric is normalized to 100, but why would these be different in this case?
Code:
Function from data generator class to get sample weights
def get_sample_weights(self, inputs, labels):
''' Obtains sample weights for any number of classes.
NOTE: sample_weights pertain a weighting to each label
'''
# get initial sample weights
sample_weights = tf.ones_like(labels, dtype=tf.float64)
# get classes and counts for each one
class_counts = np.bincount(self.train_df.price_change)
total = class_counts.sum()
n_classes = len(class_counts)
weights = tf.constant([1, 1, 1], dtype=tf.float64)
for idx, count in enumerate(class_counts):
# compute weight
# weight = total / (n_classes*count)
weight = weights[idx]
# update weight value
sample_weights = tf.where(tf.equal(labels, float(idx)),
weight,
sample_weights)
return inputs, labels, sample_weights
get baseline results
baseline = Baseline(label_index=single_gen.column_indices['price_change'])
baseline.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'],
weighted_metrics=['accuracy'])
train_metrics = baseline.evaluate(single_gen.train))
val_metrics = baseline.evaluate(single_gen.valid))
Get results with Transformer
transformer_model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
metrics=['sparse_categorical_accuracy'],
weighted_metrics=['sparse_categorical_accuracy'])
history = transformer_model.fit(aapl_gen.train,
epochs=2,
validation_data=aapl_gen.valid)
train_metrics = transformer_model.evaluate(data_gen.train)
val_metrics = transformer_model.evaluate(data_gen.valid)

Related

Costumizing loss function in keras with condition

I want to setup a keras model (tensorflow backend) for a multiclassification problem with 4 different classes. I have both labeled and unlabeled data.
I have worked out the case in which I only train with the labeled data and my model looks something like this:
# create model
inputs = keras.Input(shape=(len(config.variables), ))
X = layers.Dense(units=200, activation="relu")(inputs)
output = layers.Dense(units=4, activation="softmax", name="output")(X)
model = keras.Model(inputs=inputs, outputs=output)
model.compile(optimizer=optimizers.Adam(1e-4), loss=loss_function, metrics=["accuracy"])
# train model
model.fit(
x=train_data,
y=train_class_labels,
batch_size=200,
epochs=200,
verbose=2,
validation_split=0.2,
sample_weight = class_weights
)
I have functioning models with to different losses namely categorical_crossentropy and sparse_categorical_crossentropy, and depending on the loss function my train_class_labels where in one-hot representation (e.g. [ [0,1,0,0], [0,0,0,1], ...]) or in the integer representation (e.g. [0,0,2,1,0,3, ...]) and everything worked fine. class_weights is some weight vector ([0.78, 1,34, ...])
Now for my further plans I need to include the unlabeled data in the training process but I need it to be ignored by the loss function.
What I have tried:
setting the labels from the unlabeled data to [0,0,0,0] when using categorical_crossentropy as a loss, because i thought then my unlabeled data would be ignored by the loss function. Somehow this changed the predictions after training.
I also tried setting the weights from the unlabeled data to 0 but that did have an effect either
I concluded that I need to somehow mark me unlabeled data and customize my loss function so that it can be told to ignore those samples. Something like
def custom_loss(y_true, y_pred):
if y_true == labeled data:
return normal loss function
if y_true == unlabeled data:
return 0
Those are some snippets that I have found but they do not seem to work:
def custom_loss(y_true, y_pred):
loss = losses.sparse_categorical_crossentropy(y_true, y_pred)
return K.switch(K.flatten(K.equal(y_true, -1)), K.zeros_like(loss), loss)
def custom_loss2(y_true, y_pred):
idx = tf.not_equal(y_true, -1)
y_true = tf.boolean_mask(y_true, idx)
y_pred = tf.boolean_mask(y_pred, idx)
return losses.sparse_categorical_crossentropy(y_true, y_pred)
In those examples I set the labels from the unlabeled data to -1 so train_class_labels would look something like this: [0,-1,2,0,3, ... ]
But when using the first loss function I just get Nans and when using the second one I get the following error:
Invalid argument: logits and labels must have the same first dimension, got logits shape [1,5000] and labels shape [5000]

I think that setting the labels to [0,0,0,0] would be just fine. Because the loss is calculated by sum of the log losses of your instances per class (in your case the loss would be 0 for instances with no label).
I don't understand why you are inserting non labeled data in your training in a supervised setting.
I think that the differences that you obtain are due to the batch size and to the gradient step. If there are instances that do not contribute to the gradient descent, the loss calculated would be different than before, and then you get the difference in prediction.
Basically there would be less informative instances per batch.
If you use as batch size the size of all the dataset there would be no difference from a previous training without the unlabeled instances (but always with a training with batch size = size of the dataset)

Ensemble stacking model... with different inputs (molecular Fingerprint) in one type of keras model

I am CHEMIST and still learning ML...
I have trained 7 different models with keras using different types of molecular fingerprints as features to predict a property...however the accuracy was not that good.
So using a tutorial i found online
def optimized_weights(prd,y_fold):
# define bounds on each weight
bound_w = [(0.0, 1.0) for _ in range(n_members) ]
# arguments to the loss function
search_arg = (prd ,y_fold)
# global optimization of ensemble weights
result = differential_evolution(loss_function, bound_w,search_arg, maxiter=2000, tol=0.0001)
# get the chosen weights
weights = normalize(result['x'])
return weights
def weighted_accuracy(prd,weights,y_fold):
summed = tensordot(prd, weights, axes=((0),(0)))
yhat=np.round(summed)
score = accuracy_score(y_fold,yhat )
f1 = f1_score(y_fold,yhat)
fpr, tpr, thresholds = roc_curve(y_fold,summed,pos_label=1)
auc_test = auc(fpr, tpr)
conf_matrix=confusion_matrix(y_fold,yhat)
total=sum(sum(conf_matrix))
sensitivity = conf_matrix[0,0]/(conf_matrix[0,0]+conf_matrix[0,1])
specificity = conf_matrix[1,1]/(conf_matrix[1,0]+conf_matrix[1,1])
return score,auc_test,sensitivity,specificity,f1
For weighted average ensemble model,i trained model on 80% of data and 20% was used to find optimized weights using differential_evolution (from scipy) for max accuracy, but i think this accuracy is biased toward test data...
I also repeated the same process for 5 fold cross validation and determined avg accuracy....
Is it acceptable...
if not, then please tell me what i can do
Thanks

DeepStack offers an interface for stacking and "ensembling" Keras Models. It also offers performance tests based on validation data out of the box

For a classification model in tensorflow, is there a way to impose an asymmetric cost function during the training?

I am trying to build a Neural Network in tensorflow where the cost of a Type I error (false-positive) is more costly than a Type II error (false-negative). Is there a way to impose this during the training process (i.e. inputting a cost matrix)? This is possible with simple models like Logistic Regression in scikit learn by specifying the class_weight parameter.
cw = {0: 3,1:1}
clf = LogisticRegression(class_weight = cw )
In this case, incorrectly predicting a 0 is 3x more costly than incorrectly predicting a 1. However, this cannot be performed with a Neural Network, so I want to see if it is possible in tensorflow.
Thanks

You could use tf.nn.weighted_cross_entropy_with_logits and it's pos_weight argument.
This argument weights positive class, as described by documentation (in TF2.0 at least):
A value pos_weights > 1 decreases the false negative count, hence increasing the recall.
Conversely setting pos_weights < 1 decreases the false positive count and increases the precision.
In your case, you could create custom loss function like this:
import tensorflow as tf
# Output logits from your network, not the values after sigmoid activation
class WeightedBinaryCrossEntropy:
def __init__(self, positive_weight: float):
self.positive_weight = positive_weight
def __call__(self, targets, logits, sample_weight=None):
return tf.nn.weighted_cross_entropy_with_logits(
targets, logits, pos_weight=self.positive_weight
)
And create a custom neural network with it, for example using tf.keras (samples are weighted as they were in your question:
import numpy as np
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(32, input_shape=(10,)),
tf.keras.layers.Activation("relu"),
tf.keras.layers.Dense(10),
tf.keras.layers.Activation("relu"),
# Output one logit for binary classification
tf.keras.layers.Dense(1),
]
)
# Example random data
data = np.random.random((32, 10))
targets = np.random.randint(2, size=32)
# 3 times as costly to make type I error
model.compile(optimizer="rmsprop", loss=WeightedBinaryCrossEntropy(positive_weight=3))
model.fit(data, targets, batch_size=32)

You can use a logarithmic scale. For a 0 incorrectly predicted as 1, y - ŷ = -1, log goes to 1.71. For a 1 predicted as 0, y - ŷ = 1 log equals 0.63. For y == ŷ log equals 0. Almost the three times more costly, for a 0 incorrectly predicted as 1.
import numpy as np
from math import exp
loss=abs(1-exp(-np.log(exp(y-ŷ))))
#abs(1-exp(-np.log(exp(0))))
#Out[53]: 0.0
#abs(1-exp(-np.log(exp(-1))))
#Out[54]: 1.718281828459045
#abs(1-exp(-np.log(exp(1))))
#Out[55]: 0.6321205588285577
Then you will have a convex optimization. Implementing:
import keras.backend as K
def custom_loss(y_true,y_pred):
return K.mean(abs(1-exp(-np.log(exp(y_true-y_pred)))))
Then:
model.compile(loss=custom_loss, optimizer=sgd,metrics = ['accuracy'])

Calculating F1 score, precision, recall in tfhub retraining script

I am using tensorflow hub for image retraining classification task. The tensorflow script retrain.py by default calculates cross_entropy and accuracy.
train_accuracy, cross_entropy_value = sess.run([evaluation_step, cross_entropy],feed_dict={bottleneck_input: train_bottlenecks, ground_truth_input: train_ground_truth})
I would like to get F1 score, precision, recall and confusion matrix. How could I get these values using this script ?

Below I include a method to calculate desired metrics using scikit-learn package.
You can calculate F1 score, precision and recall using precision_recall_fscore_support method and the confusion matrix using confusion_matrix method:
from sklearn.metrics import precision_recall_fscore_support, confusion_matrix
Both methods take two 1D array-like objects which store ground truth and predicted labels respectively.
In the code provided, ground-truth labels for training data are stored in train_ground_truth variable which is defined in lines 1054 and 1060, while validation_ground_truth stores ground-truth labels for validation data and is defined in line 1087.
The tensor that calculates predicted class labels is defined and returned by add_evaluation_step function. You can modify line 1034 in order to capture that tensor object:
evaluation_step, prediction = add_evaluation_step(final_tensor, ground_truth_input)
# now prediction stores the tensor object that
# calculates predicted class labels
Now you can update line 1076 in order to evaluate prediction when calling sess.run():
train_accuracy, cross_entropy_value, train_predictions = sess.run(
[evaluation_step, cross_entropy, prediction],
feed_dict={bottleneck_input: train_bottlenecks,
ground_truth_input: train_ground_truth})
# train_predictions now stores class labels predicted by model
# calculate precision, recall and F1 score
(train_precision,
train_recall,
train_f1_score, _) = precision_recall_fscore_support(y_true=train_ground_truth,
y_pred=train_predictions,
average='micro')
# calculate confusion matrix
train_confusion_matrix = confusion_matrix(y_true=train_ground_truth,
y_pred=train_predictions)
Similarly, you can compute metrics for validation subset by modifying line 1095:
validation_summary, validation_accuracy, validation_predictions = sess.run(
[merged, evaluation_step, prediction],
feed_dict={bottleneck_input: validation_bottlenecks,
ground_truth_input: validation_ground_truth})
# validation_predictions now stores class labels predicted by model
# calculate precision, recall and F1 score
(validation_precision,
validation_recall,
validation_f1_score, _) = precision_recall_fscore_support(y_true=validation_ground_truth,
y_pred=validation_predictions,
average='micro')
# calculate confusion matrix
validation_confusion_matrix = confusion_matrix(y_true=validation_ground_truth,
y_pred=validation_predictions)
Finally, the code calls run_final_eval to evaluate trained model on test data. In this function, prediction and test_ground_truth are already defined, so you only need to include code to calculate required metrics:
test_accuracy, predictions = eval_session.run(
[evaluation_step, prediction],
feed_dict={
bottleneck_input: test_bottlenecks,
ground_truth_input: test_ground_truth
})
# calculate precision, recall and F1 score
(test_precision,
test_recall,
test_f1_score, _) = precision_recall_fscore_support(y_true=test_ground_truth,
y_pred=predictions,
average='micro')
# calculate confusion matrix
test_confusion_matrix = confusion_matrix(y_true=test_ground_truth,
y_pred=predictions)
Note that the provided code calculates global F1-scores by setting average='micro'. The different averaging methods that are supported by scikit-learn package are described in User Guide.

Get top-k predictions from tensorflow

I am relatively new in machine learning especially when it comes to implementing algorithms. I am using python and tensorflow library to implement a neural network to train on a dataset which has about 20 classes. I am able to train and get predictions successfully but I have a question,
Is it possible to get top k classes along with their probabilities using tensorflow instead of just a single prediction?
If it is possible how can this be done? Thanks for your guidance.
Update 01:
I am adding code of what I am doing. So I build a neural network with 3 layers having tanh, sigmoid, & sigmoid respectively as activation functions for the hidden layers and softmax for output layer. The code for training and prediction is as follows:
y_pred = None
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
# running the training_epoch numbered epoch
_,cost = sess.run([optimizer,cost_function],feed_dict={X:tr_features,Y:tr_labels})
cost_history = np.append(cost_history,cost)
# predict results based on the trained model
y_pred = sess.run(tf.argmax(y_,1),feed_dict={X: ts_features})
Right now y_pred is a list of class labels for each test example of ts_features. But instead of getting 1 single class label for each test example I am hoping to get top-k predictions for each example each of the k-predictions accompanied by some kind of probability.

Using tf.nn.top_k():
top_k_values, top_k_indices = tf.nn.top_k(predictions, k=k)
If predictions is a vector of probabilities per class (i.e. predictions[i] = prediction probability for class i), then top_k_values will contain the k highest probabilities in predictions, and top_k_indices will contain the indices of these probabilities, i.e. the corresponding classes.
Supposing that in your code, y_ is the vector of predicted probabilities per class:
k = 3 # replace with your value
# Instead of `y_pred`:
y_k_probs, y_k_pred = sess.run(
tf.nn.top_k(y_, k=k), feed_dict={X: ts_features})

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Keras weighted accuracy metric is much different than regular accuracy metric - python

Related

Costumizing loss function in keras with condition

Ensemble stacking model... with different inputs (molecular Fingerprint) in one type of keras model

For a classification model in tensorflow, is there a way to impose an asymmetric cost function during the training?

Calculating F1 score, precision, recall in tfhub retraining script

Get top-k predictions from tensorflow

Categories

Resources