Tried googling up, but could not find how to implement Sklearn metrics like cohen kappa, roc, f1score in keras as a metric for imbalanced data.
How to implement Sklearn Metric in Keras as Metric?
Metrics in Keras and in Sklearn mean different things.
In Keras metrics are almost same as loss. They get called during training at the end of each batch and each epoch for reporting and logging purposes. Example use is having the loss 'mse' but you still would like to see 'mae'. In this case you can add 'mae' as a metrics to the model.
In Sklearn metric functions are applied on predictions as per the definition "The metrics module implements functions assessing prediction error for specific purposes". While there's an overlap, the statistical functions of Sklearn doesn't fit to the definition of metrics in Keras. Sklearn metrics can return float, array, 2D array with both dimensions greater than 1. There is no such object in Keras by the predict method.
Answer to your question:
It depends where you want to trigger:
End of each batch or each epoch
You can write a custom callback that is fired at the end of batch.
After prediction
This seems to be easier. Let Keras predict on the entire dataset, capture the result and then feed the y_true and y_pred arrays to the respective Sklearn metric.
Everything you need to live is in the confuse matrix. Calculate the confuse matrix and follow my formulas:
In practice, this is done as follows:
from sklearn.metrics import confusion_matrix
NBC = NBC.fit(X_train,y_train)
cm = confusion_matrix(y_test, NBC.predict(X_test))
tn, fp, fn, tp = cm.ravel()
print('tn: ',tn)
print('fp: ',fp)
print('fn: ',fn)
print('tp: ',tp)
print('------------------')
print(cm)
and now:
p_0 = (tn+đĄđ)/(tn+fp+fn+đĄđ)
print('p_0:',p_0)
P_class0 = ((tn+fp)/(tn+fp+fn+đĄđ))*((tn+fn)/(tn+fp+fn+đĄđ))
print('P_yes: ',P_yes)
P_class1 = ((fn+đĄđ)/(tn+fp+fn+đĄđ))*((fp+đĄđ)/(tn+fp+fn+đĄđ))
print('P_no: ',P_no)
pe = P_yes + P_no
print('pe: ',pe)
Îș = (p_0-pe)/(1-pe)
print('Îș: ',Îș)
Related
I am using tf 2.5 to evaluate a multiclass classification problem. I am using F1 score since my dataset is highly imbalanced. The F1 metric I am using is from the tensorflow-addons package.
When I use it with a binary model everything works fine, but results and training gets weird when I am doing multiclass models.
During training and evaluation of the multiclass problem, the F1 score is way higher than it should be. In order to check if the score was correct I used scikit-learns F1 score metric and it gave a much more reasonable result. Interestingly, when manually evaluating the prediction with the tfa F1 metric using update_states() the score is the same as scikit-learns. I am not sure about the reason for that. Probably because evaluate() and fit() use batches? But how could I overcome this problem? For evaluation its not so much of a problem, since I can just use predict. But how can I show a valid F1 training score.
Example
F1-Score definition for my 7 class problem
tfa.metrics.F1Score(num_classes=7, average='macro', threshold=0.5)
Training
model.fit(ds.train_ds,validation_data=ds.val_ds,epochs=EPOCHS)
F1: 0.4163
Evaluation results
model.evaluate(ds.test_ds)
F1: 0.44059306383132935
Prediction
pred = model.predict(ds.test_ds)
metric = tfa.metrics.F1Score(num_classes=7, average='macro', threshold=0.5)
metric.update_state(y_true, y_pred)
result = metric.result()
result.numpy()
F1: 0.1444352
Scikit-Evaluation
from sklearn.metrics import f1_score
print(f1_score(y_true, y_pred, average='macro'))
F1: 0.1444351874222774
The problem was that the test dataset shuffled after each full iteration. Disabling this led to consistent scores between all evaluation methods
I simply added an additional parameter for my dataset tuning function:
def __configureperformance__(self,ds,shuffle=True):
ds = ds.cache()
if shuffle:
ds = ds.shuffle(buffer_size=1000)
ds = ds.batch(self.batch_size)
ds = ds.prefetch(buffer_size=self.AUTOTUNE)
return ds
I'm using AUC metrics to do a multilabel classification. Since keras has removed prediction_classes for obtaining the prediction classes, I just use a threshold of 0.5 to get the output classes. However, as I understand, for AUC the threshold should not be 0.5 for an imbalanced data set. How can I get the threshold that was used for training the model?
Besides, I know that AUC is used for binary classification. Can I just use it for multilabel problem? How to calculate the threshold? By taking the average or not.
You can use AUC for the multi-label problem, check this.
import numpy as np
y_true = np.random.randint(0,2,(100,4))
y_pred = np.random.randint(0,2,(100,4))
m = tf.keras.metrics.AUC(multi_label=True, thresholds=[0, 0.5])
m(y_true, y_pred).numpy()
FYI, from tf 2.5, it now supports logit predictions.
Whenever an sklearn model is fit to some data, it minimizes some loss function. How can I obtain the model loss using that loss function?
e.g.
model = sklearn.linear_model.LogisticRegression().fit(X_train,y_train)
model.get_loss(X_train, y_train) #gives the loss for these values
model.get_loss(X_test, y_test) #gives the loss for other values
Note that the .score method does NOT do this thing.
LogisticRegression minimises log loss, so you would expect the loss to be the .score, only negated. However, this actually returns the mean accuracy.
To calculate log loss you need to use the log_loss metric:
I haven't tested it, but something like this:
from sklearn.metrics import log_loss
model = sklearn.linear_model.LogisticRegression().fit(X_train, y_train)
loss = log_loss(X_test, model.predict_proba(X_test), eps=1e-15)
I am working on my regression model based on the IMDB data, to predict IMDB value. On my linear-regression, i was unable to obtain the accuracy score.
my line of code:
metrics.accuracy_score(test_y, linear_predicted_rating)
Error :
ValueError: continuous is not supported
if i were to change that line to obtain the r2 score,
metrics.r2_score(test_y,linear_predicted_rating)
i was able to obtain r2 without any error.
Any clue why i am seeing this?
Thanks.
Edit:
One thing i found out is test_y is panda data frame whereas the linear_predicted_rating is in numpy array format.
metrics.accuracy_score is used to measure classification accuracy, it can't be used to measure accuracy of regression model because it doesn't make sense to see accuracy for regression - predictions rarely can equal the expected values. And if predictions differ from expected values by 1%, the accuracy will be zero, though these predictions are great
Here are some metrics for regression: http://scikit-learn.org/stable/modules/classes.html#regression-metrics
NOTE: Accuracy (e.g. classification accuracy) is a measure for classification, not regression so we can't calculate accuracy for a regression model. For regression, one of the matrices we've to get the score (ambiguously termed as accuracy) is R-squared (R2).
You can get the R2 score (i.e accuracy) of your prediction using the score(X, y, sample_weight=None) function from LinearRegression as follows by changing the logic accordingly.
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train,y_train)
r2_score = regressor.score(x_test,y_test)
print(r2_score*100,'%')
output (a/c to my model)
86.23%
The above is R squared value and not the accuracy :
# R squared value
metrics.explained_variance_score(y_test, predictions)
What does your variables look like. Code below works well.
from sklearn import metrics
test_y, linear_predicted_rating = [1,2,3,4], [1,2,3,5]
metrics.accuracy_score(test_y, linear_predicted_rating)
You can not predict the accuracy of regression model,however you can analyze your model using Mean absolute error ,Mean squared error ,Root mean squared error,Max error,median error R-square etc.
for reference
you can go this to gain more knowledge
As an R user, I wanted to also get up to speed on scikit.
Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output.
Code example:
# Linear Regression
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LinearRegression
# Load the diabetes datasets
dataset = datasets.load_diabetes()
# Fit a linear regression model to the data
model = LinearRegression()
model.fit(dataset.data, dataset.target)
print(model)
# Make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# Summarize the fit of the model
mse = np.mean((predicted-expected)**2)
print model.intercept_, model.coef_, mse,
print(model.score(dataset.data, dataset.target))
Issues:
seems like the intercept and coef are built into the model, and I just type print (second to last line) to see them.
What about all the other standard regression output like R^2, adjusted R^2, p values, etc. If I read the examples correctly, seems like you have to write a function/equation for each of these and then print it.
So, is there no standard summary output for lin. reg. models?
Also, in my printed array of outputs of coefficients, there are no variable names associated with each of these? I just get the numeric array. Is there a way to print these where I get an output of the coefficients and the variable they go with?
My printed output:
LinearRegression(copy_X=True, fit_intercept=True, normalize=False)
152.133484163 [ -10.01219782 -239.81908937 519.83978679 324.39042769 -792.18416163
476.74583782 101.04457032 177.06417623 751.27932109 67.62538639] 2859.69039877
0.517749425413
Notes: Started off with Linear, Ridge and Lasso. I have gone through the examples. Below is for the basic OLS.
There exists no R type regression summary report in sklearn. The main reason is that sklearn is used for predictive modelling / machine learning and the evaluation criteria are based on performance on previously unseen data (such as predictive r^2 for regression).
There does exist a summary function for classification called sklearn.metrics.classification_report which calculates several types of (predictive) scores on a classification model.
For a more classic statistical approach, take a look at statsmodels.
I use:
import sklearn.metrics as metrics
def regression_results(y_true, y_pred):
# Regression metrics
explained_variance=metrics.explained_variance_score(y_true, y_pred)
mean_absolute_error=metrics.mean_absolute_error(y_true, y_pred)
mse=metrics.mean_squared_error(y_true, y_pred)
mean_squared_log_error=metrics.mean_squared_log_error(y_true, y_pred)
median_absolute_error=metrics.median_absolute_error(y_true, y_pred)
r2=metrics.r2_score(y_true, y_pred)
print('explained_variance: ', round(explained_variance,4))
print('mean_squared_log_error: ', round(mean_squared_log_error,4))
print('r2: ', round(r2,4))
print('MAE: ', round(mean_absolute_error,4))
print('MSE: ', round(mse,4))
print('RMSE: ', round(np.sqrt(mse),4))
statsmodels package gives a quiet decent summary
from statsmodels.api import OLS
OLS(dataset.target,dataset.data).fit().summary()
You can do using statsmodels
import statsmodels.api as sm
X = sm.add_constant(X.ravel())
results = sm.OLS(y,x).fit()
results.summary()
results.summary() will organize the results into three tabels
You can use the following option to have a summary table:
import statsmodels.api as sm
#log_clf = LogisticRegression()
log_clf =sm.Logit(y_train,X_train)
classifier = log_clf.fit()
y_pred = classifier.predict(X_test)
print(classifier.summary2())
Use model.summary() after predict
# Linear Regression
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LinearRegression
# load the diabetes datasets
dataset = datasets.load_diabetes()
# fit a linear regression model to the data
model = LinearRegression()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# >>>>>>>Print out the statistics<<<<<<<<<<<<<
model.summary()
# summarize the fit of the model
mse = np.mean((predicted-expected)**2)
print model.intercept_, model.coef_, mse,
print(model.score(dataset.data, dataset.target))