I want to calculate the F1 score of my models. But I receive a warning and get a 0.0 F1-score and I don't know what to do.
here is the source code:
def model_evaluation(dict):
for key,value in dict.items():
classifier = Pipeline([('tfidf', TfidfVectorizer()),
('clf', value),
])
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)
print("Accuracy Score of" , key , ": ", metrics.accuracy_score(y_test,predictions))
print(metrics.classification_report(y_test,predictions))
print(metrics.f1_score(y_test, predictions, average="weighted", labels=np.unique(predictions), zero_division=0))
print("---------------","\n")
dlist = { "KNeighborsClassifier": KNeighborsClassifier(3),"LinearSVC":
LinearSVC(), "MultinomialNB": MultinomialNB(), "RandomForest": RandomForestClassifier(max_depth=5, n_estimators=100)}
model_evaluation(dlist)
And here is the result:
Accuracy Score of KNeighborsClassifier : 0.75
precision recall f1-score support
not positive 0.71 0.77 0.74 13
positive 0.79 0.73 0.76 15
accuracy 0.75 28
macro avg 0.75 0.75 0.75 28
weighted avg 0.75 0.75 0.75 28
0.7503192848020434
---------------
Accuracy Score of LinearSVC : 0.8928571428571429
precision recall f1-score support
not positive 1.00 0.77 0.87 13
positive 0.83 1.00 0.91 15
accuracy 0.89 28
macro avg 0.92 0.88 0.89 28
weighted avg 0.91 0.89 0.89 28
0.8907396950875212
---------------
Accuracy Score of MultinomialNB : 0.5357142857142857
precision recall f1-score support
not positive 0.00 0.00 0.00 13
positive 0.54 1.00 0.70 15
accuracy 0.54 28
macro avg 0.27 0.50 0.35 28
weighted avg 0.29 0.54 0.37 28
0.6976744186046512
---------------
C:\Users\Cey\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1272: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
Accuracy Score of RandomForest : 0.5714285714285714
precision recall f1-score support
not positive 1.00 0.08 0.14 13
positive 0.56 1.00 0.71 15
accuracy 0.57 28
macro avg 0.78 0.54 0.43 28
weighted avg 0.76 0.57 0.45 28
0.44897959183673475
---------------
Can someone tell me what to do? I only receive this message when using the "MultinomialNB()" classifier
Second:
When extending the dictionary by using the Gausian classifier (GaussianNB()) I receive this error message:
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
What should I do here ?
Together with UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples (main credits go there) and #yatu's answer, I could at least find a workaround for the warning:
UndefinedMetricWarning: Precision is ill-defined and being set to 0.0
due to no predicted samples. Use zero_division parameter to control
this behavior. _warn_prf(average, modifier, msg_start, len(result))
Quote from sklearn.metrics.f1_score in the Notes at the bottom:
When true positive + false positive == 0, precision is undefined. When
true positive + false negative == 0, recall is undefined. In such
cases, by default the metric will be set to 0, as will f-score, and
UndefinedMetricWarning will be raised. This behavior can be modified
with zero_division.
Thus, you cannot avoid this error if your data does not output a difference between true positives and false positives.
That being said, you can only suppress the warning at least, adding zero_division=0 to the functions mentioned in the quote. In either case, set to 0 or 1, you will get a 0 value as the return anyway.
precision = precision_score(y_test, y_pred, zero_division=0)
print('Precision score: {0:0.2f}'.format(precision))
recall = recall_score(y_test, y_pred, zero_division=0)
print('Recall score: {0:0.2f}'.format(recall))
f1 = f1_score(y_test, y_pred, zero_division=0)
print('f1 score: {0:0.2f}'.format(recall))
Can someone tell me what to do? I only receive this message when using the "MultinomialNB()" classifier
The first error seems to be indicating that a specific label is not predicted when using the MultinomialNB, which results in an undefined f-score, or ill-defined, since the missing values are set to 0. This is explained here
When extending the dictionary by using the Gausian classifier (GaussianNB()) I receive this error message:
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
As per this question, the error is quite explicit, the issue is that TfidfVectorizer is returning a sparse matrix, which cannot be used as input for the GaussianNB. So the way I see it, you either avoid using the GaussianNB, or you add an intermediate transformer to turn the sparse array to dense, which I wouldn't advise being the result of a tf-idf vectorization.
Related
I am working on a churn classification with 3 classes 0, 1,2 but want to optimize class 0 and 1 for recall, does that mean sklearn needs to take classes 0 & 1 to be the positive classes. How can I explicitly mention for which class do I want to optimise recall , if that is not possible should I consider renaming the classes in an ascending order so that 1, 2 are default positive?
precision recall f1-score support
0 0.71 0.18 0.28 2611
1 0.57 0.54 0.56 5872
2 0.70 0.88 0.78 8913
accuracy 0.66 17396
macro avg 0.66 0.53 0.54 17396
weighted avg 0.66 0.66 0.63 17396
Here is the code I am using for reference (although I need more of an understanding of how to optimize for recall for only 0, 1 class here)
param_test1={'learning_rate':(0.05,0.1),'max_depth':(3,5)}
estimator=GridSearchCV(estimator=GradientBoostingClassifier(loss='deviance',subsample=0.8,random_state=10,
n_estimators=200),param_grid=param_test1,cv=2, refit='recall_score')
estimator.fit(df[predictors],df[target])
Suppose I have such data :
x1 x2 x3 y
0.85 0.95 0.22 1
0.35 0.26 0.42 0
0.89 0.82 0.82 1
0.36 0.14 0.32 0
0.44 0.53 0.82 1
0.75 0.78 0.52 1
I predict binary classification but the only thing that matters ,is the correct prediction of the 1s, and if the prediction is 0, it will not affect my accuracy.
I simply used the following code :
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
But this code also includes zeros in its accuracy.
How can I apply to the network that only the prediction of 1 is important ?
In other words, During fitting model, if the prediction was zero , this zero predication does not apply to the model accuracy.
It looks like you care about precision of the model. Precision means for all instances that you predict 1, what portion of them is correct.
If yes, use tf.keras.metrics.Precision() as metrics.
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=[tf.keras.metrics.Precision()])
I have a dataset including
{0: 6624, 1: 75} 0 for nonobservational sentences and 1 for observational sentences. (basically, I annotate my sentences using Named Entity Recognition, If there is a specific entity like DATA, TIME, LONG (coordinate) I put label 1)
Now I want to make a model to classify them, the best model (CV =3 FOR ALL) that I made is the ensembling model of
clf= SGDClassifier()
trial_05=Pipeline([("vect",vec),("clf",clf)])
which has:
precision recall f1-score support
0 1.00 1.00 1.00 6624
1 0.73 0.57 0.64 75
micro avg 0.99 0.99 0.99 6699
macro avg 0.86 0.79 0.82 6699
weighted avg 0.99 0.99 0.99 669
[[6611 37]
[ 13 38]]
and this model which used resampled sgd for classifcation
precision recall f1-score support
0 1.00 0.92 0.96 6624
1 0.13 1.00 0.22 75
micro avg 0.92 0.92 0.92 6699
macro avg 0.56 0.96 0.59 6699
weighted avg 0.99 0.92 0.95 6699
[[6104 0]
[ 520 75]]
As you see the problem in both cases is class 1, but in forst one we have fairly good precision and f1 score versus in the second one we have a very good recall
So I decided to use ensemble model using both in this way:
from sklearn.ensemble import VotingClassifier#create a dictionary of our models
estimators=[("trail_05",trial_05), ("resampled", SGD_RESAMPLED_Model)]#create our voting classifier, inputting our models
ensemble = VotingClassifier(estimators, voting='hard')
now I have this result:
precision recall f1-score support
0 0.99 1.00 1.00 6624
1 0.75 0.48 0.59 75
micro avg 0.99 0.99 0.99 6699
macro avg 0.87 0.74 0.79 6699
weighted avg 0.99 0.99 0.99 6699
[[6612 39]
[ 12 36]]
As you the ensembe model has better precision regarding to class 1,but worse recall and f1 socre which caused to worse confusion matrix regarding classed 1 (36 TP vs 38 TP for class 1)
MY aim is to improve TP for class one (f1 score, recall for class 1)
what do you recommend to improve TP for class one (f1score, recall for class 1?
generaly do you have any idea regarding my workflow?
I have tried parameter tuning, it i does not improve sgd model.
I imported classification_report from sklearn.metrics and when I enter my np.arrays as parameters I get the following error :
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1135:
UndefinedMetricWarning: Precision and F-score are ill-defined and
being set to 0.0 in labels with no predicted samples. 'precision',
'predicted', average, warn_for)
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1137:
UndefinedMetricWarning: Recall and F-score are ill-defined and being
set to 0.0 in labels with no true samples. 'recall', 'true',
average, warn_for)
Here is the code :
svclassifier_polynomial = SVC(kernel = 'poly', degree = 7, C = 5)
svclassifier_polynomial.fit(X_train, y_train)
y_pred = svclassifier_polynomial.predict(X_test)
poly = classification_report(y_test, y_pred)
When I was not using np.array in the past it worked just fine, any ideas on how i can correct this ?
This is not an error, just a warning that not all your labels are included in your y_pred, i.e. there are some labels in your y_test that your classifier never predicts.
Here is a simple reproducible example:
from sklearn.metrics import precision_score, f1_score, classification_report
y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'
precision_score(y_true, y_pred, average='macro')
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
0.16666666666666666
precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333
precision_score(y_true, y_pred, average=None)
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])
Exact same warnings are produced for f1_score (not shown).
Practically this only warns you that in the classification_report, the respective values for labels with no predicted samples (here 2) will be set to 0:
print(classification_report(y_true, y_pred))
precision recall f1-score support
0 0.50 1.00 0.67 2
1 0.00 0.00 0.00 2
2 0.00 0.00 0.00 2
micro avg 0.33 0.33 0.33 6
macro avg 0.17 0.33 0.22 6
weighted avg 0.17 0.33 0.22 6
[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
When I was not using np.array in the past it worked just fine
Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...
It means that some labels are only present in train data and some labels are only present in test dataset. Run the following codes, to understand the distribution of train and test labels.
from collections import Counter
Counter(y_train)
Counter(y_test)
Use stratified train_test_split to get rid of the situation where some labels are present only in test dataset.
It might have worked in past simply because of the random splitting of dataset. Hence, stratified splitting is always recommended.
The first situation is more about model fine tuning or choice of model.
I split my dataset in two parts: training set and test set. For now just forget the test set and use the training set with the function GridSearchCV of the package sklearn.model_selection to search the best parameters for an SVM:
Cs = [0.001, 0.01, 0.1, 1, 10, 100, 1000]
gammas = [0.001, 0.01, 0.1, 1]
# Set the parameters by cross-validation
param_grid = [{'kernel': ['rbf'], 'gamma': gammas, 'C': Cs}]
clf = GridSearchCV(svm.SVC(), param_grid = param_grid, cv=nfolds, verbose=1.)
clf.fit(x_train, labels)
after found my best C and gamma parameters, I create an SVM and I fit it with the training set (used before to search the best C and gamma):
model = svm.SVC(kernel='rbf', C = clf.best_params_['C'], gamma = clf.best_params_['gamma'])
model.fit(x_train, y_train)
At this point I tried one thing, I used the predict() function of the GridSearchCV object and the one of the svm.SVC object:
predicted_label1 = model.predict(x_test)
predicted_label2 = clf.predict(x_test)
and then I used the classification_report(y_test, predicted_label) to valuate my two predicted_label vectors. In my mind I should obtain the same values but this not happens...Here my output:
precision recall f1-score support
0.0 0.24 0.97 0.39 357
1.0 0.00 0.00 0.00 358
2.0 0.00 0.00 0.00 357
3.0 0.00 0.00 0.00 357
avg / total 0.06 0.24 0.10 1429
fine parametri
training set and test set saved
Create SVM classifier
precision recall f1-score support
0.0 0.70 0.63 0.66 357
1.0 0.89 0.90 0.90 358
2.0 0.89 0.94 0.91 357
3.0 0.85 0.88 0.86 357
avg / total 0.83 0.84 0.83 1429
The first is from the GridSearchCV and the second from the SVM...
Is this normal?
What does GridSearchCV returns? Does it fit with the passed training set?