plot_confusion_matrix() got an unexpected keyword argument 'classes' using sklearn - python

i am new to python and deep learning, i trained a multi classifier model and want to plot a confusion matrix but i am facing an error
here is my code
from sklearn.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
from sklearn.metrics import ConfusionMatrixDisplay
Y_pred = model.predict_generator(test_generator)
y_pred = np.argmax(Y_pred, axis=1)
category_names = sorted(os.listdir('D:/DiabaticRetinopathy/mq_dataset/DR_Normal/train'))
print(category_names)
cm = confusion_matrix(test_generator.classes, y_pred)
plot_confusion_matrix(cm, classes = category_names, title='Confusion Matrix', normalize=False, figname = 'Confusion_matrix_concrete.jpg')
i upddated my sklearn to 0.24 version. i restarted my kernel after updating but still its giving an error:
TypeError: plot_confusion_matrix() got an unexpected keyword argument 'classes'

use labels instead of classes, then Remove title, figname
plot_confusion_matrix(X = test_generator.classes, y_true = y_pred,labels= category_names, normalize=False)
Documentation: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.plot_confusion_matrix.html

There is a keyword labels, but not classes, so you can change it to that.

The error states that the keyword classes you provided is not a keyword that this function recognizes. This happens in your last line.
The documentation gives a list of the keywords that you can use:
doc

Related

Trying to plot confusion matrix, but I find this error: Singleton array 23 cannot be considered a valid collection

I'm tying to plot a confusion matrix of a Neural Network, I already constructed and saved the model. I have 11 labels in my dataset.
I using this code:
import pandas as pd
import numpy as np
from scipy import stats
from sklearn import metrics
from sklearn.metrics import classification_report, confusion_matrix
rounded_labels = np.argmax(y_test, axis=-1) #y_test are the test label, I use np.argmax to find an integer
test_model = load_model('/model.h5')
y_pred = test_model.predict(X_test, steps=1, verbose=0)
rounded_y_pred = np.argmax(y_pred, axis=-1) #I use np.argmax to find an integer prediction
And when I print rounded_y_pred I find some integer number from 0 to 10, it seems good because I have eleven labels but when I try to print the confusion matrix:
cm = confusion_matrix(y_true=rounded_labels, y_pred=rounded_y_pred)
I find this error: TypeError: Singleton array 23 cannot be considered a valid collection.
I really don't know how to fix it. Could someone help me? Thank you so much

keep returning 'numpy.float64' object is not callable when running GridSearchCV

I am running a block of codes using GridSearchCV to compare the best parameters used in LinearSVC.
However, I kept running into the same
TypeError 'numpy.float64' object is not callable
even if I converted all my inputs into float64 format. Anyone can help?
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer
clf = LinearSVC()
parameters = {'random_state':[0, 1, 42], 'tol':[1e-5, 1e-4, 1e-3]}
scorer = make_scorer(fbeta_score(y_val.values.ravel().astype('float64'),
y_pred.astype('float64'), beta=0.5))
grid_obj = GridSearchCV(clf, parameters, scoring=scorer)
grid_fit = grid_obj.fit(X_train.values.astype('float64'),
y_train.values.ravel().astype('float64'))
Why are you converting your y_train_values to float ? I assume you are doing a classification since you use SVC. The target values should be integers.
it turns out that the issue was caused by my 'make_scorer' function. It should be written as 'make_scorer(fbeta_score, beta=0.5)'

plot_confusion_matrix without estimator

I'm trying to use plot_confusion_matrix,
from sklearn.metrics import confusion_matrix
y_true = [1, 1, 0, 1]
y_pred = [1, 1, 0, 0]
confusion_matrix(y_true, y_pred)
Output:
array([[1, 0],
[1, 2]])
Now, while using the followings; using 'classes' or without 'classes'
from sklearn.metrics import plot_confusion_matrix
plot_confusion_matrix(y_true, y_pred, classes=[0,1], title='Confusion matrix, without normalization')
or
plot_confusion_matrix(y_true, y_pred, title='Confusion matrix, without normalization')
I expect to get similar output like this except the numbers inside,
Plotting simple diagram, it should not require the estimator.
Using mlxtend.plotting,
from mlxtend.plotting import plot_confusion_matrix
import matplotlib.pyplot as plt
import numpy as np
binary1 = np.array([[4, 1],
[1, 2]])
fig, ax = plot_confusion_matrix(conf_mat=binary1)
plt.show()
It provides same output.
Based on this
it requires a classifier,
disp = plot_confusion_matrix(classifier, X_test, y_test,
display_labels=class_names,
cmap=plt.cm.Blues,
normalize=normalize)
Can I plot it without a classifier?
plot_confusion_matrix expects a trained classifier. If you look at the source code, what it does is perform the prediction to generate y_pred for you:
y_pred = estimator.predict(X)
cm = confusion_matrix(y_true, y_pred, sample_weight=sample_weight,
labels=labels, normalize=normalize)
So in order to plot the confusion matrix without specifying a classifier, you'll have to go with some other tool, or do it yourself.
A simple option is to use seaborn:
import seaborn as sns
cm = confusion_matrix(y_true, y_pred)
f = sns.heatmap(cm, annot=True)
I am a bit late here, but I thought other people might benefit from my answer.
As others have mentioned using plot_confusion_matrix is not an option without the classifier but it is still possible to use sklearn to obtain a similar-looking confusion matrix without the classifier. The function below does exactly this.
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
def confusion_ma(y_true, y_pred, class_names):
cm = confusion_matrix(y_true, y_pred, normalize='true')
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
disp.plot(cmap=plt.cm.Blues)
return plt.show()
The confusion_matrix function returns a simple ndarry matrix. By passing this together with labels of the predictions to the ConfusionMatrixDisplay function a similar looking matrix is obtained. In the definition I've added the class_names to be displayed instead of 0 and 1, chosen to normalize the output and specified a colormap - change accordingly to your needs.
Since plot_confusion_matrix require the argument 'estimator' not to be None, the answer is: no, you can't. But you can plot your confusion matrix in other ways, for example see this answer: How can I plot a confusion matrix?
I tested the following "identity classifier" in a Jupyter notebook running the conda_python3 kernel in Amazon SageMaker. The reason is that SageMaker's transformation job is async and so does not allow the classifier to be used in the parameters of plot_confusion_matrix, y_pred has to be calculated before calling the function.
IC = type('IdentityClassifier', (), {"predict": lambda i : i, "_estimator_type": "classifier"})
plot_confusion_matrix(IC, y_pred, y_test, normalize='true', values_format='.2%');
So while plot_confusion_matrix indeed expects an estimator, you'll not necessarily have to use another tool IMO, if this solution fits your use case.
simplified POC from the notebook
I solved the problem of using a customized classifier; you can build any custom classifier and pass it to the plot_confusion matrix as a class:
class MyModelPredict(object):
def __init__(self, model):
self._estimator_type = 'classifier'
def predict(self, X):
return your_custom_prediction
model = MyModelPredict()
plot_confusion_matrix(model, X, y_true)

AttributeError: 'NumpyArrayIterator' object has no attribute 'classes'

I get this error:
AttributeError: 'NumpyArrayIterator' object has no attribute 'classes'
I am trying to make a confusion matrix to evaluate the Neural Net I have trained. I am using ImageDatagenerator and datagen.flow functions for before the fit_generator function for training.
For predictions I use the predict_generator function on the test set. All is working fine so far. Issue arrises in the following:
test_generator.reset()
pred = model.predict_generator(test_generator, steps=len(test_generator), verbose=2)
from sklearn.metrics import classification_report, confusion_matrix, cohen_kappa_score
y_pred = np.argmax(pred, axis=1)
print('Confusion Matrix')
print(pd.DataFrame(confusion_matrix(test_generator.classes, y_pred)))
I should be seeing a confusion matrix but instead I see an error. I ran the same code with sample data before I ran on the actual dataset and that did show me the results.
First you need to extract labels from generator and then put them in confusion_matrix function. To extract labels use x_gen,y_gen = test_generator.next(), just pay attention that labels are one hot encoded. Example:
test_generator.reset()
pred = model.predict_generator(test_generator, steps=len(test_generator), verbose=2)
from sklearn.metrics import classification_report, confusion_matrix, cohen_kappa_score
y_pred = np.argmax(pred, axis=1)
x_gen,y_gen = test_generator.next()
y_gen = np.argmax(y_gen, axis=1)
print('Confusion Matrix')
print(pd.DataFrame(confusion_matrix(y_gen, y_pred)))

Tensorflow LinearRegressor not converging

I'm attempting to do a toy linear regression in Python with TensorFlow, using the pre-built estimator tf.contrib.learn.LinearRegressor instead of building my own estimator.
The inputs I'm using are real-valued numbers between 0 and 1, and the outputs are just 3*inputs. TensorFlow seems to fit the data (no errors raised), but the outputs have no correlation to what they should be.
I'm not sure I'm getting the predictions done correctly- the documentation for the predict() function is pretty sparse.
Any ideas for how to improve the fitting?
import numpy as np
import pandas as pd
import tensorflow as tf
import itertools
import matplotlib.pyplot as plt
#Defining data set
x = np.random.rand(200)
y = 3.0*x
data = pd.DataFrame({'X':x, 'Y':y})
training_data = data[50:]
test_data= data[:50]
COLUMNS = ['Y','X']
FEATURES = ['X']
LABELS = 'Y'
#Wrapper function for the inputs of LinearRegressor
def get_input_fn(data_set, num_epochs=None, shuffle=True):
return tf.estimator.inputs.pandas_input_fn(
x=pd.DataFrame(data_set[FEATURES]),
y=pd.Series(data_set[LABELS]),
num_epochs=num_epochs,
shuffle=shuffle)
feature_cols = [tf.feature_column.numeric_column(k) for k in FEATURES]
regressor = tf.contrib.learn.LinearRegressor(feature_columns=feature_cols)
regressor.fit(input_fn=get_input_fn(test_data), steps=100)
results = regressor.predict(input_fn=get_input_fn(test_data,
num_epochs=1))
predictions = list(itertools.islice(results, 50))
#Visualizing the results
fig = plt.figure(figsize=[8,8])
ax = fig.add_subplot(111)
ax.scatter(test_data[LABELS], predictions)
ax.set_xlabel('Actual')
ax.set_ylabel('Predicted')
plt.show()
Scatter plot of results
Figured out the answer, answering here for posterity-
my input function to LinearRegressor had shuffle=True set as an argument, and my predict() call did not set shuffle=False. So the outputs were shuffled around, making them look like they didn't converge!

Categories

Resources