Keras model.evaluate() print vector of loss values - python

Using model.evaluate() prints the mean loss over the test set. Is it possible to instead get out a vector off loss values for all data points in the test set (the average of which would then be what is printed by model.evaluate())?

As a built-in function, the loss is usually more useful as the average so that you can do different visualizations. As for getting the results, just compute them yourself and you will be set. For example
def squared_error(y_true, y_pred):
return tf.math.square(y_true - y_pred)
predictions = model.predict(x_test)
losses = squared_error(y_test, predictions)
Alternatively, it may be doable using built in loss functions with a batch size of 1 like this
loss_fn = tf.keras.losses.mse
# x_test should be (batches, your_dims) where you make sure each batch only has 1 sample
predictions = model.predict(x_test)
losses = loss_fn(y_test, predictions)

Related

Huge difference between in accuracy between model.evaluate and model.predict for tensorflow CNN model

I am using ImageDataGenerator(validation_split).flow_from_directory(subset) for my training and validation sets. So the training and validation data get their own generators.
After training my data, I run model.evaluate() on my validation generator and got about 75% accuracy. However, when I run model.predict() on that same validation generator, the accuracy falls to 1%.
The model is a multiclass CNN compiled on categorical crossentropy loss and accuracy metrics, which should default to categorical accuracy. # Edit: changed to categorical accuracy anyways.
# Compile
learning_rate = tf.keras.optimizers.schedules.PolynomialDecay(initial_learning_rate=initial_lr,
decay_steps=steps,
end_learning_rate=end_lr)
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate),
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
# Validation set evaluation
val_loss, val_accuracy = model.evaluate(val_generator,
steps=int(val_size/bs)+1)
print('Accuracy: {}'.format(val_accuracy))
# Validation set predict
y_val = val_generator.classes
pred = model.predict(val_generator,
verbose=1
steps=int(val_size/bs)+1)
accuracy_TTA = np.mean(np.equal(y_val, np.argmax(pred, axis=-1)))
print('Accuracy: {}'.format(accuracy_TTA))
The problem with the varied accuracy values from model.evaluate and model.predict seems to be solved by creating separate instances of ImageDataGenerator() but with the same seed.
Also, sometimes during training KeyInterrupts or loading checkpoints, the generator instance should be reinitialised as the problem may occur.

logistic regression predicts 1 for all samples

I am trying to train a logistic regression model with data as follows:
Categorical Variable: either 0 or 1
Numerical Variables: Continuous number between 8 and 20
I have 20 numerical variables and I want to only use one at a time for the predicting model, and see which is the best feature to use.
The code I'm using is:
for variable in numerical_variable:
X = data[[variable ]]
y = data[categorical_variable]
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.20,random_state=0)
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
y_pred=logreg.predict(X_test)
print(y_pred)
cnf_matrix = metrics.confusion_matrix(y_test, y_pred)
print("Accuracy:", metrics.accuracy_score(y_test, y_pred))
print("Precision:", metrics.precision_score(y_test, y_pred))
print("Recall:", metrics.recall_score(y_test, y_pred))
The categorical variable is biased towards 1, there are about 800 1s to 200 0s. So I think this is why it always predicts one, regardless of the test samples (if I don't set random_state=0) and regardless of the numerical variable.
(using python 3)
Any thoughts on how to fix this?
Thanks
use joblib library to save your model,
import joblib
your_model = LogisticRegression()
your_model.fit(X_train, y_train)
filename = 'finalized_model.sav'
joblib.dump(your_model, filename)
this code will save your model as 'finalized_model.sav'. the extension doesn't matter,even if you don't write.
then you can call your exact and fixed model by this code to take the same predictions all the time.
your_loaded_model = joblib.load('finalized_model.sav')
as a prediction example;
your_loaded_model.predict(X_test)

Which loss function and metrics to use for multi-label classification with very high ratio of negatives to positives?

I am training a multi-label classification model for detecting attributes of clothes. I am using transfer learning in Keras, retraining the last few layers of the vgg-19 model.
The total number of attributes is 1000 and about 99% of them are 0s. Metrics like accuracy, precision, recall, etc., all fail, as the model can predict all zeroes and still achieve a very high score. Binary cross-entropy, hamming loss, etc., haven't worked in the case of loss functions.
I am using the deep fashion dataset.
So, which metrics and loss functions can I use to measure my model correctly?
What hassan has suggested is not correct -
Categorical Cross-Entropy loss or Softmax Loss is a Softmax activation plus a Cross-Entropy loss. If we use this loss, we will train a CNN to output a probability over the C classes for each image. It is used for multi-class classification.
What you want is multi-label classification, so you will use Binary Cross-Entropy Loss or Sigmoid Cross-Entropy loss. It is a Sigmoid activation plus a Cross-Entropy loss. Unlike Softmax loss it is independent for each vector component (class), meaning that the loss computed for every CNN output vector component is not affected by other component values. That’s why it is used for multi-label classification, where the insight of an element belonging to a certain class should not influence the decision for another class.
Now for handling class imbalance, you can use weighted Sigmoid Cross-Entropy loss. So you will penalize for wrong prediction based on the number/ratio of positive examples.
Actually you should use tf.nn.weighted_cross_entropy_with_logits.
It not only for multi label classification and also has a pos_weight can pay much attention at the positive classes as you would expected.
Multi-class and binary-class classification determine the number of output units, i.e. the number of neurons in the final layer.
Multi-label and single-Label determines which choice of activation function for the final layer and loss function you should use.
For single-label, the standard choice is Softmax with categorical cross-entropy; for multi-label, switch to Sigmoid activations with binary cross-entropy.
Categorical Cross-Entropy:
Binary Cross-Entropy:
C is the number of classes, and m is the number of examples in the current mini-batch. L is the loss function and J is the cost function. You can also see here.
In the loss function, you are iterating over different classes. In the cost function, you are iterating over the examples in the current mini-batch.
You can refer to this github. They have binary, multi-class, multi-labels and also options to enforce model to learn close to 0 and 1 or simply learn probability.
https://github.com/monkeyDemon/AI-Toolbox/blob/master/computer_vision/image_classification_keras/loss_function/focal_loss.py
Steve
I have been in a simialr situation like yours
you can use softmax activation function in the output layer with categorical_crossentropy to check other metrics such as precision, recall and f1 score you can use the sklearn library as follows:
from sklearn.metrics import classification_report
y_pred = model.predict(x_test, batch_size=64, verbose=1)
y_pred_bool = np.argmax(y_pred, axis=1)
print(classification_report(y_test, y_pred_bool))
as for the training stage as far as know there is the accuracy metric as follows
model.compile(loss='categorical_crossentropy'
, metrics=['acc'], optimizer='adam')
if it helps you, you can plot the training history for the loss and accuracy of your training stage using matplotlib as follows :
hist = model.fit(x_train, y_train, batch_size=24, epochs=1000, verbose=2,
callbacks=[checkpoint],
validation_data=(x_valid, y_valid)
)
# Plot training & validation accuracy values
plt.plot(hist.history['acc'])
plt.plot(hist.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
# Plot training & validation loss values
plt.plot(hist.history['loss'])
plt.plot(hist.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

How to get the model loss in sklearn

Whenever an sklearn model is fit to some data, it minimizes some loss function. How can I obtain the model loss using that loss function?
e.g.
model = sklearn.linear_model.LogisticRegression().fit(X_train,y_train)
model.get_loss(X_train, y_train) #gives the loss for these values
model.get_loss(X_test, y_test) #gives the loss for other values
Note that the .score method does NOT do this thing.
LogisticRegression minimises log loss, so you would expect the loss to be the .score, only negated. However, this actually returns the mean accuracy.
To calculate log loss you need to use the log_loss metric:
I haven't tested it, but something like this:
from sklearn.metrics import log_loss
model = sklearn.linear_model.LogisticRegression().fit(X_train, y_train)
loss = log_loss(X_test, model.predict_proba(X_test), eps=1e-15)

Relationship between sklearn .fit() and .score()

While working with a linear regression model I split the data into a training set and test set. I then calculated R^2, RMSE, and MAE using the following:
lm.fit(X_train, y_train)
R2 = lm.score(X,y)
y_pred = lm.predict(X_test)
RMSE = np.sqrt(metrics.mean_squared_error(y_test, y_pred))
MAE = metrics.mean_absolute_error(y_test, y_pred)
I thought that I was calculating R^2 for the entire data set (instead of comparing the training and original data). However, I learned that you must fit the model before you score it, therefore I'm not sure if I'm scoring the original data (as inputted in R2) or the data that I used to fit the model (X_train, and y_train). When I run:
lm.fit(X_train, y_train)
lm.score(X_train, y_train)
I get a different result than what I got when I was scoring X and y. So my question is are the inputs to the .score parameter compared to the model that was fitted (thereby making lm.fit(X,y); lm.score(X,y) the R^2 value for the original data and lm.fit(X_train, y_train); lm.score(X,y) the R^2 value for the original data based off the model created in .fit.) or is something else entirely happening?
fit() that only fit the data which is synonymous to train, that is fit the data means train the data.
score is something like testing or predict.
So one should use different dataset for training the classifier and testing the acuracy
One can do like this.
X_train,X_test,y_train,y_test=cross_validation.train_test_split(X,y,test_size=0.2)
clf=neighbors.KNeighborsClassifier()
clf.fit(X_train,y_train)
accuracy=clf.score(X_test,y_test)

Categories

Resources