Find predicted classification on validation data set - python

I am working on a neural network using TensorFlow that takes in feature vectors of length 1476 and is attempting to classify each feature vector into one of 6 categories (labels). The NN itself does not have a problem, but have been unsuccessful in finding how to get the predicted label's on the validation data set. For reference, here is my neural network:
#define the model
model = Sequential()
model.add(Dense(units=100,activation='tanh', input_dim=1476))
And then here is the model being fitted with the validation data set included (Note that I recognize fit_transform should be converted to fit method):
The generator and validationGenerator methods are just getting the training/validation data and label's to be fed in. When the model is ran, I am able to see the standard output per epoch, containing the training accuracy and loss, along with the validation accuracy and loss.
My question is if there is a way to see more info into the validation accuracy. I want to be able to see the predicted outputs vs the ground truth's for each feature vector in the validation data set per epoch. Is this at all possible? I want to be able to see if my model is classifying heavily towards only a few specific labels.


What is training accuracy and training loss and why we need to compute them?

I am new to Lstm and machine learning and I am trying to understand some of it's concepts. Below is the code of my Lstm model.
Lstm model:
model = Sequential()
model.add(Embedding(vocab_size, 50, input_length=max_length-1))
model.add(Dense(vocab_size, activation='softmax'))
early_stopping = EarlyStopping(monitor='val_loss', patience=42)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history =, y, validation_split=0.2, epochs=500, verbose=2,batch_size = 20)
Below is a sample of my output:
And the train/test accuracy and train/test loss diagrams:
My undersanding (and please correct me if I am wrong) is that val_loss and val_accuracy is the loss and accuracy of the test data. My question is, what is the train accuracy and train loss and how these values are computed?. Thank you.
1. loss and val_loss-
In deep learning, the loss is the value that a neural network is trying to minimize. That is how a neural network learns by adjusting weights and biases in a manner that reduces the loss.
loss and val_loss differ because the former is applied to the train set, and the latter to the test set. As such, the latter is a good indication of how the model performs on unseen data.
2. accuracy and val_accuracy-
Once again, acc is on the training data, and val_acc is on the validation data. It's best to rely on val_acc for a fair representation of model performance because a good neural network will end up fitting the training data at 100%, but would perform poorly on unseen data.
Training should be stopped when val_acc stops increasing, otherwise your model will probably overffit. You can use earlystopping callback to stop training.
3. Why do we need train accuracy and loss?
It's not a meaningful evaluation metric because a neural network with sufficient parameters can essentially memorize the labels of training data and then perform no better than random guessing on previously unseen examples.
However, it can be useful to monitor the accuracy and loss at some fixed interval during training as it may indicate whether the backend is functioning as expected and if the training process needs to be stopped.
Refer here for a detailed explanation about earlystopping.
4. How accuracy and loss are calculated?
Loss and accuracy are calculated as you train, according to the loss and metrics specified in compiling the model. Before you train, you must compile your model to configure the learning process. This allows you to specify the optimizer, loss function, and metrics, which in turn are how the model fit function knows what loss function to use, what metrics to keep track of, etc.
The loss function (like binary cross entropy) documentation can be found here and the metrics (like accuracy) documentation can be found here.

Why am I getting 100% accuracy using feed-forward neural networks for separate training, validation, and testing datasets in Keras?

Today I was working on a classifier to detect whether or not a mushroom was poisonous given its features. The data was in a .csv file(read to a pandas DataFrame) and the link to the data can be found at the end.
I used sci-kit learn's train_test_split function to split the data into training and testing sets.
I then removed the column that specified whether or not the mushroom was poisonous or not for the training and testing labels and assigned this to a yTrain, and yTest variable.
I then applied a one-hot-encoding (Using pd.get_dummies()) to the data since the parameters were categorical.
After this, I normalized the training and testing input data.
Essentially the training and testing input data was a distinct list of one-hot-encoded parameters and the output data was a list of one's and zeroes representing the output(one meant poisonous, zero meant edible).
I used Keras and a simple-feed forward network for this project. This network is comprised of three layers; A simple Dense(Linear Layer for PyTorch users) layer with 300 neurons, a Dense layer with 100 neurons, and a Dense layer with two neurons, each representing the probability of whether or not the given parameters of the mushroom signified it was poisonous, or edible. Adam was the optimizer that I had used, and Sparse-Categorical-Crossentropy was my loss-function.
I trained my network for 60 epochs. After about 5 epochs the loss was basically zero, and my accuracy was 1. After training, I was worried that my network had overfitted, so I tried it on my distinct testing data. The results were the same as the training and validation data; the accuracy was at 100% and my loss was negligible.
My validation loss at the end of 50 epochs is 2.258996e-07, and my training loss is 1.998715e-07. My testing loss was 4.732502e-09. I am really confused at the state of this, is the loss supposed to be this low? I don't think I am overfitting, and my validation loss is only a bit higher than my training loss, so I don't think that I am underfitting, as well.
Do any of you know the answer to this question? I am sorry if I had messed up in a silly way of some sort.
Link to dataset:
It seems that that Kaggle dataset is solvable, in the sense that you can create a model which gives the correct answer 100% of the time (if these results are to be believed). If you look at those results, you can see that the author was actually able to find models which give 100% accuracy using several methods, including decisions trees.

Validation loss is zero on first epoch only

I am trying to build a regression model in tensorflow using the dataset and keras API's. The target contains quite a lot of zero's and the non-zero values are roughly distributed normally though all are positive.
When I try to create the linear model below I notice that on the first epoch the validation loss and both of the validation metrics are exactly 0, though the training loss and metrics are not. This problem disappears after the first epoch. Though this doesn't keep me from creating the model I still can not explain why this happens and what I might do about it.
Tried so far
What I have unsuccesfully tried to pinpoint where the problem lies:
Swapping train and test
A smaller batchsize (32 instead of 255)
Shuffling test
Setting the initalizers of the dense layer to normal
My code
train =
test = # a smaller csv file
model = Sequential([
Dense(1, activation='linear') # we are doing a regression
metrics=['mean_squared_error', 'mean_absolute_error']
This is completely normal because normal metrics and validation metrics are computed in different times.
Nomral metrics are computed during the epoch, so you can see what is the loss score and other values that come from the training.
Validation metrics are used to see wether the parameters computed during a certain epoch are good. This means that some of the training data used in a specific epoch (you can specify the percentage) are not used for training, but they're used intead to compute the network accuracy. So theese values are computed at the end of the epoch, when you have a certain parameter configuration.

Cross validation with CNN

I would like to know if my code is doing what i want to do; To give you some background 'im implementing CNN for image classification. I'm trying to use cross validation to compare my different neural network architecture
here the code:
def create_model():
model = Sequential()
model.add(Dense(128, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(12, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
return model
model = KerasClassifier(build_fn=create_model, epochs=5, batch_size=20, verbose=1)
# 3-Fold Crossvalidation
kfold = KFold(n_splits=3, shuffle=True, random_state=2019)
results = cross_val_score(model, train_X, train_Y_one_hot, cv=kfold), train_Y_one_hot,validation_data=(valid_X, valid_label),class_weight=class_weights)
y_pred = model.predict(test_X)
test_eval = model.evaluate(test_X, y_pred, verbose=0)
I have found the part for cross validation on internet. But i have some problem to understand it.
My question: 1=> Can I use cross validation to improve my accuracy? For example i run 10 time my neural network and my model get the weight where the best accuracy occured
2 => If i understand well, in the code above, results run my CNN 3 time and show me the accuracy. But when i use, model is run only one time; Am i right?
Thanks for your help
Not really, cross-validation is more a way to prevent overfitting/ not be confused by abnormal results coming from a badly splitted dataset -> getting a revelant estimation of you model performances. If you want to tune the Hyperparameters of your model, you should better use sklearn.model_selection.GridSearchCV / sklearn.model_selection.RandomSearchCV
when doing cross_val_score For each Train/Test
sklearn does a fit then predict/evaluate, So for each new Instance of the model,
you have 1 fit then 1 predict/evaluate;
Else your cross-validation is not valid because it depends on fitting on previous dataset (and maybe on test data !)
There are two key terms here that you should get familiarized with:
Hyperparameters control the general architecture of a model. These are what the programmer or data scientist controls. In case of a CNN, this refers to the number of layers, their configurations, activations, optimizers etc. For a simple polynomial regression model this would be the degree of the polynomial.
Parameters refer to the actual values of weights or coefficients that the model ends up with after it solves the optimization using gradient descent or whatever method you use. In a CNN this would be the weights matrix for each layer. For a polynomial regression this would be the coefficients and bias.
Cross validation is used to find the best set of hyperparameters. The best set of parameters are obtained by the optimizer (gradient descent, adam etc) for a given set of hyperparameters and data.
To answer your questions:
You would run cross validation several times, each time with a different hyperparameter configuration (network architecture). That's the only thing you can control. At the end you pick the best architecture based on accuracy. The weights of the model would be different for each fold but finding the best weights is the optimizer's job, not yours.
Yes. In 3 fold CV, the model is trained 3 times and evaluated 3 times. When you do you are making predictions once on a new dataset.

Different accuracy by fit() and evaluate() in Keras with the same dataset

I program Keras's code to train GoogleNet. However, accuracy gotten from fit() is 100% yet with the same training dataset used for evaluate(), accuracy remains 25% only, which has such huge discrepancy!!! Also, accuracy by evaluate(), which is not like fit(), won't get improved for training more times, which means it almost stays in 25%.
Does anyone has idea of what is wrong with this situation?
# Training Dataset and labels r given. Here load GoogleNet model
from keras.models import load_model
model = load_model('FT_InceptionV3.h5')
# Training Phase,
#Testing Phase
train_loss , train_acc=model.evaluate(X_train, y_train, verbose=1)
print("Train loss=",train_loss,"Train accuracy",train_acc)
Training Result
Testing Result
After some digging into Keras issues, I found this.
The reason for this is that when you use fit, At each batch of the training data the weights are updated. The loss value returned by the fit method is not the mean of the loss of the final model, but the mean of the loss of all slightly different models used on each batch.
On the other hand, when you use to evaluate, the same model is used on the whole dataset. And this model actually doesn't even appear in the loss of the fit method since even at the last batch of training, the loss computed is used to update the model's weights.
To sum everything up, fit and evaluate have two completely different behaviours.

