Trying to add auc-roc score into CNN training - python

My current CNN has relative high accuracy but low auc score, so I want to train my model considering both accuracy and auc. However, when I tried to add 'auc' as the second metrics to train, I cannot start my epochs.
This is the error message I am getting:
FailedPreconditionError: Error while reading resource variable conv2d_4/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/conv2d_4/kernel/N10tensorflow3VarE does not exist.
[[{{node conv2d_4/Conv2D/ReadVariableOp}}]]
I have tried the function auc provided in previous discussions. Sorry I can't find the post now.
from keras import backend as K
def auc(y_true, y_pred):
auc = tf.metrics.auc(y_true, y_pred)[1]
K.get_session().run(tf.local_variables_initializer())
return auc
auc_model = models.Sequential()
auc_model.add(layers.Conv1D (kernel_size = (200), filters = 10, input_shape=(1644,1) , activation='relu'))
auc_model.add(layers.MaxPooling1D(pool_size = (50), strides=(10)))
auc_model.add(layers.Reshape((40, 35, 1)))
auc_model.add(layers.Conv2D(16, (3, 3), activation='relu'))
auc_model.add(layers.Conv2D(16, (3, 3), activation='relu'))
auc_model.add(layers.MaxPooling2D((2, 2)))
auc_model.add(layers.Flatten())
auc_model.add(layers.Dense(32, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
auc_model.add(layers.Dropout(rate=0.2))
auc_model.add(layers.Dense(1, activation='sigmoid'))
auc_model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy', auc])
auc_model.summary()
from tensorflow.keras.callbacks import EarlyStopping
target = y_tr.columns[0]
rows_tr = np.isfinite(y_tr[target]).values
rows_te = np.isfinite(y_te[target]).values
x_train = x_tr[rows_tr].reshape((x_tr[rows_tr].shape[0], 1644, 1))
x_test = x_te[rows_te].reshape((x_te[rows_te].shape[0], 1644, 1))
auc_model.fit( x_train, y_tr[target][rows_tr],
validation_data=(x_test, y_te[target][rows_te]), epochs = 5)
print('\n# Evaluate on test data')
results = auc_model.evaluate(x_test, y_te[target][rows_te], batch_size = 8, verbose=1)
I want to start my training process considering both accuracy and auc score. Thanks.

Metric is just for reporting an evaluation of your trained model at each epoch. It does not change anything on your training.
If you want to make your model consider the AUC as well, you should modify your loss. Minimizing the loss of binary_crossentropy, naturally, maximize the accuracy without regarding the AUC. This makes it more problematic when you have an imbalanced data set, like one skewed class.
If you want it really only for the metric, you can see this post:
How to compute Receiving Operating Characteristic (ROC) and AUC in keras?
But if you truly want your model to maximize the AUC, you should write a custom loss function on Keras and put it in the loss of your model.
There is a good discussion here:
https://www.kaggle.com/c/invasive-species-monitoring/discussion/32762

Related

Transfer Learning model with Keras: using other metrics than accuracy

I'm working on a binary classification model for leaves from the Swedish leaves data and thought Transfer Learning could be practical. I found this tutorial, but in the compile function, I want to use different metrics than accuracy. When I try to get AUC or FP/FN/TP/TN, ValueError is raised, claiming the shape of true y (None, 1) and the shape of the y_pred (None, 2) are incompatible.
I fail to understand:
why would y_pred have this shape?
how can the accuracy be calculated, but not the parts of the confusion matrix?!
A solution without a reasoned explanation is also very welcome :)
feature_extractor_model = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
pretrained_model_without_top_layer = hub.KerasLayer(
feature_extractor_model, input_shape=(224, 224, 3), trainable=False)
classes_num = 2
model = tf.keras.Sequential([
pretrained_model_without_top_layer,
tf.keras.layers.Dense(classes_num)
])
model.compile(
optimizer="adam",
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[['acc'], [tf.keras.metrics.TruePositives(), tf.keras.metrics.FalsePositives(), tf.keras.metrics.TrueNegatives(), tf.keras.metrics.FalseNegatives()]])
model.fit(X_train_scaled, y_train, steps_per_epoch=9, epochs=5)
If you have two classes (e.g. cats and dogs) you could either encode it sparsely as zero or one, or one-hot as [0,1] and [1,0].
Your training data is sparsely, so your loss is SparseCCE. Metrics are just losses functionally, so any metric you use would need to accept sparse. In your case, just write a "custom" loss function that accept a sparse y_true, one-hots it, and passes it to the recall/precision/etc metric function.

The difference between mean squared error of train data and test data is very large

i used linear regression to make ML model but met problem.
this is my result values
Model1 Training Mean squared error: 154.96
Model1 Test Mean squared error: 72018955075415565139968.00
training score: 0.48
testing score: -236446352571492139008.00
i dont know why these values are printed
because overfitting?
i am using tensorflow 1.13.1 and python 3.7
This seems to be the case of Overfitting.
You can
Ensure that you are following the Data Pre-Processing Steps like 1. Missing Value Imputation 2. Fixing the Outliers 3. Scaling or Normalizing the Features
Ensure that you are Performing Feature Engineering (removing undesired Features, adding meaningful Features)
Shuffle the Data, by using shuffle=True in model.fit. Code is shown below:
history = cnn_model.fit(x = X_train_reshaped,
y = y_train,
batch_size = 512,
epochs = epochs, callbacks=[callback],
verbose = 1, validation_data = (X_test_reshaped, y_test),
validation_steps = 10, steps_per_epoch=steps_per_epoch,
shuffle = True)
Use Early Stopping. Code is shown below
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15)
Use Dropout
Use Regularization. Code for Regularization is shown below (You can try l1 Regularization or l1_l2 Regularization as well):
`
from tensorflow.keras.regularizers import l2
Regularizer = l2(0.001)
model.add(tf.keras.layers.Dense(
units = 64, activation='relu', kernel_regularizer=Regularizer, bias_regularizer=Regularizer,
activity_regularizer=Regularizer))
tf.keras.layers(Dropout(0.4))
model.add(Dense(units = 10, activation = 'sigmoid',
activity_regularizer=Regularizer, kernel_regularizer=Regularizer))`
Last but not the least, Try removing some Layers, as it reduces the number of Trainable Parameters
If your Test Accuracy and Testing Score doesn't improve despite following above instructions, please share the complete code so that we can help you.
Hope this helps. Happy Learning!

Extracting weights from best Neural Network in Tensorflow/Keras - multiple epochs

I am working on a 1 - hidden - layer Neural Network with 2000 neurons and 8 + constant input neurons for a regression problem.
In particular, as optimizer I am using RMSprop with learning parameter = 0.001, ReLU activation from input to hidden layer and linear from hidden to output. I am also using a mini-batch-gradient-descent (32 observations) and running the model 2000 times, that is epochs = 2000.
My goal is, after the training, to extract the weights from the best Neural Network out of the 2000 run (where, after many trials, the best one is never the last, and with best I mean the one that leads to the smallest MSE).
Using save_weights('my_model_2.h5', save_format='h5') actually works, but at my understanding it extract the weights from the last epoch, while I want those from the epoch in which the NN has perfomed the best. Please find the code I have written:
def build_first_NN():
model = keras.Sequential([
layers.Dense(2000, activation=tf.nn.relu, input_shape=[len(X_34.keys())]),
layers.Dense(1)
])
optimizer = tf.keras.optimizers.RMSprop(0.001)
model.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error']
)
return model
first_NN = build_first_NN()
history_firstNN_all_nocv = first_NN.fit(X_34,
y_34,
epochs = 2000)
first_NN.save_weights('my_model_2.h5', save_format='h5')
trained_weights_path = 'C:/Users/Myname/Desktop/otherfolder/Data/my_model_2.h5'
trained_weights = h5py.File(trained_weights_path, 'r')
weights_0 = pd.DataFrame(trained_weights['dense/dense/kernel:0'][:])
weights_1 = pd.DataFrame(trained_weights['dense_1/dense_1/kernel:0'][:])
The then extracted weights should be those from the last of the 2000 epochs: how can I get those from, instead, the one in which the MSE was the smallest?
Looking forward for any comment.
EDIT: SOLVED
Building on the received suggestions, as for general interest, that's how I have updated my code, meeting my scope:
# build_first_NN() as defined before
first_NN = build_first_NN()
trained_weights_path = 'C:/Users/Myname/Desktop/otherfolder/Data/my_model_2.h5'
checkpoint = ModelCheckpoint(trained_weights_path,
monitor='mean_squared_error',
verbose=1,
save_best_only=True,
mode='min')
history_firstNN_all_nocv = first_NN.fit(X_34,
y_34,
epochs = 2000,
callbacks = [checkpoint])
trained_weights = h5py.File(trained_weights_path, 'r')
weights_0 = pd.DataFrame(trained_weights['model_weights/dense/dense/kernel:0'][:])
weights_1 = pd.DataFrame(trained_weights['model_weights/dense_1/dense_1/kernel:0'][:])
Use ModelCheckpoint callback from Keras.
from keras.callbacks import ModelCheckpoint
checkpoint = ModelCheckpoint(filepath, monitor='val_mean_squared_error', verbose=1, save_best_only=True, mode='max')
use this as a callback in your model.fit() . This will always save the model with the highest validation accuracy (lowest MSE on validation) at the location specified by filepath.
You can find the documentation here.
Of course, you need validation data during training for this. Otherwise I think you can probably be able to check on lowest training MSE by writing a callback function yourself.

how to improve neural network prediction, classification

I am trying to learn some neural networks for fun. I decided to try to classify some pokemon legendary cards, from a data set from kaggle. I read up on documentations and followed machine learning mastery guides, while reading up on medium to try to understand the process.
My problem/ question : i tried predicting and everything is predicting "0". i assume that is false. is my 92% false accuracy? i read something about false accuracy online.
please help!
Some background information : the dataset has 800 rows, 12 columns. i am predicting the last column ( true/false). I am using attributes of the data that has numerical and categorical. i label encoded the numerical categories. 92% of these cards are False. 8% are true.
i sampled and ran a neural network on 200 cards, and got 91% accuracy... i also reset everything and got a 92% accuracy on all 800 cards. am i overfitting?
thank you for help in advance
dataFrame = dataFrame.fillna(value='NaN')
labelencoder = LabelEncoder()
numpy_dataframe = dataFrame.as_matrix()
numpy_dataframe[:, 0] = labelencoder.fit_transform(numpy_dataframe[:, 0])
numpy_dataframe[:, 1] = labelencoder.fit_transform(numpy_dataframe[:, 1])
numpy_dataframe
X = numpy_dataframe[:,0:10]
Y = numpy_dataframe[:,10]
model = Sequential()
model.add(Dense(12, input_dim=10, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, Y, epochs=150, batch_size=10)
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
#this shows that we have 91.88% accuracy with the whole dataframe
dataFrame200False = dataFrame
dataFrame200False['Legendary'] = dataFrame200False['Legendary'].astype(str)
dataFrame200False= dataFrame200False[dataFrame200False['Legendary'].str.contains("False")]
dataFrame65True = dataFrame
dataFrame65True['Legendary'] = dataFrame65True['Legendary'].astype(str)
dataFrame65True= dataFrame65True[dataFrame65True['Legendary'].str.contains("True")]
DataFrameFalseSample = dataFrame200False.sample(200)
DataFrameFalseSample
dataFrameSampledTrueFalse = dataFrame65True.append(DataFrameFalseSample, ignore_index=True)
dataFrameSampledTrueFalse
#label encoding the files
labelencoder = LabelEncoder()
numpy_dataSample = dataFrameSampledTrueFalse.as_matrix()
numpy_dataSample[:, 0] = labelencoder.fit_transform(numpy_dataSample[:, 0])
numpy_dataSample[:, 1] = labelencoder.fit_transform(numpy_dataSample[:, 1])
numpy_dataSample
a = numpy_dataframe[:,0:10]
b = numpy_dataframe[:,10]
model = Sequential()
model.add(Dense(12, input_dim=10, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(a, b, epochs=1000, batch_size=10)
scoresSample = model.evaluate(a, b)
print("\n%s: %.2f%%" % (model.metrics_names[1], scoresSample[1]*100))
dataFramePredictSample = dataFrame.sample(500)
labelencoder = LabelEncoder()
numpy_dataframeSamples = dataFramePredictSample.as_matrix()
numpy_dataframeSamples[:, 0] = labelencoder.fit_transform(numpy_dataframeSamples[:, 0])
numpy_dataframeSamples[:, 1] = labelencoder.fit_transform(numpy_dataframeSamples[:, 1])
Xnew = numpy_dataframeSamples[:,0:10]
Ynew = numpy_dataframeSamples[:,10]
# make a prediction
Y = model.predict_classes(Xnew)
# show the inputs and predicted outputs
for i in range(len(Xnew)):
print("X=%s, Predicted=%s" % (Xnew[i], Y[i]))
Problem:
The problem is that, as you stated, your dataset is heavily imbalanced. This means that you have a lot more training examples for class 0 than class 1. This causes the network, during training, to develop a heavy bias towards predicting class 0.
Evaluation:
The first thing you should do is not use accuracy as your evaluation measure! My suggestion would be to draw a confusion matrix so that you see exactly what the model is predicting. You could also look into macro-averaging (read this if you're not familiar with the technique).
Dealing with the problem:
There are two ways you can improve the performance of the model:
Resample your data, so that it becomes balanced. You have a couple of options here. The most common way is to oversample (e.g. SMOTE) the minority class so that it reaches the population of the majority. Another option is to undersample (e.g. Clustering Centroids) the majority class so that it's population drops to that of the minority.
Use class weights during training. This forces the network to pay more attention to samples from the minority class (read this post for more info).

What should I do to get low average loss?

I'm an student in hydraulic engineering, working on a neural network in my internship so it's something new for me.
I created my neural network but it gives me a high loss and I don't know what is the problem ... you can see the code :
def create_model():
model = Sequential()
# Adding the input layer
model.add(Dense(26,activation='relu',input_shape=(n_cols,)))
# Adding the hidden layer
model.add(Dense(60,activation='relu'))
model.add(Dense(60,activation='relu'))
model.add(Dense(60,activation='relu'))
# Adding the output layer
model.add(Dense(2))
# Compiling the RNN
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
return model
kf = KFold(n_splits = 5, shuffle = True)
model = create_model()
scores = []
for i in range(5):
result = next(kf.split(data_input), None)
input_train = data_input[result[0]]
input_test = data_input[result[1]]
output_train = data_output[result[0]]
output_test = data_output[result[1]]
# Fitting the RNN to the Training set
model.fit(input_train, output_train, epochs=5000, batch_size=200 ,verbose=2)
predictions = model.predict(input_test)
scores.append(model.evaluate(input_test, output_test))
print('Scores from each Iteration: ', scores)
print('Average K-Fold Score :' , np.mean(scores))
And whene I execute my code, the result is like :
Scores from each Iteration: [[93.90406122928908, 0.8907562990148529], [89.5892979597845, 0.8907563030218878], [81.26530176050522, 0.9327731132507324], [56.46526102659081, 0.9495798339362905], [54.314151876112994, 0.9579831877676379]]
Average K-Fold Score : 38.0159922589274
Can anyone help me please ? how could I do to make the loss low ?
There are several issues, both with your questions and with your code...
To start with, in general we cannot say that an MSE loss of X value is low or high. Unlike the accuracy in classification problems which is by definition in [0, 1], the loss is not similarly bounded, so there is no general way of saying that a particular value is low or high, as you imply here (it always depends on the specific problem).
Having clarified this, let's go to your code.
First, judging from your loss='mean_squared_error', it would seem that you are in a regression setting, in which accuracy is meaningless; see What function defines accuracy in Keras when the loss is mean squared error (MSE)?. You have not shared what exact problem you are trying to solve here, but if it is indeed a regression one (i.e. prediction of some numeric value), you should get rid of metrics=['accuracy'] in your model compilation, and possibly change your last layer to a single unit, i.e. model.add(Dense(1)).
Second, as your code currently is, you don't actually fit independent models from scratch in each of your CV folds (which is the very essence of CV); in Keras, model.fit works cumulatively, i.e. it does not "reset" the model each time it is called, but it continues fitting from the previous call. That's exactly why if you see your scores, it is evident that the model is significantly better in the later folds (which already gives a hint for improving: add more epochs). To fit independent models as you should do for a proper CV, you should move create_model() inside the for loop.
Third, your usage of np.mean() here is again meaningless, as you average both the loss and the accuracy (i.e. apples with oranges) together; the fact that from 5 values of loss between 54 and 94 you end up with an "average" of 38 should have already alerted you that you are attempting something wrong. Truth is, if you dismiss the accuracy metric, as argued above, you would not have this problem here.
All in all, here is how it seems that your code should be in principle (but again, I have not the slightest idea of the exact problem you are trying to solve, so some details might be different):
def create_model():
model = Sequential()
# Adding the input layer
model.add(Dense(26,activation='relu',input_shape=(n_cols,)))
# Adding the hidden layer
model.add(Dense(60,activation='relu'))
model.add(Dense(60,activation='relu'))
model.add(Dense(60,activation='relu'))
# Adding the output layer
model.add(Dense(1)) # change to 1 unit
# Compiling the RNN
model.compile(optimizer='adam', loss='mean_squared_error') # dismiss accuracy
return model
kf = KFold(n_splits = 5, shuffle = True)
scores = []
for i in range(5):
result = next(kf.split(data_input), None)
input_train = data_input[result[0]]
input_test = data_input[result[1]]
output_train = data_output[result[0]]
output_test = data_output[result[1]]
# Fitting the RNN to the Training set
model = create_model() # move create_model here
model.fit(input_train, output_train, epochs=10000, batch_size=200 ,verbose=2) # increase the epochs
predictions = model.predict(input_test)
scores.append(model.evaluate(input_test, output_test))
print('Loss from each Iteration: ', scores)
print('Average K-Fold Loss :' , np.mean(scores))

Categories

Resources