I was trying to plot train and test learning curve in keras, however, the following code produces KeyError: 'val_acc error.
The official document <https://keras.io/callbacks/> states that in order to use 'val_acc' I need to enable validation and accuracy monitoring which I dont understand and dont know how to use in my code.
Any help would be much appreciated.
Thanks.
seed = 7
np.random.seed(seed)
dataframe = pandas.read_csv("iris.csv", header=None)
dataset = dataframe.values
X = dataset[:,0:4].astype(float)
Y = dataset[:,4]
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
dummy_y = np_utils.to_categorical(encoded_Y)
kfold = StratifiedKFold(y=Y, n_folds=10, shuffle=True, random_state=seed)
cvscores = []
for i, (train, test) in enumerate(kfold):
model = Sequential()
model.add(Dense(12, input_dim=4, init='uniform', activation='relu'))
model.add(Dense(3, init='uniform', activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history=model.fit(X[train], dummy_y[train], nb_epoch=200, batch_size=5, verbose=0)
scores = model.evaluate(X[test], dummy_y[test], verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print( "%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Looks like in Keras + Tensorflow 2.0 val_acc was renamed to val_accuracy
history_dict = history.history
print(history_dict.keys())
if u print keys of history_dict, you will get like this dict_keys(['loss', 'acc', 'val_loss', 'val_acc']).
and edit a code like this
acc = history_dict['acc']
val_acc = history_dict['val_acc']
loss = history_dict['loss']
val_loss = history_dict['val_loss']
Keys and error
You may need to enable the validation split of your trainset. Usually, the validation happens in 1/3 of the trainset. In your code, make the change as given below:
history=model.fit(X[train], dummy_y[train],validation_split=0.33,nb_epoch=200, batch_size=5, verbose=0)
It works!
The main point everyone misses to mention is that this Key Error is related to the naming of metrics during model.compile(...). You need to be consistent with the way you name your accuracy metric inside model.compile(....,metrics=['<metric name>']). Your history callback object will receive the dictionary containing key-value pairs as defined in metrics.
So, if your metric is metrics=['acc'], you can access them in history object with history.history['acc'] but if you define metric as metrics=['accuracy'], you need to change to history.history['accuracy'] to access the value, in order to avoid Key Error. I hope it helps.
N.B. Here's a link to the metrics you can use in Keras.
If you upgrade keras older version (e.g. 2.2.5) to 2.3.0 (or newer) which is compatible with Tensorflow 2.0, you might have such error (e.g. KeyError: 'acc'). Both acc and val_acc has been renamed to accuracy and val_accuracy respectively. Renaming them in script will solve the issue.
to get any val_* data (val_acc, val_loss, ...), you need to first set the validation.
first method (will validate from what you give it):
model.fit(validation_data=(X_test, Y_test))
second method (will validate from a part of the training data):
model.fit(validation_split=0.5)
I have changed acc to accuracy and my problem solved. Tensorflow 2+
e.g.
accuracy = history_dict['accuracy']
val_accuracy = history_dict['val_acccuracy']
This error also happens when you specify the validation_data=(X_test, Y_test) and your X_test and/or Y_test are empty. To check this, print the shape of X_test and Y_test respectively. In this case, the model.fit(validation_data=(X_test, Y_test), ...) method ran but because the validation set was empty, it didn't create a dictionary key for val_loss in the history.history dictionary.
What worked for me was changing objective='val_accuracy'to objective=["val_accuracy"] in
tuner = kt.BayesianOptimization(model_builder,
objective=["val_accuracy"],
max_trials=80,
seed=123)
tuner.search(X_train, y_train, epochs=50, validation_split=0.2)
I have TensorFlow 2+.
Related
I'm currently trying to visualise the performance of my prediction model by showing the val_mse in every epoch. The code that used to work for model.fit() doesn't work for tuner.search(). Can anyone provide me with some guide on this. Thank you.
Previous code:
import matplotlib.pyplot as plt
def plot_model(history):
hist = pd.DataFrame (history.history)
hist['epoch'] = history.epoch
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Absolute Error')
plt.plot(hist['epoch'], hist['mae'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_mae'],
label = 'Val Error')
plt.legend()
plt.ylim([0,20])
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Square Error')
plt.plot (hist['epoch'], hist['mse'],
label='Train Error')
plt.plot (hist['epoch'], hist['val_mse'],
label = 'Val Error')
plt.legend()
plt.ylim([0,400])
plot_model(history)
keras.tuner code:
history = tuner.search(x = normed_train_data,
y = y_train,
epochs = 200,
batch_size=64,
validation_data=(normed_test_data, y_test),
callbacks = [early_stopping])
Before using tuner.search to search the best model, you need to install and import keras_tuner:
!pip install keras-tuner --upgrade
import keras_tuner as kt
from tensorflow import keras
Then, define the hyperparameter (hp) in the model definition, for instance as below:
def build_model(hp):
model = keras.Sequential()
model.add(keras.layers.Dense(
hp.Choice('units', [8, 16, 32]), # define the hyperparameter
activation='relu'))
model.add(keras.layers.Dense(1, activation='relu'))
model.compile(loss='mse')
return model
Initialize the tuner:
tuner = kt.RandomSearch(build_model,objective='val_loss',max_trials=5)
Now, Start the search and get the best model by using tuner.search:
tuner.search(x = normed_train_data,
y = y_train,
epochs = 200,
batch_size=64,
validation_data=(normed_test_data, y_test),
callbacks = [early_stopping])
best_model = tuner.get_best_models()[0]
Hence, Now you can use this best_model to train and evaluate the model with your dataset and will get a significant decrease in loss.
Please check this link as a reference for more detail.
I am new to machine learning and I am performing a Multivariate Time Series Forecast using LSTMs in Keras. I have a monthly timeseries dataset with 4 input variables (temperature, precipitation, Dew and wind_spreed) and 1 output variable (pollution). Using this data i framed a forecasting problem where, given the weather conditions and pollution for prior months, I forecast the pollution at the next month. Below is my code
X = df[['Temperature', 'Precipitation', 'Dew', 'Wind_speed' ,'Pollution (t_1)']].values
y = df['Pollution (t)'].values
y = y.reshape(-1,1)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
scaled = scaler.fit_transform(X)
#dataset has 359 samples in total
train_X, train_y = X[:278], y[:278]
test_X, test_y = X[278:], y[278:]
# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))
test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)
model = Sequential()
model.add(LSTM(100, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dropout(0.2))
# model.add(LSTM(70))
# model.add(Dropout(0.3))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(train_X, train_y, epochs=700, batch_size=70, validation_data=(test_X, test_y), verbose=2, shuffle=False)
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
To do predictions i use the following code
from sklearn.metrics import mean_squared_error,r2_score
yhat = model.predict(test_X)
mse = mean_squared_error(test_y, yhat)
rmse = np.sqrt(mse)
r2 = r2_score(test_y, yhat)
print("test set performance")
print("--------------------")
print("MSE:",mse)
print("RMSE:",rmse)
print("R^2: ",r2)
fig, ax = plt.subplots(figsize=(10,5))
ax.plot(range(len(test_y)), test_y, '-b',label='Actual')
ax.plot(range(len(yhat)), yhat, 'r', label='Predicted')
plt.legend()
plt.show()
Running this code i fell into the following issues:
For some reason am getting a lagged result for my test set which is not in my training data as shown on the below image. I do not understand why i have these lagged results (does it have something to do with including 'pollution (t_1)' as part of my inputs)?
Graph Results:
By adding "pollution (t_1)" which is a shift by 1 lag of the polution variable as part of my inputs this variable now seems to dominate the prediction as removing the other varibales seems to have no influence on my results (r-squared and rmse) which is strange since all these variables do assit in pollution prediction.
Is there something i am doing wrong in my code which is the reason for these issues? I am new to python so any help to answer the above 2 questions will be greatly appreaciated.
First of all, I think it is not appropriate to input '1' as Timesteps value, because LSTM model is the one treating timeseries or sequence data.
I think the following script of data mining will work well
def lstm_data(df,timestamps):
array_data=df.values
sc=MinMaxScaler()
array_data_=sc.fit_transform(array_data)
array=np.empty((0,array_data_.shape[1]))
range_=array_data_.shape[0]-(timestamps-1)
for t in range(range_):
array_data_p=array_data_[t:t+sequenth_length,:]
array=np.vstack((array,array_data_p))
array_=array.reshape(-1,timestamps, array.shape[1])
return array_
#timestamps depend on your objection, but not '1'
x_data=lstm_data(x, timestamps=4)
y_data=lstm_data(y, timestamps=4)
y_data=y_data.reshape(-1,1)
#Divide each data into train and test
#Input the divided data into your LSTM model
After reading a few tutorials, this is the first time I have built a Keras Deep Learning Model as I am a beginner in machine learning and deep learning. Most of the tutorials use the train-test split to train and test the model. However, I chose to use StratifiedKFold CV. The code is as below.
X = dataset[:,0:80].astype(float)
Y = dataset[:,80]
kfold = StratifiedKFold(n_splits=10,random_state=seed)
for train, test in kfold.split(X, Y):
# create model
model = Sequential()
model.add(Dense())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam',metrics=['accuracy'])
model.fit(X[train], Y[train], epochs=100,batch_size=128, verbose=0)
scores = model.evaluate(X[test], Y[test], verbose=1)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))
Y[pred]= model.predict(X[test])
acc = accuracy_score(Y[test],Y[pred])
confusion = confusion_matrix(Y[test], Y[pred])
print(confusion)
plot_confusion_matrix(confusion, classes =['No','Yes'],title='Confusion Matrix')
TP= confusion[1,1]
TN= confusion[0,0]
FP= confusion[0,1]
FN= confusion[1,0]
print('Accuracy: ')
print((TP + TN) / float(TP + TN + FP + FN))
print(accuracy_score(Y[test],Y[pred]))
fpr, tpr, thresholds = roc_curve(Y[test], y_pred_prob)
plt.plot(fpr, tpr)
print(roc_auc_score(y_test, y_pred_prob))
y_pred_class = binarize([y_pred_prob], 0.3)[0]
confusion_new = confusion_matrix(Y[test], y_pred_class)
print(confusion_new)
I have understood the theoretical concept of Kfold CV and StratifiedKFoldCV. I have come across What does KFold in python exactly do?, KFolds Cross Validation vs train_test_split, and a few more links. But when I calculate the performance metrics it gives me the following errors.
NameError: name 'pred' is not defined
NameError: name 'y_pred_prob' is not defined
NameError: name 'roc_curve' is not defined
What I am doing wrong here? Why am I getting these errors? How do I fix this?
Thanks.
Here's a way you can try:
X = dataset[:,0:80].astype(float)
Y = dataset[:,80]
# define model
model = Sequential()
model.add(Dense(10))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam',metrics=['accuracy'])
# create folds
folds = list(StratifiedKFold(n_splits=10, shuffle=True, random_state=1).split(X, Y))
# train model for every fold
for j, (train_idx, val_idx) in enumerate(folds):
print('\nFold ',j)
X_train_cv = X[train_idx]
y_train_cv = Y[train_idx]
X_valid_cv = X[val_idx]
y_valid_cv= Y[val_idx]
model.fit(X_train_cv,
y_train_cv,
epochs=100,
batch_size=128,
validation_data = (X_valid_cv, y_valid_cv),
verbose=0)
print(model.evaluate(X_valid_cv, y_valid_cv))
# check metrics for each fold
pred = model.predict(X_valid_cv)
acc = accuracy_score(y_valid_cv, pred)
confusion = confusion_matrix(y_valid_cv, pred)
print(confusion)
For a regression problem I want to compare some metrics but I am only able to get accuracy from the history which makes no sense wrt to regression purposes. How can I get other metrics like mean_squared_error and so on?
create_model(...)
input_layer = ...
output_laye = ...
model = Model(input_layer, output_layer)
model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
return model
model = KerasRegressor(build_fn=create_model, verbose=0)
batch_size = [1, 2]
epochs = [1, 2]
optimizer = ['Adam', 'sgd']
param_grid = dict(batch_size=batch_size
, optimizer = optimizer
)
grid_obj = RandomizedSearchCV(estimator=model
, param_grid=hypparas
, n_jobs=1
, cv = 3
, scoring = ['explained_variance', 'neg_mean_squared_error', 'r2']
, refit = 'neg_mean_squared_error'
, return_train_score=True
, verbose = 2
)
grid_result = grid_obj.fit(X_train1, y_train1)
X_train1, X_val1, y_train1, y_val1 = train_test_split(X_train1, y_train1, test_size=0.2, shuffle=False)
grid_best = grid_result.best_estimator_
history = grid_best.fit(X_train1, y_train1
, validation_data=(X_val1, y_val1)
)
print(history.history.keys())
> dict_keys(['val_loss', 'val_accuracy', 'loss', 'accuracy'])
I have seen https://stackoverflow.com/a/50137577/6761328 to get e.g.
history.history['accuracy']
which works but I can't access mean_squared_error or something else:
history.history['neg_mean_squared_error']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-473-eb96973bf014> in <module>
----> 1 history.history['neg_mean_squared_error']
KeyError: 'neg_mean_squared_error'
This question is finally a follow-up on How to compare different metrics? as I think this question is the answer to the other one.
In stand-alone Keras (not sure for the scikit-learn wrapper), history.history['loss'] (or val_loss respectively for the validation set) would do the job.
Here, 'loss' and 'val_loss' are keys; give
print(history.history.keys())
to see what keys are available in your case, and you will find among them the required ones for the loss (might even be the same, i.e. 'loss' and 'val_loss').
As a side note, you should remove completely metrics=['accuracy'] from your model compilation - as you correctly point out, accuracy is meaningless in regression settings (you might want to check What function defines accuracy in Keras when the loss is mean squared error (MSE)?).
I want the classifier to run faster and stop early if the patience reaches the number I set. In the following code it does 10 iterations of fitting the model.
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.wrappers.scikit_learn import KerasClassifier
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.constraints import maxnorm
from keras.optimizers import SGD
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("sonar.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
calls=[EarlyStopping(monitor='acc', patience=10), ModelCheckpoint('C:/Users/Nick/Data Science/model', monitor='acc', save_best_only=True, mode='auto', period=1)]
def create_baseline():
# create model
model = Sequential()
model.add(Dropout(0.2, input_shape=(33,)))
model.add(Dense(33, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dense(16, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dense(122, init='normal', activation='softmax'))
# Compile model
sgd = SGD(lr=0.1, momentum=0.8, decay=0.0, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
return model
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Here is the resulting error-
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000000001D691438>, as the constructor does not seem to set parameter callbacks
I changed the cross_val_score in the following-
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'callbacks':calls})
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
and now I get this error-
ValueError: need more than 1 value to unpack
This code came from here. The code is by far the most accurate I've used so far. The problem is that there is no defined model.fit() anywhere in the code. It also takes forever to fit. The fit() operation occurs at the results = cross_val_score(...) and there's no parameters to throw a callback in there.
How do I go about doing this?
Also, how do I run the model trained on a test set?
I need to be able to save the trained model for later use...
Reading from here, which is the source code of KerasClassifier, you can pass it the arguments of fit and they should be used.
I don't have your dataset so I cannot test it, but you can tell me if this works and if not I will try and adapt the solution. Change this line :
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=[...your_callbacks...])))
A small explaination of what's happening : KerasClassifier is taking all the possibles arguments for fit, predict, score and uses them accordingly when each method is called. They made a function that filters the arguments that should go to each of the above functions that can be called in the pipeline.
I guess there are several fit and predict calls inside the StratifiedKFold step to train on different splits everytime.
The reason why it takes forever to fit and it fits 10 times is because one fit is doing 300 epochs, as you asked. So the KFold is repeating this step over the different folds :
calls fit with all the parameters given to KerasClassifier (300 epochs and batch size = 16). It's training on 9/10 of your data and using 1/10 as validation.
EDIT :
Ok, so I took the time to download the dataset and try your code... First of all you need to correct a "few" things in your network :
your input have a 60 features. You clearly show it in your data prep :
X = dataset[:,:60].astype(float)
so why would you have this :
model.add(Dropout(0.2, input_shape=(33,)))
please change to :
model.add(Dropout(0.2, input_shape=(60,)))
About your targets/labels. You changed the objective from the original code (binary_crossentropy) to categorical_crossentropy. But you didn't change your Y array. So either do this in your data preparation :
from keras.utils.np_utils import to_categorical
encoded_Y = to_categorical(encoder.transform(Y))
or change your objective back to binary_crossentropy.
Now the network's output size : 122 on the last dense layer? your dataset obviously has 2 categories so why are you trying to output 122 classes? it won't match the target. Please change back your last layer to :
model.add(Dense(2, init='normal', activation='softmax'))
if you choose to use categorical_crossentropy, or
model.add(Dense(1, init='normal', activation='sigmoid'))
if you go back to binary_crossentropy.
So now that your network compiles, I could start to troubleshout.
here is your solution
So now I could get the real error message. It turns out that when you feed fit_params=whatever in the cross_val_score() function, you are feeding those parameters to a pipeline. In order to know to which part of the pipeline you want to send those parameters you have to specify it like this :
fit_params={'mlp__callbacks':calls}
Your error was saying that the process couldn't unpack 'callbacks'.split('__', 1) into 2 values. It was actually looking for the name of the pipeline's step to apply this to.
It should be working now :)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'mlp__callbacks':calls})
BUT, you should be aware of what's happening here... the cross validation actually calls the create_baseline() function to recreate the model from scratch 10 times an trains it 10 times on different parts of the dataset. So it's not doing epochs as you were saying, it's doing 300 epochs 10 times.
What is also happening as a consequence of using this tool : since the models are always differents, it means the fit() method is applied 10 times on different models, therefore, the callbacks are also applied 10 different times and the files saved by ModelCheckpoint() get overriden and you find yourself only with the best model of the last run.
This is intrinsec to the tools you use, I don't see any way around this. This comes as consequence to using different general tools that weren't especially thought to be used together with all the possible configurations.
Try:
estimators.append(('mlp',
KerasClassifier(build_fn=create_model2,
nb_epoch=300,
batch_size=16,
verbose=0,
callbacks=[list_of_callbacks])))
where list_of_callbacks is a list of callbacks you want to apply. You could find details here. It's mentioned there that parameters fed to KerasClassifier could be legal fitting parameters.
It's also worth to mention that if you are using multiple runs with GPUs there might be a problem due to several reported memory leaks especially when you are using theano. I also noticed that running multiple fits consequently may show results which seem to be not independent when using sklearn API.
Edit:
Try also:
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params = {'mlp__callbacks': calls})
Instead of putting callbacks list in a wrapper instantiation.
This is what I have done
results = cross_val_score(estimator, X, Y, cv=kfold,
fit_params = {'callbacks': [checkpointer,plateau]})
and has worked so far
Despite the TensorFlow, Keras & SciKeras documentation suggesting you can define training callbacks via the fit method, for my setup it turns out (like #NassimBen suggests) you should do it through the model constructor instead.
Rather than this:
model = KerasClassifier(..).fit(X, y, callbacks=[<HERE>])
Try this:
model = KerasClassifier(callbacks=[<HERE>]).fit(X, y)