I have a generator function which generates tuples of (inputs, targets) on which my model is trained using the fit_generator() method in Keras.
My dataset is divided into 9 equal parts. I wish to perform a leave-one-out cross validation on the dataset using the fit_generator() method and keep the learned parameters of the previous training intact.
My question is that will calling fit_generator() multiple times on the model make it re-learn its learned parameters on the previous train and validation sets from scratch or will it keep those learned parameters intact leading to improvement of accuracy?
After a little digging I found that the fit() method in Keras retains the learned parameters as over here Calling "fit" multiple times in Keras but I'm not sure if the same happens for fit_generator() and if it does can it be used for cross-validation of data.
The pseudo-code I'm thinking of implementing to achieve the cross-validation is as follows:
class DatasetGenerator(Sequence):
def __init__(validation_id, mode):
#Some code
def __getitem__():
#The generator function
#Some code
return (inputs, targets)
for id in range(9):
train_set = DatasetGenerator(id, 'train')
#train_set contains all 8 parts leaving the id part out for validation.
validation_set = DatasetGenerator(id, 'val')
#val_set contains the id part.
history = model.fit_generator(train_set, epochs = 10, steps_per_epoch = 24000, validation_data = val_set, validation_steps = 3000)
print('History Dict:', history.history)
results = model.evaluate_generator(test_set, steps=steps)
print('Test loss, acc:', results)
Will the model keep the learned parameters intact and improve upon them for each iteration of the for loop?
fit and fit_generator behave the same in that regard, calling them again will resume training from the previously trained weights.
Also note that what you are trying to do is not cross-validation, as to do real cross-validation, you train one model for each fold, and the models are completely independent, not continued from training of the previous fold.
As far as I know it will keep the previous trained params. Also, I think what you are trying to do can be done by modifying the on_epoch_end() method of Sequence. Could be something like this:
class DatasetGenerator(Sequence):
def __init__(self, id, mode):
self.id = id
self.mode = mode
self.current_epoch=0
#some code
def __getitem__(self, idx):
id = self.id
#Some code
return (inputs, targets)
def on_epoch_end():
self.current_epoch += 1
if self.current_epoch % 10 == 0:
self.id += 1
Related
I would like to implement a model checkpoint callback based on balanced accuracy score. For this, I implemented following class:
class BalAccScore(keras.callbacks.Callback):
def __init__(self, validation_data=None):
super(BalAccScore, self).__init__()
self.validation_data = validation_data
def on_train_begin(self, logs={}):
self.balanced_accuracy = []
def on_epoch_end(self, epoch, logs={}):
y_predict = tf.argmax(self.model.predict(self.validation_data[0]), axis=1)
y_true = tf.argmax(self.validation_data[1], axis=1)
balacc = balanced_accuracy_score(y_true, y_predict)
self.balanced_accuracy.append(round(balacc,6))
logs["val_bal_acc"] = balacc
keys = list(logs.keys())
print("\n ------ validation balanced accuracy score: %f ------\n" %balacc)
I then define following callbacks
balAccScore = BalAccScore(validation_data=(X_2, y_2))
mc = ModelCheckpoint(filepath=callback_path, monitor="val_bal_acc", verbose=1, save_best_only=True, save_freq='epoch')
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['val_bal_acc'])
history = model.fit(X_1, y_1, epochs = 5, batch_size = 512,
callbacks=[balAccScore, mc],
validation_data = (X_2, y_2)
)
I then get the error
ValueError: Unknown metric function: val_bal_acc
despite the fact that I find it under history when using for example accuracy instead, i.e. by setting metrics=["acc"] when compiling instead. In which case, I get the to be expected warning:
WARNING:tensorflow:Can save best model only with val_bal_acc available, skipping.
but otherwise the model runs perfectly. Not sure why it is not running otherwise.
you should just remove the quotations in compile :
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=[val_bal_acc])
or at least this how it works in R
You're getting that error because you're not passing the balanced_accuracy_score as a value for the metrics argument when compiling the model. The string 'val_bal_acc' that you passed in the metrics argument when compiling the model doesn't work because it's not a known metric. You can access metrics by their string name, only for those metrics already implemented in tf.keras.metrics. If you want to monitor the validation balanced accuracy during training you should implement a custom metric class (you can look here how to do it) and then pass it to the metrics argument. Once you've done this you can monitor your custom metric using the name you gave to it with the prefix 'val', if you want to monitor it during the validation time. There's no need for a supplementary custom callback as you did, the logs are updated automatically once you've defined the metric. For this particular case, you can find some implementation of this metric in the answers to this question.
If you prefer a callback approach instead you don't need to define a custom metric but take advantage of the already logged metrics. You can find an implementation of that in my answer here.
I'm trying to implement a boosting model using Tensorflow "BoostedTreesRegressor".
For that, I need to implement a custom loss function where during training, the loss will be calculated according to the logic defined in my custom function rather than using the usual mean_squared_error.
I read in articles that this can be implemented using the interface, "BoostedTreesEstimator" by specifying a head. So, I tried to implement my model as follows:
#define custom loss function to calculate smape
def custom_loss_fn(labels, logits):
return (np.abs(logits - labels) / (np.abs(logits) + np.abs(labels))) * 2
#create input functions
def make_input_fn(X, y, n_epochs=None, shuffle=True):
def input_fn():
dataset = tf.data.Dataset.from_tensor_slices((dict(X), y))
if shuffle:
dataset = dataset.shuffle(NUM_EXAMPLES)
dataset = dataset.repeat(n_epochs)
dataset = dataset.batch(NUM_EXAMPLES)
return dataset
return input_fn
train_input_fn = make_input_fn(dftrain, y_train)
eval_input_fn = make_input_fn(dfeval, y_eval, n_epochs=1, shuffle=False)
my_head = tf.estimator.RegressionHead(loss_fn=custom_loss_fn)
#Training a boosted trees model
est = tf.estimator.BoostedTreesEstimator(feature_columns,
head=my_head,
n_batches_per_layer=1,
n_trees=90,
max_depth=2)
est.train(train_input_fn, max_steps=100)
predictions = list(est.predict(eval_input_fn))
This code provided an error as follows:
'Subclasses of Head must implement create_estimator_spec() or 'NotImplementedError: Subclasses of Head must implement create_estimator_spec() or _create_tpu_estimator_spec().
As I read in articles, create_estimator_spec() is used when we define a model_fn() when creating a new Estimator. Here, I do not want to create any new models or Estimators, I only want to use a custom loss function (instead of default mean squared error) when training where the training model should be equal to BoostedTreesRegressor/BoostingTreesEstimator.
It is a great help if anybody can give me some hint to implement this model.
Make sure you aren't using numpy functions in your loss function--you cannot convert tensors to numpy arrays. Try replacing np.abs with tf.abs. You might be getting the NotImplementedError because your loss function is breaking.
I would like to ensure that my code for running a cross-validation of a Keras model is correct. Currently I suspect that it is wrong, because the results appear to be over-fitting.
My code structure generally looks like as follows:
def get_model():
....
#code to create a Keras Neural network model using the functional API
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
splits = list(enumerate(kfold.split(X, y))) #X is the train feature matrix, y the target
model= get_model() #LINE A
model.compile(...) #LINE B
for k in range(0, len(splits)): #LINE C
split = splits[k]
X_split_train = ... #slice X into corresponding training parts
X_split_test
y_split_train = ... #slice y into corresponding parts
model.fit(X_split_train, y_split_train, ...)
prediction_prob = model.predict(X_split_test)
#... code for evaluating the result for this fold
And I suspect my code is wrong. Specifically, line A and B should be within the loop, line C.
Reasons for my suspicion:
Looking at the training log generated for all epochs, there seems to be continuation of model performance over different folds. Say For the first fold the model obtains an accuracy of 75%. In the second fold, it starts reporting an accuracy 75.x% upwards
the model seems to be overfitting as it soon outputs training accuracy of 1.0
for some rare classes that have only 1 instance in the dataset, in some cases the model even reported 100% F1 for those classes and this doesn't make sense.
All these seem to suggest that the model parameters and learned class distribution seem to be carried forward between folds. And the only way to fix this I suppose, is to re-create the model in every fold. Is this correct?
Thanks
No, this code is not doing cross-validation correctly, for each fold you train a new model from scratch, here you are reusing the model from the previous fold, which is incorrect.
I would do it like this:
for k in range(0, len(splits)): #LINE C
model= get_model() #LINE A
model.compile(...) #LINE B
split = splits[k]
X_split_train = ... #slice X into corresponding training parts
X_split_test
y_split_train = ... #slice y into corresponding parts
model.fit(X_split_train, y_split_train, ...)
prediction_prob = model.predict(X_split_test)
del model
could you help me to understand how to implement a callback function which determines the performance of a model on the test and validation data?
I got a bit confused reading this from a nice block entry:
len(self.model.validation_data) == 3,
because validation_data[0] ==> train_x (that you input in
model.fit()),
validation_data[1] ==> train_y,
validation_data[2]=sample_weight.
As you can see, the blogger mentions that validation_data is here the training data. From the keyword "validation_data" which is also used in model.fit I would assume that self.model.validation_data returns the validation data and something like self.model.x would return the training data (as model.fit(x= ...,) is the input for the training data)
Could someone shed light on this?
Thanks in advance for your help
edit:
I checked with dir(self) if there would be something like an x for the training data. But indeed, there is only validation_data. Could someone explain to me how I can differentiate between test and validation?
Is validation_data always the training data used in
def on_train_end(self, logs={}):
and validation_data becomes the actual validation set when using ?
def on_epoch_end(self, epoch, logs={}):
edit edit:
in the same block the author talks about self.model.training_data
he found but could not retrieve. I searched for this but apparently, it was removed.
So my question may be more on point: How can you load the used training data at the end of an epoch in a callback.
The answer is quite simple.
Before you do model.fit() you can stick anything to the model.
I just sticked X_train and Y_train to it. You can here stick the validation or test data to it, whatever you want.
model.X_train=X_train
model.Y_train=Y_train
Then, you use those values with the sklearn library to calculate any performance (accuracy, f1, kappa, ... )
First, add variables to self to enable later appending each epoch result to the variable
class yourowncallbackname(Callback):
def_on_train_begin(self,logs={}):
self.val_f1=[] #define the variable to collect results
self.val_kappa=[]
then define what happens on the end of each epoch in the same class "yourowncallbackname" (on epoch end works best). Here you use the data sticked to the model (e.g.
model.x_train) to get the results from model.predict.
def on_epoch_end(self, epoch, logs={}):
#LOAD DATA
train_predict= (np.asarray(self.model.predict(self.model.X_train)))
train_true=self.model.Y_train_jan
val_predict = np.asarray(self.model.predict(self.validation_data[0]))
val_true = self.validation_data[1]
Maybe, depending on what you want to predict you have to round the predictions with .round
val_predict = (np.asarray(self.model.predict(self.model.X_train))).round()
Then calculate any performance you like on the predicted data (same as defined in on_train_begin) for e.g. test and validation
_train_f1= f1_score(train_true, train_predict, labels=...)
_train_k= cohen_kappa_score(train_true.ravel(),train_predict,labels=...)
_val_f1= f1_score(val_true, val_predict, labels=...)
_val_k= cohen_kappa_score(val_true.ravel(), val_predict, labels=...)
To collect them, stick them now to self. This will stick to the callback itself which you can later call again.
self.train_f1.append(_train_f1)
self.train_kappa.append(_train_k)
self.val_f1.append(_val_f1)
self.val_kappa.append(_val_k)
This callback you have to initialize now wherever you want to fit your model
callbackmetric=yourowncallbackname()
Then you can put it into the fit:
history=model.fit(...
callbacks=[callbackmetric])
If you now want to use the appended results per epoch you can call them as you would call a class
Yourcallbackresults.val_f1=callbackmetric.val_f1
Yourcallbackresults.train_f1=callbackmetric.train_f1
For completion, I will post the total callback class again. You just have to addapt the correct perfroamnce calculation with al its parameters etc.:
class yourowncallbackname(Callback):
def on_train_begin(self, logs={}):
self.val_f1 = []
self.val_k = []
self.train_f1 = []
self.train_k = []
def on_epoch_end(self, epoch, logs={}):
#LOAD DATA
train_predict= (np.asarray(self.model.predict(self.model.X_train)))
train_true=self.model.Y_train_jan
val_predict = np.asarray(self.model.predict(self.validation_data[0]))
val_true = self.validation_data[1]
#CALC. PERFORMANCE
_train_f1= f1_score(...)
_train_k= cohen_kappa_score(...)
_val_f1= f1_score(...)
_val_k= cohen_kappa_score(...)
self.train_f1.append(_train_f1)
self.train_k.append(_train_k)
self.val_f1.append(_val_f1)
self.val_k.append(_val_k)
print (" val_f1: %f val_k: %f" %(_val_f1, _val_k))
return
So I'm trying to use Keras' fit_generator with a custom data generator to feed into an LSTM network.
What works
To illustrate the problem, I have created a toy example trying to predict the next number in a simple ascending sequence, and I use the Keras TimeseriesGenerator to create a Sequence instance:
WINDOW_LENGTH = 4
data = np.arange(0,100).reshape(-1,1)
data_gen = TimeseriesGenerator(data, data, length=WINDOW_LENGTH,
sampling_rate=1, batch_size=1)
I use a simple LSTM network:
data_dim = 1
input1 = Input(shape=(WINDOW_LENGTH, data_dim))
lstm1 = LSTM(100)(input1)
hidden = Dense(20, activation='relu')(lstm1)
output = Dense(data_dim, activation='linear')(hidden)
model = Model(inputs=input1, outputs=output)
model.compile(loss='mse', optimizer='rmsprop', metrics=['accuracy'])
and train it using the fit_generator function:
model.fit_generator(generator=data_gen,
steps_per_epoch=32,
epochs=10)
And this trains perfectly, and the model makes predictions as expected.
The problem
Now the problem is, in my non-toy situation I want to process the data coming out from the TimeseriesGenerator before feeding the data into the fit_generator. As a step towards this, I create a generator function which just wraps the TimeseriesGenerator used previously.
def get_generator(data, targets, window_length = 5, batch_size = 32):
while True:
data_gen = TimeseriesGenerator(data, targets, length=window_length,
sampling_rate=1, batch_size=batch_size)
for i in range(len(data_gen)):
x, y = data_gen[i]
yield x, y
data_gen_custom = get_generator(data, data,
window_length=WINDOW_LENGTH, batch_size=1)
But now the strange thing is that when I train the model as before, but using this generator as the input,
model.fit_generator(generator=data_gen_custom,
steps_per_epoch=32,
epochs=10)
There is no error but the training error is all over the place (jumping up and down instead of consistently going down like it did with the other approach), and the model doesn't learn to make good predictions.
Any ideas what I'm doing wrong with my custom generator approach?
It could be because the object type is changed from Sequence which is what a TimeseriesGenerator is to a generic generator. The fit_generator function treats these differently. A cleaner solution would be to inherit the class and override the processing bit:
class CustomGen(TimeseriesGenerator):
def __getitem__(self, idx):
x, y = super()[idx]
# do processing here
return x, y
And use this class like before as the rest of internal logic will remain the same.
I personally had a problem with the code by nuric. For some reason I had the error saying super not being subscriptable. Here is my possible fix. Let me known if this could possibly work?
class CustomGen(TimeseriesGenerator):
def __getitem__(self, idx):
x,y = super().__getitem__(idx)
return x, y