I have an LSTM model that was trained on the multi-feature daily dataset and predicts the target feature's value one day in the future.
How should I retrain the model each day as the new data becomes available? Should I rerun the model.fit with the full dataset (which gets updated each day) like in the example below?
model.fit(x_train, y_train, epochs=50, batch_size=20,
validation_data=(x_test, y_test), verbose=2, shuffle=False)
Or I can call model.fit only with the newly available data?
# run at the beggining once
model.fit(x_train, y_train, epochs=50, batch_size=20,
validation_data=(x_test, y_test), verbose=2, shuffle=False)
# run every day as the new data gets available.
model.fit(x_yesterday, x_yesterday)
Assuming you are using Keras, I would use train_on_batch See this previous question for the answer: What is the use of train_on_batch() in keras?
Related
So I have a GRU model that predict output power. For the training data I have a csv file which has data from 2018, while for my testting data it is a different csv file which has data from 2019.
I just had to short questions.
Since I'm using 2 different csv files one for testing and one for training, I do not need to train_test_split?
When it comes to model.fit, I really don't know the difference between Validation_data and Validation_split and which one should I use?
I have tested these 3 lines seperately, the 2nd and 3rd line give me the same exact results , while the first gives me way lower val_loss.
Thank you.
history=model.fit(X_train, y_train, batch_size=256, epochs=25, validation_split=0.1, verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
history=model.fit(X_train, y_train, batch_size=256, epochs=25, validation_data=(X_test, y_test), verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
history=model.fit(X_train, y_train, batch_size=256, epochs=25, validation_data=(X_test, y_test), validation_split=0.1, verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
You can do what you want, yes you can use one file to train and one to validate. But you could also merge them then use train_test_split if you wish. However, I would recommend you to merge them as you have data from different periods of time, there may be differences.
Using validation_data means you are providing the training set and validation set yourself, whereas using validation_split means you only provide a training set and keras splits it into a training set and a validation set (with the validation set being validation_split times the size of the training set)
i split my data into training and test samples (70/30) for regression-forecasting based problem (MLP, LSTM, etc.).
Within the code:
history = model.fit(X_train, y_train, epochs=100, batch_size=32,
validation_data=(X_test, y_test), verbose=0, shuffle=False)
I put my test data as the validation set and did couple weeks worth of predictions. So i did not hold back the test data...
But now that i think about it, i guess it was wrong to put the test data into the fit function, or was it ok?
NEVER EVER! use your testing that as part of training or validation. The test set should only be used for inference after training. So yes it's wrong to use your test data in the fit function, it should only be in model.predict(y_test)
I'm using tensorboard with keras this way:
from keras.callbacks import TensorBoard
tensorboard = TensorBoard(log_dir='./logs', histogram_freq=0,
write_graph=True, write_images=False)
# define model
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True,
callbacks=[tensorboard])
If I run train one more time calling second time model.fit(…), tensorboard resets step so metric plots start looking like a mess. How to make it append result to previous results?
Another question how to create another session run to compare their results on tensorboard?
To resume a previous training run, you should set the argument initial_epoch of model.fit. By doing so, the new information will be appended to the existing TensorBoard logs.
I created a neural network in python that is predicting my time-series very well.
My issue is I want to be able to create a neural network that can predict multiple time series at the same time.
Is this possible and how would I go about it?
This is the code to build the NN for a single time series
nn_model = Sequential()
nn_model.add(Dense(12, input_dim=1, activation='relu'))
nn_model.add(Dense(1))
nn_model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mse', 'mae'])
early_stop = EarlyStopping(monitor='loss', patience=2, verbose=1)
history = nn_model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=1, callbacks=[early_stop], shuffle=False)
Any ideas about how to convert this to run for multiple time series?
I'm training my data in batches using train_on_batch, but it seems train_on_batch doesn't have an option to use callbacks, which seems to be a requirement to use checkpoints.
I can't use model.fit as that seems to require I load all of my data into memory.
model.fit_generator is giving me strange problems (like hanging at end of an epoch).
Here is the example from Keras API docs showing the use of ModelCheckpoint:
from keras.callbacks import ModelCheckpoint
model = Sequential()
model.add(Dense(10, input_dim=784, kernel_initializer='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
checkpointer = ModelCheckpoint(filepath='/tmp/weights.hdf5', verbose=1,
save_best_only=True)
model.fit(x_train, y_train, batch_size=128, epochs=20,
verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])
If you train on each batch manually, you can do whatever you want at any #epoch(#batch). No need to use callback, just call model.save or model.save_weights.