Future Predictions LSTM

Future Predictions LSTM - python

I have a data set of 1400 length.
I made every operation and trained an LSTM model using Keras on Python in an attempt to predict future points. My trained model learns well.
After compiling the model, I added a new random test set. My intent is to predict unseen future, for example 15 days, that are not included in trained data. When I add some test data with a constant value, my predictions become constant.
But when I add real values that I try to predict, my model fits well on this test data.
So How can I handle it?
Why does the model prediction change due to test data ?
How can I predict the next 15 days that are not included in my training set ?
How can I predict the unseen future?
If LSTM models work only on known train and test sets why should I use it?

Related

In Keras, after you train a stateful LSTM model, do you have to re-train the model as you predict values?

I've been trying to create a stateful LSTM model with keras, and I pretty much figured out the training part, but I don't get the predicting part.
So, let's imagine that we had 10000 time-series datapoints. we use 9000 in front for training, and the other 1000 for testing. So, as we start training, we set the window length to 2, and slide the window forward as we set the input(X) as the first datapoint and set the output(y) as the second datapoint.
And as we train, the model converges because of it's stateful nature. Finally we finish training.
Now, we are left with a model, and some test data. The problem begins here. We test the first datapoint.
It returns a guessed value. Nice.
We test the second datapoint of the test set.
We get an output. But, the problem is that because we were using a stateful model, and we only one value as an input, the only way the model is going to figure out the next value is from memory of the previous time-series.
But since we didn't train the data on the first datapoint of the test set, the time-series is broken, and the model will think that the second datapoint on the test set is the first datapoint on the test set!
So, my question is,
does keras take care of this and automaticaly train the network as it's predicting?
or do I have to train the net as I am predicting
or is there some other reason that enables me to just keep predicting without training the model farther?

For a stateful LSTM, if will retain information in its cells as you predict. If you were to take any random point in the train or test dataset and repeatedly predict on it, your answer will change each time, because it keeps seeing this data and uses it every time it predicts. The only way to get a repeatable answer would be to call reset_states().
You should be calling reset_states() after each training epoch, and when you save the model, those cells should be empty. Then if you want to start predicting on the test set, you can predict on the last n training points (without saving the values anywhere), then start saving values once you get to your first test point.
It is often good practice to seed the model before prediction. If I want to evaluate on test_set[10:20,:], I can let the model predict on test_set[:10,:] first to seed the model then start saving my predicted values once I get to the range I am interested in.
To address the further training question, you do not need to train the model further to predict. Training will only be for tuning the model's weights. Look into this blog for more information on Stateful vs Stateless LSTM.

Train and validation data structure

What will happen if I use the same training data and validation data for my machine learning classifier?

If the train data and the validation data are the same, the trained classifier will have a high accuracy, because it has already seen the data. That is why we use train-test splits. We take 60-70% of the training data to train the classifier, and then run the classifier against 30-40% of the data, the validation data which the classifier has not seen yet. This helps measure the accuracy of the classifier and its behavior, such as over fitting or under fitting, against a real test set with no labels.

We create multiple models and then use the validation to see which model performed the best. We also use the validation data to reduce the complexity of our model to the correct level. If you use train data as your validation data, you will achieve incredibly high levels of success (your misclassification rate or average square error will be tiny), but when you apply the model to real data that isn't from your train data, your model will do very poorly. This is called OVERFITTING to the train data.

Basically nothing happens. You are just trying to validate your model's performance on the same data it was trained on, which practically doesn't yield anything different or useful. It is like teaching someone to recognize an apple and asking them to recognize just the same apple and see how they performed.
Why a validation set is used then? To answer this in short, the train and validation sets are assumed to be generated from the same distribution and thus the model trained on training set should perform almost equally well on the examples from validation set that it has not seen before.

Generally, we divide the data to validation and training to prevent overfitting. To explain it, we can think a model that classifies that it is human or not and you have dataset contains 1000 human images. If you train your model with all your images in that dataset , and again validate it with again same data set your accuracy will be 99%. However, when you put another image from different dataset to be classified by the your model ,your accuracy will be much more lower than the first. Therefore, generalization of the model for this example is a training a model looking for a stickman to define basically it is human or not instead of looking for specific handsome blonde man. Therefore, we divide dataset into validation and training to generalize the model and prevent overfitting.

TLDR;
If you use the same dataset for training and validation then:
training_accuracy = testing_accuracy
Your testing_accuracy will be the same as training_accuracy if you use the training dataset as the validation dataset. Therefore you will NOT be able to tell if your model has underfit or not.
Let's talk about datasets and evaluation metrics. Here is some terminology (reference) -
Datasets:
Training dataset: The data used to fit the model.
Validation dataset: the data used to validate the generalization ability of the model or for early stopping, during the training process. In most cases, this is the same as the test dataset
Evaluations:
Training accuracy: The accuracy you achieve when comparing predictions and actuals from the training data itself.
Testing accuracy: The accuracy you achieve when comparing predictions and actuals from the testing/validation data.
With the training_accuracy, you can get a sense of how well a model fits your data and the testing_accuracy tells you how well that model is generalizable. If train_accuracy is low, then your model has underfitted and you may need a better model (better features, different architecture, etc) for modeling the given problem. If training_accuracy is high but testing_accuracy is low, this means your model fits the data well, but it's not generalizable on unseen data. This is overfitting.
Note: In practice, it is better to have a overfit model and regularize it heavily rather than work with an underfit model.
Another important thing you need to understand that training a model (fit) and inference from a model (predict / score) are 2 separate tasks. Therefore, when you use the validation dataset as the training dataset, you are basically still training the model on the same training dataset but while inference, you are using the training dataset which will give you the same accuracy as the training_accuracy.
You will therefore not come to know if at all you overfit BUT that doesn't mean you will get 99% accuracy like the other answer to suggest! You may still underfit and get an extremely low model accuracy

conv net save weight and new test set

i'm using conv net for image classification.
There is something I dont understand theoretically
For training I split my data 60%train/20%validation/20% test
I save weight when metric on validation set is the best (I have same performance on training and validation set).
Now, I do a new split. Some data from training set will be on test set. I load the weight and I classify new test set.
Since weight have been computed on a part of the new test set, are we agree to says this is a bad procedure and I should retrain my model with my new training/validation set?

yes, for fair evaluation no sample in the test set should be seen during training

The all purpose of having a test set is that the model must never see it until the very last moment.
So if your model trained on some of the data in your test set, it becomes useless and the results it will gives you will have no meaning.
So basicly:
1.Train on your train set
2.Validate on your validation set
3.Repeat 1 and 2 until you are happy with the results
4.At the very end, finally test your model on the test set

How to get the prediction of new data by LSTM in python

This is a univariate time series prediction problem. As the following code shows, I divide the initial data into a train dataset (trainX) and a test dataset(testX), then I create a LSTM network by keras. Next, I train the model by the train dataset. However, when I want to get the prediction, I need to know the test value, so my problem is: why do I have to predict since I have known the true value which is test dataset in this problem. What I want to get is the prediction value of future time? If I have some misunderstandings about LSTM network, please tell me.
Thank you!
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

Since we don't have the future value with us while training the model, we just divide the data into train and test sets. Then we just imagine that test sets are the future values. We train our model using train set (and also usually a validation set). And after our model is trained, we test it using the test set to check our models performance.

why do I have to predict since I have known the true value which is test dataset in this problem. What I want to get is the prediction value of future time?
In ML, we give test data X and it returns us Y. In the case of time-series, it may mislead a beginner a bit as we use the X and output is apparently X as well: The difference here is that we are inputting old values of time-series as X and the output Y is value of same time-series but we are predicting in future (can be applied for present or even past as well) as you have identified it correctly.
(P.S: I would recommend you to begin with simple regression and then come to LSTMs etc. if all you want is to learn the Machine Learning.)

I think the correct term in this context is 'Forecasting'.
A good explanation is: after you train and test your model, with the data that you already had (as the other ones said here before me), you want to predict future data, which is, I think, the trully interresting thing about recurrent networks.
So in order to make this, you need to start predicting the values from one day after your final date in your original dataset, using the model (which is trained with this past data). Once you predict this value, you do the same thing, but considering the last values predict, and so on.
The fact that you are using a prediction to make others predictions, implies that is much more difficult to get good results, so is common to try to predict short ranges of time.
The exact code that you need to perform to do this could vary, but I think that is the prime concept
In the link below, in the last part, in which is perform a forecast, the author show us a code and a explanation on how he did it.
https://towardsdatascience.com/time-series-forecasting-with-recurrent-neural-networks-74674e289816
I guess that's it.

Test neural network using Keras Python

I have trained and tested a Feed Forward Neural Network using Keras in Python with a dataset. But each time, in order to recognize a new test set with external data (external since the data are not included within the dataset), I have to re-train the Feed Forward Neural Network to compute the test set. For instance each time I have to do:
model.fit (data, output_data)
prediction=model.predict_classes(new_test)
print "Prediction : " prediction
Obtaining correct output:
Prediction: [1 2 3 4 5 1 2 3 1 2 3]
Acc: 100%
Now I would test a new test set, namely "new_test2.csv" without re-training again, just using what the network has learned. I am also thinking about a sort of real time recognition.
How I should do that?
Thanks in advance

With a well trained model you can make predictions on any new data. You don´t have to retrain anything because (hopefully) your model can generalize it´s learning to unseen data and will achieve comparable accuracy.
Just feed in the data from "new_test2.csv" to your predict function:
prediction=model.predict_classes(content_of_new_test2)
Obviously you need data of the same type and classes. In addition to that you need to apply any transformations to the new data in the same way you may have transformed the data you trained your model on.
If you want realtime predictions you could setup an API with Flask:
http://flask.pocoo.org/
Regarding terminology and correct method of training:
You train on a training set (e.g. 70% of all the data you have).
You validate your training with a validation set (e.g. 15% of your data). You use the accuracy and loss values from your training to tune your hyperparameters.
You then evaluate your models final performance by predicting data from your test set (again 15% of your data). That has to be data, your network hasn´t seen before at all and hasn´t been used by you to optimize training parameters.
After that you can predict on production data.
If you want to save your trained model use this (taken from Keras documentation):
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

In your training file, you can save the model using
model.save('my_model.h5')
Later, whenever you want to test, you can load it with
from keras.models import load_model
model = load_model('my_model.h5')
Then you can call model.predict and whatnot.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.