Time series forecasting (eventually with python) [closed]

Time series forecasting (eventually with python) [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
What algorithms exist for time series forecasting/regression ?
What about using neural networks ? (best docs about this topic ?)
Are there python libraries/code snippets that can help ?

The classical approaches to time series regression are:
auto-regressive models (there are whole literatures about them)
Gaussian Processes
Fourier decomposition or similar to extract the periodic components of the signal (i.e., hidden oscillations in the data)
Other less common approaches that I know about are
Slow Feature Analysis, an algorithm that extract the driving forces of a time series, e.g., the parameters behind a chaotic signal
Neural Network (NN) approaches, either using recurrent NNs (i.e., built to process time signals) or classical feed-forward NNs that receive as input part of the past data and try to predict a point in the future; the advantage of the latter is that recurrent NNs are known to have a problem with taking into account the distant past
In my opinion for financial data analysis it is important to obtain not only a best-guess extrapolation of the time series, but also a reliable confidence interval, as the resulting investment strategy could be very different depending on that. Probabilistic methods, like Gaussian Processes, give you that "for free", as they return a probability distribution over possible future values. With classical statistical methods you'll have to rely on bootstrapping techniques.
There are many Python libraries that offer statistical and Machine Learning tools, here are the ones I'm most familiar with:
NumPy and SciPy are a must for scientific programming in Python
There is a Python interface to R, called RPy
statsmodel contains classical statistical model techniques, including autoregressive models; it works well with Pandas, a popular data analysis package
scikits.learn, MDP, MLPy, Orange are collections of machine learning algorithms
PyMC A python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo.
PyBrain contains (among other things) implementations of feed-forward and recurrent neural networks
at the Gaussian Process site there is a list of GP software, including two Python implementations
mloss is a directory of open source machine learning software

I've no idea about python libraries, but there are good forecasting algorithms in R which are open source. See the forecast package for code and references for time series forecasting.

Two approaches
There are two ways on how to deal with temporal structured input for classification, regression, clustering, forecasting and related tasks:
Dedicated Time Series Model: The machine learning algorithm incorporates such time series directly. Such a model is like a black box and it can be hard to explain the behavior of the model. Example are autoregressive models.
Feature based approach: Here the time series are mapped to another, possibly lower dimensional, representation. This means that the feature extraction algorithm calculates characteristics such as the average or maximal value of the time series. The features are then passed as a feature matrix to a "normal" machine learning such as a neural network, random forest or support vector machine. This approach has the advantage of a better explainability of the results. Further it enables us to use a well developed theory of supervised machine learning.
tsfresh calculates a huge number of features
The python package tsfresh calculate a huge number of such features from a pandas.DataFrame containing the time series. You can find its documentation at http://tsfresh.readthedocs.io.
Disclaimer: I am one of the authors of tsfresh.

Speaking only about the algorithms behind them, I recently used the double exponential smoothing in a project and it did well by forecasting new values when there is a trend in the data.
The implementation is pretty trivial, but maybe the algorithm is not sufficiently elaborated for your case.

Did you tried Autocorrelation for finding periodical patterns in time series ?
You can do that with numpy.correlate function.

Group method of data handling is widely used to forecast financial data.

If you want to understand Time Series Forecasting using Python then below link is very helpful.
https://github.com/ManojKumarMaruthi/Time-Series-Forecasting

Related

Applying CNN to Fast time Fourier Transform?

I have data that fast time fourier transform is applied.
(amplitudes at specific Hzs)
There are solutions on internet that CNN is applied to mel spectrogram, however, I see no solution that CNN is applied to Fast Fourier Transformed signal.
Is it possible that CNN is applied to Fast Fourier Transformed signals?
Or is it not possible because CNN is considering temporal attribute?
Thanks!

I'm assuming each row of your spreadsheet is IID, e.g. it wouldn't change the problem to re-order the rows in that spreadsheet.
In this case you have a pretty typical ML problem. The fact that the FFT has already been applied and specific frequency responses (columns) have been extracted is a process called "feature engineering". Prior to the common use of neural networks, this was a standard step in all machine learning problems and remains common to a great many domains.
With data that has been feature engineered, you should look to traditional ML algorithms. Random Forests, XGBoost, and Linear Regression come to mind. A fully connected neural network is also appropriate, but I would typically expect it to under-perform other ML methods.
The hallmark of a CNN is that it operates on an ordered sequence of data. In your case the raw data, from which your dataset was derived, would be appropriate for a CNN. In a sound file you have a 1D sequence of information. You could not re-order the data in the time dimension without fundamentally changing its meaning.
A 2D CNN operates over an image where the pixel order in X and Y cannot be changed. Again the sequential order of the data matters. The same applies for 3D CNNs.
Be aware that the application of a FFT has fundamentally biased your solution by representing it only in a limited set of frequency responses. All feature engineering is fundamentally biasing the problem, presumably in a well thoughout-out way. However, it's entirely possible that other useful signals in the data exist, which aren't expressed by the FFT # 10, 20, 30 Hz, etc. The CNN has the capacity to learn its own version of an FFT as well as other non cyclic patterns. Typically, the lack of a feature engineering step is the key differentiator between the CNN and traditional ML algorithms.

How do I make a good feature using machine learning on a timeseries forecast that has only traffic volume as an input?

So I have a time series that only has traffic volume. I've done FB prophet and neural prophet. They work okay, but I would like to do something using machine learning. So far I have the problem of trying to make my features. Using the classical dayofyear, month, etc does not give me good results. I have tried using shift where I get the average, minimum, and max of the two previous days. However that would work, but my problem is when I try to predict days in advance the feature doesn't really work for that since I cant get the average of that day. My main concern is trying to find a good feature that my predicting future dataframe also has. A picture of my data is included. Does anyone know how I would do this?

First of all, you have to clarify some definitions. FBProphet works on a mechanism that is the same as any machine learning algorithm that is, fitting the model and then predicting the output. Being an additive regression model with a piecewise linear or logistic growth curve trend, it can be considered as a Machine Learning method that allows us to predict a continuous outcome variable.
Secondly, I think you missed the most important word that your question was about - namely: Feature engineering.
Feature Engineering includes :
Process of using domain knowledge to Extract features (characteristics, properties, attributes) from raw data.
Process of transforming raw data into features that better represent the underlying problem to the predictive models, etc..
But it's very unlikely to use Machine Learning to do Feature engineering. You do Feature engineering in order to improve your Machine Learning model. Many techniques such as imputation, handling outliers, binning, log transform, one-hot encoding, grouping operations, feature split, scaling are hybrid methods using a statistical approach and/or domain knowledge.
Regarding your data, bearing in my mind that the seasonality is already handled by FBProphet, I am not confident if feature engineering transformations such as adding the day of the week, adding holidays periods, etc... could really help improve performance...
To conclude, it is not possible to create ex-nihilo new features that would outperform your model. Whether you process/transform your data or add external domain-knowledge dataset

Deep learning, signal processing and feature engineering

I have a signals represented in python in dense matrices (the values are y-coordinates from a chart - eg. weather temp etc. in different locations around the world).
I'm currently trying to process/recognize these matrices via different kinds of deep neural networks.
I'm trying to enhance the important parts of the signals in many different ways as well...I know when we talk about sound / human voice...Log mel, MFCC etc. are useful tool to do such and enhance the important parts and vise-versa...
What are my best options to enhance the most important features (in a similar way) in my dense matrices (in other words to feature engineer the most important features)? I'm of course aware about the fact, that the model can't know what the most important features are and not (without some kind of help / feature engineering - but what more can I do as a human to improve the model). Like set the importance of each feature?

Machine Learning, What are the common techniques for feature engineering and presenting the model?

I am having a ML language identification project (Python) that requires a multi-class classification model with high dimension feature input.
Currently, all I can do to improve accuracy is through trail-and-error. Mindlessly combining available feature extraction algorithms and available ML models and see if I get lucky.
I am asking if there is a commonly accepted workflow that find a ML solution systematically.
This thought might be naive, but I am thinking if I can somehow visualize those high dimension data and the decision boundaries of my model. Hopefully this visualization can help me to do some tuning. In MATLAB, after training, I can choose any two features among all features and MATLAB will give a decision boundary accordingly. Can I do this in Python?
Also, I am looking for some types of graphs that I can use in the presentation to introduce my model and features. What are the most common graphs used in the field?
Thank you

Feature engineering is more of art than technique. That might require domain knowledge or you could try adding, subtracting, dividing and multiplying different columns to make features out of it and check if it adds value to the model. If you are using Linear Regression then the adjusted R-squared value must increase or in the Tree models, you can see the feature importance, etc.

Does the SVM in sklearn support incremental (online) learning?

I am currently in the process of designing a recommender system for text articles (a binary case of 'interesting' or 'not interesting'). One of my specifications is that it should continuously update to changing trends.
From what I can tell, the best way to do this is to make use of machine learning algorithm that supports incremental/online learning.
Algorithms like the Perceptron and Winnow support online learning but I am not completely certain about Support Vector Machines. Does the scikit-learn python library support online learning and if so, is a support vector machine one of the algorithms that can make use of it?
I am obviously not completely tied down to using support vector machines, but they are usually the go to algorithm for binary classification due to their all round performance. I would be willing to change to whatever fits best in the end.

While online algorithms for SVMs do exist, it has become important to specify if you want kernel or linear SVMs, as many efficient algorithms have been developed for the special case of linear SVMs.
For the linear case, if you use the SGD classifier in scikit-learn with the hinge loss and L2 regularization you will get an SVM that can be updated online/incrementall. You can combine this with feature transforms that approximate a kernel to get similar to an online kernel SVM.
One of my specifications is that it should continuously update to changing trends.
This is referred to as concept drift, and will not be handled well by a simple online SVM. Using the PassiveAggresive classifier will likely give you better results, as it's learning rate does not decrease over time.
Assuming you get feedback while training / running, you can attempt to detect decreases in accuracy over time and begin training a new model when the accuracy starts to decrease (and switch to the new one when you believe that it has become more accurate). JSAT has 2 drift detection methods (see jsat.driftdetectors) that can be used to track accuracy and alert you when it has changed.
It also has more online linear and kernel methods.
(bias note: I'm the author of JSAT).

Maybe it's me being naive but I think it is worth mentioning how to actually update the sci-kit SGD classifier when you present your data incrementally:
clf = linear_model.SGDClassifier()
x1 = some_new_data
y1 = the_labels
clf.partial_fit(x1,y1)
x2 = some_newer_data
y2 = the_labels
clf.partial_fit(x2,y2)

Technical aspects
The short answer is no. Sklearn implementation (as well as most of the existing others) do not support online SVM training. It is possible to train SVM in an incremental way, but it is not so trivial task.
If you want to limit yourself to the linear case, than the answer is yes, as sklearn provides you with Stochastic Gradient Descent (SGD), which has option to minimize the SVM criterion.
You can also try out pegasos library instead, which supports online SVM training.
Theoretical aspects
The problem of trend adaptation is currently very popular in ML community. As #Raff stated, it is called concept drift, and has numerous approaches, which are often kinds of meta models, which analyze "how the trend is behaving" and change the underlying ML model (by for example forcing it to retrain on the subset of the data). So you have two independent problems here:
the online training issue, which is purely technical, and can be addressed by SGD or other libraries than sklearn
concept drift, which is currently a hot topic and has no just works answers There are many possibilities, hypothesis and proofes of concepts, while there is no one, generaly accepted way of dealing with this phenomena, in fact many phd dissertations in ML are currenlly based on this issue.

SGD for batch learning tasks normally has a decreasing learning rate and goes over training set multiple times. So, for purely online learning, make sure learning_rate is set to 'constant' in sklearn.linear_model.SGDClassifier() and eta0= 0.1 or any desired value. Therefore the process is as follows:
clf= sklearn.linear_model.SGDClassifier(learning_rate = 'constant', eta0 = 0.1, shuffle = False, n_iter = 1)
# get x1, y1 as a new instance
clf.partial_fit(x1, y1)
# get x2, y2
# update accuracy if needed
clf.partial_fit(x2, y2)

A way to scale SVM could be split your large dataset into batches that can be safely consumed by an SVM algorithm, then find support vectors for each batch separately, and then build a resulting SVM model on a dataset consisting of all the support vectors found in all the batches.
Updating to trends could be achieved by maintaining a time window each time you run your training pipeline. For example, if you do your training once a day and there is enough information in a month's historical data, create your traning dataset from the historical data obtained in the recent 30 days.

If interested in online learning with concept drift then here is some previous work
Learning under Concept Drift: an Overview
https://arxiv.org/pdf/1010.4784.pdf
The problem of concept drift: definitions and related work
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.58.9085&rep=rep1&type=pdf
A Survey on Concept Drift Adaptation
http://www.win.tue.nl/~mpechen/publications/pubs/Gama_ACMCS_AdaptationCD_accepted.pdf
MOA Concept Drift Active Learning Strategies for Streaming Data
http://videolectures.net/wapa2011_bifet_moa/
A Stream of Algorithms for Concept Drift
http://people.cs.georgetown.edu/~maloof/pubs/maloof.heilbronn12.handout.pdf
MINING DATA STREAMS WITH CONCEPT DRIFT
http://www.cs.put.poznan.pl/dbrzezinski/publications/ConceptDrift.pdf
Analyzing time series data with stream processing and machine learning
http://www.ibmbigdatahub.com/blog/analyzing-time-series-data-stream-processing-and-machine-learning

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.