I am trying to train a Hidden Markov Model (HMM) using the GHMM library. So far, I have been able to train both a discrete model, and a continuous model using a single Gaussian for each of the states.
There are really good examples on how to do it here.
However, I would like to train a continuous HMM with a single covariance matrix tied across all states (instead of having one for each state). Is that possible with GHMM lib? If it is, I would love to see some examples. If not, could somebody point me to some other code, or refer me to another HMM python/c library that can actually do it?
Thank you!
So, I have found this great package in C that has an HMM implementation exactly the way I wanted: Queen Mary Digital Signal Processing Library. More specifically, the HMM implementation is in these files. So, no need to use GHMM lib anymore.
Related
For a project I am working on, I need to find a model for the data graphed below that includes a sine or cosine component (hard to tell from the image but the data does follow a trig-like function for each period, although the amplitude/max/mins are changing).
data
I originally planned on finding a simple regression model for my data using Desmos before I saw how complex the data was, but alas, I do not think I am capable of determining what equation to use without the help of Python. I don't have much experience with regression in Python, I've only done basic linear modeling where I knew the type of equation and was just determining the coefficients/constants. Could anyone offer a guiding example, git code, or resources that would be useful for this?
Your question is pretty generic and looking at the graph, we cannot tell much about the data to give you a more detailed answer, but i'd say have a look at OLS
https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html
You could also look at scikit learn for the various regression models it provides.
http://scikit-learn.org/stable/modules/linear_model.html
Essentially,these packages will help you figure our the equation you are looking to have for your data.
Also, looks like your graph has an outlier ? Please note regression is very sensitive to outliers, so you may want to handle those data points before fitting the model.
I am working on a hmm for financial time series price data using the hmmlearn package (http://hmmlearn.readthedocs.io/en/latest/). I would like to implement a 2 or three state model to fit my data on. However I would like to have different distributions depending on the hidden state. It seems you can implement a custom emission probability using the _BaseHMM override. However I am not completely sure how to do this form the documentation. Are there any available example or could somebody please provide how to set e.g. two different emission probabilities for each state in a hmm framework like this?
Many thanks in advance.
Is there a way to have an x,y pair dataset given to a function that will return a list of curve fit models and the coeff. The program DataFit does this with about 200 different models, but we are looking for a pythonic way. From exponential to inverse polynomial etc.
I have seen many posts of manually using scipy to type each model, but this is not feasible for the number of models we want to test.
The closest I found was pyeq2, but this is not returning the list of functions, and seems to be a rabbit hole to code for.
If R has this available, we could use that but python is really the goal
Below is an example of the data, we want to find the best way to describe this curve
You can try library splines in R. I have used this for higher order curve fitting to some univariate data. You can try to change and achieve similar thing with corresponding R^2 errors.
You can either decide to do the following:
Choose a model to fit a parameters. This model should be based on a single independent variable. This can be done by python's scipy.optimize curve_fit function. You can choose something like a hyberbola.
Choose a model that is complex and likely represents an underlying mechanism of something at work. Like the system of ODE's from a disease SIR model. Fitting the parameters will be no easy task. This will be done by Markov Chain Monte Carlo (MCMC) methods. This is VERY difficult.
Realise that you have data and can use machine learning via scikit learn to predict from your data. This is a method that doesn't require parameters.
Machine learning and neural networks don't fit something and can't really tell you about the underlying mechanism but can make predicitions just as a best fit model would...dare I say even better.
In the end, we found that Eureqa software was able to achieve this. https://www.nutonian.com/products/eureqa/
I had some trouble finding a good transductive svm (semi-supervised support vector machine or s3vm) implementation for python. Finally I found the implementation of Fabian Gieseke of Oldenburg University, Germany (code is here: https://www.ci.uni-oldenburg.de/60506.html, paper title: Fast and Simple Gradient-Based Optimization for Semi-Supervised Support Vector Machines).
I now try to integrate the learned model into my scikit-learn code.
1) This works already:
I've got a binary classification problem. I defined a new method inside the S3VM-code returning the self.__c-coeficients (these are needed for the decision function of the classifier).
I then assign these (in my own scikit-code where clf stands for a svm.SVC-classifier) to clf.dual_coefs_ and properly change clf.support_ too (which holds the indices of the support vectors). It takes a while because sometimes you need numpy-arrays and sometimes lists etc. But it works quite well.
2) This doesnt work at all:
I want to adapt this now to a multi-class-classification problem (again with an svm.SVC-classifier).
I see the scheme for multi-class dual_coef_ in the docs at
http://scikit-learn.org/stable/modules/svm.html
I tried some things already but seem to mess it up all the time. My strategy is as follows:
for all pairs in classes:
calculate the coefficients with qns3vm for the properly binarized labeled training set (filling 0s into spaces in the coef-vector where instances have been in the labeled training set that are not in the current class-pair) --> get a 1x(l+u)-np.array of coefficients
horizontally stack these to get a (n_class*(n_class-1)/2)x(l+u) matrix | I do not have a clue why the specs say that this should be of shape [n_class-1, n_SV(=l+u)]?
replace clf.dual_coef_ with this matrix
Does anybody know the right way to replace dual_coef_ in the multi-class-setting? Or is there a neat piece of example code someone can recommend? Or at least a better explanation for the shape of dual_coef_ in the one-vs-one-multiclass-setting?
Thanks!
Damian
I'm wondering what the set_weights method of the Maxent class in NLTK is used for (or more specifically how to use it). As I understand, it allows you to manually assign weights to certain features? Could somebody provide a basic example of the type of parameter that would be passed into it?
Thanks
Alex
It apparently allows you to set the coefficient matrix of the classifier. This may be useful if you have an external MaxEnt/logistic regression learning package from which you can export the coefficients. The train_maxent_classifier_with_gis and train_maxent_classifier_with_iis learning algorithms call this function.
If you don't know what a coefficient matrix is; it's the β mentioned in Wikipedia's treatment of MaxEnt.
(To be honest, it looks like NLTK is either leaking implementation details here, or has a very poorly documented API.)