Is there a Python based library providing an SVM implementation with a GPL or any other opensource license? I have come across a few that provide an SVM wrapper for the SVM logic encoded in C, but none that are coded entirely in Python.
Regards,
Mandar
libsvm has Python bindings.
Edit
Googling found PyML, but I haven't used it.
You might want to check out this link, it has a big collection of machine learning software, it lists 50+ libraries that have been written in Python:
http://mloss.org/software/language/python/
Related
I am trying to extract text from image using keras-ocr. Does it support other written languages?
I am not getting proper documentation for supporting other languages.
Unfortunately the answer is no. Keras is written in Python and can only be used in this language. What you are looking for is referred to as an API wrapper and there is none available at this time. As an alternative, you can look into converting your Keras model into another language such as by using keras2cpp explained in a similar question.
I want to do topic modeling on short texts. I did some research on LDA and found that it doesn't go well with short texts. What methods would be better and do they have Python implementations?
You can try Short Text Topic Modelling (refer to this https://www.groundai.com/project/sttm-a-tool-for-short-text-topic-modeling/1) (code available at https://github.com/qiang2100/STTM) . It combine state-of-the-art algorithms and traditional topics modelling for long text which can conveniently be used for short text.
For more specialised libraries, try lda2vec-tf, which combines word vectors with LDA topic vectors. It is branched from the original lda2vec and improved upon and gives better results than the original library.
Besides GSDM, there is also biterm implemented in python for short text topic modeling.
The only Python implementation of short text topic modeling is GSDMM. Unfortunately, most of the others are written on Java.
Here's a very fast and easy to use implementation of GSDMM that can be used in Python that I wrote recently: https://github.com/centre-for-humanities-computing/tweetopic
I found the existing implementations quite lacking, especially performance-wise, this one usually performs about 60x times faster than gsdmm, is much better documented, and is fully compatible with sklearn.
Please, did anyone try to run CHAID algorithm on continuous predictors ??
At first, I used SPSS Modeler and it worked fine.
but when I tried it on Python 3.6, it didn't work for me.
Thanks :)
P.S. CHAID package could be found here :
https://github.com/Rambatino/CHAID
I'm the author of that library.
It's usually better to post on the issues tab on the github repo as questions have more visibility there.
Unfortunately, with regards to continuous predictors, they need to be binned first before they can be run using CHAID. We haven't implemented a binning strategy as it's very subjective (SPSS makes a lot of decisions under the hood).
Is there any python library with functions to perform fixed or random effects meta-analysis?
I have search through google, pypi and other sources but it seems that the most popular python stats libraries lack this functionality.
It would be great if it also provide graphical solutions to produce funnel plots and forest plots.
Forest plot example:
It thought of something similar to R package rmeta
I've found some people creating their own functions manually, but it isn't a actual library. In addition, metasoft was promising, but it uses python only to convert between formats.
Just to say, it seems the mostly widely used tool is R's metafor, which provides seemingly every possible method used and includes essential plotting functions.
In Python, PythonMeta the backend for a web-based tool PyMeta which offers many of the methods (fixed and random effects, various data types) found in metafor.
This PyMARE project is still under development but does provide various fixed and random effects meta-analysis estimators (this is a spin-off from the rather more mature NiMARE tool for neuroimaging meta-analysis).
statsmodels now also offers some options for meta-analysis and visualization of its results, more information here:
https://www.statsmodels.org/devel/examples/notebooks/generated/metaanalysis1.html
What are the standard tf-idf implementations/api available in python? I've come across the one in nltk. I want to know the other libraries that provide this feature.
there is a package called scikit which calculates tf-idf scores.
you can refer to my answer to this question
Python: tf-idf-cosine: to find document similarity
and also see the question code from this. Thankz.
Try the libraries which implements TF-IDF algorithm in python.
http://code.google.com/p/tfidf/
https://github.com/hrs/python-tf-idf
Unfortunately, questions asking for a tool or library are offtopic on SO. There are lot of machine learning libraries implementing tfidf. Two most comprehensive of them besides mentioned ntlk in my view are sklearn and gensim.