I am trying to extract text from image using keras-ocr. Does it support other written languages?
I am not getting proper documentation for supporting other languages.
Unfortunately the answer is no. Keras is written in Python and can only be used in this language. What you are looking for is referred to as an API wrapper and there is none available at this time. As an alternative, you can look into converting your Keras model into another language such as by using keras2cpp explained in a similar question.
Related
I want to do topic modeling on short texts. I did some research on LDA and found that it doesn't go well with short texts. What methods would be better and do they have Python implementations?
You can try Short Text Topic Modelling (refer to this https://www.groundai.com/project/sttm-a-tool-for-short-text-topic-modeling/1) (code available at https://github.com/qiang2100/STTM) . It combine state-of-the-art algorithms and traditional topics modelling for long text which can conveniently be used for short text.
For more specialised libraries, try lda2vec-tf, which combines word vectors with LDA topic vectors. It is branched from the original lda2vec and improved upon and gives better results than the original library.
Besides GSDM, there is also biterm implemented in python for short text topic modeling.
The only Python implementation of short text topic modeling is GSDMM. Unfortunately, most of the others are written on Java.
Here's a very fast and easy to use implementation of GSDMM that can be used in Python that I wrote recently: https://github.com/centre-for-humanities-computing/tweetopic
I found the existing implementations quite lacking, especially performance-wise, this one usually performs about 60x times faster than gsdmm, is much better documented, and is fully compatible with sklearn.
I have been very happy with some language translations I've done with Prolog, but long ago. I'm now using Python for general purpose programming. The area is DNA sequencing data processing, but that's besides the point.
I am interested in using a DCG (definite clause grammar) for translation into a target language. (A DCG is very close to being a set of Prolog predicates, and a DCG to Prolog interpretation layer is almost trivial, as I recall.) The method I used was to parse an input language, and at the same time as parsing the input expressions, build a network structure to represent a deeper model of the expression. Another grammar then served to elaborate that model into a valid expression in the target language.
This time, though, I'm looking to do just the second half, to take an internal model (in a network of Python objects) and translate them into a target language. (This target language is a workflow configuration language, incidentally, and the network of objects are those used by a pre-existing less general workflow engine that I hope to abandon.)
So, are there any modern, supported Prolog implementations that cleanly interface to Python?
YAP provides a Python interface package:
http://www.dcc.fc.up.pt/~vsc/yap/
If you want to try it, I suggest you start with use the current git version found at:
https://github.com/vscosta/yap-6.3
Some examples are provided with the distribution:
https://github.com/vscosta/yap-6.3/tree/master/packages/python/examples
I have a python script that reads in a file, should then detect the language of the code in the file, get the language ID from https://ghostbin.com/languages.json and upload it to https://ghostbin.com with the language ID as a parameter.
The problem is detecting the programming language used. I haven't found any lib to help me out.
Here is a module that uses a Naive Bayes classifier to do what you want, with a corresponding discussion. The caveat is that the module needs to be trained on code samples. It should be easy enough to modify it to retain its training.
What are the standard tf-idf implementations/api available in python? I've come across the one in nltk. I want to know the other libraries that provide this feature.
there is a package called scikit which calculates tf-idf scores.
you can refer to my answer to this question
Python: tf-idf-cosine: to find document similarity
and also see the question code from this. Thankz.
Try the libraries which implements TF-IDF algorithm in python.
http://code.google.com/p/tfidf/
https://github.com/hrs/python-tf-idf
Unfortunately, questions asking for a tool or library are offtopic on SO. There are lot of machine learning libraries implementing tfidf. Two most comprehensive of them besides mentioned ntlk in my view are sklearn and gensim.
Is there a Python based library providing an SVM implementation with a GPL or any other opensource license? I have come across a few that provide an SVM wrapper for the SVM logic encoded in C, but none that are coded entirely in Python.
Regards,
Mandar
libsvm has Python bindings.
Edit
Googling found PyML, but I haven't used it.
You might want to check out this link, it has a big collection of machine learning software, it lists 50+ libraries that have been written in Python:
http://mloss.org/software/language/python/