I am working on a django website. I want to search from a lots of texts from django.models(texts is something like stackoverflow questions). I am doing search with Haystack+whoosh. It is very nice using it. Much better than django.object.filter(body_text__icontains="food")
So i would like to know whether i able to have Spelling Suggestions using whoosh or some other PUre python package available. i don't like solr(since it needs java, after every update i need to rebuild the index using java(solr))
Whoosh's documentation for version 2.4.1 indicates it does indeed have a pure-Python spelling suggestion module.
Related
So, I have a problem statement in which I want to extract the list of users who are following a particular #hashtag like #obama, #corona etc.
The challenge here is I want to extract this data anonymously i.e without providing any account keys.
I tried a library named twint that is capable of doing this but it's very slow. can anyone recommend a better alternative for my use case..?
there's no such library available that satisfy your use case. yes there's this twint library but as you mentioned it's slow for your use case. so try with some other language libraries see if something is available over there.
You can try to make a scripts in python using selenium, and I think that you could get that names of users very fast.
This Github repo which I found might be useful. It does not require authentication to get the twitter data. Have a look at it -https://github.com/bisguzar/twitter-scraper
I tried this approach last year, but found out that my date range fell well outside of the available info provided by Twitter, and had to use the Premium API. If this is not a constraint for you, and since you do not want to code your own scraper, take a look at this option:
TweetScraper: Updated in September last year, also provides MongoDB integration. I haven't tried it, but seems to work OK. Don't know about the time performance.
I have a project that needs to automatically generate a python class according to some configuration files using python.
During my searches, I became familiar with Jinja2 which seems to be very popular for generating web pages, but I couldn't really find a similar case which uses Jinja to generate some python codes using Jinja (I know that it is definitely possible to do it, just the lack of examples made me hesitated).
Is it make sense to use Jinja2 for my case or is there any easier solution for generating python from python?!
You can use jinja to generate any text. I use jinja myself to generate python and there is at least one previous stack overflow post.
I am a new user of python and would like to try subnets-resolver. However, I can`t find the documentation of this package. Can someone point me to it?
It seems you are out of luck. Google does not bring anything up, and the code contains no docstrings (quite unpythonic). You will have to figure it out yourself. You could write a documentation in the process and and make it available for others though.
Update
I'm trying to create a simple Go function which will simply take in a string of reddit-style Markdown and return the appropriate HTML.
Right now, I know that having Discount installed is a prerequisite and that at least the following three files are used by reddit as wrappers around Discount:
https://github.com/reddit/reddit/blob/master/r2/r2/lib/c/reddit-discount-wrapper.c
https://github.com/reddit/reddit/blob/master/r2/r2/lib/c_markdown.py
https://github.com/reddit/reddit/blob/master/r2/r2/lib/py_markdown.py
Based on this, does anyone know how I can sort of glue all this together with Cgo and go-python to create a simple Markdown function? (independent of the rest of the reddit source code)
If all you want is Markdown, I don't see how Python fits into this. Maybe there's more to it, but if at all possible you should leave Python out of this. If there's a reason to use Python that wasn't in the question, I can edit this answer and address that.
First, try this native Go Markdown package: https://github.com/knieriem/markdown
If that doesn't work, the next easiest thing is to take Discount (or any other Markdown library written in C, such as GitHub's Upskirt fork) and wrap it with cgo or SWIG.
I am working on a Django project, where I need to implement full text search. I have seen SOLR and found some good comments for the same. But as its implemented in Java and would need java enviroment to be installed on the system along with Python. Looking for the python equivalent for SOLR, I have seen Whoosh but I am not sure whether Whoosh is as efficient and strong as SOLR. Or shall I go with SOLR option only or are there any better options than Whoosh and SOLR with python?
Please suggest.
Thanks in advance
Whoosh is actually very fast for a python-only implementation. That said, it's still at least an order of magnitude slower. Depending on the amount of data you need to index and search and the requirements on the maximum allowable latency and concurrent searches, it may not be an option.
SOLR is a bit of a complicated beast, but it's by far the most comprehensive search solution. Mix it with solrpy for stunning results. Yes, you will need java hosting.
You might also want to check out the python bindings for xapian. Xapian is very very fast, but less of a complete solution than SOLR. They are GPL licensed though, so that may/may not be viable for you.
I have used Lucene and Lucene extensions like SOLR and Nutch, and I found out that lucene pretty much satisfies what I need. I've only tried Whoosh once but chose Lucene because
1) I am using Java
2) I had trouble making UTF-8 work with Whoosh (not sure if it works out of the box now). In Lucene, I had no trouble working with Chinese characters.
If you're using Python as your Programming Language and Whoosh satisfies your needs then I'd suggest you use it over Java alternatives for better integration, avoid external dependencies, faster customization if you need to code additional functionalities.
UPDATE: If you're interested in using Lucene, it has a Python wrapper: See http://lucene.apache.org/pylucene/