How can I sort posts alphabetically in flask? - python

I have been following the tutorial provided by Flask. I'm trying to change things around a bit and make it fit the criterion for a glossary.
I suspect that my issue lies in this line of code in my flaskr.py file:
cur = db.execute('select title, text from entries order by id desc')
The reason why I suspect this is because when I mess with it it breaks everything. As well, when I tried to "sort" everything it did nothing, oh and it says to order by id descending... that's mainly why.
What I tried was:
#app.route('/order', methods=['POST'])
def order_entry():
entries.sort()
return entries
Which is probably crude and sort of silly, but I'm particularly new to programming. I can't find any other places in my code where entries are being ordered.
I have looked for different ways to organize a dictionary alphabetically but haven't had too much luck making it work. As you can tell.

Assuming this is the Flask tutorial you're following, I think your function is missing some things. Is entries some sort of global variable, or did you just remove the part where it was created? I've tried to combine your code with one of the examples from the tutorial, and added some comments.
#app.route('/order', methods=['POST'])
def order_entry():
# the following line creates a 'cursor' which you need to retrieve data
# from the database
cur = g.db.execute('select title, text from entries order by id desc')
# the following line uses that cursor ("cur"), fetches the data,
# turns it into a (unsorted) list of dictionaries
entries = [dict(title=row[0], text=row[1]) for row in cur.fetchall()]
# let's sort the list by the 'title' attribute now
entries = sorted(entries, key=lambda d: d['title'])
# or if you prefer, you could say: "entries.sort(key=lambda d:d['title']"
# return the template with the sorted entries in
return render_template('show_entries.html', entries=entries)
Now, I don't know know Flask at all, but I think this is the gist of what you want to do.
You may want to go through some Python tutorials (before tackling Flask), since there are a few basic concepts that, once you grasp, I think will make everything else much easier.

Related

Arcpy, select features based on part of a string

So for my example, I have a large shapefile of state parks where some of them are actual parks and others are just trails. However there is no column defining which are trails vs actual parks, and I would like to select those that are trails and remove them. I DO have a column for the name of each feature, that usually contains the word "trail" somewhere in the string. It's not always at the beginning or end however.
I'm only familiar with Python at a basic level and while I could go through manually selecting the ones I want, I was curious to see if it could be automated. I've been using arcpy.Select_analysis and tried using "LIKE" in my where_clause and have seen examples using slicing, but have not been able to get a working solution. I've also tried using the 'is in' function but I'm not sure I'm using it right with the where_clause. I might just not have a good enough grasp of the proper terms to use when asking and searching. Any help is appreciated. I've been using the Python Window in ArcMap 10.3.
Currently I'm at:
arcpy.Select_analysis ("stateparks", "notrails", ''trail' is in \"SITE_NAME\"')
Although using the Select tool is a good choice, the syntax for the SQL expression can be a challenge. Consider using an Update Cursor to tackle this problem.
import arcpy
stateparks = r"C:\path\to\your\shapefile.shp"
notrails = r"C:\path\to\your\shapefile_without_trails.shp"
# Make a copy of your shapefile
arcpy.CopyFeatures_management(stateparks, notrails)
# Check if "trail" exists in the string--delete row if so
with arcpy.da.UpdateCursor(notrails, "SITE_NAME") as cursor:
for row in cursor:
if "trails" in row[0]: # row[0] refers to the current row in the "SITE_NAME" field
cursor.deleteRow() # Delete the row if condition is true

How to disable query cache?

First of all, sorry for not 100% clearly questions title.
It is easier to explain with few lines of code:
query = {...}
while True:
elastic_response = elastic_client.search(elastic_index, body=query, request_cache=False)
if elastic_response["hits"]["total"]) == 0:
break
else:
for doc in elastic_response["hits"]["hits"]:
print("delete {}".format(doc["_id"]))
elastic_client.delete(index=elastic_index, doc_type=doc["_type"], id=doc["_id"])
I make a search, then delete all the docs and then do the search again to get the next bunch.
BUT the search query gives me the same docs! And this results in 404 exception on delete. It has to be some kind of cache, but i does not found anything, "request_cache" doesn't help.
I can probably refactor this code to use batch delete, but i want to understand what is wrong here
P.S. i'm using the official python client
If using a sleep() after the deletes makes the documents go away, then it's not about cache. It's about the refresh_interval and the near real timeness or Elasticsearch.
So, call _refresh after your code leaves the for loop. Also, don't delete document by document, but create a _bulk request where you delete all your documents in batches, depending on how many they are.

Compare two files and make a list

I have two files that I want to compare with each other and form a list. Each file have their own class. Book and Person. In these, I have different attributes. The ones I want to compare are: person.personalcode == book.borrowed. From this I want a list of all the borrowed books. I have started like this:
for person in person_list:
for book in booklibrary_list:
if person.personalcode == book.borrowed:
person.books.append(book, person)
for person in person_list:
if len(person.books) > 0:
print(person.personalcode + "," + person.firstname + person.lastname + "have borrowed the following books: ")
for book in person.books:
print(book)
for person in person_list:
person.books = []
But it does not work, what have I missed or done wrong?
Posting as an answer as this is too long for a comment.
First: improve your question. Show how you construct the Person and the Book class, and how you populate them. Describe what the personalcode is and how come personalcode would be the same as a book code. Some sample data and a bit more code would make this easier to answer.
Second: reading your other question, you seem to be storing your data in a text file, loading and querying, modifying and saving the data directly. This will lead you to problems and instead you should consider going down one of two lines:
Use an SQL database, possibly the easiest to start with is SQLite as it does not need a server to be set up and there is a module in the standard library that is very easy to use. Store your data there and you will find it easier in the long run.
Use Python objects (e.g. three classes: Person, Book, and BorrowedBook), manage lists of them within the program, and use shelve from the standard library to store and retrieve these lists of objects between queries.
The use of shelve would be easier if you have not used SQL before, and I hope you will forgive the pun when I say that it might be very appropriate for a book-related application!

How do you improve search?

I just got haystack with solr installed and created a custom view:
from haystack.query import SearchQuerySet
def post_search(request, template_name='search/search.html'):
getdata = request.GET.copy()
try:
results = SearchQuerySet().filter(title=getdata['search'])[:10]
except:
results = None
return render_to_response(template_name, locals(), context_instance=RequestContext(request))
This view only returns exact matches on the title field. How do I do at least things like the sql LIKE '%string%' (or at least i think it's this) where if I search 'i' or 'IN' or 'index' I will get the result 'index'?
Also are most of the ways you search edited using haystack or solr?
What other good practices/search improvements do you suggest (please give implementation too)?
Thanks a bunch in advance!
When you use Haystack/Solr, the idea is that you have to tell Haystack/Solr what you want indexed for a particular object. So say you wanted to build a find as you type index for a basic dictionary. If you wanted it to just match prefixes, for the word Boston, you'd need to tell it to index B, Bo, Bos, etc. and then you'd issue a query for whatever the current search expression was and you could return the results. If you wanted to search any part of the word, you'd need to build suffix trees and then Solr would take care of indexing them.
Look at templates in Haystack for more info. http://docs.haystacksearch.org/dev/best_practices.html#well-constructed-templates
The question you're asking is fairly generic, it might help to give specifics about what people are searching for. Then it'll be easier to suggest how to index the data. Good luck.

How to implement full text search in Django?

I would like to implement a search function in a django blogging application. The status quo is that I have a list of strings supplied by the user and the queryset is narrowed down by each string to include only those objects that match the string.
See:
if request.method == "POST":
form = SearchForm(request.POST)
if form.is_valid():
posts = Post.objects.all()
for string in form.cleaned_data['query'].split():
posts = posts.filter(
Q(title__icontains=string) |
Q(text__icontains=string) |
Q(tags__name__exact=string)
)
return archive_index(request, queryset=posts, date_field='date')
Now, what if I didn't want do concatenate each word that is searched for by a logical AND but with a logical OR? How would I do that? Is there a way to do that with Django's own Queryset methods or does one have to fall back to raw SQL queries?
In general, is it a proper solution to do full text search like this or would you recommend using a search engine like Solr, Whoosh or Xapian. What are their benefits?
I suggest you to adopt a search engine.
We've used Haystack search, a modular search application for django supporting many search engines (Solr, Xapian, Whoosh, etc...)
Advantages:
Faster
perform search queries even without querying the database.
Highlight searched terms
"More like this" functionality
Spelling suggestions
Better ranking
etc...
Disadvantages:
Search Indexes can grow in size pretty fast
One of the best search engines (Solr) run as a Java servlet (Xapian does not)
We're pretty happy with this solution and it's pretty easy to implement.
Actually, the query you have posted does use OR rather than AND - you're using \ to separate the Q objects. AND would be &.
In general, I would highly recommend using a proper search engine. We have had good success with Haystack on top of Solr - Haystack manages all the Solr configuration, and exposes a nice API very similar to Django's own ORM.
Answer to your general question: Definitely use a proper application for this.
With your query, you always examine the whole content of the fields (title, text, tags). You gain no benefit from indexes, etc.
With a proper full text search engine (or whatever you call it), text (words) is (are) indexed every time you insert new records. So queries will be a lot faster especially when your database grows.
SOLR is very easy to setup and integrate with Django. Haystack makes it even simpler.
For full text search in Python, look at PyLucene. It allows for very complex queries. The main problem here is that you must find a way to tell your search engine which pages changed and update the index eventually.
Alternatively, you can use Google Sitemaps to tell Google to index your site faster and then embed a custom query field in your site. The advantage here is that you just need to tell Google the changed pages and Google will do all the hard work (indexing, parsing the queries, etc). On top of that, most people are used to use Google to search plus it will keep your site current in the global Google searches, too.
I think full text search on an application level is more a matter of what you have and how you expect it to scale. If you run a small site with low usage I think it might be more affordable to put some time into making an custom full text search rather than installing an application to perform the search for you. And application would create more dependency, maintenance and extra effort when storing data. By making your search yourself and you can build in nice custom features. Like for example, if your text exactly matches one title you can direct the user to that page instead of showing the results. Another would be to allow title: or author: prefixes to keywords.
Here is a method I've used for generating relevant search results from a web query.
import shlex
class WeightedGroup:
def __init__(self):
# using a dictionary will make the results not paginate
# but it will be a lot faster when storing data
self.data = {}
def list(self, max_len=0):
# returns a sorted list of the items with heaviest weight first
res = []
while len(self.data) != 0:
nominated_weight = 0
for item, weight in self.data.iteritems():
if weight > nominated_weight:
nominated = item
nominated_weight = weight
self.data.pop(nominated)
res.append(nominated)
if len(res) == max_len:
return res
return res
def append(self, weight, item):
if item in self.data:
self.data[item] += weight
else:
self.data[item] = weight
def search(searchtext):
candidates = WeightedGroup()
for arg in shlex.split(searchtext): # shlex understand quotes
# Search TITLE
# order by date so we get most recent posts
query = Post.objects.filter_by(title__icontains=arg).order_by('-date')
arg_hits = query.count() # count is cheap
if arg_hits > 1000:
continue # skip keywords which has too many hits
# Each of these are expensive as it would transfer data
# from the db and build a python object,
for post in query[:50]: # so we limit it to 50 for example
# more hits a keyword has the lesser it's relevant
candidates.append(100.0 / arg_hits, post.post_id)
# TODO add searchs for other areas
# Weight might also be adjusted with number of hits within the text
# or perhaps you can find other metrics to value an post higher,
# like number of views
# candidates can contain a lot of stuff now, show most relevant only
sorted_result = Post.objects.filter_by(post_id__in=candidates.list(20))

Categories

Resources