Find page for a specific item in paginate() SQLAlchemy - python

I am usign Flask-SQLAlchemy’s paginate(). Now I need to find what is the page for a specific comment id.
For example, this will work, if I have all comments in the same page:
new_dict['url'] = '/comments#comment_' + str(comment.id)
However in my case I need this structure:
/comments?page=1#comment_73
How can I find what is the page?

From the docs, the Pagination class has .items and .has_next properties and a .next method we can use:
page_number = 0
search = Comment.query.get(15)
query = Comment.query.filter(Comment.id<40)
for num in range(1, query.paginate(1).pages + 1):
if search in query.paginate(num).items:
page_number = num
break
or
page_number = 0
search = Comment.query.get(15)
pag = Comment.query.filter(Comment.id<40).paginate(1)
while pag.has_next:
if search in pag.items:
page_number = num
break
pag.next()

As far as I know, Celeo's answer won't work. For example, what pag.next() does in his code, based on documentations is:
Returns a Pagination object for the next page.
So, basically, it's doing nothing unless you update your variable; and I recommend you to not create a new query since you already have the comment_id so:
comment_id=request.args.get('comment_id')
if comment_id and comment_id.isdigit():
comment_id = int(comment_id )
page_number = -1
index = 1 # page numbers are 1 indexed in Pagination Object
while comments_pagination_object.has_next:
for comment in comments_pagination_object.items:
if comment.id == comment_id :
page_number = index
break
if page_number != -1:
break
index += 1
product_items = product_items.next()
Then, in the URL, you will have something like:
/comments?comment_id=2
and the part product_items.next() is changing the PaginationObject's page till one of it's items (which in this case is a type of class Comment) has the same id as your request args.

Related

BeautifulSoup checking if an element has a specific class

for containerElement in container:
brandingElement = containerElement.find("div", class_="item-branding")
titleElement=containerElement.find("a", class_="item-title")
rating = brandingElement.find("i", {"class":"rating"})["aria-label"]
priceElement = containerElement.find("li", class_="price-current")
so this for loop checks for prices, ratings, and the name of an item on a website. it works. however, some items have no reviews, in which case it fails. how do i fix this? i was thinking of an if statement to check if the containerElement (the actual container the item and all its information is in) has a rating, but im not exacatly sure how to do that
for containerElement in container:
brandingElement = containerElement.find("div", class_="item-branding")
titleElement=containerElement.find("a", class_="item-title")
rating = brandingElement.find("i", {"class":"rating"})["aria-label"] if brandingElement.find("i", {"class":"rating"}) else ""
priceElement = containerElement.find("li", class_="price-current")

Displaying a List of Data from in Flask REST API Python

In my REST API I have the following code
i = 0
for item in similar_items:
name= main.get_name_from_index(item [0])
url = main.get_url_from_index(item [0])
category = main.get_categories_from_index(item [0])
if (name!= None):
return {'Name': name, 'Category': category, 'URL': url }, 200 # return data and 200 OK code
i = i + 1
if i > 20:
break
This essentially intends to iterate through similar_items and to print out the top 20 however currently it only send the JSON object of the first one. I believe the problem is with the return statement but no matter where I place it I run into the same problem.
Would really appreciate if anyone can share how I can return the desired amount of objects instead of the first one.
Your code above is returning a dictionary containing a single item, where it seems like it should be returning a list of such dictionaries. Try something like this:
i = 0
results = [] # prepare an empty results list
for item in similar_items:
name= main.get_name_from_index(item [0])
url = main.get_url_from_index(item [0])
category = main.get_categories_from_index(item [0])
if (name!= None):
results.append({'Name': name, 'Category': category, 'URL': url }) # add the current item into the results list
i = i + 1
if i > 20: # NB if you want 20 items this should be >= not just > :-)
return results, 200 # return data and 200 OK code
break

Python: not every web page have a certain element

When I tried to use urls to scrape web pages, I found that some elements only exists in some pages and other have not. Let's take the code for example
Code:
for urls in article_url_set:
re=requests.get(urls)
soup=BeautifulSoup(re.text.encode('utf-8'), "html.parser")
title_tag = soup.select_one('.page_article_title')
if title_tag=True:
print(title_tag.text)
else:
#do something
if title_tag exits, I want to print them, if it's not, just skip them.
Another thing is that, I need to save other elements and title.tag.text in data.
data={
"Title":title_tag.text,
"Registration":fruit_tag.text,
"Keywords":list2
}
It will have an error cause not all the article have Title, what should I do to skip them when I try to save? 'NoneType' object has no attribute 'text'
Edit: I decide not to skip them and keep them as Null or None.
U code is wrong:
for urls in article_url_set:
re=requests.get(urls)
soup=BeautifulSoup(re.text.encode('utf-8'), "html.parser")
title_tag = soup.select_one('.page_article_title')
if title_tag=True: # wrong
print(title_tag.text)
else:
#do something
your code if title_tag=True,
changed code title_tag == True
It is recommended to create conditional statements as follows.
title_tag == True => True == title_tag
This is a way to make an error when making a mistake.
If Code is True = title_tag, occur error.
You can simply use a truth test to check if the tag is existing, otherwise assign a value like None, then you can insert it in the data container :
title_tag = soup.select_one('.page_article_title')
if title_tag:
print(title_tag.text)
title = title_tag.text
else:
title = None
Or in one line :
title = title_tag.text if title_tag else None

Tweepy api.followers and count limits

I'm looking to retrieve the followers on some account with more than 5000 followers.
I've seen in a other topic than it's easier to proceed by page then by items per page. (link : tweepy count limited to 200?)
Once it reads all the id, it must check on the profile description if there is a element of a list i created before.
here's my previous code (note op, without using the api.followers):
for element in liste2:
print element
resultats = api.search_users(q=element, count=5000)
for user in resultats:
print user.id
i = 0;
user = api.get_user(user.id)
print user.name
while (i != 4):
if (user.description.find(liste1[i])!= 1):
print user.name + " valide"
i = 4;
statuses = api.user_timeline(id = user.id, count = 20\
0)
The counter doesn't work that's why i want to switch for api.followers witch seems to be more nice.
Thanks for reading

counting/filtering database-entries over multiple foreign key-relations

These are my DB-Models:
class Category(models.Model):
name = models.CharField(max_length = 20, unique = True)
...
class Feed(models.Model):
title = models.CharField(max_length = 100)
category = models.ForeignKey(Category)
...
class Article(models.Model):
title = models.CharField(max_length = 100)
read = models.BooleanField(default = False)
feed = models.ForeignKey(Feed)
...
Every Article belongs to one Feed (source) and each Feed is in a Category.
Now, i want to create a view to display all categories with some meta-information,
e.g. how many unread articles are in category x.
I tried things like this, but nothing worked:
categories = Category.objects.filter(feed__article__read=False)\
.annotate(Count('feed__article'))
What is the proper way to extract those information?
Especially if i want to add further information like: number of feeds in category and
number of favored articles in one QuerySet (If possible)...
Any ideas?
Thanks.
EDIT: Since i had no idea how to 'solve' this problem, i've written an ugly workaround:
result = categories.values_list('name',
'feed__title',
'feed__article__title',
'feed__article__read')
for i in range(0, len(result)):
#if pointer changed to a new category
#dump current dict to list and clear dict for the new values
if last != result[i][0]:
category_list.append(category_info.copy())
category_info.clear()
last = result[i][0]
if some values None:
insert values
elif some other values None:
insert values
else:
category_info['name'] = result[i][0]
category_info['feed_count'] = category_info.get('feed_count', 0) + 1
category_info['all_article_count'] = category_info.get('all_article_count', 0) + 1
#if a article has not been read yet
if result[i][3] == False:
category_info['unread_article_count'] = category_info.get('unread_article_count', 0) + 1
#if this category is the last in the result-list
if i+1 == len(result):
category_list.append(category_info.copy())
i += 1
I am pretty sure there is a quicker and nicer way to get those information, but at least i can work with it for the moment :/
You must label the information. You should be able to use category.article_count for the items in the queryset if you use the query below.
categories = Category.objects.filter(feed__article__read=False)\
.annotate(article_count=Count('feed__article'))

Categories

Resources