Python Boto3 Paginating Error: 'PageIterator' object is not subscriptable - python

I'm trying to get a complete list of instances available in a region. The code will iterate over a number of pages but stops short with an error:
Traceback (most recent call last):
File "list_available_instance_offerings.py", line 29, in <module>
marker = page_iterator['Marker']
TypeError: 'PageIterator' object is not subscriptable
How can I iterate over all of the pages without erroring prematurely?
Here's my script:
import sys
import boto3
ec2 = boto3.client("ec2")
marker = None
while True:
paginator = ec2.get_paginator('describe_instance_type_offerings')
page_iterator = paginator.paginate(
LocationType='availability-zone',Filters=[{'Name': 'location', 'Values':['us-east-1a']}],
PaginationConfig={
'PageSize': 50,
'StartingToken': marker})
for page in page_iterator:
offerings = page['InstanceTypeOfferings']
for offer in offerings:
print(offer['InstanceType'])
try:
marker = page_iterator['Marker']
except KeyError:
sys.exit()

There is no such property as Marker. I belive that you are after NextToken from page. In this case, it should be:
try:
marker = page['NextToken']
except KeyError:
sys.exit()

When using a boto3 paginator, you don't need to worry about the Marker. The purpose of the paginator is to manage that for you.
client = boto3.client('route53')
paginator = client.get_paginator('list_health_checks')
response_iterator = paginator.paginate(
PaginationConfig={
'PageSize': 10
}
)
for page in response_iterator:
for healthcheck in page['HealthChecks']:
print(healthcheck["Id"])
If you have 87 health checks, this will list 8 pages of 10 and 1 page of 7.
You can also use MaxItems if you want to limit the output. For example, if you have MaxItems=80 you will get 8 pages of 10.

Related

Python KeyError after returning the value (also error_id 502)

I'm using stackoverflow's API and want to use the 'quota_remaining' so I can perform pagination.
But when I try to print the 'quota_remaining' I am getting a KeyError after the print. So it prints the value but I am not able to store it in a variable because it throws a KeyError afterwards.
This is my code:
# Get data
users_url = 'https://api.stackexchange.com/2.3/users?page=1&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result)
print(result['quota_remaining']) # line 33
quota_remaining = result['quota_remaining']
And this is what is returned (I included a sample of the print(result)):
{'items': ['badge_counts': {'bronze': 6, 'silver': 0, 'gold': 0}, 'view_count': 21, 'answer_count': 2, 'question_count': 14, 'reputation_change_quarter': 0, 'reputation': 75, 'user_id': 2498916, 'link': 'https://stackoverflow.com/users/2498916/oscar-salas', 'display_name': 'Oscar Salas'}], 'has_more': True, 'backoff': 10, 'quota_max': 300, 'quota_remaining': 261, 'page': 1, 'page_size': 100}
261
1
{'error_id': 502, 'error_message': 'Violation of backoff parameter', 'error_name': 'throttle_violation'}
Traceback (most recent call last):
File "test.py", line 33, in <module>
print(result['quota_remaining'])
KeyError: 'quota_remaining'
I also don't understand why I am getting the error 502, what am I violating? What is the backoff parameter?
Change variable name, try this:
users_url = 'https://api.stackexchange.com/2.3/users?page=1&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result)
print(result['quota_remaining']) # line 33
quota_remaining1 = result['quota_remaining']
Okay I've figured it out. I was making too many requests as #SimonT pointed out.
The backoff parameter meant I had to wait that many seconds before hitting the same method again. I'm my case I had a backoff = 10 so I set up a time.sleep(10) between requests.
This is actually my full code (I had only a sample in the question as I did not understood the keyError was actually because of the throttle violation - a rookie mistake):
while quota > 240:
# Get data
users_url = f'https://api.stackexchange.com/2.3/users?page={page}&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result['quota_remaining'])
print(result['page'])
# Update counters
quota = result['quota_remaining']
page = result['page']
# Save users in a list
for user in result["items"]:
user['_id'] = user.pop('user_id')
results.append(user)
if result['has_more'] == True:
time.sleep(10)
page += 1
else:
break

Getting KeyError: 'viewCount' for using Youtube API in Python

I'm trying to get the view count for a list of videos from a channel. I've written a function and when I try to run it with just 'video_id', 'title' & 'published date' I get the output. However, when I want the view count or anything from statistics part of API, then it is giving a Key Error.
Here's the code:
def get_video_details(youtube, video_ids):
all_video_stats = []
for i in range(0, len(video_ids), 50):
request = youtube.videos().list(
part='snippet,statistics',
id = ','.join(video_ids[i:i+50]))
response = request.execute()
for video in response['items']:
video_stats = dict(
Video_id = video['id'],
Title = video['snippet']['title'],
Published_date = video['snippet']['publishedAt'],
Views = video['statistics']['viewCount'])
all_video_stats.append(video_stats)
return all_video_stats
get_video_details(youtube, video_ids)
And this is the error message:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18748/3337790216.py in <module>
----> 1 get_video_details(youtube, video_ids)
~\AppData\Local\Temp/ipykernel_18748/1715852978.py in get_video_details(youtube, video_ids)
14 Title = video['snippet']['title'],
15 Published_date = video['snippet']['publishedAt'],
---> 16 Views = video['statistics']['viewCount'])
17
18 all_video_stats.append(video_stats)
KeyError: 'viewCount'
I was referencing this Youtube video to write my code.
Thanks in advance.
I got it.
I had to use .get() to avoid the KeyErrors. It will return None for KeyErrors.
Replaced this code to get the solution.
Views = video['statistics'].get('viewCount')

Python code in Zapier (invalid syntax (usercode.py, line 42))

This code is pre-made in a Zapier forum to pull failed responses from another piece of software called iAuditor. When I plug in the code and update the API token and webhook URL this error pops up:
Traceback (most recent call last):
SyntaxError: invalid syntax (usercode.py, line 42)
Here is the code:
[code]
import json
import requests
auth_header = {'Authorization': 'a4fca847d3f203bd7306ef5d1857ba67a2b3d66aa455e06fac0ad0be87b9d226'}
webhook_url = 'https://hooks.zapier.com/hooks/catch/3950922/efka9n/'
api_url = 'https://api.safetyculture.io/audits/'
audit_id = input['audit_id']
audit_doc = requests.get(api_url + audit_id, headers=auth_header).json()
failed_items = []
audit_author = audit_doc['audit_data']['authorship']['author']
conducted_on = audit_doc['audit_data']['date_completed']
conducted_on = conducted_on[:conducted_on.index('T')]
audit_title = audit_doc['template_data']['metadata']['name']
for item in audit_doc['items']:
if item.get('responses') and item['responses'].get('failed') == True:
label = item.get('label')
if label is None:
label = 'no_label'
responses = item['responses']
response_label = responses['selected'][0]['label']
notes = responses.get('text')
if notes is None:
notes = ''
failed_items.append({'label': label,
'response_label': response_label,
'conducted_on': conducted_on,
'notes': notes,
'author': audit_author
})
for item in failed_items:
r = requests.post(webhook_url, data = item)
return response.json()
[/code]
This looks like an error from the platform. It looks like Zapier uses a script called usercode.py to bootstrap launching your script and the error seems to be coming from that part.

Cloud firestore exceptions not caught by except block

import datetime
from threading import Timer
import firebase_admin
from firebase_admin import firestore
import calendar
db = firestore.Client()
col_ref = db.collection(u'tblAssgin').get()
current_serving = [doc.id for doc in col_ref]
#print(current_serving)
def sit_time():
for i in current_serving:
try:
doc_ref = db.collection(u'tblAssgin').document(i)
except:
current_serving.remove(doc_ref)
else:
doc = doc_ref.get()
a = doc.get('assginTime')
assign_time = datetime.datetime.fromtimestamp(calendar.timegm(a.timetuple()))
now = datetime.datetime.now()
sitting_time = now - assign_time
hours,remainder = divmod(sitting_time.seconds, 3600)
minutes, seconds = divmod(remainder, 60)
print('minutes:',minutes)
updates = {u'sitting_time':minutes}
doc_ref.update(updates)
t = None
def refresh():
global t
sit_time()
t = Timer(60, refresh)
t.daemon = True
t.start()
refresh()
So basically the above code does that it first fetches all the document id's of collection name 'tblAssgin' and store it in 'current_serving' list. Then, loops over each document and calculate time and runs again after every 60 sec. Now suppose I delete one document then that document will not be found. So I want to do that when the document is not found exception is raised and that document id gets's removed from 'current_serving' list. But the exception is not caught.
Please help
Thanks in advance..!!
You're assuming CollectionReference.document() will throw an exception if the document does not exist. It doesn't.
>>> client.collection('non-existing').document('also-non-existing')
<google.cloud.firestore_v1beta1.document.DocumentReference object at 0x10feac208>
But DocumentReference.get() will throw an exception if the document doesn't exist.
>>> client.collection('non-existing').document('also-non-existing').get()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/gae/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/document.py", line 432, in get
raise exceptions.NotFound(self._document_path)
google.api_core.exceptions.NotFound: 404 ~/databases/(default)/documents/non-existing/also-non-existing

TypeError: 'NoneType' object is not iterable: Webcrawler to scrape email addresses

I am trying to get the below program working. It is supposed to find email addresses in a website but, it is breaking. I suspect the problem is with initializing result = [] inside the crawl function. Below is the code:
# -*- coding: utf-8 -*-
import requests
import re
import urlparse
# In this example we're trying to collect e-mail addresses from a website
# Basic e-mail regexp:
# letter/number/dot/comma # letter/number/dot/comma . letter/number
email_re = re.compile(r'([\w\.,]+#[\w\.,]+\.\w+)')
# HTML <a> regexp
# Matches href="" attribute
link_re = re.compile(r'href="(.*?)"')
def crawl(url, maxlevel):
result = []
# Limit the recursion, we're not downloading the whole Internet
if(maxlevel == 0):
return
# Get the webpage
req = requests.get(url)
# Check if successful
if(req.status_code != 200):
return []
# Find and follow all the links
links = link_re.findall(req.text)
for link in links:
# Get an absolute URL for a link
link = urlparse.urljoin(url, link)
result += crawl(link, maxlevel - 1)
# Find all emails on current page
result += email_re.findall(req.text)
return result
emails = crawl('http://ccs.neu.edu', 2)
print "Scrapped e-mail addresses:"
for e in emails:
print e
The error I get is below:
C:\Python27\python.exe "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py"
Traceback (most recent call last):
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 41, in <module>
emails = crawl('http://ccs.neu.edu', 2)
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 35, in crawl
result += crawl(link, maxlevel - 1)
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 35, in crawl
result += crawl(link, maxlevel - 1)
TypeError: 'NoneType' object is not iterable
Process finished with exit code 1
Any suggestions will help. Thanks!
The problem is this:
if(maxlevel == 0):
return
Currently it return None when maxlevel == 0. You can't concatenate a list with a None object.
You need to return an empty list [] to be consistent.

Categories

Resources