I'm trying to get a complete list of instances available in a region. The code will iterate over a number of pages but stops short with an error:
Traceback (most recent call last):
File "list_available_instance_offerings.py", line 29, in <module>
marker = page_iterator['Marker']
TypeError: 'PageIterator' object is not subscriptable
How can I iterate over all of the pages without erroring prematurely?
Here's my script:
import sys
import boto3
ec2 = boto3.client("ec2")
marker = None
while True:
paginator = ec2.get_paginator('describe_instance_type_offerings')
page_iterator = paginator.paginate(
LocationType='availability-zone',Filters=[{'Name': 'location', 'Values':['us-east-1a']}],
PaginationConfig={
'PageSize': 50,
'StartingToken': marker})
for page in page_iterator:
offerings = page['InstanceTypeOfferings']
for offer in offerings:
print(offer['InstanceType'])
try:
marker = page_iterator['Marker']
except KeyError:
sys.exit()
There is no such property as Marker. I belive that you are after NextToken from page. In this case, it should be:
try:
marker = page['NextToken']
except KeyError:
sys.exit()
When using a boto3 paginator, you don't need to worry about the Marker. The purpose of the paginator is to manage that for you.
client = boto3.client('route53')
paginator = client.get_paginator('list_health_checks')
response_iterator = paginator.paginate(
PaginationConfig={
'PageSize': 10
}
)
for page in response_iterator:
for healthcheck in page['HealthChecks']:
print(healthcheck["Id"])
If you have 87 health checks, this will list 8 pages of 10 and 1 page of 7.
You can also use MaxItems if you want to limit the output. For example, if you have MaxItems=80 you will get 8 pages of 10.
Related
I'm using stackoverflow's API and want to use the 'quota_remaining' so I can perform pagination.
But when I try to print the 'quota_remaining' I am getting a KeyError after the print. So it prints the value but I am not able to store it in a variable because it throws a KeyError afterwards.
This is my code:
# Get data
users_url = 'https://api.stackexchange.com/2.3/users?page=1&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result)
print(result['quota_remaining']) # line 33
quota_remaining = result['quota_remaining']
And this is what is returned (I included a sample of the print(result)):
{'items': ['badge_counts': {'bronze': 6, 'silver': 0, 'gold': 0}, 'view_count': 21, 'answer_count': 2, 'question_count': 14, 'reputation_change_quarter': 0, 'reputation': 75, 'user_id': 2498916, 'link': 'https://stackoverflow.com/users/2498916/oscar-salas', 'display_name': 'Oscar Salas'}], 'has_more': True, 'backoff': 10, 'quota_max': 300, 'quota_remaining': 261, 'page': 1, 'page_size': 100}
261
1
{'error_id': 502, 'error_message': 'Violation of backoff parameter', 'error_name': 'throttle_violation'}
Traceback (most recent call last):
File "test.py", line 33, in <module>
print(result['quota_remaining'])
KeyError: 'quota_remaining'
I also don't understand why I am getting the error 502, what am I violating? What is the backoff parameter?
Change variable name, try this:
users_url = 'https://api.stackexchange.com/2.3/users?page=1&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result)
print(result['quota_remaining']) # line 33
quota_remaining1 = result['quota_remaining']
Okay I've figured it out. I was making too many requests as #SimonT pointed out.
The backoff parameter meant I had to wait that many seconds before hitting the same method again. I'm my case I had a backoff = 10 so I set up a time.sleep(10) between requests.
This is actually my full code (I had only a sample in the question as I did not understood the keyError was actually because of the throttle violation - a rookie mistake):
while quota > 240:
# Get data
users_url = f'https://api.stackexchange.com/2.3/users?page={page}&pagesize=100&order=desc&sort=modified&site=stackoverflow&filter=!56ApJn82ELRG*IWQxo6.gXu9qS90qXxNmY8e9b'
# Make the API call
response = requests.get(users_url)
result = response.json()
print(result['quota_remaining'])
print(result['page'])
# Update counters
quota = result['quota_remaining']
page = result['page']
# Save users in a list
for user in result["items"]:
user['_id'] = user.pop('user_id')
results.append(user)
if result['has_more'] == True:
time.sleep(10)
page += 1
else:
break
I'm trying to get the view count for a list of videos from a channel. I've written a function and when I try to run it with just 'video_id', 'title' & 'published date' I get the output. However, when I want the view count or anything from statistics part of API, then it is giving a Key Error.
Here's the code:
def get_video_details(youtube, video_ids):
all_video_stats = []
for i in range(0, len(video_ids), 50):
request = youtube.videos().list(
part='snippet,statistics',
id = ','.join(video_ids[i:i+50]))
response = request.execute()
for video in response['items']:
video_stats = dict(
Video_id = video['id'],
Title = video['snippet']['title'],
Published_date = video['snippet']['publishedAt'],
Views = video['statistics']['viewCount'])
all_video_stats.append(video_stats)
return all_video_stats
get_video_details(youtube, video_ids)
And this is the error message:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18748/3337790216.py in <module>
----> 1 get_video_details(youtube, video_ids)
~\AppData\Local\Temp/ipykernel_18748/1715852978.py in get_video_details(youtube, video_ids)
14 Title = video['snippet']['title'],
15 Published_date = video['snippet']['publishedAt'],
---> 16 Views = video['statistics']['viewCount'])
17
18 all_video_stats.append(video_stats)
KeyError: 'viewCount'
I was referencing this Youtube video to write my code.
Thanks in advance.
I got it.
I had to use .get() to avoid the KeyErrors. It will return None for KeyErrors.
Replaced this code to get the solution.
Views = video['statistics'].get('viewCount')
This code is pre-made in a Zapier forum to pull failed responses from another piece of software called iAuditor. When I plug in the code and update the API token and webhook URL this error pops up:
Traceback (most recent call last):
SyntaxError: invalid syntax (usercode.py, line 42)
Here is the code:
[code]
import json
import requests
auth_header = {'Authorization': 'a4fca847d3f203bd7306ef5d1857ba67a2b3d66aa455e06fac0ad0be87b9d226'}
webhook_url = 'https://hooks.zapier.com/hooks/catch/3950922/efka9n/'
api_url = 'https://api.safetyculture.io/audits/'
audit_id = input['audit_id']
audit_doc = requests.get(api_url + audit_id, headers=auth_header).json()
failed_items = []
audit_author = audit_doc['audit_data']['authorship']['author']
conducted_on = audit_doc['audit_data']['date_completed']
conducted_on = conducted_on[:conducted_on.index('T')]
audit_title = audit_doc['template_data']['metadata']['name']
for item in audit_doc['items']:
if item.get('responses') and item['responses'].get('failed') == True:
label = item.get('label')
if label is None:
label = 'no_label'
responses = item['responses']
response_label = responses['selected'][0]['label']
notes = responses.get('text')
if notes is None:
notes = ''
failed_items.append({'label': label,
'response_label': response_label,
'conducted_on': conducted_on,
'notes': notes,
'author': audit_author
})
for item in failed_items:
r = requests.post(webhook_url, data = item)
return response.json()
[/code]
This looks like an error from the platform. It looks like Zapier uses a script called usercode.py to bootstrap launching your script and the error seems to be coming from that part.
import datetime
from threading import Timer
import firebase_admin
from firebase_admin import firestore
import calendar
db = firestore.Client()
col_ref = db.collection(u'tblAssgin').get()
current_serving = [doc.id for doc in col_ref]
#print(current_serving)
def sit_time():
for i in current_serving:
try:
doc_ref = db.collection(u'tblAssgin').document(i)
except:
current_serving.remove(doc_ref)
else:
doc = doc_ref.get()
a = doc.get('assginTime')
assign_time = datetime.datetime.fromtimestamp(calendar.timegm(a.timetuple()))
now = datetime.datetime.now()
sitting_time = now - assign_time
hours,remainder = divmod(sitting_time.seconds, 3600)
minutes, seconds = divmod(remainder, 60)
print('minutes:',minutes)
updates = {u'sitting_time':minutes}
doc_ref.update(updates)
t = None
def refresh():
global t
sit_time()
t = Timer(60, refresh)
t.daemon = True
t.start()
refresh()
So basically the above code does that it first fetches all the document id's of collection name 'tblAssgin' and store it in 'current_serving' list. Then, loops over each document and calculate time and runs again after every 60 sec. Now suppose I delete one document then that document will not be found. So I want to do that when the document is not found exception is raised and that document id gets's removed from 'current_serving' list. But the exception is not caught.
Please help
Thanks in advance..!!
You're assuming CollectionReference.document() will throw an exception if the document does not exist. It doesn't.
>>> client.collection('non-existing').document('also-non-existing')
<google.cloud.firestore_v1beta1.document.DocumentReference object at 0x10feac208>
But DocumentReference.get() will throw an exception if the document doesn't exist.
>>> client.collection('non-existing').document('also-non-existing').get()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/gae/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/document.py", line 432, in get
raise exceptions.NotFound(self._document_path)
google.api_core.exceptions.NotFound: 404 ~/databases/(default)/documents/non-existing/also-non-existing
I am trying to get the below program working. It is supposed to find email addresses in a website but, it is breaking. I suspect the problem is with initializing result = [] inside the crawl function. Below is the code:
# -*- coding: utf-8 -*-
import requests
import re
import urlparse
# In this example we're trying to collect e-mail addresses from a website
# Basic e-mail regexp:
# letter/number/dot/comma # letter/number/dot/comma . letter/number
email_re = re.compile(r'([\w\.,]+#[\w\.,]+\.\w+)')
# HTML <a> regexp
# Matches href="" attribute
link_re = re.compile(r'href="(.*?)"')
def crawl(url, maxlevel):
result = []
# Limit the recursion, we're not downloading the whole Internet
if(maxlevel == 0):
return
# Get the webpage
req = requests.get(url)
# Check if successful
if(req.status_code != 200):
return []
# Find and follow all the links
links = link_re.findall(req.text)
for link in links:
# Get an absolute URL for a link
link = urlparse.urljoin(url, link)
result += crawl(link, maxlevel - 1)
# Find all emails on current page
result += email_re.findall(req.text)
return result
emails = crawl('http://ccs.neu.edu', 2)
print "Scrapped e-mail addresses:"
for e in emails:
print e
The error I get is below:
C:\Python27\python.exe "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py"
Traceback (most recent call last):
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 41, in <module>
emails = crawl('http://ccs.neu.edu', 2)
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 35, in crawl
result += crawl(link, maxlevel - 1)
File "C:/Users/Sagar Shah/PycharmProjects/crawler/webcrawler.py", line 35, in crawl
result += crawl(link, maxlevel - 1)
TypeError: 'NoneType' object is not iterable
Process finished with exit code 1
Any suggestions will help. Thanks!
The problem is this:
if(maxlevel == 0):
return
Currently it return None when maxlevel == 0. You can't concatenate a list with a None object.
You need to return an empty list [] to be consistent.