I have a python script that calls the google analytics api once for everyday that I'm trying to get data for. However, on some calls I'm apparently receiving nothing. That or I'm handling errors incorrectly. Here is the function that I'm using to call the api.
def run_query(hour_in_dim, start_date, sessions_writer, connection_error_count, pageToken=None):
# Try to run api request for one day. Wait 10 seconds if "service is currently unavailable."
try:
traffic_results = get_api_query(analytics, start_date, start_date, pageToken)
except HttpError as err:
if err.resp.status in [503]:
print("Sleeping, api service temporarily unavailable.")
time.sleep(10)
run_query(hour_in_dim, start_date, sessions_writer, connection_error_count, pageToken)
else:
raise
except ConnectionResetError:
connection_error_count += 1
time.sleep(10)
if connection_error_count > 2:
raise
else:
run_query(hour_in_dim, start_date, sessions_writer, connection_error_count, pageToken)
# TODO: solve random occurances of "UnboundLocalError: local variable 'traffic_results' referenced before assignment"
dimensions_ga = traffic_results['reports'][0]['columnHeader']['dimensions']
rows = traffic_results['reports'][0]['data']['rows']
The Unbound Local Error is coming from the second line from the bottom where I call traffic results and try to assign it to the dimensions_ga variable.
I believe the problem is that I was using recursion instead of a loop. I used the sample code provided here:
https://developers.google.com/analytics/devguides/reporting/core/v3/errors
also changing "except HttpError, error:" to "except HttpError as error" for python 3.
Not sure the best way to test this, as error is not manually reproducible..
Related
Using Google Suite for Education.
I have an app that wants to:
Create a new calendar.
Add an ACL to such calendar, so the student role would be "reader".
Everything is run through a service account.
The calendar is created just fine, but inserting the ACL throws a 404 error (redacted for privacy):
<HttpError 404 when requesting https://www.googleapis.com/calendar/v3/calendars/MY_DOMAIN_long_string%40group.calendar.google.com/acl?alt=json returned "Not Found">
The function that tries to insert the ACL:
def _create_calendar_acl(calendar_id, user, role='reader'):
credentials = service_account.Credentials.from_service_account_file(
CalendarAPI.module_path)
scoped_credentials = credentials.with_scopes(
['https://www.googleapis.com/auth/calendar'])
delegated_credentials = scoped_credentials.with_subject(
'an_admin_email')
calendar_api = googleapiclient.discovery.build('calendar',
'v3',
credentials=delegated_credentials)
body = {'role': role,
'scope': {'type': 'user',
'value': user}}
answer = calendar_api.acl().insert(calendarId=calendar_id,
body=body,
).execute()
return answer
The most funny thing is, if I retry the operation a couple times, it finally succeeds. Hence, that's what my code does:
def create_student_schedule_calendar(email):
MAX_RETRIES = 5
# Get student information
# Create calendar
answer = Calendar.create_calendar('a.calendar.owner#mydomain',
f'Student Name - schedule',
timezone='Europe/Madrid')
calendar_id = answer['id']
counter = 0
while counter < MAX_RETRIES:
try:
print('Try ' + str(counter + 1))
_create_calendar_acl(calendar_id=calendar_id, user=email) # This is where the 404 is thrown
break
except HttpError: # this is where the 404 is caught
counter += 1
print('Wait ' + str(counter ** 2))
time.sleep(counter ** 2)
continue
if counter == MAX_RETRIES:
raise Exception(f'Exceeded retries to create ACL for {calendar_id}')
Anyway, it takes four tries (between 14 and 30 seconds) to succeed - and sometimes it expires.
Would it be possible that the recently created calendar is not immediately available for the API using it?
Propagation is often an issue with cloud-based services. Large-scale online service are distributed along a network of machines which in themselves have some level of latency - there is a discrete, non-zero amount of time that information takes to propagate along a network and update everywhere.
All operations working after the first call which doesn't result in 404, is demonstrative of this process.
Mitigation:
I suggest if you're creating and editing in the same function call implementing some kind of wait/sleep for a moment to mitigate getting 404s. This can be done in python using the time library:
import time
# calendar creation code here
time.sleep(2)
# calendar edit code here
I have a micro service with a job that needs to happen only if a different server is up.
for a few weeks it works great, if the server was down, the micro service sleeps a bit without doing the job (as should) and if the server was up - the job was done.
the server is never down for more then a few minutes (for sure! the server is highly monitored), so the job is skipped 2-3 times tops.
Today I entered my Docker Container and noticed in the logs that the job didn't even try to continue for a few weeks now (bad choice not to monitor I know), indicating, I assume that some kind of deadlock happened.
I also assume that the problem is with my Exception handling, could use some advice I work alone.
def is_server_healthy():
url = "url" #correct url for health check path
try:
res = requests.get(url)
except Exception as ex:
LOGGER.error(f"Can't health check!{ex}")
finally:
pass
return res
def init():
while True:
LOGGER.info(f"Sleeping for {SLEEP_TIME} Minutes")
time.sleep(SLEEP_TIME*ONE_MINUTE)
res = is_server_healthy()
if res.status_code == 200:
my_api.DoJob()
LOGGER.info(f"Server is: {res.text}")
else:
LOGGER.info(f"Server is down... {res.status_code}")
(The names of the variables were changed to simplify the question)
The health check is simple enough - return "up" if up. anything else considered to be down, so unless status 200 and "up" came back I consider the server to be down.
In case your server is down you get a non-captured error:
NameError: name 'res' is not defined
Why? See:
def is_server_healthy():
url = "don't care"
try:
raise Exception() # simulate fail
except Exception as ex:
print(f"Can't health check!{ex}")
finally:
pass
return res ## name is not known ;o)
res = is_server_healthy()
if res.status_code == 200: # here, next exception bound to happen
my_api.DoJob()
LOGGER.info(f"Server is: {res.text}")
else:
LOGGER.info(f"Server is down... {res.status_code}")
Even if you declared the name, it would try to access some attribute thats not there:
if res.status_code == 200: # here - object has no attribute 'status_code'
my_api.DoJob()
LOGGER.info(f"Server is: {res.text}")
else:
LOGGER.info(f"Server is down... {res.status_code}")
would try to access a member thats simply not there => Exception, and process gone.
You are probably better off using some system-specific way to call your script once every minute (Cron Jobs, Task Scheduler) then idling in a while True: with sleep.
I'm trying to implement a method which tries to make a few attempts to download an image from url. To do so, I'm using requests lib. An example of my code is:
while attempts < nmr_attempts:
try:
attempts += 1
response = requests.get(self.basis_url, params=query_params, timeout=response_timeout)
except Exception as e:
pass
Each attempt can't spend more than "response_timeout" making the request. However It seems that the timeout variable is not doing anything since it does not respect the times given by myself.
How can I limit the max blocking time at response.get() call.
Thanks in advance
Can you try following (get rid of try-except block) and see if it helps? except Exception is probably suppressing the exception that requests.get throws.
while attempts < nmr_attempts:
response = requests.get(self.basis_url, params=query_params, timeout=response_timeout)
Or with your original code, you can catch requests.exceptions.ReadTimeout exception. Such as:
while attempts < nmr_attempts:
try:
attempts += 1
response = requests.get(self.basis_url, params=query_params, timeout=response_timeout)
except requests.exceptions.ReadTimeout as e:
do_something()
i'm sending apple push notifications via AWS SNS via Lambda with Boto3 and Python.
from __future__ import print_function
import boto3
def lambda_handler(event, context):
client = boto3.client('sns')
for record in event['Records']:
if record['eventName'] == 'INSERT':
rec = record['dynamodb']['NewImage']
competitors = rec['competitors']['L']
for competitor in competitors:
if competitor['M']['confirmed']['BOOL'] == False:
endpoints = competitor['M']['endpoints']['L']
for endpoint in endpoints:
print(endpoint['S'])
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
Everything works fine! But when an endpoint is invalid for some reason (at the moment this happens everytime i run a development build on my device, since i get a different endpoint then. This will be either not found or deactivated.) the Lambda function fails and gets called all over again. In this particular case if for example the second endpoint fails it will send the push over and over again to endpoint 1 to infinity.
Is it possible to ignore invalid endpoints and just keep going with the function?
Thank you
Edit:
Thanks to your help i was able to solve it with:
try:
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
except Exception as e:
print(e)
continue
Aws lamdba on failure retries the function till the event expires from the stream.
In your case since the exception on the 2nd endpoint is not handled, the retry mechanism ensures the reexecution of post to the first endpoint.
If you handle the exception and ensure the function successfully ends even when there is a failure, then the retries will not happen.
I made a simple script for amusment that takes the latest comment from http://www.reddit.com/r/random/comments.json?limit=1 and speaks through espeak. I ran into a problem however. If Reddit fails to give me the json data, which it commonly does, the script stops and gives a traceback. This is a problem, as it stops the script. Is there any sort of way to retry to get the json if it fails to load. I am using requests if that means anything
If you need it, here is the part of the code that gets the json data
url = 'http://www.reddit.com/r/random/comments.json?limit=1'
r = requests.get(url)
quote = r.text
body = json.loads(quote)['data']['children'][0]['data']['body']
subreddit = json.loads(quote)['data']['children'][0]['data']['subreddit']
For the vocabulary, the actual error you're having is an exception that has been thrown at some point in a program because of a detected runtime error, and the traceback is the program thread that tells you where the exception has been thrown.
Basically, what you want is an exception handler:
try:
url = 'http://www.reddit.com/r/random/comments.json?limit=1'
r = requests.get(url)
quote = r.text
body = json.loads(quote)['data']['children'][0]['data']['body']
subreddit = json.loads(quote)['data']['children'][0]['data']['subreddit']
except Exception as err:
print err
so that you jump over the part that needs the thing that couldn't work. Have a look at that doc as well: HandlingExceptions - Python Wiki
As pss suggests, if you want to retry after the url failed to load:
done = False
while not done:
try:
url = 'http://www.reddit.com/r/random/comments.json?limit=1'
r = requests.get(url)
except Exception as err:
print err
done = True
quote = r.text
body = json.loads(quote)['data']['children'][0]['data']['body']
subreddit = json.loads(quote)['data']['children'][0]['data']['subreddit']
N.B.: That solution may not be optimal, since if you're offline or the URL is always failing, it'll do an infinite loop. If you retry too fast and too much, Reddit may also ban you.
N.B. 2: I'm using the newest Python 3 syntax for exception handling, which may not work with Python older than 2.7.
N.B. 3: You may also want to choose a class other than Exception for the exception handling, to be able to select what kind of error you want to handle. It mostly depends on your app design, and given what you say, you might want to handle requests.exceptions.ConnectionError, but have a look at request's doc to choose the right one.
Here's what you may want, but please think this through and adapt it to your use case:
import requests
import time
import json
def get_reddit_comments():
retries = 5
while retries != 0:
try:
url = 'http://www.reddit.com/r/random/comments.json?limit=1'
r = requests.get(url)
break # if the request succeeded we get out of the loop
except requests.exceptions.ConnectionError as err:
print("Warning: couldn't get the URL: {}".format(err))
time.delay(1) # wait 1 second between two requests
retries -= 1
if retries == 0: # if we've done 5 attempts, we fail loudly
return None
return r.text
def use_data(quote):
if not quote:
print("could not get URL, despites multiple attempts!")
return False
data = json.loads(quote)
if 'error' in data.keys():
print("could not get data from reddit: error code #{}".format(quote['error']))
return False
body = data['data']['children'][0]['data']['body']
subreddit = data['data']['children'][0]['data']['subreddit']
# … do stuff with your data here
if __name__ == "__main__":
quote = get_reddit_comments()
if not use_data(quote):
print("Fatal error: Couldn't handle data receipt from reddit.")
sys.exit(1)
I hope this snippet will help you correctly design your program. And now that you've discovered exceptions, please always remember that exceptions are for handling things that shall stay exceptional. If you throw an exception at some point in one of your programs, always ask yourself if this is something that should happen when something unexpected happens (like a webpage not loading), or if it's an expected error (like a page loading but giving you an output that is not expected).