How to handle multiple missing keys in a dict? - python

I'm using an API to get basic information about shops in my area, name of shop, address, postcode, phone number etc… The API returns back a long list about each shop, but I only want some of the data from each shop.
I created a for loop that just takes the information that I want for every shop that the API has returned. This all works fine.
Problem is not all shops have a phone number or a website, so I get a KeyError because the key website does not exist in every return of a shop. I tried to use try and except which works but only if I only handle one thing, but a shop might not have a phone number and a website, which leads to a second KeyError.
What can I do to check for every key in my for loop and if a key is found missing to just add the value "none"?
My code:
import requests
import geocoder
import pprint
g = geocoder.ip('me')
print(g.latlng)
latitude, longitude = g.latlng
URL = "https://discover.search.hereapi.com/v1/discover"
latitude = xxxx
longitude = xxxx
api_key = 'xxxxx' # Acquire from developer.here.com
query = 'food'
limit = 12
PARAMS = {
'apikey':api_key,
'q':query,
'limit': limit,
'at':'{},{}'.format(latitude,longitude)
}
# sending get request and saving the response as response object
r = requests.get(url = URL, params = PARAMS)
data = r.json()
#print(data)
for x in data['items']:
title = x['title']
address = x['address']['label']
street = x['address']['street']
postalCode = x['address']['postalCode']
position = x['position']
access = x['access']
typeOfBusiness = x['categories'][0]['name']
contacts = x['contacts'][0]['phone'][0]['value']
try:
website = x['contacts'][0]['www'][0]['value']
except KeyError:
website = "none"
resultList = {
'BUSINESS NAME:':title,
'ADDRESS:':address,
'STREET NAME:':street,
'POSTCODE:':postalCode,
'POSITION:':position,
'POSITSION2:':access,
'TYPE:':typeOfBusiness,
'PHONE:':contacts,
'WEBSITE:':website
}
print("--"*80)
pprint.pprint( resultList)

I think a good way to handle it would be to use the operator.itemgetter() to create a callable the will attempt to retrieve all the keys at once, and if any aren't found, it will generate a KeyError.
A short demonstration of what I mean:
from operator import itemgetter
test_dict = dict(name="The Shop", phone='123-45-6789', zipcode=90210)
keys = itemgetter('name', 'phone', 'zipcode')(test_dict)
print(keys) # -> ('The Shop', '123-45-6789', 90210)
keys = itemgetter('name', 'address', 'phone', 'zipcode')(test_dict)
# -> KeyError: 'address'

Related

OSM Overpass missing data in query result

I'm gathering all cities, towns and villages of some countries from OSM using an Overpass query in a Python program.
Everything seems to be correct but I found a town in Luxembug that is missing im my result set. It concerns the town Kiischpelt.
'''
import requests
import json
Country = 'LU'
overpass_url = "http://overpass-api.de/api/interpreter"
overpass_query = """
[out:json];
area["ISO3166-1"=""" + Country + """][admin_level=2]->.search;
(node["place"="city"](area.search);
node["place"="town"](area.search);
node["place"="village"](area.search);
way["place"="city"](area.search);
way["place"="town"](area.search);
way["place"="village"](area.search);
rel["place"="city"](area.search);
rel["place"="town"](area.search);
rel["place"="village"](area.search);
);
out center;
"""
response = requests.get(overpass_url,
params={'data': overpass_query})
data = response.json()
filename = """C:/Data/GetGeoData/data/""" + Country + 'cities' +'.json'
f = open(filename,'w', encoding="utf-8")
json.dump(data, f)
f.close()
'''
When searching on the OSM site for Kiischpelt, I get a result of type relation but it doesnet appear in my result set.
Also when I change the query as follows
'''rel"place";''' which should return all places of all kinds (city, town, village, isolated dwelling,...)
Any idea what I'm doing wrong?
Many thanks!

Python microsoft graph api

I am using microsoft graph api to pull my emails in python and return them as a json object. There is a limitation that it only returns 12 emails. The code is:
def get_calendar_events(token):
graph_client = OAuth2Session(token=token)
# Configure query parameters to
# modify the results
query_params = {
#'$select': 'subject,organizer,start,end,location',
#'$orderby': 'createdDateTime DESC'
'$select': 'sender, subject',
'$skip': 0,
'$count': 'true'
}
# Send GET to /me/events
events = graph_client.get('{0}/me/messages'.format(graph_url), params=query_params)
events = events.json()
# Return the JSON result
return events
The response I get are twelve emails with subject and sender, and total count of my email.
Now I want iterate over emails changing the skip in query_params to get the next 12. Any method of how to iterate it using loops or recursion.
I'm thinking something along the lines of this:
def get_calendar_events(token):
graph_client = OAuth2Session(token=token)
# Configure query parameters to
# modify the results
json_list = []
ct = 0
while True:
query_params = {
#'$select': 'subject,organizer,start,end,location',
#'$orderby': 'createdDateTime DESC'
'$select': 'sender, subject',
'$skip': ct,
'$count': 'true'
}
# Send GET to /me/events
events = graph_client.get('{0}/me/messages'.format(graph_url), params=query_params)
events = events.json()
json_list.append(events)
ct += 12
# Return the JSON result
return json_list
May require some tweaking but essentially you're adding 12 to the offset each time as long as it doesn't return an error. Then it appends the json to a list and returns that.
If you know how many emails you have, you could also batch it that way.

parse github api.. getting string indices must be integers error

I need to loop through commits and get name, date, and messages info from
GitHub API.
https://api.github.com/repos/droptable461/Project-Project-Management/commits
I have many different things but I keep getting stuck at string indices must be integers error:
def git():
#name , date , message
#https://api.github.com/repos/droptable461/Project-Project-Management/commits
#commit { author { name and date
#commit { message
#with urlopen('https://api.github.com/repos/droptable461/Project Project-Management/commits') as response:
#source = response.read()
#data = json.loads(source)
#state = []
#for state in data['committer']:
#state.append(state['name'])
#print(state)
link = 'https://api.github.com/repos/droptable461/Project-Project-Management/events'
r = requests.get('https://api.github.com/repos/droptable461/Project-Project-Management/commits')
#print(r)
#one = r['commit']
#print(one)
for item in r.json():
for c in item['commit']['committer']:
print(c['name'],c['date'])
return 'suc'
Need to get person who did the commit, date and their message.
item['commit']['committer'] is a dictionary object, and therefore the line:
for c in item['commit']['committer']: is transiting dictionary keys.
Since you are calling [] on a string (the dictionary key), you are getting the error.
Instead that code should look more like:
def git():
link = 'https://api.github.com/repos/droptable461/Project-Project-Management/events'
r = requests.get('https://api.github.com/repos/droptable461/Project-Project-Management/commits')
for item in r.json():
for key in item['commit']['committer']:
print(item['commit']['committer']['name'])
print(item['commit']['committer']['date'])
print(item['commit']['message'])
return 'suc'

How do I look get an associated value in a json variable using python?

How do I look up the 'id' associated with the a person's 'name' when the 2 are in a dictionary?
user = 'PersonA'
id = ? #How do I retrieve the 'id' from the user_stream json variable?
json, stored in a variable named "user_stream"
[
{
'name': 'PersonA',
'id': '135963'
},
{
'name': 'PersonB',
'id': '152265'
},
]
You'll have to decode the JSON structure and loop through all the dictionaries until you find a match:
for person in json.loads(user_stream):
if person['name'] == user:
id = person['id']
break
else:
# The else branch is only ever reached if no match was found
raise ValueError('No such person')
If you need to make multiple lookups, you probably want to transform this structure to a dict to ease lookups:
name_to_id = {p['name']: p['id'] for p in json.loads(user_stream)}
then look up the id directly:
id = name_to_id.get(name) # if name is not found, id will be None
The above example assumes that names are unique, if they are not, use:
from collections import defaultdict
name_to_id = defaultdict(list)
for person in json.loads(user_stream):
name_to_id[person['name']).append(person['id'])
# lookup
ids = name_to_id.get(name, []) # list of ids, defaults to empty
This is as always a trade-off, you trade memory for speed.
Martijn Pieters's solution is correct, but if you intend to make many such look-ups it's better to load the json and iterate over it just once, and not for every look-up.
name_id = {}
for person in json.loads(user_stream):
name = person['name']
id = person['id']
name_id[name] = id
user = 'PersonA'
print name_id[user]
persons = json.loads(...)
results = filter(lambda p:p['name'] == 'avi',persons)
if results:
id = results[0]["id"]
results can be more than 1 of course..

Youtube API v3 and python to generate list of views/likes on own youtube videos

I'm trying to get a RaspberryPi3 Python based birdhouse reading the popularity of its own uploaded videos (which enables it to deduct which ones should be deleted and avoid hundreds of uploaded files).
I thought the best way to read #views/#likes, was to use yt_analytics_report.py
When I input it always returns 0 values:
When I input:
$python yt_analytics_report.py --filters="video==bb_o--nP1mg"
or
$python yt_analytics_report.py --filters="channel==UCGXIq7h2UeFh9RhsSEsFsiA"
The output is:
$ python yt_analytics_report.py --filters="video==bb_o--nP1mg"
{'metrics': 'views,estimatedMinutesWatched,averageViewDuration', 'filters': 'video==bb_o--nP1mg', 'ids': 'channel==MINE', 'end_date': '2018-01-12', 'start_date': '2018-01-06'}
Please visit this URL to authorize this application: [note: here was url with authe sequence etc. which I ackced] followed by the result:
views estimatedMinutesWatched averageViewDuration
0.0 0.0 0.0
I'm new to this; The last 3 days I've been testing a variety of filters, but the result is always the same. I guess I do something severely wrong.
The (auto sensor triggered) video uploads work excellent so I presume the root cause is related to the way I'm using the yt-analytics example.
Any suggestions on rootcause or alternative methods to retrieve #views/#likes of self uploaded youtubes are appreciated.
After a few days trying I have found a solution how to generate with Python and the Youtube API v3 a list which contains views, likes etc of uploaded videos of my own Youtube channel.
I would like to share the complete code just in case anyone is having the same challenge. The code contains remarks and referrals to additional information.
Please be aware that using the API consumes API-credits... This implies that (when you run this script continuesly or often) you can run out of your daily maximum numbers of API-credits set by Google.
# This python 2.7.14 example shows how to retrieve with Youtube API v3 a list of uploaded Youtube videos in a channel and also
# shows additional statistics of each individual youtube video such as number of views, likes etc.
# Please notice that YOU HAVE TO change API_KEY and Youtube channelID
# Have a look at the referred examples to get to understand how the API works
#
# The code consists of two parts:
# - The first part queries the videos in a channel and stores it in a list
# - The second part queries in detail each individual video
#
# Credits to the Coding 101 team, the guy previously guiding me to a query and Google API explorer who/which got me on track.
#
# RESULTING EXAMPLE OUTPUT: The output of the will look a bit like this:
#
# https://www.youtube.com/watch?v=T3U2oz_Y8T0
# Upload date: 2018-01-13T09:43:27.000Z
# Number of views: 8
# Number of likes: 2
# Number of dislikes: 0
# Number of favorites:0
# Number of comments: 0
#
# https://www.youtube.com/watch?v=EFyC8nhusR8
# Upload date: 2018-01-06T14:24:34.000Z
# Number of views: 6
# Number of likes: 2
# Number of dislikes: 0
# Number of favorites:0
# Number of comments: 0
#
#
import urllib #importing to use its urlencode function
import urllib2 #for making http requests
import json #for decoding a JSON response
#
API_KEY = 'PLACE YOUR OWN YOUTUBE API HERE' # What? How? Learn here: https://www.youtube.com/watch?v=JbWnRhHfTDA
ChannelIdentifier = 'PLACE YOUR OWN YOUTUBE channelID HERE' # What? How? Learn here: https://www.youtube.com/watch?v=tf42K4pPWkM
#
# This first part will query the list of videos uploaded of a specific channel
# The identification is done through the ChannelIdentifier hwich you have defined as a variable
# The results from this first part will be stored in the list videoMetadata. This will be used in the second part of the code below.
#
# This code is based on the a very good example from Coding 101 which you can find here:https://www.youtube.com/watch?v=_M_wle0Iq9M
#
url = 'https://www.googleapis.com/youtube/v3/search?part=snippet&channelId='+ChannelIdentifier+'&maxResults=50&type=video&key='+API_KEY
response = urllib2.urlopen(url) #makes the call to YouTube
videos = json.load(response) #decodes the response so we can work with it
videoMetadata = [] #declaring our list
for video in videos['items']:
if video['id']['kind'] == 'youtube#video':
videoMetadata.append(video['id']['videoId']) #Appends each videoID and link to our list
#
# In this second part, a loop will run through the listvideoMetadata
# During each step the details a specific video are retrieved and displayed
# The structure of the API-return can be tested with the API explorer (which you can excecute without OAuth):
# https://developers.google.com/apis-explorer/#p/youtube/v3/youtube.videos.list?part=snippet%252CcontentDetails%252Cstatistics&id=Ks-_Mh1QhMc&_h=1&
#
for metadata in videoMetadata:
print "https://www.youtube.com/watch?v="+metadata # Here the videoID is printed
SpecificVideoID = metadata
SpecificVideoUrl = 'https://www.googleapis.com/youtube/v3/videos?part=snippet%2CcontentDetails%2Cstatistics&id='+SpecificVideoID+'&key='+API_KEY
response = urllib2.urlopen(SpecificVideoUrl) #makes the call to a specific YouTube
videos = json.load(response) #decodes the response so we can work with it
videoMetadata = [] #declaring our list
for video in videos['items']:
if video['kind'] == 'youtube#video':
print "Upload date: "+video['snippet']['publishedAt'] # Here the upload date of the specific video is listed
print "Number of views: "+video['statistics']['viewCount'] # Here the number of views of the specific video is listed
print "Number of likes: "+video['statistics']['likeCount'] # etc
print "Number of dislikes: "+video['statistics']['dislikeCount']
print "Number of favorites:"+video['statistics']['favoriteCount']
print "Number of comments: "+video['statistics']['commentCount']
print "\n"
Building on Sefo's answer above, I was able to clean up the outputs a bit.
The first function creates a list of relevant videos (you could replace this however you want), and the second iterates through this list and grabs the statistics and basic text data associated with each individual video.
The output is a list of dictionaries, perfect for conversion into a pandas DataFrame.
def youtube_search_list(q, max_results=10):
# Call the search.list method to retrieve results matching the specified
# query term.
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
# Call the search.list method to retrieve results matching the specified
# query term.
search_response = youtube.search().list(
q=q,
part='id,snippet',
maxResults=max_results,
order='viewCount'
).execute()
return search_response
def youtube_search_video(q='spinners', max_results=10):
max_results = max_results
order = "viewCount"
token = None
location = None
location_radius = None
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
q=q
#Return list of matching records up to max_search
search_result = youtube_search_list(q, max_results)
videos_list = []
for search_result in search_result.get("items", []):
if search_result["id"]["kind"] == 'youtube#video':
temp_dict_ = {}
#Available from initial search
temp_dict_['title'] = search_result['snippet']['title']
temp_dict_['vidId'] = search_result['id']['videoId']
#Secondary call to find statistics results for individual video
response = youtube.videos().list(
part='statistics, snippet',
id=search_result['id']['videoId']
).execute()
response_statistics = response['items'][0]['statistics']
response_snippet = response['items'][0]['snippet']
snippet_list = ['publishedAt','channelId', 'description',
'channelTitle', 'tags', 'categoryId',
'liveBroadcastContent', 'defaultLanguage', ]
for val in snippet_list:
try:
temp_dict_[val] = response_snippet[val]
except:
#Not stored if not present
temp_dict_[val] = 'xxNoneFoundxx'
stats_list = ['favoriteCount', 'viewCount', 'likeCount',
'dislikeCount', 'commentCount']
for val in stats_list:
try:
temp_dict_[val] = response_statistics[val]
except:
#Not stored if not present
temp_dict_[val] = 'xxNoneFoundxx'
#add back to main list
videos_list.append(temp_dict_)
return videos_list
this code will help you a ton, i was struggling with this for a long time, just provide the API key and the youtube channel name and the channel id in the search list
from googleapiclient.discovery import build
DEVELOPER_KEY = "paste your API key here"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"
title = []
channelId = []
channelTitle = []
categoryId = []
videoId = []
viewCount = []
likeCount = []
dislikeCount = []
commentCount = []
favoriteCount = []
category = []
tags = []
videos = []
tags = []
max_search = 50
order = "relevance"
token = None
location = None
location_radius = None
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
search_result = youtube.search().list(q="put the channel name here", type="video",
pageToken=token, order=order, part="id,snippet",
maxResults=max_search, location=location,
locationRadius=location_radius, channelId='put the
channel ID here').execute() # this line is to get the videos of the channel by the
name of it
for search_result in search_result.get("items", []):
if search_result["id"]["kind"] == 'youtube#video':
title.append(search_result['snippet']['title']) # the title of the video
videoId.append(search_result['id']['videoId']) # the ID of the video
response = youtube.videos().list(part='statistics, snippet',
id=search_result['id'][
'videoId']).execute() # this is the other request because the statistics and
snippet require this because of the API
channelId.append(response['items'][0]['snippet']['channelId']) # channel ID,
which is constant here
channelTitle.append(response['items'][0]['snippet']['channelTitle']) # channel
title, also constant
categoryId.append(response['items'][0]['snippet']['categoryId']) # stores
the categories of the videos
favoriteCount.append(response['items'][0]['statistics']['favoriteCount']) #
stores the favourite count of the videos
viewCount.append(response['items'][0]['statistics']['viewCount']) #
stores the view counts
"""
the likes and dislikes had a bug all along, which required the if else instead of
just behaving like the viewCount"""
if 'likeCount' in response['items'][0]['statistics'].keys(): # checks
for likes count then restores it, if no likes stores 0
likeCount.append(response['items'][0]['statistics']['likeCount'])
else:
likeCount.append('0')
if 'dislikeCount' in response['items'][0]['statistics'].keys(): # checks for dislikes count then stores it, if no dislikes found returns 0
dislikeCount.append(response['items'][0]['statistics']['dislikeCount'])
else:
likeCount.append('0')
if 'commentCount' in response['items'][0]['statistics'].keys(): # checks for comment count then stores it, if no comment found stores 0
commentCount.append(response['items'][0]['statistics']['commentCount'])
else:
commentCount.append('0')
if 'tags' in response['items'][0]['snippet'].keys(): # checks for tags count then stores it, if none found stores 0
tags.append(response['items'][0]['snippet']['tags'])
else:
tags.append('0')
youtube_dict = {
'tags': tags,
'channelId': channelId,
'channelTitle': channelTitle,
'categoryId': categoryId,
'title': title,
'videoId': videoId,
'viewCount': viewCount,
'likeCount': likeCount,
'dislikeCount': dislikeCount,
'commentCount': commentCount,
'favoriteCount': favoriteCount
}
for x in youtube_dict:
print(x)
for y in youtube_dict[x]:
print(y)
please re indent the code because the site ruined the indentation of python to make the code in the code section not in the words section. good luck

Categories

Resources