How do i extract the list of hospitals in each neighborhood in a city using foursquare API? and putting it into a data frame.
This is what i am trying to achieve as a DataFrame:
Neighborhood No. of hospitals
0 Neighborhood1 5
1 Neighborhood2 1
2 Neighborhood3 3
3 Neighborhood4 4
4 Neighborhood5 5
I am trying out a code from a previous tutorial to achieve this, I expected the error but i don't know where to start.
def getNearbyVenues(names, latitudes, longitudes, radius=500):
venues_list=[]
for name, lat, lng in zip(names, latitudes, longitudes):
print(name)
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={}&query=supermarket,{}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
LIMIT)
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
return(nearby_venues)
Next cell:
Toronto_venues = getNearbyVenues(names=Toronto_df['Neighborhood'],
latitudes=Toronto_df['Latitude'],
longitudes=Toronto_df['Longitude']
)
Thank you in advance!
Thank you for your response,
Toronto_venues = getNearbyVenues(names=Toronto_df['Neighborhood'],
latitudes=Toronto_df['Latitude'],
longitudes=Toronto_df['Longitude']
)
But this cell gives back this error,
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-16-03f6027f84a2> in <module>()
1 Toronto_venues = getNearbyVenues(names=Toronto_df['Neighborhood'],
2 latitudes=Toronto_df['Latitude'],
----> 3 longitudes=Toronto_df['Longitude']
4 )
<ipython-input-13-0c3ca691c166> in getNearbyVenues(names, latitudes, longitudes, radius)
16
17 # make the GET request
---> 18 results = requests.get(url).json()["response"]['groups'][0]['items']
19
20 # return only relevant information for each nearby venue
KeyError: 'groups'
You need to do value counts, then separate out any column and rename it.
df = Toronto_venues.groupby('Neighborhood').count() # Get the counts
df = pd.DataFrame(df['Venue']) # Convert the counts to a dataframe
df.rename(columns={'Venue': 'No. of Hospitals'}, inplace=True)
At this point you will have a dataframe, but the first column which is your hospital names, is the index. If you want to pull it out into a column, then use this code as well:
df.reset_index(level=0, inplace=True)
Related
So I was analyzing a data set with addresses in Philadelphia, PA. Now, in order to make use of these, I wanted to get the exact longitude and latitude to later show them on a map.
I have gotten the unique entries of the column as a list and have implemented a loop to get me the longitude and latitude, though it's giving me the same coordinates for every city and sometimes even ones that are outside of Philadelphia.
Here's what I did so far:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="my_user_agent")
geocode = lambda query: geolocator.geocode("%s, Philadelphia PA" % query)
cities = list(philly["station_name"].unique())
for city in cities:
address = city
location = geolocator.geocode(address)
if(location != None):
philly["longitude"] = location.longitude
philly["latitude"] = location.latitude
philly["coordinates"] = list(zip(philly["latitude"], philly["longitude"]))
If "philly" is a list of dictionary objects then you can iterate over the list and add the location properties to each record.
from geopy.geocoders import Nominatim
philly = [{'station_name': '30th Street Station'}]
geolocator = Nominatim(user_agent="my_user_agent")
for row in philly:
address = row["station_name"]
location = geolocator.geocode(f"{address}, Philadelphia, PA", country_codes="us")
if location:
print(address)
print(">>", location.longitude, location.latitude)
row["longitude"] = location.longitude
row["latitude"] = location.latitude
row["coordinates"] = (location.longitude, location.latitude)
print(philly)
Output:
30th Street Station
>> -75.1821442 39.9552836
[{'station_name': '30th Street Station', 'longitude': -75.1821442, 'latitude': 39.9552836, 'coordinates': (-75.1821442, 39.9552836)}]
If working with a Pandas dataframe then you can iterate over each record in the dataframe then set the latitude, longitude and coordinates fields in it.
You can do something like this:
from geopy.geocoders import Nominatim
import pandas as pd
geolocator = Nominatim(user_agent="my_user_agent")
philly = [{'station_name': '30th Street Station'}]
df = pd.DataFrame(philly)
# add empty location columns to data frame
df["latitude"] = ""
df["longitude"] = ""
df["coordinates"] = ""
for _, row in df.iterrows():
address = row.station_name
location = geolocator.geocode(f"{address}, Philadelphia, PA", country_codes="us")
if location:
row["latitude"] = location.latitude
row["longitude"] = location.longitude
row["coordinates"] = (location.longitude, location.latitude)
print(df)
Output:
station_name latitude longitude coordinates
0 30th Street Station 39.955284 -75.182144 (-75.1821442, 39.9552836)
If you have a list with duplicate station names then you should cache the results so you don't make duplicate geolocation requests.
I'm using an API to get basic information about shops in my area, name of shop, address, postcode, phone number etc… The API returns back a long list about each shop, but I only want some of the data from each shop.
I created a for loop that just takes the information that I want for every shop that the API has returned. This all works fine.
Problem is not all shops have a phone number or a website, so I get a KeyError because the key website does not exist in every return of a shop. I tried to use try and except which works but only if I only handle one thing, but a shop might not have a phone number and a website, which leads to a second KeyError.
What can I do to check for every key in my for loop and if a key is found missing to just add the value "none"?
My code:
import requests
import geocoder
import pprint
g = geocoder.ip('me')
print(g.latlng)
latitude, longitude = g.latlng
URL = "https://discover.search.hereapi.com/v1/discover"
latitude = xxxx
longitude = xxxx
api_key = 'xxxxx' # Acquire from developer.here.com
query = 'food'
limit = 12
PARAMS = {
'apikey':api_key,
'q':query,
'limit': limit,
'at':'{},{}'.format(latitude,longitude)
}
# sending get request and saving the response as response object
r = requests.get(url = URL, params = PARAMS)
data = r.json()
#print(data)
for x in data['items']:
title = x['title']
address = x['address']['label']
street = x['address']['street']
postalCode = x['address']['postalCode']
position = x['position']
access = x['access']
typeOfBusiness = x['categories'][0]['name']
contacts = x['contacts'][0]['phone'][0]['value']
try:
website = x['contacts'][0]['www'][0]['value']
except KeyError:
website = "none"
resultList = {
'BUSINESS NAME:':title,
'ADDRESS:':address,
'STREET NAME:':street,
'POSTCODE:':postalCode,
'POSITION:':position,
'POSITSION2:':access,
'TYPE:':typeOfBusiness,
'PHONE:':contacts,
'WEBSITE:':website
}
print("--"*80)
pprint.pprint( resultList)
I think a good way to handle it would be to use the operator.itemgetter() to create a callable the will attempt to retrieve all the keys at once, and if any aren't found, it will generate a KeyError.
A short demonstration of what I mean:
from operator import itemgetter
test_dict = dict(name="The Shop", phone='123-45-6789', zipcode=90210)
keys = itemgetter('name', 'phone', 'zipcode')(test_dict)
print(keys) # -> ('The Shop', '123-45-6789', 90210)
keys = itemgetter('name', 'address', 'phone', 'zipcode')(test_dict)
# -> KeyError: 'address'
I am trying to use the below function to retrieve venues for different locations but I keep getting this error and I can't figure it out because I used it before and it worked perfectly but with different locations. Please help!
def getNearbyVenues(names, latitudes, longitudes, radius=500):
venues_list=[]
for name, lat, lng in zip(names, latitudes, longitudes):
print(name)
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
LIMIT)
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighbourhood',
'Neighbourhood Latitude',
'Neighbourhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
return(nearby_venues)`
london_venues = getNearbyVenues(names=df['Location'],
latitudes=df['Latitude'],
longitudes=df['Longitude']
)
This is the error I am getting
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-180-4f13fb178c94> in <module>
1 london_venues = getNearbyVenues(names=df['Location'],
2 latitudes=df['Latitude'],
----> 3 longitudes=df['Longitude']
4 )
<ipython-input-177-d194f1c67c83> in getNearbyVenues(names, latitudes, longitudes, radius)
16
17 # make the GET request
---> 18 results = requests.get(url).json()["response"]['groups'][0]['items']
19
20 # return only relevant information for each nearby venue
KeyError: 'groups'
you might have exceeded your API call limit if you are using sandbox account, or there is no such key named "groups". If not, then please provide the coordinates of the location.
I'm trying to examine the sushi venues within 5 different cities, using foursqaure.
I can get the data and filter it correctly. Code below.
city = {'City':['Brunswick','Auckland','Wellington','Christchurch','Hamilton','Ponsonby'],
'Latitude':[-37.7670,-36.848461,-41.28664,-43.55533,-37.78333,-36.8488],
'Longitude':[144.9621,174.763336,174.77557,172.63333,175.28333,174.7381]}
df_location= pd.DataFrame(city, columns = ['City','Latitude','Longitude'])
def getNearbyVenues(names, latitudes, longitudes, radius=2000, LIMIT=100):
venues_list=[]
for name, lat, lng in zip(names, latitudes, longitudes):
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
LIMIT,
"4bf58dd8d48988d1d2941735")
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']
venues_list.append([(
name,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng']) for v in results])
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = [
'City',
'Venue',
'Venue Latitude',
'Venue Longitude',]
return(nearby_venues)
sushi_venues = getNearbyVenues(names = df_location['City'],
latitudes = df_location['Latitude'],
longitudes = df_location['Longitude'])
cities = df_location["City"]
latitude = df_location["Latitude"]
longitude = df_location["Longitude"]
I'm getting stuck on creating the maps and I'm not sure how I should iterate through the cities to create a map for each.
Here's the code I have.
maps = {}
for city in cities:
maps[city] = folium.Map(location = [latitude, longitude],zoom_start=10)
for lat, lng, neighborhood in zip(sushi_venues['Venue Latitude'], sushi_venues['Venue Longitude'], sushi_venues['Venue']):
label = '{}'.format(neighborhood)
label = folium.Popup(label, parse_html = True)
folium.CircleMarker(
[lat, lng],
radius = 5,
popup = label,
color = 'blue',
fill = True,
fill_color = '#3186cc',
fill_opacity = 0.7,
parse_html = False).add_to(maps[city])
maps[cities[0]]
For this code, 'maps[cities[0]]' brings up a blank folio map.
If I change the code to reference the row of the city in df_location, e.g
maps = {}
for city in cities:
maps[city] = folium.Map(location = [latitude[0], longitude[0],zoom_start=10)
Then 'maps[cities[0]]' brings up a correctly labeled Folio map of Brunswick with the corresponding venues marked.
So my question is, how can I correctly iterate through all 5 cities, so that I can pull a new map for each without changing the location each time? I'm unable to zip the locations because it needs to be a single lat/long to initialize the Folium map.
Thanks so much for your help!
I have a pd dataframe df_data with names of cities, latitude and longitude.
I try to create a Foursquare request with my credentials to find all the malls located within a specific radius around those cities.
Malls_id is the Foursquare defined id for malls.
I got an error with my request.
I think the error comes from adding categoryId but I don't find the problem.
What am I doing wrong?
Thanks
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=5):
venues_list=[]
for name, lat, lng in zip(names, latitudes, longitudes):
print(name)
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
Malls_id,
LIMIT)
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
return(nearby_venues)
# Run the above function on each location and create a new dataframe called location_venues and display it.
location_venues = getNearbyVenues(names=df_data['City'],
latitudes=df_data['Latitude'],
longitudes=df_data['Longitude']
)
It runs over all my locations
Warsaw
Carmel
Chesterton
Granger
Plainfield
and then it stops.
Here is the full stack trace:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-537f5e1b4f8a> in <module>()
2 location_venues = getNearbyVenues(names=df_data['City'],
3 latitudes=df_data['Latitude'],
----> 4 longitudes=df_data['Longitude']
5 )
<ipython-input-20-23c3db7fc2a3> in getNearbyVenues(names, latitudes, longitudes, radius, LIMIT)
36 'Venue Latitude',
37 'Venue Longitude',
---> 38 'Venue Category']
39
40 return(nearby_venues)
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/pandas/core/generic.py in __setattr__(self, name, value)
3625 try:
3626 object.__getattribute__(self, name)
-> 3627 return object.__setattr__(self, name, value)
3628 except AttributeError:
3629 pass
pandas/_libs/properties.pyx in pandas._libs.properties.AxisProperty.__set__()
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
557
558 def _set_axis(self, axis, labels):
--> 559 self._data.set_axis(axis, labels)
560 self._clear_item_cache()
561
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
3067 raise ValueError('Length mismatch: Expected axis has %d elements, '
3068 'new values have %d elements' %
-> 3069 (old_len, new_len))
3070
3071 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 0 elements, new values have 7 elements ```