Parsing data from MapQuest reverse geocoding api in Python?

Parsing data from MapQuest reverse geocoding api in Python? - python

My code:
from urllib import request
import json
lat = 31.33 ; long = -84.52
webpage = "http://www.mapquestapi.com/geocoding/v1/reverse?key=MY_KEY&callback=renderReverse&location={},{}".format(lat, long)
response = request.urlopen(webpage)
json_data = response.read().decode(response.info().get_param('charset') or 'utf-8')
data = json.loads(json_data)
print(data)
This gives me following error:
ValueError: Expecting value: line 1 column 1 (char 0)
I am trying to read county and state from MapQuest reverse geocoding api. The response looks like this:
renderReverse({"info":{"statuscode":0,"copyright":{"text":"\u00A9 2015 MapQuest, Inc.","imageUrl":"http://api.mqcdn.com/res/mqlogo.gif","imageAltText":"\u00A9 2015 MapQuest, Inc."},"messages":[]},"options":{"maxResults":1,"thumbMaps":true,"ignoreLatLngInput":false},"results":[{"providedLocation":{"latLng":{"lat":32.841516,"lng":-83.660992}},"locations":[{"street":"562 Patterson St","adminArea6":"","adminArea6Type":"Neighborhood","adminArea5":"Macon","adminArea5Type":"City","adminArea4":"Bibb","adminArea4Type":"County","adminArea3":"GA","adminArea3Type":"State","adminArea1":"US","adminArea1Type":"Country","postalCode":"31204-3508","geocodeQualityCode":"P1AAA","geocodeQuality":"POINT","dragPoint":false,"sideOfStreet":"L","linkId":"0","unknownInput":"","type":"s","latLng":{"lat":32.84117,"lng":-83.660973},"displayLatLng":{"lat":32.84117,"lng":-83.660973},"mapUrl":"http://www.mapquestapi.com/staticmap/v4/getmap?key=MY_KEY&type=map&size=225,160&pois=purple-1,32.84117,-83.660973,0,0,|&center=32.84117,-83.660973&zoom=15&rand=-189494136"}]}]})
How do I convert this string to a dict from which I can query using a key? Any help would be appreciated. Thanks!

First, get rid of the callback parameter in your URL since that's what is causing the response to be wrapped in renderReverse()
webpage = "http://www.mapquestapi.com/geocoding/v1/reverse?key=MY_KEY&location={},{}".format(lat, long)
That will give you valid json which should work with the json.loads function you call. At this point you can interact with data like a dictionary, getting the county and state names by their keys. The way mapquests structures their json is pretty strange, so it looks like you may have to do some string matching to get the correct key name. In this case, 'adminArea4Type' is set to 'County' so you want to access the 'adminArea4' key to return the county name.
data['results'][0]['locations'][0]['adminArea4']

Related

want to access specific dictionary value from an url and getting key 0 error in python

I want to access the place value in the dictionary resent in the URL
https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2016-10-01&endtime=2016-10-02
I wrote the below code but it is throwing an error key 0. I want to access some of the values like place, title, geometry from the URL and write it to CSV file.
import urllib2
import json
url = 'https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2016-10-01&endtime=2016-10-02'
# download the json string
json_string = urllib2.urlopen(url).read()
# de-serialize the string so that we can work with it
j = json.loads(json_string)
names = [d['properties'] for d in j[0]['type']]
print names
I am new to python.

j is a dictionary that has keys ['type', 'metadata', 'features', 'bbox'], but not 0. You are probably looking for j['features'], not j[0]['type'], but then the value of names is a list of dictionaries, not names. I hypothesize that the JSON API of the site has changed since you (or whoever wrote the code) last used it.

Extracting Data from JSON

I have a large JSON item returned through a REST API, I wont junk up this with the full text but here is the code I am currently using:
import urllib2
import json
req = urllib2.Request
('http://elections.huffingtonpost.com/
pollster/api/polls.json?state=IA')
response = urllib2.urlopen(req)
the_page = response.read()
decode = json.loads(the_page)
#print = decode #removed, because it is not actually related to the question
print decode
I have been trying to extract information out of it such as the date polls are updated, the actual data from the polls etc (particularly the presidential polls) but I am having trouble returning any data at all. Can anyone assist?
EDIT:
The actual question is how to query data from the returned array/dict

The problem is, that you overwrite print with your data, instead of printing the data. Just remove the = in the last line and it should work fine:
print decode
If you want to use Python 3, you need parenthesis for print. This would look like this:
print(decode)
Edit: As you updated your question, here an answer to your actual question: The data is returned as a combination of dicts and lists by the loads function. Hence you can also access the data like a dict/list. For example, to get the last_updated field of all polls in one list, you can do something like this:
all_last_updated = [poll['last_updated'] for poll in decode]
Or to just get the end date of all polls sponsored by "Constitutional Responsibility Project", you could do this:
end_dates = [poll['end_date'] for poll in decode if any(sponsor['name'] == 'Constitutional Responsibility Project' for sponsor in poll['sponsors'])]
Or if you just want the id of the first poll in the list, do:
the_id = decode[0]['id']
You access anything you want from the json in a similar way.

it is because you do
print = decode
instead, if you are using python 2 do
print decode
or in python 3 do
print(decode)

Quandl data, API call

Recently I am reading some stock prices database in Quandl using API call to extract the data. But I am really confused by the example I have.
import requests
api_url = 'https://www.quandl.com/api/v1/datasets/WIKI/%s.json' % stock
session = requests.Session()
session.mount('http://', requests.adapters.HTTPAdapter(max_retries=3))
raw_data = session.get(api_url)
Can anyone explain that to me?
1) for api_url, if I copy that webepage, it says 404 not found. So if I want to use other database, how do I prepare this api_usl? What does '% stock' mean?
2) here request looks like to be used to extract the data, what is the format of the raw_data? How do I know the column names? How do I extract the columns?

To expand on my comment above:
% stock is a string formatting operation, replacing %s in the preceding string with the value referenced by stock. Further details can be found here
raw_data actually references a Response object (part of the requests module - details found here
To expand on your code.
import requests
#Set the stock we are interested in, AAPL is Apple stock code
stock = 'AAPL'
#Your code
api_url = 'https://www.quandl.com/api/v1/datasets/WIKI/%s.json' % stock
session = requests.Session()
session.mount('http://', requests.adapters.HTTPAdapter(max_retries=3))
raw_data = session.get(api_url)
# Probably want to check that requests.Response is 200 - OK here
# to make sure we got the content successfully.
# requests.Response has a function to return json file as python dict
aapl_stock = raw_data.json()
# We can then look at the keys to see what we have access to
aapl_stock.keys()
# column_names Seems to be describing the individual data points
aapl_stock['column_names']
# A big list of data, lets just look at the first ten points...
aapl_stock['data'][0:10]
Edit to answer question in comment
So the aapl_stock[column_names] shows Date and Open as the first and second values respectively. This means they correspond to positions 0 and 1 in each element of the data.
Therefore to access date use aapl_stock['data'][0:10][0] (date value for first ten items) and to access the value for open use aapl_stock['data'][0:78][1] (open value for first 78 items).
To get a list of every value in the dataset, where each element is a list with values for Date and Open you could add something like aapl_date_open = aapl_stock['data'][:][0:1].
If you are new to python I seriously recommend looking at the list slice notation, a quick intro can be found here

python dictionary from url

I am trying to gather weather data from the national weather service and read it into a python script. They offer a JSON return, but they also offer another return which isn't formatted JSON but has more variables (which I want). This set of data looks like it is formatted as a python dictionary. It looks like this:
stations={
KAPC:
{
'id':'KAPC',
'stnid':'92',
'name':'Napa, Napa County Airport',
'elev':'33',
'latitude':'38.20750',
'longitude':'-122.27944',
'distance':'',
'provider':'NWS/FAA',
'link':'http://www.wrh.noaa.gov/mesowest/getobext.php?sid=KAPC',
'Date':'24 Feb 8:54 am',
'Temp':'39',
'TempC':'4',
'Dewp':'29',
'Relh':'67',
'Wind':'NE#6',
'Direction':'50&#176',
'Winds':'6',
'WindChill':'35',
'Windd':'50',
'SLP':'1027.1',
'Altimeter':'30.36',
'Weather':'',
'Visibility':'10.00',
'Wx':'',
'Clouds':'CLR',
[...]
So, to me, it looks like its got a defined variable stations equal to a dictionary of dictionaries containing the stations and their variables. My question is how do I access this data. Right now I am trying:
import urllib
response = urrllib.urlopen(url)
r = response.read()
If I try to use the JSON module, it clearly fails because this isn't json. And if I just try to read the file, it comes back with a long string of characters. Any suggestions on how to extract this data? If possible, I would just like to get the dictionary as it exists in the url return, ie stations={...} Thanks!

See, As far I infer from the question, I assume that you have data in the form of text which in not a valid JSON data, So given we have a text like: line = "stations={'KAPC':{'id':'KAPC', 'stnid':'92', 'name':'Napa, Napa County Airport'}}" (say), then we can extract the dictionary by splitting it at the = symbol and then use the eval() method which initializes the dictionary variable with the required data.
dictionary_text = line.split("=")[1]
python_dictionary = eval(dictionary_text)
print python_dictionary
>>> {'KAPC': {'id': 'KAPC', 'name': 'Napa, Napa County Airport', 'stnid': '92'}}
The python_dictionary now behaves like a Python Dictionary with key, value pairs , and you can access any attribute using python_dictionary["KAPC"]["id"]

trouble scraping from JSONP feed

I asked a similar question earlier
python JSON feed returns string not object
but I am having a little more trouble and don't understand it.
For about half of the dates this works and returns a JSON object
for example November 9 2013 works
url = 'http://data.ncaa.com/jsonp/scoreboard/basketball-men/d1/2013/11/09/scoreboard.html?callback=c'
r = requests.get(url)
jsonObj = json.loads(r.content[2:-2])
but if I try November 11 2013:
url = 'http://data.ncaa.com/jsonp/scoreboard/basketball-men/d1/2013/11/11/scoreboard.html?callback=c'
r = requests.get(url)
jsonObj = json.loads(r.content[2:-2])
I get this error
ValueError: No JSON object could be decoded
I dont understand why. When I put both urls into a browser they look exactly the same.

The JSON in the second feed is, in fact, invalid JSON. Found this by removing the callback function and running it through: http://jsonlint.com/
To see for yourself, search for the following ID: 336252
The lines just above that ID contain two commas in a row, which is disallowed by the JSON spec.
My guess is that the server at data.ncaa.com is trying to generate JSON itself rather than using a JSON library. You should contact the site administrator and make them aware of this error.

Using demjson
demjson.decode(r.content[2:-2])
seems to work

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing data from MapQuest reverse geocoding api in Python? - python

Related

want to access specific dictionary value from an url and getting key 0 error in python

Extracting Data from JSON

Quandl data, API call

python dictionary from url

trouble scraping from JSONP feed

Categories

Resources