How do i access and manipulate the api values - python

I would like some direction on how i can access the data and do some modifications etc. for example accessing and listing only emails, etc please
import requests,json
api = "https://reqres.in/api/users?page=2"
test = requests.get(api)
x = test.json()
data_structure = []
data_structure.append(x)
print(data_structure)
Output
[{'page': 2, 'per_page': 6, 'total': 12, 'total_pages': 2, 'data': [{'id': 7, 'email': 'michael.lawson#reqres.in', 'first_name': 'Michael', 'last_name': 'Lawson', 'avatar': 'https://reqres.in/img/faces/7-image.jpg'}, {'id': 8, 'email': 'lindsay.ferguson#reqres.in', 'first_name': 'Lindsay', 'last_name': 'Ferguson', 'avatar': 'https://reqres.in/img/faces/8-image.jpg'}, {'id': 9, 'email': 'tobias.funke#reqres.in', 'first_name': 'Tobias', 'last_name': 'Funke', 'avatar': 'https://reqres.in/img/faces/9-image.jpg'}, {'id': 10, 'email': 'byron.fields#reqres.in', 'first_name': 'Byron', 'last_name': 'Fields', 'avatar': 'https://reqres.in/img/faces/10-image.jpg'}, {'id': 11, 'email': 'george.edwards#reqres.in', 'first_name': 'George', 'last_name': 'Edwards', 'avatar': 'https://reqres.in/img/faces/11-image.jpg'}, {'id': 12, 'email': 'rachel.howell#reqres.in', 'first_name': 'Rachel', 'last_name': 'Howell', 'avatar': 'https://reqres.in/img/faces/12-image.jpg'}], 'support': {'url': 'https://reqres.in/#support-heading', 'text': 'To keep ReqRes free, contributions towards server costs are appreciated!'}}]

First, I highly recommend you to install the JSON Viewer extension, which will help you a lot to see what's going on your API.
https://chrome.google.com/webstore/detail/json-viewer/gbmdgpbipfallnflgajpaliibnhdgobh?hl=es
Then, you don't need to create a new list, since the x = test.json() already outputs the same dictionary you brought from the API.
So your first chunk of code should look like this
import requests,json
api = "https://reqres.in/api/users?page=2"
test = requests.get(api)
x = test.json()
Then you can access all the data inside that dictionary, for example let's get all the emails.
To make it easier you should go to the api link using JSON Viewer to see the structure of your dictionary.
To access the emails we first need to access the "data" key on your dictionary
import requests,json
api = "https://reqres.in/api/users?page=2"
test = requests.get(api)
x = test.json()
data_list = x["data"]
After that you can see that data_list is a new list of dictionaries with all the data from each element on your API (in your case each id).
So finally, to access the emails, we need to iterate trough that list and get the "email" key from each dictionary on the list.
import requests,json
api = "https://reqres.in/api/users?page=2"
test = requests.get(api)
x = test.json()
data_list = x["data"]
for i in data_list:
print(i["email"])
And that my friend, is how you get info from an API, the same way you manipulate lists and dictionaries.

Related

Replace single quotes with doubles to turn contents of a file into a nested JSON and normalize it afterwards

I have 70k files all of which look similar to this:
{'id': 24, 'name': None, 'city': 'City', 'region_id': 19,
'story_id': 1, 'description': 'text', 'uik': None, 'ustatus': 'status',
'wuiki_tik_name': '', 'reaction': None, 'reaction_official': '',
'created_at': '2011-09-07T07:24:44.420Z', 'lat': 54.7, 'lng': 20.5,
'regions': {'id': 19, 'name': 'name'}, 'stories': {'id': 1, 'name': '2011-12-04'}, 'assets': [], 'taggings': [{'tags': {'id': 6, 'name': 'name',
'tag_groups': {'id': 3, 'name': 'Violation'}}},
{'tags': {'id': 8, 'name': 'name', 'tag_groups': {'id': 5, 'name': 'resource'}}},
{'tags': {'id': 1, 'name': '01. Federal', 'tag_groups': {'id': 1, 'name': 'Level'}}},
{'tags': {'id': 3, 'name': '03. Local', 'tag_groups': {'id': 1, 'name': 'stuff'}}},
{'tags': {'id': 2, 'name': '02. Regional', 'tag_groups':
{'id': 1, 'name': 'Level'}}}], 'message_id': None, '_count': {'assets': 0, 'other_messages': 0, 'similars': 0, 'taggings': 5}}
The ultimate goal is to export it into a single CSV file. It can be successfully done without flattening. But since it has a lot of nested values, I would like to flatten it, and this is where I began facing problems related to data types. Here's the code:
import json
from pandas.io.json import json_normalize
import glob
path = glob.glob("all_messages/*.json")
for file in path:
with open(file, "r") as filer:
content = json.loads(json.dumps(filer.read()))
if content != 404:
df_main = json_normalize(content)
df_regions = json_normalize(content, record_path=['regions'], record_prefix='regions.', meta=['id'])
df_stories = json_normalize(content, record_path=['stories'], record_prefix='stories.', meta=['id'])
#... More code related to normalization
df_out.to_csv('combined_json.csv')
This code occasionally throws:
AttributeError: 'str' object has no attribute 'values' or ValueError: DataFrame constructor not properly called!. I realise that this is caused by json.dumps() JSON string output. However, I have failed to turn it into anything useable.
Any possible solutions to this?
If you only need to change ' to ":
...
for file in path:
with open(file, "r") as filer:
filer.replace("\'", "\"")
...
Making copies and using grep would be easier
While it is not the solution I was initially expecting, this approach worked as well. I kept getting error messages related to the structure of the dict literals that were reluctant to become json, so I took the csv file that I wanted to normalise and worked with each column one by one:
df = pd.read_csv("combined_json.csv")
df['regions'] = df['regions'].apply(lambda x: x.replace("'", '"'))
regions = pd.json_normalize(df['regions'].apply(json.loads).tolist()).rename(
columns=lambda x: x.replace('regions.', ''))
df['regions'] = regions['name']
Or, if it had more nested levels:
df['taggings'] = df['taggings'].apply(lambda x: x.replace("'", '"'))
taggings = pd.concat([pd.json_normalize(json.loads(j)) for j in df['taggings']])
df = df.reset_index(drop=True)
taggings = taggings.reset_index(drop=True)
df[['tags_id', 'nametag', 'group_tag', 'group_tag_name']] = taggings[['tags.id', 'tags.name', 'tags.tag_groups.id', 'tags.tag_groups.name']]
Which was eventually df.to_csv().

Accessing keys/values in a paginated/nested dictionary

I know that somewhat related questions have been asked here: Accessing key, value in a nested dictionary and here: python accessing elements in a dictionary inside dictionary among other places but I can't quite seem to apply the answers' methodology to my issue.
I'm getting a KeyError trying to access the keys within response_dict, which I know is due to it being nested/paginated and me going about this the wrong way. Can anybody help and/or point me in the right direction?
import requests
import json
URL = "https://api.constantcontact.com/v2/contacts?status=ALL&limit=1&api_key=<redacted>&access_token=<redacted>"
#make my request, store it in the requests object 'r'
r = requests.get(url = URL)
#status code to prove things are working
print (r.status_code)
#print what was retrieved from the API
print (r.text)
#visual aid
print ('---------------------------')
#decode json data to a dict
response_dict = json.loads(r.text)
#show how the API response looks now
print(response_dict)
#just for confirmation
print (type(response_dict))
print('-------------------------')
# HERE LIES THE ISSUE
print(response_dict['first_name'])
And my output:
200
{"meta":{"pagination":{}},"results":[{"id":"1329683950","status":"ACTIVE","fax":"","addresses":[{"id":"4e19e250-b5d9-11e8-9849-d4ae5275509e","line1":"222 Fake St.","line2":"","line3":"","city":"Kansas City","address_type":"BUSINESS","state_code":"","state":"OK","country_code":"ve","postal_code":"19512","sub_postal_code":""}],"notes":[],"confirmed":false,"lists":[{"id":"1733488365","status":"ACTIVE"}],"source":"Site Owner","email_addresses":[{"id":"1fe198a0-b5d5-11e8-92c1-d4ae526edd6c","status":"ACTIVE","confirm_status":"NO_CONFIRMATION_REQUIRED","opt_in_source":"ACTION_BY_OWNER","opt_in_date":"2018-09-11T18:18:20.000Z","email_address":"rsmith#fake.com"}],"prefix_name":"","first_name":"Robert","middle_name":"","last_name":"Smith","job_title":"I.T.","company_name":"FBI","home_phone":"","work_phone":"5555555555","cell_phone":"","custom_fields":[],"created_date":"2018-09-11T15:12:40.000Z","modified_date":"2018-09-11T18:18:20.000Z","source_details":""}]}
---------------------------
{'meta': {'pagination': {}}, 'results': [{'id': '1329683950', 'status': 'ACTIVE', 'fax': '', 'addresses': [{'id': '4e19e250-b5d9-11e8-9849-d4ae5275509e', 'line1': '222 Fake St.', 'line2': '', 'line3': '', 'city': 'Kansas City', 'address_type': 'BUSINESS', 'state_code': '', 'state': 'OK', 'country_code': 've', 'postal_code': '19512', 'sub_postal_code': ''}], 'notes': [], 'confirmed': False, 'lists': [{'id': '1733488365', 'status': 'ACTIVE'}], 'source': 'Site Owner', 'email_addresses': [{'id': '1fe198a0-b5d5-11e8-92c1-d4ae526edd6c', 'status': 'ACTIVE', 'confirm_status': 'NO_CONFIRMATION_REQUIRED', 'opt_in_source': 'ACTION_BY_OWNER', 'opt_in_date': '2018-09-11T18:18:20.000Z', 'email_address': 'rsmith#fake.com'}], 'prefix_name': '', 'first_name': 'Robert', 'middle_name': '', 'last_name': 'Smith', 'job_title': 'I.T.', 'company_name': 'FBI', 'home_phone': '', 'work_phone': '5555555555', 'cell_phone': '', 'custom_fields': [], 'created_date': '2018-09-11T15:12:40.000Z', 'modified_date': '2018-09-11T18:18:20.000Z', 'source_details': ''}]}
<class 'dict'>
-------------------------
Traceback (most recent call last):
File "C:\Users\rkiek\Desktop\Python WIP\Chris2.py", line 20, in <module>
print(response_dict['first_name'])
KeyError: 'first_name'
first_name = response_dict["results"][0]["first_name"]
Even though I think this question would be better answered by yourself by reading some documentation, I will explain what is going on here. You see the dict-object of the man named "Robert" is within a list which is a value under the key "results". So, at first you need to access the value within results which is a python-list.
Then you can use a loop to iterate through each of the elements within the list, and treat each individual element as a regular dictionary object.
results = response_dict["results"]
results = response_dict.get("results", None)
# use any one of the two above, the first one will throw a KeyError if there is no key=="results" the other will return NULL
# this results is now a list according to the data you mentioned.
for item in results:
print(item.get("first_name", None)
# here you can loop through the list of dictionaries and treat each item as a normal dictionary

converting list to dictionary in a python

I have a data in a list like below:
[
{'id': 1, 'first_name': 'Jeanette', 'last_name': 'Penddreth', 'email': 'jpenddreth0#census.gov', 'gender': 'Female', 'ip_address': '26.58.193.2'},
{'id': 2, 'first_name': 'Giavani', 'last_name': 'Frediani', 'email': 'gfrediani1#senate.gov', 'gender': 'Male', 'ip_address': '229.179.4.212'},
{'id': 3, 'first_name': 'Noell', 'last_name': 'Bea', 'email': 'nbea2#imageshack.us', 'gender': 'Female', 'ip_address': '180.66.162.255'},
{'id': 4, 'first_name': 'Willard', 'last_name': 'Valek', 'email': 'wvalek3#vk.com', 'gender': 'Male', 'ip_address': '67.76.188.26'}
]
I am loading the data into the dynamoDb. It is failing with the error type: <class 'list'>, valid types: <class 'dict'>: ParamValidationError.
How do I convert the above list into a dictionary?
EDIT Code used:
import boto3
import json
s3_client=boto3.client('s3')
dynamodb=boto3.resource('dynamodb')
def lambda_handler(event, context):
bucket=event['Records'][0]['s3']['bucket']['name']
json_filename=event['Records'][0]['s3']['object']['key']
json_object=s3_client.get_object(Bucket=bucket,Key=json_filename)
jsonFileReader=json_object['Body'].read()
jsonDictionary=json.loads(jsonFileReader)
table=dynamodb.Table('EMPLOYEE_DETAILS')
table.put_item(Item=jsonDictionary)
return 'Done'
I'm not familiar with dynamodb, but I imagine this will work. If your JSON is a list, then you need to iterate through the items in your list, adding each one to the table.
Replace the line:
table.put_item(Item=jsonDictionary)
with:
if type(jsonDictionary) == type([]):
# It is a list - iterate through it
for item in jsonDictionary:
table.put_item(Item=item)
else:
table.put_item(Item=jsonDictionary)

Extract specific keys from list of dict in python. Sentinelhub

I seem to be stuck on very simple task. I'm still dipping my toes into Python.
I'm trying to download Sentinel 2 Images with SentinelHub API:SentinelHub
The result of data that my code returns is like this:
{'geometry': {'coordinates': [[[[35.895906644, 31.602691754],
[36.264307655, 31.593801516],
[36.230618703, 30.604681346],
[35.642363693, 30.617971909],
[35.678587829, 30.757888786],
[35.715700562, 30.905919341],
[35.754290061, 31.053632806],
[35.793289298, 31.206946419],
[35.895906644, 31.602691754]]]],
'type': 'MultiPolygon'},
'id': 'ee923fac-0097-58a8-b861-b07d89b99310',
'properties': {'**productType**': '**S2MSI1C**',
'centroid': {'coordinates': [18.1321538275, 31.10368655], 'type': 'Point'},
'cloudCover': 10.68,
'collection': 'Sentinel2',
'completionDate': '2017-06-07T08:15:54Z',
'description': None,
'instrument': 'MSI',
'keywords': [],
'license': {'description': {'shortName': 'No license'},
'grantedCountries': None,
'grantedFlags': None,
'grantedOrganizationCountries': None,
'hasToBeSigned': 'never',
'licenseId': 'unlicensed',
'signatureQuota': -1,
'viewService': 'public'},
'links': [{'href': 'http://opensearch.sentinel-hub.com/resto/collections/Sentinel2/ee923fac-0097-58a8-b861-b07d89b99310.json?&lang=en',
'rel': 'self',
'title': 'GeoJSON link for ee923fac-0097-58a8-b861-b07d89b99310',
'type': 'application/json'}],
'orbitNumber': 10228,
'organisationName': None,
'parentIdentifier': None,
'platform': 'Sentinel-2',
'processingLevel': '1C',
'productIdentifier': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'published': '2017-07-26T13:09:17.405352Z',
'quicklook': None,
'resolution': 10,
's3Path': 'tiles/36/R/YV/2017/6/7/0',
's3URI': 's3://sentinel-s2-l1c/tiles/36/R/YV/2017/6/7/0/',
'sensorMode': None,
'services': {'download': {'mimeType': 'text/html',
'url': 'http://sentinel-s2-l1c.s3-website.eu-central-1.amazonaws.com#tiles/36/R/YV/2017/6/7/0/'}},
'sgsId': 2168915,
'snowCover': 0,
'spacecraft': 'S2A',
'startDate': '2017-06-07T08:15:54Z',
'thumbnail': None,
'title': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'updated': '2017-07-26T13:09:17.405352Z'},
'type': 'Feature'}
Can you explain how can I iterate through this set of data and extract only 'productType'? For example, if there are several similar data sets it would return only different product types.
My code is :
import matplotlib.pyplot as plt
import numpy as np
from sentinelhub import AwsProductRequest, AwsTileRequest, AwsTile, BBox, CRS
betsiboka_coords_wgs84 = [31.245117,33.897777,34.936523,36.129002]
bbox = BBox(bbox=betsiboka_coords_wgs84, crs=CRS.WGS84)
date= '2017-06-05',('2017-06-08')
data=sentinelhub.opensearch.get_area_info(bbox, date_interval=date, maxcc=None)
for i in data:
print(i)
Based on what you have provided, replace your bottom for loop:
for i in data:
print(i)
with the following:
for i in data:
print(i['properties']['**productType**'])
If you want to access only the propertyType you can use i['properties']['productType'] in your for loop. If you want to access it any time you want without writing each time those keys, you can define a generator like this:
def property_types(data_array):
for data in data_array
yield data['properties']['propertyType']
So you can use it like this in a loop (your data_array is data, as returned by sentinelhub api):
for property_type in property_types(data):
# do stuff with property_type
keys = []
for key in d.keys():
if key == 'properties':
for k in d[key].keys():
if k == '**productType**' and k not in keys:
keys.append(d[key][k])
print(keys)
Getting only specific (nested) values: Since your request key is nested, and resides inside the parent "properties" object, you need to access it first, preferably using the get method. This can be done as follows (note the '{}' parameter in the first get, this returns an empty dictionary if the first key is not present)
data_dictionary = json.loads(data_string)
product_type = data_dictionary.get('properties', {}).get('**productType**')
You can then aggregate the different product_type objects in a set, which will automatically guarantee that no 2 objects are the same
product_type_set = set()
product_type.add(product_type)

Outputting just value without format from mysql in Python

I'm building API for my app in Python and Flask. I'm trying to output some data in JSON format, however I get weirdly formatted data from my sql query. I would like to see numbers (i.e. field initial_price) as number, not as a Decimal('3.99'), similarly with id and timestamp format.
Also, is it the right way to produce JSON?
This is output from my API:
$ curl 127.0.0.1:5000/1/product?code="9571%2F702"
[{'update_time': datetime.datetime(2013, 1, 7, 22, 25, 50), 'code': '9571/702', 'description': '', 'gender': '', 'brand': 'Zara', 'initial_price': Decimal('3.99'), 'image_link': 'http://static.zara.net/photos//2012/I/0/3/p/9571/702/401/9571702401_1_1_3.jpg', 'currency': 'GBP', 'colors': 'NAVY', 'link': 'http://www.zara.com/webapp/wcs/stores/servlet/product/uk/en/zara-neu-W2012-s/341501/883021/', 'current_price': Decimal('990.00'), 'original_category': 'Girl (2-14 years)', 'id': 1623L, 'name': '"I LOVE ..." T-SHIRT'},
{'update_time': datetime.datetime(2013, 1, 7, 22, 25, 50), 'code': '9571/702', 'description': '', 'gender': '', 'brand': 'Zara', 'initial_price': Decimal('3.99'), 'image_link': 'http://static.zara.net/photos//2012/I/0/3/p/9571/702/401/9571702401_1_1_3.jpg', 'currency': 'GBP', 'colors': 'LIGHT', 'link': 'http://www.zara.com/webapp/wcs/stores/servlet/product/uk/en/zara-neu-W2012-s/341501/883021/', 'current_price': Decimal('990.00'), 'original_category': 'Girl (2-14 years)', 'id': 1624L, 'name': '"I LOVE ..." T-SHIRT'},
{'update_time': datetime.datetime(2013, 1, 7, 22, 25, 50), 'code': '9571/702', 'description': '', 'gender': '', 'brand': 'Zara', 'initial_price': Decimal('3.99'), 'image_link': 'http://static.zara.net/photos//2012/I/0/3/p/9571/702/401/9571702401_1_1_3.jpg', 'currency': 'GBP', 'colors': 'ECRU', 'link': 'http://www.zara.com/webapp/wcs/stores/servlet/product/uk/en/zara-neu-W2012-s/341501/883021/', 'current_price': Decimal('990.00'), 'original_category': 'Girl (2-14 years)', 'id': 1625L, 'name': '"I LOVE ..." T-SHIRT'}]
My code is as following:
from flask import Flask, url_for, session, redirect, escape, request
from subprocess import Popen, PIPE
import socket
import MySQLdb
import urllib
#app.route('/1/product')
def product_search():
[some not important stuff here...]
#creating list of codes with lowest Damerau-Levenshtein numbers
best_matching_codes = []
for k, v in lv:
if v == min:
best_matching_codes.append(k)
#returning JSON with best matching products info
products_json = []
for code in best_matching_codes:
cur = db.cursor()
query = "SELECT * FROM %s WHERE code LIKE '%s'" % (PRODUCTS_TABLE_NAME, product_code)
cur.execute(query)
columns = [desc[0] for desc in cur.description]
rows = cur.fetchall()
for row in rows:
products_json.append(dict((k,v) for k,v in zip(columns,row)))
return str(products_json)
Use the json module. It will output proper json from a dict:
import json
return json.dumps(products_json)
Using str does not produce valid json!
You need to use a json library to produce json.
Add to top:
import json
Change last line to:
return json.dumps(products_json)

Categories

Resources