Accessing keys/values in a paginated/nested dictionary

Accessing keys/values in a paginated/nested dictionary - python

I know that somewhat related questions have been asked here: Accessing key, value in a nested dictionary and here: python accessing elements in a dictionary inside dictionary among other places but I can't quite seem to apply the answers' methodology to my issue.
I'm getting a KeyError trying to access the keys within response_dict, which I know is due to it being nested/paginated and me going about this the wrong way. Can anybody help and/or point me in the right direction?
import requests
import json
URL = "https://api.constantcontact.com/v2/contacts?status=ALL&limit=1&api_key=<redacted>&access_token=<redacted>"
#make my request, store it in the requests object 'r'
r = requests.get(url = URL)
#status code to prove things are working
print (r.status_code)
#print what was retrieved from the API
print (r.text)
#visual aid
print ('---------------------------')
#decode json data to a dict
response_dict = json.loads(r.text)
#show how the API response looks now
print(response_dict)
#just for confirmation
print (type(response_dict))
print('-------------------------')
# HERE LIES THE ISSUE
print(response_dict['first_name'])
And my output:
200
{"meta":{"pagination":{}},"results":[{"id":"1329683950","status":"ACTIVE","fax":"","addresses":[{"id":"4e19e250-b5d9-11e8-9849-d4ae5275509e","line1":"222 Fake St.","line2":"","line3":"","city":"Kansas City","address_type":"BUSINESS","state_code":"","state":"OK","country_code":"ve","postal_code":"19512","sub_postal_code":""}],"notes":[],"confirmed":false,"lists":[{"id":"1733488365","status":"ACTIVE"}],"source":"Site Owner","email_addresses":[{"id":"1fe198a0-b5d5-11e8-92c1-d4ae526edd6c","status":"ACTIVE","confirm_status":"NO_CONFIRMATION_REQUIRED","opt_in_source":"ACTION_BY_OWNER","opt_in_date":"2018-09-11T18:18:20.000Z","email_address":"rsmith#fake.com"}],"prefix_name":"","first_name":"Robert","middle_name":"","last_name":"Smith","job_title":"I.T.","company_name":"FBI","home_phone":"","work_phone":"5555555555","cell_phone":"","custom_fields":[],"created_date":"2018-09-11T15:12:40.000Z","modified_date":"2018-09-11T18:18:20.000Z","source_details":""}]}
---------------------------
{'meta': {'pagination': {}}, 'results': [{'id': '1329683950', 'status': 'ACTIVE', 'fax': '', 'addresses': [{'id': '4e19e250-b5d9-11e8-9849-d4ae5275509e', 'line1': '222 Fake St.', 'line2': '', 'line3': '', 'city': 'Kansas City', 'address_type': 'BUSINESS', 'state_code': '', 'state': 'OK', 'country_code': 've', 'postal_code': '19512', 'sub_postal_code': ''}], 'notes': [], 'confirmed': False, 'lists': [{'id': '1733488365', 'status': 'ACTIVE'}], 'source': 'Site Owner', 'email_addresses': [{'id': '1fe198a0-b5d5-11e8-92c1-d4ae526edd6c', 'status': 'ACTIVE', 'confirm_status': 'NO_CONFIRMATION_REQUIRED', 'opt_in_source': 'ACTION_BY_OWNER', 'opt_in_date': '2018-09-11T18:18:20.000Z', 'email_address': 'rsmith#fake.com'}], 'prefix_name': '', 'first_name': 'Robert', 'middle_name': '', 'last_name': 'Smith', 'job_title': 'I.T.', 'company_name': 'FBI', 'home_phone': '', 'work_phone': '5555555555', 'cell_phone': '', 'custom_fields': [], 'created_date': '2018-09-11T15:12:40.000Z', 'modified_date': '2018-09-11T18:18:20.000Z', 'source_details': ''}]}
<class 'dict'>
-------------------------
Traceback (most recent call last):
File "C:\Users\rkiek\Desktop\Python WIP\Chris2.py", line 20, in <module>
print(response_dict['first_name'])
KeyError: 'first_name'

first_name = response_dict["results"][0]["first_name"]
Even though I think this question would be better answered by yourself by reading some documentation, I will explain what is going on here. You see the dict-object of the man named "Robert" is within a list which is a value under the key "results". So, at first you need to access the value within results which is a python-list.
Then you can use a loop to iterate through each of the elements within the list, and treat each individual element as a regular dictionary object.
results = response_dict["results"]
results = response_dict.get("results", None)
# use any one of the two above, the first one will throw a KeyError if there is no key=="results" the other will return NULL
# this results is now a list according to the data you mentioned.
for item in results:
print(item.get("first_name", None)
# here you can loop through the list of dictionaries and treat each item as a normal dictionary

Related

How to get a nested response data from json python

I call an API and this my JSON response:
{'data': [{
'id': 'd-1225959',
'startTime': '2022-12-30T00:00:00.000Z',
'endTime': '2022-12-30T23:59:00.000Z',
'checkedInAt': None,
'checkedOutAt': None,
'status': 'PENDING',
'space': {
'id': 'd-4063963',
'name': '082',
'type': 'DESK',
'createdAt': '2021-07-06T11:48:57.000Z',
'updatedAt': '2021-07-06T11:48:57.000Z',
'isAvailable': False,
'assignedTo': None,
'locationId': '133778',
'floorId': '41681',
'floorName': 'Car Park',
'neighborhoodId': '92267',
'neighborhoodName': 'NEI1'}}
I'm struggling to get the 'space' 'id' and 'name' extracted out if I do a nested python loop like so it only returns the headers like 'id' and 'name' not the values held within.
for order in response['data']:
print(order['id'])
print(order['startTime'])
print(order['endTime'])
print(order['checkedInAt'])
print(order['checkedOutAt'])
print(order['status'])
print(order['space'])
for doc in response['space']:
print(doc['id'], doc['name'])
Any help with this would be much appreciated!

for doc in response['space']: will iterate over the keys in response['space'] dict, i.e. doc will be str.
You want to do doc = response['space'] instead and then print(doc['id']). or directly print(response['space']['id']).
Note, you may want to use dict.get() method to avoid KeyError.
# if response dict has no 'space' key, return empty dict.
# if no 'id' key - return None
space_id = response.get('space', {}).get('id')

Parsing data in a JSON file

I am wanting to parse some information from a JSON file. I cant find a find to successfully retrieve the data I want.
In a file I want to output the profile name.
This is the code on how I am reading and parsing.
with open(json_data) as f:
accounts = dict(json.loads(f.read()))
shell_script = accounts['OCredit']['Profile Name']
print(shell_script)
This gives me the output of
OCredit
In a sense this is what I want, but in the application the value thats now "OCredit"(first bracket) would depend on the user.
with open(json_data) as f:
accounts = dict(json.loads(f.read()))
shell_script = accounts['OCredit']
print(shell_script)
This outputs :
{'Profile Name': 'OCredit', 'Name': 'Andrew Long', 'Email':
'asasa#yahoo.com', 'Tel': '2134568790', 'Address': '213 clover ','Zip':
'95305', 'City': 'brooklyn', 'State': 'NY','CreditCard':'213456759090',
'EXP': '12/21', 'CVV': '213'}
The actual JSON file is :
{'OCredit': {'Profile Name': 'OCredit',
'Name': 'Andrew Long',
'Email': 'asasa#yahoo.com',
'Tel': '2134568790',
'Address': '213 clover ',
'Zip': '95305',
'City': 'Brooklyn',
'State': 'NY',
'CreditCard': '213456759090',
'EXP': '12/21',
'CVV': '213'}}
So, to sum it up. I want to get inside JSON file and just print out the value that "Profile Name" has without hardcoding the fist value of the bracket.
Im not sure if I have to change the way im saving the JSON file to achieve this. Any help would be appreciated.

Try this:
for key in accounts:
print(accounts[key]['Profile Name'])
# OCredit
Or:
for value in accounts.values():
print(value['Profile Name'])

Extract specific keys from list of dict in python. Sentinelhub

I seem to be stuck on very simple task. I'm still dipping my toes into Python.
I'm trying to download Sentinel 2 Images with SentinelHub API:SentinelHub
The result of data that my code returns is like this:
{'geometry': {'coordinates': [[[[35.895906644, 31.602691754],
[36.264307655, 31.593801516],
[36.230618703, 30.604681346],
[35.642363693, 30.617971909],
[35.678587829, 30.757888786],
[35.715700562, 30.905919341],
[35.754290061, 31.053632806],
[35.793289298, 31.206946419],
[35.895906644, 31.602691754]]]],
'type': 'MultiPolygon'},
'id': 'ee923fac-0097-58a8-b861-b07d89b99310',
'properties': {'**productType**': '**S2MSI1C**',
'centroid': {'coordinates': [18.1321538275, 31.10368655], 'type': 'Point'},
'cloudCover': 10.68,
'collection': 'Sentinel2',
'completionDate': '2017-06-07T08:15:54Z',
'description': None,
'instrument': 'MSI',
'keywords': [],
'license': {'description': {'shortName': 'No license'},
'grantedCountries': None,
'grantedFlags': None,
'grantedOrganizationCountries': None,
'hasToBeSigned': 'never',
'licenseId': 'unlicensed',
'signatureQuota': -1,
'viewService': 'public'},
'links': [{'href': 'http://opensearch.sentinel-hub.com/resto/collections/Sentinel2/ee923fac-0097-58a8-b861-b07d89b99310.json?&lang=en',
'rel': 'self',
'title': 'GeoJSON link for ee923fac-0097-58a8-b861-b07d89b99310',
'type': 'application/json'}],
'orbitNumber': 10228,
'organisationName': None,
'parentIdentifier': None,
'platform': 'Sentinel-2',
'processingLevel': '1C',
'productIdentifier': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'published': '2017-07-26T13:09:17.405352Z',
'quicklook': None,
'resolution': 10,
's3Path': 'tiles/36/R/YV/2017/6/7/0',
's3URI': 's3://sentinel-s2-l1c/tiles/36/R/YV/2017/6/7/0/',
'sensorMode': None,
'services': {'download': {'mimeType': 'text/html',
'url': 'http://sentinel-s2-l1c.s3-website.eu-central-1.amazonaws.com#tiles/36/R/YV/2017/6/7/0/'}},
'sgsId': 2168915,
'snowCover': 0,
'spacecraft': 'S2A',
'startDate': '2017-06-07T08:15:54Z',
'thumbnail': None,
'title': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'updated': '2017-07-26T13:09:17.405352Z'},
'type': 'Feature'}
Can you explain how can I iterate through this set of data and extract only 'productType'? For example, if there are several similar data sets it would return only different product types.
My code is :
import matplotlib.pyplot as plt
import numpy as np
from sentinelhub import AwsProductRequest, AwsTileRequest, AwsTile, BBox, CRS
betsiboka_coords_wgs84 = [31.245117,33.897777,34.936523,36.129002]
bbox = BBox(bbox=betsiboka_coords_wgs84, crs=CRS.WGS84)
date= '2017-06-05',('2017-06-08')
data=sentinelhub.opensearch.get_area_info(bbox, date_interval=date, maxcc=None)
for i in data:
print(i)

Based on what you have provided, replace your bottom for loop:
for i in data:
print(i)
with the following:
for i in data:
print(i['properties']['**productType**'])

If you want to access only the propertyType you can use i['properties']['productType'] in your for loop. If you want to access it any time you want without writing each time those keys, you can define a generator like this:
def property_types(data_array):
for data in data_array
yield data['properties']['propertyType']
So you can use it like this in a loop (your data_array is data, as returned by sentinelhub api):
for property_type in property_types(data):
# do stuff with property_type

keys = []
for key in d.keys():
if key == 'properties':
for k in d[key].keys():
if k == '**productType**' and k not in keys:
keys.append(d[key][k])
print(keys)

Getting only specific (nested) values: Since your request key is nested, and resides inside the parent "properties" object, you need to access it first, preferably using the get method. This can be done as follows (note the '{}' parameter in the first get, this returns an empty dictionary if the first key is not present)
data_dictionary = json.loads(data_string)
product_type = data_dictionary.get('properties', {}).get('**productType**')
You can then aggregate the different product_type objects in a set, which will automatically guarantee that no 2 objects are the same
product_type_set = set()
product_type.add(product_type)

Remove nested keys and move value to main dictionary keys

I have a problem when I try to merge two dictionaries to fit for doing a post later. For some reason the get seems to be nested and Im not sure how to clean it up. Would be great to get some tips on optimizing the code as well, right now it looks a bit messy.
for network in networks:
post_dict = {e1:e2 for e1,e2 in network['extattrs'].iteritems() if e1 not in keys }
pprint (post_dict['Stuff-Name']['value'])
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
dict_to_post = merge_two_dicts(post_dict, default_keys)
network:
{u'_ref': u'ref number',
u'comment': u'Name of object',
u'extattrs': {u'Network-Type': {u'value': u'Internal'},
u'Stuff-Id': {u'value': 110},
u'Stuff-Name': {u'value': u'Name of object'}},
u'network': u'Subnet-A',
u'network_view': u'default'}
default_keys:
default_keys = {'status':'Active',
'group':None,
'site':'City-A',
'role':'Production',
'description':None,
'custom_fields':None,
'tenant':None}
post_dict:
{'name': {u'value': u'Name of object'},
'sid': {u'value': 110}}
So what I want to achive is to get rid of the nested keys (within key "name" and "sid" so the key and value pair should be "name: Name of object" and "sid: 110"
The post function is not yet defined.

In my understanding, you case is really specific and I would probably go for a easy & dirty solution. First of all have you tried this:
post_dict['name'] = (post_dict.pop('Stuff-Name'))['value']
Secondly, how about making use of the "filter and renaming" and collapse the indexing there? This is not advisable, but if you are trying to do a lazy work-around it will suffice. I recommend you go with my first suggestion, as I'm pretty confident that it will solve your issue.

To get this first value of any nested dictionary you could use this
d = {'custom_fields': None, 'description': None, 'group': None, 'name':
{'value': 'Name of object'}, 'role': 'Production', 'site': 'City-A',
'status': 'Active', 'tenant': None, 'sid': {'value': 110}}
for key in d.keys():
if type(d[key]) == dict:
d[key] = d[key].popitem()[1]
It returns
{'custom_fields': None, 'description': None, 'group': None, 'name': 'Name of
object', 'role': 'Production', 'site': 'City-A', 'status': 'Active',
'tenant': None, 'sid': 110}
I think it's this step that's causing the dictionaries to be nested in the first place
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
You could try popitem()[1] here if you'll only ever need value of that dictionary and not the key.

Schema not validating

I'm working with a JSON schema and I'm trying to use the python JSON module to validate some JSON I output against the schema.
I get the following error, indicating that the Schema itself is not validating:
validation
Traceback (most recent call last):
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/gc_aicep-395698294.764.py", line 814, in <module>
validate(entry,gc_schema)
File "/Library/Python/2.7/site-packages/jsonschema/validators.py", line 468, in validate
cls(schema, *args, **kwargs).validate(instance)
File "/Library/Python/2.7/site-packages/jsonschema/validators.py", line 117, in validate
raise error
jsonschema.exceptions.ValidationError: ({'website': 'www.stepUp.com', 'bio': '', 'accreditations': {'portfolio': '', 'certifications': [], 'degrees': {'degree_name': [], 'major': '', 'institution_name': '', 'graduation_distinction': '', 'institution_website': ''}}, 'description': 'A great counselor', 'photo': '', 'twitter': '', 'additionaltext': '', 'linkedin': '', 'price': {'costtype': [], 'costrange': []}, 'phone': {'phonetype': [], 'value': '1234567891'}, 'facebook': '', 'counselingtype': [], 'logourl': '', 'counselingoptions': [], 'linkurl': '', 'name': {'first_name': u'Rob', 'last_name': u'Er', 'middle_name': u'', 'title': u''}, 'email': {'emailtype': [], 'value': ''}, 'languages': 'english', 'datasource': {'additionaltext': '', 'linkurl': '', 'linktext': '', 'logourl': ''}, 'linktext': '', 'special_needs_offer': '', 'company': 'Step Up', 'location': {'city': 'New York', 'zip': '10011', 'locationtype': '', 'state': 'NY', 'address': '123 Road Dr', 'loc_name': '', 'country': 'united states', 'geo': ['', '']}},) is not of type 'object'
The validationError message indicates that what follows the colon is not a valid JSON object, I think, but I can't figure out why it would not be.
This JSON validates using a validator like JSON Lint if you replace the single quotation marks with double and remove the basic parentheses from either side.
The 'u' before the name has been flagged as a possible bug.
This is the code which outputs the name:
name = HumanName(row['name'])
first_name = name.first
middle_name = name.middle
last_name = name.last
title = name.title
full_name = dict(first_name=first_name, middle_name=middle_name, last_name=last_name, title=title)
name is inserted into the JSON using the following:
gc_ieca = dict(
name = full_name,
twitter = twitter,
logourl = logourl,
linktext = linktext,
linkurl = linkurl,
additionaltext = additionaltext,
datasource = datasource,
phone=phone,
email = email,
price = price,
languages = languages,
special_needs_offer = special_needs_offer,
# location
location = location,
accreditations = accreditations,
website = website
),

That isn't what a ValidationError indicates. It indicates that validation failed :), not that the JSON is invalid (jsonschema doesn't even deal with JSON, it deals with deserialized JSON i.e. Python objects, here a dict). If the JSON were invalid you'd get an error when you called json.load.
The reason it's failing is because that in fact is not an object, it's a tuple with a single element, an object, so it is in fact invalid. Why it is a tuple is a bug in your code (you have a stray comma at the end there I see).
(And FYI, the u prefixes are because those are unicode literals, and the single quotes are because that's the repr of str, nothing to do with JSON).

I see two potential issues here:
Use of single quotes. Strictly speaking, the json spec calls for a use of double-quotes for strings. Your final note sort of implies this is not your issue, however it is worth mentioning, as something to check if fixing #2 doesn't resolve the problem.
Values for name: these are listed as u'...' which is not valid json. Use of u must be followed by 4 hex digits, and should fall within the double-quotes surrounding a string, after a \ escape character.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Accessing keys/values in a paginated/nested dictionary - python

Related

How to get a nested response data from json python

Parsing data in a JSON file

Extract specific keys from list of dict in python. Sentinelhub

Remove nested keys and move value to main dictionary keys

Schema not validating

Categories

Resources