I seem to be stuck on very simple task. I'm still dipping my toes into Python.
I'm trying to download Sentinel 2 Images with SentinelHub API:SentinelHub
The result of data that my code returns is like this:
{'geometry': {'coordinates': [[[[35.895906644, 31.602691754],
[36.264307655, 31.593801516],
[36.230618703, 30.604681346],
[35.642363693, 30.617971909],
[35.678587829, 30.757888786],
[35.715700562, 30.905919341],
[35.754290061, 31.053632806],
[35.793289298, 31.206946419],
[35.895906644, 31.602691754]]]],
'type': 'MultiPolygon'},
'id': 'ee923fac-0097-58a8-b861-b07d89b99310',
'properties': {'**productType**': '**S2MSI1C**',
'centroid': {'coordinates': [18.1321538275, 31.10368655], 'type': 'Point'},
'cloudCover': 10.68,
'collection': 'Sentinel2',
'completionDate': '2017-06-07T08:15:54Z',
'description': None,
'instrument': 'MSI',
'keywords': [],
'license': {'description': {'shortName': 'No license'},
'grantedCountries': None,
'grantedFlags': None,
'grantedOrganizationCountries': None,
'hasToBeSigned': 'never',
'licenseId': 'unlicensed',
'signatureQuota': -1,
'viewService': 'public'},
'links': [{'href': 'http://opensearch.sentinel-hub.com/resto/collections/Sentinel2/ee923fac-0097-58a8-b861-b07d89b99310.json?&lang=en',
'rel': 'self',
'title': 'GeoJSON link for ee923fac-0097-58a8-b861-b07d89b99310',
'type': 'application/json'}],
'orbitNumber': 10228,
'organisationName': None,
'parentIdentifier': None,
'platform': 'Sentinel-2',
'processingLevel': '1C',
'productIdentifier': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'published': '2017-07-26T13:09:17.405352Z',
'quicklook': None,
'resolution': 10,
's3Path': 'tiles/36/R/YV/2017/6/7/0',
's3URI': 's3://sentinel-s2-l1c/tiles/36/R/YV/2017/6/7/0/',
'sensorMode': None,
'services': {'download': {'mimeType': 'text/html',
'url': 'http://sentinel-s2-l1c.s3-website.eu-central-1.amazonaws.com#tiles/36/R/YV/2017/6/7/0/'}},
'sgsId': 2168915,
'snowCover': 0,
'spacecraft': 'S2A',
'startDate': '2017-06-07T08:15:54Z',
'thumbnail': None,
'title': 'S2A_OPER_MSI_L1C_TL_SGS__20170607T120016_A010228_T36RYV_N02.05',
'updated': '2017-07-26T13:09:17.405352Z'},
'type': 'Feature'}
Can you explain how can I iterate through this set of data and extract only 'productType'? For example, if there are several similar data sets it would return only different product types.
My code is :
import matplotlib.pyplot as plt
import numpy as np
from sentinelhub import AwsProductRequest, AwsTileRequest, AwsTile, BBox, CRS
betsiboka_coords_wgs84 = [31.245117,33.897777,34.936523,36.129002]
bbox = BBox(bbox=betsiboka_coords_wgs84, crs=CRS.WGS84)
date= '2017-06-05',('2017-06-08')
data=sentinelhub.opensearch.get_area_info(bbox, date_interval=date, maxcc=None)
for i in data:
print(i)
Based on what you have provided, replace your bottom for loop:
for i in data:
print(i)
with the following:
for i in data:
print(i['properties']['**productType**'])
If you want to access only the propertyType you can use i['properties']['productType'] in your for loop. If you want to access it any time you want without writing each time those keys, you can define a generator like this:
def property_types(data_array):
for data in data_array
yield data['properties']['propertyType']
So you can use it like this in a loop (your data_array is data, as returned by sentinelhub api):
for property_type in property_types(data):
# do stuff with property_type
keys = []
for key in d.keys():
if key == 'properties':
for k in d[key].keys():
if k == '**productType**' and k not in keys:
keys.append(d[key][k])
print(keys)
Getting only specific (nested) values: Since your request key is nested, and resides inside the parent "properties" object, you need to access it first, preferably using the get method. This can be done as follows (note the '{}' parameter in the first get, this returns an empty dictionary if the first key is not present)
data_dictionary = json.loads(data_string)
product_type = data_dictionary.get('properties', {}).get('**productType**')
You can then aggregate the different product_type objects in a set, which will automatically guarantee that no 2 objects are the same
product_type_set = set()
product_type.add(product_type)
Related
I'm getting members from a Google group using the directory API. This is what is returned:
{'etag': '"..."',
'kind': 'admin#directory#members',
'members': [{'email': 'a#a.com',
'etag': '"..."',
'id': '....',
'kind': 'admin#directory#member',
'role': 'MEMBER',
'type': 'USER'},
{'email': 'b#a.com',
'etag': '"..."',
'id': '...',
'kind': 'admin#directory#member',
'role': 'MEMBER',
'type': 'USER'},
...],
'nextPageToken': '...'}
There are multiple pages being returned and I want to append the next set of members to the members key of this dictionary...I think.
Here is my code so far:
for group in groups_list:
members_service = service.members()
member_request = members_service.list(groupKey=group['email'], maxResults=200, roles='MEMBER')
memberEmails = {}
while member_request is not None:
member_results = member_request.execute()
memberEmails.update(member_results)
member_request = members_service.list_next(member_request, member_results)
I don't think memberEmails.update(member_results) will work here because it overwrites and updates existing keys.
How can I accomplish this? or what are some alternatives?
I wonder if you want to treat memberEmails like a list and extend it with the value associated with the members key in the result. For example:
for group in groups_list:
members_service = service.members()
member_request = members_service.list(groupKey=group['email'], maxResults=200, roles='MEMBER')
member_emails = list()
while member_request is not None:
member_results = member_request.execute()
member_emails.extend(member_request['members'])
member_request = members_service.list_next(member_request, member_results)
I call an API and this my JSON response:
{'data': [{
'id': 'd-1225959',
'startTime': '2022-12-30T00:00:00.000Z',
'endTime': '2022-12-30T23:59:00.000Z',
'checkedInAt': None,
'checkedOutAt': None,
'status': 'PENDING',
'space': {
'id': 'd-4063963',
'name': '082',
'type': 'DESK',
'createdAt': '2021-07-06T11:48:57.000Z',
'updatedAt': '2021-07-06T11:48:57.000Z',
'isAvailable': False,
'assignedTo': None,
'locationId': '133778',
'floorId': '41681',
'floorName': 'Car Park',
'neighborhoodId': '92267',
'neighborhoodName': 'NEI1'}}
I'm struggling to get the 'space' 'id' and 'name' extracted out if I do a nested python loop like so it only returns the headers like 'id' and 'name' not the values held within.
for order in response['data']:
print(order['id'])
print(order['startTime'])
print(order['endTime'])
print(order['checkedInAt'])
print(order['checkedOutAt'])
print(order['status'])
print(order['space'])
for doc in response['space']:
print(doc['id'], doc['name'])
Any help with this would be much appreciated!
for doc in response['space']: will iterate over the keys in response['space'] dict, i.e. doc will be str.
You want to do doc = response['space'] instead and then print(doc['id']). or directly print(response['space']['id']).
Note, you may want to use dict.get() method to avoid KeyError.
# if response dict has no 'space' key, return empty dict.
# if no 'id' key - return None
space_id = response.get('space', {}).get('id')
I want to convert this nested json into a df.
Tried different functions but none works correctly.
The encoding that worked for my was -
encoding = "utf-8-sig"
[{'replayableActionOperationState': 'SKIPPED',
'replayableActionOperationGuid': 'RAO_1037351',
'failedMessage': 'Cannot replay action: RAO_1037351: com.ebay.sd.catedor.core.model.DTOEntityPropertyChange; local class incompatible: stream classdesc serialVersionUID = 7777212484705611612, local class serialVersionUID = -1785129380151507142',
'userMessage': 'Skip all mode',
'username': 'gfannon',
'sourceAuditData': [{'guid': '24696601-b73e-43e4-bce9-28bc741ac117',
'operationName': 'UPDATE_CATEGORY_ATTRIBUTE_PROPERTY',
'creationTimestamp': 1563439725240,
'auditCanvasInfo': {'id': '165059', 'name': '165059'},
'auditUserInfo': {'id': 1, 'name': 'gfannon'},
'externalId': None,
'comment': None,
'transactionId': '0f135909-66a7-46b1-98f6-baf1608ffd6a',
'data': {'entity': {'guid': 'CA_2511202',
'tagType': 'BOTH',
'description': None,
'name': 'Number of Shelves'},
'propertyChanges': [{'propertyName': 'EntityProperty',
'oldEntity': {'guid': 'CAP_35',
'name': 'DisableAsVariant',
'group': None,
'action': 'SET',
'value': 'true',
'tagType': 'SELLER'},
'newEntity': {'guid': 'CAP_35',
'name': 'DisableAsVariant',
'group': None,
'action': 'SET',
'value': 'false',
'tagType': 'SELLER'}}],
'entityChanges': None,
'primary': True}}],
'targetAuditData': None,
'conflictedGuids': None,
'fatal': False}]
This is what i tried so far, there are more tries but that got me as close as i can.
with open(r"Desktop\Ann's json parsing\report.tsv", encoding='utf-8-sig') as data_file:
data = json.load(data_file)
df = json_normalize(data)
print (df)
pd.DataFrame(df) ## The nested lists are shown as a whole column, im trying to parse those colums - 'failedMessage' and 'sourceAuditData'`I also tried json.loads/json(df) but the output isnt correct.
pd.DataFrame.from_dict(a['sourceAuditData'][0]['data']['propertyChanges'][0]) ##This line will retrive one of the outputs i need but i dont know how to perform it on the whole file.
The expected result should be a csv/xlsx file with a column and value for each row.
For your particular example:
def unroll_dict(d):
data = []
for k, v in d.items():
if isinstance(v, list):
data.append((k, ''))
data.extend(unroll_dict(v[0]))
elif isinstance(v, dict):
data.append((k, ''))
data.extend(unroll_dict(v))
else:
data.append((k,v))
return data
And given the data in your question is stored in the variable example:
df = pd.DataFrame(unroll_dict(example[0])).set_index(0).transpose()
I know that somewhat related questions have been asked here: Accessing key, value in a nested dictionary and here: python accessing elements in a dictionary inside dictionary among other places but I can't quite seem to apply the answers' methodology to my issue.
I'm getting a KeyError trying to access the keys within response_dict, which I know is due to it being nested/paginated and me going about this the wrong way. Can anybody help and/or point me in the right direction?
import requests
import json
URL = "https://api.constantcontact.com/v2/contacts?status=ALL&limit=1&api_key=<redacted>&access_token=<redacted>"
#make my request, store it in the requests object 'r'
r = requests.get(url = URL)
#status code to prove things are working
print (r.status_code)
#print what was retrieved from the API
print (r.text)
#visual aid
print ('---------------------------')
#decode json data to a dict
response_dict = json.loads(r.text)
#show how the API response looks now
print(response_dict)
#just for confirmation
print (type(response_dict))
print('-------------------------')
# HERE LIES THE ISSUE
print(response_dict['first_name'])
And my output:
200
{"meta":{"pagination":{}},"results":[{"id":"1329683950","status":"ACTIVE","fax":"","addresses":[{"id":"4e19e250-b5d9-11e8-9849-d4ae5275509e","line1":"222 Fake St.","line2":"","line3":"","city":"Kansas City","address_type":"BUSINESS","state_code":"","state":"OK","country_code":"ve","postal_code":"19512","sub_postal_code":""}],"notes":[],"confirmed":false,"lists":[{"id":"1733488365","status":"ACTIVE"}],"source":"Site Owner","email_addresses":[{"id":"1fe198a0-b5d5-11e8-92c1-d4ae526edd6c","status":"ACTIVE","confirm_status":"NO_CONFIRMATION_REQUIRED","opt_in_source":"ACTION_BY_OWNER","opt_in_date":"2018-09-11T18:18:20.000Z","email_address":"rsmith#fake.com"}],"prefix_name":"","first_name":"Robert","middle_name":"","last_name":"Smith","job_title":"I.T.","company_name":"FBI","home_phone":"","work_phone":"5555555555","cell_phone":"","custom_fields":[],"created_date":"2018-09-11T15:12:40.000Z","modified_date":"2018-09-11T18:18:20.000Z","source_details":""}]}
---------------------------
{'meta': {'pagination': {}}, 'results': [{'id': '1329683950', 'status': 'ACTIVE', 'fax': '', 'addresses': [{'id': '4e19e250-b5d9-11e8-9849-d4ae5275509e', 'line1': '222 Fake St.', 'line2': '', 'line3': '', 'city': 'Kansas City', 'address_type': 'BUSINESS', 'state_code': '', 'state': 'OK', 'country_code': 've', 'postal_code': '19512', 'sub_postal_code': ''}], 'notes': [], 'confirmed': False, 'lists': [{'id': '1733488365', 'status': 'ACTIVE'}], 'source': 'Site Owner', 'email_addresses': [{'id': '1fe198a0-b5d5-11e8-92c1-d4ae526edd6c', 'status': 'ACTIVE', 'confirm_status': 'NO_CONFIRMATION_REQUIRED', 'opt_in_source': 'ACTION_BY_OWNER', 'opt_in_date': '2018-09-11T18:18:20.000Z', 'email_address': 'rsmith#fake.com'}], 'prefix_name': '', 'first_name': 'Robert', 'middle_name': '', 'last_name': 'Smith', 'job_title': 'I.T.', 'company_name': 'FBI', 'home_phone': '', 'work_phone': '5555555555', 'cell_phone': '', 'custom_fields': [], 'created_date': '2018-09-11T15:12:40.000Z', 'modified_date': '2018-09-11T18:18:20.000Z', 'source_details': ''}]}
<class 'dict'>
-------------------------
Traceback (most recent call last):
File "C:\Users\rkiek\Desktop\Python WIP\Chris2.py", line 20, in <module>
print(response_dict['first_name'])
KeyError: 'first_name'
first_name = response_dict["results"][0]["first_name"]
Even though I think this question would be better answered by yourself by reading some documentation, I will explain what is going on here. You see the dict-object of the man named "Robert" is within a list which is a value under the key "results". So, at first you need to access the value within results which is a python-list.
Then you can use a loop to iterate through each of the elements within the list, and treat each individual element as a regular dictionary object.
results = response_dict["results"]
results = response_dict.get("results", None)
# use any one of the two above, the first one will throw a KeyError if there is no key=="results" the other will return NULL
# this results is now a list according to the data you mentioned.
for item in results:
print(item.get("first_name", None)
# here you can loop through the list of dictionaries and treat each item as a normal dictionary
I have a problem when I try to merge two dictionaries to fit for doing a post later. For some reason the get seems to be nested and Im not sure how to clean it up. Would be great to get some tips on optimizing the code as well, right now it looks a bit messy.
for network in networks:
post_dict = {e1:e2 for e1,e2 in network['extattrs'].iteritems() if e1 not in keys }
pprint (post_dict['Stuff-Name']['value'])
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
dict_to_post = merge_two_dicts(post_dict, default_keys)
network:
{u'_ref': u'ref number',
u'comment': u'Name of object',
u'extattrs': {u'Network-Type': {u'value': u'Internal'},
u'Stuff-Id': {u'value': 110},
u'Stuff-Name': {u'value': u'Name of object'}},
u'network': u'Subnet-A',
u'network_view': u'default'}
default_keys:
default_keys = {'status':'Active',
'group':None,
'site':'City-A',
'role':'Production',
'description':None,
'custom_fields':None,
'tenant':None}
post_dict:
{'name': {u'value': u'Name of object'},
'sid': {u'value': 110}}
So what I want to achive is to get rid of the nested keys (within key "name" and "sid" so the key and value pair should be "name: Name of object" and "sid: 110"
The post function is not yet defined.
In my understanding, you case is really specific and I would probably go for a easy & dirty solution. First of all have you tried this:
post_dict['name'] = (post_dict.pop('Stuff-Name'))['value']
Secondly, how about making use of the "filter and renaming" and collapse the indexing there? This is not advisable, but if you are trying to do a lazy work-around it will suffice. I recommend you go with my first suggestion, as I'm pretty confident that it will solve your issue.
To get this first value of any nested dictionary you could use this
d = {'custom_fields': None, 'description': None, 'group': None, 'name':
{'value': 'Name of object'}, 'role': 'Production', 'site': 'City-A',
'status': 'Active', 'tenant': None, 'sid': {'value': 110}}
for key in d.keys():
if type(d[key]) == dict:
d[key] = d[key].popitem()[1]
It returns
{'custom_fields': None, 'description': None, 'group': None, 'name': 'Name of
object', 'role': 'Production', 'site': 'City-A', 'status': 'Active',
'tenant': None, 'sid': 110}
I think it's this step that's causing the dictionaries to be nested in the first place
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
You could try popitem()[1] here if you'll only ever need value of that dictionary and not the key.