Line wrapping code for nested dictionaries in python - python

with the new Entity Resolution in Alexa, nested dictionaries become very nested. what's the most pythonic way to refer to a deeply nested value? how do i write the code keeping within 79 characters per line?
this is what i currently have, and while it works, i'm pretty sure there is a better way:
if 'VolumeQuantity' in intent['slots']:
if 'resolutions' in intent['slots']['VolumeQuantity']:
half_decibels = intent['slots']['VolumeQuantity']['resolutions']['resolutionsPerAuthority'][0]['values'][0]['value']['name'].strip()
elif 'value' in intent['slots']['VolumeQuantity']:
half_decibels = intent['slots']['VolumeQuantity']['value'].strip()
Here is a partial sample of the json from alexa
{
"type": "IntentRequest",
"requestId": "amzn1.echo-api.request.9a...11",
"timestamp": "2018-03-28T20:37:21Z",
"locale": "en-US",
"intent": {
"name": "RelativeVolumeIntent",
"confirmationStatus": "NONE",
"slots": {
"VolumeQuantity": {
"name": "VolumeQuantity",
"confirmationStatus": "NONE"
},
"VolumeDirection": {
"name": "VolumeDirection",
"value": "softer",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-blah-blah-blah",
"status": {
"code": "ER_SUCCESS_MATCH"
},
"values": [
{
"value": {
"name": "down",
"id": "down"
}
}
]
}
]
},
"confirmationStatus": "NONE"
}
}
},
"dialogState": "STARTED"
}

You are probably refering to nested dictionaries, lists only accept integer indices.
Anyway, (ab?)using the implied line continuation inside parentheses, I think this is pretty readable:
>>> d = {'a':{'b':{'c':'value'}}}
>>> (d
... ['a']
... ['b']
... ['c']
... )
'value'
or alternatively
>>> (d['a']
... ['b']
... ['c'])
'value'

First, you can use some well-named intermediate variables to make the program more readable as well as simpler and faster:
volumes = intent['slots'] # Pick meaningful names. I'm just guessing.
if 'VolumeQuantity' in volumes:
quantity = volumes['VolumeQuantity']
if 'resolutions' in quantity:
half_decibels = quantity['resolutions']['resolutionsPerAuthority'][0]['values'][0]['value']['name'].strip()
elif 'value' in quantity:
half_decibels = quantity['value'].strip()
Second, you can write a helper function nav(structure, path) for navigating through these structures, so that, e.g.
nav(quantity, 'resolutions.resolutionsPerAuthority.0.values.0.value.name')
splits up the given path and does the sequence of indexing/lookup operations. It could use dict.get(key, default) so you don't have to do so many if key in dict checks.

Related

Python function to extract specific values from complex JSON logs data

I am trying to write a Python function (for use in a Google Cloud Function) that extracts specific values from JSON logs data. Ordinarily, I do this using the standard method of sorting through keys:
my_data['key1'], etc.
This JSON data, however is quite different, since it appears to have the data I need as lists inside of dictionaries. Here is a sample of the logs data:
{
"insertId": "-mgv16adfcja",
"logName": "projects/my_project/logs/cloudaudit.googleapis.com%2Factivity",
"protoPayload": {
"#type": "type.googleapis.com/google.cloud.audit.AuditLog",
"authenticationInfo": {
"principalEmail": "email#email.com"
},
"authorizationInfo": [{
"granted": true,
"permission": "resourcemanager.projects.setIamPolicy",
"resource": "projects/my_project",
"resourceAttributes": {
"name": "projects/my_project",
"service": "cloudresourcemanager.googleapis.com",
"type": "cloudresourcemanager.googleapis.com/Project"
}
},
{
"granted": true,
"permission": "resourcemanager.projects.setIamPolicy",
"resource": "projects/my_project",
"resourceAttributes": {
"name": "projects/my_project",
"service": "cloudresourcemanager.googleapis.com",
"type": "cloudresourcemanager.googleapis.com/Project"
}
}
],
"methodName": "SetIamPolicy",
"request": {
"#type": "type.SetIamPolicyRequest",
"policy": {
"bindings": [{
"members": [
"serviceAccount:my-test-
sa #my_project.iam.gserviceaccount.com "
],
"role": "projects/my_project/roles/PubBuckets"
},
{
"members": [
"serviceAccount:my-test-sa-
2 #my_project.iam.gserviceaccount.com "
],
"role": "roles/owner"
},
{
"members": [
"serviceAccount:my-test-sa-3#my_project.iam.gserviceaccount.com",
"serviceAccount:my-test-sa-4#my_project.iam.gserviceaccount.com"
]
}
My goal with this data is to extract the "role":"roles/editor" and the associated "members." So in this case, I would like to extract service accounts my-test-sa-3, 4, and 5, and print them.
When the JSON enters my cloud function I do the following:
pubsub_message = base64.b64decode(event['data']).decode('utf-8')
msg = json.loads(pubsub_message)
print(msg)
And I can get to other data that I need, e.g., project id-
proj_id = msg['resource']['labels']['project_id']
But I cannot get into the lists within the dictionaries effectively. The deepest I can currently get is to the 'bindings' key.
I have additionally tried restructuring and flattening output as a list:
policy_request =credentials.projects().getIamPolicy(resource=proj_id, body={})
policy_response = policy_request.execute()
my_bindings = policy_response['bindings']
flat_list = []
for element in my_bindings:
if type(element) is list:
for item in element:
flat_list.append(item)
else:
flat_list.append(element)
print('Here is flat_list: ', flat_list)
I then use an if statement to search the list, which returns nothing. I can't use indices, because the output will change consistently, so I need a solution that can extract the values by a key, value approach if at all possible.
Expected Output:
Role: roles/editor
Members:
sa-1#gcloud.com
sa2#gcloud.com
sa3#gcloud.com
and so on
Appreciate any help.

Return list of values from list inside json

I have a json and I'd like to get only specific values into a list. I can do this just fine iterating through, but I'm wondering if there's an easy one-liner list comprehension method to do this. Suppose I have a json:
{
"results": {
"types":
[
{
"ID": 1
"field": [
{
"type": "date",
"field": "PrjDate"
},
{
"type": "date",
"field": "ComplDate"
}
]
}
]
}
}
I'd like to get all of the field values into a single list:
fieldsList = ['PrjDate', 'ComplDate']
I can do this easily with
for types in myjson['results']['types']:
fieldsList = []
for fields in types['field']:
fieldsList.append(fields['field'])
But that seems unnecessarily clunky, is there an easy one-liner list comprehension method I can use here?
You could try
myfields = [fields['field'] for types in myjson['results']['types'] for fields in types['field']]

Looping over a couple of dictionaries within a list in Python

I have this list :
"guests_info": [
{
"title": "Mr.",
"first_name": "John",
"mi": null,
"last_name": "Smith",
"suffix": "Sr.",
"age": "18+"
},
{
"title": "Mr.",
"first_name": "James",
"mi": null,
"last_name": "Jones",
"suffix": null,
"age": "18+"
}
]
It´s json. So, What i have to do is to create a body like this one:
{
"profile": {
"name": {
"title": "Miss",
"firstName": "Kairi",
"lastName": "asdasd"
},
"age": 8
},
"preferences": {
"avatarIdentifier": "15655408",
"favoriteCharacterIdentifier": "15655408"
},
"friendsAndFamily": {
"groupClassification": {
"name": "TRAVELLING_PARTY"
},
"accessClassification": {
"name": "PLAN_VIEW_SHARED"
}
}
}
For every guest in the list and do a POST to this url :
https://env5.xxx.api.go.com/xxxxx/xxxx/id;swid=" + swid_id + "/managed-guests
Some things like Preferences is the same for every guest. I have a little xperience with python. I supposed i have to put the guests_info inside a for loop and for every dictionay i have to create a new string containing a new body, but i´m not sure how to exactly count how many dictionaries are in guests_info.
Any help will be really appreciatted
Like #Cfreak suggested, just make a function which will appropriately post the data you wish to post. To create that data, what you need to do is to loop over the list, and then over the keys of each dictionary.
E.g.,
dict_list = [
{ 'foo' : 'baz',
'bar' : 'qux'
},
{ 'foo' : 'qux',
'bar' : 'baz'
}
]
Followed by, at some point
for each in dict_list:
for key in each:
print(key + ": " + each[key])
Just be aware that the data won't be in any particular order. If you need to maintain order, you can manually reference each key.
To determine the length of the list, you can just
len(dict_list) which will output 2
If you want assistance with creating the body of the JSON query, I can certainly help with that.

Extracting values from deeply nested JSON structures

This is a structure I'm getting from elsewhere, that is, a list of deeply nested dictionaries:
{
"foo_code": 404,
"foo_rbody": {
"query": {
"info": {
"acme_no": "444444",
"road_runner": "123"
},
"error": "no_lunch",
"message": "runner problem."
}
},
"acme_no": "444444",
"road_runner": "123",
"xyzzy_code": 200,
"xyzzy_rbody": {
"api": {
"items": [
{
"desc": "OK",
"id": 198,
"acme_no": "789",
"road_runner": "123",
"params": {
"bicycle": "2wheel",
"willie": "hungry",
"height": "1",
"coyote_id": "1511111"
},
"activity": "TRAP",
"state": "active",
"status": 200,
"type": "chase"
}
]
}
}
}
{
"foo_code": 200,
"foo_rbody": {
"query": {
"result": {
"acme_no": "260060730303258",
"road_runner": "123",
"abyss": "26843545600"
}
}
},
"acme_no": "260060730303258",
"road_runner": "123",
"xyzzy_code": 200,
"xyzzy_rbody": {
"api": {
"items": [
{
"desc": "OK",
"id": 198,
"acme_no": "789",
"road_runner": "123",
"params": {
"bicycle": "2wheel",
"willie": "hungry",
"height": "1",
"coyote_id": "1511111"
},
"activity": "TRAP",
"state": "active",
"status": 200,
"type": "chase"
}
]
}
}
}
Asking for different structures is out of question (legacy apis etc).
So I'm wondering if there's some clever way of extracting selected values from such a structure.
The candidates I was thinking of:
flatten particular dictionaries, building composite keys, smth like:
{
"foo_rbody.query.info.acme_no": "444444",
"foo_rbody.query.info.road_runner": "123",
...
}
Pro: getting every value with one access and if predictable key is not there, it means that the structure was not there (as you might have noticed, dictionaries may have different structures depending on whether it was successful operation, error happened, etc).
Con: what to do with lists?
Use some recursive function that would do successive key lookups, say by "foo_rbody", then by "query", "info", etc.
Any better candidates?
You can try this rather trivial function to access nested properties:
import re
def get_path(dct, path):
for i, p in re.findall(r'(\d+)|(\w+)', path):
dct = dct[p or int(i)]
return dct
Usage:
value = get_path(data, "xyzzy_rbody.api.items[0].params.bicycle")
Maybe the function byPath in my answer to this post might help you.
You could create your own path mechanism and then query the complicated dict with paths. Example:
/ : get the root object
/key: get the value of root_object['key'], e.g. /foo_code --> 404
/key/key: nesting: /foo_rbody/query/info/acme_no -> 444444
/key[i]: get ith element of that list, e.g. /xyzzy_rbody/api/items[0]/desc --> "OK"
The path can also return a dict which you then run more queries on, etc.
It would be fairly easy to implement recursively.
I think about two more solutions:
You can try package Pynq, described here - structured query language for JSON (in Python). As far as a I understand, it's some kind of LINQ for python.
You may also try to convert your JSON to XML and then use Xquery language to get data from it - XQuery library under Python

Sort a list of dictionaries by a key, based on the order of keys in a separate list in Python

I have a list of dictionaries which looks something like this:
[
{
"format": "format1",
"code": "tr"
},
{
"format": "format2",
"code": "hc"
},
{
"format": "format3",
"code": "bx"
},
{
"format": "format4",
"code": "mm"
},
{
"format": "format5",
"code": "el"
}
]
I need to order this list based on the value of the code key, but the order of the codes is determined by a separate list:
code_order = ["mm", "hc", "el", "tr", "bx"]
So the final list should look like this:
[
{
"format": "format4",
"code": "mm"
},
{
"format": "format2",
"code": "hc"
},
{
"format": "format5",
"code": "el"
},
{
"format": "format1",
"code": "tr"
},
{
"format": "format3",
"code": "bx"
}
]
Does anyone have any suggestions on how to achieve this? I'm having a difficult time figuring out how to do this kind of sort.
Python 2.7+:
lookup = {s: i for i, s in enumerate(code_order)}
print(sorted(l, key=lambda o: lookup[o['code']]))
Older:
lookup = dict((s, i) for i, s in enumerate(code_order))
print sorted(l, key=lambda o: lookup[o['code']])
If l is your list of dicts, then
sorted(l, key=lambda d: code_order.index(d['code']))
should do the trick. Read that as:
key is the function that looks up code in the given dict d, then checks the index of that code in code_order, so the final sort is by those indices.
(If code_order gets really large, then keep in mind that list.index takes linear time so you'd better replace it by a dict. But for this short code_order, it shouldn't matter.)

Categories

Resources