Python String to List of Dicts - python

I have reviewed a number of similar questions on stackoverflow, but been unable to locate an answer which applies to my data/string.
I have a string which is effectively a list of dictionaries. In the fields, numbers are not surrounded by double quotes. If I try to use ast to evaluate the string, part of the string is cut off and I am unsure why. Could someone help me determine an appropriate way to read in this string and create a list of dicts.
Thanks,
>>> print(ascii_data)
[{"measurement": "cpu_load_short","tags": {"host": "server999","region": "us-west-1"},"fields": {"value": 0.99}},{"measurement": "cpu_load_short","tags": {"host": "server888","region": "us-east-1"},"fields": {"value": 0.88}}]
>>> x = ast.literal_eval(ascii_data)
>>> print(x)
[{'fields': {'value': 0.99}, 'tags': {'host': 'server999', 'region': 'us-west-1'}, 'measurement': 'cpu_load_short'}, {'fields': {'value': 0.88}, 'tags': {'host': 'server888', 'region': 'us-east-1'}, 'measurement': 'cpu_load_short'}]

Given:
>>> s
'[{"measurement": "cpu_load_short","tags": {"host": "server999","region": "us-west-1"},"fields": {"value": 0.99}},{"measurement": "cpu_load_short","tags": {"host": "server888","region": "us-east-1"},"fields": {"value": 0.88}}]'
You can use json
>>> import json
>>> json.loads(s)
[{u'fields': {u'value': 0.99}, u'tags': {u'host': u'server999', u'region': u'us-west-1'}, u'measurement': u'cpu_load_short'}, {u'fields': {u'value': 0.88}, u'tags': {u'host': u'server888', u'region': u'us-east-1'}, u'measurement': u'cpu_load_short'}]
Or ast:
>>> import ast
>>> ast.literal_eval(s)
[{'fields': {'value': 0.99}, 'tags': {'host': 'server999', 'region': 'us-west-1'}, 'measurement': 'cpu_load_short'}, {'fields': {'value': 0.88}, 'tags': {'host': 'server888', 'region': 'us-east-1'}, 'measurement': 'cpu_load_short'}]
And they produce the same Python data structure (at least with ascii input...):
>>> json.loads(s)==ast.literal_eval(s)
True
Since in each case the result is a Python dict know that the order may be different than the string's order. Python dicts are unordered and will usually be different than the creation order (at least prior to Python 3.6).
Under Python 3.6, they resulting dict is in the same order:
>>> json.loads(s)
[{'measurement': 'cpu_load_short', 'tags': {'host': 'server999', 'region': 'us-west-1'}, 'fields': {'value': 0.99}}, {'measurement': 'cpu_load_short', 'tags': {'host': 'server888', 'region': 'us-east-1'}, 'fields': {'value': 0.88}}]
>>> ast.literal_eval(s)
[{'measurement': 'cpu_load_short', 'tags': {'host': 'server999', 'region': 'us-west-1'}, 'fields': {'value': 0.99}}, {'measurement': 'cpu_load_short', 'tags': {'host': 'server888', 'region': 'us-east-1'}, 'fields': {'value': 0.88}}]
Python 3.6 is great...

Use json.
In [1]: s = '''[{"measurement": "cpu_load_short","tags": {"host": "server999","region": "us-west-1"},"fields": {"value": 0.99}},{"measuremen
...: t": "cpu_load_short","tags": {"host": "server888","region": "us-east-1"},"fields": {"value": 0.88}}]'''
In [2]: import json
In [3]: import pprint
In [4]: pprint.pprint(json.loads(s))
[{'fields': {'value': 0.99},
'measurement': 'cpu_load_short',
'tags': {'host': 'server999', 'region': 'us-west-1'}},
{'fields': {'value': 0.88},
'measurement': 'cpu_load_short',
'tags': {'host': 'server888', 'region': 'us-east-1'}}]
In [11]: json.loads(s)[0]['tags']['host']
Out[11]: 'server999'

How about json.loads?
j = json.loads(ascii_data)
ast.literal_eval may not be the best option. If your data source comes from some API, it would definitely be json format.
And if the order of dict keys matters to you, try specifying the object_pairs_hook argument to JSONDecoder. (ref: Can I get JSON to load into an OrderedDict in Python?)

Related

JSON viewers don't accept my pattern even after dict going through json.dumps() + json.loads()

The result when printing after a = json.dumps(dicter) and print(json.loads(a)) is this:
{
'10432981': {
'tournament': {
'name': 'Club Friendly Games',
'slug': 'club-friendly-games',
'category': {
'name': 'World',
'slug': 'world',
'sport': {
'name': 'Football',
'slug': 'football',
'id': 1
},
'id': 1468,
'flag': 'international'
},
'uniqueTournament': {
'name': 'Club Friendly Games',
'slug': 'club-friendly-games',
'category': {
'name': 'World',
'slug': 'world',
'sport': {
'name': 'Football',
'slug': 'football',
'id': 1
},
'id': 1468,
'flag': 'international'
},
'userCount': 0,
'hasPositionGraph': False,
'id': 853,
'hasEventPlayerStatistics': False,
'displayInverseHomeAwayTeams': False
},
'priority': 0,
'id': 86
}
}
}
But when trying to read in any json viewer, they warn that the format is incorrect but don't specify where the problem is.
If it doesn't generate any error when converting the dict to JSON and not even when reading it, why do views warn of failure?
You must enclose the strings using double quotes ("). The json.loads returns a python dictionary, so it is not a valid JSON object. If you want to get valid JSON you can get the string that json.dumps returns.

How to parse json.response dict

I'm looking to parse the following dictionary received json response and am running into issues. My dictionary name is results.
This is a simple json response that appears to be dict.
{'resultType': 'vector', 'result': [{'metric': {'agent_host': 'x.x.x.x', 'cluster_name': 'test_cluser', 'device_type': 'switch', 'hostname': 'myswitch', 'ifName': 'xe-0/0/10', 'instance': 'telegraf:1111', 'job': 'telegraf', 'rack_name': 'test_rack', 'site_name': 'test_site'}, 'value': [1631917506.324, '0.00009262475396549728']}]}
Type confirms that:
<class 'dict'>
Ultimately what I'd like to do is something along the lines of:
for key, value in results.items():
(rx_error, rx_error_freq) = value[16]
In order to get the value 0.00009262475396549728 from above. How would I go about doing this?
One great way to visualize nested dictionaries is to use pprint which comes built into python3
from pprint import pprint
d = {'resultType': 'vector', 'result': [{'metric': {'agent_host':
'x.x.x.x', 'cluster_name': 'test_cluser', 'device_type': 'switch',
'hostname': 'myswitch', 'ifName': 'xe-0/0/10', 'instance':
'telegraf:1111', 'job': 'telegraf', 'rack_name': 'test_rack',
'site_name': 'test_site'}, 'value': [1631917506.324,
'0.00009262475396549728']}]}
pprint(d)
>>> {'result': [{'metric': {'agent_host': 'x.x.x.x',
'cluster_name': 'test_cluser',
'device_type': 'switch',
'hostname': 'myswitch',
'ifName': 'xe-0/0/10',
'instance': 'telegraf:1111',
'job': 'telegraf',
'rack_name': 'test_rack',
'site_name': 'test_site'},
'value': [1631917506.324, '0.00009262475396549728']}],
'resultType': 'vector'}
This allows us to see the different key value pairs a lot more easily. So the data you're looking to access would is
rx_error, rx_error_freq = d["result"][0]["value"]
Loop over the result list to get the value, then print the second element
>>> data = {'resultType': 'vector', 'result': [{'metric': {'agent_host': 'x.x.x.x', 'cluster_name': 'test_cluser', 'device_type': 'switch', 'hostname': 'myswitch', 'ifName': 'xe-0/0/10', 'instance': 'telegraf:1111', 'job': 'telegraf', 'rack_name': 'test_rack', 'site_name': 'test_site'}, 'value': [1631917506.324, '0.00009262475396549728']}]}
>>> for r in data['result']:
... print(r['value'][1])
...
0.00009262475396549728

Python PrettyPrinter shows object address but not the contents

I am trying to pretty-print a python object by calling:
from pprint import pprint
...
pprint(update)
But the output looks like this:
<telegram.update.Update object at 0xffff967e62b0>
However, using Python's internal print() I get the correct output:
{'update_id': 14191809, 'message': {'message_id': 22222, 'date': 11111, 'chat': {'id': 00000, 'type': 'private', 'username': 'xxxx', 'first_name': 'X', 'last_name': 'Y'}, 'text': '/start', 'entities': [{'type': 'bot_command', 'offset': 0, 'length': 6}], 'caption_entities': [], 'photo': [], 'new_chat_members': [], 'new_chat_photo': [], 'delete_chat_photo': False, 'group_chat_created': False, 'supergroup_chat_created': False, 'channel_chat_created': False, 'from': {'id': 01010101, 'first_name': 'X', 'is_bot': False, 'last_name': 'Y', 'username': 'xxxx', 'language_code': 'en'}}}
Is there a way to make pprint(), show the object-data correctly and formatted?
pprint uses the representation (__repr__() method) of the object while print uses __str__(). What you see in print output is not a dictionary but a string representation of the inner structure of the telegram.update.Update instance.
There is no generic solution to this, but since your question is about a specific library, consulting the relevant docs shows that there is a .to_json() method, so you can do this:
import json
from pprint import pprint
...
pprint(json.loads(update.to_json()))

How to iterate over nested list of dictionaries?

I need to get the 'ids' of this json response,the thing is that, there are many dictionaries with a list of dictionaries inside,how can I do this??(PS:len(items) is 20,so I need to get the 20 ids in the form of a dictionary.
{'playlists': {'href': 'https://api.spotify.com/v1/search?query=rewind-The%25&type=playlist&offset=0&limit=20',
'items': [{'collaborative': False,
'description': 'Remember what you listened to in 2010? Rewind and rediscover your favorites.',
'external_urls': {'spotify': 'https://open.spotify.com/playlist/37i9dQZF1DXc6IFF23C9jj'},
'href': 'https://api.spotify.com/v1/playlists/37i9dQZF1DXc6IFF23C9jj',
'id': '37i9dQZF1DXc6IFF23C9jj',
'images': [{'height': None,
'url': 'https://i.scdn.co/image/ab67706f0000000327ba1078080355421d1a49e2',
'width': None}],
'name': 'Rewind - The Sound of 2010',
'owner': {'display_name': 'Spotify',
'external_urls': {'spotify': 'https://open.spotify.com/user/spotify'},
'href': 'https://api.spotify.com/v1/users/spotify',
'id': 'spotify',
'type': 'user',
'uri': 'spotify:user:spotify'},
'primary_color': None,
'public': None,
'snapshot_id': 'MTU5NTUzMTE1OSwwMDAwMDAwMGQ0MWQ4Y2Q5OGYwMGIyMDRlOTgwMDk5OGVjZjg0Mjdl',
'tracks': {'href': 'https://api.spotify.com/v1/playlists/37i9dQZF1DXc6IFF23C9jj/tracks',
'total': 100},
'type': 'playlist',
'uri': 'spotify:playlist:37i9dQZF1DXc6IFF23C9jj'},
Im trying to get it through this:
dict={'id':''}
for playlists in playlist_data['playlists']:
for items in playlists['items']:
for item in items:
for dic in range(len(item)):
for id in dic['id']:
dict.update('id')
print(dict)
I get this error:
TypeError: string indices must be integers ```
Try something like this:
ids = [item["id"] for item in json_data["playlists"]["items"]]
This is called a list comprehension.
You want to iterate over all of the "items" within the "playlists" key.
You can access that list of items:
json_data["playlists"]["items"]
Then you iterate over each item within items:
for item in json_data["playlists"]["items"]
Then you access the "id" of each item:
item["id"]
You can index an object using the keys of object. I can see there are two places where id is present in an object. To retrieve those two ids and store them in a dictionary format, you can use the following approach -
_json = {
'playlists': {
'href': 'https://api.spotify.com/v1/search?query=rewind-The%25&type=playlist&offset=0&limit=20',
'items': [{
'collaborative': False,
'description': 'Remember what you listened to in 2010? Rewind and rediscover your favorites.',
'external_urls': {
'spotify': 'https://open.spotify.com/playlist/37i9dQZF1DXc6IFF23C9jj'
},
'href': 'https://api.spotify.com/v1/playlists/37i9dQZF1DXc6IFF23C9jj',
'id': '37i9dQZF1DXc6IFF23C9jj',
'images': [{
'height': None,
'url': 'https://i.scdn.co/image/ab67706f0000000327ba1078080355421d1a49e2',
'width': None
}],
'name': 'Rewind - The Sound of 2010',
'owner': {
'display_name': 'Spotify',
'external_urls': {
'spotify': 'https://open.spotify.com/user/spotify'
},
'href': 'https://api.spotify.com/v1/users/spotify',
'id': 'spotify',
'type': 'user',
'uri': 'spotify:user:spotify'
},
'primary_color': None,
'public': None,
'snapshot_id': 'MTU5NTUzMTE1OSwwMDAwMDAwMGQ0MWQ4Y2Q5OGYwMGIyMDRlOTgwMDk5OGVjZjg0Mjdl',
'tracks': {
'href': 'https://api.spotify.com/v1/playlists/37i9dQZF1DXc6IFF23C9jj/tracks',
'total': 100
},
'type': 'playlist',
'uri': 'spotify:playlist:37i9dQZF1DXc6IFF23C9jj'
}, ]
}
}
res_dict = {'id':[items['id'], items['owner']['id']] for items in _json['playlists']['items']}
print(res_dict)
OUTPUT :
{'id': ['37i9dQZF1DXc6IFF23C9jj', 'spotify']}
If you don't need the second id that's present in the json object, you can just remove it from above res_dict and modify it as -
res_dict = {'id':items['id'] for items in _json['playlists']['items']}
This will only fetch the id present in the items array as key of any element and not any further nested ids (like items[i]->owner->id won't be in the final res as it was in the fist case ).

Accessing YAML data in Python

I have a YAML file that parses into an object, e.g.:
{'name': [{'proj_directory': '/directory/'},
{'categories': [{'quick': [{'directory': 'quick'},
{'description': None},
{'table_name': 'quick'}]},
{'intermediate': [{'directory': 'intermediate'},
{'description': None},
{'table_name': 'intermediate'}]},
{'research': [{'directory': 'research'},
{'description': None},
{'table_name': 'research'}]}]},
{'nomenclature': [{'extension': 'nc'}
{'handler': 'script'},
{'filename': [{'id': [{'type': 'VARCHAR'}]},
{'date': [{'type': 'DATE'}]},
{'v': [{'type': 'INT'}]}]},
{'data': [{'time': [{'variable_name': 'time'},
{'units': 'minutes since 1-1-1980 00:00 UTC'},
{'latitude': [{'variable_n...
I'm having trouble accessing the data in python and regularly see the error TypeError: list indices must be integers, not str
I want to be able to access all elements corresponding to 'name' so to retrieve each data field I imagine it would look something like:
import yaml
settings_stream = open('file.yaml', 'r')
settingsMap = yaml.safe_load(settings_stream)
yaml_stream = True
print 'loaded settings for: ',
for project in settingsMap:
print project + ', ' + settingsMap[project]['project_directory']
and I would expect each element would be accessible via something like ['name']['categories']['quick']['directory']
and something a little deeper would just be:
['name']['nomenclature']['data']['latitude']['variable_name']
or am I completely wrong here?
The brackets, [], indicate that you have lists of dicts, not just a dict.
For example, settingsMap['name'] is a list of dicts.
Therefore, you need to select the correct dict in the list using an integer index, before you can select the key in the dict.
So, giving your current data structure, you'd need to use:
settingsMap['name'][1]['categories'][0]['quick'][0]['directory']
Or, revise the underlying YAML data structure.
For example, if the data structure looked like this:
settingsMap = {
'name':
{'proj_directory': '/directory/',
'categories': {'quick': {'directory': 'quick',
'description': None,
'table_name': 'quick'}},
'intermediate': {'directory': 'intermediate',
'description': None,
'table_name': 'intermediate'},
'research': {'directory': 'research',
'description': None,
'table_name': 'research'},
'nomenclature': {'extension': 'nc',
'handler': 'script',
'filename': {'id': {'type': 'VARCHAR'},
'date': {'type': 'DATE'},
'v': {'type': 'INT'}},
'data': {'time': {'variable_name': 'time',
'units': 'minutes since 1-1-1980 00:00 UTC'}}}}}
then you could access the same value as above with
settingsMap['name']['categories']['quick']['directory']
# quick

Categories

Resources