Extracting information of JSON file - python

I got results through results = requests.get(url).json()
results look like this:
{'type': 'FeatureCollection', 'crs': {'type': 'name', 'properties':
{'name': 'EPSG:4326'}}, 'features': [{'type': 'Feature',
'properties': {'kode': '0101',
'navn': 'København',
'region_kode': '1084.0',
'region_navn': 'Hovedstaden'}, 'bbox': [12.453042062098154,
55.612994971371606,
12.734252598475942,
55.732491190632494]}]}
with results['features'], I am getting this
[{'type': 'Feature', 'properties': {'kode': '0101', 'navn':
'København', 'region_kode': '1084.0', 'region_navn':
'Hovedstaden'}, 'bbox': [12.453042062098154,
55.612994971371606,
12.734252598475942,
55.732491190632494]}]
I want to get the information in navn
and I tried all combination of
results['features']['properties']['navn']
results['features']['navn']
results['features']['properties']
they all show the same error message that: list indices must be integers or slices, not str
apparently, results['features'] is a list with a length of 1.
how can I get to navn information?
I want to make several calls as you can imagine.

The results['features']object is a list, try results['features'][0]['properties']['navn']
Now you select the first element in the list (0), the dictionary, and from that dictionary you select the 'navn' key. The result is the value of 'navn'
Note that python lists are between [] and items are seperated by a comma and python dictionaries are between {} and consists of key, value pairs seperated by a comma.

try this
results['features'][0]['properties']['navn']

You can try code below:
results['features'][0]['properties']['navn']

You should try accessing the first element of the list in result['features'], i.e.:
results['features'][0]['properties']['navn']
Full code:
results = {'type': 'FeatureCollection', 'crs': {'type': 'name', 'properties': {'name': 'EPSG:4326'}}, 'features': [{'type': 'Feature',
'properties': {'kode': '0101', 'navn': 'København', 'region_kode': '1084.0', 'region_navn': 'Hovedstaden'}, 'bbox': [12.453042062098154, 55.612994971371606, 12.734252598475942, 55.732491190632494]}]}
print(results['features'][0]['properties']['navn'])
# København

results = {'type': 'FeatureCollection', 'crs': {'type': 'name', 'properties': {'name': 'EPSG:4326'}},
'features': [{'type': 'Feature',
'properties': {'kode': '0101', 'navn': 'København', 'region_kode': '1084.0',
'region_navn': 'Hovedstaden'},
'bbox': [12.453042062098154, 55.612994971371606, 12.734252598475942, 55.732491190632494]}]}
navn = results['features'][0]['properties']['navn']
print(navn)
You got error because inside features there is one list. So, you can not get list with the help of str index and to get the properties inside the features you need to write [0] and the list will gone and you can get the value.

Related

Using json_normalize() for missing keys Python Pandas DataFrame

I have this snapshot of my dataset
test={'data': [{'name': 'john',
'insights': {'data': [{'account_id': '123',
'test_id': '456',
'date_start': '2022-12-31',
'date_stop': '2023-01-29',
'impressions': '4070',
'spend': '36.14'}],
'paging': {'cursors': {'before': 'MAZDZD', 'after': 'MAZDZD'}}},
'status': 'ACTIVE',
'id': '789'},
{'name': 'jack', 'status': 'PAUSED', 'id': '420'}]
}
I want to create a pandas dataframe where the columns are the name, date_start, date_stop, impressions, and spend.
When I tried json_normalize(), it raises an error because some of the keys are missing, when 'status':'PAUSED'. Is there a way to remove the values when the keys are missing from the list or another way of using json_normalize()? I tried errors='ignore' but it doesnt work as well.

Python: Flatten multiple nested dict and append

I Hello all,
I am looking for help in trying to flatten multiple nested dicts and append them to a new list.
I have multiple dicts, loaded from a geojson-File like that:
data = json.load(open("xy.geojson"))
They are all structured like that:
{'type': 'Feature', 'properties': {'tags': {'highway': 'cycleway', 'lit': 'yes', 'source': 'survey 08.2018 and Esri', 'surface': 'paving_stones', 'traffic_sign': 'DE:237 Radweg'}}, 'geometry': {'type': 'LineString', 'coordinates': [[6.7976974, 51.1935231], [6.7977131, 51.1935542], [6.7977735, 51.1935719], [6.7978679, 51.193578], [6.798005, 51.1936044], [6.7982118, 51.1936419], [6.7983474, 51.1936511], [6.7984899, 51.1936365], [6.7985761, 51.193623], [6.7986739, 51.1936186], [6.7987574, 51.1936188], [6.7988269, 51.1936342], [6.7988893, 51.1936529], [6.7989378, 51.1936778], [6.7990085, 51.1937739]]}}
Now I'd like to flatten the 'tags'-part of the dict, so I get:
{'type': 'Feature', 'properties': {'highway': 'cycleway', 'lit': 'yes', 'source': 'survey 08.2018 and Esri', 'surface': 'paving_stones', 'traffic_sign': 'DE:237 Radweg'}, 'geometry': {'type': 'LineString', 'coordinates': [[6.7976974, 51.1935231], [6.7977131, 51.1935542], [6.7977735, 51.1935719], [6.7978679, 51.193578], [6.798005, 51.1936044], [6.7982118, 51.1936419], [6.7983474, 51.1936511], [6.7984899, 51.1936365], [6.7985761, 51.193623], [6.7986739, 51.1936186], [6.7987574, 51.1936188], [6.7988269, 51.1936342], [6.7988893, 51.1936529], [6.7989378, 51.1936778], [6.7990085, 51.1937739]]}}
What I've done so far is setting up a new list and starting a for-loop:
filtered = []
for geo in data['features']:
But how can I flatten geo['properties']['tags'] within loop and append the result for each dict to filtered?
Thank you all so much, appreciate your help!
Clemens
There's probably a better way, but this seems to work:
filtered = []
for geo in data["features"]:
updated = dict(**geo)
updated["properties"] = geo["properties"]["tags"]
filtered.append(updated)
print(filtered)

How to perform an assertion to verify an item is in a list of dicts in Python

I am trying to figure out how to do an assertion to see if a number exists in a list.
So my list looks like:
data = [{'value': Decimal('4.21'), 'Type': 'sale'},
{'value': Decimal('84.73'), 'Type': 'sale'},
{'value': Decimal('70.62'), 'Type': 'sale'},
{'value': Decimal('15.00'), 'Type': 'credit'},
{'value': Decimal('2.21'), 'Type': 'credit'},
{'value': Decimal('4.21'), 'Type': 'sale'},
{'value': Decimal('84.73'), 'Type': 'sale'},
{'value': Decimal('70.62'), 'Type': 'sale'},
{'value': Decimal('15.00'), 'Type': 'credit'},
{'value': Decimal('2.21'), 'Type': 'credit'}]
Now I am trying to iterate through the list like:
for i in data:
s = i['value']
print(s)
assert 2.21 in i['value'], "Value should be there"
I am somehow only getting the first number returned for "value" i.e. 4.21
You have two problems as other commenters pointed out. You compare the wrong data types (str against Decimal, or after your edit, float against Decimal) and you also terminate on first failure. You probably wanted to write:
assert Decimal('2.21') in (d["value"] for d in data)
This will extract the value of the "value" key from each sub-dictionary inside the list and search for Decimal('2.21') in them.

Converting Dictionary Values to new Dictionary

I am pulling json data from an API and have a number of columns in my dataframe that contain dictionaries. These dictionaries are written so that the id and the value are two separate entries in the dictionary like so:
{'id': 'AnnualUsage', 'value': '13071'}
Some of the rows for these columns contain only one dictionary entry like shown above, but others can contain up to 7:
[{'id': 'AnnualUsage', 'value': '13071'},
{'id': 'TestId', 'value': 'Z13753'},
{'id': 'NumberOfMe', 'value': '3'},
{'id': 'Prem', 'value': '960002'},
{'id': 'ProjectID', 'value': '0039'},
{'id': 'Region', 'value': 'CHR'},
{'id': 'Tariff', 'value': 'Multiple'},
{'id': 'Number', 'value': '06860702'}]
When I attempt to break this dictionary down into separate column attributes
CTG_df2 = pd.concat([CTG_df['id'], CTG_df['applicationUDFs'].apply(pd.Series)], axis=1)
I end up with columns in a dataframe each containing a dictionary of the above entry i.e.
{'id': 'AnnualUsageDE', 'value': '13071'}
Is there a way for me to convert my dictionary values into new key-value pairs? For instance I would like to convert from:
{'id': 'AnnualUsageDE', 'value': '13071'}
to
{'AnnualUsageDE': '13071'}
If this is possible I will then be able to create new columns from these attributes.
You can do a dict comprehension. From your list of dicts, compose a new dict where the key is the id of each element and the value is the value of each element.
original = [{'id': 'AnnualUsage', 'value': '13071'},
{'id': 'TestId', 'value': 'Z13753'},
{'id': 'NumberOfMe', 'value': '3'},
{'id': 'Prem', 'value': '960002'},
{'id': 'ProjectID', 'value': '0039'},
{'id': 'Region', 'value': 'CHR'},
{'id': 'Tariff', 'value': 'Multiple'},
{'id': 'Number', 'value': '06860702'}]
newdict = {subdict['id']: subdict['value'] for subdict in original}
print(newdict)
# {'AnnualUsage': '13071',
# 'Number': '06860702',
# 'NumberOfMe': '3',
# 'Prem': '960002',
# 'ProjectID': '0039',
# 'Region': 'CHR',
# 'Tariff': 'Multiple',
# 'TestId': 'Z13753'}
You can iterate through the values and set each of them to the dictionary value:
newdict = dict()
for x in original:
newdict[x["id"]] = x["value"]

Python jsonpath Filter Expression

Background:
I have the following example data structure in JSON:
{'sensor' : [
{'assertions_enabled': 'ucr+',
'deassertions_enabled': 'ucr+',
'entity_id': '7.0',
'lower_critical': 'na',
'lower_non_critical': 'na',
'lower_non_recoverable': 'na',
'reading_type': 'analog',
'sensor_id': 'SR5680 TEMP (0x5d)',
'sensor_reading': {'confidence_interval': '0.500',
'units': 'degrees C',
'value': '42'},
'sensor_type': 'Temperature',
'status': 'ok',
'upper_critical': '59.000',
'upper_non_critical': 'na',
'upper_non_recoverable': 'na'}
]}
The sensor list will actually contain many of these dicts containing sensor info.
Problem:
I'm trying to query the list using jsonpath to return me a subset of sensor dicts that have sensor_type=='Temperature' but I'm getting 'False' returned (no match). Here's my jsonpath expression:
results = jsonpath.jsonpath(ipmi_node, "$.sensor[?(#.['sensor_type']=='Temperature')]")
When I remove the filter expression and just use "$.sensor.*" I get a list of all sensors, so I'm sure the problem is in the filter expression.
I've scanned multiple sites/posts for examples and I can't seem to find anything specific to Python (Javascript and PHP seem to be more prominent). Could anyone offer some guidance please?
The following expression does what you need (notice how the attribute is specified):
jsonpath.jsonpath(impi_node, "$.sensor[?(#.sensor_type=='Temperature')]")
I am using jsonpath-ng which seems to be active (as of 23.11.20) and I provide solution based on to Pedro's jsonpath expression:
data = {
'sensor' : [
{'sensor_type': 'Temperature', 'id': '1'},
{'sensor_type': 'Humidity' , 'id': '2'},
{'sensor_type': 'Temperature', 'id': '3'},
{'sensor_type': 'Density' , 'id': '4'}
]}
from jsonpath_ng.ext import parser
for match in parser.parse("$.sensor[?(#.sensor_type=='Temperature')]").find(data):
print(match.value)
Output:
{'sensor_type': 'Temperature', 'id': '1'}
{'sensor_type': 'Temperature', 'id': '3'}
NOTE: besides basic documentation provided on project's homepage I found additional information in tests.

Categories

Resources