How to extract JSON from a nested JSON file? - python

I am calling an API and getting a response like the below.
{
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
I need to clean-up this JSON a bit and produce a simple JSON like below where escalatingChainName is the name in the escalatingChain list so that I can write this into a CSV file.
{
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChainName": "Example123"
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChainName": "Example456"
}
]
}
Is there a JSON function that I can use to copy only the necessary key-value or nested key-values to a new JSON object?
With the below code, I am able to get the details list.
json_response = response.json()
items = json_response['data']
details = items['items']
I can print individual list items using
for x in details:
print(x)
How do I take it from here to pull only the necessary fields like id, name, priority and the name from escalatingchain to create a new list or JSON?

There is no existing function that will do what you want, so you'll need to write one. Fortunately that's not too hard in this case — basically you just create a list of new items by extracting the pieces of data you want from the existing ones.
import json
json_response = """\
{
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
"""
response = json.loads(json_response)
cleaned = []
for item in response['data']['items']:
cleaned.append({'id': item['id'],
'name': item['name'],
'priority': item['priority'],
'levelStr': item['levelStr'],
'escalatingChainId': item['escalatingChainId'],
'escalatingChainName': item['escalatingChain']['name']})
print('cleaned:')
print(json.dumps(cleaned, indent=4))

You can try:
data = {
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": False,
"throttlingAlerts": 20,
"enableThrottling": True,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": False,
"throttlingAlerts": 20,
"enableThrottling": True,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
for single_item in data["data"]["items"]:
print(single_item["id"])
print(single_item["name"])
print(single_item["priority"])
print(single_item["levelStr"])
print(single_item["escalatingChain"]["inAlerting"])
# and so on

Two ways of approaching this depending on whether your dealing with a variable or .json file using python list and dictionary comprehension:
Where data variable of type dictionary (nested) already defined:
# keys you want
to_keep = ['id', 'name', 'priority', 'levelStr', 'escalatingChainId',
'escalatingChainName']
new_data = [{k:v for k,v in low_dict.items() if k in to_keep}
for low_dict in data['data']['items']]
# where item is dictionary at lowest level
escalations = [{v+'Name':k[v]['name']} for k in data['data']['items']
for v in k if type(k[v])==dict]
# merge both lists of python dictionaries to produce flattened list of dictionaries
new_data = [{**new,**escl} for new,escl in zip(new_data,escalations)]
Or (and since your refer json package) if you have save the response to as a .json file:
import json
with open('response.json', 'r') as handl:
data = json.load(handl)
to_keep = ['id', 'name', 'priority', 'levelStr', 'escalatingChainId',
'escalatingChainName']
new_data = [{k:v for k,v in low_dict.items() if k in to_keep}
for low_dict in data['data']['items']]
escalations = [{v+'Name':k[v]['name']} for k in data['data']['items']
for v in k if type(k[v])==dict]
new_data = [{**new,**escl} for new,escl in zip(new_data,escalations)]
Both produce output:
[{'id': 11,
'name': 'BBC',
'priority': 4,
'levelStr': 'All',
'escalatingChainId': 3,
'escalatingChainName': 'Example123'},
{'id': 21,
'name': 'CNBC',
'priority': 4,
'levelStr': 'All',
'escalatingChainId': 3,
'escalatingChainName': 'Example456'}]

Related

Split Python Array based on a property

I have a Python List like this:
myList = [
{
"key": 1,
"date": "2020-01-02"
},
{
"key": 2,
"date": "2020-02-02"
},
{
"key": 3,
"date": "2020-01-03"
},
{
"key": 4,
"date": "2020-01-02"
},
{
"key": 5,
"date": "2020-02-02"
},
]
Now I want to split the array based on the property "date". I want my list to look like this
myList = [
[
{
"key": 1,
"date": "2020-01-02"
},
{
"key": 4,
"date": "2020-01-02"
},
],
[
{
"key": 2,
"date": "2020-02-02"
},
{
"key": 5,
"date": "2020-02-02"
},
],
[
{
"key": 3,
"date": "2020-01-03"
},
]
]
So I want a new array for each specific date in the current list. Can someone help me to achieve that?
d={}
for i in range(len(myList)):
d.setdefault(myList[i]['date'], []).append(i)
myList = [ [myList[i] for i in v] for k,v in d.items() ] # replace the original `myList` following PO behavior.
Logic:
You want to group the data based on 'date' that means you need a dictionary data structure. The rest are just implementation details.

Merge data from list of dict

I think what I want to do is easy but I don't know the correct way to do this.
I have a list as:
[
{
"id": 16,
"condition": true,
"tags": 6,
},
{
"id": 16,
"condition": true,
"tags": 1,
},
{
"id": 16,
"condition": true,
"tags": 4,
},
{
"id": 3,
"condition": false,
"tags": 3,
}
]
And I want to group the element of by list by id and condition, the output would be:
[
{
"id": 16,
"condition": true,
"tags": [6, 1, 4],
},
{
"id": 16,
"condition": false,
"tags": [3],
}
]
I can do this by looping on my list and creating another array but I was wondering about a better way to do this.
for now my code is like this:
def format(self):
list_assigned = the_list_I_want_to_modify
res = []
for x in list_assigned:
exists = [v for v in res if
v['id'] == x['id'] and v['condition'] == x['condition']]
if not exists:
x['tags'] = [x['tags']]
res.append(x)
else:
exists[0]['tags'].append(x['tags'])
return res
Thanks
There might be a prettier solution, but you could solve it by first creating a temporary dictionary with keys being tuples containing all the keys from your original list that are required to group by, and appending the tags in a list - I use the .setdefault(key, type) dictionary function to make a dictionary with default list elements.
Then you can unpack that dictionary into a list again afterwards with a list comprehension.
a = [
{
"id": 16,
"condition": True,
"tags": 6,
},
{
"id": 16,
"condition": True,
"tags": 1,
},
{
"id": 16,
"condition": True,
"tags": 4,
},
{
"verified_by": 3,
"condition": False,
"tags": 3,
}
]
tmp = {}
for elem in a:
groupby_keys = tuple(sorted((k, v) for k, v in elem.items() if k != 'tags'))
tmp.setdefault(groupby_keys, []).append(elem['tags'])
out = [{a[0]: a[1] for a in list(k) + [('tags', v)]} for k, v in tmp.items()]
print(out)
Output:
>>> out
[{'id': 16, 'condition': True, 'tags': [6, 1, 4]}, {'verified_by': 3, 'condition': False, 'tags': [3]}]

Pandas dataframe into nested child dictionary

I have a dataframe like below, where each 'level' drills down into more detail, with the last level having an id value.
data = [
{'id': 1, 'level_1': 'Animals', 'level_2': 'Carnivores', 'level_3': 'Felidae', 'level_4', 'Siamese Cat'},
{'id': 2, 'level_1': 'Animals', 'level_2': 'Carnivores', 'level_3': 'Felidae', 'level_4', 'Javanese Cat'},
{'id': 3, 'level_1': 'Animals', 'level_2': 'Carnivores', 'level_3': 'Ursidae', 'level_4', 'Polar Bear'},
{'id': 4, 'level_1': 'Animals', 'level_2': 'Carnivores', 'level_3': 'Canidae', 'level_4', 'Labradore Retriever'},
{'id': 5, 'level_1': 'Animals', 'level_2': 'Carnivores', 'level_3': 'Canidae', 'level_4', 'Golden Retriever'}
]
I want to turn this into a nested dictionary of parent / child relationships like below.
var data = {
"name": "Animals",
"children": [
{
"name": "Carnivores",
"children": [
{
"name": "Felidae",
"children": [
{
"id": 1,
"name": "Siamese Cat",
"children": []
},
{
"id": 2,
"name": "Javanese Cat",
"children": []
}
]
},
{
"name": "Ursidae",
"children": [
{
"id": 3,
"name": "Polar Bear",
"children": []
}
]
},
{
"name": "Canidae",
"children": [
{
"id": 4,
"name": "Labradore Retriever",
"children": []
},
{
"id": 5,
"name": "Golden Retriever",
"children": []
}
]
}
]
}
]
}
I've tried several approaches of grouping the dataframe and also looping over individual rows, but haven't been able to find a working solution yet. Any help would be greatly appreciated!
The answer of #Timus mimics your intention, however you might encounter some difficulties searching this dictionary as each level has a key name and a key children. If this is what you intended ignore my answer. However, if you would like to create a dictionary in which you can more easily search through unique keys you can try:
df = df.set_index(['level_1', 'level_2', 'level_3', 'level_4'])
def make_dictionary(df):
if df.index.nlevels == 1:
return df.to_dict()
dictionary = {}
for key in df.index.get_level_values(0).unique():
sub_df = df.xs(key)
dictionary[key] = df_to_dict(sub_df)
return dictionary
make_dictionary(df)
It requires setting the different levels as index, and you will end up with a slightly different dictionary:
{'Animals':
{'Carnivores':
{'Felidae':
{'id': {'Siamese Cat': 1,
'Javanese Cat': 2}},
'Ursidae':
{'id': {'Polar Bear': 3}},
'Canidae':
{'id': {'Labradore Retriever': 4,
'Golden Retriever': 5}}}
}
}
EDIT: Had to make an adjustment, because the result wasn't exactly as expected.
Here's an attempt that produces the expected output (if I haven't made a mistake, which wouldn't be a surprise, because I've made several on the way):
def pack_level(df):
if df.columns[0] == 'id':
return [{'id': i, 'name': name, 'children': []}
for i, name in zip(df[df.columns[0]], df[df.columns[1]])]
return [{'name': df.iloc[0, 0],
'children': [entry for lst in df[df.columns[1]]
for entry in lst]}]
df = pd.DataFrame(data)
columns = list(df.columns[1:])
df = df.groupby(columns[:-1]).apply(pack_level)
for i in range(1, len(columns) - 1):
df = (df.reset_index(level=-1, drop=False).groupby(columns[:-i])
.apply(pack_level)
.reset_index(level=-1, drop=True))
var_data = {'name': df.index[0], 'children': df.iloc[0]}
The result looks a bit different at first glance, but that should be only due to the sorting (from printing):
{
"children": [
{
"children": [
{
"children": [
{
"children": [],
"id": 4,
"name": "Labradore Retriever"
},
{
"children": [],
"id": 5,
"name": "Golden Retriever"
}
],
"name": "Canidae"
},
{
"children": [
{
"children": [],
"id": 1,
"name": "Siamese Cat"
},
{
"children": [],
"id": 2,
"name": "Javanese Cat"
}
],
"name": "Felidae"
},
{
"children": [
{
"children": [],
"id": 3,
"name": "Polar Bear"
}
],
"name": "Ursidae"
}
],
"name": "Carnivores"
}
],
"name": "Animals"
}
I've tried to be as generic as possible, but the first column has to be named id (as in your sample).

Extract values from Json code using Python [duplicate]

This question already has answers here:
How can I parse (read) and use JSON?
(5 answers)
Closed 3 years ago.
I'm trying to extract the values from the following json code in this format by displaying each device in a different row. Any ideas on how to do it? Thanks in advance
device1 : 232323 Completed
device2 : 345848 Completed etc
I tried the below which gives me the following output:
[{'cid': 631084346, 'data': [[{'text': 'device1'}], [{'text': '732536:Completed.'}], [{'text': '1'}]], 'id': 1750501182}, {'cid': 1891684718, 'data': [[{'text': 'device2'}], [{'text': '732536:Completed.'}], [{'text': '1'}]], 'id': 2218910703}
api_readable = api_response.json()
computer_name = str(api_readable['data']['result_sets'][0]['rows'])
# computer_name = re.sub("[{},':]", "", computer_name)
print(computer_name)
Sample Json used
{
"data": {
"max_available_age": "",
"now": "2020/01/09 22:43:58 GMT-0000",
"result_sets": [{
"age": 0,
"archived_question_id": 0,
"columns": [{
"hash": 3409330187,
"name": "Computer Name",
"type": 1
},
{
"hash": 1792443391,
"name": "Action Statuses",
"type": 1
},
{
"hash": 0,
"name": "Count",
"type": 3
}
],
"error_count": 0,
"estimated_total": 129,
"expire_seconds": 3660,
"filtered_row_count": 11,
"filtered_row_count_machines": 11,
"id": 2880628,
"issue_seconds": 0,
"item_count": 11,
"mr_passed": 129,
"mr_tested": 129,
"no_results_count": 0,
"passed": 128,
"question_id": 2880628,
"report_count": 233,
"row_count": 11,
"row_count_machines": 11,
"rows": [{
"cid": 631084346,
"data": [
[{
"text": "device1"
}],
[{
"text": "732536:Completed."
}],
[{
"text": "1"
}]
],
"id": 1750501182
},
{
"cid": 1891684718,
"data": [
[{
"text": "device2"
}],
[{
"text": "732536:Completed."
}],
[{
"text": "1"
}]
],
"id": 2218910703
}
api_readable['data']['result_sets'][0]['rows'] is a list, you can simply loop over it and print the items you want.
for row in api_readable['data']['result_sets'][0]['rows']:
print(f"{row['data'][0][0]['text']} : {row['data'][1][0]['text']}");

Merge JSON data with Python

As part of a Python program, I want to merge JSON objects that contain identically structured data. For instance:
{
"responseStatus": "SUCCESS",
"responseDetails": {
"total": 5754,
},
"data": [
{
"id": 1324651
},
{
"id": 5686131
}
]
}
What I want to do is to add the content of the data array of my section object into the data array of my first object.
So, assuming:
thejson1 = json.loads({"responseStatus": "SUCCESS","responseDetails": {"total": 5754,},"data": [{"id": 1324651},{"id": 5686131}]})
thejson2 = json.loads({"responseStatus": "SUCCESS","responseDetails": {"total": 1234,},"data": [{"id": 2165735},{"id": 2133256}]})
I thought that executing:
thejson1["data"].append(thejson2["data"])
Would expand thejson1 into:
{
"responseStatus": "SUCCESS",
"responseDetails": {
"total": 5754,
},
"data": [
{
"id": 1324651
},
{
"id": 5686131
},
{
"id": 2165735
},
{
"id": 2133256
}
]
}
But what it does instead is add thejson2 data as an array within the data array of thejson1:
{
"responseStatus": "SUCCESS",
"responseDetails": {
"total": 5754,
},
"data": [
{
"id": 1324651
},
{
"id": 5686131
},
[
{
"id": 2165735
},
{
"id": 2133256
}
]
]
}
So, what am I doing wrong? It looks like append adds the data array of the second JSON object instead of its content, but note that I can't know in advance the contents of the "data" array in my JSON input, so I can't write code that specifically loops in the "id" objects to add them one by one.
Thanks in advance!
R.
You're looking for extend, not append.
thejson1["data"].extend(thejson2["data"])
append takes the single argument and insert it to the end. While extend extends the list by adding all the individual values in the argument list to the end.
# example:
a=[1, 2, 3]
b = a[:].append([4, 5])
# b = [1, 2, 3, [4, 5]]
c = a[:].extend([4, 5])
# c = [1, 2, 3, 4, 5]
thejson1 = {"responseStatus": "SUCCESS","responseDetails": {"total": 5754,},"data": [{"id": 1324651},{"id": 5686131}]}
thejson2 = {"responseStatus": "SUCCESS","responseDetails": {"total": 1234,},"data": [{"id": 2165735},{"id": 2133256}]}
thejson1["data"] += thejson2["data"]
Output:
{'responseDetails': {'total': 5754}, 'data': [{'id': 1324651}, {'id': 5686131}, {'id': 2165735}, {'id': 2133256}], 'responseStatus': 'SUCCESS'}
You can also use += to extend.

Categories

Resources