I'm building a python application which receives REST response in below format:
[
{
'metric': 'pass_status',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': 'OK'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': 'FAIL'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': 'OK'
}
]
},
{
'metric': 'bugs',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': '1'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': '6'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': '2'
}
]
},
{
'metric': 'code_smells',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': '0'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': '1'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': '2'
}
]
}
]
You can see dates are same within for each metric.
I want to collate this data date-wise, i.e. my result json/dictionary should look like:
[
'2019-02-20T10:26:52+0000' : {
'pass_status' : 'OK',
'bugs' : '1',
'code_smells' : '0'
},
'2019-03-13T11:37:39+0000' : {
'pass_status' : 'FAIL',
'bugs' : '6',
'code_smells' : '1'
},
'2019-03-13T11:37:39+0000' : {
'pass_status' : 'OK',
'bugs' : '2',
'code_smells' : '2'
}
]
What will be the suggested approach to do this?
Thanks
I tried some itertools.groupby magic, but it turned into a mess...
maybe iteration + defaultdict is just keeping it simple...
like this:
from collections import defaultdict
result = defaultdict(dict)
for metric_dict in data:
metric_name = metric_dict['metric']
for entry in metric_dict['history']:
result[entry['date']][metric_name] = entry['value']
print(dict(result))
or a full example with the data:
data = [
{
'metric': 'pass_status',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': 'OK'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': 'FAIL'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': 'OK'
}
]
},
{
'metric': 'bugs',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': '1'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': '6'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': '2'
}
]
},
{
'metric': 'code_smells',
'history': [
{
'date': '2019-02-20T10:26:52+0000',
'value': '0'
},
{
'date': '2019-03-13T11:37:39+0000',
'value': '1'
},
{
'date': '2019-03-13T12:00:57+0000',
'value': '2'
}
]
}
]
from collections import defaultdict
result = defaultdict(dict)
for metric_dict in data:
metric_name = metric_dict['metric']
for entry in metric_dict['history']:
result[entry['date']][metric_name] = entry['value']
print(result)
Related
I have below list of dicts
[{
'NAV': 50,
'id': '61e6b2a1d0c32b744d3e3b2d'
}, {
'NAV': 25,
'id': '61e7fbe2d0c32b744d3e6ab4'
}, {
'NAV': 30,
'id': '61e801cbd0c32b744d3e7003'
}, {
'NAV': 30,
'id': '61e80663d0c32b744d3e7c51'
}, {
'NAV': 30,
'id': '61e80d9ad0c32b744d3e8da6'
}, {
'NAV': 30,
'id': '61e80f5fd0c32b744d3e93f0'
}, {
'NAV': 30,
'id': '61e90908d0c32b744d3ea967'
}, {
'NAV': 30,
'id': '61ea7cf3d0c32b744d3ed1b2'
}, {
'NAV': 50,
'id': '61fa387127e14670f3a67194'
}, {
'NAV': 30,
'id': '61fa3cea27e14670f3a6772c'
}, {
'Amount': 30,
'id': '61e6b373d0c32b744d3e3d14'
}, {
'Amount': 30,
'id': '61e6b49cd0c32b744d3e3ea0'
}, {
'Amount': 25,
'id': '61e7fe90d0c32b744d3e6ccd'
}, {
'Amount': 20,
'id': '61e80246d0c32b744d3e7242'
}, {
'Amount': 20,
'id': '61e80287d0c32b744d3e74ae'
}, {
'Amount': 20,
'id': '61e80253d0c32b744d3e733e'
}, {
'Amount': 34,
'id': '61e80697d0c32b744d3e7edd'
}, {
'Amount': 20,
'id': '61e806a3d0c32b744d3e7ff9'
}, {
'Amount': 30,
'id': '61e80e0ad0c32b744d3e906e'
}, {
'Amount': 30,
'id': '61e80e22d0c32b744d3e9198'
}, {
'Amount': 20,
'id': '61e81011d0c32b744d3e978e'
}, {
'Amount': 20,
'id': '61e8104bd0c32b744d3e9a92'
}, {
'Amount': 20,
'id': '61e81024d0c32b744d3e98cd'
}, {
'Amount': 20,
'id': '61e90994d0c32b744d3eac2b'
}, {
'Amount': 20,
'id': '61e909aad0c32b744d3ead76'
}, {
'Amount': 50,
'id': '61fa392a27e14670f3a67337'
}, {
'Amount': 50,
'id': '61fa393727e14670f3a67347'
}, {
'Amount': 50,
'id': '61fa3d6727e14670f3a67750'
}, {
'Amount': 150,
'id': '61fa3d7127e14670f3a67760'
}]
Above list contains dict which has key as NAV and Amount. I need to find the max value separately among all the dicts for NAV and Amount. So that output is
NAV = 50
Amount = 150
I have tried some approach like:
max(outList, key=lambda x: x['NAV'])
But this is giving me keyerror of 'NAV'. What is the best way to do it?
I don't understand why you are calling Current NAV ($ M). It doesn't exist in the list you have provided. Anyway, I came up with the code below:
def getMax(value):
if "NAV" in value:
return value["NAV"]
else:
return value["Amount"]
max(outList, key= getMax)
If you are interested in finding the max value for NAV and Amount separately, you can try filtering the list out and then calling the lambda as you used before.
print(max([x["NAV"] for x in outList if "NAV" in x]))
print(max([x["Amount"] for x in outList if "Amount" in x])
you could try something like this:
print max([i["NAV"] for i in t if "NAV" in i])
print max([i["Amount"] for i in t if "Amount" in i])
Result:
50
150
def max_from_list(t_list, key):
return max([i[key] for i in t_list if key in i])
if you are not only looking for NAV and Amount, maybe you can do as below:
from collections import defaultdict
res = defaultdict(list)
for i in d:
for k, v in i.items():
if k != 'id':
res[k] = max(res.get(k, 0), v)
You are on the right track with your max solution:
assuming your list is called outList (BTW, that is not a pythonic name, try out_list instead)
nav = max(outList, key=lambda item: item.get("NAV", float("-inf")))['NAV']
amount = max(outList, key=lambda item: item.get("Amount", float("-inf")))['Amount']
My content inside a dictionary is below
I need to know count for 1. BusinessArea and its count of values
Designation and its count of values
test= [ { 'masterid': '1', 'name': 'Group1', 'BusinessArea': [ 'Accounting','Research'], 'Designation': [ 'L1' 'L2' ] }, { 'masterid': '2', 'name': 'Group1', 'BusinessArea': ['Research','Accounting' ], 'Role': [ { 'id': '5032', 'name': 'Tester' }, { 'id': '5033', 'name': 'Developer' } ], 'Designation': [ 'L1' 'L2' ]}, { 'masterid': '3', 'name': 'Group1', 'BusinessArea': [ 'Engineering' ], 'Role': [ { 'id': '5032', 'name': 'Developer' }, { 'id': '5033', 'name': 'Developer', 'parentname': '' } ], 'Designation': [ 'L1' ]}]
I want to get the count of masterid of BusinessArea and Designation which is all the names
Expected out is below
[
{
"name": "BusinessArea",
"values": [
{
"name": "Accounting",
"count": "2"
},
{
"name": "Research",
"count": "2"
},
{
"name": "Engineering",
"count": "1"
}
]
},
{
"name": "Designation",
"values": [
{
"name": "L1",
"count": "3"
},
{
"name": "l2",
"count": "2"
}
]
}
]
masterid 1,2 and 3 there are L1 and masterid 1 and 2 there are L2 so for L1:3, and L2:2
something like the below (not exactly the output you mentioned but quite close..)
from collections import defaultdict
test = [{'masterid': '1', 'name': 'Group1', 'BusinessArea': ['Accounting', 'Research'], 'Designation': ['L1', 'L2']},
{'masterid': '2', 'name': 'Group1', 'BusinessArea': ['Research', 'Accounting'],
'Role': [{'id': '5032', 'name': 'Tester'}, {'id': '5033', 'name': 'Developer'}], 'Designation': ['L1', 'L2']},
{'masterid': '3', 'name': 'Group1', 'BusinessArea': ['Engineering'],
'Role': [{'id': '5032', 'name': 'Developer'}, {'id': '5033', 'name': 'Developer', 'parentname': ''}],
'Designation': ['L1']}]
b_area = defaultdict(int)
des = defaultdict(int)
for entry in test:
for val in entry['BusinessArea']:
b_area[val] += 1
for val in entry['Designation']:
des[val] += 1
print(b_area)
print(des)
output
defaultdict(<class 'int'>, {'Accounting': 2, 'Research': 2, 'Engineering': 1})
defaultdict(<class 'int'>, {'L1': 3, 'L2': 2})
I want to remove some duplicates in my merged dictionary.
My data:
mongo_data = [{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},],
'vendor': 'Fantasy'
},{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'},],
'vendor': 'Dystopia'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'Divergent', 'value': '9.0'},
{'key': 'Lord of the Rings', 'value': '9.0'},],
'vendor': 'Fantasy'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'},],
'vendor': 'Fantasy'
}]
My code:
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
search = {"url": key, "results": []}
for vendor, group2 in groupby(group, key=lambda chunk2: chunk2['vendor']):
result = {
"genre": vendor,
"data": [{'key': key['key'], 'value': key['value']}
for result2 in group2
for key in result2["variables"]],
}
search["results"].append(result)
searches.append(search)
My result:
[
{
"url": "https://goodreads.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Harry Potter",
"value": "10.0"
},
{
"key": "Discovery of Witches",
"value": "8.5"
}
]
},
{
"genre": "Dystopia",
"data": [
{
"key": "Hunger Games",
"value": "10.0"
},
{
"key": "Maze Runner",
"value": "5.5"
}
]
}
]
},
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Divergent",
"value": "9.0"
},
{
"key": "Lord of the Rings",
"value": "9.0"
},
{
"key": "The Handmaids Tale",
"value": "10.0"
},
{
"key": "Divergent",
"value": "9.0"
}
]
}
}
]
}
]
I do not want any duplicates in my structure. I am not sure on how to take them out. My expected result can be seen below.
Expected result:
[
{
"url": "https://goodreads.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Harry Potter",
"value": "10.0"
},
{
"key": "Discovery of Witches",
"value": "8.5"
}
]
},
{
"genre": "Dystopia",
"data": [
{
"key": "Hunger Games",
"value": "10.0"
},
{
"key": "Maze Runner",
"value": "5.5"
}
]
}
]
},
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Divergent",
"value": "9.0"
},
{
"key": "Lord of the Rings",
"value": "9.0"
},
{
"key": "The Handmaids Tale",
"value": "10.0"
}
]
}
}
]
}
]
Divergent is getting repeated in the last list of dictionaries. When I merged my dictionaries even the duplicates inside https://kindle.com/-->Fantasy got merged into one. Is there a way for me to remove the duplicate dictionary?
I want the https://kindle.com/ part to look like:
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Divergent",
"value": "9.0"
},
{
"key": "Lord of the Rings",
"value": "9.0"
},
{
"key": "The Handmaids Tale",
"value": "10.0"
}
]
}
}
]
}
You can try convert those dict to a set of tuple first and then convert back to a list of dict later:
mongo_data = [{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},],
'vendor': 'Fantasy'
},{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'},],
'vendor': 'Dystopia'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'Divergent', 'value': '9.0'},
{'key': 'Lord of the Rings', 'value': '9.0'},],
'vendor': 'Fantasy'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'},],
'vendor': 'Fantasy'
}]
from itertools import groupby
searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
search = {"url": key, "results": []}
for vendor, group2 in groupby(group, key=lambda chunk2: chunk2['vendor']):
result = {
"genre": vendor,
"data": set((key['key'], key['value'])
for result2 in group2
for key in result2["variables"]),
}
result['data'] = [{"key": tup[0], "value": tup[1]} for tup in result['data']]
search["results"].append(result)
searches.append(search)
searches
Output:
[{'results': [{'data': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'}],
'genre': 'Fantasy'},
{'data': [{'key': 'Maze Runner', 'value': '5.5'},
{'key': 'Hunger Games', 'value': '10.0'}],
'genre': 'Dystopia'}],
'url': 'https://goodreads.com/'},
{'results': [{'data': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Lord of the Rings', 'value': '9.0'},
{'key': 'Divergent', 'value': '9.0'}],
'genre': 'Fantasy'}],
'url': 'https://kindle.com/'}]
I am trying to convert the nested complex json into the dataframe this is my json format
There are multiple json rows like this i just have used here only one which is the id above now when i am trying to convert the dataframe i am getting the output like this
I further want to display the innings batsmen name stats runs
ballsFaced . I want to show the complete data into the dataframe.
{
'1164223': {
'fullSummaryLink': '/series/19014/scorecard/1164223/new-zealand-a-vs-india-a-1st-unofficial-odi-india-a-tour-of-nz-2018-19',
'innings': {
'1': {
'batsmen': [
{
'name': 'HD Rutherford',
'href': 'http://www.espncricinfo.com/ci/content/player/331375.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '70'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '66'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '0'
}
}
},
{
'name': 'JDS Neesham',
'href': 'http://www.espncricinfo.com/ci/content/player/355269.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '79'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '48'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '1'
}
}
}
],
'team': {
'teamDisplayName': 'NEW ZEALAND A',
'innDisplayName': 'INNINGS',
'runs': 308,
'overs': 50,
'wickets': 6,
'description': 'complete',
'inningsRunWicket': '308/6',
'inningStatus': ''
},
'bowlers': [
{
'name': 'S Kaul',
'href': 'http://www.espncricinfo.com/ci/content/player/326017.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '74'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
},
{
'name': 'K Gowtham',
'href': 'http://www.espncricinfo.com/ci/content/player/424377.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '9'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '46'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '1'
}
}
}
]
},
'2': {
'bowlers': [
{
'name': 'HK Bennett',
'href': 'http://www.espncricinfo.com/ci/content/player/226493.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '65'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
},
{
'name': 'LH Ferguson',
'href': 'http://www.espncricinfo.com/ci/content/player/493773.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '75'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
}
],
'batsmen': [
{
'name': 'V Shankar',
'href': 'http://www.espncricinfo.com/ci/content/player/477021.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '87'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '80'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '1'
}
}
},
{
'name': 'SS Iyer',
'href': 'http://www.espncricinfo.com/ci/content/player/642519.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '54'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '54'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '0'
}
}
}
],
'team': {
'teamDisplayName': 'INDIA A',
'innDisplayName': 'INNINGS',
'runs': 311,
'overs': 49,
'wickets': 6,
'description': 'target reached',
'inningsRunWicket': '311/6',
'inningStatus': ''
}
}
},
'isAvailable': True
}
}
This is the output of my code which i have tried and i have given the image sample thats the output.. I am trying to make into simple format which i have not been able to do that .
this is my python code which converts the dictionary of json values into the pandas dataframe
df = pd.concat({k: pd.DataFrame(v) for k, v in scorecard_summary.items()})
df
I am trying to convert the complex json into the pandas dataframe but i am having problem when i am converting it. I am not possibly sure how to convert the complex json into the pandas dataframe.
I want to show all the data in the dataframe this is how i am getting it and i am sharing the format of my json data so that it can be understandable
{
'1164223': {
'fullSummaryLink': '/series/19014/scorecard/1164223/new-zealand-a-vs-india-a-1st-unofficial-odi-india-a-tour-of-nz-2018-19',
'innings': {
'1': {
'batsmen': [
{
'name': 'HD Rutherford',
'href': 'http://www.espncricinfo.com/ci/content/player/331375.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '70'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '66'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '0'
}
}
},
{
'name': 'JDS Neesham',
'href': 'http://www.espncricinfo.com/ci/content/player/355269.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '79'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '48'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '1'
}
}
}
],
'team': {
'teamDisplayName': 'NEW ZEALAND A',
'innDisplayName': 'INNINGS',
'runs': 308,
'overs': 50,
'wickets': 6,
'description': 'complete',
'inningsRunWicket': '308/6',
'inningStatus': ''
},
'bowlers': [
{
'name': 'S Kaul',
'href': 'http://www.espncricinfo.com/ci/content/player/326017.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '74'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
},
{
'name': 'K Gowtham',
'href': 'http://www.espncricinfo.com/ci/content/player/424377.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '9'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '46'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '1'
}
}
}
]
},
'2': {
'bowlers': [
{
'name': 'HK Bennett',
'href': 'http://www.espncricinfo.com/ci/content/player/226493.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '65'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
},
{
'name': 'LH Ferguson',
'href': 'http://www.espncricinfo.com/ci/content/player/493773.html',
'stats': {
'overs': {
'name': 'overs',
'text': 'O',
'value': '10'
},
'conceded': {
'name': 'conceded',
'text': 'R',
'value': '75'
},
'wickets': {
'name': 'wickets',
'text': 'E',
'value': '2'
}
}
}
],
'batsmen': [
{
'name': 'V Shankar',
'href': 'http://www.espncricinfo.com/ci/content/player/477021.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '87'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '80'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '1'
}
}
},
{
'name': 'SS Iyer',
'href': 'http://www.espncricinfo.com/ci/content/player/642519.html',
'stats': {
'runs': {
'name': 'runs',
'text': 'RUNS',
'value': '54'
},
'ballsFaced': {
'name': 'ballsFaced',
'text': 'BF',
'value': '54'
},
'notouts': {
'name': 'notouts',
'text': 'Not Out',
'value': '0'
}
}
}
],
'team': {
'teamDisplayName': 'INDIA A',
'innDisplayName': 'INNINGS',
'runs': 311,
'overs': 49,
'wickets': 6,
'description': 'target reached',
'inningsRunWicket': '311/6',
'inningStatus': ''
}
}
},
'isAvailable': True
}
}
this is my python code which converts the dictionary of json values into the pandas dataframe
df = pd.concat({k: pd.DataFrame(v) for k, v in scorecard_summary.items()})
df