Create JSON from another JSON with duplicate values in Python - python

I have a JSON:
[{'job': 'fireman', 'salary': 30000', 'country':'USA'}, {'job': 'doctor', 'salary': '50000': 'country': 'Canada'},{'job': 'fireman', 'salary': 60000', 'country':'France'}, {'job': 'Engineer', 'salary': 45000', 'country':'Mexico'} ]
I want to combine the duplicate values and create a JSON like:
[
{"job": "fireman",
"sumamry": [{"country": "USA", "Salary": 40000}, {"Country": "France", "Salary": 60000}]
"total" : 100000},
{"job": "doctor",
"summary": [{"country": "Canada", "Salary": 50000}]
"total" : 50000},
....
]

Try this:
non_summarized = [{'job': 'fireman', 'salary': 30000, 'country':'USA'}, {'job': 'doctor', 'salary': 50000, 'country': 'Canada'},{'job': 'fireman', 'salary': 60000, 'country':'France'}, {'job': 'Engineer', 'salary': 45000, 'country':'Mexico'}]
# sort the list of dictionary base on job keys, so we can loop in the order
non_summarized = sorted(non_summarized, key = lambda i: i['job'])
summarized = list()
last_value = dict()
for d in non_summarized:
# check if the last value has the same job or not
# if not then create a new dict value and update with new information
if last_value.get('job') != d.get('job'):
last_value = {
'job': d.get('job'),
'total': 0,
'summary': list()
}
summarized.append(last_value)
last_value['total'] += d.get('salary', 0)
last_value['summary'].append({
'country': d.get('country'),
'salary': d.get('salary')
})
print(summarized)
Please let me know if you need any clarification.

Related

How to loop through a list of dictionary and extract those with the same 'name' and 'school' into a new list while getting their other values in it

I have this list of dictionary and I would like to get those with the same exact value of 'name' and 'school' into a new list and also getting their 'age' merged into a list as well and the rest of the dictionary that is not identical to just add into the list as per usual..
Here is an example of the list of dictionary
[{'name': 'Jane', 'age':12, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'SMU'},{'name': 'Jane', 'age':14, 'school': 'SIT'}, {'name': 'Jane', 'age':16, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'NUS'}]
and I would like it to make it into something like this..
[{'name': 'Jane', 'age': [12,14,16], 'school': 'SIT'}, {'name': 'John', 'age': 13, 'school': 'SMU'}, {'name': 'John', 'age':13, 'school': 'NUS'}]
using Python.. please help!
tried using counter, loops but still can't get it to work..
You could use itertools.groupby().
Example:
import itertools
from pprint import pprint
data = [{'name': 'Jane', 'age':12, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'SMU'},{'name': 'Jane', 'age':14, 'school': 'SIT'}, {'name': 'Jane', 'age':16, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'NUS'}]
keyfunc = lambda x: (x["name"], x["school"])
# needs to be sorted to use groupby
data.sort(key=keyfunc)
output = []
for k,v in itertools.groupby(data, key=keyfunc):
this_group = {
"name": k[0],
"school": k[1],
"age": [i["age"] for i in v],
}
output.append(this_group)
pprint(output)
The output is:
[{'age': [12, 14, 16], 'name': 'Jane', 'school': 'SIT'},
{'age': [13], 'name': 'John', 'school': 'NUS'},
{'age': [13], 'name': 'John', 'school': 'SMU'}]
If you wish to go with the solution based on a buffer dictionary, please check out the dict.setdefault() method.
Example:
buffer = {}
for i in data:
buffer.setdefault((i["name"], i["school"]), []).append(i["age"])
For reference:
https://docs.python.org/3/library/itertools.html#itertools.groupby
https://docs.python.org/3/library/stdtypes.html#dict.setdefault
x = [{'name': 'Jane', 'age':12, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'SMU'},{'name': 'Jane', 'age':14, 'school': 'SIT'}, {'name': 'Jane', 'age':16, 'school': 'SIT'}, {'name': 'John', 'age':13, 'school': 'NUS'}]
new_x = {}
for r in x:
if r['name'] in new_x.keys():
if not isinstance(new_x[r['name']]['age'], list):
new_x[r['name']]['age'] = [new_x[r['name']]['age']]
if r['age'] not in new_x[r['name']]['age']:
new_x[r['name']]['age'].append(r['age'])
else:
new_x[r['name']] = {'age': r['age'], 'school': r['school']}
z = [v.update(name=k) for k, v in new_x.items()]
z = [v for k, v in new_x.items()]
Here is a universal solution to your problem. Only name and school are considered "special". All other keys, like age are converted to list when a new value has to be added.
l = [
{"name": "Jane", "age": 12, "school": "SIT"},
{"name": "John", "age": 13, "school": "SMU"},
{"name": "Jane", "age": 14, "school": "SIT"},
{"name": "Jane", "age": 16, "school": "SIT"},
{"name": "John", "age": 13, "school": "NUS"},
]
r = {}
for x in l:
id = f"{x['name']}-{x['school']}"
if id in r:
for k,v in x.items():
if k not in ["name", "school"]:
if k in r[id]:
if isinstance(r[id][k], list):
r[id][k].append(v)
else:
r[id][k] = [r[id][k], v]
else:
r[id][k] = v
else:
r[id] = x
result = [x for x in r.values()]

Flat json to nested json python

I want to convert input json to nested json defined, I am not able to think of any json library which help me achieve this
Input json
[{'Name': 'John', 'state': 'Boston', 'currency': 'USD', 'marks': 100},
{'Name': 'Rohan', 'state': 'Paris', 'currency': 'EUR', 'marks': 20},
{'Name': 'Rohan', 'state': 'Lyon', 'currency': 'EUR', 'marks': 11.4},
{'Name': 'Messi', 'state': 'Madrid', 'currency': 'EUR', 'marks': 9.9},
{'Name': 'Lampard', 'state': 'London', 'currency': 'GBP', 'marks': 12.2},
{'Name': 'Lampard', 'state': 'London', 'currency': 'FBP', 'marks': 10.9}]
output json
{
"USD": {
"John": {
"Boston": [
{
"Marks": 100
}
]
},
Current scenario based on value Currency,Name,state,marks
The nested json can be put upto n level if required such as Name and state and marks or it can be Name , curreny , state and marks or Name,curreny and marks
So you want currency > name > state > list of marks.
One solution would be to create the structure using defaultdicts, and then just add to it.
from collections import defaultdict
from functools import wraps
data = [...]
def ddmaker(type_):
#wraps(dict)
def caller():
return defaultdict(type_)
return caller
# create the structure of the output
output = defaultdict(ddmaker(ddmaker(list)))
# add to it
for item in data:
currency = item["currency"]
name = item["Name"]
state = item["state"]
mark = item["marks"]
output[currency][name][state].append({'Marks': mark})

How do I iterate a nested dictionary with string formatting?

I checked a few other posts and either they didn't contain the information I need or I didn't understand them. I want to make this program print the sentence for every entry in the nested dictionary, and maybe also make a function to do this as well (not familiar with these yet).
I know it will use a for loop but what I can't figure out is how to configure the keys(?).
people = {
1: {
'name': 'David Wallace',
'age': 50,
'occupation': 'CFO',
'ethnicity': 'American',
'location': 'New York'
},
2: {
'name': 'Michael',
'age': 42,
'occupation': 'Regional Manager',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
},
3: {
'name': 'Jim',
'age': 27,
'occupation': 'Sales Rep',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
}
}
print('{name} is a {age} year-old {ethnicity} {occupation} from {location}.'.format(**people))
You're really treating the top-level dict more like a list, so you can write a for loop traversing the top-level like so:
people = {
1: {
'name': 'David Wallace',
'age': 50,
'occupation': 'CFO',
'ethnicity': 'American',
'location': 'New York'
},
2: {
'name': 'Michael',
'age': 42,
'occupation': 'Regional Manager',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
},
3: {
'name': 'Jim',
'age': 27,
'occupation': 'Sales Rep',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
}
}
for person in people.values():
print('{name} is a {age} year-old {ethnicity} {occupation} from {location}.'.format(**person))
The full reference for Python dictionaries is here: https://docs.python.org/3/library/stdtypes.html#dict.items
Edit: Thanks to user Chris Charley for the suggestion to use people.values() instead of people.items()

Retrieve only certain keys and values from a dictionary, nested inside a list

I've been stuck on this for hours.. I want to retrieve only ONE individuals keys and values from a dictionary that is nested inside of a list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},
{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
},{
'name': 'Tumor',
'parent': 'Jack Black',
'games': 'Trying to live',
'weight': '150 lbs'
}]
So for instance I want to get Justins information printed only, nobody elses. Any insights?
You can pass the key which you want and push it to separate list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
}]
def get_key_pair_list(input_dict, key):
new_list = []
for item in input_dict:
my_dict = {}
if key in item.keys():
my_dict[key] = item[key]
new_list.append(my_dict)
return new_list
print(get_key_pair_list(GAMERS, 'name'))
Output:
[{'name': 'Fatboi'}, {'name': 'Justin'}, {'name': 'jerkhead'}]
Comprehensive way:
key = 'name'
my_list = [{key, item[key]} for item in GAMERS if key in item.keys() ]
print(my_list)
output:
[{'name', 'Fatboi'}, {'name', 'Justin'}, {'name', 'jerkhead'}]
You want to filter the list and grab the first value that matches a predicate. Make sure to handle the case where the item doesnt exist!
filtered_info = (
item for item in GAMERS if item['name'] == 'Justin'
)
justin_info = next(filtered_info, None)
if justin_info is not None:
print(justin_info)

Dictionary keys and values

I'm working with Python dictionaries and there's something I don't understand when I use the function dict.values() and dict.keys().
Why do they give as a result also the "description" of the function? Am I missing something here?
participant = {"name": "Lisa", "age": 16, "activities": [{"name": "running", "duration": 340},{"name": "walking", "duration": 790}]}
print(participant.values())
print(participant.keys())
The print gives these results:
dict_values([[{'duration': 340, 'name': 'running'}, {'duration': 790, 'name': 'walking'}], 'Lisa', 16])
dict_keys(['activities', 'name', 'age'])
I don't want 'dict_values' and 'dict_keys' in the result. What am I doing wrong?
For this purpose you can use keyword list:
list(participant.keys()) # ['name', 'activities', 'age']
list(participant.values())
# ['Lisa', [{'name': 'running', 'duration': 340}, {'name': 'walking', 'duration': 790}], 16]

Categories

Resources