Flat json to nested json python - python

I want to convert input json to nested json defined, I am not able to think of any json library which help me achieve this
Input json
[{'Name': 'John', 'state': 'Boston', 'currency': 'USD', 'marks': 100},
{'Name': 'Rohan', 'state': 'Paris', 'currency': 'EUR', 'marks': 20},
{'Name': 'Rohan', 'state': 'Lyon', 'currency': 'EUR', 'marks': 11.4},
{'Name': 'Messi', 'state': 'Madrid', 'currency': 'EUR', 'marks': 9.9},
{'Name': 'Lampard', 'state': 'London', 'currency': 'GBP', 'marks': 12.2},
{'Name': 'Lampard', 'state': 'London', 'currency': 'FBP', 'marks': 10.9}]
output json
{
"USD": {
"John": {
"Boston": [
{
"Marks": 100
}
]
},
Current scenario based on value Currency,Name,state,marks
The nested json can be put upto n level if required such as Name and state and marks or it can be Name , curreny , state and marks or Name,curreny and marks

So you want currency > name > state > list of marks.
One solution would be to create the structure using defaultdicts, and then just add to it.
from collections import defaultdict
from functools import wraps
data = [...]
def ddmaker(type_):
#wraps(dict)
def caller():
return defaultdict(type_)
return caller
# create the structure of the output
output = defaultdict(ddmaker(ddmaker(list)))
# add to it
for item in data:
currency = item["currency"]
name = item["Name"]
state = item["state"]
mark = item["marks"]
output[currency][name][state].append({'Marks': mark})

Related

How do extract query-like results from a nested dictionary in Python?

I have a relatively simple nested dictionary as below:
emp_details = {
'Employee': {
'jim': {'ID':'001', 'Sales':'75000', 'Title': 'Lead'},
'eva': {'ID':'002', 'Sales': '50000', 'Title': 'Associate'},
'tony': {'ID':'003', 'Sales': '150000', 'Title': 'Manager'}
}
}
I can get the sales info of 'eva' easily by:
print(emp_details['Employee']['eva']['Sales'])
but I'm having difficulty writing a statement to extract information on all employees whose sales are over 50000.
You can't use one statement because the list initializer expression can't have an if without an else.
Use a for loop:
result = {} # dict expression
result_list = [] # list expression using (key, value)
for key, value in list(emp_details['Employee'].items())): # iterate from all items in dictionary
if int(value['Sales']) > 50000: # your judgement
result[key] = value # add to dict
result_list.append((key, value)) # add to list
print(result)
print(result_list)
# should say:
'''
{'jim': {'ID':'001', 'Sales':'75000', 'Title': 'Lead'}, 'tony': {'ID':'003', 'Sales': '150000', 'Title': 'Manager'}}
[('jim', {'ID':'001', 'Sales':'75000', 'Title': 'Lead'}), ('tony', {'ID':'003', 'Sales': '150000', 'Title': 'Manager'})]
'''
Your Sales is of String type.
Therefore, we can do something like this to get the information of employees whose sales are over 50000 : -
Method1 :
If you just want to get the information : -
emp_details={'Employee':{'jim':{'ID':'001', 'Sales':'75000', 'Title': 'Lead'}, \
'eva':{'ID':'002', 'Sales': '50000', 'Title': 'Associate'}, \
'tony':{'ID':'003', 'Sales': '150000', 'Title': 'Manager'}
}}
for emp in emp_details['Employee']:
if int(emp_details['Employee'][emp]['Sales']) > 50000:
print(emp_details['Employee'][emp])
It print outs to -:
{'ID': '001', 'Sales': '75000', 'Title': 'Lead'}
{'ID': '003', 'Sales': '150000', 'Title': 'Manager'}
Method2 : You can use Dict and List comprehension to get complete information : -
emp_details={'Employee':{'jim':{'ID':'001', 'Sales':'75000', 'Title': 'Lead'}, \
'eva':{'ID':'002', 'Sales': '50000', 'Title': 'Associate'}, \
'tony':{'ID':'003', 'Sales': '150000', 'Title': 'Manager'}
}}
emp_details_dictComp = {k:v for k,v in list(emp_details['Employee'].items()) if int(v['Sales']) > 50000}
print(emp_details_dictComp)
emp_details_listComp = [(k,v) for k,v in list(emp_details['Employee'].items()) if int(v['Sales']) > 50000]
print(emp_details_listComp)
Result : -
{'jim': {'ID': '001', 'Sales': '75000', 'Title': 'Lead'}, 'tony': {'ID': '003', 'Sales': '150000', 'Title': 'Manager'}}
[('jim', {'ID': '001', 'Sales': '75000', 'Title': 'Lead'}), ('tony', {'ID': '003', 'Sales': '150000', 'Title': 'Manager'})]

How do I iterate a nested dictionary with string formatting?

I checked a few other posts and either they didn't contain the information I need or I didn't understand them. I want to make this program print the sentence for every entry in the nested dictionary, and maybe also make a function to do this as well (not familiar with these yet).
I know it will use a for loop but what I can't figure out is how to configure the keys(?).
people = {
1: {
'name': 'David Wallace',
'age': 50,
'occupation': 'CFO',
'ethnicity': 'American',
'location': 'New York'
},
2: {
'name': 'Michael',
'age': 42,
'occupation': 'Regional Manager',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
},
3: {
'name': 'Jim',
'age': 27,
'occupation': 'Sales Rep',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
}
}
print('{name} is a {age} year-old {ethnicity} {occupation} from {location}.'.format(**people))
You're really treating the top-level dict more like a list, so you can write a for loop traversing the top-level like so:
people = {
1: {
'name': 'David Wallace',
'age': 50,
'occupation': 'CFO',
'ethnicity': 'American',
'location': 'New York'
},
2: {
'name': 'Michael',
'age': 42,
'occupation': 'Regional Manager',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
},
3: {
'name': 'Jim',
'age': 27,
'occupation': 'Sales Rep',
'ethnicity': 'American',
'location': 'Scranton, Pennsylvania'
}
}
for person in people.values():
print('{name} is a {age} year-old {ethnicity} {occupation} from {location}.'.format(**person))
The full reference for Python dictionaries is here: https://docs.python.org/3/library/stdtypes.html#dict.items
Edit: Thanks to user Chris Charley for the suggestion to use people.values() instead of people.items()

How to rename keys in a dictionary and make a dataframe of it?

I have a complex situation which I hope to solve and which might profit us all. I collected data from my API, added a pagination and inserted the complete data package in a tuple named q1 and finally I have made a dictionary named dict_1of that tuple which looks like this:
dict_1 = {100: {'ID': 100, 'DKSTGFase': None, 'DK': False, 'KM': None,
'Country: {'Name': GE', 'City': {'Name': 'Berlin'}},
'Type': {'Name': '219'}, 'DKObject': {'Name': '8555', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 101, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': Audi, 'Client': {‘1’ }}, 'DKComponent': {'Name': ‘John’}},
{200: {'ID': 200, 'DKSTGFase': None, 'DK': False, ' KM ': None,
'Country: {'Name': ES', 'City': {'Name': 'Madrid'}}, 'Type': {'Name': '220'},
'DKObject': {'Name': '8556', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 102, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': Mercedes, 'Client': {‘2’ }}, 'DKComponent': {'Name': ‘Sergio’}},
Please note that in the above dictionary I have just stated 2 records. The actual dictionary has 1400 records till it reaches ID 1500.
Now I want to 2 things:
I want to change some keys for all the records. key DK has to become DK1. Key Name in Country has to become Name1 and Name in Object has to become 'Name2'
The second thing I want is to make a dataFrame of the whole bunch of data. My expected outcome is:
This is my code:
q1 = response_2.json()
next_link = q1['#odata.nextLink']
q1 = [tuple(q1.values())]
while next_link:
new_response = requests.get(next_link, headers=headers, proxies=proxies)
new_data = new_response.json()
q1.append(tuple(new_data.values()))
next_link = new_data.get('#odata.nextLink', None)
dict_1 = {
record['ID']: record
for tup in q1
for record in tup[2]
}
#print(dict_1)
for x in dict_1.values():
x['DK1'] = x['DK']
x['Country']['Name1'] = x['Country']['Name']
x['Object']['Name2'] = x['Object']['Name']
df = pd.DataFrame(dict_1)
When i run this I receive the following Error:
Traceback (most recent call last):
File "c:\data\FF\Desktop\Python\PythongMySQL\Talky.py", line 57, in <module>
x['Country']['Name1'] = x['Country']['Name']
TypeError: 'NoneType' object is not subscriptable
working code
lists=[]
alldict=[{100: {'ID': 100, 'DKSTGFase': None, 'DK': False, 'KM': None,
'Country': {'Name': 'GE', 'City': {'Name': 'Berlin'}},
'Type': {'Name': '219'}, 'DKObject': {'Name': '8555', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 101, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': 'Audi', 'Client': {'1' }}, 'DKComponent': {'Name': 'John'}}}]
for eachdict in alldict:
key=list(eachdict.keys())[0]
eachdict[key]['DK1']=eachdict[key]['DK']
del eachdict[key]['DK']
eachdict[key]['Country']['Name1']=eachdict[key]['Country']['Name']
del eachdict[key]['Country']['Name']
eachdict[key]['DKObject']['Object']['Name2']=eachdict[key]['DKObject']['Object']['Name']
del eachdict[key]['DKObject']['Object']['Name']
lists.append([key, eachdict[key]['DK1'], eachdict[key]['KM'], eachdict[key]['Country']['Name1'],
eachdict[key]['Country']['City']['Name'], eachdict[key]['DKObject']['Object']['Name2'], eachdict[key]['Order']['Client']])
pd.DataFrame(lists, columns=[<columnNamesHere>])
Output:
{100: {'ID': 100,
'DKSTGFase': None,
'KM': None,
'Country': {'City': {'Name': 'Berlin'}, 'Name1': 'GE'},
'Type': {'Name': '219'},
'DKObject': {'Name': '8555', 'Object': {'Name2': 'Car'}},
'Order': {'OrderId': 101,
'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': 'Audi',
'Client': {'1'}},
'DKComponent': {'Name': 'John'},
'DK1': False}}

Retrieve only certain keys and values from a dictionary, nested inside a list

I've been stuck on this for hours.. I want to retrieve only ONE individuals keys and values from a dictionary that is nested inside of a list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},
{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
},{
'name': 'Tumor',
'parent': 'Jack Black',
'games': 'Trying to live',
'weight': '150 lbs'
}]
So for instance I want to get Justins information printed only, nobody elses. Any insights?
You can pass the key which you want and push it to separate list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
}]
def get_key_pair_list(input_dict, key):
new_list = []
for item in input_dict:
my_dict = {}
if key in item.keys():
my_dict[key] = item[key]
new_list.append(my_dict)
return new_list
print(get_key_pair_list(GAMERS, 'name'))
Output:
[{'name': 'Fatboi'}, {'name': 'Justin'}, {'name': 'jerkhead'}]
Comprehensive way:
key = 'name'
my_list = [{key, item[key]} for item in GAMERS if key in item.keys() ]
print(my_list)
output:
[{'name', 'Fatboi'}, {'name', 'Justin'}, {'name', 'jerkhead'}]
You want to filter the list and grab the first value that matches a predicate. Make sure to handle the case where the item doesnt exist!
filtered_info = (
item for item in GAMERS if item['name'] == 'Justin'
)
justin_info = next(filtered_info, None)
if justin_info is not None:
print(justin_info)

pystache: render context inside lambda

This is very similar to https://github.com/defunkt/pystache/issues/157, however in the mentioned post didn't really answer...
My target: print the following lines:
Al,John,Jack
Tim,Tom,Todd
without a final comma.
I tried this way:
ctx = {
'gangs': [
{'gangsters': [ {'name': 'Al' }, {'name': 'John'}, {'name': 'Jack'}]},
{'gangsters': [ {'name': 'Tim'}, {'name': 'Tom'} , {'name': 'Todd'}]},
]
}
class Lambdas(object):
def __init__(self, renderer):
self.renderer = renderer
def rstrip(self):
"Remove last character"
print self.renderer.context
return lambda s: self.renderer.render(s, self.renderer.context)[:-1]
renderer = pystache.Renderer(missing_tags='strict')
print renderer.render("""
{{#gangs}}
{{#rstrip}}{{#gangsters}}{{name}},{{/gangsters}}{{/rstrip}}
{{/gangs}}
""", ctx, Lambdas(renderer))
The output:
ContextStack({'gangs': [{'gangsters': [{'name': 'Al'}, {'name': 'John'}, {'name': 'Jack'}]}, {'gangsters': [{'name': 'Tim'}, {'name': 'Tom'}, {'name': 'Todd'}]}]}, <__main__.Lambdas object at 0x15cadb10>, {'gangsters': [{'name': 'Al'}, {'name': 'John'}, {'name': 'Jack'}]})
ContextStack({'gangs': [{'gangsters': [{'name': 'Al'}, {'name': 'John'}, {'name': 'Jack'}]}, {'gangsters': [{'name': 'Tim'}, {'name': 'Tom'}, {'name': 'Todd'}]}]}, <__main__.Lambdas object at 0x15cadb10>, {'gangsters': [{'name': 'Al'}, {'name': 'John'}, {'name': 'Jack'}]})
Al,John,Jack
Al,John,Jack
The culprit is the invocation to render() inside rstrip. Notice how, during the second call, the 3d element of the ContextStack is exactly identical to the previous call.
Is this a bug, or am I missing something?!?
Answered upstream: https://github.com/defunkt/pystache/issues/158
def rstrip(self):
"Remove last character"
return lambda s: copy.deepcopy(self.renderer).render(s, self.renderer.context)[:-1]

Categories

Resources