How to select multiple JSON Objects using python

How to select multiple JSON Objects using python - python

I have a Json data as following. The Json has many such objects with same NameId's:
[{
"NameId": "name1",
"exp": {
"exp1": "test1"
}
}, {
"NameId": "name1",
"exp": {
"exp2": "test2"
}
}
]
Now, what I am after is to create a new Json Object that has a merged exp and create a file something like below, so that I do not have multiple NameId:
[{
"NameId": "name1",
"exp": {
"exp1": "test1",
"exp2": "test2"
}
}
]
Is there a possibility I can achive it using Python?

You can do the manual work, merging the entries while rebuilding the structure. You can keep a dictionary with the exp to merge them.
import json
jsonData = [{
"NameId": "name1",
"exp": {
"exp1": "test1"
}
}, {
"NameId": "name1",
"exp": {
"exp2": "test2"
}
}, {
"NameId": "name2",
"exp": {
"exp3": "test3"
}
}]
result = []
expsDict = {}
for entry in jsonData:
nameId = entry["NameId"]
exp = entry["exp"]
if nameId in expsDict:
# Merge exp into resultExp.
# Note that resultExp belongs to both result and expsDict,
# changes made will be reflected in both containers!
resultExp = expsDict[nameId]
for (expName, expValue) in exp.items():
resultExp[expName] = expValue
else:
# Copy copy copy, otherwise merging would modify jsonData too!
exp = exp.copy()
entry = entry.copy()
entry["exp"] = exp
# Add a new item to the result
result.append(entry)
# Store exp to later merge other entries with the same name.
expsDict[nameId] = exp
print(result)

You can use itertools.groupby and functools.reduce
d = [{
"NameId": "name1",
"exp": {
"exp1": "test1"
}
}, {
"NameId": "name1",
"exp": {
"exp2": "test2"
}
}]
from itertools import groupby
[ {'NameId': k, 'exp': reduce(lambda x,y : {**x["exp"], **y["exp"]} , v) } for k,v in groupby(sorted(d, key=lambda x: x["NameId"]), lambda x: x["NameId"]) ]
#output
[{'NameId': 'name1', 'exp': {'exp1': 'test1', 'exp2': 'test2'}}]

Related

Convert Pandas Dataframe or csv file to Custom Nested JSON

I have a csv file with a DF with structure as follows:
my dataframe:
I want to enter the data to the following JSON format using python. I looked to couple of links (but I got lost in the nested part). The links I checked:
How to convert pandas dataframe to uniquely structured nested json
convert dataframe to nested json
"PHI": 2,
"firstname": "john",
"medicalHistory": {
"allergies": "egg",
"event": {
"inPatient":{
"hospitalized": {
"visit" : "7-20-20",
"noofdays": "5",
"test": {
"modality": "xray"
}
"vitalSign": {
"temperature": "32",
"heartRate": "80"
},
"patientcondition": {
"headache": "1",
"cough": "0"
}
},
"icu": {
"visit" : "",
"noofdays": "",
},
},
"outpatient": {
"visit":"5-20-20",
"vitalSign": {
"temperature": "32",
"heartRate": "80"
},
"patientcondition": {
"headache": "1",
"cough": "1"
},
"test": {
"modality": "blood"
}
}
}
}
If anyone can help me with the nested array, that will be really helpful.

You need one or more helper functions to unpack the data in the table like this. Write main helper function to accept two arguments: 1. df and 2. schema. The schema will be used to unpack the df into a nested structure for each row in the df. The schema below is an example of how to achieve this for a subset of the logic you describe. Although not exactly what you specified in example, should be enough of hint for you to complete the rest of the task on your own.
from operator import itemgetter
groupby_idx = ['PHI', 'firstName']
groups = df.groupby(groupby_idx, as_index=False, drop=False)
schema = {
"event": {
"eventType": itemgetter('event'),
"visit": itemgetter('visit'),
"noOfDays": itemgetter('noofdays'),
"test": {
"modality": itemgetter('test')
},
"vitalSign": {
"temperature": itemgetter('temperature'),
"heartRate": itemgetter('heartRate')
},
"patientCondition": {
"headache": itemgetter('headache'),
"cough": itemgetter('cough')
}
}
}
def unpack(obj, schema):
tmp = {}
for k, v in schema.items():
if isinstance(v, (dict,)):
tmp[k] = unpack(obj, v)
if callable(v):
tmp[k] = v(obj)
return tmp
def apply_unpack(groups, schema):
results = {}
for gidx, df in groups:
events = []
for ridx, obj in df.iterrows():
d = unpack(obj, schema)
events.append(d)
results[gidx] = events
return results
unpacked = apply_unpack(groups, schema)

How to make dynamic nested updates to a dict?

I've a requirement where I've to update/merge nested child of a dict. I've tried dict.update but it strips the sibling (get_users in the the example below).
I can update a dict like tree['endpoints']['get_tickets']['handlers']['after'] = 'new_after_handler', but those dict keys will be dynamic, coming from string, any idea how to achieve this?
So I basically want to get the test below passed, of course endpoints.get_tickets.handlers will be dynamic.
def test_partial_merge(self):
source = {
"name": "tucktock",
"endpoints": {
"get_tickets": {
"path": "tickets",
"handlers": {
"after": "after_handler",
"after_each": "after_each_handler"
}
},
"get_users": {},
},
}
merging = {
"after": "new_after_handler",
}
expected = {
"name": "tucktock",
"endpoints": {
"get_tickets": {
"path": "tickets",
"handlers": {
"after": "new_after_handler",
"after_each": "after_each_handler"
}
},
"get_users": {},
},
}
merger = Merger()
result = merger.merge(source, merging, "endpoints.get_tickets.handlers")
self.assertEqual(expected, result)

You can do something like this:
source = {
"name": "tucktock",
"endpoints": {
"get_tickets": {
"path": "tickets",
"handlers": {
"after": "after_handler",
"after_each": "after_each_handler"
}
},
"get_users": {},
},
}
merging = {
"after": "new_after_handler",
}
expected = {
"name": "tucktock",
"endpoints": {
"get_tickets": {
"path": "tickets",
"handlers": {
"after": "new_after_handler",
"after_each": "after_each_handler"
}
},
"get_users": {},
},
}
def merge(a, b, dict_path): # modifies a in place
for key in dict_path:
a = a[key]
a.update(b)
merge(source, merging, "endpoints.get_tickets.handlers".split('.'))
print(source == expected)
>>> True

In your Merger.merge method you can convert the source to collections.defaultdict(dict). Then you can iterate over the third parameter ("endpoints.get_tickets.handlers".split('.')) and iteratively go to the level of depth you need, then update this part.
Example:
def merge(source, merging, path):
result = defaultdict(dict)
result.update(source)
current_part = result
for key in path.split('.'):
current_level = current_level[key]
current_level.update(merging)
return result

Add the sub dictonary element in list in python

I am trying to add my sub dictionary element in list. It is giving me type error.
Here is dictionary and my code:
{
"key1": "value1",
"key2": {
"skey1": "svalue2",
"skey2": {
"sskey1": [{
"url": "value",
"sid": "511"
},
{
"url": "value",
"sid": "522"
},
{
"url": "value",
"sid": "533"
}]
}
}
}
I want to add the sid into the list like [511,522,533]:
here is my code:
rsId=[]
for i in op['key2']['skey2']['sskey1']:
for k,v in i.items():
if k=='sid':
rsId.append(v)

D = {
"key1":"value1",
"key2":{
"skey1":"svalue2",
"skey2":{
"sskey1":[{
"url":"value",
"sid":"511"
},
{
"url":"value",
"sid":"522"
},
{
"url":"value",
"sid":"533"
} ]
}
}
}
res = []
for i in D['key2']['skey2']['sskey1']:
res.append(i['sid'])
print res
Result:
['511', '522', '533']
or a one line code:
res = [i['sid'] for i in D['key2']['skey2']['sskey1']]

You can use dict comprehension:
rsId = [v for item in op['key2']['skey2']['sskey1'] for k, v in item.items() if k == 'sid']

You can try with one line something like this:
print(list(map(lambda x:x['sid'],data['key2']['skey2']['sskey1'])))
output:
['511', '522', '533']
If you want value in int then:
print(list(map(lambda x:int(x['sid']),data['key2']['skey2']['sskey1'])))
output:
[511, 522, 533]
when data is:
data = {
"key1":"value1",
"key2":{
"skey1":"svalue2",
"skey2":{
"sskey1":[{
"url":"value",
"sid":"511"
},
{
"url":"value",
"sid":"522"
},
{
"url":"value",
"sid":"533"
} ]
}
}
}

Get the int as output
The type error is probably due to the fact that you get a string as item of the list. Let’s see it transforming it to a number wit int() it solves your problem.
The only change to your code is in the last line of code.
op = {
"key1": "value1",
"key2": {
"skey1": "svalue2",
"skey2": {
"sskey1": [{
"url": "value",
"sid": "511"
},
{
"url": "value",
"sid": "522"
},
{
"url": "value",
"sid": "533"
}]
}
}
}
rsId = []
for i in op['key2']['skey2']['sskey1']:
for k, v in i.items():
if k == 'sid':
rsId.append(int(v)) # put the int here
output
>>> rsId
[511, 522, 533]
Another approach: checking every key that has a dictionary as value
op = {
"key1": "value1",
"key2": {
"skey1": "svalue2",
"skey2": {
"sskey1": [
{
"url": "value",
"sid": "511"
},
{
"url": "value",
"sid": "522"
},
{
"url": "value",
"sid": "533"
}
]
}
}
}
l = []
for k in op: # searching in the main dictonary
if type(op[k]) is dict: # if the value contains a dict (sub1)
for k2 in op[k]: # for every key
if type(op[k][k2]) is dict: # if the value is a dict (sub2)
for k3 in op[k][k2]: # for each key of subdict 2
for i in op[k][k2][k3]: # for every item of the list
for k4 in i: # foreach key in the item (a dict)
if k4 == 'sid': # if the key is 'sid'
l.append(int((i[k4]))) # append the value
print(l)
output
[511, 522, 533]

Filtering out desired data from a JSON file (Python)

this is a sample of my json file:
{
"pops": [{
"name": "pop_a",
"subnets": {
"Public": ["1.1.1.0/24,2.2.2.0/24"],
"Private": ["192.168.0.0/24,192.168.1.0/24"],
"more DATA":""
}
},
{
"name": "pop_b",
"subnets": {
"Public": ["3.3.3.0/24,4.4.4.0/24"],
"Private": ["192.168.2.0/24,192.168.3.0/24"],
"more DATA":""
}
}
]
}
after i read it, i want to make a dic object and store some of the things that i need from this file.
i want my object to be like this ..
[{
"name": "pop_a",
"subnets": {"Public": ["1.1.1.0/24,2.2.2.0/24"],"Private": ["192.168.0.0/24,192.168.1.0/24"]}
},
{
"name": "pop_b",
"subnets": {"Public": ["3.3.3.0/24,4.4.4.0/24"],"Private": ["192.168.2.0/24,192.168.3.0/24"]}
}]
then i want to be able to access some of the public/private values
here is what i tried, and i know there is update(), setdefault() that gave also same unwanted results
def my_funckion():
nt_json = [{'name':"",'subnets':[]}]
Pname = []
Psubnet= []
for pop in pop_json['pops']: # it print only the last key/value
nt_json[0]['name']= pop['name']
nt_json[0]['subnet'] = pop['subnet']
pprint (nt_json)
for pop in pop_json['pops']:
"""
it print the names in a row then all of the ipss
"""
Pname.append(pop['name'])
Pgre.append(pop['subnet'])
nt_json['pop_name'] = Pname
nt_json['subnet']= Psubnet
pprint (nt_json)

Here's a quick solution using list comprehension. Note that this approach can be taken only with enough knowledge of the json structure.
>>> import json
>>>
>>> data = ... # your data
>>> new_data = [{ "name" : x["name"], "subnets" : {"Public" : x["subnets"]["Public"], "Private" : x["subnets"]["Private"]}} for x in data["pops"]]
>>>
>>> print(json.dumps(new_data, indent=2))
[
{
"name": "pop_a",
"subnets": {
"Private": [
"192.168.0.0/24,192.168.1.0/24"
],
"Public": [
"1.1.1.0/24,2.2.2.0/24"
]
}
},
{
"name": "pop_b",
"subnets": {
"Private": [
"192.168.2.0/24,192.168.3.0/24"
],
"Public": [
"3.3.3.0/24,4.4.4.0/24"
]
}
}
]

How can I convert my JSON into the format required to make a D3 sunburst diagram?

I have the following JSON data:
{
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
I want to use it to generate a D3 sunburst diagram, but that requires a different data format:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}
How can I do this with Python? I think I need to use a recursive function, but I don't know where to start.

You could use recursive solution with function that takes name and dictionary as parameter. For every item in given dict it calls itself again to generate list of children which look like this: {'name': 'name here', 'children': []}.
Then it will check for special case where there's only one child which has key children with value of empty list. In that case dict which has given parameter as a name and child name as size is returned. In all other cases function returns dict with name and children.
import json
data = {
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
def helper(name, d):
# Collect all children
children = [helper(*x) for x in d.items()]
# Return dict containing size in case only child looks like this:
# {'name': '1250', 'children': []}
# Note that get is used to so that cases where child already has size
# instead of children work correctly
if len(children) == 1 and children[0].get('children') == []:
return {'name': name, 'size': children[0]['name']}
# Normal case where returned dict has children
return {'name': name, 'children': [helper(*x) for x in d.items()]}
def transform(d):
return helper(*next(iter(d.items())))
print(json.dumps(transform(data), indent=4))
Output:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to select multiple JSON Objects using python - python

Related

Convert Pandas Dataframe or csv file to Custom Nested JSON

How to make dynamic nested updates to a dict?

Add the sub dictonary element in list in python

Filtering out desired data from a JSON file (Python)

How can I convert my JSON into the format required to make a D3 sunburst diagram?

Categories

Resources