How can I get certain levels of JSON in Python? - python

If my JSON data looks like this:
{
"name": "root",
"children": [
{
"name": "a",
"children": [
{
"name": "b",
"children": [
{
"name": "c",
"size": "1"
},
{
"name": "d",
"size": "2"
}
]
},
{
"name": "e",
"size": 3
}
]
},
{
"name": "f",
"children": [
{
"name": "g",
"children": [
{
"name": "h",
"size": "1"
},
{
"name": "i",
"size": "2"
}
]
},
{
"name": "j",
"size": 5
}
]
}
]
}
How can I return two adjacent levels in Python?
For example return:
a - b,e
f - g,j
The data could become very large, therefore I have to slice it into smaller pieces.
Thanks for every help.

You need to build a tree of dicts, with values as the leaves:
{'a': {'b': {'c': '1', 'd': '2'}, 'e': '3'}, 'f': {'g': {'h': '1', 'i': '2'}, 'j': '5'}}
This can be decomposed into three separate actions:
get the "name" of a node for use as a key
if the node has "children", transform them to a dict
if the node has a "size", transform that to the single value
Unless your data is deeply nested, recursion is a straightforward approach:
def compress(node: dict) -> dict:
name = node['name'] # get the name
try:
children = node['children'] # get the children...
except KeyError:
return {name: node['size']} # or return name and value
else:
data = {}
for child in children: # collect and compress all children
data.update(compress(child))
return {name: data}
This compresses the entire hierarchy, including the "root" node:
>>> compress(data)
{'root': {'a': {'b': {'c': '1', 'd': '2'}, 'e': 3},
'f': {'g': {'h': '1', 'i': '2'}, 'j': 5}}}

Try this solution, tell me this works or not.
dictVar = {
"name": "root",
"children": [
{
"name": "a",
"children": [
{
"name": "b",
"children": [
{
"name": "c",
"size": "1"
},
{
"name": "d",
"size": "2"
}
]
},
{
"name": "e",
"size": 3
}
]
},
{
"name": "f",
"children": [
{
"name": "g",
"children": [
{
"name": "h",
"size": "1"
},
{
"name": "i",
"size": "2"
}
]
},
{
"name": "j",
"size": 5
}
]
}
]
}
name = {}
for dobj in dictVar['children']:
for c in dobj['children']:
if not dobj['name'] in name:
name[dobj['name']] = [c['name']]
else:
name[dobj['name']].append(c['name'])
print(name)
AND as you need all origin data then another is :
name = {}
for dobj in dictVar['children']:
for c in dobj['children']:
if not dobj['name'] in name:
name[dobj['name']] = [c]
else:
name[dobj['name']].append(c)
print(name)

Related

Elastic Search query list with sublist

I have an index in Elastic that contains an array of keys and values.
For example - a single document looks like this:
{
"_index": "my_index",
"_source": {
"name": "test",
"values": [
{
"name": "a",
"score": 10
},
{
"name": "b",
"score": 4
},
{
"name": "c",
"score": 2
},
{
"name": "d",
"score": 1
}
]
},
"fields": {
"name": [
"test"
],
"values.name.keyword": [
"a",
"b",
"c",
"d"
],
"name.keyword": [
"test"
],
"values.score": [
10,
4,
2,
1
],
"values.name": [
"a",
"b",
"c",
"d"
]
}
}
I want to create an Elastic query (through API) that retrieves a sum of all the name scores filtered by a list of names.
For example, for the input:
names = ['a', 'b']
The result will be: 14
Any idea how to do it?
You can di this by making values array nested. Example mapping:
{
"mappings": {
"properties": {
"values": { "type": "nested" }
}
}
}
Following query will give the result you want:
{
"size":0,
"aggs": {
"asd": {
"nested": {
"path": "values"
},
"aggs": {
"filter_agg": {
"filter": {
"terms": {
"values.name.keyword": [
"a",
"b"
]
}
},
"aggs": {
"sum": {
"sum": {
"field": "values.score"
}
}
}
}
}
}
}
}

Convert a dataframe to JSON via a dictionary using "to_dict()" and "json.dump()"

I'm trying to convert a dataframe to a particular JSON format. I've attempted doing this using the methods "to_dict()" and "json.dump()" from the pandas and json modules, respectively, but I can't get the JSON format I'm after. To illustrate:
df = pd.DataFrame({
"Location": ["1ST"] * 3 + ["2ND"] * 3,
"Date": ["2019-01", "2019-02", "2019-03"] * 2,
"Category": ["A", "B", "C"] * 2,
"Number": [1, 2, 3, 4, 5, 6]
})
def dataframe_to_dictionary(df, orientation):
dictionary = df.to_dict(orient=orientation)
return dictionary
dict_records = dataframe_to_dictionary(df, "records")
with open("./json_records.json", "w") as json_records:
json.dump(dict_records, json_records, indent=2)
dict_index = dataframe_to_dictionary(df, "index")
with open("./json_index.json", "w") as json_index:
json.dump(dict_index, json_index, indent=2)
When I convert "dict_records" to JSON, I get an array of the form:
[
{
"Location": "1ST",
"Date": "2019-01",
"Category": "A",
"Number": 1
},
{
"Location": "1ST",
"Date": "2019-02",
"Category": "B",
"Number": 2
},
...
]
And, when I convert "dict_index" to JSON, I get an object of the form:
{
"0": {
"Location": "1ST",
"Date": "2019-01",
"Category": "A",
"Number": 1
},
"1": {
"Location": "1ST",
"Date": "2019-02",
"Category": "B",
"Number": 2
}
...
}
But, I'm trying to get a format that looks like the following (where key = location and values = [{}]) like below. Thanks in advance for your help.
{
1ST: [
{
"Date": "2019-01",
"Category": "A",
"Number" 1
},
{
"Date": "2019-02",
"Category": "B",
"Number" 2
},
{
"Date": "2019-03",
"Category": "C",
"Number" 3
}
],
2ND: [
{},
{},
{}
]
}
This can be achieved via groupby:
gb = df.groupby('Location')
{k: v.drop('Location', axis=1).to_dict(orient='records') for k, v in gb}

Adding a key/value pair once I have recursively searched a dict

I have searched a nested dict for certain keys, I have succeeded in being able to locate the keys I am looking for, but I am not sure how I can now add a key/value pair to the location the key I am looking for is. Is there a way to tell python to append the data entry to the location it is currently looking at?
Code:
import os
import json
import shutil
import re
import fileinput
from collections import OrderedDict
#Finds and lists the folders that have been provided
d='.'
folders = list(filter (lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d)))
print("Folders found: ")
print(folders)
print("\n")
def processModelFolder(inFolder):
#Creating the file names
fileName = os.path.join(d, inFolder, inFolder + ".mdl")
fileNameTwo = os.path.join(d, inFolder, inFolder + ".vg2.json")
fileNameThree = os.path.join(d, inFolder, inFolder + "APPENDED.vg2.json")
#copying the json file so the new copy can be appended
shutil.copyfile(fileNameTwo, fileNameThree)
#assigning IDs and properties to search for in the mdl file
IDs = ["7f034e5c-24df-4145-bab8-601f49b43b50"]
Properties = ["IDSU_FX[0]"]
#Basic check to see if IDs and Properties are valid
for i in IDs:
if len(i) != 36:
print("ID may not have been valid and might not return the results you expect, check to ensure the characters are correct: ")
print(i)
print("\n")
if len(IDs) == 0:
print("No IDs were given!")
elif len(Properties) == 0:
print("No Properties were given!")
#Reads code untill an ID is found
else:
with open(fileName , "r") as in_file:
IDCO = None
for n, line in enumerate(in_file, 1):
if line.startswith('IDCO_IDENTIFICATION'):
#Checks if the second part of each line is a ID tag in IDs
if line.split('"')[1] in IDs:
#If ID found it is stored as IDCO
IDCO = line.split('"')[1]
else:
if IDCO:
pass
IDCO = None
#Checks if the first part of each line is a Prop in Propterties
elif IDCO and line.split(' ')[0] in Properties:
print('Found! ID:{} Prop:{} Value: {}'.format(IDCO, line.split('=')[0][:-1], line.split('=')[1][:-1]))
print("\n")
#Stores the property name and value
name = str(line.split(' ')[0])
value = str(line.split(' ')[2])
#creates the entry to be appended to the dict
#json file editing
with open(fileNameThree , "r+") as json_data:
python_obj = json.load(json_data)
#calling recursive search
get_recursively(python_obj, IDCO, name, value)
with open(fileNameThree , "w") as json_data:
json.dump(python_obj, json_data, indent = 1)
print('Processed {} lines in file: {}'.format(n , fileName))
def get_recursively(search_dict, IDCO, name, value):
"""
Takes a dict with nested lists and dicts,
and searches all dicts for a key of the field
provided, when key "id" is found it checks to,
see if its value is the current IDCO tag, if so it appends the new data.
"""
fields_found = []
for key, value in search_dict.iteritems():
if key == "id":
if value == IDCO:
print("FOUND IDCO IN JSON: " + value +"\n")
elif isinstance(value, dict):
results = get_recursively(value, IDCO, name, value)
for result in results:
x = 1
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
more_results = get_recursively(item, IDCO, name, value)
for another_result in more_results:
x=1
return fields_found
for modelFolder in folders:
processModelFolder(modelFolder)
In short, once it finds a key/id value pair that I want, can I tell it to append name/value to that location directly and then continue?
nested dict:
{
"id": "79cb20b0-02be-42c7-9b45-96407c888dc2",
"tenantId": "00000000-0000-0000-0000-000000000000",
"name": "2-stufiges Stirnradgetriebe",
"description": null,
"visibility": "None",
"method": "IDM_CALCULATE_GEAR_COUPLED",
"created": "2018-10-16T10:25:20.874Z",
"createdBy": "00000000-0000-0000-0000-000000000000",
"lastModified": "2018-10-16T10:25:28.226Z",
"lastModifiedBy": "00000000-0000-0000-0000-000000000000",
"client": "STRING_BEARINX_ONLINE",
"project": {
"id": "10c37dcc-0e4e-4c4d-a6d6-12cf65cceaf9",
"name": "proj 2",
"isBookmarked": false
},
"rootObject": {
"id": "6ff0010c-00fe-485b-b695-4ddd6aca4dcd",
"type": "IDO_GEAR",
"children": [
{
"id": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab",
"type": "IDO_SYSTEM_LOADCASE",
"children": [],
"childList": "SYSTEMLOADCASE",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab"
},
{
"name": "IDCO_DESIGNATION",
"value": "Lastfall 1"
},
{
"name": "IDSLC_TIME_PORTION",
"value": 100
},
{
"name": "IDSLC_DISTANCE_PORTION",
"value": 100
},
{
"name": "IDSLC_OPERATING_TIME_IN_HOURS",
"value": 1
},
{
"name": "IDSLC_OPERATING_TIME_IN_SECONDS",
"value": 3600
},
{
"name": "IDSLC_OPERATING_REVOLUTIONS",
"value": 1
},
{
"name": "IDSLC_OPERATING_DISTANCE",
"value": 1
},
{
"name": "IDSLC_ACCELERATION",
"value": 9.81
},
{
"name": "IDSLC_EPSILON_X",
"value": 0
},
{
"name": "IDSLC_EPSILON_Y",
"value": 0
},
{
"name": "IDSLC_EPSILON_Z",
"value": 0
},
{
"name": "IDSLC_CALCULATION_WITH_OWN_WEIGHT",
"value": "CO_CALCULATION_WITHOUT_OWN_WEIGHT"
},
{
"name": "IDSLC_CALCULATION_WITH_TEMPERATURE",
"value": "CO_CALCULATION_WITH_TEMPERATURE"
},
{
"name": "IDSLC_FLAG_FOR_LOADCASE_CALCULATION",
"value": "LB_CALCULATE_LOADCASE"
},
{
"name": "IDSLC_STATUS_OF_LOADCASE_CALCULATION",
"value": false
}
],
"position": 1,
"order": 1,
"support_vector": {
"x": 0,
"y": 0,
"z": 0
},
"u_axis_vector": {
"x": 1,
"y": 0,
"z": 0
},
"w_axis_vector": {
"x": 0,
"y": 0,
"z": 1
},
"role": "_none_"
},
{
"id": "ab7fbf37-17bb-4e60-a543-634571a0fd73",
"type": "IDO_SHAFT_SYSTEM",
"children": [
{
"id": "7f034e5c-24df-4145-bab8-601f49b43b50",
"type": "IDO_RADIAL_ROLLER_BEARING",
"children": [
{
"id": "0b3e695b-6028-43af-874d-4826ab60dd3f",
"type": "IDO_RADIAL_BEARING_INNER_RING",
"children": [
{
"id": "330aa09d-60fb-40d7-a190-64264b3d44b7",
"type": "IDO_LOADCONTAINER",
"children": [
{
"id": "03036040-fc1a-4e52-8a69-d658e18a8d4a",
"type": "IDO_DISPLACEMENT",
"children": [],
"childList": "DISPLACEMENT",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "03036040-fc1a-4e52-8a69-d658e18a8d4a"
},
{
"name": "IDCO_DESIGNATION",
"value": "Displacement 1"
}
],
"position": 1,
"order": 1,
"support_vector": {
"x": -201.3,
"y": 0,
"z": -229.8
},
"u_axis_vector": {
"x": 1,
"y": 0,
"z": 0
},
"w_axis_vector": {
"x": 0,
"y": 0,
"z": 1
},
"shaftSystemId": "ab7fbf37-17bb-4e60-a543-634571a0fd73",
"role": "_none_"
},
{
"id": "485f5bf4-fb97-415b-8b42-b46e9be080da",
"type": "IDO_CUMULATED_LOAD",
"children": [],
"childList": "CUMULATEDLOAD",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "485f5bf4-fb97-415b-8b42-b46e9be080da"
},
{
"name": "IDCO_DESIGNATION",
"value": "Cumulated load 1"
},
{
"name": "IDCO_X",
"value": 0
},
{
"name": "IDCO_Y",
"value": 0
},
{
"name": "IDCO_Z",
"value": 0
}
],
"position": 2,
"order": 1,
"support_vector": {
"x": -201.3,
"y": 0,
"z": -229.8
},
"u_axis_vector": {
"x": 1,
"y": 0,
"z": 0
},
"w_axis_vector": {
"x": 0,
"y": 0,
"z": 1
},
"shaftSystemId": "ab7fbf37-17bb-4e60-a543-634571a0fd73",
"role": "_none_"
}
],
"childList": "LOADCONTAINER",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "330aa09d-60fb-40d7-a190-64264b3d44b7"
},
{
"name": "IDCO_DESIGNATION",
"value": "Load container 1"
},
{
"name": "IDLC_LOAD_DISPLACEMENT_COMBINATION",
"value": "LOAD_MOMENT"
},
{
"name": "IDLC_TYPE_OF_MOVEMENT",
"value": "LB_ROTATING"
},
{
"name": "IDLC_NUMBER_OF_ARRAY_ELEMENTS",
"value": 20
}
],
"position": 1,
"order": 1,
"support_vector": {
"x": -201.3,
"y": 0,
"z": -229.8
},
"u_axis_vector": {
"x": 1,
"y": 0,
"z": 0
},
"w_axis_vector": {
"x": 0,
"y": 0,
"z": 1
},
"shaftSystemId": "ab7fbf37-17bb-4e60-a543-634571a0fd73",
"role": "_none_"
},
{
"id": "3258d217-e6e4-4a5c-8677-ae1fca26f21e",
"type": "IDO_RACEWAY",
"children": [],
"childList": "RACEWAY",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "3258d217-e6e4-4a5c-8677-ae1fca26f21e"
},
{
"name": "IDCO_DESIGNATION",
"value": "Raceway 1"
},
{
"name": "IDRCW_UPPER_DEVIATION_RACEWAY_DIAMETER",
"value": 0
},
{
"name": "IDRCW_LOWER_DEVIATION_RACEWAY_DIAMETER",
"value": 0
},
{
"name": "IDRCW_PROFILE_OFFSET",
"value": 0
},
{
"name": "IDRCW_PROFILE_ANGLE",
"value": 0
},
{
"name": "IDRCW_PROFILE_CURVATURE_RADIUS",
"value": 0
},
{
"name": "IDRCW_PROFILE_CENTER_POINT_OFFSET",
"value": 0
},
{
"name": "IDRCW_PROFILE_NUMBER_OF_WAVES",
"value": 0
},
{
"name": "IDRCW_PROFILE_AMPLITUDE",
"value": 0
},
{
"name": "IDRCW_PROFILE_POSITION_OF_FIRST_WAVE",
"value": 0
},
Bug
First of all, replace the value variable's name by something else, because you have a value variable as the method argument and another value variable with the same name when iterating over the dictionary:
for key, value in search_dict.iteritems(): # <-- REPLACE value TO SOMETHING ELSE LIKE val
Otherwise you will have bugs, because the value from the dictionary is the new value which you will insert. But if you iterate like for key, val in then you can actually use the outer value variable.
Adding The Value Pair
It seems id is a key inside your search_dict, but reading your JSON file your search_dict may have several nested lists like properties and/or children, so it depends on where you want to add the new pair.
If you want to add it to the same dictionary where your id is:
if key == "id":
if value == IDCO:
print("FOUND IDCO IN JSON: " + value +"\n")
search_dict[name] = value
Result:
{
"id": "3258d217-e6e4-4a5c-8677-ae1fca26f21e",
"type": "IDO_RACEWAY",
"children": [],
"childList": "RACEWAY",
"<new name>": "<new value>",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "3258d217-e6e4-4a5c-8677-ae1fca26f21e"
},
If you want to add it to the children or properties list inside the dictionary where id is:
if key == "id":
if value == IDCO:
print("FOUND IDCO IN JSON: " + value +"\n")
if search_dict.has_key("properties"): # you can swap "properties" to "children", depends on your use case
search_dict["properties"].append({"name": name, "value": value}) # a new dictionary with 'name' and 'value' keys
Result:
{
"id": "3258d217-e6e4-4a5c-8677-ae1fca26f21e",
"type": "IDO_RACEWAY",
"children": [],
"childList": "RACEWAY",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "3258d217-e6e4-4a5c-8677-ae1fca26f21e"
},
{
"name": "<new name>",
"value": "<new value>"
},

Python3: Converting a nested dict into a list of objects with "children" and "leaf"?

I'm currently trying to convert a nested dict into a list of objects with "children" and "leaf".
Here my input dict and the output I'm trying to obtain:
Input:
{
"a": {
"aa": {}
},
"b": {
"c": {
"d": {
'label': 'yoshi'
}
},
"e": {},
"f": {}
}
}
I try to obtain this:
[
{
"text": "a",
"children": [
{
"text": "aa",
"leaf": "true"
}
]
},
{
"text": "b",
"children": [
{
"text": "c",
"children": [
{
"text": "d",
"leaf": "true",
"label": "yoshi"
}
]
},
{
"text": "e",
"leaf": "true"
},
{
"text": "f",
"leaf": "true"
}
]
}
]
I've tried a few unflatten python lib on pypi but not one seems to be able to output a list format like this.
I have commented the function as I feel necessary.
def convert(d):
children = []
#iterate over each child's name and their dict (child's childs)
for child, childs_childs in d.items():
#check that it is not a left node
if childs_childs and \
all(isinstance(v,dict) for k,v in childs_childs.items()):
#recursively call ourselves to get the child's children
children.append({'text': child,
'children': convert(childs_childs)})
else:
#if the child is a lead, append to children as necessarry
#the **-exploded accommodates the 'label':'yoshi' item
children.append({'text': child,
'leaf': True,
**childs_childs})
return children
which gives:
[
{
"text": "a",
"children": [
{
"text": "aa",
"leaf": true
}
]
},
{
"text": "b",
"children": [
{
"text": "c",
"children": [
{
"text": "d",
"leaf": true,
"label": "yoshi"
}
]
},
{
"text": "e",
"leaf": true
},
{
"text": "f",
"leaf": true
}
]
}
]
Here's a rough solution. Here I assumed that all labeled nodes are leaves that just have label information.
def make_objects(d):
result = []
for k, v in d.items():
if v == {}:
result.append({"text": k, "leaf":True})
elif len(v) ==1 and "label" in v:
result.append({"text": k, "leaf":True, "label": v.get("label")})
else:
result.append({"text": k, "children": make_objects(v)})
return result
With your example input as d:
from pprint import pprint
pprint(make_objects(d))
prints
[{'children': [{'leaf': True, 'text': 'aa'}], 'text': 'a'},
{'children': [{'children': [{'label': 'yoshi', 'leaf': True, 'text': 'd'}],
'text': 'c'},
{'leaf': True, 'text': 'e'},
{'leaf': True, 'text': 'f'}],
'text': 'b'}]
Try this solution (data is your input dictionary):
def walk(text, d):
result = {'text': text}
# get all children
children = [walk(k, v) for k, v in d.items() if k != 'label']
if children:
result['children'] = children
else:
result['leaf'] = True
# add label if exists
label = d.get('label')
if label:
result['label'] = label
return result
[walk(k, v) for k, v in data.items()]
Output:
[{'text': 'a', 'children': [{'text': 'aa', 'leaf': True}]},
{'text': 'b',
'children': [{'text': 'c',
'children': [{'text': 'd', 'leaf': True, 'label': 'yoshi'}]},
{'text': 'e', 'leaf': True},
{'text': 'f', 'leaf': True}]}]

Merging two json files using python

I'm new to python, I want to merge two JSON files
there should not be any duplicate:
if the values and name are same then I will add both the keys and maintain a single record, otherwise, I will keep the record
File 1:
[ {
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
File 2:
[ {
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
and the merged file output should be:
[
{
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 3,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 2,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
}
]
order of the record does not matter.
I have tried several approaches, like merging the files and then iterating over then, parsing both files separately but I'm facing issues, being new to python.
This should help.
# -*- coding: utf-8 -*-
f1 = [ {
"key": 1,
"value": "NY"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "MA"
}
]
f2 = [ {
"key": 1,
"value": "NJ"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "TX"
}
]
check = [i["value"] for i in f1] #check list to see if the value already exist in f1.
for i in f2:
if i['value'] not in check:
f1.append(i)
print(f1)
Output:
[{'value': 'NY', 'key': 1}, {'value': 'CA', 'key': 1}, {'value': 'MA', 'key': 1}, {'value': 'NJ', 'key': 1}, {'value': 'TX', 'key': 1}]

Categories

Resources