How to search into nested JSON by values with Python

How to search into nested JSON by values with Python - python

In the following JSON object (that I can rewrite if it is easier) I need to seach the sshPort for id="Talend" and envs.id="DEV" and sshServer="jardev0"
I've tried using Pandas but it flattens the nested JSON objets, so I have to create 3 intermediate Pandas objets and search 3 times into them.
I would expect easier syntax, something like " [x.id=="talend" && x[envs].id=="DEV" && x|envs].sshServer="jardev0"]."sshPort"
[
{
"id": "talend",
"envs":
[
{"id":"DEV",
"lServeurs" :
[
{"sshServer":"jardev0", "sshPort":"20022","sshUser":"talend"},
{"sshServer":"jardev01","sshPort":"20022","sshUser":"talend"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20023","sshUser":"talend"}
]
}
]
},
{
"id": "eprel",
"envs":
[
{"id": "DEV",
"lServeurs" :
[
{"sshServer": "jardev0", "sshPort": "20024", "sshUser":"eprel"},
{"sshServer": "jardev01", "sshPort": "20025", "sshUser":"eprel"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20026","sshUser":"eprel"}
]
}
]
}
]
this should return "20022"

LINQ style filtering syntax AFAIK is not available unless you can go for list comprehension.
But you can have a method like this
def get_ssh_port(id,env_id,ssh_server):
import json
ssh_ports= []
input_dict = json.loads(input_json)
output_dict = [x for x in input_dict if x['id'] == id]
for x in output_dict:
for i in x['envs']:
if(i['id'] == env_id):
for j in i['lServeurs']:
if(j['sshServer']==ssh_server):
ssh_ports.append(j['sshPort'])
return ssh_ports
Usage:
input_json = """
[
{
"id": "talend",
"envs":
[
{"id":"DEV",
"lServeurs" :
[
{"sshServer":"jardev0", "sshPort":"20022","sshUser":"talend"},
{"sshServer":"jardev01","sshPort":"20022","sshUser":"talend"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20023","sshUser":"talend"}
]
}
]
},
{
"id": "eprel",
"envs":
[
{"id": "DEV",
"lServeurs" :
[
{"sshServer": "jardev0", "sshPort": "20024", "sshUser":"eprel"},
{"sshServer": "jardev01", "sshPort": "20025", "sshUser":"eprel"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20026","sshUser":"eprel"}
]
}
]
}
]
"""
get_ssh_port(id = 'talend', env_id = 'DEV', ssh_server = 'jardev0')

Using nested list comprehensions (3 nested iterators are very clumsy though):
import json
jsn = """[
{
"id": "talend",
"envs":
[
{"id":"DEV",
"lServeurs" :
[
{"sshServer":"jardev0", "sshPort":"20022","sshUser":"talend"},
{"sshServer":"jardev01","sshPort":"20022","sshUser":"talend"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20023","sshUser":"talend"}
]
}
]
},
{
"id": "eprel",
"envs":
[
{"id": "DEV",
"lServeurs" :
[
{"sshServer": "jardev0", "sshPort": "20024", "sshUser":"eprel"},
{"sshServer": "jardev01", "sshPort": "20025", "sshUser":"eprel"}
]
},
{"id": "PROD",
"lServeurs" :
[
{"sshServer": "jardprd01", "sshPort": "20026","sshUser":"eprel"}
]
}
]
}
]"""
x = json.loads(jsn)
targetPorts = [server['sshPort'] for envsKV in x for env in envsKV['envs'] for server in env['lServeurs'] if envsKV['id']=="talend" and env['id']=="DEV" and server['sshServer']=="jardev0"]
print(targetPorts[0])

Related

Identify location of item in json file

Suppose I have the following json file. With data1["tenants"][1]['name'] I can select uniquename2. Is there a way to collect the '1' number by looping over the document?
{
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
in short: data1["tenants"][1]['name']= uniquename2 data1["tenants"][0]['name'] = uniquename
How can I find out which number has which name. So if I have uniquename2 what number/index corresponds with it?

you can iterate over the tenants to map the index to the name
data = {
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
for index, tenant in enumerate(data['tenants']):
print(index, tenant['name'])
OUTPUT
0 uniquename
1 uniquename2

Assuming, you have turned your json into a dictionary already, this is how you can get the index of the firs ocurrence of a name in your list (This relies on the names actually being unique):
data = {
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
def index_of(tenants, tenant_name):
try:
return tenants.index(
next(
tenant for tenant in tenants
if tenant["name"] == tenant_name
)
)
except StopIteration:
raise ValueError(
f"tenants list does not have tenant by name {tenant_name}."
)
index_of(data["tenants"], "uniquename") # 0

Adding/appending keys to JSON structure in python

Running Python 3.6 without supporting additional modules. I would like to append/add new entries to the "polygon" section. Any suggestions how:
JSON Structure:
{
"messageId":775,
"value":{
"dataFrames":[
{
"content":{
"workZone":[
{
"item":{
"text":"Test"
}
},
{
"item":{
"itis":333
}
}
]
},
"duratonTime":24,
"frameType":"road Signage",
"msgId":{
"roadSignID":{
"mutcdCode":"warning",
"position":{
"elevation":634.0,
"lat":30.2,
"long":-80.5
},
"viewAngle":"111111"
}
},
"priority":1,
"regions":[
{
"anchor":{
"elevation":634.0,
"lat":34.3,
"long":-80.5
},
"description":{
"geometry":{
"direction":"0000",
"laneWidth":5.123,
"polygon":[
[
]
]
}
},
"directionality":"forward"
}
],
"sspLocationRights":1,
"sspMsgRights1":0,
"sspMsgRights2":0,
"sspTimRights":1,
"startTime":1599041581.5259035,
"startYear":2020
}
],
"msgCnt":0,
"packetID":775,
"source":"XX"
}
}
I tried to form a JSON and adding a String for the polygon but it is being added with quotes. So not sure if the best way is to access the polygon section and add new entries.
Expected JSON:
{
"messageId":775,
"value":{
"msgCnt":0,
"packetID":775,
"source":"C-V2X",
"dataFrames":[
{
"sspTimRights":1,
"frameType":"road Signage",
"msgId":{
"roadSignID":{
"position":{
"lat":-80.38466909433639,
"long":37.17942971412366,
"elevation":634.0
},
"viewAngle":"1000000000000000",
"mutcdCode":"warning"
}
},
"startYear":2020,
"startTime":1598992048.1489706,
"duratonTime":24,
"priority":1,
"sspLocationRights":1,
"regions":[
{
"anchor":{
"lat":-80.38466909433639,
"long":37.17942971412366,
"elevation":634.0
},
"directionality":"forward",
"description":{
"geometry":{
"direction":"1000000000000000",
"laneWidth":5.123,
"polygon":[
[
[
37.17942971412366,
-80.38466909433639
],
[
37.179543821887314,
-80.38487318094833
],
[
37.17967679727881,
-80.38510713731363
],
[
37.17995588265411,
-80.38560355518067
],
[
37.17998272884397,
-80.38557977915941
],
[
37.179703594552834,
-80.38508327461031
],
[
37.17957064376986,
-80.38484936187977
],
[
37.17945660930624,
-80.38464540586482
],
[
37.17942971412366,
-80.38466909433639
]
]
]
}
}
}
],
"sspMsgRights1":0,
"sspMsgRights2":0,
"content":{
"workZone":[
{
"item":{
"text":"Buffer Area"
}
},
{
"item":{
"itis":775
}
}
]
}
}
]
}
}

As #Gledi suggested. Doing a List is the proper way to insert the "polygon" information into the JSON structure.

Assuming data points to the dict in the question - the code below should work.
data['value']['dataFrames'][0]['regions'][0]['description']['geometry']['polygon'][0].append('something')

Remove object from JSON whose values are NaN using Python?

My final output JSON file is in following format
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I want to remove the whole object if the "value" is NaN.
Expected Output
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I cannot remove the blank values from my csv file because of the format of the file.
I have tried this:
with open('Result.json' , 'r') as j:
json_dict = json.loads(j.read())
json_dict['data'] = [item for item in json_dict['data'] if
len([val for val in item['value'] if isnan(val)]) == 0]
print(json_dict)
Error - json_dict['data'] = [item for item in json_dict['data'] if len([val for val in item['value'] if isnan(val)]) == 0]
TypeError: list indices must be integers or slices, not str

In case you have more than one value for json"value": [...]
then,
import json
from math import isnan
json_str = '''
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
}
]
}
}
]
'''
json_dict = json.loads(json_str)
for typeObj in json_dict:
resource_node = typeObj['resource']
resource_node['data'] = [
item for item in resource_node['data']
if len([val for val in item['value'] if isnan(val)]) == 0
]
print(json_dict)

For testing if value is NaN you could use math.isnan() function (doc):
data = '''{"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058368500"
],
"value": [
NaN
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]}'''
import json
from math import isnan
data = json.loads(data)
data['data'] = [i for i in data['data'] if not isnan(i['value'][0])]
print(json.dumps(data, indent=4))
Prints:
{
"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]
}

Update a json file from XLS file data

Im trying to update a json file from XLS file data.
This what I wish to do :
Extract namesFromJson
Extract namesFromXLS
for nameFromXLS in namesFromXLS :
Check if nameFromXLS is in namesFromJson :
if true : then :
extract xls row (of this name)
update jsonFile (of this name)
My problem is when its true, how can I update a jsonfile?
Python code:
import xlrd
import unicodedata
import json
intents_file = open("C:\myJsonFile.json","rU")
json_intents_data = json.load(intents_file)
book = xlrd.open_workbook("C:\myXLSFile.xlsx")
sheet = book.sheet_by_index(0)
row =""
nameXlsValues = []
intentJsonNames =[]
for entity in json_intents_data["intents"]:
intentJsonName = entity["name"]
intentJsonNames.append(intentJsonName)
for row_index in xrange(sheet.nrows):
nameXlsValue = sheet.cell(rowx = row_index,colx=0).value
nameXlsValues.append(nameXlsValue)
if nameXlsValue in intentJsonNames:
#here ,I have to extract row values from xlsFile and update jsonFile
for col_index in xrange(sheet.ncols):
value = sheet.cell(rowx = row_index,colx=col_index).value
if type(value) is unicode:
value = unicodedata.normalize('NFKD',value).encode('ascii','ignore')
row += "{0} - ".format(value)
my json file is like this :
{
"intents": [
{
"id": "id1",
"name": "name1",
"details": {
"tags": [
"tag1"
],
"answers": [
{
"type": "switch",
"cases": [
{
"case": "case1",
"answers": [
{
"tips": [
""
],
"texts": [
"my text to be updated"
]
}
]
},
{
"case": "case2",
"answers": [
{
"tips": [
"tip2"
],
"texts": [
]
}
]
}
]
}
],
"template": "json",
"sentences": [
"sentence1",
" sentence2",
" sentence44"]
}
},
{
"id": "id2",
"name": "name3",
"details": {
"tags": [
"tag2"
],
"answers": [
{
"type": "switch",
"cases": [
{
"case": "case1",
"answers": [
{
"texts": [
""
]
}
]
},
{
"case": "case2",
"answers": [
{
"texts": [
""
]
}
]
}
]
}
],
"sentences": [
"sentence44",
"sentence2"
]
}
}
]
}
My xls file is like this :
[![enter image description here][1]][1]

When you are loading the json data from file into memory it becomes a python dict named 'json_intents_data'.
When the condition 'if nameXlsValue in intentJsonNames' is True you need to update the dict with the data you read from the Excel. (It looks like you know how to do it.)
When the loop 'for row_index in xrange(sheet.nrows):' is done, your dict is updated and you want to save it as a json file.
import json
with open('updated_data.json', 'w') as fp:
json.dump(json_intents_data, fp)

I have a json file with 1500 keys, whats the best way to iterate through them?

Each key has a list of strings in them that I use to compare to another list. The dictionary is very nested so I use a recursive function to get the data of each key.
But it takes a long time to get through the entire list. Is there a faster way?
This is the code:
def get_industry(industry_data, industry_category): category_list = list()
for category in industry_category:
for key, item in category.items():
r = re.compile('|'.join([r'\b%s\b' % porter.stem("".join(w.split())) for w in item['key_list']]), flags=re.I)
words_found = r.findall(industry_data)
if words_found:
category_list.extend([key])
new_list = get_industry(' '.join(words_found), item["Subcategories"])
category_list.extend(new_list)
return category_list
This is an example of a JSON file.
[
{
"Agriculture": {
"Subcategories": [
{
"Fruits ": {
"Subcategories": [
{
"Fresh Fruits": {
"Subcategories": [
{
"Apricots": {
"Subcategories": [],
"key_list": [
"Apricots"
]
}
},
{
"Tamarinds": {
"Subcategories": [],
"key_list": [
"Tamarinds"
]
}
}
],
"key_list": [
"loganberries",
"medlars"
]
}
}
],
"key_list": [
"lemons",
"tangelos"
]
}
},
{
"Vegetables ": {
"Subcategories": [
{
"Beetroot": {
"Subcategories": [],
"key_list": [
"Beetroot"
]
}
},
{
"Wasabi": {
"Subcategories": [],
"key_list": [
"Wasabi"
]
}
}
],
"key_list": [
"kohlrabies",
"wasabi "
]
}
}
],
"key_list": [
"wasabi",
"batatas"
]
}
}
]
This is an example of a list I want it compared with.
["lemons","wasabi","washroom","machine","grapefruit","about","city"]
The answer should return this list:
["Agriculture","Vegetables","Wasabi"]
In order to compare list to list and return category, it takes about 3-5 seconds to finish the operation. I heard that using Pandas will significantly increase the speed.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to search into nested JSON by values with Python - python

Related

Identify location of item in json file

Adding/appending keys to JSON structure in python

Remove object from JSON whose values are NaN using Python?

Update a json file from XLS file data

I have a json file with 1500 keys, whats the best way to iterate through them?

Categories

Resources