Im trying to update a json file from XLS file data.
This what I wish to do :
Extract namesFromJson
Extract namesFromXLS
for nameFromXLS in namesFromXLS :
Check if nameFromXLS is in namesFromJson :
if true : then :
extract xls row (of this name)
update jsonFile (of this name)
My problem is when its true, how can I update a jsonfile?
Python code:
import xlrd
import unicodedata
import json
intents_file = open("C:\myJsonFile.json","rU")
json_intents_data = json.load(intents_file)
book = xlrd.open_workbook("C:\myXLSFile.xlsx")
sheet = book.sheet_by_index(0)
row =""
nameXlsValues = []
intentJsonNames =[]
for entity in json_intents_data["intents"]:
intentJsonName = entity["name"]
intentJsonNames.append(intentJsonName)
for row_index in xrange(sheet.nrows):
nameXlsValue = sheet.cell(rowx = row_index,colx=0).value
nameXlsValues.append(nameXlsValue)
if nameXlsValue in intentJsonNames:
#here ,I have to extract row values from xlsFile and update jsonFile
for col_index in xrange(sheet.ncols):
value = sheet.cell(rowx = row_index,colx=col_index).value
if type(value) is unicode:
value = unicodedata.normalize('NFKD',value).encode('ascii','ignore')
row += "{0} - ".format(value)
my json file is like this :
{
"intents": [
{
"id": "id1",
"name": "name1",
"details": {
"tags": [
"tag1"
],
"answers": [
{
"type": "switch",
"cases": [
{
"case": "case1",
"answers": [
{
"tips": [
""
],
"texts": [
"my text to be updated"
]
}
]
},
{
"case": "case2",
"answers": [
{
"tips": [
"tip2"
],
"texts": [
]
}
]
}
]
}
],
"template": "json",
"sentences": [
"sentence1",
" sentence2",
" sentence44"]
}
},
{
"id": "id2",
"name": "name3",
"details": {
"tags": [
"tag2"
],
"answers": [
{
"type": "switch",
"cases": [
{
"case": "case1",
"answers": [
{
"texts": [
""
]
}
]
},
{
"case": "case2",
"answers": [
{
"texts": [
""
]
}
]
}
]
}
],
"sentences": [
"sentence44",
"sentence2"
]
}
}
]
}
My xls file is like this :
[![enter image description here][1]][1]
When you are loading the json data from file into memory it becomes a python dict named 'json_intents_data'.
When the condition 'if nameXlsValue in intentJsonNames' is True you need to update the dict with the data you read from the Excel. (It looks like you know how to do it.)
When the loop 'for row_index in xrange(sheet.nrows):' is done, your dict is updated and you want to save it as a json file.
import json
with open('updated_data.json', 'w') as fp:
json.dump(json_intents_data, fp)
Related
I was wondering how to perform the following:
1.search for strings in a json and extract their nested components.
given:
"type": "blah",
"animals": [
{
"type": "dog1",
"name": "oscar",
}
},
{
"type": "dog2",
"name": "John",
}
},
{
"type": "cat1",
"name": "Fred",
"Colors": [
"Red"
],
"Contact_info": [
{
"Owner": "Jill",
"Owner_number": "123"
}
],
},
{
"type": "cat3",
"name": "Freddy",
"Colors": [
"Blue"
],
"Contact_info": [
{
"Owner": "Ann",
"Owner_number": "1323"
}
],
From this json, I would like to extract all of the animals that are of type cat like cat1 and cat2, as well as all of the information within that block. Like if I search for cat it should return:
{
"type": "cat1",
"name": "Fred",
"Colors": [
"Red"
],
"Contact_info": [
{
"Owner": "Jill",
"Owner_number": "123"
}
],
},
{
"type": "cat3",
"name": "Freddy",
"Colors": [
"Blue"
],
"Contact_info": [
{
"Owner": "Ann",
"Owner_number": "1323"
}
],
Not necessarily that format, but just all of the information that has type cat. Im trying to search for objects in a json file and extract features from that search as well as anything nested inside of it.
Here is my code so far:
f = open('data.json')
# returns JSON object as
# a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data:
if i['type'] == 'cat':
print(i['name'])
print(i['colors'])
break
# Closing file
f.close()```
To begin with, I recommend using the with statement that creates a runtime context that allows you to run a group of statements under the control of a context manager.
It’s much more readable and allows you to skip closing the file since the context manager will do everything for you.
Moving to your problem
Suppose your file is called animals.json
# Import json library to work with json files
import json
# Use context manager
with open("animals.json", "rb") as f:
# Load animals list from json file
animals = json.load(f)["animals"]
# Create a list of dictionaries if animal type contains "cat"
cats = [animal for animal in animals if "cat" in animal.get("type")]
# Write data to cats.json
json.dump(cats, open("cats.json", "w"), indent=4, sort_keys=False, ensure_ascii=False)
This code outputs the formatted cats.json file with all necessary elements:
[
{
"type": "cat1",
"name": "Fred",
"Colors": [
"Red"
],
"Contact_info": [
{
"Owner": "Jill",
"Owner_number": "123"
}
]
},
{
"type": "cat3",
"name": "Freddy",
"Colors": [
"Blue"
],
"Contact_info": [
{
"Owner": "Ann",
"Owner_number": "1323"
}
]
}
]
I'm actually learning how to do some cartography with python but first I would like to convert my json file to a GeoJson dynamically with Python. This is how my Json looks like :
[
{
"shape_name": "unit_shape",
"source_name": "7724C_BUSSY_SAINT_GEORGES_Bussycomore",
"building_id": "7724C",
"chaud_froid": "Chaud",
"geojson": {
"type": "LineString",
"properties": {
"densite_cc": null
},
"coordinates": [
[
2.726142,
48.834359
],
[
2.726149,
48.834367
],
[
2.726202,
48.834422
],
[
2.726262,
48.834429
],
[
2.726307,
48.834434
],
[
2.726316,
48.834435
]
]
},
"to_drop": null,
"id_enquete": null,
"shape_nom": null
},
...(many other features similar to the precedent one)]
Can someone please help me ?
import json
input_file=json.load(open("./data/data.json", "r", encoding="utf-8"))
geojs={
"type": "FeatureCollection",
"features":[
{
"type":"Feature",
"geometry": {
"type":"LineString",
"coordinates":d["geojson"]["coordinates"],
},
"properties":d,
} for d in input_file
]
}
output_file=open("./data/geodata.json", "w", encoding="utf-8")
json.dump(geojs, output_file)
output_file.close()
This is the first time I'm working with JSON, and I'm trying to pull url out of the JSON below.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer,
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"
}
I have been able to access description and _id via
data = json.loads(line)
if 'xpath' in data:
xpath = data["_id"]
description = data["sections"][0]["payload"][0]["description"]
However, I can't seem to figure out a way to access url. One other issue I have is there could be other items in sections, which makes indexing into Contact Info a non starter.
Hope this helps:
import json
with open("test.json", "r") as f:
json_out = json.load(f)
for i in json_out["sections"]:
for j in i["payload"]:
for key in j:
if "url" in key:
print(key, '->', j[key])
I think your JSON is damaged, it should be like that.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer",
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"
}
You can check it on http://json.parser.online.fr/.
And if you want to get the value of the url.
import json
j = json.load(open('yourJSONfile.json'))
print(j['sections'][1]['payload'][0]['url'])
I think it's worth to write a short function to get the url(s) and make a decision whether or not to use the first found url in the returned list, or skip processing if there's no url available in your data.
The method shall looks like this:
def extract_urls(data):
payloads = []
for section in data['sections']:
payloads += section.get('payload') or []
urls = [x['url'] for x in payloads if 'url' in x]
return urls
This should print out the URL
import json
# open json file to read
with open('test.json','r') as f:
# load json, parameter as json text (file contents)
data = json.loads(f.read())
# after observing format of JSON data, the location of the URL key
# is determined and the data variable is manipulated to extract the value
print(data['sections'][1]['payload'][0]['url'])
The exact location of the 'url' key:
1st (position) of the array which is the value of the key 'sections'
Inside the array value, there is a dict, and the key 'payload' contains an array
In the 0th (position) of the array is a dict with a key 'url'
While testing my solution, I noticed that the json provided is flawed, after fixing the json flaws(3), I ended up with this.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer",
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"}
After utilizing the JSON that was provided by Vincent55.
I made a working code with exception handling and with certain assumptions.
Working Code:
## Assuming that the target data is always under sections[i].payload
from json import loads
line = open("data.json").read()
data = loads(line)["sections"]
for x in data:
try:
# With assumption that there is only one payload
if x["payload"][0]["url"]:
print(x["payload"][0]["url"])
except KeyError:
pass
Suppose I have the following json file. With data1["tenants"][1]['name'] I can select uniquename2. Is there a way to collect the '1' number by looping over the document?
{
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
in short: data1["tenants"][1]['name']= uniquename2 data1["tenants"][0]['name'] = uniquename
How can I find out which number has which name. So if I have uniquename2 what number/index corresponds with it?
you can iterate over the tenants to map the index to the name
data = {
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
for index, tenant in enumerate(data['tenants']):
print(index, tenant['name'])
OUTPUT
0 uniquename
1 uniquename2
Assuming, you have turned your json into a dictionary already, this is how you can get the index of the firs ocurrence of a name in your list (This relies on the names actually being unique):
data = {
"tenants": [{
"key": "identifier",
"name": "uniquename",
"image": "url",
"match": [
"identifier"
],
"tags": [
"tag1",
"tag2"
]
},
{
"key": "identifier",
"name": "uniquename2",
"image": "url",
"match": [
"identifier1",
"identifier2"
],
"tags": ["tag"]
}
]
}
def index_of(tenants, tenant_name):
try:
return tenants.index(
next(
tenant for tenant in tenants
if tenant["name"] == tenant_name
)
)
except StopIteration:
raise ValueError(
f"tenants list does not have tenant by name {tenant_name}."
)
index_of(data["tenants"], "uniquename") # 0
My final output JSON file is in following format
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I want to remove the whole object if the "value" is NaN.
Expected Output
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I cannot remove the blank values from my csv file because of the format of the file.
I have tried this:
with open('Result.json' , 'r') as j:
json_dict = json.loads(j.read())
json_dict['data'] = [item for item in json_dict['data'] if
len([val for val in item['value'] if isnan(val)]) == 0]
print(json_dict)
Error - json_dict['data'] = [item for item in json_dict['data'] if len([val for val in item['value'] if isnan(val)]) == 0]
TypeError: list indices must be integers or slices, not str
In case you have more than one value for json"value": [...]
then,
import json
from math import isnan
json_str = '''
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
}
]
}
}
]
'''
json_dict = json.loads(json_str)
for typeObj in json_dict:
resource_node = typeObj['resource']
resource_node['data'] = [
item for item in resource_node['data']
if len([val for val in item['value'] if isnan(val)]) == 0
]
print(json_dict)
For testing if value is NaN you could use math.isnan() function (doc):
data = '''{"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058368500"
],
"value": [
NaN
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]}'''
import json
from math import isnan
data = json.loads(data)
data['data'] = [i for i in data['data'] if not isnan(i['value'][0])]
print(json.dumps(data, indent=4))
Prints:
{
"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]
}