Replacing words in JSON file using Python - python

smpl.json file:
[
{
"add":"dtlz",
"emp_details":[
[
"Shubham",
"ksing.shubh#gmail.com",
"intern"
],
[
"Gaurav",
"gaurav.singh#cobol.in",
"developer"
],
[
"Nikhil",
"nikhil#geeksforgeeks.org",
"Full Time"
]
]
}
]
Python file:
import json
with open('smpl.json', 'r') as file:
json_data = json.load(file)
for item in json_data["emp_details"]:
if item[''] in ['Shubham']:
item[''] = 'Indra'
with open('zz_smpl.json', 'w') as file:
json.dump(json_data, file, indent=4)
Since I'm having trouble with the code. Any help would be great.
Looking forward for your help.Thanks in advance!!!

1st, you need to understand list/arrays and maps data structures, and how they are represented by JSON. Seriously, you must understand those data structures in order to use JSON.
An empty array a1
a1 = []
Array with 3 integers
a2 = [1, 2, 3]
To address the 2nd value
a2[0] is 1st value
a2[1] is 2nd value
In python, to subset a2 into 2nd and 3rd value
a3 = a2[1:]
Maps/dicts are containers of key:value pairs.
And empty map (called a dict in python)
d1 = {}
Maps with 2 pairs
d2 = { 'name' : 'Chandra Gupta Maurya' , 'age' : 2360 }
d3 = { 'street' : 'ashoka' , 'location' : 'windsor place' , 'city' : 'delhi' }
such that value of
d2['name'] is 'Chandra Gupta Maurya'
An array of two maps. When you do this in python (and javaScript)
ad1 = [ d2, d3 ]
you are equivalently doing this:
ad1 = [
{ 'name' : 'Chandra Gupta Maurya' , 'age' : 2360 } ,
{ 'street' : 'ashoka' , 'location' : 'windsor place' , 'city' : 'delhi' }
]
so that ad1[0] is
{ 'name' : 'Chandra Gupta Maurya' , 'age' : 2360 }
Obviously "emp_details" is in position 0 of an array
json_data[0]['emp_details']
json_data[0]['emp_details'] itself is the key to an array of maps.
>>> json.dumps (json_data[0]["emp_details"] , indent=2)
produces
'[\n [\n "Shubham",\n "ksing.shubh#gmail.com",\n "intern"\n ],\n [\n "Gaurav",\n "gaurav.singh#cobol.in",\n "developer"\n ],\n [\n "Nikhil",\n "nikhil#geeksforgeeks.org",\n "Full Time"\n ]\n]'
and
>>> print ( json.dumps (json_data[0]["emp_details"], indent=2) )
produces
[
[
"Shubham",
"ksing.shubh#gmail.com",
"intern"
],
[
"Gaurav",
"gaurav.singh#cobol.in",
"developer"
],
[
"Nikhil",
"nikhil#geeksforgeeks.org",
"Full Time"
]
]
Therefore,
>>> json_data[0]["emp_details"][1]
['Gaurav', 'gaurav.singh#cobol.in', 'developer']
Then you might wish to do the replacement
>>> json_data[0]["emp_details"][1][2] = 'the rain in maine falls plainly insane'
>>> json_data[0]["emp_details"][1][1] = "I'm sure the lure in jaipur pours with furore"
>>> print ( json.dumps (json_data, indent=2) )
produces
[
{
"add": "dtlz",
"emp_details": [
[
"Shubham",
"ksing.shubh#gmail.com",
"intern"
],
[
"Gaurav",
"I'm sure the lure in jaipur pours with furore",
"the rain in maine falls plainly insane"
],
[
"Nikhil",
"nikhil#geeksforgeeks.org",
"Full Time"
]
]
}
]

There are 2 problems with your code.
First, the JSON contains an array as the root. Therefore you need to get emp_details property of the first item:
for item in json_data[0]["emp_details"]:
Then in item variable, you need to check the item at index zero:
if item[0] in ['Shubham']:
Here is the full working code:
import json
with open('smpl.json', 'r') as file:
json_data = json.load(file)
for item in json_data[0]["emp_details"]:
if item[0] in ['Shubham']:
item[0] = 'Indra'
with open('zz_smpl.json', 'w') as file:
json.dump(json_data, file, indent=4)
The working repl.it link: https://repl.it/#HarunYlmaz/python-json-write

Here's a more generic solution where outermost json array could have multiple entries (dictionaries):
import json
with open('test.json', 'r') as file:
json_data = json.load(file)
for item in json_data:
for emp in item['emp_details']:
if emp[0] in ['Shubham']:
emp[0] = 'Indra'
with open('zz_smpl.json', 'w') as file:
json.dump(json_data, file, indent=4)

Related

Python: Get all values of a specific key from json file

Im getting the json data from a file:
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
here's my initial code:
def get_names():
students = open('students.json')
data = json.load(students)
I want to get the values of all names
[ben,sam]
you need to extract the names from the students list.
data = {"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
names = [each_student['name'] for each_student in data['students']]
print(names) #['ben', 'sam']
Try using a list comprehension:
>>> [dct['name'] for dct in data['students']]
['ben', 'sam']
>>>
import json
with open('./students.json', 'r') as students_file:
students_content = json.load(students_file)
print([student['name'] for student in students_content['students']]) # ['ben', 'sam']
JSON's load function from the docs:
Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object...
The JSON file in students.json will look like:
{
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
The JSON load function can then be used to deserialize this JSON object in the file to a Python dictionary:
import json
# use with context manager to ensure the file closes properly
with open('students.json', 'rb')as students_fp:
data = json.load(students_fp)
print(type(data)) # dict i.e. a Python dictionary
# list comprehension to take the name of each student
names = [student['name'] for student in data['students']]
Where names now contains the desired:
["ben", "sam"]

How can I even use the'else' syntax in Python?

I am reading data from a JSON file to check the existence of some values.
In the JSON structure below, I try to find adomain from the data in bid and check if there is a cat value, which is not always present.
How do I fix it in the syntax below?
import pandas as pd
import json
path = 'C:/MyWorks/Python/Anal/data_sample.json'
records = [json.loads(line) for line in open(path, encoding='utf-8')]
adomain = [
rec['win_res']['seatbid'][0]['bid'][0]['adomain']
for rec in records
if 'adomain' in rec
]
Here is a data sample:
[
{ "win_res": {
"id": "12345",
"seatbid": [
{
"bid": [
{
"id": "12345",
"impid": "1",
"price": 0.1,
"adm": "",
"adomain": [
"adomain.com"
],
"iurl": "url.com",
"cid": "11",
"crid": "11",
"cat": [
"IAB12345"
],
"w": 1,
"h": 1
}
],
"seat": "1"
}
]
}}
]
As a result, the adomain value exists unconditionally, but the cat value may not be present sometimes.
So, if cat exists in adomain, I want to express adomain and cat in this way, but if there is no adomain, the cat value, how can I do it?
Your question is not clear but I think this is what you are looking for:
import json
path = 'C:/MyWorks/Python/Anal/data_sample.json'
with open(path, encoding='utf-8') as f:
records = json.load(f)
adomain = [
_['win_res']['seatbid'][0]['bid'][0]['adomain']
for _ in records
if _['win_res']['seatbid'][0]['bid'][0].get('adomain', None) and
_['win_res']['seatbid'][0]['bid'][0].get('cat', None)
]
The code above will add the value of ['win_res']['seatbid'][0]['bid'][0]['adomain'] to the list adomain only if there is a ['win_res']['seatbid'][0]['bid'][0]['cat'] corresponding value.
The code will be a lot clearer if we just walk through a bids list. Something like this:
import json
path = 'C:/MyWorks/Python/Anal/data_sample.json'
with open(path, encoding='utf-8') as f:
records = json.load(f)
bids = [_['win_res']['seatbid'][0]['bid'][0] for _ in records]
adomain = [
_['adomain']
for _ in bids
if _.get('adomain', None) and _.get('cat', None)
]

Edit JSON file while keeping it properly formatted (e.g. indentation)

I have a JSON file with the format below and I would like a method to easily edit the data in the two datasets.
By having the tables to insert (2 columns each) in a .txt or .xls file how can I easily replace the two data tables [x,x].
I tried to do it with jsondecode and jsonencode funcions in MATLAB but when I rewrite to a .json file all the identation and line changes are lost. How (and with which software) can I do it to keep it properly formatted?
{
"Compounds" :
[ "frutafresca" ],
"Property 1" :
{
"Scheme" : "Test1" ,
"StdValue" : 0.01 ,
"Data":
[
[ 353.15 , 108320 ],
[ 503.15 , 5120000 ],
[ 513.15 , 6071400 ]
]
},
"Property 2" :
{
"Scheme" : "Test 1" ,
"StdValue" : 0.01 ,
"Data":
[
[ 273.15 , 806.25 ],
[ 283.15 , 797.92 ],
[ 293.15 , 789.39 ],
[ 453.15 , 598.39 ],
[ 463.15 , 578.21 ],
[ 473.15 , 556.79 ]
]
}
}
Is there a reason not to use the standard lib json module?
json module
From the docs:
json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
If indent is a non-negative integer or string, then JSON array
elements and object members will be pretty-printed with that indent
level. An indent level of 0, negative, or "" will only insert
newlines. None (the default) selects the most compact representation.
Using a positive integer indent indents that many spaces per level. If
indent is a string (such as "\t"), that string is used to indent each
level.
import json
data = None
with open('data.json', 'r') as _file:
data = json.load(_file)
assert data is not None
## do your changes to data dict
with open('data.json', 'w') as _file:
json.dump(data, _file, indent=2) ## indent output with 2 spaces per level

Correctly referencing a JSON in Python. Strings versus Integers and Nested items

Sample JSON file below
{
"destination_addresses" : [ "New York, NY, USA" ],
"origin_addresses" : [ "Washington, DC, USA" ],
"rows" : [
{
"elements" : [
{
"distance" : {
"text" : "225 mi",
"value" : 361715
},
"duration" : {
"text" : "3 hours 49 mins",
"value" : 13725
},
"status" : "OK"
}
]
}
],
"status" : "OK"
}
I'm looking to reference the text value for distance and duration. I've done research but i'm still not sure what i'm doing wrong...
I have a work around using several lines of code, but i'm looking for a clean one line solution..
thanks for your help!
If you're using the regular JSON module:
import json
And you're opening your JSON like this:
json_data = open("my_json.json").read()
data = json.loads(json_data)
# Equivalent to:
data = json.load(open("my_json.json"))
# Notice json.load vs. json.loads
Then this should do what you want:
distance_text, duration_text = [data['rows'][0]['elements'][0][key]['text'] for key in ['distance', 'duration']]
Hope this is what you wanted!

How to parse empty JSON property/element in Python

I am attempting to parse some JSON that I am receiving from a RESTful API, but I am having trouble accessing the data in Python because it appears that there is an empty property name.
A sample of the JSON returned:
{
"extractorData" : {
"url" : "RetreivedDataURL",
"resourceId" : "e38e1a7dd8f23dffbc77baf2d14ee500",
"data" : [ {
"group" : [ {
"CaseNumber" : [ {
"text" : "PO-1994-1350",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "03/11/1994"
} ],
"CaseDescription" : [ {
"text" : "Mary v. JONES"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY BETH (Plaintiff)"
} ]
}, {
"CaseNumber" : [ {
"text" : "NP-1998-2194",
"href" : "http://www.referenceURL.net"
}, {
"text" : "FD-1998-2310",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "08/13/1993"
}, {
"text" : "06/02/1998"
} ],
"CaseDescription" : [ {
"text" : "IN RE: NOTARY PUBLIC VS REDACTED"
}, {
"text" : "REDACTED"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY H (Plaintiff)"
}, {
"text" : "Lastname, MARY BETH (Defendant)"
} ]
} ]
} ]
And the Python code I am attempting to use
import requests
import json
FirstName = raw_input("Please Enter First name: ")
LastName = raw_input("Please Enter Last Name: ")
with requests.Session() as c:
url = ('https://www.requestURL.net/?name={}&lastname={}').format(LastName, FirstName)
page = c.get(url)
data = page.content
theJSON = json.loads(data)
def myprint(d):
stack = d.items()
while stack:
k, v = stack.pop()
if isinstance(v, dict):
stack.extend(v.iteritems())
else:
print("%s: %s" % (k, v))
print myprint(theJSON["extractorData"]["data"]["group"])
I get the error:
TypeError: list indices must be integers, not str
I am new to parsing Python and more than simple python in general so excuse my ignorance. But what leads me to believe that it is an empty property is that when I use a tool to view the JSON visually online, I get empty brackets, Like so:
Any help parsing this data into text would be of great help.
EDIT: Now I am able to reference a certain node with this code:
for d in group:
print group[0]['CaseNumber'][0]["text"]
But now how can I iterate over all the dictionaries listed in the group property to list all the nodes labeled "CaseNumber" because it should exist in every one of them. e.g
print group[0]['CaseNumber'][0]["text"]
then
for d in group:
print group[1]['CaseNumber'][0]["text"]
and so on and so forth. Perhaps incrementing some sort of integer until it reaches the end? I am not quite sure.
If you look at json carefully the data key that you are accessing is actually a list, but data['group'] is trying to access it as if it were a dictionary, which is raising the TypeError.
To minify your json it is something like this
{
"extractorData": {
"url": "string",
"resourceId": "string",
"data": [{
"group": []
}]
}
}
So if you want to access group, you should first retrieve data which is a list.
data = sample['extractorData']['data']
then you can iterate over data and get group within it
for d in data:
group = d['group']
I hope this clarifies things a bit for you.

Categories

Resources