Challenge with cleaning up JSON and export into JSONL

Challenge with cleaning up JSON and export into JSONL - python

Breaking my brain on following challenge ..
Trying to clean up JSON sample data and trying to export it to a JSONL file.
Org JSON file:
{"data": {"allSites": {"activeSys": 123, "totalSys": 24718}},
"sites": [{"accountId": "12345", "accountName": "system_a"},
{"accountId": "67890","accountName": "system_b"}]}
Required JSONL data format:
{"accountId": "12345", "accounName": "system_a"}
{"accountId": "67890", "accounName": "system_b"}

You can use ast.literal_eval conversion as an
intermediate step in order to extract the desired part and then use json.dumps to convert to a pretty format for newly created json object such as
import json
import ast
Js= """{"data": {"allSites": {"activeSys": 123, "totalSys": 24718}},
"sites": [{"accountId": "12345", "accountName": "system_a"},
{"accountId": "67890","accountName": "system_b"}]}"""
data = ast.literal_eval(Js)
newJs = data["sites"]
updatedJs = json.dumps(newJs, indent=4)
print(updatedJs)
if you want to read from a file(json.file) and write to another file(data_new.json), then use
import json
import ast
with open('data.json') as f, open('data_new.json', 'w') as f_out:
data = ast.literal_eval(f.read())
newJs = data["sites"]
f_out.write(format(newJs))

Related

Why does python generate a string json file?

I would like to generate a json file from data retrieved in an api request and another json file. The problem is that in my generated json file, the braces are surrounded by double quotes and I also have "\n" and "\r" everywhere. Do you have a solution to generate a json file correctly?
A piece of my python code:
def write_json(data, filename='data.json'):
with open(filename, 'w', encoding='utf-8') as wf:
json.dump(data, wf, ensure_ascii=False, indent=4)
# JSON file to fetch IDs
with open('sast-projects.json', 'r', encoding='utf-8') as f:
projects = json.load(f)
with open('data.json') as json_file:
data = json.load(json_file)
temp = data['data']
for project in projects:
result_detail = requests.get(url_detail + str(project['id']), headers=header_detail)
temp.append(result_detail.text)
write_json(data)
And an example of my outpout (data.json):
{
"data": [
"{\r\n \"id\": 12,\r\n \"teamId\": 34,\r\n \"owner\": \"Some text\",\r\n \"name\": \"Some text\"\r\n }\r\n ]\r\n}",
"{\r\n \"id\": 98,\r\n \"teamId\": 65,\r\n \"owner\": \"Some text\",\r\n \"name\": \"Some text\"\r\n }\r\n ]\r\n}"
]
}

Change result_detail.text to result_detail.json(). You're trying to store the raw json string instead of a json object, which is causing double encoding issues.

Is there a way to rename empty header CSV file converting to JSON in Python without pandas?

I am trying to convert a CSV file to JSON but there is a header in my csv that is empty. Is there a way to name it when outputting it to JSON?
Example data
"" Calories Fat Sodium
Bread 100 10 23
I got this code from geeksforgeeks
import csv
import json
# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):
# create a dictionary
data = {}
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for rows in csvReader:
# Assuming a column named 'No' to
# be the primary key
key = rows['']
data[key] = rows
# Open a json writer, and use the json.dumps()
# function to dump data
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
# Driver Code
# Decide the two file paths according to your
# computer system
csvFilePath = r'Names.csv'
jsonFilePath = r'Names.json'
# Call the make_json function
make_json(csvFilePath, jsonFilePath)
I did this and it gets the first row, but i'm not sure how to rename it when i output it to JSON.
It appears as "":"Bread" in the JSON file.
key = rows['']
Thanks in advance if anyone can help!
Edit: Expected output
{
"Food": "Bread",
"Calories": "45",
"Fat (g)": "0",
"Carb. (g)": "11",
"Fiber (g)": "0",
"Protein": "0",
"Sodium": "10"
}

Save a "pretty" JSON Object to disc with json.dump in Python

I try to save a "pretty" json object which I created from a pandas dataframe.
df = pd.read_csv("https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv")
import json
d = df.to_dict(orient='records')
j = json.dumps(d, indent=2)
print(j)
The printed output looks great and when I copy it to an editor, it seems to work.
[
{
"model": "Mazda RX4",
"mpg": 21.0,
"cyl": 6,
"disp": 160.0,
"hp": 110,
"drat": 3.9,
"wt": 2.62,
"qsec": 16.46,
"vs": 0,
"am": 1,
"gear": 4,
"carb": 4
}
]
However, when I save it to disc, I does not look like expected.
with open("beispiel.json", "w") as write_file:
json.dump(j, write_file)
Everything is in one line and is not formatted at all:
"[\n {\n \"model\": \"Mazda RX4\",\n \"mpg\": 21.0,\n \"cyl\": 6,\n \"disp\": 160.0,\n
What am I doing wrong here?

The reason is that j is a string, so when you do:
with open("beispiel.json", "w") as write_file:
json.dump(j, write_file)
you are writing the string to the file. Just do:
json.dump(d, write_file, indent=2)

Try this:
import json
import pandas as pd
df = pd.read_csv("https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv")
d = df.to_dict(orient='records')
# 1st form
with open("beispiel.json", "w") as write_file:
write_file.write(json.dumps(d, indent=2))
# or 2nd form
with open("beispiel.json", "w") as write_file:
json.dump(d, write_file, indent=2)

How to find all "Name" parameters from big Json data using python3

How can I extract all the names from big JSON file using Python3.
with open('out.json', 'r') as f:
data = f.read()
Here I'm opening JSON file after that I tried this
a = json.dumps(data)
b= json.loads(a)
print (b)
Here is my data from JSON file.
{"data": [
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaeZ3PywqdMRWSQuA9_KML-ow","errorCauses":[]},
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaet_rFPO5bSkuEGKNI9a5vgQ","errorCauses":[]},
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaejsPt3fprRCOiYx-p7mbu5g","errorCauses":[]}]}
I need output like this
{"oaeZ3PywqdMRWSQuA9_KML-ow","oaet_rFPO5bSkuEGKNI9a5vgQ","oaejsPt3fprRCOiYx-p7mbu5g"}
I want all errorId.

Try like this :
n = {b['name'] for b in data['movie']['people']['actors']}

If you want to get or process the JSON data, you have to load the JSON first.
Here the example of the code
from json import loads
with open('out.json', 'r') as f:
data = f.read()
load = loads(data)
names = [i['name'] for i in data['movie']['people']['actors']]
or you can change names = [i['name'] for i in data['movie']['people']['actors']] to Vikas P answers

Try using json module for the above.
import json
with open('path_to_file/data.json') as f:
data = json.load(f)
actor_names = { names['name'] for names in data['movie']['people']['actors'] }

python - extract a specific key / value from json file by a variable

a bit new to python and json.
i have this json file:
{ "hosts": {
"example1.lab.com" : ["mysql", "apache"],
"example2.lab.com" : ["sqlite", "nmap"],
"example3.lab.com" : ["vim", "bind9"]
}
}
what i want to do is use the hostname variable and extract the values of each hostname.
its a bit hard to explain but im using saltstack, which already iterates over hosts and i want it to be able to extract each host's values from the json file using the hostname variable.
hope im understood.
thanks
o.

You could do something along these lines:
import json
j='''{ "hosts": {
"example1.lab.com" : ["mysql", "apache"],
"example2.lab.com" : ["sqlite", "nmap"],
"example3.lab.com" : ["vim", "bind9"]
}
}'''
specific_key='example2'
found=False
for key,di in json.loads(j).iteritems(): # items on Py 3k
for k,v in di.items():
if k.startswith(specific_key):
found=True
print k,v
break
if found:
break
Or, you could do:
def pairs(args):
for arg in args:
if arg[0].startswith(specific_key):
k,v=arg
print k,v
json.loads(j,object_pairs_hook=pairs)
Either case, prints:
example2.lab.com [u'sqlite', u'nmap']

If you have the JSON in a string then just use Python's json.loads() function to load JSON parse the JSON and load its contents into your namespace by binding it to some local name
Example:
#!/bin/env python
import json
some_json = '''{ "hosts": {
"example1.lab.com" : ["mysql", "apache"],
"example2.lab.com" : ["sqlite", "nmap"],
"example3.lab.com" : ["vim", "bind9"]
}
}'''
some_stuff = json.loads(some_json)
print some_stuff['hosts'].keys()
---> [u'example1.lab.com', u'example3.lab.com', u'example2.lab.com']
As shown you then access the contents of some_stuff just as you would any other Python dictionary ... all the top level variable declaration/assignments which were serialized (encoded) in the JSON will be keys in that dictionary.
If the JSON contents are in a file you can open it like any other file in Python and pass the file object's name to the json.load() function:
#!/bin/python
import json
with open("some_file.json") as f:
some_stuff = json.load(f)
print ' '.join(some_stuff.keys())

If the above json file is stored as 'samplefile.json', you can write following in python:
import json
f = open('samplefile.json')
data = json.load(f)
value1 = data['hosts']['example1.lab.com']
value2 = data['hosts']['example2.lab.com']
value3 = data['hosts']['example3.lab.com']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Challenge with cleaning up JSON and export into JSONL - python

Related

Why does python generate a string json file?

Is there a way to rename empty header CSV file converting to JSON in Python without pandas?

Save a "pretty" JSON Object to disc with json.dump in Python

How to find all "Name" parameters from big Json data using python3

python - extract a specific key / value from json file by a variable

Categories

Resources