JSON to CSV - Python & cStringIO - python

I'm trying to convert a JSON file to CSV format (in memory), so that I can pass it to another Transformer in Mulesoft. Here is a snippet of the JSON:
[
{
"observationid": 1,
"fkey_observation": 1,
"value": 1,
"participantid": null,
"uom": "ppb",
"finishtime": 1008585047000,
"starttime": 1008581447000,
"observedproperty": "NO2",
"measuretime": 1008581567000,
"measurementid": 1,
"longitude": 3.1415,
"identifier": "Test-1",
"latitude": 10
},
{
"observationid": 1,
"fkey_observation": 1,
"value": 12,
"participantid": null,
"uom": "ppb",
"finishtime": 1008585047000,
"starttime": 1008581447000,
"observedproperty": "SO2",
"measuretime": 1008582047000,
"measurementid": 2,
"longitude": 5,
"identifier": "Test-1",
"latitude": 11
}
]
Essentially, this should create a CSV (in memory) with 2 rows, that looks like this:
1,1,1,N,ppb,1008585047000,1008581447000,NO2,1008581567000,1,3.1415,Test-1,10
1,1,12,N,ppb,1008585047000,1008581447000,SO2,1008582047000,2,5,Test-1,11
Currently, the output comes out like this, which is wrong:
[1 1 1 None u'ppb' 1008585047000L 1008581447000L u'NO2' 1008581567000L 1 3.1415 u'Test-1' 10]
[1 1 12 None u'ppb' 1008585047000L 1008581447000L u'SO2' 1008582047000L 2 5 u'Test-1' 11]
I believe the 'u' bit refers to Unicode, but I don't know how to change the encoding.
Any help would be greatly appreciated!
Here is the Python code I have so far:
import json
import cStringIO
f = open('test.json')
data = json.load(f)
f.close()
output = cStringIO.StringIO()
for item in data:
output.write(str([item['observationid'], item['fkey_observation'], item['value'], item['participantid'], item['uom'], item['finishtime'], item['starttime'], item['observedproperty'], item['measuretime'],item['measurementid'], item['longitude'], item['identifier'], item['latitude']]) + '\n')
contents = output.getvalue()
print contents`
EDIT
Hi guys, slight change of plan.
Essentially, I have a String object, but it actually is structured like a JSON file:
"[{observationid=1, fkey_observation=1, value=1, participantid=null, uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=NO2, measuretime=2001-12-17 09:32:47.0, measurementid=1, longitude=3.1415, identifier=CITISENSE-Test-00000001, latitude=10}, {observationid=1, fkey_observation=1, value=12, participantid=null, uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=SO2, measuretime=2001-12-17 09:40:47.0, measurementid=2, longitude=5, identifier=CITISENSE-Test-00000001, latitude=11}, {observationid=1, fkey_observation=1, value=7000, participantid=null, uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=NO2, measuretime=2001-12-17 09:52:47.0, measurementid=3, longitude=6, identifier=CITISENSE-Test-00000001, latitude=9}, {observationid=2, fkey_observation=2, value=5, participantid=null, uom=ppb, finishtime=2001-12-18 10:30:47.0, starttime=2001-12-18 09:30:47.0, observedproperty=SO2, measuretime=2001-12-18 09:32:47.0, measurementid=4, longitude=7, identifier=CITISENSE-Test-00000001, latitude=8}, {observationid=2, fkey_observation=2, value=6, participantid=null, uom=ppb, finishtime=2001-12-18 10:30:47.0, starttime=2001-12-18 09:30:47.0, observedproperty=PM10, measuretime=2001-12-18 09:34:47.0, measurementid=5, longitude=8, identifier=CITISENSE-Test-00000001, latitude=10}, {observationid=3, fkey_observation=3, value=10000, participantid=null, uom=ppb, finishtime=2001-12-19 10:30:47.0, starttime=2001-12-19 09:30:47.0, observedproperty=SO2, measuretime=2001-12-19 09:38:47.0, measurementid=6, longitude=9, identifier=CITISENSE-Test-00000001, latitude=11.2}]"
How do I go about converting this to CSV? I can't use the json module as it is not a JSON file.

Here is my approach: use csv.DictWriter to handle converting from a dictionary to a row of CSV data:
import csv
import json
from cStringIO import StringIO
with open('test.json') as f:
my_data = json.load(f)
headers = [
'observationid', 'fkey_observation', 'value',
'participantid', 'uom', 'finishtime', 'starttime',
'observedproperty', 'measuretime', 'measurementid',
'longitude', 'identifier', 'latitude']
buffer = StringIO()
writer = csv.DictWriter(buffer, headers)
for row in my_data:
writer.writerow(row)
print buffer.getvalue()

Here's a little snippet I wrote up, I think it should handle your scenario and give you a list of lists. Ereli is onto something with that module though, it might make your life easier. But in the meantime maybe this will help.
import json
myFile = open('myJson.json','r+')
myData = json.load(myFile)
myFile.close()
myList = []
for x in range(0,len(myData)):
myList.append([])
for key in myData[x].keys():
value = myData[x][key]
if isinstance(value,(str,unicode)):
value = value.encode('ascii','ignore')
myList[x].append(value)
print myList

You should probably consider using something like csvwriter. it will handle the escaping and delimiter setting for you.
See example for python3:
import csv
with open('output.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
for line in data:
writer.writerow(line)
it can also be used with cStringIO.

Related

Is there a way to rename empty header CSV file converting to JSON in Python without pandas?

I am trying to convert a CSV file to JSON but there is a header in my csv that is empty. Is there a way to name it when outputting it to JSON?
Example data
"" Calories Fat Sodium
Bread 100 10 23
I got this code from geeksforgeeks
import csv
import json
# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):
# create a dictionary
data = {}
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for rows in csvReader:
# Assuming a column named 'No' to
# be the primary key
key = rows['']
data[key] = rows
# Open a json writer, and use the json.dumps()
# function to dump data
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
# Driver Code
# Decide the two file paths according to your
# computer system
csvFilePath = r'Names.csv'
jsonFilePath = r'Names.json'
# Call the make_json function
make_json(csvFilePath, jsonFilePath)
I did this and it gets the first row, but i'm not sure how to rename it when i output it to JSON.
It appears as "":"Bread" in the JSON file.
key = rows['']
Thanks in advance if anyone can help!
Edit: Expected output
{
"Food": "Bread",
"Calories": "45",
"Fat (g)": "0",
"Carb. (g)": "11",
"Fiber (g)": "0",
"Protein": "0",
"Sodium": "10"
}

Save a "pretty" JSON Object to disc with json.dump in Python

I try to save a "pretty" json object which I created from a pandas dataframe.
df = pd.read_csv("https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv")
import json
d = df.to_dict(orient='records')
j = json.dumps(d, indent=2)
print(j)
The printed output looks great and when I copy it to an editor, it seems to work.
[
{
"model": "Mazda RX4",
"mpg": 21.0,
"cyl": 6,
"disp": 160.0,
"hp": 110,
"drat": 3.9,
"wt": 2.62,
"qsec": 16.46,
"vs": 0,
"am": 1,
"gear": 4,
"carb": 4
}
]
However, when I save it to disc, I does not look like expected.
with open("beispiel.json", "w") as write_file:
json.dump(j, write_file)
Everything is in one line and is not formatted at all:
"[\n {\n \"model\": \"Mazda RX4\",\n \"mpg\": 21.0,\n \"cyl\": 6,\n \"disp\": 160.0,\n
What am I doing wrong here?
The reason is that j is a string, so when you do:
with open("beispiel.json", "w") as write_file:
json.dump(j, write_file)
you are writing the string to the file. Just do:
json.dump(d, write_file, indent=2)
Try this:
import json
import pandas as pd
df = pd.read_csv("https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv")
d = df.to_dict(orient='records')
# 1st form
with open("beispiel.json", "w") as write_file:
write_file.write(json.dumps(d, indent=2))
# or 2nd form
with open("beispiel.json", "w") as write_file:
json.dump(d, write_file, indent=2)

How to write line by line request output

I am trying to write line by line the JSON output from my Python request. I already checked some similar issue on StackOverflow in the question: write to file line by line python, without success.
Here is the code:
myfile = open ("data.txt", "a")
for item in pretty_json["geonames"]:
print (item["geonameId"],item["name"])
myfile.write ("%s\n" % item["geonameId"] + "https://www.geonames.org/" + item["name"])
myfile.close()
Here the output from my pretty_json["geonames"]
{
"adminCode1": "FR",
"lng": "7.2612",
"geonameId": 2661847,
"toponymName": "Aeschlenberg",
"countryId": "2658434",
"fcl": "P",
"population": 0,
"countryCode": "CH",
"name": "Aeschlenberg",
"fclName": "city, village,...",
"adminCodes1": {
"ISO3166_2": "FR"
},
"countryName": "Switzerland",
"fcodeName": "populated place",
"adminName1": "Fribourg",
"lat": "46.78663",
"fcode": "PPL"
}
Then, as output saved on my data.txt, I'm having :
11048419
https://www.geonames.org/Aïre2661847
https://www.geonames.org/Aeschlenberg2661880
https://www.geonames.org/Aarberg6295535
The expected result should be something like:
Aïre , https://www.geonames.org/11048419
Aeschlenberg , https://www.geonames.org/2661847
Aarberg , https://www.geonames.org/2661880
Writing the output in CSV could be a solution?
Regards.
Using the csv module.
Ex:
import csv
with open("data.txt", "a") as myfile:
writer = csv.writer(myfile) #Create Writer Object
for item in pretty_json["geonames"]: #Iterate list
writer.writerow([item["name"], "https://www.geonames.org/{}".format(item["geonameId"])]) #Write row.
If I understand correctly, you want the same screen output to your file. That's easy. If you are on python 3 just add to your print function:
print (item["geonameId"],item["name"], file=myfile)
Just compose a proper printing format for the needed items:
...
for item in pretty_json["geonames"]:
print("{}, https://www.geonames.org/{}".format(item["name"], item["geonameId"]))
Sample output:
Aeschlenberg, https://www.geonames.org/2661847

Not getting expected output in python when converting a csv to json

I have an excel file in which data is saved in csv format in such a way.This data is present in the excel file as shown below,under column A (The CSV File is generated by LabView Software code which i have written to generate data).I have also attached an image of the csv file for reference at the end of my question.
RPM,Load Current,Battery Output,Power Capacity
1200,30,12,37
1600,88,18,55
I want to create a Json file in such format
{
"power_capacity_data" :
{
"rpm" : ["1200","1600"],
"load_curr" : ["30","88"],
"batt_output" : ["12","18"],
"power_cap" : ["37","55"]
}
}
This is my code
import csv
import json
def main():
#created a dictionary so that i can append data to it afterwards
power_data = {"rpm":[],"load_curr":[],"batt_output":[],"power_cap":[]}
with open('power1.lvm') as f:
reader = csv.reader(f)
#trying to append the data of column "RPM" to dictionary
rowcount = 0
for row in reader:
if rowcount == 0:
#trying to skip the first row
rowcount = rowcount + 1
else:
power_data['rpm'].append(row[0])
print(row)
json_report = {}
json_report['pwr_capacity_data'] = power_data
with open('LVMJSON', "w") as f1:
f1.write(json.dumps(json_report, sort_keys=False, indent=4, separators=(',', ': '),encoding="utf-8",ensure_ascii=False))
f1.close()
if __name__ == "__main__":
main()
The output json file that i am getting is this:(please ignore the print(row) statement in my code)
{
"pwr_capacity_data":
{
"load_curr": [],
"rpm": [
"1200,30,12.62,37.88",
"1600,88,18.62,55.88"
],
"batt_output": [],
"power_cap": []
}
}
The whole row is getting saved in the list,but I just want the values under the column RPM to be saved .Can someone help me out with what I may be doing wrong.Thanks in advance.I have attached an image of csv file to just in case it helps
You could use Python's defaultdict to make it a bit easier. Also a dictionary to map all your header values.
from collections import defaultdict
import csv
import json
power_data = defaultdict(list)
header_mappings = {
'RPM' : 'rpm',
'Load Current' : 'load_curr',
'Battery Output' : 'batt_output',
'Power Capacity' : 'power_cap'}
with open('power1.lvm', newline='') as f_input:
csv_input = csv.DictReader(f_input)
for row in csv_input:
for key, value in row.items():
power_data[header_mappings[key]].append(value)
with open('LVMJSON.json', 'w') as f_output:
json.dump({'power_capacity_data' : power_data}, f_output, indent=2)
Giving you an output JSON file looking like:
{
"power_capacity_data": {
"batt_output": [
"12",
"18"
],
"power_cap": [
"37",
"55"
],
"load_curr": [
"30",
"88"
],
"rpm": [
"1200",
"1600"
]
}
}

Python write mutiple array value into csv

with my code, i read the values of JSON data and insert into array
def retrive_json():
with open('t_v1.json') as json_data:
d = json.load(json_data)
array = []
for i in d['ride']:
origin_lat = i['origin']['lat']
origin_lng = i['origin']['lng']
destination_lat = i['destination']['lat']
destination_lng = i['destination']['lng']
array.append([origin_lat,origin_lng,destination_lat,destination_lng])
return array
the result array is this :
[[39.72417, -104.99984, 39.77446, -104.9379], [39.77481, -104.93618, 39.6984, -104.9652]]
how i can write each element of each array into specific field in csv?
i have try in this way:
wrt = csv.writer(open(t_.csv', 'w'), delimiter=',',lineterminator='\n')
for x in jjson:
wrt.writerow([x])
but the value of each array are store all in one field
How can solved it and write each in a field?
this is my json file:
{
"ride":[
{
"origin":{
"lat":39.72417,
"lng":-104.99984,
"eta_seconds":null,
"address":""
},
"destination":{
"lat":39.77446,
"lng":-104.9379,
"eta_seconds":null,
"address":null
}
},
{
"origin":{
"lat":39.77481,
"lng":-104.93618,
"eta_seconds":null,
"address":"10 Albion Street"
},
"destination":{
"lat":39.6984,
"lng":-104.9652,
"eta_seconds":null,
"address":null
}
}
]
}
Let's say we have this:
jsonstring = """{
"ride":[
{
"origin":{
"lat":39.72417,
"lng":-104.99984,
"eta_seconds":null,
"address":""
},
"destination":{
"lat":39.77446,
"lng":-104.9379,
"eta_seconds":null,
"address":null
}
},
{
"origin":{
"lat":39.77481,
"lng":-104.93618,
"eta_seconds":null,
"address":"10 Albion Street"
},
"destination":{
"lat":39.6984,
"lng":-104.9652,
"eta_seconds":null,
"address":null
}
}
]
}"""
Here is a pandas solution:
import pandas as pd
import json
# Load json to dataframe
df = pd.DataFrame(json.loads(jsonstring)["ride"])
# Create the new columns
df["o1"] = df["origin"].apply(lambda x: x["lat"])
df["o2"] = df["origin"].apply(lambda x: x["lng"])
df["d1"] = df["destination"].apply(lambda x: x["lat"])
df["d2"] = df["destination"].apply(lambda x: x["lng"])
#export
print(df.iloc[:,2:].to_csv(index=False, header=True))
#use below for file
#df.iloc[:,2:].to_csv("output.csv", index=False, header=True)
Returns:
o1,o2,d1,d2
39.72417,-104.99984,39.77446,-104.9379
39.77481,-104.93618,39.6984,-104.9652
Condensed answer:
import pandas as pd
import json
with open('data.json') as json_data:
d = json.load(json_data)
df = pd.DataFrame(d["ride"])
df["o1"],df["o2"] = zip(*df["origin"].apply(lambda x: (x["lat"],x["lng"])))
df["d1"],df["d2"] = zip(*df["destination"].apply(lambda x: (x["lat"],x["lng"])))
df.iloc[:,2:].to_csv("t_.csv",index=False,header=False)
Or, maybe the most readable solution:
import json
from pandas.io.json import json_normalize
open('data.json') as json_data:
d = json.load(json_data)
df = json_normalize(d["ride"])
cols = ["origin.lat","origin.lng","destination.lat","destination.lng"]
df[cols].to_csv("output.csv",index=False,header=False)
This might help:
import json
import csv
def retrive_json():
with open('data.json') as json_data:
d = json.load(json_data)
array = []
for i in d['ride']:
origin_lat = i['origin']['lat']
origin_lng = i['origin']['lng']
destination_lat = i['destination']['lat']
destination_lng = i['destination']['lng']
array.append([origin_lat,origin_lng,destination_lat,destination_lng])
return array
res = retrive_json()
csv_cols = ["orgin_lat", "origin_lng", "dest_lat", "dest_lng"]
with open("output_csv.csv", 'w') as out:
writer = csv.DictWriter(out, fieldnames=csv_cols)
writer.writeheader()
for each_list in res:
d = dict(zip(csv_cols,each_list))
writer.writerow(d)
Output csv generated is:
orgin_lat,origin_lng,dest_lat,dest_lng
39.72417,-104.99984,39.77446,-104.9379
39.77481,-104.93618,39.6984,-104.9652
To me it looks like you've got an array of arrays and you want the individual elements. Therefore you'll want to use a nested for loop. Your current for loop is getting each array, to then split up each array into it's elements you'll want to loop through those. I'd suggest something like this:
for x in jjson:
for y in x:
wrt.writerow([y])
Obviously you might want to update your bracketing etc this is just me giving you an idea of how to solve your issue.
Let me know how it goes!
Why the csv-Library?
array = [[1, 2, 3, 4], [5, 6, 7, 8]]
with open('test.csv', 'w') as csv_file :
csv_file.write("# Header Info\n" \
"# Value1, Value2, Value3, Value4\n") # The header might be optional
for row in array :
csv_file.write(",".join(row) + "\n")

Categories

Resources