Loading data from a JSON file - python

I am trying to get some data from a JSON file. Here is the code for it -
import csv
import json
ifile = open('facebook.csv', "rb")
reader = csv.reader(ifile)
rownum = 0
for row in reader:
try:
csvfile = open('facebook.csv', 'r')
jsonfile = open('file.json', 'r+')
fieldnames = ("USState","NOFU2008","NOFU2009","NOFU2010", "12MI%", "24MI%")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
data = json.load(jsonfile)
print data["USState"]
except ValueError:
continue
I am not getting any output on the console for the print statement. The JSON is in the following format
{"USState": "US State", "12MI%": "12 month increase %", "24MI%": "24 month increase %", "NOFU2010": "Number of Facebook UsersJuly 2010", "NOFU2008": "Number of Facebook usersJuly 2008", "NOFU2009": "Number of Facebook UsersJuly 2009"}
{"USState": "Alabama", "12MI%": "109.3%", "24MI%": "400.7%", "NOFU2010": "1,452,300", "NOFU2008": "290,060", "NOFU2009": "694,020"}
I want to access this like NOFU2008 for all the rows.

The problem is in the way you're creating the JSON file. You don't want to use json.dump() for each row and then append those to the JSON file.
To create a JSON file, you should first create a data structure in Python that represents the entire file the way you want it, and then call json.dump() one time only to dump out the entire structure to JSON format.
Making a single json.dump() call for your entire file will insure that it is valid JSON.
I'd also recommend wrapping your list/array of rows inside a dict/object so you have a place to put other properties that pertain to the entire JSON file as opposed to a single row.
It looks like the first couple of rows of your facebook.csv are something like this (with or without the quotes):
"US State","12 month increase %","24 month increase %","Number of Facebook UsersJuly 2010","Number of Facebook usersJuly 2008","Number of Facebook UsersJuly 2009"
"Alabama","109.3%","400.7%","1,452,300","290,060","694,020"
Let's say we want to generate this JSON file from that (indented here for clarity):
{
"rows": [
{
"USState": "US State",
"12MI%": "Number of Facebook usersJuly 2008",
"24MI%": "Number of Facebook UsersJuly 2009",
"NOFU2010": "Number of Facebook UsersJuly 2010",
"NOFU2008": "12 month increase %",
"NOFU2009": "24 month increase %"
},
{
"USState": "Alabama",
"12MI%": "290,060",
"24MI%": "694,020",
"NOFU2010": "1,452,300",
"NOFU2008": "109.3%",
"NOFU2009": "400.7%"
}
]
}
Note that the top level of the JSON file is an object (not an array), and this object has a rows property which is the array of rows.
We can create this JSON file and test it with this Python code:
import csv
import json
# Read the CSV file and convert it to a list of dicts
with open( 'facebook.csv', 'rb' ) as csvfile:
fieldnames = (
"USState", "NOFU2008", "NOFU2009", "NOFU2010",
"12MI%", "24MI%"
)
reader = csv.DictReader( csvfile, fieldnames )
rows = list( reader )
# Wrap the list inside an outer dict
wrap = {
'rows': rows
}
# Format and write the entire JSON in one fell swoop
with open( 'file.json', 'wb' ) as jsonfile:
json.dump( wrap, jsonfile )
# Now test the file by reading it and parsing it
with open( 'file.json', 'rb' ) as jsonfile:
data = json.load( jsonfile )
# For fun, convert the data back to JSON again and pretty-print it
print json.dumps( data, indent=4 )
A few notes... This code does not have the nested reader loops from the original. I have no idea what those were for. One reader should be enough.
In fact, this version doesn't use a loop at all. This line generates a list of rows from the reader object:
rows = list( reader )
Also pay close attention to use use of with where the CSV and JSON files are opened. This is a great way to open a file because the file will be automatically closed at the end of the with block.
Now having said all this, I have to wonder if this exact JSON structure is what you really want? It looks like the first row of the CSV is a header row, so you may want to skip that row? You can do that easily by adding a reader.next() call before converting the rest of the CSV data to a list:
reader.next()
rows = list( reader )
Also I'm not sure I understand how you want to access the resulting data. You wouldn't be able to use data["USState"], because USState is a property of each individual row object. So say a little more about how you want to access the data and we can sort it out.

If you want to create a list of json objects in the file then you should inform yourself about what a list in json looks like.
In case that list elements are separated by a comma you should put something like this into the code:
jsonfile.write(',\n')

Related

Failing to write specific data from a csv file into a json file

I awant to get only the values of 3 columns from a csv file and write those values as a json file. I also want to remove all rows where latitude is empty.
I am trying to avoid Pandas if it is possible to avoid loading more libraries.
I manage to read csv file and print the values from the columns I want (Latitude and Longitude and (LastUL)SNR). But when I print the data the order is not the same I have writen in the code.
here an example of the output
[{'-8.881253', '-110', '38.569244417'}]
[{'-8.881253', '-110', '38.569244417'}, {'-8.910678', '-122', '38.6256140816'}]
[{'-8.881253', '-110', '38.569244417'}, {'-8.910678', '-122', '38.6256140816'}, {'38.6256782222', '-127', '-8.913913'}]
I am also failing to write dump the data into a json file and I have used the code as is in this link on other ocasions and it worked fine.
As I am new to python I am not getting the reason why that is happening.
Insights would be very appreciated:
So the csv file example is the following:
MAC LORA,Nº Série,Modelo,Type,Freguesia,PT,Latitude-Instalação,Longitude-Instalação,Latitude - API,Longitude - API,Latitude,Longitude,Data Instalação,Semana,Instalador,total instaladas,TYPO,Instaladas,Registadas no NS,Registadas Arquiled,Registadas no Máximo,Último UL,Último JoinRequest,Último Join Accept,(LastUL)RSSI,(LastUL)SNR,JR(RSSI),JR(SNR),Mesmo Poste,Substituidas,Notas,Issue,AVARIAS
0004A30B00FB82F0,202103000002777,OCTANS 40,Luminária,Freguesia de PALMELA,PT1508D2052900,38.569244417,-8.88123655,38.569244,-8.881253,38.569244417,-8.88123655,2022-04-11,2022W15,,,,1.0,1,1,1,2022-07-25 06:16:47,2022-08-10 06:18:45,2022-07-25 21:33:41,-110,"7,2","-115,00","-0,38",,,,Sem JA,
0004A30B00FA89D1,PF_A0000421451,I-TRON Zero 2Z8 4.30-3M,Luminária,Freguesia de PINHAL NOVO,PT1508D2069100,38.6256140816,-8.9107094238,38.625622,-8.910678,38.6256140816,-8.9107094238,2022-03-10,2022W10,,,,1.0,1,1,1,2022-08-10 06:31:29,2022-08-09 22:18:17,2022-08-09 22:18:17,-122,0,"-121,60","-3,00",,,,Ok,
0004A30B00FAB0D9,PF_A0000421452,I-TRON Zero 2Z8 4.30-3M,Luminária,Freguesia de PINHAL NOVO,PT1508D2026300,38.6256782222,-8.91389057769,38.625687,-8.913913,38.6256782222,-8.91389057769,2022-03-10,2022W10,,,,1.0,1,1,1,2022-07-22 06:16:25,00:00:00,2022-07-27 06:29:46,-127,"-15,5",0,0,,,,Sem JR,
The python code is as follow:
import json
import csv
from csv import DictReader
from json import dumps
csvFilePath = "csv_files/test.csv"
jsonFilePath = "rssi.json"
try:
with open(csvFilePath, 'r') as csvFile:
reader = csv.reader(csvFile)
next(reader)
data = []
for row in reader:
data.append({row[9], row[10], row[24]})
print(data)
with open(jsonFilePath, "w") as outfile:
json.dump(data, outfile, indent=4, ensure_ascii=False)
except:
print("Something didn't go as planed!")
else:
print("Successfully exported the files to json!")
It print the right columns but in the wrong order (I want Latitude, Longitude and then lastULSNR), but after that it doesn´t write to json file.
Curly braces in {row[9], row[10], row[24]} mean set in python. Set doesn't preserve the order, it only keeps the unique set of values. Set is also non-serializable to json.
Try to use tuples or lists, e.g. (row[9], row[10], row[24]).
You could also use dict to make your code/output more readable:
row = {
"Latitude": row[9],
"Longitude": row[10],
"lastULSNR": row[24]
}
if row["Latitude"]:
# if latitude is not empty
# add row to output
data.append(row)
print(data)
# [{'Latitude': 38.569244417, 'Longitude': -8.881253, 'lastULSNR': 110}]

Using Python to conver JSON to CSV

I have tried a few different ways using Panda to import my JSON to a csv file.
import pandas as pd
df = pd.read_json("CDMP_E2.json")
df.ts_csv("CDMP_Output.csv")
The problem is when I run that code it makes the output all in one "column".
The column header shows up as Credit-NoSQL.
Then the data in the column is everything from each "object"
'date':'2021-08-01','type':'CARD','amount':'100'
So it looks like this:
Credit-NoSQL
'date':'2021-08-01','type':'CARD','amount':'100'
I would instead expect to see date, type and amount as the headers instead.
account date type amount returneddate
ABCD 2021-08-01 CARD 100
EFGHI 2021-08-01 CARD 150 2021-08-04
My JSON file looks as such:
[
{
"Credit-NoSQL":{
"account":"ABCD"
"date":"2021-08-01",
"type":"CARD",
"amount":"100"
}
},
{
"Credit-NoSQL":{
"account":"EFGHI"
"date":"2021-08-02",
"type":"CARD",
"amount":"150"
"returneddate":"2021-08-04"
}
}
]
so I am not sure if it is the way my JSON file is set up with it's list and such or if I am missing something in my python command. I am new to python and still learning so I am at a loss at what I can do next.
No need to use pandas for this.
import json, csv
with open("CDMP_E2.json") as json_file:
data = [item['Credit-NoSQL'] for item in json.load(json_file)]
# Get the union of all dictionary keys
fieldnames = set()
for row in data:
fieldnames |= row
with open("CDMP_Output.csv", "w") as csv_file:
cwrite = csv.DictWriter(csv_file, fieldnames = fieldnames)
cwrite.writeheader()
cwrite.writerows(data)

Python + CSV : How can I create a new csv and write new columns to that in Python?

I want to make a new CSV from scratch. In that CSV I'll store new cells row wise. Each cell value will be computed dynamically and row will be stored to the csv in a loop. Unfortunately, all of the available codes for this purpose are for already existing CSVs. A code without using Pandas dataframe will be preferred.
Final CSV should look like this:
if you have data coming in as a list ,
import csv
list_1 = ['UOM','BelRd(D2)','Ulsoor(D2)','Chrch(D2)','BlrClub(D2)','Indrangr(D1)','Krmngl(D1','KrmnglBkry(D1)']
list_2 = ['PKT',0,0,0,0,0,0,1]
with open('/path/filename.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
writer.writerow(list_1)
writer.writerow(list_2)
you can create your own csv file, here I would like to show you how you can create csv file with headers.
import csv
rowHeaders = ["Title", "Coupon code", "Description", "Image path", "Website link", "offer expire"]
fp = open('groupon_output.csv', 'w')
mycsv = csv.DictWriter(fp, fieldnames=rowHeaders)
#write header will write your desire header
mycsv.writeheader()
# you can write multiple value to take it inside the loop
#you can write row values using dict writer
title="testing"
coupon_code="xx"
description="nothing much"
image_path="not given"
current_page_url="www.google.com"
mycsv.writerow({"Title": title, "Coupon code": coupon_code, "Description": description,"Image path": image_path, "Website link": current_page_url,"offer expire": "Not Avialable"})
import csv
data = [
[ 'a','b','c','d'],
[ 'b','1','2','3'],
[ 'c','4','5','6'],
[ 'd','7','8','9'],
]
with open ('output.csv', 'w') as output:
writer = csv.writer(output)
writer.writerows(data)
This could help you!
def header():
# Instead of hard coding like below, pass variables which hold dynamic values
# This kind of hard coding can you help you when headers are fixed
h1 = ['Date']
h2 = ['Feed']
h3 = ['Status']
h4 = ['Path']
rows = zip(h1, h2, h3, h4)
with open('test.txt', 'a') as f:
wr = csv.writer(f)
for row in rows:
wr.writerow(row)
f.close()

CSV to Multi level JSON structure

I have the following csv file (1.csv):
"STUB_1","current_week","previous_week","weekly_diff"
"Crude Oil",1184.951,1191.649,-6.698
Need to convert to the following json
json_body = [
{
"measurement":"Crude Oil",
"fields":
{
"weekly_diff":-6.698,
"current_week":1184.951,
"previous_week":1191.649
}
}
]
df = pd.read_csv("1.csv")
df = df.rename(columns={'STUB_1': 'measurement'})
j = (df.groupby(['measurement'], as_index=True)
.apply(lambda x: x[['current_week','previous_week', 'weekly_diff']].to_dict('r'))
.reset_index()
.rename(columns={0:'fields'})
.to_json(orient='records'))
print j
output:
[
{
"measurement": "Crude Oil",
"fields":
[ #extra bracket
{
"weekly_diff": -6.698,
"current_week": 1184.951,
"previous_week": 1191.649
}
] # extra bracket
}
]
which is almost what I need but with extra [ ].
can anyone help what I did wrong? thank you!
Don't use pandas for this - you would have to do a lot of manual unraveling to turn your table data into a hierarchical structure so why not just skip the middle man and use the built-in csv and json modules to do the task for you, e.g.
import csv
import json
with open("1.csv", "rU") as f: # open your CSV file for reading
reader = csv.DictReader(f, quoting=csv.QUOTE_NONNUMERIC) # DictReader for convenience
data = [{"measurement": r.pop("STUB_1", None), "fields": r} for r in reader] # convert!
data_json = json.dumps(data, indent=4) # finally, serialize the data to JSON
print(data_json)
and you get:
[
{
"measurement": "Crude Oil",
"fields": {
"current_week": 1184.951,
"previous_week": 1191.649,
"weekly_diff": -6.698
}
}
]
However, keep in mind that if you have multiple entries with the same STUB_1 value only the latest will be kept - otherwise you'd have to store your fields as a list which will bring you to your original problem with the data.
A quick note on how it does what it does - first we create a csv.DictReader - it's a convenience reader that will map each row's entry with the header fields. It also uses quoting=csv.QUOTE_NONNUMERIC to ensure automatic conversion to floats for all non-quoted fields in your CSV. Then, in the list comprehension, it essentially reads row by row from the reader and creates a new dict for each row - the measurement key contains the STUB_1 entry (which gets immediately removed with dict.pop()) and fields contains the remaining entries in the row. Finally, the json module is used to serialize this list into a JSON that you want.
Also, keep in mind that JSON (and Python <3.5) doesn't guarantee the order of elements so your measurement entry might appear after the fields entry and same goes for the sub-entries of fields. Order shouldn't matter anyway (except for a few very specific cases) but if you want to control it you can use collections.OrderedDict to build your inner dictionaries in the order you prefer to look at once serialized to JSON.

Python csv writing

I'm trying to write to a csv file using excel.
with open('Daily.csv', 'w') as f:
writer = csv.writer(f, delimiter=';')
writer.writerow(["Sales Order;Company Name;Ship Date"])
This results in the following text being put into A1 and only A1.
Sales Order;Company Name;Ship Date
It appears the delimiter isn't working at all. I would like the data to be across three columns, not just one.
use writer.writerow(["Sales Order", "Company", "Name", "Ship Date"])
delimiter parameter of the csv.writer function doesn't set the delimiter for the data you want to write but for the file, so if you use the solution by Oleg you'll get a file with values delimited by ";". csv.writer.writerow needs the list of values.

Categories

Resources