I had a Python beginners course last year. Now I am trying to get a csv to json converter. I have searched quite some time and adapted and changed some of the code I found, until the output looked similar to what I want. I am using Python 3.4.2.
#kvorobiev this is an excerpt of my CSV, but it will do for the case. The first time Converting will work. After the second time you will see that the order of the headings will change within the json file.
The csv file looks like this
Document;Item;Category
4;10;C
What I am getting in the output file as of now (after applying the changes from kvorobiev):
[
{
"Item": "10",
"Category": "C",
"Document": "4"
};
]
The json string I want to get in the output file should look like:
[
{
"Document": "4",
"Item": "10",
"Category": "C"
},
]
You will notice the headings are in the wrong order.
Here is the code:
import json
import csv
csvfile = open('file1.csv', 'r')
jsonfile = open('file1.csv'.replace('.csv','.json'), 'w')
jsonfile.write('[' + '\n' + ' ')
fieldnames = csvfile.readline().replace('\n','').split(';')
num_lines = sum(1 for line in open('file.csv')) -1
reader = csv.DictReader(csvfile, fieldnames)
i = 0
for row in reader:
i += 1
json.dump(row, jsonfile, indent=4,sort_keys=False)
if i < num_lines:
jsonfile.write(',')
jsonfile.write('\n')
jsonfile.write(' ' + ']')
print('Done')
Thanks for helping.
Replace line
reader = csv.DictReader(csvfile, fieldnames)
with
reader = csv.DictReader(csvfile, fieldnames, delimiter=';')
Also, you open file1.csv and later get lines number from file.csv
num_lines = sum(1 for line in open('file.csv')) -2
Your solution could be reduced to
import json
import csv
csvfile = open('file1.csv', 'r')
jsonfile = open('file1.csv'.replace('.csv','.json'), 'w')
jsonfile.write('{\n[\n')
fieldnames = csvfile.readline().replace('\n','').split(';')
reader = csv.DictReader(csvfile, fieldnames, delimiter=';')
for row in reader:
json.dump(row, jsonfile, indent=4)
jsonfile.write(';\n')
jsonfile.write(']\n}')
If you want to save order of columns from csv you could use
from collections import OrderedDict
...
for row in reader:
json.dump(OrderedDict([(f, row[f]) for f in fieldnames]), jsonfile, indent=4)
jsonfile.write(';\n')
jsonfile.write(']\n}')
Related
Python - 3.8
I am trying to write a csv file from a list of dictionary. I followed the official website example. writer is able to write headers. But it is not writing the rows from dictionary that i loops from the list.
csv_path = "/home/tmp/file.csv"
with open(csv_path, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=["RequestTimestamp", "MerchantId", "MerchantName", "ChannelName",
"MerchantUSN", "BatchNumber", "UniqueId", "TransactionId", "InvoiceId",
"ReferenceTimestamp", "CustomerName", "AccountHolder", "PaymentType",
"Currency", "Debit", "Credit", "SettledAmount", "Result", "bankResult", "ReconStatus"])
writer.writeheader()
for data in recons:
result_dict = utils.data_formatter_handler(data)
writer.writerow(data)
Currently, I'm trying to import a json file created by the following python script from a csv file.
import csv, json
csvFilePath ='USvideos.csv'
jsonFilePath = 'USvideos.json'
data = {}
with open(csvFilePath, encoding = 'utf8') as csvFile:
csvReader = csv.DictReader(csvFile)
for csvRow in csvReader:
video_id = csvRow['video_id']
data[video_id] = csvRow
with open(jsonFilePath, 'w') as jsonFile:
jsonFile.write(json.dumps(data, indent=4))
Problem statementThe problem is that I need to get a json file without the part in parenthesis by modifying the python script which it cames from
("2kyS6SvSYSE": ) {
"video_id": "2kyS6SvSYSE",
"trending_date": "17.20.11",
"title": "WE WANT TO TALK ABOUT OUR MARRIAGE"
},
("1ZAPwfrtAFY":) {
"video_id": "1ZAPwfrtAFY",
"trending_date": "17.20.11"
}
Purpose of solving it
I need to solve this because I want to import data appropriately in MongoDB
Guessing as to the output JSON format you need but can you give this a try?
import csv, json
csvFilePath ='USvideos.csv'
jsonFilePath = 'USvideos.json'
data = []
with open(csvFilePath, encoding = 'utf8') as csvFile:
csvReader = csv.DictReader(csvFile)
for csvRow in csvReader:
data.append(csvRow)
with open(jsonFilePath, 'w') as jsonFile:
jsonFile.write(json.dumps(data, indent=4))
I have an excel file in which data is saved in csv format in such a way.This data is present in the excel file as shown below,under column A (The CSV File is generated by LabView Software code which i have written to generate data).I have also attached an image of the csv file for reference at the end of my question.
RPM,Load Current,Battery Output,Power Capacity
1200,30,12,37
1600,88,18,55
I want to create a Json file in such format
{
"power_capacity_data" :
{
"rpm" : ["1200","1600"],
"load_curr" : ["30","88"],
"batt_output" : ["12","18"],
"power_cap" : ["37","55"]
}
}
This is my code
import csv
import json
def main():
#created a dictionary so that i can append data to it afterwards
power_data = {"rpm":[],"load_curr":[],"batt_output":[],"power_cap":[]}
with open('power1.lvm') as f:
reader = csv.reader(f)
#trying to append the data of column "RPM" to dictionary
rowcount = 0
for row in reader:
if rowcount == 0:
#trying to skip the first row
rowcount = rowcount + 1
else:
power_data['rpm'].append(row[0])
print(row)
json_report = {}
json_report['pwr_capacity_data'] = power_data
with open('LVMJSON', "w") as f1:
f1.write(json.dumps(json_report, sort_keys=False, indent=4, separators=(',', ': '),encoding="utf-8",ensure_ascii=False))
f1.close()
if __name__ == "__main__":
main()
The output json file that i am getting is this:(please ignore the print(row) statement in my code)
{
"pwr_capacity_data":
{
"load_curr": [],
"rpm": [
"1200,30,12.62,37.88",
"1600,88,18.62,55.88"
],
"batt_output": [],
"power_cap": []
}
}
The whole row is getting saved in the list,but I just want the values under the column RPM to be saved .Can someone help me out with what I may be doing wrong.Thanks in advance.I have attached an image of csv file to just in case it helps
You could use Python's defaultdict to make it a bit easier. Also a dictionary to map all your header values.
from collections import defaultdict
import csv
import json
power_data = defaultdict(list)
header_mappings = {
'RPM' : 'rpm',
'Load Current' : 'load_curr',
'Battery Output' : 'batt_output',
'Power Capacity' : 'power_cap'}
with open('power1.lvm', newline='') as f_input:
csv_input = csv.DictReader(f_input)
for row in csv_input:
for key, value in row.items():
power_data[header_mappings[key]].append(value)
with open('LVMJSON.json', 'w') as f_output:
json.dump({'power_capacity_data' : power_data}, f_output, indent=2)
Giving you an output JSON file looking like:
{
"power_capacity_data": {
"batt_output": [
"12",
"18"
],
"power_cap": [
"37",
"55"
],
"load_curr": [
"30",
"88"
],
"rpm": [
"1200",
"1600"
]
}
}
i hv same code like this
import json
param1 = "xxxxxx"
param2 = "anaaaahhhhhhhhhj"
param3 = "333333333"
with open('data/'+param1+'.json','a') as f:
data = param2,
json.dump(data, f, sort_keys=True, indent=1,ensure_ascii=False)
when i executed this, the output like this, not real dictionary
[
"anaaaahhhhhhhhhj"
][
"anaaaahhhhhhhhhj"
]
i want
[
"anaaaahhhhhhhhhj"
][
"anaaaahhhhhhhhhj"
]
to
[
"anaaaahhhhhhhhhj",
"blablablabal"
]
anyone can help me?
ps: i new in python
Assuming you're dealing with a list and trying to append to it each time you run the script: Load the list first, perform an append operation, and dump it back out.
This assumes you have an existing file named "mydata.json" with [] in it.
First method (more understandable):
import json
fileName = 'mydata.json'
# Load the list first
with open(fileName, 'r') as f:
data = json.load(f)
# Perform an append operation
data.append('New Value')
# Dump it back out
with open(fileName, 'w') as f:
json.dump(data, f, sort_keys=True, indent=1, ensure_ascii=False)
Second method (using seek):
import json
fileName = 'mydata.json'
with open(fileName, 'r+') as f:
# Load the list first
data = json.load(f)
# Perform an append operation
data.append('New Value')
# Dump it back out
f.seek(0)
json.dump(data, f, sort_keys=True, indent=1, ensure_ascii=False)
f.truncate()
Running either of these programs 3 times will result in "mydata.json" containing:
[
"New Value",
"New Value",
"New Value"
]
How can I get a nested dictionary, where both the keys and the subkeys are precisely in the same order as in the csv file?
I tried
import csv
from collections import OrderedDict
filename = "test.csv"
aDict = OrderedDict()
with open(filename, 'r') as f:
csvReader = csv.DictReader(f)
for row in csvReader:
key = row.pop("key")
aDict[key] = row
where test.csv looks like
key,number,letter
eins,1,a
zwei,2,b
drei,3,c
But the sub-dictionaries are not ordered (rows letter and number are changed). So how can I populate aDict[key] in an ordered manner?
You have to build the dictionaries and sub-dictionaries yourself from rows returned from csv.reader which are sequences, instead of using csv.DictReader.
Fortunately that's fairly easy:
import csv
from collections import OrderedDict
filename = 'test.csv'
aDict = OrderedDict()
with open(filename, 'rb') as f:
csvReader = csv.reader(f)
fields = next(csvReader)
for row in csvReader:
temp = OrderedDict(zip(fields, row))
key = temp.pop("key")
aDict[key] = temp
import json # just to create output
print(json.dumps(aDict, indent=4))
Output:
{
"eins": {
"number": "1",
"letter": "a"
},
"zwei": {
"number": "2",
"letter": "b"
},
"drei": {
"number": "3",
"letter": "c"
}
}
This is one way:
import csv
from collections import OrderedDict
filename = "test.csv"
aDict = OrderedDict()
with open(filename, 'r') as f:
order = next(csv.reader(f))[1:]
f.seek(0)
csvReader = csv.DictReader(f)
for row in csvReader:
key = row.pop("key")
aDict[key] = OrderedDict((k, row[k]) for k in order)
csv.DictReader loads the rows into a regular dict and not an ordered one. You'll have to read the csv manually into an OrderedDict to get the order you need:
from collections import OrderedDict
filename = "test.csv"
dictRows = []
with open(filename, 'r') as f:
rows = (line.strip().split(',') for line in f)
# read column names from first row
columns = rows.next()
for row in rows:
dictRows.append(OrderedDict(zip(columns, row)))
You can take advantage of the existing csv.DictReader class, but alter the rows it returns. To do that, add the following class to the beginning of your script:
class OrderedDictReader(csv.DictReader):
def next(self):
# Get a row using csv.DictReader
row = csv.DictReader.next(self)
# Create a new row using OrderedDict
new_row = OrderedDict(((k, row[k]) for k in self.fieldnames))
return new_row
Then, use this class in place of csv.DictReader:
csvReader = OrderedDictReader(f)
The rest of your code remains the same.