Json file to csv in Python

Json file to csv in Python - python

If I use this script to convert from Json to Csv in Python:
import json
import csv
with open("data.json") as file:
data = json.loads(file)
with open("data.csv", "w") as file:
csv_file = csv.writer(file)
for item in data:
csv_file.writerow([item['studio'], item['title']] + item['release_dates'].values())
It throws an error message:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

You're using json.loads() when you should be using json.load().
json.loads is for reading from a string, whereas json.load is for reading from a file.
Visit here for more information: https://docs.python.org/2/library/json.html.
Also, unrelated, but you can chain with statements.
with open("data.json") as json_file, open("data.csv", "w") as csv_file:
csv_file = csv.writer(csv_file)
for item in json.load(json_file):
csv_file.writerow(...)

Related

`r+`, `a+`, `w+` for doing both reading and writing file in python

I have a file named dict_file.json in current folder with empty dict content {}
I want to open it such that I can do both read and modify the content. That is I will read the json file as dict, modify that dict and write it back to json file.
I tried r and r+ below:
import json
f = open('dict_file.json', 'r') # same output for r+
json.loads(list_file.read())
This prints
{}
When I tried w+:
f = open('dict_file.json', 'r')
this first clears the file. Then,
json.loads(list_file.read())
gives error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "F:\ProgramFiles\Python37\lib\json\__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "F:\ProgramFiles\Python37\lib\json\__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "F:\ProgramFiles\Python37\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "F:\ProgramFiles\Python37\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
When I tried a+:
f = open('dict_file.json', 'a+')
json.loads(list_file.read())
gives error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "F:\ProgramFiles\Python37\lib\json\__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "F:\ProgramFiles\Python37\lib\json\__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "F:\ProgramFiles\Python37\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "F:\ProgramFiles\Python37\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Though, it does not clear the file.
So, I guess I should be using r+ for my usecase scenario. Also I tried reading writing both :
f = open('dict_file.json', 'r+')
a = json.loads(list_file.read())
a = {'key':'value'} # this involves complex logic instead of plain assignment
json.dump(a,f)
f.flush()
Now when I open the file, its contents are weird:
{}{'key':'value'}
whereas I want it to be:
{'key':'value'}
So I have few questions:
Q1. Why w+ and a+ gives error?
Q2. Why file contained {}{'key':'value'} instead of {'key':'value'}?
Q3. How to correctly do reading and writing (with or without with block)?
PS: I am reading file, then running some loop which computes new dict and write it to file. Then loop sleeps for some time and repeats the same. Thats why I felt flush() will be correct here. That is I open file once outside loop and only flush inside the loop. No need to open file in each iteration.

Why w+ and a+ gives error?
Because the file reader is pointed at the end of the file, where there's no json object to read
its contents are weird
Seems like you're not resetting the file to an actual valid JSON object throughout your tests, and so you've managed to clear the file, read nothing, maybe write some other object, clear it again, then append one or two objects to it, etc etc.
Keep it simple. Start over
filename = "data.json"
with open(filename) as f:
data = json.load(f)
data["foo"] = "bar"
with open(filename, "w") as f:
json.dump(data, f)

You have to think in terms of what the operations do to a file when they open it:
r and r+ open a file for reading. This means that the contents of the file is intact, and the file pointer is the beginning of the file. The entire file is visible to the reader. After reading, the file pointer will be at the end, meaning that you're effectively appending at that point.
w and w+ open a file for writing. That means that the contents of the file is truncated. There is nothing to be read in, since the original contents is destroyed.
a and a+ open the file for appending. The contents of the file are unchanged. However, the file pointer is at the end, so a reader will see no data and raise an error.
The correct way to do this is to open the file twice, and use with blocks to do it:
with open('dict_file.json', 'r') as f:
a = json.load(f)
# you don't need the file to be open here
a = {'key': 'value'}
with open('dict_file.json', 'w') as f:
json.dump(a, f)
You might be tempted to try to open the file in read-write mode (r+), read it, then rewind it, so the pointer returns to the beginning, and then overwrite. There are two reasons not to do this:
Not every stream is rewindable
If the contents of the file decreases in size, you will end up with trash at the end. You would need to truncate to the current pointer after writing, which is an unnecessary layer of complexity

Python 3 JSON writing to CSV Error

I am trying to write out a csv file from data in JSON format. I can get the fieldnames to write to the csv file but not the item value I need. This is my first time coding in python so any help would be appreciated. The json file can be found below for reference:
https://data.ny.gov/api/views/nqur-w4p7/rows.json?accessType=DOWNLOAD
Here is my error:
Traceback (most recent call last):
File "ChangeDataType.py", line 5, in <module>
data = json.dumps(inputFile)
File "/usr/lib64/python3.4/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib64/python3.4/json/encoder.py", line 192, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib64/python3.4/json/encoder.py", line 250, in iterencode
return _iterencode(o, 0)
File "/usr/lib64/python3.4/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <_io.TextIOWrapper name='rows.json?accessType=DOWNLOAD' mode='r' encoding='UTF-8'> is not JSON serializable
Here is my code:
import json
import csv
inputFile = open("rows.json?accessType=DOWNLOAD", "r")
data = json.dumps(inputFile)
with open("Data.csv","w") as csvfile:
writer = csv.DictWriter(csvfile, extrasaction='ignore', fieldnames=["date", "new_york_state_average_gal", "albany_average_gal", "binghamton_average_gal",\
"buffalo_average_gal", "nassau_average_gal", "new_york_city_average_gal", "rochester_average_gal", "syracuse_average_gal","utica_average_gal"])
writer.writeheader()
for row in data:
writer.writerow([row["date"], row["new_york_state_average_gal"], row["albany_average_gal"], row["binghamton_average_gal"],\
row["buffalo_average_gal"], row["nassau_average_gal"], row["new_york_city_average_gal"], row["rochester_average_gal"], row["syracuse\
_average_gal"],row["utica_average_gal"]])

If you want to read a JSON file you should use json.load instead of json.dumps:
data = json.load(inputFile)

Seems you're still having problems even opening the file.
Python json to CSV
You were told to use json.load
dumps takes an object to a string. You want to read JSON to a dictionary.
You therefore need to load the JSON file, and you can open two files at once
with open("Data.csv","w") as csvfile, open("rows.json?accessType=DOWNLOAD") as inputfile:
data = json.load(inputfile)
writer = csv.DictWriter(csvfile,...
Also, for example, considering the JSON data looks like "fieldName" : "syracuse_average_gal", and that is the only occurrence of the Syracuse average value, row["syracuse_average_gal"] is not correct.
Carefully inspect your JSON and figure out to parse it from the very top bracket

python load json data error

I want to load this JSON data in python :
JSON data:
{"status":"success","result":{"users":{"5543295":{"status":{"payment_verified":false,"identity_verified":false,"email_verified":true,"deposit_made":false,"phone_verified":false,"facebook_connected":false,"profile_complete":true},"avatar_large":"/img/unknown.png","avatar_cdn":"//cdn6.f-cdn.com/img/unknown.png","spam_profile":null,"search_languages":null,"corporate_users":null,"support_status":null,"last_name":null,"suspended":null,"primary_language":"en","timezone":{"country":"US","offset":-7.0,"id":105,"timezone":"America/Los_Angeles"},"membership_package":null,"qualifications":[{"description":"Foundation vWorker Member","level":null,"icon_url":"/img/insignia/vworker.png","icon_name":"vworker","score_percentage":null,"user_percentile":null,"type":null,"id":null,"name":null}],"id":5543295,"badges":null,"hourly_rate":null,"responsiveness":null,"first_name":null,"display_name":"vw693015vw","tagline":null,"account_balances":null,"public_name":"vw693015vw","role":"employer","location":{"administrative_area":null,"city":null,"country":{"highres_flag_url":"/img/flags/highres_png/unknown.png","code":null,"name":"","flag_url_cdn":"//cdn5.f-cdn.com/img/flags/png/unknown.png","highres_flag_url_cdn":"//cdn5.f-cdn.com/img/flags/highres_png/unknown.png","flag_url":"/img/flags/png/unknown.png"},"vicinity":null,"longitude":null,"full_address":null,"latitude":null},"closed":false,"email":null,"username":"vw693015vw","is_local":null,"endorsements":null,"jobs":[],"employer_reputation":{"user_id":5543295,"last3months":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"entire_history":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"earnings_score":0.0,"role":"employer","job_history":{"job_counts":[],"count_other":0},"last12months":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"project_stats":{"draft":0,"open":0,"complete":0,"pending":0,"work_in_progress":0}},"company":"","registration_date":1041968572,"is_active":null,"avatar_large_cdn":"//cdn6.f-cdn.com/img/unknown.png","profile_description":"","address":null,"limited_account":false,"portfolio_count":0,"preferred_freelancer":false,"true_location":null,"primary_currency":{"code":"USD","name":"US Dollar","country":"US","sign":"$","exchange_rate":1.0,"id":1},"mobile_tracking":null,"test_user":false,"chosen_role":"both","reputation":{"user_id":5543295,"last3months":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_again":0.0,"quality":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":0},"entire_history":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_again":0.0,"quality":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":0},"earnings_score":0.0,"role":"freelancer","job_history":{"job_counts":[],"count_other":0},"last12months":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_{"status":"success","result":{"users":{"5543296":{"status":{"payment_verified":false,"identity_verified":false,"email_verified":true,"deposit_made":false,"phone_verified":false,"facebook_connected":false,"profile_complete":true},"avatar_large":"/img/unknown.png","avatar_cdn":"//cdn6.f-cdn.com/img/unknown.png","spam_profile":null,"search_languages":null,"corporate_users":null,"support_status":null,"last_name":null,"suspended":null,"primary_language":"en","timezone":{"country":"IN","offset":5.5,"id":164,"timezone":"Asia/Calcutta"},"membership_package":null,"qualifications":[{"description":"Foundation vWorker Member","level":null,"icon_url":"/img/insignia/vworker.png","icon_name":"vworker","score_percentage":null,"user_percentile":null,"type":null,"id":null,"name":null}],"id":5543296,"badges":null,"hourly_rate":null,"responsiveness":null,"first_name":null,"display_name":"vw1619364vw","tagline":null,"account_balances":null,"public_name":"vw1619364vw","role":"employer","location":{"administrative_area":null,"city":null,"country":{"highres_flag_url":"/img/flags/highres_png/unknown.png","code":null,"name":"","flag_url_cdn":"//cdn5.f-cdn.com/img/flags/png/unknown.png","highres_flag_url_cdn":"//cdn5.f-cdn.com/img/flags/highres_png/unknown.png","flag_url":"/img/flags/png/unknown.png"},"vicinity":null,"longitude":null,"full_address":null,"latitude":null},"closed":false,"email":null,"username":"vw1619364vw","is_local":null,"endorsements":null,"jobs":[],"employer_reputation":{"user_id":5543296,"last3months":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"entire_history":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"earnings_score":0.0,"role":"employer","job_history":{"job_counts":[],"count_other":0},"last12months":{"completion_rate":null,"all":null,"incomplete_reviews":null,"complete":null,"on_time":null,"on_budget":null,"positive":0.0,"overall":0.0,"reviews":null,"category_ratings":{"communication":0.0,"clarity_spec":0.0,"payment_prom":0.0,"work_for_again":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":null},"project_stats":{"draft":0,"open":0,"complete":0,"pending":0,"work_in_progress":0}},"company":"","registration_date":1124680669,"is_active":null,"avatar_large_cdn":"//cdn6.f-cdn.com/img/unknown.png","profile_description":"","address":null,"limited_account":false,"portfolio_count":0,"preferred_freelancer":false,"true_location":null,"primary_currency":{"code":"USD","name":"US Dollar","country":"US","sign":"$","exchange_rate":1.0,"id":1},"mobile_tracking":null,"test_user":false,"chosen_role":"both","reputation":{"user_id":5543296,"last3months":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_again":0.0,"quality":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":0},"entire_history":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_again":0.0,"quality":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":0},"earnings_score":0.0,"role":"freelancer","job_history":{"job_counts":[],"count_other":0},"last12months":{"completion_rate":0.0,"all":0,"incomplete_reviews":0,"complete":0,"on_time":0.0,"on_budget":0.0,"positive":0.0,"overall":0.0,"reviews":0,"category_ratings":{"communication":0.0,"expertise":0.0,"hire_agaiagain":0.0,"quality":0.0,"professionalism":0.0},"earnings":null,"rehire_rate":null,"incomplete":0},"project_stats":null},"avatar":"/img/unknown.png","cover_image":{"current_image":{"url":"//cdn2.f-cdn.com/static/img/profiles/cover-product.jpg","width":1920,"height":550,"id":null,"description":""},"past_images":null},"corporate":null,"force_verify":null}}},"request_id":"2520481ca4d6b8130bdc20f6114d8037"}
My Python code :
import json
json_data = open('test.txt', 'r').read()
data = json.loads(json_data)
I'm sure JSON data is valid, because is result of one website, I need to load this in python and get username value.
Actual error :
MacBook-Pro-di-admin:freelancer11 admin$ python load.py
Traceback (most recent call last):
File "load.py", line 6, in <module>
data = json.loads(json_data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting : delimiter: line 1 column 4099 (char 4098)

So if you head over to your favorite JSON validator, like JSONlint, you will see the JSON is not valid, and that is precisely what the error in python means.

Read compressed stdin

I would like to have such call:
pv -ptebar compressed.csv.gz | python my_script.py
Inside my_script.py I would like to decompress compressed.csv.gz and parse it using Python csv parser. I would expect something like this:
import csv
import gzip
import sys
with gzip.open(fileobj=sys.stdin, mode='rt') as f:
reader = csv.reader(f)
print(next(reader))
print(next(reader))
print(next(reader))
Of course it doesn't work because gzip.open doesn't have fileobj argument. Could you provide some working example solving this issue?
UPDATE
Traceback (most recent call last):
File "my_script.py", line 8, in <module>
print(next(reader))
File "/usr/lib/python3.5/gzip.py", line 287, in read1
return self._buffer.read1(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 404, in _read_gzip_header
magic = self._fp.read(2)
File "/usr/lib/python3.5/gzip.py", line 91, in read
self.file.read(size-self._length+read)
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
The traceback above appeared after applying #Rawing advice.

In python 3.3+, you can pass a file object to gzip.open:
The filename argument can be an actual filename (a str or bytes object), or an existing file object to read from or write to.
So your code should work if you just omit the fileobj=:
with gzip.open(sys.stdin, mode='rt') as f:
Or, a slightly more efficient solution:
with gzip.open(sys.stdin.buffer, mode='rb') as f:
If for some odd reason you're using a python older than 3.3, you can directly invoke the gzip.GzipFile constructor. However, these old versions of the gzip module didn't have support for files opened in text mode, so we'll use sys.stdin's underlying buffer instead:
with gzip.GzipFile(fileobj=sys.stdin.buffer) as f:

Using gzip.open(sys.stdin.buffer, 'rt') fixes issue for Python 3.

Reading JSON file with Python 3

I'm using Python 3.5.2 on Windows 10 x64. The JSON file I'm reading is this which is a JSON array containing 2 more arrays.
I'm trying to parse this JSON file using the json module. As described in the docs the JSON file must be compliant to RFC 7159. I checked my file here and it tells me it's perfectly fine with the RFC 7159 format, but when trying to read it using this simple python code:
with open(absolute_json_file_path, encoding='utf-8-sig') as json_file:
text = json_file.read()
json_data = json.load(json_file)
print(json_data)
I'm getting this exception:
Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm 4.0.5\helpers\pydev\pydevd.py", line 2217, in <module>
globals = debugger.run(setup['file'], None, None)
File "C:\Program Files (x86)\JetBrains\PyCharm 4.0.5\helpers\pydev\pydevd.py", line 1643, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files (x86)\JetBrains\PyCharm 4.0.5\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/Andres Torti/Git-Repos/MCF/Sur3D.App/shapes-json-checker.py", line 14, in <module>
json_data = json.load(json_file)
File "C:\Users\Andres Torti\AppData\Local\Programs\Python\Python35-32\lib\json\__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Users\Andres Torti\AppData\Local\Programs\Python\Python35-32\lib\json\__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "C:\Users\Andres Torti\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\Andres Torti\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I can read this exact file perfectly fine on Javascript but I can't get Python to parse it. Is there anything wrong with my file or is any problem with the Python parser?

Try this
import json
with open('filename.txt', 'r') as f:
array = json.load(f)
print (array)

Based on reading over the documentation again, it appears you need to either change the third line to
json_data = json.loads(text)
or remove the line
text = json_file.read()
since read() causes the file's index to reach the end of the file. (I suppose, alternatively, you can reset the index of the file).

For python3, only the below worked for me - none of the above.
import json
with open('/emp.json', 'r') as f:
data=f.read()
print(data)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Json file to csv in Python - python

Related

`r+`, `a+`, `w+` for doing both reading and writing file in python

Python 3 JSON writing to CSV Error

python load json data error

Read compressed stdin

Reading JSON file with Python 3

Categories

Resources