Read file saved using gzip: why is the file handle a string? - python

I have a list of lists, let's suppose it is:
sentcorpus = [["hello", "how", "are", "you", "?"], ["hello", "I", "'m", "fine"]]
I want to save it in gzip format:
import gzip
import json
with gzip.open('corpus.json.gz', 'wb') as fileh:
fileh.write(json.dumps(sentcorpus).encode("utf8"))
Then it would be logical to read it back like this:
with gzip.open('wbec_corpus.json.gz', 'rb') as fileh:
sentcorpus = json.load(fileh.read().decode("utf8"))
But no:
AttributeError: 'str' object has no attribute 'read'
Instead this one works:
with gzip.open('wbec_corpus.json.gz', 'rb') as fileh:
sentcorpus = json.load(fileh)
Why is fileh a string and not a file handle?

It's not the file object, JSON library is throwing the error. To understand we need to look at
json.load and JSON.loads
json.load(fp, **)
Deserialize fp (a .read()-supporting text file or binary file
containing a JSON document) to a Python object using this conversion
table.
json.loads(s, )
Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.
In short JSON.load doesn't require .read object it needs a file pointer; however JSON.loads does require string or file.read() .
SO both of these line below will work
sentcorpus = json.loads(fileh.read().decode("utf8"))
sentcorpus = json.load(fileh)

Related

Opening a json file from computer as a dictionary

I wrote the following function that I want to apply to a json file:
import json
def myfunction(dictionary):
#doing things
return new_dictionary
data = """{
#a json file as a dictionary
}"""
info = json.loads(data)
refined = key_replacer(info)
new_data = json.dumps(refined)
print(new_data)
It works fine, but how do I do it when I want to import a file from my computer? json.loads take a string as input and returns a dictionary as output and json.dumps take a dictionary as input and returns a string as output. I tried with:
with open('C:\\Users\\SONY\\Desktop\\test.json', 'r', encoding="utf8") as data:
info = json.loads(data)
But TypeError: the JSON object must be str, bytes or bytearray, not TextIOWrapper.
You are passing a file object instead of string. To fix that, you need to read the file first json.loads(data.read())
Howerver, you can directly load json from files using json.load(open('myFile','r')) or in your case, json.load(data)
loads and dumps work on strings. If you want to work on files you should use load and dump instead.
Here is an example:
from json import dump, load
with open('myfile.json','r') as my_file:
content = load(my_file)
#do stuff on content
with open('myooutput.json','w') as my_output:
dump(content, my_output)

What is the best way to write json objects to a text file and then read them back in?

I am trying to write a service to read twitter feed stream data and then write it to a file. I am writing each JSON structure to a line in the file. With a different service I need to read each line of the file and load the json structure for further operations.
My problem is that I can read the first line then the JSON loader says the rest are not JSON structures. They look fine. Not sure what is going on.
Writting file:
self.output = open(os.path.join(self.outputdir,self.filename,'w')
self.output.write(status + "\n")
Reading File:
with open(file) as f:
line = line.replace("\n","")
tweet = json.loads(line)
print tweet['text']
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Example json file:
JSON File
JSON File
Your json is composed with multiple json objects and empty lines.
You need to load each line as a new json object and ignore empty lines:
>>> with open('streamer.151205-071156.json') as f:
>>> data = [json.loads(l) for l in f if len(l) > 1]
>>> len(data)
7
>>> print(data[0]['text'])
u'Mnjd \U0001f642\U0001f602 https://t.co/BL5Ezxtt0i'

TypeError: ... is not JSON serializable error when adding new values to an object by Python

I have such a json object:
{
"people":[
{"firstName":"Hasan Sait", "lastName":"Arslan", "email":"hasan.sait.arslan#gmail.com"}]
}
I want to add new value to this json object by python as the following:
import json
with open('data.json', 'r+') as json_file:
json_data = json.load(json_file)
people = json_data['people']
people.append({"firstName":"Mehmet"})
json_file.seek(0, 0)
json.dump(json_file, json_data)
json_file.truncate()
I get the following error: TypeError: <open file 'data.json', mode 'r+' at 0x7f3f85a4b5d0> is not JSON serializable
In stackoverflow, there are similar questions to mine asked before, but I couldn't find any beneficial solution from them.
Could you tell me where I am wrong?
json.dumps doesn't write to streams, it simply takes the object and returns the JSON-serialized string. You can then save that to the file.
import json
with open('data.json', 'r+') as json_file:
json_data = json.load(json_file)
people = json_data['people']
people.append({"firstName":"Mehmet"})
json_file.seek(0, 0)
jsonString = json.dumps(json_data)
json_file.write(jsonString)
json_file.truncate()
You just got the order of json_file and json_data wrong, so it tells you that you can't use the filepointer as json. The object is first and the file pointer second when using json.dump.

Valid JSON in text file but python json.loads gives "JSON object could be decoded"

I have a valid JSON (checked using Lint) in a text file.
I am loading the json as follows
test_data_file = open('jsonfile', "r")
json_str = test_data_file.read()
the_json = json.loads(json_str)
I have verified the json data in file on Lint and it shows it as valid. However the json.loads throws
ValueError: No JSON object could be decoded
I am a newbie to Python so not sure how to do it the right way. Please help
(I assume it has something to do it encoding the string to unicode format from utf-8 as the data in file is retrieved as a string)
I tried with open('jsonfile', 'r') and it works now.
Also I did the following on the file
json_newfile = open('json_newfile', 'w')
json_oldfile = open('json_oldfile', 'r')
old_data = json_oldfile.read()
json.dump(old_data, json_newfile)
and now I am reading the new file.

How to parse values from a JSON file in Python

I'm trying to get the values from the json file and the error that I'm getting is TypeError: expected string or buffer. I'm parsing the file correctly and moreover I guess my json file format is also correct. Where I'm going wrong?
Both the files are in the same directory.
Main_file.py
import json
json_data = open('meters_parameters.json')
data = json.loads(json_data) // TypeError: expected string or buffer
print data
json_data.close()
meters_parameters.json
{
"cilantro" : [{
"cem_093":[{
"kwh":{"function":"3","address":"286","length":"2"},
"serial_number":{"function":"3","address":"298","length":"2"},
"slave_id":{"function":"3","address":"15","length":"2"}
}]
}],
"elmeasure" : [{
"lg1119_d":[{
"kwh":{"function":"3","address":"286","length":"2"},
"serial_number":{"function":"3","address":"298","length":"2"},
"slave_id":{"function":"3","address":"15","length":"2"}
}]
}]
}
loads expects a string not a file handle. You need json.load:
import json
with open('meters_parameters.json') as f:
data = json.load(f)
print data
You're trying to load the file object, when you want to load everything in the file. Do:
data = json.loads(json_data.read())
.read() gets everything from the file and returns it as a string.
A with statement is much more pythonic here as well:
with open('meters_parameters.json') as myfile:
data = json.loads(myfile.read())

Categories

Resources