Python: Assistance with json and reading from a file - python

Say i have a notepad file (.txt) with the following content:
"Hello I am really bad at programming"
Using json, how would I get the sentence from the file to the python program which I can then use as a variable?
So far I have this code:
newfile = open((compfilename)+'.txt', 'r')
saveddata = json.load(newfile)
orgsentence = saveddata[0]
I always get this error:
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Thanks in advance for any help!

Though you are using txt file. You could read this file without json. But as you mentioned in the question, you can try like this
hello.txt
"Hello I am really bad at programming"
To read this txt file,
import json
from pprint import pprint
with open('hello.txt') as myfile:
mydata = json.load(myfile) #to load json
print myfile.read() #to print contents on stdout, not using json load
pprint(mydata)
Output:
u'Hello I am really bad at programming'

import json
with open('file.txt') as f:
data = json.load(f)

Related

JSON import in Python

I would like to import the JSON file located at "https://www.drivy.com/cars/458342/reviews?page=1&paginate_per=6&rel=next" in python.
When I run this:
with open('C:/Users/coppe/Documents/py trials/eval.json') as json_file:
reviews = json.load(json_file)
I get an error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6776: character maps to <undefined>
Actually this error is due to a special character contained in the html keyvalue. Knowing that this character is an emoticon (a thumb), how can I still import my JSON by ignoring this ?
You need to specify the correct format for the json encoder to use. Most use utf8, therefore use something like:
reviews = json.load(
open("C:/Users/coppe/Documents/py trials/eval.json", encoding="utf8")
)
or
with open('C:/Users/coppe/Documents/py trials/eval.json') as json_file:
reviews = json.load(json_file, encoding="utf8")
Good Luck!
use
open(json_file, encoding="utf8")

why do i get a decode error when using json load in python?

I try to open a json file but get a decode error. I can't find the solution for this. How can i decode this data?
The code gives the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 3765: invalid start byte
import json
url = 'users.json'
with open(url) as json_data:
data = json.load(json_data)
That means that the data you're trying to decode isn't encoded in UTF-8
EDIT:
You may decode it before loading it with json using something like this:
with open(url, 'rb') as f:
data = f.read()
data_str = data.decode("utf-8", errors='ignore')
json.load(data_str)
https://www.tutorialspoint.com/python/string_decode.htm
Be careful that you WILL lose some data during this process. A safer way would be to use the same decoding mechanism used to encode your JSON file, or to put raw data bytes in something like base64

UnicodeDecodeError: 'gbk' codec can't decode byte when read json contains chinese

I'm switching from Python 2 to 3
In my jupyter notebook the code is
file = "./data/test.json"
with open(file) as data_file:
data = json.load(data_file)
It used to be fine with python 2, but now after just switch to python 3, it gives me the error
UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 123: illegal multibyte sequence
The test.json file is like this:
[{
"name": "Daybreakers",
"detail_url": "http://www.movieinsider.com/m4120/daybreakers/",
"movie_tt_id": "中文"
}]
If I delete the chinese, there will be no error.
So what should I do?
There are a lot of similar questions in SO, but I didn't find a good solution for my case. If you find an applicable one, please tell me and I'll close this one.
Thanks a lot!
You need to specify the correct encoding when you open the file. If the JSON is encoded with UTF-8 you can do this:
import json
fname = "test.json"
with open(fname, encoding='utf-8') as data_file:
data = json.load(data_file)
print(data)
output
[{'name': 'Daybreakers', 'detail_url': 'http://www.movieinsider.com/m4120/daybreakers/', 'movie_tt_id': '中文'}]

pandas reading csv file encoding error

i have a iso8859-9 encoded csv file and trying to read it into a dataframe.
here is the code and error I got.
iller = pd.read_csv('/Users/me/Documents/Works/map/dist.csv' ,sep=';',encoding='iso-8859-9')
iller.head()
and error is
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 250: ordinal not in range(128)
and code below works without error.
import codecs
myfile = codecs.open('/Users/me/Documents/Works/map/dist.csv', "r",encoding='iso-8859-9')
for a in myfile:
print a
My question is why pandas not reading my correctly encoded file ? and is there any way to make it read?
Not possible to see what could be off with you data of course, but if you can read in the data without issues with codecs, then maybe an idea would be to write out the file to UTF encoding(?)
import codecs
filename = '/Users/me/Documents/Works/map/dist.csv'
target_filename = '/Users/me/Documents/Works/map/dist-utf-8.csv'
myfile = codecs.open(filename, "r",encoding='iso-8859-9')
f_contents = myfile.read()
or
import codecs
with codecs.open(filename, 'r', encoding='iso-8859-9') as fh:
f_contents = fh.read()
# write out in UTF-8
with codecs.open(target_filename, 'w', encoding = 'utf-8') as fh:
fh.write(f_contents)
I hope this helps!

Error using jsbeautifier in python with unicode text

I use the following code to beautify a js file (with jsbeautifier module) using python (3.4)
import jsbeautifier
def write_file(output, fn):
file = open(fn, "w")
file.write(output)
file.close()
def beautify_file():
res = jsbeautifier.beautify_file("myfile.js")
write_file(res, "myfile-exp.js")
print("beautify_file done")
def main():
beautify_file()
print("done")
pass
if __name__ == '__main__':
main()
The file contains the following contents:
function MyFunc(){
return {Language:"Мова",Theme:"ТÑма"};
}
When I run the python code, I get the following error:
'charmap' codec can't decode byte 0x90 in position 43: character maps to <undefined>
Can someone guide me as to how to handle unicode/utf-8 charsets with the beautifier?
Thanks
It's hard to tell without a full stack trace but it looks like jsbeautify isn't fully Unicode aware.
Try one of the following:
Decode js file to Unicode:
with open("myfile.js", "r", encoding="UTF-8") as myfile:
input_string = myfile.read()
res = jsbeautifier.beautify(input_string)
or, if that fails
Open file as binary:
with open("myfile.js", "rb") as myfile:
input_string = myfile.read()
res = jsbeautifier.beautify(input_string)
In addition, you may run into issues when writing. You really need to set the encoding on the output file:
file = open(fn, "w", encoding="utf-8")

Categories

Resources