Errror in outputting CSV with Django? - python

I am trying to output my model as a CSV file.It is working fine with small data in model and it is very slow with large data.And secondly there are some error in outputting a model as CSV.My logic which I am using is:
def some_view(request):
# Create the HttpResponse object with the appropriate CSV header.
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="news.csv"'
writer = csv.writer(response)
news_obj = News.objects.using('cms').all()
for item in news_obj:
#writer.writerow([item.newsText])
writer.writerow([item.userId.name])
return response
and the error which I am facing is:
UnicodeEncodeError :--
'ascii' codec can't encode characters in position 0-6: ordinal not in
range(128)
and further it says:-
The string that could not be encoded/decoded was: عبدالله الحذ

Replace line
writer.writerow([item.userId.name])
with:
writer.writerow([item.userId.name.encode('utf-8')])
Before saving unicode string to a file you must encode it in some encoding. Most system use utf-8 by default, so it's a safe choice.

From the error, The write content of csv file is like ASCII character. So decode the character.
>>>u'aあä'.encode('ascii', 'ignore')
'a'
Can fix this error from ignoring the ASCII character:
writer.writerow([item.userId.name.encode('ascii', 'ignore')])

Related

JSON import in Python

I would like to import the JSON file located at "https://www.drivy.com/cars/458342/reviews?page=1&paginate_per=6&rel=next" in python.
When I run this:
with open('C:/Users/coppe/Documents/py trials/eval.json') as json_file:
reviews = json.load(json_file)
I get an error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6776: character maps to <undefined>
Actually this error is due to a special character contained in the html keyvalue. Knowing that this character is an emoticon (a thumb), how can I still import my JSON by ignoring this ?
You need to specify the correct format for the json encoder to use. Most use utf8, therefore use something like:
reviews = json.load(
open("C:/Users/coppe/Documents/py trials/eval.json", encoding="utf8")
)
or
with open('C:/Users/coppe/Documents/py trials/eval.json') as json_file:
reviews = json.load(json_file, encoding="utf8")
Good Luck!
use
open(json_file, encoding="utf8")

why do i get a decode error when using json load in python?

I try to open a json file but get a decode error. I can't find the solution for this. How can i decode this data?
The code gives the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 3765: invalid start byte
import json
url = 'users.json'
with open(url) as json_data:
data = json.load(json_data)
That means that the data you're trying to decode isn't encoded in UTF-8
EDIT:
You may decode it before loading it with json using something like this:
with open(url, 'rb') as f:
data = f.read()
data_str = data.decode("utf-8", errors='ignore')
json.load(data_str)
https://www.tutorialspoint.com/python/string_decode.htm
Be careful that you WILL lose some data during this process. A safer way would be to use the same decoding mechanism used to encode your JSON file, or to put raw data bytes in something like base64

How to extract String from a Unicoded JSONObject in Python?

I'm getting the below error when I try to parse a String with Unicodes like ' symbol and Emojis, etc :
UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f33b' in position 19: ordinal not in range(128)
Sample Object:
{"user":{"name":"\u0e2a\u0e31\u0e48\u0e07\u0e14\u0e48\u0e27\u0e19 \u0e2b\u0e21\u0e14\u0e44\u0e27 \u0e40\u0e14\u0e23\u0e2a\u0e41\u0e1f\u0e0a\u0e31\u0e48\u0e19\u0e21\u0e32\u0e43\u0e2b\u0e21\u0e48 \u0e23\u0e32\u0e04\u0e32\u0e40\u0e1a\u0e32\u0e46 \u0e2a\u0e48\u0e07\u0e17\u0e31\u0e48\u0e27\u0e44\u0e17\u0e22 \u0e44\u0e14\u0e49\u0e02\u0e2d\u0e07\u0e0a\u0e31\u0e27\u0e23\u0e4c\u0e08\u0e49\u0e32 \u0e2a\u0e19\u0e43\u0e08\u0e15\u0e34\u0e14\u0e15\u0e48\u0e2d\u0e2a\u0e2d\u0e1a\u0e16\u0e32\u0e21 Is it","tag":"XYZ"}}
I'm able to extract tag value, but I'm unable to extract name value.
Here is my code:
dict = json.loads(json_data)
print('Tag - 'dict['user']['tag'])
print('Name - 'dict['user']['name'])
You can save the data in CSV file format which could also be opened using Excel. When you open a file in this way: open(filename, "w") then you can only store ASCII characters, but if you try to store Unicode data this way, you would get UnicodeEncodeError. In order for you to store Unicode data, you need to open the file with UTF-8 encoding.
mydict = json.loads(json_data) # or whatever dictionary it is...
# Open the file with UTF-8 encoding, most important step
f = open("userdata.csv", "w", encoding='utf-8')
f.write(mydict['user']['name'] + ", " + mydict['user']['tag'] + "\n")
f.close()
Feel free to change the code based on the data you have.
That's it...

Unicode error reading Python log file (logging)

I am creating a log file using Pythons logging library. When I am trying to read it with python and print it on a html page (using Flask), I get:
<textarea cols="80" rows="20">{% for line in log %}{{line}}{% endfor %}
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 36: ordinal not in range(128)
I guess that this has to do with that the log file is decoded in some other decoding, but which?
This is the line setting the log file if it helps:
fileLogger = logging.handlers.TimedRotatingFileHandler(filename = 'log.log', when = 'midnight', backupCount = 30)
How do I solve this problem?
The logging package file handlers will encode any Unicode object you send to it to UTF-8, unless you specified a different encoding.
Use io.open() to read the file as UTF-8 data again, you'll get unicode objects again, ideal for Jinja2:
import io
log = io.open('log.log', encoding='utf8')
You could also specify a different encoding for the TimedRotatingFileHandler but UTF-8 is an excellent default. Use the encoding keyword argument if you wanted to pick a different encoding:
fileLogger = logging.handlers.TimedRotatingFileHandler(
filename='log.log', when='midnight', backupCount=30,
encoding='Latin1')
I'm not familiar with flask, but if you can grab the contents of the log as a string. You can encode it to utf-8 like so:
string = string.encode('utf-8') # string is the log's contents, now in utf-8

How can I get my Python to parse the following text?

I have a sample of the text:
"PROTECTING-ħarsien",
I'm trying to parse with the following
import csv, json
with open('./dict.txt') as maltese:
entries = maltese.readlines()
for entry in entries:
tokens = entry.replace('"', '').replace(",", "").replace("\r\n", "").split("-")
if len(tokens) == 1:
pass
else:
print tokens[0] + "," + unicode(tokens[1])
But I'm getting an error message
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)
What am I doing wrong?
It appears that dict.txt is UTF-8 encoded (ħ is 0xc4 0xa7 in UTF-8).
You should open the file as UTF-8, then:
import codecs
with codecs.open('./dict.txt', encoding="utf-8") as maltese:
# etc.
You will then have Unicode strings instead of bytestrings to work with; you therefore don't need to call unicode() on them, but you may have to re-encode them to the encoding of the terminal you're outputting to.
You have to change your last line to (this has been tested to work on your data):
print tokens[0] + "," + unicode(tokens[1], 'utf8')
If you don't have that utf8, Python assumes that the source is ascii encoding, hence the error.
See http://docs.python.org/2/howto/unicode.html#the-unicode-type

Categories

Resources