This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
UnicodeDecodeError when passing GET data in Python/AppEngine
I get the following traceback locally and on production when trying to submit a form. Can you explain where I should look or should I start making debugging statements to see where in the code the exception occurs?
--> --> -->
Traceback (most recent call last):
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 3858, in _HandleRequest
self._Dispatch(dispatcher, self.rfile, outfile, env_dict)
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 3792, in _Dispatch
base_env_dict=env_dict)
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 580, in Dispatch
base_env_dict=base_env_dict)
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 2918, in Dispatch
self._module_dict)
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 2822, in ExecuteCGI
reset_modules = exec_script(handler_path, cgi_path, hook)
File "/media/Lexar/montao/google/appengine/tools/dev_appserver.py", line 2704, in ExecuteOrImportScript
script_module.main()
File "/media/Lexar/montao/classifiedsmarket/main.py", line 2497, in main
util.run_wsgi_app(application)
File "/media/Lexar/montao/google/appengine/ext/webapp/util.py", line 98, in run_wsgi_app
run_bare_wsgi_app(add_wsgi_middleware(application))
File "/media/Lexar/montao/google/appengine/ext/webapp/util.py", line 116, in run_bare_wsgi_app
result = application(env, _start_response)
File "/media/Lexar/montao/google/appengine/ext/webapp/__init__.py", line 655, in __call__
response.wsgi_write(start_response)
File "/media/Lexar/montao/google/appengine/ext/webapp/__init__.py", line 274, in wsgi_write
body = self.out.getvalue()
File "/usr/lib/python2.6/StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
The error manifested itself in /usr/lib/python2.6/StringIO.py i.e. the Python StringIO module. We don't need to read too far into that source file (line 49) to find this warning:
The StringIO object can accept either Unicode or 8-bit strings, but
mixing the two may take some care. If both are used, 8-bit strings that
cannot be interpreted as 7-bit ASCII (that use the 8th bit) will
cause a UnicodeError to be raised when getvalue() is called.
Bingo! And the warning is repeated again in the getvalue() method. Note that the warning is ancient; it mentions UnicodeError instead of UnicodeDecodeError, but you get the drift.
I'd suggest patching the module so that it displays what's in the bag when the error happens. Wrap up the offending statement at line 270 like this:
if self.buflist:
try:
self.buf += ''.join(self.buflist)
except UnicodeDecodeError:
import sys
print >> sys.stderr, "*** error context: buf=%r buflist=%r" % (self.buf, self.buflist)
raise
self.buflist = []
return self.buf
If the idea of patching a Python-supplied module in situ horrifies you, put the patched version in a directory that's earlier in sys.path than /usr/lib/python2.6.
Here's an example of mixing non-ASCII str and unicode:
>>> from StringIO import StringIO
>>> f = StringIO()
>>> f.write('ascii')
>>> f.write(u'\u1234'.encode('utf8'))
>>> f.write(u'\u5678')
>>> f.getvalue()
*** error context: buf='' buflist=['ascii', '\xe1\x88\xb4', u'\u5678']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\python26\lib\StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 0: ordinal not in range(128)
>>>
Then you can run your application and look at what is in buflist: which parts are data that you wrote, and which are provided by GAE. You need to look at the GAE docs to see whether it is expecting str contents (with what encoding?) or unicode contents, and adjust your code accordingly.
Related
I am trying to save control identifiers to a variable.
The reason is that: i use main_dlg.print_control_identifiers(filename="control_ids.text") but i am getting this error:
Traceback (most recent call last):
File "run_tests.py", line 7, in <module>
main_dlg.print_control_identifiers(filename="control_ids.text")
File "C:\python\lib\site-packages\pywinauto\application.py", line 696, in prin
t_control_identifiers
print_identifiers([this_ctrl, ], log_func=log_func)
File "C:\python\lib\site-packages\pywinauto\application.py", line 685, in prin
t_identifiers
print_identifiers(ctrl.children(), current_depth + 1, log_func)
File "C:\python\lib\site-packages\pywinauto\application.py", line 681, in prin
t_identifiers
log_func(output)
File "C:\python\lib\site-packages\pywinauto\application.py", line 694, in log_
func
log_file.write(str(msg) + os.linesep)
File "C:\python\lib\codecs.py", line 721, in write
return self.writer.write(data)
File "C:\python\lib\codecs.py", line 377, in write
data, consumed = self.encode(object, self.errors)
File "C:\python\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 71-76: c
haracter maps to <undefined>
because the program i am trying to control has greek characters inside.
So, i decided to save the control identifiers to a variable and then save it with the right encoding to a txt file.
A simple solution i thought is:
from pywinauto.application import Application
import os
app = Application(backend="uia").start("C:/python/Lib/site-packages/QtDesigner/designer.exe")
main_dlg = app.QtDesigner
main_dlg.wait('visible')
main_dlg.print_control_identifiers()
and then:
python script_name.py > output.txt
after that in output.txt file there are the control identifiers of the program (in this example QtDesigner).
I'm trying to import an excel file into Pandas. I'm using df=pd.read_excel(file_path) but it keeps getting me this error:
*** No CODEPAGE record, no encoding_override: will use 'ascii'
*** No CODEPAGE record, no encoding_override: will use 'ascii'
Traceback (most recent call last):
File "/Users/santanna_santanna/PycharmProjects/KlooksExplore/FindCos/FindCos_Functions.py", line 5468, in <module>
adjust_sheet(y1,y2,y3)
File "/Users/santanna_santanna/PycharmProjects/KlooksExplore/FindCos/FindCos_Functions.py", line 5130, in adjust_sheet
y1=pd.read_excel(y1)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/util/_decorators.py", line 118, in wrapper
return func(*args, **kwargs)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/io/excel.py", line 230, in read_excel
io = ExcelFile(io, engine=engine)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/io/excel.py", line 294, in __init__
self.book = xlrd.open_workbook(self._io)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/__init__.py", line 162, in open_workbook
ragged_rows=ragged_rows,
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 119, in open_workbook_xls
bk.get_sheets()
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 719, in get_sheets
self.get_sheet(sheetno)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 710, in get_sheet
sh.read(self)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/sheet.py", line 815, in read
strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/biffh.py", line 249, in unpack_string
return unicode(data[pos:pos+nchars], encoding)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/timemachine.py", line 30, in <lambda>
unicode = lambda b, enc: b.decode(enc)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1: ordinal not in range(128)
The file I'm trying to import is this one.
Is that an encoding problem or some character in the file is causing this? What would be the way to solve it?
pd.read_excel('data.csv' encoding='utf-8')
#astrobiologist gave a good hint
Since I didn't want the hassle of going into patches, the way I found to solve was to open the file in Open Office and save it as an Excel 97 file. Finally worked
I am trying to do a redis dump using the python package redis-dump-load.
It is in UTF-8, except for apparently one key, which I'm being told is in ascii. No idea why, but I thought, ok, if there is a UnicodeDecodeError for this one key (I am receiving lots of data from the dump stream up to this point), what I will do is get the encoding of the bytes string I have, and decode with that instead.
However, I am still getting a UnicodeDecodeError for ASCII as well! It is bytes, it is ascii, I think it might just be corrupt and I plan to just skip it, but curious if anyone had any other ideas.
This is my code snippet:
value = {}
for k in response:
try:
value[k.decode(encoding)] = response[k].decode(encoding)
except UnicodeDecodeError:
print("Error for", k)
print(type(k))
orig_encoding = chardet.detect(k)['encoding']
print(orig_encoding)
value[k.decode(orig_encoding)] = response[k].decode(orig_encoding)
return value
And here is the ouput I see:
Error for b'meta'
<class 'bytes'>
ascii
Traceback (most recent call last):
File "redisdl.py", line 243, in handle_response
value[k.decode(encoding)] = response[k].decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "redisdl.py", line 635, in <module>
main()
File "redisdl.py", line 626, in main
do_dump(options)
File "redisdl.py", line 547, in do_dump
dump(output, **kwargs)
File "redisdl.py", line 174, in dump
for key, type, ttl, value in _reader(r, pretty, encoding, keys):
File "redisdl.py", line 293, in _reader
type, ttl, value = _read_key(encoded_key, r, pretty, encoding)
File "redisdl.py", line 284, in _read_key
value = reader.handle_response(results[2], pretty, encoding)
File "redisdl.py", line 250, in handle_response
response[k].decode(orig_encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
Am I missing something? I have looked at multiple SO answers, believe me, and I thought up until now I understood str, bytes, decoding, and encoding. But maybe not. Even py2 & py3 differences I get. I think. Starting to doubt everything...
Getting the error below trying to install ez_install, Windows 7 64 bit machine with fresh Python 2.7. Any ideas?
Installing Setuptools
Traceback (most recent call last):
File "setup.py", line 17, in
exec(init_file.read(), command_ns)
File "", line 8, in
File "c:\users\namar\appdata\local\temp\tmp1tanvy\setuptools-2.1\setuptools\__
init__.py", line 11, in
from setuptools.extension import Extension
File "c:\users\namar\appdata\local\temp\tmp1tanvy\setuptools-2.1\setuptools\ex
tension.py", line 5, in
from setuptools.dist import _get_unpatched
File "c:\users\namar\appdata\local\temp\tmp1tanvy\setuptools-2.1\setuptools\di
st.py", line 15, in
from setuptools.compat import numeric_types, basestring
File "c:\users\namar\appdata\local\temp\tmp1tanvy\setuptools-2.1\setuptools\co
mpat.py", line 19, in
from SimpleHTTPServer import SimpleHTTPRequestHandler
File "c:\python27\lib\SimpleHTTPServer.py", line 27, in
class SimpleHTTPRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
File "c:\python27\lib\SimpleHTTPServer.py", line 208, in SimpleHTTPRequestHand
ler
mimetypes.init() # try to read system mime.types
File "c:\python27\lib\mimetypes.py", line 358, in init
db.read_windows_registry()
File "c:\python27\lib\mimetypes.py", line 258, in read_windows_registry
for subkeyname in enum_types(hkcr):
File "c:\python27\lib\mimetypes.py", line 249, in enum_types
ctype = ctype.encode(default_encoding) # omit in 3.x!
UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 11: ordinal
not in range(128)
Something went wrong during the installation.
See the error message above.
C:\Users\namar\Downloads>cd\
C:\>cd Python27
I met the same problem, solved it by deleting non-ASCII character in window registry "HKEY_CLASSES_ROOT\MIME\Database\Content Type" and it works.The reason is
ctype = ctype.encode(default_encoding)
will throw UnicodeDecodeError when it met non-ASCII characters and the program will stop.
Check this link.(http://blog.csdn.net/hugleecool/article/details/17996993).
In C:\Python27\Lib\mimetypes.py find next string
default_encoding = sys.getdefaultencoding()
replace it with
if sys.getdefaultencoding() != 'gbk':
reload(sys)
sys.setdefaultencoding('gbk')
default_encoding = sys.getdefaultencoding()
Notice: instead 'gbk' choose your encoding.
In C:\Python27\Lib\mimetypes.py find next string
default_encoding = sys.getdefaultencoding()
replace it with
if sys.getdefaultencoding() != 'gbk':
reload(sys)
sys.setdefaultencoding('gbk')
default_encoding = sys.getdefaultencoding()
Notice: instead 'gbk' choose your encoding.
The above answer satisfies my case of unicode decode error in using google app engine local web server for php and python .
In my case, simply comment the following four line worked,
try:
ctype = ctype.encode(default_encoding) # omit in 3.x!
except UnicodeEncodeError:
pass
And replace UnicodeEncodeError to be UnicodeError worked.
For more info, you can check this question.
I was using rdflib version 3.2.3 and everything was working fine. After upgrading to 4.0.1 I started getting the error:
RDFa parsing Error! 'ascii' codec can't decode byte 0xc3 in position 5454: ordinal not in range(128)
I tried various way to make this work but so far have not succeeded. Below are my attempts.
In each case I:
from rdflib import Graph
First attempt:
>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/graph.py", line 1002, in parse
parser.parse(source, self, **args)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 268, in parse
vocab_cache=vocab_cache)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 148, in _process
_check_error(processor_graph)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 57, in _check_error
raise Exception("RDFa parsing Error! %s" % msg)
Exception: RDFa parsing Error! 'ascii' codec can't decode byte 0xc3 in position 4801: ordinal not in range(128)
Second attempt:
>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/rdf'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/graph.py", line 1002, in parse
parser.parse(source, self, **args)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 570, in parse
self._parser.parse(source)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
self._cont_handler.endElementNS(pair, None)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 160, in endElementNS
self.current.end(name, qname)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 461, in property_element_end
current.data, literalLang, current.datatype)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/term.py", line 541, in __new__
raise Exception("'%s' is not a valid language tag!"%lang)
Exception: 'i18n' is not a valid language tag!
Third attempt: gives no errors but also does not give any results
>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/rdf', format='rdfa'))
0
So someone please tell me what I am dong wrong! :)
As Graham replied on the rdflib mailinglist, there is a html5lib problem - we will pin it correctly for python 2 for the next release, but for now just do:
pip install html5lib==0.95
The second problem is a problem in the data from creative commons, "i18n" really isn't a valid language tag according to rfc5646. I added the check, but in retrospect it seems to strict to raise an exception. I guess I'll change it to a warning.