Encoding error when opening an Excel file with python xlrd module

Encoding error when opening an Excel file with python xlrd module - python

I have some excel files with extensions that are xls，when I use xlrd to open these files, it failed，I do not know how to solve it.
oldbook=xlrd.open_workbook('file.xls')
oldsheet=oldbook.sheets()[0]
PS C:\Users\我是猫\Desktop\python> python -u "c:\Users\我是猫\Desktop\python\a.py"
Traceback (most recent call last):
File "c:\Users\我是猫\Desktop\python\a.py", line 64, in <module>
oldbook=xlrd.open_workbook(result)
File "E:\python\lib\site-packages\xlrd\__init__.py", line 157, in open_workbook
ragged_rows=ragged_rows,
File "E:\python\lib\site-packages\xlrd\book.py", line 117, in open_workbook_xls
bk.parse_globals()
File "E:\python\lib\site-packages\xlrd\book.py", line 1209, in parse_globals
self.handle_format(data)
File "E:\python\lib\site-packages\xlrd\formatting.py", line 538, in handle_format
unistrg = unpack_unicode(data, 2)
File "E:\python\lib\site-packages\xlrd\biffh.py", line 284, in unpack_unicode
strg = unicode(rawstrg, 'utf_16_le')
File "E:\python\lib\site-packages\xlrd\timemachine.py", line 31, in <lambda>
unicode = lambda b, enc: b.decode(enc)
File "E:\python\lib\encodings\utf_16_le.py", line 16, in decode
return codecs.utf_16_le_decode(input, errors, True)
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 10-11: illegal encoding
PS C:\Users\我是猫\Desktop\python>

Try overriding the encoding used:
oldbook = xlrd.open_workbook('file.xls', encoding_override="cp1252")
You can also try encoding_override="utf-8", play around with the encoding till you get the right one.

Related

pywinauto save (permant) control identifiers to variable

I am trying to save control identifiers to a variable.
The reason is that: i use main_dlg.print_control_identifiers(filename="control_ids.text") but i am getting this error:
Traceback (most recent call last):
File "run_tests.py", line 7, in <module>
main_dlg.print_control_identifiers(filename="control_ids.text")
File "C:\python\lib\site-packages\pywinauto\application.py", line 696, in prin
t_control_identifiers
print_identifiers([this_ctrl, ], log_func=log_func)
File "C:\python\lib\site-packages\pywinauto\application.py", line 685, in prin
t_identifiers
print_identifiers(ctrl.children(), current_depth + 1, log_func)
File "C:\python\lib\site-packages\pywinauto\application.py", line 681, in prin
t_identifiers
log_func(output)
File "C:\python\lib\site-packages\pywinauto\application.py", line 694, in log_
func
log_file.write(str(msg) + os.linesep)
File "C:\python\lib\codecs.py", line 721, in write
return self.writer.write(data)
File "C:\python\lib\codecs.py", line 377, in write
data, consumed = self.encode(object, self.errors)
File "C:\python\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 71-76: c
haracter maps to <undefined>
because the program i am trying to control has greek characters inside.
So, i decided to save the control identifiers to a variable and then save it with the right encoding to a txt file.
A simple solution i thought is:
from pywinauto.application import Application
import os
app = Application(backend="uia").start("C:/python/Lib/site-packages/QtDesigner/designer.exe")
main_dlg = app.QtDesigner
main_dlg.wait('visible')
main_dlg.print_control_identifiers()
and then:
python script_name.py > output.txt
after that in output.txt file there are the control identifiers of the program (in this example QtDesigner).

cocos creator can not build for android platform

The information I got as follows.
Traceback (most recent call last):
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\bin\cocos.py", line 983, in
run_plugin(command, argv, plugins)
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\bin\cocos.py", line 875, in run_plugin
plugin.run(argv, dependencies_objects)
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\plugins\plugin_new\project_new.py", line 258, in run
self.parse_args(argv)
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\plugins\plugin_new\project_new.py", line 104, in parse_args
description=self.__class__.brief_description())
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\plugins\plugin_new\project_new.py", line 43, in brief_description
return MultiLanguage.get_string('NEW_BRIEF')
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\bin\MultiLanguage.py", line 52, in get_string
fmt = cls.get_instance().get_current_string(key)
File "C:\CocosCreator\resources\cocos2d-x\tools\cocos2d-console\bin\MultiLanguage.py", line 158, in get_current_string
ret = ret.encode(self.encoding)
File "C:\CocosCreator\resources\utils\Python27\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-8: character maps to <undefined>
any one knows why this happened?

I do not know python, but the problem seems solved afterI made the modification to the line 158 of MultiLanguage.py(in the get_current_string method):
ret = ret.encode(self.encoding)
to the following line:
ret = ret.encode("utf8") #self.encoding
Hope this helps people who encounter the same problem.

Take a look at the versions of NDK and SDK, NDK. The recommended version is R17 - R19

Pandas can't read excel encoding

I'm trying to import an excel file into Pandas. I'm using df=pd.read_excel(file_path) but it keeps getting me this error:
*** No CODEPAGE record, no encoding_override: will use 'ascii'
*** No CODEPAGE record, no encoding_override: will use 'ascii'
Traceback (most recent call last):
File "/Users/santanna_santanna/PycharmProjects/KlooksExplore/FindCos/FindCos_Functions.py", line 5468, in <module>
adjust_sheet(y1,y2,y3)
File "/Users/santanna_santanna/PycharmProjects/KlooksExplore/FindCos/FindCos_Functions.py", line 5130, in adjust_sheet
y1=pd.read_excel(y1)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/util/_decorators.py", line 118, in wrapper
return func(*args, **kwargs)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/io/excel.py", line 230, in read_excel
io = ExcelFile(io, engine=engine)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/pandas/io/excel.py", line 294, in __init__
self.book = xlrd.open_workbook(self._io)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/__init__.py", line 162, in open_workbook
ragged_rows=ragged_rows,
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 119, in open_workbook_xls
bk.get_sheets()
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 719, in get_sheets
self.get_sheet(sheetno)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/book.py", line 710, in get_sheet
sh.read(self)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/sheet.py", line 815, in read
strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/biffh.py", line 249, in unpack_string
return unicode(data[pos:pos+nchars], encoding)
File "/Users/santanna_santanna/anaconda3/lib/python3.6/site-packages/xlrd/timemachine.py", line 30, in <lambda>
unicode = lambda b, enc: b.decode(enc)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1: ordinal not in range(128)
The file I'm trying to import is this one.
Is that an encoding problem or some character in the file is causing this? What would be the way to solve it?

pd.read_excel('data.csv' encoding='utf-8')

#astrobiologist gave a good hint
Since I didn't want the hassle of going into patches, the way I found to solve was to open the file in Open Office and save it as an Excel 97 file. Finally worked

'utf-8' decode error in tensorflow tutorial

I'm running into this bizarre problem where when I run
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/home/fqiao/development/MNIST_data/', one_hot=True)
I get:
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 199, in read_data_sets
train_images = extract_images(local_file)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 58, in extract_images
magic = _read32(bytestream)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 51, in _read32
return numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 404, in _read_gzip_header
magic = self._fp.read(2)
File "/usr/lib/python3.5/gzip.py", line 91, in read
self.file.read(size-self._length+read)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/default/_gfile.py", line 45, in sync
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/default/_gfile.py", line 199, in read
return self._fp.read(n)
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
However, if I just run the code in input_data.py directly, everything appears to be fine:
>>> dt = numpy.dtype(numpy.uint32).newbyteorder('>')
>>> f = tf.gfile.Open('/home/fqiao/development/MNIST_data/train-images-idx3-ubyte.gz', 'rb')
>>> bytestream = gzip.GzipFile(fileobj=f)
>>> testbytes = numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
>>> testbytes
2051
Anyone has any idea what's going on?
My system: Ubuntu 15.10 x64 python 3.5.0.

The bug has been addressed by a recent change 555e73d. MNIST files need to be opened with binary 'rb' mode instead of just text 'r'.

In my case, the problem was in the encoding of the data file.
Open the file using vim and execute:
:set fileencoding=utf-8
That solved the issue in my case.

import CGIHTTPServer results UnicodeDecodeError

import CGIHTTPServer results UnicodeDecodeError on Python 2.7.6.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\CGIHTTPServer.py", line 30, in <module>
import SimpleHTTPServer
File "C:\Python27\lib\SimpleHTTPServer.py", line 27, in <module>
class SimpleHTTPRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
File "C:\Python27\lib\SimpleHTTPServer.py", line 208, in SimpleHTTPRequestHand
ler
mimetypes.init() # try to read system mime.types
File "C:\Python27\lib\mimetypes.py", line 358, in init
db.read_windows_registry()
File "C:\Python27\lib\mimetypes.py", line 258, in read_windows_registry
for subkeyname in enum_types(hkcr):
File "C:\Python27\lib\mimetypes.py", line 249, in enum_types
ctype = ctype.encode(default_encoding) # omit in 3.x!
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 0: ordinal
not in range(128)"
Please advise me.

This is a bug in Python 2.7, reported as issue 21138; the mimetypes module doesn't use the correct encoding for Unicode mimetypes in the Windows registry.
The bug ticket contains a work-around; you'll have to edit mimetypes.py.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Encoding error when opening an Excel file with python xlrd module - python

Try overriding the encoding used: oldbook = xlrd.open_workbook('file.xls', encoding_override="cp1252") You can also try encoding_override="utf-8", play around with the encoding till you get the right one.

Related

pywinauto save (permant) control identifiers to variable

cocos creator can not build for android platform

Pandas can't read excel encoding

'utf-8' decode error in tensorflow tutorial

import CGIHTTPServer results UnicodeDecodeError

Categories

Resources