The ERROR: xlrd.biffh.XLRDError: Excel xlsx file; not supported - python

Please I'd like to ask you a question about opening an excel file.
Now I'm trying to open it using this program:
data = pd.read_excel(r'C:\Users\Acer\Desktop\OffshoringData.xlsx')
print(data)
The problem is that I found the following error:
**xlrd.biffh.XLRDError: Excel xlsx file; not supported**
What should I do in this case, please??

You need to use a different engine in your pandas.read_excel().
For security reasons xlrd no longer supports .xlsx files, but openpyxl still does.
So you would need to add engine='openpyxl' in your function.
Here's the documentation:
https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html

This error is usually as a result of conflicting versions of the xlrd package and your excel version. Try installing a newer version of xlrd package (v2.0.1) which is able to handle .xlsx files. Seems the version of xlrd you are using is for older versions of excel files.
reference - xlrd python package

Related

XLRDError: Excel xlsx file; not supported Databricks

I'm using Azure Databricks and trying to read an excel file. I have an encrypted file with .xlsx.pgp. After decrypting the message I get it as a byte array. So, here's the function I use to read this file as a pandas dataframe:
df = pd.read_excel(BytesIO(orig))
However, this is giving me the following error:
XLRDError: Excel xlsx file; not supported
Now, based on this documentation:
I have added openpyxl to the cluster and then tried to run the following:
df = pd.read_excel(BytesIO(orig),engine=`openpyxl`)
I'm getting the error:
global name 'openpyxl' is not defined
With the following command, I get:
df = pd.read_excel(BytesIO(orig),engine='openpyxl')
The error I get is:
ValueError: Unknown engine: openpyxl
How can I resolve this issue?
Thanks for all the help!
Errors suggests that, openpyxl library is not properly installed. Also maybe notebook is not in scope of openpyxl library.
Please install openpyxl in Cluster which is attached to notebook as shown below:
Step1: Select Cluster and click on libraries.
Step2: Click on Install New.
Next click on PyPI.
Now enter name of library that is openpyxl
Then click on Install.
Step3: Check status of openpyxl library is installed.
Step4: Successfully installed openpyxl library.
Edit -
Note - pandas version should be 1.0.1 or above.
If pandas version is below 1.0.1, you can upgrade pandas library using pip install pandas
Check pandas version using pd.__version__ command.
For more information you can refer this answer from rama-a

OSError: No suitable library found for xls

I've imported pyexcel as pe and am trying to run get_array, and iget_records, on an xls file and I'm getting a strange error, surrounding which I haven't found much documentation. Error: "OSError: No suitable library found for xls."
When I run these commands on Test.csv, I get no issue. But I need it to work for xls files because I'm dealing with non-english characters which I understand don't appear in csv files.
my_array = pe.get_array(file_name="Test.xls")
print(my_array)
separately,
records = pe.iget_records(file_name="/Tests/Test.xls")
for record in records:
print(record['alpha'], record['beta'], record['charlie'])
Ideas, anyone?
python 3.5.2;
windows 7 64 bit
To get rid of this error you need to install pyexcel-xls. With pip:
pip install pyexcel-xls

openpyxl hides new columns

I use openpyxl to read in a xlsx file and later made additions to it before saving it in a new file.
It used to work well on openpyxl 2.1.2 but after installing python on a new computer with the newest version of openpyxl it starts to hide all added columns.
I have tried adding:
ws.column_dimensions['A'].hidden = False
...
ws.column_dimensions['Z'].hidden = False
But that does not change anything.
Any ideas? Having to open up each spreadsheet generated just to unhide and save is kinda annoying.
openpyxl does not hide columns by default. If you think there is a problem then please submit a bug report.
Try to upgrade openpyxl version to 2.3 or later. I have same issue before, upgrade version solved my problem

Import error for openpyxl

I'm very very new to coding and I'm having an issue importing openpyxl into my python program. I imagine the issue is due to where I have it saved on my computer.
I've downloaded other libraries (xlrd, xlwt, xlutils) before and just saved them in my: C:\Python27\ArcGIS10.1\Lib, or C:\Python27\ArcGIS10.1\Lib\site-packages, or C:\Python27\ArcGISx6410.1\Lib, or C:\Python27\ArcGISx6410.1\Lib\site-packages directories and python has been able to "see" them when i import them into a script.
I've done some trolling on the web and it looks like I may be performing the "installation" of openpyxl incorrectly. I downloaded "setuptools-5.7" in order to try to run the setup.py script contained within the openpyxl library, and so far I haven't gotten that to work out.
Since I'm so new to python, I don't really understand some of the other stuff I've been finding about how to correctly install the library, like "pip install" etc.
If anyone has any ideas about how I can install or save or locate the openpyxl library in the easiest fashion (without using other programs that I don't already have), that would be great!
Your import is probably incorrect.
It needs to be.
from openpyxl import workbook

xlrd library not working with xlsx files.any way to covert xlsx to xls using python?

I want to convert xlsx file to xls format using python. The reason is that im using xlrd library to parse xls files, but xlrd is not able to parse xlsx files.
Switching to a different library is not feasible for me at this stage, as the entire project is using xlrd, so a lot of changes will be required.
So, is there any way i can programatically convert an xlsx file to xls using python ?
Please Help
Thank You
If you're using Python on Windows and you have Excel installed, you could use the Python for Windows Extensions to do it. Here's a sample piece of python code that did the job for me:
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
xl.DisplayAlerts = False
wb = xl.Workbooks.Open(r"C:\PATH\TO\SOURCE_FILENAME.XLSX")
wb.SaveAs(r"C:\PATH\TO\DESTINATION_FILENAME.XLS", FileFormat = 56)
wb.Close()
xl.Quit()
I tested this using Python 2.7.2 with pywin32 build 216 and Excel 2007 on Windows 7.
xlrd-0.9.2.tar.gz (md5) can extract data from Excel spreadsheets (.xls and .xlsx, versions 2.0 on-wards) on any platform.

Categories

Resources