Open a read-only Excel file using Python - python

I have a program (zTree) that is writing an Excel file and updating it constantly. What I need this Python program to do is read in the data from the Excel file as its updating. The problem that I'm having though is that when I try to read in the data using xlrd, I get the error:
peek = f.read(peeksz)
IO Error: [Errno 13] Permission denied
which comes up because Excel is in read-only mode. Is there any way to read in the data of an Excel file in read-only mode using Python?

just tested it on win 7 (64bit), but in this case it works:
import xlrd
workbook = xlrd.open_workbook('C:/User/myaccount/Book1.xls')
worksheet = workbook.sheet_by_name('Sheet1')
print worksheet
could it be, that you are trying to copy it first, or that your python is trying to put a temporary copy of the file in the py-directoy? - because that would give the IO-Error

Related

Reading the excel file from python pandas given MSAT extension error

raise CompDocError(msg)
xlrd.compdoc.CompDocError: MSAT extension: accessing sector 131072 but only 22863 in file
You might be trying to open a corrupt Excel file. Assuming you're opening the file using xlrd, you could try adding the ignore_workbook_corruption=True parameter:
workbook = xlrd.open_workbook('file_name.xls', ignore_workbook_corruption=True)

Openpyxl-Made changes to excel and store it in a dataframe, how to kill the Excel without saving all the changes and avoid further recovery dialogue?

I need to open and edit my Excel with openpyxl, store the excel as a dataframe, and close the excel without any changes. Are there any ways to kill the excel and disable the auto-recovery dialogue which may pop out later?
The reason I'm asking is that my code worked perfectly fine in Pycharm, however after I packed it into .exe with pyinstaller, the code stopped working, the error said "Excel cannot access the file, there are serval possible reasons, the file name or path does not exist, or the file is being used by another program, or the workbook you are saving has the same name as a currently open workbook.
I assume it is because the openpyxl did not really close the excel, and I exported it to a different folder with the same file name.
Here is my code:
wb1 = openpyxl.load_workbook(my_path, keep_vba=True)
ws1 = wb1["sheet name"]
making changes...
ws1_df = pd.DataFrame(ws1.values)
wb1.close()
Many thanks ahead :)
The following way you can do this. solution
from win32com.client import Dispatch
# Start excel application
xl = Dispatch('Excel.Application')
# Open existing excel file
book = xl.Workbooks.Open('workbook.xlsx')
# Some arbitrary excel operations ...
# Close excel application without saving file
book.Close(SaveChanges=False)
xl.Quit()

openpyxl cannot read Strict Open XML Spreadsheet format: UserWarning: File contains an invalid specification for Sheet1. This will be removed

A few of my users (all of whom use Mac) have uploaded an Excel into my application, which then rejected it because the file appeared to be empty. After some debugging, I've determined that the file was saved in Strict Open XML Spreedsheet format, and that openpyxl (2.6.0) doesn't issue an error, but rather prints a warning to stderr.
To reproduce, open a file, add a few rows and save as Strict Open XML Spreedsheet (*.xlsx) format.
import openpyxl
with open('excel_open_strict.xlsx', 'rb') as f:
workbook = openpyxl.load_workbook(filename=f)
This will print the following warning, but will not throw any exception:
UserWarning: File contains an invalid specification for Sheet1. This will be removed
Furthermore, the workbook appears to have no sheets:
assert workbook.get_sheet_names() == []
I've now had three Mac users experience this issue. It seems like Mac will sometimes default to using this Strict Open XML Spreedsheet format. If this is a normal case, then openpyxl should be able to handle it. Otherwise, it would be great if openpyxl would just throw an exception. As a workaround, it seems I can do the following:
import openpyxl
with open('excel_open_strict.xlsx', 'rb') as f:
workbook = openpyxl.load_workbook(filename=f)
if not workbook.get_sheet_names():
raise Exception("The Excel was saved in an incorrect format")
I had similar problems with XLSX files created using the R library openxlsx. A sample error message from a simple python program to open the file and retrieve a single value from sheet Crops:
Warning (from warnings module):
File "C:\Python38\lib\site-packages\openpyxl\reader\workbook.py", line 88
warn(msg)
UserWarning: File contains an invalid specification for Crops. This will be removed
My first, very clumsy solution:
Open with Excel
Save the file as *.xls, which triggered a warning about compatibility.
Re-save as *.xlsx
My second solution works if you only need to read the file:
Impose a read-only restriction:
wb = load_workbook(filename = 'CAF_LTAR_crops_out_0.3.xlsx', read_only=True)
The broad lesson seems to be that the XLSX file specification is not uniformly (correctly?) implemented across programming languages.
I am working with a Windows PC and I had the same Problem with openpyxl. I got an excel template that was saved as Strict Open XML Spreadsheet (*.xlsx). I tried to fill out the template but I got always a fault message for each work sheet as below and when I tried to print the array with all worksheet names was empty [].
UserWarning: File contains an invalid specification for Sheetname. This will be removed
Solution
I saved the file as Excel Workbook (*.xlsx) and not as Strict Open XML Spreadsheet (*.xlsx). After that I had no fault message, the array included all Worksheets and I could fill out the template with openpyxl.

Python; Error with Pandas to_excel() function, Permission Error [win32]

I'm reading from an excel (.xlsx) file using pandas read_excel() and trying to write back to the same file using pandas to_excel() function. For some reason with small files (20-30 rows) it works fine but when I put in a larger file (200,000 rows) it gives me a permission error.
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\...\\AppData\\Local\\Temp\\1\\openpyxl._fbk93l5'
I'm assuming the reader somehow still has its hands on the file when it attempts to overwrite back to it but I'm not sure how to resolve this. I make sure to close the file from excel before running the program.
edit:
these are my read and write functions
def readData(excelFilePath):
print("Reading data...\n")
data = pd.read_excel(excelFilePath)
return data
def writeData(data, excelFilePath):
data.to_excel(excelFilePath, index=False)
print("\nData Updated...\nProgram exiting...")
sleep(2)
I read the data, manipulate it then write back to the same file
Any help is appreciated,
Thanks

'[Errno 13]' Permission denied: Openpyxl and win32com conflict

I'm using win32com to run macros in excel and openpyxl to modify cell values. In the process of debugging, I attempted to create a simplified version of existing code but still ran into the same
[Errno 13] Permission denied:'C:\\Users\\NAME\\Desktop\\old\\Book1.xlsx'.
I believe that the error is caused by the two packages (win32com and openpyxl) opening the same file and, when attempting to save/close, cannot close the instance open in the other package.
When I attempt to save/close with openpyxl before saving/closing with win32com, I run into the permission denied error. This makes sense; Openpyxl probably does not have permission to close the excel instance open through win32com. Code is below:
wb.save(r"C:\Users\NAME\Desktop\old\Book1.xlsx")
xel.Workbooks(1).Close(SaveChanges=True)
However, when I switch the order:
xel.Workbooks(1).Close(SaveChanges=True)
wb.save(r"C:\Users\NAME\Desktop\old\Book1.xlsx")
Excel attempts to save a backup file (randomly named "522FED10" or "35C0ED10", etc.) and when I press save, Excel crashes.
What's the workaround? I was thinking that you could use win32com to run the macros, save under a different filename, then use openpyxl to access that file and edit values. However, this is extremely inefficient (I'm dealing with excel files that have hundreds of thousands of rows of data). I could consider just using win32com, but that would require a revamp of a system.
Simple code:
import openpyxl as xl
import win32com.client
xel=win32com.client.Dispatch("Excel.Application")
xel.Workbooks.Open(Filename=r"C:\Users\NAME\Desktop\old\Book1.xlsx")
wb = xl.load_workbook(r"C:\Users\NAME\Desktop\old\Book1.xlsx")
ws = wb.active
xel.visible = False
xel.Cells(1,1).Value = 'Hello Excel'
ws.cell(row = 1,column = 2).value = "test"
xel.Workbooks(1).Close(SaveChanges=True)
wb.save(r"C:\Users\NAME\Desktop\old\Book1.xlsx")
Current issue
You should definitely not mix win32com and openpyxl operations.
The win32com statement xel.Workbooks.Open() loads the workbook contents into a memory space controlled by an Excel process. The openpyxl xl.load_workbook() statement on the other hand loads the workbook contents into a completely separate memory space controlled by a Python process.
Hence any subsequent win32com commands will do nothing to affect the workbook that's living inside the python-process-controlled memory, and vice versa any openpxyl commands will do nothing to affect the workbook that's living inside the Excel-process-controlled memory.
Solution
You mentioned that you have to run some excel macros. This rules out an openpyxl-only solution. My suggestion would be to use xlwings, which is in essence a powerful and user-friendly wrapper around the win32com API.
Here is a simple example of how you can execute Excel macros and manually update cell values within a single python script:
import xlwings as xw
# Start Excel app (invisibly in the background)
app = xw.App(visible=False)
# Load excel file into active Excel app
book = app.books.open(r"Book1.xlsm")
# Instruct Excel to execute the pre-existing excel macro named "CleanUpMacro"
book.macro("CleanUpMacro")()
# Instruct Excel to write a cell value in the first sheet
book.sheets["Sheet1"].range('A1').value = 42
# Save workbook and terminate Excel application
book.save()
book.close()
app.kill()

Categories

Resources