We're getting an Excel file from a client that has open protection and Write Reserve protection turned on. I want to remove the protection so I can open the Excel file with the python xlrd module. I've installed the pywin32 package to access the Excel file through COM, and I can open it with my program supplying the two passwords, save, and close the file with no errors. I'm using Unprotect commands as described in MSDN network, and they're not failing, but they're also not removing the protection. The saved file still requires two passwords to open it after my program is done. Here's what I have so far:
import os, sys
impdir = "\\\\xxx.x.xx.x\\allshare\\IT\\NewBusiness\\Python_Dev\\import\\"
sys.path.append(impdir)
from UsefulFunctions import *
import win32com.client
wkgdir = pjoin(nbShare, 'NorthLake\\_testing')
filename = getFilename(wkgdir, '*Collections*.xls*')
xcl = win32com.client.Dispatch('Excel.Application')
xcl.visible = True
pw_str = raw_input("Enter password: ")
try:
wb = xcl.workbooks.open(filename, 0, False, None, pw_str, pw_str)
except Exception as e:
print "Error:", str(e)
sys.exit()
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
wb.Save()
xcl.Quit()
Can anyone provide me the correct syntax for unprotect commands that will work?
This function works for me
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
This post helped me a lot. I thought I would post what I used for my solution in case it may help someone else. Just Unprotect, DisaplyAlerts=False, and Save. Made it easy for me and the file is overwritten with a usable unprotected file.
import os, sys
import win32com.client
def unprotect_xlsx(filename):
xcl = win32com.client.Dispatch('Excel.Application')
pw_str = '12345'
wb = xcl.workbooks.open(filename)
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
xcl.DisplayAlerts = False
wb.Save()
xcl.Quit()
if __name__ == '__main__':
filename = 'test.xlsx'
unprotect_xlsx(filename)
you can unprotect excel file sheets with python openpyxl module without knowing the password:
from openpyxl import load_workbook
sample = load_workbook(filename="sample.xlsx")
for sheet in sample: sheet.protection.disable()
sample.save(filename="sample.xlsx")
sample.close()
where parameter "filename" is the path of your excel file which in here i have used local dir path.
if you are on MacOS (or maybe Linux? not tested)
You have to install Microsoft Excel and xlwings
pip install xlwings
Then run this:
import pandas as pd
import xlwings as xw
def _process(filename):
wb = xw.Book(filename)
sheet = wb.sheets[0]
df = sheet.used_range.options(pd.DataFrame, index=False, header=True).value
wb.close()
return df
Resources:
Adapted from this script:
https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/
xlwings documentation: https://docs.xlwings.org/en/stable/api.html
The suggestion from #Tim Williams worked. (Use SaveAs and pass empty strings for the Password and WriteResPassword parameters.) I used 'None' for the 'format' parameter after filename, and I used a new filename to keep Excel from prompting me asking if OK to overwrite the existing file. I also found that I did not need the wb.Unprotect and wb.UnprotectSharing calls using this approach.
Hey I tried the solution provided by #Enoch Sit
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
but got the error NameError: name 'pw_str' is not defined
:'(
Related
I am trying to run a VBA Macro in an xlsm workbook using python 3.7 in Spyder. This workbook has two worksheets.
The code that I have currently runs and saves the new file with no problems, however it is not triggering the VBA like it should.
I know this macro works because if I manually click the button in Excel, it works just fine.
Could someone assist with this? I checked the Macro Settings under the Trust Center and all macros are enabled so I do not think it is a permissions issue, however I am not an admin on this pc.
The code is below:
import os
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
wb = xl.Workbooks.Open("Z:\FolderName\FolderName2\FileName.xlsm")
xl.Application.Run("MacroName")
wb.SaveAs("Z:\FolderName\FolderName2\FileName1.xlsm")
wb.Close()
xl.Quit()
This can be done easily through xlwings. Once I switched to that library then I was able to quickly get this script working.
First make sure you have your All.xlsm file in your current working or in your User/Documents(Sometimes it working from yourDocuments directory and sometimes not, so better to have in both)
pass your macro name along with the file name that contains the macro you can make change to Parameters like ReadOnly or Savechanges according to your requirement
And be make sure to deleta xl object after each run
import win32com.client
xl =win32com.client.dynamic.Dispatch('Excel.Application')
xl.Workbooks.Open(Filename = XYZ.xls, ReadOnly= 1)
xl.Application.Run('All.xlsm!yourmacroname')
xl.Workbooks(1).Close(SaveChanges=1)
xl.Application.Quit()
del xl
Running Excel Macro from Python
To Run a Excel Marcro from python, You don't need almost nothing. Below a script that does the job. The advantage of Updating data from a macro inside Excel is that you immediatly see the result. You don't have to save or close the workbook first. I use this methode to update real-time stock quotes. It is fast and stable.
This is just an example, but you can do anything with macros inside Excel.
from os import system, path
import win32com.client as win32
from time import sleep
def isWorkbookOpen(xlPath, xlFileName):
SeachXl = xlPath + "~$" + xlFileName
if path.exists(SeachXl):
return True
else:
return False
def xlRunMacro(macroLink):
PathFile = macroLink[0]
xlMacro = macroLink[1]
isLinkReady = False
# Create the link with the open existing workbook
win32.pythoncom.CoInitialize()
xl = win32.Dispatch("Excel.Application")
try:
wb = win32.GetObject(PathFile)
isLinkReady = True
except:
NoteToAdd = 'Can not create the link with ' + PathFile
print(NoteToAdd)
if isLinkReady:
# If the link with the workbook exist, then run the Excel macro
try:
xl.Application.Run(xlMacro)
except:
NoteToAdd = 'Running Excel Macro ' + xlMacro + ' failed !!!'
print(NoteToAdd)
del xl
def mainProgam(macroSettings):
FullMacroLink = []
PathFile = macroSettings[0] + macroSettings[1]
FullMacroLink.append(PathFile)
FullModuleSubrout = macroSettings[1] + '!' + macroSettings[2] + '.' + macroSettings[3]
FullMacroLink.append(FullModuleSubrout)
if isWorkbookOpen(macroSettings[0], macroSettings[1]) == False:
# If the workbook is not open, Open workbook first.
system(f'start excel.exe "{PathFile}"')
# Give some time to start up Excel
sleep(2)
xlRunMacro(FullMacroLink)
def main():
macroSettings = []
# The settings below will execute the macro example
xlPath = r'C:\test\\' # Change add your needs
macroSettings.append(xlPath)
workbookName = 'Example.xlsm' # Change add your needs
macroSettings.append(workbookName)
xlModule = "Updates" # Change add your needs
macroSettings.append(xlModule)
xlSubroutine = "UpdateCurrentTime" # Change add your needs
macroSettings.append(xlSubroutine)
mainProgam(macroSettings)
if __name__ == "__main__":
main()
exit()
VBA Excel Macro
Option Explicit
Sub UpdateCurrentTime()
Dim sht As Worksheet
Set sht = ThisWorkbook.Sheets("Current-Time")
With sht
sht.Cells(2, 1).Value = Format(Now(), "hh:mm:ss")
End With
End Sub
You can use it also as a dynamic module too. Save the module above as RunExcelMacro.py in Your python project. After just use the following lines:
from RunExcelMacro import mainProgam
mainProgram(macroSettings)
It will do the job, succes ...
You need to reference the module name as well
Example here my vba code under Module1
Option Explicit
Public Sub Example()
MsgBox "Hello 0m3r"
End Sub
and here is my python
from win32com.client import Dispatch
def run_excel_macro():
try:
excel = Dispatch("Excel.Application")
excel.Visible = True
workbook = excel.Workbooks.Open(
r"D:\Documents\Book1.xlsm")
workbook.Application.Run("Module1.Example")
workbook.SaveAs(r"D:\Documents\Book5.xlsm")
excel.Quit()
except IOError:
print("Error")
if __name__ == "__main__":
run_excel_macro()
I have an Excel file that I run a Python script on. The Excel file has external data connections that need to be refreshed before the Python script is run. The functionality I'm referring to is here:
I'm using Python 2.7 and am relying on Pandas for most of the Excel data parsing.
CalculateUntilAsyncQueriesDone() will hold the program and wait until the refresh has completed.
xlapp = win32com.client.DispatchEx("Excel.Application")
wb = xlapp.Workbooks.Open(<path_to_excel_workbook>)
wb.RefreshAll()
xlapp.CalculateUntilAsyncQueriesDone()
wb.Save()
xlapp.Quit()
If you're on windows, and I believe you are given the screenshot, you can use the win32com module. It will allow you - from python - to open up Excel, load a workbook, refresh all data connections and then quit. The syntax ends up being pretty close to VBA.
I suggest you install pypiwin32 via pip (pip install pypiwin32).
import win32com.client
# Start an instance of Excel
xlapp = win32com.client.DispatchEx("Excel.Application")
# Open the workbook in said instance of Excel
wb = xlapp.workbooks.open(<path_to_excel_workbook>)
# Optional, e.g. if you want to debug
# xlapp.Visible = True
# Refresh all data connections.
wb.RefreshAll()
wb.Save()
# Quit
xlapp.Quit()
Adding this as an answer since this is the first Google link - the code in the first answer worked but has incorrect capitalization, it should be:
import win32com.client
import time
xlapp = win32com.client.DispatchEx("Excel.Application")
wb = xlapp.Workbooks.Open(<path_to_excel_workbook>)
wb.RefreshAll()
time.sleep(5)
wb.Save()
xlapp.Quit()
A small note, but important one. All the codes above are correct, but it will raise the issue with permission Err 13 because the file is only being saved, not closed as well.
add wb.Close() after save, otherwise the openned Excel will remain in the background app, and if you work with 500 of those, you might get a bit into troubles
Adding on top of what everyone else has said, I kept getting the save dialog again when the code got to the Quit line. I set the DisplayAlerts flag to false and it fixed my issue. I didn't need the sleep timer either. This is what worked for me:
xlapp = win32com.client.DispatchEx("Excel.Application")
wb = xlapp.Workbooks.Open(<path_to_excel_workbook>)
wb.RefreshAll()
xlapp.CalculateUntilAsyncQueriesDone()
xlapp.DisplayAlerts = False
wb.Save()
xlapp.Quit()
Adding another slightly changed answer as I was stumped by this and none of the solutions were working. What worked for me was enabling Xlsx.DisplayAlerts = True and Xlsx.Visible = True, then at end saving the book with book.Save() and also closing with save: book.Close(SaveChanges=True).
It's a bit cumbersome with Excel opening and closing every time (I am iterating through many excel files), but it works so thats good.
import win32com.client as win32
import pythoncom
def open_close_as_excel(file_path):
try:
pythoncom.CoInitialize()
Xlsx = win32.DispatchEx('Excel.Application')
Xlsx.DisplayAlerts = True
Xlsx.Visible = True
book = Xlsx.Workbooks.Open(file_path)
book.RefreshAll()
Xlsx.CalculateUntilAsyncQueriesDone()
book.Save()
book.Close(SaveChanges=True)
Xlsx.Quit()
pythoncom.CoUninitialize()
book = None
Xlsx = None
del book
del Xlsx
print("-- Opened/Closed as Excel --")
except Exception as e:
print(e)
finally:
# RELEASES RESOURCES
book = None
Xlsx = None
I havent found much of the topic of creating a password protected Excel file using Python.
In Openpyxl, I did find a SheetProtection module using:
from openpyxl.worksheet import SheetProtection
However, the problem is I'm not sure how to use it. It's not an attribute of Workbook or Worksheet so I can't just do this:
wb = Workbook()
ws = wb.worksheets[0]
ws_encrypted = ws.SheetProtection()
ws_encrypted.password = 'test'
...
Does anyone know if such a request is even possible with Python? Thanks!
Here's a workaround I use. It generates a VBS script and calls it from within your python script.
def set_password(excel_file_path, pw):
from pathlib import Path
excel_file_path = Path(excel_file_path)
vbs_script = \
f"""' Save with password required upon opening
Set excel_object = CreateObject("Excel.Application")
Set workbook = excel_object.Workbooks.Open("{excel_file_path}")
excel_object.DisplayAlerts = False
excel_object.Visible = False
workbook.SaveAs "{excel_file_path}",, "{pw}"
excel_object.Application.Quit
"""
# write
vbs_script_path = excel_file_path.parent.joinpath("set_pw.vbs")
with open(vbs_script_path, "w") as file:
file.write(vbs_script)
#execute
subprocess.call(['cscript.exe', str(vbs_script_path)])
# remove
vbs_script_path.unlink()
return None
Looking at the docs for openpyxl, I noticed there is indeed a openpyxl.worksheet.SheetProtection class. However, it seems to be already part of a worksheet object:
>>> wb = Workbook()
>>> ws = wb.worksheets[0]
>>> ws.protection
<openpyxl.worksheet.protection.SheetProtection object at 0xM3M0RY>
Checking dir(ws.protection) shows there is a method set_password that when called with a string argument does indeed seem to set a protected flag.
>>> ws.protection.set_password('test')
>>> wb.save('random.xlsx')
I opened random.xlsx in LibreOffice and the sheet was indeed protected. However, I only needed to toggle an option to turn off protection, and not enter any password, so I might be doing it wrong still...
You can use python win32com to save an excel file with a password.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
#Before saving the file set DisplayAlerts to False to suppress the warning dialog:
excel.DisplayAlerts = False
wb = excel.Workbooks.Open(your_file_name)
# refer https://learn.microsoft.com/en-us/previous-versions/office/developer/office-2007/bb214129(v=office.12)?redirectedfrom=MSDN
# FileFormat = 51 is for .xlsx extension
wb.SaveAs(your_file_name, 51, 'your password')
wb.Close()
excel.Application.Quit()
Here is a rework of Michał Zawadzki's solution that doesn't require creating and executing a separate vbs file:
def PassProtect(Path, Pass):
from win32com.client.gencache import EnsureDispatch
xlApp = EnsureDispatch("Excel.Application")
xlwb = xlApp.Workbooks.Open(Path)
xlApp.DisplayAlerts = False
xlwb.Visible = False
xlwb.SaveAs(Path, Password = Pass)
xlwb.Close()
xlApp.Quit()
PassProtect(FullExcelWorkbookPathGoesHere, DesiredPasswordGoesHere)
If you wanted to choose a file name that's in your project's folder, you could also do:
from os.path import abspath
PassProtect(abspath(FileNameInsideProjectFolderGoesHere), DesiredPasswordGoesHere)
openpyxl is unlikely ever to provide workbook encryption. However, you can add this yourself because Excel files (xlsx format version >= 2010) are zip-archives: create a file in openpyxl and add a password to it using standard utilities.
Whenever I have the file open in Excel and run the code, I get the following error which is surprising because I thought read_excel should be a read only operation and would not require the file to be unlocked?
Traceback (most recent call last):
File "C:\Users\Public\a.py", line 53, in <module>
main()
File "C:\Users\Public\workspace\a.py", line 47, in main
blend = plStream(rootDir);
File "C:\Users\Public\workspace\a.py", line 20, in plStream
df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True)
File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 163, in read_excel
io = ExcelFile(io, engine=engine)
File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 206, in __init__
self.book = xlrd.open_workbook(io)
File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\xlrd\__init__.py", line 394, in open_workbook
f = open(filename, "rb")
PermissionError: [Errno 13] Permission denied: '<Path to File>'
Generally Excel have a lot of restrictions when opening files (can't open the same file twice, can't open 2 different files with the same name ..etc).
I don't have excel on machine to test, but checking the docs for read_excel I've noticed that it allows you to set the engine.
from the stack trace you posted it seems like the error is thrown by xlrd which is the default engine used by pandas.
try using any of the other ones
Supported engines: “xlrd”, “openpyxl”, “odf”, “pyxlsb”, default “xlrd”.
so try with the rest, like
df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True, engine="openpyxl")
I know this is not a real answer, but you might want to submit a bug report to pandas or xlrd teams.
As a workaround I suggest making python create a copy of the original file then read from the copy. After that the code should delete the copied file. It's a bit of extra work but should work.
Example
import shutil
shutil.copy("C://Test//Test.xlsx", "C://Test//koko.xlsx")
I would suggest using the xlwings module instead which allows for greater functionality.
Firstly, you will need to load your workbook using the following line:
If the spreadsheet is in the same folder as your python script:
import xlwings as xw
workbook = xw.Book('myfile.xls')
Alternatively:
workbook = xw.Book('"C:\Users\...\myfile.xls')
Then, you can create your Pandas DataFrame, by specifying the sheet within your spreadsheet and the cell where your dataset begins:
df = workbook.sheets[0].range('A1').options(pd.DataFrame,
header=1,
index=False,
expand='table').value
When specifying a sheet you can either specify a sheet by its name or by its location (i.e. first, second etc.) in the following way:
workbook.sheets[0] or workbook.sheets['sheet_name']
Lastly, you can simply install the xlwings module by using Pip install xlwings
Mostly there is no issues in your code. [ If you publish the code it will be easier.]
You need to change the permissions of the directory you are using so that all users have read and write permissions.
I got this to work by first setting the working directory, then opening the file. Maybe something to do with shared drive permissions and read_excel function.
import os
import pandas as pd
os.chdir("c:\\Users\\...\\")
filepath = "...\\filename.xlsx"
sheetname = 'sheet1'
df_xls = pd.read_excel(filepath, sheet_name=sheetname, engine='openpyxl')
I fix this error simply closing the .xlsx file that was open.
You can set engine = 'xlrd', then you can run the code while Excel has the file open.
df = pd.read_excel(filename, sheetname, engine = 'xlrd')
You may need to pip install xlrd if you don't have it
You may also want to check if the file has a password? Alternatively you can open the file with the password required using the code below:
import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = <-- enter your own filename and password
xlwb = xlApp.Workbooks.Open(filename, Password=password)
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets([insert number here]) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1
You can set engine='python' then you can run it even if the file is open
df = pd.read_excel(filename, engine = 'python')
I looked at the previous threads regarding this topic, but they have not helped solve the problem.
how to read password protected excel in python
How to open write reserved excel file in python with win32com?
I'm trying to open a password protected file in excel without any user interaction. I searched online, and found this code which uses win32com.client
When I run this, I still get the prompt to enter the password...
from xlrd import *
import win32com.client
import csv
import sys
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename,password = r"\\HRA\Myfile.xlsx", 'caa team'
xlwb = xlApp.Workbooks.Open(filename, Password=password)
I don't think that named parameters work in this case. So you'd have to do something like:
xlwb = xlApp.Workbooks.Open(filename, False, True, None, password)
See http://msdn.microsoft.com/en-us/library/office/ff194819.aspx for details on the Workbooks.Open method.
I recently discovered a Python library that makes this task simple.
It does not require Excel to be installed and, because it's pure Python, it's cross-platform too!
msoffcrypto-tool supports password-protected (encrypted) Microsoft Office documents, including the older XLS binary file format.
Install msoffcrypto-tool:
pip install msoffcrypto-tool
You could create an unencrypted version of the workbook from the command line:
msoffcrypto-tool Myfile.xlsx Myfile-decrypted.xlsx -p "caa team"
Or, you could use msoffcrypto-tool as a library. While you could write an unencrypted version to disk like above, you may prefer to create an decrypted in-memory file and pass this to your Python Excel library (openpyxl, xlrd, etc.).
import io
import msoffcrypto
import openpyxl
decrypted_workbook = io.BytesIO()
with open('Myfile.xlsx', 'rb') as file:
office_file = msoffcrypto.OfficeFile(file)
office_file.load_key(password='caa team')
office_file.decrypt(decrypted_workbook)
# `filename` can also be a file-like object.
workbook = openpyxl.load_workbook(filename=decrypted_workbook)
If your file size is small, you can probably save that as ".csv".
and then read
It worked for me :)
Openpyxl Package works if you are using linux system. You can use secure the file by setting up a password and open the file using the same password.
For more info:
https://www.quora.com/How-do-I-open-read-password-protected-xls-or-xlsx-Excel-file-using-python-in-Linux
Thank you so much for the great answers on this topic. Trying to collate all of it. My requirement was to open a bunch of password protected excel files ( all had same password ) so that I could do some more processing on those. Please find the code below.
import pandas as pd
import os
from xlrd import *
import win32com.client as w3c
import csv
import sys
from tempfile import NamedTemporaryFile
df_list=[]
# print(len(files))
for f in files:
# print(f)
if('.xlsx' in f):
xlwb = xlapp.Workbooks.Open('C:\\users\\files\\'+f, False, True, None, 'password')
temp_f = NamedTemporaryFile(delete=False, suffix='.csv')
temp_f.close()
os.unlink(temp_f.name)
xlwb.SaveAs(Filename=temp_f.name, FileFormat=xlCSVWindows)
df = pd.read_csv(temp_f.name,encoding='Latin-1') # Read that CSV from Pandas
df.to_excel('C:\\users\\files\\password_removed\\'+f)