How to open a password protected excel file using python? - python

I looked at the previous threads regarding this topic, but they have not helped solve the problem.
how to read password protected excel in python
How to open write reserved excel file in python with win32com?
I'm trying to open a password protected file in excel without any user interaction. I searched online, and found this code which uses win32com.client
When I run this, I still get the prompt to enter the password...
from xlrd import *
import win32com.client
import csv
import sys
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename,password = r"\\HRA\Myfile.xlsx", 'caa team'
xlwb = xlApp.Workbooks.Open(filename, Password=password)

I don't think that named parameters work in this case. So you'd have to do something like:
xlwb = xlApp.Workbooks.Open(filename, False, True, None, password)
See http://msdn.microsoft.com/en-us/library/office/ff194819.aspx for details on the Workbooks.Open method.

I recently discovered a Python library that makes this task simple.
It does not require Excel to be installed and, because it's pure Python, it's cross-platform too!
msoffcrypto-tool supports password-protected (encrypted) Microsoft Office documents, including the older XLS binary file format.
Install msoffcrypto-tool:
pip install msoffcrypto-tool
You could create an unencrypted version of the workbook from the command line:
msoffcrypto-tool Myfile.xlsx Myfile-decrypted.xlsx -p "caa team"
Or, you could use msoffcrypto-tool as a library. While you could write an unencrypted version to disk like above, you may prefer to create an decrypted in-memory file and pass this to your Python Excel library (openpyxl, xlrd, etc.).
import io
import msoffcrypto
import openpyxl
decrypted_workbook = io.BytesIO()
with open('Myfile.xlsx', 'rb') as file:
office_file = msoffcrypto.OfficeFile(file)
office_file.load_key(password='caa team')
office_file.decrypt(decrypted_workbook)
# `filename` can also be a file-like object.
workbook = openpyxl.load_workbook(filename=decrypted_workbook)

If your file size is small, you can probably save that as ".csv".
and then read
It worked for me :)

Openpyxl Package works if you are using linux system. You can use secure the file by setting up a password and open the file using the same password.
For more info:
https://www.quora.com/How-do-I-open-read-password-protected-xls-or-xlsx-Excel-file-using-python-in-Linux

Thank you so much for the great answers on this topic. Trying to collate all of it. My requirement was to open a bunch of password protected excel files ( all had same password ) so that I could do some more processing on those. Please find the code below.
import pandas as pd
import os
from xlrd import *
import win32com.client as w3c
import csv
import sys
from tempfile import NamedTemporaryFile
df_list=[]
# print(len(files))
for f in files:
# print(f)
if('.xlsx' in f):
xlwb = xlapp.Workbooks.Open('C:\\users\\files\\'+f, False, True, None, 'password')
temp_f = NamedTemporaryFile(delete=False, suffix='.csv')
temp_f.close()
os.unlink(temp_f.name)
xlwb.SaveAs(Filename=temp_f.name, FileFormat=xlCSVWindows)
df = pd.read_csv(temp_f.name,encoding='Latin-1') # Read that CSV from Pandas
df.to_excel('C:\\users\\files\\password_removed\\'+f)

Related

Creating view in browser functionality with python

I have been struggling with this problem for a while but can't seem to find a solution for it. The situation is that I need to open a file in browser and after the user closes the file the file is removed from their machine. All I have is the binary data for that file. If it matters, the binary data comes from Google Storage using the download_as_string method.
After doing some research I found that the tempfile module would suit my needs, but I can't get the tempfile to open in browser because the file only exists in memory and not on the disk. Any suggestions on how to solve this?
This is my code so far:
import tempfile
import webbrowser
# grabbing binary data earlier on
temp = tempfile.NamedTemporaryFile()
temp.name = "example.pdf"
temp.write(binary_data_obj)
temp.close()
webbrowser.open('file://' + os.path.realpath(temp.name))
When this is run, my computer gives me an error that says that the file cannot be opened since it is empty. I am on a Mac and am using Chrome if that is relevant.
You could try using a temporary directory instead:
import os
import tempfile
import webbrowser
# I used an existing pdf I had laying around as sample data
with open('c.pdf', 'rb') as fh:
data = fh.read()
# Gives a temporary directory you have write permissions to.
# The directory and files within will be deleted when the with context exits.
with tempfile.TemporaryDirectory() as temp_dir:
temp_file_path = os.path.join(temp_dir, 'example.pdf')
# write a normal file within the temp directory
with open(temp_file_path, 'wb+') as fh:
fh.write(data)
webbrowser.open('file://' + temp_file_path)
This worked for me on Mac OS.

Python reading a password protected excel

problem:- python package pandas, openpyxl cant read excel with password protected.
action:- review decrypt excel files but still not working
result:- only pop-out password input box
Using:- Anaconda, jupyter notebook
What to do? read the encrypted excel and export as non-encrypting excel and proceed.
import win32com.client
excel = win32com.client.Dispatch('Excel.Application')
workbook = excel.Workbooks.Open(r'C:\ltexSales.xlsx',False, True, None, 'password')
xlCSVWindows = 0x17
workbook.SaveAs(r'C:\ltexSales_decrypted.csv', FileFormat = xlCSVWindows,
Password = None)
workbook.Close()
import pandas as pd
df = pd.read_csv(r'C:\ltexSales_decrypted.csv')
So now you may use pandas to read it.
extra: be aware of all capital letter

Save Visio Document as HTML

I'm trying to convert a lot of Visio files from .vsd to .html, but each file has a lot of pages, so I need to convert all pages to a single .html file.
Using the Python code below, I'm able to convert to PDF, but what I really need is HTML. I noticed I can use win32com.client.Dispatch("SaveAsWeb.VisSaveAsWeb"), but how to use it? Any ideas?
import sys
import win32com.client
from os.path import abspath
f = abspath(sys.argv[1])
visio = win32com.client.Dispatch("Visio.InvisibleApp")
doc = visio.Documents.Open(f)
doc.ExportAsFixedFormat(1, '{}.pdf'.format(f), 0, 0)
visio.Quit()
exit(0)
Visio cannot do that. You cannot "convert all pages into a single HTML file". You'll have a "root" file and a folder of "supporting" files.
VisSaveAsWeb is pretty well documented, no need to guess:
https://msdn.microsoft.com/en-us/vba/visio-vba/articles/vissaveasweb-object-visio-save-as-web
-- update
With python, it turned out to be not that trivial to deal with SaveAsWeb. It seems to default to a custom interface (non-dispatch). I don't think it's possible deal with this using win32com library, but with comtypes seems to work (comtypes library is building the client based on the type library, i.e. it also supports "custom" interfaces):
import sys
import comtypes
from comtypes import client
from os.path import abspath
f = abspath(sys.argv[1])
visio = comtypes.client.CreateObject("Visio.InvisibleApp")
doc = visio.Documents.Open(f)
comtypes.client.GetModule("{}\\SAVASWEB.DLL".format(visio.Path))
saveAsWeb = visio.SaveAsWebObject.QueryInterface(comtypes.gen.VisSAW.IVisSaveAsWeb)
webPageSettings = saveAsWeb.WebPageSettings.QueryInterface(comtypes.gen.VisSAW.IVisWebPageSettings)
webPageSettings.TargetPath = "{}.html".format(f)
webPageSettings.QuietMode = True
saveAsWeb.AttachToVisioDoc(doc)
saveAsWeb.CreatePages()
visio.Quit()
exit(0)
Other than that, you can try "command line" interface:
http://visualsignals.typepad.co.uk/vislog/2010/03/automating-visios-save-as-web-output.html
import sys
import win32com.client
from os.path import abspath
f = abspath(sys.argv[1])
visio = win32com.client.Dispatch("Visio.InvisibleApp")
doc = visio.Documents.Open(f)
visio.Addons("SaveAsWeb").Run("/quiet=True /target={}.htm".format(f))
visio.Quit()
exit(0)
Other than that you could give a try to my visio svg-export :)

Password Protecting Excel file using Python

I havent found much of the topic of creating a password protected Excel file using Python.
In Openpyxl, I did find a SheetProtection module using:
from openpyxl.worksheet import SheetProtection
However, the problem is I'm not sure how to use it. It's not an attribute of Workbook or Worksheet so I can't just do this:
wb = Workbook()
ws = wb.worksheets[0]
ws_encrypted = ws.SheetProtection()
ws_encrypted.password = 'test'
...
Does anyone know if such a request is even possible with Python? Thanks!
Here's a workaround I use. It generates a VBS script and calls it from within your python script.
def set_password(excel_file_path, pw):
from pathlib import Path
excel_file_path = Path(excel_file_path)
vbs_script = \
f"""' Save with password required upon opening
Set excel_object = CreateObject("Excel.Application")
Set workbook = excel_object.Workbooks.Open("{excel_file_path}")
excel_object.DisplayAlerts = False
excel_object.Visible = False
workbook.SaveAs "{excel_file_path}",, "{pw}"
excel_object.Application.Quit
"""
# write
vbs_script_path = excel_file_path.parent.joinpath("set_pw.vbs")
with open(vbs_script_path, "w") as file:
file.write(vbs_script)
#execute
subprocess.call(['cscript.exe', str(vbs_script_path)])
# remove
vbs_script_path.unlink()
return None
Looking at the docs for openpyxl, I noticed there is indeed a openpyxl.worksheet.SheetProtection class. However, it seems to be already part of a worksheet object:
>>> wb = Workbook()
>>> ws = wb.worksheets[0]
>>> ws.protection
<openpyxl.worksheet.protection.SheetProtection object at 0xM3M0RY>
Checking dir(ws.protection) shows there is a method set_password that when called with a string argument does indeed seem to set a protected flag.
>>> ws.protection.set_password('test')
>>> wb.save('random.xlsx')
I opened random.xlsx in LibreOffice and the sheet was indeed protected. However, I only needed to toggle an option to turn off protection, and not enter any password, so I might be doing it wrong still...
You can use python win32com to save an excel file with a password.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
#Before saving the file set DisplayAlerts to False to suppress the warning dialog:
excel.DisplayAlerts = False
wb = excel.Workbooks.Open(your_file_name)
# refer https://learn.microsoft.com/en-us/previous-versions/office/developer/office-2007/bb214129(v=office.12)?redirectedfrom=MSDN
# FileFormat = 51 is for .xlsx extension
wb.SaveAs(your_file_name, 51, 'your password')
wb.Close()
excel.Application.Quit()
Here is a rework of MichaƂ Zawadzki's solution that doesn't require creating and executing a separate vbs file:
def PassProtect(Path, Pass):
from win32com.client.gencache import EnsureDispatch
xlApp = EnsureDispatch("Excel.Application")
xlwb = xlApp.Workbooks.Open(Path)
xlApp.DisplayAlerts = False
xlwb.Visible = False
xlwb.SaveAs(Path, Password = Pass)
xlwb.Close()
xlApp.Quit()
PassProtect(FullExcelWorkbookPathGoesHere, DesiredPasswordGoesHere)
If you wanted to choose a file name that's in your project's folder, you could also do:
from os.path import abspath
PassProtect(abspath(FileNameInsideProjectFolderGoesHere), DesiredPasswordGoesHere)
openpyxl is unlikely ever to provide workbook encryption. However, you can add this yourself because Excel files (xlsx format version >= 2010) are zip-archives: create a file in openpyxl and add a password to it using standard utilities.

Unprotect an Excel file programmatically

We're getting an Excel file from a client that has open protection and Write Reserve protection turned on. I want to remove the protection so I can open the Excel file with the python xlrd module. I've installed the pywin32 package to access the Excel file through COM, and I can open it with my program supplying the two passwords, save, and close the file with no errors. I'm using Unprotect commands as described in MSDN network, and they're not failing, but they're also not removing the protection. The saved file still requires two passwords to open it after my program is done. Here's what I have so far:
import os, sys
impdir = "\\\\xxx.x.xx.x\\allshare\\IT\\NewBusiness\\Python_Dev\\import\\"
sys.path.append(impdir)
from UsefulFunctions import *
import win32com.client
wkgdir = pjoin(nbShare, 'NorthLake\\_testing')
filename = getFilename(wkgdir, '*Collections*.xls*')
xcl = win32com.client.Dispatch('Excel.Application')
xcl.visible = True
pw_str = raw_input("Enter password: ")
try:
wb = xcl.workbooks.open(filename, 0, False, None, pw_str, pw_str)
except Exception as e:
print "Error:", str(e)
sys.exit()
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
wb.Save()
xcl.Quit()
Can anyone provide me the correct syntax for unprotect commands that will work?
This function works for me
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
This post helped me a lot. I thought I would post what I used for my solution in case it may help someone else. Just Unprotect, DisaplyAlerts=False, and Save. Made it easy for me and the file is overwritten with a usable unprotected file.
import os, sys
import win32com.client
def unprotect_xlsx(filename):
xcl = win32com.client.Dispatch('Excel.Application')
pw_str = '12345'
wb = xcl.workbooks.open(filename)
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
xcl.DisplayAlerts = False
wb.Save()
xcl.Quit()
if __name__ == '__main__':
filename = 'test.xlsx'
unprotect_xlsx(filename)
you can unprotect excel file sheets with python openpyxl module without knowing the password:
from openpyxl import load_workbook
sample = load_workbook(filename="sample.xlsx")
for sheet in sample: sheet.protection.disable()
sample.save(filename="sample.xlsx")
sample.close()
where parameter "filename" is the path of your excel file which in here i have used local dir path.
if you are on MacOS (or maybe Linux? not tested)
You have to install Microsoft Excel and xlwings
pip install xlwings
Then run this:
import pandas as pd
import xlwings as xw
def _process(filename):
wb = xw.Book(filename)
sheet = wb.sheets[0]
df = sheet.used_range.options(pd.DataFrame, index=False, header=True).value
wb.close()
return df
Resources:
Adapted from this script:
https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/
xlwings documentation: https://docs.xlwings.org/en/stable/api.html
The suggestion from #Tim Williams worked. (Use SaveAs and pass empty strings for the Password and WriteResPassword parameters.) I used 'None' for the 'format' parameter after filename, and I used a new filename to keep Excel from prompting me asking if OK to overwrite the existing file. I also found that I did not need the wb.Unprotect and wb.UnprotectSharing calls using this approach.
Hey I tried the solution provided by #Enoch Sit
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
but got the error NameError: name 'pw_str' is not defined
:'(

Categories

Resources