Is there a way to simplify this with same kind with os.walk or glob?
from win32com.client import Dispatch
inputwb1 = "D:/apera/Workspace/Sounding/sounding001.xlsx"
inputwb2 = "D:/apera/Workspace/Sounding/sounding002.xlsx"
Sheet = 'OUTPUT'
excel = Dispatch("Excel.Application")
source = excel.Workbooks.Open(inputwb1)
source.Worksheets(Sheet).Range('F1:H500').Copy()
source.Worksheets(Sheet).Range('I1:K500').PasteSpecial(Paste=-4163)
source = excel.Workbooks.Open(inputwb2)
source.Worksheets(Sheet).Range('F1:H500').Copy()
source.Worksheets(Sheet).Range('I1:K500').PasteSpecial(Paste=-4163)
because this thing will take so much space if I want to write hundreds of it.
You seem to have almost answered your own question. Something like this might do it:
import glob
from win32com.client import Dispatch
Sheet = 'OUTPUT'
excel = Dispatch("Excel.Application")
for filename in glob.glob("D:/apera/Workspace/Sounding/sounding*.xlsx"):
source = excel.Workbooks.Open(filename)
source.Worksheets(Sheet).Range('F1:H500').Copy()
source.Worksheets(Sheet).Range('I1:K500').PasteSpecial(Paste=-4163)
you can open multiple excel files like this (example) :
import win32com.client
import os
path = ('D:\\New Folder\\MyExcelFiles\\')
fileslist = os.listdir('D:\\New Folder\\MyExcelFiles\\')
xl = win32com.client.DispatchEx('Excel.Application')
xl.Visible = True
for i in fileslist :
xl.Workbooks.Open(path+i)
if xl.Cells.Find('2014'):
xl.Cells.Replace('2015')
xl.Save()
xl.Workbooks.Close()
else:
xl.Workbooks.Close()
Related
How can you open an xlsx file in a separate excel application then a second xls file, so that during a condition, I just want to close one of them and not both. The current code below, opens both files in the same excel application even though I am opening specifying two separate excel objects. And when call the close() function on one of the files, it closes both. My requirement is that I have to do his with dispatch from win32com. Thanks.
import time, os.path, os
from win32com.client import Dispatch
path1 = 'C:\\Todolist.xlsx'
path2 = 'C:\\Todolist2.xlsx'
xla = Dispatch("Excel.Application")
xla.DisplayAlerts = False
xla.Visible = True
xl = Dispatch("Excel.Application")
xl.DisplayAlerts = False
xl.Visible = True
wb1=xla.Workbooks.Open(Filename=path1)
wb2= xl.Workbooks.Open(Filename=path2)
time.sleep(3)
wbCloseA = xla.Workbooks.Close()
I have roughly 30 excel workbooks I need to combine into one. Each workbook has a variable number of sheets but the sheet I need to combine from each workbook is called "Output" and the format of the columns in this sheet is consistent.
I need to import the Output sheet from the first file, then append the remaining files and ignore the header row.
I have tried to do this using glob/pandas to no avail.
You could use openpyxl. Here's a sketch of the code:
from openpyxl import load_workbook
compiled_wb = load_workbook(filename = 'yourfile1.xlsx')
compiled_ws = compiled['Output']
for i in range(1, 30):
wb = load_workbook(filename = 'yourfile{}.xlsx'.format(i))
ws = wb['Output']
compiled_ws.append(ws.rows()[1:]) # ignore row 0
compiled_wb.save('compiled.xlsx')
Method shown by Clinton c. Brownley in Foundations for Analytics with Python:
execute in shell indicating the path to the folder with excel files ( make sure the argument defining all_workbooks is correct) and then followed by the excel output file as follows:
python script.py <the /path/ to/ excel folder/> < your/ final/output.xlsx>
script.py:
import pandas as pd
import sys
import os
import glob
input_path = sys.argv[1]
output_file = sys.argv[2]
all_workbooks = glob.glob(os.path.join(input_file, '*.xlsx'))
all_df = []
for workbook in all_workbooks:
all_worksheets = pd.read_excel(workbook, sheetname='Output', index_col=None)
for worksheet, data in all_worksheets.items:
all_df.append(data)
data_concatenated = pd.concat(all_df, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
data_concatenated.to_excel(writer, sheetname='concatenated_Output', index=False)
writer.save()
This will probably get down-voted because this isn't a Python answer, but honestly, I wouldn't use Python for this kind of task. I think you are far better off installing the AddIn below, and using that for the job.
https://www.rondebruin.nl/win/addins/rdbmerge.htm
Click 'Merge all files from the folder in the Files location selection' and click 'Use a Worksheet name' = 'Output', and finally, I think you want 'First cell'. Good luck!
I want to make a pdf somposed by ranges in all Excel-workbooks located in a given folder (folderwithallfiles). All workbooks will have the same structure so the range reference will be the same for all workbooks.
I thought I got it with the script below, but it does not work.
import win32com.client as win32
import glob
import os
xlfiles = sorted(glob.glob("*.xlsx"))
#print "Reading %d files..."%len(xlfiles)
cwd = "C:\\Users\\user\folderwithallfiles"
#cwd = os.getcwd()
path_to_pdf = r'C:\\Users\\user\folderwithallfiles\multitest.pdf'
excel = win32.gencache.EnsureDispatch('Excel.Application')
for xlfile in xlfiles:
wb = excel.Workbooks.Open(cwd+"\\"+xlfile)
ws = wb.Sheets('sheet 1')
ws.Range("A1:Q59").Select()
wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
Please check the below code if it works. I have written on the fly. Let me know if you find issues in it.
import pandas as pd
import numpy as np
import glob
import pdfkit as pdf
all_data = pd.DataFrame()
for f in glob.glob("filepath\file*.xlsx"):
df = pd.read_excel(f)
all_data = all_data.append(df, ignore_index=True)
all_data.to_html("filepath\all_data.html)
pdf.from_file("filepath\all_data.html", "filepath\all_data.pdf")
I havent found much of the topic of creating a password protected Excel file using Python.
In Openpyxl, I did find a SheetProtection module using:
from openpyxl.worksheet import SheetProtection
However, the problem is I'm not sure how to use it. It's not an attribute of Workbook or Worksheet so I can't just do this:
wb = Workbook()
ws = wb.worksheets[0]
ws_encrypted = ws.SheetProtection()
ws_encrypted.password = 'test'
...
Does anyone know if such a request is even possible with Python? Thanks!
Here's a workaround I use. It generates a VBS script and calls it from within your python script.
def set_password(excel_file_path, pw):
from pathlib import Path
excel_file_path = Path(excel_file_path)
vbs_script = \
f"""' Save with password required upon opening
Set excel_object = CreateObject("Excel.Application")
Set workbook = excel_object.Workbooks.Open("{excel_file_path}")
excel_object.DisplayAlerts = False
excel_object.Visible = False
workbook.SaveAs "{excel_file_path}",, "{pw}"
excel_object.Application.Quit
"""
# write
vbs_script_path = excel_file_path.parent.joinpath("set_pw.vbs")
with open(vbs_script_path, "w") as file:
file.write(vbs_script)
#execute
subprocess.call(['cscript.exe', str(vbs_script_path)])
# remove
vbs_script_path.unlink()
return None
Looking at the docs for openpyxl, I noticed there is indeed a openpyxl.worksheet.SheetProtection class. However, it seems to be already part of a worksheet object:
>>> wb = Workbook()
>>> ws = wb.worksheets[0]
>>> ws.protection
<openpyxl.worksheet.protection.SheetProtection object at 0xM3M0RY>
Checking dir(ws.protection) shows there is a method set_password that when called with a string argument does indeed seem to set a protected flag.
>>> ws.protection.set_password('test')
>>> wb.save('random.xlsx')
I opened random.xlsx in LibreOffice and the sheet was indeed protected. However, I only needed to toggle an option to turn off protection, and not enter any password, so I might be doing it wrong still...
You can use python win32com to save an excel file with a password.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
#Before saving the file set DisplayAlerts to False to suppress the warning dialog:
excel.DisplayAlerts = False
wb = excel.Workbooks.Open(your_file_name)
# refer https://learn.microsoft.com/en-us/previous-versions/office/developer/office-2007/bb214129(v=office.12)?redirectedfrom=MSDN
# FileFormat = 51 is for .xlsx extension
wb.SaveAs(your_file_name, 51, 'your password')
wb.Close()
excel.Application.Quit()
Here is a rework of MichaĆ Zawadzki's solution that doesn't require creating and executing a separate vbs file:
def PassProtect(Path, Pass):
from win32com.client.gencache import EnsureDispatch
xlApp = EnsureDispatch("Excel.Application")
xlwb = xlApp.Workbooks.Open(Path)
xlApp.DisplayAlerts = False
xlwb.Visible = False
xlwb.SaveAs(Path, Password = Pass)
xlwb.Close()
xlApp.Quit()
PassProtect(FullExcelWorkbookPathGoesHere, DesiredPasswordGoesHere)
If you wanted to choose a file name that's in your project's folder, you could also do:
from os.path import abspath
PassProtect(abspath(FileNameInsideProjectFolderGoesHere), DesiredPasswordGoesHere)
openpyxl is unlikely ever to provide workbook encryption. However, you can add this yourself because Excel files (xlsx format version >= 2010) are zip-archives: create a file in openpyxl and add a password to it using standard utilities.
We're getting an Excel file from a client that has open protection and Write Reserve protection turned on. I want to remove the protection so I can open the Excel file with the python xlrd module. I've installed the pywin32 package to access the Excel file through COM, and I can open it with my program supplying the two passwords, save, and close the file with no errors. I'm using Unprotect commands as described in MSDN network, and they're not failing, but they're also not removing the protection. The saved file still requires two passwords to open it after my program is done. Here's what I have so far:
import os, sys
impdir = "\\\\xxx.x.xx.x\\allshare\\IT\\NewBusiness\\Python_Dev\\import\\"
sys.path.append(impdir)
from UsefulFunctions import *
import win32com.client
wkgdir = pjoin(nbShare, 'NorthLake\\_testing')
filename = getFilename(wkgdir, '*Collections*.xls*')
xcl = win32com.client.Dispatch('Excel.Application')
xcl.visible = True
pw_str = raw_input("Enter password: ")
try:
wb = xcl.workbooks.open(filename, 0, False, None, pw_str, pw_str)
except Exception as e:
print "Error:", str(e)
sys.exit()
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
wb.Save()
xcl.Quit()
Can anyone provide me the correct syntax for unprotect commands that will work?
This function works for me
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
This post helped me a lot. I thought I would post what I used for my solution in case it may help someone else. Just Unprotect, DisaplyAlerts=False, and Save. Made it easy for me and the file is overwritten with a usable unprotected file.
import os, sys
import win32com.client
def unprotect_xlsx(filename):
xcl = win32com.client.Dispatch('Excel.Application')
pw_str = '12345'
wb = xcl.workbooks.open(filename)
wb.Unprotect(pw_str)
wb.UnprotectSharing(pw_str)
xcl.DisplayAlerts = False
wb.Save()
xcl.Quit()
if __name__ == '__main__':
filename = 'test.xlsx'
unprotect_xlsx(filename)
you can unprotect excel file sheets with python openpyxl module without knowing the password:
from openpyxl import load_workbook
sample = load_workbook(filename="sample.xlsx")
for sheet in sample: sheet.protection.disable()
sample.save(filename="sample.xlsx")
sample.close()
where parameter "filename" is the path of your excel file which in here i have used local dir path.
if you are on MacOS (or maybe Linux? not tested)
You have to install Microsoft Excel and xlwings
pip install xlwings
Then run this:
import pandas as pd
import xlwings as xw
def _process(filename):
wb = xw.Book(filename)
sheet = wb.sheets[0]
df = sheet.used_range.options(pd.DataFrame, index=False, header=True).value
wb.close()
return df
Resources:
Adapted from this script:
https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/
xlwings documentation: https://docs.xlwings.org/en/stable/api.html
The suggestion from #Tim Williams worked. (Use SaveAs and pass empty strings for the Password and WriteResPassword parameters.) I used 'None' for the 'format' parameter after filename, and I used a new filename to keep Excel from prompting me asking if OK to overwrite the existing file. I also found that I did not need the wb.Unprotect and wb.UnprotectSharing calls using this approach.
Hey I tried the solution provided by #Enoch Sit
def Remove_password_xlsx(filename, pw_str):
xcl = win32com.client.Dispatch("Excel.Application")
wb = xcl.Workbooks.Open(filename, False, False, None, pw_str)
xcl.DisplayAlerts = False
wb.SaveAs(filename, None, '', '')
xcl.Quit()
but got the error NameError: name 'pw_str' is not defined
:'(