Python - Excel to HTML (keeping format) - python

I am trying to convert an Excel file to an HTML file while keeping the format of the workbook.
Using Excel, I am able to switch from xlsx to htm: File -> Save as -> Web page (*.html, *.htm)
Using Python, I am always getting something gibberish like the below image as workbook.htm or workbook.html.
import xlwings as xw
file_path = "*.xlsx"
excel_app = xw.App(visible=False)
wb = excel_app.books.open(file_path)
wb.save("*.html")
wb.save("*.htm")
from xlsx2html import xlsx2html
xlsx2html('*xlsx', '*.htm')
xlsx2html('*xlsx', '*.html')
I have used dummy files, I am just trying to go from the xlsx file to the htm/hmtl file using Python and keeping the format, e.g. background colors, borders, etc.

I used to have such problem. I also used xlwings library, customized it and success. You find and edit in the file xlwings/_xlwindows.py as follows:
def save(self, path=None):
saved_path = self.xl.Path
source_ext = os.path.splitext(self.name)[1] if saved_path else None
target_ext = os.path.splitext(path)[1] if path else '.xlsx'
if saved_path and source_ext == target_ext:
file_format = self.xl.FileFormat
else:
ext_to_file_format = {'.xlsx': FileFormat.xlOpenXMLWorkbook,
'.xlsm': FileFormat.xlOpenXMLWorkbookMacroEnabled,
'.xlsb': FileFormat.xlExcel12,
'.xltm': FileFormat.xlOpenXMLTemplateMacroEnabled,
'.xltx': FileFormat.xlOpenXMLTemplateMacroEnabled,
'.xlam': FileFormat.xlOpenXMLAddIn,
'.xls': FileFormat.xlWorkbookNormal,
'.xlt': FileFormat.xlTemplate,
'.xla': FileFormat.xlAddIn,
'.html': FileFormat.xlHtml # ---> add new
}

Related

Python - Open file in default program and save with default program extension (or the like)

I'm currently trying to do the following:
Open up an .xml file that's already in spreadsheet format with Excel
Save the .xml file as .xlsx without corrupting the file
Other options that I can take via Python are:
Convert the .xml to .xlsx
Copy specific columns (A1:AC6000) to another Excel workbook
Import an XML file directly in an Excel workbook.
I failed at all of them and can't think of a different way so here I am asking for help. My latest code is here:
# importing openpyxl module
import openpyxl as xl;
# opening the source excel file
file = 'C:\\Users\\ddejean\\Desktop\\HESKlogin\\Downloads\\data.xlsx'
wb1 = xl.load_workbook(file)
ws1 = wb1['Sheet1']
# opening the destination excel file
filename1 = 'C:\\Users\\ddejean\\Desktop\\HESKlogin\\Downloads\\updated.xlsx'
wb2 = xl.load_workbook(filename1)
ws2 = wb2['Sheet1']
# calculate total number of rows and
# columns in source excel file
mr = ws1.max_row
mc = ws1.max_column
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr + 1):
for j in range (1, mc + 1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
# saving the destination excel file
wb2.save(filename1)
I also tried changing the format of the file which ultimately corrupted the file:
A = r"C:\\Users\\ddejean\\Desktop\\HESKlogin\\Downloads\\data.xml"
pre, ext = os.path.splitext(A)
B = os.rename(A, pre + ".xlsx")
I tried importing the file into Excel which was terrible since none of the data in xml have properly name attributes to differentiate the data. I also tried calling a macro, but I get an error with each macro on my network, so I disposed of that alternative.
Any assistance you can give would be much appreciated! I also think it's important to say that I'm a noob.
This works for me :)
import os
import win32com.client as win32
import requests as r
import pandas as pd
hesk = "C:\\Users\\ddejean\\Desktop\\TEST\\hesk.xml"
folder = "C:\\Users\\ddejean\\Desktop\\TEST"
output = "C:\\Users\\ddejean\\Desktop\\TEST\\output.csv"
cd = os.path.dirname(os.path.abspath(folder))
xmlfile = os.path.join(cd, hesk)
csvfile = os.path.join(cd, output)
# EXCEL COM TO SAVE EXCEL XML AS CSV
if os.path.exists(csvfile):
os.remove(csvfile)
try:
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.OpenXML(xmlfile)
wb.SaveAs(csvfile, 6)
wb.Close(True)
except Exception as e:
print(e)
finally:
# RELEASES RESOURCES
wb = None
excel = None

How to run Excel VBA / Macro from Python

I am trying to run a VBA Macro in an xlsm workbook using python 3.7 in Spyder. This workbook has two worksheets.
The code that I have currently runs and saves the new file with no problems, however it is not triggering the VBA like it should.
I know this macro works because if I manually click the button in Excel, it works just fine.
Could someone assist with this? I checked the Macro Settings under the Trust Center and all macros are enabled so I do not think it is a permissions issue, however I am not an admin on this pc.
The code is below:
import os
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
wb = xl.Workbooks.Open("Z:\FolderName\FolderName2\FileName.xlsm")
xl.Application.Run("MacroName")
wb.SaveAs("Z:\FolderName\FolderName2\FileName1.xlsm")
wb.Close()
xl.Quit()
This can be done easily through xlwings. Once I switched to that library then I was able to quickly get this script working.
First make sure you have your All.xlsm file in your current working or in your User/Documents(Sometimes it working from yourDocuments directory and sometimes not, so better to have in both)
pass your macro name along with the file name that contains the macro you can make change to Parameters like ReadOnly or Savechanges according to your requirement
And be make sure to deleta xl object after each run
import win32com.client
xl =win32com.client.dynamic.Dispatch('Excel.Application')
xl.Workbooks.Open(Filename = XYZ.xls, ReadOnly= 1)
xl.Application.Run('All.xlsm!yourmacroname')
xl.Workbooks(1).Close(SaveChanges=1)
xl.Application.Quit()
del xl
Running Excel Macro from Python
To Run a Excel Marcro from python, You don't need almost nothing. Below a script that does the job. The advantage of Updating data from a macro inside Excel is that you immediatly see the result. You don't have to save or close the workbook first. I use this methode to update real-time stock quotes. It is fast and stable.
This is just an example, but you can do anything with macros inside Excel.
from os import system, path
import win32com.client as win32
from time import sleep
def isWorkbookOpen(xlPath, xlFileName):
SeachXl = xlPath + "~$" + xlFileName
if path.exists(SeachXl):
return True
else:
return False
def xlRunMacro(macroLink):
PathFile = macroLink[0]
xlMacro = macroLink[1]
isLinkReady = False
# Create the link with the open existing workbook
win32.pythoncom.CoInitialize()
xl = win32.Dispatch("Excel.Application")
try:
wb = win32.GetObject(PathFile)
isLinkReady = True
except:
NoteToAdd = 'Can not create the link with ' + PathFile
print(NoteToAdd)
if isLinkReady:
# If the link with the workbook exist, then run the Excel macro
try:
xl.Application.Run(xlMacro)
except:
NoteToAdd = 'Running Excel Macro ' + xlMacro + ' failed !!!'
print(NoteToAdd)
del xl
def mainProgam(macroSettings):
FullMacroLink = []
PathFile = macroSettings[0] + macroSettings[1]
FullMacroLink.append(PathFile)
FullModuleSubrout = macroSettings[1] + '!' + macroSettings[2] + '.' + macroSettings[3]
FullMacroLink.append(FullModuleSubrout)
if isWorkbookOpen(macroSettings[0], macroSettings[1]) == False:
# If the workbook is not open, Open workbook first.
system(f'start excel.exe "{PathFile}"')
# Give some time to start up Excel
sleep(2)
xlRunMacro(FullMacroLink)
def main():
macroSettings = []
# The settings below will execute the macro example
xlPath = r'C:\test\\' # Change add your needs
macroSettings.append(xlPath)
workbookName = 'Example.xlsm' # Change add your needs
macroSettings.append(workbookName)
xlModule = "Updates" # Change add your needs
macroSettings.append(xlModule)
xlSubroutine = "UpdateCurrentTime" # Change add your needs
macroSettings.append(xlSubroutine)
mainProgam(macroSettings)
if __name__ == "__main__":
main()
exit()
VBA Excel Macro
Option Explicit
Sub UpdateCurrentTime()
Dim sht As Worksheet
Set sht = ThisWorkbook.Sheets("Current-Time")
With sht
sht.Cells(2, 1).Value = Format(Now(), "hh:mm:ss")
End With
End Sub
You can use it also as a dynamic module too. Save the module above as RunExcelMacro.py in Your python project. After just use the following lines:
from RunExcelMacro import mainProgam
mainProgram(macroSettings)
It will do the job, succes ...
You need to reference the module name as well
Example here my vba code under Module1
Option Explicit
Public Sub Example()
MsgBox "Hello 0m3r"
End Sub
and here is my python
from win32com.client import Dispatch
def run_excel_macro():
try:
excel = Dispatch("Excel.Application")
excel.Visible = True
workbook = excel.Workbooks.Open(
r"D:\Documents\Book1.xlsm")
workbook.Application.Run("Module1.Example")
workbook.SaveAs(r"D:\Documents\Book5.xlsm")
excel.Quit()
except IOError:
print("Error")
if __name__ == "__main__":
run_excel_macro()

Print Excel to pdf with xlwings

I am trying to print Excel files to pdf with xlwings. I am using the excel api for this.
I have tried it in two ways:
1/ Using the PrintOut() call with PrintToFile argument:
wb.api.PrintOut(PrintToFile=True, PrToFileName="5.pdf", Preview=True)
The problem here is Excel just prints the file, ignoring my additional settings.
2/ Using ExportAsFixedFormat
wb.api.ExportAsFixedFormat(0, str(SwmId) + ".pdf")
Here Excel flashes a bit, but does not do anything in the end.
For the record: I can't use a macro and call it from Python because I have about a thousand of these Excel files. So, I can't put the macro in every single one of them. It would probably be a workaround to create a custom function in VBA and than call it every file. But, honestly, it would be easier if I could just do this directly from Python, in one line of code.
Below is a self-standing code example of what worked on my machine to print an excel workbook to pdf (using the ExportAsFixedFormat method):
# Environment
# -----------
# OS: Windows 10
# Excel: 2013
# python: 3.7.4
# xlwings: 0.15.8
import os
import xlwings as xw
# Initialize new excel workbook
book = xw.Book()
sheet = book.sheets[0]
sheet.range("A1").value = "dolphins"
# Construct path for pdf file
current_work_dir = os.getcwd()
pdf_path = os.path.join(current_work_dir, "workbook_printout.pdf")
# Save excel workbook to pdf file
print(f"Saving workbook as '{pdf_path}' ...")
book.api.ExportAsFixedFormat(0, pdf_path)
# Open the created pdf file
print(f"Opening pdf file with default application ...")
os.startfile(pdf_path)
xlwings documentation recommends using xw.App():
from pathlib import Path
import xlwings as xw
import os
with xw.App() as app:
# user will not even see the excel opening up
app.visible = False
book = app.books.open(path_to_excelfile)
sheet = book.sheets[0]
sheet.page_setup.print_area = '$A$1:$Q$66'
sheet.range("A1").value = "experimental"
# Construct path for pdf file
current_work_dir = os.getcwd()
pdf_file_name = "pdf_workbook_printout.pdf"
pdf_path = Path(current_work_dir, pdf_file_name)
# Save excel workbook as pdf and showing it
sheet.to_pdf(path=pdf_path, show=True)

Objects of type 'WindowsPath' can not be converted to a COM VARIANT

I have an xlsx file which is a template for a receipt. It contains images and cells. I used to go into the file manually, update the information and then export to pdf before sending to my clients. I would like to be able to convert an xlsx to pdf through python if possible.
My problem is no one shows a tutorial which just chooses a xlsx file and changes it to pdf. Or no decent video tutorial.
I've tried getting openpyxl to save it as an extension with .pdf but i know that was a long shot. And i tried to follow an example on stack overflow but it didnt work that well.
I keep getting :
File "<COMObject <unknown>>", line 5, in ExportAsFixedFormat
Objects of type 'WindowsPath' can not be converted to a COM VARIANT
and I'm pretty stuck.
#this file will open a wb and save it as another file name
#this first part opens a file from a location and makes a copy to another location
from pathlib import Path
from win32com import client
#sets filename and file
file_name = 'After Summer Bookings.xlsx'
dir_path = Path('C:/Users/BOTTL/Desktop/Business')
new_file_name = 'hello.pdf'
new_save_place = Path('C:/Users/BOTTL/Desktop/Business Python')
xlApp = client.Dispatch("Excel.Application")
books = xlApp.Workbooks.Open(dir_path / file_name)
ws = books.Worksheets[0]
ws.Visible = 1
ws.ExportAsFixedFormat(0, new_save_place / new_file_name)
I'd like it to open the xlsx file I have called After Summer Bookings.xlsx and save it as a pdf file called hello.pdf
Solved it myself :)
from pathlib import Path
from win32com import client
#sets filename and file
file_name = 'After Summer Bookings.xlsx'
dir_path = Path('C:/Users/BOTTL/Desktop/Business')
new_file_name = 'hello.pdf'
new_save_place = ('C:/Users/BOTTL/Desktop/Business Python/')
path_and_place = new_save_place + new_file_name
xlApp = client.Dispatch("Excel.Application")
books = xlApp.Workbooks.Open(dir_path / file_name)
ws = books.Worksheets[0]
ws.Visible = 1
ws.ExportAsFixedFormat(0,path_and_place)
when concatenating the location and the filename it didn't like that I had made it a path, so now that I removed path, it works like a dream :)

Error while opening the Excel file after using openpyxl in Macintosh

I received the following error as I tried to open the Excel file which I have saved using openpyxl:
Excel could not open excel_test.xlsx because some content is unreadable. Do you want to open and repair this workbook?
Is there a way to tackle this issue without having to click the "repair" option? I have looked into here but it does not solve the problem.
The original content of the file looks like this:
and here is my code:
import openpyxl
path_excel = ""
workbook_test = openpyxl.load_workbook(filename = path_excel)
worksheet_test = workbook_test["Sheet1"]
for row_, cellObj_ in enumerate(worksheet_test["D"], 1):
if row_ == 1:
cellObj_.value = "SUM"
else:
cellObj_.value = "= $B${0} + $C${0}".format(row_)
workbook_test.save(filename = path_excel)

Categories

Resources