I'm trying to map all VBA codes I've in some excel in the office.
In my job, we have more than two hundred excel files, with a lot of macro in each. How can I extract the code text of some module?
I've see something like
workbook = excel.Workbooks.Open("{}{}.xlsm".format(path, file), True, True)
for i in workbook.VBProject.VBComponents:
print(i.Name)
Return
Plan1
Plan2
Main
Plan5
Plan3
Plan4
Sheet1
How can I get the VBA code in these Modules?
The solution could be in VBA or in Python
Perhaps this is close to what you are looking for?
Sub GEN_USE_Export_all_modules_from_Project()
' https://www.ozgrid.com/forum/forum/help-forums/excel-general/60787-export-all-modules-in-current-project
' reference to extensibility library
Dim tday As String
tday = Date
Dim objMyProj As VBProject
Dim objVBComp As VBComponent
Dim destFolder as String
destFolder = "C:\Users\WHATEVER\Documents\Business\EXCEL NOTES\"
Set objMyProj = Application.VBE.ActiveVBProject
tday = WorksheetFunction.Substitute(tday, "/", "-")
For Each objVBComp In objMyProj.VBComponents
If objVBComp.Type = vbext_ct_StdModule Then
objVBComp.Export destFolder & tday & " - " & objVBComp.name & ".bas"
End If
Next
MsgBox ("Saved to " & destFolder)
End Sub
This will loop through your VBAProject (where this macro itself is stored), and will save the modules in a .bas file which you can open and go through.
Edit: You can replace .bas with .txt and it works, FYI.
I've found a solution:
import win32com.client
import pandas as pd
excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.Workbooks.Open("{}{}.xlsm".format(path, file), True, True)
dict_modules = {}
for i in workbook.VBProject.VBComponents:
name = i.name
lines = workbook.VBProject.VBComponents(name).CodeModule.CountOfLines
# To jump empty modules
if lines == 0:
pass
else:
text = workbook.VBProject.VBComponents(name).CodeModule.Lines(1,lines)
dict_modules[name] = [text]
df = pd.DataFrame(dict_modules)
The DataFrame are returned with the modules name like the head of the table.
To look each text I found
# To get the full text
module_name = df["module_name"][0]
#To get by line
module_text_by_line = module_name.splitlines()
Thanks who tried help me.
Related
I have a number of HTML files that I need to open up or import into a single Excel Workbook and simply save the Workbook. Each HTML file should be on its own Worksheet inside the Workbook.
My existing code does not work and it crashes on the workbook.Open(html) line and probably will on following lines. I can't find anything searching the web specific to this topic.
import win32com.client as win32
import pathlib as path
def save_html_files_to_worksheets(read_directory):
read_path = path.Path(read_directory)
save_path = read_path.joinpath('Single_Workbook_Containing_HTML_Files.xlsx')
excel_app = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel_app.Workbooks.Add() # create a new excel workbook
indx = 1 # used to add new worksheets dependent on number of html files
for html in read_path.glob('*.html'): # loop through directory getting html files
workbook.Open(html) # open the html in the newly created workbook - this doesn't work though
worksheet = workbook.Worksheets(indx) # each iteration in loop add new worksheet
worksheet.Name = 'Test' + str(indx) # name added worksheets
indx += 1
workbook.SaveAs(str(save_path), 51) # win32com requires string like path, 51 is xlsx extension
excel_app.Application.Quit()
save_html_files_to_worksheets(r'C:\Users\<UserName>\Desktop\HTML_FOLDER')
The following code does half of want I want, if this helps. It will convert each HTML file into a separate Excel file. I need each HTML file in one Excel file with multiple WorkSheets.
import win32com.client as win32
import pathlib as path
def save_as_xlsx(read_directory):
read_path = path.Path(read_directory)
excel_app = win32.gencache.EnsureDispatch('Excel.Application')
for html in read_path.glob('*.html'):
save_path = read_path.joinpath(html.stem + '.xlsx')
wb = excel_app.Workbooks.Open(html)
wb.SaveAs(str(save_path), 51)
excel_app.Application.Quit()
save_as_xlsx(r'C:\Users\<UserName>\Desktop\HTML_FOLDER')
Here is a link to a sample HTML file you can use, the data in the file is not real: HTML Download Link
One solution would be to open the HTML file into a temporary workbook, and copy the sheet from there into the workbook containing all of them:
workbook = excel_app.Application.Workbooks.Add()
sheet = workbook.Sheets(1)
for path in read_path.glob('*.html'):
workbook_tmp = excel_app.Application.Workbooks.Open(path)
workbook_tmp.Sheets(1).Copy(Before=sheet)
workbook_tmp.Close()
# Remove the redundant 'Sheet1'
excel_app.Application.ShowAlerts = False
sheet.Delete()
excel_app.Application.ShowAlerts = True
I believe pandas will make your job much easier.
pip install pandas
Here's an example on how to get multiple tables from a wikipedia html and input it into a Pandas DataFrame and save it to disk.
import pandas as pd
url = "https://en.wikipedia.org/wiki/List_of_American_films_of_2017"
wikitables = pd.read_html(url, header=0, attrs={"class":"wikitable"})
for idx,df in enumerate(wikitables):
df.to_csv('{}.csv'.format(idx),index=False)
For your use case, something like this should work:
import pathlib as path
import pandas as pd
def save_as_xlsx(read_directory):
read_path = path.Path(read_directory)
for html in read_path.glob('*.html'):
save_path = read_path.joinpath(html.stem + '.xlsx')
dfs_from_html = pd.read_html(html, header=0,)
for idx, df in enumerate(dfs_from_html):
df.to_excel('{}.xlsx'.format(idx),index=False)
** Make sure to set the correct html attribute in the pd.read_html function.
How about this?
Sub From_XML_To_XL()
'UpdatebyKutoolsforExcel20151214
Dim xWb As Workbook
Dim xSWb As Workbook
Dim xStrPath As String
Dim xFileDialog As FileDialog
Dim xFile As String
Dim xCount As Long
On Error GoTo ErrHandler
Set xFileDialog = Application.FileDialog(msoFileDialogFolderPicker)
xFileDialog.AllowMultiSelect = False
xFileDialog.Title = "Select a folder [Kutools for Excel]"
If xFileDialog.Show = -1 Then
xStrPath = xFileDialog.SelectedItems(1)
End If
If xStrPath = "" Then Exit Sub
Application.ScreenUpdating = False
Set xSWb = ThisWorkbook
xCount = 1
xFile = Dir(xStrPath & "\*.xml")
Do While xFile <> ""
Set xWb = Workbooks.OpenXML(xStrPath & "\" & xFile)
xWb.Sheets(1).UsedRange.Copy xSWb.Sheets(1).Cells(xCount, 1)
xWb.Close False
xCount = xSWb.Sheets(1).UsedRange.Rows.Count + 2
xFile = Dir()
Loop
Application.ScreenUpdating = True
xSWb.Save
Exit Sub
ErrHandler:
MsgBox "no files xml", , "Kutools for Excel"
End Sub
Suppose I have an excel file excel_file.xlsx and i want to send it to my printer using Python so I use:
import os
os.startfile('path/to/file','print')
My problem is that this only prints the first sheet of the excel workbook but i want all the sheets printed. Is there any way to print the entire workbook?
Also, I used Openpyxl to create the file, but it doesn't seem to have any option to select the number of sheets for printing.
Any help would be greatly appreciated.
from xlrd import open_workbook
from openpyxl.reader.excel import load_workbook
import os
import shutil
path_to_workbook = "/Users/username/path/sheet.xlsx"
worksheets_folder = "/Users/username/path/worksheets/"
workbook = open_workbook(path_to_workbook)
def main():
all_sheet_names = []
for s in workbook.sheets():
all_sheet_names.append(s.name)
for sheet in workbook.sheets():
if not os.path.exists("worksheets"):
os.makedirs("worksheets")
working_sheet = sheet.name
path_to_new_workbook = worksheets_folder + '{}.xlsx'.format(sheet.name)
shutil.copyfile(path_to_workbook, path_to_new_workbook)
nwb = load_workbook(path_to_new_workbook)
print "working_sheet = " + working_sheet
for name in all_sheet_names:
if name != working_sheet:
nwb.remove_sheet(nwb.get_sheet_by_name(name))
nwb.save(path_to_new_workbook)
ws_files = get_file_names(worksheets_folder, ".xlsx")
# Uncomment print command
for f in xrange(0, len(ws_files)):
path_to_file = worksheets_folder + ws_files[f]
# os.startfile(path_to_file, 'print')
print 'PRINT: ' + path_to_file
# remove worksheets folder
shutil.rmtree(worksheets_folder)
def get_file_names(folder, extension):
names = []
for file_name in os.listdir(folder):
if file_name.endswith(extension):
names.append(file_name)
return names
if __name__ == '__main__':
main()
probably not the best approach, but it should work.
As a workaround you can create separate .xlsx files where each has only one spreadsheet and then print them with os.startfile(path_to_file, 'print')
I have had this issue(on windows) and it was solved by using pywin32 module and this code block(in line 5 you can specify the sheets you want to print.)
import win32com.client
o = win32com.client.Dispatch('Excel.Application')
o.visible = True
wb = o.Workbooks.Open('/Users/1/Desktop/Sample.xlsx')
ws = wb.Worksheets([1 ,2 ,3])
ws.printout()
you could embed vBa on open() command to print the excel file to a default printer using xlsxwriter's utility mentioned in this article:
PBPYthon's Embed vBA in Excel
Turns out, the problem was with Microsoft Excel,
os.startfile just sends the file to the system's default app used to open those file types. I just had to change the default to another app (WPS Office in my case) and the problem was solved.
Seems like you should be able to just loop through and change which page is active. I tried this and it did print out every sheet, BUT for whatever reason on the first print it grouped together two sheets, so it gave me one duplicate page for each workbook.
wb = op.load_workbook(filepath)
for sheet in wb.sheetnames:
sel_sheet = wb[sheet]
# find the max row and max column in the sheet
max_row = sel_sheet.max_row
max_column = sel_sheet.max_column
# identify the sheets that have some data in them
if (max_row > 1) & (max_column > 1):
# Creating new file for each sheet
sheet_names = wb.sheetnames
wb.active = sheet_names.index(sheet)
wb.save(filepath)
os.startfile(filepath, "print")
I am new to use of win32com module . Following code i found on web for generating excel :
import win32com.client as win32
class generate_excel:
def excel(self):
excel = win32.gencache.EnsureDispatch('Excel.Application')
excel.Visible = True
wb = excel.Workbooks.Add()
ws = wb.Worksheets('Sheet1')
ws.Name = 'Share'
ws.Range(ws.Cells(2,2),ws.Cells(2,3)).Value = [1,70]
ws.Range(ws.Cells(3,2),ws.Cells(3,3)).Value = [2,90]
ws.Range(ws.Cells(4,2),ws.Cells(4,3)).Value = [3,92]
ws.Range(ws.Cells(5,2),ws.Cells(5,3)).Value = [4,95]
ws.Range(ws.Cells(6,2),ws.Cells(6,3)).Value = [5,98]
wb.SaveAs('hi.xlsx')
excel.Application.Quit()
d=generate_excel()
d.excel()
In this code i want to add a VB Script to draw a pie chart before the excel is closed . The VB Script is as follows :
Sub Macro2()
'
' Macro2 Macro
'
'
ActiveSheet.Shapes.AddChart.Select
ActiveChart.ChartType = xlPie
ActiveChart.SetSourceData Source:=Range("C2:C6")
End Sub
Please tell how to embed this in my Python Script .
Bit late and you've probably already found a solution but hopefully this helps someone else.
from win32com.client import DispatchEx
# Store vba code in string
vbaCode = """
Sub Macro2()
'
' Macro2 Macro
'
'
ActiveSheet.Shapes.AddChart.Select
ActiveChart.ChartType = xlPie
ActiveChart.SetSourceData Source:=Range("C2:C6")
End Sub
"""
# Init Excel and open workbook
xl = DispatchEx("Excel.Application")
wb = xl.Workbooks.Add(path_to_workbook)
# Create a new Module and insert the macro code
mod = wb.VBProject.VBComponents.Add(1)
mod.CodeModule.AddFromString(vbaCode)
# Run the new Macro
xl.Run("Macro2")
# Save the workbook and close Excel
wb.SaveAs(path_to_file, FileFormat=52)
xl.Quit()
I'm using the following code to run an Excel macro from Python:
import pymysql
import datetime
import csv
import math
import os
import glob
import sys
import win32com.client
import numpy
from tkinter import *
from tkinter import ttk
import tkinter.messagebox
def run_macro():
print('macro')
#this if is here because if an executable is created, __file__ doesn't work
if getattr(sys, 'frozen', False):
name = (os.path.dirname(sys.executable) + '\\Forecast template.xlsm')
else:
name = str(os.path.dirname(os.path.realpath(__file__)) + '\\Forecast template.xlsm')
print(name)
#this part runs the macro from excel
if os.path.exists(name):
xl=win32com.client.Dispatch("Excel.Application")
xl.Workbooks.Open(Filename=name, ReadOnly=1)
xl.Application.Run("ThisWorkbook.LoopFilesInFolder")
xl.Application.Quit() # Comment this out if your excel script closes
del xl
print('File refreshed!')
I seem to be be having a certain issue with this, after running this, I go to open any excel file and I only get a grey window:
Any idea of why this happens? Also how do I add to the code something to just open a file in Excel? (not to get the information, but to just open that file in Excel)
Extra question: How do I get this not to close all open Excel files?
EDIT: I just checked the macro, and that works just fine, the problem seems to come just from when I run the code.
NEW EDIT:
This is the code from the macro:
Sub LoopFilesInFolder()
Dim wb1 As Workbook
Dim wb2 As Workbook
Dim path As String
Dim file As String
Dim extension As String
Dim myFileName As String
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Set wb1 = ActiveWorkbook
path = ActiveWorkbook.path & "\csvs\"
extension = "*.csv"
file = Dir(path & extension)
Do While file <> ""
Set wb2 = Workbooks.Open(Filename:=path & file)
wb2.Activate
'this section is for the avail heads file, basically it just opens it and copies the info to the template
If wb2.Name = "avail_heads.csv" Then
Range("A1").Select
Range(Selection, Selection.End(xlDown)).Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.Copy
wb1.Activate
Worksheets("raw data").Range("B88").PasteSpecial xlPasteValues
End If
'this section is for the forecast file, basically it just opens it and copies the info to the template
If wb2.Name = "forecast.csv" Then
Range("A1").Select
Range(Selection, Selection.End(xlDown)).Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.Copy
wb1.Activate
Worksheets("raw data").Range("B74").PasteSpecial xlPasteValues
End If
'this section is for the income file, basically it just opens it and copies the info to the template
If wb2.Name = "income volume.csv" Then
Range("A1").Select
Range(Selection, Selection.End(xlDown)).Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.Copy
wb1.Activate
Worksheets("raw data").Range("B3").PasteSpecial xlPasteValues
End If
'this section is for the outgoing volume file, basically it just opens it and copies the info to the template
If wb2.Name = "outgoing_volume.csv" Then
Range("A1").Select
Range(Selection, Selection.End(xlDown)).Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.Copy
wb1.Activate
Worksheets("raw data").Range("B36").PasteSpecial xlPasteValues
End If
'this section is for the required heads file, basically it just opens it and copies the info to the template
If wb2.Name = "required_heads.csv" Then
Range("A1").Select
Range(Selection, Selection.End(xlDown)).Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.Copy
wb1.Activate
Worksheets("raw data").Range("B102").PasteSpecial xlPasteValues
End If
wb2.Close
file = Dir
Loop
'myFileName = ActiveWorkbook.path & "\forecast_for_w" & Format(Now, "ww") + 1
myFileName = ActiveWorkbook.path & "\yoda_forecast"
ActiveWorkbook.SaveAs Filename:=myFileName, FileFormat:=xlWorkbookNormal
'MsgBox "Done!"
Application.DisplayAlerts = True
End Sub
I have had this on bounty for almost a week and I really don know how many people have had this issue before. After tearing my hair out for over a week and considering multiple times to jump out the office window, I figured what the problem was.
Thee problem is not in python (sort of) but mostly on the VBA code, I just added
Application.ScreenUpdating = True
At the very end and the problem stopped. Not sure if an excel bug for not updating when the macro is done or a python bug for not allowing the screen to update once the macro has finished. However after doing that everything is fine now.
Thanks!!
You need to set the Visibility to True after dispatching:
xl = win32com.client.Dispatch("Excel.Application")
xl.Application.Visible = True
Also, calling the macro shouldn't need the ThisWorkbook:
xl.Application.Run("LoopFilesInFolder")
To be able to close a workbook again, you need to assign it to a variable:
wb = xl.Workbooks.Open(Filename=name, ReadOnly=1)
wb.Close(SaveChanges=False)
I wrote the package xlwings to make things easier:
from xlwings import Workbook, Range
wb = Workbook(r'C:\path\to\workbook.xlsx') # open a workbook if not open yet
Range('A1').value = 123 # Write a value to cell A1
wb.close() # close the workbook again
Running a Macro hasn't been implemented yet, but there's an issue open for that. In the meantime, you can workaround this by doing (following from above):
wb.xl_app.Run("macro_name")
Running Excel Macro from Python
To Run a Excel Marcro from python, You don't need almost nothing. Below a script that does the job. The advantage of Updating data from a macro inside Excel is that you immediatly see the result. You don't have to save or close the workbook first. I use this methode to update real-time stock quotes. It is fast and stable.
This is just an example, but you can do anything with macros inside Excel.
from os import system, path
import win32com.client as win32
from time import sleep
def isWorkbookOpen(xlPath, xlFileName):
SeachXl = xlPath + "~$" + xlFileName
if path.exists(SeachXl):
return True
else:
return False
def xlRunMacro(macroLink):
PathFile = macroLink[0]
xlMacro = macroLink[1]
isLinkReady = False
# Create the link with the open existing workbook
win32.pythoncom.CoInitialize()
xl = win32.Dispatch("Excel.Application")
try:
wb = win32.GetObject(PathFile)
isLinkReady = True
except:
NoteToAdd = 'Can not create the link with ' + PathFile
print(NoteToAdd)
if isLinkReady:
# If the link with the workbook exist, then run the Excel macro
try:
xl.Application.Run(xlMacro)
except:
NoteToAdd = 'Running Excel Macro ' + xlMacro + ' failed !!!'
print(NoteToAdd)
del xl
def mainProgam(macroSettings):
FullMacroLink = []
PathFile = macroSettings[0] + macroSettings[1]
FullMacroLink.append(PathFile)
FullModuleSubrout = macroSettings[1] + '!' + macroSettings[2] + '.' + macroSettings[3]
FullMacroLink.append(FullModuleSubrout)
if isWorkbookOpen(macroSettings[0], macroSettings[1]) == False:
# If the workbook is not open, Open workbook first.
system(f'start excel.exe "{PathFile}"')
# Give some time to start up Excel
sleep(2)
xlRunMacro(FullMacroLink)
def main():
macroSettings = []
# The settings below will execute the macro example
xlPath = r'C:\test\\' # Change add your needs
macroSettings.append(xlPath)
workbookName = 'Example.xlsm' # Change add your needs
macroSettings.append(workbookName)
xlModule = "Updates" # Change add your needs
macroSettings.append(xlModule)
xlSubroutine = "UpdateCurrentTime" # Change add your needs
macroSettings.append(xlSubroutine)
mainProgam(macroSettings)
if __name__ == "__main__":
main()
exit()
VBA Excel Macro
Option Explicit
Sub UpdateCurrentTime()
Dim sht As Worksheet
Set sht = ThisWorkbook.Sheets("Current-Time")
With sht
sht.Cells(2, 1).Value = Format(Now(), "hh:mm:ss")
End With
End Sub
You can use it also as a dynamic module too. Save the module above as RunExcelMacro.py in Your python project. After just use the following lines:
from RunExcelMacro import mainProgam
mainProgram(macroSettings)
It will do the job, succes ...
I am developing this app that loads an excel sheet and allows users to run macros and tag cells. I ran in to a problem trying to read Macros from the Excel sheet. I am using win32com module for all the excel access. So far it has been good to me. But am struggling trying to create code that will provide me with all the macros. I have the corresponding VBA code that does provide with me the macros but I have so far been unsuccessful trying to translate it in to python. I want to read all the Macros using the Python code.
heres the VBA code:
Sub ListMacros()
Const vbext_pk_Proc = 0
Dim VBComp As Object
Dim VBCodeMod As Object
Dim oListsheet As Object
Dim StartLine As Long
Dim ProcName As String
Dim iCount As Integer
Application.ScreenUpdating = False
On Error Resume Next
Set oListsheet = ActiveWorkbook.Worksheets.Add
iCount = 1
oListsheet.Range("A1").Value = "Macro"
For Each VBComp In ThisWorkbook.VBProject.VBComponents
Set VBCodeMod = ThisWorkbook.VBProject.VBComponents(VBComp.Name).CodeModule
With VBCodeMod
StartLine = .CountOfDeclarationLines + 1
Do Until StartLine >= .CountOfLines
oListsheet.[a1].Offset(iCount, 0).Value = _
.ProcOfLine(StartLine, vbext_pk_Proc)
iCount = iCount + 1
StartLine = StartLine + _
.ProcCountLines(.ProcOfLine(StartLine, _
vbext_pk_Proc), vbext_pk_Proc)
Loop
End With
Set VBCodeMod = Nothing
Next VBComp
Application.ScreenUpdating = True
End Sub
I do not have any prior experience in VBA. I am struggling with this. Any suggestions as to how to approach this problem will be very much Appreciated.Thanks
Edit: Heres what I have so far!!
# RunInExcel.py - opens named Excel file fName, returns the sheets, modules
import os, os.path, win32com.client
def run_macro(fName, path=os.getcwd()):
print(path)
fName = os.path.join(path, fName)
print(fName)
xlApp = win32com.client.Dispatch("Excel.Application")
fTest = xlApp.Workbooks.Open(fName)
for i in fTest.VBProject.VBComponents:
print i.Name
if __name__ == "__main__":
run_macro("Testing1.xlsm")
But I need the name of the Macros. Thank you for the help!
Unfortunately I don't think you can get the macro names directly using VBA. Instead you scan the code module looking for Sub xxx, etc.
def run_macro(fName, path=os.getcwd()):
print(path)
fName = os.path.join(path, fName)
print(fName)
xlApp = win32com.client.Dispatch("Excel.Application")
fTest = xlApp.Workbooks.Open(fName)
for i in fTest.VBProject.VBComponents:
print i.Name
num_lines = i.CodeModule.CountOfLines
for j in range(1, num_lines+1):
if 'Sub' in i.CodeModule.Lines(j, 1) and not 'End Sub' in i.CodeModule.Lines(j, 1):
print i.CodeModule.Lines(j, 1)
xlApp.Application.Quit()
del xlApp
if __name__ == "__main__":
run_macro("Testing1.xlsm")
Note that you need to make sure you exit the script properly or you will leave Excel running (check taskmgr).