Using Python to read VBA from an Excel spreadsheet - python

I would like to write a VBA diff program in (preferably) Python. Is there a Python library that will allow me to read the VBA contained in an Excel spreadsheet?

Here's some quick and dirty boilerplate to get you started. It uses the Excel COM object (a Windows only solution):
from win32com.client import Dispatch
wbpath = 'C:\\example.xlsm'
xl = Dispatch("Excel.Application")
xl.Visible = 1
wb = xl.Workbooks.Open(wbpath)
vbcode = wb.VBProject.VBComponents(1).CodeModule
print vbcode.Lines(1, vbcode.CountOfLines)
This prints the silly macro I recorded for this example:
Sub silly_macro()
'
' silly_macro Macro
'
'
Range("B2").Select
End Sub
Note that Lines and VBComponents use 1-based indexing. VBComponents also supports indexing by module name. Also note that Excel requires backslashes in paths.
To dive deeper see Pearson's Programming The VBA Editor. (The above example was cobbled together from what I skimmed from there.)

I have created an application that does this called VbaDiff. If you provide it two Excel files it will compare the VBA code in each. You can also run it from the command line, or use the version that comes with an API if you want to integrate it with your own programs.
You can find out more at http://www.technicana.com/vbadiff-information.html
Chris

Related

How do I send/read data from VBA in Python?

Background
Right now I'm creating a macro to help automate the creation of some graphs in VBA. However, the creation of the graphs requires specific tasks to be done, for example, certain points in a series to be larger depending on previous instances. I would much rather do this data manipulation in python.
Problem
I want to use excel for its user-friendly interface but want to handle all the data manipulation within Python. How can I send data I create in VBA to python. To clarify I'm not trying to read specific cells in the excel sheet.
If I define a string in VBA say...
Dim example_string as String
example_string = "Hello, 1, 2, 3, Bye"
How can I send this information I created within VBA to Python for manipulation?
More Specifics
I have a textbox in excel that is filled by the user, which I read using VBA. I want to send that txt data from VBA to python. The user highlights the desired cells, which are not necessarily the same each time, clicks a button and fills a textbox. I don't want to use range or specific cell selection since this would require the user to specifically enter all the desired data into cells (too time-consuming).
I want to understand the basic procedure of how to send data between VBA and python.
You can do the whole thing in python, it will be more efficient and you can either use excel or sqlite3 as database, go here to read about graphic interfaces with tkinter, use pandas and numpy to process your data.
If you insist in sending data to python, import sys to your python script to read parameters and then run it from vba with the shell() method.
EDIT: You wanted an example, here it is =>
Open a new excel file, create a procedure like this (VBA CODE):
Sub sendToPython()
Dim shell As Object
Dim python As String
Dim callThis As String
Dim passing
Set shell = VBA.CreateObject("Wscript.Shell")
'/* This is where you installed python (Notice the triple quotes and use your own path *always)*/
python = """C:\Users\yourUserName\appdata\local\programs\python\python37\python.exe"""
'/* This is the data you'll be passing to python script*/
passing = "The*eye*of*the*tiger"
callThis = "C:\Users\yourUserName\desktop\yourScriptName.py " & passing & ""
shell.Run python & callThis
End Sub
The idea is to create some kind of a parser in python, this is my silly example (PYTHON CODE):
import sys
f = open("log.txt", "w")
arg = (sys.argv[1]).split("*")
s = " "
arg = s.join(arg)
print("This is the parameter i've entered: " + arg, file=f)
Notice how i used sys to read a parameter and i exported to actually see some results because otherwise you'll just see a black screen popping up for like a millisecond.
I also found this article, but it requires you to wrap the python script in a class and i don't know if that works for you

Links break when copying Excel sheets with win32com

I am trying to use win32com to copy a worksheet from my workbook to a new workbook. The code is working fine but the cell formulas in the new book point back to the original book. I would like to break the links in the new book so that these formulas are replaced with raw numbers. This is trivial to do in Excel but I haven't been able to find out how to do it using the win32com client in Python.
Here is a snippet of my code:
import win32com.client
xl = win32com.client.gencache.EnsureDispatch('Excel.Application')
xl.Visible = True
#Open & Refresh Spreadsheet
wb = xl.Workbooks.Open(r"C:\Users\me\dummy.xlsx") #Dummy path
print("Refreshing data...")
wb.RefreshAll()
#Create new book and copy target sheet over
print("Opening new workbook")
nwb = xl.Workbooks.Add()
newfile = r"C:\Users\me\dummy2.xlsx"
wb.Worksheets(["Target Sheet"]).Copy(Before=nwb.Worksheets(1))
nwb.SaveAs(newfile)
This code works fine but in the saved "dummy2" file each of the cells containing formulas reference the original sheet. How can I break the links in the new book and/or copy values only from the original book?
Edit in response to #martineau 's downvote of the answer and of the (admittedly unsatisfactory) Microsoft documentation.
I think you haven't been able to find out how to do this because you have been looking in the wrong place. Your question really has little to do with Python or with win32com.
This line
xl = win32com.client.gencache.EnsureDispatch('Excel.Application')
fires up a COM client called xl that talks to excel.exe. Your variable xl is a thin Python wrapper around a Microsoft COM object that can call Excel VBA functions. When you type xl., everything after the dot is expected to be a VBA object or method. Any value (other than strings and floats) that you get back from a call is a VBA object in a thin Python wrapper. Python conventions do not necessarily apply to such objects.
So to find out about what functions you need to call, you need to be looking at the Excel VBA documentation. One difficulty with that documentation is that it assumes you are writing VBA, not Python. The other is that it isn't all that well-written.
The VBA method you need is Workbook.BreakLink().
Call it after copying the original workbook and before saving the copy, like this (I'm using your dummy filename here, don't expect it to actually work without fixing that):
wb.Worksheets(["Target Sheet"]).Copy(Before=nwb.Worksheets(1))
nwb.BreakLink(Name=r"C:\Users\me\dummy.xlsx", Type=1)
nwb.SaveAs(newfile)
The name of the link is the filename it points to, and the type of the link is 1 (for a link to an Excel spreadsheet). In this case you know the name of the link source (since you just made a copy of it) so there is no need to ask what the filename is, but in the general case you need to call Workbook.LinkSources() to find out what they are, and break them one by one.

Find if a value exists in a column in Excel using python

I have an Excel file with one worksheet that has sediment collection data. I am running a long Python script.
In the worksheet is a column titled “CollectionYear.” Say I want the year 2010. If the year 2010 exists in the “CollectionYear” column, I want the rest of the script to run, if not then I want the script to stop.
This seems like an easy enough task but for the life of me I cannot figure it out nor find any examples.
Any help would be greatly appreciated.
I use xlrd all the time and it works great for me. Something like this might be helpful
from xlrd import open_workbook
def main():
book = open_workbook('example.xlsx')
sheet = book.sheet_by_index(0)
collection_year_col = 2 #Just an example
test_year = 2010
for row in range(sheet.nrows):
if sheet.cell(row,collection_year_col).value == test_year:
runCode()
def runCode():
#your code
I hope this points you in the right direction. More help could be given if the details of your problem were known.
Here is what I learned from tackling a needle-in-a-haystack problem for a gigantic pile of .xls files. There are some things xlrd and friends can't (or won't) do, such as getting the formula of a cell. For that, you'll need to use the Microsoft Component Object Model (COM)1.
I recommend you find yourself a copy of Python Programming on Win32 by Mark Hammond. It's still useful 20 years later. Python Programming on Win32 covers the basics of the COM and how to access it using the pywin32 library (also from Mark Hammond).
In a nutshell, you can think of the COM as an API between a server (say, Excel) and a client (such as a Python script)2.
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
The COM API is reasonably well documented. Once you get used to the terminology, things become straight-forward albeit tedious. For example, an Excel file is technically a "Workbook". The "Workbooks" COM object has the Open method which provides a handle for Python to interact with the "Workbook". (Did you notice the different 's' endings on those?)
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
A "Workbook" contains a "Sheet", accessed here through the "Sheets" COM object:
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
Finally, the 'Cells' property of a worksheet "returns a Range object that represents all the cells on the worksheet". The Range object then has a Find method which will search within the range. The LookIn parameter allows for searching cell values, formulas, and comments.
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
match = sht1.Cells.Find('search string')
The result of Find is a Range object which has many useful properties, like Formula, GetAddress, Value, and Text. You'll also find, as with anything Microsoft, that it's good enough for government work.
Finally, don't forget to close the workbook and to quit Excel!
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
match = sht1.Cells.Find('search string')
print(match.Formula)
wb.Close(SaveChanges=False)
xl.Quit()
You can extend these ideas with Sheets.Item and Sheets.Count and iterate over all sheets in a workbook (or all workbooks in a directory). You can have lots of fun!
The headaches you may encounter include VBA macros and embedded objects, as well as the various different alerts each can produce. Performance is also an issue. The following silence notifications and can dramatically improve performance:
Application
xl.DisplayAlerts (False)
xl.AutomationSecurity (msoAutomationSecurityForceDisable)
xl.Interactive (False)
xl.PrintCommunication (False)
xl.ScreenUpdating (False)
xl.StatusBar (False)
Workbook
wb.DoNotPromptForConvert (True)
wb.EnableAutoRecover (False)
wb.KeepChangeHistory (False)
Another potential issue is late/early binding. Basically, does Python have information about the COM object? This affects things like introspection and how COM objects are referenced. The win32com.client package uses late-bound automation by default.
With late-bound automation, Python doesn't know much about the COM object:
>> import win32com.client
>> xl = win32com.client.Dispatch("Excel.Application")
>> xl
<COMObject Excel.Application>
>> len(dir(xl))
55
With early-bound automation, Python has full knowledge of the object:
>> import win32com.client
>> xl = win32com.client.Dispatch("Excel.Application")
>> xl
<win32com.gen_py.Microsoft Excel 16.0 Object Library._Application instance at 0x2583562290680>
>> len(dir(xl))
125
To enable early binding, you must run makepy.py which is included with pywin32. Running makepy.py will prompt for the library to bind with.
(venv) c:\temp\venv\Lib\site-packages\win32com\client>python makepy.py
python makepy.py
The process creates a Python file (in Temp\) which maps the methods and properties of the COM object.
(venv) c:\temp\venv\Lib\site-packages\win32com\client>python makepy.py
python makepy.py
Generating to C:\Users\Lorem\AppData\Local\Temp\gen_py\3.6\00020813-0000-0000-C000-000000000046x0x1x9.py
Building definitions from type library...
Generating...
Importing module
Early binding also provides access to COM constants, such as msoAutomationSecurityForceDisable and xlAscending and is case-sensitive (whereas late-binding is not).
That should be enough info to implement a Python-to-Excel library (like xlwings), overkill notwithstanding.
1 Actually, xlwings works by utilizing the COM though pywin32. Here's to one less dependency!
2 This example uses win32com.client.Dispatch which requires processing happen through a single Excel instance. Use win32com.client.DispatchEx to create separate instances of Excel.
Try using xlwings library to interface with Excel from python
example from their docs:
from xlwings import Workbook, Sheet, Range, Chart
wb = Workbook() # Creates a connection with a new workbook
Range('A1').value = 'Foo 1'
Range('A1').value
>>> 'Foo 1'
Range('A1').value = [['Foo 1', 'Foo 2', 'Foo 3'], [10.0, 20.0, 30.0]]

Use Python to Write VBA Script?

This might be a bit of a stretch, but is there a possibility that a python script can be used to create VBA in MS Excel (or any other MS Office product that uses VBA) using pythonwin or any other module.
Where this idea came from was pythons openpyxl modules inability to do column autowidth. The script I have creates a workbook in memory and eventually saves it to disc. There are quite a few sheets and within each sheet, there are quite a few columns. I got to thinking....what if I just use python to import a VBA script (saved somewhere in notepad or something) into the VBA editor in excel and then run that script from python using pythonwin.
Something like:
Workbooks.worksheets.Columns("A:Z").EntireColumn.Autofit
Before you comment, yes I have seen lots of pythonic examples of how to work around auto adjusting columns in openpyxl, but I see some interesting opportunities that can be had utilizing the functionality you get from VBA that may not be available in python.
Anyways, I dug around the internet a bit and I didn't see anything that indicates i can, so i thought I'd ask.
Cheers,
Mike
Yes, it is possible. You can start looking at how you can generate a VBA macro from VB on that Microsoft KB.
The Python code below is illustrating how you can do the same ; it is a basic port of the first half of the KB sample code:
import win32com.client as win32
import comtypes, comtypes.client
xl = win32.gencache.EnsureDispatch('Excel.Application')
xl.Visible = True
ss = xl.Workbooks.Add()
sh = ss.ActiveSheet
xlmodule = ss.VBProject.VBComponents.Add(1) # vbext_ct_StdModule
sCode = '''sub VBAMacro()
msgbox "VBA Macro called"
end sub'''
xlmodule.CodeModule.AddFromString(sCode)
You can look at the visible automated Excel macros, and you will see the VBAMacro defined above.
The top answer will only add the macro, if you actually want to execute it there is one more step.
import win32com.client as win32
xl = win32.gencache.EnsureDispatch('Excel.Application')
xl.Visible = True
ss = xl.Workbooks.Add()
xlmodule = ss.VBProject.VBComponents.Add(1)
xlmodule.Name = 'testing123'
code = '''sub TestMacro()
msgbox "Testing 1 2 3"
end sub'''
xlmodule.CodeModule.AddFromString(code)
ss.Application.Run('testing123.TestMacro')
Adding a module name will help deconflict from any existing scripts.

Calling python script from excel/vba

I have a python code that reads 3 arguments (scalars) and a text files and then returns me a vector of double. I want to write a macro in vba to call this python code and write the results in one of the same excel sheet. I wanted to know what was the easiest way to do it, here are some stuffs that I found:
call the shell() function in vba but it doesn't seem so easy to get the return value.
register the python code as a COM object and call it from vba--> i don't know how to do that so if you have some examples it would be more than welcome
create a custom tool in a custom toolbox, in vba create a geoprocessing object and then addtoolbox and then we can use the custom tool directly via the geoprocessing object but this is something as well that I don't know how to do..
Any tips?
Follow these steps carefully
Go to Activestate and get ActivePython 2.5.7 MSI installer.
I had DLL hell problems with 2.6.x
Install in your Windows machine
once install is complete open Command Prompt and go to
C:\Python25\lib\site-packages\win32comext\axscript\client
execute \> python pyscript.py
you should see message Registered: Python
Go to ms office excel and open worksheet
Go to Tools > Macros > Visual Basic Editor
Add a reference to the Microsoft Script control
Add a new User Form. In the UserForm add a CommandButton
Switch to the code editor and Insert the following code
Dim WithEvents PyScript As
MSScriptControl.ScriptControl
Private Sub CommandButton1_Click()
If PyScript Is Nothing Then
Set PyScript = New MSScriptControl.ScriptControl
PyScript.Language = "python"
PyScript.AddObject "Sheet", Workbooks(1).Sheets(1)
PyScript.AllowUI = True
End If
PyScript.ExecuteStatement "Sheet.cells(1,1).value='Hello'"
End Sub
Execute. Enjoy and expand as necessary
Do you have to call the Python code as a macro? You could use COM hooks within the Python script to direct Excel and avoid having to use another language:
import win32com.client
# Start Excel
xlApp = win32com.client.Dispatch( "Excel.Application" )
workbook = xlApp.Workbooks.Open( <some-file> )
sheet = workbook.Sheets( <some-sheet> )
sheet.Activate( )
# Get values
spam = sheet.Cells( 1, 1 ).Value
# Process values
...
# Write values
sheet.Cells( ..., ... ).Value = <result>
# Goodbye Excel
workbook.Save( )
workbook.Close( )
xlApp.Quit( )
Here is a good link for Excel from/to Python usage:
continuum.io/using-excel
mentions xlwings, DataNitro, ExPy, PyXLL, XLLoop, openpyxl, xlrd, xlsxwriter, xlwt
Also I found that ExcelPython is under active development.
https://github.com/ericremoreynolds/excelpython
2.
What you can do with VBA + Python is following:
Compile your py scripts that take inputs and generate outputs as text files or from console. Then VBA will prepare input for py, call the pre-compiled py script and read back its output.
3.
Consider Google Docs, OpenOffice or LibreOffice which support Python scripts.
This is assuming that available options with COM or MS script interfaces do not satisfy your needs.
This is not free approach, but worth mentioning (featured in Forbes and New York Times):
https://datanitro.com
This is not free for commercial use:
PyXLL - Excel addin that enables functions written in Python to be called in Excel.
Updated 2018
xlwings is a BSD-licensed Python library that makes it easy to call Python from Excel and vice versa.
Scripting: Automate/interact with Excel from Python using a syntax that is close to VBA.
Macros: Replace your messy VBA macros with clean and powerful Python code.
UDFs: Write User Defined Functions (UDFs) in Python (Windows only).
Installation
Quickstart
There's a tutorial on CodeProject on how to do this.
See http://www.codeproject.com/Articles/639887/Calling-Python-code-from-Excel-with-ExcelPython.
Another open source python-excel in process com tool.
This allows executing python scripts from excel in a tightly integrated manner.
https://pypi.python.org/pypi/Python-For-Excel/beta,%201.1

Categories

Resources