Specify range as only selecting filled cells/ending at empty cell Python - python

Is there a way, using win32com, to specify that Python only selects/copies/pastes/autofills/etc a range that stops when it reaches an empty cell?
i.e.
Range(A1:A%End)
Certainly open to xlrd library suggestions, but my entire script is already using win32com. Thanks for any tips folks!
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
source = excel.Workbooks.Open("C:\source")
excel.Range("A:AA").Select()
excel.Selection.Copy()
copy = excel.Workbooks.Open("C:\copy")
excel.Range("E:AE").Select()
excel.Selection.PasteSpecial()

You can get the last non-emtpy cell via
XlDirectionDown = 4
last = wb.Range("A:A").End(XlDirectionDown)
range = wb.Range("A1:A"+str(last))
The XlDirectionDown is an XlDirection enum item (xlDown), you can also get its value from COM by dispatching via EnsureDispatch:
xlApp = win32com.client.gencache.EnsureDispatch('Excel.Application')
import win32com.client.constants as cc
XlDirectionDown = cc.xlDown
First line builds the type library for Excel for win32com, which makes constants available.

Related

Read checkbox values from excel using python without using win32 library

I Need to read check boxes and have accomplished using below
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(r'\Test.xlsx')
ws = wb.Worksheets("Sheet1")
cb_dict = {}
for cb in ws.CheckBoxes():
cb_dict[cb.Name] = cb.Value
print(cb_dict)
excel.Application.quit()
The below works fine when called from windows but when this python scripts is called in other OS systems win32 library doesnt seem to be compatible
if anyone have a different approach please share
Unzip the name.xlxs table file to the folder. You'll find a file name/xl/drawings/vmlDrawing1.vml. There is the information including Anchor, Checked. The value of the checkbox in the front of the shape.
We can parse the vmlDrawing1.vml just like parse other XML file. I used xml.etree.ElementTree to find checkbox information in XML.
Reference: https://hypotenuselabs.medium.com/attack-on-checkbox-when-data-ingestion-gets-ugly-999fcdc5e000

How to return the PrintArea from Excel in Python

I'm trying to create a Python script (I'm using Python 3.7.3 with UTF-8 encoding on Windows 10 64-bit with Microsoft Office 365) that exports user selected worksheets to PDF, after the user has selected the Excel-files.
The Excel-files contain a lot of different settings for page setup and each worksheet in each Excel-file has a different page setup.
The task is therefore that I need to read all current variables regarding page setup to be able to assign them to the related variables for export.
The problem is when I'm trying to get Excel to return the current print area of the worksheet, which I can't figure out.
As far as I understand I need to be able to read the current print area, to be able to set it for the export.
The Excel-files are a mixture of ".xlxs" and ".xlsm".
I've tried using all kind of different methods from the Excel VBA documentation, but nothing has worked so far e.g. by adding ".Range" and ".Address" etc.
I've also tried the ".UsedRange", but there is no significant difference in the cells that I can search for and I can't format them in a specific way so I can't use this.
I've also tried using the "IgnorePrintAreas = False" variable in the "ExportAsFixedFormat"-function, but that didn't work either.
#This is some of the script.
#I've left out irrelevant parts (dialogboxes etc.) just to make it shorter
#Import pywin32 and open Excel and selected workbook.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch("Excel.Application")
excel.Visible = False
wb = excel.Workbooks.Open(wb_path)
#Select the 1st worksheet in the workbook
#This is just used for testing
wb.Sheets([1]).Select()
#This is the line I can't get to work
ps_prar = wb.ActiveSheet.PageSetup.PrintArea
#This is just used to test if I get the print area
print(ps_prar)
#This is exporting the selected worksheet to PDF
wb.Sheets([1]).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, pdf_path, Quality = 0, IncludeDocProperties = True, IgnorePrintAreas = False, OpenAfterPublish = True)
#This closes the workbook and the Excel-file (although Excel sometimes still exists in Task Manager
wb.Close()
wb = None
excel.Quit()
excel = None
If I leave the code as above and try and open a test Excel-file (.xlxs) with a small PrintArea (A1:H8) the print function just gives me a blank line.
If I add something to .PrintArea (as mentioned above) I get 1 of 2 errors:
"TypeError: 'str' object is not callable".
or
"ps_prar = wb.ActiveSheet.PageSetup.PrintArea.Range
AttributeError: 'str' object has no attribute 'Range'"
I'm hoping someone can help me in this matter - thanks, in advance.
try
wb = excel.Workbooks.OpenXML(wb_path)
insead of
wb = excel.Workbooks.Open(wb_path)
My problem was with a german version of ms-office. It works now. Check here https://social.msdn.microsoft.com/Forums/de-DE/3dce9f06-2262-4e22-a8ff-5c0d83166e73/excel-api-interne-namen?forum=officede

Excel pivot table filter in Python via win32com

I tried really hard to find how to do these simple lines of VBA code in Python via win32com but I couldn't find how to execute it properly :
ActiveSheet.PivotTables("PivotTable1").PivotFields("Quarters").ClearAllFilters
ActiveSheet.PivotTables("PivotTable1").PivotFields("Effective deadline"). _
PivotFilters.Add2 Type:=xlBefore, Value1:="10/10/2017"
When running these lines :
from win32com.client import DispatchEx
excel = DispatchEx('Excel.Application')
wb = excel.Workbooks.Open('myfile.xlsx')
ws = wb.Worksheets('MySheet')
ws.PivotTables(1).PivotFields("Quarters").PivotFilters('Add2', 'xlBefore', '10/10/2017')
I end up with an 'Invalid number of parameters' so I guess I'm quite close but can't find the documentation to complete my code
Has anyone ever managed to do this kind of work ?
You are calling the wrong method. You should call .Add2 after the PivotFilters property:
ws.PivotTables(1).PivotFields("Effective deadline").ClearAllFilters()
ws.PivotTables(1).PivotFields("Effective deadline").PivotFilters.Add2(31, None, '10/10/2017')
Also, notice that you need to specify the XlPivotFilterType Enumeration according to the type of filter you want to apply (in this case xlBefore = 31)

Python win32 read/modify/store/map Name Box

I'm trying to do some work on a complex Excel Workbook which has a large number of variables which have been created and used using the Name Box feature. See picture attached for example/detail.
I'd like to store or change DeathRate or maybe read all the Name Boxes and create a dictionary between names and locations of the cell from outside Excel.
I'm using the win32com library in Python but I guess I could switch to another Excel reader as long as it copes with XLSX files.
Has someone come across this before?
Found the solution, see code below:
import os
from win32com.client import Dispatch #win32com is based around cells beginning at one.
app_xl = Dispatch("Excel.Application")
WORKING_DIR = os.getcwd()
excelPath = WORKING_DIR + "\SampleModel.xls"
wb = app_xl.Workbooks.Open(excelPath)
# Get Named Boxes
name_box_list = [x for x in app_xl.ActiveWorkbook.Names]
name_box_map = {x.Name:x.Value for x in name_box_list}
print name_box_list
print name_box_map
# Change Named Boxes
name_box_list[0].Name = u'NewName'
name_box_list[0].Value = u'=model!$B$5'
name_box_map = {x.Name:x.Value for x in name_box_list}

Find if a value exists in a column in Excel using python

I have an Excel file with one worksheet that has sediment collection data. I am running a long Python script.
In the worksheet is a column titled “CollectionYear.” Say I want the year 2010. If the year 2010 exists in the “CollectionYear” column, I want the rest of the script to run, if not then I want the script to stop.
This seems like an easy enough task but for the life of me I cannot figure it out nor find any examples.
Any help would be greatly appreciated.
I use xlrd all the time and it works great for me. Something like this might be helpful
from xlrd import open_workbook
def main():
book = open_workbook('example.xlsx')
sheet = book.sheet_by_index(0)
collection_year_col = 2 #Just an example
test_year = 2010
for row in range(sheet.nrows):
if sheet.cell(row,collection_year_col).value == test_year:
runCode()
def runCode():
#your code
I hope this points you in the right direction. More help could be given if the details of your problem were known.
Here is what I learned from tackling a needle-in-a-haystack problem for a gigantic pile of .xls files. There are some things xlrd and friends can't (or won't) do, such as getting the formula of a cell. For that, you'll need to use the Microsoft Component Object Model (COM)1.
I recommend you find yourself a copy of Python Programming on Win32 by Mark Hammond. It's still useful 20 years later. Python Programming on Win32 covers the basics of the COM and how to access it using the pywin32 library (also from Mark Hammond).
In a nutshell, you can think of the COM as an API between a server (say, Excel) and a client (such as a Python script)2.
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
The COM API is reasonably well documented. Once you get used to the terminology, things become straight-forward albeit tedious. For example, an Excel file is technically a "Workbook". The "Workbooks" COM object has the Open method which provides a handle for Python to interact with the "Workbook". (Did you notice the different 's' endings on those?)
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
A "Workbook" contains a "Sheet", accessed here through the "Sheets" COM object:
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
Finally, the 'Cells' property of a worksheet "returns a Range object that represents all the cells on the worksheet". The Range object then has a Find method which will search within the range. The LookIn parameter allows for searching cell values, formulas, and comments.
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
match = sht1.Cells.Find('search string')
The result of Find is a Range object which has many useful properties, like Formula, GetAddress, Value, and Text. You'll also find, as with anything Microsoft, that it's good enough for government work.
Finally, don't forget to close the workbook and to quit Excel!
import win32com.client
# Connect to Excel server
xl = win32com.client.Dispatch("Excel.Application")
myfile = r'C:\temp\myworkbook.xls'
wb = xl.Workbooks.Open(Filename=myfile)
sht1 = wb.Sheets.Item(1)
match = sht1.Cells.Find('search string')
print(match.Formula)
wb.Close(SaveChanges=False)
xl.Quit()
You can extend these ideas with Sheets.Item and Sheets.Count and iterate over all sheets in a workbook (or all workbooks in a directory). You can have lots of fun!
The headaches you may encounter include VBA macros and embedded objects, as well as the various different alerts each can produce. Performance is also an issue. The following silence notifications and can dramatically improve performance:
Application
xl.DisplayAlerts (False)
xl.AutomationSecurity (msoAutomationSecurityForceDisable)
xl.Interactive (False)
xl.PrintCommunication (False)
xl.ScreenUpdating (False)
xl.StatusBar (False)
Workbook
wb.DoNotPromptForConvert (True)
wb.EnableAutoRecover (False)
wb.KeepChangeHistory (False)
Another potential issue is late/early binding. Basically, does Python have information about the COM object? This affects things like introspection and how COM objects are referenced. The win32com.client package uses late-bound automation by default.
With late-bound automation, Python doesn't know much about the COM object:
>> import win32com.client
>> xl = win32com.client.Dispatch("Excel.Application")
>> xl
<COMObject Excel.Application>
>> len(dir(xl))
55
With early-bound automation, Python has full knowledge of the object:
>> import win32com.client
>> xl = win32com.client.Dispatch("Excel.Application")
>> xl
<win32com.gen_py.Microsoft Excel 16.0 Object Library._Application instance at 0x2583562290680>
>> len(dir(xl))
125
To enable early binding, you must run makepy.py which is included with pywin32. Running makepy.py will prompt for the library to bind with.
(venv) c:\temp\venv\Lib\site-packages\win32com\client>python makepy.py
python makepy.py
The process creates a Python file (in Temp\) which maps the methods and properties of the COM object.
(venv) c:\temp\venv\Lib\site-packages\win32com\client>python makepy.py
python makepy.py
Generating to C:\Users\Lorem\AppData\Local\Temp\gen_py\3.6\00020813-0000-0000-C000-000000000046x0x1x9.py
Building definitions from type library...
Generating...
Importing module
Early binding also provides access to COM constants, such as msoAutomationSecurityForceDisable and xlAscending and is case-sensitive (whereas late-binding is not).
That should be enough info to implement a Python-to-Excel library (like xlwings), overkill notwithstanding.
1 Actually, xlwings works by utilizing the COM though pywin32. Here's to one less dependency!
2 This example uses win32com.client.Dispatch which requires processing happen through a single Excel instance. Use win32com.client.DispatchEx to create separate instances of Excel.
Try using xlwings library to interface with Excel from python
example from their docs:
from xlwings import Workbook, Sheet, Range, Chart
wb = Workbook() # Creates a connection with a new workbook
Range('A1').value = 'Foo 1'
Range('A1').value
>>> 'Foo 1'
Range('A1').value = [['Foo 1', 'Foo 2', 'Foo 3'], [10.0, 20.0, 30.0]]

Categories

Resources