I need to get an image file into an Excel header in an automated fashion; I don't believe this is possible in openpyxl, but I thought it might be doable in win32com, though I am not able to get it working. Does anyone know a way to do this? I found an excel macro that successfully does it within Excel:
Sub InsertPicture
With ActiveSheet.PageSetup.LeftHeaderPicture
.FileName = "C:/Users/bharris/Desktop/my_image_file.png"
.Height = 45
.Width = 30
.CropTop = -30
End With
ActiveSheet.PageSetup.LeftHeader = "&;G"
So I tried implementing from within python using win32com:
excel = DispatchEx('Excel.Application')
excel.Visible = True
wb=excel.Workbooks.Open(my_excel_file.xlsx')
ws = wb.Sheets(1)
ws.PageSetup.LeftHeaderPicture.FileName ="my_image_file.png"
ws.PageSetup.LeftHeader = "&;G"
wb.SaveAs('my_excel_file.xlsx')
wb.Close(True, 'my_excel_file.xlsx')
excel.Application.Quit()
but it gives me:
raise AttributeError("Property '%s.%s' can not be set." % (self._username_, attr))
AttributeError: Property '<unknown>.FileName' can not be set.
So it looks like FileName is not a property within win32com like it is in VBA within Excel. I've tried some different combinations of things and nothing seems to work. If anyone knows how to do this, with any type of openpyxl or win32com code, or any other code (though it has to be able to edit an existing spreadsheet, not just write a new one like xlsxwriter), your help is much appreciated!
P.S. I have found several solutions for how to insert an image into a cell, but this question is for inserting into the header specifically.
Thanks much,
Robert
Related
Is there a way to make excel visible with openpyxl module in python 3.7.5?
Did some reasearch on documentation and other resources but did not find any answer.
https://openpyxl.readthedocs.io/en/2.6/api/openpyxl.workbook.properties.html
openpyxl.__version__
> '2.6.0'
My objective is to obtain the same result as with use of win32com.client
xlApp = win32com.client.Dispatch('Excel.Application')
xlApp.Application.Visible = True
Tried setting visibility = 'visible' and minimized=False within parameters of openpyxl.workbook.views.BookView object
Also tried setting different parameters within 'sheetview' as specified in this topic:
Set workbook view with openpyxl?
Yet with no success. I believe that there is possibility to do so but i couldnt dig to the answer.
Would appreciate getting some help with the package as documentation does not include detailed descriptions.
I tested the openpyxl .remove() function and it's working on multiple empty file.
Problem: I have a more complex Excel file with multiple sheet that I need to remove. If I remove one or two it works, when I try to remove three or more, Excel raise an error when I open the file.
Sorry, we have troubles getting info in file bla bla.....
logs talking about pictures troubles
logs about error105960_01.xml ?
The strange thing is that it's talking about pictures trouble but I don't have this error if I don't remove 3 or more sheet. And I don't try to remove sheet with images !
Even more strange, It's always about the number, every file can be deleted without trouble but if I remove 3 or more, Excel yell at me.
The thing is that, it's ok when Excel "repair" the "error" but sometimes, excel reinitialize the format of the sheets (size of cell, bold and length of the characters, etc...) and everything fail :(
bad visual that I want to avoid
If someone have an idea, i'm running out of creativity !
For the code, I only use basic functions (simplify here but it would be long to present more...).
INPUT_EXCEL_PATH = "my_excel.xlsx"
OUTPUT_EXCEL_PATH = "new_excel.xlsx"
wb = openpyxl.load_workbook(INPUT_EXCEL_PATH)
ws = wb["sheet1"]
wb.remove(ws)
ws = wb["sheet2"]
wb.remove(ws)
ws = wb["sheet3"]
wb.remove(ws)
wb.save(OUTPUT_EXCEL_PATH)
In my case it was some left over empty CalculationChainPart. I used DocxToSource to investigate the corrupted file. Excel will attempt to fix the file on load. Save this file and compare it's structure to the original file. To delete descendant parts you can use the DeletePart() method.
using (SpreadsheetDocument doc = SpreadsheetDocument .Open(document, true)) {
MainDocumentPart mainPart = doc.MainDocumentPart;
if (mainPart.DocumentSettingsPart != null) {
mainPart.DeletePart(mainPart.DocumentSettingsPart);
}
}
CalculationChainPart can be also removed anytime.
While calculation chain information can be loaded by a spreadsheet application, it is not required. A calculation chain can be constructed in memory at load-time (source)
I'm trying to create a Python script (I'm using Python 3.7.3 with UTF-8 encoding on Windows 10 64-bit with Microsoft Office 365) that exports user selected worksheets to PDF, after the user has selected the Excel-files.
The Excel-files contain a lot of different settings for page setup and each worksheet in each Excel-file has a different page setup.
The task is therefore that I need to read all current variables regarding page setup to be able to assign them to the related variables for export.
The problem is when I'm trying to get Excel to return the current print area of the worksheet, which I can't figure out.
As far as I understand I need to be able to read the current print area, to be able to set it for the export.
The Excel-files are a mixture of ".xlxs" and ".xlsm".
I've tried using all kind of different methods from the Excel VBA documentation, but nothing has worked so far e.g. by adding ".Range" and ".Address" etc.
I've also tried the ".UsedRange", but there is no significant difference in the cells that I can search for and I can't format them in a specific way so I can't use this.
I've also tried using the "IgnorePrintAreas = False" variable in the "ExportAsFixedFormat"-function, but that didn't work either.
#This is some of the script.
#I've left out irrelevant parts (dialogboxes etc.) just to make it shorter
#Import pywin32 and open Excel and selected workbook.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch("Excel.Application")
excel.Visible = False
wb = excel.Workbooks.Open(wb_path)
#Select the 1st worksheet in the workbook
#This is just used for testing
wb.Sheets([1]).Select()
#This is the line I can't get to work
ps_prar = wb.ActiveSheet.PageSetup.PrintArea
#This is just used to test if I get the print area
print(ps_prar)
#This is exporting the selected worksheet to PDF
wb.Sheets([1]).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, pdf_path, Quality = 0, IncludeDocProperties = True, IgnorePrintAreas = False, OpenAfterPublish = True)
#This closes the workbook and the Excel-file (although Excel sometimes still exists in Task Manager
wb.Close()
wb = None
excel.Quit()
excel = None
If I leave the code as above and try and open a test Excel-file (.xlxs) with a small PrintArea (A1:H8) the print function just gives me a blank line.
If I add something to .PrintArea (as mentioned above) I get 1 of 2 errors:
"TypeError: 'str' object is not callable".
or
"ps_prar = wb.ActiveSheet.PageSetup.PrintArea.Range
AttributeError: 'str' object has no attribute 'Range'"
I'm hoping someone can help me in this matter - thanks, in advance.
try
wb = excel.Workbooks.OpenXML(wb_path)
insead of
wb = excel.Workbooks.Open(wb_path)
My problem was with a german version of ms-office. It works now. Check here https://social.msdn.microsoft.com/Forums/de-DE/3dce9f06-2262-4e22-a8ff-5c0d83166e73/excel-api-interne-namen?forum=officede
I tried really hard to find how to do these simple lines of VBA code in Python via win32com but I couldn't find how to execute it properly :
ActiveSheet.PivotTables("PivotTable1").PivotFields("Quarters").ClearAllFilters
ActiveSheet.PivotTables("PivotTable1").PivotFields("Effective deadline"). _
PivotFilters.Add2 Type:=xlBefore, Value1:="10/10/2017"
When running these lines :
from win32com.client import DispatchEx
excel = DispatchEx('Excel.Application')
wb = excel.Workbooks.Open('myfile.xlsx')
ws = wb.Worksheets('MySheet')
ws.PivotTables(1).PivotFields("Quarters").PivotFilters('Add2', 'xlBefore', '10/10/2017')
I end up with an 'Invalid number of parameters' so I guess I'm quite close but can't find the documentation to complete my code
Has anyone ever managed to do this kind of work ?
You are calling the wrong method. You should call .Add2 after the PivotFilters property:
ws.PivotTables(1).PivotFields("Effective deadline").ClearAllFilters()
ws.PivotTables(1).PivotFields("Effective deadline").PivotFilters.Add2(31, None, '10/10/2017')
Also, notice that you need to specify the XlPivotFilterType Enumeration according to the type of filter you want to apply (in this case xlBefore = 31)
I am trying to use win32com to copy a worksheet from my workbook to a new workbook. The code is working fine but the cell formulas in the new book point back to the original book. I would like to break the links in the new book so that these formulas are replaced with raw numbers. This is trivial to do in Excel but I haven't been able to find out how to do it using the win32com client in Python.
Here is a snippet of my code:
import win32com.client
xl = win32com.client.gencache.EnsureDispatch('Excel.Application')
xl.Visible = True
#Open & Refresh Spreadsheet
wb = xl.Workbooks.Open(r"C:\Users\me\dummy.xlsx") #Dummy path
print("Refreshing data...")
wb.RefreshAll()
#Create new book and copy target sheet over
print("Opening new workbook")
nwb = xl.Workbooks.Add()
newfile = r"C:\Users\me\dummy2.xlsx"
wb.Worksheets(["Target Sheet"]).Copy(Before=nwb.Worksheets(1))
nwb.SaveAs(newfile)
This code works fine but in the saved "dummy2" file each of the cells containing formulas reference the original sheet. How can I break the links in the new book and/or copy values only from the original book?
Edit in response to #martineau 's downvote of the answer and of the (admittedly unsatisfactory) Microsoft documentation.
I think you haven't been able to find out how to do this because you have been looking in the wrong place. Your question really has little to do with Python or with win32com.
This line
xl = win32com.client.gencache.EnsureDispatch('Excel.Application')
fires up a COM client called xl that talks to excel.exe. Your variable xl is a thin Python wrapper around a Microsoft COM object that can call Excel VBA functions. When you type xl., everything after the dot is expected to be a VBA object or method. Any value (other than strings and floats) that you get back from a call is a VBA object in a thin Python wrapper. Python conventions do not necessarily apply to such objects.
So to find out about what functions you need to call, you need to be looking at the Excel VBA documentation. One difficulty with that documentation is that it assumes you are writing VBA, not Python. The other is that it isn't all that well-written.
The VBA method you need is Workbook.BreakLink().
Call it after copying the original workbook and before saving the copy, like this (I'm using your dummy filename here, don't expect it to actually work without fixing that):
wb.Worksheets(["Target Sheet"]).Copy(Before=nwb.Worksheets(1))
nwb.BreakLink(Name=r"C:\Users\me\dummy.xlsx", Type=1)
nwb.SaveAs(newfile)
The name of the link is the filename it points to, and the type of the link is 1 (for a link to an Excel spreadsheet). In this case you know the name of the link source (since you just made a copy of it) so there is no need to ask what the filename is, but in the general case you need to call Workbook.LinkSources() to find out what they are, and break them one by one.