I just created a script with openpyxl to update a xlsx file that we usually update manually every month.
It work fine, but the new file lost all the graphs and images that were in the workbook. Is there a way to keep them?
openpyxl version 2.5 will preserve charts in existing files.
Use xlwings instead, it works as a wrapper on pywin32 and can preserve the existing Excel graphs, pivot-tables, etc.
Related
I have an excel form with shapes. If I change only the sheet contents to Python without changing this form and save it, all the shapes will disappear.
I'm using openpyxl.
How to persist excel shapes in python?
I think you're asking how to encode an Excel file's shapes in openpyxl data, and your goal is to have that data persist when you change the openpyxl data back into an Excel file format.
openpyxl is centered on transferability of data, data visualizations (e.g. charts), and some visual formatting information between Excel and python. But there is an openpyxl.chart.shapes submodule that may provide what you need.
I think you need "xlwings".
xlwing does a lot of things that openpyxl doesn't work. Keep the shapes you want or keep the formulas... However, be careful as it cannot operate in an environment without Excel.
import xlwings
app = xlwings.App(visible=False)
wb=app.books.open('test.xlsx')
ws= wbxl.sheets[0]
ws.shapes
xlwings docs
Situation
I'm working on a data project integrating python in Google Colab and Excel 365 on Win 8.1. My python code collects new data updates on a regimented schedule and then exports/writes (e.g. overwrites, not appends the data) like to a report on an Excel spreadsheet.
I have no issue getting this to work going to a standalone spreadsheet.
I know I could potentially do all this in Python and not use Excel at all, but I prefer not to reinvent the wheel and not spend hours hardcoding all the formulas and links already existing in Excel.
Goal
My goal is to:
1. Use new data from my python export to populate/overwrite a data table on Sheet A in an existing Excel workbook.
2. Then I have a separate Sheet B in the same Excel workbook performing calculations via pre-existing links connecting to the original data table on Sheet A. I then want the links to auto update each time my python export updates the data table on the first sheet.
Problem
The issues I am running into are that if I use the df.to_excel function to export the data and even if I use the spreadsheet name parameter, the export overrides the data table and names the tab okay, but wipes out any other pre-existing sheets within the same workbook.
So I attempted a work around by exporting to an external workbook and then trying to update the links in the second workbook automatically. Problem is the links don't appear to update without the source data file and the second workbook with the links both being manually opened and then the updated file manually saved.
I tried using openpyxl to control the excel files but it appeared to have no effect on the files and no data was updated. (See code block and result at the end of this post.)
Assistance
Does anybody know a way to use python to:
1. Overwrite a specific sheet within an Excel workbook without wiping out the other existing sheets? And then have the links on another sheet automatically update which are connected to the new data?
Or
2. Auto update external links between separate Excel workbooks while the files are unopened?
Or
3. Control an instance of excel that can open both files to allow the links to auto update and then save and close the files automatically?
I found a post from some years ago that identified a win32 package for python that appeared to be able to control instances of excel. When I try doing a pip install in Colab I got an error that the package was unrecognized or doesn't exist.
Ideally, I would prefer not to use VB if at all possible to solve this.
Any solutions are much appreciated.
Thanks in advance.
Sample Code that isn't producing any results:
import openpyxl
# Example code
from openpyxl import load_workbook
from openpyxl import Workbook
wb = load_workbook('/content/drive/MyDrive/Data/Series/AC5M.xlsx', keep_links=True)
ws = wb.active
Workbook.save
Workbook.close
print(ws)
Result:
"function openpyxl.workbook.workbook.Workbook.close"
Assuming I have an excel sheet already open, make some changes in the file and use pd.read_excel to create a dataframe based on that sheet, I understand that the dataframe will only reflect the data in the last saved version of the excel file. I would have to save the sheet first in order for pandas dataframe to take into account the change.
Is there anyway for pandas or other python packages to read an opened excel file and be able to refresh its data real time (without saving or closing the file)?
Have you tried using mitosheet package? It doesn't answer your question directly, but it allows you working on pandas dataframes as you would do in excel sheets. In this way, you may edit the data on the fly as in excel and still get a pandas dataframe as a result (meanwhile generating the code to perform the same operations with python). Does this help?
There is no way to do this. The table is not saved to disk, so pandas can not read it from disk.
Be careful not to over-engineer, that being said:
Depending on your use case, if this is really needed, I could theoretically imagine a Robotic Process Automation like e.g. BluePrism, UiPath or PowerAutomate loading live data from Excel into a Python environment with a pandas DataFrame continuously and then changing it.
This use case would have to be a really important process though, otherwise licensing RPA is not worth it here.
df = pd.read_excel("path")
In variable explorer you can see the data if you run the program in SPYDER ide
I create new workbooks via xlsxwriter. In every of them I need to have formated header sheet, which is stored in another template workbook. I know it is impossible to do with xlsxwriter, coz I cannot open template workbook with this.
I thought to do that by xlrd, copy this sheet and then with xlsxwriter write it to created workbook.
But is it possible? To use combination of those two libraries?
I know this question is without even any code, but I'm lame with python and if you could give me any advice or something to deal with my problem I will be gratefull.
xlrd and xlswriter aren't really designed to work together. Consider switching to the pyopenxl library, which allows both reading and writing of spreadsheets and might allow you to do what you need quite easily.
I recently started to automate a report at work using Python. Since my data was provided to me in the form of an excel sheet, I felt the best way to do this was to use an excel python module. My module of choice was openpyxl. It worked great, I've used it to perform calculations and organise my data ready to plot charts. Now here's the problem...
I know that you cannot update existing charts using openpyxl so that option went out the window.
What I then tried to do was link the data in my openpyxl spreadsheet to another spreadsheet containing the charts (which is then linked to my word document where the charts are to be displayed). So after doing this I ran my script and to my annoyance, the data links between my openpyxl spreadsheet and charts spreadsheet had been severed. I guess this is because openpyxl creates a new spreadsheet when you save using the save function links are severed.
My question is.. are there any ways to maintain the data links?
It is currently not possible to maintain links between files. I think it would be possible to keep them metadata but, for fairly obvious reasons, it won't necessarily be possible to validate them. This best way for this to happen would be through a pull request.
If you're on Windows you might look at using the Python for Windows stuff which will allow you to remote control the applications.