I have this live excel worksheet in share point. I want to update the contents of the file in that excel sheet using data scraped from a website. So I am done with the data scraping part.
Things I wanted to do:
Getting into sharepoint and opening the particular excel sheet link to update it
In excel worksheet. It has 8 columns. I wanted to update only 6 columns. Rest of the 2 columns should be untouched
I wanted to know how to make these above 2 things happen. A little guidance would be helpful
Related
Situation
I'm working on a data project integrating python in Google Colab and Excel 365 on Win 8.1. My python code collects new data updates on a regimented schedule and then exports/writes (e.g. overwrites, not appends the data) like to a report on an Excel spreadsheet.
I have no issue getting this to work going to a standalone spreadsheet.
I know I could potentially do all this in Python and not use Excel at all, but I prefer not to reinvent the wheel and not spend hours hardcoding all the formulas and links already existing in Excel.
Goal
My goal is to:
1. Use new data from my python export to populate/overwrite a data table on Sheet A in an existing Excel workbook.
2. Then I have a separate Sheet B in the same Excel workbook performing calculations via pre-existing links connecting to the original data table on Sheet A. I then want the links to auto update each time my python export updates the data table on the first sheet.
Problem
The issues I am running into are that if I use the df.to_excel function to export the data and even if I use the spreadsheet name parameter, the export overrides the data table and names the tab okay, but wipes out any other pre-existing sheets within the same workbook.
So I attempted a work around by exporting to an external workbook and then trying to update the links in the second workbook automatically. Problem is the links don't appear to update without the source data file and the second workbook with the links both being manually opened and then the updated file manually saved.
I tried using openpyxl to control the excel files but it appeared to have no effect on the files and no data was updated. (See code block and result at the end of this post.)
Assistance
Does anybody know a way to use python to:
1. Overwrite a specific sheet within an Excel workbook without wiping out the other existing sheets? And then have the links on another sheet automatically update which are connected to the new data?
Or
2. Auto update external links between separate Excel workbooks while the files are unopened?
Or
3. Control an instance of excel that can open both files to allow the links to auto update and then save and close the files automatically?
I found a post from some years ago that identified a win32 package for python that appeared to be able to control instances of excel. When I try doing a pip install in Colab I got an error that the package was unrecognized or doesn't exist.
Ideally, I would prefer not to use VB if at all possible to solve this.
Any solutions are much appreciated.
Thanks in advance.
Sample Code that isn't producing any results:
import openpyxl
# Example code
from openpyxl import load_workbook
from openpyxl import Workbook
wb = load_workbook('/content/drive/MyDrive/Data/Series/AC5M.xlsx', keep_links=True)
ws = wb.active
Workbook.save
Workbook.close
print(ws)
Result:
"function openpyxl.workbook.workbook.Workbook.close"
I have a script where I extract data from a website using multiple URLs in excel. but the problem is when the extraction interrupts, then the whole data is lost and nothing save in an excel file. Now I need something that opens URL from an excel and scrapes data one by one save the excel with every iteration.
Please let me know if you need information.
Thanks
I have an existing excel workbook with 6 sheets in it named Z_df_pca, Z_df_kpca, Z_df_ae, Y_pred_pca, Y_pred_kpca and Y_pred_ae. Imagine these sheets to be full of data obtained from my some simulation.
I want python to revise or update one of these sheets with following conditions:
Other sheets should not be affected.
Should not create copies of sheets or the workbook. Example: instead of updating Z_df_pca, creates Z_df_pca-1 and so on.
Whats the simplest way I can write to these 6 sheets given that I have a dataframe as an output to be written?
I am looking to write certain columns of data from an excel sheet to a HTML table. Not looking to write specific/fixed cells into the table always, need to do this based on conditions. For example, if I have a table with columns Name/Age/Occupation, I would like to make an HTML table using just columns Name and Occupation. Also, within Name, I would only like to write the names starting with 'N' onto the table and corresponding Occupation. The Excel sheet dynamically changes with new data everytime. Essentially, I would not want to write specific cells or range of cells into the table but only the data based on conditions I set. Any suggestions using python/html/jquery or other methods are welcome.
First you should edit the Excel file, export it as a .csv file and then work on the file using a program language of your preference. It would be much much more complicated if you try to work on the .xls or .xlsx files. I recommend using python with its library panda that works on csv files.
For parsing excel files, I've had good success using openpyxl
A Python library to read/write Excel 2010 xlsx/xlsm files
I do a lot of data analysis in Excel and have been exploring Python and DataNitro to streamline my workflow. I specifically am trying to copy certain cells from one sheet in one Excel workbook, and paste them into certain cells in a certain sheet in another Excel workbook.
I have been storing ("copying") using CellRange (DataNitro), but am not sure how to copy the stored contents into a particular sheet, in another Excel workbook. Any clue how I may go about this? Also, is it possible to make the range defined for a CellRange conditional on certain cell properties?
I would really appreciate any help! Thank you, all.
Here's an example of copying:
data = CellRange("A1:A10").value
active_wkbk("Book2.xlsx")
CellRange("A1:A10").value = data
You can make the range conditional using regular Python logic (if statements, etc.).