I have a dataframe and an Excel template file, which has a worksheet that contains column headers, some formula, and pivot tables on another sheets.
I want to paste the data onto it then save the template as a new Excel file.
First thing I notice is that I cannot save the template as a new excel file.
Second is I cannot write the Dataframe to existing worksheet, it will create a new sheet for the data.
Then I found an option on pd.ExcelWriter, if_sheet_exists='overlay' on the internet. But it gives me Error
'overlay' is not valid for if_sheet_exists. Valid options are 'error', 'new' and 'replace'.
I'm using pandas version 1.5.1. Is it still possible to achieve this, or is there any better solution?
def write_report(df):
template_filename = f'Daily Quality Report Template.xlsx'
today_str = datetime.strftime(datetime.now(), '%Y%m%d')
result_filename = f'Report\\Daily Quality Report {today_str}.xlsx'
result_sheetname = today_str
# create new file
xlresult = Workbook()
xlresult.save(result_file_name)
# write
writer = pd.ExcelWriter(result_filename, engine='openpyxl', mode='a', if_sheet_exists='overlay')
writer.book = load_workbook(template_filename)
writer.sheets = {ws.title: ws for ws in writer.book.worksheets}
df.to_excel(writer, result_sheetname, startrow=1, header=False, index=False)
writer.save()
Related
I have an excel file that contains 3 sheets (PizzaHut, InAndOut, ColdStone). I want to add an empty column to the InAndOut sheet.
path = 'C:\\testing\\test.xlsx'
data = pd.ExcelFile(path)
sheets = data.sheet_names
if 'InAndOut' in sheets:
something something add empty column called toppings to the sheet
data.to_excel('output.xlsx')
Been looking around, but I couldn't find an intuitive solution to this.
Any help will be appreciated!
Read in the sheet by name.
Do what you need to do.
Overwrite the sheet with the modified data.
sheet_name = 'InAndOut'
df = pd.read_excel(path, sheet_name)
# Do whatever
with pd.ExcelWriter(path, engine="openpyxl", mode="a", if_sheet_exists="replace") as writer:
df.to_excel(writer, sheet_name, index=False)
See pd.read_excel and pd.ExcelWriter.
Hi I am hoping someone can help me if possible.
I have a large spreadsheet of data that I have created a 'dictionary of data frames' I am however struggling to now export this to one excel file with each data frame having its own sheet in the excel document. As this could be used by other people I would also like to make the file export flexible ( it will be a clickable exe file)
I have looked at the following posts for help but just cant seem to get my head round it:
Python - splitting dataframe into multiple dataframes based on column values and naming them with those values
Save list of DataFrames to multisheet Excel spreadsheet
My code is as follows:
# Sort the Dataframe
df.sort_values(by = 'Itinerary_Departure_Date')
#Seperate Bookings By Itinerary
df_dict = dict(iter(df.groupby('Itinerary_Departure_Date')))
filepath = filedialog.asksaveasfilename(defaultextension = 'xlsx')
def frames_to_excel(df_dict, path = 'filepath'):
#Write dictionary of dataframes to separate sheets, within 1 file.
writer = pd.ExcelWriter(path, engine='xlsxwriter')
for tab_name, df_dict in df_dict.items():
df_dict.to_excel(writer, sheet_name=tab_name)
writer.save()
Fixed it!
Went down a different rabbit hole!
#Seperate Bookings By Itinerary
dict_of_itin = {k: v for k, v in df.groupby('Itinerary_Departure_Date')}
#Chooseemptyexcelfromwhereeversaved
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
book = load_workbook(file_path.replace('\\','/'))
writer = pd.ExcelWriter(file_path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
for df_name, df in dict_of_itin.items():
df.to_excel(writer, sheet_name=df_name)
writer.save()
It relies on the person using it saving an empty spreadsheet first wherever they want but then will write to it.
not as elegant but it works! :D
Chris
I am trying to create a repository "Master" excel file from a CSV which will be generated and overwritten every couple of hours. The code below creates a new excel file and writes the content from "combo1.csv" to "master.xlsx". However, whenever the combo1 file is updated, the code basically overwrites the contents in the "master.xlsx" file. I need to append the contents from "combo1" to "Master" without the headers being inserted every time. Can someone help me with this?
import pandas as pd
writer = pd.ExcelWriter('master.xlsx', engine='xlsxwriter')
df = pd.read_csv('combo1.csv')
df.to_excel(writer, sheet_name='sheetname')
writer.save()
Refer to Append Data at the End of an Excel Sheet section in this medium article:
Using Python Pandas with Excel Sheets
(Credit to Nensi Trambadiya for the article)
Basically you'll have to first read the Excel file and find the number of rows before pushing the new data.
reader = pd.read_excel(r'master.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
First read the excel file and then need to perform below method to append the rows.
import pandas as pd
from xlsxwriter import load_workbook
df = pd.DataFrame({'Name': ['abc','def','xyz','ysv'],
'Age': [08,45,32,26]})
writer = pd.ExcelWriter('master.xlsx', engine='xlsxwriter')
writer.book = load_workbook('Master.xlsx')
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
reader = pd.read_excel(r'master.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.close()
import pandas as pd
from openpyxl import load_workbook
# new dataframe with same columns
df = pd.read_csv('combo.csv')
writer = pd.ExcelWriter('master.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('master.xlsx')
# copy existing sheets
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
# read existing file
reader = pd.read_excel(r'master.xlsx')
# write out the new sheet
df.to_excel(writer, index=False, header=False, startrow=len(reader) + 1)
writer.close()
Note that a Master has to be created before running the script
I'm in the midst of writing a iPython notebook that will pull the contents of a .csv file and paste them into a specified tab on an .xlsx file. The tab on the .xlsx is filled with a bunch of pre-programmed formulas so that I might run an analysis on the original content of the .csv file.
I've ran into a snag, however, with the the date fields that I copy over from the .csv into the .xlsx file.
The dates do not get properly processed by the Excel formulas unless I double-click the date cells or apply Excel's "text to columns" function on the column of dates and set a tab as the delimiter (which I should note, does not split the cell).
I'm wondering if there's a way to either...
write a helper function that logs the keystrokes of applying the "text to columns" function call
write a helper function to double click and return down each row of the column of dates
from openpyxl import load_workbook
import pandas as pd
def transfer_hours(report_name, ER_hours_analysis_wb):
df = pd.read_csv(report_name, index_col=0)
book = load_workbook(ER_hours_analysis_wb)
sheet_name = "ER Work Log"
with pd.ExcelWriter("ER Hours Analysis 248112.xlsx",
engine='openpyxl') as writer:
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name=sheet_name,
startrow=1, startcol=0, engine='openpyxl')
Use the xlsx module
import xlsx
load_workbook ( filen = (filePath, read_only=False, data_only=False )
Setting data_only to False will return the formulas whereas data_only=True returns the non-formula values.
As great a tool as pandas is designed to be, in this case there may not be a reason to include.
Here is a shorter structure for what you're trying to accomplish:
import csv
import datetime
from openpyxl import load_workbook
def transfer_hours(report_name, ER_hours_analysis_wb):
wb = load_workbook(ER_hours_analysis_wb)
ws = wb['ER Work Log']
csvfile = open(report_name, 'rt')
reader = csv.reader(csvfile,delimiter=',')
#iterators
rownum = 0
colnum = 0
for row in reader:
for col in row:
dttm = datetime.datetime.strptime(col, "%m/%d/%Y")
ws.cell(column=colnum,row=rownum).value = dttm
wb.save('new_spreadsheet.xlsx')
What you'll be able to do from here is break out which columns should have what format based on the position in the csv. Here is an example:
for row in reader:
ws.cell(column=0,row=rownum,value=row[0])
dttm = datetime.datetime.strptime(row[1], "%m/%d/%Y")
ws.cell(column=1,row=rownum).value = dttm
For reference:
https://openpyxl.readthedocs.io/en/stable/usage.html
In Python, how do I read a file line-by-line into a list?
How to format columns with headers using OpenPyXL
I am trying to use an existing excel as a template and appending the dataframe to it and want to save as another excel
book = openpyxl.load_workbook('desktop\Template.xlsx')
writer = pd.ExcelWriter('desktop\Template.xlsx',engine='openpyxl',datetime_format='m/d/yyyy hh:mm:ss AM/PM')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_f1.to_excel(writer,'Tickets', startrow=1, header=False , index = False)
writer.save()
Also I am trying to change the representation of datetime object from pandas to excel in desired format , its not working with openpyxl