Here is a multi sheet excel file opened and operated on one sheet taken in a dataframe and then copied back. Now, a new sheet (sheet1) is being created while doing this. Objective however is to overwrite the old target sheet. When I am trying deleting the sheet before pasting data from dataframe, it says 'sheet' does not exist.
Here is the code:
import openpyxl as op
import pandas as pd
basePath = filePath
wbk = op.load_workbook(basePath + "file.xlsx")
writer = pd.ExcelWriter(basePath + "file.xlsx", engine = 'openpyxl', mode="a", if_sheet_exists="replace")
writer.book = wbk
df = pd.read_excel(basePath + "file.xlsx", sheet_name="sheet")
df.insert(0,"newCol2","")
#wbk.remove_sheet(wbk.get_sheet_by_name('sheet'))
df.to_excel(writer, sheet_name = 'sheet', index=False)
writer.save()
writer.close()
What am I doing wrong?
Related
import xlrd
import pandas as pd
wb = xlrd.open_workbook("excel_1.xlsx")
sheets = wb.sheet_names()
xl = pd.ExcelFile("excel_1.xlsx")
for sheet in sheets:
df = xl.parse(sheet)
df = df.sort_values(by="column2")
writer = pd.ExcelWriter("excel_2.xlsx")
df.to_excel(writer, sheet_name=sheet, index=False)
writer.save()
In excel_2.xlsx i can only find the sorted sheet of the last sheet of excel_1.xlsx.
Please help me in this regard.
Thank You
Create and save the writer outside of the loop.
You're overwriting the file for each sheet now.
import xlrd
import pandas as pd
wb = xlrd.open_workbook("excel_1.xlsx")
xl = pd.ExcelFile("excel_1.xlsx")
writer = pd.ExcelWriter("excel_2.xlsx")
for sheet in wb.sheet_names():
df = xl.parse(sheet).sort_values(by="column2")
df.to_excel(writer, sheet_name=sheet, index=False)
writer.save()
How can I append a row at the top of an excel sheet? Goal as follows:
The file itself is written by using pandas.df.to_excel as follows:
import pandas
with pandas.ExcelWriter(output_filename) as writer:
for file in files:
df = pandas.read_csv(file)
df.to_excel(writer, sheet_name=file.replace(".csv", "").replace("_", " ").title(), index=False)
Here is one way to do it using XlsxWriter as the Excel engine:
with pandas.ExcelWriter(output_filename, engine='xlsxwriter') as writer:
for file in files:
df = pandas.read_csv(file)
sheet_name = file.replace(".csv", "").replace("_", " ").title()
df.to_excel(writer, sheet_name=sheet_name, index=False, startrow=1)
worksheet = writer.sheets[sheet_name]
worksheet.write('A1', 'Here is some additional text')
You can use openpyxl to edit your Excel file afterwards:
import contextlib
import openpyxl
import pandas as pd
new_row = "THIS ROW IS APPENDED AFTER THE FILE IS WRITTEN BY PANDAS"
with contextlib.closing(openpyxl.open(output_filename)) as wb:
for file in files:
sheet_name = file.replace(".csv", "").replace("_", " ").title()
sheet = wb[sheet_name]
sheet.insert_rows(0)
sheet["A1"] = new_row
wb.save(output_filename)
my propose:
if excel file not exist, create it and copy data table to it;
if excel file exist, copy to data table to new sheet.
but following code running, only copy to data to new sheet, original sheet in excel file was removed.
import os
import pandas as pd
import openpyxl
f_name = "123.xlsx" #target excel file
if os.path.exists(f_name):
"""if excel file exist, added table to another sheet"""
wb = openpyxl.load_workbook(f_name) #load excel file
writer = pd.ExcelWriter(f_name, engine="openpyxl")
writer.wb = wb
df = pd.DataFrame(pd.read_excel("table_2.xlsx")) #get table to be added excel file
df.to_excel(writer, sheet_name="sheet2",index=False) #write to another sheet
writer.save()
writer.close()
else:
"""if excel file not exit, create it"""
df_1 = pd.DataFrame() # create excel file
df_1.to_excel(f_name)
writer = pd.ExcelWriter(f_name)
df_2 = pd.DataFrame(pd.read_excel("table_1.xlsx")) # get table_1
df_2.to_excel(writer, sheet_name="sheet1",index=False) # write table_1 into excel file
writer.save()
writer.close()
import os
import pandas as pd
import openpyxl
f_name = "123.xlsx" #target excel file
if os.path.exists(f_name):
"""if excel file exist, added table to another sheet"""
wb = openpyxl.load_workbook(f_name) #load excel file
writer = pd.ExcelWriter(f_name, engine="openpyxl") #assign engine
writer.book = wb #overwrite if no this
df = pd.DataFrame(pd.read_excel("table_2.xlsx")) #get table to be added excel file
df.to_excel(writer, sheet_name="table_2",index=False) #write to another sheet
writer.save()
writer.close()
else:
"""if excel file not exit, create it"""
df_1 = pd.DataFrame() # create excel file
df_1.to_excel(f_name)
writer = pd.ExcelWriter(f_name)
df_2 = pd.DataFrame(pd.read_excel("table_1.xlsx")) # get table_1
df_2.to_excel(writer, sheet_name="table_1",index=False) # write table_1 into excel file
writer.save()
writer.close()
I am updating an existing Excel workbook using pandas. When using an ExcelWriter object, can I overwrite a sheet if it exists and otherwise create a new sheet? The code I have appends new sheets, but when I try to overwrite an existing sheet it appends a new sheet with a slightly varied name (ex: If sheet 'data1' exists, running the code appends a new sheet named 'data1 1').
import pandas as pd
import openpyxl
path = 'test-out.xlsx'
book = openpyxl.load_workbook(path)
df1 = pd.DataFrame({'a': range(10), 'b': range(10)})
writer = pd.ExcelWriter(path, mode='a')
writer.book = book
df1.to_excel(writer, sheet_name='data1')
writer.save()
Pass the sheets to the writer with writer.sheets = dict((ws.title, ws) for ws in book.worksheets):
import pandas as pd
import openpyxl
path = 'test-out.xlsx'
book = openpyxl.load_workbook(path)
df1 = pd.DataFrame({'a': range(10), 'b': range(10)})
writer = pd.ExcelWriter(path, mode='a')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df1.to_excel(writer, sheet_name='data1')
writer.save()
Edit:
Seems like you don't even need mode='w', writer = pd.ExcelWriter(path, mode='a') is still working...
This program should take the contents of individual sheets and put them into one excel workbook. It almost does that, but it is overwriting instead of appending new sheets into the final workbook. I read that pandas excel writer is the way to go with this, any ideas as to why its having this behavior?
import xlwt, csv, os
from openpyxl import load_workbook
import pandas as pd
from pandas import ExcelWriter
csv_folder = r'C:\Users\Me\Desktop\Test_Folder\\'
for fil in os.listdir(csv_folder):
if '.xlsx' not in fil:
continue
else:
pass
df = pd.read_excel(csv_folder+fil, encoding = 'utf8')
file_name = fil.replace('.xlsx','')
writer = pd.ExcelWriter('condensed_output.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, sheet_name = file_name)
writer.save()
#writer.close()
Make sure the writer.save() is outside of the loop. Also be aware of the character limit on sheetnames, so if the file names are the same up to a certain point, you run the risk of writing over a sheetname that way as well.
import xlwt, csv, os
from openpyxl import load_workbook
import pandas as pd
from pandas import ExcelWriter
csv_folder = r'C:\Users\Me\Desktop\Test_Folder\\'
writer = pd.ExcelWriter('condensed_output.xlsx', engine = 'xlsxwriter')
for fil in os.listdir(csv_folder):
if '.xlsx' not in fil:
continue
else:
pass
df = pd.read_excel(csv_folder+fil, encoding = 'utf8')
file_name = fil.replace('.xlsx','')
df.to_excel(writer, sheet_name = file_name)
writer.save() #make sure this is outside of the loop.
ETA: establish the writer outside of the loop as well