The problem is the following:
I'm loading an existing excel file as follow:
import pandas as pd
from openpyxl import load_workbook
book = load_workbook('template.xlsx')
writer = pd.ExcelWriter('template.xlsx', engine='openpyxl')
writer.book = book
Then I performe some modification to the file and I save it with
writer.save()
Since this procedure is a part of a bigger pipe, it would be beneficial to be able to rename the file template.xlsx before saving the modification. Is it possible?
Thanks in adavance for any suggestion!
Why not just pass a new name to pd.ExcelWriter(...)?
import pandas as pd
from openpyxl import load_workbook
book = load_workbook('template.xlsx')
writer = pd.ExcelWriter('foo.xlsx', engine='openpyxl')
writer.book = book
writer.save()
Related
I'm creating an excel workbook with many pandas dataframes for the sheets. It will run and create the file, but the code I found below does not give a good print command notification.
What is a good way to do that?
import pandas as pd
df = pd.DataFrame(data={'col1':[9,3,4,5,1,1,1,1], 'col2':[6,7,8,9,5,5,5,5]})
df2 = pd.DataFrame(data={'col1':[25,35,45,55,65,75], 'col2':[61,71,81,91,21,31]})
with pd.ExcelWriter('test.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='testSheetJ', startrow=1, startcol=0)
df2.to_excel(writer, sheet_name='testSheetJ', startrow=1+len(df)+3, startcol=0)
import os
if os.path.exists('test.xlsx'):
print('test.xlsx is present')
I am using Openpyxl to add some data to an existing excel file but unfortunately it also changes the format of my chart (border, background and curves colors) and deletes textbox.
Does anyone know how to prevent these changes ?
Thanks ahead !
A simplified version of my code below and
a screenshot of my excel file
so everyone can reproduce the excel file.
import os
import sys
import pandas as pd
from openpyxl import load_workbook
folder = r'C:\MyFolder'
filename = r'test.xlsx'
writer = pd.ExcelWriter(os.path.join(folder, filename),
engine='openpyxl',
datetime_format='dd/mm/yyyy',
date_format='dd/mm/yyyy')
writer.book = load_workbook(os.path.join(folder, filename))
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
writer.save()
I am new to python, I was trying to load large excel file of size 15MB with 3 sheets/tab. I am trying to update 3rd tab. Since I need to update 3rd sheet, I was trying to load the excel with openpyxl.load_workbook() without read_only. My system got hung while loading could you please help. I dont want use read_only=True, because i want to edit the third sheet.
Thanks,
import pandas as pd
from openpyxl import load_workbook
meta_df = pd.read_csv('metafile')
file = 'file.xlsx'
book = load_workbook(file)
writer = pd.ExcelWriter(file, engine='openpyxl')
writer.book = book
writer.sheets = dict((wsh.title, wsh) for wsh in book.worksheets)
meta_df.to_excel(writer, 'meta_data', index=False, header=False, startrow=1)
writer.save()
I have one code that goes like below..
#After performing some operation using pandas I have written df to the .xlsx
df.to_excel('file5.xlsx',index=False) # This excel has a single tab(sheet) inside
Then I have another .xlsx file (already provided) Final.xlsx , that has multiple tab(sheet) inside it like file1,file2,file3,file4 . I want to add the newly create file5.xls to the Final.xlsx as new sheet after sheet file4 .
Below answer provided by Anky, it is adding sheet the xlsx file5.xlsx to 'Final.xlsx' but the content inside sheets file1 2 3 4 is getting missed, format broken and also data is missing ...
import pandas
from openpyxl import load_workbook
book = load_workbook('foo.xlsx')
writer = pandas.ExcelWriter('foo.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df1=pd.read_excel('file5.xlsx')
df1.to_excel(writer, "new",index=False)
writer.save()
Need help to fix this..
I have asked this in separate question - Data missing, format changed in .xlsx file having multiple sheets using pandas, openpyxl while adding new sheet in existing .xlsx file
import pandas
from openpyxl import load_workbook
book = load_workbook('foo.xlsx')
writer = pandas.ExcelWriter('foo.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, "file5",index=False)
writer.save()
Sheetname can be whatever you want to keep ex: file5
I have code from a while ago that I am re-using for a new task. The task is to write a new DataFrame into a new sheet, into an existing excel file. But there is one part of the code that I do not understand, but it just makes the code "work".
working:
from openpyxl import load_workbook
import pandas as pd
file = r'YOUR_PATH_TO_EXCEL_HERE'
df1 = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
book = load_workbook(file)
writer = pd.ExcelWriter(file, engine='openpyxl')
writer.book = book # <---------------------------- piece i do not understand
df1.to_excel(writer, sheet_name='New', index=None)
writer.save()
The little line of writer.book=book has me stumped. Without that piece of code, the Excel file will delete all other sheets, except the sheet used in the sheetname= parameter in df1.to_excel.
i looked at xlsxwriter's documentation as well as openpyxl's, but cannot seem to figure out why that line gives me my expected output. Any ideas?
edit: i believe this post is where i got the original idea from.
In the source code of ExcelWriter, with openpyxl, it initializes empty workbook and delete all sheets. That's why you need to add it explicitly
class _OpenpyxlWriter(ExcelWriter):
engine = 'openpyxl'
supported_extensions = ('.xlsx', '.xlsm')
def __init__(self, path, engine=None, **engine_kwargs):
# Use the openpyxl module as the Excel writer.
from openpyxl.workbook import Workbook
super(_OpenpyxlWriter, self).__init__(path, **engine_kwargs)
# Create workbook object with default optimized_write=True.
self.book = Workbook()
# Openpyxl 1.6.1 adds a dummy sheet. We remove it.
if self.book.worksheets:
try:
self.book.remove(self.book.worksheets[0])
except AttributeError:
# compat
self.book.remove_sheet(self.book.worksheets[0])