unable to append pandas Dataframe to existing excel sheet - python

I am quite new to Python/Pandas. I have a situation where I have to update an existing sheet with new data every week. this 'new' data is basically a processed data from raw csv files which are generated every week and I have already written a python code to generate this 'new' data which is basically a pandas Dataframe in my code. Now I want to append this Dataframe object to an existing sheet in my excel workbook. I am already using the below code to write the DF to the XL Workbook into a specific sheet.
workbook_master=openpyxl.load_workbook('C:\Claro\Pre-Sales\E2E Optimization\Transport\Transport Network Dashboard.xlsx')
writer=pandas.ExcelWriter('C:\Claro\Pre-Sales\E2E Optimization\Transport\Transport Network Dashboard.xlsx',engine='openpyxl',mode='a')
df_latency.to_excel(writer,sheet_name='Latency',startrow=workbook_master['Latency'].max_row,startcol=0,header=False,index=False)
writer.save()
writer.close()
now the problem is when i run the code and open the excel file, instead of writing the dataframe to existing sheet 'Latency', the code creates a new sheet 'Latency1' and writes the Dataframe to it. the contents and the positioning of the Dataframe is correct but I do not understand why the code is creating a new sheet 'Latency1' instead of writing the Dataframe into existing sheet 'Latency'
will greatly appreciate any help here.
Thanks
Faheem

By default, when ExcelWriter is instantiated, it assumes a new Empty Workbook with no Worksheets.
So when you try to write data into 'Latency', it creates a new blank Worksheet instead. In addition, the openpxyl library performs a check before writing to "avoid duplicate names" (see openpxyl docs : line 18), which numerically increment the sheet name to write to 'Latency1' instead.
To go around this problem, copy the existing Worksheets into the ExcelWriter.sheets attribute, after writer is created.
Like this:
writer.sheets = dict((ws.title, ws) for ws in workbook_master.worksheets)

Related

Struggling to append dataframe to existing .xlsx file in Python

I am trying to append a dataframe to an existing excel spreadsheet, but I am having trouble appending it to an existing SHEET (my excel file only has one sheet, titled "Sheet1," that contains the existing dataset).
with pd.ExcelWriter(xlsx_path, mode="a", engine="openpyxl",sheet_name="Sheet1",if_sheet_exists="overlay") as writer:
transfer.to_excel(writer,header=None,index=False)
When I use the aforementioned code, when I open the existing spreadsheet, the new data from the dataframe I requested to be appended via the to_excel function appears in a separate sheet, entitled "Sheet 11." Can someone elucidate why this is occurring? How can I just get the new data from the dataframe to appear at the bottom of the existing spreadsheet in Sheet1?
Thanks!
Refer to notes written above.
I dont know why the data is appended to 'Sheet11', however 'sheet_name=' is not an attribute in ExcelWriter so you should get a warning about that. The attribute should be used with 'to_excel'.
You'll need to state what row to append from otherwise the new data will start from row 1 over-writting any existing data. You can get the max row for the sheet and use that.
sheet_to_update = 'Sheet1'
with pd.ExcelWriter(xlsx_path,
mode="a",
engine="openpyxl",
if_sheet_exists="overlay") as writer:
transfer.to_excel(writer,
header=None,
index=False,
sheet_name=sheet_to_update,
startrow=writer.sheets[sheet_to_update].max_row)

How to export python dataframe into existing excel sheet and retain formatting?

I am trying to export a dataframe I've generated in Pandas to an Excel Workbook. I have been able to get that part working, but unfortunately no matter what I try, the dataframe goes into the workbook as a brand new worksheet.
What I am ultimately trying to do here is create a program that pulls API data from a website and imports it in an existing Excel sheet in order to create some sort of "live updating excel workbook". This means that the worksheet already has proper formatting, vba, and other calculated columns applied, and all of this would ideally stay the same except for the basic data in the dataframe I'm importing.
Anyway to go about this? Any direction at all would be quite helpful. Thanks.
Here is my current code:
file='testbook.xlsx'
writer = pd.ExcelWriter(file, engine = 'xlsxwriter')
df.to_excel(writer, sheet_name="Sheet1")
workbook = writer.book
worksheet = writer.sheets["Sheet1")
writer.save
In case u have both existing excel file and DataFrame in same format then you can simply import your exiting excel file into another DataFrame and concat both the DataFrames then save into new excel or existing one.
df1["df"] = pd.read_excel('testbook.xlsx')
df2["df"] = 1#your dataFrame
df = pd.concat([df1, df2])
df.to_excel('testbook.xlsx')
There are multiple ways of doing it if you want to do it completely using pandas library this will work.

Is it possible to append data to an xls file in Python?

I am trying to add a large dataset to an existing xls spreadsheet.
I'm currently writing to it using a pandas dataframe and the .to_excel() function, however this erases the existing data in the (multi-sheet) workbook. The existing spreadsheet is very large and complex,it also interacts with several other files, so I can't convert it to xlsx or read and rewrite all of the data, as I've seen some suggestions on other questions. I want the data that I am adding to be pasted starting from a set row in an existing sheet.
Yes , you can use the library xlsxwriter , link= https://xlsxwriter.readthedocs.io
code example :
import xlsxwriter
Name="MyFile"+".xlsx"
workbook = xlsxwriter.Workbook(Name)
worksheet = workbook.add_worksheet()
worksheet.write("A1", "Incident category".decode("utf-8"))
worksheet.write("B1", "Longitude".decode("utf-8"))
worksheet.write("C1", "Latitude".decode("utf-8"))
workbook.close()

XlsxWriter: Generate multi-worksheet workbook from separate python files?

I am writing a python script using XlsxWriter to generate an .xlsx file comprising of multiple worksheets. Each worksheet will have multiple tables and lots of formatting - hence my code is getting pretty long. Therefore, I am looking for a way to split the code up, eg. Worksheet 1 corresponding to worksheet1.py, with a 'main' file to compile the worksheets into a single workbook.
I have tried using a function to create a worksheet and calling that from another file to add to an existing workbook - but this method does not work. XlsxWriter requires you to add the worksheet to an existing workbook. (If I'm missing something and this is possible please let me know).
Alternately, I thought of creating individual workbooks with a single worksheet inside and using a second package (openpyxl) to collate the worksheets. However, I think this will alter the formatting on the worksheets. (Again, please let me know if I am missing something).
Any ideas on this subject would be greatly received
Thanks
Edit: example table
example table
Pandas will actually be very helpful in this case.
you can first create writer for your excel file
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
create you tables are dataframe, check here about dataframes basics
df.to_excel(writer,sheet_name='Sheet 1',startrow=0 , startcol=0)
place that table easily into any excel sheet(workbook) you want just provide the name as argument.
put another table in same sheet
df_1.to_excel(writer,sheet_name='Sheet 1',startrow=20 , startcol=0)
change the row from where you want to start the table, or change the sheet name

How to write a dataframe to excel, SPECIFICALLY to an existing sheet by not overwriting existing data

I need to write multiple dataframes to an excel file. These dataframes needs to be written to a specific sheet and it should not overwrite existing data on that sheet.
The code I have is as follows:
excelbook = test.xlsx
book = load_workbook(excelbook)
writer = pd.ExcelWriter(excelbook, engine = 'openpyxl')
writer.book = book
df.to_excel(writer, sheet_name = 'apple', startcol=5, startrow=0)
writer.save()
writer.close()
Problem with my code is, each time I run it to write a dataframe, it is creating a new sheet in the excel file. For example, if the sheet name I need is "apple", then since I'm running this piece of code 3 times (to write 3 dataframes to the same sheet), it is creating a new sheet each time and naming them as - "apple1", "apple2" and "apple3"
I need to write multiple dataframes to the same excel file, to the same sheet in that file, without overwriting the existing data in the sheet.
Please help. Thanks in advance.

Categories

Resources