I would like to know if there is a way to paste a dataframe in excel using pandas.DataFrame.to_excel and keep the data in an existing sheet without erasing the existing data in there. I need to do that because there are many dataframes being created from a loop, so I need to storage this information together appending the content. As follows some code:
lista=[1,2,3,4,5,6]
with pd.ExcelWriter('teste.xlsx', mode='a') as writer:
pd.DataFrame(lista).to_excel(writer, header=False, index=False, startrow=3, startcol=1,
sheet_name="Hoja1")
The result is very annoying because even though I pass "Hoja1" as the sheet_name it ends up creating another sheet with a similar name.
Related
I am trying to append a dataframe to an existing excel spreadsheet, but I am having trouble appending it to an existing SHEET (my excel file only has one sheet, titled "Sheet1," that contains the existing dataset).
with pd.ExcelWriter(xlsx_path, mode="a", engine="openpyxl",sheet_name="Sheet1",if_sheet_exists="overlay") as writer:
transfer.to_excel(writer,header=None,index=False)
When I use the aforementioned code, when I open the existing spreadsheet, the new data from the dataframe I requested to be appended via the to_excel function appears in a separate sheet, entitled "Sheet 11." Can someone elucidate why this is occurring? How can I just get the new data from the dataframe to appear at the bottom of the existing spreadsheet in Sheet1?
Thanks!
Refer to notes written above.
I dont know why the data is appended to 'Sheet11', however 'sheet_name=' is not an attribute in ExcelWriter so you should get a warning about that. The attribute should be used with 'to_excel'.
You'll need to state what row to append from otherwise the new data will start from row 1 over-writting any existing data. You can get the max row for the sheet and use that.
sheet_to_update = 'Sheet1'
with pd.ExcelWriter(xlsx_path,
mode="a",
engine="openpyxl",
if_sheet_exists="overlay") as writer:
transfer.to_excel(writer,
header=None,
index=False,
sheet_name=sheet_to_update,
startrow=writer.sheets[sheet_to_update].max_row)
I am trying to export a dataframe I've generated in Pandas to an Excel Workbook. I have been able to get that part working, but unfortunately no matter what I try, the dataframe goes into the workbook as a brand new worksheet.
What I am ultimately trying to do here is create a program that pulls API data from a website and imports it in an existing Excel sheet in order to create some sort of "live updating excel workbook". This means that the worksheet already has proper formatting, vba, and other calculated columns applied, and all of this would ideally stay the same except for the basic data in the dataframe I'm importing.
Anyway to go about this? Any direction at all would be quite helpful. Thanks.
Here is my current code:
file='testbook.xlsx'
writer = pd.ExcelWriter(file, engine = 'xlsxwriter')
df.to_excel(writer, sheet_name="Sheet1")
workbook = writer.book
worksheet = writer.sheets["Sheet1")
writer.save
In case u have both existing excel file and DataFrame in same format then you can simply import your exiting excel file into another DataFrame and concat both the DataFrames then save into new excel or existing one.
df1["df"] = pd.read_excel('testbook.xlsx')
df2["df"] = 1#your dataFrame
df = pd.concat([df1, df2])
df.to_excel('testbook.xlsx')
There are multiple ways of doing it if you want to do it completely using pandas library this will work.
I have a dataframe in my Jupyter notebook that I can successfully write to an Excel file with pandas ExcelWriter, but I'd rather split the dataframe into smaller dataframes (based on its index), then loop through them to write each to a different sheet in one Excel file. This seems syntactically correct but my code cell just runs without ever finishing:
path = r'/root/notebooks/my_file.xlsx'
writer = ExcelWriter(path)
sheets = df.index.unique().tolist()
for sheet in sheets:
df.loc[sheet].to_excel(writer, sheet_name=sheet, index=False)
writer.save()
I've tried a few different approaches without any luck. Am I missing something simple?
It is hard to determine the issue in your system without the error message (as you have said, you have an infinite loop). You might check the size of your dataset as you are putting only one row for each excel sheet. If you have plenty of rows, then you will have that many sheets.
However, I tried your code with my own dataset and there are some errors that can be fixed anyway.
path = 'raw/test_so.xlsx'
writer = pd.ExcelWriter(path)
sheets = df.index.unique().tolist()
for sheet in sheets:
df.loc[[sheet]].to_excel(writer, sheet_name=str(sheet), index=False)
writer.save()
See the df.loc[[sheet]] for each sheet to still get the dataframe format on excel (with column headers).
If your dataframe index is in integer, make sure that you do sheet_name=str(sheet), as it can't accept integer for the sheet name.
I am quite new to Python/Pandas. I have a situation where I have to update an existing sheet with new data every week. this 'new' data is basically a processed data from raw csv files which are generated every week and I have already written a python code to generate this 'new' data which is basically a pandas Dataframe in my code. Now I want to append this Dataframe object to an existing sheet in my excel workbook. I am already using the below code to write the DF to the XL Workbook into a specific sheet.
workbook_master=openpyxl.load_workbook('C:\Claro\Pre-Sales\E2E Optimization\Transport\Transport Network Dashboard.xlsx')
writer=pandas.ExcelWriter('C:\Claro\Pre-Sales\E2E Optimization\Transport\Transport Network Dashboard.xlsx',engine='openpyxl',mode='a')
df_latency.to_excel(writer,sheet_name='Latency',startrow=workbook_master['Latency'].max_row,startcol=0,header=False,index=False)
writer.save()
writer.close()
now the problem is when i run the code and open the excel file, instead of writing the dataframe to existing sheet 'Latency', the code creates a new sheet 'Latency1' and writes the Dataframe to it. the contents and the positioning of the Dataframe is correct but I do not understand why the code is creating a new sheet 'Latency1' instead of writing the Dataframe into existing sheet 'Latency'
will greatly appreciate any help here.
Thanks
Faheem
By default, when ExcelWriter is instantiated, it assumes a new Empty Workbook with no Worksheets.
So when you try to write data into 'Latency', it creates a new blank Worksheet instead. In addition, the openpxyl library performs a check before writing to "avoid duplicate names" (see openpxyl docs : line 18), which numerically increment the sheet name to write to 'Latency1' instead.
To go around this problem, copy the existing Worksheets into the ExcelWriter.sheets attribute, after writer is created.
Like this:
writer.sheets = dict((ws.title, ws) for ws in workbook_master.worksheets)
I want to import the values from a Pandas dataframe into an existing Excel sheet. I want to insert the data inside the sheet without deleting what is already there in the other cells (like formulas using those datas etc).
I tried using data.to_excel like:
writer = pd.ExcelWriter(r'path\TestBook.xlsm')
data.to_excel(writer, 'Sheet1', startrow=1, startcol=11, index = False)
writer.save()
The problem is that this way i overwrite the entire sheet.
Is there a way to only add the dataframe? It would be perfect if I could also keep the format of the destination cells.
Thanks
I found a good solution for it. Xlwings natuarally supports pandas dataframe:
https://docs.xlwings.org/en/stable/datastructures.html#pandas-dataframes
The to_excel function provides a mode parameter to insert (w) of append (a) a data frame into an excel sheet, see below example:
with pd.ExcelWriter(p_file_name, mode='a') as writer:
df.to_excel(writer, sheet_name='Data', startrow=2, startcol=2)