Append data to the last row of the Excel sheet using Pandas - python

I have an excel data for three variables (Acct, Order, Date) in a Sheet name called Orders
I have created a data frame by reading this Sheet
import pandas as pd
sheet_file=pd_ExcelFile("Orders.xlsx", engine="openpyxl")
for sheet_name in worksheets:
df=pd.read_excel(sheet_file,sheet_name,header=1)
append_data.append(df)
append_data=pd.concat(append_data)
I have another Excel file called "Total_Orders.xlsx" with ~100k rows and I need to append the above dataframe to this excel file (Sheet Name="Orders")
with pd.ExcelWriter('Total_Orders.xlsx',sheet_name='Orders',engine="openpyxl") as writer:
append_data.to_excel(writer,startrow=2,header=False,index=False)
writer.save()
The above is overwriting the data instead of appending it. I know startrow is the key here but I am not sure how to fix this. Any help is much appreciated

Have you tried in mode="a", along these lines:
with pd.ExcelWriter("Total_Orders.xlsx", mode="a", engine="openpyxl") as writer:
append_data.to_excel(writer, sheet_name="Orders")
EDIT - in response to comment
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
append_data = pd.DataFrame([{'Acct':3, 'Order':333, 'Note':'third'},
{'Acct':4, 'Order':444, 'Note':'fourth'}])
wb = load_workbook(filename = "stackoverflow.xlsx")
ws = wb["Orders"]
for r in dataframe_to_rows(append_data, index=False, header=False): #No index and don't append the column headers
ws.append(r)
wb.save("stackoverflow.xlsx")
The stackoverflow.xlsx before:
The stackoverflow.xlsx after (the 'Other' sheet was not affected):

Related

Overwrite a sheet in excel using python

I'm trying to overwrite one sheet of my excel file with data from a .txt file. The excel file I'm bringing the data into has several sheets but I only want to overwrite the 'Previous Month' sheet. Every time I run this code and open the excel file only the previous month sheet is there and nothing else. Many solutions on here show how to add more sheets, I'm trying to update an already existing sheet in an excel with 8 sheets total.
How can I fix my code so that only the one sheet is edited but all of them stay there?
import pandas as pd
#importing previous month data#
writer = pd.ExcelWriter('file.xlsx')
df = pd.read_csv('file.txt', sep='\t')
df.to_excel(writer, sheet_name='Previous Month', startrow=4, startcol=2)
writer.save()
writer.close()
Edited code- whatever is happening here keeps corrupting my original file
import pandas as pd
import openpyxl
#importing previous month data#
writer= pd.ExcelWriter('file.xlsx', mode= 'a', engine="openpyxl", if_sheet_exists="replace")
df = pd.read_csv('file.txt', sep='\t')
df.to_excel(writer, sheet_name="Previous Month", startrow=2, startcol=4)
writer.save()
writer.close()
You can use openpyxl.load_workbook() to do what you are looking for. While I did try the above suggestions, it didn't work for me. the load_workbook() usually runs without issues. So, hope this works for you as well.
I open the output file using load_workbook(), deleted the existing sheet (Sheet2 here) if it exists, then create and write the data using create_sheet() and dataframe_to_rows (Ref). Let me know in case of questions/issues.
import pandas as pd
import openpyxl
df = pd.read_csv('file.txt', sep='\t')
wb=openpyxl.load_workbook('output.xlsx') # Open workbook
if "Sheet2" in wb.sheetnames: # If sheet exists, delete it
del wb['Sheet2']
ws = wb.create_sheet(title='Sheet2') # Create new sheet
from openpyxl.utils.dataframe import dataframe_to_rows
rows = dataframe_to_rows(df, index=False, header=True) # Write dataframe as rows
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx+2, column=c_idx+4, value=value) #Add... the 2, 4 are the offset, similar to the startrow and startcol in your code
wb.save('output.xlsx')

Read each excel sheet as a different dataframe in Python

I have an excel file with 40 sheet_names. I want to read each sheet to a different dataframe, so I can export an xlsx file for each sheet.
Instead of writing all the sheet names one by one, I want to create a loop that will get all sheet names and add them as a variable in the "sheet_name" option of "pandas_read_excel"
I am trying to avoid this:
df1 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet1');
df2 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet2');
....
df40 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet40');
thank you all guys
Specifying sheet_name as None with read_excel reads all worksheets and returns a dict of DataFrames.
import pandas as pd
file = 'C:\Users\filename.xlsx'
xl = pd.read_excel(file, sheet_name=None)
sheets = xl.keys()
for sheet in sheets:
xl[sheet].to_excel(f"{sheet}.xlsx")
I think this is what you are looking for.
import pandas as pd
xlsx = pd.read_excel('file.xlsx', sheet_name=None, header=None)
for sheet in xlsx.keys(): xlsx[sheet].to_excel(sheet+'.xlsx', header=False, index=False)

Pandas create a new sheet instead of adding the data in the active one

I am creating a spreadsheet with openpyxl and adding some data.
import pandas as pd
import numpy as np
from openpyxl import Workbook
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
from collections import OrderedDict
workbook = Workbook()
sheet = workbook.active
def fill_static_values():
sheet["A1"] = "Run No."
sheet["A2"] = "MLIDMLPA"
sheet["A48"] = "Patients here"
sheet["B1"] = "Patient"
fill_static_values()
output = "./Name_of_run.xlsx"
workbook.save(filename=output)
Then my application do some data management and I want to add some of this data into the existing file.
book = load_workbook(output)
writer = pd.ExcelWriter(output, engine='openpyxl')
writer.book = book
## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_no_control.to_excel(writer, "sheet", startrow=2, startcol=3,
header=False,
index=False)
writer.save()
Solution found on this StackOverflow link
However, this is creating and adding the data in the correct position but in a new sheet called sheet2. What I am doing wrong?
The to_excel has incorrect sheet name. The S should be in CAPS. Change the line from
data_no_control.to_excel(writer, "sheet", startrow=2, startcol=3,
to
data_no_control.to_excel(writer, "Sheet", startrow=2, startcol=3,
As there is already a sheet in the excel, it is writing the data to Sheet2
EDIT
Noticed that you are using writer.sheets. If you want to use want the program pick up the first sheet from excel automatically, you can use this as well...
data_no_control.to_excel(writer, sheet_name=list(writer.sheets.keys())[0], startrow=2, startcol=3,
This will pick up the first sheet (in your case the only sheet) as the worksheet to update

Appending to a sheet in Excel creates a new sheet instead of appending

I am trying to use this code to append a dataframe to an existing sheet in Excel, but instead of appending the new data to it, it creates a new sheet. Here is the code:
import pandas as pd
import openpyxl as op
df = ['normal_dataframe']
with pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a') as writer:
df.to_excel(writer, sheet_name='Sheet1', header=False, index=False)
'test.xlsx' has a 'Sheet1', but when the file is appended, theres 2 sheets. 'Sheet1' and 'Sheet11'.
One approach with COM:
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
path = r'c:\Users\Alex20\Documents\test.xlsx'
wb = xl.Workbooks.Open(path)
ws = wb.Worksheets("Sheet1")
ws.Range("E9:F10").Value = [[9,9],[10,10]]
wb.Close(True)
xl.Quit()

Creating a Master excel file from dynamic CSV output using Python

I am trying to create a repository "Master" excel file from a CSV which will be generated and overwritten every couple of hours. The code below creates a new excel file and writes the content from "combo1.csv" to "master.xlsx". However, whenever the combo1 file is updated, the code basically overwrites the contents in the "master.xlsx" file. I need to append the contents from "combo1" to "Master" without the headers being inserted every time. Can someone help me with this?
import pandas as pd
writer = pd.ExcelWriter('master.xlsx', engine='xlsxwriter')
df = pd.read_csv('combo1.csv')
df.to_excel(writer, sheet_name='sheetname')
writer.save()
Refer to Append Data at the End of an Excel Sheet section in this medium article:
Using Python Pandas with Excel Sheets
(Credit to Nensi Trambadiya for the article)
Basically you'll have to first read the Excel file and find the number of rows before pushing the new data.
reader = pd.read_excel(r'master.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
First read the excel file and then need to perform below method to append the rows.
import pandas as pd
from xlsxwriter import load_workbook
df = pd.DataFrame({'Name': ['abc','def','xyz','ysv'],
'Age': [08,45,32,26]})
writer = pd.ExcelWriter('master.xlsx', engine='xlsxwriter')
writer.book = load_workbook('Master.xlsx')
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
reader = pd.read_excel(r'master.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.close()
import pandas as pd
from openpyxl import load_workbook
# new dataframe with same columns
df = pd.read_csv('combo.csv')
writer = pd.ExcelWriter('master.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('master.xlsx')
# copy existing sheets
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
# read existing file
reader = pd.read_excel(r'master.xlsx')
# write out the new sheet
df.to_excel(writer, index=False, header=False, startrow=len(reader) + 1)
writer.close()
Note that a Master has to be created before running the script

Categories

Resources