I just want to overwrite certain column base on my dataframe. Suppose df2 is my dataframe.
Below is the code i use. The problem is its overwrite the other columns and row even though i code it to start on columns 80.
I want its overwrite on column 80 and beyond only, but not before the column 80. 80 is index, not name.
import pandas as pd
import xlsxwriter
df2 = pd.read_excel(r'C:\Users\RUI LEONHART\Google Drive\Shop\STOCK V2.xlsx',
usecols=['XS1','S1','M1','L1','XL1','XXL1'])
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df2.to_excel(writer, sheet_name='Sheet1', startcol=80)
# Get the xlsxwriter objects from the dataframe writer object.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Close the Pandas Excel writer and output the Excel file.
writer.save()
I search around the solution. The closest one is this
python: update dataframe to existing excel sheet without overwriting contents on the same sheet and other sheets
but still overwrite the columns and row that i dont want.
Related
I'm trying to append some information to an excel using pandas.
My excel has several sheets, most of them with formulas.
I'm only trying to replace the cells from a specific sheet sheet2.
Function:
def write():
df = pd.read_excel(path, "sheet2")
df.loc['A','B'] = 10
with pd.ExcelWriter(path, engine="openpyxl", mode="a", if_sheet_exists="replace") as writer:
df.to_excel(writer, sheet_name="sheet2")
The problem
The sheet cell values get replaced BUT in other sheets the cells with formulas are empty.
When opening the excel itself in protected view they are empty, but when editing they reappear.
Help.
I created a pandas dataframe in my code and tried to append the final output to an existing Excel workbook. The existing workbook is called "Directory" and has three different sheets in it. I want to append my output to one of the sheets called "raw_data in the workbook". This sheet already has some data in it but the columns in this sheet match the columns in my new dataframe. Here is my code:
from pandas import ExcelWriter
from pandas import ExcelFile
from openpyxl import Workbook
with pd.ExcelWriter(r'C:\Users\Documents\Directory.xlsx', engine ='openpyxl', mode='a') as writer:
df.to_excel(writer, sheet_name = 'raw_data', index = False, header = False)
writer.save()
writer.close()
My code "runs" without any error but when I check the workbook after running the code, my code doesn't append my data frame to the specified sheet, "raw_data", but creates a new sheets called "raw_data1" and store the data in that tab. I couldn't figure out which part in my code is incorrect. Could anyone please help me with this? Thank you.
I have a excel workbook that has more than one worksheets (i.e. sheet1 and sheet2)
and i did like this:
import pandas
df1 = pandas.read_excel('file.xlsx', sheet_name='sheet1')
####doing something on shee1, sheet2 is not touched######
df1.to_excel('file.xlsx', sheet_name='sheet1')
By doing above, I found sheet2 missing after saving the file.
Is there a way to open and save on same file without affecting other worksheets?
A possible way to do that is by loading all of your sheets, then modifying only the first one. Although it works, you may loose any custom styling from your tables.
# Load all sheets
workbook = pd.read_excel('file.xlsx', sheet_name=None)
# do something to workbook['sheet1']
# Write all sheets to excel file
writer = pd.ExcelWriter('file.xlsx', engine='xlsxwriter')
for sheet, df in workbook.items():
df.to_excel(writer, sheet_name=sheet)
writer.save()
As far as I know, the only way to overwrite a sheet ─ while keeping the other ones untouched ─ requires using third-party libraries. For instance,
here's an option with openpyxl:
First, modify the data as you wish:
import pandas as pd
fname = 'file.xlsx'
target_sheet = 'sheet1'
df = pd.read_excel('file.xlsx', sheet_name='sheet1')
# further modification to `df` ...
then, save it to the specified sheet:
# Load required functions
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
# Read excel file (all sheets)
wb = load_workbook(fname)
# Get the index from target-sheet
idx = wb.sheetnames.index(target_sheet)
# Delete the existing target-sheet
del wb[target_sheet]
# Create a new empty target-sheet
wb.create_sheet(target_sheet, idx)
# Write `df` data on it
for r in dataframe_to_rows(df, index=False, header=True):
wb[target_sheet].append(r)
# Save file
wb.save(fname)
I suspect something is going on in the below section that is throwing off the code:
####doing something on ws1, ws2 is not touched######
When I ran your code on my system the workbook still returned both worksheets
As an isolation test can you comment/remove the code in that section and confirm if the error still appears.
Using xlsxwriter, one can write a dataframe 'df' to Excel 'simple.xlsx' using code such as:
import pandas as pd
writer = pd.ExcelWriter('simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
With above code, I see that the resultant Excel sheet has all cells (except header) as default left-aligned.
Question:
How can I make the Excel cell values to be center-aligned?
I did explore using conditional formatting but, with my cell values being combination of blanks, zeros, floats, strings and integers, I am wondering if there is another way.
Is there a smarter/quick way to do either/both of the following:
Any way to write dataframe to Excel as center-aligned? Or..
Any way to center-align the Excel sheet (for the cell range occupied by dataframe) once the dataframe has already been written to Excel?
You can add the below line to your code
df=df.style.set_properties(**{'text-align': 'center'})
Your complete code would be
import pandas as pd
writer = pd.ExcelWriter('simple.xlsx', engine='xlsxwriter')
df=df.style.set_properties(**{'text-align': 'center'})
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
I want to import the values from a Pandas dataframe into an existing Excel sheet. I want to insert the data inside the sheet without deleting what is already there in the other cells (like formulas using those datas etc).
I tried using data.to_excel like:
writer = pd.ExcelWriter(r'path\TestBook.xlsm')
data.to_excel(writer, 'Sheet1', startrow=1, startcol=11, index = False)
writer.save()
The problem is that this way i overwrite the entire sheet.
Is there a way to only add the dataframe? It would be perfect if I could also keep the format of the destination cells.
Thanks
I found a good solution for it. Xlwings natuarally supports pandas dataframe:
https://docs.xlwings.org/en/stable/datastructures.html#pandas-dataframes
The to_excel function provides a mode parameter to insert (w) of append (a) a data frame into an excel sheet, see below example:
with pd.ExcelWriter(p_file_name, mode='a') as writer:
df.to_excel(writer, sheet_name='Data', startrow=2, startcol=2)