Using xlsxwriter, how do I insert a new row to an Excel worksheet? For instance, there is an existing data table at the cell range A1:G10 of the Excel worksheet, and I want to insert a row (A:A) to give it some space for the title of the report.
I looked through the documentation here http://xlsxwriter.readthedocs.io/worksheet.html, but couldn't find such method.
import xlsxwriter
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()
worksheet.insert_row(1) # This method doesn't exist
December 2021, this is still not a possibility. You can get around this by doing some planning, and then writing your dataframe starting on different row. Building on the example from the xlsxwriter documentation:
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
writer = pd.ExcelWriter('my_excel_spreadsheet.xlsx', engine='xlsxwriter')
with writer as writer:
df.to_excel(writer, sheet_name='Sheet1', startrow = 4) # <<< notice the startrow here
And then, you can write to the earlier rows as mentioned in other comments:
workbook = writer.book
worksheet = writer.sheets['Sheet1']
worksheet.write(row, 0, 'Some Text') # <<< Then you can write to a different row
Not quite the insert() method we want, but better than nothing.
I have found that the planning involved in this process is not really ever something I can get around, even if I didn't have this problem. When I reach the stage where I am taking my data to excel, I have to do a little 'by hand' work in order to make the excel sheet pretty enough for human consumption, which is the whole point of moving things to excel. So, I don't look at the need to pre-plan my start rows as too much out of my way.
By using openpyxl you can insert iew rows and columns
import openpyxl
file = "xyz.xlsx"
#loading XL sheet bassed on file name provided by user
book = openpyxl.load_workbook(file)
#opening sheet whose index no is 0
sheet = book.worksheets[0]
#insert_rows(idx, amount=1) Insert row or rows before row==idx, amount will be no of
#rows you want to add and it's optional
sheet.insert_rows(13)
Hope this helps
Unfortunately this is not something xlsxwriter can do.
openpyxl is a good alternative to xlsxwriter, and if you are starting a new project do not use xlsxwriter.
Currently openpyxl can not insert rows, but here is an extension class for openpyxl that can.
openpyxl also allows reading of excel documents, which xlsxwriter does not.
You can try this
import xlsxwriter
wb = Workbook("name.xlsx")
ws = wb.add_worksheet("sheetname")
# Write a blank cell
ws.write_blank(0, 0, None, cell_format)
ws.write_blank('A2', None, cell_format)
Here is the official documentation:
Xlsxwriter worksheet.write_blank() method
Another alternative is to merge a few blank columns
ws.merge_range('A1:D1', "")
Otherwise you'll need to run a loop to write each blank cell
# Replace 1 for the row number you need
for c in range(0,10):
ws.write_blank(1, c, None, cell_format)
Inserting a row is equivalent to adding +1 to your row count. Technically there is no need for a "blank row" method and I'm pretty sure that's why it isn't there.
you should usewrite
read this: set_column(first_col, last_col, width, cell_format, options)
for example:
import xlsxwriter
workbook =xlsxwriter.Workbook('xD.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write(row, col, 'First Name')
workbook.close()
I am very much unhappy with the answers. The library xlxsWriter tends to perform most of the operations easily.
To add a row in the existing worksheet , you can
wb.write_row(rowNumber,columnNumber,listToAdd)
Related
I have write some content to a xlsx file by using xlsxwriter
workbook = xlsxwriter.Workbook(file_name)
worksheet = workbook.add_worksheet()
worksheet.write(row, col, value)
worksheet.close()
I'd like to add a dataframe after the existing rows to this file by to_excel
df.to_excel(file_name,
startrow=len(existing_content),
engine='xlsxwriter')
However, this seems not work.The dataframe not inserted to the file. Anyone knows why?
Unfortunately, as the content above is not specifically written, let's take a look at to_excel and XlsxWriter as examples.
using xlsxwriter
import xlsxwriter
# Create a new Excel file and add a worksheet
workbook = xlsxwriter.Workbook('example.xlsx')
worksheet = workbook.add_worksheet()
# Add some data to the worksheet
worksheet.write('A1', 'Language')
worksheet.write('B1', 'Score')
worksheet.write('A2', 'Python')
worksheet.write('B2', 100)
worksheet.write('A3', 'Java')
worksheet.write('B3', 98)
worksheet.write('A4', 'Ruby')
worksheet.write('B4', 88)
# Save the file
workbook.close()
Using the above code, we have saved the table similar to the one below to an Excel file.
Language
Score
Python
100
Java
98
Ruby
88
Next, if we want to add rows using a dataframe.to_excel :
using to_excel
import pandas as pd
# Load an existing Excel file
existing_file = pd.read_excel('example.xlsx')
# Create a new DataFrame to append
df = pd.DataFrame({
'Language': ['C++', 'Javascript', 'C#'],
'Score': [78, 97, 67]
})
# Append the new DataFrame to the existing file
result = pd.concat([existing_file, df])
# Write the combined DataFrame to the existing file
result.to_excel('example.xlsx', index=False)
The reason for using pandas concat:
To append, it is necessary to use pandas.DataFrame.ExcelWriter(), but XlsxWriter does not support append mode in ExcelWriter
Although the task can be accomplished using pandas.DataFrame.append(), the append method is slated to be deleted in the future, so we use concat instead.
The OP is using xlsxwriter in the engine parameter. Per XlsxWriter documentation "XlsxWriter is designed only as a file writer. It cannot read or modify an existing Excel file." (link to XlsxWriter Docs).
Below I've provided a fully reproducible example of how you can go about modifying an existing .xlsx workbook using the openpyxl module (link to Openpyxl Docs).
For demonstration purposes, I'll first create create a workbook called test.xlsx using pandas:
import pandas as pd
df = pd.DataFrame({'Col_A': [1,2,3,4],
'Col_B': [5,6,7,8],
'Col_C': [0,0,0,0],
'Col_D': [13,14,15,16]})
df.to_excel('test.xlsx', index=False)
This is the Expected output at this point:
Using openpyxl you can use another dataset to load the existing workbook ('test.xlsx') and modify the third column with different data from the new dataframe while preserving the other existing data. In this example, for simplicity, I update it with a one column dataframe but you could extend it to update or add more data.
from openpyxl import load_workbook
import pandas as pd
df_new = pd.DataFrame({'Col_C': [9, 10, 11, 12]})
wb = load_workbook('test.xlsx')
ws = wb['Sheet1']
for index, row in df_new.iterrows():
cell = 'C%d' % (index + 2)
ws[cell] = row[0]
wb.save('test.xlsx')
With the Expected output at the end:
I'm attempting to create a script to process several Excel sheets at once and one of the steps i'm trying to get Python to handle is to create a table using data passed from a pandas data frame. Creating a table seems pretty straightforward looking at the documentation.
Following the example from here:
# define a table style
mediumstyle = TableStyleInfo(name='TableStyleMedium2', showRowStripes=True)
# create a table
table = Table(displayName='IdlingReport', ref='A1:C35', tableStyleInfo=mediumstyle)
# add the table to the worksheet
sheet2.add_table(table)
# Saving the report
wb.save(openexcel.filename)
print('Report Saved')
However this creates an empty table, instead of using the data present in cells 'A1:C35'. I can't seem to find any examples anywhere that go beyond these steps so any help with what I may be doing wrong is greatly appreciated.
The data in 'A1:C35' is being written to Excel as follows:
while i < len(self.sheets):
with pd.ExcelWriter(filename, engine='openpyxl') as writer:
writer.book = excelbook
writer.sheets = dict((ws.title, ws) for ws in excelbook.worksheets)
self.df_7.to_excel(writer, self.sheets[i], index=False, header=True, startcol=0, startrow=0)
writer.save()
i += 1
The output looks something like this
Time Location Duration
1/01/2019 [-120085722,-254580042] 5 Min
1/02/2019 [-120085722,-254580042] 15 Min
1/02/2019 [-120085722,-254580042] 7 Min
Just to clarify right now I am first writing my data frame to Excel and then after formatting the data I've written as a table. Reversing these steps by creating the table first and then writing to Excel fills the table, but gets rid of the formatting(font color, font type, size, etc). Which means I'd have to add an additional step to fix the formatting(which i'd like to avoid if possible).
Your command
# create a table
table = Table(displayName='IdlingReport', ref='A1:C35', tableStyleInfo=mediumstyle)
creates a special Excel object — an empty table with the name IdlingReport.
You probably want something else - to fill a sheet of your Excel workbook with data from a Pandas dataframe.
For this purpuse there is a function dataframe_to_rows():
from openpyxl import Workbook
from openpyxl.utils.dataframe import dataframe_to_rows
wb = Workbook()
ws = wb.active # to rename this sheet: ws.title = "some_name"
# to create a new sheet: ws = wb.create_sheet("some_name")
for row in dataframe_to_rows(df, index=True, header=True):
ws.append(row) # appends this row after a previous one
wb.save("something.xlsx")
See Working with Pandas Dataframes and Tutorial.
I am trying to add an empty excel sheet into an existing Excel File using python xlsxwriter.
Setting the formula up as follows works well.
workbook = xlsxwriter.Workbook(file_name)
worksheet_cover = workbook.add_worksheet("Cover")
Output4 = workbook
Output4.close()
But once I try to add further sheets with dataframes into the Excel it overwrites the previous excel:
with pd.ExcelWriter('Luther_April_Output4.xlsx') as writer:
data_DifferingRates.to_excel(writer, sheet_name='Differing Rates')
data_DifferingMonthorYear.to_excel(writer, sheet_name='Differing Month or Year')
data_DoubleEntries.to_excel(writer, sheet_name='Double Entries')
How should I write the code, so that I can add empty sheets and existing data frames into an existing excel file.
Alternatively it would be helpful to answer how to switch engines, once I have produced the Excel file...
Thanks for any help!
If you're not forced use xlsxwriter try using openpyxl. Simply pass 'openpyxl' as the Engine for the pandas built-in ExcelWriter class. I had asked a question a while back on why this works. It is helpful code. It works well with the syntax of pd.to_excel() and it won't overwrite your already existing sheets.
from openpyxl import load_workbook
import pandas as pd
book = load_workbook(file_name)
writer = pd.ExcelWriter(file_name, engine='openpyxl')
writer.book = book
data_DifferingRates.to_excel(writer, sheet_name='Differing Rates')
data_DifferingMonthorYear.to_excel(writer, sheet_name='Differing Month or Year')
data_DoubleEntries.to_excel(writer, sheet_name='Double Entries')
writer.save()
You could use pandas.ExcelWriter with optional mode='a' argument for appending to existing Excel workbook.
You can also append to an existing Excel file:
>>> with ExcelWriter('path_to_file.xlsx', mode='a') as writer:`
... df.to_excel(writer, sheet_name='Sheet3')`
However unfortunately, this requires using a different engine, since as you observe the ExcelWriter does not support the optional mode='a' (append). If you try to pass this parameter to the constructor, it raises an error.
So you will need to use a different engine to do the append, like openpyxl. You'll need to ensure that the package is installed, otherwise you'll get a "Module Not Found" error. I have tested using openpyxl as the engine, and it is able to append new a worksheet to existing workbook:
with pd.ExcelWriter(engine='openpyxl', path='Luther_April_Output4.xlsx', mode='a') as writer:
data_DifferingRates.to_excel(writer, sheet_name='Differing Rates')
data_DifferingMonthorYear.to_excel(writer, sheet_name='Differing Month or Year')
data_DoubleEntries.to_excel(writer, sheet_name='Double Entries')
I think you need to write the data into a new file. This works for me:
# Write multiple tabs (sheets) into to a new file
import pandas as pd
from openpyxl import load_workbook
Work_PATH = r'C:\PythonTest'+'\\'
ar_source = Work_PATH + 'Test.xlsx'
Output_Wkbk = Work_PATH + 'New_Wkbk.xlsx'
# Need workbook from openpyxl load_workbook to enumerage tabs
# is there another way with only xlsxwriter?
workbook = load_workbook(filename=ar_source)
# Set sheet names in workbook as a series.
# You can also set the series manually tabs = ['sheet1', 'sheet2']
tabs = workbook.sheetnames
print ('\nWorkbook sheets: ',tabs,'\n')
# Replace this function with functions for what you need to do
def default_col_width (df, sheetname, writer):
# Note, this seems to use xlsxwriter as the default engine.
for column in df:
# map col width to col name. Ugh.
column_width = max(df[column].astype(str).map(len).max(), len(column))
# set special column widths
narrower_col = ['OS','URL'] #change to fit your workbook
if column in narrower_col: column_width = 10
if column_width >30: column_width = 30
if column == 'IP Address': column_width = 15 #change for your workbook
col_index = df.columns.get_loc(column)
writer.sheets[sheetname].set_column(col_index,col_index,column_width)
return
# Note nothing is returned. Writer.sheets is global.
with pd.ExcelWriter(Output_Wkbk,engine='xlsxwriter') as writer:
# Iterate throuth he series of sheetnames
for tab in tabs:
df1 = pd.read_excel(ar_source, tab).astype(str)
# I need to trim my input
df1.drop(list(df1)[23:],axis='columns', inplace=True, errors='ignore')
try:
# Set spreadsheet focus
df1.to_excel(writer, sheet_name=tab, index = False, na_rep=' ')
# Do something with the spreadsheet - Calling a function
default_col_width(df1, tab, writer)
except:
# Function call failed so just copy tab with no changes
df1.to_excel(writer, sheet_name=tab, index = False,na_rep=' ')
If I use the input file name as the output file name, it fails and erases the original. No need to save or close if you use With... it closes autmatically.
I have an excel sheet which came from a pandas dataframe. I then use Xlsxwriter to add formulas, new columns and formatting. The problem is I only seem to be able format what I've written using xlsxwriter and nothing that came from the dataframe. So what I get is something like this half formatted table
As you can see from the image the two columns from the dataframe remain untouched. They must have some kind of default formatting that is overriding mine.
Since I don't know how to convert a worksheet back into to a dataframe the code below is obviously completely wrong but it's just to give an idea of what I'm looking for.
export = "files/sharepointExtract.xlsx"
df = pd.read_excel(export)# df = dataframe
writer = pd.ExcelWriter('files/new_report-%s.xlsx' % (date.today()), engine = 'xlsxwriter')
workbook = writer.book
# Code to make the header red, this works fine because
# it's written in xlsxwriter using write.row()
colour_format = workbook.add_format()
colour_format.set_bg_color('#640000')
colour_format.set_font_color('white')
worksheet.set_row(0, 15, colour_format)
table_body_format = workbook.add_format()
table_body_format.set_bg_color('blue')
for row in worksheet.rows:
row.set_row(0,15, table_body_format)
This code gives an Attribute error but even without the for loop we just get what can be seen in the image.
The following should work:
import pandas as pd
from datetime import date
export = "files/sharepointExtract.xlsx"
df = pd.read_excel(export)
writer = pd.ExcelWriter('files/new_report-{}.xlsx'.format(date.today()), engine ='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', startrow=1 , startcol=0, header=False, index=False, encoding='utf8')
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Code to make the header red background with white text
colour_format = workbook.add_format()
colour_format.set_bg_color('#640000')
colour_format.set_font_color('white')
# Code to make the body blue
table_body_format = workbook.add_format()
table_body_format.set_bg_color('blue')
# Set the header (row 0) to height 15 using colour_format
worksheet.set_row(0, 15, colour_format)
# Set the default format for other rows
worksheet.set_column('A:Z', 15, table_body_format)
# Write the header manually
for colx, value in enumerate(df.columns.values):
worksheet.write(0, colx, value)
writer.save()
When Pandas is used to write the header, it uses its own format style which overwrites the underlying xlsxwriter version. The simplest approach is to stop it from writing the header and get it to write the rest of the data from row 1 onwards (not 0). This avoids the formatting from being altered. You can then easily write your own header using the column values from the dataframe.
I'm new to Python so I hope this sounds right.
How could I use Python to write to an Excel file from user input?
I want my script to ask users "Name:" "Job Title:" "Building Number:" "Date:" etc. and from that raw input, fill in the corresponding columns one after the other in an Excel spreadsheet. I don't want future use of the script to overwrite previous data in the sheet either. I'd like each time to create a new line in the spreadsheet and then fill in the correct entries in each row. I hope that makes sense. Thank you so much in advance for your help.
You could use openpyxl to write to the workbook. Here's some basic usage, and should help avoid overwriting:
import openpyxl
wb = openpyxl.load_workbook('C:/test.xlsx')
ws = wb.active
i = 0
cell_val = ''
# Finds which row is blank first
while cell_val != '':
cell_val = ws['A' + i].value
i += 1
# Modify Sheet, Starting With Row i
wb.save('C:/test.xlsx')
Hope This Helps.
Edited, getting input and time:
For getting information from the user, use
x = input('Prompt: ')
However, if you want the actual current, I suggest using the time module:
>>> from time import strftime
>>> date = strftime('%m-%d-%y')
>>> time = strftime('%I:%M%p')
>>> print(date)
08-28-15
>>> print(time)
01:57AM
I will also add that XlsxWriter is also an excellent library for writing to Excel, however, unlike OpenPyXl, it is only a writer and does not read Excel files.
An example found from their documentation is as follows:
import xlsxwriter
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()
# Some data we want to write to the worksheet.
expenses = (
['Rent', 1000],
['Gas', 100],
['Food', 300],
['Gym', 50],
)
# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0
# Iterate over the data and write it out row by row.
for item, cost in (expenses):
worksheet.write(row, col, item)
worksheet.write(row, col + 1, cost)
row += 1
# Write a total using a formula.
worksheet.write(row, 0, 'Total')
worksheet.write(row, 1, '=SUM(B1:B4)')
workbook.close()
You may want to use the pandas module. It makes reading, writing, and manipulating Excel files very easy:
http://pandas.pydata.org/
Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.