Cell Format for a range of cells using xlsxwriter - python

I am using xlsxwriter to export pandas dataframe to excel file. I need format a range of cells without using worksheet.write function as the data is already present in cells.
If I am using set_row or set_column, it is adding the format to entire row or column.
Please help me find a solution.

I need format a range of cells without using worksheet.write function as the data is already present in cells.
In general that isn't possible with XlsxWriter. If you want to specify formatting for cells then you need to do it when you write the data to the cells.
There are some options which may or may not suit your needs:
Row and Column formatting. However that formats the rest of the row or column and not just the cells with data.
Add a table format via add_table().
Add a conditional format via conditional_format().
However, these are just workarounds. If you really need to format the cells then you will need to do it when using write().

Related

pandas ExcelWriter merge but keep value that's already there

I have a few small data frames that I'm outputting to excel on one sheet. To make then fit better, I need to merge some cells in one table, but to write this in xlsx writer, I need to specify the data parameter. I want to keep the data that is already written in the left cell from using the to_excel() bit of code. Is there a way to do this without having to specify the data parameter? Or do I need to lookup the value in the dataframe to put in there.
For example:
df.to_excel(writer, 'sheet') gives similar to the following output:
Then I want to merge across C:D for this table without having to specify what data should be there (because it is already in column C), using something like:
worksheet.merge_range('C1:D1', cell_format = fmat) etc.
to get below:
Is this possible? Or will I need to lookup the values in the dataframe?
Is this possible? Or will I need to lookup the values in the dataframe?
You will need to lookup the data from the dataframe. There is no way in XlsxWriter to write formatting on top of existing data. The data and formatting need to be written at the same time (apart from Conditional Formatting which can't be used for merging anyway).

The conditional format still applies after writing a file using Python - xlsxwriter. Is there a way to make this not work?

I compared two excel files and where ever there is change, I added "-->" to identify the change.
I used XlsxWriter and did conditional formatting to highlight the cells that contain "-->". I save and close the workbook.
Now, when I open the excel file from the folder that is saved and change "-->" to blank, the applied conditional formatting also disappears. But I want to keep the conditional formatting even after I remove "-->" value from the cell. Could someone help me pls?
Below is my code for conditional formatting
worksheet.conditional_format(1,1,df.shape[0],df.shape[1],
{'type': 'text',
'criteria': 'containing',
'value':' -->',
'format': green_fmt})
I want to keep the conditional formatting even after I remove "-->" value from the cell
As far as I know that isn't possible in Excel (and therefore not in XlsxWriter). A conditional format is based on a condition. If you remove the condition you will turn off the formatting.
As an alternative you could iterate over the data in the DataFrame to look for matches and add cell formatting using to the matching cells using write() with a format.
You could also use Pandas styling for this, although you will need to use openpyxl as the Excel engine to be able to export that.

XlsxWriter set_column limit rows

I have a pandas dataframe that I'm writing to a excel file with XlsxWriter. I'm setting the cell format with
worksheet.set_column(first_index, last_index, None, cell_format)
On a few of my columns. By doing this however not only the cells with values in my excel file gets the format applied, but seemingly infinite rows get the cell format applied.
How can I limit the cell format to a set of rows?
You can use conditional formatting:
worksheet.conditional_format(first_index, last_index, {'type': 'no_blanks',
'format': cell_format})
This works if the other rows are blank. I don't know if that's the case for you.
By doing this however not only the cells with values in my excel file gets the format applied, but seemingly infinite rows get the cell format applied.
That is how column formatting works in Excel.
How can I limit the cell format to a set of rows?
You can set row formatting with the set_row() method but from the overall question it sounds like you want to limit the formatting to a range of cells.
The only way to do that in Excel, or XlsxWriter, is to format the cells individually (apart from solutions like using conditional formatting or worksheet tables that can be applied to a range).
In order to do that with a dataframe you would need to avoid df.to_excel() and write the data cell by cell using XlsxWriter methods.

Maintaining Formulae when Adding Rows/Columns

I'm doing some excel sheet Python automation using openpyxl and I'm having an issue when I try to insert columns or rows into my sheet.
I'm modifying an existing excel sheet which has basic formula in it (i.e. =F2-G2) however when I insert a row or column before these cells, the formula do not adjust accordingly like they would if you would perform that action in excel.
For example, inserting a column before column F should change the formula to =G2-H2 but instead it stays at =F2-G2...
Is there any way to work around this issue? I can't really iterate through all the cells and fix the formula because the file contains many columns with formula in them.
openpyxl is a file format library and not an application like Excel and does not attempt to provide the same functionality. Translating formulae in cells that are moved should be possible with the library's tokeniser but this ignores any formulae that refer to the cells being moved on the same worksheet or in the same workbook.
Easy, just iterate from your inserted row downward to the max row and change formulae's row number accordingly, below code is just a example:
#insert a new row after identified row
ws.insert_rows(InsertedRowNo)
#every time you insert a new row, you need to adjust all formulas row numbers after the new row.
for i in range (InsertedRowNo,ws.max_row):
ws.cell(row=i,column=20).value='=HYPERLINK(VLOOKUP(TRIM(A{0}),dict!$A$2:$B$1001,2,0),A{0})'.format(i)

Pandas: how to format both rows and columns when exporting to Excel (row format takes precedence)?

I am using pandas and xlsxwriter to export and format a number of dataframes to Excel.
The xlsxwriter documentation mentions that:
http://xlsxwriter.readthedocs.io/worksheet.html?highlight=set_column
A row format takes precedence over a default column format
Precedence means that, if you format column B as percentage, and then row 2 as bold, cell B2 won't be bold and in % - it will be bold only, but not in %!
I have provided an example below. Is there a way around it? Maybe an engine other than xlsxwriter? Maybe some way to apply formatting after exporting the dataframes to Excel?
It makes no difference whether I format the row first and the columns later, or viceversa.
It's not shown in the example below, but in my code I export a number of dataframes, all with the same columns, to the same Excel sheet. The dataframes are the equivalent of an Excel Pivot table, with a 'total' row at the bottom. I'd like the header row and the total row to be bold, and each column to have a specific formatting depending on the data (%, thousands, millions, etc). Sample code below:
import pandas as pd
writer = pd.ExcelWriter('test.xlsx')
wk = writer.book.add_worksheet('Test')
fmt_bold = writer.book.add_format({'bold':True})
fmt_pct = writer.book.add_format({'num_format': '0.0%'})
wk.write(1,1,1)
wk.write(2,1,2)
wk.set_column(1,1, None, fmt_pct)
wk.set_row(1,None, fmt_bold)
writer.close()
As #jmcnamara notes openpyxl provides different formatting options because it allows you essentially to process a dataframe within a worksheet. NB. openpyxl does not support row or column formats.
The openpyxl dataframe_to_rows() function converts a dataframe to a generator of values, row by row allowing you to apply whatever formatting or additional processing you like.
In this case you will need to create another format that is a combination of the row and column formats and apply it to the cell.
In order to do that you will need to iterate over the data frame and call XlsxWriter directly, rather then using the Pandas-Excel interface.
Alternatively, you may be able to do using OpenPyXL as the pandas Excel engine. Recent versions of the Pandas interface added the ability to add formatting to the Excel data after writing the dataframe, when using OpenPyXL.

Categories

Resources