OpenPyXL set number_format for the whole column - python

I'm exporting some data to Excel and I've successfully implemented formatting each populated cell in a column when exporting into Excel file with this:
import openpyxl
from openpyxl.utils import get_column_letter
wb = openpyxl.Workbook()
ws = wb.active
# Add rows to worksheet
for row in data:
ws.append(row)
# Disable formatting numbers in columns from column `D` onwards
# Need to do this manually for every cell
for col in range(3, ws.max_column+1):
for cell in ws[get_column_letter(col)]:
cell.number_format = '#'
# Export data to Excel file...
But this only formats populated cells in each column. Other cells in this column still have General formatting.
How can I set all empty cells in this column as # so that anyone, who will edit cells in these columns within this exported Excel file, will not have problems with inserting lets say phone numbers as actual Numbers.

For openpyxl you must always set the styles for every cell individually. If you set them for the column, then Excel will apply them when it creates new cells, but styles are always still applied to individual cells.

As you are iterating on the rows of that columns only to the max_cell those are the only cells that are being reformatted. While you can't reformat a column you can use a different way to set the format at least to a specific cell:
last_cell = 100
for col in range(3, ws.max_column+1):
for row in range(1, last_cell):
ws.cell(column=col, row=row).number_format = '#' # Changing format to TEXT
The following will format all the cell in the column up to last_cell you can use that, and, while it's not exactly what you need it's close enough.

conditional formatting will do the hack to put number formatting on the entire column. for applying thousand separator on entire column this worked for me:
diff_style = DifferentialStyle(numFmt = NumberFormat(numFmtId='4',formatCode='#,##0.00'))
rule1 = Rule(type="expression", dxf=diff_style)
rule1.formula = ["=NOT(ISBLANK($H2))"] // column on which thousand separator is to be applied
work_sheet.conditional_formatting.add("$H2:$H500001", rule1) // provide a range of cells

Related

Bolding and coloring rows using xlsxwriter

Here is my current code:
import xlsxwriter
user_input = [["10002",'01/04/23','',"300",'',"300",'','','',"44.44",'','','','',"34232",'','','',"34",'','',"2312"],["10001","01/30/2023","63","15","12345","gatorade","0.1234","a0001","4","50","50","115.4","123","33456","34543","34234","3432","34.22","1800","1800","0","0"]]
#Lists are entered here
column_titles = ['1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22']
user_input.insert(0, column_titles)
#Adds column titles to be in first row of Excel
workbook = xlsxwriter.Workbook('workbook.xlsx')
worksheet = workbook.add_worksheet()
for row_num, data in enumerate(user_input):
worksheet.write_row(row_num, 0, data)
#Adds to Excel doc
I have tried to follow https://xlsxwriter.readthedocs.io/tutorial02.html and How to set formatting for entire row or column in xlsxwriter Python? , but every time I try and edit those to work for my own code, my workbook just comes back blank. Doesn't error out or anything.
This is my first time using xlsxwriter, so I'm not quite sure how to do much yet. I'm trying to take the first row in the spreadsheet, and put it all in bold. (My attempts of this are not in my example code). As well as putting the first 5 columns in the first row, and highlighting those boxes to be blue. Can anybody help me with this?
I'm thinking maybe the way I have the column titles list being appended into the original list may be part of what's complicating this? But I'm unsure. Thank you in advance for any help.
You may need to separate the data writes depending on formatting required.
To add formatting you can create a format and apply that when writing or set to a row or column.
Looking at the header row only given the two requirements;
Bold all values
First 5 cells highlighted in blue
In the example code below there a two formats the bolding and the cell highlight.
In this case bolding is set to the row (0) i.e. row 1, this line can be added before or after the list is written. The 'set_row' applies the format 'header_row_format' to the whole row from A1 to the last possible column the sheet can contain.
While bolding all the cells in the first row may be OK, higlighting probably wouldn't be notwithstanding your requirement is to only highlight the first 5 anyway. Therefore in this case we can create another format, 'cell_format' and only add this to the first 5 cells as we write the cell values.
If you did only want to bold those cells that you write data to you could include bold as part of the 'cell_format' see commented format line. However in this case you'd need two cell formats one with the bg colour and one without.
import xlsxwriter
column_titles = ['1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22']
workbook = xlsxwriter.Workbook('workbook.xlsx')
worksheet = workbook.add_worksheet()
header_row_format = workbook.add_format({'bold': True})
worksheet.set_row(0, None, header_row_format)
cell_format = workbook.add_format()
# cell_format.set_bold(True)
cell_format.set_bg_color('blue')
for col_num, data in enumerate(column_titles):
if col_num < 5:
worksheet.write(0, col_num, data, cell_format)
else:
worksheet.write(0, col_num, data)
workbook.close()

OpenPyXL - Change font for entire worksheet, column or row

I am unsure as to what the below means from the OpenPyXL documentation:
Styles can also applied to columns and rows but note that this applies only to cells created (in Excel) after the file is closed. If you want to apply styles to entire rows and columns then you must apply the style to each cell yourself. This is a restriction of the file format:'
I'm unsure what this applies only to cells created (in Excel) after the file is closed means.
>>> col = ws.column_dimensions['A']
>>> col.font = Font(bold=True)
>>> row = ws.row_dimensions[1]
>>> row.font = Font(underline="single")
I tried the below code to change row 4's font but it isn't having any impact.
row = ws.row_dimensions[4]
row.font = Font(name='Arial', size=8)
Is there a solution instead of changing every individual cell?
Thanks

Openpyxl - Remove formatting from all sheets in an Excel file

I have files with a lot of weird formatting and I'm trying to create a function that removes any formatting from an xlsx file.
Some guys in here suggested to use "cell.fill = PatternFill(fill_type=None)" to clear any format from a given cell.
path = r'C:\Desktop\Python\Openpyxl\Formatted.xlsx
wb = xl.load_workbook(filename = path)
def removeFormatting(file):
ws = wb[file]
for row in ws.iter_rows():
for cell in row:
if cell.value is None:
cell.fill = PatternFill(fill_type=None)
wb.save(path)
for s in wb.sheetnames:
removeFormatting(s)
But this won't change anything. If the cells are empty but colored, then openpyxl still sees them as non empty.
Following this post:
Openpyxl check for empty cell
The problem with ws.max_column and ws.max_row is that it will count blank columns as well, thus defeating the purpose."
#bhaskar was right.
When I'm trying to get the max column, I get for all the sheets, the same value as from the first sheet.
col = []
for sheet in wb.worksheets:
col.append(sheet.max_column)
So even if there are different sheet dimensions, if the cell has a background color or any other formatting, it will take it as valid non empty cell.
Does anyone know how to solve this?
Thanks in advance!
This function removes styles from all cells in a given worksheet (passed as ws object).
So you can open your file, iterate over all worksheets and apply this function to each one:
def removeFormatting(ws):
# ws is not the worksheet name, but the worksheet object
for row in ws.iter_rows():
for cell in row:
cell.style = 'Normal'
If you also want to check info about how to define and apply named styles, take a look here:
https://openpyxl.readthedocs.io/en/stable/styles.html#cell-styles-and-named-styles

Fill an existing Excel file with data from a Pandas DataFrame

I have a Pandas DataFrame with a bunch of rows and labeled columns.
I also have an excel file which I prepared with one sheet which contains no data but only
labeled columns in row 1 and each column is formatted as it should be: for example if I
expect percentages in one column then that column will automatically convert a raw number to percentage.
What I want to do is fill the raw data from my DataFrame into that Excel sheet in such a way
that row 1 remains intact so the column names remain. The data from the DataFrame should fill
the excel rows starting from row 2 and the pre-formatted columns should take care of converting
the raw numbers to their appropriate type, hence filling the data should not override the column format.
I tried using openpyxl but it ended up creating a new sheet and overriding everything.
Any help?
If you're certain about the order of columns is same, you can try this after opening the sheet with openpyxl:
df.to_excel(writer, startrow = 2,index = False, Header = False)
If your # of columns and order is same then you may try xlsxwriter and also mention the sheet name to want to refresh:
df.to_excel('filename.xlsx', engine='xlsxwriter', sheet_name='sheetname', index=False)

xlsxwriter format specific cell

I've seen many pages on here to format the width of an entire column, but is there a way to format an individual cell width? My issue is that I'm creating a sheet that has a "header" more or less, several rows where each column is a different length because they're been mergered to include unique information. Below this section will be a standard dataframe, which the entire column's width will need to be formatted to the data. But for the first five rows I need to specify unique width values. Is this possible?
xlswriter has a format feature saying how to change the properties of the spreadsheet cell: https://xlsxwriter.readthedocs.io/format.html
import xlsxwriter
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()
cell_format = workbook.add_format()
cell_format.set_bold()
cell_format.set_font_color('red')
There are properties to do everything including change the width of the cell.

Categories

Resources