xlsxwriter doesn't change date alignment on data from pandas.to_excel() - python

example data:
A
B
1
2020/10/01
2
2021/10/01
Im using pandas.to_excel like so:
df = pd.DataFrame(list(data))
writer = pd.ExcelWriter("excel.xlsx", engine='xlsxwriter', date_format="dd/mm/yyyy;#")
df.to_excel(writer_head, sheet_name='Sheet 1', index=False, startrow=4)
then i create the formatting like this:
df.to_excel(writer, sheet_name='Sheet 1', index=False, startrow=1)
workbook = writer.book
worksheet = writer.sheets['Sheet 1']
# format
date_align = workbook.add_format({
'align': 'center',
'valign': 'vcenter',
'num_format': 'dd/mm/yyyy;#',
})
So i tried to apply the formatting like this:
worksheet.set_column('B:B', 13, date_align)
writer.save()
But it didn't work, the date being created from pd.to_excel() doesn't change in alignment nor number format, but if i tried to write the data manually it worked like so:
worksheet.write('B', datetime.now().today())
worksheet.set_column('B:B', 13, date_align)
writer.save()
Now that's worked, but i want the data from pd.to_excel() to be formatted, and i checked the type from the list is indeed datetime.date and the excel output has category of 'Date' not custom or anything else. Oh and the alignment worked fine using pd.to_excel() as long as it is not date or datetime

Related

Applying conditional formatting to excel column from pandas dataframe

Im trying to make an excel document with multiple sheets and apply conditional formatting to select columns in the sheet, however, for some reason I cannot get the conditional formatting to apply when I open the sheet.
newexcelfilename= 'ResponseData_'+date+'.xlsx'
exceloutput = "C:\\Users\\jimbo\\Desktop\\New folder (3)\\output\\"+newexcelfilename
print("Writing to Excel file...")
# Given a dict of pandas dataframes
dfs = {'Tracts': tracts_finaldf, 'Place':place_finaldf,'MCDs':MCD_finaldf,'Counties': counties_finaldf, 'Congressional Districts':cd_finaldf,'AIAs':aia_finaldf}
writer = pd.ExcelWriter(exceloutput, engine='xlsxwriter')
workbook = writer.book
## columns for 3 color scale formatting export out of pandas as text, need to convert to
number format.
numberformat = workbook.add_format({'num_format': '00.0'})
## manually applying header format
header_format = workbook.add_format({
'bold': True,
'text_wrap': False,
'align': 'left',
})
for sheetname, df in dfs.items(): # loop through `dict` of dataframes
df.to_excel(writer, sheet_name=sheetname, startrow=1,header=False,index=False) # send df to writer
worksheet = writer.sheets[sheetname] # pull worksheet object
for col_num, value in enumerate(df.columns.values):
worksheet.write(0, col_num, value, header_format)
for idx, col in enumerate(df): # loop through all columns
series = df[col]
col_len = len(series.name) # len of column name/header
worksheet.set_column(idx,idx,col_len)
if col in ['Daily Internet Response Rate (%)',
'Daily Response Rate (%)',
'Cumulative Internet Response Rate (%)',
'Cumulative Response Rate (%)']:
worksheet.set_column(idx,idx,col_len,numberformat)
if col == 'DATE':
worksheet.set_column(idx,idx,10)
if col == 'ACO':
worksheet.set_column(idx,idx,5)
## applying conditional formatting to columns which were converted to the
numberformat
if worksheet == 'Tracts':
worksheet.conditional_format('E2:H11982', {'type':'3_color_scale',
'min_color': 'FF5733',
'mid_color':'FFB233',
'max_color': 'C7FF33',
'min_value': 0,
'max_vallue': 100})
writer.save()
Everything functions properly in the code in terms of resizing column widths and applying the numeric format to the specified columns, however I cannot get the conditional formatting to apply.
Ive tried to search all other questions on stack exchange but I cannot find an answer.
You have a few syntax errors in the conditional format, such as not specifying the colours in Html format and a typo in max_value. Once those are fixed it should work. Here is a smaller working example based on yours:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Apply a conditional format to the cell range.
worksheet.conditional_format('B2:B8',
{'type': '3_color_scale',
'min_color': '#FF5733',
'mid_color': '#FFB233',
'max_color': '#C7FF33',
'min_value': 0,
'max_value': 100})
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
Also, this line:
if worksheet == 'Tracts':
Should probably be:
if sheetname == 'Tracts':

How to color text in a cell containing a specific string using pandas

After running my algorithms I saved all the data in an excel file using pandas.
writer = pd.ExcelWriter('Diff.xlsx', engine='xlsxwriter')
Now, some of the cells contain strings which includes "-->" in it. I have the row and column number for those cells using:
xl_rowcol_to_cell(rows[i],cols[i])
But I couldn't figure how to color those cells or atleast the whole text in it.
Any suggestions/tips?
You could use a conditional format in Excel like this:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': ['foo', 'a --> b', 'bar']})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Add a format. Light red fill with dark red text.
format1 = workbook.add_format({'bg_color': '#FFC7CE',
'font_color': '#9C0006'})
# Apply a conditional format to the cell range.
worksheet.conditional_format(1, 1, len(df), 1,
{'type': 'text',
'criteria': 'containing',
'value': '-->',
'format': format1})
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See Adding Conditional Formatting to Dataframe output in the XlsxWriter docs.
def highlight (dataframe):
if dataframe[dataframe['c'].str.contains("-->")]:
return ['background-color: yellow']*5
else:
return ['background-color: white']*5
df.style.apply(highlight, axis=1)

XlsxWriter: add color to cells

I try to write dataframe to xlsx and give color to that.
I use
worksheet.conditional_format('A1:C1', {'type': '3_color_scale'})
But it's not give color to cell. And I want to one color to this cells.
I saw cell_format.set_font_color('#FF0000')
but there is don't specify number of cells
sex = pd.concat([df2[["All"]],df3], axis=1)
excel_file = 'example.xlsx'
sheet_name = 'Sheet1'
writer = pd.ExcelWriter(excel_file, engine='xlsxwriter')
sex.to_excel(writer, sheet_name=sheet_name, startrow=1)
workbook = writer.book
worksheet = writer.sheets[sheet_name]
format = workbook.add_format()
format.set_pattern(1)
format.set_bg_color('gray')
worksheet.write('A1:C1', 'Ray', format)
writer.save()
I need to give color to A1:C1, but I should give name to cell. How can I paint several cells of my df?
The problem is that worksheet.write('A1:C1', 'Ray', format) is used only to write a single cell.
A possible solution to write more cells in a row, is use write_row().
worksheet.write_row("A1:C1", ['Ray','Ray2','Ray3'], format)
Remember that write_row() takes a list of string to write in cells.
If you use worksheet.write_row("A1:C1", 'Ray', format), you have R in the first cell, a in second and y in the third.
cf = workbook.add_format({'bg_color': 'yellow'})
worksheet.write('A1', 'Column name', cf)

Pandas Xlsxwriter time format

I am trying to write out my pandas table using xlsxwriter. I have two columns:
Date | Time
10/10/2015 8:57
11/10/2015 10:23
But when I use xlsxwriter, the output is:
Date | Time
10/10/2015 0.63575435
11/10/2015 0.33256774
I tried using datetime_format = 'hh:mm:ss' but this didn't change it. How else can I get the date to format correctly without effecting the date column?
The following code works for me, but there are some caveats. If the custom formatting will work depends on the Windows/Excel version you open it with. Excels custom formatting depends on the language settings of the Windows OS.
Excel custom formatting
Windows date/time settings
So yeah, not the best solution... but the idea is to change the formatting for each column instead of changing how to interpret a type of data for the whole excel file that is being created.
import pandas as pd
from datetime import datetime, date
# Create a Pandas dataframe from some datetime data.
df = pd.DataFrame({'Date and time': [date(2015, 1, 1),
date(2015, 1, 2),
date(2015, 1, 3),
date(2015, 1, 4),
date(2015, 1, 5)],
'Time only': ["11:30:55",
"1:20:33",
"11:10:00",
"16:45:35",
"12:10:15"],
})
df['Time only'] = df['Time only'].apply(pd.to_timedelta)
#df['Date and time'] = df['Date and time'].apply(pd.to_datetime)
# Create a Pandas Excel writer using XlsxWriter as the engine.
# Also set the default datetime and date formats.
writer = pd.ExcelWriter("pandas_datetime.xlsx",
engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects in order to set the column
# widths, to make the dates clearer.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
#PLAY AROUND WITH THE NUM_FORMAT, IT DEPENDS ON YOUR WINDOWS AND EXCEL DATE/TIME SETTINGS WHAT WILL WORK
# Add some cell formats.
format1 = workbook.add_format({'num_format': 'd-mmm-yy'})
format2 = workbook.add_format({'num_format': "h:mm:ss"})
# Set the format
worksheet.set_column('B:B', None, format1)
worksheet.set_column('C:C', None, format2)
worksheet.set_column('B:C', 20)
# Close the Pandas Excel writer and output the Excel file.
writer.save()

Pandas ExcelWriter set_column fails to format numbers after DataFrame.to_excel used

I have tried the example code found on the xlsxwriter webpage at http://xlsxwriter.readthedocs.org/en/latest/example_pandas_column_formats.html
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Numbers': [1010, 2020, 3030, 2020, 1515, 3030, 4545],
'Percentage': [.1, .2, .33, .25, .5, .75, .45 ],
})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("pandas_column_formats.xlsx", engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Add some cell formats.
format1 = workbook.add_format({'num_format': '#,##0.00'})
format2 = workbook.add_format({'num_format': '0%'})
# Note: It isn't possible to format any cells that already have a format such
# as the index or headers or any cells that contain dates or datetimes.
# Set the column width and format.
worksheet.set_column('B:B', 18, format1)
# Set the format but not the column width.
worksheet.set_column('C:C', None, format2)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
However it does not format the columns as expected - they simply appear unformatted (no numeric rounding, percentages.)
I am using Pandas 0.15.2. Any ideas. Has this changed recently in Pandas perhaps?
Any ideas would be welcome.
This seems like it is fixed in Pandas 16. See https://github.com/jmcnamara/XlsxWriter/issues/204

Categories

Resources