Print Colored Font Into Excel Using Pandas - python

I'm trying to print a green, bold font into an excel spreadsheet.
I can print this in jupyter notebook without a problem, but this is what I get in the spreadsheet: [1m[92mHello
import pandas as pd
import numpy as np
writer = pd.ExcelWriter('out.xlsx')
pd.DataFrame([1,'\033[1m' + '\033[92m'+ 'Hello',3]).to_excel(writer, sheet_name= 'sheet1')
writer.save()

You can use the ExcelWriter classes and methods to do many things in the workbook/worksheet. To do what you are intending to do, do the following.
import pandas as pd
import numpy as np
writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')
pd.DataFrame([1,'Hello',3]).to_excel(writer, sheet_name= 'sheet1')
worksheet = writer.sheets['sheet1']
workbook = writer.book
cell_format = workbook.add_format({'bold':True, 'font_color': 'green'})
worksheet.set_row(2,None,cell_format)
writer.save()
Documentation of ExcelWriter
Also, If you are trying to change the format of the header, you have to reset the header style first. Put the following before defining the writer
pd.io.formats.excel.header_style = None

pandas cannot achieve that.
You can use openpyxl to do so.
So what you have to do is export your data into excel using pandas, and then load the workbook using openpyxl and handle the coloring and other visualisation aspects from there.
from openpyxl.styles import colors
from openpyxl.styles import Font, Color
from openpyxl import Workbook
wb = load_workbook('yourworkbookname.xlsx')
ws = wb.active
a1 = ws['A1']
d4 = ws['D4']
ft = Font(color=colors.GREEN, bold=True)
a1.font = ft
d4.font = ft
wb.save()
For more documentation into openpyxl, visit here

Related

Add formats to dataframe and insert in excel using python?

i have to insert a database into excel with borders and all values in data frame should be centered i tried doing formatting to cells but does not work
df1.to_excel(writer,index=False,header=True,startrow=12,sheet_name='Sheet1')
writer.close()
writer=pd.ExcelWriter(s, engine="xlsxwriter")
writer.book = load_workbook(s)
workbooks= writer.book
worksheet = workbooks['Sheet1']
f1= workbooks.add_format()
worksheet.conditional_format(12,0,len(df1)+1,7,{'format':f1})
can u please help me with this
I'm not going to lie: I've done this for the first time right now, so this might not be a very good solution. I'm using openpyxl because it seems more flexible to me than XlsxWriter. I hope you can use it too.
My assumption is that the variable file_name contains a valid file name.
First your Pandas step:
with pd.ExcelWriter(file_name, engine='xlsxwriter') as writer:
df1.to_excel(writer, index=False, header=True, startrow=12, sheet_name='Sheet1')
Then the necessary imports from openpyxl:
from openpyxl import load_workbook
from openpyxl.styles import NamedStyle, Alignment, Border, Side
Loading the workbook and selecting the worksheet:
wb = load_workbook(file_name)
ws = wb['Sheet1']
Defining the required style:
centered_with_frame = NamedStyle('centered_with_frame')
centered_with_frame.alignment = Alignment(horizontal='center')
bd = Side(style='thin')
centered_with_frame.border = Border(left=bd, top=bd, right=bd, bottom=bd)
Selecting the relevant cells:
cells = ws[ws.cell(row=12+1, column=1).coordinate:
ws.cell(row=12+1+df1.shape[0], column=df1.shape[1]).coordinate]
Applying the defined style to the selected cells:
for row in cells:
for cell in row:
cell.style = centered_with_frame
Finally saving the workbook:
wb.save(file_name)
As I said: This might not be optimal.

python: xlsxwriter adding dataframe + formula to excel file

I found to different methods to write dataframes and formulas to an excel file.
import pandas as pd
import numpy as np
import xlsxwriter
# Method 1
writer = pd.ExcelWriter('example.xlsx', engine='xlsxwriter')
A = pd.DataFrame(np.array([[1,2,3],[4,5,6],[7,8,9]]))
A.to_excel(writer , sheet_name='Sheet1')
writer.save()
# Method 2
workbook = xlsxwriter.Workbook('example.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_formula('B5' , '=_xlfn.STDEV.S(B3:B5)')
workbook.close()
One works for adding the dataframe the other for the formula. Problem: Method 2 deletes what was written to the file with Method 1. How can I combine them?
Here is one way to do it if you want to combine the two actions into one:
import pandas as pd
import numpy as np
import xlsxwriter
# Method 1
writer = pd.ExcelWriter('example.xlsx', engine='xlsxwriter')
A = pd.DataFrame(np.array([[1,2,3],[4,5,6],[7,8,9]]))
A.to_excel(writer , sheet_name='Sheet1')
# Get the xlsxwriter objects from the dataframe writer object.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Write the formula.
worksheet.write_formula('B5' , '=_xlfn.STDEV.S(B2:B4)')
# Or Create a new worksheet and add the formula there.
worksheet = workbook.add_worksheet()
worksheet.write_formula('B5' , '=_xlfn.STDEV.S(Sheet1!B2:B4)')
writer.save()
Output:
See the Working with Python Pandas and XlsxWriter section of the XlsxWriter docs.
Note, I corrected the range of the formula to avoid a circular reference.

Moving sheets from excel files into one workbook using openpyxl

So I've been trying to code a script which loads all excel files from a specific location and moves worksheets inside these files into one workbook. I'm ending with and error:
AttributeError: 'DataFrame' object has no attribute 'DataFrame'.
I'm pretty new to this so I would really appreciate any tip on how to make that work. I can stick only
with openpyxl because at the moment I cannot install xlrd module on my workstation.
from pandas import ExcelWriter
import glob
import pandas as pd
import openpyxl
writer = ExcelWriter("output.xlsx")
for filename in glob.glob (r"C:\path\*.xlsx"):
wb = openpyxl.load_workbook(filename)
for ws in wb.sheetnames:
ws = wb[ws]
print (ws)
data = ws.values
columns = next(data)[0:]
df= pd.DataFrame(data, columns=columns)
print(df)
for df in df.DataFrame:
df.to_excel([writer,sheet_name= ws)
writer.save()
first you have to use sheet_name as a string not an object and another thing is last for loop is not needed as we loop through sheet names.
from pandas import ExcelWriter
import glob
import pandas as pd
import openpyxl
writer = ExcelWriter("output.xlsx")
for filename in glob.glob (r"C:\path\*.xlsx"):
wb = openpyxl.load_workbook(filename)
for ws in wb.sheetnames:
ws1 = wb[ws]
data = ws1.values
columns = next(data)[0:]
df= pd.DataFrame(data, columns=columns)
df.to_excel(writer,sheet_name=ws,index = False)
writer.save()

to_excel() without index layout

I'm using to_excel to write multiple DataFrames to multiple Excel documents. This works fine except that the index of the Dataframes is appended in bold with a border around each cell (see image).
The following code is a simplification of the code I use but has the same problem:
import pandas as pd
from openpyxl import load_workbook
df = pd.DataFrame(np.random.randint(50,60, size=(20, 3)))
xls_loc = r'test_doc.xlsx'
wb = load_workbook(xls_loc)
writer = pd.ExcelWriter(xls_loc, engine='openpyxl')
writer.book = wb
df.to_excel(writer, sheet_name='test sheet',index=True,startrow=1,startcol=1, header=False)
writer.save()
writer.close()
Is there a way to append the index without making the index bold and add borders?
Make the index a new column and then set index=False in to_excel()
df.insert(0, 'index', df.index)
You could insert the dataframe using xlwings to avoid formatting:
import pandas as pd
import xlwings as xw
df = pd._testing.makeDataFrame()
with xw.App(visible=False) as app:
wb = xw.Book()
wb.sheets[0]["A1"].value = df
wb.save("test.xlsx")
wb.close()
import pandas as pd
data = [11,12,13,14,15]
df = pd.DataFrame(data)
wb = pd.ExcelWriter('FileName.xlsx', engine='xlsxwriter')
df.style.set_properties(**{'text-align': 'center'}).to_excel(wb, sheet_name='sheet_01',index=False,header=None)
wb.save()
In to_excel() method index=False & header=None is the main trick

Formatting integers with comma separator using openpyxl and to_excel

I am writing DataFrames to excel using to_excel(). I need to use openpyxl instead of XlsxWriter, I think, as the writer engine because I need to open existing Excel files and add sheets. Regardless, I'm deep into other formatting using openpyxl so I'm not keen on changing.
This writes the DataFrame, and formats the floats, but I can't figure out how to format the int dtypes.
import pandas as pd
from openpyxl import load_workbook
df = pd.DataFrame({'county':['Cnty1','Cnty2','Cnty3'], 'ints':[5245,70000,4123123], 'floats':[3.212, 4.543, 6.4555]})
fileName = "Maryland - test.xlsx"
book = load_workbook(fileName)
writer = pd.ExcelWriter(fileName, engine='openpyxl')
writer.book = book
df.to_excel(writer, sheet_name='Test', float_format='%.2f', header=False, index=False, startrow=3)
ws = writer.sheets['Test']
writer.save()
writer.close()
Tried using this, but I think it only works with XlsxWriter:
intFormat = book.add_format({'num_format': '#,###'})
ws.set_column('B:B', intFormat)
This type of thing could be used cell-by-cell with a loop, but there's A LOT of data:
ws['B2'].number_format = '#,###'
This can be fixed by using number_fomat from openpyxl.styles
from openpyxl.styles import numbers
def sth():
#This will output a number like: 2,000.00
cell.number_format = numbers.FORMAT_NUMBER_COMMA_SEPARATED1
Checkout this link for further reading thedocs

Categories

Resources