I want to format a column in a dataframe to have ',' between large numbers once i send the df to_excel. i have a code that works but it selects the column based on its position. I want a code to select the column based on its name and not position. can someone help me please?
df.to_excel(writer, sheet_name = 'Final Trade List')
wb = writer.book
ws = writer.sheets['Final Trade List']
format = wb.add_format({'num_format': '#,##'})
ws.set_column('O:O', 12, format) # this code works but its based on position and not name
ws.set_column(df['$ to buy'], 12, format) # this gives me an error
writer.save()
TypeError: cannot convert the series to <class 'int'>
This should do the trick:
import pandas as pd
df['columnname'] = pd.Series([format(val, ',') for val in df['columnname']], index = df.index)
Related
I have an ecxel file like this and I want the numbers in the date field to be converted to a date like (2021.7.22) and replaced in the date field again using python
You can try something like this:
import pandas as pd
dfs = pd.read_excel('Test.xlsx', sheet_name=None)
output = {}
for ws, df in dfs.items():
if 'date' in df.columns:
df['date'] = pd.to_datetime(df['date'].apply(lambda x: f'{str(x)[:4]}.{str(x)[4:6 if len(str(x)) > 7 else 5]}.{str(x)[-2:]}')).dt.date
output[ws] = df
writer = pd.ExcelWriter('TestOutput.xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
For each worksheet containing the column date in the input xlsx file, it will convert the integer it finds to a date, assuming that the month portion may be 1 or 2 digits and that the day portion is always a full 2 digits. If the actual month/day protocol in your data is different, you can adjust the logic accordingly.
The code creates a new output xlsx reflecting the above changes.
when i read an excel file in pandas dataframe i got this strange format date i want to get rid of the zeros
'''
dataframes.append(pd.read_excel(full_excel_path,sheet_name=sheet_name,engine='openpyxl',header=header,usecols=usecols,nrows=nrows,dtype=str))
for object in json_meta_data:
header = object['excel_header'] - 1
excel_filename = object['excel_filename']
sheet_name = object['excel_sheet_name']
usecols = object['excel_usecols']
nrows = object['excel_nrows']
'''
You need to change the dtype. Pandas has a to_datetime that is good for this.
example df['time'] = pd.to_datetime(df['time']).dt.date
For your column you might use
df['date_arrete'] = pd.to_datetime(df['date_arrete']).dt.date
I am writing a dataframe into excel and using xlsx writer to format my date columns to a custom format but the excel always contains a datetime value and ignores the custom formatting specified in my code. Here is the code:
writer = ExcelWriter(path+'test.xlsx', engine='xlsxwriter')
workbook = writer.book
df.to_excel(writer,sheet_name='sheet1', index=False, startrow = 1, header=False)
worksheet1 = writer.sheets['sheet1']
fmt = workbook.add_format({'num_format':'d-mmm-yy'})
worksheet1.set_column('C:C', None, fmt)
# Adjusting column width
worksheet1.set_column(0, 20, 12)
# Adding back the header row
column_list = df.columns
for idx, val in enumerate(column_list):
worksheet1.write(0, idx, val)
writer.save()
Here I want 'd-mmm-yy' format for column C but the exported excel contains datetime values. I also don't want to use strftime to convert my columns to strings to ensure easy date filtering in excel.
Excel output:
The reason this doesn't work as expected is because Pandas uses a default datetime format with datetime objects and it applies this format at the cell level. In XlsxWriter, and Excel, a cell format overrides a column format so you column format has no effect.
The easiest way to handle this is to specify the Pandas date (or datetime) format as a parameter in pd.ExcelWriter():
import pandas as pd
from datetime import date
df = pd.DataFrame({'Dates': [date(2020, 2, 1),
date(2020, 2, 2),
date(2020, 2, 3),
date(2020, 2, 4),
date(2020, 2, 5)]})
writer = pd.ExcelWriter('pandas_datetime.xlsx',
engine='xlsxwriter',
date_format='d-mmm-yy')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
Output:
See also this Pandas Datetime example from the XlsxWriter docs.
I have a DataFrame read from an excel sheet in which I've made a few new columns to using Xlsxwriter. Now I need to filter this new set of data using the new column I created in Xlsxwriter (which is a date column btw). Is there a way to turn this new worksheet into a dataframe again so I can filter the new column? I'll try to provide any useful code:
export = "files/extract.xlsx"
future_days = 12
writer = pd.ExcelWriter('files/new_report-%s.xlsx' % (date.today()), engine ='xlsxwriter')
workbook = writer.book
df = pd.read_excel(export)
df.to_excel(writer, 'Full Log', index=False)
log_sheet = writer.sheets['Full Log']
new_headers = ('todays date', 'Milestone Date')
log_sheet.write_row('CW1', new_headers)
# This for loop just writes in the formula for my new columns on every line
for row_num in range(2, len(df.index)+2):
log_sheet.write_formula('CX' + str(row_num),'=IF(AND($BS{0}>1/1/1990,$BT{0}<>"Yes"),IF($BS{0}<=$CW{0},$BS{0},"Date In Future"),IF(AND($BW{0}>1/1/1990,$BX{0}<>"Yes"),IF($BW{0}<=CW{0},$BW{0},"Date In Future"),IF(AND($CA{0}>1/1/1990,$CCW{0}<>"Yes"),IF($CA{0}<=CW{0},$CA{0},"Date In Future"),IF(AND($CE{0}>1/1/1990,$CF{0}<>"Yes"),IF($CE{0}<CW{0},$CE{0},"Date In Future"),IF(AND($CI{0}>1/1/1990,$CJ{0}<>"Yes"),IF($CI{0}<CW{0},$CI{0},"Date In Future"),IF(AND($CM{0}>1/1/1990,$CN{0}<>"Yes"),IF($CM{0}<CW{0},$CM{0},"Date In Future"),"No Date"))))))'.format(row_num))
log_sheet.write_formula('CW' + str(row_num), '=TODAY()+' + str(future_days))
log_sheet.write_formula('CY' + str(row_num), '=IF(AND(AI{0}>DATEVALUE("1/1/1900"), AH{0}>DATEVALUE("1/1/1900"),A{0}<>"Test",A{0}<>"Dummy Test"),NETWORKDAYS(AH{0},AI{0}-1),"Test")'.format(row_num))
So now that's all done I need to filter this "full log" sheet so it only gets data where the values in the new milestone date column have passed the date of today. I've used Xlsxwriters Autofilter for this but I don't like it as it doesn't actually apply the filter. just sets it.
You can call the save function on the writer then load the file into a new dataframe
writer.save()
df2 = pd.read_excel('Full Log')
I am working on a project where I am writing out onto an xlsx spreadsheet and need to format the one column for 'Date'. I get the program to run and all but the column format is still set to 'General'.
Try this in a different way with different code to see if anyone answers.:
for row in cur.execute('''SELECT `Mapline`,`Plant`,`Date`,`Action` from AEReport'''):
lengthOfHeadings = len(row)
output = '%s-%s.xlsx' % ("AEReport",now.strftime("%m%d%Y-%H%M"))
workbook = xlsxwriter.Workbook(output, {'strings_to_numbers':True})
worksheet = workbook.add_worksheet()
format=workbook.add_format({'font_size':'8','border':True})
format2=workbook.add_format({'font_size':'8','border':True,'num_format':'mm/dd/yy hh:mm'})
count = 0
for name in range(0,lengthOfHeadings):
if name==row[2]:
name=int(name)
worksheet.write(counter, count, row[name],format2)
else:
worksheet.write(counter, count, row[name],format)
count += 1
counter += 1
Slihthinden
To get the date time format working, you would have to get the date value converted to a excel serial date value.
Here is an example showing how does it work:
import pandas as pd
data = pd.DataFrame({'test_date':pd.date_range('1/1/2011', periods=12, freq='M') })
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
data.test_date = data.test_date - pd.datetime(1899, 12, 31)
pd.core.format.header_style = None
data.to_excel(writer, sheet_name='test', index=False)
workbook = writer.book
worksheet = writer.sheets['test']
formatdict = {'num_format':'mm/dd/yyyy'}
fmt = workbook.add_format(formatdict)
worksheet.set_column('A:A', None, fmt)
writer.save()
This is how the output will look like:
from datetime import datetime
date_format = workbook.add_format({'num_format':'yyyy-mm-dd hh:mm:ss'})
worksheet.write(0, 0, datetime.today(),date_format)
result:
image from Excel Generated
date = workbook.add_format({'num_format': 'dd-mm-yyyy'})
worksheet.write(1, 1 , 02-12-199, date)