how to format a column in pandas using a column name

how to format a column in pandas using a column name - python

I want to format a column in a dataframe to have ',' between large numbers once i send the df to_excel. i have a code that works but it selects the column based on its position. I want a code to select the column based on its name and not position. can someone help me please?
df.to_excel(writer, sheet_name = 'Final Trade List')
wb = writer.book
ws = writer.sheets['Final Trade List']
format = wb.add_format({'num_format': '#,##'})
ws.set_column('O:O', 12, format) # this code works but its based on position and not name
ws.set_column(df['$ to buy'], 12, format) # this gives me an error
writer.save()
TypeError: cannot convert the series to <class 'int'>

This should do the trick:
import pandas as pd
df['columnname'] = pd.Series([format(val, ',') for val in df['columnname']], index = df.index)

Related

How to change my excel file values usning python

I have an ecxel file like this and I want the numbers in the date field to be converted to a date like (2021.7.22) and replaced in the date field again using python

You can try something like this:
import pandas as pd
dfs = pd.read_excel('Test.xlsx', sheet_name=None)
output = {}
for ws, df in dfs.items():
if 'date' in df.columns:
df['date'] = pd.to_datetime(df['date'].apply(lambda x: f'{str(x)[:4]}.{str(x)[4:6 if len(str(x)) > 7 else 5]}.{str(x)[-2:]}')).dt.date
output[ws] = df
writer = pd.ExcelWriter('TestOutput.xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
For each worksheet containing the column date in the input xlsx file, it will convert the integer it finds to a date, assuming that the month portion may be 1 or 2 digits and that the day portion is always a full 2 digits. If the actual month/day protocol in your data is different, you can adjust the logic accordingly.
The code creates a new output xlsx reflecting the above changes.

excel column type Date format problems in pandas dataframe

when i read an excel file in pandas dataframe i got this strange format date i want to get rid of the zeros
'''
dataframes.append(pd.read_excel(full_excel_path,sheet_name=sheet_name,engine='openpyxl',header=header,usecols=usecols,nrows=nrows,dtype=str))
for object in json_meta_data:
header = object['excel_header'] - 1
excel_filename = object['excel_filename']
sheet_name = object['excel_sheet_name']
usecols = object['excel_usecols']
nrows = object['excel_nrows']
'''

You need to change the dtype. Pandas has a to_datetime that is good for this.
example df['time'] = pd.to_datetime(df['time']).dt.date
For your column you might use
df['date_arrete'] = pd.to_datetime(df['date_arrete']).dt.date

Using xlsx writer to write custom date format

I am writing a dataframe into excel and using xlsx writer to format my date columns to a custom format but the excel always contains a datetime value and ignores the custom formatting specified in my code. Here is the code:
writer = ExcelWriter(path+'test.xlsx', engine='xlsxwriter')
workbook = writer.book
df.to_excel(writer,sheet_name='sheet1', index=False, startrow = 1, header=False)
worksheet1 = writer.sheets['sheet1']
fmt = workbook.add_format({'num_format':'d-mmm-yy'})
worksheet1.set_column('C:C', None, fmt)
# Adjusting column width
worksheet1.set_column(0, 20, 12)
# Adding back the header row
column_list = df.columns
for idx, val in enumerate(column_list):
worksheet1.write(0, idx, val)
writer.save()
Here I want 'd-mmm-yy' format for column C but the exported excel contains datetime values. I also don't want to use strftime to convert my columns to strings to ensure easy date filtering in excel.
Excel output:

The reason this doesn't work as expected is because Pandas uses a default datetime format with datetime objects and it applies this format at the cell level. In XlsxWriter, and Excel, a cell format overrides a column format so you column format has no effect.
The easiest way to handle this is to specify the Pandas date (or datetime) format as a parameter in pd.ExcelWriter():
import pandas as pd
from datetime import date
df = pd.DataFrame({'Dates': [date(2020, 2, 1),
date(2020, 2, 2),
date(2020, 2, 3),
date(2020, 2, 4),
date(2020, 2, 5)]})
writer = pd.ExcelWriter('pandas_datetime.xlsx',
engine='xlsxwriter',
date_format='d-mmm-yy')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
Output:
See also this Pandas Datetime example from the XlsxWriter docs.

Turn Xlsxwriter sheet into Pandas Dataframe

I have a DataFrame read from an excel sheet in which I've made a few new columns to using Xlsxwriter. Now I need to filter this new set of data using the new column I created in Xlsxwriter (which is a date column btw). Is there a way to turn this new worksheet into a dataframe again so I can filter the new column? I'll try to provide any useful code:
export = "files/extract.xlsx"
future_days = 12
writer = pd.ExcelWriter('files/new_report-%s.xlsx' % (date.today()), engine ='xlsxwriter')
workbook = writer.book
df = pd.read_excel(export)
df.to_excel(writer, 'Full Log', index=False)
log_sheet = writer.sheets['Full Log']
new_headers = ('todays date', 'Milestone Date')
log_sheet.write_row('CW1', new_headers)
# This for loop just writes in the formula for my new columns on every line
for row_num in range(2, len(df.index)+2):
log_sheet.write_formula('CX' + str(row_num),'=IF(AND($BS{0}>1/1/1990,$BT{0}<>"Yes"),IF($BS{0}<=$CW{0},$BS{0},"Date In Future"),IF(AND($BW{0}>1/1/1990,$BX{0}<>"Yes"),IF($BW{0}<=CW{0},$BW{0},"Date In Future"),IF(AND($CA{0}>1/1/1990,$CCW{0}<>"Yes"),IF($CA{0}<=CW{0},$CA{0},"Date In Future"),IF(AND($CE{0}>1/1/1990,$CF{0}<>"Yes"),IF($CE{0}<CW{0},$CE{0},"Date In Future"),IF(AND($CI{0}>1/1/1990,$CJ{0}<>"Yes"),IF($CI{0}<CW{0},$CI{0},"Date In Future"),IF(AND($CM{0}>1/1/1990,$CN{0}<>"Yes"),IF($CM{0}<CW{0},$CM{0},"Date In Future"),"No Date"))))))'.format(row_num))
log_sheet.write_formula('CW' + str(row_num), '=TODAY()+' + str(future_days))
log_sheet.write_formula('CY' + str(row_num), '=IF(AND(AI{0}>DATEVALUE("1/1/1900"), AH{0}>DATEVALUE("1/1/1900"),A{0}<>"Test",A{0}<>"Dummy Test"),NETWORKDAYS(AH{0},AI{0}-1),"Test")'.format(row_num))
So now that's all done I need to filter this "full log" sheet so it only gets data where the values in the new milestone date column have passed the date of today. I've used Xlsxwriters Autofilter for this but I don't like it as it doesn't actually apply the filter. just sets it.

You can call the save function on the writer then load the file into a new dataframe
writer.save()
df2 = pd.read_excel('Full Log')

how to set a column to DATE format in xlsxwriter

I am working on a project where I am writing out onto an xlsx spreadsheet and need to format the one column for 'Date'. I get the program to run and all but the column format is still set to 'General'.
Try this in a different way with different code to see if anyone answers.:
for row in cur.execute('''SELECT `Mapline`,`Plant`,`Date`,`Action` from AEReport'''):
lengthOfHeadings = len(row)
output = '%s-%s.xlsx' % ("AEReport",now.strftime("%m%d%Y-%H%M"))
workbook = xlsxwriter.Workbook(output, {'strings_to_numbers':True})
worksheet = workbook.add_worksheet()
format=workbook.add_format({'font_size':'8','border':True})
format2=workbook.add_format({'font_size':'8','border':True,'num_format':'mm/dd/yy hh:mm'})
count = 0
for name in range(0,lengthOfHeadings):
if name==row[2]:
name=int(name)
worksheet.write(counter, count, row[name],format2)
else:
worksheet.write(counter, count, row[name],format)
count += 1
counter += 1
Slihthinden

To get the date time format working, you would have to get the date value converted to a excel serial date value.
Here is an example showing how does it work:
import pandas as pd
data = pd.DataFrame({'test_date':pd.date_range('1/1/2011', periods=12, freq='M') })
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
data.test_date = data.test_date - pd.datetime(1899, 12, 31)
pd.core.format.header_style = None
data.to_excel(writer, sheet_name='test', index=False)
workbook = writer.book
worksheet = writer.sheets['test']
formatdict = {'num_format':'mm/dd/yyyy'}
fmt = workbook.add_format(formatdict)
worksheet.set_column('A:A', None, fmt)
writer.save()
This is how the output will look like:

from datetime import datetime
date_format = workbook.add_format({'num_format':'yyyy-mm-dd hh:mm:ss'})
worksheet.write(0, 0, datetime.today(),date_format)
result:
image from Excel Generated

date = workbook.add_format({'num_format': 'dd-mm-yyyy'})
worksheet.write(1, 1 , 02-12-199, date)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to format a column in pandas using a column name - python

This should do the trick: import pandas as pd df['columnname'] = pd.Series([format(val, ',') for val in df['columnname']], index = df.index)

Related

How to change my excel file values usning python

excel column type Date format problems in pandas dataframe

Using xlsx writer to write custom date format

Turn Xlsxwriter sheet into Pandas Dataframe

how to set a column to DATE format in xlsxwriter

Categories

Resources