I am trying to convert a date format in Excel to be formatted such that it can be read into a MySQL database. I am using python to process the date column, export it back into a csv file and then dump it into a MySQL table.
Here is how the date column look like :
Date
6/10/13
6/17/13
6/24/13
I want it to be in the format : 2013-06-10 or ("%Y-%m-%d")
Here is my code:
import datetime
import csv
def read(filename):
new_date=[]
cr = csv.reader(open(filename,"rU").readlines()[1:], dialect='excel')
for row in cr:
# print row[0]
cols=datetime.datetime.strptime(row[0] , "%m/%d/%y" )
newcols=cols.strftime("%Y-%m-%d")
# print newcols
new_date.append(newcols)
print new_date[0]
with open('new_file.csv', 'wb') as f:
writer = csv.writer(f)
for date in new_date:
writer.writerow([date])
The code runs, but when i open the new_file.csv, the date column automatically reverts back to the old format in excel.
How can i change this?
Thanks,
Have you tried opening the .csv in a simple text editor (like notepad, for example) to see if the date is being printed to that file correctly? My guess is that Excel, which has the ability to recognize dates, is just reformatting it to it's default date format when you open the file, and that if you open your .csv file in a simple text editor to check, you'll see that your code has correctly reformatted the dates in the csv.
(By default, Excel formats dates like you specify in Control Panel, and you can change your default date formatting in Control Panel to change Excel's default formatting.)
You can change the way that Excel formats dates as described here in the documentation.
Select the cells you want to format.
Press CTRL+1.
In the Format Cells box, click the Number tab.
In the Category list, click Date.
You can create a custom date format from this menu if the format you see is not there. Details here.
Related
Code:
def write_pandas_dataframe_to_excel(df):
book = openpyxl.load_workbook('~/Documents/test.xlsm', read_only=False, keep_vba=True)
sheet = book['Database']
# Delete all rows after the header so that we can replace them with the contents of our pandas dataframe
sheet.delete_rows(1,sheet.max_row)
#Write values from the pandas dataframe to the sheet
for r in dataframe_to_rows(df,index=include_index, header=True):
sheet.append(r)
for row in sheet[2:sheet.max_row]: # skip the header
cell = row[0] # column A is a Date Field.
cell.number_format = 'YYYY-mm-dd'
book.save(excel_file_path)
book.close()
Expected Result: I open up test.xlsm, and in column A, all dates should already be in the format YYYY-mm-dd
Actual Result: While the YYYY-mm-dd format gets applied without any issues when I run the python code, I then have to open up the excel file, select each cell manually and hit 'Return' in the formula window for the YYYY-mm-dd format to be applied.
Is there a way for my specified date format to be applied through the python code rather than having to manually apply it by opening up excel and selecting each cell, going to the formula bar and hitting 'Return' every time?
Thanks in advance!
I've figured out the answer. Put simply, the date was being written to excel as a string, and that was causing the issue.
In the pandas dataframe I'm containing my data I had used strptime to format the date, which converted the date type to a generic 'object' type. I removed the strptime so that it maintained the datetime object, and that way when I write to excel it writes as a pandas Timestamp object rather than a string.
I am asking a follow up question from here (File downloaded is different from what is on server).
I have datetime in csv file which is getting reformatted.
My CSV has data like this 1-Jan-15,1-Feb-15,1-Mar-15.
But, the reformated csv is like Jan-15, Feb-15, Mar-15.......
Is there any way to stop automatic reformatting of data?
Instead of opening the .csv file directly in Excel, open a new blank workbook in Excel and use Get Data from Text (under the Data tab of the Ribbon) to import the .csv file.
This will open the Text Import Wizard, which has 3 total screens.
On Step 1, choose Delimited.
On Step 2, choose Comma.
And on Step 3, highlight all columns with dates in them and choose Text.
Click Finish.
The General format (which is also what happens by default if you open a .csv file in Excel directly) will recognize the dates as being dates and reformat them according to your locale settings. By instructing Excel to interpret those columns as text, they will not be recognized as dates and therefore left as they are.
I am reading a table and writing to CSV. I have some columns where I want to change the date format. Below is some code I have
for row in out[1]:
dt = datetime.datetime.strptime(row[18], '%Y-%m-%d %H:%M:%S.%f')
row[18] = dt.strftime('%Y-%m-%d)
print(row[18])
It does seem to change the date format to what I'm looking for because the print statement shows as much, but when I open the excel sheet that it wrote, the date format is like 11/12/2019 instead of 2020-11-12.
Any advice? I'm trying to prevent manually changing the format date once opening excel
Most likely, what Excel shows you is not what the underlying data is. Excel auto formats fields when it displays them. Hence, if it opens a CSV and sees 2020-11-12 it will recognize it is a date and display it for you as a formatted date field. If you want to save a field as TEXT that Excel will know is TEXT, then the cell has to start with a " ' ". Try adding this:
row[18] = dt.strftime('\'%Y-%m-%d')
I have a CSV file where the date is formatted as yy/mm/dd, but Excel is reading it wrongly as dd/mm/yyyy (e.g. 8th September 2015 is read as 15th of September 2008).
I know how to change the format that Excel outputs, but how can I change the format it uses to interpret the CSV data?
I'd like to keep it to Excel if possible, but I could work with a Python program.
Option 3. Import it properly
Use DATA, Get External Data, From Text and when the wizard prompts you choose the appropriate DMY combination (Step 3 of 3, Under Column data format, and Date).
Option 1. change the format excel reads in
edit: a better method is suggested in the OP comments to accomplish this, I was not aware you could do that
it(excel) uses your windows settings
so you can go to
Control Panel > Clock, Language, Region > (under Region and Language) change the date,time or number format
and enter the appropriate format
Option 2. change the csv date format
from dateutil.parser import parse
with open("output.csv","wb") as fout:
csv_out = csv.writer(fout)
for row in csv.reader(open("input.csv","rb")):
row[date_index] = parse(row[date_index]).strftime("%x")
csv_out.writerow(row)
I'm trying to write some dates from one excel spreadsheet to another. Currently, I'm getting a representation in excel that isn't quite what I want such as this: "40299.2501157407"
I can get the date to print out fine to the console, however it doesn't seem to work right writing to the excel spreadsheet -- the data must be a date type in excel, I can't have a text version of it.
Here's the line that reads the date in:
date_ccr = xldate_as_tuple(sheet_ccr.cell(row_ccr_index, 9).value, book_ccr.datemode)
Here's the line that writes the date out:
row.set_cell_date(11, datetime(*date_ccr))
There isn't anything being done to date_ccr in between those two lines other than a few comparisons.
Any ideas?
You can write the floating point number directly to the spreadsheet and set the number format of the cell. Set the format using the num_format_str of an XFStyle object when you write the value.
https://secure.simplistix.co.uk/svn/xlwt/trunk/xlwt/doc/xlwt.html#xlwt.Worksheet.write-method
The following example writes the date 01-05-2010. (Also includes time of 06:00:10, but this is hidden by the format chosen in this example.)
import xlwt
# d can also be a datetime object
d = 40299.2501157407
wb = xlwt.Workbook()
sheet = wb.add_sheet('new')
style = xlwt.XFStyle()
style.num_format_str = 'DD-MM-YYYY'
sheet.write(5, 5, d, style)
wb.save('test_new.xls')
There are examples of number formats (num_formats.py) in the examples folder of the xlwt source code. On my Windows machine: C:\Python26\Lib\site-packages\xlwt\examples
You can read about how Excel stores dates (third section on this page): https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html