Change date format when importing from CSV - python

I have a CSV file where the date is formatted as yy/mm/dd, but Excel is reading it wrongly as dd/mm/yyyy (e.g. 8th September 2015 is read as 15th of September 2008).
I know how to change the format that Excel outputs, but how can I change the format it uses to interpret the CSV data?
I'd like to keep it to Excel if possible, but I could work with a Python program.

Option 3. Import it properly
Use DATA, Get External Data, From Text and when the wizard prompts you choose the appropriate DMY combination (Step 3 of 3, Under Column data format, and Date).

Option 1. change the format excel reads in
edit: a better method is suggested in the OP comments to accomplish this, I was not aware you could do that
it(excel) uses your windows settings
so you can go to
Control Panel > Clock, Language, Region > (under Region and Language) change the date,time or number format
and enter the appropriate format
Option 2. change the csv date format
from dateutil.parser import parse
with open("output.csv","wb") as fout:
csv_out = csv.writer(fout)
for row in csv.reader(open("input.csv","rb")):
row[date_index] = parse(row[date_index]).strftime("%x")
csv_out.writerow(row)

Related

exporting to csv converts text to date

From Python i want to export to csv format a dataframe
The dataframe contains two columns like this
So when i write this :
df['NAME'] = df['NAME'].astype(str) # or .astype('string')
df.to_csv('output.csv',index=False,sep=';')
The excel output in csv format returns this :
and reads the value "MAY8218" as a date format "may-18" while i want it to be read as "MAY8218".
I've tried many ways but none of them is working. I don't want an alternative like putting quotation marks to the left and the right of the value.
Thanks.
If you want to export the dataframe to use it in excel just export it as xlsx. It works for me and maintains the value as string in the original format.
df.to_excel('output.xlsx',index=False)
The CSV format is a text format. The file contains no hint for the type of the field. The problem is that Excel has the worst possible support for CSV files: it assumes that CSV files always use its own conventions when you try to read one. In short, one Excel implementation can only read correctly what it has written...
That means that you cannot prevent Excel to interpret the csv data the way it wants, at least when you open a csv file. Fortunately you have other options:
import the csv file instead of opening it. This time you have options to configure the way the file should be processed.
use LibreOffice calc for processing CSV files. LibreOffice is a little behind Microsoft Office on most points except for csv file handling where it has an excellent support.

Any way to stop auto reformatting of data in excel

I am asking a follow up question from here (File downloaded is different from what is on server).
I have datetime in csv file which is getting reformatted.
My CSV has data like this 1-Jan-15,1-Feb-15,1-Mar-15.
But, the reformated csv is like Jan-15, Feb-15, Mar-15.......
Is there any way to stop automatic reformatting of data?
Instead of opening the .csv file directly in Excel, open a new blank workbook in Excel and use Get Data from Text (under the Data tab of the Ribbon) to import the .csv file.
This will open the Text Import Wizard, which has 3 total screens.
On Step 1, choose Delimited.
On Step 2, choose Comma.
And on Step 3, highlight all columns with dates in them and choose Text.
Click Finish.
The General format (which is also what happens by default if you open a .csv file in Excel directly) will recognize the dates as being dates and reformat them according to your locale settings. By instructing Excel to interpret those columns as text, they will not be recognized as dates and therefore left as they are.

Converting dates with multiple formats in a CSV file

I have a CSV full of tweets containing a few headers. Among them, for some unknown reason, the date format changes midway from %Y-%m-%d to %d/%m/%Y as shown in the image below.
This makes it difficult when trying to export it into another program e.g. Matlab. I'm attempting to solve this in Python, but any other solution would be great.
I've attempted multiple solutions from just googling around. Mainly parsing in a date format when reading the CSV, DateTime.strptime and others. I'm very new to Python so I'm sorry if I'm a bit clueless
I'm looking to standardise all the dates, e.g. changing the %d/%m/%Y to the other format, while keeping it individual row separate.
I'm thinking of following the approach held here, but adding an if statement if it recognises a certain format. How would I go about breaking the date down and changing it then?
This might work but I'm too lazy to check it against an image of a CSV file.
import pandas as pd
# Put all the formats into a list
possible_formats = ['%Y-%m-%d', '%d/%m/%Y']
# Read in the data
data = pd.read_csv("data_file.csv")
date_column = "date"
# Parse the dates in each format and stash them in a list
fixed_dates = [pd.to_datetime(data[date_column], errors='coerce', format=fmt) for fmt in possible_formats]
# Anything we could parse goes back into the CSV
data[date_column] = pd.NaT
for fixed in fixed_dates:
data.loc[~pd.isnull(fixed), date_column] = fixed[~pd.isnull(fixed)]
data.to_csv("new_file.csv")

Converting datetime to be mysql format using python

I am trying to convert a date format in Excel to be formatted such that it can be read into a MySQL database. I am using python to process the date column, export it back into a csv file and then dump it into a MySQL table.
Here is how the date column look like :
Date
6/10/13
6/17/13
6/24/13
I want it to be in the format : 2013-06-10 or ("%Y-%m-%d")
Here is my code:
import datetime
import csv
def read(filename):
new_date=[]
cr = csv.reader(open(filename,"rU").readlines()[1:], dialect='excel')
for row in cr:
# print row[0]
cols=datetime.datetime.strptime(row[0] , "%m/%d/%y" )
newcols=cols.strftime("%Y-%m-%d")
# print newcols
new_date.append(newcols)
print new_date[0]
with open('new_file.csv', 'wb') as f:
writer = csv.writer(f)
for date in new_date:
writer.writerow([date])
The code runs, but when i open the new_file.csv, the date column automatically reverts back to the old format in excel.
How can i change this?
Thanks,
Have you tried opening the .csv in a simple text editor (like notepad, for example) to see if the date is being printed to that file correctly? My guess is that Excel, which has the ability to recognize dates, is just reformatting it to it's default date format when you open the file, and that if you open your .csv file in a simple text editor to check, you'll see that your code has correctly reformatted the dates in the csv.
(By default, Excel formats dates like you specify in Control Panel, and you can change your default date formatting in Control Panel to change Excel's default formatting.)
You can change the way that Excel formats dates as described here in the documentation.
Select the cells you want to format.
Press CTRL+1.
In the Format Cells box, click the Number tab.
In the Category list, click Date.
You can create a custom date format from this menu if the format you see is not there. Details here.

datetime issue with xlrd & xlwt python libs

I'm trying to write some dates from one excel spreadsheet to another. Currently, I'm getting a representation in excel that isn't quite what I want such as this: "40299.2501157407"
I can get the date to print out fine to the console, however it doesn't seem to work right writing to the excel spreadsheet -- the data must be a date type in excel, I can't have a text version of it.
Here's the line that reads the date in:
date_ccr = xldate_as_tuple(sheet_ccr.cell(row_ccr_index, 9).value, book_ccr.datemode)
Here's the line that writes the date out:
row.set_cell_date(11, datetime(*date_ccr))
There isn't anything being done to date_ccr in between those two lines other than a few comparisons.
Any ideas?
You can write the floating point number directly to the spreadsheet and set the number format of the cell. Set the format using the num_format_str of an XFStyle object when you write the value.
https://secure.simplistix.co.uk/svn/xlwt/trunk/xlwt/doc/xlwt.html#xlwt.Worksheet.write-method
The following example writes the date 01-05-2010. (Also includes time of 06:00:10, but this is hidden by the format chosen in this example.)
import xlwt
# d can also be a datetime object
d = 40299.2501157407
wb = xlwt.Workbook()
sheet = wb.add_sheet('new')
style = xlwt.XFStyle()
style.num_format_str = 'DD-MM-YYYY'
sheet.write(5, 5, d, style)
wb.save('test_new.xls')
There are examples of number formats (num_formats.py) in the examples folder of the xlwt source code. On my Windows machine: C:\Python26\Lib\site-packages\xlwt\examples
You can read about how Excel stores dates (third section on this page): https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html

Categories

Resources