Bad datetime conversion in pandas when a csv file it's opened

Bad datetime conversion in pandas when a csv file it's opened - python

I have a simple csv in which there are a Date and Activity column like this:
and when I open it with pandas and I try to convert the Date column with pd.to_datetime its change the date. When there are a change of month like this
Its seems that pandas change the day by the month or something like that:
The format of date that I want it's dd-mm-yyyy or yyyy-mm-dd.
This it's the code that I using:
import pandas as pd
dataset = pd.read_csv(directory + "Time 2020 (Activities).csv", sep = ";")
dataset[["Date"]] = dataset[["Date"]].apply(pd.to_datetime)
How can I fix that?

You could specify the date format in the pd.to_datetime parameters:
dataset['Date'] = pd.to_datetime(dataset['Date'], format='%Y-%m-%d')

Related

Date format changes after the 240th Row in Date Column of CSV: goes from 1880-01-15 (%Y-%m-%d) to 15-01-00 (%d-%m-%y). How do I make them the same?

I need to be able to change the following dates in new_file["Date] from 15,01,00 (%d,%m,%y) format to 1880-01-15 (%Y-%m-%d) format so that the column is completely uniform. The formats change at the 240th index of the column.
[]
Here is my following attempt. Can someone help me?
from datetime import date, datetime
import pandas as pd
file = pd.read_csv("sea_levels_1880_2015.csv")
#print(file)
#converting data frame to csv
headerList = ["Date", "Global_Mean_Sea_Level", "GMSL_Uncertainty"]
file.to_csv("sea_lvl_1880_2015.csv", header=headerList, index=False)
# display modified csv file
new_file = pd.read_csv("sea_lvl_1880_2015.csv")
#changing format from 15,01,00 to 15-01-00
new_file['Date'] = new_file['Date'].str.replace(',','-')
#new_file['Date'] = pd.to_datetime( "%Y-%m-%d")
print(new_file)

Assuming:
on your example each date represent seperate month,
for some reason dates after hitting 1900 moved to 15th of each month instead of 1st, like it was in 1800s (or you cut the last 5 when preparing the example),
your data doesn't go over 1999,
This should solve your problem:
new_file['Date'] = new_file['Date'].str.replace(r'(\d\d),(\d\d),(\d\d)', r'19\3-\2-\1', regex=True)
If you want to just keep it in the file and not do any date operations, you don't really need to import anything from datetime.
I hope that helps.

Changing date format convert mmm-yy to yyyy/mm/dd

I have a .CSV file with a column "Date". It has the full date in it e.g. 1/9/2020 but is formatted to Sep-20. (All dates are the first of every month)
The issue is that python is reading the formatted .CSV file's formatted value of Sep-20. How do I change all the values to a yyyy/mm/dd (2020/09/01) format?
What I tried so far but to no avail.
import pandas as pd
tw_df = pd.read_csv("tw_data.csv", index_col = "Date", parse_dates = True, format = "%Y%m%d")
Error Message
TypeError: parser_f() got an unexpected keyword argument 'format'

You can use datetime to convert the information to date inside Pandas. Use strptime to convert string on a given format to date format that you can work inside Pandas.
Check the code below:
import pandas as pd
from datetime import datetime
df = pd.read_csv('tw_data.csv')
conv = lambda x: datetime.strptime(x, "%b-%y")
df["Date"] = df["Date"].apply(conv)

defining month first dateformat in pandas?

How can i define month first dateformat in pandas?
for date first format I define like using dayfirst attribute;
dateCols = ['Document Date']
data = pd.read_excel(os.path.join(delivery_path, f), parse_dates=dateCols,
dayfirst=True, sheet_name='Refined', skiprows=1)
There is no monthfirst attribute. How should I define that when reading the file? And also what is the default dateformat panda uses when reading date columns?
eg: October 1st =10/01/2019

I don't understand your date column is like this October 1st=10/01/2019 or this 10/01/2019 if your column is October 1st=10/01/2019
import pandas as pd
def clean(date_column):
date = str(date_column).split('=')
return date[1]
data[dateCols] = pd.to_datetime(data[dateCols].apply(clean),format='%m/%d/%Y')
if 10/01/2019
data[dateCols] = pd.to_datetime(data[dateCols],format='%m/%d/%Y')
for the format you can learn more about from here http://strftime.org/

python pandas alter column from timestamp iso format to regular

I have a data frame that some of the columns have dates in this format (iso format):
YYYY-MM-DDThh:mm:ssTZD
I want to convert it to
YYYY-MM-DD HH:MM[:SS[.SSSSSS]]
For example when I do:
print (df["create_date"])
I get:
2014-11-24 20:21:49-05:00
How can I alter the date in the column ?

You need to do this:
from datetime import datetime
df["new_date"] = df["create_date"].strftime("%Y-%m-%d %H:%M[:%S[.%f]]")
If the column is type string, the try:
df["new_date"] = df["create_date"].dt.strftime("%Y-%m-%d %H:%M[:%S[.%f]]")
Then write this to csv/excel
import pandas as pd
df.to_csv("\\path\\file.csv")

pandas reading dates from csv in yy-mm-dd format

I have a csv files with dates in the format displayed as dd-mmm-yy and i want to read in the format yyyy-mm-dd. parse dates option works but it not converting dates correct before 2000
Example: actual date is 01-Aug-1968. It is displayed as 01-Aug-68. Pandas date parase and correction=true reads the date as 01-Aug-2068.
Is there any option to read the date in pandas in the correct format for the dates before 2000.

from dateutil.relativedelta import relativedelta
import datetime
let's assume you have a csv like this:
mydates
18-Aug-68
13-Jul-45
12-Sep-00
20-Jun-10
15-Jul-60
Define your date format
d = lambda x: pd.datetime.strptime(x, '%d-%b-%y')
Put a constraint on them
dateparse = lambda x: d(x) if d(x) < datetime.datetime.now() else d(x) - relativedelta(years=100)
read your csv:
df = pd.read_csv("myfile.csv", parse_dates=['mydates'], date_parser=dateparse)
here is your result:
print df
mydates
0 1968-08-18
1 1945-07-13
2 2000-09-12
3 2010-06-20
4 1960-07-15
Voilà

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bad datetime conversion in pandas when a csv file it's opened - python

You could specify the date format in the pd.to_datetime parameters: dataset['Date'] = pd.to_datetime(dataset['Date'], format='%Y-%m-%d')

Related

Date format changes after the 240th Row in Date Column of CSV: goes from 1880-01-15 (%Y-%m-%d) to 15-01-00 (%d-%m-%y). How do I make them the same?

Changing date format convert mmm-yy to yyyy/mm/dd

defining month first dateformat in pandas?

python pandas alter column from timestamp iso format to regular

pandas reading dates from csv in yy-mm-dd format

Categories

Resources