Changing date format convert mmm-yy to yyyy/mm/dd - python

I have a .CSV file with a column "Date". It has the full date in it e.g. 1/9/2020 but is formatted to Sep-20. (All dates are the first of every month)
The issue is that python is reading the formatted .CSV file's formatted value of Sep-20. How do I change all the values to a yyyy/mm/dd (2020/09/01) format?
What I tried so far but to no avail.
import pandas as pd
tw_df = pd.read_csv("tw_data.csv", index_col = "Date", parse_dates = True, format = "%Y%m%d")
Error Message
TypeError: parser_f() got an unexpected keyword argument 'format'

You can use datetime to convert the information to date inside Pandas. Use strptime to convert string on a given format to date format that you can work inside Pandas.
Check the code below:
import pandas as pd
from datetime import datetime
df = pd.read_csv('tw_data.csv')
conv = lambda x: datetime.strptime(x, "%b-%y")
df["Date"] = df["Date"].apply(conv)

Related

pd.to_datetime() not working on a column with different date formats

I wrote a program where a dataframe is traversed & when any column with the name 'Date' is encountered, all the rows under that column are supposed to be converted to a 'datetime' object using 'pd.to_datetime' in the format mentioned. But this method is not working for me.
The 'Date' column in my dataset consists of dates in various formats & with different separators. Example: 26/04/2007, 01-15-1998, 2020-12-2. When I do debugging I get an error message for those dates that are not in the format specified.
Isn't the whole point of using the method a way to convert dates in any format to a datetime object & in the format specified?
My code:
from dateutil.parser import parse
import re
from datetime import datetime
import calendar
import pandas as pd
def date_fun(filepath):
date_list=['Date', 'date', 'Dates', 'dates']
for i in filepath.columns:
for j in date_list:
if i==j:
filepath[i]=pd.to_datetime(filepath[i], format='%d-%m-%Y')
main_path = pd.read_csv('C:/Data_Cleansing/lockdown_us.csv')
fpath=main_path.copy()
date_fun(fpath)
Error: time data '26-2016-09' does not match format '%d-%m-%Y' (match)
Where is the mistake in my code?

How to convert a "yyyy-MM-dd'T'HH:mm:ssZ'" format in a dataframe to a datetime format

How can I convert a "yyyy-MM-dd'T'HH:mm:ssZ'" format in a dataframe to a datetime format that I can further format to an index
2021-01-02T05:22:58.000Z is one of the dates in the dataframe
i've tried this line of code:
df['created_at_tweet']= pd.to_datetime(df['created_at_tweet'], format=("yyyy-MM-dd'T'HH :mm:ss.SSS'Z'"))
but i get the error
ValueError: time data '2021-01-02T01:43:32.000Z' does not match format 'yyyy-MM-dd'T'HH :mm:ss.SSS'Z'' (match)
any ideas?
This works
df = pd.DataFrame({'created_at_tweet' : ['2021-01-02T01:43:32.000Z'], 'tweet' : ['Hello Twitter!']})
df['created_at_tweet']= pd.to_datetime(
df['created_at_tweet'],
format=('%Y-%m-%dT%H:%M:%S.%f'))
yields
df

how to change date format where the source contain multiple format

How to change format date from 12-Mar-2022 to , format='%d/%m/%Y' in python
so the problem is I read data from the google sheet where in the data contain multiple format, some of them is 12/03/2022 and some of them 12-Mar-2022.
I tried using this got error of couse because doesn't match for 12-Mar-2022
defectData_x['date'] = pd.to_datetime(defectData_x['date'], format='%d/%m/%Y')
Appreciate your help
defectData_x['date1'] = defectData_x['date'].dt.strftime('%d/%m/%Y')
don forget date1's dtype is not datetime but object
so it is better using date column and date1 column both before make final result
after final result, you can drop date column
add my example:
import pandas as pd
df = pd.DataFrame(["12/03/2022", "12-Mar-2022"], columns=["date"])
df["date1"] = pd.to_datetime(df["date"])
df['date2'] = df['date1'].dt.strftime('%d/%m/%Y')

Bad datetime conversion in pandas when a csv file it's opened

I have a simple csv in which there are a Date and Activity column like this:
and when I open it with pandas and I try to convert the Date column with pd.to_datetime its change the date. When there are a change of month like this
Its seems that pandas change the day by the month or something like that:
The format of date that I want it's dd-mm-yyyy or yyyy-mm-dd.
This it's the code that I using:
import pandas as pd
dataset = pd.read_csv(directory + "Time 2020 (Activities).csv", sep = ";")
dataset[["Date"]] = dataset[["Date"]].apply(pd.to_datetime)
How can I fix that?
You could specify the date format in the pd.to_datetime parameters:
dataset['Date'] = pd.to_datetime(dataset['Date'], format='%Y-%m-%d')

Converting date between DD/MM/YYYY and YYYY-MM-DD?

Using a Python script, I need to read a CVS file where dates are formated as DD/MM/YYYY, and convert them to YYYY-MM-DD before saving this into a SQLite database.
This almost works, but fails because I don't provide time:
from datetime import datetime
lastconnection = datetime.strptime("21/12/2008", "%Y-%m-%d")
#ValueError: time data did not match format: data=21/12/2008 fmt=%Y-%m-%d
print lastconnection
I assume there's a method in the datetime object to perform this conversion very easily, but I can't find an example of how to do it. Thank you.
Your example code is wrong. This works:
import datetime
datetime.datetime.strptime("21/12/2008", "%d/%m/%Y").strftime("%Y-%m-%d")
The call to strptime() parses the first argument according to the format specified in the second, so those two need to match. Then you can call strftime() to format the result into the desired final format.
you first would need to convert string into datetime tuple, and then convert that datetime tuple to string, it would go like this:
lastconnection = datetime.strptime("21/12/2008", "%d/%m/%Y").strftime('%Y-%m-%d')
I am new to programming. I wanted to convert from yyyy-mm-dd to dd/mm/yyyy to print out a date in the format that people in my part of the world use and recognise.
The accepted answer above got me on the right track.
The answer I ended up with to my problem is:
import datetime
today_date = datetime.date.today()
print(today_date)
new_today_date = today_date.strftime("%d/%m/%Y")
print (new_today_date)
The first two lines after the import statement gives today's date in the USA format (2017-01-26). The last two lines convert this to the format recognised in the UK and other countries (26/01/2017).
You can shorten this code, but I left it as is because it is helpful to me as a beginner. I hope this helps other beginner programmers starting out!
Does anyone else else think it's a waste to convert these strings to date/time objects for what is, in the end, a simple text transformation? If you're certain the incoming dates will be valid, you can just use:
>>> ddmmyyyy = "21/12/2008"
>>> yyyymmdd = ddmmyyyy[6:] + "-" + ddmmyyyy[3:5] + "-" + ddmmyyyy[:2]
>>> yyyymmdd
'2008-12-21'
This will almost certainly be faster than the conversion to and from a date.
#case_date= 03/31/2020
#Above is the value stored in case_date in format(mm/dd/yyyy )
demo=case_date.split("/")
new_case_date = demo[1]+"-"+demo[0]+"-"+demo[2]
#new format of date is (dd/mm/yyyy) test by printing it
print(new_case_date)
If you need to convert an entire column (from pandas DataFrame), first convert it (pandas Series) to the datetime format using to_datetime and then use .dt.strftime:
def conv_dates_series(df, col, old_date_format, new_date_format):
df[col] = pd.to_datetime(df[col], format=old_date_format).dt.strftime(new_date_format)
return df
Sample usage:
import pandas as pd
test_df = pd.DataFrame({"Dates": ["1900-01-01", "1999-12-31"]})
old_date_format='%Y-%m-%d'
new_date_format='%d/%m/%Y'
conv_dates_series(test_df, "Dates", old_date_format, new_date_format)
Dates
0 01/01/1900
1 31/12/1999
The most simplest way
While reading the csv file, put an argument parse_dates
df = pd.read_csv("sample.csv", parse_dates=['column_name'])
This will convert the dates of mentioned column to YYYY-MM-DD format
Convert date format DD/MM/YYYY to YYYY-MM-DD according to your question, you can use this:
from datetime import datetime
lastconnection = datetime.strptime("21/12/2008", "%d/%m/%Y").strftime("%Y-%m-%d")
print(lastconnection)
df is your data frame
Dateclm is the column that you want to change
This column should be in DateTime datatype.
df['Dateclm'] = pd.to_datetime(df['Dateclm'])
df.dtypes
#Here is the solution to change the format of the column
df["Dateclm"] = pd.to_datetime(df["Dateclm"]).dt.strftime('%Y-%m-%d')
print(df)

Categories

Resources