I have a series of dates but in a format like "1OCT20" or "30MAR19", how can I convert them into datetime?
thanks in advance
use pd.to_datetime with the format argument set to %d%b%y
%d Day of the month as a zero-padded decimal number.
%b Month as locale’s abbreviated name.
%y Year without century as a zero-padded decimal number.
I usually use this https://strftime.org/ website when looking for specific datetime formats.
pd.to_datetime('1OCT20',format='%d%b%y')
Timestamp('2020-10-01 00:00:00')
pd.to_datetime('30MAR19',format='%d%b%y')
Timestamp('2019-03-30 00:00:00')
on your dataset you can cast it directly on your column
df['trgdate'] = pd.to_datetime(df['srcdate'],format='%d%b%y')
Related
I am trying to convert a dataframe column "date" from string to datetime. I have this format: "January 1, 2001 Monday".
I tried to use the following:
from dateutil import parser
for index,v in df['date'].items():
df['date'][index] = parser.parse(df['date'][index])
But it gives me the following error:
ValueError: Cannot set non-string value '2001-01-01 00:00:00' into a StringArray.
I checked the datatype of the column "date" and it tells me string type.
This is the snippet of the dataframe:
Any help would be most appreciated!
why don't you try this instead of dateutils, pandas offer much simpler tools such as pd.to_datetime function:
df['date'] = pd.to_datetime(df['date'], format='%B %d, %Y %A')
You need to specify the format for the datetime object in order it to be parsed correctly. The documentation helps with this:
%A is for Weekday as locale’s full name, e.g., Monday
%B is for Month as locale’s full name, e.g., January
%d is for Day of the month as a zero-padded decimal number.
%Y is for Year with century as a decimal number, e.g., 2021.
Combining all of them we have the following function:
from datetime import datetime
def mdy_to_ymd(d):
return datetime.strptime(d, '%B %d, %Y %A').strftime('%Y-%m-%d')
print(mdy_to_ymd('January 1, 2021 Monday'))
> 2021-01-01
One more thing is for your case, .apply() will work faster, thus the code is:
df['date'] = df['date'].apply(lambda x: mdy_to_ymd)
Feel free to add Hour-Minute-Second if needed.
This question already has answers here:
Parse date string and change format
(10 answers)
Closed 9 days ago.
I have some dates in my json file that look like this:
2020-12-11
2020-5-1
2020-3-21
and I want to convert them to YYYY-MM-DD format. They are already in a similar format, but I want to add leading zeros for single-digit month and day numbers.
The output should look like this:
2020-12-11
2020-05-01
2020-03-21
How can I do this?
The parser in datetutil can be used as follows (d1 is original date string):
from dateutil import parser
d2 = parser.parse(d1).date()
produces (datetime format which could be converted to string using strftime() if that is required):
2020-12-11
2020-05-01
2020-03-21
There is also an option (dayfirst = True) to declare day-before-month.
The usual way to reformat a date string is to use the datetime module (see How to convert a date string to different format).
Here you want to use the format codes (quoting the descriptions from the documentation)
%Y - Year with century as a decimal number.
%m - Month as a zero-padded decimal number.
%d - Day of the month as a zero-padded decimal number.
According to the footnote (9) in the documentation of datetime, %m and %d accept month and day numbers without leading zeros when used with strptime, but will output zero-padded numbers when used with strftime.
So you can use the same format string %Y-%m-%d to do a round-trip with strptime and strftime to add the zero-padding.
from datetime import datetime
def reformat(date_str):
fmt = '%Y-%m-%d'
return datetime.strptime(date_str, fmt).strftime(fmt)
Today one of my script gave an error for an invalid datetime format as an input. The script is expecting the datetime input as '%m/%d/%Y', but it got it in an entirely different format. For example, the date should have been 5/2/2022 but it was May 2, 2022. To add a bit more information for clarity, the input is coming for a Google sheet and the entire date is in a single cell (rather than different cells for month, date and year).
Is there a way to convert this kind of worded format to the desired datetime format before the script starts any kind of processing?
If you're in presence of the full month name, try this:
>>> pd.to_datetime(df["Date"], format="%B %d, %Y")
0 2022-05-02
Name: Date, dtype: datetime64[ns]
According to the Python docs:
%B: "Month as locale’s full name".
%d: "Day of the month as a zero-padded decimal number". (Although it seems to work in this case)
%Y: "Year with century as a decimal number."
Now, if you want to transform this date to the format you initially expected, just transform the series using .dt.strftime:
>>> pd.to_datetime(df["Date"], format="%B %d, %Y").dt.strftime("%m/%d/%Y")
0 05/02/2022
Name: Date, dtype: object
I'm trying to convert a string into a datetime. However is says that I don't follow the format and I am confused. Could anyone help me please?
Code:
from datetime import datetime
date = datetime.strptime('2021-11-27 00:00', '%y-%m-%d %H:%M')
Error:
ValueError: time data '2021-11-27 00:00' does not match format '%y-%m-%d %H:%M'
The %y code matches a two-digit year - for a four-digit year, you should use %Y instead.
date = datetime.strptime('2021-11-27 00:00', '%Y-%m-%d %H:%M')
as per the documentation, %y is
Year without century as a zero-padded decimal number.
and %Y is
Year with century as a decimal number.
so
from datetime import datetime
date = datetime.strptime('2021-11-27 00:00', '%Y-%m-%d %H:%M')
date
will give
datetime.datetime(2021, 11, 27, 0, 0)
I'm not familiar with python too much but this documentation says %y is for year without century but %Y is for year with century:
Directive
Meaning
%y
Year without century as a zero-padded decimal number.
%Y
Year with century as a decimal number.
So, looks like the correct format should be %Y-%m-%d %H:%M
Here a demonstration.
Remember, almost all programming languages have this custom specifiers and most of them are case sensitive.
In my project I have a string like this one:
"2018-03-07 06:46:02.737951"
I would like to get two variables: one in date format that contains the data, and the other for the time.
I tried:
from datetime import datetime
datetime_object = datetime.strptime('2018-03-07 06:46:02.737951', '%b %d %Y %I:%M%p')
but I get an error.
Then I tried:
from dateutil import parser
dt = parser.parse("2018-03-07 06:46:02.737951")
but I don't know what I can do with these results.
How can I extract the values for my vars "date_var" and "time_var"?
You need to match your string exactly. Reference: strftime-and-strptime-behavior
from datetime import datetime
dt = datetime.strptime('2018-03-07 06:46:02.737951', '%Y-%m-%d %H:%M:%S.%f')
print(dt.date())
print(dt.time())
d = dt.date() # extract date
t = dt.time() # extract time
print(type(d)) # printout the types
print(type(t))
Output:
2018-03-07
06:46:02.737951
<class 'datetime.date'>
<class 'datetime.time'>
Your format string is something along the lines of:
Month as locale’s abbreviated name.
Day of the month as a zero-padded decimal number.
Year with century as a decimal number.
Hour (12-hour clock) as a zero-padded decimal number.
Minute as a zero-padded decimal number.
Locale’s equivalent of either AM or PM.
with some spaces and : in it - which does not match your format.
# Accessing the time as an object:
the_time = dt.time()
#the_time
datetime.time(23, 55)
# Accessing the time as a string:
the_time.strftime("%H:%M:%S")
'23:55:00'
Similar for Date
Refer here