i have a column hour which has value like '2019091300',"2019091301" the last two digits are hour value i want to transform it '2019-091-13 00:00:00', '2019-09-13 01:00:00' etc. '20190913' could be transformable by DateTime formatting . but I am stuck processing the last hour part . any solution would be helpful
Add parameter format with %Y for YYYY, %m for MM, %d for DD and %H for HH in to_datetime:
df['date'] = pd.to_datetime(df['date'], format='%Y%m%d%H')
Related
I am trying to convert a dataframe column "date" from string to datetime. I have this format: "January 1, 2001 Monday".
I tried to use the following:
from dateutil import parser
for index,v in df['date'].items():
df['date'][index] = parser.parse(df['date'][index])
But it gives me the following error:
ValueError: Cannot set non-string value '2001-01-01 00:00:00' into a StringArray.
I checked the datatype of the column "date" and it tells me string type.
This is the snippet of the dataframe:
Any help would be most appreciated!
why don't you try this instead of dateutils, pandas offer much simpler tools such as pd.to_datetime function:
df['date'] = pd.to_datetime(df['date'], format='%B %d, %Y %A')
You need to specify the format for the datetime object in order it to be parsed correctly. The documentation helps with this:
%A is for Weekday as locale’s full name, e.g., Monday
%B is for Month as locale’s full name, e.g., January
%d is for Day of the month as a zero-padded decimal number.
%Y is for Year with century as a decimal number, e.g., 2021.
Combining all of them we have the following function:
from datetime import datetime
def mdy_to_ymd(d):
return datetime.strptime(d, '%B %d, %Y %A').strftime('%Y-%m-%d')
print(mdy_to_ymd('January 1, 2021 Monday'))
> 2021-01-01
One more thing is for your case, .apply() will work faster, thus the code is:
df['date'] = df['date'].apply(lambda x: mdy_to_ymd)
Feel free to add Hour-Minute-Second if needed.
I'm trying to convert a string into a datetime. However is says that I don't follow the format and I am confused. Could anyone help me please?
Code:
from datetime import datetime
date = datetime.strptime('2021-11-27 00:00', '%y-%m-%d %H:%M')
Error:
ValueError: time data '2021-11-27 00:00' does not match format '%y-%m-%d %H:%M'
The %y code matches a two-digit year - for a four-digit year, you should use %Y instead.
date = datetime.strptime('2021-11-27 00:00', '%Y-%m-%d %H:%M')
as per the documentation, %y is
Year without century as a zero-padded decimal number.
and %Y is
Year with century as a decimal number.
so
from datetime import datetime
date = datetime.strptime('2021-11-27 00:00', '%Y-%m-%d %H:%M')
date
will give
datetime.datetime(2021, 11, 27, 0, 0)
I'm not familiar with python too much but this documentation says %y is for year without century but %Y is for year with century:
Directive
Meaning
%y
Year without century as a zero-padded decimal number.
%Y
Year with century as a decimal number.
So, looks like the correct format should be %Y-%m-%d %H:%M
Here a demonstration.
Remember, almost all programming languages have this custom specifiers and most of them are case sensitive.
time data '07/10/2019:08:00:00 PM' does not match format '%m/%d/%Y:%H:%M:%S.%f'
I am not sure what is wrong. Here is the code that I have been using:
import datetime as dt
df['DATE'] = df['DATE'].apply(lambda x: dt.datetime.strptime(x,'%m/%d/%Y:%H:%M:%S.%f'))
Here's a sample of the column:
Transaction_date
07/10/2019:08:00:00 PM
07/23/2019:08:00:00 PM
3/15/2021
8/15/2021
8/26/2021
Your format is incorrect. Try:
df["DATE"] = pd.to_datetime(df["DATE"], format="%m/%d/%Y:%I:%M:%S %p")
You should be using %I to specify a 12-hour format and %p for AM/PM.
Separately, just use pd.to_datetime instead of importing datetime and using apply.
Example:
>>> pd.to_datetime('07/10/2019:08:00:00 PM', format="%m/%d/%Y:%I:%M:%S %p")
Timestamp('2019-07-10 20:00:00')
Edit:
To handle multiple formats, you can use pd.to_datetime with fillna:
df["DATE"] = pd.to_datetime(df["DATE"], format="%m/%d/%Y:%I:%M:%S %p", errors="coerce").fillna(pd.to_datetime(df["DATE"], format="%m/%d/%Y", errors="coerce"))
I have a series of dates but in a format like "1OCT20" or "30MAR19", how can I convert them into datetime?
thanks in advance
use pd.to_datetime with the format argument set to %d%b%y
%d Day of the month as a zero-padded decimal number.
%b Month as locale’s abbreviated name.
%y Year without century as a zero-padded decimal number.
I usually use this https://strftime.org/ website when looking for specific datetime formats.
pd.to_datetime('1OCT20',format='%d%b%y')
Timestamp('2020-10-01 00:00:00')
pd.to_datetime('30MAR19',format='%d%b%y')
Timestamp('2019-03-30 00:00:00')
on your dataset you can cast it directly on your column
df['trgdate'] = pd.to_datetime(df['srcdate'],format='%d%b%y')
I have a timestamp column in my dataframe which is originally a str type. Some sample values:
'6/13/2015 6:45:58 AM'
'6/13/2015 7:00:37 PM'
I use the following code to convert this values into datetime with 24H format using this code:
df['timestampx'] = pd.to_datetime(df['timestamp'], format='%m/%d/%Y %H:%M:%S %p')
And, I obtain this result:
2015-06-13 06:45:58
2015-06-13 07:00:37
That means, the dates are NOT converted with 24H format and I am also loosing the AM/PM info. Any help?
You're reading it in as a 24 hour time, but really the current format isn't 24 hour time, it's 12 hour time. Read it in as 12 hour with the suffix (AM/PM), then you'll be OK to output in 24 hour time later if need be.
df = pd.DataFrame(['6/13/2015 6:45:58 AM','6/13/2015 7:00:37 PM'], columns = ['timestamp'])
df['timestampx'] = pd.to_datetime(df['timestamp'], format='%m/%d/%Y %I:%M:%S %p')
print df
timestamp timestampx
0 6/13/2015 6:45:58 AM 2015-06-13 06:45:58
1 6/13/2015 7:00:37 PM 2015-06-13 19:00:37