time data not matching format - python

time data '07/10/2019:08:00:00 PM' does not match format '%m/%d/%Y:%H:%M:%S.%f'
I am not sure what is wrong. Here is the code that I have been using:
import datetime as dt
df['DATE'] = df['DATE'].apply(lambda x: dt.datetime.strptime(x,'%m/%d/%Y:%H:%M:%S.%f'))
Here's a sample of the column:
Transaction_date
07/10/2019:08:00:00 PM
07/23/2019:08:00:00 PM
3/15/2021
8/15/2021
8/26/2021

Your format is incorrect. Try:
df["DATE"] = pd.to_datetime(df["DATE"], format="%m/%d/%Y:%I:%M:%S %p")
You should be using %I to specify a 12-hour format and %p for AM/PM.
Separately, just use pd.to_datetime instead of importing datetime and using apply.
Example:
>>> pd.to_datetime('07/10/2019:08:00:00 PM', format="%m/%d/%Y:%I:%M:%S %p")
Timestamp('2019-07-10 20:00:00')
Edit:
To handle multiple formats, you can use pd.to_datetime with fillna:
df["DATE"] = pd.to_datetime(df["DATE"], format="%m/%d/%Y:%I:%M:%S %p", errors="coerce").fillna(pd.to_datetime(df["DATE"], format="%m/%d/%Y", errors="coerce"))

Related

Convert String "YYYY-MM-DD hh:mm:ss Etc/GMT" to timestamp in UTC pandas

I have a pandas column of datetime-like string values like this exampe:
exammple_value = "2022-06-24 16:57:33 Etc/GMT"
Expected output
Timestamp('2022-06-24 16:57:33+0000', tz='UTC')
Etc/GMT is the timezone, you can get it in python with:
import pytz
list(filter(lambda x: 'GMT' in x, pytz.all_timezones))[0]
----
OUT:
'Etc/GMT'
Use to_datetime with %Z for parse timezones and for UTC use Timestamp.tz_convert:
exammple_value = "2022-06-24 16:57:33 Etc/GMT"
print (pd.to_datetime(exammple_value, format='%Y-%m-%d %H:%M:%S %Z').tz_convert('UTC'))
2022-06-24 16:57:33+00:00
Another idea is remove timezones by split:
print (pd.to_datetime(exammple_value.rsplit(maxsplit=1)[0]).tz_localize('UTC'))
2022-06-24 16:57:33+00:00

How to convert "07-JUL-22 08.54.22.153000000 AM" to datetime object in python

I want to convert "07-JUL-22 08.54.22.153000000 AM" to a datetime object in python in order to be able to perform timedelta operations inside a pandas dataframe!
Your help is much appreciated.
pd.to_datetime can infer 07-JUL-22 08:54:22.153000000 AM, you can do
df['date2'] = pd.to_datetime(
df['date'].str.replace(r'(\d{2})\.(\d{2})\.(\d{2})', r'\1:\2:\3', regex=True)
)
print(df)
date date2
0 07-JUL-22 08:54:22.153000000 AM 2022-07-07 08:54:22.153
Nanoseconds are a bit of a problem for the batteries included with Python:
from datetime import datetime
d = "07-jul-22 08.54.22.153000 am"
dt = datetime.strptime(d, "%d-%b-%y %H.%M.%S.%f %p")
Note that the timestamp is truncated to microseconds. How important are those for your application?
You can truncate nanoseconds to microseconds with a regex:
import re
d = "07-jul-22 08.54.22.153000000 am"
d = re.sub(r"(\d\d\d\d\d\d)\d\d\d", r"\1", d)
(Cheat sheet: https://strftime.org/ )

How to convert date time to timestampe in python?

I have a dataset that has a column of that date written in the following format. How can I convert them to timestamp?
date = 1/1/2016 1:00:00 AM
you may need to use datetime library:
import datetime
datestr = "1/1/2016 1:00:00 AM"
datetm =datetime.datetime.strptime(
datestr, '%d/%m/%Y %H:%M:%S %p')
datetime.datetime.timestamp(datetm)
output:
1451610000.0
More info on formats, etc.: docs

How to format datetime in a dataframe the way I want?

I cannot find the correct format for this datetime. I have tried several formats, %Y/%m/%d%I:%M:%S%p is the closest format I can find for the example below.
df['datetime'] = '2019-11-13 16:28:05.779'
df['datetime'] = pd.to_datetime(df['datetime'], format="%Y/%m/%d%I:%M:%S%p")
Result:
ValueError: time data '2019-11-13 16:28:05.779' does not match format '%Y/%m/%d%I:%M:%S%p' (match)
Before guessing yourself have pandas make the first guess
df['datetime'] = pd.to_datetime(df['datetime'], infer_datetime_format=True)
0 2019-11-13 16:28:05.779
Name: datetime, dtype: datetime64[ns]
You can solve this probably by using the parameter infer_datetime_format=True. Here's an example:
df = {}
df['datetime'] = '2019-11-13 16:28:05.779'
df['datetime'] = pd.to_datetime(df['datetime'], infer_datetime_format=True)
print(df['datetime'])
print(type(df['datetime'])
Output:
2019-11-13 16:28:05.779000
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
Here is the pandas.to_datetime() call with the correct format string: pd.to_datetime(df['datetime'], format="%Y/%m/%d %H:%M:%S")
You were missing a space, %I is for 12-hour time (the example time you gave is 16:28, and %p is, to quote the docs, the Locale’s equivalent of either AM or PM.

How to fix date formatting using python3

I have data with the date format as follows:
date_format = 190410
year = 19
month = 04
date = 10
I want to change the date format, to be like this:
date_format = 10-04-2019
How do I solve this problem?
>>> import datetime
>>> date = 190410
>>> datetime.datetime.strptime(str(date), "%y%m%d").strftime("%d-%m-%Y")
'10-04-2019'
datetime.strptime() takes a data string and a format, and turns that into datetime object, and datetime objects have a method called strftime that turns datetime objects to string with given format. You can look what %y %m %d %Y are from here.
This is what you want(Notice that you have to change your format)
import datetime
date_format = '2019-04-10'
date_time_obj = datetime.datetime.strptime(date_format, '%Y-%m-%d')
print(date_time_obj)
Here is an other example
import datetime
date_time_str = '2018-06-29 08:15:27.243860'
date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)
You can also do this
from datetime import datetime, timedelta
s = "20120213"
# you could also import date instead of datetime and use that.
date = datetime(year=int(s[0:4]), month=int(s[4:6]), day=int(s[6:8]))
print(date)
There are many ways to achieve what you want.

Categories

Resources