Conversion of date format in Python - python

I have a date as 05 May, 2015. I want it in 05-May-2015. In another line I also want it in 05052015.How to do this using Python

I would suggest you to read Python datetime library
In [28]: from datetime import datetime
In [29]: my_date = '05 May, 2015'
In [30]: custom_date = datetime.strptime(my_date,'%d %B, %Y')
In [34]: first_date_format = custom_date.strftime('%d-%B-%Y')
In [35]: first_date_format
Out[35]: '05-May-2015'
In [36]: second_date_format = custom_date.strftime('%d%m%Y')
In [37]: second_date_format
Out[37]: '05052015'

Related

format a pandas dataframe column to datatime format [duplicate]

I have pandas column like following
January 2014
February 2014
I want to convert it to following format
201401
201402
I am doing following
df.date = pd.to_datetime(df.date,format= '%Y%B')
But,it gives me an error.
You shouldn't need the format string, it just works:
In [207]:
pd.to_datetime('January 2014')
Out[207]:
Timestamp('2014-01-01 00:00:00')
besides your format string is incorrect, it should be '%B %Y':
In [209]:
pd.to_datetime('January 2014', format='%B %Y')
Out[209]:
Timestamp('2014-01-01 00:00:00')

Remaining Hour Per day for difference between two timestamp using pandas

I am working on a log data where i had to find the usage of a software on a daily basis . for instance if the log shows for a user : start time 04/01/2019 9:15 AM End Time 04/03/2019 12:00 PM. If i take a difference between these two dates then i will get the data usage for the span not for a particular day. is there a way where i can get the data usage per day until the end date.
Data would be of similar form shown below
and here is what i am trying to achieve
Since, you don't provide some origin data, I create some fake data myself. Also I not sure whether you mean to compare Start date with End date from your description. If I misunderstand you, please post comment below.
In [10]: import pandas as pd
In [11]: import numpy as np
In [12]: df1 = pd.DataFrame({"A":[1,2], "Start":[20190302, 20190401], "End": [20190304, 20190402]})
In [13]: df1
Out[13]:
A Start End
0 1 20190302 20190304
1 2 20190401 20190402
In [14]: df2 = pd.DataFrame(df1.values.repeat((df1.End - df1.Start > 1) + 1, axis=0), columns=df1.columns)
In [15]: df2
Out[15]:
A Start End
0 1 20190302 20190304
1 1 20190302 20190304
2 2 20190401 20190402
If you need to compare your actually date, you may want to use something like datetime lib to do that. Form example:
In [28]: import datetime
In [29]: dt1 = datetime.datetime.strptime("11/30/2018 17:13", "%m/%d/%Y %H:%M")
In [30]: dt1
Out[30]: datetime.datetime(2018, 11, 30, 17, 13)
In [31]: dt2 = datetime.datetime.strptime("11/29/2018 17:13", "%m/%d/%Y %H:%M")
In [32]: dt3 = datetime.datetime.strptime("11/28/2018 17:13", "%m/%d/%Y %H:%M")
In [33]: dt1 - dt2
Out[33]: datetime.timedelta(days=1)
In [34]: (dt1 - dt2).days
Out[34]: 1
In [35]: (dt1 - dt3).days
Out[35]: 2

How can I convert a text string to hour format with python [duplicate]

How do I convert the following string to a datetime object?
"Jun 1 2005 1:33PM"
datetime.strptime parses an input string in the user-specified format into a timezone-naive datetime object:
>>> from datetime import datetime
>>> datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
datetime.datetime(2005, 6, 1, 13, 33)
To obtain a date object using an existing datetime object, convert it using .date():
>>> datetime.strptime('Jun 1 2005', '%b %d %Y').date()
date(2005, 6, 1)
Links:
strptime docs: Python 2, Python 3
strptime/strftime format string docs: Python 2, Python 3
strftime.org format string cheatsheet
Notes:
strptime = "string parse time"
strftime = "string format time"
Use the third-party dateutil library:
from dateutil import parser
parser.parse("Aug 28 1999 12:00AM") # datetime.datetime(1999, 8, 28, 0, 0)
It can handle most date formats and is more convenient than strptime since it usually guesses the correct format. It is also very useful for writing tests, where readability is more important than performance.
Install it with:
pip install python-dateutil
Check out strptime in the time module. It is the inverse of strftime.
$ python
>>> import time
>>> my_time = time.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
time.struct_time(tm_year=2005, tm_mon=6, tm_mday=1,
tm_hour=13, tm_min=33, tm_sec=0,
tm_wday=2, tm_yday=152, tm_isdst=-1)
timestamp = time.mktime(my_time)
# convert time object to datetime
from datetime import datetime
my_datetime = datetime.fromtimestamp(timestamp)
# convert time object to date
from datetime import date
my_date = date.fromtimestamp(timestamp)
Python >= 3.7
To convert a YYYY-MM-DD string to a datetime object, datetime.fromisoformat could be used.
from datetime import datetime
date_string = "2012-12-12 10:10:10"
print (datetime.fromisoformat(date_string))
2012-12-12 10:10:10
Caution from the documentation:
This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of datetime.isoformat(). A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.
I have put together a project that can convert some really neat expressions. Check out timestring.
Here are some examples below:
pip install timestring
>>> import timestring
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm')
<timestring.Date 2015-08-15 20:40:00 4491909392>
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm').date
datetime.datetime(2015, 8, 15, 20, 40)
>>> timestring.Range('next week')
<timestring.Range From 03/10/14 00:00:00 to 03/03/14 00:00:00 4496004880>
>>> (timestring.Range('next week').start.date, timestring.Range('next week').end.date)
(datetime.datetime(2014, 3, 10, 0, 0), datetime.datetime(2014, 3, 14, 0, 0))
Remember this and you didn't need to get confused in datetime conversion again.
String to datetime object = strptime
datetime object to other formats = strftime
Jun 1 2005 1:33PM
is equals to
%b %d %Y %I:%M%p
%b Month as locale’s abbreviated name(Jun)
%d Day of the month as a zero-padded decimal number(1)
%Y Year with century as a decimal number(2015)
%I Hour (12-hour clock) as a zero-padded decimal number(01)
%M Minute as a zero-padded decimal number(33)
%p Locale’s equivalent of either AM or PM(PM)
so you need strptime i-e converting string to
>>> dates = []
>>> dates.append('Jun 1 2005 1:33PM')
>>> dates.append('Aug 28 1999 12:00AM')
>>> from datetime import datetime
>>> for d in dates:
... date = datetime.strptime(d, '%b %d %Y %I:%M%p')
... print type(date)
... print date
...
Output
<type 'datetime.datetime'>
2005-06-01 13:33:00
<type 'datetime.datetime'>
1999-08-28 00:00:00
What if you have different format of dates you can use panda or dateutil.parse
>>> import dateutil
>>> dates = []
>>> dates.append('12 1 2017')
>>> dates.append('1 1 2017')
>>> dates.append('1 12 2017')
>>> dates.append('June 1 2017 1:30:00AM')
>>> [parser.parse(x) for x in dates]
OutPut
[datetime.datetime(2017, 12, 1, 0, 0), datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 1, 12, 0, 0), datetime.datetime(2017, 6, 1, 1, 30)]
Many timestamps have an implied timezone. To ensure that your code will work in every timezone, you should use UTC internally and attach a timezone each time a foreign object enters the system.
Python 3.2+:
>>> datetime.datetime.strptime(
... "March 5, 2014, 20:13:50", "%B %d, %Y, %H:%M:%S"
... ).replace(tzinfo=datetime.timezone(datetime.timedelta(hours=-3)))
This assumes you know the offset. If you don't, but you know e.g. the location, you can use the pytz package to query the IANA time zone database for the offset. I'll use Tehran here as an example because it has a half-hour offset:
>>> tehran = pytz.timezone("Asia/Tehran")
>>> local_time = tehran.localize(
... datetime.datetime.strptime("March 5, 2014, 20:13:50",
... "%B %d, %Y, %H:%M:%S")
... )
>>> local_time
datetime.datetime(2014, 3, 5, 20, 13, 50, tzinfo=<DstTzInfo 'Asia/Tehran' +0330+3:30:00 STD>)
As you can see, pytz has determined that the offset was +3:30 at that particular date. You can now convert this to UTC time, and it will apply the offset:
>>> utc_time = local_time.astimezone(pytz.utc)
>>> utc_time
datetime.datetime(2014, 3, 5, 16, 43, 50, tzinfo=<UTC>)
Note that dates before the adoption of timezones will give you weird offsets. This is because the IANA has decided to use Local Mean Time:
>>> chicago = pytz.timezone("America/Chicago")
>>> weird_time = chicago.localize(
... datetime.datetime.strptime("November 18, 1883, 11:00:00",
... "%B %d, %Y, %H:%M:%S")
... )
>>> weird_time.astimezone(pytz.utc)
datetime.datetime(1883, 11, 18, 7, 34, tzinfo=<UTC>)
The weird "7 hours and 34 minutes" are derived from the longitude of Chicago. I used this timestamp because it is right before standardized time was adopted in Chicago.
If your string is in ISO 8601 format and you have Python 3.7+, you can use the following simple code:
import datetime
aDate = datetime.date.fromisoformat('2020-10-04')
for dates and
import datetime
aDateTime = datetime.datetime.fromisoformat('2020-10-04 22:47:00')
for strings containing date and time. If timestamps are included, the function datetime.datetime.isoformat() supports the following format:
YYYY-MM-DD[*HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]]
Where * matches any single character. See also here and here.
Here are two solutions using Pandas to convert dates formatted as strings into datetime.date objects.
import pandas as pd
dates = ['2015-12-25', '2015-12-26']
# 1) Use a list comprehension.
>>> [d.date() for d in pd.to_datetime(dates)]
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]
# 2) Convert the dates to a DatetimeIndex and extract the python dates.
>>> pd.DatetimeIndex(dates).date.tolist()
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]
Timings
dates = pd.DatetimeIndex(start='2000-1-1', end='2010-1-1', freq='d').date.tolist()
>>> %timeit [d.date() for d in pd.to_datetime(dates)]
# 100 loops, best of 3: 3.11 ms per loop
>>> %timeit pd.DatetimeIndex(dates).date.tolist()
# 100 loops, best of 3: 6.85 ms per loop
And here is how to convert the OP's original date-time examples:
datetimes = ['Jun 1 2005 1:33PM', 'Aug 28 1999 12:00AM']
>>> pd.to_datetime(datetimes).to_pydatetime().tolist()
[datetime.datetime(2005, 6, 1, 13, 33),
datetime.datetime(1999, 8, 28, 0, 0)]
There are many options for converting from the strings to Pandas Timestamps using to_datetime, so check the docs if you need anything special.
Likewise, Timestamps have many properties and methods that can be accessed in addition to .date
I personally like the solution using the parser module, which is the second answer to this question and is beautiful, as you don't have to construct any string literals to get it working. But, one downside is that it is 90% slower than the accepted answer with strptime.
from dateutil import parser
from datetime import datetime
import timeit
def dt():
dt = parser.parse("Jun 1 2005 1:33PM")
def strptime():
datetime_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
print(timeit.timeit(stmt=dt, number=10**5))
print(timeit.timeit(stmt=strptime, number=10**5))
Output:
10.70296801342902
1.3627995655316933
As long as you are not doing this a million times over and over again, I still think the parser method is more convenient and will handle most of the time formats automatically.
Something that isn't mentioned here and is useful: adding a suffix to the day. I decoupled the suffix logic so you can use it for any number you like, not just dates.
import time
def num_suffix(n):
'''
Returns the suffix for any given int
'''
suf = ('th','st', 'nd', 'rd')
n = abs(n) # wise guy
tens = int(str(n)[-2:])
units = n % 10
if tens > 10 and tens < 20:
return suf[0] # teens with 'th'
elif units <= 3:
return suf[units]
else:
return suf[0] # 'th'
def day_suffix(t):
'''
Returns the suffix of the given struct_time day
'''
return num_suffix(t.tm_mday)
# Examples
print num_suffix(123)
print num_suffix(3431)
print num_suffix(1234)
print ''
print day_suffix(time.strptime("1 Dec 00", "%d %b %y"))
print day_suffix(time.strptime("2 Nov 01", "%d %b %y"))
print day_suffix(time.strptime("3 Oct 02", "%d %b %y"))
print day_suffix(time.strptime("4 Sep 03", "%d %b %y"))
print day_suffix(time.strptime("13 Nov 90", "%d %b %y"))
print day_suffix(time.strptime("14 Oct 10", "%d %b %y"))​​​​​​​
In [34]: import datetime
In [35]: _now = datetime.datetime.now()
In [36]: _now
Out[36]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)
In [37]: print _now
2016-01-19 09:47:00.432000
In [38]: _parsed = datetime.datetime.strptime(str(_now),"%Y-%m-%d %H:%M:%S.%f")
In [39]: _parsed
Out[39]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)
In [40]: assert _now == _parsed
Django Timezone aware datetime object example.
import datetime
from django.utils.timezone import get_current_timezone
tz = get_current_timezone()
format = '%b %d %Y %I:%M%p'
date_object = datetime.datetime.strptime('Jun 1 2005 1:33PM', format)
date_obj = tz.localize(date_object)
This conversion is very important for Django and Python when you have USE_TZ = True:
RuntimeWarning: DateTimeField MyModel.created received a naive datetime (2016-03-04 00:00:00) while time zone support is active.
Create a small utility function like:
def date(datestr="", format="%Y-%m-%d"):
from datetime import datetime
if not datestr:
return datetime.today().date()
return datetime.strptime(datestr, format).date()
This is versatile enough:
If you don't pass any arguments it will return today's date.
There's a date format as default that you can override.
You can easily modify it to return a datetime.
This would be helpful for converting a string to datetime and also with a time zone:
def convert_string_to_time(date_string, timezone):
from datetime import datetime
import pytz
date_time_obj = datetime.strptime(date_string[:26], '%Y-%m-%d %H:%M:%S.%f')
date_time_obj_timezone = pytz.timezone(timezone).localize(date_time_obj)
return date_time_obj_timezone
date = '2018-08-14 13:09:24.543953+00:00'
TIME_ZONE = 'UTC'
date_time_obj_timezone = convert_string_to_time(date, TIME_ZONE)
arrow offers many useful functions for dates and times. This bit of code provides an answer to the question and shows that arrow is also capable of formatting dates easily and displaying information for other locales.
>>> import arrow
>>> dateStrings = [ 'Jun 1 2005 1:33PM', 'Aug 28 1999 12:00AM' ]
>>> for dateString in dateStrings:
... dateString
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').datetime
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').format('ddd, Do MMM YYYY HH:mm')
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').humanize(locale='de')
...
'Jun 1 2005 1:33PM'
datetime.datetime(2005, 6, 1, 13, 33, tzinfo=tzutc())
'Wed, 1st Jun 2005 13:33'
'vor 11 Jahren'
'Aug 28 1999 12:00AM'
datetime.datetime(1999, 8, 28, 0, 0, tzinfo=tzutc())
'Sat, 28th Aug 1999 00:00'
'vor 17 Jahren'
See http://arrow.readthedocs.io/en/latest/ for more.
You can also check out dateparser:
dateparser provides modules to easily parse localized dates in almost
any string formats commonly found on web pages.
Install:
pip install dateparser
This is, I think, the easiest way you can parse dates.
The most straightforward way is to use the dateparser.parse function,
that wraps around most of the functionality in the module.
Sample code:
import dateparser
t1 = 'Jun 1 2005 1:33PM'
t2 = 'Aug 28 1999 12:00AM'
dt1 = dateparser.parse(t1)
dt2 = dateparser.parse(t2)
print(dt1)
print(dt2)
Output:
2005-06-01 13:33:00
1999-08-28 00:00:00
You can use easy_date to make it easy:
import date_converter
converted_date = date_converter.string_to_datetime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
If you want only date format then you can manually convert it by passing your individual fields like:
>>> import datetime
>>> date = datetime.date(int('2017'),int('12'),int('21'))
>>> date
datetime.date(2017, 12, 21)
>>> type(date)
<type 'datetime.date'>
You can pass your split string values to convert it into date type like:
selected_month_rec = '2017-09-01'
date_formate = datetime.date(int(selected_month_rec.split('-')[0]),int(selected_month_rec.split('-')[1]),int(selected_month_rec.split('-')[2]))
You will get the resulting value in date format.
Similar to Javed's answer, I just wanted date from string - so combining Simon's and Javed's logic, we get:
from dateutil import parser
import datetime
s = '2021-03-04'
parser.parse(s).date()
Output
datetime.date(2021, 3, 4)
It seems using pandas Timestamp is the fastest:
import pandas as pd
N = 1000
l = ['Jun 1 2005 1:33PM'] * N
list(pd.to_datetime(l, format=format))
%timeit _ = list(pd.to_datetime(l, format=format))
1.58 ms ± 21.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Other solutions
from datetime import datetime
%timeit _ = list(map(lambda x: datetime.strptime(x, format), l))
9.41 ms ± 95.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
from dateutil.parser import parse
%timeit _ = list(map(lambda x: parse(x), l))
73.8 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
If the string is an ISO 8601 string, please use csio8601:
import ciso8601
l = ['2014-01-09'] * N
%timeit _ = list(map(lambda x: ciso8601.parse_datetime(x), l))
186 µs ± 4.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
If you don't want to explicitly specify which format your string is in with respect to the date time format, you can use this hack to by pass that step:
from dateutil.parser import parse
# Function that'll guess the format and convert it into the python datetime format
def update_event(start_datetime=None, end_datetime=None, description=None):
if start_datetime is not None:
new_start_time = parse(start_datetime)
return new_start_time
# Sample input dates in different formats
d = ['06/07/2021 06:40:23.277000', '06/07/2021 06:40', '06/07/2021']
new = [update_event(i) for i in d]
for date in new:
print(date)
# Sample output dates in Python datetime object
# 2014-04-23 00:00:00
# 2013-04-24 00:00:00
# 2014-04-25 00:00:00
If you want to convert it into some other datetime format, just modify the last line with the format you like for example something like date.strftime('%Y/%m/%d %H:%M:%S.%f'):
from dateutil.parser import parse
def update_event(start_datetime=None, end_datetime=None, description=None):
if start_datetime is not None:
new_start_time = parse(start_datetime)
return new_start_time
# Sample input dates in different formats
d = ['06/07/2021 06:40:23.277000', '06/07/2021 06:40', '06/07/2021']
# Passing the dates one by one through the function
new = [update_event(i) for i in d]
for date in new:
print(date.strftime('%Y/%m/%d %H:%M:%S.%f'))
# Sample output dates in required Python datetime object
# 2021/06/07 06:40:23.277000
# 2021/06/07 06:40:00.000000
# 2021/06/07 00:00:00.000000
Try running the above snippet to have a better clarity.
See my answer.
In real-world data this is a real problem: multiple, mismatched, incomplete, inconsistent and multilanguage/region date formats, often mixed freely in one dataset. It's not ok for production code to fail, let alone go exception-happy like a fox.
We need to try...catch multiple datetime formats fmt1,fmt2,...,fmtn and suppress/handle the exceptions (from strptime()) for all those that mismatch (and in particular, avoid needing a yukky n-deep indented ladder of try..catch clauses). From my solution
def try_strptime(s, fmts=['%d-%b-%y','%m/%d/%Y']):
for fmt in fmts:
try:
return datetime.strptime(s, fmt)
except:
continue
return None # or reraise the ValueError if no format matched, if you prefer
A short sample mapping a yyyy-mm-dd date string to a datetime.date object:
from datetime import date
date_from_yyyy_mm_dd = lambda δ : date(*[int(_) for _ in δ.split('-')])
date_object = date_from_yyyy_mm_dd('2021-02-15')
Use:
emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")
emp.info()
It shows "Start Date Time" Column and "Last Login Time" both are "object = strings" in data-frame:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name 933 non-null object
Gender 855 non-null object
Start Date 1000 non-null object
Last Login Time 1000 non-null object
Salary 1000 non-null int64
Bonus % 1000 non-null float64
Senior Management 933 non-null object
Team 957 non-null object
dtypes: float64(1), int64(1), object(6)
memory usage: 62.6+ KB
By using the parse_dates option in read_csv mention, you can convert your string datetime into the pandas datetime format.
emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv", parse_dates=["Start Date", "Last Login Time"])
emp.info()
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name 933 non-null object
Gender 855 non-null object
Start Date 1000 non-null datetime64[ns]
Last Login Time 1000 non-null datetime64[ns]
Salary 1000 non-null int64
Bonus % 1000 non-null float64
Senior Management 933 non-null object
Team 957 non-null object
dtypes: datetime64[ns](2), float64(1), int64(1), object(4)
memory usage: 62.6+ KB
#Convert String to datetime
>>> x=datetime.strptime('Jun 1 2005', '%b %d %Y').date()
>>> print(x,type(x))
2005-06-01 00:00:00 <class 'datetime.datetime'>
#Convert datetime to String (Reverse above process)
>>> y=x.strftime('%b %d %Y')
>>> print(y,type(y))
Jun 01 2005 <class 'str'>

Split and Combine Date

I am trying to write a python script that will compare dates from two different pages. The format of date in one page is Oct 03 2016 whereas on other page is (10/3/2016). My goal is to compare these two dates. I was able to convert Oct to 10 but don't know how to make it 10/3/2016.
You should really be using the dateutil library for this.
>>> import dateutil.parser
>>> first_date = dateutil.parser.parse('Oct 03 2016')
>>> second_date = dateutil.parser.parse('10/3/2016')
>>> first_date
datetime.datetime(2016, 10, 3, 0, 0)
>>> second_date
datetime.datetime(2016, 10, 3, 0, 0)
>>> first_date == second_date
True
>>>
Use datetime module to convert your string to datetime object and then compare both. For example:
>>> from datetime import datetime
>>> date1 = datetime.strptime('Oct 03 2016', '%b %d %Y')
>>> date2 = datetime.strptime('10/3/2016', '%m/%d/%Y')
>>> date1 == date2
True
Further, you may convert thisdatetime object to your custom format using datetime.strftime() as:
>>> date1.strftime('%d * %B * %Y')
'03 * October * 2016'
List of all the directives usable for formatting the string are available at the strftime link I mentioned above.

String to date type conversion in Python [duplicate]

How do I convert the following string to a datetime object?
"Jun 1 2005 1:33PM"
datetime.strptime parses an input string in the user-specified format into a timezone-naive datetime object:
>>> from datetime import datetime
>>> datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
datetime.datetime(2005, 6, 1, 13, 33)
To obtain a date object using an existing datetime object, convert it using .date():
>>> datetime.strptime('Jun 1 2005', '%b %d %Y').date()
date(2005, 6, 1)
Links:
strptime docs: Python 2, Python 3
strptime/strftime format string docs: Python 2, Python 3
strftime.org format string cheatsheet
Notes:
strptime = "string parse time"
strftime = "string format time"
Use the third-party dateutil library:
from dateutil import parser
parser.parse("Aug 28 1999 12:00AM") # datetime.datetime(1999, 8, 28, 0, 0)
It can handle most date formats and is more convenient than strptime since it usually guesses the correct format. It is also very useful for writing tests, where readability is more important than performance.
Install it with:
pip install python-dateutil
Check out strptime in the time module. It is the inverse of strftime.
$ python
>>> import time
>>> my_time = time.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
time.struct_time(tm_year=2005, tm_mon=6, tm_mday=1,
tm_hour=13, tm_min=33, tm_sec=0,
tm_wday=2, tm_yday=152, tm_isdst=-1)
timestamp = time.mktime(my_time)
# convert time object to datetime
from datetime import datetime
my_datetime = datetime.fromtimestamp(timestamp)
# convert time object to date
from datetime import date
my_date = date.fromtimestamp(timestamp)
Python >= 3.7
To convert a YYYY-MM-DD string to a datetime object, datetime.fromisoformat could be used.
from datetime import datetime
date_string = "2012-12-12 10:10:10"
print (datetime.fromisoformat(date_string))
2012-12-12 10:10:10
Caution from the documentation:
This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of datetime.isoformat(). A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.
I have put together a project that can convert some really neat expressions. Check out timestring.
Here are some examples below:
pip install timestring
>>> import timestring
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm')
<timestring.Date 2015-08-15 20:40:00 4491909392>
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm').date
datetime.datetime(2015, 8, 15, 20, 40)
>>> timestring.Range('next week')
<timestring.Range From 03/10/14 00:00:00 to 03/03/14 00:00:00 4496004880>
>>> (timestring.Range('next week').start.date, timestring.Range('next week').end.date)
(datetime.datetime(2014, 3, 10, 0, 0), datetime.datetime(2014, 3, 14, 0, 0))
Remember this and you didn't need to get confused in datetime conversion again.
String to datetime object = strptime
datetime object to other formats = strftime
Jun 1 2005 1:33PM
is equals to
%b %d %Y %I:%M%p
%b Month as locale’s abbreviated name(Jun)
%d Day of the month as a zero-padded decimal number(1)
%Y Year with century as a decimal number(2015)
%I Hour (12-hour clock) as a zero-padded decimal number(01)
%M Minute as a zero-padded decimal number(33)
%p Locale’s equivalent of either AM or PM(PM)
so you need strptime i-e converting string to
>>> dates = []
>>> dates.append('Jun 1 2005 1:33PM')
>>> dates.append('Aug 28 1999 12:00AM')
>>> from datetime import datetime
>>> for d in dates:
... date = datetime.strptime(d, '%b %d %Y %I:%M%p')
... print type(date)
... print date
...
Output
<type 'datetime.datetime'>
2005-06-01 13:33:00
<type 'datetime.datetime'>
1999-08-28 00:00:00
What if you have different format of dates you can use panda or dateutil.parse
>>> import dateutil
>>> dates = []
>>> dates.append('12 1 2017')
>>> dates.append('1 1 2017')
>>> dates.append('1 12 2017')
>>> dates.append('June 1 2017 1:30:00AM')
>>> [parser.parse(x) for x in dates]
OutPut
[datetime.datetime(2017, 12, 1, 0, 0), datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 1, 12, 0, 0), datetime.datetime(2017, 6, 1, 1, 30)]
Many timestamps have an implied timezone. To ensure that your code will work in every timezone, you should use UTC internally and attach a timezone each time a foreign object enters the system.
Python 3.2+:
>>> datetime.datetime.strptime(
... "March 5, 2014, 20:13:50", "%B %d, %Y, %H:%M:%S"
... ).replace(tzinfo=datetime.timezone(datetime.timedelta(hours=-3)))
This assumes you know the offset. If you don't, but you know e.g. the location, you can use the pytz package to query the IANA time zone database for the offset. I'll use Tehran here as an example because it has a half-hour offset:
>>> tehran = pytz.timezone("Asia/Tehran")
>>> local_time = tehran.localize(
... datetime.datetime.strptime("March 5, 2014, 20:13:50",
... "%B %d, %Y, %H:%M:%S")
... )
>>> local_time
datetime.datetime(2014, 3, 5, 20, 13, 50, tzinfo=<DstTzInfo 'Asia/Tehran' +0330+3:30:00 STD>)
As you can see, pytz has determined that the offset was +3:30 at that particular date. You can now convert this to UTC time, and it will apply the offset:
>>> utc_time = local_time.astimezone(pytz.utc)
>>> utc_time
datetime.datetime(2014, 3, 5, 16, 43, 50, tzinfo=<UTC>)
Note that dates before the adoption of timezones will give you weird offsets. This is because the IANA has decided to use Local Mean Time:
>>> chicago = pytz.timezone("America/Chicago")
>>> weird_time = chicago.localize(
... datetime.datetime.strptime("November 18, 1883, 11:00:00",
... "%B %d, %Y, %H:%M:%S")
... )
>>> weird_time.astimezone(pytz.utc)
datetime.datetime(1883, 11, 18, 7, 34, tzinfo=<UTC>)
The weird "7 hours and 34 minutes" are derived from the longitude of Chicago. I used this timestamp because it is right before standardized time was adopted in Chicago.
If your string is in ISO 8601 format and you have Python 3.7+, you can use the following simple code:
import datetime
aDate = datetime.date.fromisoformat('2020-10-04')
for dates and
import datetime
aDateTime = datetime.datetime.fromisoformat('2020-10-04 22:47:00')
for strings containing date and time. If timestamps are included, the function datetime.datetime.isoformat() supports the following format:
YYYY-MM-DD[*HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]]
Where * matches any single character. See also here and here.
Here are two solutions using Pandas to convert dates formatted as strings into datetime.date objects.
import pandas as pd
dates = ['2015-12-25', '2015-12-26']
# 1) Use a list comprehension.
>>> [d.date() for d in pd.to_datetime(dates)]
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]
# 2) Convert the dates to a DatetimeIndex and extract the python dates.
>>> pd.DatetimeIndex(dates).date.tolist()
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]
Timings
dates = pd.DatetimeIndex(start='2000-1-1', end='2010-1-1', freq='d').date.tolist()
>>> %timeit [d.date() for d in pd.to_datetime(dates)]
# 100 loops, best of 3: 3.11 ms per loop
>>> %timeit pd.DatetimeIndex(dates).date.tolist()
# 100 loops, best of 3: 6.85 ms per loop
And here is how to convert the OP's original date-time examples:
datetimes = ['Jun 1 2005 1:33PM', 'Aug 28 1999 12:00AM']
>>> pd.to_datetime(datetimes).to_pydatetime().tolist()
[datetime.datetime(2005, 6, 1, 13, 33),
datetime.datetime(1999, 8, 28, 0, 0)]
There are many options for converting from the strings to Pandas Timestamps using to_datetime, so check the docs if you need anything special.
Likewise, Timestamps have many properties and methods that can be accessed in addition to .date
I personally like the solution using the parser module, which is the second answer to this question and is beautiful, as you don't have to construct any string literals to get it working. But, one downside is that it is 90% slower than the accepted answer with strptime.
from dateutil import parser
from datetime import datetime
import timeit
def dt():
dt = parser.parse("Jun 1 2005 1:33PM")
def strptime():
datetime_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
print(timeit.timeit(stmt=dt, number=10**5))
print(timeit.timeit(stmt=strptime, number=10**5))
Output:
10.70296801342902
1.3627995655316933
As long as you are not doing this a million times over and over again, I still think the parser method is more convenient and will handle most of the time formats automatically.
Something that isn't mentioned here and is useful: adding a suffix to the day. I decoupled the suffix logic so you can use it for any number you like, not just dates.
import time
def num_suffix(n):
'''
Returns the suffix for any given int
'''
suf = ('th','st', 'nd', 'rd')
n = abs(n) # wise guy
tens = int(str(n)[-2:])
units = n % 10
if tens > 10 and tens < 20:
return suf[0] # teens with 'th'
elif units <= 3:
return suf[units]
else:
return suf[0] # 'th'
def day_suffix(t):
'''
Returns the suffix of the given struct_time day
'''
return num_suffix(t.tm_mday)
# Examples
print num_suffix(123)
print num_suffix(3431)
print num_suffix(1234)
print ''
print day_suffix(time.strptime("1 Dec 00", "%d %b %y"))
print day_suffix(time.strptime("2 Nov 01", "%d %b %y"))
print day_suffix(time.strptime("3 Oct 02", "%d %b %y"))
print day_suffix(time.strptime("4 Sep 03", "%d %b %y"))
print day_suffix(time.strptime("13 Nov 90", "%d %b %y"))
print day_suffix(time.strptime("14 Oct 10", "%d %b %y"))​​​​​​​
In [34]: import datetime
In [35]: _now = datetime.datetime.now()
In [36]: _now
Out[36]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)
In [37]: print _now
2016-01-19 09:47:00.432000
In [38]: _parsed = datetime.datetime.strptime(str(_now),"%Y-%m-%d %H:%M:%S.%f")
In [39]: _parsed
Out[39]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)
In [40]: assert _now == _parsed
Django Timezone aware datetime object example.
import datetime
from django.utils.timezone import get_current_timezone
tz = get_current_timezone()
format = '%b %d %Y %I:%M%p'
date_object = datetime.datetime.strptime('Jun 1 2005 1:33PM', format)
date_obj = tz.localize(date_object)
This conversion is very important for Django and Python when you have USE_TZ = True:
RuntimeWarning: DateTimeField MyModel.created received a naive datetime (2016-03-04 00:00:00) while time zone support is active.
Create a small utility function like:
def date(datestr="", format="%Y-%m-%d"):
from datetime import datetime
if not datestr:
return datetime.today().date()
return datetime.strptime(datestr, format).date()
This is versatile enough:
If you don't pass any arguments it will return today's date.
There's a date format as default that you can override.
You can easily modify it to return a datetime.
This would be helpful for converting a string to datetime and also with a time zone:
def convert_string_to_time(date_string, timezone):
from datetime import datetime
import pytz
date_time_obj = datetime.strptime(date_string[:26], '%Y-%m-%d %H:%M:%S.%f')
date_time_obj_timezone = pytz.timezone(timezone).localize(date_time_obj)
return date_time_obj_timezone
date = '2018-08-14 13:09:24.543953+00:00'
TIME_ZONE = 'UTC'
date_time_obj_timezone = convert_string_to_time(date, TIME_ZONE)
arrow offers many useful functions for dates and times. This bit of code provides an answer to the question and shows that arrow is also capable of formatting dates easily and displaying information for other locales.
>>> import arrow
>>> dateStrings = [ 'Jun 1 2005 1:33PM', 'Aug 28 1999 12:00AM' ]
>>> for dateString in dateStrings:
... dateString
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').datetime
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').format('ddd, Do MMM YYYY HH:mm')
... arrow.get(dateString.replace(' ',' '), 'MMM D YYYY H:mmA').humanize(locale='de')
...
'Jun 1 2005 1:33PM'
datetime.datetime(2005, 6, 1, 13, 33, tzinfo=tzutc())
'Wed, 1st Jun 2005 13:33'
'vor 11 Jahren'
'Aug 28 1999 12:00AM'
datetime.datetime(1999, 8, 28, 0, 0, tzinfo=tzutc())
'Sat, 28th Aug 1999 00:00'
'vor 17 Jahren'
See http://arrow.readthedocs.io/en/latest/ for more.
You can also check out dateparser:
dateparser provides modules to easily parse localized dates in almost
any string formats commonly found on web pages.
Install:
pip install dateparser
This is, I think, the easiest way you can parse dates.
The most straightforward way is to use the dateparser.parse function,
that wraps around most of the functionality in the module.
Sample code:
import dateparser
t1 = 'Jun 1 2005 1:33PM'
t2 = 'Aug 28 1999 12:00AM'
dt1 = dateparser.parse(t1)
dt2 = dateparser.parse(t2)
print(dt1)
print(dt2)
Output:
2005-06-01 13:33:00
1999-08-28 00:00:00
You can use easy_date to make it easy:
import date_converter
converted_date = date_converter.string_to_datetime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
If you want only date format then you can manually convert it by passing your individual fields like:
>>> import datetime
>>> date = datetime.date(int('2017'),int('12'),int('21'))
>>> date
datetime.date(2017, 12, 21)
>>> type(date)
<type 'datetime.date'>
You can pass your split string values to convert it into date type like:
selected_month_rec = '2017-09-01'
date_formate = datetime.date(int(selected_month_rec.split('-')[0]),int(selected_month_rec.split('-')[1]),int(selected_month_rec.split('-')[2]))
You will get the resulting value in date format.
Similar to Javed's answer, I just wanted date from string - so combining Simon's and Javed's logic, we get:
from dateutil import parser
import datetime
s = '2021-03-04'
parser.parse(s).date()
Output
datetime.date(2021, 3, 4)
It seems using pandas Timestamp is the fastest:
import pandas as pd
N = 1000
l = ['Jun 1 2005 1:33PM'] * N
list(pd.to_datetime(l, format=format))
%timeit _ = list(pd.to_datetime(l, format=format))
1.58 ms ± 21.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Other solutions
from datetime import datetime
%timeit _ = list(map(lambda x: datetime.strptime(x, format), l))
9.41 ms ± 95.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
from dateutil.parser import parse
%timeit _ = list(map(lambda x: parse(x), l))
73.8 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
If the string is an ISO 8601 string, please use csio8601:
import ciso8601
l = ['2014-01-09'] * N
%timeit _ = list(map(lambda x: ciso8601.parse_datetime(x), l))
186 µs ± 4.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
If you don't want to explicitly specify which format your string is in with respect to the date time format, you can use this hack to by pass that step:
from dateutil.parser import parse
# Function that'll guess the format and convert it into the python datetime format
def update_event(start_datetime=None, end_datetime=None, description=None):
if start_datetime is not None:
new_start_time = parse(start_datetime)
return new_start_time
# Sample input dates in different formats
d = ['06/07/2021 06:40:23.277000', '06/07/2021 06:40', '06/07/2021']
new = [update_event(i) for i in d]
for date in new:
print(date)
# Sample output dates in Python datetime object
# 2014-04-23 00:00:00
# 2013-04-24 00:00:00
# 2014-04-25 00:00:00
If you want to convert it into some other datetime format, just modify the last line with the format you like for example something like date.strftime('%Y/%m/%d %H:%M:%S.%f'):
from dateutil.parser import parse
def update_event(start_datetime=None, end_datetime=None, description=None):
if start_datetime is not None:
new_start_time = parse(start_datetime)
return new_start_time
# Sample input dates in different formats
d = ['06/07/2021 06:40:23.277000', '06/07/2021 06:40', '06/07/2021']
# Passing the dates one by one through the function
new = [update_event(i) for i in d]
for date in new:
print(date.strftime('%Y/%m/%d %H:%M:%S.%f'))
# Sample output dates in required Python datetime object
# 2021/06/07 06:40:23.277000
# 2021/06/07 06:40:00.000000
# 2021/06/07 00:00:00.000000
Try running the above snippet to have a better clarity.
See my answer.
In real-world data this is a real problem: multiple, mismatched, incomplete, inconsistent and multilanguage/region date formats, often mixed freely in one dataset. It's not ok for production code to fail, let alone go exception-happy like a fox.
We need to try...catch multiple datetime formats fmt1,fmt2,...,fmtn and suppress/handle the exceptions (from strptime()) for all those that mismatch (and in particular, avoid needing a yukky n-deep indented ladder of try..catch clauses). From my solution
def try_strptime(s, fmts=['%d-%b-%y','%m/%d/%Y']):
for fmt in fmts:
try:
return datetime.strptime(s, fmt)
except:
continue
return None # or reraise the ValueError if no format matched, if you prefer
A short sample mapping a yyyy-mm-dd date string to a datetime.date object:
from datetime import date
date_from_yyyy_mm_dd = lambda δ : date(*[int(_) for _ in δ.split('-')])
date_object = date_from_yyyy_mm_dd('2021-02-15')
Use:
emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")
emp.info()
It shows "Start Date Time" Column and "Last Login Time" both are "object = strings" in data-frame:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name 933 non-null object
Gender 855 non-null object
Start Date 1000 non-null object
Last Login Time 1000 non-null object
Salary 1000 non-null int64
Bonus % 1000 non-null float64
Senior Management 933 non-null object
Team 957 non-null object
dtypes: float64(1), int64(1), object(6)
memory usage: 62.6+ KB
By using the parse_dates option in read_csv mention, you can convert your string datetime into the pandas datetime format.
emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv", parse_dates=["Start Date", "Last Login Time"])
emp.info()
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name 933 non-null object
Gender 855 non-null object
Start Date 1000 non-null datetime64[ns]
Last Login Time 1000 non-null datetime64[ns]
Salary 1000 non-null int64
Bonus % 1000 non-null float64
Senior Management 933 non-null object
Team 957 non-null object
dtypes: datetime64[ns](2), float64(1), int64(1), object(4)
memory usage: 62.6+ KB
#Convert String to datetime
>>> x=datetime.strptime('Jun 1 2005', '%b %d %Y').date()
>>> print(x,type(x))
2005-06-01 00:00:00 <class 'datetime.datetime'>
#Convert datetime to String (Reverse above process)
>>> y=x.strftime('%b %d %Y')
>>> print(y,type(y))
Jun 01 2005 <class 'str'>

Categories

Resources