How did datetime decide '22' in %y is 2022 and not 1922? - python

Example code
a = '27-10-80'
b = '27-10-22'
c = '27-10-50'
x = datetime.strptime(a, '%d-%m-%y')
print(x)
y = datetime.strptime(b, '%d-%m-%y')
print(y)
z = datetime.strptime(c, '%d-%m-%y')
print(z)
Output:
1980-10-27 00:00:00
2022-10-27 00:00:00
2050-10-27 00:00:00
How did datetime decide the year generated from string using '%y' format. Why did 80 become 1980 while 50 become 2050?

The Python documentation indicates that strftime/strptime follows the semantics of C
Most of the functions defined in this module call platform C library functions with the same name.
as well as documents the behaviour of %y:
Function strptime() can parse 2-digit years when given %y format code. When 2-digit years are parsed, they are converted according to the POSIX and ISO C standards: values 69–99 are mapped to 1969–1999, and values 0–68 are mapped to 2000–2068.
But we can confirm this behaviour by looking up STRPTIME(3), for %y:
the year within the current century. When a century is not otherwise specified, values in the range 69-99 refer to years in the twentieth century (1969 to 1999 inclusive); values in the range 00-68 refer to years in the twenty-first century (2000 to 2068 inclusive). Leading zeros are permitted but not required.
emphasis mine.
There is no expressed reasoning for that cutoff, but it is reasonable to think that it stems from the UNIX EPOCH being 1970, with a year of safety margin at the boundary.

This is documented under the comment section of _strptime.py.
if group_key == 'y':
year = int(found_dict['y'])
# Open Group specification for strptime() states that a %y
#value in the range of [00, 68] is in the century 2000, while
#[69,99] is in the century 1900
if year <= 68:
year += 2000
else:
year += 1900

From the documentation of time
Function strptime() can parse 2-digit years when given %y format code. When 2-digit years are parsed, they are converted according to the POSIX and ISO C standards: values 69–99 are mapped to 1969–1999, and values 0–68 are mapped to 2000–2068.

Related

Why is a datetime string format not reversible?

I expected datetime.strftime and datetime.strptime calls to be reversible. Such that calling
datetime.strptime(datetime.now().strftime(fmt), fmt))
Would give the closest reconstruction of now() (given the information preserved by the format).
However, this is not the case when formatting a date to a string with a YYYY-Week# format:
>>> yyyy_u = datetime.datetime(1992, 5, 17).strftime('%Y-%U')
>>> print(yyyy_u)
'1992-20'
Formatting the string back to a date does not give the expected response:
>>> datetime.datetime.strptime(yyyy_u, '%Y-%U')
datetime.datetime(1992, 1, 1, 0, 0)
I would have expected the response to be the first day of week 20 in 1992 (17 May 1992).
Is this a failure of the %U format option or more generally are datetime.strftime and datetime.strptime calls not meant to be reversible?
From the Python docs regarding strptime() behaviour:
When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the year are specified.
Day of the week must be specified along with Week number and Year.
(%Y-%U-%w)
datetime.datetime.strptime('1992-20-0', '%Y-%U-%w') gives the first day of week 20 for 1992 year.

How to get Year-Week format in ISO calendar format?

I am trying to get the current date in ISO Calendar format as follows alongwith the zero padding on the week?
2019/W06
I tried the following, but prefer something using strftime as it is much easier to read.
print(str(datetime.datetime.today().isocalendar()[0]) + '/W' + str(datetime.datetime.today().isocalendar()[1]))
2019/W6
Use following code:
print(datetime.now().strftime('%Y/W%V'))
%Y Year with century as a decimal number.
%V - The ISO 8601 week number of the current year (01 to 53), where
week 1 is the first week that has at least 4 days in the current year,
and with Monday as the first day of the week.
https://docs.python.org/3.7/library/datetime.html#strftime-and-strptime-behavior
Solution with strftime:
If you want the zero padding:
datetime.date.today().strftime("%Y/W%V")
Output:
2019/W06
If you don't want it:
datetime.date.today().strftime("%Y/W%-V")
Output:
2019/W6
Note that "%V" returns the week number, and the "-" is what removes the leading zero.

How do I parse a date without zero padding, in the format (1 or 2-digit year)-(Month abbreviation)?

I need to parse a few dates that are roughly in the format (1 or 2-digit year)-(Month abbreviation), for example:
5-Jun (June 2005)
13-Jan (January 2013)
I tried using strptime with the format %b-%y but it did not consistently produce the desired date. Per the documentation, this is because some years in my dataset are not zero-padded.
Further, when I tested the datetime module (please see below for my code) on the string "5-Jun", I got "2019-06-05", instead of the desired result (June 2005), even if I set yearfirst=True when calling parse.
from dateutil.parser import parse
parsed = parse("5-Jun",yearfirst=True)
print(parsed)
It will be easier if 0 is padded to single digit years, as it can be directly converted to time using format. Regular expression is used here to replace any instance of single digit number with it's '0 padded in front' value. I've used regex from here.
Sample code:
import re
match_condn = r'\b([0-9])\b'
replace_str = r'0\1'
datetime.strptime(re.sub(match_condn, replace_str, '15-Jun'), '%y-%b').strftime("%B %Y")
Output:
June 2015
One approach is to use str.zfill
Ex:
import datetime
d = ["5-Jun", "13-Jan"]
for date in d:
date, month = date.split("-")
date = date.zfill(2)
print(datetime.datetime.strptime(date+"-"+month, "%y-%b").strftime("%B %Y"))
Output:
June 2005
January 2013
Ah. I see from #Rakesh's answer what your data is about. I thought you needed to parse the full name of the month. So you had your two terms %b and %y backwards, but then you had the problem with the single-digit years. I get it now. Here's a much simpler way to get what you want if you can assume your dates are always in one of the two formats you indicate:
inp = "5-Jun"
t = time.strptime(("0" + inp)[-6:], "%y-%b")

Datetime strftime formatting

I am taking my date column in my dataframe and making it a string in order to take off the time element when I write it to excel. For some reason I cant seem to write the date in the following format i need which is (10/2/2016). I can it to appear in this format (10/02/2016) but two issues arise - I need the day to be to one digit and not two and also is not in date order ( it seems to be sequencing on the month and not the year than month than day).
Here is my code:
df8 = df.set_index('DATE').resample('W-WED').apply(pd.DataFrame.tail, n=1)
df8.index= df8.index.droplevel(0)
df8 = df8.reset_index('DATE', drop=False)
df8['DATE'] = pd.to_datetime(df8['DATE']).apply(lambda x:x.date().strftime('%m/%d/%Y'))
Sample data (this is what is showing with the above formatting)
DATE Distance (cm)
01/02/2013 206.85
01/04/2012 315.33
01/05/2011 219.46
01/06/2016 180.44
01/07/2015 168.55
01/08/2014 156.89
You can use dt.day instead of %d directive which automatically discards the leading zeros to give the desired formatted date strings as shown:
pd.to_datetime(df8['DATE']).map(lambda x: '{}/{}/{}'.format(x.month, x.day, x.year))
Demo:
df = pd.DataFrame(dict(date=['2016/10/02', '2016/10/03',
'2016/10/04', '2016/10/05', '2016/10/06']))
>>> pd.to_datetime(df['date']).map(lambda x: '{}/{}/{}'.format(x.month, x.day, x.year))
0 10/2/2016
1 10/3/2016
2 10/4/2016
3 10/5/2016
4 10/6/2016
Name: date, dtype: object
EDIT based on sample data added:
Inorder for it to impact only the days and not months, we must fill/pad the left side of the strings containing the .month attribute with 0's using str.zfill having a width parameter equal to 2, so that single digit months would be left padded with 0 and the double digit ones would be left unchanged.
>>> pd.to_datetime(df['DATE']).map(lambda x: '{}/{}/{}'.format(str(x.month).zfill(2), x.day, x.year))
0 01/2/2013
1 01/4/2012
2 01/5/2011
3 01/6/2016
4 01/7/2015
5 01/8/2014
Name: DATE, dtype: object
From here, you can make the day not zero-padded by using
Code Meaning Example
%m Month as a zero-padded decimal number. 09
%-m Month as a decimal number. (Platform specific) 9
So use %-m instead of %m

Strange timestamp conversion

I have these dates
2016-02-26 12:12:12
2016-02-friday 12:12:12
(Those two dates refers to the same day)
If I convert the first one in a timestamp and then convert it back in a readable format it works.
But if I try the same on the second one it does not convert back to the right day !
Here's what I did :
sTimestamp = time.mktime(
datetime.datetime.strptime(
"2016-02-26 12:12:12",
"%Y-%m-%d %H:%M:%S")
.timetuple())
print("date from timestamp = " +
datetime.datetime.fromtimestamp(int(sTimestamp))
.strftime('%Y-%m-%d %H:%M:%S'))
sTimestamp = time.mktime(
datetime.datetime.strptime(
"2016-02-friday 12:12:12",
"%Y-%m-%A %H:%M:%S")
.timetuple())
print("date from timestamp = " +
datetime.datetime.fromtimestamp(int(sTimestamp)).
strftime('%Y-%m-%d %H:%M:%S'))
The output of thoses two lines are :
date from timestamp = 2016-02-26 12:12:12
date from timestamp = 2016-02-01 12:12:12
As you can see the first one is back to 26 but the second one converts back to 01 for an unknown reason. And by the way, 01 is a monday...
For information I am using python 3.4 and I am on Windows.
The first problem:
(Those two dates refers to the same day)
No, they don't. The first one refers to the last Friday of February in the year 2016; the second refers to, at best, a Friday in February in the year 2016.
Further, strptime is meant to be used with numbers as strings like "Friday" are not exact. The Python docs say:
For time objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they’re used anyway, 1900 is substituted for the year, and 1 for the month and day.
So it looks like using inexact values such as "Friday" use the same fallback of defaulting to 1.

Categories

Resources