Formatting string into datetime using Pandas - trouble with directives - python

I have a string that is the full year followed by the ISO week of the year (so some years have 53 weeks, because the week counting starts at the first full week of the year). I want to convert it to a datetime object using pandas.to_datetime(). So I do:
pandas.to_datetime('201145', format='%Y%W')
and it returns:
Timestamp('2011-01-01 00:00:00')
which is not right. Or if I try:
pandas.to_datetime('201145', format='%Y%V')
it tells me that %V is a bad directive.
What am I doing wrong?

I think that the following question would be useful to you: Reversing date.isocalender()
Using the functions provided in that question this is how I would proceed:
import datetime
import pandas as pd
def iso_year_start(iso_year):
"The gregorian calendar date of the first day of the given ISO year"
fourth_jan = datetime.date(iso_year, 1, 4)
delta = datetime.timedelta(fourth_jan.isoweekday()-1)
return fourth_jan - delta
def iso_to_gregorian(iso_year, iso_week, iso_day):
"Gregorian calendar date for the given ISO year, week and day"
year_start = iso_year_start(iso_year)
return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)
def time_stamp(yourString):
year = int(yourString[0:4])
week = int(yourString[-2:])
day = 1
return year, week, day
yourTimeStamp = iso_to_gregorian( time_stamp('201145')[0] , time_stamp('201145')[1], time_stamp('201145')[2] )
print yourTimeStamp
Then run that function for your values and append them as date time objects to the dataframe.
The result I got from your specified string was:
2011-11-07

Related

Extraction of day, month and year from the date 4.5.6

Which function can I use to extract day, month and year from dates written in this manner 4.5.6 where 4 is the day, 5 is the month and 6 is the year (presumably 2006). I have already tried using dateparser.parse but it is not working.
day, month, year = map(int, '4.5.6'.split('.'))
And then add 2000 as necessary to the year.
You can then construct a datetime object with
from datetime import datetime
dt = datetime(year, month, day)
While it would be logical to use datetime.strptime, the one-digit year messes things up, and the above will just work fine.
Here is how you can use the datetime.datetime.strptime() method:
import datetime
s = "4.5.6"
i = s.rindex('.') + 1
s = s[:i] + s[i:].rjust(2, '0') # Add padding to year
dt = datetime.datetime.strptime(s, "%d.%m.%y")
print(dt)
Output:
2006-05-04 00:00:00
With the resulting datetime.datetime object, you can access plenty of information about the date, for example you can get the year by printing dt.year (outputs 2006).

Calculate Last Friday of Month in Pandas

I've written this function to get the last Thursday of the month
def last_thurs_date(date):
month=date.dt.month
year=date.dt.year
cal = calendar.monthcalendar(year, month)
last_thurs_date = cal[4][4]
if month < 10:
thurday_date = str(year)+'-0'+ str(month)+'-' + str(last_thurs_date)
else:
thurday_date = str(year) + '-' + str(month) + '-' + str(last_thurs_date)
return thurday_date
But its not working with the lambda function.
datelist['Date'].map(lambda x: last_thurs_date(x))
Where datelist is
datelist = pd.DataFrame(pd.date_range(start = pd.to_datetime('01-01-2014',format='%d-%m-%Y')
, end = pd.to_datetime('06-03-2019',format='%d-%m-%Y'),freq='D').tolist()).rename(columns={0:'Date'})
datelist['Date']=pd.to_datetime(datelist['Date'])
Jpp already added the solution, but just to add a slightly more readable formatted string - see this awesome website.
import calendar
def last_thurs_date(date):
year, month = date.year, date.month
cal = calendar.monthcalendar(year, month)
# the last (4th week -> row) thursday (4th day -> column) of the calendar
# except when 0, then take the 3rd week (February exception)
last_thurs_date = cal[4][4] if cal[4][4] > 0 else cal[3][4]
return f'{year}-{month:02d}-{last_thurs_date}'
Also added a bit of logic - e.g. you got 2019-02-0 as February doesn't have 4 full weeks.
Scalar datetime objects don't have a dt accessor, series do: see pd.Series.dt. If you remove this, your function works fine. The key is understanding that pd.Series.apply passes scalars to your custom function via a loop, not an entire series.
def last_thurs_date(date):
month = date.month
year = date.year
cal = calendar.monthcalendar(year, month)
last_thurs_date = cal[4][4]
if month < 10:
thurday_date = str(year)+'-0'+ str(month)+'-' + str(last_thurs_date)
else:
thurday_date = str(year) + '-' + str(month) + '-' + str(last_thurs_date)
return thurday_date
You can rewrite your logic more succinctly via f-strings (Python 3.6+) and a ternary statement:
def last_thurs_date(date):
month = date.month
year = date.year
last_thurs_date = calendar.monthcalendar(year, month)[4][4]
return f'{year}{"-0" if month < 10 else "-"}{month}-{last_thurs_date}'
I know that a lot of time has passed since the date of this post, but I think it would be worth adding another option if someone came across this thread
Even though I use pandas every day at work, in that case my suggestion would be to just use the datetutil library. The solution is a simple one-liner, without unnecessary combinations.
from dateutil.rrule import rrule, MONTHLY, FR, SA
from datetime import datetime as dt
import pandas as pd
# monthly options expiration dates calculated for 2022
monthly_options = list(rrule(MONTHLY, count=12, byweekday=FR, bysetpos=3, dtstart=dt(2022,1,1)))
# last satruday of the month
last_saturday = list(rrule(MONTHLY, count=12, byweekday=SA, bysetpos=-1, dtstart=dt(2022,1,1)))
and then of course:
pd.DataFrame({'LAST_ST':last_saturdays}) #or whatever you need
This question answer Calculate Last Friday of Month in Pandas
This can be modified by selecting the appropriate day of the week, here freq='W-FRI'
I think the easiest way is to create a pandas.DataFrame using pandas.date_range and specifying freq='W-FRI.
W-FRI is Weekly Fridays
pd.date_range(df.Date.min(), df.Date.max(), freq='W-FRI')
Creates all the Fridays in the date range between the min and max of the dates in df
Use a .groupby on year and month, and select .last(), to get the last Friday of every month for every year in the date range.
Because this method finds all the Fridays for every month in the range and then chooses .last() for each month, there's not an issue with trying to figure out which week of the month has the last Friday.
With this, use pandas: Boolean Indexing to find values in the Date column of the dataframe that are in last_fridays_in_daterange.
Use the .isin method to determine containment.
pandas: DateOffset objects
import pandas as pd
# test data: given a dataframe with a datetime column
df = pd.DataFrame({'Date': pd.date_range(start=pd.to_datetime('2014-01-01'), end=pd.to_datetime('2020-08-31'), freq='D')})
# create a dateframe with all Fridays in the daterange for min and max of df.Date
fridays = pd.DataFrame({'datetime': pd.date_range(df.Date.min(), df.Date.max(), freq='W-FRI')})
# use groubpy and last, to get the last Friday of each month into a list
last_fridays_in_daterange = fridays.groupby([fridays.datetime.dt.year, fridays.datetime.dt.month]).last()['datetime'].tolist()
# find the data for the last Friday of the month
df[df.Date.isin(last_fridays_in_daterange)]

Python Date Index: finding the closest date a year ago from today

I have a panda dataframe (stock prices) with an index in a date format. It is daily but only for working days.
I basically try to compute some price performance YTD and from a year ago.
To get the first date of the actual year in my dataframe I used the following method:
today = str(datetime.date.today())
curr_year = int(today[:4])
curr_month = int(today[5:7])
first_date_year = (df[str(curr_year)].first_valid_index())
Now I try to get the closest date a year ago (exactly one year from the last_valid_index()). I could extract the month and the year but then it wouldn't be as precise. Any suggestion ?
Thanks
Since you didn't provide any data, I am assuming that you have a list of dates (string types) like the following:
dates = ['11/01/2016', '12/01/2016', '02/01/2017', '03/01/2017']
You then need to transform that into datetime format, I would suggest using pandas:
pd_dates = pd.to_datetime(dates)
Then you have to define today and one year ago. I would suggest using datetime for that:
today = datetime.today()
date_1yr_ago = datetime(today.year-1, today.month, today.day)
Lastly, you slice the date list for dates larger than the date_1yr_ago value and get the first value of that slice:
pd_dates[pd_dates > date_1yr_ago][0]
This will return the first date that is larger than the 1 year ago date.
output:
Timestamp('2017-02-01 00:00:00')
You can convert that datetime value to string with the following code:
datetime.strftime(pd_dates[pd_dates > date_1yr_ago][0], '%Y/%m/%d')
output:
'2017/02/01'

Get date from week number

Please what's wrong with my code:
import datetime
d = "2013-W26"
r = datetime.datetime.strptime(d, "%Y-W%W")
print(r)
Display "2013-01-01 00:00:00", Thanks.
A week number is not enough to generate a date; you need a day of the week as well. Add a default:
import datetime
d = "2013-W26"
r = datetime.datetime.strptime(d + '-1', "%Y-W%W-%w")
print(r)
The -1 and -%w pattern tells the parser to pick the Monday in that week. This outputs:
2013-07-01 00:00:00
%W uses Monday as the first day of the week. While you can pick your own weekday, you may get unexpected results if you deviate from that.
See the strftime() and strptime() behaviour section in the documentation, footnote 4:
When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the year are specified.
Note, if your week number is a ISO week date, you'll want to use %G-W%V-%u instead! Those directives require Python 3.6 or newer.
In Python 3.8 there is the handy datetime.date.fromisocalendar:
>>> from datetime import date
>>> date.fromisocalendar(2020, 1, 1) # (year, week, day of week)
datetime.date(2019, 12, 30, 0, 0)
In older Python versions (3.7-) the calculation can use the information from datetime.date.isocalendar to figure out the week ISO8601 compliant weeks:
from datetime import date, timedelta
def monday_of_calenderweek(year, week):
first = date(year, 1, 1)
base = 1 if first.isocalendar()[1] == 1 else 8
return first + timedelta(days=base - first.isocalendar()[2] + 7 * (week - 1))
Both works also with datetime.datetime.
To complete the other answers - if you are using ISO week numbers, this string is appropriate (to get the Monday of a given ISO week number):
import datetime
d = '2013-W26'
r = datetime.datetime.strptime(d + '-1', '%G-W%V-%u')
print(r)
%G, %V, %u are ISO equivalents of %Y, %W, %w, so this outputs:
2013-06-24 00:00:00
Availabe in Python 3.6+; from docs.
import datetime
res = datetime.datetime.strptime("2018 W30 w1", "%Y %W w%w")
print res
Adding of 1 as week day will yield exact current week start. Adding of timedelta(days=6) will gives you the week end.
datetime.datetime(2018, 7, 23)
If anyone is looking for a simple function that returns all working days (Mo-Fr) dates from a week number consider this (based on accepted answer)
import datetime
def weeknum_to_dates(weeknum):
return [datetime.datetime.strptime("2021-W"+ str(weeknum) + str(x), "%Y-W%W-%w").strftime('%d.%m.%Y') for x in range(-5,0)]
weeknum_to_dates(37)
Output:
['17.09.2021', '16.09.2021', '15.09.2021', '14.09.2021', '13.09.2021']
In case you have the yearly number of week, just add the number of weeks to the first day of the year.
>>> import datetime
>>> from dateutil.relativedelta import relativedelta
>>> week = 40
>>> year = 2019
>>> date = datetime.date(year,1,1)+relativedelta(weeks=+week)
>>> date
datetime.date(2019, 10, 8)
Another solution which worked for me that accepts series data as opposed to strptime only accepting single string values:
#fw_to_date
import datetime
import pandas as pd
# fw is input in format 'YYYY-WW'
# Add weekday number to string 1 = Monday
fw = fw + '-1'
# dt is output column
# Use %G-%V-%w if input is in ISO format
dt = pd.to_datetime(fw, format='%Y-%W-%w', errors='coerce')
Here's a handy function including the issue with zero-week.

How to get week number in Python?

How to find out what week number is current year on June 16th (wk24) with Python?
datetime.date has a isocalendar() method, which returns a tuple containing the calendar week:
>>> import datetime
>>> datetime.date(2010, 6, 16).isocalendar()[1]
24
datetime.date.isocalendar() is an instance-method returning a tuple containing year, weeknumber and weekday in respective order for the given date instance.
In Python 3.9+ isocalendar() returns a namedtuple with the fields year, week and weekday which means you can access the week explicitly using a named attribute:
>>> import datetime
>>> datetime.date(2010, 6, 16).isocalendar().week
24
You can get the week number directly from datetime as string.
>>> import datetime
>>> datetime.date(2010, 6, 16).strftime("%V")
'24'
Also you can get different "types" of the week number of the year changing the strftime parameter for:
%U - Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0. Examples: 00, 01, …, 53
%W - Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0. Examples: 00, 01, …, 53
[...]
(Added in Python 3.6, backported to some distribution's Python 2.7's) Several additional directives not required by the C89 standard are included for convenience. These parameters all correspond to ISO 8601 date values. These may not be available on all platforms when used with the strftime() method.
[...]
%V - ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4. Examples: 01, 02, …, 53
from: datetime — Basic date and time types — Python 3.7.3 documentation
I've found out about it from here. It worked for me in Python 2.7.6
I believe date.isocalendar() is going to be the answer. This article explains the math behind ISO 8601 Calendar. Check out the date.isocalendar() portion of the datetime page of the Python documentation.
>>> dt = datetime.date(2010, 6, 16)
>>> wk = dt.isocalendar()[1]
24
.isocalendar() return a 3-tuple with (year, wk num, wk day). dt.isocalendar()[0] returns the year,dt.isocalendar()[1] returns the week number, dt.isocalendar()[2] returns the week day. Simple as can be.
There are many systems for week numbering. The following are the most common systems simply put with code examples:
ISO: First week starts with Monday and must contain the January 4th (or first Thursday of the year). The ISO calendar is already implemented in Python:
>>> from datetime import date
>>> date(2014, 12, 29).isocalendar()[:2]
(2015, 1)
North American: First week starts with Sunday and must contain the January 1st. The following code is my modified version of Python's ISO calendar implementation for the North American system:
from datetime import date
def week_from_date(date_object):
date_ordinal = date_object.toordinal()
year = date_object.year
week = ((date_ordinal - _week1_start_ordinal(year)) // 7) + 1
if week >= 52:
if date_ordinal >= _week1_start_ordinal(year + 1):
year += 1
week = 1
return year, week
def _week1_start_ordinal(year):
jan1 = date(year, 1, 1)
jan1_ordinal = jan1.toordinal()
jan1_weekday = jan1.weekday()
week1_start_ordinal = jan1_ordinal - ((jan1_weekday + 1) % 7)
return week1_start_ordinal
>>> from datetime import date
>>> week_from_date(date(2014, 12, 29))
(2015, 1)
MMWR (CDC): First week starts with Sunday and must contain the January 4th (or first Wednesday of the year). I created the epiweeks package specifically for this numbering system (also has support for the ISO system). Here is an example:
>>> from datetime import date
>>> from epiweeks import Week
>>> Week.fromdate(date(2014, 12, 29))
(2014, 53)
Here's another option:
import time
from time import gmtime, strftime
d = time.strptime("16 Jun 2010", "%d %b %Y")
print(strftime(d, '%U'))
which prints 24.
See: http://docs.python.org/library/datetime.html#strftime-and-strptime-behavior
The ISO week suggested by others is a good one, but it might not fit your needs. It assumes each week begins with a Monday, which leads to some interesting anomalies at the beginning and end of the year.
If you'd rather use a definition that says week 1 is always January 1 through January 7, regardless of the day of the week, use a derivation like this:
>>> testdate=datetime.datetime(2010,6,16)
>>> print(((testdate - datetime.datetime(testdate.year,1,1)).days // 7) + 1)
24
Generally to get the current week number (starts from Sunday):
from datetime import *
today = datetime.today()
print today.strftime("%U")
For the integer value of the instantaneous week of the year try:
import datetime
datetime.datetime.utcnow().isocalendar()[1]
If you are only using the isocalendar week number across the board the following should be sufficient:
import datetime
week = date(year=2014, month=1, day=1).isocalendar()[1]
This retrieves the second member of the tuple returned by isocalendar for our week number.
However, if you are going to be using date functions that deal in the Gregorian calendar, isocalendar alone will not work! Take the following example:
import datetime
date = datetime.datetime.strptime("2014-1-1", "%Y-%W-%w")
week = date.isocalendar()[1]
The string here says to return the Monday of the first week in 2014 as our date. When we use isocalendar to retrieve the week number here, we would expect to get the same week number back, but we don't. Instead we get a week number of 2. Why?
Week 1 in the Gregorian calendar is the first week containing a Monday. Week 1 in the isocalendar is the first week containing a Thursday. The partial week at the beginning of 2014 contains a Thursday, so this is week 1 by the isocalendar, and making date week 2.
If we want to get the Gregorian week, we will need to convert from the isocalendar to the Gregorian. Here is a simple function that does the trick.
import datetime
def gregorian_week(date):
# The isocalendar week for this date
iso_week = date.isocalendar()[1]
# The baseline Gregorian date for the beginning of our date's year
base_greg = datetime.datetime.strptime('%d-1-1' % date.year, "%Y-%W-%w")
# If the isocalendar week for this date is not 1, we need to
# decrement the iso_week by 1 to get the Gregorian week number
return iso_week if base_greg.isocalendar()[1] == 1 else iso_week - 1
I found these to be the quickest way to get the week number; all of the variants.
from datetime import datetime
dt = datetime(2021, 1, 3) # Date is January 3rd 2021 (Sunday), year starts with Friday
dt.strftime("%W") # '00'; Monday is considered first day of week, Sunday is the last day of the week which started in the previous year
dt.strftime("%U") # '01'; Sunday is considered first day of week
dt.strftime("%V") # '53'; ISO week number; result is '53' since there is no Thursday in this year's part of the week
Further clarification for %V can be found in the Python doc:
The ISO year consists of 52 or 53 full weeks, and where a week starts on a Monday and ends on a Sunday. The first week of an ISO year is the first (Gregorian) calendar week of a year containing a Thursday. This is called week number 1, and the ISO year of that Thursday is the same as its Gregorian year.
https://docs.python.org/3/library/datetime.html#datetime.date.isocalendar
NOTE: Bear in mind the return value is a string, so pass the result to a int constructor if you need a number.
I summarize the discussion to two steps:
Convert the raw format to a datetime object.
Use the function of a datetime object or a date object to calculate the week number.
Warm up
from datetime import datetime, date, time
d = date(2005, 7, 14)
t = time(12, 30)
dt = datetime.combine(d, t)
print(dt)
1st step
To manually generate a datetime object, we can use datetime.datetime(2017,5,3) or datetime.datetime.now().
But in reality, we usually need to parse an existing string. we can use strptime function, such as datetime.strptime('2017-5-3','%Y-%m-%d') in which you have to specific the format. Detail of different format code can be found in the official documentation.
Alternatively, a more convenient way is to use dateparse module. Examples are dateparser.parse('16 Jun 2010'), dateparser.parse('12/2/12') or dateparser.parse('2017-5-3')
The above two approaches will return a datetime object.
2nd step
Use the obtained datetime object to call strptime(format). For example,
python
dt = datetime.strptime('2017-01-1','%Y-%m-%d') # return a datetime object. This day is Sunday
print(dt.strftime("%W")) # '00' Monday as the 1st day of the week. All days in a new year preceding the 1st Monday are considered to be in week 0.
print(dt.strftime("%U")) # '01' Sunday as the 1st day of the week. All days in a new year preceding the 1st Sunday are considered to be in week 0.
print(dt.strftime("%V")) # '52' Monday as the 1st day of the week. Week 01 is the week containing Jan 4.
It's very tricky to decide which format to use. A better way is to get a date object to call isocalendar(). For example,
python
dt = datetime.strptime('2017-01-1','%Y-%m-%d') # return a datetime object
d = dt.date() # convert to a date object. equivalent to d = date(2017,1,1), but date.strptime() don't have the parse function
year, week, weekday = d.isocalendar()
print(year, week, weekday) # (2016,52,7) in the ISO standard
In reality, you will be more likely to use date.isocalendar() to prepare a weekly report, especially in the Christmas-New Year shopping season.
You can try %W directive as below:
d = datetime.datetime.strptime('2016-06-16','%Y-%m-%d')
print(datetime.datetime.strftime(d,'%W'))
'%W': Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0. (00, 01, ..., 53)
For pandas users, if you want to get a column of week number:
df['weekofyear'] = df['Date'].dt.week
isocalendar() returns incorrect year and weeknumber values for some dates:
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime as dt
>>> myDateTime = dt.datetime.strptime("20141229T000000.000Z",'%Y%m%dT%H%M%S.%fZ')
>>> yr,weekNumber,weekDay = myDateTime.isocalendar()
>>> print "Year is " + str(yr) + ", weekNumber is " + str(weekNumber)
Year is 2015, weekNumber is 1
Compare with Mark Ransom's approach:
>>> yr = myDateTime.year
>>> weekNumber = ((myDateTime - dt.datetime(yr,1,1)).days/7) + 1
>>> print "Year is " + str(yr) + ", weekNumber is " + str(weekNumber)
Year is 2014, weekNumber is 52
Let's say you need to have a week combined with the year of the current day as a string.
import datetime
year,week = datetime.date.today().isocalendar()[:2]
week_of_the_year = f"{year}-{week}"
print(week_of_the_year)
You might get something like 2021-28
If you want to change the first day of the week you can make use of the calendar module.
import calendar
import datetime
calendar.setfirstweekday(calendar.WEDNESDAY)
isodate = datetime.datetime.strptime(sweek,"%Y-%m-%d").isocalendar()
week_of_year = isodate[1]
For example, calculate the sprint number for a week starting on WEDNESDAY:
def calculate_sprint(sweek):
calendar.setfirstweekday(calendar.WEDNESDAY)
isodate=datetime.datetime.strptime(sweek,"%Y-%m-%d").isocalendar()
return "{year}-{month}".format(year=isodate[0], month=isodate[1])
calculate_sprint('2021-01-01')
>>>'2020-53'
We have a similar issue and we came up with this logic
I have tested for 1year test cases & all passed
import datetime
def week_of_month(dt):
first_day = dt.replace(day=1)
dom = dt.day
if first_day.weekday() == 6:
adjusted_dom = dom
else:
adjusted_dom = dom + first_day.weekday()
if adjusted_dom % 7 == 0 and first_day.weekday() != 6:
value = adjusted_dom / 7.0 + 1
elif first_day.weekday() == 6 and adjusted_dom % 7 == 0 and adjusted_dom == 7:
value = 1
else:
value = int(ceil(adjusted_dom / 7.0))
return int(value)
year = 2020
month = 01
date = 01
date_value = datetime.datetime(year, month, date).date()
no = week_of_month(date_value)
userInput = input ("Please enter project deadline date (dd/mm/yyyy/): ")
import datetime
currentDate = datetime.datetime.today()
testVar = datetime.datetime.strptime(userInput ,"%d/%b/%Y").date()
remainDays = testVar - currentDate.date()
remainWeeks = (remainDays.days / 7.0) + 1
print ("Please pay attention for deadline of project X in days and weeks are : " ,(remainDays) , "and" ,(remainWeeks) , "Weeks ,\nSo hurryup.............!!!")
A lot of answers have been given, but id like to add to them.
If you need the week to display as a year/week style (ex. 1953 - week 53 of 2019, 2001 - week 1 of 2020 etc.), you can do this:
import datetime
year = datetime.datetime.now()
week_num = datetime.date(year.year, year.month, year.day).strftime("%V")
long_week_num = str(year.year)[0:2] + str(week_num)
It will take the current year and week, and long_week_num in the day of writing this will be:
>>> 2006

Categories

Resources