I've seen about 30 similar posts to this, but nothing really doing exactly what I'm looking for and some which just don't work..
I'm trying to return a list of N business dates, to then iterate through a dictionary and pull data out according to the corresponding dates.
Assuming the current date is:
refreshed = str(data['Meta Data']['3. Last Refreshed'])
For completion, the value of above right now is:2020-1-30
I want to be able to calculate n days prior to this date..
I don't really want to import a bunch of funky modules, and have tried a function using a loop and datetime.date.isoweekday() - but I always come across an issue when passing refreshed in.
One of the main issues I'm seeing with some of the examples elsewhere is where the examples are calculating the dates from datetime.date.today() - seemingly it's fine to pass that to isoweekday() but I can't pass refreshed to isoweekday() to calculate it's 0-6 reference. I've tried using strfrtime() to reformat the date into a suitable format for isoweekday but to no avail.
Subtracting days from a date
You can subtract 30 days from a datetime.datetime object by subtracting a datetime.timedelta object:
>>> import datetime
>>> datetime.datetime.today()
datetime.datetime(2020, 10, 31, 10, 20, 0, 704133)
>>> datetime.datetime.today() - datetime.timedelta(30)
datetime.datetime(2020, 10, 1, 10, 19, 49, 680385)
>>> datetime.datetime.strptime('2020-01-30', '%Y-%m-%d') - datetime.timedelta(30)
datetime.datetime(2019, 12, 31, 0, 0)
Skipping week-ends by subtracting 7 days instead of 5
We are starting from date d and you want to subtract N=30 non-week-end days. A general way could be:
Figure out which day of the week is d;
Figure out how many week-ends there are between d and d-N;
Remove the appropriate number of days.
However, you want to subtract 30 days, and 30 is a multiple of 5. This makes things particularly easy: when you subtract 5 days from a date, you are guaranteed to encounter exactly one week-end in those five days. So you can immediately remove 7 days instead of 5.
Removing 30 days is the same as removing 6 times 5 days. So you can remove 6 times 7 days instead, which is achieved by subtracting datetime.timedelta(42) from your date.
Note: this accounts for week-ends, but not for special holidays.
Skipping week-ends iteratively
You can test for days of the week using .weekday(). This is already answered on this other question: Loop through dates except for week-ends
You can add N days using a timedelta:
data['Meta Data']['3. Last Refreshed'] = pd.to_datetime(data['Meta Data']['3. Last Refreshed']) + pd.to_timedelta(4, unit="D")
Replace 4 with your n days.
I want to find the first day of a given month an average 90 days previous to a random date. For instance:
December 15 -- returns August 30
December 30 -- returns August 30
December 1st -- returns August 30
I know this can be done with Pandas pd.DateOffset:
print(pd.Timestamp("2019-12-15") - pd.DateOffset(days=90))
but then I'll get something like September 15th.
I know I can count minus 90 days, select the month, subtract 1 and then select last day of the obtained month, but I was wondering if this can be easily done in one line of code, efficiently.
Assume that the date in question is:
dat = pd.Timestamp('2019-12-15')
To compute the date 90 days before, run:
dat2 = dat - pd.DateOffset(days=90)
getting 2019-09-16.
And finally, to get the start of this month, run:
dat2 - pd.offsets.MonthBegin(0)
getting 2019-09-01.
To put tho whole thing short, run just:
dat - pd.DateOffset(days=90) - pd.offsets.MonthBegin(0)
A subtle difference becomes visible if you start from a date, which
turned 90 days back gives just the first day of a month. E.g.
dat = pd.Timestamp('2019-11-30')
dat2 = dat - pd.DateOffset(days=90)
gives 2019-09-01.
Then dat2 - pd.offsets.MonthBegin(0) gives just the same date.
If you want in this case the start date of the previous month, run:
dat2 - pd.offsets.MonthBegin(1)
(note the argument changed to 1), getting 2019-08-01.
So choose the variant which suits your needs.
Edit: the (puzzling) behavior below was for pandas 0.17.1. It appears fixed in 0.18.1.
Is there a way to represent an arbitrary time span with a pandas.Period?
Specifically, I was trying to contrive a pandas.Period() to represent an arbitrary n-day span (with the goal of making a multi-year Period).
I tried a few things, and it seems that playing with the freq argument gets me more or less what I want. However, I was surprised by the unexpected end_time of the period in the case of the freq argument having a multiplier (as in freq='2D').
import pandas as pd
p = pd.Period(1970, freq='2D')
p # Period('1970-01-01', '2D')
p.start_time # Timestamp('1970-01-01 00:00:00')
p.end_time # Timestamp('1970-01-04 23:59:59.999999999')
p.end_time - p.start_time
# Timedelta('3 days 23:59:59.999999')
Why? That's 4 days, not 2.
However:
p+1 # Period('1970-01-03', '2D')
(p+1).start_time # Timestamp('1970-01-03 00:00:00')
So, (p+1) gives me the expected (a period starting 2 days after p's start).
But what's the deal with end_time? What's the relationship between freq='nD' to actual duration in days?
def actual_span(n, unit='D'):
p = pd.Period(1970, freq='{}{}'.format(n, unit))
return p.end_time + pd.Timedelta(1) - p.start_time
x = pd.DataFrame({'d': range(1, 10)})
x['span'] = x.n.apply(actual_span)
print(x.set_index('n'))
# span
# n
# 1 1 days
# 2 4 days
# 3 9 days
# 4 16 days
# 5 25 days
# 6 36 days
# 7 49 days
# 8 64 days
# 9 81 days
Why is it the square of the requested number of days?
Note that (p+1).start_time is correct (gives us n days).
Small print: Python 3.51, Pandas 0.18.1 correction: 0.17.1.
pd.Period(1970, freq='2D') has the expected start_time and end_time for me, also using Pandas 0.18.1. Maybe try restarting your interpreter, and run the first bit of code you posted again to verify that you're still getting the unexpected output?
I have to save the time in AM PM format.
But i am having trouble in deciding how to enter midnight time.
Suppose the time is 9PM to 6AM next morning. I have to divide it into day to day basis . Like this
t1 = datetime.datetime.strptime('09:00PM', '%I:%M%p').time()
t2 = datetime.datetime.strptime('12:00AM', '%I:%M%p').time()
t3 = datetime.datetime.strptime('06:00AM', '%I:%M%p').time()
Now i want to know whether the t2 should be
12:00 AM or 11.59PM
If i do 12:00AM then i can't compare if 9pm > 12am but 11.59 looks odd or may be it is right way
You should always use 00:00 (or 12:00 AM) to represent midnight.
Using 23:59 (or 11:59 PM) is problematic for a couple of reasons:
Precision matters in the comparison. Is 23:59:01 not before midnight? What about 23:59:59.9999?
Duration calculation will be thrown off by whatever precision you chose. Consider that 10:00 pm to midnight is 2 hours, not 1 hour and 59 minutes.
To avoid these problems, you should always treat time intervals as half-open intervals. That is, the range has an inclusive start, and an exclusive end. In interval notation: [start, end)
Now with regard to crossing the midnight boundary:
When you are comparing times that are associated with a date, you can just compare directly:
[2015-01-01T21:00, 2015-01-02T06:00) = 9 hours
2015-01-01T21:00 < 2015-01-02T06:00
When you do not have a date, you can determine duration, but you cannot determine order!
[21:00, 06:00) = 9 hours
21:00 < 06:00 OR 21:00 > 06:00
The best you can do is determine whether a time is between the points covered by the range.
Both 23:00 and 01:00 are in the range [21:00, 06:00)
21:00 is also in that range, but 06:00 is NOT.
Think about a clock. It's modeled as a circle, not as a straight line.
To calculate duration of a time-only interval that can cross midnight, use the following pseudocode:
if (start <= end)
duration = end - start
else
duration = end - start + 24_hours
Or more simply:
duration = (end - start + 24_hours) % 24_hours
To determine whether a time-only value falls within a time-only interval that can cross midnight, use this pseudocode:
if (start <= end)
is_between = start <= value AND end > value
else
is_between = start <= value OR end > value
Note that in the above pseudocode, I am referring to the magnitude of the values, as compared numerically - not the logical time values which, as said earlier, cannot be compared independently without a reference date.
Also, much of this is covered in my Date and Time Fundamentals course on Pluralsight (towards the very end, in "Working With Ranges").
How about making t1 = 09:00PM, t2 = 11.59PM, t3 = 12:00AM and t4 = 06:00AM. Then you have definite time ranges per day. Of course, adding the date would make time differences evident as well.
Given a date range how to calculate the number of weekends partially or wholly within that range?
(A few definitions as requested:
take 'weekend' to mean Saturday and Sunday.
The date range is inclusive i.e. the end date is part of the range
'wholly or partially' means that any part of the weekend falling within the date range means the whole weekend is counted.)
To simplify I imagine you only actually need to know the duration and what day of the week the initial day is...
I darn well now it's going to involve doing integer division by 7 and some logic to add 1 depending on the remainder but I can't quite work out what...
extra points for answers in Python ;-)
Edit
Here's my final code.
Weekends are Friday and Saturday (as we are counting nights stayed) and days are 0-indexed starting from Monday. I used onebyone's algorithm and Tom's code layout. Thanks a lot folks.
def calc_weekends(start_day, duration):
days_until_weekend = [5, 4, 3, 2, 1, 1, 6]
adjusted_duration = duration - days_until_weekend[start_day]
if adjusted_duration < 0:
weekends = 0
else:
weekends = (adjusted_duration/7)+1
if start_day == 5 and duration % 7 == 0: #Saturday to Saturday is an exception
weekends += 1
return weekends
if __name__ == "__main__":
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
for start_day in range(0,7):
for duration in range(1,16):
print "%s to %s (%s days): %s weekends" % (days[start_day], days[(start_day+duration) % 7], duration, calc_weekends(start_day, duration))
print
General approach for this kind of thing:
For each day of the week, figure out how many days are required before a period starting on that day "contains a weekend". For instance, if "contains a weekend" means "contains both the Saturday and the Sunday", then we have the following table:
Sunday: 8
Monday: 7
Tuesday: 6
Wednesday: 5
Thursday: 4
Friday: 3
Saturday: 2
For "partially or wholly", we have:
Sunday: 1
Monday: 6
Tuesday: 5
Wednesday: 4
Thursday: 3
Friday: 2
Saturday: 1
Obviously this doesn't have to be coded as a table, now that it's obvious what it looks like.
Then, given the day-of-week of the start of your period, subtract[*] the magic value from the length of the period in days (probably start-end+1, to include both fenceposts). If the result is less than 0, it contains 0 weekends. If it is equal to or greater than 0, then it contains (at least) 1 weekend.
Then you have to deal with the remaining days. In the first case this is easy, one extra weekend per full 7 days. This is also true in the second case for every starting day except Sunday, which only requires 6 more days to include another weekend. So in the second case for periods starting on Sunday you could count 1 weekend at the start of the period, then subtract 1 from the length and recalculate from Monday.
More generally, what's happening here for "whole or part" weekends is that we're checking to see whether we start midway through the interesting bit (the "weekend"). If so, we can either:
1) Count one, move the start date to the end of the interesting bit, and recalculate.
2) Move the start date back to the beginning of the interesting bit, and recalculate.
In the case of weekends, there's only one special case which starts midway, so (1) looks good. But if you were getting the date as a date+time in seconds rather than day, or if you were interested in 5-day working weeks rather than 2-day weekends, then (2) might be simpler to understand.
[*] Unless you're using unsigned types, of course.
My general approach for this sort of thing: don't start messing around trying to reimplement your own date logic - it's hard, ie. you'll screw it up for the edge cases and look bad. Hint: if you have mod 7 arithmetic anywhere in your program, or are treating dates as integers anywhere in your program: you fail. If I saw the "accepted solution" anywhere in (or even near) my codebase, someone would need to start over. It beggars the imagination that anyone who considers themselves a programmer would vote that answer up.
Instead, use the built in date/time logic that comes with Python:
First, get a list of all of the days that you're interested in:
from datetime import date, timedelta
FRI = 5; SAT = 6
# a couple of random test dates
now = date.today()
start_date = now - timedelta(57)
end_date = now - timedelta(13)
print start_date, '...', end_date # debug
days = [date.fromordinal(d) for d in
range( start_date.toordinal(),
end_date.toordinal()+1 )]
Next, filter down to just the days which are weekends. In your case you're interested in Friday and Saturday nights, which are 5 and 6. (Notice how I'm not trying to roll this part into the previous list comprehension, since that'd be hard to verify as correct).
weekend_days = [d for d in days if d.weekday() in (FRI,SAT)]
for day in weekend_days: # debug
print day, day.weekday() # debug
Finally, you want to figure out how many weekends are in your list. This is the tricky part, but there are really only four cases to consider, one for each end for either Friday or Saturday. Concrete examples help make it clearer, plus this is really the sort of thing you want documented in your code:
num_weekends = len(weekend_days) // 2
# if we start on Friday and end on Saturday we're ok,
# otherwise add one weekend
#
# F,S|F,S|F,S ==3 and 3we, +0
# F,S|F,S|F ==2 but 3we, +1
# S|F,S|F,S ==2 but 3we, +1
# S|F,S|F ==2 but 3we, +1
ends = (weekend_days[0].weekday(), weekend_days[-1].weekday())
if ends != (FRI, SAT):
num_weekends += 1
print num_weekends # your answer
Shorter, clearer and easier to understand means that you can have more confidence in your code, and can get on with more interesting problems.
To count whole weekends, just adjust the number of days so that you start on a Monday, then divide by seven. (Note that if the start day is a weekday, add days to move to the previous Monday, and if it is on a weekend, subtract days to move to the next Monday since you already missed this weekend.)
days = {"Saturday":-2, "Sunday":-1, "Monday":0, "Tuesday":1, "Wednesday":2, "Thursday":3, "Friday":4}
def n_full_weekends(n_days, start_day):
n_days += days[start_day]
if n_days <= 0:
n_weekends = 0
else:
n_weekends = n_days//7
return n_weekends
if __name__ == "__main__":
tests = [("Tuesday", 10, 1), ("Monday", 7, 1), ("Wednesday", 21, 3), ("Saturday", 1, 0), ("Friday", 1, 0),
("Friday", 3, 1), ("Wednesday", 3, 0), ("Sunday", 8, 1), ("Sunday", 21, 2)]
for start_day, n_days, expected in tests:
print start_day, n_days, expected, n_full_weekends(n_days, start_day)
If you want to know partial weekends (or weeks), just look at the fractional part of the division by seven.
You would need external logic beside raw math. You need to have a calendar library (or if you have a decent amount of time implement it yourself) to define what a weekend, what day of the week you start on, end on, etc.
Take a look at Python's calendar class.
Without a logical definition of days in your code, a pure mathematical methods would fail on corner case, like a interval of 1 day or, I believe, anything lower then a full week (or lower then 6 days if you allowed partials).