Detecting a US Holiday - python

What's the simplest way to determine if a date is a U.S. bank holiday in Python? There seem to be various calendars and webservices listing holidays for various countries, but I haven't found anything specific to banks in the U.S.

The Pandas package provides a convenient solution for this:
from pandas.tseries.holiday import USFederalHolidayCalendar
cal = USFederalHolidayCalendar()
holidays = cal.holidays(start='2014-01-01', end='2014-12-31').to_pydatetime()
if datetime.datetime(2014,01,01) in holidays:
print True

Use the holiday library in python.
pip install holidays
For USA holiday:
1. To check a date holiday or not.
from datetime import date
import holidays
# Select country
us_holidays = holidays.US()
# If it is a holidays then it returns True else False
print('01-01-2018' in us_holidays)
print('02-01-2018' in us_holidays)
# What holidays is it?
print(us_holidays.get('01-01-2018'))
print(us_holidays.get('02-01-2018'))
2. To list out all holidays in US:
from datetime import date
import holidays
# Select country
us_holidays = holidays.US()
# Print all the holidays in US in year 2018
for ptr in holidays.US(years = 2018).items():
print(ptr)
You can find holidays for any country you like the list of countries listed on my Blog.
My Blog on Holidays
Github Link of Holidays Python

Some general comments:
I don't think that #ast4 really means "nth day of nth week of nth month algorithm". The notion of "nth week in nth month" is mind-snapping (like the "ISO calendar"). I've never seen a holiday defined in terms of "nth week". Martin Luther King Day is an example of the"Nth weekday in month" type of holiday:
MONDAY, ...., SATURDAY = range(7)
JAN, ....., DEC = range(1, 12)
Holiday("Martin L King's Birthday", type='floating',
ordinal=3, weekday=MON, month=JAN)
Holiday("Memorial Day", type='floating',
ordinal=-1, weekday=MON, month=MAY)
The USA doesn't have Easter-related holidays. Definition is not difficult:
Holiday("Good Friday", type='moveable',
base='gregorian_easter', delta_days=-2)
Holiday("Easter Monday", etc, delta_days=1)
# Some states in Australia used to have Easter Tuesday (no kidding)
Holiday("Easter Tuesday", etc, delta_days=2)
The 'base' idea can be used to cater for lunar new year, in fact any holiday that is an offset from a base date that needs a special procedure to calculate it.
The so-called "static" holidays are not fixed when the "fixed" date is a Saturday or Sunday and may even vanish (no alternative day off work):
# Americans will get a day off on Friday 31 Dec 2010
# because 1 Jan 2011 is a Saturday.
Holiday("New Year's Day", type='fixed',
day=1, month=JAN, sat_adj=-1, sun_adj=????)
# Australia observes ANZAC Day on the day, with no day off
# if the fixed date falls on a weekend.
Holiday("ANZAC Day", type='fixed', day=25, month=APR, sat_adj=0, sun_adj=0)
# Two consecutive "fixed" holidays is OK; three would need re-thinking.
# Australia again:
Holiday("Christmas Day", type='fixed', day=25, month=DEC, sat_adj=2, sun_adj=1)
Holiday("Boxing Day", type='fixed', day=26, month=DEC, sat_adj=2, sun_adj=2)
I'm sure there are ways of specifying holidays that aren't catered for by the above rules ... any such news is welcome.

Make sure date != one of these:
http://www.buyusa.gov/uk/en/us_bank_holidays.html

I've actually worked recently on a problem much like this one. The static holidays are rather trivial to generate (e.g. New Years Eve - December 31st, (cycle through years)). There are well defined algorithms out there to generate the floating holidays. Essentially you have a starting date (e.g. January 1st, 1900) and work from there. What I ended up implementing was a nth day of nth week of nth month algorithm (e.g. MLK day = 3rd Monday of January). Easter is a bit different to do, but again there are well defined algorithms for that already out there (Good Friday is trivial after you have January).
There's a fairly decent book on this out there you may want to check out: Calendrical Calculations

I should caution contributors from thinking this is all solvable by an algorithm. Three examples:
most Islamic holidays are lunar. The moon is predictable (you need a new month before the month starts, and that affects public holidays like Eid, at the end of Ramadan). Some countries use algorithms to predict the moon, but others explicitly require that the new moon be actually sighted in the country. Some even post people to the tops of tall mountains, tasked to try to spot the new moon. If the night is cloudy, the new moon may not be sighted, so the following day is not a holiday.
a committee in China meets around November each year to decide on the number of holidays that will be given for Chinese New Year in mainland China the following February. As an added complication, because there are so many public holidays given for Chinese New Year, the Chinese Government sometimes nominates weekends after the holiday as working days, to boost economic activity.
stock exchanges in India sometimes have a very short trading period at a weekend for auspicious religious reasons.
For this reason, there are companies that do this research, get the updates, and publish holidays via an API, for a fee. Users typically query that API every day in case new holidays are announced.

Related

Python: get first date of the week from calender week and year (working for 2020 but not for 2021)

I would like to extract the first date of the week from the year and the calender week (in Europe where the first calender week is the week that includes the 4th of January)
My code works correctly for
import datetime
year=20
week=53
print(datetime.datetime.strptime(str(year) + "-"+ str(week-1) +'-1-CET', "%y-%U-%w-%Z"))
the output is 2020-12-28 00:00:00 which is correct
for
import datetime
year=21
week=1
print(datetime.datetime.strptime(str(year) + "-"+ str(week-1) +'-1-CET', "%y-%U-%w-%Z"))
I get also 2020-12-28 00:00:00.The correct output would be 2021-01-04.
Could you please tell me where my mistake is?
Thanks
Using %V instead of %U should make things easier.

Get week of UK fiscal year

I want to get the week number corresponding to the UK fiscal year (which runs 6th April to 5th April). report_date.strftime('%V') will give me the week number corresponding to the calendar year (1st January to 31st December).
For example, today is 2nd February which is UK fiscal week 44, but %V would return 05.
I've seen the https://pypi.org/project/fiscalyear/ library but it doesn't seem to offer a way to do this. I know that I can work out the number of days since April 6th and divide by 7, but just curious if there's a better way.
This does the job in Python. It counts the number of days since April 6th of the given year (formatted_report_date), and if the answer is negative (because April 6th hasn't passed yet), then a year is subtracted. Then divide by 7 and add 1 (for 1-indexing). The answer will be between 1-53.
def get_fiscal_week(formatted_report_date):
"""
Given a date, returns the week number (from 1-53) since the last April 6th.
:param formatted_report_date: the formatted date to be converted into a
fiscal week.
"""
from datetime import datetime
fiscal_start = formatted_report_date.replace(month=4, day=6)
days_since_fiscal_start = (formatted_report_date - fiscal_start).days
if days_since_fiscal_start < 0:
fiscal_start = fiscal_start.replace(year=fiscal_start.year-1)
days_since_fiscal_start = (formatted_report_date - fiscal_start).days
return (days_since_fiscal_start / 7) + 1

Unable to parse an exact result from a webpage using requests

I've created a script in python to parse two fields from a webpage - total revenue and it's concerning date. The fields I'm after are javascript encrypted. They are available in page source within json array. The following script can parse those two fields accordingly.
However, the problem is the date visible in that page is different from the one available in page source.
Webpage link
The date in that webpage is like this
The date in page source is like this
There is clearly a variation of one day.
After visiting that webpage when you click on this tab Quarterly you can see the results there:
I've tried with:
import re
import json
import requests
url = 'https://finance.yahoo.com/quote/GTX/financials?p=GTX'
res = requests.get(url)
data = re.findall(r'root.App.main[^{]+(.*);',res.text)[0]
jsoncontent = json.loads(data)
container = jsoncontent['context']['dispatcher']['stores']['QuoteSummaryStore']['incomeStatementHistoryQuarterly']['incomeStatementHistory']
total_revenue = container[0]['totalRevenue']['raw']
concerning_date = container[0]['endDate']['fmt']
print(total_revenue,concerning_date)
Result I get (revenue in million):
802000000 2019-06-30
Result I wish to get:
802000000 2019-06-29
When I try with this ticker AAPL, I get the exact date, so subtracing or adding a day is not an option.
How can I get the exact date from that site?
Btw, I know how to get them using selenium, so I would only like to stick to requests.
As mentioned in the comments, you need to convert the date to the appropriate timezone (EST), which can be done with datetime and dateutil.
Here is a working example:
import re
import json
import requests
from datetime import datetime, timezone
from dateutil import tz
url = 'https://finance.yahoo.com/quote/GTX/financials?p=GTX'
res = requests.get(url)
data = re.findall(r'root.App.main[^{]+(.*);',res.text)[0]
jsoncontent = json.loads(data)
container = jsoncontent['context']['dispatcher']['stores']['QuoteSummaryStore']['incomeStatementHistoryQuarterly']['incomeStatementHistory']
total_revenue = container[0]['totalRevenue']['raw']
EST = tz.gettz('EST')
raw_date = datetime.fromtimestamp(container[0]['endDate']['raw'], tz=EST)
concerning_date = raw_date.date().strftime('%d-%m-%Y')
print(total_revenue, concerning_date)
The updated section of this answer outlines the root cause of the date differences.
ORIGINAL ANSWER
Some of the raw values in your JSON are UNIX timestamps.
Reference from your code with modifications:
concerning_date_fmt = container[0]['endDate']['fmt']
concerning_date_raw = container[0]['endDate']['raw']
print(f'{concerning_date} -- {concerning_date_raw}')
# output
2019-07-28 -- 1564272000
'endDate': {'fmt': '2019-07-28', 'raw': 1564272000}
1564272000 is the number of elapsed seconds since January 01 1970. This date was the start of the Unix Epoch and the time is in Coordinated Universal Time (UTC). 1564272000 is the equivalent to: 07/28/2019 12:00am (UTC).
You can covert these timestamps to a standard datetime format by using built-in Python functions
from datetime import datetime
unix_timestamp = int('1548547200')
converted_timestamp = datetime.utcfromtimestamp(unix_timestamp).strftime('%Y-%m-%dT%H:%M:%SZ')
print (converted_timestamp)
# output Coordinated Universal Time (or UTC)
2019-07-28T00:00:00Z
reformatted_timestamp = datetime.strptime(converted_timestamp, '%Y-%m-%dT%H:%M:%SZ').strftime('%d-%m-%Y')
print (reformatted_timestamp)
# output
28-07-2019
This still does not solve your original problem related to JSON dates and column dates being different at times. But here is my current hypothesis related to the date disparities that are occurring.
The json date (fmt and raw) that are being extracted from root.App.main are in Coordinated Universal Time (UTC). This is clear because of the UNIX timestamp in raw.
The dates being displayed in the table columns seem to be in the Eastern Standard Time (EST) timezone. EST is currently UTC-4. Which means that 2019-07-28 22:00 (10pm) EST would be 2019-07-29 02:00 (2am) UTC. The server hosting finance.yahoo.com looks to be in the United States, based on the traceroute
results. These values are also in the json file:
'exchangeTimezoneName': 'America/New_York'
'exchangeTimezoneShortName': 'EDT'
There is also the possibility that some of the date differences are linked to the underlying React code, which the site uses. This issue is harder to diagnose, because the code isn't visible.
At this time I believe that the best solution would be to use the UNIX timestamp as your ground truth time reference. This reference could be used to replace the table column's date.
There is definitely some type of conversion happening between the JSON file and the columns.
NVIDIA JSON FILE: 'endDate': {'raw': 1561766400, 'fmt': '2019-06-29'}
NVIDIA Associated Total Revenue column: 6/30/2019
BUT the Total Revenue column date should be 6/28/2019 (EDT), because the UNIX time stamp for 1561766400 is 06/29/2019 12:00am (UTC).
The disparity with DELL is greater than a basic UNIX timestamp and a EDT timestamp conversion.
DELL JSON FILE:{"raw":1564704000,"fmt":"2019-08-02"}
DELL Associated Total Revenue column: 7/31/2019
If we convert the UNIX timestamp to an EDT timestamp, the result would be 8/1/2019, but that is not the case in the DELL example, which is 7/31/2019. Something within the Yahoo code base has to be causing this difference.
I'm starting to believe that React might be the culprit with these date differences, but I cannot be sure without doing more research.
If React is the root cause then the best option would be to use the date elements from the JSON data.
UPDATED ANSWER 10-17-2019
This problem is very interesting, because it seems that these column dates are linked to a company's official end of fiscal quarter and not a date conversation issue.
Here are several examples for
Apple Inc. (AAPL)
Atlassian Corporation Plc (TEAM)
Arrowhead Pharmaceuticals, Inc. (ARWR):
Their column dates are:
6/30/2019
3/31/2019
12/31/2018
9/30/2018
These dates match to these fiscal quarters.
Quarter 1 (Q1): January 1 - March 31.
Quarter 2 (Q2): April 1 - June 30.
Quarter 3 (Q3): July 1 - September 30.
Quarter 4 (Q4): October 1 - December 31
These fiscal quarter end dates can vary greatly as this DELL example shows.
DELL (posted in NASDAQ)
End of fiscal quarter: July 2019
Yahoo Finance
Column date: 7/31/2019
JSON date: 2019-08-02
From the company's website:
When does Dell Technologies’ fiscal year end?
Our fiscal year is the 52- or 53-week period ending on the Friday nearest January 31. Our 2020 fiscal year will end on January 31, 2020. For prior fiscal years, see list below: Our 2019 fiscal year ended on February 1, 2019 Our 2018 fiscal year ended on February 2, 2018 Our 2017 fiscal year ended on February 3, 2017 Our 2016 fiscal year ended on January 29, 2016 Our 2015 fiscal year ended on January 30, 2015 Our 2014 fiscal year ended on January 31, 2014 Our 2013 fiscal year ended on February 1, 2013
NOTE: The 05-03-19 and 08-02-19 dates.
These are from the JSON quarter data for DELL:
{'raw': 1564704000, 'fmt': '2019-08-02'}
{'raw': 1556841600, 'fmt': '2019-05-03'}
It seems that these column dates are linked to a company's fiscal quarter end dates. So I would recommend that you either use the JSON date as you primary reference element or the corresponding column date.
P.S. There is some type of date voodoo occurring at Yahoo, because they seem to move these column quarter dates based on holidays, weekends and end of month.
Instead of getting the fmt of the concerning_date, It's better to get the timestamp.
concerning_date = container[0]['endDate']['raw']
In the example above you will get the result 1561852800 which you can transfer into a date with a certain timezone. (Hint: use datetime and pytz). This timestamp will yield the following results based on timezone:
Date in Los Angeles*: 29/06/2019, 17:00:00
Date in Berlin* :30/06/2019, 02:00:00
Date in Beijing*: 30/06/2019, 07:00:00
Date in New York* :29/06/2019, 19:00:00

Adjust datetime in Pandas to get CustomBusinessWeek

I have a long series of stock daily prices and I am trying to get week prices to do some calculations. I have been reading the documentation and I see you can set offsets get a specific date of the week which is what I want. This is the code assume stock is part of a loop I am runing.
df_clean_BW[WEEKLY_PricesFriday'] = stock.resample('W-FRI').last()
But for US stock market there are many days where it is a holiday on Friday so then I saw you can adjust this for USCalendar Holidays. This is the code I was using
from pandas.tseries.offsets import CustomBusinessDay
from pandas.tseries.holiday import USFederalHolidayCalendar
bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
But I dont know how to combine the two so that if there is a holiday on Friday to take the day prior (the Thursday instead). So something like this but this throws an error
df_clean_BW[WEEKLY_PricesFriday'] = stock.resample('W-FRI' & bday_us).last()
I have a long list of dates so I don't want to create a list of exception days because that would be too long. Here is an example of the output I would want. In this case Jan 1, 2016 was a Friday so I just want to take December 31, 2015 instead. This must be a common request for anyone who looks at stock data but I cant figure out a way to do it.
Date Price Week Price
12/30/2015 103.3227
12/31/2015 101.3394
1/4/2016 101.426 101.3394 << Take 12/31 as 1.1 is holiday
1/5/2016 98.8844
1/6/2016 96.9492
1/7/2016 92.8575
1/8/2016 93.3485 93.3485
First generate your array of Fridays including holidays. Then use np.busday_offset() to offset them like this:
np.busday_offset(fridays, 0, roll='backward', busdaycal=bday_us.calendar)

Linking Simpy simulation time to Python Calendar for week day specific actions

I want to build a simulation model of a production network with SimPy comprising the following features with regard to time:
Plants work from Monday to Friday (with two shifts of 8 hours)
Heavy trucks drive on all days of the week except Sunday
Light trucks drive on all days of the week, including Sunday
To this purpose, I want to construct a BroadcastPipe as given in the docs combined with timeouts to make the objects wait during days they are not working (for the plants additional logic is required to model shifts). This BroadcastPipe would just count the days (assuming 24*60 minutes for each day) and then say "It's Monday, everybody". The objects (plant, light and heavy trucks) would then process this information individually and act accordingly.
Now, I wonder whether there is an elegant method to link simulation time to regular Python Calender objects in order to easily access days of the week. This would be useful for clarity and enhancements like bank holidays and varying starting days. Do you have any advise how to do this? (or general advice on how to model better?). Thanks in advance!
I usually set a start date and define it to be equal with the simulation time (Environment.now) 0. Since SimPy’s simulation time has no inherent unit, I also define that it is in seconds. Using arrow, I can than easily calculate an actual date and time from the current simulation time:
import arrow
import simpy
start = arrow.get('2015-01-01T00:00:00')
env = simpy.Environment()
# do some simulation ...
current_date = start.replace(seconds=env.now)
print('Curret weekday:', current_date.weekday())
You might use the datetime module and create a day_of_week object, though you would still need to calculate the elapsed time:
import datetime
# yyyy = four digit year integer
# mm = 1- or 2-digit month integer
# dd = 1- or 2-digit day integer
day_of_week = datetime.datetime(yyyy, mm, dd).strftime('%a')
if day_of_week == 'Mon':
# Do Monday tasks...
elif day_of_week == 'Tue':
# Tuesday...

Categories

Resources