I'm looking into getting data from the American stock exchanges for some python code, Basically what I need to do is import a stock name and previous date in time and it will give me all the data for the next 10 days of the market being open, Is this possible?
market = input("Market:")
ticker = input("Ticker:")
ticker = ticker.upper()
ystartdate = (input("Start Date IN FORMAT yyyy-mm-dd:"))
day1=input("Day1 :")
day2=input("Day2 :")
day3=input("Day3 :")
day4=input("Day4 :")
day5=input("Day5 :")
day6=input("Day6 :")
day7=input("Day7 :")
day8=input("Day8 :")
day9=input("Day9 :")
day10=input("Day10:")
Currently i have to input all the data automatically but that is a pain to do, Basically i would put in a stock and date like 2012-10-15 and it would go look at the stock on that date and for the next 10 days. If its possible it would be a life saver! Thanks
You should be working with a proper time format, not strings for this.
You can use pandas for example with datetime64.
import pandas as pd
input = ("Starting Date: ")
dates = pd.date_range(start=start_date, periods=10)
There is also the datetime package which has timedelta concepts which may help you if you don't want to use pandas.
I think what your need is included in pandas. In fact, you want to use either pandas.bdate_range or pandas.date_range with the freq argument set to B (I think both are more or less the same). These create business days, that is they would non include weekends. bdate_range also allows you to specify holidays, so I think that it might be a little more flexible.
>>> import pandas as pd
>>> dates = pd.bdate_range(start='2018-10-25', periods=10) # Start date is a Thursday
>>> print(dates)
DatetimeIndex(['2018-10-25', '2018-10-26', '2018-10-29', '2018-10-30',
'2018-10-31', '2018-11-01', '2018-11-02', '2018-11-05',
'2018-11-06', '2018-11-07'],
dtype='datetime64[ns]', freq='B')
Note how this excludes the 27th (a Saturday) and the 28th (a Sunday). If you want to specify holidays, you need to specify freq='C'.
Having these dates in separate variables is kind of ugly, but if you really want to, you can then go and unpack them like this:
>>> day1, day2, day3, day4, day5, day6, day7, day8, day9, day10 = dates
Related
I am encountering some issues when using the .between method in Python.
I have a simple dataset consisting of ~59000 records
The date format is in DD/MM/YYYY and I would like to filter the days in the month of April in the year 2014.
psi_df = pd.read_csv('thecsvfile.csv')
psi_west_df = psi_df[['24-hr_psi','west']]
april_records = psi_west_df[psi_west_df['24-hr_psi'].between('1/4/2014','31/4/2014')]
april_records.head(100)
I received the output whereby the date suddenly jumps from 3/4/2014 (3rd April) - 10/4/2014 (10th April). This pattern recurs for every month and for every year up till the year 2020 (the final year of this dataset), which was not my original intention of obtaining the data for the month of April in the year 2014.
As I am still rather new to python, I decided to perform some fixes in Excel instead. I separated the date and the time columns and reran the code with the necessary syntax updated.
psi_df = pd.read_csv('psi_new.csv')
psi_west_df = psi_df[['date','west']]
april_records = psi_west_df[psi_west_df['date'].between('1/4/2014','31/4/2014')]
april_records.head(100)
I still faced the same issue and now, I am totally stumped as to why this is occurring. Am I using the .between method wrongly? Seeking everyone's kind guidance and directions as to why this is occurring. Much appreciated and many thanks everyone.
The csv file that I am using can be obtained from this website:
https://data.gov.sg/dataset/historical-24-hr-psi
The first problem is your date column isn't a date but an object column.
Ensure you column is really a date by using the pandas to_datetime function.
psi_west_df['date'] = pd.to_datetime(psi_west_df['date'], format='%d/%m/%Y')
After the column is really a date column in order for the between function to run with no problems you should give it two date object and not string object like this:
start_day = pd.to_datetime('1/4/2014', format='%d/%m/%Y')
end_day = pd.to_datetime('30/4/2014', format='%d/%m/%Y')
april_records = psi_west_df[psi_west_df['date'].between(start_day, end_day)]
So all together:
psi_df = pd.read_csv('psi_new.csv')
psi_west_df = psi_df[['date','west']]
psi_west_df['date'] = pd.to_datetime(psi_west_df['date'], format='%d/%m/%Y')
start_day = pd.to_datetime('1/4/2014', format='%d/%m/%Y')
end_day = pd.to_datetime('30/4/2014', format='%d/%m/%Y')
april_records = psi_west_df[psi_west_df['date'].between(start_day, end_day)]
april_records.head(100)
Note - this code should work on the data after you change it with excel, meaning you have a separate column for data and time.
I have a question. I have a set of numeric values that are a date, but apparently the date is wrongly formatted and coming out of SAS. For example, I have the value 5893 that is in SAS 19.02.1976 when formatted correctly. I want to achieve this in Python/PySpark. From what I've found until now, there is a function fromtimestamp.
However, when I do this, it gives a wrong date:
value = 5893
date = datetime.datetime.fromtimestamp(value)
print(date)
1970-01-01 02:38:13
Any proposals to get the correct date? Thank you! :-)
EDIT: And how would the code look like when this operation is imposed on a dataframe column rather than a variable?
The Epoch, as far as SAS is concerned, is 1st January 1960. The number you have (5893) is the number of elapsed days since that Epoch. Therefore:
from datetime import timedelta, date
print(date(1960, 1, 1) + timedelta(days=5893))
...will give you the desired result
import numpy as np
import pandas as pd
ser = pd.Series([19411.0, 19325.0, 19325.0, 19443.0, 19778.0])
ser = pd.to_timedelta(ser, unit='D') + pd.Timestamp('1960-1-1')
Revised question with appropriate MCVE:
As part of a script I'm writing I need to have a loop that contains a different pair of dates during each iteration, these dates are the first and last available stock trading dates of each month. I have managed to find a calendar with the available dates in an index however despite my research I am not sure how to select the correct dates from this index so that they can be used in the DateTime variables start and end.
Here is as far as my research has got me and I will continue to search for and build my own solution which I will post if I manage to find one:
from __future__ import division
import numpy as np
import pandas as pd
import datetime
import pandas_market_calendars as mcal
from pandas_datareader import data as web
from datetime import date
'''
Full date range:
'''
startrange = datetime.date(2016, 1, 1)
endrange = datetime.date(2016, 12, 31)
'''
Tradable dates in the year:
'''
nyse = mcal.get_calendar('NYSE')
available = nyse.valid_days(start_date='2016-01-01', end_date='2016-12-31')
'''
The loop that needs to take first and last trading date of each month:
'''
dict1 = {}
for i in available:
start = datetime.date('''first available trade day of the month''')
end = datetime.date('''last available trade day of the month''')
diffdays = ((end - start).days)/365
dict1 [i] = diffdays
print (dict1)
That is probably because 1 January 2016 was not a trading day. To check if I am right, try giving it the date 4 January 2016, which was the following Monday. If that works, then you will have to be more sophisticated about the dates you ask for.
Look in the documentaion for dm.BbgDataManager(). It is possible that you can ask it what dates are available.
Currently I am trying to trim the current date into day, month and year with the following code.
#Code from my local machine
from datetime import datetime
from datetime import timedelta
five_days_ago = datetime.now()-timedelta(days=5)
# result: 2017-07-14 19:52:15.847476
get_date = str(five_days_ago).rpartition(' ')[0]
#result: 2017-07-14
#Extract the day
day = get_date.rpartition('-')[2]
# result: 14
#Extract the year
year = get_date.rpartition('-')[0])
# result: 2017-07
I am not a Python professional because I grasp this language for a couple of months ago but I want to understand a few things here:
Why did I receive this 2017-07 if str.rpartition() is supposed to separate a string once you have declared some sort separator (-, /, " ")? I was expecting to receive 2017...
Is there an efficient way to separate day, month and year? I do not want to repeat the same mistakes with my insecure code.
I tried my code in the following tech. setups:
local machine with Python 3.5.2 (x64), Python 3.6.1 (x64) and repl.it with Python 3.6.1
Try the code online, copy and paste the line codes
Try the following:
from datetime import date, timedelta
five_days_ago = date.today() - timedelta(days=5)
day = five_days_ago.day
year = five_days_ago.year
If what you want is a date (not a date and time), use date instead of datetime. Then, the day and year are simply properties on the date object.
As to your question regarding rpartition, it works by splitting on the rightmost separator (in your case, the hyphen between the month and the day) - that's what the r in rpartition means. So get_date.rpartition('-') returns ['2017-07', '-', '14'].
If you want to persist with your approach, your year code would be made to work if you replace rpartition with partition, e.g.:
year = get_date.partition('-')[0]
# result: 2017
However, there's also a related (better) approach - use split:
parts = get_date.split('-')
year = parts[0]
month = parts[1]
day = parts[2]
I currently have a program setup to run two different ways. One way is to run over a specified time frame, and the other way is to run everyday. However, when I have it set to run everyday, I only want it to continue if its a business day. Now from research I've seen that you can iterate through business days using Pandas like so:
start = 2016-08-05
end = datetime.date.today().strftime("%Y-%m-%d")
for day in pd.bdate_range(start, end):
print str(day) + " is a business Day"
And this works great when I run my program over the specified period.
But when I want to have the program ran everyday, I can't quite figure out how to test one specific day for being a business day. Basically I want to do something like this:
start = datetime.date.today().strftime("%Y-%m-%d")
end = datetime.date.today().strftime("%Y-%m-%d")
if start == end:
if not Bdate(start)
print "Not a Business day"
I know I could probably setup pd.bdate_range() to do what I'm looking for, but in my opinion would be sloppy and not intuitive. I feel like there must be a simpler solution to this. Any advice?
Since len of pd.bdate_range() tells us how many business days are in the supplied range of dates, we can cast this to a bool to determine if a range of a single day is a business day:
def is_business_day(date):
return bool(len(pd.bdate_range(date, date)))
I just found a different solution to this. This might be interesting if you want to find the next business day if your date is not a business day.
bdays=BDay()
def is_business_day(date):
return date == date + 0*bdays
adding 0*bdays rolls forward on the next business day including the current one. Unfortunately, subtracting 0*bdays does not roll backwards (at least with the pandas version I was using).
Moreover, due to this behavior, you also need to be careful since not necessarily
0*bdays + 1*bdays != 1*bdays
There is builtin method to do this in pandas.
For Pandas version <1.0
from pandas.tseries.offsets import Day, BDay
from datetime import datetime
bday=BDay()
is_business_day = bday.onOffset(datetime(2020,8,20))
For Pandas version >=1.1.0 (onOffset is deprecated)
from pandas.tseries.offsets import Day, BDay
from datetime import datetime
bday=BDay()
is_business_day = bday.is_on_offset(datetime(2020,8,20))
Using at least numpy version 1.7.0., try np.is_busday()
start = datetime.date.today().strftime("%Y-%m-%d")
end = datetime.date.today().strftime("%Y-%m-%d")
if start == end:
# added code here
if not np.is_busday(start):
print("Not a Business day")
for me I use an old trick from Excel:
from pandas.tseries.offsets import Day, BDay
def is_bday(x):
return x == x + Day(1) - BDay(1)
Please check this module - bdateutil
Please check the below code using above module :
from bdateutil import isbday
from datetime import datetime,date
now = datetime.now()
val = isbday(date(now.year, now.month, now.day))
print val
Please let me know if this help.