Loading data from Yahoo! Finance with pandas - python

I am working my way through Wes McKinney's book Python For Data Analysis and on page 139 under Correlation and Covariance, I am getting an error when I try to run his code to obtain data from Yahoo! Finance.
Here is what I am running:
#CORRELATION AND COVARIANCE
import pandas.io.data as web
all_data = {}
for ticker in ['AAPL', 'IBM', 'MSFT', 'GOOG']:
all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2003', '1/1/2013')
price = DataFrame({tic: data['Adj Close']
for tic, data in all_data.iteritems()})
volume = DataFrame({tic: data['Volume']
for tic, data in all_data.iteritems()})
Here is the error I am getting:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\io\data.py", line 390, in get_data_yahoo
adjust_price, ret_index, chunksize, 'yahoo', name)
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\io\data.py", line 336, in _get_data_from
hist_data = src_fn(symbols, start, end, retry_count, pause)
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\io\data.py", line 190, in _get_hist_yahoo
return _retry_read_url(url, retry_count, pause, 'Yahoo!')
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\io\data.py", line 169, in _retry_read_url
"return a 200 for url %r" % (retry_count, name, url))
IOError: after 3 tries, Yahoo! did not return a 200 for url 'http://ichart.yahoo.com/table.csv?s=GOOG&a=0&b=1&c=2000&d=0&e=1&f=2010&g=d&ignore=.csv'
>>> ... >>> >>> ... >>>
Any idea on what the problem is?

As Karl pointed out, the ticker had changed meaning Yahoo returns a 'page not found'.
When polling data from the web, it is a good idea to wrap the call in a try except
all_data = {}
for ticker in ['AAPL', 'IBM', 'MSFT', 'GOOG']:
try:
all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2003', '1/1/2013')
price = DataFrame({tic: data['Adj Close']
for tic, data in all_data.iteritems()})
volume = DataFrame({tic: data['Volume']
for tic, data in all_data.iteritems()})
except:
print "Cant find ", ticker

Had the same problem and changing 'GOOG' to 'GOOGL' seems to work, once you've followed these instructions to switch from pandas.io.data to pandas_datareader.data.
http://pandas-datareader.readthedocs.org/en/latest/remote_data.html#yahoo-finance

As of 6/1/17, I pieced the following together from this page and a couple of others:
from pandas_datareader import data as web
# import pandas.io.data as web
import fix_yahoo_finance
import datetime
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2017, 6, 1)
all_data={}
for ticker in ['AAPL', 'IBM', 'MSFT', 'GOOGL']:
all_data[ticker] = web.get_data_yahoo(ticker, start, end)
price = DataFrame({tic: data['Adj Close']
for tic, data in all_data.iteritems()})
volume = DataFrame({tic: data['Volume']
for tic, data in all_data.iteritems()})

Im using the code snippet below to load yahoo finance data.
import pandas_datareader as pdr
from datetime import datetime
from pandas import DataFrame as df
def get_data(selection, sdate, edate):
data = pdr.get_data_yahoo(symbols=selection, start=sdate, end=edate)
data = df(data['Adj Close'])
return data
start_date = datetime(2017, 1, 1)
end_date = datetime(2019,4,28)
selected = [ 'TD.TO', 'AC.TO', 'BNS.TO', 'ENB.TO', 'MFC.TO','RY.TO','BCE.TO']
print(get_data(selected, start_date, end_date).head(1))
https://repl.it/repls/DevotedBetterAlgorithms

Related

pyfinance dividends and documentation

I'm trying to use pyfinance to pull data, I have run into issues with the dividends. Below is the code, the error I et is:
import yfinance as yf
print('Enter Ticker:')
symbol = input()
symbol = yf.Ticker(symbol)
print('Forward PE:')
print(symbol.info['forwardPE'])
print('Dividends:')
info = yf.Ticker(symbol).info
div = info.get('trailingAnnualDividendYield')
print(div)
Does anyone have documentation for pyfinance? What I have been able to find is slim, how can I view the modules/classes/etc
Error from python interpreter:
Enter Ticker:
c
Forward PE:
8.224477
Dividends:
Traceback (most recent call last):
File "/home/user/Desktop/test.py", line 10, in <module>
info = yf.Ticker(symbol).info
File "/home/user/.local/lib/python3.9/site-packages/yfinance/base.py", line 49, in __init__
self.ticker = ticker.upper()
AttributeError: 'Ticker' object has no attribute 'upper'
You assign symbol = yf.Ticker(symbol), so symbol is yfinance.Ticker object now, not a string. And then you call yf.Ticker(symbol).info (which is not needed) that leads to an error. Don't save on variables names.
import yfinance as yf
print('Enter Ticker:')
symbol = input()
s = yf.Ticker(symbol)
print('Forward PE:')
print(s.info['forwardPE'])
print('Dividends:')
div = s.info.get('trailingAnnualDividendYield')
print(div)
And the results:
Enter Ticker:
ibm
Forward PE:
12.553453
Dividends:
0.049131215
This fixed it:
import yfinance as yf
print('Enter Ticker:')
x = input()
symbol = x
symbol = yf.Ticker(symbol)
info = yf.Ticker(x).info
div = info.get('trailingAnnualDividendYield')
print('Forward PE:')
print(symbol.info['forwardPE'])
print('Dividend:')
print(div)

Python KeyError : 0, can you help me find the error?

I have a keyError 0 in my python code.
I dont really understand what in means in my case I read a lot about it but I cant find my error on my own
can somebody help me find it and maybe explain it to me ?
regards,
# use a function to pull all info from website
def getdata(stock):
# company quote group of items
company_quote = requests.get(f"https://financialmodelingprep.com/api/v3/quote/{stock}")
company_quote = company_quote.json()
share_price = float("{0:.2f}".format(company_quote[0]['price']))
# balance sheet
BS = requests.get(f"https://financialmodelingprep.com/api/v3/financials/balance-sheet-statement/{stock}?period=quarter")
BS = BS.json()
# total debt
debt = float("{0:.2f}".format(float(BS['financials'][0]['Total debt'])/10**9))
# total cash
cash = float("{0:.2f}".format(float(BS['financials'][0]['Cash and short-term investments'])/10**9))
# income statement group of item
IS = requests.get(f"https://financialmodelingprep.com/api/v3/financials/income-statement/{stock}?period=quarter")
IS = IS.json()
# most recent quarterly revenue
qRev = float("{0:.2f}".format(float(IS['financials'][0]['Revenue'])/10**9))
# company profile group of items
company_info = requests.get(f"https://financialmodelingprep.com/api/v3/company/profile/{stock}")
company_info = company_info.json()
# CEO
ceo = company_info['profile']['ceo']
return (share_price, cash, debt, qRev, ceo)
tickers = ('AAPL', 'MSFT', 'GOOG', 'MVIS')
data = map(getdata, tickers)
# create the dataframe with pandas to store all of the info
df = pd.DataFrame(data, columns = ['Total Cash', 'Total Debt', 'Q3 2019 Revenue', 'CEO'], index = tickers)
print(df)
# writing to excel
writer = pd.ExcelWriter('example.xlsx')
df.to_excel(writer, 'Statistics')
writer.save()
I just executed the code you pasted and seems the issue is you are not using correctly the API, seems it is missing an API KEY, from your code I get this:
{'Error Message': 'Invalid API KEY. Please retry or visit our documentation to create one FREE https://financialmodelingprep.com/developer/docs'}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in getdata
KeyError: 0
So, take a look to the API and send the correct values (Probably it is missing a header or so on)

How do you use the python alpha_vantage API to return extended intraday data?

I have been working with the alpha vantage python API for a while now, but I have only needed to pull daily and intraday timeseries data. I am trying to pull extended intraday data, but am not having any luck getting it to work. Trying to run the following code:
from alpha_vantage.timeseries import TimeSeries
apiKey = 'MY API KEY'
ts = TimeSeries(key = apiKey, output_format = 'pandas')
totalData, _ = ts.get_intraday_extended(symbol = 'NIO', interval = '15min', slice = 'year1month1')
print(totalData)
gives me the following error:
Traceback (most recent call last):
File "/home/pi/Desktop/test.py", line 9, in <module>
totalData, _ = ts.get_intraday_extended(symbol = 'NIO', interval = '15min', slice = 'year1month1')
File "/home/pi/.local/lib/python3.7/site-packages/alpha_vantage/alphavantage.py", line 219, in _format_wrapper
self, *args, **kwargs)
File "/home/pi/.local/lib/python3.7/site-packages/alpha_vantage/alphavantage.py", line 160, in _call_wrapper
return self._handle_api_call(url), data_key, meta_data_key
File "/home/pi/.local/lib/python3.7/site-packages/alpha_vantage/alphavantage.py", line 354, in _handle_api_call
json_response = response.json()
File "/usr/lib/python3/dist-packages/requests/models.py", line 889, in json
self.content.decode(encoding), **kwargs
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 518, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
What is interesting is that if you look at the TimeSeries class, it states that extended intraday is returned as a "time series in one csv_reader object" whereas everything else, which works for me, is returned as "two json objects". I am 99% sure this has something to do with the issue, but I'm not entirely sure because I would think that calling intraday extended function would at least return SOMETHING (despite it being in a different format), but instead just gives me an error.
Another interesting little note is that the function refuses to take "adjusted = True" (or False) as an input despite it being in the documentation... likely unrelated, but maybe it might help diagnose.
Seems like TIME_SERIES_INTRADAY_EXTENDED can return only CSV format, but the alpha_vantage wrapper applies JSON methods, which results in the error.
My workaround:
from alpha_vantage.timeseries import TimeSeries
import pandas as pd
apiKey = 'MY API KEY'
ts = TimeSeries(key = apiKey, output_format = 'csv')
#download the csv
totalData = ts.get_intraday_extended(symbol = 'NIO', interval = '15min', slice = 'year1month1')
#csv --> dataframe
df = pd.DataFrame(list(totalData[0]))
#setup of column and index
header_row=0
df.columns = df.iloc[header_row]
df = df.drop(header_row)
df.set_index('time', inplace=True)
#show output
print(df)
This is an easy way to do it.
ticker = 'IBM'
date= 'year1month2'
apiKey = 'MY API KEY'
df = pd.read_csv('https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY_EXTENDED&symbol='+ticker+'&interval=15min&slice='+date+'&apikey='+apiKey+'&datatype=csv&outputsize=full')
#Show output
print(df)
import pandas as pd
symbol = 'AAPL'
interval = '15min'
slice = 'year1month1'
api_key = ''
adjusted = '&adjusted=true&'
csv_url = 'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY_EXTENDED&symbol='+symbol+'&interval='+interval+'&slice='+slice+adjusted+'&apikey='+api_key
data = pd.read_csv(csv_url)
print(data.head)

Getting a Request Error 404 in Python while accessing an API

Previously, I used a Morningstar API to get stock data; however, now that I am away from USA for a week, I am not being able to access the data.
This is the code snippet:
import datetime as dt
from dateutil.relativedelta
import relativedelta
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader.data as web
import csv
from mpl_finance
import candlestick_ohlc
import matplotlib.dates as mdates
from matplotlib.dates import DateFormatter, MonthLocator, YearLocator, DayLocator, WeekdayLocator
style.use( 'ggplot' )
end = dt.date.today()
start_48 = end - relativedelta( years=4 )
start_120 = end - relativedelta( years=10 )
ticker = input( 'Ticker: ' ) #should be in Uppercase
ticker = ticker.upper()
df_w = web.DataReader( ticker, 'morningstar', start_48, end )
df_m = web.DataReader( ticker, 'morningstar', start_120, end )
print()
file_name_w = ticker + 'weekly.csv'
file_name_m = ticker + 'monthly.csv'
df_w.to_csv( file_name_w )
df_m.to_csv( file_name_m )
df_w = pd.read_csv( file_name_w, parse_dates=True, index_col=0 )
df_m = pd.read_csv( file_name_m, parse_dates=True, index_col=0 )
This is the error message:
Ticker: spy
Traceback (most recent call last):
File "/Users/zubairjohal/Documents/OHLC.py", line 24, in <module>
df_w = web.DataReader( ticker, 'morningstar', start_48, end )
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas_datareader/data.py", line 391, in DataReader
session=session, interval="d").read()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas_datareader/mstar/daily.py", line 219, in read
df = self._dl_mult_symbols(symbols=symbols)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas_datareader/mstar/daily.py", line 130, in _dl_mult_symbols
resp.status_code, resp.reason))
Exception: Request Error!: 404 : Not Found
Is it an IP issue, and is there a way to fix this? I know that this code is fine because it worked perfectly well two days ago.
I had the same problem too, here in the USA. The datareader service (morningstar) worked 3 days ago and it stopped working a day before yesterday. I believe that morningstar changed here REST interface, so there is nothing much we can do except waiting on for the developers to fix it.
404 means not found, assuming you didn't make any change and suddenly doesn't work I would say it is either that the API URL is not accessible in that country (or blocked in that specific network) or their API changed (or is under maintenance). If you know the API URL try it directly in a browser with different Internet connections.

Python reading date from excel throws error

I am trying to read date from excel file using xlrd module. Below is my code for this :
# Variables
myfile = '/home/mobaxterm/.git/Operation_Documentation/docs/Servicing Portal User & Certificate Inventory.xlsx'
mydate = 'Expiration Date'
row_head = 0
# Import required modules
import xlrd
import datetime
today = datetime.date.today()
book = xlrd.open_workbook(myfile)
sheet = book.sheet_by_index(1)
for col_index in range(sheet.ncols):
print xlrd.cellname(row_head,col_index),"-",
print sheet.cell(row_head,col_index).value
if sheet.cell(row_head,col_index).value == mydate:
for raw_index in range(sheet.nrows):
expire = sheet.cell(raw_index,col_index).value
print expire
expire_date = datetime.datetime(*xlrd.xldate_as_tuple(expire, book.datemode))
print 'datetime: %s' % expire_date
break
While running the code i am getting following error :
Traceback (most recent call last):
File "cert_monitor.py", line 31, in <module>
expire_date = datetime.datetime(*xlrd.xldate_as_tuple(expire, book.datemode))
File "/usr/lib/python2.6/site-packages/xlrd/xldate.py", line 61, in xldate_as_tuple
xldays = int(xldate)
ValueError: invalid literal for int() with base 10: 'Expiration Date'
Can anyone suggest what could be the issue here?
Thanks for your time.
I believe that you should only skip the header:
for raw_index in range(1, sheet.nrows):
...
You are checking that sheet.cell(row_head,col_index).value == mydate, and then you want to iterate over the rows, but you should skip row_head first - it is ==mydate, which is not a date but a simple 'Expiration Date' string.

Categories

Resources