API CALL STOCK DATA - python

import requests
import pandas as pd
import os
import io
import time
import csv
# Method 1: PRE-ENTER LIST OF stockS INSIDE STOCK LIST
# stock_list = ['QQQ', 'AAPL', 'TSLA', 'AMZN', 'GOOG',
# 'MSFT', 'META', 'BA', 'PFE', 'MRNA', 'BAC']
stock_list = ['TSLA', 'XLE']
for stock in stock_list:
os.chdir('C:/Users/bean/Desktop')
path = f'C:/Users/bean/Desktop'
API = 'APIKEY'
symbol = stock
if not os.path.exists(os.path.join(path, symbol)):
os.makedirs(os.path.join(symbol+'/months'))
# os.makedirs(os.path.join(symbol))
# Slice months for api calls.
month_slices = [f'year1month1', f'year1month2', f'year1month3',
f'year1month4', f'year1month5', f'year1month6',
f'year1month7', f'year1month8', f'year1month9',
f'year1month10', f'year1month11', f'year1month12',
f'year2month1', f'year2month2', f'year2month3',
f'year2month4', f'year2month5', f'year2month6',
f'year2month7', f'year2month8', f'year2month9',
f'year2month10', f'year2month11', f'year2month12']
# Get all URL links.
urls = []
for stock in stock_list:
for slice in month_slices:
url = f'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY_EXTENDED&symbol={stock}&interval=1min&slice={slice}&apikey={API}'
urls.append(url)
print(url)
# Append the data.
data = []
counter = 0
for url in urls:
response = requests.get(url)
# df = pd.DataFrame(url)
df = pd.read_csv(io.BytesIO(response.content))
df.to_csv(
f'C:/Users/bean/Desktop/{stock}/months/{stock}_{slice}.csv', index=False)
data.append(df)
counter += 1
if counter % 5 == 0:
print(
f'counter is: {counter} for symbol: {stock}. ')
print(
'Sleeping one minute. API allows 5 calls per minute; 500 total daily.')
time.sleep(60)
counter = 0
# Combine and save sheets to your destination.
months_df = pd.concat(data)
months_df.to_csv(
f'C:/Users/bean/Desktop/{stock}/combined_{stock}_data.csv', index=False)
print(f' finished: {months_df}')
Essentially, using alpha vantage to try getting minute data for market. Can anyone help me with this code. The problem is I made this long version which works but here trying to make it more concise, I am using a loop and counter. Since the free version of the API only allows 5 calls per minute, I need to make the program sleep. The problem is using the counter and the loop, when I make it sleep, after it comes back it only does one api call instead of the next 5 like it should. Thats when the program stops again for 60 seconds, then proceeds again one call at time.
I am not sure why it wouldn't just repeat. I liked the idea of if remainder 5 (%5==0) because if I have a lot of symbols in list it can continue going.
Does this have to do with the indent of the counter?
Thanks

Related

Making the python version of a SAS macro/call

I'm trying to create in Python what a macro does in SAS. I have a list of over 1K tickers that I'm trying to download information for but doing all of them in one step made python crash so I split up the data into 11 portions. Below is the code we're working with:
t0=t.time()
printcounter=0
for ticker in tickers1:
printcounter+=1
print(printcounter)
try:
selected = yf.Ticker(ticker)
shares = selected.get_shares()
shares_wide = shares.transpose()
info=selected.info
market_cap=info['marketCap']
sector=info['sector']
name=info['shortName']
comb = shares_wide.assign(market_cap_oct22=market_cap,sector=sector,symbol=ticker,name=name)
company_info_1 = company_info_1.append(comb)
except:
comb = pd.DataFrame()
comb = comb.append({'symbol':ticker,'ERRORFLAG':'ERROR'},ignore_index=True)
company_info_1 = company_info_1.append(comb)
print("total run time:", round(t.time()-t0,3),"s")
What I'd like to do is instead of re-writing and running this code for all 11 portions of data and manually changing "tickers1" and "company_info_1" to "tickers2" "company_info_2" "tickers3" "company_info_3" (and so on)... I'd like to see if there is a way to make a python version of a SAS macro/call so that I can get this data more dynamically. Is there a way to do this in python?
You need to generalize your existing code and wrap it in a function.
def comany_info(tickers):
for ticker in tickers:
try:
selected = yf.Ticker(ticker) # you may also have to pass the yf object
shares = selected.get_shares()
shares_wide = shares.transpose()
info=selected.info
market_cap=info['marketCap']
sector=info['sector']
name=info['shortName']
comb = shares_wide.assign(market_cap_oct22=market_cap,sector=sector,symbol=ticker,name=name)
company_info = company_info.append(comb)
except:
comb = pd.DataFrame()
comb = comb.append({'symbol':ticker,'ERRORFLAG':'ERROR'},ignore_index=True)
company_info = company_info.append(comb)
return company_info # return the dataframe
Create a master dataframe to collect your results from the function call. Loop over the 11 groups of tickers passing each group into your function. Append the results to your master.
# master df to collect results
master = pd.DataFrame()
# assuming you have your tickers in a list of lists
# loop over each of the 11 groups of tickers
for tickers in groups_of_tickers:
df = company_info(tickers) # fetch data from Yahoo Finance
master = master.append(df))
Please note I typed this on the fly. I have no way of testing this. I'm quite sure there are syntactical issues to work through. Hopefully it provides a framework for how to think about the solution.

What is the fastest way to iterate through a list of yfinance tickers?

Im using the python yfinance yahoo API for stock data retrieval. Right now im getting the peg ratio, which is an indicator of a company price related to its growth and earnings. I have a csv downloaded from here: https://www.nasdaq.com/market-activity/stocks/screener.
It has exactly 8000 stocks.
What I do is get the symbol list, and iterate it to access to the yahoo ticker. Then I get a use the ticker.info method which returns a dictionary. I iterate this process through the 8000 symbols. It goes at a speed of 6 symbols per minute, which is not viable. Is there a faster way with another API or another structure? I dont care about the API as long as I can get basic info as the growth, earnings, EPS and those things.
Here is the code:
import pandas as pd
import yfinance as yf
data = pd.read_csv("data/stock_list.csv")
symbols = data['Symbol']
for symbol in symbols:
stock = yf.Ticker(symbol)
try:
if stock.info['pegRatio']:
print(stock.info['shortName'] + " : " + str(stock.info['pegRatio']))
except KeyError:
pass
It seems that when certain data are needed from the Ticker.info attribute, HTTP requests are made to acquire them. Multithreading will help to improve matters. Try this:-
import pandas as pd
import yfinance as yf
import concurrent.futures
data = pd.read_csv('data/stock_list.csv')
def getPR(symbol):
sn = None
pr = None
try:
stock = yf.Ticker(symbol)
pr = stock.info['pegRatio']
sn = stock.info['shortName']
except Exception:
pass
return (sn, pr)
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = {executor.submit(getPR, sym): sym for sym in data['Symbol']}
for future in concurrent.futures.as_completed(futures):
sn, pr = future.result()
if sn:
print(f'{sn} : {pr}')

Handling Exceptions with Bulk API Requests

I am pulling data from an API that allows batch requests, and then storing the data to a Dataframe. When there is an exception with one of the items being looked up via the API, I want to either skip that item entirely, (or write zeroes to the Dataframe) and then go on to the next item.
But my issue is that because the API data is being accessed in bulk (i.e., not looping through each item in the list), an exception for any item in the list breaks the program. So how can I elegantly handle exceptions without looping through each individual item in the tickers list?
Note that removing ERROR from the tickers list will enable the program to run successfully:
import os
from iexfinance.stocks import Stock
import iexfinance
# Set IEX Finance API Token (Sandbox)
os.environ['IEX_API_VERSION'] = 'iexcloud-sandbox'
os.environ['IEX_TOKEN'] = 'Tpk_a4bc3e95d4c94810a3b2d4138dc81c5d'
# List of companies to get data for
tickers = ['MSFT', 'ERROR', 'AMZN']
batch = Stock(tickers, output_format='pandas')
income_ttm = 0
try:
# Get income from last 4 quarters, sum it, and store to temp Dataframe
df_income = batch.get_income_statement(period="year")
print(df_income)
except (iexfinance.utils.exceptions.IEXQueryError, iexfinance.utils.exceptions.IEXSymbolError) as e:
pass
This should do the work
import os
from copy import deepcopy
from iexfinance.stocks import Stock
import iexfinance
def find_wrong_symbol(tickers, err):
wrong_ticker = []
for one_ticker in tickers:
if one_ticker.upper() in err:
wrong_ticker.append(one_ticker)
return wrong_ticker
# Set IEX Finance API Token (Sandbox)
os.environ['IEX_API_VERSION'] = 'iexcloud-sandbox'
os.environ['IEX_TOKEN'] = 'Tpk_a4bc3e95d4c94810a3b2d4138dc81c5d'
# List of companies to get data for
tickers = ['MSFT', 'AMZN', 'failing']
batch = Stock(tickers, output_format='pandas')
income_ttm = 0
try:
# Get income from last 4 quarters, sum it, and store to temp Dataframe
df_income = batch.get_income_statement(period="year")
print(df_income)
except (iexfinance.utils.exceptions.IEXQueryError, iexfinance.utils.exceptions.IEXSymbolError) as e:
wrong_tickers = find_wrong_symbol(tickers, str(e))
tickers_to_get = deepcopy(tickers)
assigning_dict = {}
for wrong_ticker in wrong_tickers:
tickers_to_get.pop(tickers_to_get.index(wrong_ticker))
assigning_dict.update({wrong_ticker: lambda x: 0})
new_batch = Stock(tickers_to_get, output_format='pandas')
df_income = new_batch.get_income_statement(period="year").assign(**assigning_dict)
I create a small function in order to find tickers that are not handle by the API. After deleting the wrong tickers, I recall the API without it and with the assign function add the missing columns with the 0 values (it could be anything, a NaN or another default value).

Retry Single Iteration in For Loop (Python)

Python novice here (sorry if this is a dumb question)! I'm currently using a for loop to download and manipulate data. Unfortunately, I occasionally run into brief network issues that cause portions of the loop to fail.
Originally, I was doing something like this:
# Import Modules
import fix_yahoo_finance as yf
import pandas as pd
from stockstats import StockDataFrame as sdf
# Stock Tickers to Gather Data For - in my full code I have thousands of tickers
Ticker = ['MSFT','SPY','GOOG']
# Data Start and End Data
Data_Start_Date = '2017-03-01'
Data_End_Date = '2017-06-01'
# Create Data List to Append
DataList = pd.DataFrame([])
# Initialize Loop
for i in Ticker:
# Download Data
data = yf.download(i, Data_Start_Date, Data_End_Date)
# Create StockDataFrame
stock_df = sdf.retype(data)
# Calculate RSI
data['rsi'] = stock_df['rsi_14']
DataList.append(pd.DataFrame(data))
DataList.to_csv('DataList.csv',header=True,index=True)
With that basic layout, whenever I had a network error, it caused the entire program to halt and spit out an error.
I did some research and tried modifying the 'for loop' to following:
for i in Ticker:
try:
# Download Data
data = yf.download(i, Data_Start_Date, Data_End_Date)
# Create StockDataFrame
stock_df = sdf.retype(data)
# Calculate RSI
data['rsi'] = stock_df['rsi_14']
DataList.append(pd.DataFrame(data))
except:
continue
With this, the code always ran without issue, but whenever I encountered a network error, it skipped all the tickers it was on (failed to download their data).
I want this to download the data for each ticker once. If it fails, I want it to try again until it succeeds once and then move on to the next ticker. I tried using while True and variations of it, but it caused the loop to download the same ticker multiple times!
Any help or advice is greatly appreciated! Thank you!
If you can continue after you've hit a glitch (some protocols support it), then you're better off not using this exact approach. But for a slightly brute-force method:
for i in Ticker:
incomplete = True
tries = 10
while incomplete and tries > 0:
try:
# Download Data
data = yf.download(i, Data_Start_Date, Data_End_Date)
incomplete = False
except:
tries -= 1
# Create StockDataFrame
if incomplete:
print("Oops, it is really failing a lot, skipping: %r" % (i,))
continue # not technically needed, but in case you opt to add
# anything afterward ...
else:
stock_df = sdf.retype(data)
# Calculate RSI
data['rsi'] = stock_df['rsi_14']
DataList.append(pd.DataFrame(data))
This is slighly different that Prune's in that it stops after 10 attempts ... if it fails that many times, that indicates you may want to divert some energy into fixing a different problem such as network connectivity.
If it gets to that point, it will continue in the list of Tickers, so perhaps you can get most of what you need.
You can use a wrapper loop to continue until you get a good result.
for i in Ticker:
fail = True
while fail: # Keep trying until it works
try:
# Download Data
data = yf.download(i, Data_Start_Date, Data_End_Date)
# Create StockDataFrame
stock_df = sdf.retype(data)
# Calculate RSI
data['rsi'] = stock_df['rsi_14']
DataList.append(pd.DataFrame(data))
except:
continue
else:
fail = False

(Python) Passing multiple changes to URL

I have a list of several stock tickers:
ticker = (GE,IBM,GM,F,PG,CSCO)
That I want to pass to the URL in my python program.
url = "https://www.quandl.com/api/v3/datasets/WIKI/FB.json"
I'm trying to pass a new ticker into the URL on each subsequent pass thru my program. I'm struggling with how to pass each new ticker in the list of ticker into the URL as it loops thru the program. Program needs to grab a new ticker from the list and replace the one in the URL.
Example: After the first pass, program should grab GE from the list and replace FB in the URL and continue looping thru until all tickers have been passed to URL. Not sure how best to deal with the part of the program. Any help would be appreciated.
import requests
url_tpl = "https://www.quandl.com/api/v3/datasets/WIKI/{ticker}.json"
# Here your results will be stored
jsons = {}
for ticker in ('FB', 'GE', 'IBM', 'GM', 'F' , 'PG', 'CSCO'):
res = requests.get(url_tpl.format(ticker=ticker))
if res.status_code == 200:
jsons[ticker] = res.json()
else:
print('error while fetching {ticker}, response code: '
'{status}'.format(ticker=ticker, status=res.status_code))

Categories

Resources