I am trying to pull out multiple ticker data from the yfinance API and save it to a csv file (in total I have 1000 tickers I need to get the data for, that data being the entire table of date, open, high, low, close, volume, etc etc), so far I am able to successfully get data for 1 ticker by using the following Python code:
import yfinance as yf
def yfinance(ticker_symbol):
ticker_data = yf.Ticker(ticker_symbol)
tickerDF = ticker_data.history(period='1d', start='2020-09-30', end='2020-10-31')
print(tickerDF)
yfinance('000001.SS')
However if I try on multiple tickers this doesn't work. Following the yfinance docs which say for multiple tickers use:
tickers = yf.Tickers('msft aapl goog')
# ^ returns a named tuple of Ticker objects
# access each ticker using (example)
tickers.tickers.MSFT.info
tickers.tickers.AAPL.history(period="1mo")
tickers.tickers.GOOG.actions
I have a couple of issue here, the docs use a string such as 'aapl' my tickers are all of digit format like '000001.SS', the ".SS" part is proving to be an issue when passing it into the code:
tickers.tickers.000001.SS.history(period="1mo")
# Clearly this wont for for a start
The next issue I am having is, even if I pass in for example 3 tickers to my function like so:
yfinance('000001.SS 000050.KS 00006.KS')
# similar to yfinance docs of tickers = yf.Tickers('msft aapl goog')
I get errors like:
AttributeError: 'Tickers' object has no attribute '000001.SS'
(I have also tried to run these into a for loop and pass each on to the Tickers object but get the same error.)
Im stuck now, I dont know how to pass in multiple tickers to yfinance and get back data that I want and the docs aren't very helpful.
Is anyone able to help me with this?
Could you not just store them in an array specifying the type as dtype object then use that pull the data from.
import yfinance as yf
import numpy as np
tickers = ['msft', 'aapl', 'goog']
totalPortfolio = np.empty([len(tickers)], dtype=object)
num = 0
for ticker in tickers:
totalPortfolio[num] = yf.download(ticker, start='2020-09-30', end='2020-10-31', interval="1d")
num = num + 1
Take a look at the code below:
test = yf.Tickers("A B C")
# creates test as a yf.tickers object
test_dict = test.tickers
# creates a dict object containing the individual tickers. Can be checked with type()
You are trying to use "tickers.tickers.MSFT.info" to retrieve the ticker data from your dictionary "tickers.tickers" but like your error message says, a dict object has no attributes named after your specific ticker names. This is in general not how you access elements in a dictionary.
Instead you should use the code as below (like with all dict objects):
#old code from above
test = yf.Tickers("A B C")
test_dict = test.tickers
#new code accessing the dict correctly
a_data = test_dict["A"]
a_data = test.tickers["A"] #does the same as the line above
b_data = test.tickers["B"] #and so on for the other tickers
In a loop this could look something like this:
ticker_list = ["A", "B", "C"] #add tickers as needed
tickers_data = {}
tickers_history = {}
for ticker in ticker_list:
tickers_data[ticker] = yf.Ticker(ticker)
tickers_history = tickers_data[ticker].history(period='1d', start='2020-09-30', end='2020-10-31')
#access the dicts as needed using tickers_data[" your ticker name "]
alternatively you can also use the "yf.Tickers" function to retrieve multiple tickers at once, but because you save the history seperately I don't think this will necessarily improve your code much.
You should pay attention however, that "yf.Ticker()" and "yf.Tickers()" are different functions from each other with differing syntax and are not interchangeable.
You did mix that up when you tried accessing multiple tickers with your custom "yfinance()" function, that has been previously defined with the "yf.Ticker()" function and thus only accepts one symbol at a time.
Related
I'm trying to create in Python what a macro does in SAS. I have a list of over 1K tickers that I'm trying to download information for but doing all of them in one step made python crash so I split up the data into 11 portions. Below is the code we're working with:
t0=t.time()
printcounter=0
for ticker in tickers1:
printcounter+=1
print(printcounter)
try:
selected = yf.Ticker(ticker)
shares = selected.get_shares()
shares_wide = shares.transpose()
info=selected.info
market_cap=info['marketCap']
sector=info['sector']
name=info['shortName']
comb = shares_wide.assign(market_cap_oct22=market_cap,sector=sector,symbol=ticker,name=name)
company_info_1 = company_info_1.append(comb)
except:
comb = pd.DataFrame()
comb = comb.append({'symbol':ticker,'ERRORFLAG':'ERROR'},ignore_index=True)
company_info_1 = company_info_1.append(comb)
print("total run time:", round(t.time()-t0,3),"s")
What I'd like to do is instead of re-writing and running this code for all 11 portions of data and manually changing "tickers1" and "company_info_1" to "tickers2" "company_info_2" "tickers3" "company_info_3" (and so on)... I'd like to see if there is a way to make a python version of a SAS macro/call so that I can get this data more dynamically. Is there a way to do this in python?
You need to generalize your existing code and wrap it in a function.
def comany_info(tickers):
for ticker in tickers:
try:
selected = yf.Ticker(ticker) # you may also have to pass the yf object
shares = selected.get_shares()
shares_wide = shares.transpose()
info=selected.info
market_cap=info['marketCap']
sector=info['sector']
name=info['shortName']
comb = shares_wide.assign(market_cap_oct22=market_cap,sector=sector,symbol=ticker,name=name)
company_info = company_info.append(comb)
except:
comb = pd.DataFrame()
comb = comb.append({'symbol':ticker,'ERRORFLAG':'ERROR'},ignore_index=True)
company_info = company_info.append(comb)
return company_info # return the dataframe
Create a master dataframe to collect your results from the function call. Loop over the 11 groups of tickers passing each group into your function. Append the results to your master.
# master df to collect results
master = pd.DataFrame()
# assuming you have your tickers in a list of lists
# loop over each of the 11 groups of tickers
for tickers in groups_of_tickers:
df = company_info(tickers) # fetch data from Yahoo Finance
master = master.append(df))
Please note I typed this on the fly. I have no way of testing this. I'm quite sure there are syntactical issues to work through. Hopefully it provides a framework for how to think about the solution.
Good Evening
Hi everyone, so i got the following JSON file from Walmart regarding their product items and price.
so i loaded up jupyter notebook, imported pandas and then loaded it into a Dataframe with custom columns as shown in the pics below.
now this is what i want to do:
make new columns named as min price and max price and load the data into it
how can i do that ?
Here is the code in jupyter notebook for reference.
i also want the offer price as some items dont have minprice and maxprice :)
EDIT: here is the PYTHON Code:
import json
import pandas as pd
with open("walmart.json") as f:
data = json.load(f)
walmart = data["items"]
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
print(wdf.loc[0,"primaryOffer"])
pd.set_option('display.max_colwidth', None)
print(wdf)
Here is the JSON File:
https://pastebin.com/sLGCFCDC
The following code snippet on top of your code would achieve the required task:
min_prices = []
max_prices = []
offer_prices = []
for i,row in wdf.iterrows():
if('showMinMaxPrice' in row['primaryOffer']):
min_prices.append(row['primaryOffer']['minPrice'])
max_prices.append(row['primaryOffer']['maxPrice'])
offer_prices.append('N/A')
else:
min_prices.append('N/A')
max_prices.append('N/A')
offer_prices.append(row['primaryOffer']['offerPrice'])
wdf['minPrice'] = min_prices
wdf['maxPrice'] = max_prices
wdf['offerPrice'] = offer_prices
Here we are checking for the 'showMinMaxPrice' element from the json in the column named 'primaryOffer'. For cases where the minPrice and maxPrice is available, the offerPrice is shown as 'N/A' and vice-versa. These are first stored in lists and later added to the dataframe as columns.
The output for wdf.head() would then be:
I parsed a html table for financial transactions and have 3 different lists:
1. DATE
2. TICKER
3. MOTHER COMPANY
I would like to populate a stock prices for stocks from my TICKER list for a maximum possible period
I am new to python and cant figure out how to get the data for the stocks from my TICKER list... Any guidance would be of great help
Many thanks in advance
TICKERS
['OSR', 'NWSA', 'MNK', 'ZTS', 'FNAC', 'WWAV', 'NRZ', 'CST', 'BPY', 'ERA', 'AXLL', 'LMCAD', 'ABBV']
I am trying with a simple code but cant get through:
import yfinance as yf
for ticker in tickers:
data = yf.download(ticker, period="max")
The download function in yfinance accepts a list of tickers separated by spaces.
In order to download the data for all your tickers for a max period simply call it this way.
For example, if you want to download the data for 'OSR', 'NWA' and 'MNK':
import yfinance as yf
tickers = 'OSR NWA MNK'
data = yf.download(tickers, period='max')
You can then access each ticker's data using data[ticker].
If you have your tickers as a list and want to convert to a space-delimited string use join:
ticker_list = ['OSR', 'NWA', 'MNK']
ticker_str = ' '.join(ticker_list)
I'm trying to get some data for multiple stocks, but simple for loop does not iterate over the class.
For example:
In[2]: import yfinance as yf
stock = yf.Ticker('AAPL')
stock.info.get('sharesOutstanding')
Out[2]: 4375479808
And when I'm trying something like:
t = ['AAPL', 'MSFT']
for str in t:
stock = yf.Ticker(str)
a = []
a = stock.info.get('sharesOutstanding')
I get only MSFT shares outstanding.
Ideally, the result must be a dataframe like:
sharesOutstanding
AAPl 4375479808
MSFT 7606049792
Any ideas how to realise it? Actually I have list of about 6375 stocks, but if there would be a solve for two stocks, then code cample can be used for multiple stocks, I think.
PROBLEM SOLVING:
a = []
b = []
for str in t:
try:
stock = yf.Ticker(str)
a.append(stock.info.get('sharesOutstanding'))
b.append(stock.info.get('symbol'))
except KeyError:
continue
except IndexError:
continue
shares_ots = pd.DataFrame(a, b)
The problem most likely occurs because the a list is declared locally within the loop, meaning that the data it holds is overridden each iteration.
To solve the issue, we can declare the list outside of the scope of the loop. This way, it can retain its information.
t = ['AAPL', 'MSFT']
a = []
for str in t:
stock = yf.Ticker(str)
a.append(stock.info.get('sharesOutstanding'))
Alternatively, you can use another built-in function in the API as shown in the docs.
tickers = yf.Tickers('aapl msft')
# ^ returns a named tuple of Ticker objects
# access each ticker
tickers.msft.info.get('sharesOutstanding'))
tickers.aapl.info.get('sharesOutstanding'))
EDIT
If you prefer, you can simplify the loop with list comprehension as shown:
t = ['AAPL', 'MSFT']
a = [yf.Ticker(str).info.get('sharesOutstanding') for str in t]
Because the Ticker(str).info object is a Python dictionary, we can pass in an additional argument to the get function to specify a default fallback value.
a = [yf.Ticker(str).info.get('sharesOutstanding', 'NaN') for str in t]
In this case, if the dictionary does not have the 'sharesOutstanding' key, it will default to None. This way, we can ensure that len(a) == len(t).
To create a pandas data frame, try something like
df = pd.DataFrame(a, t, columns=['sharesOutstanding'])
You are re-creating an array on each iteration, and not correctly appending to that array anyway. Try this:
t = ['AAPL', 'MSFT']
a = []
for str in t:
stock = yf.Ticker(str)
a.append(stock.info.get('sharesOutstanding'))
I'm gathering data from a bunch of ETFs through Yahoo Finance using Pandas-Datareader and I'm getting odd errors with a handful of the tickers even though the data seems available. The code is very simple:
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2017,1,1)
for ticker in TICKERS:
f = dr.DataReader(ticker, 'yahoo', start, end)
and works for most of my tickers but not all:
EMLP GDVD (Failed to get data for GDVD) AMZA RFDI ARKK ARKW SECT (Failed to get data for SECT)
EMLP works fine. Datareader produces urls like this url for GDVD even though the historical data for GDVD is available on the website. I see the following error in Chrome using the GDVD url:
{"finance": {"error": {"code": "Unauthorized","description": "Invalid cookie"}}}
Is there a way to get historical prices for these tickers? The full list of failed tickers in case anyone can see a pattern:
['GDVD', 'SECT', 'DWLD', 'CCOR', 'DFNL', 'DUSA', 'AIEQ', 'CACG', 'QSY', 'ACT', 'TAXR', 'TTAI', 'FLIO', 'FMDG', 'VGFO', 'FFSG', 'LRGE', 'YLDE', 'VESH', 'DEMS', 'SQZZ']
Using the yahoo_fin package, I was able to get the data for the tickers you listed. Check out this link: http://theautomatic.net/yahoo_fin-documentation/.
My code looks like this:
from yahoo_fin.stock_info import get_data
tickers = ['GDVD', 'SECT', 'DWLD', 'CCOR', 'DFNL', 'DUSA', 'AIEQ', 'CACG',
'QSY', 'ACT', 'TAXR', 'TTAI', 'FLIO', 'FMDG', 'VGFO', 'FFSG',
'LRGE', 'YLDE', 'VESH', 'DEMS', 'SQZZ']
stocks = {}
for ticker in tickers:
stocks[ticker] = get_data(ticker)
So the data gets stored into a dictionary, where the keys are the tickers, and the values are the data frames containing each stock's data.
Alternatively, you could use a dictionary comprehension, like this:
stocks = {ticker : get_data(ticker) for ticker in tickers}
If you want to collapse all of the data sets into a single data frame, you could use the functools package like this:
from functools import reduce
combined = reduce(lambda x,y: x.append(y), stocks.values())