I have a list of stock indexes (indizies) and several lists of stock tickers (e.g. gdaxi, mdaxi).
I want to download the stocks from yahoo in two loops.
Background: In the real program the user can choose which index, indexes he wants to download.
The problem is, that the type of index_name is a string and for the second loop index_name has to be a list. But the second loop takes index_name as a string.
Result :It trys to download the csv for g,d,a,x,i
Question: How can I transform index_name from string to list?
from pandas_datareader import data as pdr
indizies = ['GDAXI', 'MDAXI']
gdaxi = ["ADS.DE", "AIR.DE", "ALV.DE"]
mdaxi = ["AIXA.DE", "AT1.DE"]
for index_name in indizies:
for ticker in index_name:
df = pdr.get_data_yahoo(ticker)
df.to_csv(f'{ticker}.csv')
In ticker in index_name you are iterating over the letters in given strings.
I guess you to change your code to something like:
from pandas_datareader import data as pdr
gdaxi = ["ADS.DE", "AIR.DE", "ALV.DE"]
mdaxi = ["AIXA.DE", "AT1.DE"]
indizies = [gdaxi, mdaxi]
for index_name in indizies:
for ticker in index_name:
df = pdr.get_data_yahoo(ticker)
df.to_csv(f'{ticker}.csv')```
Related
I am trying to pull out multiple ticker data from the yfinance API and save it to a csv file (in total I have 1000 tickers I need to get the data for, that data being the entire table of date, open, high, low, close, volume, etc etc), so far I am able to successfully get data for 1 ticker by using the following Python code:
import yfinance as yf
def yfinance(ticker_symbol):
ticker_data = yf.Ticker(ticker_symbol)
tickerDF = ticker_data.history(period='1d', start='2020-09-30', end='2020-10-31')
print(tickerDF)
yfinance('000001.SS')
However if I try on multiple tickers this doesn't work. Following the yfinance docs which say for multiple tickers use:
tickers = yf.Tickers('msft aapl goog')
# ^ returns a named tuple of Ticker objects
# access each ticker using (example)
tickers.tickers.MSFT.info
tickers.tickers.AAPL.history(period="1mo")
tickers.tickers.GOOG.actions
I have a couple of issue here, the docs use a string such as 'aapl' my tickers are all of digit format like '000001.SS', the ".SS" part is proving to be an issue when passing it into the code:
tickers.tickers.000001.SS.history(period="1mo")
# Clearly this wont for for a start
The next issue I am having is, even if I pass in for example 3 tickers to my function like so:
yfinance('000001.SS 000050.KS 00006.KS')
# similar to yfinance docs of tickers = yf.Tickers('msft aapl goog')
I get errors like:
AttributeError: 'Tickers' object has no attribute '000001.SS'
(I have also tried to run these into a for loop and pass each on to the Tickers object but get the same error.)
Im stuck now, I dont know how to pass in multiple tickers to yfinance and get back data that I want and the docs aren't very helpful.
Is anyone able to help me with this?
Could you not just store them in an array specifying the type as dtype object then use that pull the data from.
import yfinance as yf
import numpy as np
tickers = ['msft', 'aapl', 'goog']
totalPortfolio = np.empty([len(tickers)], dtype=object)
num = 0
for ticker in tickers:
totalPortfolio[num] = yf.download(ticker, start='2020-09-30', end='2020-10-31', interval="1d")
num = num + 1
Take a look at the code below:
test = yf.Tickers("A B C")
# creates test as a yf.tickers object
test_dict = test.tickers
# creates a dict object containing the individual tickers. Can be checked with type()
You are trying to use "tickers.tickers.MSFT.info" to retrieve the ticker data from your dictionary "tickers.tickers" but like your error message says, a dict object has no attributes named after your specific ticker names. This is in general not how you access elements in a dictionary.
Instead you should use the code as below (like with all dict objects):
#old code from above
test = yf.Tickers("A B C")
test_dict = test.tickers
#new code accessing the dict correctly
a_data = test_dict["A"]
a_data = test.tickers["A"] #does the same as the line above
b_data = test.tickers["B"] #and so on for the other tickers
In a loop this could look something like this:
ticker_list = ["A", "B", "C"] #add tickers as needed
tickers_data = {}
tickers_history = {}
for ticker in ticker_list:
tickers_data[ticker] = yf.Ticker(ticker)
tickers_history = tickers_data[ticker].history(period='1d', start='2020-09-30', end='2020-10-31')
#access the dicts as needed using tickers_data[" your ticker name "]
alternatively you can also use the "yf.Tickers" function to retrieve multiple tickers at once, but because you save the history seperately I don't think this will necessarily improve your code much.
You should pay attention however, that "yf.Ticker()" and "yf.Tickers()" are different functions from each other with differing syntax and are not interchangeable.
You did mix that up when you tried accessing multiple tickers with your custom "yfinance()" function, that has been previously defined with the "yf.Ticker()" function and thus only accepts one symbol at a time.
I parsed a html table for financial transactions and have 3 different lists:
1. DATE
2. TICKER
3. MOTHER COMPANY
I would like to populate a stock prices for stocks from my TICKER list for a maximum possible period
I am new to python and cant figure out how to get the data for the stocks from my TICKER list... Any guidance would be of great help
Many thanks in advance
TICKERS
['OSR', 'NWSA', 'MNK', 'ZTS', 'FNAC', 'WWAV', 'NRZ', 'CST', 'BPY', 'ERA', 'AXLL', 'LMCAD', 'ABBV']
I am trying with a simple code but cant get through:
import yfinance as yf
for ticker in tickers:
data = yf.download(ticker, period="max")
The download function in yfinance accepts a list of tickers separated by spaces.
In order to download the data for all your tickers for a max period simply call it this way.
For example, if you want to download the data for 'OSR', 'NWA' and 'MNK':
import yfinance as yf
tickers = 'OSR NWA MNK'
data = yf.download(tickers, period='max')
You can then access each ticker's data using data[ticker].
If you have your tickers as a list and want to convert to a space-delimited string use join:
ticker_list = ['OSR', 'NWA', 'MNK']
ticker_str = ' '.join(ticker_list)
I am fairly new to python and coding in general.
I have a big data file that provides daily data for the period 2011-2018 for a number of stock tickers (300~).
The data is a .csv file with circa 150k rows and looks as follows (short example):
Date,Symbol,ShortExemptVolume,ShortVolume,TotalVolume
20110103,AAWW,0.0,28369,78113.0
20110103,AMD,0.0,3183556,8095093.0
20110103,AMRS,0.0,14196,18811.0
20110103,ARAY,0.0,31685,77976.0
20110103,ARCC,0.0,177208,423768.0
20110103,ASCMA,0.0,3930,26527.0
20110103,ATI,0.0,193772,301287.0
20110103,ATSG,0.0,23659,72965.0
20110103,AVID,0.0,7211,18896.0
20110103,BMRN,0.0,21740,213974.0
20110103,CAMP,0.0,2000,11401.0
20110103,CIEN,0.0,625165,1309490.0
20110103,COWN,0.0,3195,24293.0
20110103,CSV,0.0,6133,25394.0
I have a function that allows me to filter for a specific symbol and get 10 observations before and after a specified date (could be any date between 2011 and 2018).
import pandas as pd
from datetime import datetime
import urllib
import datetime
def get_data(issue_date, stock_ticker):
df = pd.read_csv (r'D:\Project\Data\Short_Interest\exampledata.csv')
df['Date'] = pd.to_datetime(df['Date'], format="%Y%m%d")
d = df
df = pd.DataFrame(d)
short = df.loc[df.Symbol.eq(stock_ticker)]
# get the index of the row of interest
ix = short[short.Date.eq(issue_date)].index[0]
# get the item row for that row's index
iloc_ix = short.index.get_loc(ix)
# get the +/-1 iloc rows (+2 because that is how slices work), basically +1 and -1 trading days
short_data = short.iloc[iloc_ix-10: iloc_ix+11]
return [short_data]
I want to create a script that iterates a list of 'issue_dates' and 'stock_tickers'. The list (a .csv) looks as following:
ARAY,07/08/2017
ARAY,24/04/2014
ACETQ,16/11/2015
ACETQ,16/11/2015
NVLNA,15/08/2014
ATSG,29/09/2017
ATI,24/05/2016
MDRX,18/06/2013
MDRX,18/06/2013
AMAGX,10/05/2017
AMAGX,14/02/2014
AMD,14/09/2016
To break down my problem and question I would like to know how to do the following:
First, how do I load the inputs?
Second, how do I call the function on each input?
And last, how do I accumulate all the function returns in one dataframe?
To load the inputs and call the function for each row; iterate over the csv file and pass each row's values to the function and accumulate the resulting Seriesin a list.
I modified your function a bit: removed the DataFrame creation so it is only done once and added a try/except block to account for missing dates or tickers (your example data didn't match up too well). The dates in the second csv look like they are day/month/year so I converted them for that format.
import pandas as pd
import datetime, csv
def get_data(df, issue_date, stock_ticker):
'''Return a Series for the ticker centered on the issue date.
'''
short = df.loc[df.Symbol.eq(stock_ticker)]
# get the index of the row of interest
try:
ix = short[short.Date.eq(issue_date)].index[0]
# get the item row for that row's index
iloc_ix = short.index.get_loc(ix)
# get the +/-1 iloc rows (+2 because that is how slices work), basically +1 and -1 trading days
short_data = short.iloc[iloc_ix-10: iloc_ix+11]
except IndexError:
msg = f'no data for {stock_ticker} on {issue_date}'
#log.info(msg)
print(msg)
short_data = None
return short_data
df = pd.read_csv (datafile)
df['Date'] = pd.to_datetime(df['Date'], format="%Y%m%d")
results = []
with open('issues.csv') as issues:
for ticker,date in csv.reader(issues):
day,month,year = map(int,date.split('/'))
# dt = datetime.datetime.strptime(date, r'%d/%m/%Y')
date = datetime.date(year,month,day)
s = get_data(df,date,ticker)
results.append(s)
# print(s)
Creating a single DataFrame or table for all that info may be problematic especially since the date ranges are all different. Probably should ask a separate question regarding that. Its mcve should probably just include a few minimal Pandas Series with a couple of different date ranges and tickers.
The following code downloads Close price data for the list of tickers in "tickers". My objective is to get a new list called "valid_tickers" which meet my criteria - in this example, criteria is that a ticker has more than 1,323 data points.
In other words, I want to eliminate stocks that have a shorter price history (FTV in this example). If I apply dropna() to the entire data, all the n/a are eliminated but I am also shortening the price history of the stock that has full data (MSFT in this example).
This is undesirable. Therefore, I want to eliminate n/a just for the ticker where they are found, then measure the length of its price history, and include the ticker in the "valid_tickers" list only if it has more than 1,323 points. However, dropna() does not want to work on data[ticker] for some reason. What am I doing wrong here?
import yfinance as yf
import pandas as pd
tickers = ['FTV','MSFT']
data = yf.download(tickers, start="2012-04-03", end="2017-07-07")['Close']
data = data.reset_index()
valid_tickers =[]
for ticker in tickers:
data[ticker] = pd.DataFrame(data, columns = [ticker])
data[ticker] = data[ticker].dropna()
if len(data[ticker]) > 1323:
valid_tickers.append(ticker)
print (valid_tickers)
I think there's a problem with the re-use of the name data. What would happen if you replaced data[ticker] with tmp ?
tickers = ['FTV','MSFT']
data = yf.download(tickers, start="2012-04-03", end="2017-07-07")['Close']
data = data.reset_index()
valid_tickers =[]
for ticker in tickers:
tmp = pd.DataFrame(data, columns = [ticker])
tmp = tmp.dropna()
if len(tmp) > 1323:
valid_tickers.append(ticker)
print (valid_tickers)
Do you just want to find the tickers with more than 1323 valid price observations? If so, use this:
valid_obs_series = data.notnull().sum() # get total non-na observations per ticker
valid_tickers = list(valid_obs_series[valid_obs_series > 1323].index) # get valid tickers
You can easily do this in one line with df.dropna():
# your code:
tickers = ['FTV','MSFT']
import yfinance as yf
data = yf.download(tickers, start="2012-04-03", end="2017-07-07")['Close']
data = data.reset_index()
valid_tickers =[]
# my addition:
valid_tickers.append(data.dropna(thresh=1323, axis=1).columns[1:][0]) # this line here
It appends the columns names of stocks that have > 1323 valid data points to the valid_tickers list. In your case, only 'MSFT'.
all_data = {}
for ticker in ['TWTR', 'SNAP', 'FB']:
all_data[ticker] = np.array(pd.read_csv('https://www.google.com/finance/getprices?i=60&p=10d&f=d,o,h,l,c,v&df=cpct&q={}'.format(ticker, skiprows=7, header=None))
date = []
for i in np.arange(0, len(all_data['SNAP'])):
if all_data['SNAP'][i][0][0] == 'a':
t = datetime.datetime.fromtimestamp(int(all_data['SNAP'][i][0].replace('a','')))
date.append(t)
else:
date.append(t+ datetime.timedelta(minutes= int(all_data['SNAP'][i][0])))
Hi, what this code does is to create a dictionary(all_data) and then put intraday data for twitter, snapchat, facebook into the dictionary from the url. The dates are in epoch time format and so the second for did a second for loop.
I was only able to do so for one of the tickers (SNAP) and i was wondering if anyone knew how to create iterate all the data to do the same
With pandas, you normally convert a timestamp to datetime using:
df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit="s")
Note:
Your script seems to contain other errors, which are outside the scope of the question.