Index problem when dowloading data in python

Index problem when dowloading data in python - python

I want to calculate the total return (last price - first price)/first price. I would like to do it directly dowloading the data from yahoo without dowlading previously the csv file and then uploading it into pyhton.
Any idea why the error and how to solve it?
Thank you!
import pandas as pd
import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
import datetime as dt
start=dt.datetime(2019,5,1)
end=dt.datetime.now()
data= yf.download('TSLA',start,end)['Adj Close']
data = pd.DataFrame(data).dropna()
total_return = (data[-1] - data[0]) / data[0]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\anaconda4\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-45-bb800ce55aa9> in <module>
----> 1 total_return = (data[0]- data[0])/ data[0]
~\anaconda4\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2900 if self.columns.nlevels > 1:
2901 return self._getitem_multilevel(key)
-> 2902 indexer = self.columns.get_loc(key)
2903 if is_integer(indexer):
2904 indexer = [indexer]
~\anaconda4\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: 0

The problem is that you neglected to specify the DF column you want to use.
In that last line, replace each reference to df with df["Adj Close"].
total_return = ( data["Adj Close"][-1] - data["Adj Close"][0] ) / data["Adj Close"][0]
For the code
data = pd.DataFrame(data).dropna()
print(data["Adj Close"][-1], data["Adj Close"][0])
total_return = ( data["Adj Close"][-1] - data["Adj Close"][0] ) /
data["Adj Close"][0]
print(total_return)
The output is
570.22998046875 46.801998138427734
11.1838810980284

Related

Python "KeyError: 0" - detect the crossed cells

I tried so hard to understand this kind of errors.
Code
#assigne each cell from each site to a dataframe
cell_list = []
grouped = df.groupby(df['Cell Name'])
for i in range(len(df.index)):
if (df['Cell Name'].iloc[i] not in cell_list):
cell_list.append(df['Cell Name'].iloc[i])
for cell in cell_list:
g = grouped.get_group(cell)
for j in range(len(g)-1):
T_vals1 = g[j].values
diff_TCH = T_vals1[1:] - T_vals1[:-1]
T_vals2 = g[j-1].values
diff_TCH2 = T_vals2[1:] - T_vals2[:-1]
print(diff_TCH)
product23 = np.multiply(diff_TCH,diff_TCH2)
print(product23)
Error
This is the error I got:
KeyError Traceback (most recent call
last) ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance) 3360 try:
-> 3361 return self._engine.get_loc(casted_key) 3362 except KeyError as err:
~\anaconda3\lib\site-packages\pandas_libs\index.pyx in
pandas._libs.index.IndexEngine.get_loc()
~\anaconda3\lib\site-packages\pandas_libs\index.pyx in
pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call
last)
C:\Users\IMENEP~1\AppData\Local\Temp/ipykernel_6864/4137740194.py in
2 g = grouped.get_group(cell)
3 for j in range(len(g)-1):
----> 4 T_vals1 = g[j].values
5 diff_TCH = T_vals1[1:] - T_vals1[:-1]
6 T_vals2 = g[j-1].values
~\anaconda3\lib\site-packages\pandas\core\frame.py in
getitem(self, key) 3456 if self.columns.nlevels > 1: 3457 return self._getitem_multilevel(key)
-> 3458 indexer = self.columns.get_loc(key) 3459 if is_integer(indexer): 3460 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance) 3361 return
self._engine.get_loc(casted_key) 3362 except KeyError
as err:
-> 3363 raise KeyError(key) from err 3364 3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 0
Can you help me understand those errors?

Not able to display the column of a dataframe

When I am trying to print a single column of my data set it is showing errors
KeyError Traceback (most recent call
last) ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance) 2645 try:
-> 2646 return self._engine.get_loc(key) 2647 except KeyError:
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Label'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call
last) in
----> 1 data['Label']
~\anaconda3\lib\site-packages\pandas\core\frame.py in
getitem(self, key) 2798 if self.columns.nlevels > 1: 2799 return self._getitem_multilevel(key)
-> 2800 indexer = self.columns.get_loc(key) 2801 if is_integer(indexer): 2802 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance) 2646 return
self._engine.get_loc(key) 2647 except KeyError:
-> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2649
indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2650 if indexer.ndim > 1 or indexer.size > 1:
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Label'
data['Label']

It could be possible that the column name is having trailing spaces. Just try to print the column names & verify.
print(data.columns)
or try to print the columns after
data.columns = data.columns.str.strip()

If you have DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets.
Now suppose that you want to select a column from the data(as per your question) DataFrame.
data["Label"]
But if you are unaware of the columns. You can get a column list and then display column data.
columns = data.columns.values.tolist()
data[columns[index]]

KeyError while tring to write an additional field in Python Pandas dataframe

I want to add a calculated field 'Score' in dataframe positions_deposits.
When I run the following operation on pandas dataframe positions_deposits,
for i in range(len(positions_deposits)):
<Read some values from the dataframe which would be passed to a function in the next line>
Score = RAG_function (Amber_threshold, Red_threshold, Type_threshold, Values)
positions_deposits['Score'].loc[i] = Score
I get the following error. Can you please guide me through what error I am making and how to resolve it?
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Score'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-201-7d0481b84aa4> in <module>
6 Values = positions_deposits['Values'].loc[i]
7 # Score = RAG_function (Amber_threshold, Red_threshold, Type_threshold, Values)
----> 8 positions_deposits["Score"].loc[i] = RAG_function (Amber_threshold, Red_threshold, Type_threshold, Values)
9
10 # print("Score is %i.00" %Score)
~/.local/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
2904 if self.columns.nlevels > 1:
2905 return self._getitem_multilevel(key)
-> 2906 indexer = self.columns.get_loc(key)
2907 if is_integer(indexer):
2908 indexer = [indexer]
~/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: 'Score'
Please note: if I print(Score), there is no error. It means the function, RAG_function is getting executed but the dataframe is failing.
Thanks!

You'll probably want to read up on how .loc and .iloc work. But having said that, there is another way which is better:
import pandas
import random
df = pandas.DataFrame([{"A": random.randint(0,100), "B": random.randint(0,100)} for _ in range(100)])
def rag_function(row):
A = row["A"]
B = row["B"]
return A * B
df["Score"] = df.apply(rag_function, axis=1)
NOTE: I don't have your RAG_function so I've created some random function. The idea is that you apply this function to every row in the dataframe.

Money Flow Index keyerror

I've obtained the historical values from an example stock (Apple in this case) and was following an example I saw online, however, when their code succeeded mine failed because of some keyerror?
Could anyone tell/show me what's wrong and hopefully how to fix it? Error is:
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2645 try:
-> 2646 return self._engine.get_loc(key)
2647 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 1
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-60-d89407b24e87> in <module>
3
4 for i in range(1, len(typical_price)):
----> 5 if typical_price[i] > typical_price[i-1]:
6 positive_flow.append(money_flow[i-1])
7 negative_flow.append(0)
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2798 if self.columns.nlevels > 1:
2799 return self._getitem_multilevel(key)
-> 2800 indexer = self.columns.get_loc(key)
2801 if is_integer(indexer):
2802 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2646 return self._engine.get_loc(key)
2647 except KeyError:
-> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key))
2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2650 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 1
Code is:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import math
import pandas_datareader as pdr
stocks = ['AAPL']
data_close = pdr.get_data_yahoo(stocks, start='2020-01-01')['Close']
data_high = pdr.get_data_yahoo(stocks, start='2020-01-01')['High']
data_low = pdr.get_data_yahoo(stocks, start='2020-01-01')['Low']
data_volume = pdr.get_data_yahoo(stocks, start='2020-01-01')['Volume']
typical_price = (data_close + data_high + data_low)/3;
money_flow = typical_price * data_volume;
positive_flow = []
negative_flow = []
for i in range(1, len(typical_price)):
if typical_price[i] > typical_price[i-1]:
positive_flow.append(money_flow[i-1])
negative_flow.append(0)
elif typical_price[i] < typical_price[i-1]:
positive_flow.append(0)
negative_flow.append(money_flow[i-1])
else:
positive_flow.append(0)
negative_flow.append(0)
Error appears when I run the final part of the code where I try to retrieve the positive and negative moneyflow for my MFI algorithm.

Use iloc indexing
for i in range(1, len(typical_price)):
if typical_price.iloc[i].item() > typical_price.iloc[i-1].item():
positive_flow.append(money_flow.iloc[i-1])
negative_flow.append(0)
elif typical_price.iloc[i].item() < typical_price.iloc[i-1].item():
positive_flow.append(0)
negative_flow.append(money_flow.iloc[i-1])
else:
positive_flow.append(0)
negative_flow.append(0)
typical_price and money_flow probably has datetime as index not integers. If you want access row by integer-location then you can use iloc

Unexpected Python KeyError

I have loaded a CSV file into a Pandas dataframe:
import pandas as pd
Name ID Sex M_Status DaysOff
Joe 3 M S 1
NaN NaN NaN NaN 2
NaN NaN NaN NaN 3
df = pd.read_csv('People.csv')
This data will then be loaded into an HTML file.
test = """
HTML code
"""
Now for preparing data for the HTML file:
df1 = df.filter(['Name','ID','Sex','M_Status','DaysOff'])
file = ""
for i, rows in df1.iterrows():
name = (df1['Name'][i])
id = (df1['ID'][i])
sex = (df1['Sex'][i])
m_status = (df1['M_Status'][i])
days_off = (df1['DaysOff'][i])
with open(f"personInfo{i}.html", "w") as file:
file.write(test.format(name,id,sex,m_status,days_off))
file.close()
And the error:
KeyError: 'days_off'
Note: This error is occurs within the for loop.
Can anyone see where I'm going wrong? This error is generated when you try to grab data from a column which doesn't match the name, or if the column doesn't have that header namne. However, it does.
Error Information:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'days_off'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-16-35e6b916521b> in <module>
1 name = (df1['Name'][i])
2 id = (df1['ID'][i])
3 sex = (df1['Sex'][i])
4 m_status = (df1['M_Status'][i])
----> 5 days_off = (df1['DaysOff'][i])
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in
__getitem__(self, key)
2925 if self.columns.nlevels > 1:
2926 return self._getitem_multilevel(key)
-> 2927 indexer = self.columns.get_loc(key)
2928 if is_integer(indexer):
2929 indexer = [indexer]
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_loc(self, key, method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return
self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method,
tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'days_off'

Just a hunch, but your error message suggests that you are trying to access your dataframe column with the key days_off, when it should be DaysOff. I don't see any place in the code you provided where this happens, but I would double-check your source code file to make sure that you are using the right key name.

I've resolved this and what a stupid error it was!
Basically there was a space at the end of the header name.
What Python wanted/was expecting:
days_off = (df1['DaysOff '][i])
whereas I was giving it:
days_off = (df1['DaysOff'][i])
Very stupid human error. Thanks to all that looked into it though

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Index problem when dowloading data in python - python

Related

Python "KeyError: 0" - detect the crossed cells

Not able to display the column of a dataframe

KeyError while tring to write an additional field in Python Pandas dataframe

Money Flow Index keyerror

Unexpected Python KeyError

Categories

Resources