Key error message when calculating variables using pandas and yfinance - python

trying to calculate some variables from yfinance from the column df['Close'].
But im getting this error which i have not seen before. and heres are the code:
import os
import pandas as pd
import plotly.graph_objects as go
symbols = 'AAPL'
for filename in os.listdir('datasets/'):
#print(filename)
symbol = filename.split('.')[0]
#print(symbol)
df = pd.read_csv('datasets/{}'.format(filename))
if df.empty:
continue
df['20_sma'] = df['Close'].rolling(window=20).mean()
df['stddev'] = df['Close'].rolling(window=20).std()
df['lowerband'] = df['20_sma'] + (2* df['stddev'])
df['upperband'] = df['20_sma'] - (2* df['stddev'])
if symbol in symbols:
print(df)
and heres are the error message:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Close'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/Kit/Documents/TTM_squeezer/squeeze.py", line 16, in <module>
df['20_sma'] = df['Close'].rolling(window=20).mean()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in __getitem__
indexer = self.columns.get_loc(key)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
raise KeyError(key) from err
KeyError: 'Close'
Seems like the 'Close' column has contributed to this error but i just cant figure out why?
Many thanks

turns out there was an error in the process where the local file was saved
case closed, thanks all

Related

How do I handle a date error in eventstudy package?

I'm trying to run multiple event studies in Python with the eventstudy package. Still, I keep getting the exact date error no matter what I do (different date formats, naming or not the columns, setting the date_format parameter,...)
Would you know what is wrong or how else I could do these multiple event studies?
This is my code now:
import eventstudy as es
import pandas as pd
returns = "C:/Users/Artur Andrade/OneDrive/Documents/_Others/Monografia/Eu/Base de dados/RETURNS.csv"
events = "C:/Users/Artur Andrade/OneDrive/Documents/_Others/Monografia/Eu/Base de dados/TESTE.csv"
es.Single.import_returns("C:/Users/Artur Andrade/OneDrive/Documents/_Others/Monografia/Eu/Base de dados/RETURNS.csv", is_price=True)
energy = es.Multiple.from_csv(
path = "C:/Users/Artur Andrade/OneDrive/Documents/_Others/Monografia/Eu/Base de dados/TESTE.csv",
event_study_model = es.Single.market_model,
event_window = (-5,+10),
estimation_size = 100,
buffer_size = 30,
ignore_errors = True
)
energy.results()
energy.plot()
And I keep getting this error:
Traceback (most recent call last):
File "C:\Users\Artur Andrade\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'date'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Artur Andrade\OneDrive\Documents\_Others\Monografia\Eu\Base de dados\codigo\evento.py", line 11, in <module>
es.Single.import_returns("C:/Users/Artur Andrade/OneDrive/Documents/_Others/Monografia/Eu/Base de dados/RETURNS.csv", is_price=True)
File "C:\Users\Artur Andrade\AppData\Local\Programs\Python\Python310\lib\site-packages\eventstudy\single.py", line 327, in import_returns
data = read_csv(path, format_date=True, date_format=date_format)
File "C:\Users\Artur Andrade\AppData\Local\Programs\Python\Python310\lib\site-packages\eventstudy\utils.py", line 112, in read_csv
df[date_column] = pd.to_datetime(df[date_column], format=date_format)
File "C:\Users\Artur Andrade\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py", line 3505, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\Artur Andrade\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'date'
My database right now is just like this:
RETURNS.csv:
Date,IBOV,IEE,AESB3,EGIE3,ENBR3,EQTL3,MEGA3,NEOE3,ENEV3
2017-01-02,"59,589","36,062",14, ,11.18,10.53, , ,2.97
.
.
.
2022-09-23,"111,716","82,969",9.64,40.7,23.71,26.97,10.72,16.48,15.34
2022-09-26,"109,114","80,913",9.6,39.7,23.16,26.71,10.42,16.11,14.81
2022-09-27,"108,376","79,121",9.55,39.07,22.72,26.15,10.36,15.72,14.37
2022-09-28,"108,451","78,427",9.44,38.54,22.04,26.1,10.59,15.35,14.26
2022-09-29,"107,664","77,950",9.37,38.29,21.72,26.01,10.38,15.23,14.45
TESTE.csv:
security_ticker,market_ticker,event_date
AESB3,IBOV,2017-01-13
.
.
.
ENBR3,IBOV,2022-04-20
EQTL3,IBOV,2021-06-02
MEGA3,IBOV,2022-07-04
ENEV3,IBOV,2021-12-15

iterate a dataframe

I'm trying to iterate a dataframe to call queries in mongodb from a list and save each query in a csv file. I have the connection with no errors, but when I iterate it just creates the frist file (0.csv) and I have an error for the second row of the dataframe.
This is my code:
sql = [
('tran','transactions',{"den": "00100002773060"}),
('tran','Data',{'name': 'john'}),
]
df = pd.DataFrame(sql, columns = ["database", "entity", "sql"])
for i in range(len(df)):
database = df.iloc[i]["database"]
entity=df.iloc[i]["entity"]
myquery=df.iloc[i]["sql"]
collection = client[database][entity]
try:
mydoc = list(collection.find(myquery))
if len(mydoc) > 0:
df = pd.DataFrame(mydoc)
df.pop("_id")
df.to_csv(str(i) + '.csv')
print("file saved")
except:
print("error on file")
and this the error
Traceback (most recent call last):
File "/home/r/Desktop/table_csv/entorno_virtual/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3629, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'database'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "getSql.py", line 12, in <module>
database = df.iloc[i]["database"]
File "/home/r/Desktop/table_csv/entorno_virtual/lib/python3.8/site-packages/pandas/core/series.py", line 958, in __getitem__
return self._get_value(key)
File "/home/r/Desktop/table_csv/entorno_virtual/lib/python3.8/site-packages/pandas/core/series.py", line 1069, in _get_value
loc = self.index.get_loc(label)
File "/home/r/Desktop/table_csv/entorno_virtual/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3631, in get_loc
raise KeyError(key) from err
KeyError: 'database'
from what I can see here you are changing your df variable here
df = pd.DataFrame(mydoc)
probably just rename it

Creating a deltatime array in Python

I am new to python, so I decided to start a project to improve my skills. Therefore, I started trying this one on GeeksForGeeks. Now, I am having difficulty to append a deltaTime variable into an array. I tried a numpy array as well, but it did not worked out.
My code:
from matplotlib.ticker import Formatter
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import numpy as np
from pandas._libs.tslibs import timestamps
birdData = pd.read_csv("bird_tracking.csv")
birdNames = pd.unique(birdData.bird_name)
#Pegando intervalo do tempo
timestamps = []
for i in range(len(birdData)):
timestamps.append(datetime.datetime.strptime(birdData.date_time.iloc[i][:-3], "%Y-%m-%d %H:%M:%S"))
birdData["timestamps"] = pd.Series(timestamps, index = birdData.index)
plt.figure(figsize=(7, 7))
for name in birdNames:
times = birdData.timestamps[birdData.bird_name == name]
elapsedTime = []
for time in times:
x = time-times[0]
#print(x)
elapsedTime.append(x)
plt.plot(np.array(elapsedTime)/datetime.timedelta(days=1), label = name)
plt.xlabel(" Observation ")
plt.ylabel(" Elapsed time (days) ")
plt.show()
The error that I am finding:
Traceback (most recent call last):
File "C:\Users\User\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1625, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1632, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Users\User\Documents\GitHub\TrackingBirdMigration\dataTime.py", line 24, in <module>
x = time-times[0]
File "C:\Users\User\anaconda3\lib\site-packages\pandas\core\series.py", line 853, in __getitem__
return self._get_value(key)
File "C:\Users\User\anaconda3\lib\site-packages\pandas\core\series.py", line 961, in _get_value
loc = self.index.get_loc(label)
File "C:\Users\User\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 0
[Done] exited with code=1 in 8.313 seconds

Problem accessing pandas data that is represented with commas?

I have line as follows:
data = pd.read_csv("file.csv", sep=";", encoding='ISO-8859-1', engine = 'python')
test = str(data['information'])
I'm trying to access csv column that contains data in a cell like so: "1000,10500,2500"
I get an error:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Vastuualue'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/erik.ilonen/Desktop/Projekti_csv_data/Toinen_testiohjelma/toinen_datan_kasittely_ohjelma.py", line 12, in <module>
test = str(dataAlkuperainen['Vastuualue'])
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/frame.py", line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 'information'
Your separator is not right.
sep should be comma not semicolon, so use sep="," instead of sep=";".

Reading a file with pandas and use correlation coefficients on two columns

I have a file like following with no header
0.000000 0.330001 0.280120
1.000000 0.355590 0.298581
2.000000 0.305945 0.280231
I want to read this file using pandas dataframe and want to perform correlation coefficient between the second and the third column.
I am trying like following:
import pandas as pd
df = pd.read_csv('COLVAR_hbondnohead', header=None)
df['1'].corr(df['2'])
It pops up with a huge error message. Am I not treating the columns properly? Any suggestion or hint?
Error message
Traceback (most recent call last):
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3063, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2685, in __getitem__
return self._getitem_column(key)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2486, in _get_item_cache
values = self._data.get(item)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '1'
You will have to specify separator which is space while reading file. Then use position to access the columns. Below code should work.
df = pd.read_csv('test.txt', sep=' ', header=None)
df[1].corr(df[2])
Roy what is the file extension? is it .csv ? if it is you should add it to the end of fileName like pd.read_csv('COLVAR_hbondnohead.csv', header=None)
You don't have columns named 1 and 2, So, you have to create those columns first.
import pandas as pd
df = pd.read_csv('COLVAR_hbondnohead', header=None)
df1 = df.reindex(columns=['1','2', '3'])
then
df1['2'].corr(df1['3'])

Categories

Resources