Python code acts strange in apply function from groupby - python

Hello I have observed some quite strange pandas/python behavior and would like to understand how this comes to exist.
See the code below. When I call the code without line 2 it does not cause a problem (yes I do not do anything here) and print "hello" once per group. When I call it with this line it prints the correct result, then "hello" and then fails. According to the error message it fails in line 2 with Keyerror 0.
1 |def somefunction(group):
2 | # print(group["Date"][0]) # <- This line causes the problem...
3 | print("hello")
4 | return group
5 |
6 |
7 |groups= df_test.groupby(by=["UnitNumber", "Date"]).apply(somefunction)
How can this be? I am quite confused why he either fails and then still prints a message or that the program fails in the return because of a print.
In case anyone is fighting with the same problem I fixed it by using the following line instead of line 2.
print(group.iloc[0]["Date"].strftime('%Y-%m-%d'))
But I still would like to understand why the code above acts so strange. In case there are any know issues i am using Jupyterlabs.
EDIT:
As requested the error message (note that the line numbers are of course different):
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-266-55f0006d1f02> in <module>
----> 1 groups= df_test.groupby(by=["UnitNumber", "Date"]).apply(somefunction)
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
1251 with option_context("mode.chained_assignment", None):
1252 try:
-> 1253 result = self._python_apply_general(f, self._selected_obj)
1254 except TypeError:
1255 # gh-20949
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f, data)
1285 data after applying f
1286 """
-> 1287 keys, values, mutated = self.grouper.apply(f, data, self.axis)
1288
1289 return self._wrap_applied_output(
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\groupby\ops.py in apply(self, f, data, axis)
818 # group might be modified
819 group_axes = group.axes
--> 820 res = f(group)
821 if not _is_indexed_like(res, group_axes, axis):
822 mutated = True
<ipython-input-264-c99a31bdd0cf> in somefunction(group)
1 def somefunction(group):
----> 2 print(group["Date"][0]) # <- This line causes the problem...
3 # print("hello")
4 return group
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
940
941 elif key_is_scalar:
--> 942 return self._get_value(key)
943
944 if is_hashable(key):
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
1049
1050 # Similar to Index.get_value, but we do not fall back to positional
-> 1051 loc = self.index.get_loc(label)
1052 return self.index._get_values_for_loc(self, loc, label)
1053
~\scoop\apps\anaconda3\2021.05\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
-> 3363 raise KeyError(key) from err
3364
3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 0

Related

KeyError: 0 The above exception was the direct cause of the following exception:

The code was working correctly, but problems appeared in other parts, and when solving the problems of the other parts, this part did not work and I was not able to solve this problem and I hope to find help with you
I am getting an error that in this part:
`
value_result = []
value_skill = []
for i in range(df4.shape[0]):
job_description = df4['job_description'][i]
annotations = skill_extractor.annotate(job_description)
for type_matching, arr_skills in annotations["results"].items():
for skill in arr_skills:
if SKILL_DB[skill["skill_id"]]["skill_name"] in new_stopwords:
value_result.append(SKILL_DB[skill["skill_id"]]["skill_name"])
value_skill.append(SKILL_DB[skill["skill_id"]]["skill_type"])
`
and this is the error message
`
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3802 try:
-> 3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_1268\3053340950.py in <module>
2 value_skill = []
3 for i in range(df4.shape[0]):
----> 4 job_description = df4['job_description'][i]
5 annotations = skill_extractor.annotate(job_description)
6 for type_matching, arr_skills in annotations["results"].items():
~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
979
980 elif key_is_scalar:
--> 981 return self._get_value(key)
982
983 if is_hashable(key):
~\anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
1087
1088 # Similar to Index.get_value, but we do not fall back to positional
-> 1089 loc = self.index.get_loc(label)
1090 return self.index._get_values_for_loc(self, loc, label)
1091
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
-> 3805 raise KeyError(key) from err
3806 except TypeError:
3807 # If we have a listlike key, _check_indexing_error will raise
KeyError: 0
`
i am try to solve but cant do this
Many thanks for your time, much appreciated.

How come my connecting DTW-Python lines aren't showing up in my time-series graphs?

I am trying to use dtw-python inside of a jupyter notebook in order to graph the Dynamic Time Warping between two time-series graphs.
I have successfully imported dtw-python into my jupyter notebook, but when I try to use dtw to plot my two time-series graphs, I keep getting errors and only the two graphs show up (not the dtw connecting lines between the graphs). Below is an image of my script that I have tried, that only plots the two graphs and not the connected dtw lines for comparison. I also receive this error message.
SCRIPT I AM USING
from dtw import *
import matplotlib.pyplot as plt
dtw(USA_Unemployment_Inflation['Inflation'], USA_Unemployment_Inflation['Unemployment'], keep_internals=True,
step_pattern=rabinerJuangStepPattern(6, "c"))\
.plot(type="twoway",offset=-2)
ERROR MESSAGE I RECEIVE
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3628 try:
-> 3629 return self._engine.get_loc(casted_key)
3630 except KeyError as err:
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
C:\Users\Public\Documents\Wondershare\CreatorTemp\ipykernel_17236\2682910787.py in <module>
2 import matplotlib.pyplot as plt
3
----> 4 dtw(USA_Unemployment_Inflation['Inflation'], USA_Unemployment_Inflation['Unemployment'], keep_internals=True,
5 step_pattern=rabinerJuangStepPattern(6, "c"))\
6 .plot(type="twoway",offset=-2)
~\anaconda3\lib\site-packages\dtw\dtw.py in plot(self, type, **kwargs)
122 """
123 # ENDIMPORT
--> 124 return dtwPlot(self, type, **kwargs)
125
126
~\anaconda3\lib\site-packages\dtw\dtwPlot.py in dtwPlot(x, type, **kwargs)
68 return dtwPlotAlignment(x, **kwargs)
69 elif type == "twoway":
---> 70 return dtwPlotTwoWay(x, **kwargs)
71 elif type == "threeway":
72 return dtwPlotThreeWay(x, **kwargs)
~\anaconda3\lib\site-packages\dtw\dtwPlot.py in dtwPlotTwoWay(d, xts, yts, offset, ts_type, match_indices, match_col, xlab, ylab, **kwargs)
192 col = []
193 for i in idx:
--> 194 col.append([(d.index1[i], xts[d.index1[i]]),
195 (d.index2[i], -offset + yts[d.index2[i]])])
196
~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
956
957 elif key_is_scalar:
--> 958 return self._get_value(key)
959
960 if is_hashable(key):
~\anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
1067
1068 # Similar to Index.get_value, but we do not fall back to positional
-> 1069 loc = self.index.get_loc(label)
1070 return self.index._get_values_for_loc(self, loc, label)
1071
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3629 return self._engine.get_loc(casted_key)
3630 except KeyError as err:
-> 3631 raise KeyError(key) from err
3632 except TypeError:
3633 # If we have a listlike key, _check_indexing_error will raise
KeyError: 0
OUTPUT I RECEIVE WITHOUT THE DTW LINES CONNECTED BETWEEN THE TWO GRAPHS
If anyone has any tips to fix this so that my dtw lines actually show up, or if my syntax is incorrect please ANYTHING HELPS!! THANKS.

Having issues getting rb.fit() to work for a sentiment analysis project. Python 3.9.12

Getting back into coding after a while of being away.
I'm having issues getting rb.fit() to work for a sentiment analysis project. Python 3.9.12 Also open to other recommendations if you think RandomBaseline is just no good.
rb = RandomBaseline()
#rb.fit(data, target_col)
rb.fit(df_text_analysis, df_text_analysis['Q6'])
I found the code https://asperbrothers.com/blog/sentiment-analysis-in-python/ and it seems I have it setup the same way but I'm getting a KeyError: nan
rb = RandomBaseline()
rb.fit(df.iloc[X_train.index], "Sentiment")
Error output below.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py:3621, in Index.get_loc(self, key, method, tolerance)
3620 try:
-> 3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
File /opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()
File /opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: nan
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Input In [12], in <cell line: 3>()
1 rb = RandomBaseline()
2 #rb.fit(data, target_col)
----> 3 rb.fit(df_text_analysis, df_text_analysis['Q6'])
Input In [11], in RandomBaseline.fit(self, data, target_col)
9 agg = data.groupby(target_col).count()
11 for n in feels:
---> 12 self.categories[n] = agg.loc[n][0] / len(data)
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py:967, in _LocationIndexer.__getitem__(self, key)
964 axis = self.axis or 0
966 maybe_callable = com.apply_if_callable(key, self.obj)
--> 967 return self._getitem_axis(maybe_callable, axis=axis)
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py:1202, in _LocIndexer._getitem_axis(self, key, axis)
1200 # fall thru to straight lookup
1201 self._validate_key(key, axis)
-> 1202 return self._get_label(key, axis=axis)
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py:1153, in _LocIndexer._get_label(self, label, axis)
1151 def _get_label(self, label, axis: int):
1152 # GH#5667 this will fail if the label is not present in the axis.
-> 1153 return self.obj.xs(label, axis=axis)
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/generic.py:3864, in NDFrame.xs(self, key, axis, level, drop_level)
3862 new_index = index[loc]
3863 else:
-> 3864 loc = index.get_loc(key)
3866 if isinstance(loc, np.ndarray):
3867 if loc.dtype == np.bool_:
File /opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)
KeyError: nan

If conditional | Dataframe | Key error: 12690

I have a dataframe with a lot of columns, but in this case I am trying to create an if conditional just for one of them.
The idea is compare one row with the previous one to check if they are equal. But the code it doesn't work.
Proyectonevera2['CodProducto']
Out:
0 10390792
1 10390792
2 10390792
3 10390792
4 10390792
...
12685 10229147
12686 10229147
12687 10229147
12688 10229147
12689 10229147
Name: CodProducto, Length: 12690, dtype: object
The column is called "CodProducto" and the type is object
for i in range(0,len(Proyectonevera2)):
if Proyectonevera2.loc[i+1,'CodProducto'] == Proyectonevera2.loc[i,'CodProducto']:
Proyectonevera2.loc[i,'Prueba'] = 1
else:
Proyectonevera2.loc[i,'Prueba'] = 0
But when I run the code, it appears this error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
C:\ProgramData\Anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
C:\ProgramData\Anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 12690
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_17980/3964080695.py in <module>
1 for i in range(0,len(Proyectonevera2)):
----> 2 if Proyectonevera2.loc[i+1,'CodProducto'] == Proyectonevera2.loc[i,'CodProducto']:
3 Proyectonevera2.loc[i,'Prueba'] = 1
4 else:
5 Proyectonevera2.loc[i,'Prueba'] = 0
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
923 with suppress(KeyError, IndexError):
924 return self.obj._get_value(*key, takeable=self._takeable)
--> 925 return self._getitem_tuple(key)
926 else:
927 # we by definition only have the 0th axis
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_tuple(self, tup)
1098 def _getitem_tuple(self, tup: tuple):
1099 with suppress(IndexingError):
-> 1100 return self._getitem_lowerdim(tup)
1101
1102 # no multi-index, so validate all of the indexers
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_lowerdim(self, tup)
836 # We don't need to check for tuples here because those are
837 # caught by the _is_nested_tuple_indexer check above.
--> 838 section = self._getitem_axis(key, axis=i)
839
840 # We should never have a scalar section here, because
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1162 # fall thru to straight lookup
1163 self._validate_key(key, axis)
-> 1164 return self._get_label(key, axis=axis)
1165
1166 def _get_slice_axis(self, slice_obj: slice, axis: int):
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis)
1111 def _get_label(self, label, axis: int):
1112 # GH#5667 this will fail if the label is not present in the axis.
-> 1113 return self.obj.xs(label, axis=axis)
1114
1115 def _handle_lowerdim_multi_index_axis0(self, tup: tuple):
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level)
3774 raise TypeError(f"Expected label or tuple of labels, got {key}") from e
3775 else:
-> 3776 loc = index.get_loc(key)
3777
3778 if isinstance(loc, np.ndarray):
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
-> 3363 raise KeyError(key) from err
3364
3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 12690
You don't need a loop to do that:
Proyectonevera2['Prueba'] = (Proyectonevera2['CodProducto'] == Proyectonevera2['CodProducto'].shift()).astype('int')

KeyError Traceback using for+if in python

I am a newbie on python. I am trying to check the results of a test and train and I have to compare my predictions with the actual test results (data_train).
Data_train is a dictionary as shown in the image below . The prediction is an array like this
The code aims to counts the consistent classifications between prediction and test results.
consistent=0
inconsistent=0
​
for i in np.linspace(1,len_test,len_test):
if data_train['class'][i] == predictions[i]:
consistent=consistent+1
else:
inconsistent=inconsistent+1
I have this error, what does it means? I don't know how to catch error like this one in python
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\anaconda4\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
~\anaconda4\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\anaconda4\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 1
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10076/1764309407.py in <module>
1 for i in np.linspace(1,len_test,len_test):
----> 2 a=data_train['class'][i]
3 b=predictions[i]
4 if a == b:
5 consistent=consistent+1
~\anaconda4\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
940
941 elif key_is_scalar:
--> 942 return self._get_value(key)
943
944 if is_hashable(key):
~\anaconda4\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
1049
1050 # Similar to Index.get_value, but we do not fall back to positional
-> 1051 loc = self.index.get_loc(label)
1052 return self.index._get_values_for_loc(self, loc, label)
1053
~\anaconda4\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
-> 3363 raise KeyError(key) from err
3364
3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 1.0
I'm not entirely sure of the code's context, but it seems you're using NumPy. As I understand it, referencing an array element can only be done using an integer (or slices, ellipses etc.), but not strings. You appear to be attempting to reference using the string 'class' and I don't understand why. My suggestion would be to remove the ['class'] reference and simply use the integer reference, such as: a=data_train[i]

Categories

Resources