AssertionError using Basemap and Pandas

AssertionError using Basemap and Pandas - python

I'm trying to follow the tutorial here:
http://nbviewer.ipython.org/github/ehmatthes/intro_programming/blob/master/notebooks/visualization_earthquakes.ipynb#install_standard
However, I am using pandas instead of the built in csv module for python. My code is as follows:
import pandas as pd
eq_data = pd.read_csv('earthquake_data.csv')
map2 = Basemap(projection='robin'
, resolution='l'
, area_thresh=1000.0
, lat_0=0
, lon_0=0)
map2.drawcoastlines()
map2.drawcountries()
map2.fillcontinents(color = 'gray')
map2.drawmapboundary()
map2.drawmeridians(np.arange(0, 360, 30))
map2.drawparallels(np.arange(-90, 90, 30))
x,y = map2(eq_data['longitude'].values, eq_data['latitude'].values)
map2.plot(x,y, marker='0', markercolor='red', markersize=6)
This produces an AssertionError but with no description:
AssertionError Traceback (most recent call last)
<ipython-input-64-d3426e1f175d> in <module>()
14 x,y = map2(range(20), range(20))#eq_data['longitude'].values, eq_data['latitude'].values)
15
---> 16 map2.plot(x,y, marker='0', markercolor='red', markersize=6)
c:\Python27\lib\site-packages\mpl_toolkits\basemap\__init__.pyc in with_transform(self, x, y, *args, **kwargs)
540 # convert lat/lon coords to map projection coords.
541 x, y = self(x,y)
--> 542 return plotfunc(self,x,y,*args,**kwargs)
543 return with_transform
544
c:\Python27\lib\site-packages\mpl_toolkits\basemap\__init__.pyc in plot(self, *args, **kwargs)
3263 ax.hold(h)
3264 try:
-> 3265 ret = ax.plot(*args, **kwargs)
3266 except:
3267 ax.hold(b)
c:\Python27\lib\site-packages\matplotlib\axes.pyc in plot(self, *args, **kwargs)
4135 lines = []
4136
-> 4137 for line in self._get_lines(*args, **kwargs):
4138 self.add_line(line)
4139 lines.append(line)
c:\Python27\lib\site-packages\matplotlib\axes.pyc in _grab_next_args(self, *args, **kwargs)
315 return
316 if len(remaining) <= 3:
--> 317 for seg in self._plot_args(remaining, kwargs):
318 yield seg
319 return
c:\Python27\lib\site-packages\matplotlib\axes.pyc in _plot_args(self, tup, kwargs)
303 ncx, ncy = x.shape[1], y.shape[1]
304 for j in xrange(max(ncx, ncy)):
--> 305 seg = func(x[:, j % ncx], y[:, j % ncy], kw, kwargs)
306 ret.append(seg)
307 return ret
c:\Python27\lib\site-packages\matplotlib\axes.pyc in _makeline(self, x, y, kw, kwargs)
255 **kw
256 )
--> 257 self.set_lineprops(seg, **kwargs)
258 return seg
259
c:\Python27\lib\site-packages\matplotlib\axes.pyc in set_lineprops(self, line, **kwargs)
198 raise TypeError('There is no line property "%s"' % key)
199 func = getattr(line, funcName)
--> 200 func(val)
201
202 def set_patchprops(self, fill_poly, **kwargs):
c:\Python27\lib\site-packages\matplotlib\lines.pyc in set_marker(self, marker)
851
852 """
--> 853 self._marker.set_marker(marker)
854
855 def set_markeredgecolor(self, ec):
c:\Python27\lib\site-packages\matplotlib\markers.pyc in set_marker(self, marker)
231 else:
232 try:
--> 233 Path(marker)
234 self._marker_function = self._set_vertices
235 except ValueError:
c:\Python27\lib\site-packages\matplotlib\path.pyc in __init__(self, vertices, codes, _interpolation_steps, closed, readonly)
145 codes[-1] = self.CLOSEPOLY
146
--> 147 assert vertices.ndim == 2
148 assert vertices.shape[1] == 2
149
AssertionError:
I thought I had the problem due to the update to pandas which no longer allows passing Series like you used to be able to as described here:
Runtime error using python basemap and pyproj?
But as you can see, I adjusted my code for this and it didn't fix the problem. At this point I am lost.
I am using Python 2.7.6, pandas 0.15.2, and basemap 1.0.7 on windows server 2012 x64.

There are two problems with my code. First, the plot function for the map2 object is inherited from matplotlib. Thus the marker attribute cannot be '0' it needs to be 'o'. Additionally, there is no markercolor attribute. It is called color. The below code should work.
import pandas as pd
eq_data = pd.read_csv('earthquake_data.csv')
map2 = Basemap(projection='robin'
, resolution='l'
, area_thresh=1000.0
, lat_0=0
, lon_0=0)
map2.drawcoastlines()
map2.drawcountries()
map2.fillcontinents(color = 'gray')
map2.drawmapboundary()
map2.drawmeridians(np.arange(0, 360, 30))
map2.drawparallels(np.arange(-90, 90, 30))
x,y = map2(eq_data['longitude'].values, eq_data['latitude'].values)
map2.plot(x,y, marker='o', color='red', markersize=6, linestyle='')

Related

How to create a regplot in Pyspark - tough time with sparse vector

I am using a Colab Notebook and doing a linear regression in Pyspark.
I completed the whole regression with no problems. However, we were asked to add a regression plot, and that's where I am stuck.
I had multiple features in my dataset which I clubbed into one column using the feature assembler. Here is what my predicted dataset looks like:
I now want to create something like this:
And I am using the following code:
import chart_studio.plotly as py
import plotly.graph_objects as go
x = mdata.toPandas()['Independent_Features'].values.tolist()
y = mdata.toPandas()['MonthlyCharges'].values.tolist()
y_pred=mdata.toPandas()['prediction'].values.tolist()
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=x,
y=y,
mode='markers',
name='Original_Data',
))
fig.add_trace(
go.Scatter(
x=x,
y=y_pred,
name='Predicted'
))
fig.update_layout(
title="Linear Regression",
xaxis_title="Independent Features",
yaxis_title="Monthly Charges",
font=dict(
family="Courier New, monospace",
size=18,
color="#7f7f7f"
)
)
fig.show()
But I am getting this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-45-cb270db7e5d4> in <module>()
30
31
---> 32 fig.show()
11 frames
/usr/local/lib/python3.7/dist-packages/plotly/basedatatypes.py in show(self, *args, **kwargs)
3396 import plotly.io as pio
3397
-> 3398 return pio.show(self, *args, **kwargs)
3399
3400 def to_json(self, *args, **kwargs):
/usr/local/lib/python3.7/dist-packages/plotly/io/_renderers.py in show(fig, renderer, validate, **kwargs)
387
388 # Mimetype renderers
--> 389 bundle = renderers._build_mime_bundle(fig_dict, renderers_string=renderer, **kwargs)
390 if bundle:
391 if not ipython_display:
/usr/local/lib/python3.7/dist-packages/plotly/io/_renderers.py in _build_mime_bundle(self, fig_dict, renderers_string, **kwargs)
295 setattr(renderer, k, v)
296
--> 297 bundle.update(renderer.to_mimebundle(fig_dict))
298
299 return bundle
/usr/local/lib/python3.7/dist-packages/plotly/io/_base_renderers.py in to_mimebundle(self, fig_dict)
389 default_width="100%",
390 default_height=525,
--> 391 validate=False,
392 )
393
/usr/local/lib/python3.7/dist-packages/plotly/io/_html.py in to_html(fig, config, auto_play, include_plotlyjs, include_mathjax, post_script, full_html, animation_opts, default_width, default_height, validate, div_id)
144
145 # ## Serialize figure ##
--> 146 jdata = to_json_plotly(fig_dict.get("data", []))
147 jlayout = to_json_plotly(fig_dict.get("layout", {}))
148
/usr/local/lib/python3.7/dist-packages/plotly/io/_json.py in to_json_plotly(plotly_object, pretty, engine)
122 from _plotly_utils.utils import PlotlyJSONEncoder
123
--> 124 return json.dumps(plotly_object, cls=PlotlyJSONEncoder, **opts)
125 elif engine == "orjson":
126 JsonConfig.validate_orjson()
/usr/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
236 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
237 separators=separators, default=default, sort_keys=sort_keys,
--> 238 **kw).encode(obj)
239
240
/usr/local/lib/python3.7/dist-packages/_plotly_utils/utils.py in encode(self, o)
57 """
58 # this will raise errors in a normal-expected way
---> 59 encoded_o = super(PlotlyJSONEncoder, self).encode(o)
60 # Brute force guessing whether NaN or Infinity values are in the string
61 # We catch false positive cases (e.g. strings such as titles, labels etc.)
/usr/lib/python3.7/json/encoder.py in encode(self, o)
197 # exceptions aren't as detailed. The list call should be roughly
198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
200 if not isinstance(chunks, (list, tuple)):
201 chunks = list(chunks)
/usr/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
255 self.key_separator, self.item_separator, self.sort_keys,
256 self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)
258
259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
/usr/local/lib/python3.7/dist-packages/_plotly_utils/utils.py in default(self, obj)
134 except NotEncodable:
135 pass
--> 136 return _json.JSONEncoder.default(self, obj)
137
138 #staticmethod
/usr/lib/python3.7/json/encoder.py in default(self, o)
177
178 """
--> 179 raise TypeError(f'Object of type {o.__class__.__name__} '
180 f'is not JSON serializable')
181
TypeError: Object of type SparseVector is not JSON serializable
Here is a link to the notebook: https://colab.research.google.com/drive/1NLibPkZgOE_w7dVTerAF4nUQiPXW_o06?usp=sharing
I saw it mentioned SparseVector, so I tried to convert it to a DenseVector, but that just wouldn't work, tried multiple commands, but nothing helps.

custom function for scatterplot in bokeh not running

dataset i am working on:
https://www.kaggle.com/code/gauravsahani/housing-in-london-for-beginners/data
i am using bokeh and have produced the following scatter plot
source=ColumnDataSource(data=dict(df,av=df.average_price,cr=df.no_of_crimes,ar=df.area))
p=figure(sizing_mode='stretch_width',toolbar_location=None,height=500,
x_axis_label='Average Salary',y_axis_label='crime rate')
p.xaxis.formatter = BasicTickFormatter(use_scientific=False)
p.add_layout(Legend(), 'right')
p.scatter(x='av',y='cr',source=source,size=9,alpha=0.4,legend_field='area',fill_color=factor_cmap('area',palette=magma(34),factors=df.area.unique()))
p.xgrid.grid_line_color=None
p.legend.label_text_font_size='12px'
p.legend.padding=4
p.legend.orientation='vertical'
p.legend.spacing=-7
p.add_tools(HoverTool(tooltips=[('Area','#ar')]))
show(p)
I am trying to produce a custom function for reusability which is the following:
source=ColumnDataSource(data=dict(df,av=df.average_price,
cr=df.no_of_crimes,
ar=df.area))
tooltips=[('Area','#ar')]
def scatter(source,x,y,xlabel=None,ylabel=None,size=None,alpha=None,legend_field=None,fill_color=None):
p=figure(sizing_mode='stretch_width',
toolbar_location=None,
height=500,
x_axis_label=xlabel,
y_axis_label=ylabel)
p.xgrid.grid_line_color=None
p.xaxis.formatter = BasicTickFormatter(use_scientific=False)
p.legend.label_text_font_size='12px'
p.legend.padding=4
p.legend.orientation='vertical'
p.legend.spacing=-7
p.scatter(source=source,x=x,y=y,size=size,alpha=alpha,legend_field=legend_field,fill_color=fill_color)
return p
p=scatter(source,x='av',y='cr')
show(p)
however I keep getting an error and I cant seem to figure out why since both my x and y columns are numerical values and already run successfuly when not working with a function. the error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [234], in <cell line: 24>()
20 p.scatter(source=source,x=x,y=y,size=size,alpha=alpha,legend_field=legend_field,fill_color=fill_color)
22 return p
---> 24 p=scatter(source,x='av',y='cr')
26 show(p)
Input In [234], in scatter(source, x, y, xlabel, ylabel, size, alpha, legend_field, fill_color)
17 p.legend.orientation='vertical'
18 p.legend.spacing=-7
---> 20 p.scatter(source=source,x=x,y=y,size=size,alpha=alpha,legend_field=legend_field,fill_color=fill_color)
22 return p
File ~\Anaconda3\lib\site-packages\bokeh\plotting\glyph_api.py:962, in GlyphAPI.scatter(self, *args, **kwargs)
960 return self.circle(*args, **kwargs)
961 else:
--> 962 return self._scatter(*args, marker=marker_type, **kwargs)
File ~\Anaconda3\lib\site-packages\bokeh\plotting\_decorators.py:86, in glyph_method.<locals>.decorator.<locals>.wrapped(self, *args, **kwargs)
84 if self.coordinates is not None:
85 kwargs.setdefault("coordinates", self.coordinates)
---> 86 return create_renderer(glyphclass, self.plot, **kwargs)
File ~\Anaconda3\lib\site-packages\bokeh\plotting\_renderer.py:116, in create_renderer(glyphclass, plot, **kwargs)
113 # handle the mute glyph, we always set one
114 muted_visuals = pop_visuals(glyphclass, kwargs, prefix='muted_', defaults=glyph_visuals, override_defaults={'alpha':0.2})
--> 116 glyph = make_glyph(glyphclass, kwargs, glyph_visuals)
117 nonselection_glyph = make_glyph(glyphclass, kwargs, nonselection_visuals)
118 selection_glyph = make_glyph(glyphclass, kwargs, selection_visuals)
File ~\Anaconda3\lib\site-packages\bokeh\plotting\_renderer.py:145, in make_glyph(glyphclass, kws, extra)
143 kws = kws.copy()
144 kws.update(extra)
--> 145 return glyphclass(**kws)
File ~\Anaconda3\lib\site-packages\bokeh\model\model.py:128, in Model.__init__(self, **kwargs)
121 def __init__(self, **kwargs: Any) -> None:
122
123 # "id" is popped from **kw in __new__, so in an ideal world I don't
124 # think it should be here too. But Python has subtle behavior here, so
125 # it is necessary
126 kwargs.pop("id", None)
--> 128 super().__init__(**kwargs)
129 default_theme.apply_to_model(self)
File ~\Anaconda3\lib\site-packages\bokeh\core\has_props.py:206, in HasProps.__init__(self, **properties)
203 self._unstable_themed_values = {}
205 for name, value in properties.items():
--> 206 setattr(self, name, value)
208 self._initialized = True
File ~\Anaconda3\lib\site-packages\bokeh\core\has_props.py:230, in HasProps.__setattr__(self, name, value)
228 properties = self.properties(_with_props=True)
229 if name in properties:
--> 230 return super().__setattr__(name, value)
232 descriptor = getattr(self.__class__, name, None)
233 if isinstance(descriptor, property): # Python property
File ~\Anaconda3\lib\site-packages\bokeh\core\property\descriptors.py:283, in PropertyDescriptor.__set__(self, obj, value, setter)
280 class_name = obj.__class__.__name__
281 raise RuntimeError(f"{class_name}.{self.name} is a readonly property")
--> 283 value = self.property.prepare_value(obj, self.name, value)
284 old = self._get(obj)
285 self._set(obj, old, value, setter=setter)
File ~\Anaconda3\lib\site-packages\bokeh\core\property\dataspec.py:515, in SizeSpec.prepare_value(self, cls, name, value)
513 except TypeError:
514 pass
--> 515 return super().prepare_value(cls, name, value)
File ~\Anaconda3\lib\site-packages\bokeh\core\property\bases.py:365, in Property.prepare_value(self, owner, name, value, hint)
363 else:
364 obj_repr = owner if isinstance(owner, HasProps) else owner.__name__
--> 365 raise ValueError(f"failed to validate {obj_repr}.{name}: {error}")
367 if isinstance(owner, HasProps):
368 obj = owner
ValueError: failed to validate Scatter(id='40569', ...).size: expected an element of either String, Dict(Enum('expr', 'field', 'value', 'transform'), Either(String, Instance(Transform), Instance(Expression), Float)) or Float, got None
any help would be great

def scatter(source,x,y,xlabel=None,ylabel=None,size=5,alpha=1,legend_field=str(None),fill_color=None):
p=figure(sizing_mode='stretch_width',
toolbar_location=None,
height=500,
x_axis_label=xlabel,
y_axis_label=ylabel)
p.xgrid.grid_line_color=None
p.xaxis.formatter = BasicTickFormatter(use_scientific=False)
p.legend.label_text_font_size='12px'
p.legend.padding=4
p.legend.orientation='vertical'
p.legend.spacing=-7
if legend_field == str(None):
p.scatter(source=source,x=x,y=y,size=size,alpha=alpha,fill_color=fill_color)
else:
p.scatter(source=source,x=x,y=y,size=size,alpha=alpha,legend_field=legend_field,fill_color=fill_color)
return p

seaborn lmplot logistic raises AttributeError: module 'pandas' has no attribute 'Panel'

I am using the code below that I took from the Seaborn documentation as it is. Running this code results in an error.
AttributeError: module 'pandas' has no attribute 'Panel'
I am wondering if there is a way around this problem without reverting to a previous version of Pandas. Can anyone help?
tips = sns.load_dataset("tips")
tips["big_tip"] = (tips.tip / tips.total_bill) > .15
sns.lmplot(x="total_bill", y="big_tip", data=tips,
logistic=True, y_jitter=.03);
The version info as well as the complete error message are as follows:
pandas : 1.3.5
seaborn: '0.11.2'
--------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-2a96c34ef86c> in <module>
2 tips["big_tip"] = (tips.tip / tips.total_bill) > .15
3 sns.lmplot(x="total_bill", y="big_tip", data=tips,
----> 4 logistic=True, y_jitter=.03);
~/anaconda3/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in lmplot(x, y, data, hue, col, row, palette, col_wrap, height, aspect, markers, sharex, sharey, hue_order, col_order, row_order, legend, legend_out, x_estimator, x_bins, x_ci, scatter, fit_reg, ci, n_boot, units, seed, order, logistic, lowess, robust, logx, x_partial, y_partial, truncate, x_jitter, y_jitter, scatter_kws, line_kws, facet_kws, size)
643 scatter_kws=scatter_kws, line_kws=line_kws,
644 )
--> 645 facets.map_dataframe(regplot, x=x, y=y, **regplot_kws)
646 facets.set_axis_labels(x, y)
647
~/anaconda3/lib/python3.7/site-packages/seaborn/axisgrid.py in map_dataframe(self, func, *args, **kwargs)
775
776 # Draw the plot
--> 777 self._facet_plot(func, ax, args, kwargs)
778
779 # For axis labels, prefer to use positional args for backcompat
~/anaconda3/lib/python3.7/site-packages/seaborn/axisgrid.py in _facet_plot(self, func, ax, plot_args, plot_kwargs)
804 plot_args = []
805 plot_kwargs["ax"] = ax
--> 806 func(*plot_args, **plot_kwargs)
807
808 # Sort out the supporting information
~/anaconda3/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in regplot(x, y, data, x_estimator, x_bins, x_ci, scatter, fit_reg, ci, n_boot, units, seed, order, logistic, lowess, robust, logx, x_partial, y_partial, truncate, dropna, x_jitter, y_jitter, label, color, marker, scatter_kws, line_kws, ax)
861 scatter_kws["marker"] = marker
862 line_kws = {} if line_kws is None else copy.copy(line_kws)
--> 863 plotter.plot(ax, scatter_kws, line_kws)
864 return ax
865
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in plot(self, ax, scatter_kws, line_kws)
368
369 if self.fit_reg:
--> 370 self.lineplot(ax, line_kws)
371
372 # Label the axes
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in lineplot(self, ax, kws)
411 """Draw the model."""
412 # Fit the regression model
--> 413 grid, yhat, err_bands = self.fit_regression(ax)
414 edges = grid[0], grid[-1]
415
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in fit_regression(self, ax, x_range, grid)
209 from statsmodels.genmod.families import Binomial
210 yhat, yhat_boots = self.fit_statsmodels(grid, GLM,
--> 211 family=Binomial())
212 elif self.lowess:
213 ci = None
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in fit_statsmodels(self, grid, model, **kwargs)
279 return yhat
280
--> 281 yhat = reg_func(X, y)
282 if self.ci is None:
283 return yhat, None
~/anaconda3/lib/python3.7/site-packages/seaborn/regression.py in reg_func(_x, _y)
273 def reg_func(_x, _y):
274 try:
--> 275 yhat = model(_y, _x, **kwargs).fit().predict(grid)
276 except glm.PerfectSeparationError:
277 yhat = np.empty(len(grid))
~/anaconda3/lib/python3.7/site-packages/statsmodels/genmod/generalized_linear_model.py in __init__(self, endog, exog, family, offset, exposure, freq_weights, var_weights, missing, **kwargs)
289 offset=offset, exposure=exposure,
290 freq_weights=freq_weights,
--> 291 var_weights=var_weights, **kwargs)
292 self._check_inputs(family, self.offset, self.exposure, self.endog,
293 self.freq_weights, self.var_weights)
~/anaconda3/lib/python3.7/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
214
215 def __init__(self, endog, exog=None, **kwargs):
--> 216 super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
217 self.initialize()
218
~/anaconda3/lib/python3.7/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
66 hasconst = kwargs.pop('hasconst', None)
67 self.data = self._handle_data(endog, exog, missing, hasconst,
---> 68 **kwargs)
69 self.k_constant = self.data.k_constant
70 self.exog = self.data.exog
~/anaconda3/lib/python3.7/site-packages/statsmodels/base/model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
89
90 def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
---> 91 data = handle_data(endog, exog, missing, hasconst, **kwargs)
92 # kwargs arrays could have changed, easier to just attach here
93 for key in kwargs:
~/anaconda3/lib/python3.7/site-packages/statsmodels/base/data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
631 exog = np.asarray(exog)
632
--> 633 klass = handle_data_class_factory(endog, exog)
634 return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
635 **kwargs)
~/anaconda3/lib/python3.7/site-packages/statsmodels/base/data.py in handle_data_class_factory(endog, exog)
611 if data_util._is_using_ndarray_type(endog, exog):
612 klass = ModelData
--> 613 elif data_util._is_using_pandas(endog, exog):
614 klass = PandasData
615 elif data_util._is_using_patsy(endog, exog):
~/anaconda3/lib/python3.7/site-packages/statsmodels/tools/data.py in _is_using_pandas(endog, exog)
99
100 def _is_using_pandas(endog, exog):
--> 101 from statsmodels.compat.pandas import data_klasses as klasses
102 return (isinstance(endog, klasses) or isinstance(exog, klasses))
103
~/anaconda3/lib/python3.7/site-packages/statsmodels/compat/pandas.py in <module>
21 except ImportError:
22 from pandas.tseries import frequencies
---> 23 data_klasses = (pandas.Series, pandas.DataFrame, pandas.Panel)
24 else:
25 try:
~/anaconda3/lib/python3.7/site-packages/pandas/__init__.py in __getattr__(name)
242 return _SparseArray
243
--> 244 raise AttributeError(f"module 'pandas' has no attribute '{name}'")
245
246
AttributeError: module 'pandas' has no attribute 'Panel'

you are using the latest version of pandas library where Panal is removed from pandas version 0.25 and onward

TypeError: unhashable type: 'numpy.ndarray' when trying to plot a DataFrame

I am new to Python and trying to create a plot graph from a DataFrame.
I used the following piece of code:
predictions= list()
for the in range (10):
predicted= StartARIMAForecasting(RegistrationRates, 5,1,0)
predictions.append(predicted)
RegistrationRates.append(predicted)
data = {'Year':['2016','2017','2018','2019','2020','2021','2022','2023','2024','2025'], 'Registration Rate':predictions}
resultdf = pd.DataFrame(data)
print(resultdf)
plt.xlabel('Year')
plt.ylabel('Registration Rate')
plt.plot(resultdf)
Following output is seen:
0 2016 [50.68501406476124]
1 2017 [52.41297372600995]
2 2018 [54.0703599343735]
3 2019 [53.58327982434545]
4 2020 [55.647237533704754]
5 2021 [54.398197822219714]
6 2022 [55.06459335430334]
7 2023 [56.00171430250292]
8 2024 [55.70449088032122]
9 2025 [57.7127557392168]
but blank graph is plotted with following error:
TypeError: unhashable type: 'numpy.ndarray'
Full stack-trace is provided below:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-53-1d843e3f6a23> in <module>
58 plt.xlabel('Year')
59 plt.ylabel('Registration Rate')
---> 60 plt.plot(resultdf)
61
62 root.mainloop()
~\Anaconda3\lib\site-packages\matplotlib\pyplot.py in plot(*args, **kwargs)
3356 mplDeprecation)
3357 try:
-> 3358 ret = ax.plot(*args, **kwargs)
3359 finally:
3360 ax._hold = washold
~\Anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
1853 "the Matplotlib list!)" % (label_namer, func.__name__),
1854 RuntimeWarning, stacklevel=2)
-> 1855 return func(ax, *args, **kwargs)
1856
1857 inner.__doc__ = _add_data_doc(inner.__doc__,
~\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in plot(self, *args, **kwargs)
1525 kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
1526
-> 1527 for line in self._get_lines(*args, **kwargs):
1528 self.add_line(line)
1529 lines.append(line)
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _grab_next_args(self, *args, **kwargs)
404 this += args[0],
405 args = args[1:]
--> 406 for seg in self._plot_args(this, kwargs):
407 yield seg
408
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _plot_args(self, tup, kwargs)
381 x, y = index_of(tup[-1])
382
--> 383 x, y = self._xy_from_xy(x, y)
384
385 if self.command == 'plot':
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _xy_from_xy(self, x, y)
214 if self.axes.xaxis is not None and self.axes.yaxis is not None:
215 bx = self.axes.xaxis.update_units(x)
--> 216 by = self.axes.yaxis.update_units(y)
217
218 if self.command != 'plot':
~\Anaconda3\lib\site-packages\matplotlib\axis.py in update_units(self, data)
1467 neednew = self.converter != converter
1468 self.converter = converter
-> 1469 default = self.converter.default_units(data, self)
1470 if default is not None and self.units is None:
1471 self.set_units(default)
~\Anaconda3\lib\site-packages\matplotlib\category.py in default_units(data, axis)
113 # default_units->axis_info->convert
114 if axis.units is None:
--> 115 axis.set_units(UnitData(data))
116 else:
117 axis.units.update(data)
~\Anaconda3\lib\site-packages\matplotlib\category.py in __init__(self, data)
180 self._counter = itertools.count(start=0)
181 if data is not None:
--> 182 self.update(data)
183
184 def update(self, data):
~\Anaconda3\lib\site-packages\matplotlib\category.py in update(self, data)
197 data = np.atleast_1d(np.array(data, dtype=object))
198
--> 199 for val in OrderedDict.fromkeys(data):
200 if not isinstance(val, VALID_TYPES):
201 raise TypeError("{val!r} is not a string".format(val=val))
TypeError: unhashable type: 'numpy.ndarray'

If you check the type of your column 'Registration Rate', you will see that it's type of numpy.ndarray as shown in the error.
type(resultdf['Registration Rate'][0])
So, maybe modify your predictions creation to make it a single element?
predictions= list()
for the in range (10):
predicted= StartARIMAForecasting(RegistrationRates, 5,1,0)
# predicted is a numpy.ndarray, len = 1
p = predicted[0]
predictions.append(p)
Then run your code a gain.

pandas Series' object has no attribute 'find'

I am trying to do simple plot of data and getting the following error.. any help is very much appreciated
AttributeError: 'Series' object has no attribute 'find'
Versions :
python3 ,
matplotlib (2.0.2) ,
pandas (0.20.3) ,
jupyter (1.0.0).
Code:
import pandas as pd
import matplotlib.pyplot as plt
pd_hr_data = pd.read_csv("/Users/pc/Downloads/HR_comma_sep.csv")
#print(pd_hr_data['average_montly_hours'],pd_hr_data['sales'])
take_ten_data = pd_hr_data[0:19]
x = take_ten_data['average_montly_hours'].astype(int)
y = take_ten_data['sales'].astype(str)
print(type(x[0]))
print(type(y[0]))
#print(x,y) ---- this gives me all the 20 values
#print(type(y[0]))
plt.plot(x,y)
plt.show()
Out Put / Error:
-
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
9 #print(type(y[0]))
10
---> 11 plt.plot(x,y)
12 plt.show()
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/pyplot.py
in plot(*args, **kwargs)
3315 mplDeprecation)
3316 try:
-> 3317 ret = ax.plot(*args, **kwargs)
3318 finally:
3319 ax._hold = washold
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/__init__.py
in inner(ax, *args, **kwargs)
1896 warnings.warn(msg % (label_namer, func.__name__),
1897 RuntimeWarning, stacklevel=2)
-> 1898 return func(ax, *args, **kwargs)
1899 pre_doc = inner.__doc__
1900 if pre_doc is None:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/axes/_axes.py
in plot(self, *args, **kwargs)
1404 kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
1405
-> 1406 for line in self._get_lines(*args, **kwargs):
1407 self.add_line(line)
1408 lines.append(line)
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/axes/_base.py
in _grab_next_args(self, *args, **kwargs)
405 return
406 if len(remaining) <= 3:
--> 407 for seg in self._plot_args(remaining, kwargs):
408 yield seg
409 return
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/axes/_base.py
in _plot_args(self, tup, kwargs)
355 ret = []
356 if len(tup) > 1 and is_string_like(tup[-1]):
--> 357 linestyle, marker, color = _process_plot_format(tup[-1])
358 tup = tup[:-1]
359 elif len(tup) == 3:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/matplotlib/axes/_base.py
in _process_plot_format(fmt)
92 # handle the multi char special cases and strip them from the
93 # string
---> 94 if fmt.find('--') >= 0:
95 linestyle = '--'
96 fmt = fmt.replace('--', '')
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/generic.py
in __getattr__(self, name)
3079 if name in self._info_axis:
3080 return self[name]
-> 3081 return object.__getattribute__(self, name)
3082
3083 def __setattr__(self, name, value):
AttributeError: 'Series' object has no attribute 'find'

I think you can use DataFrame.plot with define x and y by columns names, because it better support plotting non numeric values:
take_ten_data = pd_hr_data[0:19]
x = take_ten_data['average_montly_hours'].astype(int)
y = take_ten_data['sales'].astype(str)
take_ten_data.plot(x='average_montly_hours', y='sales')
#working without x,y also, but less readable
#take_ten_data.plot('average_montly_hours','sales')
plt.show()
Sample:
take_ten_data = pd.DataFrame({'average_montly_hours':[3,10,12], 'sales':[10,20,30]})
x = take_ten_data['average_montly_hours'].astype(int)
y = take_ten_data['sales'].astype(str)
take_ten_data.plot(x='average_montly_hours', y='sales')
plt.show()
But if all values are numeric it works nice:
take_ten_data = pd.DataFrame({'average_montly_hours':[3,10,12], 'sales':['10','20','30']})
x = take_ten_data['average_montly_hours'].astype(int)
#convert to int if necessary
y = take_ten_data['sales'].astype(int)
plt.plot(x,y)
plt.show()

Following worked for me and hope it helps.... Issue was mixing differnt data types for plotting.
import pandas as pd
import matplotlib.pyplot as plt
pd_hr_data = pd.read_csv("/Users/pc/Downloads/HR_comma_sep.csv")
take_ten_data = pd_hr_data[0:4]
y = take_ten_data['average_montly_hours'].astype(int)
x = [1,2,3,4] ----this is can be autogenerated based on the series/matrix size
names = take_ten_data['sales']
plt.bar(x,y, align='center')
#plt.plot(x,y) ---- use this if you want
plt.xticks(x, names)
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

AssertionError using Basemap and Pandas - python

Related

How to create a regplot in Pyspark - tough time with sparse vector

custom function for scatterplot in bokeh not running

seaborn lmplot logistic raises AttributeError: module 'pandas' has no attribute 'Panel'

TypeError: unhashable type: 'numpy.ndarray' when trying to plot a DataFrame

pandas Series' object has no attribute 'find'

Categories

Resources