Is statsmodel/exponential smoothing working correctly? - python

I am performing a time series analysis using statsmodels and the exponential smoothing method. I am trying to reproduce the results from
https://www.statsmodels.org/devel/examples/notebooks/generated/exponential_smoothing.html
with a particular dataframe (with the same format as the example, but only one outcome).
Here are the lines of code:
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt
fit = ExponentialSmoothing(dataframe, seasonal_periods=4, trend='add', seasonal='mul', initialization_method="estimated").fit()
simulations = fit.simulate(5, repetitions=100, error='mul')
fit.fittedvalues.plot(ax=ax, style='--', color='green')
simulations.plot(ax=ax, style='-', alpha=0.05, color='grey', legend=False)
fit.forecast(8).rename('Holt-Winters (add-mul-seasonal)').plot(ax=ax, style='--', marker='o', color='green', legend=True)
However, when I run it, I get the error
TypeError: __init__() got an unexpected keyword argument 'initialization_method'
but when I check the parameters of ExponentialSmoothing in statsmodel, initialization_method is one of them, so I don't know what happens there.
Moving forward, I removed initialization_method from the parameters of ExponentialSmoothing within the code, then I get another error the line below
AttributeError: 'ExponentialSmoothing' object has no attribute 'simulate'
Again, I go and check if simulate is not deprecated in the latest version of statsmodels and no, it is still an attribute.
I upgraded the statsmodels, I upgraded pip and I still get the same errors.
What is it going on there?
Thanks in advance for any help!

Indeed, there was a bug in the previous version, that was corrected in the new version of statsmodels. One only needs to update to statsmodels 0.12.0 and this issue is solved.

Related

Issue with SHAP dependency plots: TypeError: loop of ufunc does not support argument 0 of type Explanation which has no callable conjugate method

I am attempting to the use the SHAP python library on an XGBoost model I built for binary classification on tabular data.
While attempting to generate dependency plots in particular, I get the following error.
TypeError: loop of ufunc does not support argument 0 of type Explanation which has no callable conjugate method
Here is the full traceback:
https://pastebin.com/u3imDXir
My code for the SHAP analysis seems fairly basic and I am able to generate beeswarm plots with no trouble.
import shap
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_train)
max_display = 20
shap.summary_plot(shap_values, X_train, max_display=max_display, color_bar=False, show=False)
plt.gcf().set_size_inches(5, 7)
plt.colorbar(label='feature value', ticks=[], fraction=0.03, pad=0.04)
plt.title = "SHAP beeswarm summary plot"
plt.savefig(('../reports/figures/01_successbutdialysisDurReversed/all_Data_plotted/summary_plot_limFeatures_' + str(max_display) + '.png'), dpi=600, bbox_inches='tight')`
It is when I attempt to make the dependence_plot that I start to get the errors.
shap.dependence_plot("iv_total", shap_values, X_train)
plt.show()
I have thus far attempted to wipe my virtualenv and install the latest anaconda distributions for major packages, on Python 3.10. If it is of relevance, I am using an M1 Mac. I would appreciate any help with this!

Unexpected behavior of pyplot in the seaborn library. Bug?

I'm trying to understand the pointplot function (Link to pointplot doc) to plot error bars.
Setting the 'errorbar' argument to 'sd' should plot the standard deviation along with the mean. But calculating the standard deviation manually results in a different value.
I used the example provided in the documentation:
import seaborn as sns
df = sns.load_dataset("penguins")
ax = sns.pointplot(data=df, x="island", y="body_mass_g", errorbar="sd")
data = ax.lines[1].get_ydata()
print(data[1] - data[0]) # prints 248.57843137254895
sd = df[df['island'] == 'Torgersen']['body_mass_g'].std()
print(sd) # prints 445.10794020256765
I expected both printed values to be the same, since both data[1] - data[0] and sd should be equal to the standard deviation of the variable 'body_mass_g' for the category 'Torgersen'. Other standard deviation provided by sns.pointplot are also not as expected.
I must be missing something obvious here but for the life of me I can't figure it out.
Appreciate any help. I tested the code locally and in google colab with the same results.
My PC had an outdated version of seaborn (0.11.2), where the argument 'errorbar' was named 'ci'. Using the correct argument resolves the problem. Strangly google Colab also uses version 0.11.2, contrary to their claim that they auto update their packages.

solve_ivp Error: "missing 2 required positional arguments:"

The function that I am using for solve_ivp is defined as:
def ydot(t,y,kappa4,kappa16):
Upon using solve_ivp as below:
sol=solve_ivp(ydot,[0,10],initial_condition(),args=(50,100))
I get the following error:
ydot() missing 2 required positional arguments: 'kappa4' and 'kappa16'
I am not able debug the code even though I have defined the function ydot the way scipy documentation for solve_ivp defines (https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html)
There's even an example in the end of the documentation that demonstrates the passing of arguments implemented in the same way as I have done.
I believe the problem is somewhere in the two above pieces of the code I have provided from an otherwise long code.
I was able to replicate the error with scipy 1.1.0 .
Upgrading scipy to the latest version via cmd (pip install scipy==1.4.1) solved that error message for me.
Then the minimal reproducible example gave another error:
TypeError: ydot() argument after * must be an iterable, not int
Which was solved by the solution given by Tejas. The full working minimal script is hence:
from scipy.integrate import solve_ivp
def ydot(t,y,a): return -a*y
sol=solve_ivp(ydot,[0,10],[5],args=(8,))
print(sol.y)
I had faced the same issue recently.
But the great Warren Weckesser
helped me out.
Change
args=(10)
to
args=(10,)
and your MWE will work fine.
This happens because of tuples with a single element.
For reference see pg 30 of Python tutorial pdf (Release 3.5.1) on python.org

Python ARIMA predictions returning NaNs

I am trying a simple prediction using ARIMA. The code below produces all NaNs as output for the argument of order of (1,1,3), but for order argument of (1,1,2) and (1,1,4) i am able to get proper(numerical) output. The same function works fine, in some other installations with older/newer versions of pandas, statsmodels and pmdarima. I checked related questions here in Stackoverflow, but since the same function with the same argument is working in other libraries, i assume there is nothing wrong with the order argument of (1,1,3) and probably the bug is with library versions or some other configuration. Any help is appreciated.
from statsmodels.tsa.arima_model import ARIMA
def testarima():
trainseries = pd.Series([600.00,10.00,405.00,900.00,500.00,500.00,500.00,500.00,500.00,
500.00,1000.00,533.00,2784.11,1775.00,300.00,4289.42,1270.00,
500.00,2145.00,1650.00,1750.00,785.00,4137.50,2450.00,2194.00,
1750.00,1000.00,2250.00,1000.00,1055.98,1000.00,250.00,450.00,
540.00,2247.50,200.00,820.00,570.00,555.00])
model = ARIMA(trainseries, order=(1, 1, 3))
# print("train: " + str(train))
try:
model_fit = model.fit(disp=0)
fc, se, conf = model_fit.forecast(24, alpha=0.05)
print('result: '+str(fc))
return fc
except:
return np.zeros(24)
statsmodels v 0.10.2
pmdarima v 1.5.1
pandas 0.25.3
python 3.7.5
There is a warning output
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\tsa\kalmanf\kalmanfilter.py:225: RuntimeWarning: invalid value encountered in log
Z_mat.astype(complex), R_mat, T_mat)
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\tsa\kalmanf\kalmanfilter.py:225: RuntimeWarning: invalid value encountered in true_divide
Z_mat.astype(complex), R_mat, T_mat)
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py:492: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available
'available', HessianInversionWarning)
But in other installation where this is working also these warnings appear, but there i get proper numerical output
use panda version 0.22.0.0
its not resolved then avoid statsmodel just import ARIMA, if you using tensorflow in colab its not supporting tensorflow 2 version

skimage - TypeError: peak_local_max() got an unexpected keyword argument 'num_peaks_per_label'

The following code gives me the error present in the title :
from skimage.feature import peak_local_max
local_maxi = peak_local_max(imd,labels=iml,
indices=False,num_peaks_per_label=2)
Where imd is a "distance transformed image" which was obtained with :
from scipy import ndimage
imd = ndimage.distance_transform_edt(im)
im is the input binary image that I would like to later on segment with the watershed function of scikit-image. But to use this function properly, I first need to find the markers which will serve as the starting flooding points : that's what I'm trying to do with the 'peak_local_max' function.
Also, iml is the labeled version of im, that I got with :
from skimage.measure import label
iml = label(im)
I don't know what I've been doing wrong. Also, I've noticed that, the function seems to totally ignore its num_peaks argument. For instance, when I do :
local_maxi = peak_local_max(imd,labels=iml,
indices=True,num_peaks=1)
I always get the same number of peaks detected as when I set num_peaks=500 or num_peaks=np.inf. What am I missing here please ?
As #a_guest pointed out, my version of skimage wasn't matching with the version of the documentation I was referring to. The num_peaks_per_label argument is currently only available in the v0.13dev version. Updating my version to the dev version also fixed my problem with the num_peaks argument.

Categories

Resources