Python ARIMA predictions returning NaNs - python

I am trying a simple prediction using ARIMA. The code below produces all NaNs as output for the argument of order of (1,1,3), but for order argument of (1,1,2) and (1,1,4) i am able to get proper(numerical) output. The same function works fine, in some other installations with older/newer versions of pandas, statsmodels and pmdarima. I checked related questions here in Stackoverflow, but since the same function with the same argument is working in other libraries, i assume there is nothing wrong with the order argument of (1,1,3) and probably the bug is with library versions or some other configuration. Any help is appreciated.
from statsmodels.tsa.arima_model import ARIMA
def testarima():
trainseries = pd.Series([600.00,10.00,405.00,900.00,500.00,500.00,500.00,500.00,500.00,
500.00,1000.00,533.00,2784.11,1775.00,300.00,4289.42,1270.00,
500.00,2145.00,1650.00,1750.00,785.00,4137.50,2450.00,2194.00,
1750.00,1000.00,2250.00,1000.00,1055.98,1000.00,250.00,450.00,
540.00,2247.50,200.00,820.00,570.00,555.00])
model = ARIMA(trainseries, order=(1, 1, 3))
# print("train: " + str(train))
try:
model_fit = model.fit(disp=0)
fc, se, conf = model_fit.forecast(24, alpha=0.05)
print('result: '+str(fc))
return fc
except:
return np.zeros(24)
statsmodels v 0.10.2
pmdarima v 1.5.1
pandas 0.25.3
python 3.7.5
There is a warning output
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\tsa\kalmanf\kalmanfilter.py:225: RuntimeWarning: invalid value encountered in log
Z_mat.astype(complex), R_mat, T_mat)
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\tsa\kalmanf\kalmanfilter.py:225: RuntimeWarning: invalid value encountered in true_divide
Z_mat.astype(complex), R_mat, T_mat)
C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py:492: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available
'available', HessianInversionWarning)
But in other installation where this is working also these warnings appear, but there i get proper numerical output

use panda version 0.22.0.0
its not resolved then avoid statsmodel just import ARIMA, if you using tensorflow in colab its not supporting tensorflow 2 version

Related

Issue with SHAP dependency plots: TypeError: loop of ufunc does not support argument 0 of type Explanation which has no callable conjugate method

I am attempting to the use the SHAP python library on an XGBoost model I built for binary classification on tabular data.
While attempting to generate dependency plots in particular, I get the following error.
TypeError: loop of ufunc does not support argument 0 of type Explanation which has no callable conjugate method
Here is the full traceback:
https://pastebin.com/u3imDXir
My code for the SHAP analysis seems fairly basic and I am able to generate beeswarm plots with no trouble.
import shap
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_train)
max_display = 20
shap.summary_plot(shap_values, X_train, max_display=max_display, color_bar=False, show=False)
plt.gcf().set_size_inches(5, 7)
plt.colorbar(label='feature value', ticks=[], fraction=0.03, pad=0.04)
plt.title = "SHAP beeswarm summary plot"
plt.savefig(('../reports/figures/01_successbutdialysisDurReversed/all_Data_plotted/summary_plot_limFeatures_' + str(max_display) + '.png'), dpi=600, bbox_inches='tight')`
It is when I attempt to make the dependence_plot that I start to get the errors.
shap.dependence_plot("iv_total", shap_values, X_train)
plt.show()
I have thus far attempted to wipe my virtualenv and install the latest anaconda distributions for major packages, on Python 3.10. If it is of relevance, I am using an M1 Mac. I would appreciate any help with this!

Is statsmodel/exponential smoothing working correctly?

I am performing a time series analysis using statsmodels and the exponential smoothing method. I am trying to reproduce the results from
https://www.statsmodels.org/devel/examples/notebooks/generated/exponential_smoothing.html
with a particular dataframe (with the same format as the example, but only one outcome).
Here are the lines of code:
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt
fit = ExponentialSmoothing(dataframe, seasonal_periods=4, trend='add', seasonal='mul', initialization_method="estimated").fit()
simulations = fit.simulate(5, repetitions=100, error='mul')
fit.fittedvalues.plot(ax=ax, style='--', color='green')
simulations.plot(ax=ax, style='-', alpha=0.05, color='grey', legend=False)
fit.forecast(8).rename('Holt-Winters (add-mul-seasonal)').plot(ax=ax, style='--', marker='o', color='green', legend=True)
However, when I run it, I get the error
TypeError: __init__() got an unexpected keyword argument 'initialization_method'
but when I check the parameters of ExponentialSmoothing in statsmodel, initialization_method is one of them, so I don't know what happens there.
Moving forward, I removed initialization_method from the parameters of ExponentialSmoothing within the code, then I get another error the line below
AttributeError: 'ExponentialSmoothing' object has no attribute 'simulate'
Again, I go and check if simulate is not deprecated in the latest version of statsmodels and no, it is still an attribute.
I upgraded the statsmodels, I upgraded pip and I still get the same errors.
What is it going on there?
Thanks in advance for any help!
Indeed, there was a bug in the previous version, that was corrected in the new version of statsmodels. One only needs to update to statsmodels 0.12.0 and this issue is solved.

'tensorflow' has no attribute 'to_int32'

I am trying to implement CTC loss to audio files but I get the following error:
TensorFlow has no attribute 'to_int32'
I'm running tf.version 2.0.0.
I think it's with the version, I'm currently using, as we see the error is thrown in the package itself ' tensorflow_backend.py' code.
I have imported packages as "tensorflow.keras.class_name" with backend as K. Below is the screenshot.
You can cast the tensor in TensorFlow 2 as follows:
tf.cast(my_tensor, tf.int32)
You can read the documentation of the method in https://www.tensorflow.org/api_docs/python/tf/cast
You can also see that the to_int32 is deprecated and was used in TensorFlow 1
https://www.tensorflow.org/api_docs/python/tf/compat/v1/to_int32
After you make the import just write
tf.to_int=lambda x: tf.cast(x, tf.int32)
This is similar to writing the behavior of tf.to_int in everywhere in the code, so you don't have to manually edit a TF1.0 code

fbprophet predict() method scalar values error

I'm trying to follow the basic tutorial for fbprophet and am getting an error that doesn't really make sense on the Prophet.predict() method. My code follows the tutorial exactly:
import pandas as pd
import numpy as np
from fbprophet import Prophet
df = pd.read_csv("example_wp_peyton_manning.csv")
df['y'] = np.log(df['y'])
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(periods = 365)
forecast = m.predict(future)
on the predict method, I get
ValueError: If using all scalar values, you must pass an index
I've seen this before when trying to use DataFrame constructors improperly, but this seems to be happening under the hood in the fbprophet code, which is strange because the passed dataframe comes from the package's own make_future_dataframe method. Has anyone else experienced this/know a work-around?
For context, I'm using Python 3.6.0, with Visual C++ 14.0, Numpy 1.13.1, Pandas 0.21.0, pystan 2.17.0.0 and fbprophet 0.2
There doesn't seem to be a tag for fbprophet and I don't have the reputation to make one
I got some other error but it works after adding:
...
m = Prophet()
m.daily_seasonality=True
...
Maybe you should try python 2.

R Arima works but Python statsmodels SARIMAX throws invertibility error

I am comparing SARIMAX fitting results between R (3.3.1) forecast package (7.3) and Python's (3.5.2) statsmodels (0.8).
The R-code is:
library(forecast)
data("AirPassengers")
Arima(AirPassengers, order=c(2,1,1), seasonal=list(order=c(0,1,0),
period=12))$aic
[1] 1017.848
The Python code is:
from statsmodels.tsa.statespace import sarimax
import pandas as pd
AirlinePassengers =
pd.Series([112,118,132,129,121,135,148,148,136,119,104,118,115,126,
141,135,125,149,170,170,158,133,114,140,145,150,178,163,
172,178,199,199,184,162,146,166,171,180,193,181,183,218,
230,242,209,191,172,194,196,196,236,235,229,243,264,272,
237,211,180,201,204,188,235,227,234,264,302,293,259,229,
203,229,242,233,267,269,270,315,364,347,312,274,237,278,
284,277,317,313,318,374,413,405,355,306,271,306,315,301,
356,348,355,422,465,467,404,347,305,336,340,318,362,348,
363,435,491,505,404,359,310,337,360,342,406,396,420,472,
548,559,463,407,362,405,417,391,419,461,472,535,622,606,
508,461,390,432])
AirlinePassengers.index = pd.DatetimeIndex(end='1960-12-31',
periods=len(AirlinePassengers), freq='1M')
print(sarimax.SARIMAX(AirlinePassengers,order=(2,1,1),
seasonal_order=(0,1,0,12)).fit().aic)
Which throws an error: ValueError: Non-stationary starting autoregressive parameters found with enforce_stationarity set to True.
If I set enforce_stationarity (and enforce_invertibility, which is also required) to False, the model fit works but AIC is very poor (>1400).
Using some other model parameters for the same data, e.g., ARIMA(0,1,1)(0,0,1)[12] I can get identical results from R and Python with stationarity and invertibility checks enabled in Python.
My main question is: What explains the difference in behavior with some model parameters? Are statsmodels' invertibility checks different from forecast's Arima and is the other somehow "more correct"?
I also found a pull request related to fixing an invertibility calculation bug in statsmodels: https://github.com/statsmodels/statsmodels/pull/3506
After re-installing statsmodels with the latest source code from Github, I still get the same error with the code above, but setting enforce_stationarity=False and enforce_invertibility=False I get aic of around 1010 which is lower than in the R case. But model parameters are also vastly different.

Categories

Resources