SARIMAX python np.linalg.linalg.LinAlgError: LU decomposition error - python

I have a problem with time series analysis. I have a dataset with 5 features. Following is the subset of my input dataset:
date,price,year,day,totaltx
1/1/2016 0:00,434.46,2016,1,126762
1/2/2016 0:00,433.59,2016,2,147449
1/3/2016 0:00,430.36,2016,3,148661
1/4/2016 0:00,433.49,2016,4,185279
1/5/2016 0:00,432.25,2016,5,178723
1/6/2016 0:00,429.46,2016,6,184207
My endogenous data is price column and exogenous data is totaltx price.
This is the code I am running and getting an error:
import statsmodels.api as sm
import pandas as pd
import numpy as np
from numpy.linalg import LinAlgError
def arima(filteredData, coinOutput, window, horizon, trainLength):
start_index = 0
end_index = 0
inputNumber = filteredData.shape[0]
predictions = np.array([], dtype=np.float32)
prices = np.array([], dtype=np.float32)
# sliding on time series data with 1 day step
while ((end_index) < inputNumber - 1):
end_index = start_index + trainLength
trainFeatures = filteredData[start_index:end_index]["totaltx"]
trainOutput = coinOutput[start_index:end_index]["price"]
arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
arima_fit = arima.fit(disp=0)
testdata=filteredData[end_index:end_index+1]["totaltx"]
total_sample = end_index-start_index
predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
price = coinOutput[end_index:end_index + 1]["price"].values
predictions = np.append(predictions, predicted)
prices = np.append(prices, price)
start_index = start_index + 1
return predictions, prices
def processCoins(bitcoinPrice, window, horizon):
output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
return output
trainLength=100;
for window in [3,5]:
for horizon in [1,2,5,7,10]:
bitcoinPrice = pd.read_csv("..\\prices.csv", sep=",")
coinOutput = processCoins(bitcoinPrice, window, horizon)
predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)
In this code, I am using rolling window regression technique. I am training arima for start_index:end_index and predicting the test data with end_index:end_index+1
This the error that is thrown from my code:
Traceback (most recent call last):
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 115, in <module>
predictions, prices = arima(filteredBitcoinPrice, coinOutput, window, horizon, trainLength, outputFile)
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 64, in arima
arima_fit = arima.fit(disp=0)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 469, in fit
skip_hessian=True, **kwargs)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 466, in fit
full_output=full_output)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\optimizer.py", line 191, in _fit
hess=hessian)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\optimizer.py", line 410, in _fit_lbfgs
**extra_kwargs)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 193, in fmin_l_bfgs_b
**opts)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 328, in _minimize_lbfgsb
f, g = func_and_grad(x)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 273, in func_and_grad
f = fun(x, *args)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\optimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 440, in f
return -self.loglike(params, *args) / nobs
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 646, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 825, in loglike
kfilter = self._filter(**kwargs)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 747, in _filter
self._initialize_state(prefix=prefix, complex_step=complex_step)
File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\representation.py", line 723, in _initialize_state
self._statespaces[prefix].initialize_stationary(complex_step)
File "_representation.pyx", line 1351, in statsmodels.tsa.statespace._representation.dStatespace.initialize_stationary
File "_tools.pyx", line 1151, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.linalg.LinAlgError: LU decomposition error.

This looks like it might be a bug. In the meantime, you may be able to fix this by using a different initialization, like so:
arima = sm.tsa.statespace.SARIMAX(
endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
initialization='approximate_diffuse')
If you get a chance, please file a bug report at https://github.com/statsmodels/statsmodels/issues/new!

I had the same error.
Erroneous code:
mod = sm.tsa.SARIMAX(y, order=(0 1,0), seasonal_order=(1,0,0,12))
res = mod.fit()
This gave me error :
LinAlgError: Schur decomposition solver error
I was able to solve this error by passing argument enforce_stationarity=False:
mod = sm.tsa.SARIMAX(y, order=(0 1,0), seasonal_order=(1,0,0,12),enforce_stationarity=False)
res = mod.fit()

Related

LU decomposition error in statsmodels ARIMA model

I know there is a very similar question and answer on stackoverflow (here), but this seems to be distinctly different. I am using statsmodels v 0.13.2, and I am using an ARIMA model as opposed to a SARIMAX model.
I am trying to fit a list of time series data sets with an ARIMA model. The offending piece of my code is here:
import numpy as np
from statsmodels.tsa.arima.model import ARIMA
items = np.log(og_items)
items['count'] = items['count'].apply(lambda x: 0 if math.isnan(x) or math.isinf(x) else x)
model = ARIMA(items, order=(14, 0, 7))
trained = model.fit()
items is a dataframe containing a date index and a single column, count.
I apply the lambda on the second line because some counts can be 0, resulting in a negative infinity after log is applied. The final product going into the ARIMA does not contain any NaNs or Infinite numbers. However, when I try this without using the log function, I do not get the error. This only occurs on certain series, but there does not seem to be rhyme or reason to which are affected. One series had about half of its values as zero after applying the lambda, while another did not have a single zero. Here is the error:
Traceback (most recent call last):
File "item_pipeline.py", line 267, in <module>
main()
File "item_pipeline.py", line 234, in main
restaurant_predictions = make_predictions(restaurant_data=restaurant_data, models=models,
File "item_pipeline.py", line 138, in make_predictions
predictions = model(*data_tuple[:2], min_date=min_date, max_date=max_date,
File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 127, in predict_daily_arima
predict_date_arima(prediction_dict, item_dict, prediction_date, x_days_out=x_days_out, log_vals=log_vals,
File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 51, in predict_date_arima
raise e
File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 47, in predict_date_arima
fitted = model.fit()
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/arima/model.py", line 390, in fit
res = super().fit(
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 704, in fit
mlefit = super(MLEModel, self).fit(start_params, method=method,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/model.py", line 563, in fit
xopt, retvals, optim_settings = optimizer._fit(f, score, start_params,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/optimizer.py", line 241, in _fit
xopt, retvals = func(objective, gradient, start_params, fargs, kwargs,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/optimizer.py", line 651, in _fit_lbfgs
retvals = optimize.fmin_l_bfgs_b(func, start_params, maxiter=maxiter,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_lbfgsb_py.py", line 199, in fmin_l_bfgs_b
res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_lbfgsb_py.py", line 362, in _minimize_lbfgsb
f, g = func_and_grad(x)
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 286, in fun_and_grad
self._update_grad()
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 256, in _update_grad
self._update_grad_impl()
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 173, in update_grad
self.g = approx_derivative(fun_wrapped, self.x, f0=self.f,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 505, in approx_derivative
return _dense_difference(fun_wrapped, x0, f0, h,
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 576, in _dense_difference
df = fun(x) - f0
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 456, in fun_wrapped
f = np.atleast_1d(fun(x, *args, **kwargs))
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 137, in fun_wrapped
fx = fun(np.copy(x), *args)
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/model.py", line 531, in f
return -self.loglike(params, *args) / nobs
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 939, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 983, in loglike
kfilter = self._filter(**kwargs)
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 903, in _filter
self._initialize_state(prefix=prefix, complex_step=complex_step)
File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/representation.py", line 983, in _initialize_state
self._statespaces[prefix].initialize(self.initialization,
File "statsmodels/tsa/statespace/_representation.pyx", line 1362, in statsmodels.tsa.statespace._representation.dStatespace.initialize
File "statsmodels/tsa/statespace/_initialization.pyx", line 288, in statsmodels.tsa.statespace._initialization.dInitialization.initialize
File "statsmodels/tsa/statespace/_initialization.pyx", line 406, in statsmodels.tsa.statespace._initialization.dInitialization.initialize_stationary_stationary_cov
File "statsmodels/tsa/statespace/_tools.pyx", line 1206, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.LinAlgError: LU decomposition error.
The solution in the other stackoverflow post was to initialize the statespace differently. It looks like the statespace is involved, if you look at the last few lines of the error. However, it does not seem that that workflow is exposed in the newer version of statsmodels. Is it? If not, what else can I try to circumvent this error?
So far, I have tried manually initializing the model to approximate diffuse, and manually setting the initialize property to approximate diffuse. Neither seem to be valid in the new statsmodels code.
Turns out there's a new way to initialize. The second line below is the operative line.
model = ARIMA(items, order=(14, 0, 7))
model.initialize_approximate_diffuse() # this line
trained = model.fit()

Sklearn: only size-1 arrays can be converted to Python scalars

When I tried to take the word_vector transformed from Chinese as the feature of sklearn,an error occurred.
The shape of x_train and word_vector are (747,) and (1,100) and the latter's dtype is float64
for this question, I guess the type of the data may be different, but i tried to traverse all the data, it was ok ……
Here are the code:
import pandas as pd
from sklearn.model_selection import train_test_split,GridSearchCV
import SZ_function as sz
import gensim
import numpy as np
from sklearn.naive_bayes import MultinomialNB
from sklearn import metrics
def remove_stop_words(text):
stop_words = sz.get_step_words('notebook/HIT.txt')
text = text.split()
word_list = ''
for word in text:
if word not in stop_words:
word_list += word
word_list += ' '
return word_list
def pre_process(path):
data = pd.read_excel(path)
data['text'] = data['text'].apply(sz.remove_number_en)
data['text'] = data['text'].apply(sz.cut_words)
data['text'] = data['text'].apply(remove_stop_words)
data = data.replace(to_replace='', value='None')
data = data.replace(to_replace='None', value=np.nan).dropna()
return data
def create_corpus(data):
text = data['text']
return [sentences.split() for sentences in text]
def word_vec(corpus):
model = gensim.models.word2vec.Word2Vec(corpus)
return model
def get_sent_vec(sent,model,size):
vec = np.zeros(size).reshape((1,size))
count = 0
for word in sent[1:]:
try:
vec += model.wv[word].reshape((1,size))
count += 1
except:
continue
if count != 0:
vec /= count
return vec
if __name__ == '__main__':
data = pre_process('datasets_demo.xlsx')
corpus = create_corpus(data)
model = word_vec(corpus)
data['text']=data['text'].apply(get_sent_vec,model=model,size=100)
x_train,y_train,x_test,y_test = train_test_split(data['text'],data['label'])
estimator = MultinomialNB()
estimator.fit(x_train,y_train)
here are the all trackback:
Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\12996\AppData\Local\Temp\jieba.cache
Loading model cost 0.628 seconds.
Prefix dict has been built successfully.
TypeError: only size-1 arrays can be converted to Python scalars
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\IPython\core\interactiveshell.py", line 3457, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-8366eff678ac>", line 1, in <module>
runfile('C:/Users/12996/Desktop/Tensorflow_/datasets_demo.py', wdir='C:/Users/12996/Desktop/Tensorflow_')
File "E:\pycharm\PyCharm 2022.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "E:\pycharm\PyCharm 2022.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/12996/Desktop/Tensorflow_/datasets_demo.py", line 66, in <module>
estimator.fit(x_train,y_train)
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\sklearn\naive_bayes.py", line 663, in fit
X, y = self._check_X_y(X, y)
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\sklearn\naive_bayes.py", line 523, in _check_X_y
return self._validate_data(X, y, accept_sparse="csr", reset=reset)
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\sklearn\base.py", line 581, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\sklearn\utils\validation.py", line 976, in check_X_y
estimator=estimator,
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\sklearn\utils\validation.py", line 746, in check_array
array = np.asarray(array, order=order, dtype=dtype)
File "E:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\series.py", line 857, in __array__
return np.asarray(self._values, dtype)
ValueError: setting an array element with a sequence.

Buffer is too small for requested array-Astropy

I'm reading TESS data and as you might expect it can be really large, My Buffer is too small. Is there a way I can avoid this error? Like maybe skip the file that has a large file? Or is there a more permanent solution(not involving more memory)? My Code is below, along with the Full Error
from lightkurve import TessTargetPixelFile
import lightkurve as lk
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
def func(tpf):
tpf.plot(aperture_mask=tpf.pipeline_mask);
lc = tpf.to_lightcurve()
mask = (lc.time.value < 1464)
masked_lc = lc[mask]
clipped_lc = masked_lc.remove_outliers(sigma=5);
flat_lc = clipped_lc.flatten()
binned_lc = flat_lc.bin(binsize=5)
periodogram = binned_lc.to_periodogram(method="bls", period=np.arange(1, 20, 0.001))
print(periodogram.plot())
planet_b_period = periodogram.period_at_max_power
planet_b_t0 = periodogram.transit_time_at_max_power
planet_b_dur = periodogram.duration_at_max_power
ax = binned_lc.fold(period=planet_b_period, epoch_time=planet_b_t0).scatter()
ax.set_xlim(-5, 5);
best_fit_period = periodogram.period_at_max_power
print('Best fit period: {:.3f}'.format(best_fit_period))
bfc=best_fit_period
print(bfc)
folded_lc = binned_lc.fold(period=bfc)
folded_lc.scatter(s=7);
plt.show()
planet_b_period
planet_b_model = periodogram.get_transit_model(period=planet_b_period,
transit_time=planet_b_t0,
duration=planet_b_dur)
ax = binned_lc.fold(planet_b_period, planet_b_t0).scatter(s=7)
planet_b_model.fold(planet_b_period, planet_b_t0).plot(ax=ax, c='r', lw=2)
ax.set_xlim(-5, 5);
dataset=pd.read_csv("all_targets_S001_v1.csv")
d=dataset["TICID"]
print(d)
IDs=[]
for i in range(len(d)):
IDs.append("TIC"+str(d[i]))
for i in range(len(IDs)):
tpf_file = lk.search_targetpixelfile(IDs[i], mission="TESS", sector=5).download(quality_bitmask='default')
try:
func(tpf_file)
continue
except:
continue
Thanks
The Full Error
WARNING: File may have been truncated: actual file length (262144) is smaller than the expected size (46402560) [astropy.io.fits.file]
Traceback (most recent call last):
File "lc.py", line 42, in <module>
tpf_file = lk.search_targetpixelfile(IDs[i], mission="TESS", sector=5).download(quality_bitmask='default')
File "C:\ProgramData\Anaconda3\lib\site-packages\lightkurve\utils.py", line 555, in wrapper
return f(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\lightkurve\search.py", line 355, in download
return self._download_one(
File "C:\ProgramData\Anaconda3\lib\site-packages\lightkurve\search.py", line 290, in _download_one
return read(path, quality_bitmask=quality_bitmask, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\lightkurve\io\read.py", line 112, in read
return getattr(__import__("lightkurve"), filetype)(path_or_url, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\lightkurve\targetpixelfile.py", line 2727, in __init__
quality_array=self.hdu[1].data["QUALITY"], bitmask=quality_bitmask
File "C:\ProgramData\Anaconda3\lib\site-packages\astropy\utils\decorators.py", line 758, in __get__
val = self.fget(obj)
File "C:\ProgramData\Anaconda3\lib\site-packages\astropy\io\fits\hdu\table.py", line 399, in data
data = self._get_tbdata()
File "C:\ProgramData\Anaconda3\lib\site-packages\astropy\io\fits\hdu\table.py", line 171, in _get_tbdata
raw_data = self._get_raw_data(self._nrows, columns.dtype,
File "C:\ProgramData\Anaconda3\lib\site-packages\astropy\io\fits\hdu\base.py", line 520, in _get_raw_data
return self._file.readarray(offset=offset, dtype=code, shape=shape)
File "C:\ProgramData\Anaconda3\lib\site-packages\astropy\io\fits\file.py", line 330, in readarray
return np.ndarray(shape=shape, dtype=dtype, offset=offset,
TypeError: buffer is too small for requested array

sklearn kneighbours memory error python

I am working on a Windows 7 8gb RAM.
This is the vectorizer I am using to vectorize a free text column in my 52MB training dataset
vec = CountVectorizer(analyzer='word',stop_words='english',decode_error='ignore',binary=True)
I want to calculate 5 nearest neighbours with this dataset for an 18MB test set.
nbrs = NearestNeighbors(n_neighbors=5).fit(vec.transform(data['clean_sum']))
vectors = vec.transform(data_test['clean_sum'])
distances,indices = nbrs.kneighbors(vectors)
This is the stack trace -
Traceback (most recent call last):
File "cr_nearness.py", line 224, in <module>
distances,indices = nbrs.kneighbors(vectors)
File "C:\Anaconda2\lib\site-packages\sklearn\neighbors\base.py", line 371,
kneighbors
n_jobs=n_jobs, squared=True)
File "C:\Anaconda2\lib\site-packages\sklearn\metrics\pairwise.py", line 12
in pairwise_distances
return _parallel_pairwise(X, Y, func, n_jobs, **kwds)
File "C:\Anaconda2\lib\site-packages\sklearn\metrics\pairwise.py", line 10
in _parallel_pairwise
return func(X, Y, **kwds)
File "C:\Anaconda2\lib\site-packages\sklearn\metrics\pairwise.py", line 23
n euclidean_distances
distances = safe_sparse_dot(X, Y.T, dense_output=True)
File "C:\Anaconda2\lib\site-packages\sklearn\utils\extmath.py", line 181,
afe_sparse_dot
ret = ret.toarray()
File "C:\Anaconda2\lib\site-packages\scipy\sparse\compressed.py", line 940
toarray
return self.tocoo(copy=False).toarray(order=order, out=out)
File "C:\Anaconda2\lib\site-packages\scipy\sparse\coo.py", line 250, in to
y
B = self._process_toarray_args(order, out)
File "C:\Anaconda2\lib\site-packages\scipy\sparse\base.py", line 817, in _
ess_toarray_args
return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError
Any ideas?
Use KNN with KD TREE
model =
KNeighborsClassifier(n_neighbors=5,algorithm='kd_tree').fit(X_train,
Y_train)
the model by default is algorithm='brute'. brute false take too much memory.
I think for your model it should be look like this
nbrs =
NearestNeighbors(n_neighbors=5,algorithm='kd_tree').fit(vec.transform(data['clean_sum']))

Pymc 2d gaussian fitting

I am trying to fit a predefined 2d gaussian function to some observed data with pymc. I keep running into errors and the last one I got was ValueError: setting an array element with a sequence. I understand what the error means, but I am not sure where the error is occurring in the code. My naive guess would be the random variables are being set to some array elements. Any suggestions would be much appreciated. Here is my code so far:
import pymc as mc
import numpy as np
import pyfits as pf
arr = pf.getdata('img.fits')
x=y=np.arange(0,71)
xx,yy=np.meshgrid(x,y)
err_map = pf.getdata('imgwht.fits')
def model((x,y),arr):
amp = mc.Uniform('amp',lower=-1,upper=1,doc='Amplitude')
x0 = mc.Uniform('x0',lower=21,upper=51,doc='xo')
y0 = mc.Uniform('y0',lower=21,upper=51,doc='yo')
sigx = mc.Uniform('sigx',lower=0.1,upper=10,doc='Sigma in X')
sigy = mc.Uniform('sigy',lower=0.1,upper=10,doc='Sigma in Y')
thta = mc.Uniform('theta',lower=0,upper=2*np.pi,doc='Rotation')
os = mc.Uniform('c',lower=-1,upper=1,doc='Vertical offset')
#mc.deterministic(plot=False,trace=False)
def gaussian((x, y)=(xx,yy), amplitude=amp, xo=x0, yo=y0, sigma_x=sigx, sigma_y=sigy, theta=thta, offset=os):
xo = float(xo)
yo = float(yo)
a = (mc.cos(theta)**2)/(2*sigma_x**2) + (mc.sin(theta)**2)/(2*sigma_y**2)
b = -(mc.sin(2*theta))/(4*sigma_x**2) + (mc.sin(2*theta))/(4*sigma_y**2)
c = (mc.sin(theta)**2)/(2*sigma_x**2) + (mc.cos(theta)**2)/(2*sigma_y**2)
gauss = offset+amplitude*mc.exp(-1*(a*((x-xo)**2)+2*b*(x-xo)*(y-yo)+c*((y-yo)**2)))
return gauss
flux = mc.Normal('flux',mu=gaussian,tau=err_map,value=arr,observed=True,doc='Observed Flux')
return locals()
mdl = mc.MCMC(model((xx,yy),arr))
mdl.sample(iter=1e5,burn=9e4)
Full traceback:
File "model.py", line 31, in <module>
mdl = mc.MCMC(model((xx,yy),arr))
File "model.py", line 29, in model
flux = mc.Normal('flux',mu=gaussian,tau=err_map,value=arr,observed=True,doc='Observed Flux')
File "/usr/lib64/python2.7/site-packages/pymc/distributions.py", line 318, in __init__
**arg_dict_out)
File "/usr/lib64/python2.7/site-packages/pymc/PyMCObjects.py", line 761, in __init__
verbose=verbose)
File "/usr/lib64/python2.7/site-packages/pymc/Node.py", line 219, in __init__
Node.__init__(self, doc, name, parents, cache_depth, verbose=verbose)
File "/usr/lib64/python2.7/site-packages/pymc/Node.py", line 129, in __init__
self.parents = parents
File "/usr/lib64/python2.7/site-packages/pymc/Node.py", line 152, in _set_parents
self.gen_lazy_function()
File "/usr/lib64/python2.7/site-packages/pymc/PyMCObjects.py", line 810, in gen_lazy_function
self._logp.force_compute()
File "LazyFunction.pyx", line 257, in pymc.LazyFunction.LazyFunction.force_compute (pymc/LazyFunction.c:2409)
File "/usr/lib64/python2.7/site-packages/pymc/distributions.py", line 2977, in wrapper
return f(value, **kwds)
File "/usr/lib64/python2.7/site-packages/pymc/distributions.py", line 2168, in normal_like
return flib.normal(x, mu, tau)
ValueError: setting an array element with a sequence.
I've run into an issue like this before, but never had a chance to track it down to its source. The problem line in your code is the one for the observed Stochastic:
flux = mc.Normal('flux',mu=gaussian,tau=err_map,value=arr,observed=True,doc='Observed Flux')
I know a work-around that you can use, which is to check if the mu variable is a pymc.Node, and only find the likelihood if it is not:
#mc.observed
def flux(mu=gaussian,tau=err_map,value=arr):
if isinstance(mu, mc.Node):
return 0
else:
return mc.normal_like(value, mu, tau)
I think it would be worth filing a bug report in the PyMC github issue tracker if you have time.
The #mc.deterministic decorator returns a deterministic variable. To get the value of the variable, use the attribute value.
flux = mc.Normal('flux',mu=gaussian.value,tau=err_map,value=arr,observed=True,doc='Observed Flux')

Categories

Resources