Multivariate Normal pdf in Scipy - python

Trying to evaluate scipy's multivariate_normal.pdf function, but keep getting errors. MWE:
import numpy as np
from scipy.stats import multivariate_normal as mvnorm
x = np.random.rand(5)
mvnorm.pdf(x)
gives
TypeError: pdf() takes at least 4 arguments (2 given)
The docs say both the mean and cov arguments are optional, and that the last axis of x labels the components. Since x.shape = (4L,), it seems like all is kosher. I am expecting a single number as output.

It looks like these parameters aren't optional.
If I pass the default values for mean and cov like:
import numpy as np
from scipy.stats import multivariate_normal as mvnorm
x = np.random.rand(5)
mvnorm.pdf(x, mean=0, cov=1)
I get the following output:
array([ 0.35082878, 0.27012396, 0.26986049, 0.39887847, 0.36116341])
While using:
import numpy as np
from scipy.stats import multivariate_normal as mvnorm
x = np.random.rand(5)
mvnorm.pdf(x)
gives me the same error:
TypeError: pdf() takes at least 4 arguments (2 given)

Related

How to calculate improper integral in python?

How can I calculate the value of this integral:
f_tu(t) is given as numpy.array. The graph looks like this:
How can I implement this?
Everything I could find looks something like this
from scipy.integrate import quad
def f(x):
return 1/sin(x)
I = quad(f, 0, 1)
but I have an array there, not a specific function like sin.
How about auc from sklearn.metrics?
import numpy as np
import numpy as np
from scipy.integrate import quad
from sklearn.metrics import auc
x = np.arange(0, 100, 0.001)
y = np.sin(x)
print('auc:', auc(x,y))
print('quad:', quad(np.sin, 0, 100))
auc: 0.13818791291277366
quad: (0.1376811277123232, 9.459751315610276e-09)
Okay, so you have one of those pesky infinity integrals. Here is how I would deal with it:
import numpy as np
from scipy.integrate import quad
def f(x):
return(1/(x**2)) #put your function to integrate here
print(quad(f,0,np.Infinity)) #integrates from 0 to infinity
This returns two values. The first is the estimated value of the integral, and the second is the approximate absolute error of the integral which is useful to know.
If you want to integrate a numpy array here is a simple solution:
import numpy as np
print(np.trapz(your numpy array here))

python - What produces the same plot as autocorrelation_plot()?

I need the values of the autocorrelation coefficients coming from the autocorrelation_plot(). The problem is that the output coming from this function is not accessible, so I need another function to get such values. That's why I used acf() from statsmodels but it didn't get the same plot as autocorrelation_plot() does. Here is my code:
from statsmodels.tsa.stattools import acf
from pandas.plotting import autocorrelation_plot
import matplotlib.pyplot as plt
import numpy as np
y = np.sin(np.arange(1,6*np.pi,0.1))
plt.plot(acf(y))
plt.show()
So the result is not the same as this:
autocorrelation_plot(y)
plt.show()
This seems to be related to the nlags parameter of acf:
nlags: int, optional
Number of lags to return autocorrelation for.
I don't know what exactly this does but in the source of acf there is a slicing
that shortens the array:
avf = acovf(x, unbiased=unbiased, demean=True, fft=fft, missing=missing)
acf = avf[:nlags + 1] / avf[0]
If you use statsmodels.tsa.stattools.acovf directly the result is the same as with autocorrelation_plot:
avf = acovf(x, unbiased=unbiased, demean=True, fft=fft, missing=missing)
So you can call it like
plt.plot(acf(y, nlags=len(y)))
to make it work.
An explanation of lag: https://math.stackexchange.com/questions/2548314/what-is-lag-in-a-time-series/2548350

Runtime Return array size Error Python

I am trying to solve a simple differential equation using odeint function. It is giving an error with matching size of array. I think my initial_condi is not matching with the equation function. I can't figure it out where actually the error is. Blow is the error and code. Any help would be greatly appreciated.
RuntimeError: The size of the array returned by func (1) does not match the size of y0 (3)
from scipy import *
from scipy.integrate import odeint
from operator import itemgetter
import matplotlib as plt
from matplotlib.ticker import FormatStrFormatter
from pylab import *
from itertools import product
import itertools
from numpy import zeros_like
import operator
initial_condi = [1, 1, 1]
t_range = arange(0.0,60.0,1.0)
def equation(w, t):
T,I,V = w
dT= V*I*10.24-T*1.64
return dT
result_init = odeint(equation, initial_condi, t_range)
plt.plot(t, result_init[:, 0])
plt.show()
As your state vector has 3 components, the return value of the ODE function also needs to have 3 components, the derivatives of T,I,V. You only provided dT, but should return [dT, dI, dV ].

Python fmin(find minimum) for a vector function

I would like to find the minimum of 3dvar function defined as:
J(x)=(x-x_b)B^{-1}(x-x_b)^T + (y-H(x)) R^{-1} (y-H(x))^T (latex code)
with B,H,R,x_b,y given.
I would like to find the argmin(J(x)). However it seems fmin in python does not work. (the function J works correctly)
Here is my code:
import numpy as np
from scipy.optimize import fmin
import math
def dvar_3(x):
B=np.eye(5)
H=np.ones((3,5))
R=np.eye(3)
xb=np.ones(5)
Y=np.ones(3)
Y.shape=(Y.size,1)
xb.shape=(xb.size,1)
value=np.dot(np.dot(np.transpose(x-xb),(np.linalg.inv(B))),(x-xb)) +np.dot(np.dot(np.transpose(Y-np.dot(H,x)),(np.linalg.inv(R))),(Y-np.dot(H,x)))
return value[0][0]
ini=np.ones(5) #
ini.shape=(ini.size,1) #change initial to vertical vector
fmin(dvar_3,ini) #start at initial vector
I receive this error:
ValueError: operands could not be broadcast together with shapes (5,5) (3,3)
How can I solve this problem? Thank you in advance.
reshape argument x in the function dvar_3, the init argument of fmin() needs a one-dim array.
import numpy as np
from scipy.optimize import fmin
import math
def dvar_3(x):
x = x[:, None]
B=np.eye(5)
H=np.ones((3,5))
R=np.eye(3)
xb=np.ones(5)
Y=np.ones(3)
Y.shape=(Y.size,1)
xb.shape=(xb.size,1)
value=np.dot(np.dot(np.transpose(x-xb),(np.linalg.inv(B))),(x-xb)) +np.dot(np.dot(np.transpose(Y-np.dot(H,x)),(np.linalg.inv(R))),(Y-np.dot(H,x)))
return value[0][0]
ini=np.ones(5) #
fmin(dvar_3,ini) #start at initial vector

why 1D scipy.interpolate.griddata using method=nearest produces nans?

I am running scipy.interpolate.griddata on a set of coordinates that could be of many dimensions (even 1). When the coordinates are 1D the nearest method produces nans instead of the closest values when outside boundaries. An example:
import numpy as np
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
target_points = [1.,2.,3.,4.,5.,6.,7.]
points = np.random.rand(50)*2*np.pi
values = np.sin(points)
interp = griddata(points, values, target_points, method='nearest')
plt.plot(points,values,'o')
plt.plot(target_points,interp,'ro')
print interp
plt.show()
The last value printed is a NaN. Am I doing something wrong? If this is a limitation of scipy do you have a smart workaround?
Note that linear/cubic modes are expected to give NaNs, but this should not be the case for the 'nearest' mode.
When the data is 1-dimensional, griddata defers to interpolate.interp1d:
if ndim == 1 and method in ('nearest', 'linear', 'cubic'):
from .interpolate import interp1d
points = points.ravel()
...
ip = interp1d(points, values, kind=method, axis=0, bounds_error=False,
fill_value=fill_value)
return ip(xi)
So even though method='nearest' griddata will not extrapolate since interp1d behaves this way.
However, there are other tools, such as scipy.cluster.vq (vector quantization), which you could use to find the nearest value. For example,
import numpy as np
import scipy.cluster.vq as vq
import matplotlib.pyplot as plt
target_points = np.array([1.,2.,3.,4.,5.,6.,7.])
points = (np.random.rand(50)*2*np.pi)
values = np.sin(points)
code, dist = vq.vq(target_points, points)
interp = values[code]
plt.plot(points,values,'o')
plt.plot(target_points,interp,'ro')
print interp
plt.show()
This looks like a bug in scipy.interpolate.griddata because the behaviour is not according to the documentation which clearly states that the input argument "fill_value" has no effect when method is "nearest".
The output of the following line:
scipy.interpolate.griddata(points=np.array([1,2]), values=np.array([10,20]), xi=3, method='nearest', fill_value=-1)
is array(-1.0) which proves that the fill_value has an impact on the output contrary to what is stated in the documentation.

Categories

Resources