How to use user defined metric in KernelDensity from scikit-learn (python) - python

I'm using scikit-learn (0.14) and trying to implement a user defined metric for my KernelDensity estimation.
Following code is an example how my code is structured:
def myDistance(x,y):
return np.sqrt(sum((x - y)**2))
dt=DistanceMetric.get_metric("pyfunc",func=myDistance)
kernelModel=KernelDensity(algorithm='ball_tree',metric='pyfunc')
kernelModel.fit(X)
According to the documentation, the BallTree algorithm should accept user defined metrics.
If I run this code the way given here, I get following error:
TypeError: __init__() takes exactly 1 positional argument (0 given)
The error seems to come from:
sklearn.neighbors.dist_metrics.PyFuncDistance.__init__
I don't understand this. If i check what 'dt' in the code above gives me, I get what I expect. dt.pairwise(X) returns the correct values.
What am I doing wrong?
Thanks in advance.

Solution is
kernelModel=KernelDensity(...,metric='pyfunc',metric_params={"func":myDistance})
The call to Distancemetric.get_metric is not necessary.
M

Related

Can't specify initialization method in Exponential smoothing

When going through the exponential smoothing tutorial provided by statsmodels, I had an issue with specifying an initialization method in anaconda notebook. The error would return:
TypeError: init() got an unexpected keyword argument 'initialization_method'
It would reference the first line in the cell:
fit1 = Holt(air, initialization_method="estimated").fit(smoothing_level=0.8, smoothing_trend=0.2, optimized=False)
When looking for the solution, I downgraded my prompt-toolkit but that didn't do anything. I can't find anyone else having this problem, any ideas?

Recurrent problem using scipy.optimize.fmin

I am encountering some problems when translating the following code from MATLAB to Python:
Matlab code snippet:
x=M_test %M_test is a 1x3 array that holds the adjustment points for the function
y=R_test %R_test is also a 1x3 array
>> M_test=[0.513,7.521,13.781]
>> R_test=[2.39,3.77,6.86]
expo3= #(b,x) b(1).*(exp(-(b(2)./x).^b(3)));
NRCF_expo3= #(b) norm(y-expo3(b,x));
B0_expo3=[fcm28;1;1];
B_expo3=fminsearch(NRCF_expo3,B0_expo3);
Data_raw.fcm_expo3=(expo3(B_expo3,Data_raw.M));
The translated (python) code:
expo3=lambda x,M_test: x[0]*(1-exp(-1*(x[1]/M_test)**x[2]))
NRCF_expo3=lambda R_test,x,M_test: np.linalg.norm(R_test-expo3,ax=1)
B_expo3=scipy.optimize.fmin(func=NRCF_expo3,x0=[fcm28,1,1],args=(x,))
For clarity, the object function 'expo3' wants to go through the adjustment points defined by M_test.
'NRCF_expo3' is the function that wants to be minimised, which is basically the error between R_test and the drawn exponential function.
When I run the code, I obtain the following error message:
B_expo3=scipy.optimize.fmin(func=NRCF_expo3,x0=[fcm28,1,1]),args=(x,))
NameError: name 'x' is not defined
There are a lot of similar questions that I have perused.
If I delete the 'args' from the optimization function, as numpy/scipy analog of matlab's fminsearch
seems to indicate it is not necessary, I obtain the error:
line 327, in function_wrapper
return function(*(wrapper_args+args))
TypeError: <lambda>() missing 2 required positional arguments: 'x' and 'M_test'
There are a lot of other modifications that I have tried, following examples like Using scipy to minimize a function that also takes non variational parameters or those found in Open source examples, but nothing works for me.
I expect this is probably quite obvious, but I am very new to Python and I feel like I am looking for a needle in a haystack. What am I not seeing?
Any help would be really appreciated. I can also provide more code, if that is necessary.
I think you shouldn't use lambdas in your code, make instead a single target function with your three parameters (see PEP8). There is a lot of missing information in you post, but for what I can infer, you want something like this:
from scipy.optimize import fmin
# Define parameters
M_TEST = np.array([0.513, 7.521, 13.781])
X_ARR = np.array([2.39,3.77,6.86])
X0 = np.array([10, 1, 1]) # whatever your variable fcm28 is
def nrcf_exp3(r_test, m_test, x):
expo3 = x[0] * (1 - np.exp(-(x[1] / m_test) ** x[2]))
return np.linalg.norm(r_test - expo3)
fmin(nrcf_exp3, X0, args=(M_TEST, X_ARR))

AdaDelta optimization algorithm using Python

I was studying the AdaDelta optimization algorithm so I tried to implement it in Python, but there is something wrong with my code, since I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'sqrt'
I did not find something about what is causing that error. According to the message, it's because of this line of code:
rms_grad = np.sqrt(self.e_grad + epsilon)
This line is similar to this equation:
RMS[g]t=√E[g^2]t+ϵ
I got the core equations of the algorithm in this article: http://ruder.io/optimizing-gradient-descent/index.html#adadelta
Just one more detail: I'm initializing the E[g^2]t matrix like this:
self.e_grad = (1 - mu)*np.square(nabla)
Where nabla is the gradient. Similar to this equation:
E[g2]t = γE[g2]t−1 + (1−γ)g2t
(the first term is equal to zero in the first iteration, just like the line of code above)
So I want to know if I'm initializing the E matrix the wrong way or I'm doing the square root inappropriately. I tried to use the pow() function but it doesn't work. If anyone could help me with this I would be very grateful, I'm trying this for weeks.
Additional details requested by andersource:
Here is the entire source code on github: https://github.com/pedrovbeltran/neural-networks-and-deep-learning/blob/experimental/modified-networks/network2_with_adadelta.py .
I think the problem is that self.e_grad_w is an ndarray of shape (2,) which further contains two additional ndarrays with 2d shapes, instead of directly containing data. This seems to be initialized in e_grad_initializer, in which nabla_w has the same structure. I didn't track where this comes from all the way back, but I believe once you fix this issue the problem will be resolved.

Scipy.signal method 'filtfilt()' doesn't recognized correctly

It's my first time working with scipy.signal library and I am experimenting an error with the method filtfilt().
This is the code I am trying to execute:
Fs = 1000
# s is an array of numbers
a=signal.firwin(10, cutoff=0.5/(Fs/2))
ss = s - np.mean(s)
se = signal.filtfilt(a, 1, ss, method="gust")
When I execute this code I get the next error:
TypeError: filtfilt() got an unexpected keyword argument 'method'
But in the documentation of the method it is clearly shown that the parameter 'method' exists.
What could be the problem?
I would guess you have different versions of scipy in use. The documentation of filtfilt says the 'gust' method was added in 0.16. I assume the method parameter does not exist in earlier versions.

Calling goodness of fit value with result.prsquared() in statsmodels results TypeError: 'numpy.float64' object is not callable

I am a complete beginner, and I'm currently doing this tutorial about logit regression models in python 3.4, with statsmodels 0.6.1 and Pycharm community version 4.5.1:
http://blog.yhathq.com/posts/logistic-regression-and-python.html
It runs smoothly. I try to add my own lines, to try out a few things.
After the part when I fit the data
train_cols = data.columns[1:]
logit = sm.logit(data['admit'], data[train_cols])
result = logit.fit()
and I print out the summary
print(result.summary())
I tried to take a little detour from the tutorial, to print only the Goodness of Fit measurement (in this case, it is a pseudo R-squared value). According to the documentation it is a method of result object (same as summary), so it should work like this:
print(result.prsquared())
However, running this code results in a TypeError on a line containing only print(result.prsquared()):
TypeError: 'numpy.float64' object is not callable
It really bugs me, because if I would to compare several models, pseudo R-squared would be my first choice to do it.
prsquared is an attribute, not a function. Try:
print(result.prsquared)

Categories

Resources