How to use user defined metric in KernelDensity from scikit-learn (python)

How to use user defined metric in KernelDensity from scikit-learn (python) - python

I'm using scikit-learn (0.14) and trying to implement a user defined metric for my KernelDensity estimation.
Following code is an example how my code is structured:
def myDistance(x,y):
return np.sqrt(sum((x - y)**2))
dt=DistanceMetric.get_metric("pyfunc",func=myDistance)
kernelModel=KernelDensity(algorithm='ball_tree',metric='pyfunc')
kernelModel.fit(X)
According to the documentation, the BallTree algorithm should accept user defined metrics.
If I run this code the way given here, I get following error:
TypeError: __init__() takes exactly 1 positional argument (0 given)
The error seems to come from:
sklearn.neighbors.dist_metrics.PyFuncDistance.__init__
I don't understand this. If i check what 'dt' in the code above gives me, I get what I expect. dt.pairwise(X) returns the correct values.
What am I doing wrong?
Thanks in advance.

Solution is
kernelModel=KernelDensity(...,metric='pyfunc',metric_params={"func":myDistance})
The call to Distancemetric.get_metric is not necessary.
M

Related

Can't specify initialization method in Exponential smoothing

When going through the exponential smoothing tutorial provided by statsmodels, I had an issue with specifying an initialization method in anaconda notebook. The error would return:
TypeError: init() got an unexpected keyword argument 'initialization_method'
It would reference the first line in the cell:
fit1 = Holt(air, initialization_method="estimated").fit(smoothing_level=0.8, smoothing_trend=0.2, optimized=False)
When looking for the solution, I downgraded my prompt-toolkit but that didn't do anything. I can't find anyone else having this problem, any ideas?

Recurrent problem using scipy.optimize.fmin

I am encountering some problems when translating the following code from MATLAB to Python:
Matlab code snippet:
x=M_test %M_test is a 1x3 array that holds the adjustment points for the function
y=R_test %R_test is also a 1x3 array
>> M_test=[0.513,7.521,13.781]
>> R_test=[2.39,3.77,6.86]
expo3= #(b,x) b(1).*(exp(-(b(2)./x).^b(3)));
NRCF_expo3= #(b) norm(y-expo3(b,x));
B0_expo3=[fcm28;1;1];
B_expo3=fminsearch(NRCF_expo3,B0_expo3);
Data_raw.fcm_expo3=(expo3(B_expo3,Data_raw.M));
The translated (python) code:
expo3=lambda x,M_test: x[0]*(1-exp(-1*(x[1]/M_test)**x[2]))
NRCF_expo3=lambda R_test,x,M_test: np.linalg.norm(R_test-expo3,ax=1)
B_expo3=scipy.optimize.fmin(func=NRCF_expo3,x0=[fcm28,1,1],args=(x,))
For clarity, the object function 'expo3' wants to go through the adjustment points defined by M_test.
'NRCF_expo3' is the function that wants to be minimised, which is basically the error between R_test and the drawn exponential function.
When I run the code, I obtain the following error message:
B_expo3=scipy.optimize.fmin(func=NRCF_expo3,x0=[fcm28,1,1]),args=(x,))
NameError: name 'x' is not defined
There are a lot of similar questions that I have perused.
If I delete the 'args' from the optimization function, as numpy/scipy analog of matlab's fminsearch
seems to indicate it is not necessary, I obtain the error:
line 327, in function_wrapper
return function(*(wrapper_args+args))
TypeError: <lambda>() missing 2 required positional arguments: 'x' and 'M_test'
There are a lot of other modifications that I have tried, following examples like Using scipy to minimize a function that also takes non variational parameters or those found in Open source examples, but nothing works for me.
I expect this is probably quite obvious, but I am very new to Python and I feel like I am looking for a needle in a haystack. What am I not seeing?
Any help would be really appreciated. I can also provide more code, if that is necessary.

I think you shouldn't use lambdas in your code, make instead a single target function with your three parameters (see PEP8). There is a lot of missing information in you post, but for what I can infer, you want something like this:
from scipy.optimize import fmin
# Define parameters
M_TEST = np.array([0.513, 7.521, 13.781])
X_ARR = np.array([2.39,3.77,6.86])
X0 = np.array([10, 1, 1]) # whatever your variable fcm28 is
def nrcf_exp3(r_test, m_test, x):
expo3 = x[0] * (1 - np.exp(-(x[1] / m_test) ** x[2]))
return np.linalg.norm(r_test - expo3)
fmin(nrcf_exp3, X0, args=(M_TEST, X_ARR))

AdaDelta optimization algorithm using Python

I was studying the AdaDelta optimization algorithm so I tried to implement it in Python, but there is something wrong with my code, since I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'sqrt'
I did not find something about what is causing that error. According to the message, it's because of this line of code:
rms_grad = np.sqrt(self.e_grad + epsilon)
This line is similar to this equation:
RMS[g]t=√E[g^2]t+ϵ
I got the core equations of the algorithm in this article: http://ruder.io/optimizing-gradient-descent/index.html#adadelta
Just one more detail: I'm initializing the E[g^2]t matrix like this:
self.e_grad = (1 - mu)*np.square(nabla)
Where nabla is the gradient. Similar to this equation:
E[g2]t = γE[g2]t−1 + (1−γ)g2t
(the first term is equal to zero in the first iteration, just like the line of code above)
So I want to know if I'm initializing the E matrix the wrong way or I'm doing the square root inappropriately. I tried to use the pow() function but it doesn't work. If anyone could help me with this I would be very grateful, I'm trying this for weeks.
Additional details requested by andersource:
Here is the entire source code on github: https://github.com/pedrovbeltran/neural-networks-and-deep-learning/blob/experimental/modified-networks/network2_with_adadelta.py .

I think the problem is that self.e_grad_w is an ndarray of shape (2,) which further contains two additional ndarrays with 2d shapes, instead of directly containing data. This seems to be initialized in e_grad_initializer, in which nabla_w has the same structure. I didn't track where this comes from all the way back, but I believe once you fix this issue the problem will be resolved.

Scipy.signal method 'filtfilt()' doesn't recognized correctly

It's my first time working with scipy.signal library and I am experimenting an error with the method filtfilt().
This is the code I am trying to execute:
Fs = 1000
# s is an array of numbers
a=signal.firwin(10, cutoff=0.5/(Fs/2))
ss = s - np.mean(s)
se = signal.filtfilt(a, 1, ss, method="gust")
When I execute this code I get the next error:
TypeError: filtfilt() got an unexpected keyword argument 'method'
But in the documentation of the method it is clearly shown that the parameter 'method' exists.
What could be the problem?

I would guess you have different versions of scipy in use. The documentation of filtfilt says the 'gust' method was added in 0.16. I assume the method parameter does not exist in earlier versions.

Calling goodness of fit value with result.prsquared() in statsmodels results TypeError: 'numpy.float64' object is not callable

I am a complete beginner, and I'm currently doing this tutorial about logit regression models in python 3.4, with statsmodels 0.6.1 and Pycharm community version 4.5.1:
http://blog.yhathq.com/posts/logistic-regression-and-python.html
It runs smoothly. I try to add my own lines, to try out a few things.
After the part when I fit the data
train_cols = data.columns[1:]
logit = sm.logit(data['admit'], data[train_cols])
result = logit.fit()
and I print out the summary
print(result.summary())
I tried to take a little detour from the tutorial, to print only the Goodness of Fit measurement (in this case, it is a pseudo R-squared value). According to the documentation it is a method of result object (same as summary), so it should work like this:
print(result.prsquared())
However, running this code results in a TypeError on a line containing only print(result.prsquared()):
TypeError: 'numpy.float64' object is not callable
It really bugs me, because if I would to compare several models, pseudo R-squared would be my first choice to do it.

prsquared is an attribute, not a function. Try:
print(result.prsquared)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use user defined metric in KernelDensity from scikit-learn (python) - python

Solution is kernelModel=KernelDensity(...,metric='pyfunc',metric_params={"func":myDistance}) The call to Distancemetric.get_metric is not necessary. M

Related

Can't specify initialization method in Exponential smoothing

Recurrent problem using scipy.optimize.fmin

AdaDelta optimization algorithm using Python

Scipy.signal method 'filtfilt()' doesn't recognized correctly

Calling goodness of fit value with result.prsquared() in statsmodels results TypeError: 'numpy.float64' object is not callable

Categories

Resources