having ambiguity using customized kernel for `sklearn.svm` regressor - python

I want to use customized kernel function in Epsilon-Support Vector Regression module of Sklearn.svm. I found this code as an example for customized kernel for svc at the scilit-learn documentation:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2] # we only take the first two features. We could
# avoid this ugly slicing by using a two-dim dataset
Y = iris.target
def my_kernel(X, Y):
"""
We create a custom kernel:
(2 0)
k(X, Y) = X ( ) Y.T
(0 1)
"""
M = np.array([[2, 0], [0, 1.0]])
return np.dot(np.dot(X, M), Y.T)
h = .02 # step size in the mesh
# we create an instance of SVM and fit out data.
clf = svm.SVC(kernel=my_kernel)
clf.fit(X, Y)
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)
# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.Paired, edgecolors='k')
plt.title('3-Class classification using Support Vector Machine with custom'
' kernel')
plt.axis('tight')
plt.show()
I want to define some function like:
def my_new_kernel(X):
a,b,c = (random.randint(0,100) for _ in range(3))
# imagine f1,f2,f3 are functions like sin(x), cos(x), ...
ans = a*f1(X) + b*f2(X) + c*f3(X)
return ans
What I thought about kernel method is that it's a function that gets matrix of features (X) as input and returns a matrix of shape (n,1) . Then svm appends the returned matrix to the feature columns and uses that to classify the labels Y.
In the code above the kernel is used in svm.fit function and I can't figure out what are X and Y inputs of kernel and their shapes. if X and Y (inputs of my_kernel method) are the features and label of dataset, so then how does the kernel work for test data where we have no labels?
Actually I want to use svm for a dataset with shape of (10000, 6), (5 columns=features, 1 column = label) then if I want to use my_new_kernel method what would be the inputs and output and their shapes.

Your exact issue is quite unclear; here are some remarks which may be helpful nevertheless.
I can't figure out what are X and Y inputs of kernel and their shapes. if X and Y (inputs of my_kernel method) are the features and label of dataset,
Indeed they are; from the documentation of fit:
Parameters:
X : {array-like, sparse matrix}, shape (n_samples, n_features)
Training vectors, where n_samples is the number of samples and n_features is the number of features. For kernel=”precomputed”, the
expected shape of X is (n_samples, n_samples).
y : array-like, shape(n_samples,)
Target values (class labels in classification, real numbers in regression)
exactly like they are for the default available kernels.
so then how does the kernel work for test data where we have no labels?
A close look at the code you have provided will reveal that the labels Y are indeed used only during training (fit); they are not of course used during prediction (clf.predict() in the code above - don't get confused with yy, which have nothing to do with Y).

Related

How to interpret the coefficients returned from a multivariate cubic regression (polynomial degree 3) when using linearRegression().coef_?

I am trying to fit a hyperplane to a dataset which includes 2 features and 1 target variable. I processed the features using PolynomialFeatures.fit_transform() and PolynomialFeature(degree = 3), and then fitted those features and target variable into a LinearRegression() model. When I use LinearRegression().coef_ to get the coefficients in order to write out a function for the hyperplane (I want the written out function itself), 10 coefficients are returned and I don't know how to interpret them into a function. I know that for a PolynomialFeature(degree = 2) model, 6 coefficients are returned and the function looks like m[0] + x1*m[1] + x2*m[2] + (x1**2)*m[3] + (x2**2)*m[4] + x1*x2*m[5] where m is the list of coefficients returned in that order. How would I interpret the cubic one?
Here is what my code for thee cubic model looks like:
poly = polyF(degree = 3)
x_poly = poly.fit_transform(x)
model = linR()
model.fit(x_poly, y)
model.coef_
(returns):
array([ 0.00000000e+00, -1.50603348e+01, 2.33283686e+00, 6.73172519e-01,
-1.93686431e-01, -7.30930307e-02, -9.31687047e-03, 3.48729458e-03,
1.63718406e-04, 2.26682333e-03])
So if (X1,X2) transforms to (1,X1,X2,X1^2,X1X2,X2^2)
Then (X1,X2,X3) should transform to
(1,
X1, X2, X3,
X1X2, X1X3, X2X3,
X1^2 * X2, X2^2 * X3, X3^2 * X1)
I was facing the same question and developed the following code block to print the fit equation. To do so, it was necessary to include_bias=True in PolynomialFeatures and to set fit_intercept=False in LinearRegression, as opposed to conventional use:
import pandas as pd
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
def polyReg():
seed=12341
df=pd.read_csv("input.txt", delimiter=', ', engine='python')
X=df[["x1","x2","x3"]]
y=df["y"]
poly=PolynomialFeatures(degree=2,include_bias=True)
poly_X=poly.fit_transform(X)
X_train,X_test,y_train,y_test=train_test_split(poly_X,y,test_size=0.5,random_state=seed)
regression=linear_model.LinearRegression(fit_intercept=False)
fit=regression.fit(X_train,y_train)
variable_names=poly.get_feature_names_out(X.columns)
variable_names=np.core.defchararray.replace(variable_names.astype(str),' ','*')
fit_coeffs=["{:0.5g}".format(x) for x in fit.coef_]
arr_list=[fit_coeffs,variable_names]
fit_equation=np.apply_along_axis(join_txt, 0, arr_list)
fit_equation='+'.join(fit_equation)
fit_equation=fit_equation.replace("*1+","+")
fit_equation=fit_equation.replace("+-","-")
print("Fit equation:")
print(fit_equation)
def join_txt(text,delim='*'):
return np.asarray(delim.join(text),dtype=object)

2D lat/lon KernelDensity Estimator for sklearn

Using KernelDensity of sklearn.neighbors, I am getting density values that are much smaller than I expect. The density estimates have values that are about 1/200th of what I would expect.
I have reviewed sklearn's Kernel Density Estimate of Species Distribution, and I went down that path of converting my input lat/lon data to radians and using a haversine distance metric, but I was getting strange results.
I've thought about this a lot and here are the parameters that make most sense to me.
bandwidth = 1.0 map units
metric = "Euclidean" for my 2D space
kernel = "Gaussian" for probability density
Questions
Does this seem like a reasonable approach to this problem?
Why are the density values so much smaller than I would expect?
Here is my function and the parameters that I pass to it.
import rasterio
from rasterio.crs import CRS
from sklearn.neighbors import KernelDensity
import numpy as np
def kernel_density_lat_lon(positions, bandwidth, metric, kernel,
cell_size, extent, output_raster, multiplier=None):
# Set the bounds of the output raster based on the extent
x_min = extent[0]
x_max = extent[1]
y_min = extent[2]
y_max = extent[3]
# Create arrays, based on cell_size and bounds
# These arrays hold x locations and y locations for each pixel in the output raster
x = np.arange(x_min, x_max, cell_size)
y = np.arange(y_min, y_max, cell_size)
# Create a meshgrid, which has cells whose values are the (x,y) location at each cell
xx, yy = np.meshgrid(x, y)
# Pair the x locations with y locations
xys = np.vstack((xx.ravel(), yy.ravel())).T
# Create a density map
x_shape = xx.shape
# Get the kernel density estimator
kde = KernelDensity(bandwidth=bandwidth,metric=metric,
kernel=kernel, algorithm='ball_tree')
# Fit it to the coordinate pairs
_ = kde.fit(positions)
# Evaluate
z = np.exp(kde.score_samples(xys))
print(np.max(z))
zi = np.arange(xys.shape[0])
# Plug densities into grid
zg = -9999 + np.zeros(xys.shape[0])
zg[zi] = z
xyz = np.hstack((xys[:, :2], zg[:, None]))
# Get the density values arranged on the grid
z = xyz[:, 2].reshape(x_shape)
temp = z[::-1, :]
output_arr = temp.reshape(-1, temp.shape[0], temp.shape[1])
# Write the densities to a raster
with rasterio.open(
output_raster,
'w',
driver='GTiff',
height=output_arr.shape[1],
width=output_arr.shape[2],
dtype=output_arr.dtype,
crs=CRS.from_epsg(4326),
count=1,
transform=rasterio.transform.from_bounds(x_min, y_min, x_max, y_max, output_arr.shape[2], output_arr.shape[1])
) as dst:
dst.write(output_arr)
if __name__ == "__main__":
positions = [[126.82800884821953, 8.021550450814345],
[123.0835913004416, 15.887493017360754],
[122.87172138544588, 15.155979776107289],
[122.48465193221716, 15.233649683534475],
[122.26320643954872, 16.71625103407011],
[122.13275884500477, 15.941644592949958],
[120.63772441542471, 7.078277119741588],
[120.57180822188472, 7.537689414917545],
[119.53047809084589, 1.396741864447578],
[119.51652407635684, 1.7028166423529711],
[119.35538543402562, 7.795232293743844],
[119.35371605376332, 1.7139590065581176],
[118.21983976700818, 0.2725608428591114],
[116.32507063966972, -2.0478066628388163],
[115.9455871941716, -2.2758686356158915],
[110.54879990595637, 4.849182291868757],
[109.00373897612512, 12.330559666134512],
[108.56317006080423, 23.10356852435795],
[107.95374212609899, -3.878293744564539],
[107.6618148392204, -4.215545933851648],
[107.39598092145678, -3.3557991558597426],
[107.38347877309276, -4.243848824653475],
[107.3802332907293, -4.724984303635246],
[106.92298020128571, 3.3377440975999058],
[106.8467663232349, -3.427384435159751],
[106.6198566766759, 3.327030211530555],
[106.59035576911651, 3.409433089119516],
[106.48649132403538, 3.5936300679047966],
[106.2879431146126, 3.039670857739856],
[105.96323043582797, 2.5103916023650656],
[105.9540323861389, 2.596746532847891],
[105.80111748849575, 3.388380151516756],
[105.62119198430719, 3.2169296961449554],
[105.43276377101233, 2.6840109661437204],
[105.29236334314527, 2.420170430982717],
[104.94141265184744, 3.091707354213681],
[103.08902291491331, 3.1932135322924133],
[102.59488296531873, 14.93503092216549],
[100.7213889691745, 5.834246665586201],
[100.70491932538964, 5.2594820067014245],
[100.51665775078591, 6.0369426594605855],
[100.51156199546038, 5.491942119998682],
[100.45311457726862, 5.281343969279209],
[99.984116880471, 5.658350660638604],
[93.51170627287425, 24.024373245961645],
[93.34991893283902, 23.04050533807432],
[84.93884193888668, 19.384547030288207],
[84.30999142795147, 18.825326243832105],
[84.1630944193751, 19.06013889689632],
[83.80094785724114, 18.57306909774846],
[74.16321921976069, 23.579347585345776],
[72.4113965790803, 21.875517403679595],
[49.40472412468231, 32.2487630729451],
[42.90510332039255, -12.821849510976579],
[42.408207428324495, -12.31050970009727],
[42.36825610793828, -13.083052941231413],
[42.30285486383656, -12.234780003717532],
[15.328057669295298, -7.460883355600632],
[14.631592099379093, -7.440778982157976],
[14.563929300312948, -7.140268202440664],
[14.446656807020666, -4.699494598106393],
[14.188788859460905, -6.430418645148537],
[13.44490187975298, -2.8654279482460323],
[13.301089335672936, -2.593387816196834],
[13.131727857324034, -3.412434046655619],
[11.637624067618695, 5.306602656962694],
[11.537324701566494, 1.5773310360579327],
[11.056051828014489, 5.372994263069668],
[10.981944105212998, 6.05789466930291],
[10.978615683124655, 5.7586879077143225],
[10.384229532923067, 2.6509917300959476],
[10.293978958054748, 5.6087142487617045],
[9.724503564938162, 5.965801337392755],
[9.228154036572047, 6.4564328707855605],
[8.847083818460739, 4.696640992862242],
[8.724622829999017, 5.5476494764785516],
[8.483278678008926, 6.612624047942372],
[8.44366045716664, 6.2122982089038725],
[8.4255624128847, 4.755664077859387],
[8.11860899795907, 5.659724263701104],
[7.912362077517271, 4.87480562915889],
[7.563449250527216, 2.842579773546474],
[7.2608575851074, 5.16577516485171],
[7.004069229900638, 3.5416918941072804],
[6.9915716303567494, 3.7362296571866294],
[6.468876406999725, 5.010859767233725],
[6.203147917904825, 4.992482439632923],
[5.4017709770599325, 7.676092696459705],
[5.350100368207385, 7.762605113995827],
[5.279221956366327, 4.915935839020336],
[5.213104554080347, 8.281676925077297],
[5.1108484406102805, 7.9040681892696485],
[5.059337403465768, 8.140534352024792],
[4.861618772269268, 8.322655646328752],
[4.80376638793241, 8.062341031849334],
[4.665446704573248, 7.477404025788393],
[4.6477402888853145, 7.797020093234158],
[4.609044098910636, 38.765860093618905],
[4.555126307535386, 7.873929016757312],
[4.4195324599539845, 7.394848626095032],
[4.400283930670644, 8.038284539940614],
[4.347819621721147, 8.443859742876246],
[4.240704264765369, 6.955830447603886],
[4.227870824209585, 5.751072313355475],
[4.033821062618696, 7.0740805209122595],
[3.665972118522844, 6.545536856751896],
[3.4165849005141005, 7.191717476638518],
[3.121450235674562, 8.103710628355616],
[1.8057346437941182, 1.3314371195302515],
[0.21998421850813876, 6.744306925430884],
[-12.310298533627448, 11.362835062050264],
[-49.352317054841336, 2.010101652464972],
[-49.56587070660965, 1.366869361066606],
[-49.5821860267535, 1.824258170311353],
[-70.58665807820438, 20.03257364630837],
[-70.6803277335339, 19.902301232265422],
[-70.78620439744233, 20.024999949922996],
[-70.86459827149523, 20.273742251629713],
[-71.02033226779315, 19.891866165854587],
[-73.57317798569044, 12.265930473198331],
[-75.32300214385347, -10.734649751468147],
[-75.36631826293349, -10.206201123969526],
[-75.37463804230384, -10.724232696199014],
[-75.40829227919468, -10.817431611704407],
[-75.46984739081694, -10.195876463554633],
[-75.56266706716431, -10.202240256127965],
[-75.74233061116121, -10.647556252995775],
[-75.90503122834087, -10.297561312609464],
[-75.94114020328095, -10.530481915516726],
[-78.13302896559648, -1.2629721839381856],
[-78.42506520505198, -0.6805387090496724],
[-78.68351568134375, -1.1006283268898114],
[-79.09221180056895, -1.5423219306900116],
[-90.05839881111541, 21.022199691388156],
[-91.3208074507767, 20.58263399988673],
[-91.86906142999138, 20.169783366358622],
[-91.89838954465436, 20.49386425203851]]
bandwidth = 1.0
cell_size = 0.1
extent = [-180, 180, -90, 90]
metric = "euclidean"
kernel = "gaussian"
output_raster = metric + "_" + kernel + "_" + str(bandwidth).split(".")[0] + ".tif"
# The parameters that I think should do the trick
kernel_density_lat_lon(positions, bandwidth, metric, kernel, cell_size,
extent, output_raster)
# The parameters that get me closest to the desired output
# This requires multiplying all of the density probabilities by 205...
bandwidth = 1.0
cell_size = 0.1
extent = [-180, 180, -90, 90]
metric = "euclidean"
kernel = "epanechnikov"
multiplier = 205
output_raster = metric + "_" + kernel + "_" + str(bandwidth).split(".")[0] + ".tif"
kernel_density_lat_lon(positions, bandwidth, metric, kernel, cell_size,
extent, output_raster, multiplier)
I've thought about this a lot, and I am stumped as to why the density estimates are less than I would expect. Thanks for your help.

Trouble calculating slope and intercept in Numpy/Scypy using linear regression

i'm new in this forum.
I have a small problem to understand how to calcolate slope and intercept from value that are in a csv file.
This is my working codes(minquadbasso.py is the programme's name):
import numpy as np
import matplotlib.pyplot as plt # To visualize
import pandas as pd # To read data
from sklearn.linear_model import LinearRegression
data = pd.read_csv('TelefonoverticaleAsseY.csv') # load data set
X = data.iloc[:, 0].values.reshape(-1, 1) # values converts it into a numpy array
Y = data.iloc[:, 1].values.reshape(-1, 1) # -1 means that calculate the dimension of rows, but have 1 column
linear_regressor = LinearRegression() # create object for the class
linear_regressor.fit(X, Y) # perform linear regression
Y_pred = linear_regressor.predict(X) # make predictions
plt.scatter(X, Y)
plt.plot(X, Y_pred, color='black')
plt.show()
If I use:
from scipy.stats import linregress
linregress(X, Y)
compiler give me this error:
Traceback (most recent call last):
File "minquadbasso.py", line 11, in <module>
linregress(X, Y)
File "/usr/local/lib/python3.7/dist-packages/scipy/stats/_stats_mstats_common.py", line 116, in linregress
ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat
ValueError: too many values to unpack (expected 4)
Can you make me understand what i'm doing wrong and suggest what change in order to calculate seccesfully slope and intercept?
My go-to for linear regression is np.polyfit. If you have an array (or list) of x data, and an array or list of y data just use
coeff = np.polyfit(x,y, deg = 1)
coeff is now a list of least square coefficients to fit your data, with highest degree of x first. So for a first degree fit y = ax + b,
a = coeff[0] and b = coeff[1] 'deg' is the degree of the polynomial you want to fit to your data. To evaluate your regression (predict) you can use np.polyval
y_prediction = np.polyval(coeff, x)
If you want the covariance matrix for the fit
coeff, cov = np.polyfit(x,y, deg = 1, cov = True)
you can find more on it here.

Canonical Discriminant Function in Python sklearn

I am learning about Linear Discriminant Analysis and am using the scikit-learn module. I am confused by the "coef_" attribute from the LinearDiscriminantAnalysis class. As far as I understand, these are the discriminant function coefficients (sklearn calls them weight vectors). Since there should be (n_classes-1) discriminant functions, I would expect the coef_ attribute to be an array with shape (n_components, n_features), but instead it prints an (n_classes, n_features) array. Below is an example of this using the Iris dataset example from sklearn. Since there are 3 classes and 2 components, I would expect print(lda.coef_) to give me a 2x4 array instead of a 3x4 array...
Maybe I'm misinterpreting what the weight vectors are, perhaps they are the coefficients for the classification function?
And how do I get the coefficients for each variable in each discriminant/canonical function?
screenshot of jupyter notebook
Code here:
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
import numpy as np
iris = datasets.load_iris()
X = iris.data
y = iris.target
target_names = iris.target_names
lda = LinearDiscriminantAnalysis(n_components=2,store_covariance=True)
X_r = lda.fit(X, y).transform(X)
plt.figure()
for color, i, target_name in zip(colors, [0, 1, 2], target_names):
plt.scatter(X_r2[y == i, 0], X_r2[y == i, 1], alpha=.8, color=color,
label=target_name)
plt.legend(loc='best', shadow=False, scatterpoints=1)
plt.xlabel('Function 1 (%.2f%%)' %(lda.explained_variance_ratio_[0]*100))
plt.ylabel('Function 2 (%.2f%%)' %(lda.explained_variance_ratio_[1]*100))
plt.title('LDA of IRIS dataset')
print(lda.coef_)
#output -> [[ 6.24621637 12.24610757 -16.83743427 -21.13723331]
# [ -1.51666857 -4.36791652 4.64982565 3.18640594]
# [ -4.72954779 -7.87819105 12.18760862 17.95082737]]
You can calculate the coefficients with the following code:
def LDA_coefficients(X,lda):
nb_col = X.shape[1]
matrix= np.zeros((nb_col+1,nb_col), dtype=int)
Z=pd.DataFrame(data=matrix,columns=X.columns)
for j in range(0,nb_col):
Z.iloc[j,j] = 1
LD = lda.transform(Z)
nb_funct= LD.shape[1]
results = pd.DataFrame();
index = ['const']
for j in range(0,LD.shape[0]-1):
index = np.append(index,'C'+str(j+1))
for i in range(0,LD.shape[1]):
coef = [LD[-1][i]]
for j in range(0,LD.shape[0]-1):
coef = np.append(coef,LD[j][i]-LD[-1][i])
result = pd.Series(coef)
result.index = index
column_name = 'LD' + str(i+1)
results[column_name] = result
return results
Before calling this function you need to complete the linear discriminant analysis:
lda = LinearDiscriminantAnalysis()
lda.fit(X,y)

Extract decision boundary with scikit-learn linear SVM

I have a very simple 1D classification problem: a list of values [0, 0.5, 2] and their associated classes [0, 1, 2]. I would like to get the classification boundaries between those classes.
Adapting the iris example (for visualization purposes), getting rid of the non-linear models:
X = np.array([[x, 1] for x in [0, 0.5, 2]])
Y = np.array([1, 0, 2])
C = 1.0 # SVM regularization parameter
svc = svm.SVC(kernel='linear', C=C).fit(X, Y)
lin_svc = svm.LinearSVC(C=C).fit(X, Y)
Gives the following result:
LinearSVC is returning junk (why?), but the SVC with linear kernel is working okay. So I would like to get the boundaries values, that you can graphically guess: ~0.25 and ~1.25.
That's where I'm lost: svc.coef_ returns
array([[ 0.5 , 0. ],
[-1.33333333, 0. ],
[-1. , 0. ]])
while svc.intercept_ returns array([-0.125 , 1.66666667, 1. ]).
This is not explicit.
I must be missing something silly, how to obtain those values? They seem obvious to compute, that would be ridiculous to iterate over the x-axis to find the boundary...
I had the same question and eventually found the solution in the sklearn documentation.
Given the weights W=svc.coef_[0] and the intercept I=svc.intercept_ , the decision boundary is the line
y = a*x - b
with
a = -W[0]/W[1]
b = I[0]/W[1]
Exact boundary calculated from coef_ and intercept_
I think this is a great question and haven't been able to find a general answer to it anywhere in the documentation. This site really needs Latex, but anyway, I'll try to do my best without...
In general, a hyperplane is defined by its unit normal and an offset from the origin. So we hope to find some decision function of the form: x dot n + d > 0 (where the > may of course be replaced with >=).
In the case of the SVM Margins Example, we can manipulate the equation they start with to clarify its conceptual significance. First, let's establish the notational convenience of writing coef to represent coef_[0] and intercept to represent intercept_[0], since these arrays only have 1 value. Then some simple substitution yields the equation:
y + coef[0]*x/coef[1] + intercept/coef[1] = 0
Multiplying through by coef[1], we obtain
coef[1]*y + coef[0]*x + intercept = 0
And so we see that the coefficients and intercept function roughly as their names would imply. Applying one quick generalization of notation should make the answer clear - we will replace x and y with a single vector x.
coef[0]*x[0] + coef[1]*x[1] + intercept = 0
In general, the coef_ and intercept_ members of the svm classifier will have dimension matching the data set it was trained on, so we can extrapolate this equation to data of arbitrary dimension. And to avoid leading anyone astray, here is the final generalized decision boundary using the original variable names from the svm:
coef_[0][0]*x[0] + coef_[0][1]*x[1] + coef_[0][2]*x[2] + ... + coef_[0][n-1]*x[n-1] + intercept_[0] = 0
where the dimension of the data is n.
Or more tersely:
sum(coef_[0][i]*x[i]) + intercept_[0] = 0
where i sums over the range of the dimension of the input data.
Get decision line from SVM, demo 1
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs
# we create 40 separable points
X, y = make_blobs(n_samples=40, centers=2, random_state=6)
# fit the model, don't regularize for illustration purposes
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none')
plt.show()
Prints:
Approximate the separating n-1 dimensional hyperplane of an SVM, Demo 2
import numpy as np
import mlpy
from sklearn import svm
from sklearn.svm import SVC
import matplotlib.pyplot as plt
np.random.seed(0)
mean1, cov1, n1 = [1, 5], [[1,1],[1,2]], 200 # 200 samples of class 1
x1 = np.random.multivariate_normal(mean1, cov1, n1)
y1 = np.ones(n1, dtype=np.int)
mean2, cov2, n2 = [2.5, 2.5], [[1,0],[0,1]], 300 # 300 samples of class -1
x2 = np.random.multivariate_normal(mean2, cov2, n2)
y2 = 0 * np.ones(n2, dtype=np.int)
X = np.concatenate((x1, x2), axis=0) # concatenate the 1 and -1 samples
y = np.concatenate((y1, y2))
clf = svm.SVC()
#fit the hyperplane between the clouds of data, should be fast as hell
clf.fit(X, y)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
production_point = [1., 2.5]
answer = clf.predict([production_point])
print("Answer: " + str(answer))
plt.plot(x1[:,0], x1[:,1], 'ob', x2[:,0], x2[:,1], 'or', markersize = 5)
colormap = ['r', 'b']
color = colormap[answer[0]]
plt.plot(production_point[0], production_point[1], 'o' + str(color), markersize=20)
#I want to draw the decision lines
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
plt.show()
Prints:
These hyperplanes are all straight as an arrow, they're just straight in higher dimensions and can't be comprehended by mere mortals confined to 3 dimensional space. These hyperplanes are cast into higher dimensions with the creative kernel functions, than flattened back into the visible dimension for your viewing pleasure. Here is a video trying to impart some intuition of what is going on in demo 2: https://www.youtube.com/watch?v=3liCbRZPrZA

Categories

Resources