I want to build a chart similar to this
I have created a bar chart, and I have the logistic regression completed.
#imports
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
plt.bar(prob_df['diff'], prob_df['full_win_prob'])
plt.show()
#logistic regression
X = dfx['home_diff'].values
y = dfx['away_win'].values
X = X.reshape(-1, 1)
logreg = LogisticRegression()
logreg.fit(X, y)
print(logreg.intercept_, logreg.coef_)
[-0.67032214] [[0.04948131]] #results
I have the chart and I have the model, I can't figure out how to plot the model on top of the chart, its a bit frustrating I'm sure the answer is simple. I would prefer an answer in matplotlib but seaborn is also ok.
You can use yy = logreg.predict_proba(XX)[:,1] to plot the logistic curve for an array of x-values. logreg.predict_proba(XX)[:,0] gives the inverted curve, the probability of being 0. logreg.predict(XX) gives the predictions, i.e. the logistic curve rounded to 0 or 1.
Here is an example starting from some generated test data.
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
import numpy as np
np.random.seed(123)
# logistic regression
X = np.random.uniform(1, 9, 20) # dfx['home_diff'].values
y = np.random.choice([0, 1], 20, p=[0.8, 0.2])
y = np.where(X < 5, y, 1 - y) # dfx['away_win'].values
X = X.reshape(-1, 1)
plt.scatter(X[:, 0], y, color='black', label='Given values')
logreg = LogisticRegression()
logreg.fit(X, y)
XX = np.linspace(0, 10, 1000).reshape(-1, 1)
prediction = logreg.predict(XX)
probability_0, probability_1 = logreg.predict_proba(XX).T
plt.plot(XX[:, 0], prediction, color='limegreen', lw=2, alpha=0.7, label='Predicted values')
plt.plot(XX[:, 0], probability_0, color='crimson', lw=2, alpha=0.7, ls='--', label='Probability of being 0')
plt.plot(XX[:, 0], probability_1, color='deepskyblue', lw=2, alpha=0.7, label='Probability of being 1')
plt.legend()
Thanks to JohanCs help I was able to get it. Clearly my answer is heavily inspired by his posting. I adjusted the code a bit so it displays the model over the bar graph rather than the scatterplot. I tried to upvote you but I don't have enough karma.
Here is what I used and the result:
# logistic regression
X = dfx['home_diff'].values
y = dfx['away_win'].values
X = X.reshape(-1, 1)
logreg = LogisticRegression()
logreg.fit(X, y)
prediction = logreg.predict(X)
probability_0, probability_1 = logreg.predict_proba(X).T
plt.plot(X[:, 0], probability_1, color='blue', lw=2, alpha=0.6, label='Probability of being 1')
plt.bar(prob_df['diff'], prob_df['full_win_prob'], color='orange')
plt.legend()
Related
I have an assignment in which I need to compare my own multi-class logistic regression and the built-in SKlearn one.
As part of it, I need to plot the decision boundaries of each, on the same figure (for 2,3, and 4 classes separately).
This is my model's decision boundaries for 3 classes:
Made with this code:
x1_min, x1_max = X[:,0].min()-.5, X[:,0].max()+.5
x2_min, x2_max = X[:,1].min()-.5, X[:,1].max()+.5
xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))
grid = np.c_[xx1.ravel(), xx2.ravel()]
for i in range(len(ws)):
probs = ol.predict_prob(grid, ws[i]).reshape(xx1.shape)
plt.contour(xx1, xx2, probs, [0.5], linewidths=1, colors='green')
where
ol - is my Own Linear regression
ws - the current weights
That's how I tried to plot the Sklearn boundaries:
for i in range(len(clf.coef_)):
w = clf.coef_[i]
a = -w[0] / w[1]
xx = np.linspace(x1_min, x1_max)
yy = a * xx - (clf.intercept_[0]) / w[1]
plt.plot(xx, yy, 'k-')
Resulting
I understand that it's due to the 1dim vs 2dim grids, but I can't understand how to solve it.
I also tried to use the built-in DecisionBoundaryDisplay but I couldn't figure out how to plot it with my boundaries + it doesn't plot only the lines but also the whole background is painted in the corresponding color.
A couple fixes:
Change clf.intercept_[1] to clf.intercept_[i]
If the xlimits and ylimits in the plot look strange, you can constrain them.
ax.set_xlim([x1_min, x1_max])
ax.set_ylim([x2_min, x2_max])
MRE:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
X, y = make_blobs(n_features=2, centers=3, random_state=42)
fig, ax = plt.subplots(1, 2)
x1_min, x1_max = X[:,0].min()-.5, X[:,0].max()+.5
x2_min, x2_max = X[:,1].min()-.5, X[:,1].max()+.5
def draw_coef_lines(clf, X, y, ax, title):
for i in range(len(clf.coef_)):
w = clf.coef_[i]
a = -w[0] / w[1]
xx = np.linspace(x1_min, x1_max)
yy = a * xx - (clf.intercept_[i]) / w[1]
ax.plot(xx, yy, 'k-')
ax.scatter(X[:, 0], X[:, 1], c=y)
ax.set_xlim([x1_min, x1_max])
ax.set_ylim([x2_min, x2_max])
ax.set_title(title)
clf1 = LogisticRegression().fit(X, y)
clf2 = LogisticRegression(multi_class="ovr").fit(X, y)
draw_coef_lines(clf1, X, y, ax[0], "Multinomial")
draw_coef_lines(clf2, X, y, ax[1], "OneVsRest")
plt.show()
I have some trouble plotting the image which is in my head.
I want to visualize the Kernel-trick with Support Vector Machines. So I made some two-dimensional data consisting of two circles (an inner and an outer circle) which should be separated by a hyperplane. Obviously this isn't possible in two dimensions - so I transformed them into 3D. Let n be the number of samples. Now I have an (n,3)-array (3 columns, n rows) X of data points and an (n,1)-array y with labels. Using sklearn I get the linear classifier via
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
I already plot the data points as scatter plot via
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
Now I want to plot the separating hyperplane as surface plot. My problem here is the missing explicit representation of the hyperplane because the decision function only yields an implicit hyperplane via decision_function = 0. Therefore I need to plot the level set (of level 0) of an 4-dimensional object.
Since I'm not a python expert I would appreciate if somebody could help me out! And I know that this isn't really the "style" of using a SVM but I need this image as an illustration for my thesis.
Edit: my current "code"
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs, make_circles
from tikzplotlib import save as tikz_save
plt.close('all')
# we create 50 separable points
#X, y = make_blobs(n_samples=40, centers=2, random_state=6)
X, y = make_circles(n_samples=50, factor=0.5, random_state=4, noise=.05)
X2, y2 = make_circles(n_samples=50, factor=0.2, random_state=5, noise=.08)
X = np.append(X,X2, axis=0)
y = np.append(y,y2, axis=0)
# shifte X to [0,2]x[0,2]
X = np.array([[item[0] + 1, item[1] + 1] for item in X])
X[X<0] = 0.01
clf = svm.SVC(kernel='rbf', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5, linestyles=['--','-','--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none', edgecolors='k')
################## KERNEL TRICK - 3D ##################
trans_X = np.array([[item[0]**2, item[1]**2, np.sqrt(2*item[0]*item[1])] for item in X])
fig = plt.figure()
ax = plt.axes(projection ="3d")
# creating scatter plot
ax.scatter3D(trans_X[:,0],trans_X[:,1],trans_X[:,2], c = y, cmap=plt.cm.Paired)
clf2 = svm.SVC(kernel='linear', C=1000)
clf2.fit(trans_X, y)
ax = plt.gca(projection='3d')
xlim = ax.get_xlim()
ylim = ax.get_ylim()
zlim = ax.get_zlim()
### from here i don't know what to do ###
xx = np.linspace(xlim[0], xlim[1], 3)
yy = np.linspace(ylim[0], ylim[1], 3)
zz = np.linspace(zlim[0], zlim[1], 3)
ZZ, YY, XX = np.meshgrid(zz, yy, xx)
xyz = np.vstack([XX.ravel(), YY.ravel(), ZZ.ravel()]).T
Z = clf2.decision_function(xyz).reshape(XX.shape)
#ax.contour(XX, YY, ZZ, Z, colors='k', levels=[-1, 0, 1], alpha=0.5, linestyles=['--','-','--'])
Desired Output
I want to get something like that.
In general I want to reconstruct what they do in this article, especially "Non-linear transformations".
Part of your question is addressed in this question on linear-kernel SVM. It's a partial answer, because only linear kernels can be represented this way, i.e. thanks to hyperplane coordinates accessible via the estimator when using linear kernel.
Another solution is to find the isosurface with marching_cubes
This solution involves installing the scikit-image toolkit (https://scikit-image.org) which allows to find an isosurface of a given value (here, I considered 0 since it represents the distance to the hyperplane) from the mesh grid of the 3D coordinates.
In the code below (copied from yours), I implement the idea for any kernel (in the example, I used the RBF kernel), and the output is shown beneath the code. Please consider my footnote about 3D plotting with matplotlib, which may be another issue in your case.
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from skimage import measure
from sklearn.datasets import make_blobs, make_circles
from tikzplotlib import save as tikz_save
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
plt.close('all')
# we create 50 separable points
#X, y = make_blobs(n_samples=40, centers=2, random_state=6)
X, y = make_circles(n_samples=50, factor=0.5, random_state=4, noise=.05)
X2, y2 = make_circles(n_samples=50, factor=0.2, random_state=5, noise=.08)
X = np.append(X,X2, axis=0)
y = np.append(y,y2, axis=0)
# shifte X to [0,2]x[0,2]
X = np.array([[item[0] + 1, item[1] + 1] for item in X])
X[X<0] = 0.01
clf = svm.SVC(kernel='rbf', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5, linestyles=['--','-','--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none', edgecolors='k')
################## KERNEL TRICK - 3D ##################
trans_X = np.array([[item[0]**2, item[1]**2, np.sqrt(2*item[0]*item[1])] for item in X])
fig = plt.figure()
ax = plt.axes(projection ="3d")
# creating scatter plot
ax.scatter3D(trans_X[:,0],trans_X[:,1],trans_X[:,2], c = y, cmap=plt.cm.Paired)
clf2 = svm.SVC(kernel='rbf', C=1000)
clf2.fit(trans_X, y)
z = lambda x,y: (-clf2.intercept_[0]-clf2.coef_[0][0]*x-clf2.coef_[0][1]*y) / clf2.coef_[0][2]
ax = plt.gca(projection='3d')
xlim = ax.get_xlim()
ylim = ax.get_ylim()
zlim = ax.get_zlim()
### from here i don't know what to do ###
xx = np.linspace(xlim[0], xlim[1], 50)
yy = np.linspace(ylim[0], ylim[1], 50)
zz = np.linspace(zlim[0], zlim[1], 50)
XX ,YY, ZZ = np.meshgrid(xx, yy, zz)
xyz = np.vstack([XX.ravel(), YY.ravel(), ZZ.ravel()]).T
Z = clf2.decision_function(xyz).reshape(XX.shape)
# find isosurface with marching cubes
dx = xx[1] - xx[0]
dy = yy[1] - yy[0]
dz = zz[1] - zz[0]
verts, faces, _, _ = measure.marching_cubes_lewiner(Z, 0, spacing=(1, 1, 1), step_size=2)
verts *= np.array([dx, dy, dz])
verts -= np.array([xlim[0], ylim[0], zlim[0]])
# add as Poly3DCollection
mesh = Poly3DCollection(verts[faces])
mesh.set_facecolor('g')
mesh.set_edgecolor('none')
mesh.set_alpha(0.3)
ax.add_collection3d(mesh)
ax.view_init(20, -45)
plt.savefig('kerneltrick')
Running the code produces the following image with Matplotlib, where the green semi-transparent surface represents the non-linear decision boundary.
Footnote: 3D plotting with matplotlib
Note that Matplotlib 3D is not able to manage the "depth" of objects in some cases, because it can be in conflict with the zorder of this object. This is the reason why sometimes the hyperplane look to be plotted "on top of" the points, even it should be "behind". This issue is a known bug discussed in the matplotlib 3d documentation and in this answer.
If you want to have better rendering results, you may want to use Mayavi, as recommended by the Matplotlib developers, or any other 3D Python plotting library.
I am following this script from ScikitLearn to plot the margins and the hyperplane for SVC. I'm executing each line in a different cell and am following the same order, i.e. cell 1 for line 1, cell 2 for line 2, and so on. When I finally get to the plotting part, say on cell k(which is the last cell in my notebook that holds the last 7 lines of the code), I get UserWarning: No contour levels were found within the data range. and no plot as mentioned in the link given above shows up.
However, when I execute all the lines in the same cell, the code works as expected. What am I doing wrong here?
The code:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs
# we create 40 separable points
X, y = make_blobs(n_samples=40, centers=2, random_state=6)
# fit the model, don't regularize for illustration purposes
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none', edgecolors='k')
plt.show()
Using SVM with sklearn library, I would like to plot the data with each labels representing its color. I don't want to color the points but filling area with colors.
I have now :
d_pred, d_train_std, d_test_std, l_train, l_test
d_pred are the labels predicted.
I would plot d_pred with d_train_std with shape : (70000,2) where X-axis are the first column and Y-Axis the second column.
Thank you.
You cannot visualize the decision surface for a lot of features. This is because the dimensions will be too many and there is no way to visualize an N-dimensional surface.
However, you can use 2 features and plot nice decision surfaces as follows.
I have also written an article about this here:
https://towardsdatascience.com/support-vector-machines-svm-clearly-explained-a-python-tutorial-for-classification-problems-29c539f3ad8?source=friends_link&sk=80f72ab272550d76a0cc3730d7c8af35
Case 1: 2D plot for 2 features and using the iris dataset
from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
iris = datasets.load_iris()
X = iris.data[:, :2] # we only take the first two features.
y = iris.target
def make_meshgrid(x, y, h=.02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
return xx, yy
def plot_contours(ax, clf, xx, yy, **params):
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out
model = svm.SVC(kernel='linear')
clf = model.fit(X, y)
fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = X[:, 0], X[:, 1]
xx, yy = make_meshgrid(X0, X1)
plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('y label here')
ax.set_xlabel('x label here')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
ax.legend()
plt.show()
Case 2: 3D plot for 3 features and using the iris dataset
from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from mpl_toolkits.mplot3d import Axes3D
iris = datasets.load_iris()
X = iris.data[:, :3] # we only take the first three features.
Y = iris.target
#make it binary classification problem
X = X[np.logical_or(Y==0,Y==1)]
Y = Y[np.logical_or(Y==0,Y==1)]
model = svm.SVC(kernel='linear')
clf = model.fit(X, Y)
# The equation of the separating plane is given by all x so that np.dot(svc.coef_[0], x) + b = 0.
# Solve for w3 (z)
z = lambda x,y: (-clf.intercept_[0]-clf.coef_[0][0]*x -clf.coef_[0][1]*y) / clf.coef_[0][2]
tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot3D(X[Y==0,0], X[Y==0,1], X[Y==0,2],'ob')
ax.plot3D(X[Y==1,0], X[Y==1,1], X[Y==1,2],'sr')
ax.plot_surface(x, y, z(x,y))
ax.view_init(30, 60)
plt.show()
It can be difficult to get the function in 3D. An easy way to get a visualization is to get a large amount of points that cover your point space and run them through your learned function (my_model.predict), keep the points that hit inside the function, and visualize them. The more you add the more defined the boundary will be.
Here's my code that does what #Christian Tuchez describes:
outputs = my_clf.predict(1_test)
hits = []
for i in range(outputs.size):
if outputs[i] == 1:
hits.append(i) # save the index where it's 1
This saves the index of all the points that hit in the function (saved in the "hits" list). You can probably accomplish this without a loop, I just found it easiest for me.
Then to display just those points, you'd write something like this:
ax.scatter(1_test[hits[:], 0], 1_test[hits[:], 1], 1_test[hits[:], 2], c="cyan", s=2, edgecolor=None)
I'm trying to merge two plots in one:
http://scikit-learn.org/stable/auto_examples/linear_model/plot_sgd_iris.html
http://scikit-learn.org/stable/auto_examples/ensemble/plot_voting_decision_regions.html#sphx-glr-auto-examples-ensemble-plot-voting-decision-regions-py
In the left plot I want to display the decision boundary with the hyperplane corresponding to the OVA classifiers and in the right plot I would like to show the decision probabilities.
This is the code so far:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sn
from sklearn import datasets
from sklearn import preprocessing
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.svm import SVC
def plot_hyperplane(c, color, fitted_model):
"""
Plot the one-against-all classifiers for the given model.
Parameters
--------------
c : index of the hyperplane to be plot
color : color to be used when drawing the line
fitted_model : the fitted model
"""
xmin, xmax = plt.xlim()
ymin, ymax = plt.ylim()
try:
coef = fitted_model.coef_
intercept = fitted_model.intercept_
except:
return
def line(x0):
return (-(x0 * coef[c, 0]) - intercept[c]) / coef[c, 1]
plt.plot([xmin, xmax], [line(xmin), line(xmax)], ls="--", color=color, zorder=3)
def plot_decision_boundary(X, y, fitted_model, features, targets):
"""
This function plots a model decision boundary as well as it tries to plot
the decision probabilities, if available.
Requires a model fitted with two features only.
Parameters
--------------
X : the data to learn
y : the classification labels
fitted_model : the fitted model
"""
cmap = plt.get_cmap('Set3')
prob = cmap
colors = [cmap(i) for i in np.linspace(0, 1, len(fitted_model.classes_))]
plt.figure(figsize=(9.5, 5))
for i, plot_type in enumerate(['Decision Boundary', 'Decision Probabilities']):
plt.subplot(1, 2, i+1)
mesh_step_size = 0.01 # step size in the mesh
x_min, x_max = X[:, 0].min() - .1, X[:, 0].max() + .1
y_min, y_max = X[:, 1].min() - .1, X[:, 1].max() + .1
xx, yy = np.meshgrid(np.arange(x_min, x_max, mesh_step_size), np.arange(y_min, y_max, mesh_step_size))
# First plot, predicted results using the given model
if i == 0:
Z = fitted_model.predict(np.c_[xx.ravel(), yy.ravel()])
for h, color in zip(fitted_model.classes_, colors):
plot_hyperplane(h, color, fitted_model)
# Second plot, predicted probabilities using the given model
else:
prob = 'RdYlBu_r'
try:
Z = fitted_model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]
except:
plt.text(0.4, 0.5, 'Probabilities Unavailable', horizontalalignment='center',
verticalalignment='center', transform=plt.gca().transAxes, fontsize=12)
plt.axis('off')
break
Z = Z.reshape(xx.shape)
# Display Z
plt.imshow(Z, interpolation='nearest', cmap=prob, alpha=0.5,
extent=(x_min, x_max, y_min, y_max), origin='lower', zorder=1)
# Plot the data points
for i, color in zip(fitted_model.classes_, colors):
idx = np.where(y == i)
plt.scatter(X[idx, 0], X[idx, 1], facecolor=color, edgecolor='k', lw=1,
label=iris.target_names[i], cmap=cmap, alpha=0.8, zorder=2)
plt.title(plot_type + '\n' +
str(fitted_model).split('(')[0]+ ' Test Accuracy: ' + str(np.round(fitted_model.score(X, y), 5)))
plt.xlabel(features[0])
plt.ylabel(features[1])
plt.gca().set_aspect('equal')
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.tight_layout()
plt.subplots_adjust(top=0.9, bottom=0.08, wspace=0.02)
plt.show()
if __name__ == '__main__':
iris = datasets.load_iris()
X = iris.data[:, [0, 2]]
y = iris.target
scaler = preprocessing.StandardScaler().fit_transform(X)
clf1 = DecisionTreeClassifier(max_depth=4)
clf2 = KNeighborsClassifier(n_neighbors=7)
clf3 = SVC(kernel='rbf', probability=True)
clf4 = SGDClassifier(alpha=0.001, n_iter=100).fit(X, y)
clf1.fit(X, y)
clf2.fit(X, y)
clf3.fit(X, y)
clf4.fit(X, y)
plot_decision_boundary(X, y, clf1, iris.feature_names, iris.target_names[[0, 2]])
plot_decision_boundary(X, y, clf2, iris.feature_names, iris.target_names[[0, 2]])
plot_decision_boundary(X, y, clf3, iris.feature_names, iris.target_names[[0, 2]])
plot_decision_boundary(X, y, clf4, iris.feature_names, iris.target_names[[0, 2]])
And the results:
As can be seen, for the last example (clf4 in the given code) I'm so far unable to plot the hyperplane in the wrong position. I wonder how to correct this. They should be translated to the correct range regarding the used features to fit the model.
Thanks.
Apparently, the problem is the ends of dashed lines representing hyperplanes are not consistent with the final and expected xlim and ylim. A good thing about this case is that you already have x_min, x_max, y_min, y_max defined. So use that and fix xlim and ylim by applying the following 3 lines before plotting hyperplanes (specifically, add in front of your comment line # First plot, predicted results using the given model).
ax = plt.gca()
ax.set_xlim((x_min, x_max), auto=False)
ax.set_ylim((y_min, y_max), auto=False)