Multiple confusion matrices with plot_confusion_matrix - python

I want plot three consufusion_matrix in the same windows.
I used plot_confusion_matrix becouse i have a np.array to indicate the data of consusion matrix (I took some confusion matrix and i have do some operation with they).
When i plot don't see the label in each one.
a1 = np.zeros(shape=(19, 19))
a2 = np.zeros(shape=(19, 19))
a3 = np.zeros(shape=(19, 19))
for element in total_cm_svm:
a1 = a1 + element
for element in total_cm_lda:
a2 = a2 + element
for element in total_cm_etc:
a3 = a3 + element
fig, axes = plt.subplots(nrows=1, ncols=3)
cm = plot_confusion_matrix(conf_mat=a1, colorbar=False, class_names=label_ax, axis=axes[0])
cm1 = plot_confusion_matrix(conf_mat=a2, colorbar=False, class_names=label_ax, axis=axes[1])
cm2 = plot_confusion_matrix(conf_mat=a3, colorbar=False, class_names=label_ax, axis=axes[2])
plt.show()
total_cm_svm, total_cm_lda and in total_cm_etc are list which contain confusion_matrix elements (confusion_matrix(y_test, y_pred, labels=label_ax))
I would like to indicate labels = ['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18'] and indicate name of axis in each one plot

Related

plot a point within ridgeplots

having the following dataframe:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import joypy
sample1 = np.random.normal(5, 10, size = (200, 5))
sample2 = np.random.normal(40, 5, size = (200, 5))
sample3 = np.random.normal(10, 5, size = (200, 5))
b = []
for i in range(0, 3):
a = "Sample" + "{}".format(i)
lst = np.repeat(a, 200)
b.append(lst)
b = np.asarray(b).reshape(600,1)
data_arr = np.vstack((sample1,sample2, sample3))
df1 = pd.DataFrame(data = data_arr, columns = ["foo", "bar", "qux", "corge", "grault"])
df1.insert(0, column="sampleNo", value = b)
I am able to produce the following ridgeplot:
fig, axes = joypy.joyplot(df1, column = ['foo'], by = 'sampleNo',
alpha=0.6,
linewidth=.5,
linecolor='w',
fade=True)
Now, let's say I have the following vector:
vectors = np.asarray([10, 40, 50])
How do I plot each one of those points into the density plots? E.g., on the distribution plot of sample 1, I'd like to have a single point (or line) on 10; sample 2 on 40, etc..
I've tried to use axvline, and I sort of expected this to work, but no luck:
for ax in axes:
ax.axvline(vectors(ax))
I am not sure if what I want is possible at all...
You almost had the correct approach.
axes holds 4 axis objects, in order: the three stacked plots from top to bottom and the big one where all the other 3 live in. So,
for ax, v in zip(axes, vectors):
ax.axvline(v)
zip() will only zip up to the shorter iterable, which is vectors. So, it will match each point from vectors with each axis from the stacked plots.

For Looping Subplots with Datashader / Holoviews / Bokeh

The general notation for creating subplots with datashade/holoviews/Bokeh is using a '+' notation:
plot = plot1 + plot2 + plot3
However, I'm trying to generate plots inside a for loop like I can with Matplotlib. In Seaborn I can just do this to create subplots while incrementing through the dataframe:
fig, axes = plt.subplots(nrows=len(DF_cols), ncols=1, figsize=(10,10), sharex = True)
count = 0
for i in DF_cols:
sns.lineplot(data=df[i], label=i, ax=axes[count])
count += 1
return fig, axes
How do convert the method I have below for Datashade/Holoviews into a more automated process?
c1 = hv.Curve(df['T'])
c2 = hv.Curve(df['A'])
c3 = hv.Curve(df['B'])
c4 = hv.Curve(df['C'])
plot1 = dynspread(datashade(c1))
plot2 = dynspread(datashade(c2))
plot3 = dynspread(datashade(c3))
plot4 = dynspread(datashade(c4))
plot = (plot1 + plot2 + plot3 + plot4).cols(1)
plot
My initial approach was to use create a custom string to mimic the normal Datashade notation and running exec() on it, but that doesn't work when using inside functions or it encounters other errors eventually.
You can programmatically create layouts by passing a list of elements to hv.Layout. In this case, the following line should do the trick:
hv.Layout([plot1, plot2, plot3, plot4]).cols(1)

Seaborn barplot legend labels lose color

I have a seaborn boxplot which when I try to use plt.legend("Strings") to change name of labels it loses the colors of the labels. I need to change labels while maintaining the color coding, but I do not know how to do this after searching for an answer.
The Hues legend 1-4 corresponds from 1 = Very interested in politics to 4 = not at all interested. I want to change the legend hue labels from 1-4 to how interested they are in politics.
My code is:
Packages
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
I didnt know how to create dataframe in any simpler way so i did this
a1 = {'Reads Newspapers': 0, 'Interest in Politics': 1}
a2 = {'Reads Newspapers': 0, 'Interest in Politics': 2}
a3 = {'Reads Newspapers': 0, 'Interest in Politics': 3}
a4 = {'Reads Newspapers': 0, 'Interest in Politics': 4}
b1 = {'Reads Newspapers': 1, 'Interest in Politics': 1}
b2 = {'Reads Newspapers': 1, 'Interest in Politics': 2}
b3 = {'Reads Newspapers': 1, 'Interest in Politics': 3}
b4 = {'Reads Newspapers': 1, 'Interest in Politics': 4}
df1 = pd.DataFrame(data=a1, index=range(1))
df1 = pd.concat([df1]*23)
df2 = pd.DataFrame(data=a2, index=range(1))
df2 = pd.concat([df2]*98)
df3 = pd.DataFrame(data=a3, index=range(1))
df3 = pd.concat([df3]*99)
df4 = pd.DataFrame(data=a4, index=range(1))
df4 = pd.concat([df4]*18)
b1 = pd.DataFrame(data=b1, index=range(1))
b1 = pd.concat([b1]*468)
b2 = pd.DataFrame(data=b2, index=range(1))
b2 = pd.concat([b2]*899)
b3 = pd.DataFrame(data=b3, index=range(1))
b3 = pd.concat([b3]*413)
b4 = pd.DataFrame(data=b4, index=range(1))
b4 = pd.concat([b4]*46)
data = pd.concat([df1,df2,df3,df4,b1,b2,b3,b4])
Actual plotting that produces error
plt.figure(figsize=(10,8))
g = sns.barplot(data=data, x='Reads Newspapers', estimator=len,y='Interest in Politics', hue='Interest in Politics' )
plt.ylabel("Sample Size")
ax = plt.subplot()
ax = ax.set_xticklabels(["No","Yes"])
#plt.legend(["very interested","somewhat interested", "only a little interested", "not at all interested "],)
#plt.savefig('Newspaper policy')
I tried using plt.legend but the legend labels lose their color when I do this so it becomes strings with no color association, making it even worse than before.
I have now editted in the entirety of my script.
https://github.com/HenrikMorpheus/Newspaper-reading-survey/blob/master/politicalinterest.ipynb
It loads with an error for some reason i dont know, but you should be able to open the notebook in jupyter.
Use dedicated dataframe column
An option is to create a new column in the dataframe with the respective labels in, and use this column as input for the hue, such that the desired labels are automatically created.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({"reads" : ["Yes"] * 250 + ["No"]*150,
"interest" : [4,2,2,2,2,3,3,1,1,1]*40})
labels=["very interested","somewhat interested",
"only a little interested", "not at all interested"]
# Create new dataframe column with the labels instead of numbers
df["Interested in politics"] = df["interest"].map(dict(zip(range(1,5), labels)))
plt.figure(figsize=(10,8))
# Use newly created dataframe column as hue
ax = sns.barplot(data=df, x='reads', estimator=len,y='interest',
hue='Interested in politics', hue_order=labels)
ax.set_ylabel("Sample Size")
plt.show()
Setting the labels manually.
You may obtain the handles and labels for the legend via ax.get_legend_handles_labels() and use them to create a new legend with the labels from the list.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({"reads" : ["Yes"] * 250 + ["No"]*150,
"interest" : [4,2,2,2,2,3,3,1,1,1]*40})
labels=["very interested","somewhat interested",
"only a little interested", "not at all interested"]
plt.figure(figsize=(10,8))
ax = sns.barplot(data=df, x='reads', estimator=len,y='interest', hue='interest' )
ax.set_ylabel("Sample Size")
h, l = ax.get_legend_handles_labels()
ax.legend(h, labels, title="Interested in politics")
plt.show()

Arrangement of pie charts using matplotlib subplot

I have 7 pi-charts (4 are listed below). I am trying to create a dashboard with 4 pie charts in first row and 3 pie charts in second row. Not sure where I am going wrong with the below code. Are there any other alternatives to achieve this? Any help would be appreciated.
from matplotlib import pyplot as PLT
fig = PLT.figure()
ax1 = fig.add_subplot(221)
line1 = plt.pie(df_14,colors=("g","r"))
plt.title('EventLogs')
ax1 = fig.add_subplot(223)
line2 = plt.pie(df_24,colors=("g","r"))
plt.title('InstalledApp')
ax1 = fig.add_subplot(222)
line3 = plt.pie(df_34,colors=("g","r"))
plt.title('Drive')
ax1 = fig.add_subplot(224)
line4 = plt.pie(df_44,colors=("g","r"))
plt.title('SQL Job')
ax1 = fig.add_subplot(321)
line5 = plt.pie(df_54,colors=("g","r"))
plt.title('Administrators')
ax2 = fig.add_subplot(212)
PLT.show()
A better method which I always use and is more intuitive, at-least for me, is to use subplot2grid....
fig = plt.figure(figsize=(18,10), dpi=1600)
#this line will produce a figure which has 2 row
#and 4 columns
#(0, 0) specifies the left upper coordinate of your plot
ax1 = plt.subplot2grid((2,4),(0,0))
plt.pie(df_14,colors=("g","r"))
plt.title('EventLogs')
#next one
ax1 = plt.subplot2grid((2, 4), (0, 1))
plt.pie(df_24,colors=("g","r"))
plt.title('InstalledApp')
And you can go on like this, and when you want to switch the row just write the coordinate as (1, 0)... which is second row-first column.
An example with 2 rows and 2 cols -
fig = plt.figure(figsize=(18,10), dpi=1600)
#2 rows 2 columns
#first row, first column
ax1 = plt.subplot2grid((2,2),(0,0))
plt.pie(df.a,colors=("g","r"))
plt.title('EventLogs')
#first row sec column
ax1 = plt.subplot2grid((2,2), (0, 1))
plt.pie(df.a,colors=("g","r"))
plt.title('EventLog_2')
#Second row first column
ax1 = plt.subplot2grid((2,2), (1, 0))
plt.pie(df.a,colors=("g","r"))
plt.title('InstalledApp')
#second row second column
ax1 = plt.subplot2grid((2,2), (1, 1))
plt.pie(df.a,colors=("g","r"))
plt.title('InstalledApp_2')
Hope this helps!
Use this if you want to create quicker arrangements of subplots
In addition to hashcode55's code:
When you want to avoid making multiple DataFrames, I recommend to assign integers to your feature-column and iterate through those. Make sure you make a dictionary for the features though.
Here I am doing a plot with 4 columns and 2 rows.
fig = plt.figure(figsize=(25,10)) #,dpi=1600)
i= 0 #this is the feature I used
r,c = 0 ,0 #these are the rows(r) and columns(c)
for i in range(7):
if c < 4:
#weekday
ax1 = plt.subplot2grid((2,4), (r, c))
plt.pie(data[data.feature == i].something , labels = ..., autopct='%.0f%%')
plt.title(feature[i])
c +=1 #go one column to the left
i+=1 #go to the next feature
else:
c = 0 #reset column number as we exceeded 4 columns
r = 1 #go into the second row
ax1 = plt.subplot2grid((2,4), (r, c))
plt.pie(data[data.feature == i].something , labels = ..., autopct='%.0f%%')
plt.title(days[i])
c +=1
i+=1
plt.show()
This code will go on until the amount of features is exhausted.

Different color for each set in scatter plot on matplotlib [duplicate]

This question already has answers here:
Setting different color for each series in scatter plot on matplotlib
(8 answers)
Closed 7 years ago.
I have two samples sets from a multivariate normal distribution:
¿How could I set a different color for each set in scatter plot on matplotlib? Eg. printing values from A1 in blue and values from A2 in red.
N= 4
A1 = np.random.multivariate_normal(mean=[1,-4], cov=[[2,-1],[-1,2]],size = N)
A2 = np.random.multivariate_normal(mean=[1,-3], cov=[[1,1.5],[1.5,3]],size= N)
>>>print A1
[[ 0.16820131 -2.14909926]
[ 0.57792273 -2.43727122]
[-0.06946973 -3.72143292]
[ 2.59454949 -5.34776438]]
>>>print A2
[[ 0.98396671 -1.68934158]
[-0.33756576 -3.28187214]
[ 1.49767632 -3.46575623]
[ 1.47036718 -1.58453858]]
Could someone help me? Thanks in advance.
This should work for you.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
N = 1000
A1 = np.random.multivariate_normal(mean=[1,-4], cov=[[2,-1],[-1,2]],size = N)
A2 = np.random.multivariate_normal(mean=[1,-3], cov=[[1,1.5],[1.5,3]],size= N)
fig, ax = plt.subplots()
ax.scatter(A1[:,0], A1[:,1], color="blue", alpha=0.2)
ax.scatter(A2[:,0], A2[:,1], color="red", alpha=0.2)

Categories

Resources