How to add legends to sns subplots? - python

Desired Outcome:
So I was trying to add legends to the subplots,
I tried to add ax[0].legend() to the code, but it raise "No handles with labels found to put in legend."
fig, ax = plt.subplots(1, 2, figsize=(15, 6))
ax = ax.flatten()
sns.histplot(train[train['Survived'] == 0]['Age'], ax=ax[0])
sns.histplot(train[train['Survived'] == 1]['Age'], color='red', alpha=0.4, ax=ax[0])
sns.histplot(train[train['Survived'] == 0]['Fare'], ax=ax[1])
sns.histplot(train[train['Survived'] == 1]['Fare'], color='red', alpha=0.4, ax=ax[1])
plt.show()

I know what to do now:
ax[0].legend(['Survived', 'Not Survived'])
ax[1].legend(['Survived', 'Not Survived'])

Related

How can I implement plt.subplot correctly so my graphs can be side by side?

I am outputting two scatter plots, but I want both graphs to be next to each other side by side.
When I use plt.subplots, my ax1 and ax2 aren't being recognized? How can I make the bottom two scatterplots next to each other? Whenever I use plt.subplots it just creates empty graphs.
# Scatter plots.
ax1 = df_Baker.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(10, 7))
df_Muriel.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(10, 7), ax=ax1)
df_Tanner.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(10, 7), ax=ax1)
# regression lines
plt.plot(df_Baker.HS_GPA, Baker_fit[0] * df_Baker.HS_GPA + Baker_fit[1], color='darkblue', linewidth=2)
plt.plot(df_Tanner.HS_GPA, Tanner_fit[0] * df_Tanner.HS_GPA + Tanner_fit[1], color='deeppink', linewidth=2)
plt.plot(df_Muriel.HS_GPA, Muriel_fit[0] * df_Muriel.HS_GPA + Muriel_fit[1], color='deeppink', linewidth=2)
plt.legend(labels=['_h', '_hii', '_', '10 - 20','1 - 5'], title='Legend Test')
plt.title('BIO: Basic Concepts', size=24)
plt.xlabel('High school gpa', size=18)
plt.ylabel('cousre Grade', size=18);
#-----------------------------------------------------------------------------
# Scatter plots.
ax2 = df_Baker.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(6, 3))
df_Muriel.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(6, 3), ax=ax2)
df_Tanner.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(6, 3), ax=ax2)
# regression lines
plt.plot(df_Baker.HS_GPA, Baker_fit[0] * df_Baker.HS_GPA + Baker_fit[1], color='black', linewidth=2)
plt.plot(df_Tanner.HS_GPA, Tanner_fit[0] * df_Tanner.HS_GPA + Tanner_fit[1], color='black', linewidth=2)
plt.plot(df_Muriel.HS_GPA, Muriel_fit[0] * df_Muriel.HS_GPA + Muriel_fit[1], color='black', linewidth=2)
plt.legend(labels=['_h', '_hii', '_', '10 - 20','1 - 5'], title='Legend Test')
plt.title('BIO: Basic Concepts', size=24)
plt.xlabel('High school gpa', size=18)
plt.ylabel('cousre Grade', size=18);
Output graphs so far
In this line:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 3))
your create the two axes objects and attach the names ax1 and ax2 to them.
Later, in
ax1 = df_Baker.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(10, 7))
(and similarely in the line for ax2), your create new axis objects, and assign the names ax1 and ax2 to them.
I seems that this is not what you want. Rather, I guess you want to use the previously generated axes objects in the calls to df.Bakker.plot(). You can achieve this by using the ax= keyword:
df_Baker.plot(kind='scatter', x='HS_GPA', y='Course_Grade', color='black', alpha=0.5, figsize=(10, 7), ax=ax1)
You will also have to change the plt.plot(...) calls to ax1.plot(...) or ax2.plot(...), and similar for the functions plt.xlabel, plt.ylabel, plt.legend.
I would suggest to read the Blog post https://matplotlib.org/matplotblog/posts/pyplot-vs-object-oriented-interface/ on the difference between the Pyplot vs. Object Oriented Interface to Matplotlib, and you can also have a look at the examples referenced in https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplots.html

How to add vital few to Pareto Chart in python?

I use this code that drow
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.DataFrame({"Type of defect":["A","B","C","D","E","F","G","Other"], "Count":[17,202,387,25,825,12,3,45]})
data=data.set_index("Type of defect")
data = pd.concat([data[data.index!='Other'].sort_values(by='Count',ascending = False), data[data.index=='Прочее']])
data['Accumulated frequency'] = 100 *data['Count'].cumsum() / data['Count'].sum()
data['Limit']=80
data['Vital few']=np.where((data['Limit'] <= data['Accumulated frequency']) & (data['Limit'].shift(1) <= data['Accumulated frequency'].shift(1)), 0, 100)
fig, axes = plt.subplots()
ax1 = data.plot(use_index=True, y='Count', kind='bar', ax=axes)
ax2 = data.plot(use_index=True, y='Accumulated frequency', marker='D', color="C1", kind='line', ax=axes, secondary_y=True)
ax2.set_ylim([0,110])
ax3 = data.plot(use_index=True, y='Limit', color="gray", kind='line', linestyle='dashed', ax=axes, secondary_y=True)
ax4 = data.plot(use_index=True, y='Vital few', color="yellow", kind='area', ax=axes, secondary_y=True, alpha=0.1)
I get the following picture
However, I need to get this chart
The main problem is how to display "vital few" (Yellow area). There are also problems with the location of the legend and row/column labels. Please help me with this.
Area graphs cannot draw rectangles, so you need to use matplotlib's axvspan(). axvspan() is not reflected in the legend, so you need to add it, and use Patch to set the rectangle and label.
from matplotlib.patches import Patch
fig, axes = plt.subplots()
ax1 = data.plot(use_index=True, y='Count', kind='bar', ax=axes)
ax2 = data.plot(use_index=True, y='Accumulated frequency', marker='D', color="C1", kind='line', ax=axes, secondary_y=True)
ax2.set_ylim([0,110])
ax3 = data.plot(use_index=True, y='Limit', color="gray", kind='line', linestyle='dashed', ax=axes, secondary_y=True)
#ax4 = data.plot(use_index=True, y='Vital few', color="yellow", kind='area', ax=axes, secondary_y=True, alpha=0.1)
axes.axvspan(-0.5,1.25, ymax=0.95,facecolor='yellow', alpha=0.1)
handler1, label1 = ax1.get_legend_handles_labels()
#handler2, label2 = ax2.get_legend_handles_labels()
handler3, label3 = ax3.get_legend_handles_labels()
#print(label1, label2, label3)
add_legend = [Patch(facecolor='yellow', edgecolor='yellow', alpha=0.1, label='Vital few(right)')]
axes.legend(handles=handler1+handler3+add_legend)
plt.show()
EDIT:
If it is strictly linked to the y-axis value, it can be handled by a bar chart as an alternative method. By increasing the default width, the bars will be connected.
ax4 = data.plot(use_index=True, y='Vital few',color='yellow', kind='bar', width=1.0,ax=axes, secondary_y=True, alpha=0.1)

Subplot date formatting in Axis

I have a 2x2 graph with date in x-axis in both graphs. I have used datetime.strptime to bring a string into type = datetime.datetime object format.
However I am planning to have some 12 subplots and doing this the following way seems messy.
Is there a better 'pythonic' way?
This is what I have:
xx.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%y %H:%M'))
plt.grid(True)
plt.ylabel('paramA',fontsize=8, color = "blue")
plt.tick_params(axis='both', which='major', labelsize=8)
plt.plot(date_list, myarray[:,0], '-b', label='paramA')
plt.setp(plt.xticks()[1], rotation=30, ha='right') # ha is the same as horizontalalignment
xx = plt.subplot(2,1,2)
xx.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%y %H:%M'))
plt.grid(True)
plt.ylabel('paramB', 'amount of virtual mem',fontsize=8, color = "blue")
plt.tick_params(axis='both', which='major', labelsize=8)
plt.plot(date_list, myarray[:,1], '-y', label='paramB')plt.setp(plt.xticks()[1], rotation=30, ha='right') # ha is the same as horizontalalignment ```
PS: Initially I tried defining the plot as follows. This however did not work:
fig, axs = plt.subplots(2,1,figsize=(15,15))
plt.title('My graph')
for ax in enumerate(axs):
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%y %H:%M:%S'))
You failed to provide any data or a Minimal, Complete, and Verifiable example. Nevertheless, something like this should work. You can extend it to your real case by using desired number of rows and columns in the first command.
fig, axes = plt.subplots(nrows=2, ncols=3)
labels = ['paramA', 'paramB', 'paramC', 'paramD', 'paramE', 'paramF']
for i, ax in enumerate(axes.flatten()):
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%y %H:%M'))
ax.grid(True)
ax.set_ylabel(labels[i], fontsize=8, color="blue")
ax.tick_params(axis='both', which='major', labelsize=8)
ax.plot(date_list, myarray[:,i], '-b', label=labels[i])
plt.setp(plt.xticks()[1], rotation=30, ha='right') # ha is the same as horizontalalignment
EDIT:
Change your code to
fig, axs = plt.subplots(2,1,figsize=(15,15))
plt.title('My graph')
for ax in axs:
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%y %H:%M:%S'))

KeyError with plotting cluster centers

ax = df_seeds_train[df_seeds_train['cluster']==0].plot(kind='scatter', x='asymmetry', y='perimeter', s=50, c='green', sharex=False)
df_seeds_train[df_seeds_train['cluster']==1].plot(kind='scatter',x='asymmetry',y='perimeter',s=50, c='orange', sharex=False, ax = ax)
df_seeds_train[df_seeds_train['cluster']==2].plot(kind='scatter',x='asymmetry',y='perimeter',s=50, c='purple', sharex=False, ax = ax)
centers.plot(kind = 'scatter', x='asymmetry', y='perimeter', c='red', s=50, marker='x', sharex=False, ax=ax)
I need to get red markers in the center of my clusters but I keep getting KeyErrors for 'asymmetry' and 'perimeteter'. Does anybody know how to fix this. I added an image of the outcome I get now.Outcome
Thanks in advance!
df_seeds_train

How to delete extra plots on a AxesSubplot object?

I have a AxesSubplot object ax1 from this:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
I plot multiple times on this ax1 to see how alpha values will set the plots' appearance:
first = ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.3)
second = ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.6)
third = ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.9)
But these three plots overlap each other:
How can I erase the former histogram , then only show one plot each time? And by the way, what does the alpha arg do?
Thanks. :)
If I understand what you want,then try this
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
Output
first = ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.3)
second = ax2.hist(np.random.randn(100), bins=20, color='k', alpha=0.6)
third = ax3.hist(np.random.randn(100), bins=20, color='k', alpha=0.9)
plt.show()

Categories

Resources