I've tried the other threads, but can't work out how to solve. I'm attempting to create a discrete colorbar. Much of the code appears to be working, a discrete bar does appear, but the labels are wrong and it throws the error: "No mappable was found to use for colorbar creation. First define a mappable such as an image (with imshow) or a contour set (with contourf)."
Pretty sure the error is because I'm missing an argument in plt.colorbar, but not sure what it's asking for or how to define it.
Below is what I have. Any thoughts gratefully received:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
norm = mpl.colors.BoundaryNorm(np.arange(-0.5,4), cmap.N)
ex2 = sample_data.plot.scatter(x='order_count', y='total_value',c='cluster', marker='+', ax=ax, cmap='plasma', norm=norm, s=100, edgecolor ='none', alpha=0.70)
plt.colorbar(ticks=np.linspace(0,3,4))
plt.show()
Indeed, the fist argument to colorbar should be a ScalarMappable, which would be the scatter plot PathCollection itself.
Setup
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"x" : np.linspace(0,1,20),
"y" : np.linspace(0,1,20),
"cluster" : np.tile(np.arange(4),5)})
cmap = mpl.colors.ListedColormap(["navy", "crimson", "limegreen", "gold"])
norm = mpl.colors.BoundaryNorm(np.arange(-0.5,4), cmap.N)
Pandas plotting
The problem is that pandas does not provide you access to this ScalarMappable directly. So one can catch it from the list of collections in the axes, which is easy if there is only one single collection present: ax.collections[0].
fig, ax = plt.subplots()
df.plot.scatter(x='x', y='y', c='cluster', marker='+', ax=ax,
cmap=cmap, norm=norm, s=100, edgecolor ='none', alpha=0.70, colorbar=False)
fig.colorbar(ax.collections[0], ticks=np.linspace(0,3,4))
plt.show()
Matplotlib plotting
One could consider using matplotlib directly to plot the scatter in which case you would directly use the return of the scatter function as argument to colorbar.
fig, ax = plt.subplots()
scatter = ax.scatter(x='x', y='y', c='cluster', marker='+', data=df,
cmap=cmap, norm=norm, s=100, edgecolor ='none', alpha=0.70)
fig.colorbar(scatter, ticks=np.linspace(0,3,4))
plt.show()
Output in both cases is identical.
Related
I have the following code for plotting the histogram and the kde-functions (Kernel density estimation) of a training and validation dataset:
#Plot histograms
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
displot_dataTrain=sns.displot(data_train, bins='auto', kde=True)
displot_dataTrain._legend.remove()
plt.ylabel('Count')
plt.xlabel('Training Data')
plt.title("Histogram Training Data")
plt.show()
displot_dataValid =sns.displot(data_valid, bins='auto', kde=True)
displot_dataValid._legend.remove()
plt.ylabel('Count')
plt.xlabel('Validation Data')
plt.title("Histogram Validation Data")
plt.show()
# Try to plot the kde-functions together --> yields an AttributeError
X1 = np.linspace(data_train.min(), data_train.max(), 1000)
X2 = np.linspace(data_valid.min(), data_valid.max(), 1000)
fig, ax = plt.subplots(1,2, figsize=(12,6))
ax[0].plot(X1, displot_dataTest.kde.pdf(X1), label='train')
ax[1].plot(X2, displot_dataValid.kde.pdf(X1), label='valid')
The plotting of the histograms and kde-functions inside one plot works without problems. Now I would like to have the 2 kde-functions inside one plot but when using the posted code, I get the following error AttributeError: 'FacetGrid' object has no attribute 'kde'
Do you have any idea, how I can combined the 2 kde-functions inside one plot (without the histogram)?
sns.displot() returns a FacetGrid. That doesn't work as input for ax.plot(). Also, displot_dataTest.kde.pdf is never valid. However, you can write sns.kdeplot(data=data_train, ax=ax[0]) to create a kdeplot inside the first subplot. See the docs; note the optional parameters cut= and clip= that can be used to adjust the limits.
If you only want one subplot, you can use fig, ax = plt.subplots(1, 1, figsize=(12,6)) and use ax=ax instead of ax=ax[0] as in that case ax is just a single subplot, not an array of subplots.
The following code has been tested using the latest seaborn version:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
fig, ax = plt.subplots(figsize=(12, 6))
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='crimson', label='train', fill=True, ax=ax)
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='limegreen', label='valid', fill=True, ax=ax)
ax.legend()
plt.tight_layout()
plt.show()
I have a parallel coordinates plot with lots of data points so I'm trying to use a continuous colour bar to represent that, which I think I have worked out. However, I haven't been able to remove the default key that is put in when creating the plot, which is very long and hinders readability. Is there a way to remove this table to make the graph much easier to read?
This is the code I'm currently using to generate the parallel coordinates plot:
parallel_coordinates(data[[' male_le','
female_le','diet','activity','obese_perc','median_income']],'median_income',colormap = 'rainbow',
alpha = 0.5)
fig, ax = plt.subplots(figsize=(6, 1))
fig.subplots_adjust(bottom=0.5)
cmap = mpl.cm.rainbow
bounds = [0.00,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N,)
plt.colorbar(mpl.cm.ScalarMappable(norm = norm, cmap=cmap),cax = ax, orientation = 'horizontal',
label = 'normalised median income', alpha = 0.5)
plt.show()
Current Output:
I want my legend to be represented as a color bar, like this:
Any help would be greatly appreciated. Thanks.
You can use ax.legend_.remove() to remove the legend.
The cax parameter of plt.colorbar indicates the subplot where to put the colorbar. If you leave it out, matplotlib will create a new subplot, "stealing" space from the current subplot (subplots are often referenced to by ax in matplotlib). So, here leaving out cax (adding ax=ax isn't necessary, as here ax is the current subplot) will create the desired colorbar.
The code below uses seaborn's penguin dataset to create a standalone example.
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import numpy as np
from pandas.plotting import parallel_coordinates
penguins = sns.load_dataset('penguins')
fig, ax = plt.subplots(figsize=(10, 4))
cmap = plt.get_cmap('rainbow')
bounds = np.arange(penguins['body_mass_g'].min(), penguins['body_mass_g'].max() + 200, 200)
norm = mpl.colors.BoundaryNorm(bounds, 256)
penguins = penguins.dropna(subset=['body_mass_g'])
parallel_coordinates(penguins[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']],
'body_mass_g', colormap=cmap, alpha=0.5, ax=ax)
ax.legend_.remove()
plt.colorbar(mpl.cm.ScalarMappable(norm=norm, cmap=cmap),
ax=ax, orientation='horizontal', label='body mass', alpha=0.5)
plt.show()
How do I set the font size of the colorbar label?
ax=sns.heatmap(table, vmin=60, vmax=100, xticklabels=[4,8,16,32,64,128],yticklabels=[2,4,6,8], cmap="PuBu",linewidths=.0,
annot=True,cbar_kws={'label': 'Accuracy %'}
Unfortunately seaborn does not give access to the objects it creates. So one needs to take the detour, using the fact that the colorbar is an axes in the current figure and that it is the last one created, hence
ax = sns.heatmap(...)
cbar_axes = ax.figure.axes[-1]
For this axes, we may set the fontsize by getting the ylabel using its set_size method.
Example, setting the fontsize to 20 points:
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(0)
import seaborn as sns
data = np.random.rand(10, 12)*100
ax = sns.heatmap(data, cbar_kws={'label': 'Accuracy %'})
ax.figure.axes[-1].yaxis.label.set_size(20)
plt.show()
Note that the same can of course be achieved by via
ax = sns.heatmap(data)
ax.figure.axes[-1].set_ylabel('Accuracy %', size=20)
without the keyword argument passing.
You could also explicitly pass in the axes objects into heatmap and modify them directly:
grid_spec = {"width_ratios": (.9, .05)}
f, (ax, cbar_ax) = plt.subplots(1,2, gridspec_kw=grid_spec)
sns.heatmap(data, ax=ax, cbar_ax=cbar_ax, cbar_kws={'label': 'Accuracy %'})
cbar_ax.yaxis.label.set_size(20)
I have plotted my data with factorplot in seaborn and get facetgrid object, but still cannot understand how the following attributes could be set in such a plot:
Legend size: when I plot lots of variables, I get very small legends, with small fonts.
Font sizes of y and x labels (a similar problem as above)
You can scale up the fonts in your call to sns.set().
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.normal(size=37)
y = np.random.lognormal(size=37)
# defaults
sns.set()
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='small')
ax.legend(loc='upper left', bbox_to_anchor=(0, 1.1))
sns.set(font_scale=5) # crazy big
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='big')
ax.legend(loc='upper left', bbox_to_anchor=(0, 1.3))
The FacetGrid plot does produce pretty small labels. While #paul-h has described the use of sns.set as a way to the change the font scaling, it may not be the optimal solution since it will change the font_scale setting for all plots.
You could use the seaborn.plotting_context to change the settings for just the current plot:
with sns.plotting_context(font_scale=1.5):
sns.factorplot(x, y ...)
I've made some modifications to #paul-H code, such that you can independently set the font size for the x/y axes and legend:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.normal(size=37)
y = np.random.lognormal(size=37)
# defaults
sns.set()
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='small')
ax.legend(loc='upper left', fontsize=20,bbox_to_anchor=(0, 1.1))
ax.set_xlabel('X_axi',fontsize=20);
ax.set_ylabel('Y_axis',fontsize=20);
plt.show()
This is the output:
For the legend, you can use this
plt.setp(g._legend.get_title(), fontsize=20)
Where g is your facetgrid object returned after you call the function making it.
This worked for me
g = sns.catplot(x="X Axis", hue="Class", kind="count", legend=False, data=df, height=5, aspect=7/4)
g.ax.set_xlabel("",fontsize=30)
g.ax.set_ylabel("Count",fontsize=20)
g.ax.tick_params(labelsize=15)
What did not work was to call set_xlabel directly on g like g.set_xlabel() (then I got a "Facetgrid has no set_xlabel" method error)
I'm working with data that has the data has 3 plotting parameters: x,y,c. How do you create a custom color value for a scatter plot?
Extending this example I'm trying to do:
import matplotlib
import matplotlib.pyplot as plt
cm = matplotlib.cm.get_cmap('RdYlBu')
colors=[cm(1.*i/20) for i in range(20)]
xy = range(20)
plt.subplot(111)
colorlist=[colors[x/2] for x in xy] #actually some other non-linear relationship
plt.scatter(xy, xy, c=colorlist, s=35, vmin=0, vmax=20)
plt.colorbar()
plt.show()
but the result is TypeError: You must first set_array for mappable
From the matplotlib docs on scatter 1:
cmap is only used if c is an array of floats
So colorlist needs to be a list of floats rather than a list of tuples as you have it now.
plt.colorbar() wants a mappable object, like the CircleCollection that plt.scatter() returns.
vmin and vmax can then control the limits of your colorbar. Things outside vmin/vmax get the colors of the endpoints.
How does this work for you?
import matplotlib.pyplot as plt
cm = plt.cm.get_cmap('RdYlBu')
xy = range(20)
z = xy
sc = plt.scatter(xy, xy, c=z, vmin=0, vmax=20, s=35, cmap=cm)
plt.colorbar(sc)
plt.show()
Here is the OOP way of adding a colorbar:
fig, ax = plt.subplots()
im = ax.scatter(x, y, c=c)
fig.colorbar(im, ax=ax)
If you're looking to scatter by two variables and color by the third, Altair can be a great choice.
Creating the dataset
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame(40*np.random.randn(10, 3), columns=['A', 'B','C'])
Altair plot
from altair import *
Chart(df).mark_circle().encode(x='A',y='B', color='C').configure_cell(width=200, height=150)
Plot