I have the following code for plotting the histogram and the kde-functions (Kernel density estimation) of a training and validation dataset:
#Plot histograms
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
displot_dataTrain=sns.displot(data_train, bins='auto', kde=True)
displot_dataTrain._legend.remove()
plt.ylabel('Count')
plt.xlabel('Training Data')
plt.title("Histogram Training Data")
plt.show()
displot_dataValid =sns.displot(data_valid, bins='auto', kde=True)
displot_dataValid._legend.remove()
plt.ylabel('Count')
plt.xlabel('Validation Data')
plt.title("Histogram Validation Data")
plt.show()
# Try to plot the kde-functions together --> yields an AttributeError
X1 = np.linspace(data_train.min(), data_train.max(), 1000)
X2 = np.linspace(data_valid.min(), data_valid.max(), 1000)
fig, ax = plt.subplots(1,2, figsize=(12,6))
ax[0].plot(X1, displot_dataTest.kde.pdf(X1), label='train')
ax[1].plot(X2, displot_dataValid.kde.pdf(X1), label='valid')
The plotting of the histograms and kde-functions inside one plot works without problems. Now I would like to have the 2 kde-functions inside one plot but when using the posted code, I get the following error AttributeError: 'FacetGrid' object has no attribute 'kde'
Do you have any idea, how I can combined the 2 kde-functions inside one plot (without the histogram)?
sns.displot() returns a FacetGrid. That doesn't work as input for ax.plot(). Also, displot_dataTest.kde.pdf is never valid. However, you can write sns.kdeplot(data=data_train, ax=ax[0]) to create a kdeplot inside the first subplot. See the docs; note the optional parameters cut= and clip= that can be used to adjust the limits.
If you only want one subplot, you can use fig, ax = plt.subplots(1, 1, figsize=(12,6)) and use ax=ax instead of ax=ax[0] as in that case ax is just a single subplot, not an array of subplots.
The following code has been tested using the latest seaborn version:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
fig, ax = plt.subplots(figsize=(12, 6))
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='crimson', label='train', fill=True, ax=ax)
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='limegreen', label='valid', fill=True, ax=ax)
ax.legend()
plt.tight_layout()
plt.show()
Related
I'm trying to create some scatter plots, with seaborn with a specific area of each plot highlighted in red. However when I add the code for axvspan, it changes the x-axis. This is how the plots look prior to axvspan being applied.
When i apply the line for axvpsan:
fig, (ax0, ax1) = plt.subplots(2,1, figsize=(5,10))
ax0.axvspan("0.4", "0.8", color='red', alpha=0.3, label ='Problem Area')
sns.scatterplot(x='Values_1', y='Values_2', data=df3, color='green', ax=ax0)
sns.scatterplot(x='Values_3', y='Values_4', data=df3, color='green', ax=ax1)
plt.show()
It sends up looking like this:
Ultimately, the red section needs to only cover the data between 0.4 and 0.7, but by altering the x-axis it ends up covering all of it.
Any advice?
The unexpected behavior is resulting from passing the xmin and xmax arguments to matplotlib.pyplot.axvspan as str and not as float.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
# generate data
rng = np.random.default_rng(12)
df3 = pd.DataFrame({"Values_2": rng.random(100), "Values_1": np.linspace(0., 0.6, 100)})
fig, ax0 = plt.subplots(1,1, figsize=(6, 4))
ax0.axvspan(0.4, 0.8, color='red', alpha=0.3, label ='Problem Area')
sns.scatterplot(x='Values_1', y='Values_2', data=df3, color='green', ax=ax0)
plt.show()
This gives:
I'm using two related packages that generate plots I want to overlay for comparison. I call a method called plot_spectro from each package which plots to plt. Then I must do plt.legend() and plt.show() to see them. What happens is two plots with the same data ranges appear, but I would like to overlay (superimpose) them.
import matplotlib.pyplot as plt
s.plot_spectro(xaxis=x, yaxis=y)
plt.xlim(-6,2)
plt.ylim(-2.5,2.5)
o1.plot_spectro(xaxis=x, yaxis=y, color='b')
plt.xlim(-6,2)
plt.ylim(-2.5,2.5)
plt.legend()
plt.show()
Create an axis instance and pass it to both the plots as shown below
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
s.plot_spectro(xaxis=x, yaxis=y, ax=ax) # <--- pass ax=ax here
o1.plot_spectro(xaxis=x, yaxis=y, color='b', ax=ax) # <--- pass ax=ax here
plt.xlim(-6,2)
plt.ylim(-2.5,2.5)
plt.legend()
plt.show()
How to draw the following graph showing the difference against the average using matplotlib, searborn, Plotly or with any other framework?
I have found that some calls this plot Mean indexed bar chart. Using seaborn, it can be using a code like the following:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", context="talk")
f, ax1 = plt.subplots(figsize=(7, 5), sharex=True)
mean = df.mean()
y2 = mean - df["your column"]
sns.barplot(x=dfCopy.index, y=y2, palette="deep", ax=ax1)
ax1.axhline(0, color="k", clip_on=False)
ax1.set_ylabel("Diverging")
# Finalize the plot
sns.despine(bottom=True)
plt.setp(f.axes, yticks=[])
plt.tight_layout(h_pad=2)
I've tried the other threads, but can't work out how to solve. I'm attempting to create a discrete colorbar. Much of the code appears to be working, a discrete bar does appear, but the labels are wrong and it throws the error: "No mappable was found to use for colorbar creation. First define a mappable such as an image (with imshow) or a contour set (with contourf)."
Pretty sure the error is because I'm missing an argument in plt.colorbar, but not sure what it's asking for or how to define it.
Below is what I have. Any thoughts gratefully received:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
norm = mpl.colors.BoundaryNorm(np.arange(-0.5,4), cmap.N)
ex2 = sample_data.plot.scatter(x='order_count', y='total_value',c='cluster', marker='+', ax=ax, cmap='plasma', norm=norm, s=100, edgecolor ='none', alpha=0.70)
plt.colorbar(ticks=np.linspace(0,3,4))
plt.show()
Indeed, the fist argument to colorbar should be a ScalarMappable, which would be the scatter plot PathCollection itself.
Setup
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"x" : np.linspace(0,1,20),
"y" : np.linspace(0,1,20),
"cluster" : np.tile(np.arange(4),5)})
cmap = mpl.colors.ListedColormap(["navy", "crimson", "limegreen", "gold"])
norm = mpl.colors.BoundaryNorm(np.arange(-0.5,4), cmap.N)
Pandas plotting
The problem is that pandas does not provide you access to this ScalarMappable directly. So one can catch it from the list of collections in the axes, which is easy if there is only one single collection present: ax.collections[0].
fig, ax = plt.subplots()
df.plot.scatter(x='x', y='y', c='cluster', marker='+', ax=ax,
cmap=cmap, norm=norm, s=100, edgecolor ='none', alpha=0.70, colorbar=False)
fig.colorbar(ax.collections[0], ticks=np.linspace(0,3,4))
plt.show()
Matplotlib plotting
One could consider using matplotlib directly to plot the scatter in which case you would directly use the return of the scatter function as argument to colorbar.
fig, ax = plt.subplots()
scatter = ax.scatter(x='x', y='y', c='cluster', marker='+', data=df,
cmap=cmap, norm=norm, s=100, edgecolor ='none', alpha=0.70)
fig.colorbar(scatter, ticks=np.linspace(0,3,4))
plt.show()
Output in both cases is identical.
I have plotted my data with factorplot in seaborn and get facetgrid object, but still cannot understand how the following attributes could be set in such a plot:
Legend size: when I plot lots of variables, I get very small legends, with small fonts.
Font sizes of y and x labels (a similar problem as above)
You can scale up the fonts in your call to sns.set().
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.normal(size=37)
y = np.random.lognormal(size=37)
# defaults
sns.set()
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='small')
ax.legend(loc='upper left', bbox_to_anchor=(0, 1.1))
sns.set(font_scale=5) # crazy big
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='big')
ax.legend(loc='upper left', bbox_to_anchor=(0, 1.3))
The FacetGrid plot does produce pretty small labels. While #paul-h has described the use of sns.set as a way to the change the font scaling, it may not be the optimal solution since it will change the font_scale setting for all plots.
You could use the seaborn.plotting_context to change the settings for just the current plot:
with sns.plotting_context(font_scale=1.5):
sns.factorplot(x, y ...)
I've made some modifications to #paul-H code, such that you can independently set the font size for the x/y axes and legend:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.normal(size=37)
y = np.random.lognormal(size=37)
# defaults
sns.set()
fig, ax = plt.subplots()
ax.plot(x, y, marker='s', linestyle='none', label='small')
ax.legend(loc='upper left', fontsize=20,bbox_to_anchor=(0, 1.1))
ax.set_xlabel('X_axi',fontsize=20);
ax.set_ylabel('Y_axis',fontsize=20);
plt.show()
This is the output:
For the legend, you can use this
plt.setp(g._legend.get_title(), fontsize=20)
Where g is your facetgrid object returned after you call the function making it.
This worked for me
g = sns.catplot(x="X Axis", hue="Class", kind="count", legend=False, data=df, height=5, aspect=7/4)
g.ax.set_xlabel("",fontsize=30)
g.ax.set_ylabel("Count",fontsize=20)
g.ax.tick_params(labelsize=15)
What did not work was to call set_xlabel directly on g like g.set_xlabel() (then I got a "Facetgrid has no set_xlabel" method error)