How to center x axis values on seaborn histogram? - python

I know that when discrete = True, x-axis values are aligned on the center. However, I don't understand why it brakes when it comes to creating histogram with certain bin number (e.g., when setting a bins value of 19):
sns.histplot(data=df_ckd, x="HEIGHT", hue="SEX", multiple="stack",bins=19)
plt.xticks(np.arange(32, 198, 12))
plt.show()
How can I put those x axis values in the center?

You can use xlim, example:
import matplotlib.pyplot as plt
import seaborn as sns
data = [5,8,12,18,19,19.9,20.1,21,24,28]
fig, ax = plt.subplots()
sns.histplot(data, ax=ax) # distplot is deprecate and replaced by histplot
ax.set_xlim(1,31)
ax.set_xticks(range(1,32))
plt.show()

Related

Removing legend from mpl parallel coordinates plot?

I have a parallel coordinates plot with lots of data points so I'm trying to use a continuous colour bar to represent that, which I think I have worked out. However, I haven't been able to remove the default key that is put in when creating the plot, which is very long and hinders readability. Is there a way to remove this table to make the graph much easier to read?
This is the code I'm currently using to generate the parallel coordinates plot:
parallel_coordinates(data[[' male_le','
female_le','diet','activity','obese_perc','median_income']],'median_income',colormap = 'rainbow',
alpha = 0.5)
fig, ax = plt.subplots(figsize=(6, 1))
fig.subplots_adjust(bottom=0.5)
cmap = mpl.cm.rainbow
bounds = [0.00,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N,)
plt.colorbar(mpl.cm.ScalarMappable(norm = norm, cmap=cmap),cax = ax, orientation = 'horizontal',
label = 'normalised median income', alpha = 0.5)
plt.show()
Current Output:
I want my legend to be represented as a color bar, like this:
Any help would be greatly appreciated. Thanks.
You can use ax.legend_.remove() to remove the legend.
The cax parameter of plt.colorbar indicates the subplot where to put the colorbar. If you leave it out, matplotlib will create a new subplot, "stealing" space from the current subplot (subplots are often referenced to by ax in matplotlib). So, here leaving out cax (adding ax=ax isn't necessary, as here ax is the current subplot) will create the desired colorbar.
The code below uses seaborn's penguin dataset to create a standalone example.
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import numpy as np
from pandas.plotting import parallel_coordinates
penguins = sns.load_dataset('penguins')
fig, ax = plt.subplots(figsize=(10, 4))
cmap = plt.get_cmap('rainbow')
bounds = np.arange(penguins['body_mass_g'].min(), penguins['body_mass_g'].max() + 200, 200)
norm = mpl.colors.BoundaryNorm(bounds, 256)
penguins = penguins.dropna(subset=['body_mass_g'])
parallel_coordinates(penguins[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']],
'body_mass_g', colormap=cmap, alpha=0.5, ax=ax)
ax.legend_.remove()
plt.colorbar(mpl.cm.ScalarMappable(norm=norm, cmap=cmap),
ax=ax, orientation='horizontal', label='body mass', alpha=0.5)
plt.show()

Seaborn Adjusting Markers

As you can see here, the X axis labels here are quite unreadable. This will happen regardless of how I adjust the figure size. I'm trying to figure out how to adjust the labeling so that it only shows certain points. The X axis are all numerical between -1 to 1, and I think it would be nice and more viewer friendly to have labels at -1, -.5, 0, .5 and 1.
Is there a way to do this? Thank you!
Here's my code
sns.set(rc={'figure.figsize':(20,8)})
ax = sns.countplot(musi['Positivity'])
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha='right')
plt.tight_layout()
plt.show()
Basically seaborn is wrapper on matplotlib. You can use matplotlib ticker function to do a Job. Refer the below example.
Let's Plots tick every 1 spacing.
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
sns.set_theme(style="whitegrid")
x = [0,5,9,10,15]
y = [0,1,2,3,4]
tick_spacing = 1
fig, ax = plt.subplots(1,1)
sns.lineplot(x, y)
ax.xaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
plt.show()
Now Let's plot ticks every 5 ticks.
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
sns.set_theme(style="whitegrid")
x = [0,5,9,10,15]
y = [0,1,2,3,4]
tick_spacing = 5
fig, ax = plt.subplots(1,1)
sns.lineplot(x, y)
ax.xaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
plt.show()
P.S.: This solution give you explicit control of the tick spacing via the number given to ticker.MultipleLocater(), allows automatic limit determination, and is easy to read later.

Seaborn heatmaps in subplots - align x-axis

I am trying to plot a figure containing two subplots, a seaborn heatmap and simple matplotlib lines. However, when sharing the x-axis for both plots, they do not align as can be seen in this figure:
It would seem that the problem is similar to this post, but when displaying ax[0].get_xticks() and ax[1].get_xticks() I get the same positions, so I don't know what to change. And in my picture the the deviation seems to be more than a 0.5 shift.
What am I doing wrong?
The code I used to plot the figure is the following:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
M_1=np.random.random((15,15))
M_2=np.random.random((15,15))
L_1=np.random.random(15)
L_2=np.random.random(15)
x=range(15)
cmap = sns.color_palette("hot", 100)
sns.set(style="white")
fig, ax = plt.subplots(2, 1, sharex='col', figsize=(10, 12))
ax[0].plot(x,L_1,'-', marker='o',color='tab:orange')
sns.heatmap(M_1, cmap=cmap, vmax=np.max(M_1), center=np.max(M_1)/2., square=False, ax=ax[1])
#Mr-T 's comment is spot on. The easiest would be to create the axes beforehand instead of letting heatmap() shrink your axes in order to make room for the colorbar.
There is the added complication that the labels for the heatmap are not actually placed at [0,1,...] but are in the middle of each cell at [0.5, 1.5, ...]. So if you want your upper plot to align with the labels at the bottom (and with the center of each cell), you may have to shift your plot by 0.5 units to the right:
M_1=np.random.random((15,15))
M_2=np.random.random((15,15))
L_1=np.random.random(15)
L_2=np.random.random(15)
x=np.arange(15)
cmap = sns.color_palette("hot", 100)
sns.set(style="white")
fig, ax = plt.subplots(2, 2, sharex='col', gridspec_kw={'width_ratios':[100,5]})
ax[0,1].remove() # remove unused upper right axes
ax[0,0].plot(x+0.5,L_1,'-', marker='o',color='tab:orange')
sns.heatmap(M_1, cmap=cmap, vmax=np.max(M_1), center=np.max(M_1)/2., square=False, ax=ax[1,0], cbar_ax=ax[1,1])

Seaborn gives wrong values on x-axis ticks?

In the code below Matplotlib gives the correct range of 5.0 to 10.0, why is Seaborn different?
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from matplotlib import ticker
sns.set()
fig, (ax1, ax2) = plt.subplots(2)
x = np.linspace(5, 10)
y = x ** 2
sns.barplot(x, y, ax=ax1)
ax1.xaxis.set_major_locator(ticker.MultipleLocator(5))
ax1.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
ax2.bar(x, y, width = 0.1)
ax2.xaxis.set_major_locator(ticker.MultipleLocator(5))
ax2.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
plt.show()
Seaborn's barplot is a categorical plot. This means it places the bars at successive integer positions (0,1,...N-1). Hence, if you have N bars, the axis will range from -0.5 to N-0.5.
There is no way to tell seaborn to place the bars at different positions; but you can of course fake the labels to let it appear as such. E.g. to label every 5th bar with the value from x:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from matplotlib import ticker
sns.set()
fig, ax = plt.subplots()
x = np.linspace(5, 10)
y = x ** 2
sns.barplot(x, y, ax=ax)
ax.xaxis.set_major_locator(ticker.FixedLocator(np.arange(0, len(x), 5)))
ax.xaxis.set_major_formatter(ticker.FixedFormatter(x[::5]))
ax.tick_params(axis="x", rotation=90)
plt.tight_layout()
plt.show()
Inversely, it is possible to plot categorical plots with matplotlib. To this end, one needs to plot strings.
ax.bar(x.astype(str), y)
ax.xaxis.set_major_locator(ticker.FixedLocator(np.arange(0, len(x), 5)))
ax.xaxis.set_major_formatter(ticker.FixedFormatter(x[::5]))
ax.tick_params(axis="x", rotation=90)
If you want a numerical bar plot, i.e. a plot where each bar is at the axis position of x, you would need to use matplotlib. This is the default case also shown in the question, where the bars range between 5 and 10. One should make sure to have the width of the bars smaller than the difference between successive x positions in this case.
ax.bar(x, y, width=np.diff(x).mean()*0.8)
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
ax.tick_params(axis="x", rotation=90)

Plot intersection between y-axis grid and kdeplot on seaborn

I have created the following plot using seaborn kdeplot and customizing the gridlines.
sns.set_style('whitegrid')
cdf_accuracy = sns.kdeplot(eval_df['accuracy'], cumulative=True)
cdf_accuracy.yaxis.set_major_locator(ticker.MultipleLocator(0.25))
cdf_accuracy.xaxis.set_major_locator(ticker.MultipleLocator(10))
However, I would like to show the gridlines on the x-axis just on the points were the y-axis gridlines intersect the plot. There is a way to do this?
Thanks for your answers
As long as your characteristic is monotonic, which should be given with a cumulative dataset, you could simply use interpolation on the y-values:
import numpy as np
y_intrsct = [.25, .5, .75]
x_intrsct = np.interp(y_intrsct, y_data, x_data)
which results in
array([67.69792378, 83.24194722, 92.24041857])
plotted with the following code:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(x_data, y_data)
ax.set_yticks(np.linspace(0, 1, 5))
ax.grid(axis='y')
ax.vlines(x_intrsct, *ax.get_ylim())

Categories

Resources