Seaborn Barplot - Inconsistence when displaying values

Seaborn Barplot - Inconsistence when displaying values - python

For a multi-group bar plot in Seaborn, I would like to add text which is reffered from the int_txt on top each of the bar plot.
However, the text is not placed as intended.
For example, the code below
import seaborn as sns
import pandas as pd
from matplotlib import pyplot as plt
# Create an example dataframe
data = {'pdvalue': [1, 1, 1, 1, 4, 4, 4, 4, 2, 2, 2, 2, 8, 8, 8, 8],
'xval': [0, 0, 0.5, 0.5, 0.2, 0, 0.2, 0.2, 0.3, 0.3, 0.4, 0.1, 1, 1.1, 3, 1],
'int_txt': [11, 14, 4, 5.1, 1, 2, 5.1, 1, 2, 4, 1, 3, 6, 6, 2, 3],
'group': ['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd']}
df = pd.DataFrame(data)
df['int_txt'] = df['int_txt'].round(0).astype(int)
df=df.sort_values(by='pdvalue', ascending=True)
g = sns.barplot (data=df,x="pdvalue",y="xval",hue="group",)
for idx,p in enumerate(g.patches):
if p.get_height()!=0:
val_me=df['int_txt'][idx]
g.annotate(format(val_me, '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
plt.show()
will produced
Whereas, the expected output shall be something like
The appended text is based on the look-up table
and for any xval equal to zero, no text will be appended.
May I know where did I do wrong?

You didn't do anything wrong really. It's just sns plots the bars by hue first. To see this do:
for idx,p in enumerate(g.patches):
# annotate the enumeration
g.annotate(format(idx, '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
And you see (notice the enumeration on top)
One way around is to sort your data by hue column, then access with .iloc:
# sort by group first
df=df.sort_values(by=['group','pdvalue'], ascending=True)
g = sns.barplot (data=df,x="pdvalue",y="xval",hue="group",)
for idx,p in enumerate(g.patches):
if p.get_height()!=0:
# access with `iloc`, not `loc`
val_me=df['int_txt'].iloc[idx]
g.annotate(format(val_me, '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
And you would get the expected annotation:

Related

How to remove spaces between multiple colorbars in one figure

I am trying to plot three colorbars horizontally. I would like to remove the white spaces between the three colorbars. Is there a way to do this and/or to gradually adjust the space?
Code for reproduction:
import matplotlib as mpl
import matplotlib.pyplot as plt
fig, axes = plt.subplots(figsize=(8, 2), nrows=3, ncols=1, sharex=True, sharey=True)
fig.suptitle('Bar comparison')
# upper colorbar
bar1 = [['a', 0, 0.6], ['b', 0.6, 1.2], ['a', 1.2, 1.8], ['b', 1.8, 4]]
colors1 = ['yellow', 'blue', 'yellow', 'blue']
cmap1 = mpl.colors.ListedColormap(colors1)
bounds1 = [0] + [i[2] for i in bar1]
norm1 = mpl.colors.BoundaryNorm(bounds1, len(colors1))
plt.colorbar(mpl.cm.ScalarMappable(cmap=cmap1, norm=norm1),
cax=axes[0],
ticks=[[0], [bar1[-1][2]]],
spacing='proportional',
orientation='horizontal')
# middle colorbar
bar2 = [['a', 0, 0.5], ['b', 0.5, 1], ['a', 1, 2], ['b', 2, 3.8], ['a', 3.8, 4]]
colors2 = ['yellow', 'blue', 'yellow', 'blue', 'yellow']
cmap2 = mpl.colors.ListedColormap(colors2)
bounds2 = [0] + [i[2] for i in bar2]
norm2 = mpl.colors.BoundaryNorm(bounds2, len(colors2))
plt.colorbar(mpl.cm.ScalarMappable(cmap=cmap2, norm=norm2),
cax=axes[1],
ticks=[[0], [bar2[-1][2]]],
spacing='proportional',
orientation='horizontal')
# lower colorbar
bar3 = [['a', 0, 0.5], ['b', 0.5, 1], ['a', 1, 2], ['b', 2, 3.8], ['a', 3.8, 4]]
colors3 = ['green', 'green', 'green', 'green', 'red']
cmap3 = mpl.colors.ListedColormap(colors3)
bounds3 = [0] + [i[2] for i in bar3]
norm3 = mpl.colors.BoundaryNorm(bounds3, len(colors3))
plt.colorbar(mpl.cm.ScalarMappable(cmap=cmap3, norm=norm3),
cax=axes[2],
ticks=[[0], [bar3[-1][2]]],
spacing='proportional',
orientation='horizontal')
# Figure settings
# Hide x labels and tick labels for all but bottom plot.
for ax in axes:
ax.label_outer()
plt.show()

To remove the space between the color bars, you need to use hspace=0 using subplots_adjust(). Add this line to the code, just before plotting...
plt.subplots_adjust(hspace=0)

Add style details to a barplot

I have prepared the following plot:
x = [1, 2, 3, 4, 5, 6]
y = [-0.015, 0.386, -0.273, -0.091, 0.955, 1.727]
errors = [0.744, 0.954, 0.737, 0.969, 0.848, 0.460]
plt.figure(figsize=(8, 4))
plt.bar(x, y, yerr=errors, align='center', alpha=0.5, color='grey')
plt.xticks((0, 1, 2, 3, 4, 5, 6, 7), ('', 'I1', 'I2', 'I3', 'I4', 'I5', 'I6', ''))
plt.ylim((-3, 3))
plt.show()
I have a couple of questions:
How do i add the bottom and top dashes in the error segment?
I would like to color the background green if y > 0.8, yellow if -0.8 <= y <= 0.8, red if y < -0.8. How can I do it?

To set the horizontal lines, you need to use capsize=n. Larger the n, wider the horizontal lines. You can set the background color using axvspan() and select color based on the value of y. Below is the updated code. I changed y value of one to show the red color...
x = [1, 2, 3, 4, 5, 6]
y = [-0.015, 0.386, -0.273, -0.091, -0.955, 1.727]
errors = [0.744, 0.954, 0.737, 0.969, 0.848, 0.460]
plt.figure(figsize=(8, 4))
ax=plt.bar(x, y, yerr=errors, align='center', alpha=0.5, color='grey', capsize=5)
plt.xticks((0, 1, 2, 3, 4, 5, 6, 7), ('', 'I1', 'I2', 'I3', 'I4', 'I5', 'I6', ''))
plt.ylim((-3, 3))
for i in range(1,7):
if y[i-1] > 0.8:
color = 'green'
elif y[i-1] > -0.8:
color = 'yellow'
else:
color = 'red'
plt.axvspan(i-0.5, i+.5, facecolor=color, alpha=0.3)
plt.show()
EDIT:
As requested, you are looking for horizontal ranges. So, you would need to use axhspan(). To bring the gray to solid, change alpha to 1 and you will need to bring the bars to the front using zorder(). I also added edgecolor='gray' to keep the look of a fully solid bar. Code is below...
plt.figure(figsize=(8, 4))
ax=plt.bar(x, y, yerr=errors, align='center', alpha=1, color='grey', edgecolor='gray',capsize=10, zorder=2)
plt.xticks((0, 1, 2, 3, 4, 5, 6, 7), ('', 'I1', 'I2', 'I3', 'I4', 'I5', 'I6', ''))
plt.ylim((-3, 3))
plt.axhspan(0.8, plt.gca().get_ylim()[1], facecolor='green', alpha=0.3, zorder=1)
plt.axhspan(-0.8, 0.8, facecolor='yellow', alpha=0.3, zorder=1)
plt.axhspan(plt.gca().get_ylim()[0], -0.8, facecolor='red', alpha=0.3, zorder=1)
plt.show()

Sharing Y-axis in a matplotlib subplots

I have been trying to create a matplotlib subplot (1 x 3) with horizontal bar plots on either side of a lineplot.
It looks like this:
The code for generating the above plot -
u_list = [2, 0, 0, 0, 1, 5, 0, 4, 0, 0]
n_list = [0, 0, 1, 0, 4, 3, 1, 1, 0, 6]
arr_ = list(np.arange(10, 11, 0.1))
data_ = pd.DataFrame({
'points': list(np.arange(0, 10, 1)),
'value': [10.4, 10.5, 10.3, 10.7, 10.9, 10.5, 10.6, 10.3, 10.2, 10.4][::-1]
})
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(20, 8))
ax1 = plt.subplot(1, 3, 1)
sns.barplot(u_list, arr_, orient="h", ax=ax1)
ax2 = plt.subplot(1, 3, 2)
x = data_['points'].tolist()
y = data_['value'].tolist()
ax2.plot(x, y)
ax2.set_yticks(arr_)
plt.gca().invert_yaxis()
ax3 = plt.subplot(1, 3, 3, sharey=ax1, sharex=ax1)
sns.barplot(n_list, arr_, orient="h", ax=ax3)
fig.tight_layout()
plt.show()
Edit
How do I share the y-axis of the central line plot with the other horizontal bar plots?

I would set the limits of all y-axes to the same range, set the ticks in all axes and than set the ticks/tick-labels of all but the most left axis to be empty. Here is what I mean:
from matplotlib import pyplot as plt
import numpy as np
u_list = [2, 0, 0, 0, 1, 5, 0, 4, 0, 0]
n_list = [0, 0, 1, 0, 4, 3, 1, 1, 0, 6]
arr_ = list(np.arange(10, 11, 0.1))
x = list(np.arange(0, 10, 1))
y = [10.4, 10.5, 10.3, 10.7, 10.9, 10.5, 10.6, 10.3, 10.2, 10.4]
fig, axs = plt.subplots(1, 3, figsize=(20, 8))
axs[0].barh(arr_,u_list,height=0.1)
axs[0].invert_yaxis()
axs[1].plot(x, y)
axs[1].invert_yaxis()
axs[2].barh(arr_,n_list,height=0.1)
axs[2].invert_yaxis()
for i in range(1,len(axs)):
axs[i].set_ylim( axs[0].get_ylim() ) # align axes
axs[i].set_yticks([]) # set ticks to be empty (no ticks, no tick-labels)
fig.tight_layout()
plt.show()
This is a minimal example and for the sake of conciseness, I refrained from mixing matplotlib and searborn. Since seaborn uses matplotlib under the hood, you can reproduce the same output there (but with nicer bars).

deleting line from figure in bokeh

I am new to Bokeh. I made a widget where when I click a checkbox I want to be able to add/delete a line in a bokeh figure. I have 20 such checkboxes and I dont want to replot the whole figure, just to delete 1 line if a checkbox was unchecked.
This is done through a callback, where I have access to the figure object. I would imagine there is a way to do something like this:
F=figure()
F.line('x', 'y', source=source, name='line1')
F.line('x', 'z', source=source, name='line2')
%%in callback
selected_line_name = 'line1' # this would be determined by checkbox
selected_line = F.children[selected_line_name]
delete(selected_line)
However, I am unable to figure out how to
1) access a glyph from its parent object
2) delete a glyph
I tried setting the datasource 'y'=[], but since all column data sources have to be the same size, this removes all the plots...

There are several ways:
# Keep the glyphs in a variable:
line2 = F.line('x', 'z', source=source, name='line2')
# or get the glyph from the Figure:
line2 = F.select_one({'name': 'line2'})
# in callback:
line2.visible = False

This will work to maintain a shared 'x' data source column if glyphs are assigned as a variable and given a name attribute. The remove function fills the appropriate 'y' columns with nans, and the restore function replaces nans with the original values.
The functions require numpy and bokeh GlyphRenderer imports. I'm not sure that this method is worthwhile given the simple visible on/off option, but I am posting it anyway just in case this helps in some other use case.
Glyphs to remove or restore are referenced by glyph name(s), contained within a list.
src_dict = source.data.copy()
def remove_glyphs(figure, glyph_name_list):
renderers = figure.select(dict(type=GlyphRenderer))
for r in renderers:
if r.name in glyph_name_list:
col = r.glyph.y
r.data_source.data[col] = [np.nan] * len(r.data_source.data[col])
def restore_glyphs(figure, src_dict, glyph_name_list):
renderers = figure.select(dict(type=GlyphRenderer))
for r in renderers:
if r.name in glyph_name_list:
col = r.glyph.y
r.data_source.data[col] = src_dict[col]
Example:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import Range1d, ColumnDataSource
from bokeh.models.renderers import GlyphRenderer
import numpy as np
output_notebook()
p = figure(plot_width=200, plot_height=150,
x_range=Range1d(0, 6),
y_range=Range1d(0, 10),
toolbar_location=None)
source = ColumnDataSource(data=dict(x=[1, 3, 5],
y1=[1, 1, 2],
y2=[1, 2, 6],
y3=[1, 3, 9]))
src_dict = source.data.copy()
line1 = p.line('x', 'y1',
source=source,
color='blue',
name='g1',
line_width=3)
line2 = p.line('x', 'y2',
source=source,
color='red',
name='g2',
line_width=3)
line3 = p.line('x', 'y3',
source=source,
color='green',
name='g3',
line_width=3)
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [1, 2, 6], 'y3': [1, 3, 9]}
remove_glyphs(p, ['g1', 'g2'])
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [nan, nan, nan], 'y2': [nan, nan, nan], 'y3': [1, 3, 9]}
restore_glyphs(p, src_dict, ['g1', 'g3'])
print(source.data)
show(p)
('g3' was already on the plot, and is not affected)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [nan, nan, nan], 'y3': [1, 3, 9]}
restore_glyphs(p, src_dict, ['g2'])
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [1, 2, 6], 'y3': [1, 3, 9]}

How to annotate a stacked bar plot and add legend labels

In short:
Height of bars does not match the numbers.
Labels seem to be placed on the wrong height. (should be right in the middle of each bar)
On the very bottom I also see the '0' labels which I really don't want to see in the graph.
Explained:
I'm trying to make a stacked bar chart and label each bar with it's appropriate value in it. But for some reason the height of the bars is completely wrong. Like for the first week the green bar should be 20 points long but it is only 10. And the red bar should be 10 points long but it is only 8 or so. And week 17 should have multiple bars in it but instead has only one (the white one)
I am guessing that because of the wrong bar heights the labels are misplaced too. I have no idea why the 0's on the very bottom are also showing but that's a problem too.
I don't know if these are all separate questions and should be asked in separate posts, but I feel like they are all connected and that there is an answer that solves them all.
import matplotlib.pyplot as plt
import numpy as np
newYearWeek =[201613, 201614, 201615, 201616, 201617, 201618, 201619, 201620, 201621, 201622]
uniqueNames = ['Word1', 'Word2', 'Word3', 'Word4', 'Word5', 'Word6',
'Word7', 'Word8', 'Word9', 'Word10', 'Word11']
#Each column in the multiarray from top to bottom represents 1 week
#Each row from left to right represents the values of that word.
#So that makes 11 rows and 10 columns.
#And yes the multidimensional array have to be like this with the 0's in it.
keywordsMuliarray = [
[20, 3, 1, 0, 0, 1, 6, 3, 1, 2],
[10, 1, 0, 0, 3, 1, 3, 1, 0, 2],
[2, 2, 5, 3, 5, 4, 5, 4, 3, 2],
[0, 4, 3, 3, 1, 0, 2, 7, 1, 2],
[0, 0, 2, 0, 1, 1, 1, 0, 1, 3],
[0, 0, 3, 2, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 7, 6, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 2, 0, 1]]
fig = plt.figure(figsize=(8.5, 5.5))
ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
N = len(newYearWeek)
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat',
'orange', 'mediumslateblue', 'silver',
'whitesmoke', 'black', 'darkkhaki', 'dodgerblue', 'crimson',
'sage', 'navy', 'plum', 'darkviolet', 'lightpink']
def autolabel(rects, values):
# Attach some text labels.
for (rect, value) in zip(rects, values):
ax.text(rect.get_x() + rect.get_width() / 2.,
rect.get_y() + rect.get_height() / 2.,
'%d'%value,
ha = 'center',
va = 'center')
left = np.zeros(len(uniqueNames)) # left alignment of data starts at zero
helpingNumber = 0
for i in range(0, len(newYearWeek)):
rects1 = plt.bar(ind, keywordsMuliarray[helpingNumber][:],width, color=colors[helpingNumber], label=uniqueNames[helpingNumber])
autolabel(rects1, keywordsMuliarray[helpingNumber][:])
helpingNumber = helpingNumber+1
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 1, box.height])
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
#plt.ylabel('Scores')
plt.xticks(ind + width/2., newYearWeek, fontsize=8)
#plt.yticks(np.arange(0, 81, 10))
plt.margins(x=0.02)
plt.tight_layout(rect=[0,0,0.8,1])
plt.show()
This is how the graph looks now:

To make what you want you have to sum heights of all previous bars in current column (list bot_heights), like here:
import matplotlib.pyplot as plt
import numpy as np
newYearWeek =[201613, 201614, 201615, 201616, 201617, 201618, 201619, 201620, 201621, 201622]
uniqueNames = ['Word1', 'Word2', 'Word3', 'Word4', 'Word5', 'Word6',
'Word7', 'Word8', 'Word9', 'Word10', 'Word11']
#Each column in the multiarray from top to bottom represents 1 week
#Each row from left to right represents the values of that word.
#So that makes 11 rows and 10 columns.
#And yes the multidimensional array have to be like this with the 0's in it.
keywordsMuliarray = [
[20, 3, 1, 0, 0, 1, 6, 3, 1, 2],
[10, 1, 0, 0, 3, 1, 3, 1, 0, 2],
[2, 2, 5, 3, 5, 4, 5, 4, 3, 2],
[0, 4, 3, 3, 1, 0, 2, 7, 1, 2],
[0, 0, 2, 0, 1, 1, 1, 0, 1, 3],
[0, 0, 3, 2, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 7, 6, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 2, 0, 1]]
fig = plt.figure(figsize=(8.5, 5.5))
ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
N = len(newYearWeek)
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat',
'orange', 'mediumslateblue', 'silver',
'whitesmoke', 'black', 'darkkhaki', 'dodgerblue', 'crimson',
'sage', 'navy', 'plum', 'darkviolet', 'lightpink']
def autolabel(rects, values):
# Attach some text labels
for (rect, value) in zip(rects, values):
if value > 0:
ax.text(rect.get_x() + rect.get_width() / 2.,
rect.get_y() + rect.get_height() / 2.,
'%d'%value, ha = 'center', va = 'center', size = 9)
left = np.zeros(len(uniqueNames)) # left alignment of data starts at zero
# plot the first bars
rects1 = plt.bar(ind, keywordsMuliarray[0][:],width,
color=colors[0], label=uniqueNames[0])
autolabel(rects1, keywordsMuliarray[0][:])
# put other bars on previuos
bot_heights = [0.] * len(keywordsMuliarray[0][:])
for i in xrange(1,N):
bot_heights = [bot_heights[j] + keywordsMuliarray[i-1][j] for j in xrange(len(bot_heights))]
rects1 = plt.bar(ind, keywordsMuliarray[i][:],width,
color=colors[i], label=uniqueNames[i],
bottom=bot_heights)
autolabel(rects1, keywordsMuliarray[i][:])
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 1, box.height])
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
#plt.ylabel('Scores')
plt.xticks(ind + width/2., newYearWeek, fontsize=8)
plt.yticks(np.arange(0, 41, 5))
plt.margins(x=0.02)
plt.tight_layout(rect=[0,0,0.8,1])
plt.show()
To prevent overlapping of bar labels I recommend you do not add a label if a value is zero (look to modified autolabel function). As a result I get:

The other answer doesn't plot data for 'Word11'
Lists and arrays of data can most easily be plotted by loading them into pandas
Plot the dataframe with pandas.DataFrame.plot and kind='bar'
When plotting data from pandas, the index values become the axis tick labels and the column names are the segment labels
matplotlib.pyplot.bar_label can be used to add annotations
See Adding value labels on a matplotlib bar chart for more options using .bar_label.
Tested in pandas 1.3.1, python 3.81., and matplotlib 3.4.21.
Minimum version required
labels = [f'{v.get_height():0.0f}' if v.get_height() > 0 else '' for v in c ] without the assignment expression (:=).
import pandas as pd
import matplotlib.pyplot as plt
# create a dataframe from the data in the OP and transpose it with .T
df = pd.DataFrame(data=keywordsMuliarray, index=uniqueNames, columns=newYearWeek).T
# display(df.head())
Word1 Word2 Word3 Word4 Word5 Word6 Word7 Word8 Word9 Word10 Word11
201613 20 10 2 0 0 0 1 0 0 0 0
201614 3 1 2 4 0 0 0 0 1 0 0
201615 1 0 5 3 2 3 1 0 0 0 0
201616 0 0 3 3 0 2 0 1 0 0 0
201617 0 3 5 1 1 0 1 0 7 0 0
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat', 'orange', 'mediumslateblue', 'silver', 'whitesmoke', 'black', 'darkkhaki']
# plot the dataframe
ax = df.plot(kind='bar', stacked=True, figsize=(9, 6), color=colors, rot=0, ec='k')
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
# add annotations
for c in ax.containers:
# customize the label to account for cases when there might not be a bar section
labels = [f'{h:0.0f}' if (h := v.get_height()) > 0 else '' for v in c ]
# set the bar label
ax.bar_label(c, labels=labels, label_type='center', fontsize=8)
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Seaborn Barplot - Inconsistence when displaying values - python

Related

How to remove spaces between multiple colorbars in one figure

Add style details to a barplot

Sharing Y-axis in a matplotlib subplots

deleting line from figure in bokeh

How to annotate a stacked bar plot and add legend labels

Categories

Resources