In the Bokeh guide there are examples of various bar charts that can be created. http://docs.bokeh.org/en/0.10.0/docs/user_guide/charts.html#id4
This code will create one:
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.autompg import autompg as df
p = Bar(df, 'cyl', values='mpg', title="Total MPG by CYL")
output_file("bar.html")
show(p)
My question is if it's possible to add data labels to each individual bar of the chart? I searched online but could not find a clear answer.
Use Labelset
Use Labelset to create a label over each individual bar
In my example I'm using vbar with the plotting interface, it is a little bit more low level then the Charts interface, but there might be a way to add it into the Bar chart.
from bokeh.palettes import PuBu
from bokeh.io import show, output_notebook
from bokeh.models import ColumnDataSource, ranges, LabelSet
from bokeh.plotting import figure
output_notebook()
source = ColumnDataSource(dict(x=['Áætlaðir','Unnir'],y=[576,608]))
x_label = ""
y_label = "Tímar (klst)"
title = "Tímar; núllti til þriðji sprettur."
plot = figure(plot_width=600, plot_height=300, tools="save",
x_axis_label = x_label,
y_axis_label = y_label,
title=title,
x_minor_ticks=2,
x_range = source.data["x"],
y_range= ranges.Range1d(start=0,end=700))
labels = LabelSet(x='x', y='y', text='y', level='glyph',
x_offset=-13.5, y_offset=0, source=source, render_mode='canvas')
plot.vbar(source=source,x='x',top='y',bottom=0,width=0.3,color=PuBu[7][2])
plot.add_layout(labels)
show(plot)
You can find more about labelset here: Bokeh annotations
NOTE FROM BOKEH MAINTAINERS The portions of the answer below that refer to the bokeh.charts are of historical interest only. The bokeh.charts API was deprecated and subsequently removed from Bokeh. See the answers here and above for information on the stable bokeh.plotting API
Yes, you can add labels to each bar of the chart. There are a few ways to do this. By default, your labels are tied to your data. But you can change what is displayed. Here are a few ways to do that using your example:
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.autompg import autompg as df
from bokeh.layouts import gridplot
from pandas import DataFrame
from bokeh.plotting import figure, ColumnDataSource
from bokeh.models import Range1d, HoverTool
# output_file("bar.html")
""" Adding some sample labels a few different ways.
Play with the sample data and code to get an idea what does what.
See below for output.
"""
Sample data (new labels):
I used some logic to determine the new dataframe column. Of course you could use another column already in df (it all depends on what data you're working). All you really need here is to supply a new column to the dataframe.
# One method
labels = []
for number in df['cyl']:
if number == 3:
labels.append("three")
if number == 4:
labels.append("four")
if number == 5:
labels.append("five")
if number == 6:
labels.append("six")
if number == 8:
labels.append("eight")
df['labels'] = labels
Another way to get a new dataframe column. Again, we just need to supply df a new column to use on our bar plot.
# Another method
def new_labels(x):
if x % 2 != 0 or x == 6:
y = "Inline"
elif x % 2 == 0:
y = "V"
else:
y = "nan"
return y
df["more_labels"] = df["cyl"].map(new_labels)
Now the bar chart:
I've done it two ways. p1 just specifies the new labels. Note that because I used strings it put them in alphabetical order on the chart. p2 uses the original labels, plus adds my new labels on the same bar.
# Specifying your labels
p1 = Bar(df, label='labels', values='mpg',
title="Total MPG by CYL, remapped labels, p1",
width=400, height=400, legend="top_right")
p2 = Bar(df, label=['cyl', 'more_labels'], values='mpg',
title="Total MPG by CYL, multiple labels, p2", width=400, height=400,
legend="top_right")
Another way:
Bokeh has three main "interface levels". High level charts provides quick easy access but limited functionality; plotting which gives more options; models gives even more options.
Here I'm using the plotting interface and the Figure class that contains a rect method. This gives you more detailed control of your chart.
# Plot with "intermediate-level" bokeh.plotting interface
new_df = DataFrame(df.groupby(['cyl'])['mpg'].sum())
factors = ["three", "four", "five", "six", "eight"]
ordinate = new_df['mpg'].tolist()
mpg = [x * 0.5 for x in ordinate]
p3 = figure(x_range=factors, width=400, height=400,
title="Total MPG by CYL, using 'rect' instead of 'bar', p3")
p3.rect(factors, y=mpg, width=0.75, height=ordinate)
p3.y_range = Range1d(0, 6000)
p3.xaxis.axis_label = "x axis name"
p3.yaxis.axis_label = "Sum(Mpg)"
A fourth way to add specific labels:
Here I'm using the hover plot tool. Hover over each bar to display your specified label.
# With HoverTool, using 'quad' instead of 'rect'
top = [int(x) for x in ordinate]
bottom = [0] * len(top)
left = []
[left.append(x-0.2) for x in range(1, len(top)+1)]
right = []
[right.append(x+0.2) for x in range(1, len(top)+1)]
cyl = ["three", "four", "five", "six", "eight"]
source = ColumnDataSource(
data=dict(
top=[int(x) for x in ordinate],
bottom=[0] * len(top),
left=left,
right=right,
cyl=["three", "four", "five", "six", "eight"],
)
)
hover = HoverTool(
tooltips=[
("cyl", "#cyl"),
("sum", "#top")
]
)
p4 = figure(width=400, height=400,
title="Total MPG by CYL, with HoverTool and 'quad', p4")
p4.add_tools(hover)
p4.quad(top=[int(x) for x in ordinate], bottom=[0] * len(top),
left=left, right=right, color="green", source=source)
p4.xaxis.axis_label = "x axis name"
Show all four charts in a grid:
grid = gridplot([[p1, p2], [p3, p4]])
show(grid)
These are the ways I am aware of. There may be others. Change whatever you like to fit your needs. Here is what running all of this will output (you'll have to run it or serve it to get the hovertool):
Related
I've included the PolyDrawTool in my Bokeh plot to let users circle points. When a user draws a line near the edge of the plot the tool expands the axes which often messes up the shape. Is there a way to freeze the axes while a user is drawing on the plot?
I'm using bokeh 1.3.4
MRE:
import numpy as np
import pandas as pd
import string
from bokeh.io import show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.models import PolyDrawTool, MultiLine
def prepare_plot():
embedding_df = pd.DataFrame(np.random.random((100, 2)), columns=['x', 'y'])
embedding_df['word'] = embedding_df.apply(lambda x: ''.join(np.random.choice(list(string.ascii_lowercase), (8,))), axis=1)
# Plot preparation configuration Data source
source = ColumnDataSource(ColumnDataSource.from_df(embedding_df))
labels = LabelSet(x="x", y="y", text="word", y_offset=-10,x_offset = 5,
text_font_size="10pt", text_color="#555555",
source=source, text_align='center')
plot = figure(plot_width=1000, plot_height=500, active_scroll="wheel_zoom",
tools='pan, box_select, wheel_zoom, save, reset')
# Configure free-hand draw
draw_source = ColumnDataSource(data={'xs': [], 'ys': [], 'color': []})
renderer = plot.multi_line('xs', 'ys', line_width=5, alpha=0.4, color='color', source=draw_source)
renderer.selection_glyph = MultiLine(line_color='color', line_width=5, line_alpha=0.8)
draw_tool = PolyDrawTool(renderers=[renderer], empty_value='red')
plot.add_tools(draw_tool)
# Add the data and labels to plot
plot.circle("x", "y", size=0, source=source, line_color="black", fill_alpha=0.8)
plot.add_layout(labels)
return plot
if __name__ == '__main__':
plot = prepare_plot()
show(plot)
The PolyDrawTool actually updates a ColumnDataSource to drive a glyph that draws what the users indicates. The behavior you are seeing is a natural consequence of that fact, combined with Bokeh's default auto-ranging DataRange1d (which by default also consider every glyph when computing the auto-bounds). So, you have two options:
Don't use DataRange1d at all, e.g. you can provide fixed axis bounds when you call figure:
p = figure(..., x_range=(0,10), y_range=(-20, 20)
or you can set them after the fact:
p.x_range = Range1d(0, 10)
p.y_range = Range1d(-20, 20)
Of course, with this approach you will no longer get any auto-ranging at all; you will need to set the axis ranges to exactly the start/end that you want.
Make DataRange1d be more selective by explicitly setting its renderers property:
r = p.circle(...)
p.x_range.renderers = [r]
p.y_range.renderers = [r]
Now the DataRange models will only consider the circle renderer when computing the auto-ranged start/end.
I have more than one line on a bokeh plot, and I want the HoverTool to show the value for each line, but using the method from a previous stackoverflow answer isn't working:
https://stackoverflow.com/a/27549243/3087409
Here's the relevant code snippet from that answer:
fig = bp.figure(tools="reset,hover")
s1 = fig.scatter(x=x,y=y1,color='#0000ff',size=10,legend='sine')
s1.select(dict(type=HoverTool)).tooltips = {"x":"$x", "y":"$y"}
s2 = fig.scatter(x=x,y=y2,color='#ff0000',size=10,legend='cosine')
fig.select(dict(type=HoverTool)).tooltips = {"x":"$x", "y":"$y"}
And here's my code:
from bokeh.models import HoverTool
from bokeh.plotting import figure
source = ColumnDataSource(data=dict(
x = [list of datetimes]
wind = [some list]
coal = [some other list]
)
)
hover = HoverTool(mode = "vline")
plot = figure(tools=[hover], toolbar_location=None,
x_axis_type='datetime')
plot.line('x', 'wind')
plot.select(dict(type=HoverTool)).tooltips = {"y":"#wind"}
plot.line('x', 'coal')
plot.select(dict(type=HoverTool)).tooltips = {"y":"#coal"}
As far as I can tell, it's equivalent to the code in the answer I linked to, but when I hover over the figure, both hover tools boxes show the same value, that of the wind.
You need to add renderers for each plot. Check this. Also do not use samey label for both values change the names.
from bokeh.models import HoverTool
from bokeh.plotting import figure
source = ColumnDataSource(data=df)
plot = figure(x_axis_type='datetime',plot_width=800, plot_height=300)
plot1 =plot.line(x='x',y= 'wind',source=source,color='blue')
plot.add_tools(HoverTool(renderers=[plot1], tooltips=[('wind',"#wind")],mode='vline'))
plot2 = plot.line(x='x',y= 'coal',source=source,color='red')
plot.add_tools(HoverTool(renderers=[plot2], tooltips=[("coal","#coal")],mode='vline'))
show(plot)
The output look like this.
I am trying to plot a few points on a graph, similarly to a heat map.
Sample code (adapted from the heat map section here):
import pandas as pd
from bokeh.io import output_notebook, show
from bokeh.models import BasicTicker, ColorBar, ColumnDataSource, LinearColorMapper, PrintfTickFormatter
from bokeh.plotting import figure
from bokeh.transform import transform
import numpy as np
# change this if you don't run it on a Jupyter Notebook
output_notebook()
testx = np.random.randint(0,10,10)
testy = np.random.randint(0,10,10)
npdata = np.stack((testx,testy), axis = 1)
hist, bins = np.histogramdd(npdata, normed = False, bins = (10,10), range=((0,10),(0,10)))
data = pd.DataFrame(hist, columns = [str(x) for x in range(10)])
data.columns.name = 'y'
data['x'] = [str(x) for x in range(10)]
data = data.set_index('x')
df = pd.DataFrame(data.stack(), columns=['present']).reset_index()
source = ColumnDataSource(df)
colors = ['lightblue', "yellow"]
mapper = LinearColorMapper(palette=colors, low=df.present.min(), high=df.present.max())
p = figure(plot_width=400, plot_height=400, title="test circle map",
x_range=list(data.index), y_range=list((data.columns)),
toolbar_location=None, tools="", x_axis_location="below")
p.circle(x="x", y="y", size=20, source=source,
line_color=None, fill_color=transform('present', mapper))
p.axis.axis_line_color = None
p.axis.major_tick_line_color = None
p.axis.major_label_text_font_size = "10pt"
p.axis.major_label_standoff = 10
p.xaxis.major_label_orientation = 0
show(p)
That returns:
Now, as you can see, the grid lines are centered on the points(circles), and I would like, instead to have the circles enclosed in a square created by the lines.
I went through this to see if I could find information on how to offset the grid lines by 0.5 (that would have worked), but I was not able to.
There's nothing built into Bokeh to accomplish this kind of offsetting of categorical ticks, but you can write a custom extension to do it:
CS_CODE = """
import {CategoricalTicker} from "models/tickers/categorical_ticker"
export class MyTicker extends CategoricalTicker
type: "MyTicker"
get_ticks: (start, end, range, cross_loc) ->
ticks = super(start, end, range, cross_loc)
# shift the default tick locations by half a categorical bin width
ticks.major = ([x, 0.5] for x in ticks.major)
return ticks
"""
class MyTicker(CategoricalTicker):
__implementation__ = CS_CODE
p.xgrid.ticker = MyTicker()
p.ygrid.ticker = MyTicker()
Note that Bokeh assumes CoffeeScript by default when the code is just a string, but it's possible to use pure JS or TypeScript as well. Adding this to your code yields:
Please note the comment about output_notebook you must call it (possibly again, if you have called it previously) after the custom model is defined, due to #6107
I have a dataframe as
df = pd.DataFrame(data = {'Country':'Spain','Japan','Brazil'],'Number':[10,20,30]})
I wanted to plot a bar chart with labels (that is value of 'Number') annotated on the top for each bar and proceeded accordingly.
from bokeh.charts import Bar, output_file,output_notebook, show
from bokeh.models import Label
p = Bar(df,'Country', values='Number',title="Analysis", color = "navy")
label = Label(x='Country', y='Number', text='Number', level='glyph',x_offset=5, y_offset=-5)
p.add_annotation(label)
output_notebook()
show(p)
But I got an error as ValueError: expected a value of type Real, got COuntry of type str.
How do I solve this issue ?
Label produces a single label at position x and y. In you example, you are trying to add multiple labels using the data from your DataFrame as coordinates. Which is why you are getting your error message x and y need to be real coordinate values that map to the figure's x_range and y_range. You should look into using LabelSet (link) which can take a Bokeh ColumnDataSource as an argument and build multiple labels.
Unforutnately, you are also using a Bokeh Bar chart which is a high level chart which creates a categorical y_range. Bokeh cannot put labels on categorical y_ranges for now. You can circumvent this problem by creating a lower level vbar chart using placeholder x values and then styling it to give it the same look as your original chart. Here it is in action.
import pandas as pd
from bokeh.plotting import output_file, show, figure
from bokeh.models import LabelSet, ColumnDataSource, FixedTicker
# arbitrary placeholders which depends on the length and number of labels
x = [1,2,3]
# This is offset is based on the length of the string and the placeholder size
offset = -0.05
x_label = [x + offset for x in x]
df = pd.DataFrame(data={'Country': ['Spain', 'Japan', 'Brazil'],
'Number': [10, 20, 30],
'x': x,
'y_label': [-1.25, -1.25, -1.25],
'x_label': x_label})
source = ColumnDataSource(df)
p = figure(title="Analysis", x_axis_label='Country', y_axis_label='Number')
p.vbar(x='x', width=0.5, top='Number', color="navy", source=source)
p.xaxis.ticker = FixedTicker(ticks=x) # Create custom ticks for each country
p.xaxis.major_label_text_font_size = '0pt' # turn off x-axis tick labels
p.xaxis.minor_tick_line_color = None # turn off x-axis minor ticks
label = LabelSet(x='x_label', y='y_label', text='Number',
level='glyph', source=source)
p.add_layout(label)
show(p)
I'm new to bokeh and I just jumped right into using hovertool as that's why I wanted to use bokeh in the first place.
Now I'm plotting genes and what I want to achieve is multiple lines with the same y-coordinate and when you hover over a line you get the name and position of this gene.
I have tried to mimic this example, but for some reason the I can't even get it to show coordinates.
I'm sure that if someone who actually knows their way around bokeh looks at this code, the mistake will be apparent and I'd be very thankful if they showed it to me.
from bokeh.plotting import figure, HBox, output_file, show, VBox, ColumnDataSource
from bokeh.models import Range1d, HoverTool
from collections import OrderedDict
import random
ys = [10 for x in range(len(levelsdf2[(name, 'Start')]))]
xscale = zip(levelsdf2[('Log', 'Start')], levelsdf2[('Log', 'Stop')])
yscale = zip(ys,ys)
TOOLS="pan,wheel_zoom,box_zoom,reset,hover"
output_file("scatter.html")
hover_tips = levelsdf2.index.values
colors = ["#%06x" % random.randint(0,0xFFFFFF) for c in range(len(xscale))]
source = ColumnDataSource(
data=dict(
x=xscale,
y=yscale,
gene=hover_tips,
colors=colors,
)
)
p1 = figure(plot_width=1750, plot_height=950,y_range=[0, 15],tools=TOOLS)
p1.multi_line(xscale[1:10],yscale[1:10], alpha=1, source=source,line_width=10, line_color=colors[1:10])
hover = p1.select(dict(type=HoverTool))
hover.tooltips = [
("index", "$index"),
("(x,y)", "($x, $y)"),
]
show(p1)
the levelsdf2 is a pandas.DataFrame, if it matters.
I figured it out on my own. It turns out that version 0.8.2 of Bokeh doesn't allow hovertool for lines so I did the same thing using quads.
from bokeh.plotting import figure, HBox, output_file, show, VBox, ColumnDataSource
from bokeh.models import Range1d, HoverTool
from collections import OrderedDict
import random
xscale = zip(levelsdf2[('series1', 'Start')], levelsdf2[('series1', 'Stop')])
xscale2 = zip(levelsdf2[('series2', 'Start')], levelsdf2[('series2', 'Stop')])
yscale2 = zip([9.2 for x in range(len(levelsdf2[(name, 'Start')]))],[9.2 for x in range(len(levelsdf2[(name, 'Start')]))])
TOOLS="pan,wheel_zoom,box_zoom,reset,hover"
output_file("linesandquads.html")
hover_tips = levelsdf2.index.values
colors = ["#%06x" % random.randint(0,0xFFFFFF) for c in range(len(xscale))]
proc1 = 'Log'
proc2 = 'MazF2h'
expression1 = levelsdf2[(proc1, 'Level')]
expression2 = levelsdf2[(proc2, 'Level')]
source = ColumnDataSource(
data=dict(
start=[min(xscale[x]) for x in range(len(xscale))],
stop=[max(xscale[x]) for x in range(len(xscale))],
start2=[min(xscale2[x]) for x in range(len(xscale2))],
stop2=[max(xscale2[x]) for x in range(len(xscale2))],
gene=hover_tips,
colors=colors,
expression1=expression1,
expression2=expression2,
)
)
p1 = figure(plot_width=900, plot_height=500,y_range=[8,10.5],tools=TOOLS)
p1.quad(left="start", right="stop", top=[9.211 for x in range(len(xscale))],
bottom = [9.209 for x in range(len(xscale))], source=source, color="colors")
p1.multi_line(xscale2,yscale2, source=source, color="colors", line_width=20)
hover = p1.select(dict(type=HoverTool))
hover.tooltips = OrderedDict([
(proc1+" (start,stop, expression)", "(#start| #stop| #expression1)"),
("Gene","#gene"),
])
show(p1)
Works like a charm.
EDIT: Added a picture of the result, as requested and edited code to match the screenshot posted.
It's not the best solution as it turns out it's not all that easy to plot several series of quads on one plot. It's probably possible but as it didn't matter much in my use case I didn't investigate too vigorously.
As all genes are represented on all series at the same place I just added tooltips for all series to the quads and plotted the other series as multi_line plots on the same figure.
This means that if you hovered on the top line at 9.21 you'd get tooltips for the line at 9.2 as well, but If you hovered on the 9.2 line you wouldn't get a tooltip at all.