I have a dataframe of multiple columns. First two columns are x and y coordinates and the rest columns are different property values for (x,y) pairs.
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['x'] = np.random.randint(1,1000,100)
df['y'] = np.random.randint(1,1000,100)
df['val1'] = np.random.randint(1,1000,100)
df['val2'] = np.random.randint(1,1000,100)
df['val3'] = np.random.randint(1,1000,100)
print df.head()
x y val1 val2 val3
0 337 794 449 969 933
1 19 563 592 677 886
2 512 467 664 160 16
3 36 112 91 230 910
4 972 572 336 879 860
Using customJS in Bokeh, I would like to change the value of color in 2-D heatmap by providing a drop down menu.
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.models import LinearColorMapper
from bokeh.palettes import RdYlBu11 as palette
p = figure()
source = ColumnDataSource(df)
color_mapper = LinearColorMapper(palette=palette)
p.patches('x', 'y', source=source,\
fill_color={'field': 'val1', 'transform':color_mapper})
show(p)
The above commands plot a color map whose color is determined by the column 'val1'. I would like to plot different columns (either val1, val2, or val3) based on whatever is selected in the drop down menu.
I can create a drop down widget in bokeh by doing
from bokeh.models.widgets import Select
select = Select(title="Option:", value="val1", options=["val1","val2","val3"])
But, I'm not quite sure how I can use the selected value to update the plot by using callback.
Could someone give me a guideline here?
Thanks.
I have included an example with comments inline with the code. The main important steps are to write the javascript code that is executed each time the selected option on the widget changes. The code simply needs to just re-assign which of the columns is set to the values for the 'y' column of the datasource.
The other issue is that your data is just x and y points. The patches glyph will require a different data structure which defines the boundaries of the patches. I believe there are better ways to make a heatmap in bokeh, there should be numerous examples on stack overflow and the bokeh docs.
import pandas as pd
import numpy as np
from bokeh.io import show
from bokeh.layouts import widgetbox,row
from bokeh.models import ColumnDataSource, CustomJS
df = pd.DataFrame()
df['x'] = np.random.randint(1,1000,1000)
df['y'] = np.random.randint(1,1000,1000)
df['val1'] = np.random.randint(1,1000,1000)
df['val2'] = np.random.randint(1,1000,1000)
df['val3'] = np.random.randint(1,1000,1000)
from bokeh.plotting import figure
from bokeh.models import LinearColorMapper
from bokeh.palettes import RdYlBu11 as palette
p = figure(x_range=(0,1000),y_range=(0,1000))
source = ColumnDataSource(df)
source_orig = ColumnDataSource(df)
color_mapper = LinearColorMapper(palette=palette)
p.rect('x', 'y', source=source,width=4,height=4,
color={'field': 'val1', 'transform':color_mapper})
from bokeh.models.widgets import Select
select = Select(title="Option:", value="val1", options=["val1","val2","val3"])
callback = CustomJS(args={'source':source},code="""
// print the selectd value of the select widget -
// this is printed in the browser console.
// cb_obj is the callback object, in this case the select
// widget. cb_obj.value is the selected value.
console.log(' changed selected option', cb_obj.value);
// create a new variable for the data of the column data source
// this is linked to the plot
var data = source.data;
// allocate the selected column to the field for the y values
data['y'] = data[cb_obj.value];
// register the change - this is required to process the change in
// the y values
source.change.emit();
""")
# Add the callback to the select widget.
# This executes each time the selected option changes
select.callback = callback
show(row(p,select))
Related
The Goal:
To generate a scatter plot from a pandas DataFrame with 3 columns: x, y, type (either a, b or c). The data points should have different colors based on the type. Every data point should have a hover effect. However, data points with type c should have a tap effect too. The data file (data_file.csv) looks something like:
x
y
z
1
4
a
2
3
b
3
2
a
4
4
c
..
..
..
My attempt:
First, I imported the dataframe and divided into two parts: one with c type data and another with the everything else. Then I created two columndatasource and plotted the data. Is there a shortcut or better way than this? Also, I couldn't achieve some feature (see below).
Code:
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, OpenURL, TapTool
from bokeh.models.tools import HoverTool
from bokeh.transform import factor_cmap
file = "data.csv"
df = read_csv(file, skiprows=1, header=None, sep="\t")
# now I will seperate the dataframe into two: one with type **a** & **b**
# and another dataframe containing with type **c**
c_df = df.drop(df[df[2] != 'c'].index)
ab_df = df.drop(df[df[2] == 'c'].index)
ab_source = ColumnDataSource(data=dict(
Independent = ab_df[0],
Dependent = ab_df[1],
Type = ab_df[2]
))
c_source = ColumnDataSource(data=dict(
Independent = c_df[0],
Dependent = c_df[1],
Type = c_df[2],
link = "http://example.com/" + c_df[0].apply(str) + ".php"
))
p = figure(title="Random PLot")
p.circle('Independent', 'Dependent',
size=10,
source=ab_source,
color=factor_cmap('Type',
['red', 'blue'],
['a', 'b']),
legend_group='Type'
)
p.circle('Independent', 'Dependent',
size=12,
source=c_source,
color=factor_cmap('Type',
['green'],
['c']),
name='needsTapTool'
)
p.legend.title = "Type"
hover = HoverTool()
hover.tooltips = """
<div>
<h3>Type: #Type</h3>
<p> #Independent and #Dependent </p>
</div>
"""
p.add_tools(hover)
url = "#link"
tap = TapTool(names=['needsTapTool'])
tap.callback = OpenURL(url=url)
p.add_tools(tap)
show(p)
Problems:
(1) How can I add two different hover tools so that different data points will behave differently depending on their type? Whenever I add another hover tool, only the last one is getting effective.
(2) How can I take part of a data in CDS? For example, imagine I have a column called 'href' which contains a link but have a "http://www" part. Now how can I set the 'link' variable inside a CDS that doesn't contain this part? when I try:
c_source = ColumnDataSource(data=dict(
link = c_df[3].apply(str)[10:]
))
I get a keyError. Any help will be appreciated.
It is possible to define multiple Tools, even multiple HoverTools in one plot. The trick is the collect the renderers and apply them to a specific tool.
In the example below, two HoverTools are added and on TapTool.
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, OpenURL, TapTool, HoverTool
output_notebook()
df = pd.DataFrame({'x':[1,2,3,4], 'y':[4,3,2,1], 'z':['a','b','a','c']})
>>
x y z
0 1 4 a
1 2 3 b
2 3 2 a
3 4 1 c
color = {'a':'red', 'b':'blue', 'c':'green'}
p = figure(width=300, height=300)
# list to collect the renderers
renderers = []
for item in df['z'].unique():
df_group = df[df['z']==item].copy()
# if group 'c', add some urls
if item == 'c':
# url with "https"
df_group['link'] = df_group.loc[:,'x'].apply(lambda x: f"https://www.example.com/{x}.php")
# url without "https"
df_group['plain_link'] = df_group.loc[:,'x'].apply(lambda x: f"example.com/{x}.php")
renderers.append(
p.circle(
'x',
'y',
size=10,
source=ColumnDataSource(df_group),
color=color[item],
legend_label=item
)
)
p.legend.title = "Type"
# HoverTool for renderers of group 'a' and 'b'
hover = HoverTool(renderers=renderers[:2])
hover.tooltips = """
<div>
<h3>Type: #z</h3>
<p> #x and #y </p>
</div>
"""
p.add_tools(hover)
# HoverTool for renderer of group 'c'
hover_c = HoverTool(renderers=[renderers[-1]])
hover_c.tooltips = """
<div>
<h3>Type: #z</h3>
<p> #x and #y </p>
<p> #plain_link </p>
</div>
"""
p.add_tools(hover_c)
# TapTool for renderer of group 'c'
tap = TapTool(renderers=[renderers[-1]], callback=OpenURL(url="#link"))
p.add_tools(tap)
show(p)
I have 2 bokeh rows. The top row contains a DataTable and a TextInput, both of which are able to stretch_width in order to fit the width of the browser. The bottom row contains a gridplot, which is able to stretch_width, but only does so by distorting the scale of the image. Ideally, I would like the gridplot to update the amount of columns displayed based on the size of the browser.
Consider the following example:
import pandas as pd
from bokeh.models.widgets import DataTable, TableColumn
from bokeh.models import ColumnDataSource, TextInput
from bokeh.plotting import figure, output_file, save
from bokeh.layouts import row, column, gridplot
def get_datatable():
"""this can stretch width without issue"""
df = pd.DataFrame({'a': [0, 1, 2], 'b': [2, 3, 4]})
source = ColumnDataSource(df)
Columns = [TableColumn(field=i, title=i) for i in df.columns]
data_table = DataTable(columns=Columns, source=source, sizing_mode='stretch_width', max_width=9999)
return data_table
def get_text_input():
"""this can stretch width without issue"""
return TextInput(value='Example', title='Title', sizing_mode="stretch_width", max_width=9999)
def get_gridplot():
"""
this requires columns to be hard-coded
stretch_width is an option, but only distorts the images if enabled
"""
figs = []
for _ in range(30):
fig = figure(x_range=(0,10), y_range=(0,10))
_ = fig.image_rgba(image=[], x=0, y=0)
figs.append(fig)
return gridplot(children=figs, ncols=2)
top_row = row([get_datatable(), get_text_input()], max_width=9999, sizing_mode='stretch_width')
bottom_row = row(get_gridplot())
col = column(top_row, bottom_row, sizing_mode="stretch_width")
output_file("example.html")
save(col)
My end goal is to have the gridplot automatically update the amount of columns based on the width of the browser. Is there a way to do this natively in bokeh? If not, is it possible to do this via a CustomJs javascript callback?
Solution
Consider using sizing_mode=“scale_width” when calling figure.
fig = figure(x_range=(0,10), y_range=(0,10), sizing_mode=“scale_width”)
Note
It may be preferable to use scale_width instead of stretch_width more generally.
Bokeh Doc Example: https://docs.bokeh.org/en/latest/docs/user_guide/layout.html#multiple-objects
here's my data :https://drive.google.com/drive/folders/1CabmdDQucaKW2XhBxQlXVNOSiNRtkMm-?usp=sharing
i want to use the select to choose the stock i want to show;
and slider to choose the year range i want to show;
and checkboxgroup to choose the index i want to compare with.
the problem is when i adjust the slider, the figure will update, but when i use the select and checkboxgroup, the figure won't update,
what's the reason?
from bokeh.io import curdoc
from bokeh.layouts import column, row
from bokeh.models import ColumnDataSource, Slider, TextInput , Select , Div, CheckboxGroup
from bokeh.plotting import figure
import pandas as pd
import numpy as np
price=pd.read_excel('price.xlsx',index_col=0)
# input control
stock = Select(title='Stock',value='AAPL',options=[x for x in list(price.columns) if x not in ['S&P','DOW']])
yr_1 = Slider(title='Start year',value=2015,start=2000,end=2020,step=1)
yr_2 = Slider(title='End year',value=2020,start=2000,end=2020,step=1)
index = CheckboxGroup(labels=['S&P','DOW'],active=[0,1])
def get_data():
compare_index = [index.labels[i] for i in index.active]
stocks = stock.value
start_year = str(yr_1.value)
end_year = str(yr_2.value)
select_list = []
select_list.append(stocks)
select_list.extend(compare_index)
selected = price[select_list]
selected = selected [start_year:end_year]
for col in selected.columns:
selected[col]=selected[col]/selected[col].dropna()[0]
return ColumnDataSource(selected)
def make_plot(source):
fig=figure(plot_height=600, plot_width=700, title="",sizing_mode="scale_both", x_axis_type="datetime")
data_columns = list(source.data.keys())
for data in data_columns[1:]:
fig.line(x=data_columns[0],y=data,source=source,line_width=3, line_alpha=0.6, legend_label=data)
return fig
def update(attrname, old, new):
new_source = get_data()
source.data.clear()
source.data.update(new_source.data)
#get the initial value and plot
source = get_data()
plot = make_plot(source)
#control_update
stock.on_change('value', update)
yr_1.on_change('value', update)
yr_2.on_change('value', update)
index.on_change('active', update)
# Set up layouts and add to document
inputs = column(stock, yr_1, yr_2, index)
curdoc().add_root(row(inputs, plot, width=800))
curdoc().title = "Stocks"
You're creating a new ColumnDataSource for new data. That's not a good approach.
Instead, create it once and then just assign its data as appropriate.
In your case, I would do it like this:
Create ColumnDataSource just once, as described above
Do not use .update on CDS, just reassign .data
Create the legend manually
For that one line that's susceptible to the select change choose a static x field and use it everywhere instead
Change the first legend item's label when you change the select's value to instead of that x field it has the correct name
I have a pandas dataframe I am pulling data from and showing as a bar plot using Bokeh. What I want is show the max value of each bar upon hover. This is the first day I'm using Bokeh and I already changed the code a couple times and I'm really confused how to set it up. I added the:
p.add_tools(HoverTool(tooltips=[("x_ax", "#x_ax"), ("y_ax", "#y_ax")]))
line, but just don't understand it.
Here's the code:
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, ranges, LabelSet
from bokeh.plotting import figure, save, gridplot, output_file
# prepare some data
# x = pd.Series(range(1,36))
x_ax = FAdf['SampleID']
y_ax = FAdf['First Run Au (ppm)']
# output to static HTML file
output_file("bars.html")
# create a new plot with a title and axis labels
p = figure(x_range=x_ax, title="Batch results", x_axis_label='sample', y_axis_label='Au (ppm)',
toolbar_location="above", plot_width=1200, plot_height=800)
p.add_tools(HoverTool(tooltips=[("x_ax", "#x_ax"), ("y_ax", "#y_ax")]))
# setup for the bars
p.vbar(x=x_ax, top=y_ax, width=0.9)
p.xgrid.grid_line_color = None
p.y_range.start = 0
# turn bar tick labels 45 deg
p.xaxis.major_label_orientation = np.pi/3.5
# show the results
show(p)
Sample from the FAdf database:
SampleID:
0 KR-19 349
1 KR-19 351
2 Blank_2
3 KR-19 353
First Run Au (ppm):
0 0.019
1 0.002
2 0.000
3 0.117
If you pass actual literal data sequences to a glyph method like you have above, then Bokeh uses generic field names like "x" and "y" since it has no way of knowing any other names use. These are the columns you would need to configure the hover tool with:
tooltips=[("x_ax", "#x"), ("y_ax", "#y")])
Alternatively, you can pass a source argument to the vbar method so that the columns have the column names that you prefer. This is described in the Users Guide:
https://docs.bokeh.org/en/latest/docs/user_guide/data.html
I am using Bokeh on Jupyter Notebooks to help with data visualization. I wanted to be able to plot the data from a panda DataFrame, and then when I hover over the Bokeh plot, all the feature values should be visible in the hover Box. However, with the code below, only the index correctly displays, and all the other fields appear as ???, and I'm not sure why.
Here is my working example
//Importing all the neccessary things
import numpy as np
import pandas as pd
from bokeh.layouts import row, widgetbox, column
from bokeh.models import CustomJS, Slider, Select, HoverTool
from bokeh.plotting import figure, output_file, show, ColumnDataSource
from bokeh.io import push_notebook, output_notebook, curdoc
from bokeh.client import push_session
#from bokeh.scatter_with_hover import scatter_with_hover
output_notebook()
np.random.seed(0)
samples = np.random.randint(low = 0, high = 1000, size = 1000)
samples = samples.reshape(200,5)
cols = ["A", "B", "C", "D", "E"]
df = pd.DataFrame(samples, columns=cols)
# Here is a dict of some keys that I want to be able to pick from for plotting
labels = list(df.columns.values)
axis_map = {key:key for key in labels}
code2 = ''' var data = source.data;
//axis values with select widgets
var value1 = val1.value;
var value2 = val2.value;
var original_data = original_source.data
// get data corresponding to selection
x = original_data[value1];
y = original_data[value2];
data['x'] = x;
data['y'] = y;
source.trigger('change');
// set axis labels
x_axis.axis_label = value1;
y_axis.axis_label = value2;
'''
datas = "datas"
source = ColumnDataSource(data=dict( x=df['A'], y=df['B'],
label = labels, datas = df))
original_source = ColumnDataSource(data=df.to_dict(orient='list'))
a= source.data[datas].columns.values
#print a.columns.values
print a
TOOLS = [ HoverTool(tooltips= [(c, '#' + c) for c in source.data[datas].columns.values] +
[('index', '$index')] )]
# hover.tooltips.append(('index', '$index'))
#plot the figures
plot = figure(plot_width=800, plot_height=800, tools= TOOLS)
plot.scatter(x= "x",y="y", source=source, line_width=2, line_alpha=0.6,
size = 3)
callback = CustomJS(args=dict(source=source, original_source = original_source,
x_axis=plot.xaxis[0],y_axis=plot.yaxis[0]), code=code2)
#Create two select widgets to pick the features of interest
x_axis = Select(title="X Axis", options=sorted(axis_map.keys()), value="A", callback = callback)
callback.args["val1"] = x_axis
callbackDRange.args["val1"]= x_axis
y_axis = Select(title="Y Axis", options=sorted(axis_map.keys()), value="B", callback = callback)
callback.args["val2"] = y_axis
callbackDRange.args["val2"]= y_axis
plot.xaxis[0].axis_label = 'A'
plot.yaxis[0].axis_label = 'B'
#Display the graph in a jupyter notebook
layout = column(plot, x_axis, y_axis )
show(layout, notebook_handle=True)
I'm even passing in the full dataframe into the source ColumnDataSource so I can access it later, but it won't work. Any guidance would be greatly appreciated!
Running your code in recent version of Bokeh results in the warning:
Which suggests the root of the problem. If we actually look at the data source you create for the glyph:
source = ColumnDataSource(data=dict(x=df['A'],
y=df['B'],
label=labels,
datas = df))
It's apparent what two things are going wrong:
You are violating the fundamental assumption that all CDS columns must always be the same length at all times. The CDS is like a "Cheap DataFrame", i.e. it doesn't have ragged columns. All the columns must have the same length just like a DataFrame.
You are configuring a HoverTool to report values from columns named "A", "B", etc. but your data source has no such columns at all! The CDS you create above for your glyph has columns named:
"x", "y"
"labels" which is the wrong length
"datas" which has a bad value, columns should normally be 1-d arrays
The last column "datas" is irrelevant, BTW. I think you must think the hover tool will somehow look in that column for information to display but that is not how hover tool works. Bottom line:
If you configured a tooltip for a field #A then your CDS must have a column named "A"
And that's exactly what's not the case here.
It's hard to say exactly how you should change your code without more context about what exactly you want to do. I guess I'd suggest taking a closer look at the documention for hover tools in the User's Guide