I plotted a line graphs with using bokeh on Python. I want to highlight and take the values (Max-Min-x, y coordinates) of the selected areas with "Box Select tool" like shown below. when I choose a certain section on the graph with "box select tool" the color of the selected part does not change. How to solve this problem?
Example
import numpy as np
import pandas as pd
from bokeh.plotting import figure,show,output_file
from bokeh.models import ColumnDataSource
output_file("PlottingTest.html")
dataset = pd.read_csv("data.csv")
data = dataset.iloc[:,3]
time = np.linspace(1, 500, num = 500)
TOOLS ="pan,wheel_zoom,reset,hover,poly_select,xbox_select,lasso_select"
s1 = ColumnDataSource(data=dict(x=time, y=data))
p = figure(title = 'Test',x_axis_label = 'time', y_axis_label='csv Data',plot_width=1000, plot_height=500,tools=TOOLS)
p.line ('date', 't1', source=s1, selection_color="orange")
p.line(time, data, legend_label="Current", line_width=1)
p.toolbar.autohide = True
show(p)
Related
I want to add labels with the values above the bars like here: How to add data labels to a bar chart in Bokeh? but don't know how to do it. My code looks different then other examples, the code is working but maybe it is not the right way.
My code:
from bokeh.io import export_png
from bokeh.io import output_file, show
from bokeh.palettes import Spectral5
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg_clean as df
from bokeh.transform import factor_cmap
from bokeh.models import ColumnDataSource, ranges, LabelSet, Label
import pandas as pd
d = {'lvl': ["lvl1", "lvl2", "lvl2", "lvl3"],
'feature': ["test1", "test2","test3","test4"],
'count': ["5", "20","8", "90"]}
dfn = pd.DataFrame(data=d)
sourceframe = ColumnDataSource(data=dfn)
groupn = dfn.groupby(by=['lvl', 'feature'])
index_cmapn = factor_cmap('lvl_feature', palette=Spectral5, factors=sorted(dfn.lvl.unique()), end=1)
pn = figure(plot_width=800, plot_height=300, title="Count",x_range=groupn, toolbar_location=None)
labels = LabelSet(x='feature', y='count', text='count', level='glyph',x_offset=0, y_offset=5, source=sourceframe, render_mode='canvas',)
pn.vbar(x='lvl_feature', top="count_top" ,width=1, source=groupn,line_color="white", fill_color=index_cmapn, )
pn.y_range.start = 0
pn.x_range.range_padding = 0.05
pn.xgrid.grid_line_color = None
pn.xaxis.axis_label = "levels"
pn.xaxis.major_label_orientation = 1.2
pn.outline_line_color = None
pn.add_layout(labels)
export_png(pn, filename="color.png")
I think it has something to do with my dfn.groupby(by=['lvl', 'feature']) and the (probably wrong) sourceframe = ColumnDataSource(data=dfn).
The plot at this moment:
You can add the groups names in the initial dictionary like this:
d = {'lvl': ["lvl1", "lvl2", "lvl2", "lvl3"],
'feature': ["test1", "test2","test3","test4"],
'count': ["5", "20","8", "90"],
'groups': [('lvl1', 'test1'), ('lvl2', 'test2'), ('lvl2', 'test3'), ('lvl3', 'test4')]}
And then call LabelSet using as x values the groups.
labels = LabelSet(x='groups', y='count', text='count', level='glyph',x_offset=20, y_offset=0, source=sourceframe, render_mode='canvas',)
In this way the labels appear. Note that I played a bit with the offset to check if that was the problem, you can fix that manually.
I'm trying to plot a simple heatmap using bokeh/holoviews. My data (pandas dataframe) has categoricals (on y) and datetime (on x). The problem is that the number of categorical elements is >3000 and the resulting plot appears with messed overlapped tickers on the y axis that makes it totally useless. Currently, is there a reliable way in bokeh to select only a subset of the tickers based on the zoom level?
I've already tried plotly and the result looks perfect but however I need to use bokeh/holoviews and datashader. I want also avoid to replace categoricals with numericals tickers.
I've also tried this solution but actually it doesn't work (bokeh 1.2.0).
This is a toy example representing my use case (Actually here #y is 1000 but it gives the idea)
from datetime import datetime
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.transform import linear_cmap
from bokeh.io import output_notebook
output_notebook()
# build sample data
index = pd.date_range(start='1/1/2019', periods=1000, freq='T')
data = np.random.rand(1000,100)
columns = ['col'+ str(n) for n in range(100)]
# initial data format
df = pd.DataFrame(data=data, index=index, columns=columns)
# bokeh
df = df.stack().reset_index()
df.rename(columns={'level_0':'x','level_1':'y', 0:'z'},inplace=True)
df.sort_values(by=['y'],inplace=True)
x = [
date.to_datetime64().astype('M8[ms]').astype('O')
for date in df.x.to_list()
]
data = {
'value': df.z.to_list(),
'x': x,
'y': df.y.to_list(),
'date' : df.x.to_list()
}
p = figure(x_axis_type='datetime', y_range=columns, width=900, tooltips=[("x", "#date"), ("y", "#y"), ("value", "#value")])
p.rect(x='x', y='y', width=60*1000, height=1, line_color=None,
fill_color=linear_cmap('value', 'Viridis256', low=df.z.min(), high=df.z.max()), source=data)
show(p)
Finally, I partially followed the suggestion from James and managed to get it to work using a python callback for the ticker. This solution was hard to find for me. I really searched all the Bokeh docs, examples and source code for days.
The main problem for me is that in the doc is not mentioned how I can use "ColumnDataSource" objects in the custom callback.
https://docs.bokeh.org/en/1.2.0/docs/reference/models/formatters.html#bokeh.models.formatters.FuncTickFormatter.from_py_func
Finally, this helped a lot:
https://docs.bokeh.org/en/1.2.0/docs/user_guide/interaction/callbacks.html#customjs-with-a-python-function.
So, I modified the original code as follow in the hope it can be useful to someone:
from datetime import datetime
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.transform import linear_cmap
from bokeh.io import output_notebook
from bokeh.models import FuncTickFormatter
from bokeh.models import ColumnDataSource
output_notebook()
# build sample data
index = pd.date_range(start='1/1/2019', periods=1000, freq='T')
data = np.random.rand(1000,100)
columns_labels = ['col'+ str(n) for n in range(100)]
columns = [n for n in range(100)]
# initial data format
df = pd.DataFrame(data=data, index=index, columns=columns)
# bokeh
df = df.stack().reset_index()
df.rename(columns={'level_0':'x','level_1':'y', 0:'z'},inplace=True)
df.sort_values(by=['y'],inplace=True)
x = [
date.to_datetime64().astype('M8[ms]').astype('O')
for date in df.x.to_list()
]
data = {
'value': df.z.to_list(),
'x': x,
'y': df.y.to_list(),
'y_labels_tooltip' : [columns_labels[k] for k in df.y.to_list()],
'y_ticks' : columns_labels*1000,
'date' : df.x.to_list()
}
cd = ColumnDataSource(data=data)
def ticker(source=cd):
labels = source.data['y_ticks']
return "{}".format(labels[tick])
#p = figure(x_axis_type='datetime', y_range=columns, width=900, tooltips=[("x", "#date{%F %T}"), ("y", "#y_labels"), ("value", "#value")])
p = figure(x_axis_type='datetime', width=900, tooltips=[("x", "#date{%F %T}"), ("y", "#y_labels_tooltip"), ("value", "#value")])
p.rect(x='x', y='y', width=60*1000, height=1, line_color=None,
fill_color=linear_cmap('value', 'Viridis256', low=df.z.min(), high=df.z.max()), source=cd)
p.hover.formatters = {'date': 'datetime'}
p.yaxis.formatter = FuncTickFormatter.from_py_func(ticker)
p.yaxis[0].ticker.desired_num_ticks = 20
show(p)
The result is this:
I am intending to create a scatter plot with a linear colormapper. The dataset is the popular Female Literacy and Birthrate dataset.
The plot would have the "GDP per capita" on the x axis and "Life Expectancy at Birth" on the y axis. In addition to this (and this is where i am running into the issue), is to vary the color of the points according to "Birth rate".
Current Code:
#DATA MANIPULATION
# import Pandas, Bokeh, etc
import numpy as np
import pandas as pd
from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource
from bokeh.palettes import Viridis256 as palette
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg as df
from bokeh.transform import linear_cmap
# load the data file
excel_file = '../factbook.xlsx'
#(removed url above since it is private)
factbook = pd.read_excel(excel_file)
source = ColumnDataSource(factbook)
colormapper = linear_cmap(field_name = factbook["Birth rate"], palette=palette, low=min(factbook["Birth rate"]), high=max(factbook["Birth rate"]))
p = figure(title = "UN Factbook Bubble Visualization",
x_axis_label = 'GDP per capita', y_axis_label = 'Life expectancy at birth')
p.circle(x = 'GDP per capita', y = 'Life expectancy at birth', source = source, color =colormapper)
output_file("file", title="Bubble Graph")
show(p)
the p.circle line is having an issue with consuming the colormapper. I would like help on understanding how to resolve this.
The field_name parameter should be provided with the name of a column. You are supplying the entire data column itself. Since you have not provided a complete runnable example, it is impossible to test for sure, but presumably you want:
linear_cmap(field_name="Birth rate", ...)
I am trying to append an AdjointLayout of a Scatter plot with two supporting histograms to a Bokeh dashboard. However, whenever trying to incorporate the two in a single row, the Bokeh widgets encounter display issues and the AdjointLayout never scales. Is this the current expected behavior or is here a different approach I need to take to currently accomplish this?
Minimal Example of the problem:
import numpy as np
import pandas as pd
import holoviews as hv
from bokeh.layouts import layout
from bokeh.models import Select
from bokeh.io import curdoc
renderer = hv.renderer('bokeh').instance(mode='server')
np.random.seed(10)
data = np.random.rand(100,4)
opts = {}
opts['color_index'] = 2
opts['size_index'] = 3
opts['scaling_factor'] = 50
points = hv.Points(data, vdims=['z', 'size']).opts(plot=opts)
fields = ['berry', 'cherry', 'dairy']
x = Select(title='X-Axis:', value=fields[0], options=fields)
y = Select(title='Y-Axis:', value=fields[1], options=fields)
dashboard = points + points[0.3:0.7, 0.3:0.7].hist()
app = renderer.get_plot(dashboard).state
dashboard = layout([
[[x, y], app],
])
curdoc().add_root(dashboard)
Using Bokeh 0.13.0 and Holoviews 1.10.5
I am unable to plot the area chart in bokeh for some reason..
Below is the code used for the same..
from bokeh.charts import Area, show, output_file
Areadict = dict(
I = df['IEXT'],
Date=df['Month'],
O = df['OT']
)
area = Area(Areadict, x='Date', y=['I','O'], title="Area Chart",
legend="top_left",
xlabel='time', ylabel='memory')
output_file('area.html')
show(area)
All i see if the date axis getting plotted, but no signs of the two areacharts that I am interested in.
Please advise
I would recommend looking at Holoviews which is a very high level API built on top of Bokeh, and is endorsed by the Bokeh team. You can see an Area chart example in their documentation. Basically it looks like:
# create holoviews objects
dims = dict(kdims='time', vdims='memory')
python = hv.Area(python_array, label='python', **dims)
pypy = hv.Area(pypy_array, label='pypy', **dims)
jython = hv.Area(jython_array, label='jython', **dims)
# plot
overlay.relabel("Area Chart") + hv.Area.stack(overlay).relabel("Stacked Area Chart")
Which results in
Otherwise, as of Bokeh 0.13 to create a stacked area chart with the stable bokeh.plotting API, you will need to stack the data yourself, as shown in this example:
import numpy as np
import pandas as pd
from bokeh.plotting import figure, show, output_file
from bokeh.palettes import brewer
N = 20
cats = 10
df = pd.DataFrame(np.random.randint(10, 100, size=(N, cats))).add_prefix('y')
def stacked(df):
df_top = df.cumsum(axis=1)
df_bottom = df_top.shift(axis=1).fillna({'y0': 0})[::-1]
df_stack = pd.concat([df_bottom, df_top], ignore_index=True)
return df_stack
areas = stacked(df)
colors = brewer['Spectral'][areas.shape[1]]
x2 = np.hstack((df.index[::-1], df.index))
p = figure(x_range=(0, N-1), y_range=(0, 800))
p.grid.minor_grid_line_color = '#eeeeee'
p.patches([x2] * areas.shape[1], [areas[c].values for c in areas],
color=colors, alpha=0.8, line_color=None)
show(p)
which results in