I recently started using Bokeh for interactive network visualization. I'm plotting coordinates for 50 points, nodes that represent machines. Below is image of how my data is represented and my code. (I only put 14 machines to be simpler).
I've managed to plot the points but I have a question that I didn't find anywhere a specific solution. For some machines I have its temperature, but others no. How can I make the machines that I have the temperature information have a different color?
Like, the ones that I have this information be red, and the others that I don't have the information be blue? All tutorials that I found about changing the nodes color involves palettes, but this wouldn't have much use for me now.
import pandas
from bokeh.io import output_notebook, show, save
from bokeh.io import output_notebook, show, save
from bokeh.models import Range1d, Circle, ColumnDataSource, MultiLine
from bokeh.plotting import figure
output_notebook()
from bokeh.plotting import ColumnDataSource, figure, output_file, show
df = pandas.read_excel('Pasta1.xlsx', engine='openpyxl')
source = ColumnDataSource(data=dict(x = df['x'] ,y = df['y']))
TOOLTIPS = [("index", "$index"),("(x,y)", "($x, $y)")]
p = figure(width=1000, height=500, tooltips=TOOLTIPS,title="Redes")
p.circle('x', 'y', size=10,fill_color='red', source=source)
show(p)
The solutions is to use the color keyword of the p.circle() and pass a list (or array) instead of a static value. If you now the rules for your color it should be easy to pass the information to your plot.
The keyword fill_color also accepts lists (or arrays), if you prefere this.
Example
The example below creates the color column in the pandas DataFrame first, using np.where(). This can be done also by hand of unsing other technics.
import numpy as np
import pandas as pd
from bokeh.io import output_notebook, show, save
from bokeh.models import Range1d, Circle, ColumnDataSource, MultiLine
from bokeh.plotting import figure
output_notebook()
df = pd.DataFrame({
'machine':['J'+str(i) for i in range(13)],
'x':list(range(13)),
'y':list(range(13)),
'Temp' : [np.nan, 32, np.nan, 33, np.nan, np.nan, np.nan, np.nan, 35, np.nan, np.nan, 32, np.nan]
})
df['color'] = np.where(df['Temp'].isna(), 'blue', 'red')
source = ColumnDataSource(df)
TOOLTIPS = [("index", "$index"),("(x,y)", "($x, $y)"), ('name', "#machine")]
p = figure(width=500, height=500, tooltips=TOOLTIPS,title="Redes")
p.circle(x='x', y='y', size=10, color='color', source=source)
show(p)
Output
Related
I'm trying to highlight last value of a time series plot by plot its value on yaxis, as shown in this question. I prefer using LabelSet over Legend because you can precisely control the text positions and also using a data source to update it. But unfortunately, I can not find out how to draw label text outside the plot box.
Here is some code to plot LabelSet and notice how the text is only shown inside the box (66.1x is partially blocked by yaxis):
import pandas as pd
from bokeh.io import output_notebook
output_notebook()
from bokeh.plotting import figure, show
from bokeh.models import LabelSet, ColumnDataSource
#import bokeh.sampledata
#bokeh.sampledata.download()
from bokeh.sampledata.stocks import MSFT
df = pd.DataFrame(MSFT)[:50]
df["date"] = pd.to_datetime(df["date"])
p = figure(
x_axis_type="datetime", width=1000, toolbar_location='left',
title = "MSFT Candlestick", y_axis_location="right")
p.line(df.date, df.close)
ds = ColumnDataSource({'x': [df.date.iloc[-1]], 'y': [df.close.iloc[-1]], 'text': [' ' + str(df.close.iloc[-1])]})
ls = LabelSet(x='x', y='y', text='text', source=ds)
p.add_layout(ls)
show(p)
Please let me know how to show LabelSet outside the box, Thanks
I just discovered Bokeh recently, and I try to display a legend for each day of week (represented by 'startdate_dayweek'). The legend should contain the color for each row corresponding to each day.
import pandas as pd
from bokeh.plotting import figure, show
from bokeh.io import output_file
from bokeh.palettes import Set1_7
output_file("conso_daily.html")
treatcriteria_data_global = pd.read_csv(r"treatcriteria_evolution.csv", sep=';')
final_global_data = treatcriteria_data_global.groupby(['startdate_weekyear','startdate_dayweek'],as_index = False).sum().pivot('startdate_weekyear','startdate_dayweek').fillna(0)
numlines = len(final_global_data.columns)
palette = Set1_7[0:numlines]
ts_list_of_list = []
for i in range(0,len(final_global_data.columns)):
ts_list_of_list.append(final_global_data.index)
vals_list_of_list = final_global_data.values.T.tolist()
p = figure(width=500, height=300)
p.left[0].formatter.use_scientific = False
p.multi_line(ts_list_of_list, vals_list_of_list,
legend='startdate_dayweek',
line_color = palette,
line_width=4)
show(p)
But I don't have the expected result in the legend:
How to have the legend for each day? Is the problem due to the fact that I created a MultiIndex table? Thanks.
The multi_line() function can take the parameter legend_field or legend_group. Both are working very well for your usecase, if you use a ColumnDataSource as source. Keep in mind, that a error will come if you use both parameters at the same time.
Minimal Example
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource
output_notebook()
source = ColumnDataSource(dict(
xs=[[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]],
ys=[[1,2,3,4,5],[1,1,1,1,5],[5,4,3,2,1]],
legend =['red', 'green', 'blue'],
line_color = ['red', 'green', 'blue']))
p = figure(width=500, height=300)
p.multi_line(xs='xs',
ys='ys',
legend_field ='legend',
line_color = 'line_color',
source=source,
line_width=4)
show(p)
Output
I have a dataframe that details sales of various product categories vs. time. I'd like to make a "line and marker" plot of sales vs. time, per category. To my surprise, this appears to be very difficult in Bokeh.
The scatter plot is easy. But then trying to overplot a line of sales vs. date with the same source (so I can update both scatter and line plots in one go when the source updates) and in such a way that the colors of the line match the colors of the scatter plot markers proves near impossible.
Minimal reproducible example with contrived data:
import pandas as pd
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02','2020-01-01','2020-01-02'],\
'Product Category':['shoes','shoes','grocery','grocery'],\
'Sales':[100,180,21,22],'Colors':['red','red','green','green']})
df['Date'] = pd.to_datetime(df['Date'])
from bokeh.io import output_notebook
output_notebook()
from bokeh.io import output_file, show
from bokeh.plotting import figure
source = ColumnDataSource(df)
plot = figure(x_axis_type="datetime", plot_width=800, toolbar_location=None)
plot.scatter(x="Date",y="Sales",size=15, source=source, fill_color="Colors", fill_alpha=0.5, \
line_color="Colors",legend="Product Category")
for cat in list(set(source.data['Product Category'])):
tmp = source.to_df()
col = tmp[tmp['Product Category']==cat]['Colors'].values[0]
plot.line(x="Date",y="Sales",source=source, line_color=col)
show(plot)
Here's what it looks like, which is clearly wrong:
Here's what I want and don't know how to make:
Can Bokeh not make such plots, where scatter markers and lines have the same color per category, with a legend?
With bokeh it is often helpful to first think about the visualisation you want and then structuring the data source appropriately. You want two lines, on per category, the x axis is time and y axis is the sales. Then a natural way to structure your data source is the following:
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02'],
'Shoe Sales':[100, 180],
'Grocery Sales': [21, 22]
})
from bokeh.io import output_notebook
output_notebook()
from bokeh.io import output_file, show
from bokeh.plotting import figure
source = ColumnDataSource(df)
plot = figure(x_axis_type="datetime", plot_width=800, toolbar_location=None)
categories = ["Shoe Sales", "Grocery Sales"]
colors = {"Shoe Sales": "red", "Grocery Sales": "green"}
for category in categories:
plot.scatter(x="Date",y=category,size=15, source=source, fill_color=colors[category], legend=category)
plot.line(x="Date",y=category,source=source, line_color=colors[category])
show(plot)
The solutions is to group your data. Then you can plot lines for each group.
Minimal Example
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
output_notebook()
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02','2020-01-01','2020-01-02'],
'Product Category':['shoes','shoes','grocery','grocery'],
'Sales':[100,180,21,22],'Colors':['red','red','green','green']})
df['Date'] = pd.to_datetime(df['Date'])
plot = figure(x_axis_type="datetime",
plot_width=400,
plot_height=400,
toolbar_location=None
)
plot.scatter(x="Date",
y="Sales",
size=15,
source=df,
fill_color="Colors",
fill_alpha=0.5,
line_color="Colors",
legend_field="Product Category"
)
for color in df['Colors'].unique():
plot.line(x="Date", y="Sales", source=df[df['Colors']==color], line_color=color)
show(plot)
Output
I am trying to represent the data using the bokeh scatter.
Here is my code:
from bokeh.plotting import Scatter, output_file, show import pandas
df=pandas.Dataframe(colume["X","Y"])
df["X"]=[1,2,3,4,5,6,7]
df["Y"]=[23,43,32,12,34,54,33]
p=Scatter(df,x="X",y="Y", title="Day Temperature measurement", xlabel="Tempetature", ylabel="Day")
output_file("File.html")
show(p)
The Output should look like this:
Expected Output
The error is:
ImportError Traceback (most recent call
> last) <ipython-input-14-1730ac6ad003> in <module>
> ----> 1 from bokeh.plotting import Scatter, output_file, show
> 2 import pandas
> 3
> 4 df=pandas.Dataframe(colume["X","Y"])
> 5
ImportError: cannot import name 'Scatter' from 'bokeh.plotting'
(C:\Users\LENOVO\Anaconda3\lib\site-packages\bokeh\plotting__init__.py)
I had also found that the Scatter is no longer maintained now. Is there is any way to use it?
Also which alternative do I have to represent the data same as the Scatter using any another python libraries?
Using older version of Bokeh will resolve this issue?
Scatter (with a capital S) has never been part of bokeh.plotting. It used to be a part of the old bokeh.charts API that was removed several years ago. However, it is not needed at all to create basic scatter plots, since all the glyph methods in bokeh.plotting (e.g circle, square) are all implicitly scatter-type functions to begin with:
from bokeh.plotting import figure, show
import pandas as pd
df = pd.DataFrame({"X" :[1,2,3,4,5,6,7],
"Y": [23,43,32,12,34,54,33]})
p = figure(x_axis_label="Tempetature", y_axis_label="Day",
title="Day Temperature measurement")
p.circle("X", "Y", size=15, source=df)
show(p)
Which yields:
You can also just pass the data directly to circle as in the other answer.
If you want to do fancier things, like map the marker type based on a column there is also a plot.scatter (lower case s) methods on the figure:
from bokeh.plotting import figure, show
from bokeh.sampledata.iris import flowers
from bokeh.transform import factor_cmap, factor_mark
SPECIES = ['setosa', 'versicolor', 'virginica']
MARKERS = ['hex', 'circle_x', 'triangle']
p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Sepal Width'
p.scatter("petal_length", "sepal_width", source=flowers, legend_field="species", fill_alpha=0.4, size=12,
marker=factor_mark('species', MARKERS, SPECIES),
color=factor_cmap('species', 'Category10_3', SPECIES))
show(p)
which yields:
If you look up "scatter" in the docs, you'll find
Scatter Markers
To scatter circle markers on a plot, use the circle() method of Figure:
from bokeh.plotting import figure, output_file, show
# output to static HTML file
output_file("line.html")
p = figure(plot_width=400, plot_height=400)
# add a circle renderer with a size, color, and alpha
p.circle([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], size=20, color="navy", alpha=0.5)
# show the results
show(p)
To work with dataframes, just pass in the columns like df.X and df.Y to the x and y args.
from bokeh.plotting import figure, show, output_file
import pandas as pd
df = pd.DataFrame(columns=["X","Y"])
df["X"] = [1,2,3,4,5,6,7]
df["Y"] = [23,43,32,12,34,54,33]
p = figure()
p.scatter(df.X, df.Y, marker="circle")
#from bokeh.io import output_notebook
#output_notebook()
show(p) # or output to a file...
I want to draw a circle with bokeh, the color of this circle depends on a column of DataFrame. But I got an empty plot. If i don't specify a color argument for p.circle, it'll work fine.
Here is the code, you can copy and paste and run it.
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, CategoricalColorMapper
from bokeh.palettes import Spectral11
import pandas as pd
df = pd.DataFrame({
'price':[10,15,20,25,30],
'action':[0,1,0,2,3],
'sign':[0,-1,0,1,-1]
})
source = ColumnDataSource(data=dict(
index=df.index,
price=df.price,
action=df.action,
sign=df.sign
))
color_mapper = CategoricalColorMapper(factors= [str(i) for i in list(df.sign.unique())], palette=Spectral11)
p = figure(plot_width=800, plot_height=400)
# this works fine
p.circle('index', 'price', radius=0.2 , source=source)
# this don't work
p.circle('index', 'price', radius=0.2 , color={'field':'sign', 'transform':color_mapper}, source=source)
show(p)
Bokeh doesn't like it when you take some information from a ColumnDataSource, and other information from a different source. This worked for me(in a notebook):
from bokeh.plotting import figure, output_notebook, show
from bokeh.models import ColumnDataSource, CategoricalColorMapper
from bokeh.palettes import Spectral11
import pandas as pd
output_notebook()
df = pd.DataFrame({
'price':[10,15,20,25,30],
'action':[0,1,0,2,3],
'sign':[0,-1,0,1,-1],
})
source = ColumnDataSource(data=dict(
index=df.index,
price=df.price,
action=df.action,
sign=df.sign,
color=[Spectral11[i+1] for i in df.sign]
))
p = figure(plot_width=800, plot_height=400)
# this don't work
p.circle('index', 'price', radius=0.2 ,
color='color',
source=source)
show(p)