How to use RangetoolLink with holoviews in an Overlayed plot - python

I am trying to use the holoviews Rangetool link in a holoviews Overlayed plot. But unable to achieve the range linking to work. Is it possible to achieve this.?
Based on these links example 1 and example 2 I tried the options with an overlayed plot instead of a single curve plot. But this didn't work. Below I provided an example with a similar dummy data.
import pandas as pd
import holoviews as hv
from holoviews import opts
import numpy as np
from holoviews.plotting.links import RangeToolLink
hv.extension('bokeh')
# Genrate Random Data
def randomDataGenerator(noOfSampleDataSets):
for i in range(noOfSampleDataSets):
res = np.random.randn(1000).cumsum()
yield res
# Overlay Plots
overlaid_plot = hv.Overlay([hv.Curve(data)
.opts(width=800, height=600, axiswise=True, default_tools=[])
for data in randomDataGenerator(5)])
# Adjust Source Height
source = overlaid_plot.opts(height=200)
# adjust target plot attributes
target = source.opts(clone=True, width=800, labelled=['y'],)
# Link source and target
rtlink = RangeToolLink(source, target)
# Compose and plot.
(target + source).cols(1).opts(merge_tools=False)
I expect that the source plot will show up with a range tool as shown in the example and be able to select a range in it which should select the same data points in the target plot.

Following code works in my case. I slightly refactored the code. But the logic is still the same. So if we have a an overlaid plot, link one of the curves in the overlayed plot works fine with all the remaining curves.
Following code works in a jupyter notebook. Its not tested in other environment.
import holoviews as hv
import numpy as np
hv.extension('bokeh')
from holoviews.plotting.links import RangeToolLink
# Genrate Random Data
def randomDataGenerator(noOfSampleDataSets):
for i in range(noOfSampleDataSets):
res = np.random.randn(1000).cumsum()
yield res
#generate all curves
def getCurves(n):
for data in randomDataGenerator(n):
curve = hv.Curve(data)
yield curve
source_curves, target_curves = [], []
for curve in getCurves(10):
# Without relabel, the curve somehow shares the ranging properties. opts with clone=True doesn't help either.
src = curve.relabel('').opts(width=800, height=200, yaxis=None, default_tools=[])
tgt = curve.opts(width=800, labelled=['y'], toolbar='disable')
source_curves.append(src)
target_curves.append(tgt)
# link RangeTool for the first curves in the list.
RangeToolLink(source_curves[0],target_curves[0])
#Overlay the source and target curves
overlaid_plot_src = hv.Overlay(source_curves).relabel('Source')
overlaid_plot_tgt = hv.Overlay(target_curves).relabel('Target').opts(height=600)
# layout the plot and render
layout = (overlaid_plot_tgt + overlaid_plot_src).cols(1)
layout.opts(merge_tools=False,shared_axes=False)

Related

How to specify axis limits in geoviews (python)?

I have been using geopandas, but I am trying to switch to geoviews because it is more interactive. I'm wondering how to specify the axis limits for plotted data as a default view. I understand that it will always plot all of the data that exist, but it would be nice to have a given zoom for the purpose of this project. I posted the image of the map output below. However, I want it to output with xlim = ([-127, -102]) and ylim = ([25, 44]). I looked on stackoverflow and other places online and was unable to find the answer.
# Read in shapefiles
fire = pd.read_pickle(r'fire_Aug2020.pkl')
fire = fire.loc[fire['FRP'] != -999.0, :]
# Assign gv.Image
data = gv.Dataset(fire[['Lon','Lat','YearDay']])
points = data.to(gv.Points, ['Lon','Lat'])
m = (points).opts(tools = ['hover'], width = 400, height = 200)
m
Your are very close to a working solution. Try to add xlim and ylim as tuple to the opts call.
Minimal Example
Comment: GeoViews is based on Holoviews, see the documentation for more details.
Because GeoViews is not installed on my machine, the example below uses HoloViews. I will update this answer soon.
import pandas as pd
import numpy as np
import numpy as np
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
fire = pd.DataFrame({'Lat':np.random.randint(10,80,50),'Lon':np.random.randint(-160,-60,50)})
data = hv.Dataset(fire[['Lon','Lat']])
points = data.to(hv.Points, ['Lon','Lat'])
(points).opts(tools = ['hover'], width = 400)
Above is the output without limits and below I make use of xlim and ylim.
(points).opts(tools = ['hover'], width = 400, height = 200, xlim=(-127,-102), ylim=(25,44))

Chord Graph with Holoviews: Troubles concerning adding colors and labels

I've tried to plot a chord graph according to http://holoviews.org/reference/elements/bokeh/Chord.html#elements-bokeh-gallery-chord
Nonetheless, I'm receiving a graph lacking labels and colors, which is totally different from the picture shown on the website:
The following is my code.
import pandas as pd
import holoviews as hv
from holoviews import opts, dim
from bokeh.sampledata.les_mis import data
hv.extension('bokeh')
hv.output(size=200)
links = pd.DataFrame(data['links'])
nodes = hv.Dataset(pd.DataFrame(data['nodes']), 'index')
chord = hv.Chord((links, nodes)).select(value=(5, None))
chord.opts(opts.Chord(labels='name',
cmap='Category20',
edge_cmap='Category20',
node_color=dim('name').str()))
mr = hv.renderer('matplotlib')
mr.show(chord)
Any suggestion will be much appreciated.

Holoviews scatter plot color by categorical data

I've been trying to understand how to accomplish this very simple task of plotting two datasets, each with a different color, but nothing i found online seems to do it. Here is some sample code:
import pandas as pd
import numpy as np
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
ds1x = np.random.randn(1000)
ds1y = np.random.randn(1000)
ds2x = np.random.randn(1000) * 1.5
ds2y = np.random.randn(1000) + 1
ds1 = pd.DataFrame({'dsx' : ds1x, 'dsy' : ds1y})
ds2 = pd.DataFrame({'dsx' : ds2x, 'dsy' : ds2y})
ds1['source'] = ['ds1'] * len(ds1.index)
ds2['source'] = ['ds2'] * len(ds2.index)
ds = pd.concat([ds1, ds2])
Goal is to produce two datasets in a single frame, with a categorical column keeping track of the source. Then i try plotting a scatter plot.
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter
And that works as expected. But i cannot seem to understand how to color the two datasets differently based on the source column. I tried the following:
scatter = hv.Scatter(ds, 'dsx', 'dsy', color='source')
scatter = hv.Scatter(ds, 'dsx', 'dsy', cmap='source')
Both throw warnings and no color. I tried this:
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter.opts(color='source')
Which throws an error. I tried converting the thing to a Holoviews dataset, same type of thing.
Why is something that is supposed to be so simple so obscure?
P.S. Yes, i know i can split the data and overlay two scatter plots and that will give different colors. But surely there has to be a way to accomplish this based on categorical data.
You can create a scatterplot in Holoviews with different colors per category as follows. They are all elegant one-liners:
1) By simply using .hvplot() on your dataframe to do this for you.
import hvplot
import hvplot.pandas
df.hvplot(kind='scatter', x='col1', y='col2', by='category_col')
# If you are using bokeh as a backend you can also just use 'color' parameter.
# I like this one more because it creates a hv.Scatter() instead of hv.NdOverlay()
# 'category_col' is here just an extra vdim, which is used for colors
df.hvplot(kind='scatter', x='col1', y='col2', color='category_col')
2) By creating an NdOverlay scatter plot as follows:
import holoviews as hv
hv.Dataset(df).to(hv.Scatter, 'col1', 'col2').overlay('category_col')
3) Or doppler's answer slightly adjusted, which sets 'category_col' as an extra vdim and is then used for the colors:
hv.Scatter(
data=df, kdims=['col1'], vdims=['col2', 'category_col'],
).opts(color='category_col', cmap=['blue', 'orange'])
Resulting plot:
You need the following sample data if you want to use my example directly:
import numpy as np
import pandas as pd
# create sample dataframe
df = pd.DataFrame({
'col1': np.random.normal(size=30),
'col2': np.random.normal(size=30),
'category_col': np.random.choice(['category_1', 'category_2'], size=30),
})
As an extra:
I find it interesting that there are basically 2 solutions to the problem.
You can create a hv.Scatter() with the category_col as an extra vdim which provides the colors or alternatively 2 separate scatterplots which are put together by hv.NdOverlay().
In the backend the hv.Scatter() solution will look like this:
:Scatter [col1] (col2,category_col)
And the hv.NdOverlay() backend looks like this:
:NdOverlay [category_col] :Scatter [col1] (col2)
This may help: http://holoviews.org/user_guide/Style_Mapping.html
Concretely, you cannot use a dim transform on a dimension that is not declared, not obscure at all :)
scatter = hv.Scatter(ds, 'dsx', ['dsy', 'source']
).opts(color=hv.dim('source').categorize({'ds1': 'blue', 'ds2': 'orange'}))
should get you there (haven't tested it myself).
Related:
Holoviews color per category
Overlay NdOverlays while keeping color / changing marker

Building a chord diagram with Holoviews: no error, but no image saved either

I want to draw a chord diagram. To first get the method working, I was following this example. (Note that for this, on the command line, you have to type 'bokeh sampledata' to download the sample data).
The code I used (taken mostly from the example, but adding in matplotlib to save the image) is:
import holoviews as hv
from holoviews import opts, dim
from bokeh.sampledata.airport_routes import routes, airports
hv.extension('bokeh')
# Count the routes between Airports
route_counts = routes.groupby(['SourceID', 'DestinationID']).Stops.count().reset_index()
nodes = hv.Dataset(airports, 'AirportID', 'City')
chord = hv.Chord((route_counts, nodes), ['SourceID', 'DestinationID'], ['Stops'])
# Select the 20 busiest airports
busiest = list(routes.groupby('SourceID').count().sort_values('Stops').iloc[-20:].index.values)
busiest_airports = chord.select(AirportID=busiest, selection_mode='nodes')
fig = plt.figure()
busiest_airports.opts(opts.Chord(cmap='Category20', edge_color=dim('SourceID').str(),height=800, labels='City', node_color=dim('AirportID').str(), width=800))
fig.savefig('plot.png')
There is a file called plot.png made, but it is empty. I've tried editing the code slightly in different ways (e.g. doing different image formats such as .pdf) but it doesn't change. Does anyone have any ideas?
Update to what works:
import holoviews as hv
from holoviews import opts, dim
from bokeh.sampledata.airport_routes import routes, airports
hv.extension('bokeh')
# Count the routes between Airports
route_counts = routes.groupby(['SourceID', 'DestinationID']).Stops.count().reset_index()
nodes = hv.Dataset(airports, 'AirportID', 'City')
chord = hv.Chord((route_counts, nodes), ['SourceID', 'DestinationID'], ['Stops'])
# Select the 20 busiest airports
busiest = list(routes.groupby('SourceID').count().sort_values('Stops').iloc[-20:].index.values)
busiest_airports = chord.select(AirportID=busiest, selection_mode='nodes')
plot = busiest_airports.opts(opts.Chord(cmap='Category20', edge_color=dim('SourceID').str(),height=800, labels='City', node_color=dim('AirportID').str(), width=800))
hv.save(plot,'plot2.png')
and it works, as suggested below.

Add text annotations to each data point in Holoviews plot

In Bokeh I am able to add a text annotation to each point in my plot programmatically by using LabelSet. Below I give an example for a simple bar plot:
import numpy as np
import pandas as pd
# Make some data
dat = \
(pd.DataFrame({'team':['a','b','c'], 'n_people':[10,5,12]})
.assign(n_people_percent = lambda x: (x['n_people']/np.sum(x['n_people'])*100)
.round(1).astype('string') + '%')
)
dat
# Bar plot with text annotations for every bar
from bkcharts import show, Bar
from bkcharts.attributes import CatAttr
from bokeh.models import (ColumnDataSource, LabelSet)
source_labs = ColumnDataSource(data = dat)
p = Bar(data = dat, label = CatAttr(columns = 'team'), values = 'n_people')
labels = LabelSet(x = 'team', y = 'n_people',
text = 'n_people_percent', source = source_labs)
p.add_layout(labels)
show(p)
However I am not sure how to achieve the same thing with Holoviews. I can make the same bar plot without the annotations very easily:
import holoviews as hv
hv.extension('bokeh')
p = hv.Bars(dat, kdims=['team'], vdims=['n_people'])
p
I can add a single text label adding an Overlay with the hv.Text element
p * hv.Text('a', 11, '37.0%')
But I have no idea how I can label each bar without explicitly calling hv.Text separately for every data point (bar). The problem seems to be that hv.Text does not accept a data argument like other elements e.g. hv.Bars, instead just x and y coordinates. My intuition would be that I should be able to do something like
p * hv.Text(dat, kdims=['team'], vdims=['n_people_percent'])
Any help with this appreciated!
Looks like this commit adds vectorized labels to hv.Labels, so try:
import holoviews as hv
hv.extension('bokeh')
p = hv.Bars(dat, kdims=['team'], vdims=['n_people'])
p * hv.Labels(dat, kdims=['team', 'n_people'], vdims=['n_people_percent'])

Categories

Resources