Datashader canvas.line() aliasing - python

I use bokeh to plot temperature curves, but in some cases the dataset is quite big (> 500k measurements) and I'm have a laggy user experience with bokeh (event with output_backend="webgl"). So I'm experimenting datashader to get a faster rendering and a smoother user experience.
But the visual result given by datashader is not as beautiful as bokeh's result, datashader result has aliasing :
I obtain this side-by-side comparison with the following code :
import pandas as pd
import datashader as ds
import datashader.transfer_functions as tf
from bokeh.plotting import figure
from bokeh.io import output_notebook, show
from bokeh.models import ColumnDataSource
from bokeh.layouts import row
import numpy as np
output_notebook()
# generate signal
n = 2000
start = 0
end = 70
signal = [np.sin(x) for x in np.arange(start, end, step=(end-start)/n)]
signal = pd.DataFrame(signal, columns=["signal"])
signal = signal.reset_index()
# create a bokeh plot
source = ColumnDataSource(signal)
p = figure(plot_height=300, plot_width=400, title="bokeh plot")
p.line(source=source, x="index", y="signal")
# create a datashader image and put it in a bokeh plot
x_range = (signal["index"].min(), signal["index"].max())
y_range = (signal["signal"].min(), signal["signal"].max())
cvs = ds.Canvas(x_range=x_range, y_range=y_range, plot_height=300, plot_width=400)
agg = cvs.line(signal, 'index', 'signal')
img = tf.shade(agg)
image_source = ColumnDataSource(data=dict(image = [img.data]))
q = figure(x_range=x_range, y_range=y_range, plot_height=300, plot_width=400, title="datashader + bokeh")
q.image_rgba(source = image_source,
image="image",
dh=(y_range[1] - y_range[0]),
dw=(x_range[1] - x_range[0]),
x=x_range[0],
y=y_range[0],
dilate=False)
# visualize both plot, bokeh on left
show(row(p, q))
Have you any idea how to fix this aliasing and get a smooth result ? (similar to bokeh's result)

Here's a runnable version of your code, using HoloViews in a Jupyter notebook:
import pandas as pd, numpy as np, holoviews as hv
from holoviews.operation.datashader import datashade, dynspread
hv.extension("bokeh")
%opts Curve RGB [width=400]
n, start, end = 2000, 0, 70
sine = [np.sin(x) for x in np.arange(start, end, step=(end-start)/n)]
signal = pd.DataFrame(sine, columns=["signal"]).reset_index()
curve = hv.Curve(signal)
curve + datashade(curve)
It's true that the datashaded output here doesn't look very nice. Datashader's timeseries support, like the rest of datashader, was designed to allow accurate accumulation and summation of huge numbers of mathematically perfect (i.e., infinitely thin) curves on a raster grid, so that every x location on every curve will fall into one and only one y location in the grid. Here you just seem to want server-side rendering of a large timeseries, which requires partial incrementing of multiple nearby bins in the grid and isn't something that datashader is optimized for yet.
One thing you can do already is to render the curve at a high resolution then "spread" it so that each non-zero pixel will show up in neighboring pixels as well:
curve + dynspread(datashade(curve, height=1200, width=1200, dynamic=False, \
cmap=["#30a2da"]), max_px=3, threshold=1)
Here I set the color to match Bokeh's default, then forced HoloView's "dynspread" function to spread by 3 pixels. Using Datashader+Bokeh as in your version you would do ``img = tf.spread(tf.shade(agg), px=3)` and increase the plot size in the Canvas call to get a similar result.
I haven't tried running a simple smoothing filter over the result of tf.shade() or tf.spread(), but those both just return RGB images, so some filter like that would probably give good results.
The real solution would be to implement an optional antialiased line-drawing function for datashader, operating when the lines are drawn first rather than fixing up the pixels later, but that would take some work. Contributions welcome!

Related

How to plot a line graph of density over a density colour map plot in Python

First time user so apologies for any mistakes.
I have some code (pasted below) which is used to analyse and gain values/graphs from a simulation I have run.
This results in the following image:
I would therefore now like to plot a line graph on top of this according to the values of the colour map corresponding to r = 0 on the y-axis at every point on the x - axis with each respective value on the colour map. However, I'm completely lost on where to even begin with this. I've tried looking into KDE and other similar things, but I realise I'm not sure how to take numerical values which were used to generate the colour map.
from openpmd_viewer import OpenPMDTimeSeries
from openpmd_viewer.addons import LpaDiagnostics
import numpy as np
from scipy.constants import c, e, m_e
import matplotlib.pyplot as plt
from matplotlib import gridspec
# Replace the string below, to point to your data
ts = OpenPMDTimeSeries(r"/Users/bentorrance/diags/hdf5/")
ts_2d = LpaDiagnostics(r"/Users/bentorrance/diags/hdf5/")
plt.figure(1)
Ez = ts.get_field(iteration=5750, field='E', coord='z', plot=True, cmap='inferno')
plt.title(r'Electric Field Density $E_{z}$')
plt.show()

4D Density Plot in Python

I am looking to plot some density maps from some grid-like data:
X,Y,Z = np.mgrids[-5:5:50j, -5:5:50j, -5:5:50j]
rho = np.random.rand(50,50,50) #for the sake of argument
I am interested in producing an interpolated density plot as shown below, from Mathematica here, using Python.
Is there any solution in Matplotlib or another plotting suite for this sort of plot?
To be clear, I do not want a scatterplot of coloured points, which is not suitable the plot I am trying to make. I would like a 3D interpolated density plot, as shown below.
Plotly
Plotly Approach from https://plotly.com/python/3d-volume-plots/ uses np.mgrid
import plotly.graph_objects as go
import numpy as np
X, Y, Z = np.mgrid[-8:8:40j, -8:8:40j, -8:8:40j]
values = np.sin(X*Y*Z) / (X*Y*Z)
fig = go.Figure(data=go.Volume(
x=X.flatten(),
y=Y.flatten(),
z=Z.flatten(),
value=values.flatten(),
isomin=0.1,
isomax=0.8,
opacity=0.1, # needs to be small to see through all surfaces
surface_count=17, # needs to be a large number for good volume rendering
))
fig.show()
Pyvista
Volume Rendering example:
https://docs.pyvista.org/examples/02-plot/volume.html#sphx-glr-examples-02-plot-volume-py
3D-interpolation code you might need with pyvista:
interpolate 3D volume with numpy and or scipy

Dynamic spectrum using plotly

I want to plot time vs frequency as x and y axis, but also a third parameter that is specified by the intensity of plot at (x, y) rather (time, frequency) point. [Actually, instead of going up with third axis in 3D visualisation, I want something like a 2D plot, with amplitude of third axis governed by the intensity(color) value at (x,y)].
Can someone please suggest me something similar that I am looking for? These plots are actually called dynamical spectrum.
PS: I am plotting in python offline. I have gone through https://plot.ly/python/, but still I am not sure which will serve my purpose.
Please suggest something that will help me accomplish the above :)
This is the code to compute and visualize the spectrogram with plotly, i tested the code with this audio file: vignesh.wav
The code was tested in Jupyter notebook using python 3.6
# Full example
import numpy as np
import matplotlib.pyplot as plt
# plotly offline
import plotly.offline as pyo
from plotly.offline import init_notebook_mode #to plot in jupyter notebook
import plotly.graph_objs as go
init_notebook_mode() # init plotly in jupyter notebook
from scipy.io import wavfile # scipy library to read wav files
AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)
Audiodata = Audiodata / (2.**15) # Normalized between [-1,1]
#Spectrogram
from scipy import signal
plt.figure()
N = 512 #Number of point in the fft
w = signal.blackman(N)
freqs, bins, Pxx = signal.spectrogram(Audiodata, fs,window = w,nfft=N)
# Plot with plotly
trace = [go.Heatmap(
x= bins,
y= freqs,
z= 10*np.log10(Pxx),
colorscale='Jet',
)]
layout = go.Layout(
title = 'Spectrogram with plotly',
yaxis = dict(title = 'Frequency'), # x-axis label
xaxis = dict(title = 'Time'), # y-axis label
)
fig = go.Figure(data=trace, layout=layout)
pyo.iplot(fig, filename='Spectrogram')
I'd suggest the pcolormesh plot
import matplotlib.pyplot as mp
import numpy as np
# meshgrid your timevector to get it in the desired format
X, Y = np.meshgrid(timevector, range(num_of_frequency_bins))
fig1, ax1 = mp.subplots()
Plothandle = mp.pcolormesh(X, Y, frequencies, cmap=mp.cm.jet, antialiased=True, linewidth=0)
Whereas num_of_frequency_bins the amount of frequencies to display on your y-axis. For example from 0Hz to 1000Hz with 10Hz resolution you'll have to do: range(0,1000,10)
Antialiased is just for the looks, same with linewidth.
Colormap jet is usually not recommended due to non-linear gray-scale, but in frequency-domains it is regularly used. Thus I used it here. But python has some nice linear gray-scale colormaps as well!
To the topic of using plotly: If you just want a static image, you don't have to use plotly. If you want to have an interactive image where you can drag around axes and stuff like this, you should take a look at plotly.

Python Matplotlib : how to put label next to each element in the bubble plot

I have bubble plot like this, and I am willing to put labels next to each bubble (their name). Does any body know how to do that?
#Falko refered to another post that indicates you should be looking for the text method of the axes. However, your problem is quite a bit more involved than that, because you'll need to implement an offset that scales dynamically with the size of the "bubble" (the marker). That means you'll be looking into the transformation methods of matplotlib.
As you didn't provide a simple example dataset to experiment with, I've used one that is freely available: earthquakes of 1974. In this example, I'm plotting the depth of the quake vs the date on which it occurred, using the magnitude of the earthquake as the size of the bubbles/markers. I'm appending the locations of where these earthquakes happened next to the markers, not inside (which is far more easy: ignore the offset and set ha='center' in the call to ax.text).
Note that the bulk of this code example is merely about getting some dataset to toy with. What you really needed was just the ax.text method with the offset.
from __future__ import division # use real division in Python2.x
from matplotlib.dates import date2num
import matplotlib.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Get a dataset
data_url = 'http://datasets.flowingdata.com/earthquakes1974.csv'
df = pd.read_csv(data_url, parse_dates=['time'])
# Select a random subset of that dataframe to generate some variance in dates, magnitudes, ...
data = np.random.choice(df.shape[0], 10)
records = df.loc[data]
# Taint the dataset to add some bigger variance in the magnitude of the
# quake to show that the offset varies with the size of the marker
records.mag.values[:] = np.arange(10)
records.mag.values[0] = 50
records.mag.values[-1] = 100
dates = [date2num(dt) for dt in records.time]
f, ax = plt.subplots(1,1)
ax.scatter(dates, records.depth, s=records.mag*100, alpha=.4) # markersize is given in points**2 in recentt versions of mpl
for _, record in records.iterrows():
# Specify an offset for the text annotation:
# it is approx the radius of the disc + 10 points to the right
dx, dy = np.sqrt(record.mag*100)/f.dpi/2 + 10/f.dpi, 0.
offset = transforms.ScaledTranslation(dx, dy, f.dpi_scale_trans)
ax.text(date2num(record.time), record.depth, s=record.place,
va='center', ha='left',
transform=ax.transData + offset)
ax.set_xticks(dates)
ax.set_xticklabels([el.strftime("%Y-%M") for el in records.time], rotation=-60)
ax.set_ylabel('depth of earthquake')
plt.show()
For one such run, I got:
Definitely not pretty because of the overlapping labels, but it was just an example to show how to use the transforms in matplotlib to dynamically add an offset to the labels.

Infinite horizontal line in Bokeh

Is there a way to plot an infinite horizontal line with Bokeh?
The endpoints of the line should never become visible, no matter how far out the user is zooming.
This is what I've tried so far. It just prints an empty canvas:
import bokeh.plotting as bk
import numpy as np
p = bk.figure()
p.line([-np.inf,np.inf], [0,0], legend="y(x) = 0")
bk.show(p)
One way would be to set the endpoints extremely high/low and the figure's x_range and y_range very small in relation to them.
import bokeh.plotting as bk
import numpy as np
p = bk.figure(x_range=[-10,10])
p.line([-np.iinfo(np.int64).max, np.iinfo(np.int64).max], [0,0], legend="y(x) = 0")
bk.show(p)
However, I am hoping that somebody has a more elegant solution.
Edit: removed outdated solution
You are looking for "spans":
Spans (line-type annotations) have a single dimension (width or height) and extend to the edge of the plot area.
Please, take a look at
http://docs.bokeh.org/en/latest/docs/user_guide/annotations.html#spans
So, the code will look like:
import numpy as np
import bokeh.plotting as bk
from bokeh.models import Span
p = bk.figure()
# Vertical line
vline = Span(location=0, dimension='height', line_color='red', line_width=3)
# Horizontal line
hline = Span(location=0, dimension='width', line_color='green', line_width=3)
p.renderers.extend([vline, hline])
bk.show(p)
With this solution users are allowed to pan and zoom at will. The end of the lines will never show up.
The Bokeh documentation on segments and rays indicates the following solution (using ray):
To have an “infinite” ray, that always extends to the edge of the
plot, specify 0 for the length.
And indeed, the following code produces an infinite, horizontal line:
import numpy as np
import bokeh.plotting as bk
p = bk.figure()
p.ray(x=[0], y=[0], length=0, angle=0, line_width=1)
p.ray(x=[0], y=[0], length=0, angle=np.pi, line_width=1)
bk.show(p)
If you plot two rays from the middle they won't get smaller as you zoom in or out since the length is in pixel. So something like this:
p.ray(x=[0],y=[0],length=300, angle=0, legend="y(x) = 0")
p.ray(x=[0],y=[0],length=300, angle=np.pi, legend="y(x) = 0")
But if the user pans in either direction the end of the ray will show up. If you can prevent the user from panning at all (even when they zoom) then this is a little nicer code for a horizontal line.
If the user is able to zoom and pan anywhere they please, there is no good way (as far as I can tell) to get a horizontal line as you describe.
In case you are wondering how to use spans in combination with time series, convert your dates to unix timestamps:
start_date = time.mktime(datetime.date(2018, 3, 19).timetuple())*1000
vline = Span(location=start_date,dimension='height', line_color='red',line_width=3)
Or see this link for a full example.

Categories

Resources