Bokeh patches z order - python

I have a number of patches with associated z values, and I'd like them to display so that if two patches overlap the one with the higher z value is shown. It seems that the way to do this is by ordering the renderers according to this github conversation: https://github.com/bokeh/bokeh/issues/696
I have done this successfully by applying a series of glyphs using patch in order of their z value. Unfortunately, using patch for all 460 glyphs takes 25 seconds for a response so it isn't a viable solution.
How do I change the render order for patches? I tried ordering the input data, but this didn't seem to have an impact.
edit:
It seems there might not be a way to accomplish what I want. https://github.com/bokeh/bokeh/issues/3601

As of Bokeh 0.12.10 patch draw order is always stable (see #7049 for details). Therefore the answer is: upgrade to Bokeh 0.12.10 or newer and put the data in the orer you would like it to be drawn.

Related

Is there a PyQtGraph parameter for autorange which limits how many points are visible?

I have a PyQtGraph (Line graph) which constantly has new values added too it, and I am using the plot.autoRange() function to update the viewBox, but the problem is that I am using custom Ticks (Time, 12:00PM for example), and if it has more than 10-ish values the x-ticks overlap when it auto ranges. Is it possible to for example make autoRange only show the last 10 values?
Currently I found a workaround by removing the first value once 10 has been reached, but this really isn't optimal since the old data isn't in the graph anymore.
Generally speaking if you want that kind of custom behavior, you need to use the setRange method:
https://pyqtgraph.readthedocs.io/en/latest/graphicsItems/viewbox.html#pyqtgraph.ViewBox.setRange
If you want to stick with autoRange, the autoRange method takes an optional items argument; what you can do is create an overlapping curve of just the last 10 points (that you want to display) and call the autoRange function on just that curve, not all the items in the ViewBox. If they overlap entirely it should be visually not noticeable (but if you have mouse related events, you may have more complication).
Hopefully that helps

Creating a packed bubble / scatter plot in python (jitter based on size to avoid overlapping)

I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.
Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.
I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.

Why is part of my contour plot showing white?

I am using Python's matplotlib.pyplot.contourf to create a contour plot of my data with a color bar. I have done this successfully countless times, even with other layers of the same variable. However, when the values get small (on the order of 1E-12), parts of the contour show up white. The white color does not show up in the color bar either. Does anyone know what causes this and how to fix this? The faulty contour is attached below.
a1 = plt.contourf(np.linspace(1,24,24),np.linspace(1,20,20),np.transpose(data[:,:,15]))
plt.colorbar(a1)
plt.show()
tl;dr
Given the new information, matplotlib couldn't set the right number of levels (see parameters in the documentation) for your data leaving data unplotted. To fix that you need to tell matplotlib to extend the limits with either plt.contourf(..., extend="max") or plt.contourf(..., extend="both")
Extensive answer
There are a few reasons why contourf() is showing white zones with a colormap that doesn't include white.
NaN values
NaN values are never plotted.
Masked data
If you mask data before plotting, it won't appear in the plot. But you should know if you masked your data.
Although, you may have unnoticed mask your data if you use something like Tick locator = LogLocator().
Matplotlib couldn't set the right levels for your data
Sometimes matplotlib doesn't set the right levels, leaving some of your data without plotting.
To fix that you can user plt.contourf(..., extend=EXTENDS) where EXTENDS can be "neither", "both", "min", "max"
Coarse grid
contourf plots whitespace over finite data. Past answers do not correct
One remark, white section in the plot can also occur if the X and Y vectors data points are not equally spaced. In that case best to use function tricontourf().
I was facing the same problem recently, when there was data available even higher/lower than the levels I have set. So, the plt.contourf fills the contours exclusively given by you, and it neglects any other higher or lower values present in your data.
I solved this by adding a key word argument extend="both", which for your case would be something like this:
a1 = plt.contourf(np.linspace(1,24,24),np.linspace(1,20,20),np.transpose(data[:,:,15]), extend="both")
or in general form:
a1 = plt.contourf(x,y,variable[:,:,15],extend="both")
By doing this, you're instructing the module to plot the higher(/lower) values according to the highest(/lowest) filled contour.
If you want only to extend in the lower or higher range, you can change the keyword argument to
extend="min" or extend ="max"

Highlighting many ranges on an axis of a Bokeh plot?

I have a scatter plot of data and would like to highlight certain ranges of the x-axis. When the number ranges to highlight are relatively small, using BoxAnnotation works well. However, I'm trying to make many adjacent highlightings (with different opacity). With many adjacent BoxAnnotations, zoomed out, the boxes slightly overlap, creating lines. Additionally, thousands of BoxAnnotations takes a long time to generate and does not run smoothly when interacting with the plot.
To be more specific about my case, I have some temporal data and a predictive model detecting the probability of some event occurring in the data. I want each segment to be highlighted with an opacity given by the probability that an event is occurring at that point in time. However, my current BoxAnnotation approach results in artificial lines from overlap of boxes when zoomed out (they disappear when zooming in on a region), and slow responsiveness of the interactive plot.
Is there a way to accomplish something similar to this without the artifacts and with a smoother experience?
Current method:
source = ColumnDataSource(data=data_frame)
figure_ = figure(x_axis_label='Time', y_axis_label='Intensity')
for index in range(data_frame.shape[0] - 1):
figure_.add_layout(
BoxAnnotation(left=data_frame['time'].values[index], right=data_frame['time'].values[index + 1],
fill_alpha=data_frame['prediction'].values[index], fill_color='red', line_alpha=0)
)
figure_.circle(x='time', y='intensity', source=source)
show(figure_)
Example of artificial lines when there are too many small adjacent BoxAnnotations:
When zooming on the x-axis, the lines disappear:
There's probably not any way to salvage this exact approach. The artifacts are due to the functioning of the underlying raster HTML canvas, and here's not anything that can be one about that. And any slowness is due to the fact that this kind of use of BoxAnnotation (with so very many individual instances) is not at all what was envisioned, and it is simply not optimized to show hundreds of instances the way e.g. scatter glyphs are. You are trying to use box annotations to construct a sort of translucent heat map, and that is not a good fit for it, for the reasons above.
You could potentially overcome slowness by using a single rect or vbar glyph that draws all the boxes at once in a vectorized way. But that won't alleviate the compositing issues.
Your best bet is to create a semi-transparent "heatmap" image overlay yourself with a tool or code that can afford better control over the details of rasterization and compositing. I can't really advise you on how to do that in any detail. The Datashader library might be useful for this.

Matplotlib example output significantly differs from website

When I run the Matplotlib api example code: radar_chart.py on my computer the output differs from the result on the Matplotlib website at a crucial point. The zero values, of which there are plenty of them, do not hit the origin of the chart on the Matplotlib website, see the chart at the link. When I run the exact same code on my own computer the zero values do hit the origin. See picture below. This results in a less smooth and readable chart compared to the one on the Matplotlib website, however this is not what one would expect. Could anyone please tell me why this difference exists?
The reason for this difference is that the linked example is produced using matplotlib 2.0, while on your computer you run <= 1.5.
It can be observed when looking at the old example on the matplotlib page.
This difference is due to the axes margins being set to 0 in matplotlib 1.5 and to 0.05 in matplotlib 2.0.
There are several ways to set the margins, one being plt.margins(x=0.05, y=0.05).
Since here you want to have the same margins for all axes, one easy method is to use rc params. Adding
plt.rcParams['axes.xmargin'] = 0.05
plt.rcParams['axes.ymargin'] = 0.05
at the top of the script, will set the margins to the values used by default in matplotlib 2.0. Of course you can play around with them and see which values best fit your needs.

Categories

Resources