Better interpolation for Plotly Scatter splines - python

Plotly currently supports Catmull-Rom splines for interpolation of the lines between markers on a Scatter plot.
I have graphs where the data is fundamentally a normal distribution. Cubic or Hermite interpolation works very well for this type of data in other graphing frameworks - unfortunately the Catmull-Rom splines (or at least Plotly's implementation of them) really doesn't.
I've experimented with values of "smoothing" between 0.0 and 1.0 (it seems, though this is not documented, that values over 1.0 make no further difference). Unfortunately, they all look bad.
I've seen a suggestion elsewhere that it might make sense to do my own interpolation using scipy's interpolate.interp2d, and graph that line separately. However, this fails for my use case, since I want the color of the line to be paired with the color of the markers, and for both to appear on the legend as a single item, as shown above.
Has anyone had any experience making the Plotly splines look nicer than they do on a quasi-normal distribution using smoothing=1.0?

Related

Python way of plotting a Tephigram and Hodograph

I am looking to plot a Tephigram using Python and so far I have noticed that there are no real ways of doing so and I do not have the meteorological training to start from a SkewT plot and customize it to a Tephigram. I am currently using MetPy to create plots like the one below but I do not think it is a valid Tephigram. I have looked at these Issue 1,Issue 2 and Issue 3 as well as this library, but I am open to any library or method.
To be clear, there's no amount of customization you can do to a Skew-T to produce a Tephigram. Skew-T log-P diagrams plot pressure vs. temperature, with pressure on a logarithmic scale on the vertical, temperature rotated (nominally 45 degrees). Tephigrams fundamentally plot entropy (potential temperature) vs. temperature, with those two axes rotated 45 degrees.
The tephigram_python library you mentioned would have been what I linked you to, so I'm curious what its deficiencies are.

Creating a packed bubble / scatter plot in python (jitter based on size to avoid overlapping)

I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.
Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.
I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.

Plotting distribution from sampled data in python

I have two sets of sampled points in 2d space[x ,y], each set represents one class. When I plot all points, it's mess and one can't see anything on it. I need somehow plot distribution of each set (if it's possible on same canvas with different colours, then better). Does anybody know about some good library for it?
Matplotlib is a very good library for that task. You can plot histograms, scatter plots and lot of other things. You just have to know what you want and then you can probably do it with that. I use that for similar tasks a lot.
[UPDATE]
As I said, you can do that with matplotlib. Here is an example from their gallery: http://matplotlib.org/examples/pylab_examples/scatter_hist.html
It's not so pretty as with the answer in the comment of #lejlot, but still correct.

Matplotlib alternative for 3D scatter plots

I am having a hard time using Matplotlib to visualize reprojection results of my data in 3 dimensions after applying Principle components analysis or Linear discriminant analysis. After doing a scatter plot, I cannot rotate the data or change the point of view while zooming easily (Rotation axis stays the same even after you zoom, and if you zoom too much points just disappear) and every change takes one second to occur. Matplotlib is very useful but for this specific use case it starts to get very frustrating as it probably wasn't designed for such tasks. Is there an alternative to Matplotlib in Python that can handle 3d scatter plots better and where one could fluidly navigate through the cloud?
An example is shown in the next figure. I have drawn spheres around each data cluster corresponding to a specific class and colored overlapping spheres with red. Now I want to see how these sphere intersect. I think the biggest problem with Matplotlib is that it doesn't allow shifting of the whole graph with the mouse, it only allows rotation around a fixed point, which makes things very messy once you zoom a bit.
matplotlib is not quite mature for 3d graphics :
http://matplotlib.org/mpl_toolkits/mplot3d/faq.html
mplot3d was intended to allow users to create simple 3D graphs with the same “look-and-feel” as matplotlib’s 2D plots. Furthermore, users can use the same toolkit that they are already familiar with to generate both their 2D and 3D plots.
I don't think easy navigation in a 3d plot is easily doable (even 3d scaling is not possible without tweaking the lib). mplot3d was not really intended to be a full-fledged 3D graphics library in the beginning, but more a nice addition for people who needed basic 3D and who were acquainted with matplotlib 2D plot structure.
You might want to take a look at MayaVI (which is pretty good) :
MayaVi2 is a very powerful and featureful 3D graphing library. For advanced 3D scenes and excellent rendering capabilities, it is highly recomended to use MayaVi2.
Note that unlike matplotlib, MayaVI is not yet compatible with Python3 (and might not be in the foreseeable future), so you'll need a Python2 installation.
A very good alternative, but not in Python, is the 3D plot from ILNumerics (http://ilnumerics.net/). It is in .NET
Matplotlib works alright for 3D however, not too fast when interactivity is needed:
https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html
Mayavi is really fast and compatible with Python 3:
https://docs.enthought.com/mayavi/mayavi/mlab.html#id1

Periodic Axes class in matplotlib?

I have a collection of latitude/longitude points that straddle the longitude=0 line. I'd like to plot these using a matplotlib Axes class that "wraps" the horizontal dimension such that, when looking towards l=360, points at l=1 are plotted at the equivalent of l=361. Ideally, I'd also like something that defines the pan/zoom actions so I can use the plot interactively.
I know that it is possible to define custom projections in matplotlib, but I haven't found the equivalent of a Cylindrical projection that implements all of this functionality. I'd rather not use basemap. Does anyone know if something like this exists somewhere?
You can get exactly what you are asking for by modifying the mathplotlib exapmle - api example code: custom_projection_example.py you just need to decide if you would like a spherical representation or cylindrical - if the latter then you may find more useful code in the custom_scale_example.py which also includes panning and zooming but in the example deliberatly limits the data to +-90 degrees - you will need to wrap instead.

Categories

Resources