I am implementing the Fastdtw algorithm to find the optimal path to align two time-series data. I hope to output a plot like this:
However, I've never tried such kind of plot before. I guess maybe I need to use the imshow() function in matplotlib, but I don't know how to draw the extra trajectory in the plot.
I wish somebody coould give a similar example about drawing like such style. I will modify the parameters by myself.
I am doing some computational fluid dynamics (CFD) simulations for some research, and I have come across a paper that I would like to build upon.
In principle, I am trying to simulate flows and viscosities etc inside a triangular shaped container. Now, some of the cavity-flow and Navier-Stokes equations are quite long. Therefore, some these equations have kindly been publicly written and available in python format here. The code for these equations uses numpy.meshgrid() and numpy.linspace() extensively to produce some rectangular plots in the link. There is nothing wrong with the equations and they are mathematically sound.
However, I would like to replicate these results by simulating them instead inside a triangular container. The plots for these would therefore look like the plots provided on page 28 of this paper. Note here that this is not the rectangular plots with only a triangular subsection plotted, rather the "grid" in this simulation is triangular itself.
My question is whether numpy has a specific feature that would allow for these triangular grids? My evidence of research into this question has led me to scour the documentation regarding non-rectangular arrays, however the closest that I could find was numpy.tril() and numpy.triu(), which still give me rectangular arrays with zeros in the lower and upper triangles of the array respectively. I was wondering if there was any numpy method that allows for the creation of these triangular containers to simulate fluids in.
My last hope would be to create some kind of dictionary, with keys as row numbers, and values as lists which store the column. That way I could create a triangular dictionary. But this would not integrate with the mathematical equations that have written for numpy mentioned previously.
TLDR
How can I use the existing numpy libraries to create triangular grids so that I can have plots that look like this
to then look like this
I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.
Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.
I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.
I created a graph in MATLAB (see figure below) such that around every data point there is a data distribution plotted (grey area plots). The way I did it in MATLAB was to create a set of axes for every distribution curve and then plot the curves without showing those axes at every point of the data curve. I also used a command 'linkaxes' to set figure limits for all the curves at once.
I must say that this is far from an elegant solution and I had many troubles with saving this figure in the correct aspect ratio settings. All in all I couldn't find any other useful option in MATLAB.
Is there a more elegant solution for such types of graphs in Python? I am not that much interested in how to do the areas highlighted, but how to place a set of curves(distributions) exactly at the positions of the main data curve points.
Thank you!
Since some years ago I use matlab for my plots (mostly density plots), but now I want to change to matplotlib. I have a problem trying to figure out how to get analogous plots in matplotlib. I have to represent a 2D array. In matlab I used to use the surf function, and then change to view(2) (az=0 and el=90). An example:
surf(X,Y,log10(z),'FaceColor','interp','EdgeColor','none')
view(2)
In matplotlib I have tried some functions, but I have not got the same feeling. m3plot is a computationally expensive toolkit and it is not the same as using surf. imshow does not allow to use log functions in his arguments (like the example), and log values is something mandatory for me. Then it is pcolor, but I can not find a 'FaceColor'-like option to smooth the edges. I would like to know if someone knows what is the best equivalent in matplotlib.
Thank you for your time!
Try installing mayavi which has the surf function (mayavi is a fully-blown 3D visualisation library using hardware acceleration)
Finally, the solution that suits me is to use the routine pcolormesh(). This combined with the option shading='gouraud' interpolates the data and smooth the edges. In addition, it works pretty well with large arrays in comparision with pcolor.