I'm making a simple word game simulation in Python and need a way to visualise a grid of coordinates. The input would be a simple 2D array with either '' or a character in each spot.
I need each spot in the grid to either be blank or have one letter. The x and y axes need to have arbitrary start points, such as -20. It seems like Matplotlib might do what I want, but having looked around on a bunch of Stackoverflow questions and Matplotlib help pages, I can't find what I need.
The question here partly has what I need:
Show the values in the grid using matplotlib
Except I want no colour, the values are single characters, and the axis labels need to allow arbitrary start points.
Does anyone know whether Matplotlib is the right library to do this sort of thing, or if I should try something else? Performance matters but it's not the most important thing. I don't need any interactivity with the display window, it's purely read only.
Related
I am currently trying to plot figures like this :
where i'm generating some random polytope located inside [-1,1]x[-1,1], then applying some optimization algorithms, and then plotting everything.
The problem is that because the polytopes are randomly generated, they can be very small or not centered, and it would be therefore convenient if i could remove the blank part on the sides.
I know it's possible to do it when saving a plot with something like plt.savefig('image.png', bbox_inches='tight'), but i would like to display it directly without white spaces (it's in a jupyter notebook so it would be more convenient).
I'm using a meshgrid to plot the data, and i have a 'None' at every blank pixel. I guess it could be possible to find some algorithm which finds the smallest square enclosing my polytope but i don't really want to go this way.
Do you have any ideas on how to do it using matplotlib ?
I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.
Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.
I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.
I have to build some rudimentary CAD Tool in Python based on matplotlib for handling the display of the content.
After all the parts have been put together, the whole layout shall be exported as line elements (basically just tuples of the start / end coordinates of the lines, e.g. [x1,y1,x2,y2]) and just points.
So far I have all the basic gemoetric stuff implemented, but I cannot figure out how to implement text properly. To be able to use different fonts etc. I want to use the text capabilities of matplotlib, but I can't find a way to export the text properly from matplotlib.
Is there a way to get a vectorized output right away? Or at least an array of the plotted text?
After some days of struggling, I found a way to get the outline of the text: https://github.com/rougier/freetype-py , more precisely the example https://github.com/rougier/freetype-py/blob/master/examples/glyph-vector.py
If you just want to get the outline as an vector array, you can delete everything after line 78 and do this:
path = Path(VERTS, CODES)
outline = path.to_polygons()
This will give you an array of polygons, and each polygon is again an array of points (x,y) of the polygon.
Though it was some trouble to get freetype running on windows and I still have not figured out how to make it portable, I think I will stick with this solution, because it is fast, reliable and allows one to use all the nice system fonts.
Im trying to create a wordcloud with Matplotlib. Essentially I am able to put text at arbitrary locations in my grid, but need to work out a way of preventing them from colliding. In relation to this I am stuck on two questions:
What is the unit of fontsize?
How do I transfer the "fontsize" of the text to units in my figure, so I can mark them as used? That is, how do I know how much space each letter will take up in my grid? Ideally I would not have to mark out a whole rectangle around each word, but only the pixels they actually use as available for other words.
I'm not sure about how to do it with matplotlib but I have used this in the past: http://peekaboo-vision.blogspot.co.uk/2012/11/a-wordcloud-in-python.html
I am making a pyplot graph with a set of points using:
plt.plot([range(0,10)], [dictionary[key]],'bo')
This correctly draws the points as I expect, however I also want a line to be drawn between these points. I can't find a way to do this with pyplot, I assume it's trivial.
Can someone explain how I can do this?
Try explicitly specifying the properties you want:
plt.plot(range(10),range(10),marker='o',color='b',linestyle='-')
the compact style is nice for interactive stuff, but I find using the keyword arguments makes the code more readable and makes it possible to loop control how the line properties are cycled through when plotting more than one curve on the same graph.
What is dictionary[key] in your code? If it is a scalar then it will make 10 separate lines of length one. I think you may want to really do
plt.plot(np.arange(10),np.ones(10)*dictionary[key],marker='o',color='b',linestyle='-')
or
plt.plot(range(10),[dictionary[key]]*10,marker='o',color='b',linestyle='-')
depending on if you are using numpy or not.
In your case [range(0,10)] is a list of list. Hence, you are plotting 10 points instead of a line. Try
plt.plot(range(0,10), dictionary[key],'bo-')
Yup, just add a "-":
plt.plot([range(0,10)], [dictionary[key]],'bo-')
That will make blue circular points connected by a line.