I'm using graphviz from python to plot japanese kanjis and their relationships. I'm exporting the output as a svg and using sfdp as a render engine.
The output is roughly square (3000x3000). I was wondering if there was any way to specify a constraint so that the output is more like a vertical rectangle (for instance 5000x1000) than a square.
It seems that the size parameters mostly work on the scale of the resulting image, and not on the positioning of the nodes themselves. But I have a lot of different connexe components so it should be easy to move them around.
Is what I'm trying to do impossible?
Thanks!
sfdp, because it uses a spring model, tends to produce square graphs. However, if you use dot you have the added attributes like rank and rankdir to shape your graph more to your wishes.
You can refer to Node, Edge and Graph Attributes to see which attributes sfdp supports.
Try playing with the ratio attribute:
...
If ratio is numeric, it is taken as the desired aspect ratio. Then, if the actual aspect ratio is less than the desired ratio, the drawing height is scaled up to achieve the desired ratio; if the actual ratio is greater than that desired ratio, the drawing width is scaled up.
So if the original graph was like this:
Adding the graph [ratio="1.5"] attribute will make it look like this:
UPD: also notice how the ratio attribute works in pair with size attribute, maybe the combination will be best for you. Detailed explanation in the ratio docs over the link I've posted.
Related
I would like to know how best to generate a schematic diagram, something like this, from a graph (created using the Python NetworkX library) that contains the latitude and longitude of each node (city) and the lines connecting them in the Indian railway network.
The cities (nodes) should be located reasonably close to their actual position, but not necessarily exactly. I am OK with using the plate carrée projection that simply maps lat/long onto X/Y in the diagram.
The rail lines (edges) can be straight lines or even curves if it fits better.
On the diagram should be displayed the cities (preferably as dots) along with a short (max 4 characters) label for each, the lines connecting them, and a single label for each line (the given example has quite long labels for the lines).
Preferably the amount of manual tweaking of coordinates to get things to fit should be minimised.
Using Graphviz was my first idea. But I don't know how well neato/fdp (required for fixed positioning of nodes) will perform with large numbers of nodes/edges. Also, making Graphviz display labels separately outside the nodes (rather than inline) seems to need a lot of manual positioning of each label, which would be pretty boring. Is there any better way to get this kind of layout?
Doable (https://forum.graphviz.org/t/another-stupid-graphviz-trick-geographic-graphs/256), but does not seem to use many Graphviz features. In addition to tools mentioned in the link, maybe consider pikchr (https://pikchr.org/home/doc/trunk/homepage.md)
I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.
Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.
I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.
I'm making some photo-editing tools in python using PIL (Python Imaging Library), and I was trying to make a program which converts a photo to its 'painted' version.
I've managed to make a program which converts a photo into its distinct colours, but the problem is that the algorithm I'm using is operating on every pixel, meaning that the resulting image has very jagged differences between colours.
Ideally, I'd like to smoothen out these edges, but I don't know how!
I've checked out this site for some help, but the method there produces quite different results to what I need.
My Starting Image:
My Image with Distinct Colours:
I would like to smoothen the edges in the image above.
Results of using the method which doesn't quite work:
As you can see, using the technique doesn't smoothen the edges into natural-looking curves; instead it creates jagged edges.
I know I should provide sample output, but suprisingly, I haven't actually got it, so I'll describe it as best as I can. Simply put, I want to smoothen the edges between the different colours.
I've seen something called a Gaussian blur, but I'm not quite sure as to how to apply it here as the answers I've seen always mention some sort of threshold, and are usually to do with binary images, so I don't think it can apply here.
Edge enhancement does the opposite of edge smoothing, so this is certainly not the tool you should use.
Unfortunately, there is little that you can do because edge smoothing will indeed smoothen the jaggies, but it will also destroy the true edges, resulting in a blurred image. Edge-preserving smoothing is also a dead-end.
You should have a look at the methods to extract the "cartoon part" of an image. There is a lot of literature on this topic, though often pretty sophisticated.
You can enhance the quality of your "Image with Distinct Colours" by applying a median filter with a radius of 2:
If you want to get "comic-like" dark edges, you can calculate the edges of the original image using a sobel filter, convert the edge map to grayscale, then multiply the resulting edge map with 2, inverse the map and add each non-white pixel of the edge map to the original image. This will result in:
This is of course only a starting point as the result leaves much to be desired, but it should give you a good idea about the basic concept.
Given a contour outlining the edge of the letter S (in comic sans for example), how can I get a series of points along the spine of this letter in order to later represent this shape using lines, cubic spline or other curve-representing technique? I want to process and represent the shape using 30-40 points in Python/OpenCV.
Morphological skeletonization could help with this but the operation always seems to produce erroneous branches. Is there a better way to collapse the contour into just the 'S' shape of the letter?
In the example below you can see the erroneous 'serpent's tongue' like branches that are produced by morphological skeletonization. I don't know if it's fair to say they are erroneous if that's what the algorithm is supposed to be doing, but for me I would not like them to be there.
Below is the comic sans alphabet:
Another problem with skeletonization is that it is computationally expensive, but if you know a way of making it robust to forming 'serpent's tongue' like branches then I will give it a try.
Actually vectorizing fonts isn't trivial problem and quite tricky. To properly vectorize fonts using bezier curve you'll need tracing. There are many library you can use for tracing image, for example Potrace. I'm not knowledgeable using python but based on my experience, I have done similar project using c++ described below:
A. Fit the contour using cubic bezier
This method is quite simple although a lot of work should be done. I believe this also works well if you want to fit skeletons obtained from thinning.
Find contour/edge of the object, you can use OpenCV function findContours()
The entire shape can't be represented using a single cubic bezier, so divide them to several segments using Ramer-Douglas-Peucker (RDP). The important thing in this step, don't delete any points, use RDP only to segment the points. See colored segments on image below.
For each segments, where S is a set of n points S = (s0, s1,...Sn), fit a cubic bezier using Least Square Fitting
Illustration of least square fitting:
B. Resolution Resolution Independent Curve Rendering
This method as described in this paper is quite complex but one of the best algorithms available to display vector fonts:
Find contour (the same with method A)
Use RDP, differently from method A, use RDP to remove points so the contour can be simplified.
Do delaunay triangulation.
Draw bezier curve on the outer edges using method described in the paper
The following simple idea might be usefull.
Calculate Medial axis of the outer contour. This would ensure connectivity of the curves.
Find out the branch points. Depending on its length you can delete them in order to eliminate "serpent's tongue" problem.
Hope it helps.
I am using Graphviz in Python via pydot. The diagram I am making has many clusters of directed graphs. pydot is putting them next to each other horizontally resulting in an image that is very wide. How can I tell it to output images of a maximum width so that I can scroll vertically instead?
There are several things you can do.
You can set the maximum size of your graph, using 'size' (e.g., size = "4, 8" (inches)). This fixes the size of your final layout. Unlike most other node,edge, and graph parameters in the dot language, 'size' has no default. Also, the default orientation is 'portrait', which i believe is what you want (for a graph that is taller vs. wider), but you might want to set this parameter explicitly in case it was set to 'landscape' earlier.
'Size' can be used with the 'ratio' parameter (the layout aspect ratio) to manipulate the configuration. 'Ratio' takes a float (e.g., ratio = "2.0") or 'auto' or 'fill'. (The latter tells graphviz to fill use the entire graph region alloted by 'size'.
The parameters that have the greatest effect on graph configuration are 'nodesep' and 'ranksep'. These are the minimum horizontal distance between adjacent nodes of equal 'rank', and the minimum vertical distance between adjacent ranks of nodes. The default values are 0.25 and 0.75 inches, respectively. To get the configuration you want, you will want to simultaneously increase nodesep and decrease ranksep. Gradual iteration should allow you to quickly converge on a set of values for these two parameters that gives you the configuration you want.
Initialize your graph like this:
graph = pydot.Dot(graph_type='digraph', rankdir='LR')
This will set the graph direction from left to right. In general, use the graphviz documentation to find the right attribute in order to achieve what you want.
I'm not sure if you're able to do this with your data, but if you change the order that the nodes are inserted into the graph it can really affect the generated graph. If you don't want to supply any ordering information to Graphviz and want Graphviz to attempt solving optimal placement of nodes to minimize contention, use Graphviz's neato instead. It uses a spring model to figure out where nodes should be placed.
It looks like you should be able to use neato inside pydot like:
my_graph.write('my_graph.png', prog='neato', format='png')
See pydot's documenation here.