networkx - python - distance between nodes nx.draw_graphviz()

networkx - python - distance between nodes nx.draw_graphviz() - python

I'm using networkx to make some graphs. I like the output of the fdp layout with Graphviz.
However, I can't seem to get the nodes to space apart far enough to vizualize.
I've tried using scale, K and nodesep in the nx.draw() command, however a lot of the nodes are still jumbled and can't be seen because of overlap. I decreased the node size from 300 (default) to 200, still not too good. Any smaller and the colors I've added are not easily recognizable.
There are approx. 2400 nodes. Does anyone know how to space nodes with nx.draw_graphviz(g, prog="fdp")? Ideally I would like to order the nodes from largest cluster to smallest in a vertical fashion, but can't seem to find a layout.
I tried using prog="dot" and using rankdir="TB", but the nodes are still printed left to right in a jumbled order and are very hard to make out. I either need to increase the spacing of the nodes, or make the image much larger, and I've also tried playing around with the parameters to Matplotlib and the image is the same size every time. All thoughts are appreciated.

Related

Generate schematic (geographic) diagram from graph

I would like to know how best to generate a schematic diagram, something like this, from a graph (created using the Python NetworkX library) that contains the latitude and longitude of each node (city) and the lines connecting them in the Indian railway network.
The cities (nodes) should be located reasonably close to their actual position, but not necessarily exactly. I am OK with using the plate carrée projection that simply maps lat/long onto X/Y in the diagram.
The rail lines (edges) can be straight lines or even curves if it fits better.
On the diagram should be displayed the cities (preferably as dots) along with a short (max 4 characters) label for each, the lines connecting them, and a single label for each line (the given example has quite long labels for the lines).
Preferably the amount of manual tweaking of coordinates to get things to fit should be minimised.
Using Graphviz was my first idea. But I don't know how well neato/fdp (required for fixed positioning of nodes) will perform with large numbers of nodes/edges. Also, making Graphviz display labels separately outside the nodes (rather than inline) seems to need a lot of manual positioning of each label, which would be pretty boring. Is there any better way to get this kind of layout?

Doable (https://forum.graphviz.org/t/another-stupid-graphviz-trick-geographic-graphs/256), but does not seem to use many Graphviz features. In addition to tools mentioned in the link, maybe consider pikchr (https://pikchr.org/home/doc/trunk/homepage.md)

Highlighting many ranges on an axis of a Bokeh plot?

I have a scatter plot of data and would like to highlight certain ranges of the x-axis. When the number ranges to highlight are relatively small, using BoxAnnotation works well. However, I'm trying to make many adjacent highlightings (with different opacity). With many adjacent BoxAnnotations, zoomed out, the boxes slightly overlap, creating lines. Additionally, thousands of BoxAnnotations takes a long time to generate and does not run smoothly when interacting with the plot.
To be more specific about my case, I have some temporal data and a predictive model detecting the probability of some event occurring in the data. I want each segment to be highlighted with an opacity given by the probability that an event is occurring at that point in time. However, my current BoxAnnotation approach results in artificial lines from overlap of boxes when zoomed out (they disappear when zooming in on a region), and slow responsiveness of the interactive plot.
Is there a way to accomplish something similar to this without the artifacts and with a smoother experience?
Current method:
source = ColumnDataSource(data=data_frame)
figure_ = figure(x_axis_label='Time', y_axis_label='Intensity')
for index in range(data_frame.shape[0] - 1):
figure_.add_layout(
BoxAnnotation(left=data_frame['time'].values[index], right=data_frame['time'].values[index + 1],
fill_alpha=data_frame['prediction'].values[index], fill_color='red', line_alpha=0)
)
figure_.circle(x='time', y='intensity', source=source)
show(figure_)
Example of artificial lines when there are too many small adjacent BoxAnnotations:
When zooming on the x-axis, the lines disappear:

There's probably not any way to salvage this exact approach. The artifacts are due to the functioning of the underlying raster HTML canvas, and here's not anything that can be one about that. And any slowness is due to the fact that this kind of use of BoxAnnotation (with so very many individual instances) is not at all what was envisioned, and it is simply not optimized to show hundreds of instances the way e.g. scatter glyphs are. You are trying to use box annotations to construct a sort of translucent heat map, and that is not a good fit for it, for the reasons above.
You could potentially overcome slowness by using a single rect or vbar glyph that draws all the boxes at once in a vectorized way. But that won't alleviate the compositing issues.
Your best bet is to create a semi-transparent "heatmap" image overlay yourself with a tool or code that can afford better control over the details of rasterization and compositing. I can't really advise you on how to do that in any detail. The Datashader library might be useful for this.

order of networkx nodes - print graphviz layout vertically

I have a graphviz layout I've created. I've also tried to create graphs using differing drawing styles such as random, circular, shell, spectral, spring. I believe graphviz is the most accurate to my data. I created a file containing two columns of strings. These columns are the edges. (Each string has at least one corresponding partner, which is why GraphViz layout I think best represents these data) From that file I created a list of unique strings for the nodes. I then plotted the nodes and added the edges. A version of my script can be found here: (networkx - change node size based on list or dictionary value)
Here is the output using graphviz layout (instead of 100 the sizes were multiplied by 10, some numbers are as high as 15020, and other as small as 10):
Here is the output using random:
Can one conclude that all the edges that should be present are present in the graphviz example? Is it correct to say that smaller nodes "on top of" larger ones are conncted? Is it possible to make their edges viewable? Are there so many more edge visible in the random example due to the random placement of nodes in the graph, therefore edges can have a much higher 'length' to traverse?
If what I think is correct, and the graphviz is the best drawing option for my data, since there are many overlaps between the nodes and edges (and if those nodes "on top of" the larger node are indeed connected) what I would like to do is sort the plot in a "vertical" fashion. So, the largest nodes with most edges on top, going down to nodes with only 1 edge. I've tried to change the overall figure size, which did not make anything more discernable. For some reason, I got the original window with the plot and a secondary window with a grey blank background.
So, I'm starting to think some of my assumptions are correct. Here is the image as large as I can make it:

What is happening is that networkx puts the nodes over top of the edges. So the edges are drawn underneath the nodes.
I believe the easiest way to still see them is to set alpha=0.5 or something else less than 1 in the draw command to make the nodes partly transparent.

Efficient 2D edge detection in Python

I know that this problem has been solved before, but I've been great difficulty finding any literature describing the algorithms used to process this sort of data. I'm essentially doing some edge finding on a set of 2D data. I want to be able to find a couple points on an eye diagram (generally used to qualify high speed communications systems), and as I have had no experience with image processing I am struggling to write efficient methods.
As you can probably see, these diagrams are so called because they resemble the human eye. They can vary a great deal in the thickness, slope, and noise, depending on the signal and the system under test. The measurements that are normally taken are jitter (the horizontal thickness of the crossing region) and eye height (measured at either some specified percentage of the width or the maximum possible point). I know this can best be done with image processing instead of a more linear approach, as my attempts so far take several seconds just to find the left side of the first crossing. Any ideas of how I should go about this in Python? I'm already using NumPy to do some of the processing.
Here's some example data, it is formatted as a 1D array with associated x-axis data. For this particular example, it should be split up every 666 points (2 * int((1.0 / 2.5e9) / 1.2e-12)), since the rate of the signal was 2.5 GB/s, and the time between points was 1.2 ps.
Thanks!

Have you tried OpenCV (Open Computer Vision)? It's widely used and has a Python binding.
Not to be a PITA, but are you sure you wouldn't be better off with a numerical approach? All the tools I've seen for eye-diagram analysis go the numerical route; I haven't seen a single one that analyzes the image itself.
You say your algorithm is painfully slow on that dataset -- my next question would be why. Are you looking at an oversampled dataset? (I'm guessing you are.) And if so, have you tried decimating the signal first? That would at the very least give you fewer samples for your algorithm to wade through.

just going down your route for a moment, if you read those images into memory, as they are, wouldn't it be pretty easy to do two flood fills (starting centre and middle of left edge) that include all "white" data. if the fill routine recorded maximum and minimum height at each column, and maximum horizontal extent, then you have all you need.
in other words, i think you're over-thinking this. edge detection is used in complex "natural" scenes when the edges are unclear. here you edges are so completely obvious that you don't need to enhance them.

Placing nodes vertically in Graphviz using pydot

I am using Graphviz in Python via pydot. The diagram I am making has many clusters of directed graphs. pydot is putting them next to each other horizontally resulting in an image that is very wide. How can I tell it to output images of a maximum width so that I can scroll vertically instead?

There are several things you can do.
You can set the maximum size of your graph, using 'size' (e.g., size = "4, 8" (inches)). This fixes the size of your final layout. Unlike most other node,edge, and graph parameters in the dot language, 'size' has no default. Also, the default orientation is 'portrait', which i believe is what you want (for a graph that is taller vs. wider), but you might want to set this parameter explicitly in case it was set to 'landscape' earlier.
'Size' can be used with the 'ratio' parameter (the layout aspect ratio) to manipulate the configuration. 'Ratio' takes a float (e.g., ratio = "2.0") or 'auto' or 'fill'. (The latter tells graphviz to fill use the entire graph region alloted by 'size'.
The parameters that have the greatest effect on graph configuration are 'nodesep' and 'ranksep'. These are the minimum horizontal distance between adjacent nodes of equal 'rank', and the minimum vertical distance between adjacent ranks of nodes. The default values are 0.25 and 0.75 inches, respectively. To get the configuration you want, you will want to simultaneously increase nodesep and decrease ranksep. Gradual iteration should allow you to quickly converge on a set of values for these two parameters that gives you the configuration you want.

Initialize your graph like this:
graph = pydot.Dot(graph_type='digraph', rankdir='LR')
This will set the graph direction from left to right. In general, use the graphviz documentation to find the right attribute in order to achieve what you want.

I'm not sure if you're able to do this with your data, but if you change the order that the nodes are inserted into the graph it can really affect the generated graph. If you don't want to supply any ordering information to Graphviz and want Graphviz to attempt solving optimal placement of nodes to minimize contention, use Graphviz's neato instead. It uses a spring model to figure out where nodes should be placed.
It looks like you should be able to use neato inside pydot like:
my_graph.write('my_graph.png', prog='neato', format='png')
See pydot's documenation here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.