Group vertices in clusters using NetworkX - python

I am trying to represent graphically some graphs, and I need to group in clusters some nodes that have a common characteristics.
I am using NetworkX, and I need to do something similar with the graph from this tutorial, from the slide 44, left figure.
I want to draw some delimiting line around each cluster. My current code is like that:
vec = self.colors
colors = (linspace(0, 1, len(set(vec))) * 20 + 10)
nx.draw_circular(g, node_color=array([colors[x] for x in vec]))
show()
I wish to find an example and see how can I use networkx to cluster the graph.

I'm not positive what your question is. I think you're asking "how do I get networkx to put some nodes close together"
Before I launch into the answer, the drawing documentation for networkx is here: http://networkx.lanl.gov/reference/drawing.html
So that figure you're asking about has 4 different communities that are clustered based on having lots of edges within each community and not many outside.
If you don't want to put much effort into it, spring_layout is often good for putting tightly knit communities together. The basic algorithm of spring_layout acts as if the edges are springs (and nodes repel). So lots of edges keeps nodes close together. Note that it initializes the positions randomly, so each time you'll get a different output.
The easiest way to do this is just
nx.draw_spring(G)
But maybe you want more. If you want to, you can fix every single node's position. Define a dict, usually named pos.
pos = {}
for node in G.nodes_iter():
pos[node] = (xcoord, ycoord).
where xcoord and ycoord are the coordinates you want the node to be at.
Then just do
draw_networkx(G, pos = pos)
That's often a lot of effort. So sometimes you just tell it a few of them have to be in particular places, and let networkx do the rest
Define fixedpos for a few nodes and then run
spring_layout
telling it what nodes are fixed and giving it fixedpos as the initial positions. Then it will hold those fixed and fit everything else around.
Here is some code which generates a network that has 4 completely connected parts and a few other edges between them. (actually it generates a complete network and then deletes all but a few edges between these parts). Then it draws it with a simple spring layout. Then it fixes 4 of them to be at the corners of a square and places the other nodes around those fixed positions.
import networkx as nx
import random
import pylab as py
from math import floor
G = nx.complete_graph(20)
for edge in G.edges():
if floor(edge[0]/5.)!=floor(edge[1]/5.):
if random.random()<0.95:
G.remove_edge(edge[0],edge[1])
nx.draw_spring(G)
py.show()
fixedpos = {1:(0,0), 6:(1,1), 11:(1,0), 16:(0,1)}
pos = nx.spring_layout(G, fixed = fixedpos.keys(), pos = fixedpos)
nx.draw_networkx(G, pos=pos)
py.show()
You can also specify weights to the edges, pass the weights to spring_layout and larger weights will tell it to keep the corresponding nodes closer together. So once you've identified your communities, increase weights within the communities/clusters if necessary to keep them close together.
Note that you can also specify what color to make each node, so it is straightforward to specify the color for each community/cluster.
If you then want to draw curves around each of those clusters, you'll have to do that through matplotlib.

Related

How to connect nodes in a networkx graph?

Left is input, right is desired output:
Input: I am given some n. I generate n points uniformly at random from [0, 1]. So, the points are tuples (x, y).
I then add this list of nodes into NetworkX graph object. Now, I'd like to connect the edges as shown in the right. That is, the graph is connected (you can get from anywhere to anywhere using some number of edges) but not necessarily an Erdos Renyi graph.
I'm not sure what the term is for this kind of graph - no overlapping edges graph? but is it possible to generate edges for such a graph using Networkx?
Networks derived from points in Euclidean space are typically called geometric graphs. Graphs with no overlapping edges are called planar graphs. As you have drawn all your edges as straight lines, I assume that you are particularly interested in planar, straight-line graphs (PSLGs).
There are several generators for geometric graphs in networkx, however, I am unsure if any of them would necessarily honor the planarity constraint (it feels that you could coerce the geographical_treshold_graph to do that if you picked the threshold parameter in an intelligent way but I don't have a solution off the top of my head).
Personally, I would start with my random points, and then get the edges by computing a Delaunay triangulation, implemented in scipy.spatial. I would then subsample the edges (depending on the task) and create my graph objects in networkx/igraph/graph-tool.

networkX.draw() not producing edges

I'm not sure why my network graph doesn't include edges.
I'm creating a network from a pandas dataframe that looks like the following:
I created the network as follows:
G = nx.from_pandas_edgelist(network_df,
edge_attr='weight',
source='Source',
target='Target',
create_using=nx.Graph())
but nx.draw(G) produces a graph without edges.
I tried using nx.DigGraph() but the result is the same.
Any help is greatly appreciated.
That central "blob" in your plot is a lot of nodes connected together which probably do have edges, but they are obscured by the dense mass of nodes. On the periphery there are a few nodes joined together by edges, but due to the plotting algorithm they pairs (or somewhat larger cluster) are again so close together that the nodes are obscured. The isolated nodes are isolated.
It's probably best to try another layout. The default is spring_layout. Here's another that will probably show it better:
pos = nx.circular_layout(G)
nx.draw(G, pos)
As a general rule, networkx was not designed for the purpose of graph visualization. So you may need to look at other tools like graphviz.

Find Subgraphs inside a Connected Component from a NetworkX Graph

I have built a NetworkX Graph containing 50000 Nodes and about 100 Million edges. I have a list of all connected components of this group using nx.connected_components(G) method. This method results in me having clusters of nodes such that each node has a path to reach every other node in that cluster. Now what I want is, in each of these connected components, I want to find subgraphs/sub-clusters such that each of these subgraphs are connected to each other by exactly one edge. Is there a method in NetworkX that I can use directly or any other way in which I can get this done? Sorry I am very new to graph theory so need a little direction.
If I understand you correctly, then for each subgraph, you want to find all graph cuts of size 1, i.e. you want to find all edges, that if taken away partition the graph into two subgraphs. These edges are called bridges and there are efficient algorithms to find them. The implementation in networkx is accessible via networkx.algorithms.bridges.bridges.
What you want is called minimum spanning three. Using networkx you can do it like this:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_edges_from([(1,2), (1,3), (2,3), (4,5), (4,6), (5,6)])
nx.draw(nx.minimum_spanning_tree(G), with_labels=True)
plt.show()
However, I'm a little bit in doubt if networkx is able to perform on so many edges according to this benchmark. I have tested connected components algorithm on igraph, it worked for me as well (and, of course, much faster), so you might also like to look for igraph based solutions.
Result

Plot directional edges as non-overlapping lines

I have a networkx directional graph. I'd like to plot it so that nodes that interact (A->B, B->A) have two edges displayed and the colors correspond to the relative weights.
Currently I have a simple three-node graph. Naturally the flow is "order created" -> "order closed" but on not-so-rare occasions the flow can be reversed!
Since this is going on a dashboard, I need to avoid the "download it and use another software package" to visualize it. Ideally I'd just use igraph (love it in R), but it's not supported in my environment (Mode Analytics).
Is the best solution to just create a hack where I split every node to a pair: "node-sender" and "node-receiver"? At least that would have the bonus of seeing self-edges too. Or maybe switch to plotly's chord diagram?
Edit: What I'm looking for
Ideally I could just use something like the qgraph package in R.
edges <- data.frame(from = from_node, to = to_node, thickness = weights)
qgraph(edges, esize=10, gray=TRUE)

order of networkx nodes - print graphviz layout vertically

I have a graphviz layout I've created. I've also tried to create graphs using differing drawing styles such as random, circular, shell, spectral, spring. I believe graphviz is the most accurate to my data. I created a file containing two columns of strings. These columns are the edges. (Each string has at least one corresponding partner, which is why GraphViz layout I think best represents these data) From that file I created a list of unique strings for the nodes. I then plotted the nodes and added the edges. A version of my script can be found here: (networkx - change node size based on list or dictionary value)
Here is the output using graphviz layout (instead of 100 the sizes were multiplied by 10, some numbers are as high as 15020, and other as small as 10):
Here is the output using random:
Can one conclude that all the edges that should be present are present in the graphviz example? Is it correct to say that smaller nodes "on top of" larger ones are conncted? Is it possible to make their edges viewable? Are there so many more edge visible in the random example due to the random placement of nodes in the graph, therefore edges can have a much higher 'length' to traverse?
If what I think is correct, and the graphviz is the best drawing option for my data, since there are many overlaps between the nodes and edges (and if those nodes "on top of" the larger node are indeed connected) what I would like to do is sort the plot in a "vertical" fashion. So, the largest nodes with most edges on top, going down to nodes with only 1 edge. I've tried to change the overall figure size, which did not make anything more discernable. For some reason, I got the original window with the plot and a secondary window with a grey blank background.
So, I'm starting to think some of my assumptions are correct. Here is the image as large as I can make it:
What is happening is that networkx puts the nodes over top of the edges. So the edges are drawn underneath the nodes.
I believe the easiest way to still see them is to set alpha=0.5 or something else less than 1 in the draw command to make the nodes partly transparent.

Categories

Resources