So I have a undirected multi graph (derived from an ontology), I wish to remove the edges that create cycles (but not all edges, the constituents of the multi graph have to remain connected). Is there a good way of doing this with the networkx package?
There may not be a unique way to do that for your graph. But maybe finding a spanning tree will solve your problem?
https://networkx.github.io/documentation/latest/reference/generated/networkx.algorithms.mst.minimum_spanning_tree.html
So I ended up with
def as_spanning_trees(G):
"""
For a given graph with multiple sub graphs, find the components
and draw a spanning tree.
Returns a new Graph with components as spanning trees (i.e. without cycles).
Parameters
---------
G: - networkx.Graph
"""
G2 = nx.Graph()
# We find the connected constituents of the graph as subgraphs
graphs = nx.connected_component_subgraphs(G, copy=False)
# For each of these graphs we extract the spanning tree, removing the cycles
for g in graphs:
T = nx.minimum_spanning_tree(g)
G2.add_edges_from(T.edges())
G2.add_nodes_from(T.nodes())
return G2
Related
I am new to building directed graphs with networkx and I'm trying to work out how to compare two graphs. More specifically, how to tell if a smaller graph is a subgraph (unsure of the exact terminology) of a larger graph
As an example, assume I have the following directed graph:
I would like to be able to check whether a series of smaller graphs are sub-graphs of this initial graph. Returning a True value if they are (graph B), and False if they are not (graph C):
Graph B = Sub-graph of Graph A
Graph C != Sub-graph of Graph A
Example Code
import networkx
A = nx.DiGraph()
A.add_edges_from([('A','B'),('B','C'),('C','A')])
nx.draw_networkx(A)
B = nx.DiGraph()
B.add_edges_from([('A','B')])
nx.draw_networkx(B)
C = nx.DiGraph()
C.add_edges_from([('A','B'),('A','C')])
nx.draw_networkx(C)
I've had a look through the documentation and cannot seem to find what I need. An alternative I have been considering is to represent the nodes as a sequence of strings, and then searching for each substring in the main graphs string sequence - however, I can't imagine this is the most effecient/effective/stable way to solve the problem.
You are looking for a subgraph isomorphisms.
nx.isomorphism.DiGraphMatcher(A, B).subgraph_is_isomorphic()
# True
nx.isomorphism.DiGraphMatcher(A, C).subgraph_is_isomorphic()
# False
Note that the operation can be slow for large graphs, as the problem is NP-complete.
I am working with networks undergoing a number of disrupting events. So, a number of nodes fail because of a given event. Therefore there is a transition between the image to the left to that to the right:
My question: how can I find the disconnected subgraphs, even if they contain only 1 node? My purpose is to count them and render as failed, as in my study this is what applies to them. By semi-isolated nodes, I mean groups of isolated nodes, but connected to each other.
I know I can find isolated nodes like this:
def find_isolated_nodes(graph):
""" returns a list of isolated nodes. """
isolated = []
for node in graph:
if not graph[node]:
isolated += node
return isolated
but how would you amend these lines to make them find groups of isolated nodes as well, like those highlighted in the right hand side picture?
MY THEORETICAL ATTEMPT
It looks like this problem is addressed by the Flood Fill algorithm, which is explained here. However, I wonder how it could be possible to simply count the number of nodes in the giant component(s) and then subtract it from the number of nodes that appear still active at stage 2. How would you implement this?
If I understand correctly, you are looking for "isolated" nodes, meaning the nodes not in the largest component of the graph. As you mentioned, one method to identify the "isolated" nodes is to find all the nodes NOT in the largest component. To do so, you can just use networkx.connected_components, to get a list of the components and sort them by size:
components = list(nx.connected_components(G)) # list because it returns a generator
components.sort(key=len, reverse=True)
Then you can find the largest component, and get a count of the "isolated" nodes:
largest = components.pop(0)
num_isolated = G.order() - len(largest)
I put this all together in an example where I draw a Erdos-Renyi random graph, coloring isolated nodes blue:
# Load modules and create a random graph
import networkx as nx, matplotlib.pyplot as plt
G = nx.gnp_random_graph(10, 0.15)
# Identify the largest component and the "isolated" nodes
components = list(nx.connected_components(G)) # list because it returns a generator
components.sort(key=len, reverse=True)
largest = components.pop(0)
isolated = set( g for cc in components for g in cc )
# Draw the graph
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos=pos, nodelist=largest, node_color='r')
nx.draw_networkx_nodes(G, pos=pos, nodelist=isolated, node_color='b')
nx.draw_networkx_edges(G, pos=pos)
plt.show()
I have a graph consisting of about 200 nodes out of which I am removing nodes on each iteration.
It is possible to visualize the graph with the nodes removed, but the location of the nodes does not change while doing so. Ideally, I'd like to take away nodes and see if the remaining nodes move closer together and if clusters form as more and more nodes are removed.
I'm using networkX for this. I have tried to recompute the graph on every iteration but there seems to be some randomness in how the graph is created. I am therefore getting a very different graph on each iteration.
Is there a way to achieve what I want?
You can use draw_networkx for this:
import networkx as nx
import matplotlib.pyplot as plt
nodes = [i for i in range(10)]
edges = [(i, i+1) for i in range(5)]
G = nx.Graph()
G.add_nodes_from(nodes)
G.add_edges_from(edges)
positions = {}
for node in nodes:
positions[node] = (node, node)
nx.draw_networkx(G, pos=positions)
I generate a graph of 10 nodes with some edges, and then define a dict where the keys are the nodes (1 to 10 here) and the values are the coordinates in (x,y) format. In this example I arranged the nodes along a line.
Then, at the next iteration, just remove the nodes you do not need and pass the same dict. It will skip over the missing nodes and just plot the ones still present in the graph:
G.remove_nodes_from([5,6])
nx.draw_networkx(G, pos=positions)
You should see the nodes 5 and 6 missing.
.draw_networkx relies on matplotlib, so you can do many of the things allowed by that library. More info here.
Hope it helps!
I have a csr matrix from which I extracted data, rows, and columns.
I want to create a bipartite graph using NetworkX, and I also tried several solutions without success (as an example: Plot bipartite graph using networkx in Python). The reasons why it doesn't work, in my opinion, is a matter of labeling. My two sets and the nodes inside them have no string name.
For example in a 10x10 matrix, the rows/cols indexes represent the name of the nodes of the two sets, while the intersection of these nodes is the weighted link between those nodes.
In my case, then, if I have (0,0)=0.5 it doesn't mean that it is a self-loop; instead, the link with weight 0.5 connects the "node 0" of the first set with the "node 0" of the second one.
import networkx as nx
from networkx.algorithms import bipartite
import matplotlib.pyplot as plt
def function(foo, n_row, n_col):
n_row=10
n_col=10
After the creation of the matrix, I obtain my data
weights = weights.tocsr()
wcoo = weights.tocoo()
m_data = wcoo.data
m_rows = wcoo.row
m_cols = wcoo.col
g = nx.Graph()
# TRIAL 1
g.add_nodes_from(m_cols, bipartite=0)
g.add_nodes_from(m_rows, bipartite=1)
bi_m = bipartite.matrix.biadjacency_matrix(g, m_data)
# TRIAL 2
g.add_weighted_edges_from(zip(m_cols, m_rows, m_data))
nx.draw(g, node_size=500)
plt.show()
I expected a bipartite graph with two sets of 10 nodes per each with a certain amount of weighted links among them (without link among the same set) as a result.
I, instead, obtained a classic non-oriented graph with 10 nodes in total.
At the same time, I'd like to optimize as well as I can my code to speed-up the computational time without affecting the readability.
I need to compute the density of a subgraph made of vertices belonging to the same attribute "group".
ie., let g be an iGraph graph,
g.vs.select(group = 1)
gives me an object with all vertices belonging to group 1
Is there any way to compute density on the graph which is formed by these vertices and the connections between them?
In a fashion similar to this maybe?
g2.vs(g2.vs.select(group = i)).density()
Try this:
g.vs.select(group=1).subgraph().density()