I have a csr matrix from which I extracted data, rows, and columns.
I want to create a bipartite graph using NetworkX, and I also tried several solutions without success (as an example: Plot bipartite graph using networkx in Python). The reasons why it doesn't work, in my opinion, is a matter of labeling. My two sets and the nodes inside them have no string name.
For example in a 10x10 matrix, the rows/cols indexes represent the name of the nodes of the two sets, while the intersection of these nodes is the weighted link between those nodes.
In my case, then, if I have (0,0)=0.5 it doesn't mean that it is a self-loop; instead, the link with weight 0.5 connects the "node 0" of the first set with the "node 0" of the second one.
import networkx as nx
from networkx.algorithms import bipartite
import matplotlib.pyplot as plt
def function(foo, n_row, n_col):
n_row=10
n_col=10
After the creation of the matrix, I obtain my data
weights = weights.tocsr()
wcoo = weights.tocoo()
m_data = wcoo.data
m_rows = wcoo.row
m_cols = wcoo.col
g = nx.Graph()
# TRIAL 1
g.add_nodes_from(m_cols, bipartite=0)
g.add_nodes_from(m_rows, bipartite=1)
bi_m = bipartite.matrix.biadjacency_matrix(g, m_data)
# TRIAL 2
g.add_weighted_edges_from(zip(m_cols, m_rows, m_data))
nx.draw(g, node_size=500)
plt.show()
I expected a bipartite graph with two sets of 10 nodes per each with a certain amount of weighted links among them (without link among the same set) as a result.
I, instead, obtained a classic non-oriented graph with 10 nodes in total.
At the same time, I'd like to optimize as well as I can my code to speed-up the computational time without affecting the readability.
Related
I have a graph consisting of about 200 nodes out of which I am removing nodes on each iteration.
It is possible to visualize the graph with the nodes removed, but the location of the nodes does not change while doing so. Ideally, I'd like to take away nodes and see if the remaining nodes move closer together and if clusters form as more and more nodes are removed.
I'm using networkX for this. I have tried to recompute the graph on every iteration but there seems to be some randomness in how the graph is created. I am therefore getting a very different graph on each iteration.
Is there a way to achieve what I want?
You can use draw_networkx for this:
import networkx as nx
import matplotlib.pyplot as plt
nodes = [i for i in range(10)]
edges = [(i, i+1) for i in range(5)]
G = nx.Graph()
G.add_nodes_from(nodes)
G.add_edges_from(edges)
positions = {}
for node in nodes:
positions[node] = (node, node)
nx.draw_networkx(G, pos=positions)
I generate a graph of 10 nodes with some edges, and then define a dict where the keys are the nodes (1 to 10 here) and the values are the coordinates in (x,y) format. In this example I arranged the nodes along a line.
Then, at the next iteration, just remove the nodes you do not need and pass the same dict. It will skip over the missing nodes and just plot the ones still present in the graph:
G.remove_nodes_from([5,6])
nx.draw_networkx(G, pos=positions)
You should see the nodes 5 and 6 missing.
.draw_networkx relies on matplotlib, so you can do many of the things allowed by that library. More info here.
Hope it helps!
I am trying to have nodes connect to a main node with different distances.
What I have so far:
import networkx as nx
G = nx.empty_graph( 3 , create_using= None)
G.add_edge(0,1)
G.add_edge(0,2)
Graph with equal distance to a main node
However, as it can be seen from the image, the distance between the node on either side have equal distance to the main node. Is there a way to have their distance to the main node different?
There are two parts to your question:
Part 1 - Distance between nodes:
In network theory, the distance between nodes is represented by the weight of the edge between them. So you can add all your edges with weights to your network with the following line:
G = nx.Graph()
G.add_weighted_edges_from([(0,1,4.0),(0,2,5.0)])
You can randomize the weights on the edges above for random distance between nodes.
Part 2 - Network Visualization:
I understand that you're more concerned with how the network graph is shown. If you use nx.draw_random(G) you can get randomized distances between your nodes, and suggest that you save a picture when you get the desired figure, as it randomizes every time you run.
Hope it helps... :)
I have 1000 different names, each constituting a node. Each name can be connected with 0..1000 other names an unlimited amount of times. I would like to graph it in such a way that the distance between two nodes is inversely proportional to the number of times they are connected.
Example:
'node1' : ['node2','node2','node2','node2','node2','node3']
'node2' : ['node1','node1','node1','node1','node1']
'node3' : ['node1']
node1and node2 should huddle together and node3 should be further away.
Is that possible? Currently I'm graphing using the following code:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from(grapharr.keys())
for k in grapharr:
for j in grapharr[k]:
G.add_edge(k,j)
nx.draw_networkx(G, **options)
grapharris a dict structure where the keys are nodes and the values are arrays containing the connections for the particular node.
It is impossible in the general case. Look at this graph:
Imagine that the central node has a thousand connections to each other, but 'square' nodes have only one connection between them. How will you draw them?
Anyway, you can set the connectivity level as edge weight and use force-directed layouts that will try to create the best layout (but not 100% optimal, of course). In networkx, there are:
spring_layout
draw_spring
graphviz_layout with prog='neato' parameter
I have a network, and how to generate a random network but ensure each node retains the same degre of the original network using networkx? My first thought is to get the adjacency matrix, and perform a random in each row of the matrix, but this way is somwhat complex, e.g. need to avoid self-conneted (which is not seen in the original network) and re-label the nodes. Thanks!
I believe what you're looking for is expected_degree_graph. It generates a random graph based on a sequence of expected degrees, where each degree in the list corresponds to a node. It also even includes an option to disallow self-loops!
You can get a list of degrees using networkx.degree. Here's an example of how you would use them together in networkx 2.0+ (degree is slightly different in 1.0):
import networkx as nx
from networkx.generators.degree_seq import expected_degree_graph
N,P = 3, 0.5
G = nx.generators.random_graphs.gnp_random_graph(N, P)
G2 = expected_degree_graph([deg for (_, deg) in G.degree()], selfloops=False)
Note that you're not guaranteed to have the exact degrees for each node using expected_degree_graph; as the name implies, it's probabilistic given the expected value for each of the degrees. If you want something a little more concrete you can use configuration_model, however it does not protect against parallel edges or self-loops, so you'd need to prune those out and replace the edges yourself.
The goal is as follows: I have the set of nodes, size N, for which I would like to
generate (as an iterable) a set of all possible directed graphs (presumably using networkx.DiGraph() and numpy.arange(0,N))(edit: to clarify — this is a very large number of graphs for even small N, so some kind of iterable structure is needed which allows the set to be constructed on the fly)
filter this set of graphs based on their meeting some criteria (e.g., the existence of a complete path from node 0 to node N-1)
store parameter values associated with each edge
compute functions across the filtered sets of graphs, taking into account the edge-associated parameter values in those functions without needing to have all the graphs in memory at the same time (as that would be really taxing without parallelization, and I don't know how to handle parallelization yet)
It seems like there has been plenty of discussion of dealing efficiently with graphs with many nodes and edges (i.e., large graphs) but very little discussion of how to effectively handle many graphs at once (large sets of graphs) where each graph can be expected to not overtax the system.
Edit: Ideally this would also work well with pandas and numpy arrays, but I'm pretty sure that any solution to the above can also be made to work with these since fundamentally networkX is working over dictionaries of dictionaries.
You write that you'd like to generate all possible directed graphs with N nodes. Unfortunately, this isn't feasible even for small N.
Given a set of N nodes, the number of possible undirected edges is N(N-1)/2. We can define a graph by selecting a subset of these edges. There are 2^(N*(N-1)/2) possible subsets, meaning that there are exactly that many possible undirected graphs on N nodes.
Suppose N=10. There are roughly 3.5 * 10^13 possible graphs on these nodes. If you could handle one million graphs per second, you'd need roughly 10^7 seconds handle all of the graphs. This is on the order of a year.
And that's just undirected graphs. There are more directed graphs. For N nodes, there are 2^(N*(N-1)) directed graphs. Here's a table that demonstrates how fast this blows up. |V| is the number of nodes:
|V| Number of Digraphs
=== ==================
1 1
2 4
3 64
4 4096
5 1048576
6 1073741824
7 4398046511104
8 72057594037927936
9 4722366482869645213696
If you'd really like to do this, here are python generators which will lazily enumerate graphs on a node set:
from itertools import chain
import networkx as nx
def power_set(iterable):
"""Return an iterator over the power set of the finite iterable."""
s = list(iterable)
return chain.from_iterable(combinations(s, n) for n in xrange(len(s) + 1))
def enumerate_graphs(nodes):
# create a list of possible edges
possible_edges = list(combinations(nodes, 2))
# create a graph for each possible set of edges
for edge_set in power_set(possible_edges):
g = nx.Graph()
g.add_nodes_from(nodes)
g.add_edges_from(edge_set)
yield g
def enumerate_digraphs(nodes):
# create a list of possible edges
possible_edges = list(combinations(nodes, 2))
# generate each edge set
for edge_set in power_set(possible_edges):
# for each set of `M` edges there are `M^2` directed graphs
for swaps in power_set(xrange(len(edge_set))):
directed_edge_set = list(edge_set)
for swap in swaps:
u,v = directed_edge_set[swap]
directed_edge_set[swap] = v,u
g = nx.DiGraph()
g.add_nodes_from(nodes)
g.add_edges_from(directed_edge_set)
yield g
We can then plot all of the directed graphs thusly:
nodes = ("apples", "pears", "oranges")
digraphs = list(enumerate_digraphs(nodes))
layout = nx.random_layout(digraphs[0])
plt.figure(figsize=(20,30))
for i,g in enumerate(digraphs):
plt.subplot(6,5,i+1)
nx.draw_networkx(g, pos=layout)
plt.gca().get_xaxis().set_visible(False)
plt.gca().get_yaxis().set_visible(False)