I want to plot a network in Python using a co-occurence matrix as an input, such that nodes that have a non-zero co-occurence count are connected, and the weight of the edges is proportional to the number of co-occurrences between each node.
Is there a python library in existence that will facilitate this task using a co-occurence matrix as an input?
You might find NetworkX to be a useful tool for that. You can easily feed it the input nodes and edges in several ways.
In the case that you want to generate your network using a co-occurrence matrix, you can use NetworkX's method from_numpy_matrix, which allows you to create a graph from a numpy matrix matrix which will be interpreted as an adjacency matrix.
Here's a simply toy example from the documentation:
import numpy as np
import networkx as nx
A=np.matrix([[1,1],[2,1]])
G=nx.from_numpy_matrix(A)
It is indeed possible to do something like that with networkx
Check this: https://stackoverflow.com/a/25651827/4288795
With it you can generate graphs like this:
You can export the information to a graphml file format and use yEd Graph Editor to navigate and format the contents of your networkx graph.
Related
I want to create an adjacency matrix in python and then proceed to study networks from there on. I wish to do so using sympy as I already developed a code to study eigenvalues and eigenvectors using sympy. However, I already ran into a problem plotting the matrix using matplotlib. So far, this is what I did
#Create a simple adjacency matrix
import sympy as sym
B=sym.Matrix([[0,1,0],[0,0,1],[1,0,0]])
import matplotlib.pyplot as plt
#Define vertices
a=B.row(0)
b=B.row(1)
c=B.row(2)
for i in B:
for j in B:
if B.row(i)[j]==B.row(j)[i] and B.row(i)[j]==1:
plt.plot(B.row(i)[j], B.row(j)[i])
plt.show()
This returns an empty plot. I can more or less see that there is a faulty logic in the for loop but I can't quite put my hand around it much less how to fix it. I tried another approach where I define the vertices as just random points in the xyz space whose only meaning is their connection to one another but I couldn't figure out hot to connect such points using a for loop.
Basically my question is how do I plot an adjacency matrix built through sympy if it is at all possible.
I am converting my networkx graph using the following code.
nx.drawing.nx_pydot.write_dot(G,path)
It creates a correct dot format which I can visualize later using graphviz interface. However, instead of adding multiple existing lines(Arcs, edges whatever you say), it creates a single (or two if there is an edge in the opposite direction). I just want to have all lines to be preserved in the dot format. How can I do that?
The creation of Networkx graph in the first place was not correct. I changed the line G=nx.DiGraph(directed=True) to G=nx.MultiDiGraph(directed=True). Now I do not have that problem.
I want to generate a random graph using the edge distribution from the original graph.
Is there a way in networkx module to do that?
I'm assuming you mean same degree distribution.
The command nx.configuration_model(degree_list) will do it.
So in your case, given an existing graph G:
H = nx.configuration_model([G.degree(node) for node in G])
I have built a NetworkX Graph containing 50000 Nodes and about 100 Million edges. I have a list of all connected components of this group using nx.connected_components(G) method. This method results in me having clusters of nodes such that each node has a path to reach every other node in that cluster. Now what I want is, in each of these connected components, I want to find subgraphs/sub-clusters such that each of these subgraphs are connected to each other by exactly one edge. Is there a method in NetworkX that I can use directly or any other way in which I can get this done? Sorry I am very new to graph theory so need a little direction.
If I understand you correctly, then for each subgraph, you want to find all graph cuts of size 1, i.e. you want to find all edges, that if taken away partition the graph into two subgraphs. These edges are called bridges and there are efficient algorithms to find them. The implementation in networkx is accessible via networkx.algorithms.bridges.bridges.
What you want is called minimum spanning three. Using networkx you can do it like this:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_edges_from([(1,2), (1,3), (2,3), (4,5), (4,6), (5,6)])
nx.draw(nx.minimum_spanning_tree(G), with_labels=True)
plt.show()
However, I'm a little bit in doubt if networkx is able to perform on so many edges according to this benchmark. I have tested connected components algorithm on igraph, it worked for me as well (and, of course, much faster), so you might also like to look for igraph based solutions.
Result
I would like to create a networkx graph that looks more or less like this, but I haven't been able to find a way for it to display the way I need. The large nodes and edges display fine, but I haven't been able to find how to add the small nodes.
networkx.draw() has an optional argument node_size:
node_size (scalar or array, optional (default=300)) – Size of nodes. If an array is specified it must be the same length as nodelist.
If you want to draw nodes with various sizes, you should specify the array of sizes. You can also use some kind of list generator.
P.S. I don't recommend to use basic networkx drawing functional. There are many powerful visualization libraries better than networkx. Even in networkx docs you can find the same opinion. One can use Gephi, Graphviz (with various libraries) or Cytoscape for really HUGE graphs.