Graphviz: avoid overlapping edges and nodes - python

I know it's a recurrent question but I can't find a working answer in my case.
I have a graph with 365 nodes to plot correctly.
For now, the result is almost satisfying, but not perfect. As you can see, some edges cross each other, and some nodes overlap. I would like to avoid that, and I'm pretty sure it's possible, there is a lot of free space available for a better organization. I use the graphviz_layout function with NetworkX. Would you have any idea ?
Here is a working example:
import networkx as nx
import matplotlib.pyplot as plt
import re
import pygraphviz
from networkx.drawing.nx_agraph import graphviz_layout
G = nx.Graph()
G.add_node(1)
inp = "BDDDDDDDDDCCDDCh14CCCh13Ch10Ce24CCh9DCCCCh28CCh8DDDDDDDCCCCCh43Ch42CCCCh41DDDh59Ch58CCCh57CCh40CCDCCh72CCCCCCh39CDDCCh85Ch84h38Ch37DDDCh96Ch95h94CCCh7CDDDDDCDCDh115CDCCh118CCh113Dh125Ch111CDe131CCCh130h110Ch109DDCCCCh140CCh139CCCCh108CCCh107DDh159CCCh158CCCh6DDDDDDCh174CCCh173DCDCDDCCCh186CCh185CCh183CCh181Ch172Ch171DDDCCCh206CCh205CCh204Ch170CCCCh169CCCh5DDDCCCh230DDe237CCCh236CCCh235Ch229CCCCh228CCCCCCCh4CCCh3DDDCDDCCh270Ch269DDh277h276CCh267CCCh266DDCCh288CCCh287CCCCCh265DCCCCh302CCh2CCCCCh1DCCDDDh322Ch321DDCe329CCCCCCCCh327CCh326CCCCCh320CCDCCCh350CCCh317DCe361Ch359Ch"
chains = re.findall("([a-zA-Z]+)", inp)
bp = [int(value) for value in re.findall("([0-9]+)", inp)]
while G.number_of_nodes() < len(chains[0]):
G.add_node(G.number_of_nodes() + 1)
G.add_edge(G.number_of_nodes(), G.number_of_nodes() - 1)
for number, sequence in zip(bp, chains[1:]):
G.add_node(G.number_of_nodes() + 1)
G.add_edge(number, G.number_of_nodes())
for res in sequence[:-1]:
G.add_node(G.number_of_nodes() + 1)
G.add_edge(G.number_of_nodes(), G.number_of_nodes() - 1)
pos = nx.nx_agraph.graphviz_layout(G)
fig = plt.figure(1)
nx.draw(G, pos)
fig.set_size_inches(50, 50)
plt.savefig('tree.png', bbox_inches='tight')

Related

Bipartite graph in NetworkX for LARGE amount of nodes

I am trying to create bipartite of certain nodes, for small numbers it looks perfectly fine:
Image for around 30 nodes
Unfortunately, this isn't the case for more nodes like this one:
Image for more nodes
My code for determining the position of each node looks something like this:
pos = {}
pos[SOURCE_STRING] = (0, width/2)
row = 0
for arr in left_side.keys():
pos[str(arr).replace(" ","")]=(NODE_SIZE, row)
row += NODE_SIZE
row = 0
for arr in right_side.keys():
pos[str(arr).replace(" ","")]=(2*NODE_SIZE,row)
row += NODE_SIZE
pos[SINK_STRING] = (3*NODE_SIZE, width/2)
return pos
And then I feed it to the DiGraph class:
G = nx.DiGraph()
G.add_nodes_from(nodes)
G.add_edges_from(edges, len=1)
nx.draw(G, pos=pos ,node_shape = "s", with_labels = True,node_size=NODE_SIZE)
This doesn't make much sense since they should be in the same distance from each other since NODE_SIZE is constant it doesn't change for the rest of the program.
Following this thread:
Bipartite graph in NetworkX
Didn't help me either.
Can something be done about this?
Edit(Following Paul Brodersen Advice using netGraph:
Used this documentation: netgraph doc
And still got somewhat same results, such as:
netgraph try
Using edges and different positions, also played with node size, with no success.
Code:
netgraph.Graph(edges, node_layout='bipartite', node_labels=True)
plt.show()
In your netgraph call, you are not changing the node size.
My suggestion with 30 nodes:
import numpy as np
import matplotlib.pyplot as plt
from netgraph import Graph
edges = np.vstack([np.random.randint(0, 15, 60),
np.random.randint(16, 30, 60)]).T
Graph(edges, node_layout='bipartite', node_size=0.5, node_labels=True, node_label_offset=0.1, edge_width=0.1)
plt.show()
With 100 nodes:
import numpy as np
import matplotlib.pyplot as plt
from netgraph import Graph
edges = np.vstack([np.random.randint(0, 50, 200),
np.random.randint(51, 100, 200)]).T
Graph(edges, node_layout='bipartite', node_size=0.5, node_labels=True, node_label_offset=0.1, edge_width=0.1)
plt.show()

Networkx viz - How do I sort nodes before plotting them in circular layout?

I am trying to create timetables or cardioid graph using network in python this is my code
import matplotlib.pyplot as plt
import networkx as nx
G = nx.Graph()
n = 10
for i in range(1,n):
if i*2 < n:
G.add_node(i, weight=i)
G.add_node(i*2, weight=i*2)
G.add_edge(i, i*2)
else:
G.add_node(i, weight=i)
G.add_node(i*2-n, weight=i*2-n)
G.add_edge(i,i*2 - n)
plt.figure(figsize=(10,6))
nx.draw_networkx(G, pos=nx.circular_layout(G), node_size=1000)
But then I am getting something like this
enter image description here
Whereas I want nodes to be in a a sorted manner like 0,1,2... in circular format, how do I achieve that?
You can achieve it by using
nx.draw_networkx(G, pos=nx.circular_layout(sorted(G.nodes()), node_size=1000)
to get the nodes sorted in an anticlockwise manner.
Otherwise, for a clockwise ordering you can use
nx.draw_networkx(G, pos=nx.circular_layout(sorted(G.nodes(), reverse=True)), node_size=1000)
Example:

how can I visual a dense graph obviously in networkx python package?

I have a large, dense directed graph in python, made with the NetworkX package. How can I improve the clarity of the graph image?
The following image shows my graph.
I can recommend you several ways to improve your graph visualization depending on its size.
If you want to visualize a large graph (>1000 nodes), you can read some tricks in my another answer. In your case I recommend you to import the graph to a large vector picture:
import networkx as nx
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(40, 40))
G = nx.fast_gnp_random_graph(300, 0.02, seed=1337)
nx.draw(G, node_size=30)
plt.axis('equal')
plt.show()
fig.savefig('waka.svg')
If you have relatively small graph (<1000 nodes), you can play with graph layouts.
The most suitable layout for your kind of graph is the default spring_layout. It has k argument that set the optimal distance between nodes. Here is the example:
Default k value
import networkx as nx
import random
random.seed(1234)
G = nx.fast_gnp_random_graph(30, 0.4, seed=1337)
for i in range(20):
G.add_edge(i + 40, random.randint(1, 30))
G.add_edge(i + 40, random.randint(1, 30))
pos = nx.spring_layout(G, seed=4321)
nx.draw(G, pos=pos, node_size=30, node_color='red')
Enlarged k value
import networkx as nx
import random
random.seed(1234)
G = nx.fast_gnp_random_graph(30, 0.4, seed=1337)
for i in range(20):
G.add_edge(i + 40, random.randint(1, 30))
G.add_edge(i + 40, random.randint(1, 30))
pos = nx.spring_layout(G, seed=4321, k=2)
nx.draw(G, pos=pos, node_size=30, node_color='red')
It is less readable if you need analyse edges with high precision, but it is better if you are care more about nodes.

merge/average neighbors closer than a given radius

I have a couple hundred coordinates in a 3d space, I need to merge the points closer than a given radius and replace them with the neighbors average.
It sounds like a pretty standard problem but I haven't been able to find a solution so far. The dataset is small enough to be able to compute pairwise distances for all the points.
Don't know, maybe some kind of graph analysis / connected components labelling on the sparse distance matrix?
I don't really need the averaging part, just the clustering (is clustering the correct term here?)
A toy dataset could be coords = np.random.random(size=(100,2))
Here's what I tried so far using scipy.cluster.hierarchy. It seems to work fine, but I'm open to more suggestions (DBSCAN maybe?)
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import fclusterdata
from scipy.spatial.distance import pdist
np.random.seed(0)
fig = plt.figure(figsize=(10,5))
gs = mpl.gridspec.GridSpec(1,2)
gs.update(wspace=0.01, hspace= 0.05)
coords = np.random.randint(30, size=(200,2))
img = np.zeros((30,30))
img[coords.T.tolist()] = 1
ax = plt.subplot(gs[0])
ax.imshow(img, cmap="nipy_spectral")
clusters = fclusterdata(coords, 2, criterion="distance", metric="euclidean")
print(len(np.unique(clusters)))
img[coords.T.tolist()] = clusters
ax = plt.subplot(gs[1])
ax.imshow(img, cmap="nipy_spectral")
plt.show()
Here is a method that uses KDTree to query neighbors and networkx module to gather connected components.
from scipy import spatial
import networkx as nx
cutoff = 2
components = nx.connected_components(
nx.from_edgelist(
(i, j) for i, js in enumerate(
spatial.KDTree(coords).query_ball_point(coords, cutoff)
)
for j in js
)
)
clusters = {j: i for i, js in enumerate(components) for j in js}
Example output:

Save a graph in the graph6 format in Python using networkx

I try to save a graph in graph6 format in Python with networkx. The obvious way does not seem to work:
import networkx as nx
g = nx.Graph()
g.add_edge('1','2')
nx.write_graph6(g,"anedge.g6")
g1 = nx.read_graph6("anedge.g6")
import matplotlib.pyplot as plt
nx.draw(g1)
plt.savefig("anedge.pdf")
This makes a pdf file showing a graph with two isolated vertices instead of two connected vertices.
Using g.add_edge(0, 1) instead of g.add_edge('1','2') it should work:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edge(0, 1)
nx.write_graph6(g, 'anedge.g6')
g1 = nx.read_graph6('anedge.g6')
nx.draw(g1)
plt.savefig("anedge.pdf")
This is actually exposing a bug in the networkx graph6 generator when the nodes are not consecutively ordered from zero. Bug fix is here https://github.com/networkx/networkx/pull/2739

Categories

Resources