Simplify networkx node labels - python

%matplotlib inline
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_node('abc#gmail.com')
nx.draw(G, with_labels=True)
plt.show()
The output figure is
What I want is
I have thousands of email records from person#email.com to another#email.com in a csv file, I use G.add_node(email_address) and G.add_edge(from, to) to build G. I want keep the whole email address in Graph G but display it in a simplified string.

networkx has a method called relabel_nodes that takes a graph (G), a mapping (the relabeling rules) and returns a new graph (new_G) with the nodes relabeled.
That said, in your case:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_node('abc#gmail.com')
mapping = {
'abc#gmail.com': 'abc'
}
relabeled_G = nx.relabel_nodes(G,mapping)
nx.draw(relabeled_G, with_labels=True)
plt.show()
That way you keep G intact and haves simplified labels.
You can optionally modify the labels in place, without having a new copy, in which case you'd simply call G = nx.relabel_nodes(G, mapping, copy=False)
If you don't know the email addresses beforehand, you can pass relabel_nodes a function, like so:
G = nx.relabel_nodes(G, lambda email: email.split("#")[0], copy=False)

Related

Bipartite graph in NetworkX for LARGE amount of nodes

I am trying to create bipartite of certain nodes, for small numbers it looks perfectly fine:
Image for around 30 nodes
Unfortunately, this isn't the case for more nodes like this one:
Image for more nodes
My code for determining the position of each node looks something like this:
pos = {}
pos[SOURCE_STRING] = (0, width/2)
row = 0
for arr in left_side.keys():
pos[str(arr).replace(" ","")]=(NODE_SIZE, row)
row += NODE_SIZE
row = 0
for arr in right_side.keys():
pos[str(arr).replace(" ","")]=(2*NODE_SIZE,row)
row += NODE_SIZE
pos[SINK_STRING] = (3*NODE_SIZE, width/2)
return pos
And then I feed it to the DiGraph class:
G = nx.DiGraph()
G.add_nodes_from(nodes)
G.add_edges_from(edges, len=1)
nx.draw(G, pos=pos ,node_shape = "s", with_labels = True,node_size=NODE_SIZE)
This doesn't make much sense since they should be in the same distance from each other since NODE_SIZE is constant it doesn't change for the rest of the program.
Following this thread:
Bipartite graph in NetworkX
Didn't help me either.
Can something be done about this?
Edit(Following Paul Brodersen Advice using netGraph:
Used this documentation: netgraph doc
And still got somewhat same results, such as:
netgraph try
Using edges and different positions, also played with node size, with no success.
Code:
netgraph.Graph(edges, node_layout='bipartite', node_labels=True)
plt.show()
In your netgraph call, you are not changing the node size.
My suggestion with 30 nodes:
import numpy as np
import matplotlib.pyplot as plt
from netgraph import Graph
edges = np.vstack([np.random.randint(0, 15, 60),
np.random.randint(16, 30, 60)]).T
Graph(edges, node_layout='bipartite', node_size=0.5, node_labels=True, node_label_offset=0.1, edge_width=0.1)
plt.show()
With 100 nodes:
import numpy as np
import matplotlib.pyplot as plt
from netgraph import Graph
edges = np.vstack([np.random.randint(0, 50, 200),
np.random.randint(51, 100, 200)]).T
Graph(edges, node_layout='bipartite', node_size=0.5, node_labels=True, node_label_offset=0.1, edge_width=0.1)
plt.show()

Plot distribution of node attributes networkx

The nodes in a directed graph has Name, Age and Height as attributes. I want to plot the distribution of the three attributes, is that possible?
I know that it is possible to get attributes this way:
name = nx.get_node_attributes(G, "Name")
age = nx.get_node_attributes(G, "Age")
height = nx.get_node_attributes(G, "Height")
But I don't really get how I can use those instead of G in function below?
import networkx as nx
def plot_degree_dist(G):
degrees = [G.degree(n) for n in G.nodes()]
plt.hist(degrees)
plt.show()
plot_degree_dist(nx.gnp_random_graph(100, 0.5, directed=True))
Or is there some better way to do plot the distribution of node attributes?
Seems like a perfectly reasonable way to me. I'm not aware of any more convenient method. To be more generalizable, add an argument to your function that takes the name of the attribute you'd like to plot.
Just know that nx.get_node_attributes() returns a dictionary keyed to the nodes. Since we're just plotting the distribution, we're only interested in the values and not the keys.
Here's a self-contained example following your lead:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
def plot_attribute_dist(G, attribute):
attribute = nx.get_node_attributes(G, attribute).values()
plt.hist(attribute)
plt.show()
attribute_name = 'Name'
G = nx.gnp_random_graph(100, 0.5, directed=True)
rng = np.random.default_rng(seed=42)
for node, data in G.nodes(data=True):
data[attribute_name] = rng.normal()
plot_attribute_dist(G, attribute_name)
which outputs

KeyError: 'color' in networkX

I am doing a tutorial on network X: https://www.datacamp.com/community/tutorials/networkx-python-graph-tutorial
This is the following code:
import itertools
import copy
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew /e570c38bcc72a8d102422f2af836513b/raw/89c76b2563dbc0e88384719a35cba0dfc04cd522/edgelist_sleeping_giant.csv')
nodelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew/f989e10af17fb4c85b11409fea47895b/raw/a3a8da0fa5b094f1ca9d82e1642b384889ae16e8/nodelist_sleeping_giant.csv')
g = nx.Graph()
## Add edges and edge attributes
for i, elrow in edgelist.iterrows():
g.add_edge(elrow[0], elrow[1], attr_dict=elrow[2:].to_dict())
## Add nodes and node attributes
for i, nlrow in nodelist.iterrows():
g.nodes[nlrow['id']].update(nlrow[1:].to_dict())
##Visualization
# Define node positions data structure (dict) for plotting
node_positions = {node[0]: (node[1]['X'], -node[1]['Y']) for node in g.nodes(data=True)}
# Define data structureof edge colors for plotting
edge_colors = [e[2]["color"] for e in g.edges(data=True)]
gives me a KeyError: 'color' although in the data provided, the column is called color so it has nothing to do with case sensitivity
You are missing the "attr_dict" key as the "color" key is nested inside it.
edge_colors = [e[2]["attr_dict"]["color"] for e in g.edges(data=True)]

Save a graph in the graph6 format in Python using networkx

I try to save a graph in graph6 format in Python with networkx. The obvious way does not seem to work:
import networkx as nx
g = nx.Graph()
g.add_edge('1','2')
nx.write_graph6(g,"anedge.g6")
g1 = nx.read_graph6("anedge.g6")
import matplotlib.pyplot as plt
nx.draw(g1)
plt.savefig("anedge.pdf")
This makes a pdf file showing a graph with two isolated vertices instead of two connected vertices.
Using g.add_edge(0, 1) instead of g.add_edge('1','2') it should work:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edge(0, 1)
nx.write_graph6(g, 'anedge.g6')
g1 = nx.read_graph6('anedge.g6')
nx.draw(g1)
plt.savefig("anedge.pdf")
This is actually exposing a bug in the networkx graph6 generator when the nodes are not consecutively ordered from zero. Bug fix is here https://github.com/networkx/networkx/pull/2739

NetworkX and Matplotlib - Misplaced Text Labels

The following code tries to place a label for each node apart from the one that is by default included by NetworkX/Matplotlib. The original positions of the nodes are obtained through the call to "nx.spring_layout(g)".
The problem is that, when it comes to draw with Matplotlib the labels, the latter are misplaced, as it can be seen in the attached graph.
Should I be doing something differently?
import logging
import networkx as nx
import matplotlib.pyplot as plt
__log = logging.getLogger(__name__)
g = nx.Graph()
nodes = ['shield', 'pcb-top', 'pcb-config', 'chassis']
for k in nodes:
g.add_node(k)
plt.figure(figsize=(8, 11), dpi=150)
nx.draw(g, with_labels=True)
node_cfg = nx.spring_layout(g)
for k, node in node_cfg.items():
__log.debug('node = %s #(%.6f, %.6f)', k, node[0], node[1])
plt.text(node[0], node[1], k, bbox={'color': 'grey'})
plt.savefig('test.png')
Use the same position information for the network drawing as for the labels.
node_cfg = nx.spring_layout(g)
plt.figure(figsize=(8, 11), dpi=150)
nx.draw(g, pos=node_cfg, with_labels=True)

Categories

Resources