Node size dependent on the node degree on NetworkX - python

I imported my Facebook data onto my computer in the form of a .json file. The data is in the format:
{"nodes":[{"name":"Alan"},{"name":"Bob"}],"links":[{"source":0,"target:1"}]}
Then, I use this function:
def parse_graph(filename):
"""
Returns networkx graph object of facebook
social network in json format
"""
G = nx.Graph()
json_data=open(filename)
data = json.load(json_data)
# The nodes represent the names of the respective people
# See networkx documentation for information on add_* functions
G.add_nodes_from([n['name'] for n in data['nodes']])
G.add_edges_from([(data['nodes'][e['source']]['name'],data['nodes'][e['target']]['name']) for e in data['links']])
json_data.close()
return G
to enable this .json file to be used a graph on NetworkX. If I find the degree of the nodes, the only method I know how to use is:
degree = nx.degree(p)
Where p is the graph of all my friends. Now, I want to plot the graph such that the size of the node is the same as the degree of that node. How do I do this?
Using:
nx.draw(G,node_size=degree)
didn't work and I can't think of another method.

Update for those using networkx 2.x
The API has changed from v1.x to v2.x. networkx.degree no longer returns a dict but a DegreeView Object as per the documentation.
There is a guide for migrating from 1.x to 2.x here.
In this case it basically boils down to using dict(g.degree) instead of d = nx.degree(g).
The updated code looks like this:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(1,2), (2,3), (2,4), (3,4)])
d = dict(g.degree)
nx.draw(g, nodelist=d.keys(), node_size=[v * 100 for v in d.values()])
plt.show()
nx.degree(p) returns a dict while the node_size keywod argument needs a scalar or an array of sizes. You can use the dict nx.degree returns like this:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(1,2), (2,3), (2,4), (3,4)])
d = nx.degree(g)
nx.draw(g, nodelist=d.keys(), node_size=[v * 100 for v in d.values()])
plt.show()

#miles82 provided a great answer. However, if you've already added the nodes to your graph using something like G.add_nodes_from(nodes), then I found that d = nx.degree(G) may not return the degrees in the same order as your nodes.
Building off the previous answer, you can modify the solution slightly to ensure the degrees are in the correct order:
d = nx.degree(G)
d = [(d[node]+1) * 20 for node in G.nodes()]
Note the d[node]+1, which will be sure that nodes of degree zero are added to the chart.

other method if you still get 'DiDegreeView' object has no attribute 'keys'
1)you can first get the degree of each node as a list of tuples
2)build a node list from the first value of tuple and degree list from the second value of tuple.
3)finally draw the network with the node list you've created and degree list you've created
here's the code:
list_degree=list(G.degree()) #this will return a list of tuples each tuple is(node,deg)
nodes , degree = map(list, zip(*list_degree)) #build a node list and corresponding degree list
plt.figure(figsize=(20,10))
nx.draw(G, nodelist=nodes, node_size=[(v * 5)+1 for v in degree])
plt.show() #ploting the graph

Related

How to get Minimum Spanning Tree Matrix in python

Initially, i have 2d array. By using this array i have created a graph with weight on its edges. Now i am trying to use this Graph to make Minimum Spanning Tree matrix but i cant make it as desire. I am using the following code to make graph.
G = nx.from_numpy_matrix(ED_Matrix, create_using=nx.DiGraph)
layout = nx.spring_layout(G)
sizes = len(ED_Matrix)
nx.draw(G, layout, with_labels=True, node_size=sizes)
labels = nx.get_edge_attributes(G, "weight")
output = nx.draw_networkx_edge_labels(G, pos=layout, edge_labels=labels)
plt.show()
And its gives the output like this
Now i am using MST code, to get the its MST matrix but its gives error like this.
from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import minimum_spanning_tree
Tcsr = minimum_spanning_tree(G)
Tcsr.toarray().astype(int)
Taking into account example from docs of scipy, it should be constructed from adjacency matrix of G, not from G.
You might want to replace G with nx.adjacency_matrix(G) or csr_matrix(nx.adjacency_matrix(G)) or ED_Matrix itself in calculation (assignment) of Tcsr:
Tcsr = minimum_spanning_tree(nx.adjacency_matrix(G)) #or
Tcsr = minimum_spanning_tree(csr_matrix(nx.adjacency_matrix(G))) #or
Tcsr = minimum_spanning_tree(ED_Matrix)
Tcsr is a sparse matrix which is later converted to numpy array.

NetworkX Minimum Spanning Tree has different cluster arrangement with the same data?

I have a large dataset which compares products with a relatedness measure which looks like this:
product1 product2 relatedness
0101 0102 0.047619
0101 0103 0.023810
0101 0104 0.095238
0101 0105 0.214286
0101 0106 0.047619
... ... ...
I used the following code to feed the data into the NetworkX graphing tool and produce an MST diagram:
import networkx as nx
import matplotlib.pyplot as plt
products = (data['product1'])
products = list(dict.fromkeys(products))
products = sorted(products)
G = nx.Graph()
G.add_nodes_from(products)
print(G.number_of_nodes())
print(G.nodes())
row = 0
for c in data['product1']:
p = data['product2'][row]
w = data['relatedness'][row]
if w > 0:
G.add_edge(c,p, weight=w, with_labels=True)
row = row + 1
nx.draw(nx.minimum_spanning_tree(G), with_labels=True)
plt.show()
The resulting diagram looks like this: https://i.imgur.com/pBbcPGc.jpg
However, when I re-run the code, with the same data and no modifications, the arrangement of the clusters appears to change, so it then looks different, example here: https://i.imgur.com/4phvFGz.jpg, second example here: https://i.imgur.com/f2YepVx.jpg. The clusters, edges, and weights do not appear to be changing, but the arrangement of them on the graph space is changing each time.
What causes the arrangement of the nodes to change each time without any changes to the code or data? How can I re-write this code to produce a network diagram with approximately the same arrangement of nodes and edges for the same data each time?
The nx.draw method uses by default the spring_layout (link to the doc). This layout implements the Fruchterman-Reingold force-directed algorithm which starts with random initial positions. This is this layout effect that you witness in your repetitive trials.
If you want to "fix" the positions, then you should explicitely call the spring_layout function and specify the initial positions in the pos argument.
Assign G = nx.minimum_spanning_tree(G) for purpose of clarity. Then
nx.draw(G, with_labels=True)
is equivalent to
pos = nx.spring_layout(G)
nx.draw(G, pos=pos, with_labels=True)
Since you don't like pos to be calculated randomly every time you run your script, the only way to keep your pos stable is to store it once and retrieve from file after each rerun. You can put this script to calculate pos in an improved manner before nx.draw(G, pos=pos, with_labels=True):
import os, json
def store(pos):
#form of dictionary to be stored dictionary retrieved
return {k: v.tolist() for k, v in pos.items()}
def retrieve(pos):
#form of dictionary to be retrieved
return {float(k): v for k, v in pos.items()}
if 'nodes.txt' in os.listdir():
json_file = open('pos.txt').read()
pos = retrieve(json.loads(json_file)) #retrieving dictionary from file
print('retrieve', pos)
else:
with open('pos.txt', 'w') as outfile:
pos = nx.spring_layout(new_G) #calculates pos
print('store', pos)
json.dump(store(pos), outfile, indent=4) #records pos dictionary into file
This is an ugly solution because it depends unconditionally of data types used in pos dictionary. It worked for me, but you might to define your custom ones used in store and retrieve

Add and delete a random edge in networkx

I'm using NetworkX in python. Given any undirected and unweighted graph, I want to loop through all the nodes. With each node, I want to add a random edge and/or delete an existing random edge for that node with probability p. Is there a simple way to do this? Thanks a lot!
Create a new random edge in networkx
Let's set up a test graph:
import networkx as nx
import random
import matplotlib.pyplot as plt
graph = nx.Graph()
graph.add_edges_from([(1,3), (3,5), (2,4)])
nx.draw(graph, with_labels=True)
plt.show()
Now we can pick a random edge from a list of non-edge from the graph. It is not totally clear yet what is the probability you mentioned. Since you add a comment stating that you want to use random.choice I'll stick to that.
def random_edge(graph, del_orig=True):
'''
Create a new random edge and delete one of its current edge if del_orig is True.
:param graph: networkx graph
:param del_orig: bool
:return: networkx graph
'''
edges = list(graph.edges)
nonedges = list(nx.non_edges(graph))
# random edge choice
chosen_edge = random.choice(edges)
chosen_nonedge = random.choice([x for x in nonedges if chosen_edge[0] == x[0]])
if del_orig:
# delete chosen edge
graph.remove_edge(chosen_edge[0], chosen_edge[1])
# add new edge
graph.add_edge(chosen_nonedge[0], chosen_nonedge[1])
return graph
Usage exemple:
new_graph = random_edge(graph, del_orig=True)
nx.draw(new_graph, with_labels=True)
plt.show()
We can still add a probability distribution over the edges in random.choiceif you need to (using numpy.random.choice() for instance).
Given a node i, To add edges without duplication you need to know (1) what edges from i already exist and then compute (2) the set of candidate edges that don't exist from i. For removals, you already defined a method in the comment - which is based simply on (1).
Here is a function that will provide one round of randomised addition and removal, based on list comprehensions
def add_and_remove_edges(G, p_new_connection, p_remove_connection):
'''
for each node,
add a new connection to random other node, with prob p_new_connection,
remove a connection, with prob p_remove_connection
operates on G in-place
'''
new_edges = []
rem_edges = []
for node in G.nodes():
# find the other nodes this one is connected to
connected = [to for (fr, to) in G.edges(node)]
# and find the remainder of nodes, which are candidates for new edges
unconnected = [n for n in G.nodes() if not n in connected]
# probabilistically add a random edge
if len(unconnected): # only try if new edge is possible
if random.random() < p_new_connection:
new = random.choice(unconnected)
G.add_edge(node, new)
print "\tnew edge:\t {} -- {}".format(node, new)
new_edges.append( (node, new) )
# book-keeping, in case both add and remove done in same cycle
unconnected.remove(new)
connected.append(new)
# probabilistically remove a random edge
if len(connected): # only try if an edge exists to remove
if random.random() < p_remove_connection:
remove = random.choice(connected)
G.remove_edge(node, remove)
print "\tedge removed:\t {} -- {}".format(node, remove)
rem_edges.append( (node, remove) )
# book-keeping, in case lists are important later?
connected.remove(remove)
unconnected.append(remove)
return rem_edges, new_edges
To see this function in action:
import networkx as nx
import random
import matplotlib.pyplot as plt
p_new_connection = 0.1
p_remove_connection = 0.1
G = nx.karate_club_graph() # sample graph (undirected, unweighted)
# show original
plt.figure(1); plt.clf()
fig, ax = plt.subplots(2,1, num=1, sharex=True, sharey=True)
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos=pos, ax=ax[0])
# now apply one round of changes
rem_edges, new_edges = add_and_remove_edges(G, p_new_connection, p_remove_connection)
# and draw new version and highlight changes
nx.draw_networkx(G, pos=pos, ax=ax[1])
nx.draw_networkx_edges(G, pos=pos, ax=ax[1], edgelist=new_edges,
edge_color='b', width=4)
# note: to highlight edges that were removed, add them back in;
# This is obviously just for display!
G.add_edges_from(rem_edges)
nx.draw_networkx_edges(G, pos=pos, ax=ax[1], edgelist=rem_edges,
edge_color='r', style='dashed', width=4)
G.remove_edges_from(rem_edges)
plt.show()
And you should see something like this.
Note that you could also do something similar with the adjacency matrix,
A = nx.adjacency_matrix(G).todense() (it's a numpy matrix so operations like A[i,:].nonzero() would be relevant). This might be more efficient if you have extremely large networks.

How do display bipartite graphs with python networkX package?

How does one display a bipartite graph in the python networkX package, with the nodes from one class in a column on the left and those from the other class on the right?
I can create a graph and display it like this
B = nx.Graph()
B.add_nodes_from([1,2,3,4], bipartite=0) # Add the node attribute "bipartite"
B.add_nodes_from(['a','b','c'], bipartite=1)
B.add_edges_from([(1,'a'), (1,'b'), (2,'b'), (2,'c'), (3,'c'), (4,'a')])
nx.draw(B)
plt.show()
But I want nodes 1,2,3,4 on the left in a column and the nodes 'a','b','c' in a column on the right, with edges going between them.
You need to set the positions for each node by yourself:
B = nx.Graph()
B.add_nodes_from([1,2,3,4], bipartite=0) # Add the node attribute "bipartite"
B.add_nodes_from(['a','b','c'], bipartite=1)
B.add_edges_from([(1,'a'), (1,'b'), (2,'b'), (2,'c'), (3,'c'), (4,'a')])
# Separate by group
l, r = nx.bipartite.sets(B)
pos = {}
# Update position for node from each group
pos.update((node, (1, index)) for index, node in enumerate(l))
pos.update((node, (2, index)) for index, node in enumerate(r))
nx.draw(B, pos=pos)
plt.show()
Building on #Rikka's answer and newer versions of NetworkX, the following automates (and improves) the positioning of the bipartite network. I've also added labels and different colors to the different partitions of the network.
B = networkx.Graph()
B.add_nodes_from([1,2,3,4], bipartite=0) # Add the node attribute "bipartite"
B.add_nodes_from(['abc','bcd','cef'], bipartite=1)
B.add_edges_from([(1,'abc'), (1,'bcd'), (2,'bcd'), (2,'cef'), (3,'cef'), (4,'abc')])
top = networkx.bipartite.sets(B)[0]
pos = networkx.bipartite_layout(B, top)
networkx.draw(B, pos=pos, with_labels=True, node_color=['green','green','green','green','blue','blue','blue'])
plt.show()
To answer my own question, based on #Rikka above--Here is code to determine the positions for nodes in an arbitrary multipartite graph, given names for the parts.
def position_MultiPartiteGraph( Graph, Parts ):
# Graph is a networkX Graph object, where the nodes have attribute 'agentType' with part name as a value
# Parts is a list of names for the parts (to be shown as columns)
# returns list of dictionaries with keys being networkX Nodes, values being x,y coordinates for plottingxPos = {}
xPos = {}
yPos = {}
for index1, agentType in enumerate(Parts):
xPos[agentType] = index1
yPos[agentType] = 0
pos = {}
for node, attrDict in Graph.nodes(data=True):
agentType = attrDict['agentType']
# print ('node: %s\tagentType: %s' % (node, agentType))
# print ('\t(x,y): (%d,%d)' % (xPos[agentType], yPos[agentType]))
pos[node] = (xPos[agentType], yPos[agentType])
yPos[agentType] += 1
return pos
Now, suppose I define a tripartite graph like this (weights are irrelevant for this example):
TG = nx.Graph()
TG.add_nodes_from([1,2,3,4], agentType='world') # Add the node attribute "bipartite"
TG.add_nodes_from(['a','b','c'], agentType='sender')
TG.add_nodes_from(['A','B','C'], agentType='receiver')
# This is just an easier way to add (and to automatically generate) weighted edges
myEdges = [(1,'a',0.75),
(1,'b',0.25),
(2,'b',0.5),
(2,'c',0.5),
(3,'c',1.0),
(4,'a',1.0),
('a','C',0.10),
('a','A',0.80),
('c','A',1.0),
('b','C',1.0)]
[TG.add_edge(x,y,weight=z) for x,y, z in myEdges]
Then here is how to use it:
nx.draw(TG,pos=position_MultiPartiteGraph(TG, ['world', 'sender', 'receiver']))
plt.show()
I'm not sure how to show the output, but it works for me! Hurray! Thanks #Rikka!

Matrix object has no attribute nodes networkX python

I'm trying to represent some numbers as edges of a graph with connected components. For this, I've been using python's networkX module.
My graph is G, and has nodes and edges initialised as follows:
G = nx.Graph()
for (x,y) in my_set:
G.add_edge(x,y)
print G.nodes() #This prints all the nodes
print G.edges() #Prints all the edges as tuples
adj_matrix = nx.to_numpy_matrix(G)
Once I add the following line,
pos = nx.spring_layout(adj_matrix)
I get the abovementioned error.
If it might be useful, all the nodes are numbered in 9-15 digits. There are 412 nodes and 422 edges.
Detailed error:
File "pyjson.py", line 89, in <module>
mainevent()
File "pyjson.py", line 60, in mainevent
pos = nx.spring_layout(adj_matrix)
File "/usr/local/lib/python2.7/dist-packages/networkx/drawing/layout.py", line 244, in fruchterman_reingold_layout
A=nx.to_numpy_matrix(G,weight=weight)
File "/usr/local/lib/python2.7/dist-packages/networkx/convert_matrix.py", line 128, in to_numpy_matrix
nodelist = G.nodes()
AttributeError: 'matrix' object has no attribute 'nodes'
Edit: Solved below. Useful information: pos creates a dict with coordinates for each node. Doing nx.draw(G,pos) creates a pylab figure. But it doesn't display it, because pylab doesn't display automatically.
(some of this answer addresses some things in your comments. Can you add those to your question so that later users get some more context)
pos creates a dict with coordinates for each node. Doing nx.draw(G,pos) creates a pylab figure. But it doesn't display it, because pylab doesn't display automatically.
import networkx as nx
import pylab as py
G = nx.Graph()
for (x,y) in my_set:
G.add_edge(x,y)
print G.nodes() #This prints all the nodes
print G.edges() #Prints all the edges as tuples
pos = nx.spring_layout(G)
nx.draw(G,pos)
py.show() # or py.savefig('graph.pdf') if you want to create a pdf,
# similarly for png or other file types
The final py.show() will display it. py.savefig('filename.extension') will save as any of a number of filetypes based on what you use for extension.
spring_layout takes a network graph as it's first param and not a numpy array. What it returns are the positions of the nodes according to the Fruchterman-Reingold force-directed algorithm.
So you need to pass this to draw example:
import networkx as nx
%matplotlib inline
G=nx.lollipop_graph(14, 3)
nx.draw(G,nx.spring_layout(G))
yields:

Categories

Resources