Python: how to optimize the count of all possible shortest paths? - python

In a 3x3 network I want to be able to determine all the shortest paths between any two nodes. Then, for each node in the network, I want to compute how many shortest paths pass through one specific node.
This requires using the nx.all_shortest_paths(G,source,target) function, which returns a generator. This is at variance from using the nx.all_pairs_shortest_path(G), as suggested here. The difference is that in the former case the function computes all the shortest paths between any two nodes, while in the latter case it computes only one shortest path between the same pair of nodes.
Given that I need to consider all shortest paths, I have come up with the following script. This is how I generate the network I am working with:
import networkx as nx
N=3
G=nx.grid_2d_graph(N,N)
pos = dict( (n, n) for n in G.nodes() )
labels = dict( ((i, j), i + (N-1-j) * N ) for i, j in G.nodes() )
nx.relabel_nodes(G,labels,False)
inds=labels.keys()
vals=labels.values()
inds.sort()
vals.sort()
pos2=dict(zip(vals,inds))
nx.draw_networkx(G, pos=pos2, with_labels=False, node_size = 15)
And this is how I print all the shortest paths between any two nodes:
for n in G.nodes():
for j in G.nodes():
if (n!=j): #Self-loops are excluded
gener=nx.all_shortest_paths(G,source=n,target=j)
print('From node '+str(n)+' to '+str(j))
for p in gener:
print(p)
print('------')
The result is a path from node x to node y which only includes the nodes along the way. An excerpt of what I get is:
From node 0 to 2 #Only one path exists
[0, 1, 2] #Node 1 is passed through while going from node 0 to node 2
------
From node 0 to 4 #Two paths exist
[0, 1, 4] #Node 1 is passed through while going from node 0 to node 4
[0, 3, 4] #Node 3 is passed through while going from node 0 to node 4
------
...continues until all pairs of nodes are covered...
My question: how could I amend the last code block to make sure that I know how many shortest paths, in total, pass through each node? According to the excerpt outcome I've provided, node 1 is passed through 2 times, while node 3 is passed through 1 time (starting and ending node are excluded). This calculation needs to be carried out to the end to figure out the final number of paths through each node.

I would suggest making a dict mapping each node to 0
counts = {}
for n in G.nodes(): counts[n] = 0
and then for each path you find -- you're already finding and printing them all -- iterate through the vertices on the path incrementing the appropriate values in your dict:
# ...
for p in gener:
print(p)
for v in p: counts[v] += 1

What you seek to compute is the unnormalized betweenness centrality.
From Wikipedia:
The betweenness centrality is an indicator of a node's centrality in a network. It is equal to the number of shortest paths from all vertices to all others that pass through that node.
More generally, I suggest you have a look at all the standard measures of centrality already in Networkx.

Related

Dijkstra algorithm to select randomly an adjacent node with same minimum weight

I have implemented Dijkstra's algorithm but I have a problem. It always prints the same minimum path while there may be other paths with the same weight.
How could I change my algorithm so that it randomly selects the neighbors with the same weight?
My algorithm is below:
def dijkstra_algorithm(graph, start_node):
unvisited_nodes = list(graph.get_nodes())
# We'll use this dict to save the cost of visiting each node and update it as we move along the graph
shortest_path = {}
# We'll use this dict to save the shortest known path to a node found so far
previous_nodes = {}
# We'll use max_value to initialize the "infinity" value of the unvisited nodes
max_value = sys.maxsize
for node in unvisited_nodes:
shortest_path[node] = max_value
# However, we initialize the starting node's value with 0
shortest_path[start_node] = 0
# The algorithm executes until we visit all nodes
while unvisited_nodes:
# The code block below finds the node with the lowest score
current_min_node = None
for node in unvisited_nodes: # Iterate over the nodes
if current_min_node == None:
current_min_node = node
elif shortest_path[node] < shortest_path[current_min_node]:
current_min_node = node
# The code block below retrieves the current node's neighbors and updates their distances
neighbors = graph.get_outgoing_edges(current_min_node)
for neighbor in neighbors:
tentative_value = shortest_path[current_min_node] + graph.value(current_min_node, neighbor)
if tentative_value < shortest_path[neighbor]:
shortest_path[neighbor] = tentative_value
# We also update the best path to the current node
previous_nodes[neighbor] = current_min_node
# After visiting its neighbors, we mark the node as "visited"
unvisited_nodes.remove(current_min_node)
return previous_nodes, shortest_path
# The code block below finds all the min nodes
# and randomly chooses one for traversal
min_nodes = []
for node in unvisited_nodes: # Iterate over the nodes
if len(min_nodes) == 0:
min_nodes.append(node)
elif shortest_path[node] < shortest_path[min_nodes[0]]:
min_nodes = [node]
else:
# this is the case where 2 nodes have the same cost
# we are going to take all of them
# and at the end choose one randomly
min_nodes.append(node)
current_min_node = random.choice(min_nodes)
What the code does is as follows:
Instead of taking the first smallest element, it creates a list of all the smallest elements.
At the end it choose one of the smallest elements randomly.
This will both guarantee the Dijkstra invariant and choose a random path among the cheapest.
probably just try something like this
random.shuffle(neighbors)
for neighbor in neighbors:
...
which should visit the neighbors randomly (this assumes neighbors is a list or tuple... if its a generator call list on it first...

Scale-free network using preferential attachment algorithm

I'm having trouble understanding what this piece of code does. Please could someone step by step go through the code and explain how it works and what it's doing?
def scale_free(n,m):
if m < 1 or m >=n:
raise nx.NetworkXError("Preferential attactment algorithm must have m >= 1"
" and m < n, m = %d, n = %d" % (m, n))
# Add m initial nodes (m0 in barabasi-speak)
G=nx.empty_graph(m)
# Target nodes for new edges
targets=list(range(m))
# List of existing nodes, with nodes repeated once for each adjacent edge
repeated_nodes=[]
# Start adding the other n-m nodes. The first node is m.
source=m
while source<n:
# Add edges to m nodes from the source.
G.add_edges_from(zip([source]*m,targets))
# Add one node to the list for each new edge just created.
repeated_nodes.extend(targets)
# And the new node "source" has m edges to add to the list.
repeated_nodes.extend([source]*m)
# Now choose m unique nodes from the existing nodes
# Pick uniformly from repeated_nodes (preferential attachement)
targets = _random_subset(repeated_nodes,m)
source += 1
return G
So the first part of this makes sure that m is at least 1 and n>m.
def scale_free(n,m):
if m < 1 or m >=n:
raise nx.NetworkXError("Preferential attactment algorithm must have m >= 1"
" and m < n, m = %d, n = %d" % (m, n))
Then it creates a graph with no edges and the first m nodes 0, 1, ..., m-1.
This looks a bit different from the standard barabasi-albert graph which starts from a connected version, rather than a version without any edges.
# Add m initial nodes (m0 in barabasi-speak)
G=nx.empty_graph(m)
Now it's going to start adding new nodes 1 at a time and connecting them to existing nodes based on various rules. It first creates a set of "targets" that has all of the nodes in the edge-less graph.
# Target nodes for new edges
targets=list(range(m))
# List of existing nodes, with nodes repeated once for each adjacent edge
repeated_nodes=[]
# Start adding the other n-m nodes. The first node is m.
source=m
Now it's going to add each node 1 at a time. When it does that, it will add the new node with edges to m of the previous existing nodes. Those m previous nodes have been stored in a list called targets.
while source<n:
Here it creates those edges
# Add edges to m nodes from the source.
G.add_edges_from(zip([source]*m,targets))
Now it's going to decide who will get those edges when the next node is added. It's supposed to choose them with probability proportional to their degree The way it does that is by having a list repeated_nodes which has each node appearing once per edge. It then chooses a random set of m nodes from that to be the new targets. Depending on how _random_subset is defined, it might or might not be able to choose the same node several times to be a target in the same step.
# Add one node to the list for each new edge just created.
repeated_nodes.extend(targets)
# And the new node "source" has m edges to add to the list.
repeated_nodes.extend([source]*m)
# Now choose m unique nodes from the existing nodes
# Pick uniformly from repeated_nodes (preferential attachement)
targets = _random_subset(repeated_nodes,m)
source += 1
return G

Find a minimum in a list of shortest paths

I have my graph object, im trying to find a method to find a minimum for a group of nodes.
Ex. Nodes:
input_nodes=[123,45]
graph_nodes=[10, 76,123,45,98,456]
I run an algoritm which calculate shortest path between every node in the graph and every node in the input.
I have a dictionary with all shortest paths beetwen nodes :
{10:{123:0.56, 45:0.2}, 76:{123:0, 45:0.23}......
and so on for every graphs node.
How to get only min weight which is different from zero:
Like this:
Minimum path node 10 has with node 45,
Minimum path node 76 has with node 45,
......
Thatnks
I assume there might be ties in the weights, i.e. one node can be equally "close" to more than one other node. The following code will include them all in a dict. The structure of the result is essentially the same as the input d, but only the minimum weight items are retained.
Update: if a node has no neighbors or only neighbors with weight 0, this node will map to a empty dict in the result.
d = {10:{123:0.56, 45:0.2}, 76:{123:0, 45:0.23}, 19:{17:0}, 20:{}}
def closest(ns):
m = min((v for v in ns.values() if v != 0), default=-1)
return {k: v for k, v in ns.items() if v == m}
print({k: closest(v) for k, v in d.items()})

How to traverse tree with specific properties

I have a tree as shown below.
Red means it has a certain property, unfilled means it doesn't have it. I want to minimise the Red checks.
If Red than all Ancestors are also Red (and should not be checked again).
If Not Red than all Descendants are Not Red.
The depth of the tree is d.
The width of the tree is n.
Note that children nodes have value larger than the parent.
Example: In the tree below,
Node '0' has children [1, 2, 3],
Node '1' has children [2, 3],
Node '2' has children [3] and
Node '4' has children [] (No children).
Thus children can be constructed as:
if vertex.depth > 0:
vertex.children = [Vertex(parent=vertex, val=child_val, depth=vertex.depth-1, n=n) for child_val in xrange(self.val+1, n)]
else:
vertex.children = []
Here is an example tree:
I am trying to count the number of Red nodes. Both the depth and the width of the tree will be large. So I want to do a sort of Depth-First-Search and additionally use the properties 1 and 2 from above.
How can I design an algorithm to do traverse that tree?
PS: I tagged this [python] but any outline of an algorithm would do.
Update & Background
I want to minimise the property checks.
The property check is checking the connectedness of a bipartite graph constructed from my tree's path.
Example:
The bottom-left node in the example tree has path = [0, 1].
Let the bipartite graph have sets R and C with size r and c. (Note, that the width of the tree is n=r*c).
From the path I get to the edges of the graph by starting with a full graph and removing edges (x, y) for all values in the path as such: x, y = divmod(value, c).
The two rules for the property check come from the connectedness of the graph:
- If the graph is connected with edges [a, b, c] removed, then it must also be connected with [a, b] removed (rule 1).
- If the graph is disconnected with edges [a, b, c] removed, then it must also be disconnected with additional edge d removed [a, b, c, d] (rule 2).
Update 2
So what I really want to do is check all combinations of picking d elements out of [0..n]. The tree structure somewhat helps but even if I got an optimal tree traversal algorithm, I still would be checking too many combinations. (I noticed that just now.)
Let me explain. Assuming I need checked [4, 5] (so 4 and 5 are removed from bipartite graph as explained above, but irrelevant here.). If this comes out as "Red", my tree will prevent me from checking [4] only. That is good. However, I should also mark off [5] from checking.
How can I change the structure of my tree (to a graph, maybe?) to further minimise my number of checks?
Use a variant of the deletion–contraction algorithm for evaluating the Tutte polynomial (evaluated at (1,2), gives the total number of spanning subgraphs) on the complete bipartite graph K_{r,c}.
In a sentence, the idea is to order the edges arbitrarily, enumerate spanning trees, and count, for each spanning tree, how many spanning subgraphs of size r + c + k have that minimum spanning tree. The enumeration of spanning trees is performed recursively. If the graph G has exactly one vertex, the number of associated spanning subgraphs is the number of self-loops on that vertex choose k. Otherwise, find the minimum edge that isn't a self-loop in G and make two recursive calls. The first is on the graph G/e where e is contracted. The second is on the graph G-e where e is deleted, but only if G-e is connected.
Python is close enough to pseudocode.
class counter(object):
def __init__(self, ival = 0):
self.count = ival
def count_up(self):
self.count += 1
return self.count
def old_walk_fun(ilist, func=None):
def old_walk_fun_helper(ilist, func=None, count=0):
tlist = []
if(isinstance(ilist, list) and ilist):
for q in ilist:
tlist += old_walk_fun_helper(q, func, count+1)
else:
tlist = func(ilist)
return [tlist] if(count != 0) else tlist
if(func != None and hasattr(func, '__call__')):
return old_walk_fun_helper(ilist, func)
else:
return []
def walk_fun(ilist, func=None):
def walk_fun_helper(ilist, func=None, count=0):
tlist = []
if(isinstance(ilist, list) and ilist):
if(ilist[0] == "Red"): # Only evaluate sub-branches if current level is Red
for q in ilist:
tlist += walk_fun_helper(q, func, count+1)
else:
tlist = func(ilist)
return [tlist] if(count != 0) else tlist
if(func != None and hasattr(func, '__call__')):
return walk_fun_helper(ilist, func)
else:
return []
# Crude tree structure, first element is always its colour; following elements are its children
tree_list = \
["Red",
["Red",
["Red",
[]
],
["White",
[]
],
["White",
[]
]
],
["White",
["White",
[]
],
["White",
[]
]
],
["Red",
[]
]
]
red_counter = counter()
eval_counter = counter()
old_walk_fun(tree_list, lambda x: (red_counter.count_up(), eval_counter.count_up()) if(x == "Red") else eval_counter.count_up())
print "Unconditionally walking"
print "Reds found: %d" % red_counter.count
print "Evaluations made: %d" % eval_counter.count
print ""
red_counter = counter()
eval_counter = counter()
walk_fun(tree_list, lambda x: (red_counter.count_up(), eval_counter.count_up()) if(x == "Red") else eval_counter.count_up())
print "Selectively walking"
print "Reds found: %d" % red_counter.count
print "Evaluations made: %d" % eval_counter.count
print ""
How hard are you working on making the test for connectedness fast?
To test a graph for connectedness I would pick edges in a random order and use union-find to merge vertices when I see an edge that connects them. I could terminate early if the graph was connected, and I have a sort of certificate of connectedness - the edges which connected two previously unconnected sets of vertices.
As you work down the tree/follow a path on the bipartite graph, you are removing edges from the graph. If the edge you remove is not in the certificate of connectedness, then the graph must still be connected - this looks like a quick check to me. If it is in the certificate of connectedness you could back up to the state of union/find as of just before that edge was added and then try adding new edges, rather than repeating the complete connectedness test.
Depending on exactly how you define a path, you may be able to say that extensions of that path will never include edges using a subset of vertices - such as vertices which are in the interior of the path so far. If edges originating from those untouchable vertices are sufficient to make the graph connected, then no extension of the path can ever make it unconnected. Then at the very least you just have to count the number of distinct paths. If the original graph is regular I would hope to find some dynamic programming recursion that lets you count them without explicitly enumerating them.

python, igraph coping with vertex renumbering

I am implementing an algorithm for finding a dense subgraph in a directed graph using python+igraph. The main loop maintains two subgraphs S and T which are initially identical and removes nodes (and incident edges) accoriding to a count of the indegree (or outdegree) of those nodes with respect to the other graph. The problem I have is that igraph renumbers the vertices so when I delete some from T, the remaining nodes no longer correspond to the same ones in S.
Here is the main part of the loop that is key.
def directed(S):
T = S.copy()
c = 2
while(S.vcount() > 0 and T.vcount() > 0):
if (S.vcount()/T.vcount() > c):
AS = S.vs.select(lambda vertex: T.outdegree(vertex) < 1.01*E(S,T)/S.vcount())
S.delete_vertices(AS)
else:
BT = T.vs.select(lambda vertex: S.indegree(vertex) < 1.01*E(S,T)/T.vcount())
T.delete_vertices(BT)
This doesn't work because of the effect of deleting vertices on the vertex ids. Is there a standard workaround for this problem?
One possibility is to assign unique names to the vertices in the name vertex attribute. These are kept intact when vertices are removed (unlike vertex IDs), and you can use them to refer to vertices in functions like indegree or outdegree. E.g.:
>>> g = Graph.Ring(4)
>>> g.vs["name"] = ["A", "B", "C", "D"]
>>> g.degree("C")
2
>>> g.delete_vertices(["B"])
>>> g.degree("C")
1
Note that I have removed vertex B so vertex C also gained a new ID, but the name is still the same.
In your case, the row with the select condition could probably be re-written like this:
AS = S.vs.select(lambda vertex: T.outdegree(vertex["name"]) < 1.01 * E(S,T)/S.vcount())
Of course this assumes that initially the vertex names are the same in S and T.

Categories

Resources