Calculate the longest path between two nodes NetworkX - python

I'm trying to make a Gantt chard using Networkx. All the nodes in the network are "tasks" that need to be performed to complete the project. With Networkx it is easy to calculate the total time of the project. But the make the Gantt chard I need the latest start of each node.
NetworkX includes one function(dag_longest_path_length) but this calculates to longest path in the whole network. Another function(astar_path_length) results in the shortest path between a source and node, but no function is availed which gives the longest path, or latest start in my case. (if a node as two predecessors it will take the fastest route, but in reality it also has to wait on the second before it can start.
I was thinking of one option.
To evaluate the previous attached nodes and selecting the longest path. Unformal I did not succeeded.
start_time=[]
time=0
DD=nx.DiGraph()
for i in range(df.shape[0]):
DD.add_edge(str(df.at[i,'blockT'])+'_'+df.at[i,'Task'], str(df.at[i,'blockS'])+'_'+df.at[i,'Succ'], weight=df.at[i,'duration'])
fig, ax = plt.subplots()
labels=[]
for i in range(df.shape[0]):
labels.append(str(df.at[i,'blockT'])+'_'+df.at[i,'Task'])
print(nx.astar_path_length(DD, '0_START', str(df.at[i,'blockT'])+'_'+df.at[i,'Task']) )
ax.broken_barh([(nx.astar_path_length(DD, '0_START', str(df.at[i,'blockT'])+'_'+df.at[i,'Task']), heuristic=None, weight='weight'),df.at[i,'duration'] )],(i-0.4,0.8), facecolors='blue' )

Here is some code that I use. I agree is really should be part of NetworkX because it comes up pretty often for me. graph must be a DiGraph. s is the source node and dist is a dict keyed by nodes with weighted distances to s as values.
def single_source_longest_dag_path_length(graph, s):
assert(graph.in_degree(s) == 0)
dist = dict.fromkeys(graph.nodes, -float('inf'))
dist[s] = 0
topo_order = nx.topological_sort(graph)
for n in topo_order:
for s in graph.successors(n):
if dist[s] < dist[n] + graph.edges[n,s]['weight']:
dist[s] = dist[n] + graph.edges[n,s]['weight']
return dist

Looks like you are using DAGs.
Your problem is rather rare so there is no built-in function for it in networkx. You should do it manually:
max(nx.all_simple_paths(DAG, source, target), key=lambda x: len(x))
Here is the full testing code:
import networkx as nx
import random
from itertools import groupby
# Create random DAG
G = nx.gnp_random_graph(50,0.3,directed=True)
DAG = nx.DiGraph([(u,v) for (u,v) in G.edges() if u<v])
# Get the longest path from node 1 to node 10
max(nx.all_simple_paths(DAG, 1, 10), key=lambda x: len(x))

Related

how to create a route knowing starting point, ending point and distance to travel

I am trying to create a walk path on a map using python. And I need to set not only start point and end point, but distance to travel too. So I can not just create a shortest path from point to point.
I started with osmnx and networkx. I created different paths, but I can not check their distance. Can not find anything on that on documentation.
The idea is to make a telegram bot which would create a walking path, so the point is to walk for 5 km for example. Bot was easy, but I have no idea how to create a route based on distance I want to travel (with start and end points)
I'm not clear what the exact question is here, but as for:
I started with osmnx and networkx. I created different paths, but I can not check their distance. Can not find anything on that on documentation.
This functionality is explained in both the OSMnx usage examples/documentation as well as the NetworkX documentation:
import networkx as nx
import osmnx as ox
ox.config(use_cache=True, log_console=True)
# get a graph, an origin, and a destination
G = ox.graph_from_place('Piedmont, CA, USA', network_type='drive')
orig, dest = list(G)[0], list(G)[-10]
# calculate the shortest path from origin to destination
path = nx.shortest_path(G, orig, dest, weight='length')
# the length of each edge traversed along the path
lengths = ox.utils_graph.get_route_edge_attributes(G, path, 'length')
# the total length of the path
path_length = sum(lengths)
# or just directly calculate the shortest path's length from origin to destination
path_length = nx.shortest_path_length(G, orig, dest, weight='length')
If you want to find the shortest path at of least L length, you could use OSMnx's k_shortest_paths function then iterate through the paths until you find one with length >= L (determining each's length as demonstrated in the code snippet above).
L = 3400
paths = ox.k_shortest_paths(G, orig, dest, 1000, 'length')
for i, path in enumerate(paths):
length = sum(ox.utils_graph.get_route_edge_attributes(G, path, 'length'))
if length >= L:
break
i, length, path

Problem with appending a graph object to lists for networkx in Python

I am trying to remove nodes at random from graphs using the networkx package. The first block describes the graph construction and the second block gives me the node lists that I have to remove from my graph H (20%, 50% and 70% removals). I want 3 versions of the base graph H in the end, in a list or any data structure. The code in block 3 gives me objects of type "None". The last block shows that it works for a single case.
I am guessing that the problem is in the append function, which somehow returns objects of type "None". I also feel that the base graph H might be getting altered after every iteration. Is there any way around this? Any help would be appreciated :)
import networkx as nx
import numpy as np
import random
# node removals from Graphs at random
# network construction
H = nx.Graph()
H.add_nodes_from([1,2,3,4,5,6,7,8,9,10])
H.add_edges_from([[1,2],[2,4],[5,6],[7,10],[1,5],[3,6]])
nx.info(H)
nodes_list = list(H.nodes)
# list of nodes to be removed
perc = [.20,.50,.70] # percentage of nodes to be removed
random_sample_list = []
for p in perc:
interior_list = []
random.seed(2) # for replicability
sample = round(p*10)
random_sample = random.sample(nodes_list, sample)
interior_list.append(random_sample)
random_sample_list.append(random_sample)
# applying the list of nodes to be removed to create a list of graphs - not working
graph_list = []
for i in range(len(random_sample_list)):
H1 = H.copy()
graph_list.append(H1.remove_nodes_from(random_sample_list[i]))
# list access - works
H.remove_nodes_from(random_sample_list[1])
nx.info(H)
Final output should look like:
[Graph with 20% removed nodes, Graph with 50% removed nodes, Graph with 7% removed nodes] - eg. list
The function remove_nodes_from does not return the modified graph, but returns None. Consequently, you only need to create the graph with the desired percentage of your nodes and append it to the list:
graph_list = []
for i in range(len(random_sample_list)):
H1 = H.copy()
H1.remove_nodes_from(random_sample_list[i])
graph_list.append(H1)

Improving BFS performance with some kind of memoization

I have this issue that I'm trying to build an algorithm which will find distances from one vertice to others in graph.
Let's say with the really simple example that my network looks like this:
network = [[0,1,2],[2,3,4],[4,5,6],[6,7]]
I created a BFS code which is supposed to find length of paths from the specified source to other graph's vertices
from itertools import chain
import numpy as np
n = 8
graph = {}
for i in range(0, n):
graph[i] = []
for communes in communities2:
for vertice in communes:
work = communes.copy()
work.remove(vertice)
graph[vertice].append(work)
for k, v in graph.items():
graph[k] = list(chain(*v))
def bsf3(graph, s):
matrix = np.zeros([n,n])
dist = {}
visited = []
queue = [s]
dist[s] = 0
visited.append(s)
matrix[s][s] = 0
while queue:
v = queue.pop(0)
for neighbour in graph[v]:
if neighbour in visited:
pass
else:
matrix[s][neighbour] = matrix[s][v] + 1
queue.append(neighbour)
visited.append(neighbour)
return matrix
bsf3(graph,2)
First I'm creating graph (dictionary) and than use the function to find distances.
What I'm concerned about is that this approach doesn't work with larger networks (let's say with 1000 people in there). And what I'm thinking about is to use some kind of memoization (actually that's why I made a matrix instead of list). The idea is that when the algorithm calculates the path from let's say 0 to 3 (what it does already) it should keep track for another routes in such a way that matrix[1][3] = 1 etc.
So I would use the function like bsf3(graph, 1) it would not calculate everything from scratch, but would be able to access some values from matrix.
Thanks in advance!
Knowing this not fully answer your question, but this is another approach you cabn try.
In networks you will have a routing table for each node inside your network. You simple save a list of all nodes inside the network and in which node you have to go. Example of routing table of node D
A -> B
B -> B
C -> E
D -> D
E -> E
You need to run BFS on each node to build all routing table and it will take O(|V|*(|V|+|E|). The space complexity is quadratic but you have to check all possible paths.
When you create all this information you can simple start from a node and search for your destination node inside the table and find the next node to go. This will give a more better time complexity (if you use the right data structure for the table).

How to find all connected subgraph of a graph in networkx?

I'm developing a python application, and i want to list all possible connected subgraph of any size and starting from every node using NetworkX.
I just tried using combinations() from itertools library to find all possible combination of nodes but it is very too slow because it searchs also for not connected nodes:
for r in range(0,NumberOfNodes)
for SG in (G.subgraph(s) for s in combinations(G,r):
if (nx.is_connected(SG)):
nx.draw(SG,with_labels=True)
plt.show()
The actual output is correct. But i need another way faster to do this, because all combinations of nodes with a graph of 50 nodes and 8 as LenghtTupleToFind are up to 1 billion (n! / r! / (n-r)!) but only a minimal part of them are connected subgraph so are what i am interested in. So, it's possible to have a function for do this?
Sorry for my english, thank you in advance
EDIT:
As an example:
so, the results i would like to have:
[0]
[0,1]
[0,2]
[0,3]
[0,1,4]
[0,2,5]
[0,2,5,4]
[0,1,4,5]
[0,1,2,4,5]
[0,1,2,3]
[0,1,2,3,5]
[0,1,2,3,4]
[0,1,2,3,4,5]
[0,3,2]
[0,3,1]
[0,3,2]
[0,1,4,2]
and all combination that generates a connected graph
I had the same requirements and ended up using this code, super close to what you were doing. This code yields exactly the input you asked for.
import networkx as nx
import itertools
G = you_graph
all_connected_subgraphs = []
# here we ask for all connected subgraphs that have at least 2 nodes AND have less nodes than the input graph
for nb_nodes in range(2, G.number_of_nodes()):
for SG in (G.subgraph(selected_nodes) for selected_nodes in itertools.combinations(G, nb_nodes)):
if nx.is_connected(SG):
print(SG.nodes)
all_connected_subgraphs.append(SG)
I have modified Charly Empereur-mot's answer by using ego graph to make it faster:
import networkx as nx
import itertools
G = you_graph.copy()
all_connected_subgraphs = []
# here we ask for all connected subgraphs that have nb_nodes
for n in you_graph.nodes():
egoG = nx.generators.ego_graph(G,n,radius=nb_nodes-1)
for SG in (G.subgraph(sn+(n,) for sn in itertools.combinations(egoG, nb_nodes-1)):
if nx.is_connected(SG):
all_connected_subgraphs.append(SG)
G.remove_node(n)
You might want to look into connected_components function. It will return you all connected nodes, which you can then filter by size and node.
You can find all the connected components in O(n) time and memory complexity. Keep a seen boolean array, and run Depth First Search (DFS) or Bread First Search (BFS), to find the connected components.
In my code, I used DFS to find the connected components.
seen = [False] * num_nodes
def search(node):
component.append(node)
seen[node] = True
for neigh in G.neighbors(node):
if not seen[neigh]:
dfs(neigh)
all_subgraphs = []
# Assuming nodes are numbered 0, 1, ..., num_nodes - 1
for node in range(num_nodes):
component = []
dfs(node)
# Here `component` contains nodes in a connected component of G
plot_graph(component) # or do anything
all_subgraphs.append(component)

Algorithm Is Node A Connected to Node B in Graph

I am looking for an algorithm to check for any valid connection (shortest or longest) between two arbitrary nodes on a graph.
My graph is fixed to a grid with logical (x, y) coordinates with north/south/east/west connections, but nodes can be removed randomly so you can't assume that taking the edge with coords closest to the target is always going to get you there.
The code is in python. The data structure is each node (object) has a list of connected nodes. The list elements are object refs, so we can then search that node's list of connected nodes recursively, like this:
for pnode in self.connected_nodes:
for cnode in pnode.connected_nodes:
...etc
I've included a diagram showing how the nodes map to x,y coords and how they are connected in north/east/south/west. Sometimes there are missing nodes (i.e between J and K), and sometimes there are missing edges (i.e between G and H). The presence of nodes and edges is in flux (although when we run the algorithm, it is taking a fixed snapshot in time), and can only be determined by checking each node for it's list of connected nodes.
The algorithm needs to yield a simple true/false to whether there is a valid connection between two nodes. Recursing through every list of connected nodes explodes the number of operations required - if the node is n edges away, it requires at most 4^n operations. My understanding is something like Dijistrka's algorithm works by finding the shortest path based on edge weights, but if there is no connection at all then would it still work?
For some background, I am using this to model 2D destructible objects. Each node represents a chunk of the material, and if one or more nodes do not have a connection to the rest of the material then it should separate off. In the diagram - D, H, R - should pare off from the main body as they are not connected.
UPDATE:
Although many of the posted answers might well work, DFS is quick, easy and very appropriate. I'm not keen on the idea of sticking extra edges between nodes with high value weights to use Dijkstra because node's themselves might disappear as well as edges. The SSC method seems more appropriate for distinguishing between strong and weakly connected graph sections, which in my graph would work if there was a single edge between G and H.
Here is my experiment code for DFS search, which creates the same graph as shown in the diagram.
class node(object):
def __init__(self, id):
self.connected_nodes = []
self.id = id
def dfs_is_connected(self, node):
# Initialise our stack and our discovered list
stack = []
discovered = []
# Declare operations count to track how many iterations it took
op_count = 0
# Push this node to the stack, for our starting point
stack.append(self)
# Keeping iterating while the stack isn't empty
while stack:
# Pop top element off the stack
current_node = stack.pop()
# Is this the droid/node you are looking for?
if current_node.id == node.id:
# Stop!
return True, op_count
# Check if current node has not been discovered
if current_node not in discovered:
# Increment op count
op_count += 1
# Is this the droid/node you are looking for?
if current_node.id == node.id:
# Stop!
return True, op_count
# Put this node in the discovered list
discovered.append(current_node)
# Iterate through all connected nodes of the current node
for connected_node in current_node.connected_nodes:
# Push this connected node into the stack
stack.append(connected_node)
# Couldn't find the node, return false. Sorry bud
return False, op_count
if __name__ == "__main__":
# Initialise all nodes
a = node('a')
b = node('b')
c = node('c')
d = node('d')
e = node('e')
f = node('f')
g = node('g')
h = node('h')
j = node('j')
k = node('k')
l = node('l')
m = node('m')
n = node('n')
p = node('p')
q = node('q')
r = node('r')
s = node('s')
# Connect up nodes
a.connected_nodes.extend([b, e])
b.connected_nodes.extend([a, f, c])
c.connected_nodes.extend([b, g])
d.connected_nodes.extend([r])
e.connected_nodes.extend([a, f, j])
f.connected_nodes.extend([e, b, g])
g.connected_nodes.extend([c, f, k])
h.connected_nodes.extend([r])
j.connected_nodes.extend([e, l])
k.connected_nodes.extend([g, n])
l.connected_nodes.extend([j, m, s])
m.connected_nodes.extend([l, p, n])
n.connected_nodes.extend([k, m, q])
p.connected_nodes.extend([s, m, q])
q.connected_nodes.extend([p, n])
r.connected_nodes.extend([h, d])
s.connected_nodes.extend([l, p])
# Check if a is connected to q
print a.dfs_is_connected(q)
print a.dfs_is_connected(d)
print p.dfs_is_connected(h)
To find this out, you just need to run simple DFS or BFS algorithm on one of the nodes, it'll find all reachable nodes within a continuous component of the graph, so you just mark it down if you've found the other node during the run of algorithm.
There is a way to use Dijkstra to find the path. If there is an edge between two nodes put 1 for weight, if there is no node, put weight of sys.maxint. Then when the min path is calculated, if it is larger than the number of nodes - there is no path between them.
Another approach is to first find the strongly connected components of the graph. If the nodes are on the same strong component then use Dijkstra to find the path, otherwise there is no path that connects them.
You could take a look at the A* Path Finding Algorithm (which uses heuristics to make it more efficient than Dijkstra's, so if there isn't anything you can exploit in your problem, you might be better off using Dijkstra's algorithm. You would need positive weights though. If this is not something you have in your graph, you could simply give each edge a weight of 1).
Looking at the pseudo code on Wikipedia, A* moves from one node to another by getting the neighbours of the current node. Dijkstra's Algorithm keeps an adjacency list so that it knows which nodes are connected to each other.
Thus, if you where to start from node H, you could only go to R and D. Since these nodes are not connected to the others, the algorithm will not go through the other nodes.
You can find strongly connected components(SCC) of your graph and then check if nodes of interest in one component or not. In your example H-R-D will be first component and rest second, so for H and R result will be true but H and A false.
See SCC algorithm here: https://class.coursera.org/algo-004/lecture/53.

Categories

Resources