Generate Complete DFS Paths using networkx - python

I am trying to generate complete path list instead of the optimized one. Better explained using the below example.
import networkx as nx
G = nx.Graph()
G.add_edges_from([(0, 1), (1, 2), (2, 3)])
G.add_edges_from([(0, 1), (1, 2), (2, 4)])
G.add_edges_from([(0, 5), (5, 6)])
The above code create a Graph with edges 0=>1=>2=>3 and 0=>1=>2=>4 and 0=>5=>6
All I want is to extract all paths from 0.
I tried:
>> list(nx.dfs_edges(G, 0))
[(0, 1), (1, 2), (2, 3), (2, 4), (0, 5), (5, 6)]
All I want is:
[(0, 1, 2, 3), (0, 1, 2, 4), (0, 5, 6)]
Is there any pre-existing method from networkx which can be used? If not, any way to write an optimal method that can do the job?
Note: My problem is limited to the given example. No more corner cases possible.
Note2: For simplification the data is generated. In my case, the edges list is coming from data set. Assumption is given a graph and a node (Say 0), Can we generate all paths?

Give this a try:
import networkx as nx
G = nx.Graph()
G.add_edges_from([(0, 1), (1, 2), (2, 3)])
G.add_edges_from([(0, 1), (1, 2), (2, 4)])
G.add_edges_from([(0, 5), (5, 6)])
pathes = []
path = [0]
for edge in nx.dfs_edges(G, 0):
if edge[0] == path[-1]:
# node of path
path.append(edge[1])
else:
# new path
pathes.append(path)
search_index = 2
while search_index <= len(path):
if edge[0] == path[-search_index]:
path = path[:-search_index + 1] + [edge[1]]
break
search_index += 1
else:
raise Exception("Wrong path structure?", path, edge)
# append last path
pathes.append(path)
print(pathes)
# [[0, 1, 2, 3], [0, 1, 2, 4], [0, 5, 6]]

Related

Adding multiple directed edges in networkx

I know this should be very basic but I have no clue how to do this using networkx. What I am trying to do is to create a MultiDiGraph with 20 nodes. There would be 2 edges connecting each nodes to each other, one away from the node and the other going towards the node. I am unable to create those edges. Any help would be greatly appreciated. It should look something like the picture I have attached.
You could create a graph, and then convert it to a directed graph. In this way you get edges in both directions:
import networkx as nx
g = nx.Graph()
g.add_edges_from([(0, 1), (1, 2), (1, 3)])
g = g.to_directed()
>>> g.edges
OutEdgeView([(0, 1), (1, 0), (1, 2), (1, 3), (2, 1), (3, 1)])
If you want to generate a complete directed graph with n nodes:
import networkx as nx
g = nx.complete_graph(4).to_directed()
>>> g.edges
OutEdgeView([(0, 1), (0, 2), (0, 3), (1, 0), (1, 2), (1, 3), (2, 0), (2, 1), (2, 3), (3, 0), (3, 1), (3, 2)])

Can anyone explain this implementation of depth first search?

So i am learning about search algorithms at the minute, and would appreciate it if someone could provide an explanation of how this implementation of depth first search works, i do understand how depth first search works as a algorithm, but i am struggling to grasp how it has been implemented here.
Thanks for your patience and understanding, Below is the code:
map = {(0, 0): [(1, 0), (0, 1)],
(0, 1): [(1, 1), (0, 2)],
(0, 2): [(1, 2), (0, 3)],
(0, 3): [(1, 3), (0, 4)],
(0, 4): [(1, 4), (0, 5)],
(0, 5): [(1, 5)],
(1, 0): [(2, 0), (1, 1)],
(1, 1): [(2, 1), (1, 2)],
(1, 2): [(2, 2), (1, 3)],
(1, 3): [(2, 3), (1, 4)],
(1, 4): [(2, 4), (1, 5)],
(1, 5): [(2, 5)],
(2, 0): [(3, 0), (2, 1)],
(2, 1): [(3, 1), (2, 2)],
(2, 2): [(3, 2), (2, 3)],
(2, 3): [(3, 3), (2, 4)],
(2, 4): [(3, 4), (2, 5)],
(2, 5): [(3, 5)],
(3, 0): [(4, 0), (3, 1)],
(3, 1): [(4, 1), (3, 2)],
(3, 2): [(4, 2), (3, 3)],
(3, 3): [(4, 3), (3, 4)],
(3, 4): [(4, 4), (3, 5)],
(3, 5): [(4, 5)],
(4, 0): [(5, 0), (4, 1)],
(4, 1): [(5, 1), (4, 2)],
(4, 2): [(5, 2), (4, 3)],
(4, 3): [(5, 3), (4, 4)],
(4, 4): [(5, 4), (4, 5)],
(4, 5): [(5, 5)],
(5, 0): [(5, 1)],
(5, 1): [(5, 2)],
(5, 2): [(5, 3)],
(5, 3): [(5, 4)],
(5, 4): [(5, 5)],
(5, 5): []}
visited = []
path = []
routes = []
def goal_test(node):
if node == (5, 5):
return True
else:
return False
found = False
def dfs(visited, graph, node):
global routes
visited = visited + [node]
if goal_test(node):
routes = routes + [visited]
else:
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
dfs(visited, map, (0, 0))
print(len(routes))
for route in routes:
print(route)
This implementation employs several bad practices:
map is a native Python function, so it is a bad idea to create a variable with that name.
visited should not need to be initialised in the global scope: the caller has no interest in this as it only plays a role in the DFS algorithm itself
routes should not have to be initialised to an empty list either, and it is bad that dfs mutates this global variable. Instead dfs should return that information to the caller. This makes one dfs call self-contained, as it returns the possible routes from the current node to the target. It is up to the caller to extend the routes in this returned collection with an additional node.
The body of goal_test should be written as return node == (5, 5). The if ... else is just translating a boolean value to the same boolean value.
The function goal_test seems overkill when you can just pass an argument to the dfs function that represents the target node. This makes it also more generic, as you don't need to hard-code the target location inside a function.
path and found are initialised but never used.
dfs would run into a stack overflow if the graph would have cycles. It does not happen with the given graph, because that graph is acyclic, but it would be better if you could also rely on this function when giving it cyclic graphs.
dfs will visit the same cell multiple times, as it can be found via different paths (like for instance (2,2)), and so from there it will perform the same DFS search it already did before. This could be made slightly more efficient by storing the result it got from a previous visit to that cell, i.e. we could use memoization. The gain is small, as most time is spent on creating and copying paths. The gain (of using memoization) would be significant if the function would only count the number of paths, and not build them.
Here is an implementation that deals with the above mentioned points. It uses a wrapper function to hide the use of memoization to the caller, and to reduce the number of arguments that need to be passed to dfs:
def search(graph, source, target):
# Use memoization to avoid repetitive DFS from same node,
# Also used to mark a node as visited, to avoid runnning in cycles
memo = dict() # has routes that were already collected
def dfs(node):
if node not in memo: # not been here before
if node == target:
memo[node] = [[target]]
else:
# Mark with None that this node is on the current path
# ...avoiding infinite recursion on a cycle
memo[node] = None # temporary value while not yet back from recursion
memo[node] = [
[node] + route
for neighbour in graph[node]
for route in dfs(neighbour)
if route
]
return memo[node]
return dfs(source)
graph = {(0, 0): [(1, 0), (0, 1)],
# ...etc ...
}
routes = search(graph, (0, 0), (5, 5))
print(len(routes))
for route in routes:
print(route)

Cycle through graph back to starting node

having an issue working out how to cycle back to the start node in my graph. Currently, from the graph I create I can cycle from a start node and follow the edges till there are no connected nodes. However I can't work out how to make it cycle through and finish on the start node, if possible.
This is an example of the graph with its connections.
Node & Connection(S) [(0, 4), (1, 5), (1, 8), (3, 1), (4, 0), (4, 3), (5, 0),
(5, 3), (5, 7), (6, 0), (6, 4), (7, 0), (8, 5), (8, 6), (8, 7)]
This is my code to cycle through the graph and follow its edges.
def pathSearch(graph, start, path=[]):
path=path+[start]
for node in graph[start]:
if not node in path:
path=pathSearch(graph, node, path)
return path
print ('Path ', pathSearch(g, 0))
This is what I get as an output starting from node 0:
pathSearch [0, 4, 3, 1, 5, 7, 8, 6]
This is right, but why isn't it doing a full cycle back to the start node?

Remove duplicate unordered tuples from list

In a list of tuples, I want to have just one copy of a tuple where it may be (x, y) or (y, x).
So, in:
# pairs = list(itertools.product(range(3), range(3)))
pairs = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
the result should be:
result = [(0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)] # updated pairs
This list of tuples is generated using itertools.product() but I want to remove the duplicates.
My working solution:
pairs = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
result = []
for pair in pairs:
a, b = pair
# reordering in increasing order
temp = (a, b) if a < b else (b, a)
result.append(temp)
print(list(set(result))) # I could use sorted() but the order doesn't matter
How can this be improved?
You could use combinations_with_replacement
The code for combinations_with_replacement() can be also expressed as a subsequence of product() after filtering entries where the elements are not in sorted order (according to their position in the input pool)
import itertools
pairs = list(itertools.combinations_with_replacement(range(3), 2))
print(pairs)
>>> [(0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)]
edit I just realized, your solution matches my solution. What you are doing is just fine. If you need to do this for a very large list, then there are some other options you may want to look into, like a key value store.
If you need to remove dupes more programatically, then you can use a function like this:
def set_reduce(pairs):
new_pairs = set([])
for x,y in pairs:
if x < y:
new_pairs.add((x,y))
else:
new_pairs.add((y,x))
return new_pairs
running this results in
>>>set_reduce(pairs)
set([(0, 1), (1, 2), (0, 0), (0, 2), (2, 2), (1, 1)])
This is one solution which relies on sparse matrices. This works for the following reasons:
An entry in a matrix cannot contain two values. Therefore, uniqueness is guaranteed.
Selecting the upper triangle ensures that (0, 1) is preferred above (1, 0), and inclusion of both is not possible.
import numpy as np
from scipy.sparse import csr_matrix, triu
lst = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1),
(1, 2), (2, 0), (2, 1), (2, 2)]
# get row coords & col coords
d1, d2 = list(zip(*lst))
# set up sparse matrix inputs
row, col, data = np.array(d1), np.array(d2), np.array([1]*len(lst))
# get upper triangle of matrix including diagonal
m = triu(csr_matrix((data, (row, col))), 0)
# output coordinates
result = list(zip(*(m.row, m.col)))
# [(0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)]

fix an ordering of the edges in an undirected networkx graph

In an undirected NetworkX graph edges are represented as Python tuples. their ordering depends on the way you ask for them. Here is a small example what I'm referring to:
import networkx as nx
g = nx.complete_bipartite_graph(2,2)
print(g.edges())
print(g.edges(2))
The output is
[(0, 2), (0, 3), (1, 2), (1, 3)]
[(2, 0), (2, 1)]
Is there a way (not involving manual sorting) to avoid having different representation for an edge?
I'm not sure what you want since in the title you ask for ordered edges but in your example you ask for ordered nodes in the edges. In my examples I show ordering for both. Note that I use list comprehension to create new lists of edges -- the original list of edges (some_edges) is unchanged.
First how to sort individual tuples of nodes in a list of edges. That is, edges are in the same order but the nodes in them are sorted.
import networkx as nx
g = nx.Graph()
g.add_edges_from([
(5, 2),
(2, 1),
(3, 2),
(4, 2)
])
some_edges = g.edges(2)
print("Not sorted: ", some_edges)
print("SORTED")
# sort nodes in edges
some_edges_1 = [tuple(sorted(edge)) for edge in some_edges]
print("Sorted nodes:", some_edges_1)
And now how to sort edges in a list of edges.
# sort edges in list of edges
some_edges_2 = sorted(some_edges_1)
print("Sorted edges:", some_edges_2)
Output for both blocks of code above:
Not sorted: [(2, 1), (2, 3), (2, 4), (2, 5)]
SORTED
Sorted nodes: [(1, 2), (2, 3), (2, 4), (2, 5)]
Sorted edges: [(1, 2), (2, 3), (2, 4), (2, 5)]
Here's also an example of reverse sorting where you can actually see the difference between sorting individual edges and sorting a list of edges.
print("Not sorted: ", some_edges)
print("SORTED REVERSE")
# sort nodes in edges
some_edges_1 = [tuple(sorted(edge, reverse=True)) for edge in some_edges]
print("Sorted nodes:", some_edges_1)
# sort edges in list of edges
some_edges_2 = sorted(some_edges_1, reverse=True)
print("Sorted edges:", some_edges_2)
Output:
Not sorted: [(2, 1), (2, 3), (2, 4), (2, 5)]
SORTED REVERSE
Sorted nodes: [(2, 1), (3, 2), (4, 2), (5, 2)]
Sorted edges: [(5, 2), (4, 2), (3, 2), (2, 1)]

Categories

Resources