Generate all possible paths from a list of tuples - python

Given a list of tuples, I need to find all unique paths from this list :
Input: [('a','b'),('b','c'),('c','d'),('g','i'),('d','e'),('e','f'),('f','g'),('c','g')]
Output: [['a','b','c','d','e','f','g'],['a','b','c','g','i']]
(the 2 possible unique paths)
Two tuples can connect if the second element of the tuple matches with the first element of the other tuple i.e: One tuple is (_,a) and other tuple is like (a,_).
This issue has already been raised there: Getting Unique Paths from list of tuple but the solution is implemented in haskell (and I know nothing about this language).
But do you know if there's an efficient way to do this in Python?
I know the library itertools has many efficient built in functions for stuff like that, but I'm not too familiar with this.

You are wanting to find all simple paths in your graph.
Python has an amazing library for graph processing: networkx. You can solve your problem with literally several lines of code:
import networkx as nx
a = [('a','b'),('b','c'),('c','d'),('g','i'),('d','e'),('e','f'),('f','g'),('c','g')]
# Create graph
G = nx.Graph()
# Fill graph with data
G.add_edges_from(a)
# Get all simple paths from node 'a' to node 'i'
list(nx.all_simple_paths(G, 'a', 'i'))
will return you:
[['a', 'b', 'c', 'g', 'i'], ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'i']]
If you want ALL possible paths, just replace the last line with it:
for start in G.nodes:
for end in G.nodes:
if start != end:
print(list(nx.all_simple_paths(G, start, end)))

You can build a dict that maps each parent to a list of connected children, so that you can recursively yield the paths from each parent node in an average time complexity of O(n):
def get_paths(parent, mapping):
if parent not in mapping:
yield [parent]
return
for child in mapping[parent]:
for path in get_paths(child, mapping):
yield [parent, *path]
edges = [('a','b'),('b','c'),('c','d'),('g','i'),('d','e'),('e','f'),('f','g'),('c','g')]
parents = set()
children = set()
mapping = {}
for a, b in edges:
mapping.setdefault(a, []).append(b)
parents.add(a)
children.add(b)
print([path for parent in parents - children for path in get_paths(parent, mapping)])
This outputs:
[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'i'], ['a', 'b', 'c', 'g', 'i']]

You can use recursion with a generator:
d = [('a','b'),('b','c'),('c','d'),('g','i'),('d','e'),('e','f'),('f','g'),('c','g')]
def get_paths(start, c = []):
r = [b for a, b in d if a == start]
if r:
for i in r:
yield from get_paths(i, c+[i])
else:
yield c
print(list(get_paths('a', ['a'])))
Output:
[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'i'], ['a', 'b', 'c', 'g', 'i']]

Related

Separate a List in 2 or more lists

If I have the following list of lists for example:
[['A', 'B', 'C'], ['A', 'D', 'E', 'B', 'C']]
How could I get a List with lists of only 3 elems each (in case they are greater than 3 elems), if they have not more than 3 elems we don't need to do nothing, we just need to separate the elems with more than 3 like the following:
[['A', 'B', 'C'], ['A', 'D', 'E'], ['D', 'E', 'B'], ['E', 'B', 'C']]
Could you help me with this ? I've been trying for a long time without success, kinda new to Python.
Edit:
Well, I resolved this in this way:
def separate_in_three(lista):
paths = []
for path in lista:
if len(path) <= 3:
paths.append(path)
else:
for node in range(len(path)-1):
paths.append(path[:3])
path.pop(0);
if(len(path) == 3):
paths.append(path)
break
return paths
Seems to resolve my problem, I could use the list in comprehension, were it would be much more efficient than the way I did ?
Thanks for the help btw !
you can use list comprehension like below.
l = [['A', 'B', 'C'], ['A', 'D', 'E', 'B', 'C','Z']]
[l[0]] + [l[1][i: i+len(l[0])] for i in range(1 + len(l[1]) - len(l[0]))]

Solving a "colored Quxes" coding challenge with recursion

I am trying to solve some of the coding challenges that I find online. However I was stopped by the below problem. I tried to solve it using recursion but I feel I am missing a very important concept in recursion. My code works for all of the below examples except the last one it will break down.
Can someone point to me the mistake that I made in this recursion code? Or maybe guide me through solving the issue?
I know why my code breaks but I don't know how to get around the "pass by object reference" in Python which I think creating the bigger problem for me.
The coding question is:
On a mysterious island there are creatures known as Quxes which come in three colors: red, green, and blue. One power of the Qux is that if two of them are standing next to each other, they can transform into a single creature of the third color.
Given N Quxes standing in a line, determine the smallest number of them remaining after any possible sequence of such transformations.
For example, given the input ['R', 'G', 'B', 'G', 'B'], it is possible to end up with a single Qux through the following steps:
Arrangement | Change
----------------------------------------
['R', 'G', 'B', 'G', 'B'] | (R, G) -> B
['B', 'B', 'G', 'B'] | (B, G) -> R
['B', 'R', 'B'] | (R, B) -> G
['B', 'G'] | (B, G) -> R
['R'] |
________________________________________
My code is:
class fusionCreatures(object):
"""Regular Numbers Gen.
"""
def __init__(self , value=[]):
self.value = value
self.ans = len(self.value)
def fusion(self, fus_arr, i):
color = ['R','G','B']
color.remove(fus_arr[i])
color.remove(fus_arr[i+1])
fus_arr.pop(i)
fus_arr.pop(i)
fus_arr.insert(i, color[0])
return fus_arr
def fusionCreatures1(self, arr=None):
# this method is to find the smallest number of creature in a row after fusion
if arr == None:
arr = self.value
for i in range (0,len(arr)-1):
#print(arr)
if len(arr) == 2 and i >= 1 or len(arr)<2:
break
if arr[i] != arr[i+ 1]:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
if len(arr) < self.ans:
self.ans = len(arr)
return self.ans
Testing array (all of them work except the last one):
t1 = fusionCreatures(['R','G','B','G','B'])
t2 = fusionCreatures(['R','G','B','R','G','B'])
t3 = fusionCreatures(['R','R','G','B','G','B'])
t4 = fusionCreatures(['G','R','B','R','G'])
t5 = fusionCreatures(['G','R','B','R','G','R','G'])
t6 = fusionCreatures(['R','R','R','R','R'])
t7 = fusionCreatures(['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'])
print(t1.fusionCreatures1())
print(t2.fusionCreatures1())
print(t3.fusionCreatures1())
print(t4.fusionCreatures1())
print(t5.fusionCreatures1())
print(t6.fusionCreatures1())
print(t7.fusionCreatures1())
I'll start by mentioning that there is a deductive approach that works in O(n) and is detailed in this blog post. It boils down to checking the parity of the counts of the three types of elements in the list to determine which of a few fixed outcomes occurs.
You mention that you'd prefer to use a recursive approach, which is O(n!). This is a good start because it can be used as a tool for helping arrive at the O(n) solution and is a common recursive pattern to be familiar with.
Because we can't know whether a given fusion between two Quxes will ultimately lead to an optimal global solution we're forced to try every possibility. We do this by walking over the list and looking for potential fusions. When we find one, perform the transformation in a new list and call fuse_quxes on it. Along the way, we keep track of the smallest length achieved.
Here's one approach:
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
This is pretty much the direction your provided code is moving towards, but the implementation seems unclear. Unfortunately, I don't see any single or quick fix. Here are a few general issues:
Putting the fusionCreatures1 function into a class allows it to mutate external state, namely self.value and self.ans. self.value in particular is poorly named and difficult to keep track of. It seems like the intent is to use it as a reference copy to reset arr to its default value, but arr = self.value means that when fus_arr is mutated in fusion(), self.value is as well. Everything is pretty much a reference to one underlying list.
Adding slices to these copies at least makes the program easier to reason about, for example, arr = self.value[:] and fus_arr = fus_arr[:] in the fusion() function. In short, try to write pure functions.
self.ans is also unclear and unnecessary; better to keep the result value relegated to a local variable within the recursive function.
It seems unnecessary to put a stateless function into a class unless it's a purely static method and the class is acting as a namespace.
Another cause of cognitive overload are branching statements like if and break. We want to minimize the frequency and nesting of these. Here is fusionCreatures1 in pseudocode, with annotations for mutations and complex interactions:
def fusionCreatures1():
if ...
read mutated global state
for i in len(arr):
if complex length and index checks:
break
if arr[i] != arr[i+ 1]:
impure_func_that_changes_arr_length(arr)
recurse()
if new best compared to global state:
mutate global state
You'll probably agree that it's pretty difficult to mentally step through a run of this function.
In fusionCreatures1(), two variables are unused:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
The assignment arr1 = self.fusion(arr, i) (along with the return fus_arr) seems to indicate a lack of understanding that self.fusion is really an in-place function that mutates its argument array. So calling it means arr1 is arr and we have another aliased variable to reason about.
Beyond this, neither arr1 or testlen are used in the program, so the intent is unclear.
A good linter will pick up these unused variables and identify most of the other complexity issues I've mentioned.
Mutating a list while looping over it is usually disastrous. self.fusion(arr, i) mutates arr inside a loop, making it very difficult to reason about its length and causing an index error when the range(len(arr)) no longer matches the actual len(arr) in the function body (or at least necessitating an in-body precondition). Making self.fusion(arr, i) pure using a slice, as mentioned above, fixes this problem but reveals that there is no recursive base case, resulting in a stack overflow error.
Avoid variable names like arr, arr1, value unless the context is obvious. Again, these obfuscate intent and make the program difficult to understand.
Some minor style suggestions:
Use snake_case per PEP-8. Class names should be TitleCased to differentiate them from functions. No need to inherit from object--that's implicit.
Use consistent spacing around functions and operators: range (0,len(arr)-1): is clearer as range(len(arr) - 1):, for example. Use vertical whitespace around blocks.
Use lists instead of typing out t1, t2, ... t7.
Function names should be verbs, not nouns. A class like fusionCreatures with a method called fusionCreatures1 is unclear. Something like QuxesSolver.minimize(creatures) makes the intent a bit more obvious.
As for the solution I provided above, there are other tricks worth considering to speed it up. One is memoization, which can help avoid duplicate work (any given list will always produce the same minimized length, so we just store this computation in a dict and spit it back out if we ever see it again). If we hit a length of 1, that's the best we can do globally, so we can skip the rest of the search.
Here's a full runner, including the linear solution translated to Python (again, defer to the blog post to read about how it works):
from collections import defaultdict
from itertools import permutations
from random import choice, randint
def fuse_quxes_linear(quxes, choices="RGB"):
counts = defaultdict(int)
for e in quxes:
counts[e] += 1
if not quxes or any(x == len(quxes) for x in counts.values()):
return len(quxes)
elif len(set(counts[x] % 2 for x in choices)) == 1:
return 2
return 1
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
if __name__ == "__main__":
tests = [
['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
]
for test in tests:
print(test, "=>", fuse_quxes(test))
assert fuse_quxes_linear(test) == fuse_quxes(test)
for i in range(100):
test = [choice("RGB") for x in range(randint(0, 10))]
assert fuse_quxes_linear(test) == fuse_quxes(test)
Output:
['R', 'G', 'B', 'G', 'B'] => 1
['R', 'G', 'B', 'R', 'G', 'B'] => 2
['R', 'R', 'G', 'B', 'G', 'B'] => 2
['G', 'R', 'B', 'R', 'G'] => 1
['G', 'R', 'B', 'R', 'G', 'R', 'G'] => 2
['R', 'R', 'R', 'R', 'R'] => 5
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'] => 2
Here is my suggestion.
First, instead of "R", "G" and "B" I use integer values 0, 1, and 2. This allows nice and easy fusion between a and b, as long as they are different, by simply doing 3 - a - b.
Then my recursion code is:
def fuse_quxes(l):
n = len(l)
for i in range(n - 1):
if l[i] == l[i + 1]:
continue
else:
newn = fuse_quxes(l[:i] + [3 - l[i] - l[i + 1]] + l[i+2:])
if newn < n:
n = newn
return n
Run this with
IN[5]: fuse_quxes([0, 0, 0, 1, 1, 1, 2, 2, 2])
Out[5]: 2
Here is my attempt of the problem
please find the description in comment
inputs = [['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'],]
def fuse_quxes(inp):
RGB_set = {"R", "G", "B"}
merge_index = -1
## pair qux with next in line and loop through all pairs
for i, (q1, q2) in enumerate(zip(inp[:-1], inp[1:])):
merged = RGB_set-{q1,q2}
## If more than item remained in merged after removing q1 and q2 qux can't fuse
if(len(merged))==1:
merged = merged.pop()
merge_index=i
merged_color = merged
## loop through the pair until result of fuse is different from qux in either right
## or left side
if (i>0 and merged!=inp[i-1]) or ((i+2)<len(inp) and merged!=inp[i+2]):
break
print(inp)
## merge two qux which results to qux differnt from either its right or left else do any
## possible merge
if merge_index>=0:
del inp[merge_index]
inp[merge_index] = merged_color
return fuse_quxes(inp)
else:
## if merge can't be made break the recurssion
print("Result", len(inp))
print("_______________________")
return len(inp)
[fuse_quxes(inp) for inp in inputs]
output
['R', 'G', 'B', 'G', 'B']
['R', 'R', 'G', 'B']
['R', 'B', 'B']
['G', 'B']
['R']
Result 1
_______________________
['R', 'G', 'B', 'R', 'G', 'B']
['R', 'G', 'B', 'R', 'R']
['R', 'G', 'G', 'R']
['B', 'G', 'R']
['B', 'B']
Result 2
_______________________
['R', 'R', 'G', 'B', 'G', 'B']
['R', 'B', 'B', 'G', 'B']
['G', 'B', 'G', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
['G', 'R', 'B', 'R', 'G']
['G', 'G', 'R', 'G']
['G', 'B', 'G']
['R', 'G']
['B']
Result 1
_______________________
['G', 'R', 'B', 'R', 'G', 'R', 'G']
['G', 'G', 'R', 'G', 'R', 'G']
['G', 'B', 'G', 'R', 'G']
['R', 'G', 'R', 'G']
['B', 'R', 'G']
['B', 'B']
Result 2
_______________________
['R', 'R', 'R', 'R', 'R']
Result 5
_______________________
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'G', 'G', 'B', 'B', 'B']
['B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
[1, 2, 2, 1, 2, 5, 2]

Get level of items in a nested list

Problem:
I have some linked data and I want to build a structure like this one on this picture :
and get the level of each item because in the future I will make some calculations by staring at the lowest level of my tree structure.
Expected Result:
I need to get a structure that gives me items per level :
level 0: A
level 1: A = B, C,D
level 2: D = E, F, G
level 3: E = H,I , J, K
what I have tried so far:
I've tried this recursive code to simulate the behavior but I'm unable to get items the level of items.
dict_item = {"A": ["B","C","D"], "D": ["E","F","G"], "E":["H","I","J"]}
def build_bom(product):
if not dict_item.get(product):
return product
else :
return [build_bom(x) for x in dict_item.get(product)]
print(build_bom("A"))
My output is a nested list like this :
['B', 'C', [['H', 'I', 'J'], 'F', 'G']]
My Question:
I'm not sure if this is the best approach to handle my problem.
And how to get the desired output?
here is the desired output :
[ {"parent_E":["H", "I", "J"]},
{"parent_D": ["E", "F", "G"]},
{"parent_A"} :["D","C","B"]},
]
A list of dictionaries ( where keys are parents and values are children), the first element in the list is the lowest level of my structure and the last is the highest element.
PS: This is a simulation but in future, I will have to works on large datasets with this code.
Any Help will be appreciated
This is how I will approach this problem. First, I'll generate the tree from your dict_item object.
dict_item = {"A": ["B","C","D"], "D": ["E","F","G"], "E":["H","I","J"]}
def build_tree(x):
if x in dict_item:
return {x: [build_tree(v) for v in dict_item[x]]}
else:
return x
tree = build_tree("A")
print(tree)
>>> {'A': ['B', 'C', {'D': [{'E': ['H', 'I', 'J']}, 'F', 'G']}]}
Then, do a breadth-first search on the tree. Each time we hit an element that has children, we append it to a list:
results = []
queue = [tree]
while queue:
x = queue.pop(0)
if isinstance(x, dict):
parent, children = list(x.items())[0]
results.append({'parent_' + parent: dict_item[parent]})
for child in children:
queue.append(child)
print(results)
>>> [{'parent_A': ['B', 'C', 'D']}, {'parent_D': ['E', 'F', 'G']}, {'parent_E': ['H', 'I', 'J']}]
Then all we need to do now, is to reverse the list:
print list(reversed(results))
>>> [{'parent_E': ['H', 'I', 'J']}, {'parent_D': ['E', 'F', 'G']}, {'parent_A': ['B', 'C', 'D']}]

Handle Transitivity in Python

I have set of pairwise relationship something like this
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
Number of such relationship is big enough to check it individually. These tuple indicates that both values are same. I would like to apply transitivity and find out common groups. Output would be like following:
[('a','b','c','g'), ('d','e','m','z','p'), ('t','k','n','l','j')]
I tried following code but it has bug,
common_cols = []
common_group_count = 0
for (c1, c2) in col_combi:
found = False
for i in range(len(common_cols)):
if (c1 in common_cols[i]):
common_cols[i].append(c2)
found = True
break
elif (c2 in common_cols[i]):
common_cols[i].append(c1)
found = True
break
if not found:
common_cols.append([c1,c2])
Output of above code is following
[['a', 'b', 'c', 'g'], ['d', 'e', 'm', 'z', 'p'], ['l', 'j', 'k'], ['t', 'k', 'n']]
I know why this code is not working. So I would like to know how can I perform this task.
Thanks in advance
You can approach this as a graph problem using the NetworkX library:
import networkx
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
g = networkx.Graph(col_combi)
for subgraph in networkx.connected_component_subgraphs(g):
print subgraph.nodes()
Output:
['m', 'z', 'e', 'd', 'p']
['t', 'k', 'j', 'l', 'n']
['a', 'c', 'b', 'g']
You can implement a solution using sets and union/intersection operations.
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
from itertools import combinations
sets = [set(x) for x in col_combi]
stable = False
while not stable: # loop until no further reduction is found
stable = True
# iterate over pairs of distinct sets
for s,t in combinations(sets, 2):
if s & t: # do the sets intersect ?
s |= t # move items from t to s
t ^= t # empty t
stable = False
# remove empty sets
sets = list(filter(None, sets)) # added list() for python 3
print sets
Output:
[set(['a', 'c', 'b', 'g']), set(['p', 'e', 'd', 'z', 'm']), set(['t', 'k', 'j', 'l', 'n'])]
Note: doc for itertools.combinations
A solution with itertools, you can take a look.
lst =[]
import itertools
for a, b in itertools.combinations(col_combi, 2):
for i in a:
if i in b:
lst.append(set(a+b))
for indi,i in enumerate(lst):
for j in lst:
if i == j:
continue
if i & j:
lst[indi] = i|j
lst.remove(j)
print lst
Output of this is:
[set(['a', 'c', 'b', 'g']), set(['k', 'j', 'l', 'n']), set(['e', 'd', 'm', 'p', 'z'])]
Of course this can be made more efficient. I will try to update soon.
From the code after elif you assume the relationship is reflexive.
Your algorithm fails if the pairs are not provided in a specific order.
Example:
(b, c) (a, b) (c, d)
will end up with two sets
b, c, d
and
a, b
The problem is about partitioning a set using an equivalence relation. Understanding the set theory background helps identifying a library that can solve the problem. See https://en.m.wikipedia.org/wiki/Equivalence_relation .

How does the stack flow/appending works in a recursive call (graph search)

It's me trying to understand recursion, the graph is here as support to help me ask my question.
I have this graph :
And I'm using this function to find all the paths possible from one vertex to an other.
def find_all_path(self, start_vertex, end_vertex, path=[]):
graph = self.__graph_dict
path = path + [start_vertex]
if end_vertex == start_vertex:
return [path]
paths = []
for neighbour in graph[start_vertex]:
if neighbour not in path:
extended_paths = self.find_all_path(neighbour, end_vertex, path)
for p in extended_paths:
paths.append(p)
return paths
Which would give, from a to d:
[['a', 'b', 'd'], ['a', 'b', 'c', 'e', 'd'], ['a', 'b', 'c', 'e', 'f', 'd'], ['a', 'c', 'b', 'd'], ['a', 'c', 'e', 'd'], ['a', 'c', 'e', 'f', 'd']]
1. Is paths passed by reference?
Meaning that, paths is changing throughout each stack even though they're not related.
2. How does paths gets all the path appended to it?
For example, a has b and c as neighbours. Let's say it goes through b first, path = [a,b], and then it calls the function again with (b,d,[a,b]) as parameters. It goes again until it reaches the base case where d == d.
At this point, it returns [a,b,d] and so forth until... what? The stack is empty, how does that work?
Obviously, that's the part I don't get, since paths is returned to the top how come all the other path can be appended to it?
Here's me trying to understand this stack appending process :
I think my confusion is related to the flow ("how does the computing process works?"). At first, I though it was the longest path which would be appended as the first element in paths, meaning the process wouldn't end until the longest path is found, but it's not, since [a,b,d] is first.

Categories

Resources