I am trying to solve some of the coding challenges that I find online. However I was stopped by the below problem. I tried to solve it using recursion but I feel I am missing a very important concept in recursion. My code works for all of the below examples except the last one it will break down.
Can someone point to me the mistake that I made in this recursion code? Or maybe guide me through solving the issue?
I know why my code breaks but I don't know how to get around the "pass by object reference" in Python which I think creating the bigger problem for me.
The coding question is:
On a mysterious island there are creatures known as Quxes which come in three colors: red, green, and blue. One power of the Qux is that if two of them are standing next to each other, they can transform into a single creature of the third color.
Given N Quxes standing in a line, determine the smallest number of them remaining after any possible sequence of such transformations.
For example, given the input ['R', 'G', 'B', 'G', 'B'], it is possible to end up with a single Qux through the following steps:
Arrangement | Change
----------------------------------------
['R', 'G', 'B', 'G', 'B'] | (R, G) -> B
['B', 'B', 'G', 'B'] | (B, G) -> R
['B', 'R', 'B'] | (R, B) -> G
['B', 'G'] | (B, G) -> R
['R'] |
________________________________________
My code is:
class fusionCreatures(object):
"""Regular Numbers Gen.
"""
def __init__(self , value=[]):
self.value = value
self.ans = len(self.value)
def fusion(self, fus_arr, i):
color = ['R','G','B']
color.remove(fus_arr[i])
color.remove(fus_arr[i+1])
fus_arr.pop(i)
fus_arr.pop(i)
fus_arr.insert(i, color[0])
return fus_arr
def fusionCreatures1(self, arr=None):
# this method is to find the smallest number of creature in a row after fusion
if arr == None:
arr = self.value
for i in range (0,len(arr)-1):
#print(arr)
if len(arr) == 2 and i >= 1 or len(arr)<2:
break
if arr[i] != arr[i+ 1]:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
if len(arr) < self.ans:
self.ans = len(arr)
return self.ans
Testing array (all of them work except the last one):
t1 = fusionCreatures(['R','G','B','G','B'])
t2 = fusionCreatures(['R','G','B','R','G','B'])
t3 = fusionCreatures(['R','R','G','B','G','B'])
t4 = fusionCreatures(['G','R','B','R','G'])
t5 = fusionCreatures(['G','R','B','R','G','R','G'])
t6 = fusionCreatures(['R','R','R','R','R'])
t7 = fusionCreatures(['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'])
print(t1.fusionCreatures1())
print(t2.fusionCreatures1())
print(t3.fusionCreatures1())
print(t4.fusionCreatures1())
print(t5.fusionCreatures1())
print(t6.fusionCreatures1())
print(t7.fusionCreatures1())
I'll start by mentioning that there is a deductive approach that works in O(n) and is detailed in this blog post. It boils down to checking the parity of the counts of the three types of elements in the list to determine which of a few fixed outcomes occurs.
You mention that you'd prefer to use a recursive approach, which is O(n!). This is a good start because it can be used as a tool for helping arrive at the O(n) solution and is a common recursive pattern to be familiar with.
Because we can't know whether a given fusion between two Quxes will ultimately lead to an optimal global solution we're forced to try every possibility. We do this by walking over the list and looking for potential fusions. When we find one, perform the transformation in a new list and call fuse_quxes on it. Along the way, we keep track of the smallest length achieved.
Here's one approach:
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
This is pretty much the direction your provided code is moving towards, but the implementation seems unclear. Unfortunately, I don't see any single or quick fix. Here are a few general issues:
Putting the fusionCreatures1 function into a class allows it to mutate external state, namely self.value and self.ans. self.value in particular is poorly named and difficult to keep track of. It seems like the intent is to use it as a reference copy to reset arr to its default value, but arr = self.value means that when fus_arr is mutated in fusion(), self.value is as well. Everything is pretty much a reference to one underlying list.
Adding slices to these copies at least makes the program easier to reason about, for example, arr = self.value[:] and fus_arr = fus_arr[:] in the fusion() function. In short, try to write pure functions.
self.ans is also unclear and unnecessary; better to keep the result value relegated to a local variable within the recursive function.
It seems unnecessary to put a stateless function into a class unless it's a purely static method and the class is acting as a namespace.
Another cause of cognitive overload are branching statements like if and break. We want to minimize the frequency and nesting of these. Here is fusionCreatures1 in pseudocode, with annotations for mutations and complex interactions:
def fusionCreatures1():
if ...
read mutated global state
for i in len(arr):
if complex length and index checks:
break
if arr[i] != arr[i+ 1]:
impure_func_that_changes_arr_length(arr)
recurse()
if new best compared to global state:
mutate global state
You'll probably agree that it's pretty difficult to mentally step through a run of this function.
In fusionCreatures1(), two variables are unused:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
The assignment arr1 = self.fusion(arr, i) (along with the return fus_arr) seems to indicate a lack of understanding that self.fusion is really an in-place function that mutates its argument array. So calling it means arr1 is arr and we have another aliased variable to reason about.
Beyond this, neither arr1 or testlen are used in the program, so the intent is unclear.
A good linter will pick up these unused variables and identify most of the other complexity issues I've mentioned.
Mutating a list while looping over it is usually disastrous. self.fusion(arr, i) mutates arr inside a loop, making it very difficult to reason about its length and causing an index error when the range(len(arr)) no longer matches the actual len(arr) in the function body (or at least necessitating an in-body precondition). Making self.fusion(arr, i) pure using a slice, as mentioned above, fixes this problem but reveals that there is no recursive base case, resulting in a stack overflow error.
Avoid variable names like arr, arr1, value unless the context is obvious. Again, these obfuscate intent and make the program difficult to understand.
Some minor style suggestions:
Use snake_case per PEP-8. Class names should be TitleCased to differentiate them from functions. No need to inherit from object--that's implicit.
Use consistent spacing around functions and operators: range (0,len(arr)-1): is clearer as range(len(arr) - 1):, for example. Use vertical whitespace around blocks.
Use lists instead of typing out t1, t2, ... t7.
Function names should be verbs, not nouns. A class like fusionCreatures with a method called fusionCreatures1 is unclear. Something like QuxesSolver.minimize(creatures) makes the intent a bit more obvious.
As for the solution I provided above, there are other tricks worth considering to speed it up. One is memoization, which can help avoid duplicate work (any given list will always produce the same minimized length, so we just store this computation in a dict and spit it back out if we ever see it again). If we hit a length of 1, that's the best we can do globally, so we can skip the rest of the search.
Here's a full runner, including the linear solution translated to Python (again, defer to the blog post to read about how it works):
from collections import defaultdict
from itertools import permutations
from random import choice, randint
def fuse_quxes_linear(quxes, choices="RGB"):
counts = defaultdict(int)
for e in quxes:
counts[e] += 1
if not quxes or any(x == len(quxes) for x in counts.values()):
return len(quxes)
elif len(set(counts[x] % 2 for x in choices)) == 1:
return 2
return 1
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
if __name__ == "__main__":
tests = [
['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
]
for test in tests:
print(test, "=>", fuse_quxes(test))
assert fuse_quxes_linear(test) == fuse_quxes(test)
for i in range(100):
test = [choice("RGB") for x in range(randint(0, 10))]
assert fuse_quxes_linear(test) == fuse_quxes(test)
Output:
['R', 'G', 'B', 'G', 'B'] => 1
['R', 'G', 'B', 'R', 'G', 'B'] => 2
['R', 'R', 'G', 'B', 'G', 'B'] => 2
['G', 'R', 'B', 'R', 'G'] => 1
['G', 'R', 'B', 'R', 'G', 'R', 'G'] => 2
['R', 'R', 'R', 'R', 'R'] => 5
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'] => 2
Here is my suggestion.
First, instead of "R", "G" and "B" I use integer values 0, 1, and 2. This allows nice and easy fusion between a and b, as long as they are different, by simply doing 3 - a - b.
Then my recursion code is:
def fuse_quxes(l):
n = len(l)
for i in range(n - 1):
if l[i] == l[i + 1]:
continue
else:
newn = fuse_quxes(l[:i] + [3 - l[i] - l[i + 1]] + l[i+2:])
if newn < n:
n = newn
return n
Run this with
IN[5]: fuse_quxes([0, 0, 0, 1, 1, 1, 2, 2, 2])
Out[5]: 2
Here is my attempt of the problem
please find the description in comment
inputs = [['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'],]
def fuse_quxes(inp):
RGB_set = {"R", "G", "B"}
merge_index = -1
## pair qux with next in line and loop through all pairs
for i, (q1, q2) in enumerate(zip(inp[:-1], inp[1:])):
merged = RGB_set-{q1,q2}
## If more than item remained in merged after removing q1 and q2 qux can't fuse
if(len(merged))==1:
merged = merged.pop()
merge_index=i
merged_color = merged
## loop through the pair until result of fuse is different from qux in either right
## or left side
if (i>0 and merged!=inp[i-1]) or ((i+2)<len(inp) and merged!=inp[i+2]):
break
print(inp)
## merge two qux which results to qux differnt from either its right or left else do any
## possible merge
if merge_index>=0:
del inp[merge_index]
inp[merge_index] = merged_color
return fuse_quxes(inp)
else:
## if merge can't be made break the recurssion
print("Result", len(inp))
print("_______________________")
return len(inp)
[fuse_quxes(inp) for inp in inputs]
output
['R', 'G', 'B', 'G', 'B']
['R', 'R', 'G', 'B']
['R', 'B', 'B']
['G', 'B']
['R']
Result 1
_______________________
['R', 'G', 'B', 'R', 'G', 'B']
['R', 'G', 'B', 'R', 'R']
['R', 'G', 'G', 'R']
['B', 'G', 'R']
['B', 'B']
Result 2
_______________________
['R', 'R', 'G', 'B', 'G', 'B']
['R', 'B', 'B', 'G', 'B']
['G', 'B', 'G', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
['G', 'R', 'B', 'R', 'G']
['G', 'G', 'R', 'G']
['G', 'B', 'G']
['R', 'G']
['B']
Result 1
_______________________
['G', 'R', 'B', 'R', 'G', 'R', 'G']
['G', 'G', 'R', 'G', 'R', 'G']
['G', 'B', 'G', 'R', 'G']
['R', 'G', 'R', 'G']
['B', 'R', 'G']
['B', 'B']
Result 2
_______________________
['R', 'R', 'R', 'R', 'R']
Result 5
_______________________
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'G', 'G', 'B', 'B', 'B']
['B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
[1, 2, 2, 1, 2, 5, 2]
Problem:
I have some linked data and I want to build a structure like this one on this picture :
and get the level of each item because in the future I will make some calculations by staring at the lowest level of my tree structure.
Expected Result:
I need to get a structure that gives me items per level :
level 0: A
level 1: A = B, C,D
level 2: D = E, F, G
level 3: E = H,I , J, K
what I have tried so far:
I've tried this recursive code to simulate the behavior but I'm unable to get items the level of items.
dict_item = {"A": ["B","C","D"], "D": ["E","F","G"], "E":["H","I","J"]}
def build_bom(product):
if not dict_item.get(product):
return product
else :
return [build_bom(x) for x in dict_item.get(product)]
print(build_bom("A"))
My output is a nested list like this :
['B', 'C', [['H', 'I', 'J'], 'F', 'G']]
My Question:
I'm not sure if this is the best approach to handle my problem.
And how to get the desired output?
here is the desired output :
[ {"parent_E":["H", "I", "J"]},
{"parent_D": ["E", "F", "G"]},
{"parent_A"} :["D","C","B"]},
]
A list of dictionaries ( where keys are parents and values are children), the first element in the list is the lowest level of my structure and the last is the highest element.
PS: This is a simulation but in future, I will have to works on large datasets with this code.
Any Help will be appreciated
This is how I will approach this problem. First, I'll generate the tree from your dict_item object.
dict_item = {"A": ["B","C","D"], "D": ["E","F","G"], "E":["H","I","J"]}
def build_tree(x):
if x in dict_item:
return {x: [build_tree(v) for v in dict_item[x]]}
else:
return x
tree = build_tree("A")
print(tree)
>>> {'A': ['B', 'C', {'D': [{'E': ['H', 'I', 'J']}, 'F', 'G']}]}
Then, do a breadth-first search on the tree. Each time we hit an element that has children, we append it to a list:
results = []
queue = [tree]
while queue:
x = queue.pop(0)
if isinstance(x, dict):
parent, children = list(x.items())[0]
results.append({'parent_' + parent: dict_item[parent]})
for child in children:
queue.append(child)
print(results)
>>> [{'parent_A': ['B', 'C', 'D']}, {'parent_D': ['E', 'F', 'G']}, {'parent_E': ['H', 'I', 'J']}]
Then all we need to do now, is to reverse the list:
print list(reversed(results))
>>> [{'parent_E': ['H', 'I', 'J']}, {'parent_D': ['E', 'F', 'G']}, {'parent_A': ['B', 'C', 'D']}]
I have set of pairwise relationship something like this
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
Number of such relationship is big enough to check it individually. These tuple indicates that both values are same. I would like to apply transitivity and find out common groups. Output would be like following:
[('a','b','c','g'), ('d','e','m','z','p'), ('t','k','n','l','j')]
I tried following code but it has bug,
common_cols = []
common_group_count = 0
for (c1, c2) in col_combi:
found = False
for i in range(len(common_cols)):
if (c1 in common_cols[i]):
common_cols[i].append(c2)
found = True
break
elif (c2 in common_cols[i]):
common_cols[i].append(c1)
found = True
break
if not found:
common_cols.append([c1,c2])
Output of above code is following
[['a', 'b', 'c', 'g'], ['d', 'e', 'm', 'z', 'p'], ['l', 'j', 'k'], ['t', 'k', 'n']]
I know why this code is not working. So I would like to know how can I perform this task.
Thanks in advance
You can approach this as a graph problem using the NetworkX library:
import networkx
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
g = networkx.Graph(col_combi)
for subgraph in networkx.connected_component_subgraphs(g):
print subgraph.nodes()
Output:
['m', 'z', 'e', 'd', 'p']
['t', 'k', 'j', 'l', 'n']
['a', 'c', 'b', 'g']
You can implement a solution using sets and union/intersection operations.
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
from itertools import combinations
sets = [set(x) for x in col_combi]
stable = False
while not stable: # loop until no further reduction is found
stable = True
# iterate over pairs of distinct sets
for s,t in combinations(sets, 2):
if s & t: # do the sets intersect ?
s |= t # move items from t to s
t ^= t # empty t
stable = False
# remove empty sets
sets = list(filter(None, sets)) # added list() for python 3
print sets
Output:
[set(['a', 'c', 'b', 'g']), set(['p', 'e', 'd', 'z', 'm']), set(['t', 'k', 'j', 'l', 'n'])]
Note: doc for itertools.combinations
A solution with itertools, you can take a look.
lst =[]
import itertools
for a, b in itertools.combinations(col_combi, 2):
for i in a:
if i in b:
lst.append(set(a+b))
for indi,i in enumerate(lst):
for j in lst:
if i == j:
continue
if i & j:
lst[indi] = i|j
lst.remove(j)
print lst
Output of this is:
[set(['a', 'c', 'b', 'g']), set(['k', 'j', 'l', 'n']), set(['e', 'd', 'm', 'p', 'z'])]
Of course this can be made more efficient. I will try to update soon.
From the code after elif you assume the relationship is reflexive.
Your algorithm fails if the pairs are not provided in a specific order.
Example:
(b, c) (a, b) (c, d)
will end up with two sets
b, c, d
and
a, b
The problem is about partitioning a set using an equivalence relation. Understanding the set theory background helps identifying a library that can solve the problem. See https://en.m.wikipedia.org/wiki/Equivalence_relation .