I'm learning Python and I have a problem with this seems to be simple task.
I want to find all possible combination of numbers that sum up to a given number.
for example: 4 -> [1,1,1,1] [1,1,2] [2,2] [1,3]
I pick the solution which generate all possible subsets (2^n) and then yield just those that sum is equal to the number. I have a problem with the condition. Code:
def allSum(number):
#mask = [0] * number
for i in xrange(2**number):
subSet = []
for j in xrange(number):
#if :
subSet.append(j)
if sum(subSet) == number:
yield subSet
for i in allSum(4):
print i
BTW is it a good approach?
Here's some code I saw a few years ago that does the trick:
>>> def partitions(n):
if n:
for subpart in partitions(n-1):
yield [1] + subpart
if subpart and (len(subpart) < 2 or subpart[1] > subpart[0]):
yield [subpart[0] + 1] + subpart[1:]
else:
yield []
>>> print list(partitions(4))
[[1, 1, 1, 1], [1, 1, 2], [2, 2], [1, 3], [4]]
Additional References:
http://mathworld.wolfram.com/Partition.html
http://en.wikipedia.org/wiki/Partition_(number_theory)
http://www.site.uottawa.ca/~ivan/F49-int-part.pdf
Here is an alternate approach which works by taking a list of all 1s and recursively collapsing it by adding subsequent elements, this should be more efficient than generating all possible subsets:
def allSum(number):
def _collapse(lst):
yield lst
while len(lst) > 1:
lst = lst[:-2] + [lst[-2] + lst[-1]]
for prefix in _collapse(lst[:-1]):
if not prefix or prefix[-1] <= lst[-1]:
yield prefix + [lst[-1]]
return list(_collapse([1] * number))
>>> allSum(4)
[[1, 1, 1, 1], [1, 1, 2], [2, 2], [1, 3], [4]]
>>> allSum(5)
[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 2, 2], [1, 1, 3], [2, 3], [1, 4], [5]]
You can strip off the last value if you don't want the trivial case. If you will just be looping over the results remove the list call and just return the generator.
This is equivalent to the problem described in this question and can use a similar solution.
To elaborate:
def allSum(number):
for solution in possibilites(range(1, number+1), number):
expanded = []
for value, qty in zip(range(1, number+1), solution):
expanded.extend([value]*qty)
yield expanded
That translates this question into that question and back again.
That solution doesn't work, right? It will never add a number to a subset more than once, so you will never get, for example, [1,1,2]. It will never skip a number, either, so you will never get, for example, [1,3].
So the problem with your solution is twofold: One, you are not actually generating all possible subsets in the range 1..number. Two, The set of all subsets will exclude things that you should be including, because it will not allow a number to appear more than once.
This kind of problem can be generalized as a search problem. Imagine that the numbers you want to try are nodes on a tree, and then you can use depth-first search to find all paths through the tree that represent a solution. It's an infinitely large tree, but luckily, you never need to search all of it.
Related
I am solving a recursion problem, where I am given an array of integers and asked to return the powerset of it.
e.g. powerset of [1,2,3] is [[],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]]
Here is the recursive code that does it:
def powerset(array, idx = None):
if idx is None:
idx = len(array) - 1
if idx <0:
return [[]]
ele = array[idx]
subset = powerset(array,idx-1)
for i in range(len(subset)):
currentSubset = subset[i]
subset.append(currentSubset + [ele])
return subset
While I do understand what is happening for the most part, my questions are:
when we get to the base case idx<0, this means the idx pointer points outside of the array, and we don't want to call array[idx], but my question is- do we return the empty set "[[]]" just as a filler, so that the top recursive call on the recursion stack gets executed next? Otherwise what does this do?
This might be a tallish order, but can someone explain with respect to the example [1,2,3] how the recursive call runs?
Here is my understanding;
We start with the pointer idx pointing at 3, so ele=3, we then initialise a subset called subset that holds the powerset of [1,2]
Here is where I am confused, and struggling to see how the code pans out... Do we now go to the next batch of code which is the for loop? Or do we calculate the powerset of [1,2]?
Following Chepner's suggestion:
def powerset(array):
return _powerset(array,len(array)-1)
def _powerset(array,index):
if index <0:
return [[]]
ele = array[index]
subset = _powerset(array,index-1)
for i in range(len(subset)):
subset.append(subset[i] + [ele])
return subset
Adding print to the start and end of a recursive call is a useful way to visualize how it works.
def powerset(array, idx=None, indent=0):
trace = f"{' '*indent}powerset({array}, {idx})"
print(f"{trace}...")
if idx is None:
idx = len(array)-1
if idx < 0:
print(f"{trace} -> [[]]")
return [[]]
ele = array[idx]
subset = powerset(array, idx-1, indent+1)
for i in range(len(subset)):
subset.append(subset[i] + [ele])
print(f"{trace} -> {subset}")
return subset
prints:
powerset([1, 2, 3], None)...
powerset([1, 2, 3], 1)...
powerset([1, 2, 3], 0)...
powerset([1, 2, 3], -1)...
powerset([1, 2, 3], -1) -> [[]]
powerset([1, 2, 3], 0) -> [[], [1]]
powerset([1, 2, 3], 1) -> [[], [1], [2], [1, 2]]
powerset([1, 2, 3], None) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
Note that 0 and None need to be handled differently, which is why you need to use idx is None instead of not idx!
(edit) Per notes in the comments, one way to avoid the idx = None confusion (other than to wrap it in another function layer) is to rework the recursion slightly so that it doesn't need the idx variable in the first place. Rather than passing the full array and a variable indicating what part to iterate through, pass the subset of the array that you want to compute the powerset for. That makes each recursive call (including the very first one) operate according to the exact same "contract" -- compute the powerset of this list.
def powerset(array, indent=0):
trace = f"{' '*indent}powerset({array})"
print(f"{trace}...")
if array:
p = powerset(array[:-1], indent+1)
p.extend([s + [array[-1]] for s in p])
else:
p = [[]]
print(f"{trace} -> {p}")
return p
powerset([1, 2, 3])...
powerset([1, 2])...
powerset([1])...
powerset([])...
powerset([]) -> [[]]
powerset([1]) -> [[], [1]]
powerset([1, 2]) -> [[], [1], [2], [1, 2]]
powerset([1, 2, 3]) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
Note that the sequence of recursive calls is computing the exact same result at each step, but the input is simpler -- rather than watching idx go from None to 1 to 0 to -1 and having to reason about what that means, we see arr steadily shrink towards the base case, and then each layer of the stack adding the last element of arr to each of the previous call's subsets.
As written, the function returns the power set of array[:idx+1] if it's passed an explicit index.
The test idx < 0 is really testing idx == -1, as other negative values should never occur. In this case, the function should compute the power set of the empty set, which is a set that contains one element: the empty set. The set containing the empty set is represented by [[]].
My impression is that you're trying to think of the recursive function as a loop written in a confusing way. You should instead just think of it as a function that computes something – in this case, a power set – and happens to use itself as a library function (in a way that always terminates).
Given a nonempty set S, which contains an element x, and a way to compute the power set of the smaller set S\{x}, how do you get the power set of S? Answer: for each set in the power set of S\{x}, return two sets: that set, and that set with x added to it. That's what this code does.
I am working on a question as following:
Given a set of numbers that might contain duplicates, find all of its distinct subsets.
You can use the following as an example :
Example 1:
Input: [1, 3, 3]
Output: [], [1], [3], [1,3], [3,3], [1,3,3]
Example 2:
Input: [1, 5, 3, 3]
Output: [], [1], [5], [3], [1,5], [1,3], [5,3], [1,5,3], [3,3],
[1,3,3], [3,3,5], [1,5,3,3]
My approach is
class Solution:
def distinct_subset(self, nums):
n = len(nums)
previousEnd = 0
output = []
for i in range(n):
# judge if the current element is equal to the previous element
# if so, only update the elements generated in the previous iteration
if i > 0 and nums[i] == nums[i-1]:
previousStart = previousEnd + 1
else:
previousStart = 0
perviousEnd = len(output)
# create a temp array to store the output from the previous iteration
temp = list(output[previousStart:previousEnd])
# add current element to all the array generated by the previous iteration
output += [j + [nums[i]] for j in temp]
return output
def main():
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 5, 3, 3])))
main()
However, my approach will only return []:
Here is the list of subsets: []
Here is the list of subsets: []
Process finished with exit code 0
I am not sure why did I go wrong. The algorithm supposes to update the output in each iteration. But now it failed.
Please feel free to share your ideas. Thanks for your help in advanced.
Yes, I ran your code and it appears no matter what you do the function will always return an output of an empty list, because nothing is actually changing in the list, it is always blank.
Forgive me, but I had to look up what 'all distinct subsets' meant, and I stumbled across this code, but it seems to do exactly what you are asking.
# Python3 program to find all subsets of
# given set. Any repeated subset is
# considered only once in the output
def printPowerSet(arr, n):
# Function to find all subsets of given set.
# Any repeated subset is considered only
# once in the output
_list = []
# Run counter i from 000..0 to 111..1
for i in range(2**n):
subset = ""
# consider each element in the set
for j in range(n):
# Check if jth bit in the i is set.
# If the bit is set, we consider
# jth element from set
if (i & (1 << j)) != 0:
subset += str(arr[j]) + "|"
# if subset is encountered for the first time
# If we use set<string>, we can directly insert
if subset not in _list and len(subset) > 0:
_list.append(subset)
# consider every subset
for subset in _list:
# split the subset and print its elements
arr = subset.split('|')
for string in arr:
print(string, end = " ")
print()
# Driver Code
if __name__ == '__main__':
arr = [10, 12, 12, 17]
n = len(arr)
printPowerSet(arr, n)
However, as you can see the above code does not use classes just a single function. If that works great, if you are required to use a class, let me know you will need to change the above code obviously.
I assume the below is what you are looking for:
[1, 3, 3] to [1,3]
[1, 5, 3, 3] to [1,5,3]
The set(list) function will do that for you real easy, however it doesn't handle compound data structure well.
Below code will work for compound data from, one level deep:
[[1, 1], [0, 1], [0, 1], [0, 0], [1, 0], [1, 1], [1, 1]]
to:
[[1, 1], [0, 1], [0, 0], [1, 0]]
code:
def get_unique(list):
temp = []
for i in list:
if i not in temp:
temp.append(i)
yield i
print(*get_unique(list))
I've trimmed the above code to give you your desired outputs, still not in a class though, is this okay?...
def distinct_subset(user_input):
n = len(user_input)
output = []
for i in range(2 ** n):
subset = ""
for j in range(n):
if (i & (1 << j)) != 0:
subset += str(user_input[j]) + ", "
if subset[:-2] not in output and len(subset) > 0:
output.append(subset[:-2])
return output
def main():
print("Here is the list of subsets: " + str(distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(distinct_subset([1, 5, 3, 3])))
main()
You're looking for distinct combinations of the powerset of your list.
Using itertools to generate the combinations and a set to eliminate duplicates, you could write the function like this:
from itertools import combinations
def uniqueSubsets(A):
A = sorted(A)
return [*map(list,{subset for size in range(len(A)+1)
for subset in combinations(A,size)})]
print(uniqueSubsets([1,3,3]))
# [[1, 3], [3, 3], [1], [3], [], [1, 3, 3]]
print(uniqueSubsets([1,5,3,3]))
# [1, 3] [3, 3] [1] [3] [3, 3, 5] [1, 3, 5] [1, 5] [5] [] [1, 3, 3, 5] [1, 3, 3] [3, 5]
If you have a lot of duplicates, it may be more efficient to filter them out as you go. Here is a recursive generator function that short-circuits the expansion when a combination has already been seen. It generates combinations by removing one element at a time (starting from the full size) and recursing to get shorter combinations.
def uniqueSubsets(A,seen=None):
if seen is None: seen,A = set(),sorted(A)
for i in range(len(A)): # for each position in the list
subset = (*A[:i],*A[i+1:]) # combination without that position
if subset in seen: continue # that has not been seen before
seen.add(subset)
yield from uniqueSubsets(subset,seen) # get shorter combinations
yield list(A)
print(*uniqueSubsets([1,3,3]))
# [] [3] [3, 3] [1] [1, 3] [1, 3, 3]
print(*uniqueSubsets([1,5,3,3]))
# [] [3] [3, 3] [5] [5, 3] [5, 3, 3] [1] [1, 3] [1, 3, 3] [1, 5] [1, 5, 3] [1, 5, 3, 3]
In both cases we are sorting the list in order to ensure that the combinations will always present the values in the same order for the set() to recognize them. (otherwise lists such as [3,3,1,3] could still produce duplicates)
I have list/array of integers, call a subarray a peak if it goes up and then goes down. For example:
[5,5,4,5,4]
contains
[4,5,4]
which is a peak.
Also consider
[6,5,4,4,4,4,4,5,6,7,7,7,7,7,6]
which contains
[6,7,7,7,7,7,6]
which is a peak.
The problem
Given an input list, I would like to find all the peaks contained in it of minimal length and report them. In the example above, [5,6,7,7,7,7,7,6] is also a peak but we remove the first element and it remains a peak so we don't report it.
So for input list:
L = [5,5,5,5,4,5,4,5,6,7,8,8,8,8,8,9,9,8]
we would return
[4,5,4] and [8,9,9,8] only.
I am having problems devising a nice algorithm for this. Any help would be hugely appreciated.
Using itertools
Here is a short solution using itertools.groupby to detect peaks. The groups identifying peaks are then unpacked to yield the actual sequence.
from itertools import groupby, islice
l = [1, 2, 1, 2, 2, 0, 0]
fst, mid, nxt = groupby(l), islice(groupby(l), 1, None), islice(groupby(l), 2, None)
peaks = [[f[0], *m[1], n[0]] for f, m, n in zip(fst, mid, nxt) if f[0] < m[0] > n[0]]
print(peaks)
Output
[[1, 2, 1], [1, 2, 2, 0]]
Using a loop (faster)
The above solution is elegant but since three instances of groupby are created, the list is traversed three times.
Here is a solution using a single traversal.
def peaks(lst):
first = 0
last = 1
while last < len(lst) - 1:
if lst[first] < lst[last] == lst[last+1]:
last += 1
elif lst[first] < lst[last] > lst[last+1]:
yield lst[first:last+2]
first = last + 1
last += 2
else:
first = last
last += 1
l = [1, 2, 1, 2, 2, 0, 0]
print(list(peaks(l)))
Output
[[1, 2, 1], [1, 2, 2, 0]]
Notes on benchmark
Upon benchmarking with timeit, I noticed an increase in performance of about 20% for the solution using a loop. For short lists the overhead of groupby could bring that number up to 40%. The benchmark was done on Python 3.6.
I am trying to get all possible permutations for characters in a list. I need it to return all possible perms inside a list. A list of lists, where each component of the list is a permutation. Cant seem to figure out whats wrong with it. Tried playing around with the lists but nothing helps. Trying to get this done without importing anything.
the code:
def permutation(lst1, num_of_perms):
if num_of_perms == len(lst1) - 1:
print(lst1)
for i in range(num_of_perms, len(lst1)):
# "removes" the first component of the list and returns all
# permutations where it is the first letter
lst1[i], lst1[num_of_perms] = lst1[num_of_perms], lst1[i]
# swaps two components of the list each time.
permutation(lst1, num_of_perms + 1)
lst1[i], lst1[num_of_perms] = lst1[num_of_perms], lst1[i]
# swaps back before the next loop
Also i am open for any tips on how to improve the coding style.
There is a difference between returning a value and printing a value, although the difference may be tricky to see if you are only running the function from the interactive interpreter, as it always prints the return value of a function to standard output.
The simplest fix would be to make permutation a generator function, as it simply involves replacing print with yield. You need to yield a copy of the list in the base case (otherwise, when you finally iterate the return value, you get references to whatever lst1 refers to at that time, rather than what it referred to when you used yield). You also need to explicitly yield the values from the recursive call.
def permutation(lst1, num_of_perms):
if num_of_perms == len(lst1) - 1:
yield(lst1[:])
for i in range(num_of_perms, len(lst1)):
# "removes" the first component of the list and returns all
# permutations where it is the first letter
lst1[i], lst1[num_of_perms] = lst1[num_of_perms], lst1[i]
# swaps two components of the list each time.
yield from permutation(lst1, num_of_perms + 1)
lst1[i], lst1[num_of_perms] = lst1[num_of_perms], lst1[i]
# swaps back before the next loop
With those changes, you can make a list out of the generator itself:
>>> list(permutation([1,2,3],0))
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 2, 1], [3, 1, 2]]
or iterate over it one permutation at a time
>>> for i, p in enumerate(permutation([1,2,3], 0)):
... print("{}) {}".format(i, p))
...
0) [1, 2, 3]
1) [1, 3, 2]
2) [2, 1, 3]
3) [2, 3, 1]
4) [3, 2, 1]
5) [3, 1, 2]
I want to split a list into a nest list. The list I have is this:
[1,2,1,3,2]
Now, I want the output to be like this:
[[1,2],[2,1],[1,3],[3,2]]
Is there any possible of doing the output as mentioned above?
You can use zip
lst = [1,2,1,3,2]
res = [list(pair) for pair in zip(lst, lst[1:])]
print(res) # -> [[1, 2], [2, 1], [1, 3], [3, 2]]
Note: the first instance of lst in zip does not have to be sliced since it is the smallest of the two that dictates the number of tuples that will be generated.
As #Jean-FrancoisFabre said in the comments, if the original list is big, you might want to go with a generator instead of a hard slice.
res = [list(pair) for pair in zip(lst, itertools.islice(lst, 1, None))]
The benefit of this approach (or the drawback of the previous one) is that the second list used in zip (lst[1:]) is not created in memory, but you will need to import itertools for it to work.
You're looking for bi-grams. Here is a generic function to generate n-grams from a sequence.
def ngrams(seq, n):
return [seq[i:i + n] for i in range(len(seq) - n + 1)]
a = [1,2,1,3,2]
print ngrams(a, 2)
# [[1, 2], [2, 1], [1, 3], [3, 2]]