Shortest subset for given sum and fastest solution in Python - python

This is a variant of "given sum problem" and I try to write a solution in Python which will solve it in O(log n) time. For a given natural number N (equal or larger than 1) find the shortest count of items p1..n which sum makes the N, and p items are a product of below iteration:
value of pi is either pi-1* 2 orpi-1+ 1
start from p1 which is exactly 1
so accordingly:
p2 is always 2, but p3 can be either 3 or 4
For the input N = 18, candidate sets are: [1, 2, 4, 5, 6], [1, 2, 3, 4, 8], [1, 2, 4, 5, 6], [1, 2, 3, 4, 8]
And the answer is 5.
This is code I wrote so far but it's slow and freezes on moderately "big" (N >= 1000) values:
possible = None
def solution(N):
global possible
possible = list()
tea(1, [1], N)
sizes = [len(p) for p in possible]
return min(sizes)
pass
def tea(n, l, target):
global possible
if (sum(l) > target):
return
elif (sum(l) == target):
possible.append(l)
i = n * 2
tea(i, l + [i], target)
i = n + 1
tea(i, l + [i], target)
print solution(18)
# should print 5
print solution(220)
# should print 11
print solution(221)
# no such solution? print -1
How to solve it in a more efficient way?
Fastest solutions are most crucial but a more pythonic code is appreciated as well.

Use breadth-first search to reduce wasted effort. The code below could be optimized more.
def solution(n):
q = [(1,)]
visited = set()
for seq in q:
s = sum(seq)
if s == n:
return seq
elif s > n:
continue
key = (seq[-1], s)
if key in visited:
continue
visited.add(key)
q.append(seq + (seq[-1] * 2,))
q.append(seq + (seq[-1] + 1,))
return None

You are looking for the shortest solution, so if you found a solution, there is no need to look for longer ones.
you can change the code so it will not look for solutions longer then what you have found this way:
(note the added if condition)
def tea(n, l, target):
global possible
if (sum(l) > target):
return
elif (sum(l) == target):
possible.append(l)
# we want to keep looking for new solutions only if l is shorter!
if possible and (len(l) >= max(len(i) for i in possible)):
return
i = n * 2
tea(i, l + [i], target)
i = n + 1
tea(i, l + [i], target)
in addition, seems like you want the function to return -1 when there is no solution, currently, your code will raise an error on such cases, I would change solution() funtion to this:
possible = []
def solution(N):
global possible
tea(1, [1], N)
sizes = [len(p) for p in possible] # you can use: size = map(len,possible) instead
if sizes:
return min(sizes)
return -1
and as for your "more pythonic code", I would write it this way:
def solution(N):
possibles =[]
tea(1, [1], N, possibles)
if not possibles:
return -1
else:
return min(map(len,possibles))
def tea(n, l, target, possibles): # maybe a better name then "tea"
if (sum(l) > target):
return
elif (sum(l) == target):
possibles.append(l)
return
if possibles and (len(l) >= max(len(i) for i in possibles)):
return
tea(n * 2, l + [n * 2], target, possibles)
tea(n + 1, l + [n + 1], target, possibles)

Related

Why is my code O(N^2) if there is only one for loop

I am trying to solve problem 4.1 on Codility.com. I need a solution which runs in O(N) time. My code solves the problem, but it runs in O(N^2) time, according to Codility's performance tests. Unfortunately I can't see why. There is only one for loop, which should scale linearly with n.
The challenge is to write a solution which tests whether an array ('A') contains all integers between 1 and X. It should return the index of the element which is the last element in 1 to X that appears in the array, or it should return -1 if not every integer between 1 and X is an element of the array. For example, the solution to X = 4 and A = [1, 3, 4, 2, 2] is 3 since 2 is the last element in 1 to 4 which appears in the array and it first appears in position 3. The solution to X = 5 and A = [1, 2, 4, 2, 3] is -1 because 5 never appears. My solution is below.
def Solution(X, A):
N = len(A)
count = [0] * (X + 1)
# Solution for single-element arrays
if N == 1:
if A[0] == 1:
return 0
elif A[0] != 1:
return - 1
# Solution for multi-element arrays
elif N != 1:
for i in range(0, N + 1, 1):
if count[A[i]] == 0:
count[A[i]] = count[A[i]] + 1
else:
pass
if count == [0] + [1] * (X):
return i
elif count != [0] + [1] * (X) and i == N - 1:
return -1
Would anyone know why it runs in O(N^2) time? Codility's performance tests confirm this, but as far as I can see this should run in O(kN) time since there is only one for loop. Any help is appreciated.
Something like this would work in O(n), you would have to adjust it to the exact question given:
def solution(x, a):
b = []
for v in a:
# the conditions
if (type(v) == int) and (v < x) and (v>1): b.append(v)
else: return -1
return b
# some examples
# this returns -1
a = [1,2,3,4,5.5,6]
x = solution(5,a)
print(x)
# this returns the the list [2, 3, 4]
a = [2,3,4]
x = solution(5,a)
print(x)
the reason why is because you can exit the list at the first fail with the return -1 statement.
This should be O(N): a single instantiation of a list of length X, one for-loop over A which does O(1) retrieval of single list items, and one single final search of the list for None values.
def solution(X, A):
pos = [None] * X
for i, n in enumerate(A):
if 1 <= n <= X:
if pos[n-1] is None:
pos[n-1] = i
return pos[n-1] if None not in pos else -1
print(solution(4, [1, 3, 4, 2, 2]))
print(solution(5, [1, 2, 4, 2, 3]))
print(solution(1000, reversed(range(1, 1001))))
prints:
3
-1
999

Can I return the base case as answer?

def encryption(n):
if len(n) == 2:
return n
result = []
for i in range(len(n) - 1):
total = n[i] + n[i + 1]
right_most_digit = total % 10
result.append(right_most_digit)
if len(result) > 2:
result = encryption(result)
return "".join([str(x) for x in result])
if __name__ == '__main__':
numbers = [1, 5, 7, 9]
print(encryption(numbers))
I need help with this code.
The problem is to add the two adjacent numbers and keep the right_most_digit, repeat the process to a point where only two numbers left, and return a string.
For example, 1+5, 5+7, 7+9 will be 6,2,6, then 6+2, 2+6 will be 8,8, then return 88 as a string.
With the least changes,
def encryption(n):
if len(n) == 2:
return n
result = []
for i in range(len(n) - 1):
total = n[i] + n[i + 1]
right_most_digit = total % 10
result.append(right_most_digit)
if len(result) > 2:
result = encryption(result)
return result
if __name__ == '__main__':
numbers = [1, 5, 7, 9]
print("".join([str(x) for x in encryption(numbers)]))
print("".join([str(x) for x in encryption(numbers)]).lstrip("0")) if you also want to strip any leading 0
Since you asked about a different approach, you might consider not doing this recursively. You can simply keep processing the list in a while loop until the list is of length 2 or less. Then convert to a string:
def encryption(n):
while len(n) > 2:
n = [(a + b) % 10 for a, b in zip(n, n[1:])]
return "".join([str(x) for x in n])
numbers = [1, 5, 7, 9]
print(encryption(numbers))
# 88
This is a little easier to reason about and it also makes it easy to debug by sticking a print(n) in the loops to see the progression.

Find n integers in list that after multiplying equal to m

I need to print out n indexes of elements of list that after multiplying equal to some given integer. It's guaranteed that the combination exists in a list. For example, for the following input(number of elements in array, multiplication wanted number, number of elements in wanted sublist and given array):
7 60 4
30 1 1 3 10 6 4
I should get in any order
1 2 4 5
Because 1*1*10*6==60. If there are more than 1 solution I need to print any of them.
My solution works but pretty slow, how can I make it work faster?
from itertools import chain, combinations
arr = list(map(int, input().split()))
numbers = list(map(int, input().split()))
s = sorted(numbers)
def filtered_sublists(input_list, length):
return (
l for l in all_sublists(input_list)
if len(l) == length
)
def all_sublists(l):
return chain(*(combinations(l, i) for i in range(len(l) + 1)))
def multiply(arr):
result = 1
for x in arr:
result = result * x
return result
def get_indexes(data):
indexes = []
for i in range(len(data)):
if arr[1] == multiply(data[i]):
for el in data[i]:
if numbers.index(el) in indexes:
all_ind = [i for i, x in enumerate(numbers) if x == el]
for ind in all_ind:
if ind not in indexes:
indexes.append(ind)
break
else:
indexes.append(numbers.index(el))
break
return indexes
sublists = list(filtered_sublists(numbers, arr[2]))
print(*get_indexes(sublists))
The key is don't test every combination.
def combo(l, n=4, target=60, current_indices=[], current_mul=1):
if current_mul > target and target > 0:
return
elif len(current_indices) == n and current_mul == target:
yield current_indices
return
for i, val in enumerate(l):
if (not current_indices) or (i > current_indices[-1] and val * current_mul <= target):
yield from combo(l, n, target, current_indices + [i], val * current_mul)
l = [30,1,1,3,10,6,4]
for indices in combo(l, n=4, target=60):
print(*indices)
Prints:
1 2 4 5
More testcases:
l = [1,1,1,2,3,3,9]
for c, indices in combo(l, n=4, target=9):
print(*indices)
Prints:
0 1 2 6
0 1 4 5
0 2 4 5
1 2 4 5
We can use a memoized recursion for an O(n * k * num_factors), solution, where num_factors depends on how many factors of the target product we can create. The recurrence should be fairly clear from the code. (Zeros aren't handled but those should be pretty simple to add extra handling for.)
Pythonesque JavaScript code:
function f(A, prod, k, i=0, map={}){
if (i == A.length || k == 0)
return []
if (map[[prod, k]])
return map[[prod, k]]
if (prod == A[i] && k == 1)
return [i]
if (prod % A[i] == 0){
const factors = f(A, prod / A[i], k - 1, i + 1, map)
if (factors.length){
map[[prod, k]] = [i].concat(factors)
return map[[prod, k]]
}
}
return f(A, prod, k, i + 1, map)
}
var A = [30, 1, 1, 3, 10, 6, 4]
console.log(JSON.stringify(f(A, 60, 4)))
console.log(JSON.stringify(f(A, 60, 3)))
console.log(JSON.stringify(f(A, 60, 1)))
You could start from the target product and recursively divide by factors in the remaining list until you get down to 1 and after using the specified number of factors. This has the advantage of quickly eliminating whole branches of recursion under numbers that are not a factor of the target product.
Handling zero values in the list and a target product of zero requires a couple of special conditions at the start and while traversing factors.
For example:
def findFactors(product, count, factors, offset=0):
if product == 0: return sorted((factors.index(0)+i)%len(factors) for i in range(count))
if not count: return [] if product == 1 else None
if not factors: return None
for i,factor in enumerate(factors,1):
if factor == 0 or product%factor != 0: continue
subProd = findFactors(product//factor,count-1,factors[i:],i+offset)
if subProd is not None: return [i+offset-1]+subProd
r = findFactors(60, 4, [30,1,1,3,10,6,4])
print(r) # [1, 2, 4, 5]
r = findFactors(60, 4, [30,1,1,0,3,10,6,4])
print(r) # [1, 2, 5, 6]
r = findFactors(0, 4, [30,1,1,3,10,6,0,4])
print(r) # [0, 1, 6, 7]

Python query in list without for loop

I want to find a sum with pair of numbers in python list.
List is sorted
Need to check consecutive combinations
Avoid using for loop
I used a for loop to get the job done and its working fine. I want to learn other optimized way to get the same result.
Can I get the same result with other ways without using a for loop?
How could I use binary search in this situation?
This is my code:
def query_sum(list, find_sum):
"""
This function will find sum of two pairs in list
and return True if sum exist in list
:param list:
:param find_sum:
:return:
"""
previous = 0
for number in list:
sum_value = previous + number
if sum_value == find_sum:
print("Yes sum exist with pair {} {}".format(previous, number))
return True
previous = number
x = [1, 2, 3, 4, 5]
y = [1, 2, 4, 8, 16]
query_sum(x, 7)
query_sum(y, 3)
this is the result.
Yes sum exist with pair 3 4
Yes sum exist with pair 1 2
You can indeed use binary search if your list is sorted (and you are only looking at sums of successive elements), since the sums will be monotonically increasing as well. In a list of N elements, there are N-1 successive pairs. You can copy and paste any properly implemented binary search algorithm you find online and replace the criteria with the sum of successive elements. For example:
def query_sum(seq, target):
def bsearch(l, r):
if r >= l:
mid = l + (r - l) // 2
s = sum(seq[mid:mid + 2])
if s == target:
return mid
elif s > target:
return bsearch(l, mid - 1)
else:
return bsearch(mid + 1, r)
else:
return -1
i = bsearch(0, len(seq) - 1)
if i < 0:
return False
print("Sum {} exists with pair {} {}".format(target, *seq[i:i + 2]))
return True
IDEOne Link
You could use the built-in bisect module, but then you would have to pre-compute the sums. This is a much cheaper method since you only have to compute log2(N) sums.
Also, this solution avoids looping using recursion, but you might be better off writing a loop like while r >= l: around the logic instead of using recursion:
def query_sum(seq, target):
def bsearch(l, r):
while r >= l:
mid = l + (r - l) // 2
s = sum(seq[mid:mid + 2])
if s == target:
return mid
elif s > target:
r = mid - 1
else:
l = mid + 1
return -1
i = bsearch(0, len(seq) - 1)
if i < 0:
return False
print("Yes sum exist with pair {} {}".format(*seq[i:i + 2]))
return True
IDEOne Link
# simpler one:
def query_sum(seq, target):
def search(seq, index, target):
if index < len(seq):
if sum(seq[index:index+2]) == target:
return index
else:
return search(seq, index+1, target)
else:
return -1
return search(seq, 0, target)

Largest subarray with sum equal to 0

This is a typical interview question. Given an array that contains both positive and negative elements without 0, find the largest subarray whose sum equals 0. I tried to solve this. This is what I came up with.
def sub_array_sum(array,k=0):
start_index = -1
hash_sum = {}
current_sum = 0
keys = set()
best_index_hash = {}
for i in array:
start_index += 1
current_sum += i
if current_sum in hash_sum:
hash_sum[current_sum].append(start_index)
keys.add(current_sum)
else:
if current_sum == 0:
best_index_hash[start_index] = [(0,start_index)]
else:
hash_sum[current_sum] = [start_index]
if keys:
for k_1 in keys:
best_start = hash_sum.get(k_1)[0]
best_end_list = hash_sum.get(k_1)[1:]
for best_end in best_end_list:
if abs(best_start-best_end) in best_index_hash:
best_index_hash[abs(best_start-best_end)].append((best_start+1,best_end))
else:
best_index_hash[abs(best_start-best_end)] = [(best_start+1,best_end)]
if best_index_hash:
(bs,be) = best_index_hash[max(best_index_hash.keys(),key=int)].pop()
return array[bs:be+1]
else:
print "No sub array with sum equal to 0"
def Main():
a = [6,-2,8,5,4,-9,8,-2,1,2]
b = [-8,8]
c = [-7,8,-1]
d = [2200,300,-6,6,5,-9]
e = [-9,9,-6,-3]
print sub_array_sum(a)
print sub_array_sum(b)
print sub_array_sum(c)
print sub_array_sum(d)
print sub_array_sum(e)
if __name__ == '__main__':
Main()
I am not sure if this will satisfy all the edge case. if someone can comment on that, it would be excellent Also i want to extend this to sum equalling to any K not just 0. How should i go about it. And any pointers to optimize this further is also helpful.
You have given a nice, linear-time solution (better than the two other answers at this time, which are quadratic-time), based on the idea that whenever sum(i .. j) = 0, it must be that sum(0 .. i-1) = sum(0 .. j) and vice versa. Essentially you compute the prefix sums sum(0 .. i) for all i, building up a hashtable hash_sum in which hash_sum[x] is a list of all positions i having sum(0 .. i) = x. Then you go through this hashtable, one sum at a time, looking for any sum that was made by more than one prefix. Among all such made-more-than-once sums, you choose the one that was made by a pair of prefixes that are furthest apart -- this is the longest.
Since you already noticed the key insight needed to make this algorithm linear-time, I'm a bit puzzled as to why you build up so much unnecessary stuff in best_index_hash in your second loop. For a given sum x, the furthest-apart pair of prefixes that make that sum will always be the smallest and largest entries in hash_sum[x], which will necessarily be the first and last entries (because that's the order they were appended), so there's no need to loop over the elements in between. In fact you don't even need a second loop at all: you can keep a running maximum during your first loop, by treating start_index as the rightmost endpoint.
To handle an arbitrary difference k: Instead of finding the leftmost occurrence of a prefix summing to current_sum, we need to find the leftmost occurrence of a prefix summing to current_sum - k. But that's just first_with_sum{current_sum - k}.
The following code isn't tested, but should work:
def sub_array_sum(array,k=0):
start_index = -1
first_with_sum = {}
first_with_sum{0} = -1
best_start = -1
best_len = 0
current_sum = 0
for i in array:
start_index += 1
current_sum += i
if current_sum - k in first_with_sum:
if start_index - first_with_sum{current_sum - k} > best_len:
best_start = first_with_sum{current_sum - k} + 1
best_len = start_index - first_with_sum{current_sum - k}
else:
first_with_sum{current_sum} = start_index
if best_len > 0:
return array[best_start:best_start+best_len-1]
else:
print "No subarray found"
Setting first_with_sum{0} = -1 at the start means that we don't have to treat a range beginning at index 0 as a special case. Note that this algorithm doesn't improve on the asymptotic time or space complexity of your original one, but it's simpler to implement and will use a small amount less space on any input that contains a zero-sum subarray.
Here's my own answer, just for fun.
The number of subsequences is quadratic, and the time to sum a subsequence is linear, so the most naive solution would be cubic.
This approach is just an exhaustive search over the subsequences, but a little trickery avoids the linear summing factor, so it's only quadratic.
from collections import namedtuple
from itertools import chain
class Element(namedtuple('Element', ('index', 'value'))):
"""
An element in the input sequence. ``index`` is the position
of the element, and ``value`` is the element itself.
"""
pass
class Node(namedtuple('Node', ('a', 'b', 'sum'))):
"""
A node in the search graph, which looks like this:
0 1 2 3
\ / \ / \ /
0-1 1-2 2-3
\ / \ /
0-2 1-3
\ /
0-3
``a`` is the start Element, ``b`` is the end Element, and
``sum`` is the sum of elements ``a`` through ``b``.
"""
#classmethod
def from_element(cls, e):
"""Construct a Node from a single Element."""
return Node(a=e, b=e, sum=e.value)
def __add__(self, other):
"""The combining operation depicted by the graph above."""
assert self.a.index == other.a.index - 1
assert self.b.index == other.b.index - 1
return Node(a=self.a, b=other.b, sum=(self.sum + other.b.value))
def __len__(self):
"""The number of elements represented by this node."""
return self.b.index - self.a.index + 1
def get_longest_k_sum_subsequence(ints, k):
"""The longest subsequence of ``ints`` that sums to ``k``."""
n = get_longest_node(n for n in generate_nodes(ints) if n.sum == k)
if n:
return ints[n.a.index:(n.b.index + 1)]
if k == 0:
return []
def get_longest_zero_sum_subsequence(ints):
"""The longest subsequence of ``ints`` that sums to zero."""
return get_longest_k_sum_subsequence(ints, k=0)
def generate_nodes(ints):
"""Generates all Nodes in the graph."""
nodes = [Node.from_element(Element(i, v)) for i, v in enumerate(ints)]
while len(nodes) > 0:
for n in nodes:
yield n
nodes = [x + y for x, y in zip(nodes, nodes[1:])]
def get_longest_node(nodes):
"""The longest Node in ``nodes``, or None if there are no Nodes."""
return max(chain([()], nodes), key=len) or None
if __name__ == '__main__':
def f(*ints):
return get_longest_zero_sum_subsequence(list(ints))
assert f() == []
assert f(1) == []
assert f(0) == [0]
assert f(0, 0) == [0, 0]
assert f(-1, 1) == [-1, 1]
assert f(-1, 2, 1) == []
assert f(1, -1, 1, -1) == [1, -1, 1, -1]
assert f(1, -1, 8) == [1, -1]
assert f(0, 1, -1, 8) == [0, 1, -1]
assert f(5, 6, -2, 1, 1, 7, -2, 2, 8) == [-2, 1, 1]
assert f(5, 6, -2, 2, 7, -2, 1, 1, 8) == [-2, 1, 1]
I agree with sundar nataraj when he says that this must be posted to the code review forum.
For fun though I looked at your code. Though I am able to understand your approach, I fail to understand the need to use Counter.
best_index_hash[start_index] = [(0,start_index)] - Here best_index_hash is of the type Counter. Why are you assigning a list to it?
for key_1, value_1 in best_index_hash.most_common(1) - You trying to get largest subsequence and for that you are using most_common as the answer. This is not intuitive semantically.
I am tempted to post a solution but I will wait for you to edit the code snippet and improve it.
Addendum
For fun, I had a go at this puzzle and I present my effort below. I make no guarantees of correctness/completeness.
from collections import defaultdict
def max_sub_array_sum(a, s):
if a:
span = defaultdict(lambda : (0,0))
current_total = 0
for i in xrange(len(a)):
current_total = a[i]
for j in xrange (i + 1, len(a)):
current_total += a[j]
x,y = span[current_total]
if j - i > y - x:
span[current_total] = i,j
if s in span:
i, j = span[s]
print "sum=%d,span_length=%d,indices=(%d,%d),sequence=%s" %\
(s, j-i + 1, i, j, str(a[i:j + 1]))
return
print "Could not find a subsequence of sum %d in sequence %s" % \
(s, str(a))
max_sub_array_sum(range(-6, -1), 0)
max_sub_array_sum(None, 0)
max_sub_array_sum([], 0)
max_sub_array_sum(range(6), 15)
max_sub_array_sum(range(6), 14)
max_sub_array_sum(range(6), 13)
max_sub_array_sum(range(6), 0)
Here's the solution taken from LeetCode :
def sub_array_sum(nums, k=0):
count, sum = 0, 0
map = dict()
map[0] = 1
for i in range(len(nums)):
sum += nums[i]
if map.__contains__(sum - k):
count += map[sum - k]
map[sum] = map.get(sum, 0) + 1
return count

Categories

Resources