Reducing a N-Sum to a Two Sum - python

I've recently came across a cool algorithm to reduce any problem of the sort "Find n numbers in an array that sum to a target" to a Two Sum problem. However, I am having a hard time understanding one line of the code.
def findNsum(nums, target, N, result, results):
if len(nums) < N or N < 2 or target < nums[0]*N or target > nums[-1]*N: # early termination
return
if N == 2: # two pointers solve sorted 2-sum problem
l,r = 0,len(nums)-1
while l < r:
s = nums[l] + nums[r]
if s == target:
results.append(result + [nums[l], nums[r]])
l += 1
while l < r and nums[l] == nums[l-1]:
l += 1
elif s < target:
l += 1
else:
r -= 1
else: # recursively reduce N
for i in range(len(nums)-N+1):
if i == 0 or (i > 0 and nums[i-1] != nums[i]):
findNsum(nums[i+1:], target-nums[i], N-1, result+[nums[i]], results)
results = []
findNsum(sorted(nums), 0, 3, [], results)
return results
The condition:
if i == 0 or (i > 0 and nums[i-1] != nums[i]):
Does not make sense to me. Why do I have to check if nums[i-1] != nums[i]? If I try it out with, say, with nums = [-1, 0, 1, 2, 2, -1, -4], I get [[-4, 2, 2], [-1, -1, 2], [-1, 0, 1]] with the condition. If I take it out I get [[-4, 2, 2], [-1, -1, 2], [-1, 0, 1], [-1, 0, 1]]. Can any one make sense of this?
Cheers!

The condition nums[i-1] != nums[i] is to avoid creating duplicate solutions when picking the first element, which can see in your output in the second example. This problem wants to find all unique solutions, not all possible solutions, hence we want to drop the second [-1,0,1]

Related

Prime factor visitation. Flipping states based on prime factors

So I have a list of 1s and 0s:
[1, 1, 0, 0, 1, 1, 0, 1, 1, 1]
and a list of numbers: [3, 4, 15]
I have to find all the prime factors of those numbers, and flip the states in the first list that correspond to the prime factors of those numbers.
So for the above example:
numbers[0] = 3, prime factors are just 3
So after the states were changed, the array look like:
[1, 1, 1, 0, 1, 0, 0, 1, 0, 1], so every (i + 1) % 3 == 0 positions were flipped
numbers[1] = 4, prime factors are just 2
So after the states were changed, the array look like:
[1, 0, 1, 1, 1, 1, 0, 0, 0, 0]
numbers[3] = 15, prime factors are just 3, 5
So after the states were changed, the array look like:
[1, 0, 0, 1, 1, 0, 0, 0, 1, 0]
[1, 0, 0, 1, 0, 0, 0, 0, 1, 1]
Heres what i have so far:
from collections import Counter
def prime_factors(num):
i = 2
factors = []
while i * i < = num:
if (num % i):
i += 1
else:
num //= i
factors.append(i)
if (num > 1):
factors.append(num)
return list(set(factors))
def flip(states, numbers):
factors = []
for num in numbers:
factors.extend(prime_factors(num))
facotrs = Counter(factors)
for key, val in factors.items():
if val % 2:
for i in range(len(states)):
if ((i + 1) % factor == 0):
states[i] = 1 if states[i] == 0 else 0
return states
This works fine, but for large lists, it TLEs.
How do I fix this to make it faster?
for key, val in factors.items():
if val % 2:
for i in range(len(states)):
if ((i + 1) % factor == 0):
states[i] = 1 if states[i] == 0 else 0
In the second for-loop, you don't need to start at 0 in range(len(states)). You can start with factor-1.
In states[i] = 1 if states[i] == 0 else 0,
you can replace this line with XOR operator:
states[i] = states[i]^1.
Try to optimize the prime number generator with the sieve of Eratosthenes. Should work withing a couple seconds for numbers less than 107
import math;
def sieve_primes(number):
a = [True if n>=2 and n%2==1 or n==2 else False for n in range(number+1)];
for x in range(3, (int)(math.sqrt(number))):
if a[x]:
for xx in range(x*x, number, x):
a[xx] = False;
primes = []
for i in range(len(a)):
if a[i]:
primes.append(i);
return primes;
Also, you could avoid multiple calls to the prime generator by finding the biggest number that you'll need to factorize, generating primes for it and storing them for later lookup.
This part:
for i in range(len(states)):
if ((i + 1) % factor == 0):
states[i] = 1 if states[i] == 0 else 0
seems too elaborate. How about (untested!):
for i in range(factor - 1, len(states), factor):
states[i] = 1 - states[i]
? That is, jump directly to all only the indices i such that i+1 is divisible by factor.
Making factoring very fast
Here's suitable sieve code as mentioned in a comment. If the maximum possible number "is large", of course this is an absurd approach. But you haven't told us. If it's up to a few million, this goes fast.
# Return list `sf` such that sf[i] is the smallest prime
# factor of `i`, for 2 <= i <= maxn.
def sieve(maxn):
from math import isqrt
sf = list(range(maxn + 1))
for i in range(4, len(sf), 2):
sf[i] = 2
for p in range(3, isqrt(maxn) + 1, 2):
if sf[p] == p:
for i in range(p * p, len(sf), p + p):
if sf[i] == i:
sf[i] = p
return sf
The whole thing
Here's a whole program. Code for sieve() was already given. Once that's called, it never needs to be called again.
It's often the case that successfully completing timed "programming challenges" crucially relies on the stated input constraints. You were asked several times to tell us what they were, but to no avail. My educated guess is that they put a relatively low limit on the maximum number that needs to be factored. The program here exploits that. But if they didn't put limits on it, there's scant hope for "a fast" solution, because efficient factoring of truly large integers remains a difficult, open research problem.
The method here tries to balance the time and space needed for preprocessing against the time needed to factor. It may or may not be "the best" tradeoff, depending on the still-unknown-to-us input constraints. For example, if the maximum number is "quite" small, you could fully compute - and store - the unique prime factors for every possible number in advance (although a sieve method would remain the fastest easy way to do that).
# Return list of unique prime factors of n.
def upf(n, sf):
result = []
while n > 1:
p = sf[n]
result.append(p)
n //= p
while n % p == 0:
n //= p
return result
MAXN = 1_000_000
sf = sieve(MAXN)
def crunch(nums, bits, sf):
from collections import defaultdict
pcount = defaultdict(int)
for num in nums:
for p in upf(num, sf):
pcount[p] += 1
for p, count in pcount.items():
if count & 1:
for i in range(p - 1, len(bits), p):
bits[i] = 1 - bits[i]
nums = [3, 4, 15]
bits = [1, 1, 0, 0, 1, 1, 0, 1, 1, 1]
expected = [1, 0, 0, 1, 0, 0, 0, 0, 1, 1]
crunch(nums, bits, sf)
assert bits == expected

Max interval intersection point

I am trying to implement the logic in python. Given a set of intervals, find the interval which has the maximum number of intersections. If input (1,6) (2,3) (4,11), then (1,6) should be returned. This has been answered in below but I have been unable to implement it in python.
given-a-set-of-intervals-find-the-interval-which-has-the-maximum-number-of-inte.
So far I am using the below logic. Any help will be greatly appreciated.
def interval_intersection(intervals):
if len(intervals) ==1:
return intervals
intervals.sort(key=lambda x: x[0])
res =[intervals[0]]
for i in range(1,len(intervals)):
if intervals[i][0] > res[-1][1]:
res.append(intervals[i])
else:
res[-1] = [min(res[-1][0],intervals[i][0]),max(res[-1][1],intervals[i][1])]
return res
Examples:
[[1,5],[5,10],[5,5]]
ans should be [5,5]
In case of tie [5,5] the interval with least number of elements . Here [5,5] has only 1 element in the range ie 5 hence the ans
[[1,2],[3,5]]
no intersection return -1
This is a fairly straightforward implementation of David Eisenstat's algorithm. The only subtleties are:
I assume that all intervals are closed on both ends, which means that sorting events should put starts before ends if they're simultaneous. If you want intervals that are fully open, or open on the right side, this order needs to be reversed.
The returned interval has the most intersections, with ties broken first by smallest length, then by earliest start.
def interval_solve(intervals: Sequence[Sequence[int]]) -> Union[Sequence[int], int]:
start_type = -1 # Assumes all intervals are closed
end_type = 1
events = [(s, start_type, i) for i, (s, e) in enumerate(intervals)]
events.extend((e, end_type, i) for i, (s, e) in enumerate(intervals))
events.sort()
inter_count: Dict[int, int] = {}
start_count = 0
stop_count = 0
for event_time, event_type, event_id in events:
if event_type == start_type:
start_count += 1
inter_count[event_id] = -(stop_count + 1)
else:
stop_count += 1
inter_count[event_id] += start_count
# Find max by most intersections, then by shortest interval, then by earliest start
answer = max(range(len(intervals)),
key=lambda j: (inter_count[j], intervals[j][0] - intervals[j][1]))
if inter_count[answer] == 0:
return -1
return intervals[answer]
The actual idea is pretty simple, we sort the intervals and store some of them with an index and a boolean key for indicating the start or end events.
Then, we just traverse it while counting the end events before an index and the start events. For any index i, interval overlap count is simply, number of start events before - number of end events before (-1).
Finally, we can just check which one has the minimum length in case of multiple solutions.
# https://stackoverflow.com/questions/69426852/max-interval-intersection-point
def max_interval_count(intervals):
interval_sorted = []
for idx, interval in enumerate(intervals):
s, e = interval
interval_sorted.append([s, idx, 0]) # 0 for start
interval_sorted.append([e, idx, 1]) # 1 for end
interval_sorted.sort(key = lambda x: x[0])
print(interval_sorted)
number_of_starts = 0
number_of_ends = 0
overlap_count = {}
for event in interval_sorted:
_, idx, start_end = event
if start_end == 0: # start event
# subtract all the ending before it
overlap_count[idx] = - (number_of_ends)
number_of_starts += 1
else: # end event
overlap_count[idx] += (number_of_starts - 1) # -1 as we should not include the start from the same interval
number_of_ends += 1
print(overlap_count)
ans_idx = -1
max_over_count = 0
min_len_interval = 99999999999
for idx, overl_cnt in overlap_count.items():
if overl_cnt > max_over_count:
ans_idx = idx
max_over_count = overl_cnt
elif overl_cnt == max_over_count and overl_cnt > 0 and (intervals[idx][1] - intervals[idx][0] + 1) < min_len_interval:
min_len_interval = (intervals[idx][1] - intervals[idx][0] + 1)
ans_idx = idx
if ans_idx == -1:
return ans_idx
return intervals[ans_idx]
if __name__ == "__main__":
test_1 = [[1,5],[5,10],[5,5]]
test_2 = [[1,2],[3,5]]
test_3 = [(1,6), (2,3), (4,11)]
ans = max_interval_count(test_1)
print(ans)
print("---------")
ans = max_interval_count(test_2)
print(ans)
print("---------")
ans = max_interval_count(test_3)
print(ans)
print("---------")
[[1, 0, 0], [5, 0, 1], [5, 1, 0], [5, 2, 0], [5, 2, 1], [10, 1, 1]]
{0: 0, 1: 1, 2: 1}
[5, 5]
---------
[[1, 0, 0], [2, 0, 1], [3, 1, 0], [5, 1, 1]]
{0: 0, 1: 0}
-1
---------
[[1, 0, 0], [2, 1, 0], [3, 1, 1], [4, 2, 0], [6, 0, 1], [11, 2, 1]]
{0: 2, 1: 1, 2: 1}
(1, 6)
---------

Finding Maximum non-negative Subarray in python

I've tried to find the sub-array(s) from a given which contain elements of maximum sum than any other sub array.
Below function has parameter as input a and the output needs to be returned. There can be more than one subarray as their maximum sum can be equal. The code did not seem to be working as expected.
def max_sum_subarray(a):
N, sub_sum, max_sum, subArrays = len(a), 0, 0, {}
p,q=0,0 #starting and ending indices of a max sub arr
for i in range(N):
q=i
sub_sum+=a[i]
if(a[i]<0):
q-=1
if(sub_sum>=max_sum):
if(sub_sum>max_sum):
subArrays.clear()
subArrays[sub_sum]=[(p,q)]
else:
subArrays[sub_sum].append((p,q))
sub_sum=0
p=i+1
if(sub_sum>=max_sum):
if(sub_sum>max_sum):
subArrays.clear()
subArrays[sub_sum]=[(p,q)]
else:
subArrays[sub_sum].append((p,q))
return(subArrays[p:q+1])
When I tried to run for input
a=[ 1, 2, 5, -7, 2, 5 ]
Expected output is [1, 2, 5] but it gave [2, 5] instead. Can anyone please post the solution in python?
It seems like you making this harder than necessary. You can just keep track of max array seen to far and the current one you're pushing into -- you don't really need to care about anything else. When you hit a negative (or the end of the array) decide if the current should be the new max:
def maxSub(a):
max_so_far = []
max_sum = 0
cur = []
for n in a:
if n >= 0:
cur.append(n)
else:
cur_sum = sum(cur)
if cur_sum > max_sum:
max_sum = cur_sum
max_so_far = cur
cur = []
return max([max_so_far, cur], key = sum)
a=[ 1, 2, 5, -7, 2, 5 ]
maxSub(a)
# [1, 2, 5]
Of course itertools.groupby makes this a one-liner:
from itertools import groupby
a=[ 1, 2, 5, -7, 2, 5 ]
max([list(g) for k,g in groupby(a, key=lambda x: x>0) if k == True], key=sum)
For the following conditions:
NOTE 1: If there is a tie, then compare with segment’s length and
return segment which has maximum length
NOTE 2: If there is still a tie, then return the segment with minimum
starting index
Here is my working code in python:
def check(max_arr,curr):
if sum(curr) > sum(max_arr):
max_arr = curr
elif sum(curr) == sum(max_arr):
if len(curr) > len(max_arr):
max_arr = curr
elif len(curr) == len(max_arr):
if max_arr and (curr[0] > max_arr[0]):
max_arr = curr
return max_arr
def maxset(A):
curr = []
max_arr = []
for i in A:
if i >= 0:
curr.append(i)
else:
max_arr = check(max_arr,curr)
curr = []
max_arr = check(max_arr,curr)
return max_arr

Counting consecutive numbers in all columns of a 2D array

I have a 2d array, X, that looks like this
[0, 0, 0, 2, 1]
[1, 2, 1, 0, 1]
[2, 2, 1, 0, 0]
[0, 0, 1, 2, 0]
I'm trying to iterate through the entire 2D array to try and count all the instances where there are 2 consecutive elements in a column. E.g. X above would return 4 (X[1][1] == X[2][1] && X[1][2] == X[2][2] && X[2][2] == X[3][2] and so on)
I'm finding this very hard to visualize. So far I have:
def get_opposite(number):
if number == 2: return 1
if number == 1: return 2
def counter(X, number):
count = 0
for i in range(len(X)):
for j in range(len(X[i])-1):
if X[i][j] == X[i][j+1] and X[i][j] != 0 and X[i][j] != get_opposite(number):
count += 1
return count
I keep either getting vastly incorrect results, or IndexError, it should be fairly straight forward but I'm not sure what I'm doing wrong
If you compare the example you give in the text with your actual code, you'll notice your code is comparing with the value on the right, not the with the value below it. You need to apply +1 to the first index, not the second. This also means the range of your loops has to be adapted accordingly.
Secondly, you don't need the first function. The equality comparison is enough.
Also, I removed the second argument of the function, as it serves no role:
def counter(X):
count = 0
for i in range(len(X)-1):
for j in range(len(X[i])):
if X[i][j] == X[i+1][j] and X[i][j] != 0:
count += 1
return count

finding contiguous Subset with Largest Sum

def max_sublist(x):
max1 = 0
max2 = 0
result = []
for i in x:
max2 = max(0, max2 + i)
max1 = max(max1, max2)
print result
I want to add elements till the element which had the max sum. How do I add only whose elements to the result.
For ex. if x = [4, -1, 5, 6, -13, 2]
then result should be [4, -1, 5, 6]
This is a classic problem in optimization, and it's called the maximum subarray problem. Here's one possible dynamic programming solution in O(n), using Kadane's algorithm:
def max_val_contiguous_subsequence_idxs(seq):
i = thisSum = maxSum = 0
startIdx, endIdx = 0, -1
for j in xrange(len(seq)):
thisSum += seq[j]
if thisSum > maxSum:
maxSum = thisSum
startIdx = i
endIdx = j
elif thisSum < 0:
thisSum = 0
i = j + 1
return (maxSum, startIdx, endIdx)
The above will return in a single pass a tuple with the maximum sum, the starting index and the end index of the subsequence. For example, using the sample input in the question:
lst = [4, -1, 5, 6, -13, 2]
maxSum, startIdx, endIdx = max_val_contiguous_subsequence_idxs(lst)
maxSum
=> 14
lst[startIdx:endIdx+1]
=> [4, -1, 5, 6]
Notice that the implementations shown in the wikipedia page (which look a lot like the solution you were aiming for) only give the maximum sum, but unlike my solution they don't tell you how to find the subsequence indexes in the array.

Categories

Resources