Find the maximum element by summing overlapping intervals - python

Say we are given the total size of the interval space. Say we are also given an array of tuples giving us the start and end indices of the interval to sum over along with a value. After completing all the sums, we would like to return the maximum element. How would I go about solving this efficiently?
Input format: n = interval space, intervals = array of tuples that contain start index, end index, and value to add to each element
Eg:
Input: n = 5, intervals = [(1,2,100),(2,5,100),(3,4,100)]
Output: 200
so array is initially [0,0,0,0,0]
At each iteration the following modifications will be made:
1) [100,100,0,0,0]
2) [100,200,100,100,100]
3) [100,200,200,200,100]
Thus the answer is 200.
All I've figured out so far is the brute force solution of splicing the array and adding a value to the spliced portion. How can I do better? Any help is appreciated!

One way is to separate your intervals into a beginning and an end, and specify how much is added or subtracted to the total based on whether you are in that interval or not. Once you sort the intervals based on their location on the number line, you traverse it, adding or subtracting the values based on whether you enter or leave the interval. Here is some code to do so:
def find_max_val(intervals):
operations = []
for i in intervals:
operations.append([i[0],i[2]])
operations.append([i[1]+1,-i[2]])
unique_ops = defaultdict(int)
for operation in operations:
unique_ops[operation[0]] += operation[1]
sorted_keys = sorted(unique_ops.keys())
print(unique_ops)
curr_val = unique_ops[sorted_keys[0]]
max_val = curr_val
for key in sorted_keys[1:]:
curr_val += unique_ops[key]
max_val = max(max_val, curr_val)
return max_val
intervals = [(1,2,100),(2,5,100),(3,4,100)]
print(find_max_val(intervals))
# Output: 200

Here is the code for 3 intervals.
n = int(input())
x = [0]*n
interval = []
for i in range(3):
s = int(input()) #start
e = int(input()) #end
v = int(input()) #value
#add value
for i in range (s-1, e):
x[i] += v
print(max(x))

You can use list comprehension to do a lot of the work.
n=5
intervals = [(1,2,100),(2,5,100),(3,4,100)]
intlst = [[r[2] if i>=r[0]-1 and i<=r[1]-1 else 0 for i in range(n)] for r in intervals]
lst = [0]*n #[0,0,0,0,0]
for ls in intlst:
lst = [lst[i]+ls[i] for i in range(n)]
print(lst)
print(max(lst))
Output
[100, 200, 200, 200, 100]
200

Related

Modify the elements of a list inside a for loop (Python equivalent of Matlab code with a nested loop)

I have the following Matlab code (adopted from Programming and Numerical Methods in MATLAB by Otto&Denier, page 75)
clear all
p = input('Enter the power you require: ');
points = p+2;
n = 1:points;
for N = n
sums(N) = 0;
for j = 1:N
sums(N) = sums(N)+j^p;
end
end
The output for 3 as the given value of p is the following list
>> sums
sums =
1 9 36 100 225
I have written the following Python code (maybe not the most 'Pythonic way') trying to follow as much as possible Matlab instructions.
p = int(input('Enter the power you require: '))
points = p+2
n = range(points)
for N in range(1, len(n)+1):
sums = [0]*N
for index, item in list(enumerate(sums)):
sums[index] = item+index**p
Nevertheless the output is not same list. I have tried to replace the inner loop with
for j in range(1,N+1):
sums[N] = sums[N]+j**p
but this results to an index error message. Thanks in advance for any suggestions.
This might be due to the index difference. In Python, it starts from 0 while it's 1 in Matlab. Also, sums = [0]*N initialize a list of a length N, this has to be moved outside of the loop.
points = p+2
sums = [0]*points
for N in range(0, points):
for index in range(0, N+1):
sums[N] = sums[N] + (index+1)**p
sums(N) = 0; does not create an array of all zeros, it sets element N of the existing array to 0, and creates additional elements in the array if it not at least of length N.
Because N grows by one each iteration, you could initialize as an empty array before the loop, and append(0) inside the loop:
sums = []
for N in range(1, len(n)+1):
sums.append(0)
I don’t particularly like the use of enumerate here either, I would:
for index in range(N)
sums[index] += (index + 1)**p
(Notice the +1 on the index that was missing in the code in the OP!)
Finally, n is just confusing here. I would:
for N in range(1, points + 1):
…

How to retrieve subset in partitioning algorithm?

I have an array and I would like to split it two parts such that their sum is equal for example [10, 30, 20, 50] can be split into [10, 40] , [20, 30]. Both have a sum of 50. This is essentially partitioning algorithm but I'd like the retrieve the subsets not just identify whether it's partitionable. So, I went ahead and did the following:
Update: updated script to handle duplicates
from collections import Counter
def is_partitionable(a):
possible_sums = [a[0]]
corresponding_subsets = [[a[0]]]
target_value = sum(a)/2
if a[0] == target_value:
print("yes",[a[0]],a[1:])
return
for x in a[1:]:
temp_possible_sums = []
for (ind, t) in enumerate(possible_sums):
cursum = t + x
if cursum < target_value:
corresponding_subsets.append(corresponding_subsets[ind] + [x])
temp_possible_sums.append(cursum)
if cursum == target_value:
one_subset = corresponding_subsets[ind] + [x]
another_subset = list((Counter(a) - Counter(one_subset)).elements())
print("yes", one_subset,another_subset)
return
possible_sums.extend(temp_possible_sums)
print("no")
return
is_partitionable(list(map(int, input().split())))
Sample Input & Output:
>>> is_partitionable([10,30,20,40])
yes [10, 40] [30, 20]
>>> is_partitionable([10,30,20,20])
yes [10, 30] [20, 20]
>>> is_partitionable([10,30,20,10])
no
I'm essentially storing the corresponding values that were added to get a value in corresponding_subsets. But, as the size of a increases, it's obvious that the corresponding_subsets would have way too many sub-lists (equal to the number of elements in possible_sums). Is there a better/more efficient way to do this?
Though it is still a hard problem, you could try the following. I assume that there are n elements and they are stored in the array named arr ( I assume 1-based indexing ). Let us make two teams A and B, such that I want to partition the elements of arr among teams A and B such that sum of elements in both the teams is equal. Each element of arr has an option of either going to team A or team B. Say if an element ( say ith element ) goes to team A we denote it by -a[i] and if it goes to team B we let it be a[i]. Thus after assigning each element to a team, if the total sum is 0 our job is done. We will create n sets ( they do not store duplicates ). I will work with the example arr = {10,20,30,40}. Follow the following steps
set_1 = {10,-10} # -10 if it goes to Team A and 10 if goes to B
set_2 = {30,-10,10,-30} # four options as we add -20 and 20
set_3 = {60,0,20,-40,-20,-60} # note we don't need to store duplicates
set_4 = {100,20,40,-40,60,-20,-80,0,-60,-100} # see there is a zero means our task is possible
Now all you have to do is backtrack from the 0 in the last set to see if the ith element a[i] was added as a[i] or as -a[i], ie. whether it is added to Team A or B.
EDIT
The backtracking routine. So we have n sets from set_1 to set_n. Let us make two lists list_A to push the elements that belong to team A and similarly list_B. We start from set_n , thus using a variable current_set initially having value n. Also we are focusing at element 0 in the last list, thus using a variable current_element initially having value 0. Follow the approach in the code below ( I assume all sets 1 to n have been formed, for sake of ease I have stored them as list of list, but you should use set data structure ). Also the code below assumes a 0 is seen in the last list ie. our task is possible.
sets = [ [0], #see this dummy set it is important, this is set_0
#because initially we add -arr[0] or arr[0] to 0
[10,-10],
[30,-10,10,-30],
[60,0,20,-40,-20,-60],
[100,20,40,-40,60,-20,-80,0,-60,-100]]
# my array is 1 based so ignore the zero
arr = [0,10,20,30,40]
list_A = []
list_B = []
current_element = 0
current_set = 4 # Total number of sets in this case is n=4
while current_set >= 1:
print current_set,current_element
for element in sets[current_set-1]:
if element + arr[current_set] == current_element:
list_B.append(arr[current_set])
current_element = element
current_set -= 1
break
elif element - arr[current_set] == current_element:
list_A.append(arr[current_set])
current_element = element
current_set -= 1
break
print list_A,list_B
This is my implementation of #sasha's algo on the feasibility.
def my_part(my_list):
item = my_list.pop()
balance = []
temp = [item, -item]
while len(my_list) != 0:
new_player = my_list.pop()
for i, items in enumerate(temp):
balance.append(items + new_player)
balance.append(items - new_player)
temp = balance[:]
balance = set(balance)
if 0 in balance:
return 'YES'
else:
return 'NO'
I am working on the backtracking too.

Python Radix Sort

I'm trying to implement Radix sort in python.
My current program is not working correctly in that a list like [41,51,2,3,123] will be sorted correctly to [2,3,41,51,123], but something like [52,41,51,42,23] will become [23,41,42,52,51] (52 and 51 are in the wrong place).
I think I know why this is happening, because when I compare the digits in the tens place, I don't compare units as well (same for higher powers of 10).
How do I fix this issue so that my program runs in the fastest way possible? Thanks!
def radixsort(aList):
BASEMOD = 10
terminateLoop = False
temp = 0
power = 0
newList = []
while not terminateLoop:
terminateLoop = True
tempnums = [[] for x in range(BASEMOD)]
for x in aList:
temp = int(x / (BASEMOD ** power))
tempnums[temp % BASEMOD].append(x)
if terminateLoop:
terminateLoop = False
for y in tempnums:
for x in range(len(y)):
if int(y[x] / (BASEMOD ** (power+1))) == 0:
newList.append(y[x])
aList.remove(y[x])
power += 1
return newList
print(radixsort([1,4,1,5,5,6,12,52,1,5,51,2,21,415,12,51,2,51,2]))
Currently, your sort does nothing to reorder values based on anything but their highest digit. You get 41 and 42 right only by chance (since they are in the correct relative order in the initial list).
You should be always build a new list based on each cycle of the sort.
def radix_sort(nums, base=10):
result_list = []
power = 0
while nums:
bins = [[] for _ in range(base)]
for x in nums:
bins[x // base**power % base].append(x)
nums = []
for bin in bins:
for x in bin:
if x < base**(power+1):
result_list.append(x)
else:
nums.append(x)
power += 1
return result_list
Note that radix sort is not necessarily faster than a comparison-based sort. It only has a lower complexity if the number of items to be sorted is larger than the range of the item's values. Its complexity is O(len(nums) * log(max(nums))) rather than O(len(nums) * log(len(nums))).
Radix sort sorts the elements by first grouping the individual digits of the same place value. [2,3,41,51,123] first we group them based on first digits.
[[],[41,51],[2],[3,123],[],[],[],[],[],[]]
Then, sort the elements according to their increasing/decreasing order. new array will be
[41,51,2,3,123]
then we will be sorting based on tenth digit. in this case [2,3]=[02,03]
[[2,3],[],[123],[],[41],[51],[],[],[],[]]
now new array will be
[2,3,123,41,51]
lastly based on 100th digits. this time [2,3,41,51]=[002,003,041,051]
[[2,3,41,51],[123],[],[],[],[],[],[],[],[]]
finally we end up having [2,3,41,51,123]
def radixsort(A):
if not isinstance(A,list):
raise TypeError('')
n=len(A)
maxelement=max(A)
digits=len(str(maxelement)) # how many digits in the maxelement
l=[]
bins=[l]*10 # [[],[],.........[]] 10 bins
for i in range(digits):
for j in range(n): #withing this we traverse unsorted array
e=int((A[j]/pow(10,i))%10)
if len(bins[e])>0:
bins[e].append(A[j]) #adds item to the end
else:
bins[e]=[A[j]]
k=0 # used for the index of resorted arrayA
for x in range(10):#we traverse the bins and sort the array
if len(bins[x])>0:
for y in range(len(bins[x])):
A[k]=bins[x].pop(0) #remove element from the beginning
k=k+1

Minimum of 4th Element in NxNx4 list (Python)

Hi I've been reading up on finding the minimum of a multidimensional list, but if I have an N x N x 4 list, how do I get the minimum between every single 4th element? All other examples have been for a small example list using real indices. I suppose I'll be needing to define indices in terms of N....
[[[0,1,2,3],[0,1,2,3],...N],[[0,1,2,3],[0,1,2,3],...N].....N]
And then there's retrieving their indices.
I don't know what to try.
If anyone's interested in the actual piece of code:
relative = [[[[100] for k in range(5)] for j in range(N)] for i in range(N)]
What the following does is fill in the 4th element with times satisfying the mathematical equations. The 0th, 1st, 2nd and 3rd elements of relative have positions and velocities. The 4th spot is for the time taken for the i and jth particles to collide (redundant values such as i-i or j-i are filled with the value 100 (because it's big enough for the min function not to retrieve it). I need the shortest collision time (hence the 4th element comparisons)
def time(relative):
i = 0
t = 0
while i<N:
j = i+1
while j<N and i<N:
rv = relative[i][j][0]*relative[i][j][2]+relative[i][j][1]*relative[i][j][3] #Dot product of r and v
if rv<0:
rsquared = (relative[i][j][0])**2+(relative[i][j][1])**2
vsquared = (relative[i][j][2])**2+(relative[i][j][3])**2
det = (rv)**2-vsquared*(rsquared-diameter**2)
if det<0:
t = 100 #For negative times, assign an arbitrarily large number to make sure min() wont pick it up.
elif det == 0:
t = -rv/vsquared
elif det>0:
t1 = (-rv+sqrt((rv)**2-vsquared*(rsquared-diameter**2)))/(vsquared)
t2 = (-rv-sqrt((rv)**2-vsquared*(rsquared-diameter**2)))/(vsquared)
if t1-t2>0:
t = t2
elif t1-t2<0:
t = t1
elif rv>=0:
t = 100
relative[i][j][4]=t #Put the times inside the relative list for element ij.
j = j+1
i = i+1
return relative
I've tried:
t_fin = min(relative[i in range(0,N-1)][j in range(0,N-1)][4])
Which compiles but always returns 100 even thought I've checked it isnt the smallest element.
If you want the min of 4th element of NxNx4 list,
min([x[3] for lev1 in relative for x in lev1])

How to find the largest contiguous, overlapping region in a set of sorted arrays

Given a tuple of ordered 1D-arrays (arr1, arr2, arr3, ), which would be the best way to get a tuple of min/max indices ((min1, max1), (min2, max2), (min3, max3), ) so that the arrays span the largest common range?
What I mean is that
min(arr[min1], arr2[min2], arr3[min3]) > max(arr1[min1-1], arr2[min2-1], arr3[min3-1])
and
max(arr[min1], arr2[min2], arr3[min3]) < min(arr1[min1+1], arr2[min2+1], arr3[min3+1])
the same for the upper bounds?
An example:
Given arange(12) and arange(3, 8), I want to get ((3,8), (0,6)), with the goal that arange(12)[3:8] == arange(3,8)[0:6].
EDIT Note that the arrays can be float or integer.
Sorry if this is confusing; I cannot find easier words right now. Any help is greatly appreciated!
EDIT2 / answer I just realize that I was terrible at formulating my question. I ended up solving what I wanted like this:
mins = [np.min(t) for t in arrays]
maxs = [np.max(t) for t in arrays]
lower_bound = np.max(mins)
upper_bound = np.min(maxs)
lower_row = [np.searchsorted(arr, lower_bound, side='left') for arr in arrays]
upper_row = [np.searchsorted(arr, upper_bound, side='right') for arr in arrays]
result = zip(lower_row, upper_row)
However, both answers seem to be valid for the question I asked, so I'm unsure to select only one of them as 'correct' - what should I do?
I'm sure there are different ways to do this, I would use a merge algorithm to walk through the two arrays, keeping track of overlap regions. If you're not familiar with the idea take a look at merge-sort, hopefully between that and the code it's clear how this works.
def find_overlap(a, b):
i = 0
j = 0
len_a = len(a)
len_b = len(b)
in_overlap = False
best_count = 0
best_start = (-1, -1)
best_end = (-1, -1)
while i < len_a and j < len_b:
if a[i] == b[j]:
if in_overlap:
# Keep track of the length of the overlapping region
count += 1
else:
# This is a new overlapping region, set count to 1 record start
in_overlap = True
count = 1
start = (i, j)
# Step indicies
i += 1
j += 1
end = (i, j)
if count > best_count:
# Is this the longest overlapping region so far?
best_count = count
best_start = start
best_end = end
# If not in a an overlapping region, only step one index
elif a[i] < b[j]:
in_overlap = False
i += 1
elif b[j] < a[i]:
in_overlap = False
j += 1
else:
# This should never happen
raise
# End of loop
return best_start, best_end
Note that end here is returned in python convention so that if a=[0, 1, 2] and b=[0, 1, 4], start=(0, 0) and end=(2, 2).
I think you're looking for a solution to a special case of the longest common substring problem. While that problem is solvable using suffix trees or dynamic programming, your special case of sorted "strings" is easier to solve.
Here's code that I think will give you the values you want. It's single argument is a sequence of sorted sequences. Its return value is list containing a 2-tuple for each of the inner sequences. The tuple values are slice indexes to the longest common substring between the sequences. Note that if there is no common substring, the tuples will all be (0,0), which will result in empty slices (which I think is correct, since the empty slices will all be equal to each other!).
The code:
def longest_common_substring_sorted(sequences):
l = len(sequences)
current_indexes = [0]*l
current_substring_length = 0
current_substring_starts = [0]*l
longest_substring_length = 0
longest_substring_starts = current_substring_starts
while all(index < len(sequence) for index, sequence
in zip(current_indexes, sequences)):
m = min(sequence[index] for index, sequence
in zip(current_indexes, sequences))
common = True
for i in range(l):
if sequences[i][current_indexes[i]] == m:
current_indexes[i] += 1
else:
common = False
if common:
current_substring_length += 1
else:
if current_substring_length > longest_substring_length:
longest_substring_length = current_substring_length
longest_substring_starts = current_substring_starts
current_substring_length = 0
current_substring_starts = list(current_indexes)
if current_substring_length > longest_substring_length:
longest_substring_length = current_substring_length
longest_substring_starts = current_substring_starts
return [(i, i+longest_substring_length)
for i in longest_substring_starts]
Test output:
>>> a=[1,2,3,4,5,6]
>>> b=[1,2,3,5,6,7]
>>> c=[3,4,5,6,7,8]
>>> longest_common_substring_sorted((a,b,c))
[(4, 6), (3, 5), (2, 4)]
I'm sorry about not having commented the code very well. The algorithm is somewhat similar to the merge step of a mergesort. Basically, it keeps track of an index into each of the sequences. As it iterates, it increments all of the indexes that correspond to values that are equal to the smallest value. If the current values in all of the lists are equal (to the smallest value, and thus to each other), it knows that it is within a substring common to all of them. When a substring ends, it is checked against the longest substring found so far.

Categories

Resources