How to reduce time complexity of this program

How to reduce time complexity of this program - python

The entry Y [i][j] stores the sum of the subarray X[i..j], but can I get a better time complexity?
def func(X, n):
Y = [[0 for i in range(n)] for j in range(n)]
for i in range(n):
for j in range(i, n):
for k in range(i, j+1):
Y[i][j] += X[k]
return Y
if __name__ == "__main__":
n = 500
X = list(range(n))
for i in range(30, 50):
print(X[i], end=" ")
print()
print(func(X, n)[30][49])

You could use a prefix sum array.
The idea is that you have an array where the entry ps[i] denotes the sum of all elements arr[0..i]. You can calculate it in linear time:
ps[0] = arr[0]
for i in range(1, len(arr)):
ps[i] = ps[i - 1] + arr[i]
Can you guess how to retrieve a sum Y(i, j) in constant time?
Solution: Y(i, j) = ps[j] - ps[i - 1]. You take the entire sum of the array from j to the start and subtract the part that you don't want again (which is from i-1 to the start).
Note: It is possible that I messed up some edge cases. Be wary for things like i=0, j=0, j<i, etc.

Related

how to write a python code to display magic square matrix according to user input?

This is a program to print a matrix whose sum of each row , column or diagonal elements are equal.
I have a working code but my program gives same output each time i run it. I need a program to print different matrix output for same input.
def matrix(n):
m = [[0 for x in range(n)]for y in range(n)]
i = n // 2
j = n - 1
num = 1
while num <= (n * n):
if i == -1 and j == n:
j = n - 2
i = 0
else:
if j == n:
j = 0
if i < 0:
i = n - 1
if m[int(i)][int(j)]:
j = j - 2
i = i + 1
continue
else:
m[int(i)][int(j)] = num
num = num + 1
j = j + 1
i = i - 1
print ("Sum of eggs in each row or column and diagonal : ",int(n*(n*n+1)/2),"\n")
for i in range(0, n):
for j in range(0, n):
print('%2d ' % (m[i][j]),end = '')
if j == n - 1:
print()
n=int(input("Number of rows of the matrix : "))
matrix(n)

I am unsure whether this is what you are looking for, but one solution is to add a random number to each value of the matrix, as this doesn't break the property
a matrix whose sum of each row, column or diagonal elements are equal.
Here is how you could do it:
add = random.randint(0, 50)
m = [[v+add for v in row] for row in m]
Moreover, you can rotate and add two magic squares without loosing their property. Therefore, you can rotate the magic square you have and add it to the original. This can add some nonlinearity to the results.
def rotate(m): # 90 degrees counter clockwise
return [[m[j][i] for j in range(len(m))] for i in range(len(m[0])-1, -1, -1)]
# add the matrix with its rotated version
m = list(map(lambda e: [sum(x) for x in zip(*e)], zip(m, rotate(m))))
I hope this helps!

Is it possible to process string from starting for DP solution

I was trying out longest palindromic subsequence problem from leetcode.
One of the discussed solution is as follows:
class Solution:
def longestPalindromeSubseq(self, s: str) -> int:
n = len(s)
dp = [[0] * n for _ in range(n)]
for i in range(n - 1, -1, -1):
dp[i][i] = 1
for j in range(i+1, n):
if s[i] == s[j]:
dp[i][j] = dp[i + 1][j - 1] + 2
else:
dp[i][j] = max(dp[i + 1][j], dp[i][j - 1])
return dp[0][n - 1]
So it starts from end of the string:
I was guessing if it is possible to begin from the starting of the string. That is if its possible to have loops something like this:
for i in range(0, n):
for j in range(i+1, n):
# ...
But dp[i + 1] wont be calculated for any given iteration of i and we need dp[i+1] for evaluating
dp[i][j] = dp[i + 1][j - 1] + 2 and
dp[i][j] = max(dp[i + 1][j], dp[i][j - 1])
Is it possible to change these two updates to dp (and hence come up with new recurrence relation) in some way to make it possible to begin from the starting of the string or starting from the end of the string is the only way possible !? (I was not able to come up with any recurrence solution / index adjustments to make it possible. So I have started to believe that its indeed not possible. But I wanted to be sure.)

The first hint that you can do this from the beginning is that let's say you're given a string 'baabbcc' that this logic gets the answer for, the same logic will work for the reversed string as well ('ccbbaab').
The more robust reasoning for this can be derived from what dp[i][j] represents. The value represents the Longest Palindromic Subsequence between i and j inclusive. We calculate this dp array using two pointers, say i and j.
We iterate over all possible values of i and j, and if s[i] == s[j] then we know that the answer from i to j will be equal to the answer for i+1 to j-1 + 2 because we can take the answer from i+1 to j-1 and add s[i] and s[j] to the beginning and end of that. I hope this is clear from the code you provided.
What that means is that to calculate dp[i][j], you need dp[i+1][j-1].
The code you have provided does this by starting the i pointer from the ending and for every i, it loops from j = i till j = n-1. This means that i+1 is reached before i and j-1 is reached before j.
However, you can achieve the same effect starting from the beginning. This time, start by moving the j pointer from the beginning, and for every j, move the i pointer backward from i = j till i = 0. This ensures that j-1 is reached before j and i+1 is reached before i, which is what we're looking for.
The final code would look something like this (Which I've submitted and gotten accepted):
class Solution:
def longestPalindromeSubseq(self, s: str) -> int:
n = len(s)
dp = [[0] * n for _ in range(n)]
for j in range(0, n):
dp[j][j] = 1
for i in range(j-1, -1, -1):
if s[i] == s[j]:
dp[i][j] = dp[i + 1][j - 1] + 2
else:
dp[i][j] = max(dp[i + 1][j], dp[i][j - 1])
return dp[0][n - 1]

How to calculate the worst case complexity for the function?

def functionX(L):
""" L is a non-empty list of length len(L) = n. """
i= 1
while i< len(L) -1:
j = i-1
while j <= i+ 1:
L[j] = L[j] + L[i]
j = j + 1
i= i+ 1
For j loop why do we have 3 iterations each with 3 steps instead of i iterations? I have hard time figuring it out.

Is it clearer with for loops than while loops?
def functionY(L):
N = len(L)
for i in range(1,N-1):
for j in range(i-1,i+2):
L[j] = L[j] + L[i]
How about pseudo-code?
for i in range(N): # drop the -1s on both ends; O(n-2) = O(n)
for j in range(3): # (i-1) to (i+2) covers 3 elements
do something
This makes it pretty clear that Tony's answer is correct, we're in the class O(n). Specifically, the line L[j] = L[j] + L[i] will be accessed 3n-6 times. This is in the complexity class O(3n) = O(n). If you're looking at array accesses as your atomic operation, then we have O(3*(3n-6)) = O(n), still. The complexity class would not change if the line read L[j] += L[i], though the total number of array accesses would go down.

You have n iterations of the outer loop and in every outer-loop-iteration, 3 iterations of the inner loop, because for given i, variable j has a value of i - 1, i and i + 1. Therefore the complexity equals O(3 * n) = O(n).

Find subset with K elements that are closest to eachother

Given an array of integers size N, how can you efficiently find a subset of size K with elements that are closest to each other?
Let the closeness for a subset (x1,x2,x3,..xk) be defined as:
2 <= N <= 10^5
2 <= K <= N
constraints: Array may contain duplicates and is not guaranteed to be sorted.
My brute force solution is very slow for large N, and it doesn't check if there's more than 1 solution:
N = input()
K = input()
assert 2 <= N <= 10**5
assert 2 <= K <= N
a = []
for i in xrange(0, N):
a.append(input())
a.sort()
minimum = sys.maxint
startindex = 0
for i in xrange(0,N-K+1):
last = i + K
tmp = 0
for j in xrange(i, last):
for l in xrange(j+1, last):
tmp += abs(a[j]-a[l])
if(tmp > minimum):
break
if(tmp < minimum):
minimum = tmp
startindex = i #end index = startindex + K?
Examples:
N = 7
K = 3
array = [10,100,300,200,1000,20,30]
result = [10,20,30]
N = 10
K = 4
array = [1,2,3,4,10,20,30,40,100,200]
result = [1,2,3,4]

Your current solution is O(NK^2) (assuming K > log N). With some analysis, I believe you can reduce this to O(NK).
The closest set of size K will consist of elements that are adjacent in the sorted list. You essentially have to first sort the array, so the subsequent analysis will assume that each sequence of K numbers is sorted, which allows the double sum to be simplified.
Assuming that the array is sorted such that x[j] >= x[i] when j > i, we can rewrite your closeness metric to eliminate the absolute value:
Next we rewrite your notation into a double summation with simple bounds:
Notice that we can rewrite the inner distance between x[i] and x[j] as a third summation:
where I've used d[l] to simplify the notation going forward:
Notice that d[l] is the distance between each adjacent element in the list. Look at the structure of the inner two summations for a fixed i:
j=i+1 d[i]
j=i+2 d[i] + d[i+1]
j=i+3 d[i] + d[i+1] + d[i+2]
...
j=K=i+(K-i) d[i] + d[i+1] + d[i+2] + ... + d[K-1]
Notice the triangular structure of the inner two summations. This allows us to rewrite the inner two summations as a single summation in terms of the distances of adjacent terms:
total: (K-i)*d[i] + (K-i-1)*d[i+1] + ... + 2*d[K-2] + 1*d[K-1]
which reduces the total sum to:
Now we can look at the structure of this double summation:
i=1 (K-1)*d[1] + (K-2)*d[2] + (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
i=2 (K-2)*d[2] + (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
i=3 (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
...
i=K-2 2*d[K-2] + d[K-1]
i=K-1 d[K-1]
Again, notice the triangular pattern. The total sum then becomes:
1*(K-1)*d[1] + 2*(K-2)*d[2] + 3*(K-3)*d[3] + ... + (K-2)*2*d[K-2]
+ (K-1)*1*d[K-1]
Or, written as a single summation:
This compact single summation of adjacent differences is the basis for a more efficient algorithm:
Sort the array, order O(N log N)
Compute the differences of each adjacent element, order O(N)
Iterate over each N-K sequence of differences and calculate the above sum, order O(NK)
Note that the second and third step could be combined, although with Python your mileage may vary.
The code:
def closeness(diff,K):
acc = 0.0
for (i,v) in enumerate(diff):
acc += (i+1)*(K-(i+1))*v
return acc
def closest(a,K):
a.sort()
N = len(a)
diff = [ a[i+1] - a[i] for i in xrange(N-1) ]
min_ind = 0
min_val = closeness(diff[0:K-1],K)
for ind in xrange(1,N-K+1):
cl = closeness(diff[ind:ind+K-1],K)
if cl < min_val:
min_ind = ind
min_val = cl
return a[min_ind:min_ind+K]

itertools to the rescue?
from itertools import combinations
def closest_elements(iterable, K):
N = set(iterable)
assert(2 <= K <= len(N) <= 10**5)
combs = lambda it, k: combinations(it, k)
_abs = lambda it: abs(it[0] - it[1])
d = {}
v = 0
for x in combs(N, K):
for y in combs(x, 2):
v += _abs(y)
d[x] = v
v = 0
return min(d, key=d.get)
>>> a = [10,100,300,200,1000,20,30]
>>> b = [1,2,3,4,10,20,30,40,100,200]
>>> print closest_elements(a, 3); closest_elements(b, 4)
(10, 20, 30) (1, 2, 3, 4)

This procedure can be done with O(N*K) if A is sorted. If A is not sorted, then the time will be bounded by the sorting procedure.
This is based on 2 facts (relevant only when A is ordered):
The closest subsets will always be subsequent
When calculating the closeness of K subsequent elements, the sum of distances can be calculated as the sum of each two subsequent elements time (K-i)*i where i is 1,...,K-1.
When iterating through the sorted array, it is redundant to recompute the entire sum, we can instead remove K times the distance between the previously two smallest elements, and add K times the distance of the two new largest elements. this fact is being used to calculate the closeness of a subset in O(1) by using the closeness of the previous subset.
Here's the pseudo-code
List<pair> FindClosestSubsets(int[] A, int K)
{
List<pair> minList = new List<pair>;
int minVal = infinity;
int tempSum;
int N = A.length;
for (int i = K - 1; i < N; i++)
{
tempSum = 0;
for (int j = i - K + 1; j <= i; j++)
tempSum += (K-i)*i * (A[i] - A[i-1]);
if (tempSum < minVal)
{
minVal = tempSum;
minList.clear();
minList.add(new pair(i-K, i);
}
else if (tempSum == minVal)
minList.add(new pair(i-K, i);
}
return minList;
}
This function will return a list of pairs of indexes representing the optimal solutions (the starting and ending index of each solution), it was implied in the question that you want to return all solutions of the minimal value.

try the following:
N = input()
K = input()
assert 2 <= N <= 10**5
assert 2 <= K <= N
a = some_unsorted_list
a.sort()
cur_diff = sum([abs(a[i] - a[i + 1]) for i in range(K - 1)])
min_diff = cur_diff
min_last_idx = K - 1
for last_idx in range(K,N):
cur_diff = cur_diff - \
abs(a[last_idx - K - 1] - a[last_idx - K] + \
abs(a[last_idx] - a[last_idx - 1])
if min_diff > cur_diff:
min_diff = cur_diff
min_last_idx = last_idx
From the min_last_idx, you can calculate the min_first_idx. I use range to preserve the order of idx. If this is python 2.7, it will take linearly more RAM. This is the same algorithm that you use, but slightly more efficient (smaller constant in complexity), as it does less then summing all.

After sorting, we can be sure that, if x1, x2, ... xk are the solution, then x1, x2, ... xk are contiguous elements, right?
So,
take the intervals between numbers
sum these intervals to get the intervals between k numbers
Choose the smallest of them

My initial solution was to look through all the K element window and multiply each element by m and take the sum in that range, where m is initialized by -(K-1) and incremented by 2 in each step and take the minimum sum from the entire list. So for a window of size 3, m is -2 and the values for the range will be -2 0 2. This is because I observed a property that each element in the K window add a certain weight to the sum. For an example if the elements are [10 20 30] the sum is (30-10) + (30-20) + (20-10). So if we break down the expression we have 2*30 + 0*20 + (-2)*10. This can be achieved in O(n) time and the entire operation would be in O(NK) time. However it turns out that this solution is not optimal, and there are certain edge cases where this algorithm fails. I am yet to figure out those cases, but shared the solution anyway if anyone can figure out something useful from it.
for(i = 0 ;i <= n - k;++i)
{
diff = 0;
l = -(k-1);
for(j = i;j < i + k;++j)
{
diff += a[j]*l;
if(min < diff)
break;
l += 2;
}
if(j == i + k && diff > 0)
min = diff;
}

You can do this is O(n log n) time with a sliding window approach (O(n) if the array is already sorted).
First, suppose we've precomputed, at every index i in our array, the sum of distances from A[i] to the previous k-1 elements. The formula for that would be
(A[i] - A[i-1]) + (A[i] - A[i-2]) + ... + (A[i] - A[i-k+1]).
If i is less than k-1, we just compute the sum to the array boundary.
Suppose we also precompute, at every index i in our array, the sum of distances from A[i] to the next k-1 elements. Then we could solve the whole problem with a single pass of a sliding window.
If our sliding window is on [L, L+k-1] with closeness sum S, then the closeness sum for the interval [L+1, L+k] is just S - dist_sum_to_next[L] + dist_sum_to_prev[L+k]. The only changes in the sum of pairwise distances are removing all terms involving A[L] when it leaves our window, and adding all terms involving A[L+k] as it enters our window.
The only remaining part is how to compute, at a position i, the sum of distances between A[i] and the previous k-1 elements (the other computation is totally symmetric). If we know the distance sum at i-1, this is easy: subtract the distance from A[i-1] to A[i-k], and add in the extra distance from A[i-1] to A[i] k-1 times
dist_sum_to_prev[i] = (dist_sum_to_prev[i - 1] - (A[i - 1] - A[i - k])
+ (A[i] - A[i - 1]) * (k - 1)
Python code:
def closest_subset(nums: List[int], k: int) -> List[int]:
"""Given a list of n (poss. unsorted and non-unique) integers nums,
returns a (sorted) list of size k that minimizes the sum of pairwise
distances between all elements in the list.
Runs in O(n lg n) time, uses O(n) auxiliary space.
"""
n = len(nums)
assert len(nums) == n
assert 2 <= k <= n
nums.sort()
# Sum of pairwise distances to the next (at most) k-1 elements
dist_sum_to_next = [0] * n
# Sum of pairwise distances to the last (at most) k-1 elements
dist_sum_to_prev = [0] * n
for i in range(1, n):
if i >= k:
dist_sum_to_prev[i] = ((dist_sum_to_prev[i - 1] -
(nums[i - 1] - nums[i - k]))
+ (nums[i] - nums[i - 1]) * (k - 1))
else:
dist_sum_to_prev[i] = (dist_sum_to_prev[i - 1]
+ (nums[i] - nums[i - 1]) * i)
for i in reversed(range(n - 1)):
if i < n - k:
dist_sum_to_next[i] = ((dist_sum_to_next[i + 1]
- (nums[i + k] - nums[i + 1]))
+ (nums[i + 1] - nums[i]) * (k - 1))
else:
dist_sum_to_next[i] = (dist_sum_to_next[i + 1]
+ (nums[i + 1] - nums[i]) * (n-i-1))
best_sum = math.inf
curr_sum = 0
answer_right_bound = 0
for i in range(n):
curr_sum += dist_sum_to_prev[i]
if i >= k:
curr_sum -= dist_sum_to_next[i - k]
if curr_sum < best_sum and i >= k - 1:
best_sum = curr_sum
answer_right_bound = i
return nums[answer_right_bound - k + 1:answer_right_bound + 1]

Checker board algorithm implementation

def Checker(n):
p = [[7,3,5,6,1],[2,6,7,0,2],[3,5,7,8,2],[7,6,1,1,4],[6,7,4,7,8]] #profit of each cell
cost = [[0 for j in range(n)] for i in range(n)]
w = [[0 for j in range(n)] for i in range(n)] #w[i, j] store the column number (j) of the previous square from which we moved to the current square at [i,j]
for j in range(1,n):
cost[1][j] = 0
for i in range(2,n):
for j in range(1,n):
max = cost[i-1][j] + p[i-1][j]
w[i][j] = j
if (j > 1 and cost[i-1][j-1] + p[i-1][j-1] > max):
max = cost[i-1][j-1] + p[i-1][j-1]
w[i][j] = j-1
if (j < n and cost[i-1][j+1] + p[i-1][j+1] > max):
max = cost[i-1][j+1] + p[i-1][j+1]
w[i][j] = j+1
cost[i][j] = max
print cost[i][j]
maxd = cost[1][1]
maxj = 1
for j in range(2,n):
if cost[1][j] >maxd:
maxd = cost[1][j]
maxj = j
print "Maximum profit is: ",maxd
printsquares(w,n,maxj)
def printsquares(w,i,j):
if i == -1:
return
print "Square at row %d and column %d"%(i,j)
printsquares(w,i-1,w[i][j])
if __name__ == '__main__':
print "5*5 checker board problem"
n = 5
Checker(n)
The above program is implementation of checker board algorithm in python.
when i run the above code the following error is shown:
if (j < n and cost[i-1][j+1] + p[i-1][j+1] > max): IndexError: list
index out of range
what am i doing wrong and any one would propose solution for it?

To avoid the IndexError exceptions,
if (j < n and cost[i-1][j+1] + p[i-1][j+1] > max):
max = cost[i-1][j+1] + p[i-1][j+1]
should be written:
if (j < n-1 and cost[i-1][j+1] + p[i-1][j+1] > max):
max = cost[i-1][j+1] + p[i-1][j+1]
and
printsquares(w,i-1,w[i][j])
becomes
printsquares(w,i-1,w[i-1][j])
But as said by other fellows, I'm not certain that the algorithm is correctly implemented.

You're pretty much trying to do this:
L = range(5)
print L[5+1]
At cost[i-1][j+1].
There is no 6th element. There's only five. Hence the IndexError.
As for a solution, you probably only want n and not n+1 if you want the last element. However, I'm not 100% sure.

It looks like you are trying to adapt an algorithm from a language where lists start from 1 whereas they start from 0 in Python. As far I as checked, you never access cost[i][0].

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to reduce time complexity of this program - python

Related

how to write a python code to display magic square matrix according to user input?

Is it possible to process string from starting for DP solution

How to calculate the worst case complexity for the function?

Find subset with K elements that are closest to eachother

Checker board algorithm implementation

Categories

Resources