Empirically measuring time complexity of QuickSelect

Empirically measuring time complexity of QuickSelect - python

I know that the time complexity should be O(N). However, when I'm testing it empirically, I get weird results. Can somebody please explain what's going on?
def insertPivot(array, start, end):
pivot = end
i = start
j = end - 1
while i < j:
while array[i] < array[pivot] and i < j:
i += 1
while array[j] > array[pivot] and j > i:
j -= 1
array[i], array[j] = array[j], array[i]
if array[i] > array[pivot]:
array[i], array[pivot] = array[pivot], array[i]
pivot = i
return pivot
def quickselect(array, k):
start = 0
end = len(array) - 1
pivot = insertPivot(array, start, end)
while pivot != k - 1:
if pivot < k - 1:
start, end = pivot, end
else:
start, end = start, pivot - 1
pivot = insertPivot(array, start, end)
return array[k - 1]
And here's how I'm getting my measurements
import random
import timeit
import numpy as np
av_times = dict()
for n in [10, 100, 500, 1000, 5000, 10000]:
times = list()
array = list(range(n))
for _ in range(10):
random.shuffle(array)
k = random.randint(0, n)
times.append(
timeit.timeit(lambda: quickselect(array, k), number=10)
)
av_times[n] = sum(times) / len(times)
xx, yy = zip(*av_times.items())
xx, yy = np.log(xx), np.log(yy)
m, b = np.polyfit(xx, yy, 1)
The slope coefficient m is 1.5, which suggests that the time complexity is O(N*sqrt(N))

insertPivot is of O(N) complexity indeed, since you increase i and decrease j until j is no longer greater than i. However, insertPivot is embedded into a while cycle inside quickselect. So, whatever the complexity of quickselect, that multiplies the complexity of insertPivot, since an algorithm of O(n) complexity is executed on each step of your loop. If pivot < k - 1, then you increase the left boundary of your interval. Otherwise you decrease to pivot - 1. So, in the loop you decrease the interval's size on each step by the difference between its left of right edge and pivot. Depending on what function you can use to approximate the number of steps you can determine what to multiply your N in the linear complexity with, resulting in the actual complexity.

Related

Making the complexity smaller (better)

I have an algorithm that looks for the good pairs in a list of numbers. A good pair is being considered as index i being less than j and arr[i] < arr[j]. It currently has a complexity of O(n^2) but I want to make it O(nlogn) based on divide and conquering. How can I go about doing that?
Here's the algorithm:
def goodPairs(nums):
count = 0
for i in range(0,len(nums)):
for j in range(i+1,len(nums)):
if i < j and nums[i] < nums[j]:
count += 1
j += 1
j += 1
return count
Here's my attempt at making it but it just returns 0:
def goodPairs(arr):
count = 0
if len(arr) > 1:
# Finding the mid of the array
mid = len(arr)//2
# Dividing the array elements
left_side = arr[:mid]
# into 2 halves
right_side = arr[mid:]
# Sorting the first half
goodPairs(left_side)
# Sorting the second half
goodPairs(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count

The current previously accepted answer by Fire Assassin doesn't really answer the question, which asks for better complexity. It's still quadratic, and about as fast as a much simpler quadratic solution. Benchmark with 2000 shuffled ints:
387.5 ms original
108.3 ms pythonic
104.6 ms divide_and_conquer_quadratic
4.1 ms divide_and_conquer_nlogn
4.6 ms divide_and_conquer_nlogn_2
Code (Try it online!):
def original(nums):
count = 0
for i in range(0,len(nums)):
for j in range(i+1,len(nums)):
if i < j and nums[i] < nums[j]:
count += 1
j += 1
j += 1
return count
def pythonic(nums):
count = 0
for i, a in enumerate(nums, 1):
for b in nums[i:]:
if a < b:
count += 1
return count
def divide_and_conquer_quadratic(arr):
count = 0
left_count = 0
right_count = 0
if len(arr) > 1:
mid = len(arr) // 2
left_side = arr[:mid]
right_side = arr[mid:]
left_count = divide_and_conquer_quadratic(left_side)
right_count = divide_and_conquer_quadratic(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count + left_count + right_count
def divide_and_conquer_nlogn(arr):
mid = len(arr) // 2
if not mid:
return 0
left = arr[:mid]
right = arr[mid:]
count = divide_and_conquer_nlogn(left)
count += divide_and_conquer_nlogn(right)
i = 0
for r in right:
while i < mid and left[i] < r:
i += 1
count += i
arr[:] = left + right
arr.sort() # linear, as Timsort takes advantage of the two sorted runs
return count
def divide_and_conquer_nlogn_2(arr):
mid = len(arr) // 2
if not mid:
return 0
left = arr[:mid]
right = arr[mid:]
count = divide_and_conquer_nlogn_2(left)
count += divide_and_conquer_nlogn_2(right)
i = 0
arr.clear()
append = arr.append
for r in right:
while i < mid and left[i] < r:
append(left[i])
i += 1
append(r)
count += i
arr += left[i:]
return count
from timeit import timeit
from random import shuffle
arr = list(range(2000))
shuffle(arr)
funcs = [
original,
pythonic,
divide_and_conquer_quadratic,
divide_and_conquer_nlogn,
divide_and_conquer_nlogn_2,
]
for func in funcs:
print(func(arr[:]))
for _ in range(3):
print()
for func in funcs:
arr2 = arr[:]
t = timeit(lambda: func(arr2), number=1)
print('%5.1f ms ' % (t * 1e3), func.__name__)

One of the most well-known divide-and-conquer algorithms is merge sort. And merge sort is actually a really good foundation for this algorithm.
The idea is that when comparing two numbers from two different 'partitions', you already have a lot of information about the remaining part of these partitions, as they're sorted in every iteration.
Let's take an example!
Consider the following partitions, which has already been sorted individually and "good pairs" have been counted.
Partition x: [1, 3, 6, 9].
Partition y: [4, 5, 7, 8].
It is important to note that the numbers from partition x is located further to the left in the original list than partition y. In particular, for every element in x, it's corresponding index i must be smaller than some index j for every element in y.
We will start of by comparing 1 and 4. Obviously 1 is smaller than 4. But since 4 is the smallest element in partition y, 1 must also be smaller than the rest of the elements in y. Consequently, we can conclude that there is 4 additional good pairs, since the index of 1 is also smaller than the index of the remaining elements of y.
The exact same thing happens with 3, and we can add 4 new good pairs to the sum.
For 6 we will conclude that there is two new good pairs. The comparison between 6 and 4 did not yield a good pair and likewise for 6 and 5.
You might now notice how these additional good pairs would be counted? Basically if the element from x is less than the element from y, add the number of elements remaining in y to the sum. Rince and repeat.
Since merge sort is an O(n log n) algorithm, and the additional work in this algorithm is constant, we can conclude that this algorithm is also an O(n log n) algorithm.
I will leave the actual programming as an exercise for you.

#niklasaa has added an explanation for the merge sort analogy, but your implementation still has an issue.
You are partitioning the array and calculating the result for either half, but
You haven't actually sorted either half. So when you're comparing their elements, your two pointer approach isn't correct.
You haven't used their results in the final computation. That's why you're getting an incorrect answer.
For point #1, you should look at merge sort, especially the merge() function. That logic is what will give you the correct pair count without having O(N^2) iteration.
For point #2, store the result for either half first:
# Sorting the first half
leftCount = goodPairs(left_side)
# Sorting the second half
rightCount = goodPairs(right_side)
While returning the final count, add these two results as well.
return count + leftCount + rightCount

Like #Abhinav Mathur stated, you have most of the code down, your problem is with these lines:
# Sorting the first half
goodPairs(left_side)
# Sorting the second half
goodPairs(right_side)
You want to store these in variables that should be declared before the if statement. Here's an updated version of your code:
def goodPairs(arr):
count = 0
left_count = 0
right_count = 0
if len(arr) > 1:
mid = len(arr) // 2
left_side = arr[:mid]
right_side = arr[mid:]
left_count = goodPairs(left_side)
right_count = goodPairs(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count + left_count + right_count
Recursion can be difficult at times, look into the idea of merge sort and quick sort to get better ideas on how the divide and conquer algorithms work.

How to solve R select problem, using random pivot

I need to find the kth number in the input list. Please tell me what's wrong
def partition(arr, start, end, pivot):
pivot_locate = arr.index(pivot)
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1): #k = 1~R
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return arr
def RSelect(arr, start, end, i):
if start == end : return arr[start]
if start < end :
pivot = random.choice(arr)
while arr.index(pivot) < start or arr.index(pivot) > end :
pivot = random.choice(arr)
arr_new = partition(arr, start, end, pivot)
pLoc = arr_new.index(pivot)
if pLoc == i : return pivot
elif pLoc > i : return RSelect(arr_new, start, pLoc-1, i)
else : return RSelect(arr_new, pLoc+1, end, i)
T =int(input())
for j in range(T):
N, k = map(int, input().split())
my_list = list(map(int,input().split()))
k = len(my_list)-k
anw = RSelect(my_list, 0, len(my_list)-1, k)
print(anw)
Some of the test code works fine, but some outputs incorrect answers. I don't know what's the problem. I am taking a course on the probabilistic selection algorithm.

There are some issues when the input list contains a lot of duplicate values. These issues all relate to how you use index(). You cannot assume that index() will return the index of the searched value within the range [start, end], even if you know it should be within that range, it could also occur outside of it. And as index() returns the index of the first occurrence, you get undesired results:
In partition, arr.index(pivot) could return an index that is less than start, which obviously will lead to an undesired swap of the value at start to outside the range.
while arr.index(pivot) < start could be true, even if the value is also present in the subrange that is under consideration. In case the range consists of only repetitions of this value and no other, this will make that while loop an infinite loop.
A similar problem occurs with arr_new.index(pivot). This can lead to a recursive call where the range is greater than the current range, leading to a potential stack overflow.
Some other remarks:
arr_new = partition() is a bit misleading, as it gives the impression you get a new list, but actually arr has been mutated and it is the list that partition also returns. So to avoid misinterpretation, it is better to just continue with arr and not introduce a new variable for the same list. Instead of returning arr, it would be more useful if partition would return the index of where the pivot value ended up. This way you don't have to perform an index call any more.
partition has to search the given pivot value. It can be relieved from that scan by passing it the index of the pivot. You should really aim to avoid using index at all as it leads to a worse average time complexity.
Here is the proposed code with those points taken into account:
def partition(arr, start, end, pivot_locate):
# The function gets the index, and now gets the pivot value
pivot = arr[pivot_locate]
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1):
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return i-1 # return the new index of the pivot
def RSelect(arr, start, end, i):
if start >= end:
return arr[start]
if start < end:
# Select a random index within the range
pLoc = random.randint(start, end)
# call partition with that index and get a new index back
pLoc = partition(arr, start, end, pLoc)
if pLoc == i:
return arr[pLoc]
elif pLoc > i:
return RSelect(arr, start, pLoc-1, i)
else:
return RSelect(arr, pLoc+1, end, i)

How to sum elements of the rows of a lattice periodically

Suppose I have a lattice
a = np.array([[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4]])
I'd like to make a function func(lattice, start, end) that takes in 3 inputs, where start and end are the index of rows for which the function would sum the elements. For example, for func(a,1,3) it'll sum all the elements of those rows such that func(a,1,3) = 2+2+2+2+3+3+3+3+4+4+4+4 = 36.
Now I know this can be done easily with slicing and np.sum() whatever. But crucially what I want func to do is to also have the ability to wrap around. Namely func(a,2,4) should return 3+3+3+3+4+4+4+4+1+1+1+1.
Couple more examples would be
func(a,3,4) = 4+4+4+4+1+1+1+1
func(a,3,5) = 4+4+4+4+1+1+1+1+2+2+2+2
func(a,0,1) = 1+1+1+1+2+2+2+2
In my situation I'm never gonna get to a point where it'll sum the whole thing again i.e.
func(a,3,6) = sum of all elements
Update:
For my algorithm
for i in range(MC_STEPS_NODE):
sweep(lattice, prob, start_index_master, end_index_master,
rows_for_master)
# calculate the energy
Ene = subhamiltonian(lattice, start_index_master, end_index_master)
# calculate the magnetisation
Mag = mag(lattice, start_index_master, end_index_master)
E1 += Ene
M1 += Mag
E2 += Ene*Ene
M2 += Mag*Mag
if i % sites_for_master == 0:
comm.Send([lattice[start_index_master:start_index_master+1], L, MPI.INT],
dest=(rank-1)%size, tag=4)
comm.Recv([lattice[end_index_master:end_index_master+1], L, MPI.INT],
source = (rank+1)%size, tag=4)
start_index_master = (start_index_master + 1)
end_index_master = (end_index_master + 1)
if start_index_master > 100:
start_index_master = start_index_master % L
if end_index_master > 100:
end_index_master = end_index_master % L
The function I want is the mag() function which calculates the magnetisation of a sublattice which are just sum of all its elements. Imagine a LxL lattice split up into two sublattices, one belongs to the master and the other to the worker. Each sweep sweeps the corresponding sublattice of lattice with start_index_master and end_index_master determining the start and end row of the sublattice. For every i%sites_for_master = 0, the indices move down by adding 1, eventually mod by 100 to prevent memory overflow in mpi4py. So you can imagine if the sublattice is at the centre of the main lattice then start_index_master < end_index_master. Eventually the sublattice will keep moving down to the point where start_index_master < end_index_master where end_index_master > L, so in this case if start_index_master = 10 for a lattice L=10, the most bottom row of the sublattice is the first row ([0]) of the main lattice.
Energy function:
def subhamiltonian(lattice: np.ndarray, col_len_start: int,
col_len_end: int) -> float:
energy = 0
for i in range(col_len_start, col_len_end+1):
for j in range(len(lattice)):
spin = lattice[i%L, j]
nb_sum = lattice[(i%L+1) % L, j] + lattice[i%L, (j+1) % L] + \
lattice[(i%L-1) % L, j] + lattice[i%L, (j-1) % L]
energy += -nb_sum*spin
return energy/4.
This is my function for computing the energy of the sublattice.

You could use np.arange to create the indexes to be summed.
>>> def func(lattice, start, end):
... rows = lattice.shape[0]
... return lattice[np.arange(start, end+1) % rows].sum()
...
>>> func(a,3,4)
20
>>> func(a,3,5)
28
>>> func(a,0,1)
12

You can check if the stop index wraps-around and if it does add the sum from the beginning of the array to the result. This is efficient because it relies on slice indexing and only does extra work if necessary.
def func(a, start, stop):
stop += 1
result = np.sum(a[start:stop])
if stop > len(a):
result += np.sum(a[:stop % len(a)])
return result
The above version works for stop - start < len(a), i.e. no more than one full wrap-around. For an arbitrary number of wrap-around (i.e. arbitrary values for start and stop) the following version can be used:
def multi_wraps(a, start, stop):
result = 0
# Adjust both indices in case the start index wrapped around.
stop -= (start // len(a)) * len(a)
start %= len(a)
stop += 1 # Include the element pointed to by the stop index.
n_wraps = (stop - start) // len(a)
if n_wraps > 0:
result += n_wraps * a.sum()
stop = start + (stop - start) % len(a)
result += np.sum(a[start:stop])
if stop > len(a):
result += np.sum(a[:stop % len(a)])
return result
In case n_wraps > 0 some parts of the array will be summed twice which is unnecessarily inefficient, so we can compute the sum of the various array parts as necessary. The following version sums every array element at most once:
def multi_wraps_efficient(a, start, stop):
# Adjust both indices in case the start index wrapped around.
stop -= (start // len(a)) * len(a)
start %= len(a)
stop += 1 # Include the element pointed to by the stop index.
n_wraps = (stop - start) // len(a)
stop = start + (stop - start) % len(a) # Eliminate the wraps since they will be accounted for separately.
tail_sum = a[start:stop].sum()
if stop > len(a):
head_sum = a[:stop % len(a)].sum()
if n_wraps > 0:
remaining_sum = a[stop % len(a):start].sum()
elif n_wraps > 0:
head_sum = a[:start].sum()
remaining_sum = a[stop:].sum()
result = tail_sum
if stop > len(a):
result += head_sum
if n_wraps > 0:
result += n_wraps * (head_sum + tail_sum + remaining_sum)
return result
The following plot shows a performance comparison between using index arrays and the two multi-wrap methods presented above. The tests are run on a (1_000, 1_000) lattice. One can observe that for the multi_wraps method there is an increase in runtime when going from 1 to 2 wrap-around since it unnecessarily sums the array twice. The multi_wraps_efficient method has the same performance irregardless of the number of wrap-around since it sums every array element no more than once.
The performance plot was generated using the perfplot package:
perfplot.show(
setup=lambda n: (np.ones(shape=(1_000, 1_000), dtype=int), 400, n*1_000 + 200),
kernels=[
lambda x: index_arrays(*x),
lambda x: multi_wraps(*x),
lambda x: multi_wraps_efficient(*x),
],
labels=['index_arrays', 'multi_wraps', 'multi_wraps_efficient'],
n_range=range(1, 11),
xlabel="Number of wrap-around",
equality_check=lambda x, y: x == y,
)

Incorrect indexing for max subarray in Python

I wrote both a brute-force and a divide-and-conquer implementation of the Max Subarray problem in Python. Tests are run by drawing a random sample of integers.
When the length of the input array is large, the assert in __main__ fails because the recursive algorithm does not return the correct answer. However, the two algorithms DO agree when the array is less than 10 elements long (this is approximate, and the actual size of the failed input varies on each execution). The issue does not seem to be related to even or odd array lengths, but it does appear to be related to how the array is indexed.
Sorry if I'm missing something stupid, but why does the recursive algorithm stop returning the correct output when the input array starts getting larger?
# Subarray solutions are represented by an array in the form
# [lower_bound, higher_bound, sum]
from sys import maxsize
import random
import time
# Brute force implementation (THETA(n^2))
def bf_max_subarray(A):
biggest = -maxsize - 1
left = 0
right = 0
for i in range(0, len(A)):
sum = 0
for j in range(i, len(A)):
sum += A[j]
if sum > biggest:
biggest = sum
left = i
right = j
return [left, right, biggest]
# Part of divide-and-conquer solution
def cross_subarray(A, l, m, r):
lsum = -maxsize - 1
rsum = -maxsize - 1
lbound = 0
rbound = 0
tempsum = 0
for i in range(m, l-1, -1):
tempsum += A[i]
if tempsum > lsum:
lsum = tempsum
lbound = i
tempsum = 0
for j in range(m+1, r+1):
tempsum += A[j]
if tempsum > rsum:
rsum = tempsum
rbound = j
return [lbound, rbound, lsum + rsum]
# Recursive solution
def rec_max_subarray(A, l, r):
# Base case: array of one element
if (l == r):
return [l, r, A[l]]
else:
m = (l+r)//2
left = rec_max_subarray(A, l, m)
right = rec_max_subarray(A, m+1, r)
cross = cross_subarray(A, l, m, r)
# Returns the array representing the subarray with the maximum sum.
return max([left, right, cross], key=lambda i:i[2])
if __name__ == "__main__":
for i in range(1, 101):
A = random.sample(range(-i*2, i), i)
start = time.clock()
bf = bf_max_subarray(A)
bf_time = time.clock() - start
start = time.clock()
dc = rec_max_subarray(A, 0, len(A)-1)
dc_time = time.clock() - start
assert dc == bf # Make sure the algorithms agree.

The subarray with the maximum sum is represented by an array of the form [left_bound, right_bound, sum].
But thanks toreturn max([left, right, cross], key=lambda i:i[2]), rec_max_subarray returns the correct maximum sum for A, but risks returning indicies that do not match the indicies returned in bf_max_subarray. My error was assuming that the boundaries of a subarray with the maximum sum would be unique.
The solution is to either fix the criteria that selects a subarray, or just to assert the equality of the sums using assert dc[2] == bf[2].

Find subset with K elements that are closest to eachother

Given an array of integers size N, how can you efficiently find a subset of size K with elements that are closest to each other?
Let the closeness for a subset (x1,x2,x3,..xk) be defined as:
2 <= N <= 10^5
2 <= K <= N
constraints: Array may contain duplicates and is not guaranteed to be sorted.
My brute force solution is very slow for large N, and it doesn't check if there's more than 1 solution:
N = input()
K = input()
assert 2 <= N <= 10**5
assert 2 <= K <= N
a = []
for i in xrange(0, N):
a.append(input())
a.sort()
minimum = sys.maxint
startindex = 0
for i in xrange(0,N-K+1):
last = i + K
tmp = 0
for j in xrange(i, last):
for l in xrange(j+1, last):
tmp += abs(a[j]-a[l])
if(tmp > minimum):
break
if(tmp < minimum):
minimum = tmp
startindex = i #end index = startindex + K?
Examples:
N = 7
K = 3
array = [10,100,300,200,1000,20,30]
result = [10,20,30]
N = 10
K = 4
array = [1,2,3,4,10,20,30,40,100,200]
result = [1,2,3,4]

Your current solution is O(NK^2) (assuming K > log N). With some analysis, I believe you can reduce this to O(NK).
The closest set of size K will consist of elements that are adjacent in the sorted list. You essentially have to first sort the array, so the subsequent analysis will assume that each sequence of K numbers is sorted, which allows the double sum to be simplified.
Assuming that the array is sorted such that x[j] >= x[i] when j > i, we can rewrite your closeness metric to eliminate the absolute value:
Next we rewrite your notation into a double summation with simple bounds:
Notice that we can rewrite the inner distance between x[i] and x[j] as a third summation:
where I've used d[l] to simplify the notation going forward:
Notice that d[l] is the distance between each adjacent element in the list. Look at the structure of the inner two summations for a fixed i:
j=i+1 d[i]
j=i+2 d[i] + d[i+1]
j=i+3 d[i] + d[i+1] + d[i+2]
...
j=K=i+(K-i) d[i] + d[i+1] + d[i+2] + ... + d[K-1]
Notice the triangular structure of the inner two summations. This allows us to rewrite the inner two summations as a single summation in terms of the distances of adjacent terms:
total: (K-i)*d[i] + (K-i-1)*d[i+1] + ... + 2*d[K-2] + 1*d[K-1]
which reduces the total sum to:
Now we can look at the structure of this double summation:
i=1 (K-1)*d[1] + (K-2)*d[2] + (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
i=2 (K-2)*d[2] + (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
i=3 (K-3)*d[3] + ... + 2*d[K-2] + d[K-1]
...
i=K-2 2*d[K-2] + d[K-1]
i=K-1 d[K-1]
Again, notice the triangular pattern. The total sum then becomes:
1*(K-1)*d[1] + 2*(K-2)*d[2] + 3*(K-3)*d[3] + ... + (K-2)*2*d[K-2]
+ (K-1)*1*d[K-1]
Or, written as a single summation:
This compact single summation of adjacent differences is the basis for a more efficient algorithm:
Sort the array, order O(N log N)
Compute the differences of each adjacent element, order O(N)
Iterate over each N-K sequence of differences and calculate the above sum, order O(NK)
Note that the second and third step could be combined, although with Python your mileage may vary.
The code:
def closeness(diff,K):
acc = 0.0
for (i,v) in enumerate(diff):
acc += (i+1)*(K-(i+1))*v
return acc
def closest(a,K):
a.sort()
N = len(a)
diff = [ a[i+1] - a[i] for i in xrange(N-1) ]
min_ind = 0
min_val = closeness(diff[0:K-1],K)
for ind in xrange(1,N-K+1):
cl = closeness(diff[ind:ind+K-1],K)
if cl < min_val:
min_ind = ind
min_val = cl
return a[min_ind:min_ind+K]

itertools to the rescue?
from itertools import combinations
def closest_elements(iterable, K):
N = set(iterable)
assert(2 <= K <= len(N) <= 10**5)
combs = lambda it, k: combinations(it, k)
_abs = lambda it: abs(it[0] - it[1])
d = {}
v = 0
for x in combs(N, K):
for y in combs(x, 2):
v += _abs(y)
d[x] = v
v = 0
return min(d, key=d.get)
>>> a = [10,100,300,200,1000,20,30]
>>> b = [1,2,3,4,10,20,30,40,100,200]
>>> print closest_elements(a, 3); closest_elements(b, 4)
(10, 20, 30) (1, 2, 3, 4)

This procedure can be done with O(N*K) if A is sorted. If A is not sorted, then the time will be bounded by the sorting procedure.
This is based on 2 facts (relevant only when A is ordered):
The closest subsets will always be subsequent
When calculating the closeness of K subsequent elements, the sum of distances can be calculated as the sum of each two subsequent elements time (K-i)*i where i is 1,...,K-1.
When iterating through the sorted array, it is redundant to recompute the entire sum, we can instead remove K times the distance between the previously two smallest elements, and add K times the distance of the two new largest elements. this fact is being used to calculate the closeness of a subset in O(1) by using the closeness of the previous subset.
Here's the pseudo-code
List<pair> FindClosestSubsets(int[] A, int K)
{
List<pair> minList = new List<pair>;
int minVal = infinity;
int tempSum;
int N = A.length;
for (int i = K - 1; i < N; i++)
{
tempSum = 0;
for (int j = i - K + 1; j <= i; j++)
tempSum += (K-i)*i * (A[i] - A[i-1]);
if (tempSum < minVal)
{
minVal = tempSum;
minList.clear();
minList.add(new pair(i-K, i);
}
else if (tempSum == minVal)
minList.add(new pair(i-K, i);
}
return minList;
}
This function will return a list of pairs of indexes representing the optimal solutions (the starting and ending index of each solution), it was implied in the question that you want to return all solutions of the minimal value.

try the following:
N = input()
K = input()
assert 2 <= N <= 10**5
assert 2 <= K <= N
a = some_unsorted_list
a.sort()
cur_diff = sum([abs(a[i] - a[i + 1]) for i in range(K - 1)])
min_diff = cur_diff
min_last_idx = K - 1
for last_idx in range(K,N):
cur_diff = cur_diff - \
abs(a[last_idx - K - 1] - a[last_idx - K] + \
abs(a[last_idx] - a[last_idx - 1])
if min_diff > cur_diff:
min_diff = cur_diff
min_last_idx = last_idx
From the min_last_idx, you can calculate the min_first_idx. I use range to preserve the order of idx. If this is python 2.7, it will take linearly more RAM. This is the same algorithm that you use, but slightly more efficient (smaller constant in complexity), as it does less then summing all.

After sorting, we can be sure that, if x1, x2, ... xk are the solution, then x1, x2, ... xk are contiguous elements, right?
So,
take the intervals between numbers
sum these intervals to get the intervals between k numbers
Choose the smallest of them

My initial solution was to look through all the K element window and multiply each element by m and take the sum in that range, where m is initialized by -(K-1) and incremented by 2 in each step and take the minimum sum from the entire list. So for a window of size 3, m is -2 and the values for the range will be -2 0 2. This is because I observed a property that each element in the K window add a certain weight to the sum. For an example if the elements are [10 20 30] the sum is (30-10) + (30-20) + (20-10). So if we break down the expression we have 2*30 + 0*20 + (-2)*10. This can be achieved in O(n) time and the entire operation would be in O(NK) time. However it turns out that this solution is not optimal, and there are certain edge cases where this algorithm fails. I am yet to figure out those cases, but shared the solution anyway if anyone can figure out something useful from it.
for(i = 0 ;i <= n - k;++i)
{
diff = 0;
l = -(k-1);
for(j = i;j < i + k;++j)
{
diff += a[j]*l;
if(min < diff)
break;
l += 2;
}
if(j == i + k && diff > 0)
min = diff;
}

You can do this is O(n log n) time with a sliding window approach (O(n) if the array is already sorted).
First, suppose we've precomputed, at every index i in our array, the sum of distances from A[i] to the previous k-1 elements. The formula for that would be
(A[i] - A[i-1]) + (A[i] - A[i-2]) + ... + (A[i] - A[i-k+1]).
If i is less than k-1, we just compute the sum to the array boundary.
Suppose we also precompute, at every index i in our array, the sum of distances from A[i] to the next k-1 elements. Then we could solve the whole problem with a single pass of a sliding window.
If our sliding window is on [L, L+k-1] with closeness sum S, then the closeness sum for the interval [L+1, L+k] is just S - dist_sum_to_next[L] + dist_sum_to_prev[L+k]. The only changes in the sum of pairwise distances are removing all terms involving A[L] when it leaves our window, and adding all terms involving A[L+k] as it enters our window.
The only remaining part is how to compute, at a position i, the sum of distances between A[i] and the previous k-1 elements (the other computation is totally symmetric). If we know the distance sum at i-1, this is easy: subtract the distance from A[i-1] to A[i-k], and add in the extra distance from A[i-1] to A[i] k-1 times
dist_sum_to_prev[i] = (dist_sum_to_prev[i - 1] - (A[i - 1] - A[i - k])
+ (A[i] - A[i - 1]) * (k - 1)
Python code:
def closest_subset(nums: List[int], k: int) -> List[int]:
"""Given a list of n (poss. unsorted and non-unique) integers nums,
returns a (sorted) list of size k that minimizes the sum of pairwise
distances between all elements in the list.
Runs in O(n lg n) time, uses O(n) auxiliary space.
"""
n = len(nums)
assert len(nums) == n
assert 2 <= k <= n
nums.sort()
# Sum of pairwise distances to the next (at most) k-1 elements
dist_sum_to_next = [0] * n
# Sum of pairwise distances to the last (at most) k-1 elements
dist_sum_to_prev = [0] * n
for i in range(1, n):
if i >= k:
dist_sum_to_prev[i] = ((dist_sum_to_prev[i - 1] -
(nums[i - 1] - nums[i - k]))
+ (nums[i] - nums[i - 1]) * (k - 1))
else:
dist_sum_to_prev[i] = (dist_sum_to_prev[i - 1]
+ (nums[i] - nums[i - 1]) * i)
for i in reversed(range(n - 1)):
if i < n - k:
dist_sum_to_next[i] = ((dist_sum_to_next[i + 1]
- (nums[i + k] - nums[i + 1]))
+ (nums[i + 1] - nums[i]) * (k - 1))
else:
dist_sum_to_next[i] = (dist_sum_to_next[i + 1]
+ (nums[i + 1] - nums[i]) * (n-i-1))
best_sum = math.inf
curr_sum = 0
answer_right_bound = 0
for i in range(n):
curr_sum += dist_sum_to_prev[i]
if i >= k:
curr_sum -= dist_sum_to_next[i - k]
if curr_sum < best_sum and i >= k - 1:
best_sum = curr_sum
answer_right_bound = i
return nums[answer_right_bound - k + 1:answer_right_bound + 1]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.