Find the maximum result after collapsing an array with subtractions - python

Given an array of integers, I need to reduce it to a single number by repeatedly replacing any two numbers with their difference, to produce the maximum possible result.
Example1 - If I have array of [0,-1,-1,-1] then performing (0-(-1)) then (1-(-1)) and then (2-(-1)) will give 3 as maximum possible output
Example2- [3,2,1,1] we can get maximum output as 5 { first (1-1) then (0-2) then (3-(-2)}
Can someone tell me how to solve this question?

The goal is to find the maximum result after iteratively substracting two numbers in an array.
The problem is mainly at the algorithmic level.
It is clear that the final result is bounded by the sum of the absolute values:
Bound = sum_i abs(a[i])
According to the rule x - y, the signs may change frequently. The algorithm must focus on maximizing the sum of absolute remaining values.
The key point is to consider what happens when we decide to pair two numbers. If the signs of the numbers are different, the result will have a maximum absolute value, and we can decide the sign. For example, if the two values are (-5, 10), we will get 15 or -15. Associating numbers with a different sign, we just have to select the final sign in such a way that it is different from the sign of another number, for example the neighboor one.
The consequence is that if not all the signs are equal, we can manage that the result is equal to the bound. For example:
1 2 -3 4 5 -6 7 -> pair 2 and -3
1 -5 4 5 -6 7 -> pair 1 and -5
-6 4 5 -6 7
-10 5 -6 7
15 -6 7
-21 7
28 = Bound
If all the signs are equal, the bound cannot be attained. However, it is possible to minimize the loss by selecting as first number the one with the lowest absolute value. For example:
3 4 1 2 2 -> pair 1 and 2
3 4 -1 2 -> pair -1 and 2
3 4 -3
3 -7
10 = Bound - 2*1
The same procedure can be used if all signs are negative. The point is that after this first "operation", we get a new bound, which can be attained thanks to the "sign rule".
In this situation (all signs equal), the result is equal to the bound minus twice the minimum absolute value.
Of course, one has to treat the n = 1 case separately : result = a[0];
From that, writing a programme is rather easy.

Consider an "anchor" an element we chose to string an arbitrary number of subtractions to.
Then clearly:
(Anchor1 - neg1 - neg2 - neg3...) - (Anchor2 - pos1 - pos2 - pos3...)
is optimal if
Anchor1 = max(array) and Anchor2 = min(array)
JavaScript code demonstrating the method by comparing it with brute force as well as Damien's great method (might not correctly handle arrays with all identical elements):
function bruteForce(A){
if (A.length == 1)
return A[0]
let best = -Infinity
for (let i=1; i<A.length; i++){
for (let j=0; j<i; j++){
let A1 = A.slice()
let A2 = A.slice()
A1[i] -= A1[j]
A1.splice(j, 1)
A2[j] -= A2[i]
A2.splice(i, 1)
best = Math.max(
best, bruteForce(A1), bruteForce(A2))
}
}
return best
}
function f(A){
let max = [-Infinity, -1]
let min = [Infinity, -1]
A.map(function(x, i){
if (x > max[0])
max = [x, i]
if (x < min[0])
min = [x, i]
})
let restPos = 0
let restNeg = 0
A.map(function(x, i){
if (i != max[1] && i != min[1]){
if (x >= 0)
restPos += x
else
restNeg += x
}
})
return (max[0] - restNeg) - (min[0] - restPos)
}
function damien(A){
let absSum = 0
let minAbs = Infinity
let allSignsEqual = true
let firstSign = A[0] > 0
A.map(function(x, i){
absSum += Math.abs(x)
minAbs = Math.min(minAbs, Math.abs(x))
if ((x > 0) != firstSign)
allSignsEqual = false
})
return allSignsEqual ? absSum - 2*minAbs : absSum
}
var n = 6
var m = 10
for (let i=0; i<100; i++){
let A = []
for (let j=0; j<n; j++)
A.push(~~(Math.random()*m) * [1,1][~~(Math.random()*2)])
let a = bruteForce(A)
let b = f(A)
let c = damien(A)
if (a != b || a != c)
console.log(`Mismatch! ${a}, ${b}, ${c} :: ${A}`)
}
console.log("Done")

The other answers are fine, but here's another way to think about it:
If you expand the result into individual terms, you want all the positive numbers to end up as additive terms, and all the negative numbers to end up as subtractive terms.
If you have both signs available, then this is easy:
Subtract all but one of the positive numbers from a negative number
Subtract all of the negative numbers from the remaining positive number
If all your numbers have the same sign, then pick the one with the smallest absolute value at treat it as having the opposite sign in the above procedure. That works out to:
If you have only negative numbers, then subtract them all from the least negative one; or
If you have only positive numbers, then subtract all but one from the smallest, and then subtract the result from the remaining one.

Related

Find minimum steps required to reach n

I am trying to solve a Dynamic programming problem which is as follows but unable to solve it.
You are given a primitive calculator that can perform the following three operations with the current number 𝑥: multiply 𝑥 by 2, multiply 𝑥 by 3, or add 1 to 𝑥. Your goal is given a positive integer 𝑛, find the minimum number of operations needed to obtain the number 𝑛 starting from the number 1
I found the solution on stackoverflow itself but unable to understand what's going on.
I have heard that every DP problem can be solved by creating matrix which I was trying to do but don't know where I am going wrong. The table is created below which shows number of steps required to reach to n from 1, initially I take values as infinity.
i / j 0 1 2 3 4 5
plus 1 0 1 2 3 4 5
multiple by 2 0 infinity 2 infinity 3 infinity
multiple by 3 0 infinity infinity 2 infinity infinity
I am trying to solve this problem in Python.
Can someone please help me.
I found the solution which is as follows but not able to understand exactly what is going on:
import math
target = int(input())
def optVal(target, cache):
result = [1] * cache[-1] # 1
for i in range(1, cache[-1]): # 2
result[-i] = target # 3
if cache[target-1] == cache[target] - 1: # 4
target -= 1
elif target % 2 == 0 and (cache[target // 2] == cache[target] - 1): # 5
target //= 2
else: # 6 # target % 3 == 0 and (cache[target // 3] == cache[target] - 1):
target //= 3
return result
cache = [0] + [math.inf] * target # 1
for i in range(1, len(cache)): # 2
temp1 = math.inf
temp2 = math.inf
temp3 = math.inf
temp1 = cache[i - 1] + 1
if i % 2 == 0:
temp2 = cache[i // 2] + 1
if i % 3 == 0:
temp3 = cache[i // 3] + 1
cache[i] = min(temp1, temp2, temp3)
print('Minimum operation: ', cache[target] - 1)
finalLst = optVal(target, cache)
print(' '.join([str(x) for x in finalLst]))
Input:
5
Output:
3
1245
This algorithm is split in two parts. The first is in the main, the second is in the optVal function.
The first part builds the cache list, where cache[i] holds the minimum number of steps necessary to arrive from 0 to i applying, at each step, one of the three possible operations: +1, *2 or *3. This list is the 1-dimensional case of the matrix you read about.
When cache[i] is calculated, all indices lower than i already have been calculated. One can get to i in three possible ways, so a maximum of three possible sources of i, i.e., elements of cache, need to be examined: i-1, i//2 and i//3, but i//2 only if i is even, and i//3 only if i can be divided by 3. These elements of cache are compared, and the content of the winner, incremented by 1 (because of the extra step to get to i), is stored in cache. This process is bootstrapped by putting a 0 in cache[0]. In the end, cache[target] will contain the minimum number of steps to get to target starting from 0 (which is 1 more than the steps to get there starting from 1, which is how the problem was stated – note that you only can apply the +1 operation to move out from 0).
Now, if I had written the code, I probably would have stored the “parent” or the “winning operation” of each cache[i] together with the number of steps to get there (BTW, those math.inf are not really needed, because there always is a finite number of steps to reach i, because of the +1 operation.) The approach of the author is to infer this information from the content of the possible parents (max 3) of each cache[i] that needs to be examined. In both cases, the chain of “ancestors” has to be reconstructed backwards, starting from cache[target], and this is what happens in optVal().
In optVal() the target is changed at each iteration (a bit confusingly), because at each iteration the info you have is the minimum number of steps needed to reach a certain target number. Knowing that, you look at the 1, 2 or 3 possible parents to check which one contains exactly that number of steps minus 1. The one that passes the test is the actual parent, and so you can continue building the chain backwards replacing target with the parent.
to solve this DP, you have to construct a table of minimum number of steps required to get n, if one two or all the operations were available. you will be creating it left to right, top to bottom, ie 1 to n, add 1 to mul 3. As you go down more number of operations are available
A cells value only depends on the value above it (if available) and atmax 3 values in the left side eg. for (n = 6),(mul 3) cell will depend only on (n = 6),(mul 2) and (n = 2)(mul 3), (n = 3)(mul 3), (n = 5)(mul 3). you will then compare these values and whichever is smaller after operation, you will put that value, so you will be comparing value of (n = 2)(mul 3) + 1 vs (n = 3)(mul 3) + 1 vs (n = 5)(mul 3) + 1 vs (n = 6)(mul 2), and then whichever is smaller you will put that value
since n = 1 is given, the first column would have all the values as zero
for n = 2, its values will depend on values of n = 1. you can "add 1" or "multiply by 2" (1 step), both are valid. so this column will have all the values as 0 + 1 = 1
for n = 3, its values will depend on values of n = 1 (because 1 = 1/3 of 3) AND n = 2. if you can only "add 1" or "multiply by 2", then you will choose to add 1 to n = 2 so total steps 1+1 = 2. BUT if you could also multiply by three you will need only one step so 0 + 1 = 1. since 1 < 2 you will put 1 as this value. so the entries for n = 3 is 2, 2, 1
for n = 4, it will depend on n = 3 (add 1), and n = 2 (mul 2). so the values will be 3, 2, 2
for n = 5, it will depend on n = 4 (add 1). so the values will be 4, 3, 3
so the minimum steps are 3 to reach n = 5
final table:
1 2 3 4 5
add 1 0 1 2 3 4
mul 2 0 1 2 2 3
mul 3 0 1 1 2 3
#include <bits/stdc++.h>
using namespace std;
int rec(vector<int> &dp,int n)
{
if(n==1) return 0;
if(dp[n]!=INT_MAX) return dp[n];
return dp[n]=min({1+rec(dp,n-1),(n%2==0)?1+rec(dp,n/2):INT_MAX,(n%3==0)?1+rec(dp,n/3):INT_MAX});
}
string genseq(vector<int> &dp, int n){
string res="";
while(n>1)
{
res=to_string(n)+" "+res;
if(dp[n-1]==(dp[n]-1)) n--;
else if(n%2==0&&( dp[n/2]==dp[n]-1)) n/=2;
else if(n%3==0&&( dp[n/3]==dp[n]-1)) n/=3;
}
return "1 "+res;
}
int main()
{
int n;
cin>>n;
vector<int> dp(n+1,INT_MAX);
dp[0]=0;
dp[1]=0;
std::cout << rec(dp,n) << std::endl;
std::cout << genseq(dp,n) << std::endl;
return 0;
}

Unable to understand the logic behind the solution[FrogRiverOne]

I m unable to understand what is logic behind the solution for Codility FrogRiverOne here https://codility.com/demo/take-sample-test/frog_river_one
Task description
A small frog wants to get to the other side of a river. The frog is initially located on one bank of the river (position 0) and wants to get to the opposite bank (position X+1). Leaves fall from a tree onto the surface of the river.
You are given an array A consisting of N integers representing the falling leaves. A[K] represents the position where one leaf falls at time K, measured in seconds.
The goal is to find the earliest time when the frog can jump to the other side of the river. The frog can cross only when leaves appear at every position across the river from 1 to X (that is, we want to find the earliest moment when all the positions from 1 to X are covered by leaves). You may assume that the speed of the current in the river is negligibly small, i.e. the leaves do not change their positions once they fall in the river.
For example, you are given integer X = 5 and array A such that:
A[0] = 1
A[1] = 3
A[2] = 1
A[3] = 4
A[4] = 2
A[5] = 3
A[6] = 5
A[7] = 4
In second 6, a leaf falls into position 5. This is the earliest time when leaves appear in every position across the river.
Write a function:
def solution(X, A)
that, given a non-empty array A consisting of N integers and integer X, returns the earliest time when the frog can jump to the other side of the river.
If the frog is never able to jump to the other side of the river, the function should return −1.
For example, given X=5 and array A such that:
A[0] = 1
A[1] = 3
A[2] = 1
A[3] = 4
A[4] = 2
A[5] = 3
A[6] = 5
A[7] = 4
the function should return 6, as explained above.
Assume that:
N and X are integers within the range [1 . 100,000]
each element of array A is an integer within the range (1 . X).
Complexity:
expected worst-case time complexity is O(N);
expected worst-case space complexity is O(X) (not counting the storage required for input arguments).
Solution--
Input arguments to Function - (2,[2,2,2,2,2]) and (5, [1, 3, 1, 4, 2, 3, 5, 4])
def solution(X,A):
covered = 0
covered_a = [-1]*X
for index,element in enumerate(A):
if covered_a[element-1] == -1:
covered_a[element-1] = element
covered += 1
if covered == X:
return index
return -1
I want to understand what is the logic behind creating an boolean array and substracting 1 element wise from the input array A
It's because you want to "flag" all the numbers you've seen so far. So it begins with all of them on 'False' because you haven't seen any of them yet.
I'm going to give you my Java solution to this question which scored 100%. The main strategy is to use java.util.Set to store all required integers for a full jump and a second java.util.Set to keep storing current leaves and to keep checking if the first set fully exists in the second set.
package com.codility.lesson04.countingelements;
import java.util.HashSet;
import java.util.Set;
public class FrogRiverOne {
public int solution(int X, int[] A) {
SetrequiredLeavesSet = new HashSet();
for(int i=1; i<=X; i++) {
requiredLeavesSet.add(i);
}
SetcurrentLeavesSet = new HashSet();
for(int p=0; p<A.length; p++) {
currentLeavesSet.add(A[p]);
//keep adding to current leaves set until it is at least the same size as required leaves set
if(currentLeavesSet.size() < requiredLeavesSet.size()) continue;
if(currentLeavesSet.containsAll(requiredLeavesSet)) {
return p;
}
}
return -1;
}
}
You can find the code and unit tests for this problem here and an entire list of Codility solutions with explanations of the strategies here.
This is the best solution that I came up with and is very easy to understand. It gives O(n) time complexity.
def solution(X, A):
positions = set()
seconds = 0
for i in range(0, len(A)):
if A[i] not in positions and A[i] <= X:
positions.add(A[i])
seconds = i
if len(positions) == X:
return seconds
return -1

Shuffling a list with maximum distance travelled [duplicate]

I have tried to ask this question before, but have never been able to word it correctly. I hope I have it right this time:
I have a list of unique elements. I want to shuffle this list to produce a new list. However, I would like to constrain the shuffle, such that each element's new position is at most d away from its original position in the list.
So for example:
L = [1,2,3,4]
d = 2
answer = magicFunction(L, d)
Now, one possible outcome could be:
>>> print(answer)
[3,1,2,4]
Notice that 3 has moved two indices, 1 and 2 have moved one index, and 4 has not moved at all. Thus, this is a valid shuffle, per my previous definition. The following snippet of code can be used to validate this:
old = {e:i for i,e in enumerate(L)}
new = {e:i for i,e in enumerate(answer)}
valid = all(abs(i-new[e])<=d for e,i in old.items())
Now, I could easily just generate all possible permutations of L, filter for the valid ones, and pick one at random. But that doesn't seem very elegant. Does anyone have any other ideas about how to accomplish this?
This is going to be long and dry.
I have a solution that produces a uniform distribution. It requires O(len(L) * d**d) time and space for precomputation, then performs shuffles in O(len(L)*d) time1. If a uniform distribution is not required, the precomputation is unnecessary, and the shuffle time can be reduced to O(len(L)) due to faster random choices; I have not implemented the non-uniform distribution. Both steps of this algorithm are substantially faster than brute force, but they're still not as good as I'd like them to be. Also, while the concept should work, I have not tested my implementation as thoroughly as I'd like.
Suppose we iterate over L from the front, choosing a position for each element as we come to it. Define the lag as the distance between the next element to place and the first unfilled position. Every time we place an element, the lag grows by at most one, since the index of the next element is now one higher, but the index of the first unfilled position cannot become lower.
Whenever the lag is d, we are forced to place the next element in the first unfilled position, even though there may be other empty spots within a distance of d. If we do so, the lag cannot grow beyond d, we will always have a spot to put each element, and we will generate a valid shuffle of the list. Thus, we have a general idea of how to generate shuffles; however, if we make our choices uniformly at random, the overall distribution will not be uniform. For example, with len(L) == 3 and d == 1, there are 3 possible shuffles (one for each position of the middle element), but if we choose the position of the first element uniformly, one shuffle becomes twice as likely as either of the others.
If we want a uniform distribution over valid shuffles, we need to make a weighted random choice for the position of each element, where the weight of a position is based on the number of possible shuffles if we choose that position. Done naively, this would require us to generate all possible shuffles to count them, which would take O(d**len(L)) time. However, the number of possible shuffles remaining after any step of the algorithm depends only on which spots we've filled, not what order they were filled in. For any pattern of filled or unfilled spots, the number of possible shuffles is the sum of the number of possible shuffles for each possible placement of the next element. At any step, there are at most d possible positions to place the next element, and there are O(d**d) possible patterns of unfilled spots (since any spot further than d behind the current element must be full, and any spot d or further ahead must be empty). We can use this to generate a Markov chain of size O(len(L) * d**d), taking O(len(L) * d**d) time to do so, and then use this Markov chain to perform shuffles in O(len(L)*d) time.
Example code (currently not quite O(len(L)*d) due to inefficient Markov chain representation):
import random
# states are (k, filled_spots) tuples, where k is the index of the next
# element to place, and filled_spots is a tuple of booleans
# of length 2*d, representing whether each index from k-d to
# k+d-1 has an element in it. We pretend indices outside the array are
# full, for ease of representation.
def _successors(n, d, state):
'''Yield all legal next filled_spots and the move that takes you there.
Doesn't handle k=n.'''
k, filled_spots = state
next_k = k+1
# If k+d is a valid index, this represents the empty spot there.
possible_next_spot = (False,) if k + d < n else (True,)
if not filled_spots[0]:
# Must use that position.
yield k-d, filled_spots[1:] + possible_next_spot
else:
# Can fill any empty spot within a distance d.
shifted_filled_spots = list(filled_spots[1:] + possible_next_spot)
for i, filled in enumerate(shifted_filled_spots):
if not filled:
successor_state = shifted_filled_spots[:]
successor_state[i] = True
yield next_k-d+i, tuple(successor_state)
# next_k instead of k in that index computation, because
# i is indexing relative to shifted_filled_spots instead
# of filled_spots
def _markov_chain(n, d):
'''Precompute a table of weights for generating shuffles.
_markov_chain(n, d) produces a table that can be fed to
_distance_limited_shuffle to permute lists of length n in such a way that
no list element moves a distance of more than d from its initial spot,
and all permutations satisfying this condition are equally likely.
This is expensive.
'''
if d >= n - 1:
# We don't need the table, and generating a table for d >= n
# complicates the indexing a bit. It's too complicated already.
return None
table = {}
termination_state = (n, (d*2 * (True,)))
table[termination_state] = 1
def possible_shuffles(state):
try:
return table[state]
except KeyError:
k, _ = state
count = table[state] = sum(
possible_shuffles((k+1, next_filled_spots))
for (_, next_filled_spots) in _successors(n, d, state)
)
return count
initial_state = (0, (d*(True,) + d*(False,)))
possible_shuffles(initial_state)
return table
def _distance_limited_shuffle(l, d, table):
# Generate an index into the set of all permutations, then use the
# markov chain to efficiently find which permutation we picked.
n = len(l)
if d >= n - 1:
random.shuffle(l)
return
permutation = [None]*n
state = (0, (d*(True,) + d*(False,)))
permutations_to_skip = random.randrange(table[state])
for i, item in enumerate(l):
for placement_index, new_filled_spots in _successors(n, d, state):
new_state = (i+1, new_filled_spots)
if table[new_state] <= permutations_to_skip:
permutations_to_skip -= table[new_state]
else:
state = new_state
permutation[placement_index] = item
break
return permutation
class Shuffler(object):
def __init__(self, n, d):
self.n = n
self.d = d
self.table = _markov_chain(n, d)
def shuffled(self, l):
if len(l) != self.n:
raise ValueError('Wrong input size')
return _distance_limited_shuffle(l, self.d, self.table)
__call__ = shuffled
1We could use a tree-based weighted random choice algorithm to improve the shuffle time to O(len(L)*log(d)), but since the table becomes so huge for even moderately large d, this doesn't seem worthwhile. Also, the factors of d**d in the bounds are overestimates, but the actual factors are still at least exponential in d.
In short, the list that should be shuffled gets ordered by the sum of index and a random number.
import random
xs = range(20) # list that should be shuffled
d = 5 # distance
[x for i,x in sorted(enumerate(xs), key= lambda (i,x): i+(d+1)*random.random())]
Out:
[1, 4, 3, 0, 2, 6, 7, 5, 8, 9, 10, 11, 12, 14, 13, 15, 19, 16, 18, 17]
Thats basically it. But this looks a little bit overwhelming, therefore...
The algorithm in more detail
To understand this better, consider this alternative implementation of an ordinary, random shuffle:
import random
sorted(range(10), key = lambda x: random.random())
Out:
[2, 6, 5, 0, 9, 1, 3, 8, 7, 4]
In order to constrain the distance, we have to implement a alternative sort key function that depends on the index of an element. The function sort_criterion is responsible for that.
import random
def exclusive_uniform(a, b):
"returns a random value in the interval [a, b)"
return a+(b-a)*random.random()
def distance_constrained_shuffle(sequence, distance,
randmoveforward = exclusive_uniform):
def sort_criterion(enumerate_tuple):
"""
returns the index plus a random offset,
such that the result can overtake at most 'distance' elements
"""
indx, value = enumerate_tuple
return indx + randmoveforward(0, distance+1)
# get enumerated, shuffled list
enumerated_result = sorted(enumerate(sequence), key = sort_criterion)
# remove enumeration
result = [x for i, x in enumerated_result]
return result
With the argument randmoveforward you can pass a random number generator with a different probability density function (pdf) to modify the distance distribution.
The remainder is testing and evaluation of the distance distribution.
Test function
Here is an implementation of the test function. The validatefunction is actually taken from the OP, but I removed the creation of one of the dictionaries for performance reasons.
def test(num_cases = 10, distance = 3, sequence = range(1000)):
def validate(d, lst, answer):
#old = {e:i for i,e in enumerate(lst)}
new = {e:i for i,e in enumerate(answer)}
return all(abs(i-new[e])<=d for i,e in enumerate(lst))
#return all(abs(i-new[e])<=d for e,i in old.iteritems())
for _ in range(num_cases):
result = distance_constrained_shuffle(sequence, distance)
if not validate(distance, sequence, result):
print "Constraint violated. ", result
break
else:
print "No constraint violations"
test()
Out:
No constraint violations
Distance distribution
I am not sure whether there is a way to make the distance uniform distributed, but here is a function to validate the distribution.
def distance_distribution(maxdistance = 3, sequence = range(3000)):
from collections import Counter
def count_distances(lst, answer):
new = {e:i for i,e in enumerate(answer)}
return Counter(i-new[e] for i,e in enumerate(lst))
answer = distance_constrained_shuffle(sequence, maxdistance)
counter = count_distances(sequence, answer)
sequence_length = float(len(sequence))
distances = range(-maxdistance, maxdistance+1)
return distances, [counter[d]/sequence_length for d in distances]
distance_distribution()
Out:
([-3, -2, -1, 0, 1, 2, 3],
[0.01,
0.076,
0.22166666666666668,
0.379,
0.22933333333333333,
0.07766666666666666,
0.006333333333333333])
Or for a case with greater maximum distance:
distance_distribution(maxdistance=9, sequence=range(100*1000))
This is a very difficult problem, but it turns out there is a solution in the academic literature, in an influential paper by Mark Jerrum, Alistair Sinclair, and Eric Vigoda, A Polynomial-Time Approximation Algorithm for the Permanent of a Matrix with Nonnegative Entries, Journal of the ACM, Vol. 51, No. 4, July 2004, pp. 671–697. http://www.cc.gatech.edu/~vigoda/Permanent.pdf.
Here is the general idea: first write down two copies of the numbers in the array that you want to permute. Say
1 1
2 2
3 3
4 4
Now connect a node on the left to a node on the right if mapping from the number on the left to the position on the right is allowed by the restrictions in place. So if d=1 then 1 on the left connects to 1 and 2 on the right, 2 on the left connects to 1, 2, 3 on the right, 3 on the left connects to 2, 3, 4 on the right, and 4 on the left connects to 3, 4 on the right.
1 - 1
X
2 - 2
X
3 - 3
X
4 - 4
The resulting graph is bipartite. A valid permutation corresponds a perfect matching in the bipartite graph. A perfect matching, if it exists, can be found in O(VE) time (or somewhat better, for more advanced algorithms).
Now the problem becomes one of generating a uniformly distributed random perfect matching. I believe that can be done, approximately anyway. Uniformity of the distribution is the really hard part.
What does this have to do with permanents? Consider a matrix representation of our bipartite graph, where a 1 means an edge and a 0 means no edge:
1 1 0 0
1 1 1 0
0 1 1 1
0 0 1 1
The permanent of the matrix is like the determinant, except there are no negative signs in the definition. So we take exactly one element from each row and column, multiply them together, and add up over all choices of row and column. The terms of the permanent correspond to permutations; the term is 0 if any factor is 0, in other words if the permutation is not valid according to the matrix/bipartite graph representation; the term is 1 if all factors are 1, in other words if the permutation is valid according to the restrictions. In summary, the permanent of the matrix counts all permutations satisfying the restriction represented by the matrix/bipartite graph.
It turns out that unlike calculating determinants, which can be accomplished in O(n^3) time, calculating permanents is #P-complete so finding an exact answer is not feasible in general. However, if we can estimate the number of valid permutations, we can estimate the permanent. Jerrum et. al. approached the problem of counting valid permutations by generating valid permutations uniformly (within a certain error, which can be controlled); an estimate of the value of the permanent can be obtained by a fairly elaborate procedure (section 5 of the paper referenced) but we don't need that to answer the question at hand.
The running time of Jerrum's algorithm to calculate the permanent is O(n^11) (ignoring logarithmic factors). I can't immediately tell from the paper the running time of the part of the algorithm that uniformly generates bipartite matchings, but it appears to be over O(n^9). However, another paper reduces the running time for the permanent to O(n^7): http://www.cc.gatech.edu/fac/vigoda/FasterPermanent_SODA.pdf; in that paper they claim that it is now possible to get a good estimate of a permanent of a 100x100 0-1 matrix. So it should be possible to (almost) uniformly generate restricted permutations for lists of 100 elements.
There may be further improvements, but I got tired of looking.
If you want an implementation, I would start with the O(n^11) version in Jerrum's paper, and then take a look at the improvements if the original algorithm is not fast enough.
There is pseudo-code in Jerrum's paper, but I haven't tried it so I can't say how far the pseudo-code is from an actual implementation. My feeling is it isn't too far. Maybe I'll give it a try if there's interest.
I am not sure how good it is, but maybe something like:
create a list of same length than initial list L; each element of this list should be a list of indices of allowed initial indices to be moved here; for instance [[0,1,2],[0,1,2,3],[0,1,2,3],[1,2,3]] if I understand correctly your example;
take the smallest sublist (or any of the smallest sublists if several lists share the same length);
pick a random element in it with random.choice, this element is the index of the element in the initial list to be mapped to the current location (use another list for building your new list);
remove the randomly chosen element from all sublists
For instance:
L = [ "A", "B", "C", "D" ]
i = [[0,1,2],[0,1,2,3],[0,1,2,3],[1,2,3]]
# I take [0,1,2] and pick randomly 1 inside
# I remove the value '1' from all sublists and since
# the first sublist has already been handled I set it to None
# (and my result will look as [ "B", None, None, None ]
i = [None,[0,2,3],[0,2,3],[2,3]]
# I take the last sublist and pick randomly 3 inside
# result will be ["B", None, None, "D" ]
i = [None,[0,2], [0,2], None]
etc.
I haven't tried it however. Regards.
My idea is to generate permutations by moving at most d steps by generating d random permutations which move at most 1 step and chaining them together.
We can generate permutations which move at most 1 step quickly by the following recursive procedure: consider a permutation of {1,2,3,...,n}. The last item, n, can move either 0 or 1 place. If it moves 0 places, n is fixed, and we have reduced the problem to generating a permutation of {1,2,...,n-1} in which every item moves at most one place.
On the other hand, if n moves 1 place, it must occupy position n-1. Then n-1 must occupy position n (if any smaller number occupies position n, it will have moved by more than 1 place). In other words, we must have a swap of n and n-1, and after swapping we have reduced the problem to finding such a permutation of the remainder of the array {1,...,n-2}.
Such permutations can be constructed in O(n) time, clearly.
Those two choices should be selected with weighted probabilities. Since I don't know the weights (though I have a theory, see below) maybe the choice should be 50-50 ... but see below.
A more accurate estimate of the weights might be as follows: note that the number of such permutations follows a recursion that is the same as the Fibonacci sequence: f(n) = f(n-1) + f(n-2). We have f(1) = 1 and f(2) = 2 ({1,2} goes to {1,2} or {2,1}), so the numbers really are the Fibonacci numbers. So my guess for the probability of choosing n fixed vs. swapping n and n-1 would be f(n-1)/f(n) vs. f(n-2)/f(n). Since the ratio of consecutive Fibonacci numbers quickly approaches the Golden Ratio, a reasonable approximation to the probabilities is to leave n fixed 61% of the time and swap n and n-1 39% of the time.
To construct permutations where items move at most d places, we just repeat the process d times. The running time is O(nd).
Here is an outline of an algorithm.
arr = {1,2,...,n};
for (i = 0; i < d; i++) {
j = n-1;
while (j > 0) {
u = random uniform in interval (0,1)
if (u < 0.61) { // related to golden ratio phi; more decimals may help
j -= 1;
} else {
swap items at positions j and j-1 of arr // 0-based indexing
j -= 2;
}
}
}
Since each pass moves items at most 1 place from their start, d passes will move items at most d places. The only question is the uniform distribution of the permutations. It would probably be a long proof, if it's even true, so I suggest assembling empirical evidence for various n's and d's. Probably to prove the statement, we would have to switch from using the golden ratio approximation to f(n-1)/f(n-2) in place of 0.61.
There might even be some weird reason why some permutations might be missed by this procedure, but I'm pretty sure that doesn't happen. Just in case, though, it would be helpful to have a complete inventory of such permutations for some values of n and d to check the correctness of my proposed algorithm.
Update
I found an off-by-one error in my "pseudocode", and I corrected it. Then I implemented in Java to get a sense of the distribution. Code is below. The distribution is far from uniform, I think because there are many ways of getting restricted permutations with short max distances (move forward, move back vs. move back, move forward, for example) but few ways of getting long distances (move forward, move forward). I can't think of a way to fix the uniformity issue with this method.
import java.util.Random;
import java.util.Map;
import java.util.TreeMap;
class RestrictedPermutations {
private static Random rng = new Random();
public static void rPermute(Integer[] a, int d) {
for (int i = 0; i < d; i++) {
int j = a.length-1;
while (j > 0) {
double u = rng.nextDouble();
if (u < 0.61) { // related to golden ratio phi; more decimals may help
j -= 1;
} else {
int t = a[j];
a[j] = a[j-1];
a[j-1] = t;
j -= 2;
}
}
}
}
public static void main(String[] args) {
int numTests = Integer.parseInt(args[0]);
int d = 2;
Map<String,Integer> count = new TreeMap<String,Integer>();
for (int t = 0; t < numTests; t++) {
Integer[] a = {1,2,3,4,5};
rPermute(a,d);
// convert a to String for storage in Map
String s = "(";
for (int i = 0; i < a.length-1; i++) {
s += a[i] + ",";
}
s += a[a.length-1] + ")";
int c = count.containsKey(s) ? count.get(s) : 0;
count.put(s,c+1);
}
for (String k : count.keySet()) {
System.out.println(k + ": " + count.get(k));
}
}
}
Here are two sketches in Python; one swap-based, the other non-swap-based. In the first, the idea is to keep track of where the indexes have moved and test if the next swap would be valid. An additional variable is added for the number of swaps to make.
from random import randint
def swap(a,b,L):
L[a], L[b] = L[b], L[a]
def magicFunction(L,d,numSwaps):
n = len(L)
new = list(range(0,n))
for i in xrange(0,numSwaps):
x = randint(0,n-1)
y = randint(max(0,x - d),min(n - 1,x + d))
while abs(new[x] - y) > d or abs(new[y] - x) > d:
y = randint(max(0,x - d),min(n - 1,x + d))
swap(x,y,new)
swap(x,y,L)
return L
print(magicFunction([1,2,3,4],2,3)) # [2, 1, 4, 3]
print(magicFunction([1,2,3,4,5,6,7,8,9],2,4)) # [2, 3, 1, 5, 4, 6, 8, 7, 9]
Using print(collections.Counter(tuple(magicFunction([0, 1, 2], 1, 1)) for i in xrange(1000))) we find that the identity permutation comes up heavy with this code (the reason why is left as an exercise for the reader).
Alternatively, we can think about it as looking for a permutation matrix with interval restrictions, where abs(i - j) <= d where M(i,j) would equal 1. We can construct a one-off random path by picking a random j for each row from those still available. x's in the following example represent matrix cells that would invalidate the solution (northwest to southeast diagonal would represent the identity permutation), restrictions represent how many is are still available for each j. (Adapted from my previous version to choose both the next i and the next j randomly, inspired by user2357112's answer):
n = 5, d = 2
Start:
0 0 0 x x
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 0 0
restrictions = [3,4,5,4,3] # how many i's are still available for each j
1.
0 0 1 x x # random choice
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 0 0
restrictions = [2,3,0,4,3] # update restrictions in the neighborhood of (i ± d)
2.
0 0 1 x x
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 1 0 # random choice
restrictions = [2,3,0,0,2] # update restrictions in the neighborhood of (i ± d)
3.
0 0 1 x x
0 0 0 0 x
0 1 0 0 0 # random choice
x 0 0 0 0
x x 0 1 0
restrictions = [1,0,0,0,2] # update restrictions in the neighborhood of (i ± d)
only one choice for j = 0 so it must be chosen
4.
0 0 1 x x
1 0 0 0 x # dictated choice
0 1 0 0 0
x 0 0 0 0
x x 0 1 0
restrictions = [0,0,0,0,2] # update restrictions in the neighborhood of (i ± d)
Solution:
0 0 1 x x
1 0 0 0 x
0 1 0 0 0
x 0 0 0 1 # dictated choice
x x 0 1 0
[2,0,1,4,3]
Python code (adapted from my previous version to choose both the next i and the next j randomly, inspired by user2357112's answer):
from random import randint,choice
import collections
def magicFunction(L,d):
n = len(L)
restrictions = [None] * n
restrict = -1
solution = [None] * n
for i in xrange(0,n):
restrictions[i] = abs(max(0,i - d) - min(n - 1,i + d)) + 1
while True:
availableIs = filter(lambda x: solution[x] == None,[i for i in xrange(n)]) if restrict == -1 else filter(lambda x: solution[x] == None,[j for j in xrange(max(0,restrict - d),min(n,restrict + d + 1))])
if not availableIs:
L = [L[i] for i in solution]
return L
i = choice(availableIs)
availableJs = filter(lambda x: restrictions[x] <> 0,[j for j in xrange(max(0,i - d),min(n,i + d + 1))])
nextJ = restrict if restrict != -1 else choice(availableJs)
restrict = -1
solution[i] = nextJ
restrictions[ nextJ ] = 0
for j in xrange(max(0,i - d),min(n,i + d + 1)):
if j == nextJ or restrictions[j] == 0:
continue
restrictions[j] = restrictions[j] - 1
if restrictions[j] == 1:
restrict = j
print(collections.Counter(tuple(magicFunction([0, 1, 2], 1)) for i in xrange(1000)))
Using print(collections.Counter(tuple(magicFunction([0, 1, 2], 1)) for i in xrange(1000))) we find that the identity permutation comes up light with this code (why is left as an exercise for the reader).
Here's an adaptation of #גלעד ברקן's code that takes only one pass through the list (in random order) and swaps only once (using a random choice of possible positions):
from random import choice, shuffle
def magicFunction(L, d):
n = len(L)
swapped = [0] * n # 0: position not swapped, 1: position was swapped
positions = list(xrange(0,n)) # list of positions: 0..n-1
shuffle(positions) # randomize positions
for x in positions:
if swapped[x]: # only swap an item once
continue
# find all possible positions to swap
possible = [i for i in xrange(max(0, x - d), min(n, x + d)) if not swapped[i]]
if not possible:
continue
y = choice(possible) # choose another possible position at random
if x != y:
L[y], L[x] = L[x], L[y] # swap with that position
swapped[x] = swapped[y] = 1 # mark both positions as swapped
return L
Here is a refinement of the above code that simply finds all possible adjacent positions and chooses one:
from random import choice
def magicFunction(L, d):
n = len(L)
positions = list(xrange(0, n)) # list of positions: 0..n-1
for x in xrange(0, n):
# find all possible positions to swap
possible = [i for i in xrange(max(0, x - d), min(n, x + d)) if abs(positions[i] - x) <= d]
if not possible:
continue
y = choice(possible) # choose another possible position at random
if x != y:
L[y], L[x] = L[x], L[y] # swap with that position
positions[x] = y
positions[y] = x
return L

MaxDoubleSliceSum Algorithm

I'm trying to solve the problem of finding the MaxDoubleSliceSum value. Simply, it's the maximum sum of any slice minus one element within this slice (you have to drop one element, and the first and the last element are excluded also). So, technically the first and the last element of the array cannot be included in any slice sum.
Here's the full description:
A non-empty zero-indexed array A consisting of N integers is given.
A triplet (X, Y, Z), such that 0 ≤ X < Y < Z < N, is called a double slice.
The sum of double slice (X, Y, Z) is the total of A[X + 1] + A[X + 2] + ... + A[Y − 1] + A[Y + 1] + A[Y + 2] + ... + A[Z − 1].
For example, array A such that:
A[0] = 3
A[1] = 2
A[2] = 6
A[3] = -1
A[4] = 4
A[5] = 5
A[6] = -1
A[7] = 2
contains the following example double slices:
double slice (0, 3, 6), sum is 2 + 6 + 4 + 5 = 17,
double slice (0, 3, 7), sum is 2 + 6 + 4 + 5 − 1 = 16,
double slice (3, 4, 5), sum is 0.
The goal is to find the maximal sum of any double slice.
Write a function:
def solution(A)
that, given a non-empty zero-indexed array A consisting of N integers, returns the maximal sum of any double slice.
For example, given:
A[0] = 3
A[1] = 2
A[2] = 6
A[3] = -1
A[4] = 4
A[5] = 5
A[6] = -1
A[7] = 2
the function should return 17, because no double slice of array A has a sum of greater than 17.
Assume that:
N is an integer within the range [3..100,000];
each element of array A is an integer within the range [−10,000..10,000].
Complexity:
expected worst-case time complexity is O(N);
expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.
Here's my try:
def solution(A):
if len(A) <= 3:
return 0
max_slice = 0
minimum = A[1] # assume the first element is the minimum
max_end = -A[1] # and drop it from the slice
for i in xrange(1, len(A)-1):
if A[i] < minimum: # a new minimum found
max_end += minimum # put back the false minimum
minimum = A[i] # assign the new minimum to minimum
max_end -= minimum # drop the new minimum out of the slice
max_end = max(0, max_end + A[i])
max_slice = max(max_slice, max_end)
return max_slice
What makes me think that this may approach the correct solution but some corners of the problem may haven't been covered is that 9 out 14 test cases pass correctly (https://codility.com/demo/results/demoAW7WPN-PCV/)
I know that this can be solved by applying Kadane’s algorithm forward and backward. but I'd really appreciate it if someone can point out what's missing here.
Python solution O(N)
This should be solved using Kadane’s algorithm from two directions.
ref:
Python Codility Solution
C++ solution - YouTube tutorial
JAVA solution
def compute_sum(start, end, step, A):
res_arr = [0]
res = 0
for i in range(start, end, step):
res = res + A[i]
if res < 0:
res_arr.append(0)
res = 0
continue
res_arr.append(res)
return res_arr
def solution(A):
if len(A) < 3:
return 0
arr = []
left_arr = compute_sum(1, len(A)-1, 1, A)
right_arr = compute_sum(len(A)-2, 0, -1, A)
k = 0
for i in range(len(left_arr)-2, -1, -1):
arr.append(left_arr[i] + right_arr[k])
k = k + 1
return max(arr)
This is just how I'd write the algorithm.
Assume a start index of X=0, then iteratively sum the squares to the right.
Keep track of the index of the lowest int as you count, and subtract the lowest int from the sum when you use it. This effectively lets you place your Y.
Keep track of the max sum, and the X, Y, Z values for that sum
if the sum ever turns negative then save the max sum as your result, so long as it is greater than the previous result.
Choose a new X, You should start looking after Y and subtract one from whatever index you find. And repeat the previous steps, do this until you have reached the end of the list.
How might this be an improvement?
Potential problem case for your code: [7, 2, 4, -18, -14, 20, 22]
-18 and -14 separate the array into two segments. The sum of the first segment is 7+2+4=13, the sum of the second segment is just 20. The above algorithm handles this case, yours might but I'm bad at python (sorry).
EDIT (error and solution): It appears my original answer brings nothing new to what I thought was the problem, but I checked the errors and found the actual error occurs here: [-20, -10, 10, -70, 20, 30, -30] will not be handled correctly. It will exclude the positive 10, so it returns 50 instead of 60.
It appears the askers code doesn't correctly identify the new starting position (my method for this is shown in case 4), it's important that you restart the iterations at Y instead of Z because Y effectively deletes the lowest number, which is possibly the Z that fails the test.

Improving runtime on Euler #10

So I was attacking a Euler Problem that seemed pretty simple on a small scale, but as soon as I bump it up to the number that I'm supposed to do, the code takes forever to run. This is the question:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
I did it in Python. I could wait a few hours for the code to run, but I'd rather find a more efficient way to go about this. Here's my code in Python:
x = 1;
total = 0;
while x <= 2000000:
y = 1;
z = 0;
while x >= y:
if x % y == 0:
z += 1;
y += 1;
if z == 2:
total += x
x += 1;
print total;
Like mentioned in the comments, implementing the Sieve of Eratosthenes would be a far better choice. It takes up O(n) extra space, which is an array of length ~2 million, in this case. It also runs in O(n), which is astronomically faster than your implementation, which runs in O(n²).
I originally wrote this in JavaScript, so bear with my python:
max = 2000000 # we only need to check the first 2 million numbers
numbers = []
sum = 0
for i in range(2, max): # 0 and 1 are not primes
numbers.append(i) # fill our blank list
for p in range(2, max):
if numbers[p - 2] != -1: # if p (our array stays at 2, not 0) is not -1
# it is prime, so add it to our sum
sum += numbers[p - 2]
# now, we need to mark every multiple of p as composite, starting at 2p
c = 2 * p
while c < max:
# we'll mark composite numbers as -1
numbers[c - 2] = -1
# increment the count to 3p, 4p, 5p, ... np
c += p
print(sum)
The only confusing part here might be why I used numbers[p - 2]. That's because I skipped 0 and 1, meaning 2 is at index 0. In other words, everything's shifted to the side by 2 indices.
Clearly the long pole in this tent is computing the list of primes in the first place. For an artificial situation like this you could get someone else's list (say, this one), prase it and add up the numbers in seconds.
But that's unsporting, in my view. In which case, try the sieve of atkin as noted in this SO answer.

Categories

Resources