how C++ insert like python? - python

I want to implement insert in C++ like this:
// python code
insertIndexes = [1, 1, 2, 2, 3, 3, 5]
arr = []
toInsertValue = 0;
for i in insertIndexes:
arr.insert(i, toInsertValue)
toInsertValue += 1
print arr // [0, 1, 3, 5, 4, 6, 2]
but I find that I have to know vector size if I want to use insert in C++:
// !!C++ wrong code!!
// vec is not initialized correctly
vector<int> vec;
int insertIndexes[] = {1, 1, 2, 2, 3, 3, 5}
int toInsertValue = 0;
for (int i = 0; i < sizeof(insertIndexes)/sizeof(insertIndexes[0]); i++) {
vec.insert(vec.begin() + insertIndexes[i], toInsertValue);
toInsertValue += 1;
}

In Python, inserting at an index outside the list size is very forgiving, the implementation checks that the insert location is greater than or equal to len(list), then the new item is inserted appended. In C++'s std::vector, this is not so. You will have to make that check yourself.
auto offset = 0;
for(auto x : indexes){
if(x < vec.size()) //Is the selected index in range?
vec.insert(vec.begin() + x, offset++);
else
vec.insert(vec.end(), offset++);
}
Full example:
std::vector<int> indexes = {1, 1, 2, 2, 3, 3, 5};
std::vector<int> vec;
auto offset = 0;
for(auto x : indexes){
auto iter = (x < int(vec.size())) ? vec.begin() + x : vec.end();
vec.insert(iter, offset++);
}
std::copy(vec.begin(), vec.end(), std::ostream_iterator<int>(std::cout, " "));
Outputs (As seen Live On Coliru ):
0 1 3 5 4 6 2

When you define a vector without a specific size, it will be empty and all indexing into it (with or without iterators) will be out of bounds leading to undefined behavior.
In your loop you need to check that indexes[i] will not be out of bounds, and if it is then resize the vector appropriately or use push_back to append the value offset to the vector.

Related

Maximum sum of subsequence of length L with a restriction

Given an array of positive integers. How to find a subsequence of length L with max sum which has the distance between any two of its neighboring elements that do not exceed K
I have the following solution but don't know how to take into account length L.
1 <= N <= 100000, 1 <= L <= 200, 1 <= K <= N
f[i] contains max sum of the subsequence that ends in i.
for i in range(K, N)
f[i] = INT_MIN
for j in range(1, K+1)
f[i] = max(f[i], f[i-j] + a[i])
return max(f)
(edit: slightly simplified non-recursive solution)
You can do it like this, just for each iteration consider if the item should be included or excluded.
def f(maxK,K, N, L, S):
if L == 0 or not N or K == 0:
return S
#either element is included
included = f(maxK,maxK, N[1:], L-1, S + N[0] )
#or excluded
excluded = f(maxK,K-1, N[1:], L, S )
return max(included, excluded)
assert f(2,2,[10,1,1,1,1,10],3,0) == 12
assert f(3,3,[8, 3, 7, 6, 2, 1, 9, 2, 5, 4],4,0) == 30
If N is very long you can consider changing to a table version, you could also change the input to tuples and use memoization.
Since OP later included the information that N can be 100 000, we can't really use recursive solutions like this. So here is a solution that runs in O(nKL), with same memory requirement:
import numpy as np
def f(n,K,L):
t = np.zeros((len(n),L+1))
for l in range(1,L+1):
for i in range(len(n)):
t[i,l] = n[i] + max( (t[i-k,l-1] for k in range(1,K+1) if i-k >= 0), default = 0 )
return np.max(t)
assert f([10,1,1,1,1,10],2,3) == 12
assert f([8, 3, 7, 6, 2, 1, 9],3,4) == 30
Explanation of the non recursive solution. Each cell in the table t[ i, l ] expresses the value of max subsequence with exactly l elements that use the element in position i and only elements in position i or lower where elements have at most K distance between each other.
subsequences of length n (those in t[i,1] have to have only one element, n[i] )
Longer subsequences have the n[i] + a subsequence of l-1 elements that starts at most k rows earlier, we pick the one with the maximal value. By iterating this way, we ensure that this value is already calculated.
Further improvements in memory is possible by considering that you only look at most K steps back.
Here is a bottom up (ie no recursion) dynamic solution in Python. It takes memory O(l * n) and time O(l * n * k).
def max_subseq_sum(k, l, values):
# table[i][j] will be the highest value from a sequence of length j
# ending at position i
table = []
for i in range(len(values)):
# We have no sum from 0, and i from len 1.
table.append([0, values[i]])
# By length of previous subsequence
for subseq_len in range(1, l):
# We look back up to k for the best.
prev_val = None
for last_i in range(i-k, i):
# We don't look back if the sequence was not that long.
if subseq_len <= last_i+1:
# Is this better?
this_val = table[last_i][subseq_len]
if prev_val is None or prev_val < this_val:
prev_val = this_val
# Do we have a best to offer?
if prev_val is not None:
table[i].append(prev_val + values[i])
# Now we look for the best entry of length l.
best_val = None
for row in table:
# If the row has entries for 0...l will have len > l.
if l < len(row):
if best_val is None or best_val < row[l]:
best_val = row[l]
return best_val
print(max_subseq_sum(2, 3, [10, 1, 1, 1, 1, 10]))
print(max_subseq_sum(3, 4, [8, 3, 7, 6, 2, 1, 9, 2, 5, 4]))
If I wanted to be slightly clever I could make this memory O(n) pretty easily by calculating one layer at a time, throwing away the previous one. It takes a lot of cleverness to reduce running time to O(l*n*log(k)) but that is doable. (Use a priority queue for your best value in the last k. It is O(log(k)) to update it for each element but naturally grows. Every k values you throw it away and rebuild it for a O(k) cost incurred O(n/k) times for a total O(n) rebuild cost.)
And here is the clever version. Memory O(n). Time O(n*l*log(k)) worst case, and average case is O(n*l). You hit the worst case when it is sorted in ascending order.
import heapq
def max_subseq_sum(k, l, values):
count = 0
prev_best = [0 for _ in values]
# i represents how many in prev subsequences
# It ranges from 0..(l-1).
for i in range(l):
# We are building subsequences of length i+1.
# We will have no way to find one that ends
# before the i'th element at position i-1
best = [None for _ in range(i)]
# Our heap will be (-sum, index). It is a min_heap so the
# minimum element has the largest sum. We track the index
# so that we know when it is in the last k.
min_heap = [(-prev_best[i-1], i-1)]
for j in range(i, len(values)):
# Remove best elements that are more than k back.
while min_heap[0][-1] < j-k:
heapq.heappop(min_heap)
# We append this value + (best prev sum) using -(-..) = +.
best.append(values[j] - min_heap[0][0])
heapq.heappush(min_heap, (-prev_best[j], j))
# And now keep min_heap from growing too big.
if 2*k < len(min_heap):
# Filter out elements too far back.
min_heap = [_ for _ in min_heap if j - k < _[1]]
# And make into a heap again.
heapq.heapify(min_heap)
# And now finish this layer.
prev_best = best
return max(prev_best)
Extending the code for itertools.combinations shown at the docs, I built a version that includes an argument for the maximum index distance (K) between two values. It only needed an additional and indices[i] - indices[i-1] < K check in the iteration:
def combinations_with_max_dist(iterable, r, K):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r and indices[i] - indices[i-1] < K:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
Using this you can bruteforce over all combinations with regards to K, and then find the one that has the maximum value sum:
def find_subseq(a, L, K):
return max((sum(values), values) for values in combinations_with_max_dist(a, L, K))
Results:
print(*find_subseq([10, 1, 1, 1, 1, 10], L=3, K=2))
# 12 (10, 1, 1)
print(*find_subseq([8, 3, 7, 6, 2, 1, 9, 2, 5, 4], L=4, K=3))
# 30 (8, 7, 6, 9)
Not sure about the performance if your value lists become very long though...
Algorithm
Basic idea:
Iteration on input array, choose each index as the first taken element.
Then Recursion on each first taken element, mark the index as firstIdx.
The next possible index would be in range [firstIdx + 1, firstIdx + K], both inclusive.
Loop on the range to call each index recursively, with L - 1 as the new L.
Optionally, for each pair of (firstIndex, L), cache its max sum, for reuse.
Maybe this is necessary for large input.
Constraints:
array length <= 1 << 17 // 131072
K <= 1 << 6 // 64
L <= 1 << 8 // 256
Complexity:
Time: O(n * L * K)
Since each (firstIdx , L) pair only calculated once, and that contains a iteration of K.
Space: O(n * L)
For cache, and method stack in recursive call.
Tips:
Depth of recursion is related to L, not array length.
The defined constraints are not the actual limit, it could be larger, though I didn't test how large it can be.
Basically:
Both array length and K actually could be of any size as long as there are enough memory, since they are handled via iteration.
L is handled via recursion, thus it does has a limit.
Code - in Java
SubSumLimitedDistance.java:
import java.util.HashMap;
import java.util.Map;
public class SubSumLimitedDistance {
public static final long NOT_ENOUGH_ELE = -1; // sum that indicate not enough element, should be < 0,
public static final int MAX_ARR_LEN = 1 << 17; // max length of input array,
public static final int MAX_K = 1 << 6; // max K, should not be too long, otherwise slow,
public static final int MAX_L = 1 << 8; // max L, should not be too long, otherwise stackoverflow,
/**
* Find max sum of sum array.
*
* #param arr
* #param K
* #param L
* #return max sum,
*/
public static long find(int[] arr, int K, int L) {
if (K < 1 || K > MAX_K)
throw new IllegalArgumentException("K should be between [1, " + MAX_K + "], but get: " + K);
if (L < 0 || L > MAX_L)
throw new IllegalArgumentException("L should be between [0, " + MAX_L + "], but get: " + L);
if (arr.length > MAX_ARR_LEN)
throw new IllegalArgumentException("input array length should <= " + MAX_ARR_LEN + ", but get: " + arr.length);
Map<Integer, Map<Integer, Long>> cache = new HashMap<>(); // cache,
long maxSum = NOT_ENOUGH_ELE;
for (int i = 0; i < arr.length; i++) {
long sum = findTakeFirst(arr, K, L, i, cache);
if (sum == NOT_ENOUGH_ELE) break; // not enough elements,
if (sum > maxSum) maxSum = sum; // larger found,
}
return maxSum;
}
/**
* Find max sum of sum array, with index of first taken element specified,
*
* #param arr
* #param K
* #param L
* #param firstIdx index of first taken element,
* #param cache
* #return max sum,
*/
private static long findTakeFirst(int[] arr, int K, int L, int firstIdx, Map<Integer, Map<Integer, Long>> cache) {
// System.out.printf("findTakeFirst(): K = %d, L = %d, firstIdx = %d\n", K, L, firstIdx);
if (L == 0) return 0; // done,
if (firstIdx + L > arr.length) return NOT_ENOUGH_ELE; // not enough elements,
// check cache,
Map<Integer, Long> map = cache.get(firstIdx);
Long cachedResult;
if (map != null && (cachedResult = map.get(L)) != null) {
// System.out.printf("hit cache, cached result = %d\n", cachedResult);
return cachedResult;
}
// cache not exists, calculate,
long maxRemainSum = NOT_ENOUGH_ELE;
for (int i = firstIdx + 1; i <= firstIdx + K; i++) {
long remainSum = findTakeFirst(arr, K, L - 1, i, cache);
if (remainSum == NOT_ENOUGH_ELE) break; // not enough elements,
if (remainSum > maxRemainSum) maxRemainSum = remainSum;
}
if ((map = cache.get(firstIdx)) == null) cache.put(firstIdx, map = new HashMap<>());
if (maxRemainSum == NOT_ENOUGH_ELE) { // not enough elements,
map.put(L, NOT_ENOUGH_ELE); // cache - as not enough elements,
return NOT_ENOUGH_ELE;
}
long maxSum = arr[firstIdx] + maxRemainSum; // max sum,
map.put(L, maxSum); // cache - max sum,
return maxSum;
}
}
SubSumLimitedDistanceTest.java:
(test case, via TestNG)
import org.testng.Assert;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;
import java.util.concurrent.ThreadLocalRandom;
public class SubSumLimitedDistanceTest {
private int[] arr;
private int K;
private int L;
private int maxSum;
private int[] arr2;
private int K2;
private int L2;
private int maxSum2;
private int[] arrMax;
private int KMax;
private int KMaxLargest;
private int LMax;
private int LMaxLargest;
#BeforeClass
private void setUp() {
// init - arr,
arr = new int[]{10, 1, 1, 1, 1, 10};
K = 2;
L = 3;
maxSum = 12;
// init - arr2,
arr2 = new int[]{8, 3, 7, 6, 2, 1, 9, 2, 5, 4};
K2 = 3;
L2 = 4;
maxSum2 = 30;
// init - arrMax,
arrMax = new int[SubSumLimitedDistance.MAX_ARR_LEN];
ThreadLocalRandom rd = ThreadLocalRandom.current();
long maxLongEle = Long.MAX_VALUE / SubSumLimitedDistance.MAX_ARR_LEN;
int maxEle = maxLongEle > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) maxLongEle;
for (int i = 0; i < arrMax.length; i++) {
arrMax[i] = rd.nextInt(maxEle);
}
KMax = 5;
LMax = 10;
KMaxLargest = SubSumLimitedDistance.MAX_K;
LMaxLargest = SubSumLimitedDistance.MAX_L;
}
#Test
public void test() {
Assert.assertEquals(SubSumLimitedDistance.find(arr, K, L), maxSum);
Assert.assertEquals(SubSumLimitedDistance.find(arr2, K2, L2), maxSum2);
}
#Test(timeOut = 6000)
public void test_veryLargeArray() {
run_printDuring(arrMax, KMax, LMax);
}
#Test(timeOut = 60000) // takes seconds,
public void test_veryLargeArrayL() {
run_printDuring(arrMax, KMax, LMaxLargest);
}
#Test(timeOut = 60000) // takes seconds,
public void test_veryLargeArrayK() {
run_printDuring(arrMax, KMaxLargest, LMax);
}
// run find once, and print during,
private void run_printDuring(int[] arr, int K, int L) {
long startTime = System.currentTimeMillis();
long sum = SubSumLimitedDistance.find(arr, K, L);
long during = System.currentTimeMillis() - startTime; // during in milliseconds,
System.out.printf("arr length = %5d, K = %3d, L = %4d, max sum = %15d, running time = %.3f seconds\n", arr.length, K, L, sum, during / 1000.0);
}
#Test
public void test_corner_notEnoughEle() {
Assert.assertEquals(SubSumLimitedDistance.find(new int[]{1}, 2, 3), SubSumLimitedDistance.NOT_ENOUGH_ELE); // not enough element,
Assert.assertEquals(SubSumLimitedDistance.find(new int[]{0}, 1, 3), SubSumLimitedDistance.NOT_ENOUGH_ELE); // not enough element,
}
#Test
public void test_corner_ZeroL() {
Assert.assertEquals(SubSumLimitedDistance.find(new int[]{1, 2, 3}, 2, 0), 0); // L = 0,
Assert.assertEquals(SubSumLimitedDistance.find(new int[]{0}, 1, 0), 0); // L = 0,
}
#Test(expectedExceptions = IllegalArgumentException.class)
public void test_invalid_K() {
// SubSumLimitedDistance.find(new int[]{1, 2, 3}, 0, 2); // K = 0,
// SubSumLimitedDistance.find(new int[]{1, 2, 3}, -1, 2); // K = -1,
SubSumLimitedDistance.find(new int[]{1, 2, 3}, SubSumLimitedDistance.MAX_K + 1, 2); // K = SubSumLimitedDistance.MAX_K+1,
}
#Test(expectedExceptions = IllegalArgumentException.class)
public void test_invalid_L() {
// SubSumLimitedDistance.find(new int[]{1, 2, 3}, 2, -1); // L = -1,
SubSumLimitedDistance.find(new int[]{1, 2, 3}, 2, SubSumLimitedDistance.MAX_L + 1); // L = SubSumLimitedDistance.MAX_L+1,
}
#Test(expectedExceptions = IllegalArgumentException.class)
public void test_invalid_tooLong() {
SubSumLimitedDistance.find(new int[SubSumLimitedDistance.MAX_ARR_LEN + 1], 2, 3); // input array too long,
}
}
Output of test case for large input:
arr length = 131072, K = 5, L = 10, max sum = 20779205738, running time = 0.303 seconds
arr length = 131072, K = 64, L = 10, max sum = 21393422854, running time = 1.917 seconds
arr length = 131072, K = 5, L = 256, max sum = 461698553839, running time = 9.474 seconds

Get correct FileLengthFrames with CoreAudio

I'm working on converting my Python code to Objective C to run on ios devices. The code about reading audio file. In Python I'm using AudioSegment to read file , The result is 2 separated channels in array.
For example:
Left channel [-1,-2,-3,-4,-5,-6,-7,-8,-9,-10] //length = 10
Right channel [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] //length = 10
So the total length from python is 20
Here is how I get audio output in objective c
float *audioTotal = malloc(fileLengthInFrames * sizeof(float));
SInt16 *inputFrames = (SInt16*)bufferList->mBuffers[0].mData;
for(int i = 0; i < fileLengthInFrames; ++i) {
audioTotal[i] = (float)inputFrames[i];
printf("%f ", audioTotal[i]);
}
And output is :
[-1, 1, -2, 2, -3, 3, -4, 4, -5, 5] // length = 10
So the out put from objective c is mixed left and right channel. So I have to separate them by code:
if (clientFormat.mChannelsPerFrame > 1) {
int indexLeft = 0;
int indexRight = 0;
float *leftAudio = malloc(fileLengthInFrames* sizeof(float));
float *rightAudio = malloc(fileLengthInFrames * sizeof(float));
for(int i = 0; i < fileLengthInFrames; i++) {
if (i%2 == 0) {
leftAudio[indexLeft] = audioTotal[i];
printf("%f ", leftAudio[indexLeft]);
indexLeft ++;
} else {
rightAudio[indexRight] = audioTotal[i];
printf("%f ", rightAudio[indexRight]);
indexRight ++;
}
}
}
And now I have 2 separated channel from objective c:
Left channel [-1,-2,-3,-4,-5] //length = 5
Right channel [ 1, 2, 3, 4, 5] //length = 5
So the total length I got from objective c is 10 compare with 20 in python.
Where is my rest of data? Did I miss some steps? Or wrong configuration?
Thanks for help.
When you have interleaved samples and you "separate them by code", you're forgetting to multiply by channelsPerBuffer (which seems to be interleaved-savvy?), so for stereo you're missing out on half of the samples. Try changing the for loop to
for(int i = 0; i < fileLengthInFrames*channelsPerBuffer; i++) {
// display left and right samples here ...
}
The length of audioTotal should also be fileLengthInFrames*channelsPerBuffer.
p.s. why recalculate fileLengthInFrames if client and file sample rates are the same?

Shift array elements in C++ without loop

Is there a way to shift array elements in C++ without using any loop like the below Python code which shifts the elements of the list just by manipulating list indices
def rotate(lst, n):
n = n % len(lst)
return lst[n:] + lst[:n]
> rotate([1,2,3,4,5], 1) # rotate forward
[2, 3, 4, 5, 1]
C++ standard algorithms also work with arrays, so you can just use std::rotate or std::rotate_copy.
The functions' interfaces are a bit more complex than rotation in your Python example, though. You have to provide, as a second argument, an iterator to the element which will become the first element in the resulting array.
For an array { 1, 2, 3, 4, 5 } and a forward rotation by one element, that would be the second element (the "2"). You get an iterator to that element by adding 1 to an iterator to the array's first element, for example array.begin() + 1, assuming that you use std::array, or array + 1 if it's a raw array.
#include <iostream>
#include <algorithm>
#include <array>
int main()
{
std::array<int, 5> array = { 1, 2, 3, 4, 5 };
std::rotate(
array.begin(),
array.begin() + 1,
array.end()
);
for (auto&& element : array)
{
std::cout << element << "\n";
}
}
If you want an interface like in your Python code, then you can wrap std::rotate in a function of your own and provide an int parameter. This is also a nice opportunity to make the whole thing more reusable by creating a generic function which can be used with any suitable container:
#include <iostream>
#include <algorithm>
#include <array>
#include <vector>
#include <list>
template <class Container>
void rotate(Container& container, int n)
{
using std::begin;
using std::end;
auto new_begin = begin(container);
std::advance(new_begin, n);
std::rotate(
begin(container),
new_begin,
end(container)
);
}
int main()
{
std::array<int, 5> array = { 1, 2, 3, 4, 5 };
rotate(array, 1);
std::vector<int> vector = { 1, 2, 3, 4, 5 };
rotate(vector, 3);
std::list<int> list = { 1, 2, 3, 4, 5 };
rotate(list, 2);
int raw_array[] = { 1, 2, 3, 4, 5 };
rotate(raw_array, 3);
// test output goes here...
}
Note how std::begin and std::end make sure that raw arrays (with their begin + N syntax) and container classes (with their c.begin() + N syntax) are both supported, and std::advance makes the function work for containers with non-random-access iterators like std::list (where you must increment iterators repeatedly to advance them by more than one element).
By the way, if you want to support n arguments greater than or equal to the container's size, then you can use the C++17 function std::size or just create your own. And perhaps use assert to catch accidental negative arguments:
assert(n >= 0);
using std::size;
n = n % size(container);

Integer list to ranges

I need to convert a list of ints to a string containing all the ranges in the list.
So for example, the output should be as follows:
getIntRangesFromList([1,3,7,2,11,8,9,11,12,15]) -> "1-3,7-9,11-12,15"
So the input is not sorted and there can be duplicate values. The lists range in size from one element to 4k elements. The minimum and maximum values are 1 and 4094.
This is part of a performance critical piece of code. I have been trying to optimize this, but I can't find a way to get this faster. This is my current code:
def _getIntRangesFromList(list):
if (list==[]):
return ''
list.sort()
ranges = [[list[0],list[0]]] # ranges contains the start and end values of each range found
for val in list:
r = ranges[-1]
if val==r[1]+1:
r[1] = val
elif val>r[1]+1:
ranges.append([val,val])
return ",".join(["-".join([str(y) for y in x]) if x[0]!=x[1] else str(x[0]) for x in ranges])
Any idea on how to get this faster?
This could be a task for the itertools module.
import itertools
list_num = [1, 2, 3, 7, 8, 9, 11, 12, 15]
groups = (list(x) for _, x in
itertools.groupby(list_num, lambda x, c=itertools.count(): x - next(c)))
print(', '.join('-'.join(map(str, (item[0], item[-1])[:len(item)])) for item in groups))
This will give you 1-3, 7-9, 11-12, 15.
To understand what's going on you might want to check the content of groups.
import itertools
list_num = [1, 2, 3, 7, 8, 9, 11, 12, 15]
groups = (list(x) for _, x in
itertools.groupby(list_num, lambda x, c=itertools.count(): x - next(c)))
for element in groups:
print('element={}'.format(element))
This will give you the following output.
element=[1, 2, 3]
element=[7, 8, 9]
element=[11, 12]
element=[15]
The basic idea is to have a counter running parallel to the numbers. groupby will create individual groups for numbers with the same numerical distance to the current value of the counter.
I don't know whether this is faster on your version of Python. You'll have to check this yourself. In my setting it's slower with this data set, but faster with a bigger number of elements.
The fastest one I could come up, which tests about 10% faster than your solution on my machine (according to timeit):
def _ranges(l):
if l:
l.sort()
return ''.join([(str(l[i]) + ('-' if l[i] + 1 == l[i + 1] else ','))
for i in range(0, len(l) - 1) if l[i - 1] + 2 != l[i + 1]] +
[str(l[-1])])
else: return ''
The above code assumes that the values in the list are unique. If they aren't, it's easy to fix but there's a subtle hack which will no longer work and the end result will be slightly slower.
I actually timed _ranges(u[:]) because of the sort; u is 600 randomly selected integers from range(1000) comprising 235 subsequences; 83 are singletons and 152 contain at least two numbers. If the list is sorted, quite a lot of time is saved.
def _to_range(l, start, stop, idx, result):
if idx == len(l):
result.append((start, stop))
return result
if l[idx] - stop > 1:
result.append((start, stop))
return _to_range(l, l[idx], l[idx], idx + 1, result)
return _to_range(l, start, l[idx], idx + 1, result)
def get_range(l):
if not l:
return []
return _to_range(l, start = l[0], stop = l[0], idx = 0, result = [])
l = [1, 2, 3, 7, 8, 9, 11, 12, 15]
result = get_range(l)
print(result)
>>> [(1, 3), (7, 9), (11, 12), (15, 15)]
# I think it's better to fetch the data as it is and if needed, change it
# with
print(','.join('-'.join([str(start), str(stop)]) for start, stop in result))
>>> 1-3,7-9,11-12,15-15
Unless you don't care at all about the data, then u can just append str(start) + '-' + str(stop) in _to_range function so later there will be no need to type extra '-'.join method.
I'll concentrate on the performance that is your main issue. I'll give 2 solutions:
1) If the boundaries of the integers stored is between A and B, and you can create an array of booleans(even you can choose an array of bits for expanding the range you can storage) with (B - A + 2) elements, e.g. A = 0 and B = 1 000 000, we can do this (i'll write it in C#, sorry XD). This run in O(A - B) and is a good solution if A - B is less than the number of numbers:
public string getIntRangesFromList(int[] numbers)
{
//You can change this 2 constants
const int A = 0;
const int B = 1000000;
//Create an array with all its values in false by default
//Last value always will be in false in propourse, as you can see it storage 1 value more than needed for 2nd cycle
bool[] apparitions = new bool[B - A + 2];
int minNumber = B + 1;
int maxNumber = A - 1;
int pos;
for (int i = 0; i < numbers.Length; i++)
{
pos = numbers[i] - A;
apparitions[pos] = true;
if (minNumber > pos)
{
minNumber = pos;
}
if (maxNumber < pos)
{
maxNumber = pos;
}
}
//I will mantain the concatenation simple, but you can make it faster to improve performance
string result = "";
bool isInRange = false;
bool isFirstRange = true;
int firstPosOfRange = 0; //Irrelevant what is its initial value
for (int i = minNumber; i <= maxNumber + 1; i++)
{
if (!isInRange)
{
if (apparitions[i])
{
if (!isFirstRange)
{
result += ",";
}
else
{
isFirstRange = false;
}
result += (i + A);
isInRange = true;
firstPosOfRange = i;
}
}
else
{
if (!apparitions[i])
{
if (i > firstPosOfRange + 1)
{
result += "-" + (i + A - 1);
}
isInRange = false;
}
}
}
return result;
}
2) O(N * log N)
public string getIntRangesFromList2(int[] numbers)
{
string result = "";
if (numbers.Length > 0)
{
numbers.OrderBy(x => x); //sorting and making the algorithm complexity O(N * log N)
result += numbers[0];
int countNumbersInRange = 1;
for (int i = 1; i < numbers.Length; i++)
{
if (numbers[i] != numbers[i - 1] + 1)
{
if (countNumbersInRange > 1)
{
result += "-" + numbers[i - 1];
}
result += "," + numbers[i];
countNumbersInRange = 1;
}
else
{
countNumbersInRange++;
}
}
}
return result;
}

How to implement/construct the following permutation, given two n-tuples, efficiently?

I am studying queuing theory in which I am frequently presented with the following situation.
Let x, y both be n-tuples of nonnegative integers (depicting lengths of the n queues). In addition, x and y each have distinguished queue called their "prime queue". For example,
x = [3, 6, 1, 9, 5, 2] with x' = 1
y = [6, 1, 5, 9, 5, 5] with y' = 5
(In accordance with Python terminology I am counting the queues 0-5.)
How can I implement/construct the following permutation f on {0,1,...,5} efficiently?
first set f(x') = y'. So here f(1) = 5.
then set f(i) = i for any i such that x[i] == y[i]. Clearly there is no need to consider the indices x' and y'. So here f(3) = 3 (both length 9) and f(4) = 4 (both length 5).
there are now equally sized sets of queues unpaired in x and in y. So here in x this is {0,2,5} and in y this is {0,1,2}.
rank these from from 1 to s, where s is the common size of the sets, by length with 1 == lowest rank == shortest queue and s == highest rank == longest queue. So here, s = 3, and in x rank(0) = 1, rank(2) = 3 and rank(5) = 2, and in y rank(0) = 1, rank(1) = 3, rank(2) = 2. If there is a tie, give the queue with the larger index the higher rank.
pair these s queues off by rank. So here f(0) = 0, f(2) = 1, f(5) = 2.
This should give the permutation [0, 5, 1, 3, 4, 2].
My solution consists of tracking the indices and loops over x and y multiple times, and is terribly inefficient. (Roughly looking at n >= 1,000,000 in my application.)
Any help would be most appreciated.
Since you must do the ranking, you can't get linear and will need to sort. So it looks pretty straightforward. You do 1. in O(1) and 2. in O(n) by just going over the n-tuples. At the same time, you can construct the copy of x and y with only those that are left for 3. but do not include only the value, but instead use tuple of value and its index in the original.
In your example, x-with-tuples-left would be [[3,0],[1,2],[2,5]] and y-with-tuples-left would be [[6,0],[1,1],[5,2]].
Then just sort both x-with-tuples-left and y-with-tuples-left (it will be O(n.log n)), and read the permutation from the second element of the corresponding tuples.
In your example, sorted x-with-... would be [[1,2],[2,5],[3,0]] and sorted y-with-... would be [[1,1],[5,2],[6,0]]. Now, you nicely see 5. from the second elements: f(2)=1, f(5)=2, f(0)=0.
EDIT: Including O(n+L) in Javascript:
function qperm (x, y, xprime, yprime) {
var i;
var n = x.length;
var qperm = new Array(n);
var countsx = [], countsy = []; // same as new Array()
qperm[xprime] = yprime; // doing 1.
for (i = 0; i < n; ++i) {
if (x[i] == y[i] && i != xprime && i != yprime) { // doing 2.
qperm[i] = i; }
else { // preparing for 4. below
if (i != xprime) {
if (countsx[x[i]]) countsx[x[i]]++; else countsx[x[i]] = 1; }
if (i != yprime) {
if (countsy[y[i]]) countsy[y[i]]++; else countsy[y[i]] = 1; } }
// finishing countsx and countsy
var count, sum;
for (i = 0, count = 0; i < countsx.length; ++i) {
if (countsx[i]) {
sum = count + countsx[i];
countsx[i] = count;
count = sum; }
for (i = 0, count = 0; i < countsy.length; ++i) {
if (countsy[i]) {
sum = count + countsy[i];
countsy[i] = count;
count = sum; }
var yranked = new Array(count);
for (i = 0; i < n; ++i) {
if (i != yprime && (x[i] != y[i] || i == xprime)) { // doing 4. for y
yranked[countsy[y[i]]] = y[i];
countsy[y[i]]++; } }
for (i = 0; i < n; ++i) {
if (i != xprime && (x[i] != y[i] || i == yprime)) { // doing 4. for x and 5. at the same time
// this was here but was not right: qperm[x[i]] = yranked[countsx[x[i]]];
qperm[i] = yranked[countsx[x[i]]];
// this was here but was not right: countsy[y[i]]++; } } }
countsx[x[i]]++; } }
return qperm; }
Hopefully it's correct ;-)

Categories

Resources