Related
Consider we have 2 arrays of size N, with their values in the range [0, N-1]. For example:
a = np.array([0, 1, 2, 0])
b = np.array([2, 0, 3, 3])
I need to produce a new array c which contains exactly N/2 elements from a and b respectively, i.e. the values must be taken evenly/equally from both parent arrays.
(For odd length, this would be (N-1)/2 and (N+1)/2. Can also ignore odd length case, not important).
Taking equal number of elements from two arrays is pretty trivial, but there is an additional constraint: c should have as many unique numbers as possible / as few duplicates as possible.
For example, a solution to a and b above is:
c = np.array([b[0], a[1], b[2], a[3]])
>>> c
array([2, 1, 3, 0])
Note that the position/order is preserved. Each element of a and b that we took to form c is in same position. If element i in c is from a, c[i] == a[i], same for b.
A straightforward solution for this is simply a sort of path traversal, easy enough to implement recursively:
def traverse(i, a, b, path, n_a, n_b, best, best_path):
if n_a == 0 and n_b == 0:
score = len(set(path))
return (score, path.copy()) if score > best else (best, best_path)
if n_a > 0:
path.append(a[i])
best, best_path = traverse(i + 1, a, b, path, n_a - 1, n_b, best, best_path)
path.pop()
if n_b > 0:
path.append(b[i])
best, best_path = traverse(i + 1, a, b, path, n_a, n_b - 1, best, best_path)
path.pop()
return best, best_path
Here n_a and n_b are how many values we will take from a and b respectively, it's 2 and 2 as we want to evenly take 4 items.
>>> score, best_path = traverse(0, a, b, [], 2, 2, 0, None)
>>> score, best_path
(4, [2, 1, 3, 0])
Is there a way to implement the above in a more vectorized/efficient manner, possibly through numpy?
The algorithm is slow mainly because it runs in an exponential time. There is no straightforward way to vectorize this algorithm using only Numpy because of the recursion. Even if it would be possible, the huge number of combinations would cause most Numpy implementations to be inefficient (due to large Numpy arrays to compute). Additionally, there is AFAIK no vectorized operation to count the number of unique values of many rows efficiently (the usual way is to use np.unique which is not efficient in this case and cannot be use without a loop). As a result, there is two possible strategy to speed this up:
trying to find an algorithm with a reasonable complexity (eg. <= O(n^4));
using compilation methods, micro-optimizations and tricks to write a faster brute-force implementation.
Since finding a correct sub-exponential algorithm turns out not to be easy, I choose the other approach (though the first approach is the best).
The idea is to:
remove the recursion by generating all possible solutions using a loop iterating on integer;
write a fast way to count unique items of an array;
use the Numba JIT compiler so to optimize the code that is only efficient once compiled.
Here is the final code:
import numpy as np
import numba as nb
# Naive way to count unique items.
# This is a slow fallback implementation.
#nb.njit
def naive_count_unique(arr):
count = 0
for i in range(len(arr)):
val = arr[i]
found = False
for j in range(i):
if arr[j] == val:
found = True
break
if not found:
count += 1
return count
# Optimized way to count unique items on small arrays.
# Count items 2 by 2.
# Fast on small arrays.
#nb.njit
def optim_count_unique(arr):
count = 0
for i in range(0, len(arr), 2):
if arr[i] == arr[i+1]:
tmp = 1
for j in range(i):
if arr[j] == arr[i]: tmp = 0
count += tmp
else:
val1, val2 = arr[i], arr[i+1]
tmp1, tmp2 = 1, 1
for j in range(i):
val = arr[j]
if val == val1: tmp1 = 0
if val == val2: tmp2 = 0
count += tmp1 + tmp2
return count
#nb.njit
def count_unique(arr):
if len(arr) % 2 == 0:
return optim_count_unique(arr)
else:
# Odd case: not optimized yet
return naive_count_unique(arr)
# Count the number of bits in a 32-bit integer
# See https://stackoverflow.com/questions/71097470/msb-lsb-popcount-in-numba
#nb.njit('int_(uint32)', inline='always')
def popcount(v):
v = v - ((v >> 1) & 0x55555555)
v = (v & 0x33333333) + ((v >> 2) & 0x33333333)
c = np.uint32((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24
return c
# Count the number of bits in a 64-bit integer
#nb.njit(inline='always')
def bit_count(n):
if n < (1 << 30):
return popcount(np.uint32(n))
else:
return popcount(np.uint32(n)) + popcount(np.uint32(n >> 32))
# Mutate `out` so not to create an expensive new temporary array
#nb.njit
def int_to_path(n, out, a, b):
for i in range(len(out)):
out[i] = a[i] if ((n >> i) & 1) else b[i]
#nb.njit(['(int32[:], int32[:], int64, int64)', '(int64[:], int64[:], int64, int64)'])
def traverse_fast(a, b, n_a, n_b):
# This assertion is needed because the paths are encoded using 64-bit.
# This should not be a problem in practice since the number of solutions to
# test would be impracticably huge to test using this algorithm anyway.
assert n_a + n_b < 62
max_iter = 1 << (n_a + n_b)
path = np.empty(n_a + n_b, dtype=a.dtype)
score, best_score, best_i = 0, 0, 0
# Iterate over all cases (more than the set of possible solution)
for i in range(max_iter):
# Filter the possible solutions
if bit_count(i) != n_b:
continue
# Analyse the score of the solution
int_to_path(i, path, a, b)
score = count_unique(path)
# Store it if it better than the previous one
if score > best_score:
best_score = score
best_i = i
int_to_path(best_i, path, a, b)
return best_score, path
This implementation is about 30 times faster on arrays of size 8 on my machine. On could use several cores to speed this up even further. However, I think it is better to focus on finding a sub-exponential implementation so to avoid wasting more computing resources. Note that the path is different from the initial function but the score is the same on random arrays. It can help others to test their implementation on larger arrays without waiting for a long time.
Test this heavily.
import numpy as np
from numpy.random._generator import default_rng
rand = default_rng(seed=1)
n = 16
a = rand.integers(low=0, high=n, size=n)
b = rand.integers(low=0, high=n, size=n)
uniques = np.setxor1d(a, b)
print(a)
print(b)
print(uniques)
def limited_uniques(arr: np.ndarray) -> np.ndarray:
choose = np.zeros(shape=n, dtype=bool)
_, idx, _ = np.intersect1d(arr, uniques, return_indices=True)
idx = idx[:n//2]
choose[idx] = True
n_missing = n//2 - len(idx)
counts = choose.cumsum()
diffs = np.arange(n) - counts
at = np.searchsorted(diffs, n_missing)
choose[:at] = True
return arr[choose]
a_half = limited_uniques(a)
uniques = np.union1d(uniques, np.setdiff1d(a, a_half))
interleaved = np.empty_like(a)
interleaved[0::2] = a_half
interleaved[1::2] = limited_uniques(b)
print(interleaved)
[ 7 8 12 15 0 2 13 15 3 4 13 6 4 13 4 6]
[10 8 1 0 13 12 13 8 13 5 7 12 1 4 1 7]
[ 1 2 3 5 6 10 15]
[ 7 10 8 8 12 1 15 0 0 13 2 12 3 5 6 4]
Problem Statement
Given two integer arrays A and B of size N and M respectively. You begin with a score of 0. You want to perform exactly K operations. On the iᵗʰ operation (1-indexed), you will:
Choose one integer x from either the start or the end of any one array, A or B. Remove it from that array
Add x to score.
Return the maximum score after performing K operations.
Example
Input: A = [3,1,2], B = [2,8,1,9] and K=5
Output: 24
Explanation: An optimal solution is as follows:
Choose from end of B, add 9 to score. Remove 9 from B
Choose from start of A, add 3 to score. Remove 3 from A
Choose from start of B, add 2 to score. Remove 2 from B
Choose from start of B, add 8 to score. Remove 8 from B
Choose from end of A, add 2 to score. Remove 2 from A
The total score is 9+3+2+8+2 = 24
Constraints
1 ≤ N ≤ 6000
1 ≤ M ≤ 6000
1 ≤ A[i] ≤ 109
1 ≤ B[i] ≤ 109
1 ≤ K ≤ N+M
My Approach
Since, greedy [choosing maximum end from both array] approach is failing here [because it will produce conflict when maximum end of both array is same], it suggests we have to look for all possible combinations. There will be overlapping sub-problems, hence DP!
Here is the python reprex code for the same.
A = [3,1,2]
N = len(A)
B = [2,8,1,9]
M = len(B)
K = 5
memo = {}
def solve(i,j, AL, BL):
if (i,j,AL,BL) in memo:
return memo[(i,j,AL,BL)]
AR = (N-1)-(i-AL)
BR = (M-1)-(j-BL)
if AL>AR or BL>BR or i+j==K:
return 0
op1 = A[AL] + solve(i+1,j,AL+1,BL)
op2 = B[BL] + solve(i,j+1,AL,BL+1)
op3 = A[AR] + solve(i+1,j,AL,BL)
op4 = B[BR] + solve(i,j+1,AL,BL)
memo[(i,j,AL,BL)] = max(op1,op2,op3,op4)
return memo[(i,j,AL,BL)]
print(solve(0,0,0,0))
In brief,
i indicates that we have performed i operations from A
j indicates that we have performed j operations from B
Total operation is thus i+j
AL indicates index on left of which which all integers of A are used. Similarly AR indicates index on right of which all integers of A used for operation.
BL indicates index on left of which which all integers of B are used. Similarly BR indicates index on right of which all integers of B used for operation.
We are trying out all possible combination, and choosing maximum from them in each step. Also memoizing our answer.
Doubt
The code worked fine for several test cases, but also failed for few. The message was Wrong Answer means there was no Time Limit Exceed, Memory Limit Exceed, Syntax Error or Run Time Error. This means there is some logical error only.
Can anyone help in identifying those Test Cases? And, also in understanding intuition/reason behind why this approach failed in some case?
Examples were posted code gives the wrong answer:
Example 1.
A = [1, 1, 1]
N = len(A)
B = [1, 1]
M = len(B)
K = 5
print(print(solve(0,0,0,0))) # Output: 4 (which is incorrect)
# Correct answer is 5
Example 2.
A = [1, 1]
B = [1]
N = len(A)
M = len(B)
K = 3
print(print(solve(0,0,0,0))) # Output: 2 (which is incorrect)
# Correct answer is 3
Alternative Code
def solve(A, B, k):
def solve_(a_left, a_right, b_left, b_right, remaining_ops, sum_):
'''
a_left - left pointer into A
a_right - right pointer in A
b_left - left pointer into B
b_right - right pointer into B
remaining_ops - remaining operations
sum_ - sum from previous operations
'''
if remaining_ops == 0:
return sum_ # out of operations
if a_left > a_right and b_left > b_right:
return sum_ # both left and right are empty
if (a_left, a_right, b_left, b_right) in cache:
return cache[(a_left, a_right, b_left, b_right)]
max_ = sum_ # init to current sum
if a_left <= a_right: # A not empty
max_ = max(max_,
solve_(a_left + 1, a_right, b_left, b_right, remaining_ops - 1, sum_ + A[a_left]), # Draw from left of A
solve_(a_left, a_right - 1, b_left, b_right, remaining_ops - 1, sum_ + A[a_right])) # Draw from right of A
if b_left <= b_right: # B not empty
max_ = max(max_,
solve_(a_left, a_right, b_left + 1, b_right, remaining_ops - 1, sum_ + B[b_left]), # Draw from left of B
solve_(a_left, a_right, b_left, b_right - 1, remaining_ops - 1, sum_ + B[b_right])) # Draw from right of B
cache[(a_left, a_right, b_left, b_right)] = max_ # update cache
return cache[(a_left, a_right, b_left, b_right)]
cache = {}
return solve_(0, len(A) - 1, 0, len(B) - 1, k, 0)
Tests
print(solve([3,1,2], [2,8,1,9], 5) # Output 24
print(solve([1, 1, 1], [1, 1, 1], 5) # Output 5
The approach is failing because the Recursive Functions stops computing further sub-problems when either "AL exceeds AR" or "BL exceeds BR".
We should stop computing and return 0 only when both of them are True. If either of "AL exceeds AR" or "BL exceeds BR" evaluates to False, means we can solve that sub-problem.
Moreover, one quick optimization here is that when N+M==K, in this case we can get maximum score by choosing all elements from both the arrays.
Here is the correct code!
A = [3,1,2]
B = [2,8,1,9]
K = 5
N, M = len(A), len(B)
memo = {}
def solve(i,j, AL, BL):
if (i,j,AL,BL) in memo:
return memo[(i,j,AL,BL)]
AR = (N-1)-(i-AL)
BR = (M-1)-(j-BL)
if i+j==K or (AL>AR and BL>BR):
return 0
ans = -float('inf')
if AL<=AR:
ans = max(A[AL]+solve(i+1,j,AL+1,BL),A[AR]+solve(i+1,j,AL,BL),ans)
if BL<=BR:
ans = max(B[BL]+solve(i,j+1,AL,BL+1),B[BR]+solve(i,j+1,AL,BL),ans)
memo[(i,j,AL,BL)] = ans
return memo[(i,j,AL,BL)]
if N+M==K:
print(sum(A)+sum(B))
else:
print(solve(0,0,0,0))
[This answer was published taking help from DarryIG's Answer. The reason for publishing answer is to write code similar to code in question body. DarryIG's answer used different prototype for function]
Although there are lots of questions that have already been asked and answered regarding heap implementation in python, I was unable to find any practical clarifications about indexes. So, allow me to ask one more heap related question.
I'm trying to write a code that transforms a list of values into a min-heap and saves swapped indexes. Here is what I have so far:
def mins(a, i, res):
n = len(a)-1
left = 2 * i + 1
right = 2 * i + 2
if not (i >= n//2 and i <= n):
if (a[i] > a[left] or a[i] > a[right]):
if a[left] < a[right]:
res.append([i, left])
a[i], a[left] = a[left], a[i]
mins(a, left, res)
else:
res.append([i, right])
a[i], a[right] = a[right], a[i]
mins(a, right, res)
def heapify(a, res):
n = len(a)
for i in range(n//2, -1, -1):
mins(a, i, res)
return res
a = [7, 6, 5, 4, 3, 2]
res = heapify(a, [])
print(a)
print(res)
Expected output:
a = [2, 3, 4, 5, 6, 7]
res = [[2, 5], [1, 4], [0, 2], [2, 5]]
What I get:
a = [3, 4, 5, 6, 7, 2]
res = [[1, 4], [0, 1], [1, 3]]
It's clear that there is something wrong with indexation in the above script. Probably something very obvious, but I just don't see it. Help out!
You have some mistakes in your code:
In heapify the first node that has a child, is at index (n - 2)//2, so use that as start value of the range.
In mins the condition not (i >= n//2 and i <= n) does not make a distinction between the case where the node has just one child or two. And i==n//2 should really be allowed when n is odd. Because then it has a left child. It is so much easier to just compare the value of left and right with n. It is also confusing that in heapify you define n as len(a), while in mins you define it as one less. This is really good for confusing the reader of your code!
To avoid code duplication (the two blocks where you swap), introduce a new variable that is set to either left or right depending on which one has the smaller value.
Here is a correction:
def mins(a, i, res):
n = len(a)
left = 2 * i + 1
right = 2 * i + 2
if left >= n:
return
child = left
if right < n and a[right] < a[left]:
child = right
if a[child] < a[i]: # need to swap
res.append([i, child])
a[i], a[child] = a[child], a[i]
mins(a, child, res)
def heapify(a, res):
n = len(a)
for i in range((n - 2)//2, -1, -1):
mins(a, i, res)
return res
I want to generate a list of unique numbers from 0 to 2 million, excluding several numbers. The best solution I came up with is this
excludez = [34, 394849, 2233, 22345, 95995, 2920]
random.sample([i for i in range(0,2000000) if i not in excludez ], 64)
This generates 64 random ints from 0 to 2 million, excluding values in the list excludez.
This contains a generator expression, so I am wondering if there is a faster solution to this. I am open to using any library, especially numpy.
Edit:
The generated samples should contain unique numbers.
Edit 2:
I tested all the solutions using
print(timeit(lambda: solnX(), number=256))
And then did 3 samples of that code.
Here are the average results:
Original:
135.838 seconds
#inspectorG4dget
0.02750687366665261
#jdehesa 1st solution
150.08836392466674
(surprising since was a numpy solution
#jdehesa 2nd solution
0.022973252333334433 seconds
#Andrej Kesely
0.016359308333373217 seconds
#Divakar
39.05853628633334 seconds
I timed in google colab, here's a link to the notebook.
I rearranged the code a bit so that all solutions had a level playing field.
https://colab.research.google.com/drive/1ITYNrSTEVR_M5QZhqaSDmM8Q06IHsE73
Here's one with masking -
def random_uniq(excludez, maxnum, num_samples):
m = np.ones(maxnum, dtype=bool)
m[excludez] = 0
c = np.count_nonzero(m)
idx = np.random.choice(c,num_samples,replace=False)
m2 = np.ones(c, dtype=bool)
m2[idx] = 0
mc = m.copy()
m[m] = m2
out = np.flatnonzero(m!=mc)
return out
excludez = [34, 394849, 2233, 22345, 95995, 2920]
out = random_uniq(excludez, maxnum=2000000, num_samples=64)
In [85]: excludez = set([34, 394849, 2233, 22345, 95995, 2920]) # faster lookups
In [86]: answer = set() # since you don't really care about order
In [87]: while len(answer) < 64:
...: r = random.randrange(0,2000000)
...: if r not in excludez and r not in answer: answer.add(r)
...:
This is one method to do it with NumPy:
import numpy as np
np.random.seed(0)
excludez = np.sort([2, 3, 6, 7, 13])
n = 15
size = 5
# Get unique integers in a reduced range
r = np.random.choice(n - len(excludez), size, replace=False)
# Shift values accordingly so excluded values are avoided
shift = np.arange(len(excludez) + 1)
r += shift[np.searchsorted(excludez - shift[:-1], r, 'right')]
print(r)
# [ 4 12 8 14 1]
Here is the same algorithm with plain Python:
import random
import bisect
random.seed(0)
excludez = [2, 3, 6, 7, 13]
n = 15
size = 5
shift = range(len(excludez) + 1)
search = [exc - i for i, exc in enumerate(excludez)]
r = random.sample(range(n - len(excludez)), size)
r = [v + shift[bisect.bisect_right(search, v)] for v in r]
print(r)
# [10, 14, 0, 4, 8]
One possible solution, method2 might contain duplicates, method3 no:
from timeit import timeit
import random
excludez = [34, 394849, 2233, 22345, 95995, 2920]
def method1():
return random.sample([i for i in range(0,2000000) if i not in excludez ], 64)
def method2():
out = []
while len(out) < 64:
i = int(random.random() * 2000000)
if i in excludez:
continue
out.append(i)
return out
def method3():
out = []
while len(out) < 64:
i = int(random.random() * 2000000)
if i in excludez or i in out:
continue
out.append(i)
return out
print(timeit(lambda: method1(), number=10))
print(timeit(lambda: method2(), number=10))
print(timeit(lambda: method3(), number=10))
Prints:
1.865599181000107
0.0002175730000999465
0.00039564000007885625
EDIT: Added int()
def fun_lst(lst, a, b):
if min(lst)<b and max(lst)>a:
return True
return False
How do I check if the values in the list are bigger than a and smaller than b? I tried the above, but in this example: fun_lst([-1, 3.5, 6], -2.4, 0)the function returns True and it supposed to return False.
Do this:
def fun_lst(lst, a, b):
if min(lst) > a and max(lst) < b:
return True
return False
print(fun_lst([-1, 3.5, 6], -2.4, 0) )
Output:
False
Doing min(lst) > a ensures every element is bigger than a.
Doing max(lst) < b ensures every element is smaller than b.
Alternate solution:
def fun_lst(lst, a, b):
return all(a < elem < b for elem in lst)
You can try this one liner if you like
all([num < b and num > a for num in lst])
This code here will check each item in the list, if an item is found that is not greater than a and less than b then it returns false, otherwise it returns true.
def fun_lst(lst, a, b):
for item in lst:
if not a < item < b:
return False
return True
myList = [1,2,3,4,5,6,7,8,9]
lower_limit = 3
upper_limit = 8
bool_output = all([i > lower_limit and i < upper_limit for i in myList])
print(bool_output)
False
myList = [1,2,3,4,5,6,7,8,9]
lower_limit = 0
upper_limit = 10
bool_output = all([i > lower_limit and i < upper_limit for i in myList])
print(bool_output)
True
you should try this:
def fun_lst(lst, a, b):
return all(n > a and n < b for n in lst)
If you have a provision to use numpy try this
In [1]: import numpy as np
In [3]: np.array(b)
Out[3]: array([ 3, 1, 4, 66, 8, 3, 4, 56])
In [17]: b[(2<b) & (b<5)]
Out[17]: array([3, 4, 3, 4])
Different a method:
def fun_lst(lis, x, y):
list = [i>x and i<y for i in lis]
return False if False in list else True
It's a little easy:
def fun_lst(lis, x, y):
return x<max(lis)<y
lambda version:
fun_lst = lambda lis, x, y: x<max(lis)<y
Outputs:
fun_lst([-1, 3.5, 6], -2.4, 0) #Output: False