Codility - Tape equilibrium training using Python - python

A non-empty zero-indexed array A consisting of N integers is given. Array A represents numbers on a tape. Any integer P, such that 0 < P < N, splits this tape into two non-empty parts: A[0], A[1], ..., A[P − 1] and A[P], A[P + 1], ..., A[N − 1]. The difference between the two parts is the value of: |(A[0] + A[1] + ... + A[P − 1]) − (A[P] + A[P + 1] + ... + A[N − 1])| In other words, it is the absolute difference between the sum of the first part and the sum of the second part.
def solution(A):
N = len(A)
my_list = []
for i in range(1, N):
first_tape = sum(A[:i - 1]) + A[i]
second_tape = sum(A[i - 1:]) + A[i]
difference = abs(first_tape - second_tape)
my_list.append(difference)
print(min(my_list))
return min(my_list)
My solution gets 100% on Correctness but 0% on Performance.
I think it is supposed to be O(N) but my time complexity is O(N*N).
Can anyone please give me advice please?

You can change your code to something like below to have complexity O(N).
def solution(A):
s = sum(A)
m = float('inf')
left_sum = 0
for i in A[:-1]:
left_sum += i
m = min(abs(s - 2*left_sum), m)
return m

Functional approach as #darkvalance wrote, but with comments:
from itertools import accumulate
def solution(A):
array_sum = sum(A) # saving sum of all elements to have an O(n) complexity
# accumulate returns accumulated sums
# e.g. for input: [3, 1, 2, 4] it returns: [3, 4, 6, 10]
# we are passing a copy of the array without the last element
# including the last element doesn't make sense, becuase
# accumulate[A][-1] == array_sum
accumulated_list = accumulate(A[:-1])
return min([abs(2*x - array_sum) for x in accumulated_list])

To answer your question - it's O(n*n) because sum() function is O(n) time complexity and you are calling it inside a for loop with N elements, which is also O(N).
So the resulting time complexity of the algorithm will be O(N*n)

my java code
O(N)
class Solution {
public int solution(int[] arr) {
int sum = 0;
for(int i = 0; i<arr.length; i++){
sum = sum + arr[i];
}
int minSum = 100000;
int tempSum = 0;
int previousSum = 0;
for(int i = 0; i<arr.length-1; i++){
previousSum = previousSum + arr[i];
tempSum = Math.abs(previousSum - (sum - previousSum));
if(minSum > tempSum){
minSum = tempSum;
}
}
return minSum;
}
}

My python code O(N)
def solution(A):
# write your code in Python 3.6
mini = float('inf')
check = A[0]
total = sum(A)-check
for i in range(1, len(A)):
diff = abs(check-total)
total -= A[i]
check += A[i]
if diff < mini:
mini = diff
return mini

Functional approach O(N). Accumulate provides the cumulative running sum of a list. We can compute the difference between the 2 arrays with the sum of the list, and the cumulative sum at each point.
from itertools import accumulate
def solution(A):
s = sum(A)
l = list(accumulate(A[:-1]))
return min([abs(2*x - s) for x in l])

def solution(A):
res = []
left_sum = 0
right_sum = sum(A)
for i in range(0, len(A)-1):
left_sum += A[i]
right_sum = right_sum - A[i]
res.append(abs(right_sum-left_sum))
return min(res)

currently you are calculating the sum again and again in first_tape and second_tape. What you need to do is store the total sum and the calculate the sum using difference. SO e.g. if your array is [1,2,3,4], the total sum would be 10. Lets assume your first_tape is of size 1 or in other words your first_tape is [1] so the sum of first tape would be 1. Then the sum of the remaining second tape would be
`total sum - first_tape sum`
and the difference would be
first_tape sum - (total sum - first_tape sum)
You can calculate the first_tape sum within the same loop by doing something like:
previous sum += i (where i is the current array element)
So the solution is order of N.

Here I also add a check for when the element of the array is 0
def MinimialDiff(A):
if len(A) == 2:
return abs(A[0]-A[1])
tot_sum = sum(A)
min_value = float('inf')
left_sum = 0
for x in range(0,len(A)-1):
if A[x] == 0:
continue
left_sum += A[x]
temp = abs(2*left_sum-tot_sum)
min_value = min(min_value,temp)
return min_value

This is my original solution for the Tape Equilibrium problem
def solution(A) :
import numpy as np
# Check if the supplied array is empty or single element
if len( A ) < 2 :
return -1
# Otherwise, create two NumPy Arrays of (non-linear) accumulated sums:
# All but last, Start to End
Array_Sum_Accumulated = np.array( list( np.cumsum( A[ 0 : -1 : 1 ] ) )[ : : 1 ] )
# All but first, End to Start
Array_Sum_Acc_Reversed = np.array( list( np.cumsum( A[ -1 : 0 : -1 ] ) )[ : : -1 ] )
# Array of Absolute Diffenences using fast (precompiled) and simple NumPy magic
Array_Sum_Difference = abs( Array_Sum_Accumulated - Array_Sum_Acc_Reversed )
# for debugging only
if len( A ) <= 20 :
print( "%s\n%s\n%s" % ( Array_Sum_Accumulated, Array_Sum_Acc_Reversed, Array_Sum_Difference ) )
return min( Array_Sum_Difference )
Unfortunately, Codility does not permit import of the NumPy module for this particular lesson. Hence, this is the solution, importing the IterTools module instead of NumPy, that (finally) yielded 100% result across the board:
def solution(A):
from itertools import accumulate
# Check if the supplied array is empty or single element
if len( A ) < 2 :
return -1
# If only two elements return the Absolute Difference of them
if len( A ) == 2 :
return abs( A[ 0 ] - A[ 1 ])
# Otherwise, create two lists of (non-linear) accumulated sums:
# All but last, Start to End
Array_Sum_Accumulated = list( accumulate( A[ 0 : -1 : 1 ] ) )[ : : 1 ]
# All but first, End to Start
Array_Sum_Acc_Reversed = list( accumulate( A[ -1 : 0 : -1 ] ) )[ : : -1 ]
# List of Absolute Differences using the slower (interpreted) loop
Array_Sum_Difference = [ ]
for i in range( 0, len( Array_Sum_Accumulated ) ) :
Array_Sum_Difference.append( abs( Array_Sum_Accumulated[ i ] - Array_Sum_Acc_Reversed [ i ] ) )
# For debugging only
if len( A ) <= 20 :
print( "%s\n%s\n%s" % ( Array_Sum_Accumulated, Array_Sum_Acc_Reversed, Array_Sum_Difference ) )
return min( Array_Sum_Difference )
Thanks to darkvalance for his IterTools solution, and to TenaciousRaptor for the (very enlightening) clarification of the logic used.
Thanks also to Jun Jang for attempting the Two Split Tapes solution which shows that the non-linear accumulation can provide multiple 'pairs of tapes', because the same minimum absolute difference can appear at multiple equilibrium points on 'the tape'.
The IterTools solution provided by darkvalance not only provides exactly the same results, it looks extremely Pythonic and out performs the Three NumPy Arrays solution in more than 97% of tests, (after 100,000 tests of arrays of 100,000 elements).
Congratulations. I hope that one day my code will look something like yours.

This code snippet also a possible solution
def solution(A):
# write your code in Python 3.6
d=[]
for i in range(1,len(A)):
d.append(abs(sum(A[:i])-sum(A[i:])))
return list(set(d))[0]

Related

Taking equal number of elements from two arrays, such that the taken values have as few duplicates as possible

Consider we have 2 arrays of size N, with their values in the range [0, N-1]. For example:
a = np.array([0, 1, 2, 0])
b = np.array([2, 0, 3, 3])
I need to produce a new array c which contains exactly N/2 elements from a and b respectively, i.e. the values must be taken evenly/equally from both parent arrays.
(For odd length, this would be (N-1)/2 and (N+1)/2. Can also ignore odd length case, not important).
Taking equal number of elements from two arrays is pretty trivial, but there is an additional constraint: c should have as many unique numbers as possible / as few duplicates as possible.
For example, a solution to a and b above is:
c = np.array([b[0], a[1], b[2], a[3]])
>>> c
array([2, 1, 3, 0])
Note that the position/order is preserved. Each element of a and b that we took to form c is in same position. If element i in c is from a, c[i] == a[i], same for b.
A straightforward solution for this is simply a sort of path traversal, easy enough to implement recursively:
def traverse(i, a, b, path, n_a, n_b, best, best_path):
if n_a == 0 and n_b == 0:
score = len(set(path))
return (score, path.copy()) if score > best else (best, best_path)
if n_a > 0:
path.append(a[i])
best, best_path = traverse(i + 1, a, b, path, n_a - 1, n_b, best, best_path)
path.pop()
if n_b > 0:
path.append(b[i])
best, best_path = traverse(i + 1, a, b, path, n_a, n_b - 1, best, best_path)
path.pop()
return best, best_path
Here n_a and n_b are how many values we will take from a and b respectively, it's 2 and 2 as we want to evenly take 4 items.
>>> score, best_path = traverse(0, a, b, [], 2, 2, 0, None)
>>> score, best_path
(4, [2, 1, 3, 0])
Is there a way to implement the above in a more vectorized/efficient manner, possibly through numpy?
The algorithm is slow mainly because it runs in an exponential time. There is no straightforward way to vectorize this algorithm using only Numpy because of the recursion. Even if it would be possible, the huge number of combinations would cause most Numpy implementations to be inefficient (due to large Numpy arrays to compute). Additionally, there is AFAIK no vectorized operation to count the number of unique values of many rows efficiently (the usual way is to use np.unique which is not efficient in this case and cannot be use without a loop). As a result, there is two possible strategy to speed this up:
trying to find an algorithm with a reasonable complexity (eg. <= O(n^4));
using compilation methods, micro-optimizations and tricks to write a faster brute-force implementation.
Since finding a correct sub-exponential algorithm turns out not to be easy, I choose the other approach (though the first approach is the best).
The idea is to:
remove the recursion by generating all possible solutions using a loop iterating on integer;
write a fast way to count unique items of an array;
use the Numba JIT compiler so to optimize the code that is only efficient once compiled.
Here is the final code:
import numpy as np
import numba as nb
# Naive way to count unique items.
# This is a slow fallback implementation.
#nb.njit
def naive_count_unique(arr):
count = 0
for i in range(len(arr)):
val = arr[i]
found = False
for j in range(i):
if arr[j] == val:
found = True
break
if not found:
count += 1
return count
# Optimized way to count unique items on small arrays.
# Count items 2 by 2.
# Fast on small arrays.
#nb.njit
def optim_count_unique(arr):
count = 0
for i in range(0, len(arr), 2):
if arr[i] == arr[i+1]:
tmp = 1
for j in range(i):
if arr[j] == arr[i]: tmp = 0
count += tmp
else:
val1, val2 = arr[i], arr[i+1]
tmp1, tmp2 = 1, 1
for j in range(i):
val = arr[j]
if val == val1: tmp1 = 0
if val == val2: tmp2 = 0
count += tmp1 + tmp2
return count
#nb.njit
def count_unique(arr):
if len(arr) % 2 == 0:
return optim_count_unique(arr)
else:
# Odd case: not optimized yet
return naive_count_unique(arr)
# Count the number of bits in a 32-bit integer
# See https://stackoverflow.com/questions/71097470/msb-lsb-popcount-in-numba
#nb.njit('int_(uint32)', inline='always')
def popcount(v):
v = v - ((v >> 1) & 0x55555555)
v = (v & 0x33333333) + ((v >> 2) & 0x33333333)
c = np.uint32((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24
return c
# Count the number of bits in a 64-bit integer
#nb.njit(inline='always')
def bit_count(n):
if n < (1 << 30):
return popcount(np.uint32(n))
else:
return popcount(np.uint32(n)) + popcount(np.uint32(n >> 32))
# Mutate `out` so not to create an expensive new temporary array
#nb.njit
def int_to_path(n, out, a, b):
for i in range(len(out)):
out[i] = a[i] if ((n >> i) & 1) else b[i]
#nb.njit(['(int32[:], int32[:], int64, int64)', '(int64[:], int64[:], int64, int64)'])
def traverse_fast(a, b, n_a, n_b):
# This assertion is needed because the paths are encoded using 64-bit.
# This should not be a problem in practice since the number of solutions to
# test would be impracticably huge to test using this algorithm anyway.
assert n_a + n_b < 62
max_iter = 1 << (n_a + n_b)
path = np.empty(n_a + n_b, dtype=a.dtype)
score, best_score, best_i = 0, 0, 0
# Iterate over all cases (more than the set of possible solution)
for i in range(max_iter):
# Filter the possible solutions
if bit_count(i) != n_b:
continue
# Analyse the score of the solution
int_to_path(i, path, a, b)
score = count_unique(path)
# Store it if it better than the previous one
if score > best_score:
best_score = score
best_i = i
int_to_path(best_i, path, a, b)
return best_score, path
This implementation is about 30 times faster on arrays of size 8 on my machine. On could use several cores to speed this up even further. However, I think it is better to focus on finding a sub-exponential implementation so to avoid wasting more computing resources. Note that the path is different from the initial function but the score is the same on random arrays. It can help others to test their implementation on larger arrays without waiting for a long time.
Test this heavily.
import numpy as np
from numpy.random._generator import default_rng
rand = default_rng(seed=1)
n = 16
a = rand.integers(low=0, high=n, size=n)
b = rand.integers(low=0, high=n, size=n)
uniques = np.setxor1d(a, b)
print(a)
print(b)
print(uniques)
def limited_uniques(arr: np.ndarray) -> np.ndarray:
choose = np.zeros(shape=n, dtype=bool)
_, idx, _ = np.intersect1d(arr, uniques, return_indices=True)
idx = idx[:n//2]
choose[idx] = True
n_missing = n//2 - len(idx)
counts = choose.cumsum()
diffs = np.arange(n) - counts
at = np.searchsorted(diffs, n_missing)
choose[:at] = True
return arr[choose]
a_half = limited_uniques(a)
uniques = np.union1d(uniques, np.setdiff1d(a, a_half))
interleaved = np.empty_like(a)
interleaved[0::2] = a_half
interleaved[1::2] = limited_uniques(b)
print(interleaved)
[ 7 8 12 15 0 2 13 15 3 4 13 6 4 13 4 6]
[10 8 1 0 13 12 13 8 13 5 7 12 1 4 1 7]
[ 1 2 3 5 6 10 15]
[ 7 10 8 8 12 1 15 0 0 13 2 12 3 5 6 4]

Maximum Score from Two Arrays | Which Test Case is this approach missing?

Problem Statement
Given two integer arrays A and B of size N and M respectively. You begin with a score of 0. You want to perform exactly K operations. On the iᵗʰ operation (1-indexed), you will:
Choose one integer x from either the start or the end of any one array, A or B. Remove it from that array
Add x to score.
Return the maximum score after performing K operations.
Example
Input: A = [3,1,2], B = [2,8,1,9] and K=5
Output: 24
Explanation: An optimal solution is as follows:
Choose from end of B, add 9 to score. Remove 9 from B
Choose from start of A, add 3 to score. Remove 3 from A
Choose from start of B, add 2 to score. Remove 2 from B
Choose from start of B, add 8 to score. Remove 8 from B
Choose from end of A, add 2 to score. Remove 2 from A
The total score is 9+3+2+8+2 = 24
Constraints
1 ≤ N ≤ 6000
1 ≤ M ≤ 6000
1 ≤ A[i] ≤ 109
1 ≤ B[i] ≤ 109
1 ≤ K ≤ N+M
My Approach
Since, greedy [choosing maximum end from both array] approach is failing here [because it will produce conflict when maximum end of both array is same], it suggests we have to look for all possible combinations. There will be overlapping sub-problems, hence DP!
Here is the python reprex code for the same.
A = [3,1,2]
N = len(A)
B = [2,8,1,9]
M = len(B)
K = 5
memo = {}
def solve(i,j, AL, BL):
if (i,j,AL,BL) in memo:
return memo[(i,j,AL,BL)]
AR = (N-1)-(i-AL)
BR = (M-1)-(j-BL)
if AL>AR or BL>BR or i+j==K:
return 0
op1 = A[AL] + solve(i+1,j,AL+1,BL)
op2 = B[BL] + solve(i,j+1,AL,BL+1)
op3 = A[AR] + solve(i+1,j,AL,BL)
op4 = B[BR] + solve(i,j+1,AL,BL)
memo[(i,j,AL,BL)] = max(op1,op2,op3,op4)
return memo[(i,j,AL,BL)]
print(solve(0,0,0,0))
In brief,
i indicates that we have performed i operations from A
j indicates that we have performed j operations from B
Total operation is thus i+j
AL indicates index on left of which which all integers of A are used. Similarly AR indicates index on right of which all integers of A used for operation.
BL indicates index on left of which which all integers of B are used. Similarly BR indicates index on right of which all integers of B used for operation.
We are trying out all possible combination, and choosing maximum from them in each step. Also memoizing our answer.
Doubt
The code worked fine for several test cases, but also failed for few. The message was Wrong Answer means there was no Time Limit Exceed, Memory Limit Exceed, Syntax Error or Run Time Error. This means there is some logical error only.
Can anyone help in identifying those Test Cases? And, also in understanding intuition/reason behind why this approach failed in some case?
Examples were posted code gives the wrong answer:
Example 1.
A = [1, 1, 1]
N = len(A)
B = [1, 1]
M = len(B)
K = 5
print(print(solve(0,0,0,0))) # Output: 4 (which is incorrect)
# Correct answer is 5
Example 2.
A = [1, 1]
B = [1]
N = len(A)
M = len(B)
K = 3
print(print(solve(0,0,0,0))) # Output: 2 (which is incorrect)
# Correct answer is 3
Alternative Code
def solve(A, B, k):
def solve_(a_left, a_right, b_left, b_right, remaining_ops, sum_):
'''
a_left - left pointer into A
a_right - right pointer in A
b_left - left pointer into B
b_right - right pointer into B
remaining_ops - remaining operations
sum_ - sum from previous operations
'''
if remaining_ops == 0:
return sum_ # out of operations
if a_left > a_right and b_left > b_right:
return sum_ # both left and right are empty
if (a_left, a_right, b_left, b_right) in cache:
return cache[(a_left, a_right, b_left, b_right)]
max_ = sum_ # init to current sum
if a_left <= a_right: # A not empty
max_ = max(max_,
solve_(a_left + 1, a_right, b_left, b_right, remaining_ops - 1, sum_ + A[a_left]), # Draw from left of A
solve_(a_left, a_right - 1, b_left, b_right, remaining_ops - 1, sum_ + A[a_right])) # Draw from right of A
if b_left <= b_right: # B not empty
max_ = max(max_,
solve_(a_left, a_right, b_left + 1, b_right, remaining_ops - 1, sum_ + B[b_left]), # Draw from left of B
solve_(a_left, a_right, b_left, b_right - 1, remaining_ops - 1, sum_ + B[b_right])) # Draw from right of B
cache[(a_left, a_right, b_left, b_right)] = max_ # update cache
return cache[(a_left, a_right, b_left, b_right)]
cache = {}
return solve_(0, len(A) - 1, 0, len(B) - 1, k, 0)
Tests
print(solve([3,1,2], [2,8,1,9], 5) # Output 24
print(solve([1, 1, 1], [1, 1, 1], 5) # Output 5
The approach is failing because the Recursive Functions stops computing further sub-problems when either "AL exceeds AR" or "BL exceeds BR".
We should stop computing and return 0 only when both of them are True. If either of "AL exceeds AR" or "BL exceeds BR" evaluates to False, means we can solve that sub-problem.
Moreover, one quick optimization here is that when N+M==K, in this case we can get maximum score by choosing all elements from both the arrays.
Here is the correct code!
A = [3,1,2]
B = [2,8,1,9]
K = 5
N, M = len(A), len(B)
memo = {}
def solve(i,j, AL, BL):
if (i,j,AL,BL) in memo:
return memo[(i,j,AL,BL)]
AR = (N-1)-(i-AL)
BR = (M-1)-(j-BL)
if i+j==K or (AL>AR and BL>BR):
return 0
ans = -float('inf')
if AL<=AR:
ans = max(A[AL]+solve(i+1,j,AL+1,BL),A[AR]+solve(i+1,j,AL,BL),ans)
if BL<=BR:
ans = max(B[BL]+solve(i,j+1,AL,BL+1),B[BR]+solve(i,j+1,AL,BL),ans)
memo[(i,j,AL,BL)] = ans
return memo[(i,j,AL,BL)]
if N+M==K:
print(sum(A)+sum(B))
else:
print(solve(0,0,0,0))
[This answer was published taking help from DarryIG's Answer. The reason for publishing answer is to write code similar to code in question body. DarryIG's answer used different prototype for function]

How to make nested list behave like numpy array?

I'm trying to implements an algorithm to count subsets with given sum in python which is
import numpy as np
maxN = 20
maxSum = 1000
minSum = 1000
base = 1000
dp = np.zeros((maxN, maxSum + minSum))
v = np.zeros((maxN, maxSum + minSum))
# Function to return the required count
def findCnt(arr, i, required_sum, n) :
# Base case
if (i == n) :
if (required_sum == 0) :
return 1
else :
return 0
# If the state has been solved before
# return the value of the state
if (v[i][required_sum + base]) :
return dp[i][required_sum + base]
# Setting the state as solved
v[i][required_sum + base] = 1
# Recurrence relation
dp[i][required_sum + base] = findCnt(arr, i + 1, required_sum, n) + findCnt(arr, i + 1, required_sum - arr[i], n)
return dp[i][required_sum + base]
arr = [ 2, 2, 2, 4 ]
n = len(arr)
k = 4
print(findCnt(arr, 0, k, n))
And it gives the expected result, but I was asked to not use numpy, so I replaced numpy arrays with nested lists like this :
#dp = np.zeros((maxN, maxSum + minSum)) replaced by
dp = [[0]*(maxSum + minSum)]*maxN
#v = np.zeros((maxN, maxSum + minSum)) replaced by
v = [[0]*(maxSum + minSum)]*maxN
but now the program always gives me 0 in the output, I think this is because of some behavior differences between numpy arrays and nested lists, but I don't know how to fix it
EDIT :
thanks to #venky__ who provided this solution in the comments :
[[0 for i in range( maxSum + minSum)] for i in range(maxN)]
and it worked, but I still don't understand what is the difference between it and what I was doing before, I tried :
print( [[0 for i in range( maxSum + minSum)] for i in range(maxN)] == [[0]*(maxSum + minSum)]*maxN )
And the result is True, so how this was able to fix the problem ?
It turns out that I was using nested lists the wrong way to represent 2d arrays, since python was not crating separate objets, but the same sub list indexes was referring to the same integer object, for better explanation please read this.

How can I accelerate the array assignment in python?

I am trying to make array assignment in python, but it is very slow, is there any way to accelerate?
simi_matrix_img = np.zeros((len(annot), len(annot)), dtype='float16')
for i in range(len(annot)):
for j in range(i + 1):
score = 0
times = 0
if i != j:
x_idx = [p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5]
y_idx = [p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5]
for idx in itertools.product(x_idx, y_idx):
score += simi_matrix_word[idx]
times += 1
simi_matrix_img[i, j] = score/times
else:
simi_matrix_img[i, j] = 1.0
"annot" is a numpy array. Is there any way to accelerate it?
I think the indent for this line is wrong:
simi_matrix_img[i, j] = score/times
you want to perform that assignment after all the product iterations. But since it's the last assignment that takes, the results will be the same.
Here's a partial reworking of your code
def foo1(annot, simi_matrix_word):
N = annot.shape[0]
simi_matrix_img = np.zeros((N,N))
for i in range(N):
for j in range(i + 1):
if i != j:
x_idx = np.nonzero(annot[i])[0]
y_idx = np.nonzero(annot[j])[0]
idx = np.ix_(x_idx, y_idx)
# print(idx, simi_matrix_word[idx])
score = simi_matrix_word[idx].mean()
simi_matrix_img[i, j] = score
else:
simi_matrix_img[i, j] = 1.0
return simi_matrix_img
For a small test case, it returns the same thing:
annot=np.array([[1,0,1],[0,1,1]])
simi_matrix_word = np.arange(12, dtype=float).reshape(3,4)
[[ 1. 0.]
[ 7. 1.]]
That gets rid of all the inner iterations. Next step would be reduce the outer iterations. For example start with np.eye(N), and just iterate on the lower tri indices:
In [169]: np.eye(2)
Out[169]:
array([[ 1., 0.],
[ 0., 1.]])
In [170]: np.tril_indices(2,-1)
Out[170]: (array([1]), array([0]))
Note that for a 2 row annot, we are only calculating one score, at [1,0].
Replacing nonzero with boolean indexing:
def foo3(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i,j in zip(*np.tril_indices(N,-1)):
score = simi_matrix_word[A[i],:][:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
or this might speed up the indexing a bit:
def foo4(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i in range(1,N):
x = simi_matrix_word[A[i],:]
for j in range(i):
score = x[:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
Since the number of nonzero values for each row of annot can differ, the number of terms that are summed for each score also differs. That strongly suggests that further vectorization is impossible.
(1) You could use generators instead of list comprehension where possible. For example:
x_idx = (p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5)
y_idx = (p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5)
With this, you iterate only once over those items (in for idx in itertools.product(x_idx, y_idx)), as opposed to twice (once for constructing the list then again in said for loop).
(2) What Python are you using? If <3, I have a hunch that a significant part of the problem is you're using range(), which can be expensive in connection with really large ranges (as I'm assuming you're using here). In Python 2.7, range() actually constructs lists (not so in Python 3), which can be an expensive operation. Try achieving the same result using a simple while loop. For example, instead of for i in range(len(annot)), do:
i=0
while i < len(annot):
... do stuff with i ...
i += 1
(3) Why call len(annot) so many times? It doesn't seem like you're mutating annot. Although len(annot) is a fast O you could store the length in a var, e.g., annot_len = len(annot), and then just reference that. Wouldn't scrape much off though, I'm afraid.

List comprehension to return the sum of n/2...?

Basically, how do I write the same function in list comprehension?
def blah(n):
if n <= 1:
return 1
return n + blah(n/2)
print blah(32)
I don't really need this for anything other than proving to myself that custom step for any range in list comprehension is actually possible.
import math
def lcsum(n):
return sum([n>>i for i in range(int(math.log(n, 2))+1)])
You'd need to generate the sequence of halved numbers:
def halved(n):
while n:
yield n
n >>= 1
Then use turn that into a list:
list(halved(32))
or just directly sum it:
sum(halved(32))
You'd have to use math.log() to turn that into a range()-suitable value:
import math
sum(n >> i for i in range(int(math.log(n, 2)) + 1))
I would write it like this, if you really wanted some kind of list comprehension in there:
import math
def sumOfNHalf( n ):
return sum( [ 2**x for x in range( 0, int( math.log( n, 2 ) + 1 ) ) ] )

Categories

Resources