I have a coding challenge next week as the first round interview. The HR said they will use Codility as the coding challenge platform. I have been practicing using the Codility Lessons.
My issue is that I often get a very high score on Correctness, but my Performance score, which measure time complexity, is horrible (I often get 0%).
Here's the question:
https://app.codility.com/programmers/lessons/5-prefix_sums/passing_cars/
My code is:
def solution(A):
N = len(A)
my_list = []
count = 0
for i in range(N):
if A[i] == 1:
continue
else:
my_list = A[i + 1:]
count = count + sum(my_list)
print(count)
return count
It is supposed to be O(N) but mine is O(N**2).
How can someone approach this question to solve it under the O(N) time complexity?
In general, when you look at an algorithm question, how do you come up with an approach?
You should not sum the entire array each time you find a zero. That makes it O(n^2). Instead note that every zero found will give a +1 for each following one:
def solution(A):
zeros = 0
passing = 0
for i in A:
if i == 0:
zeros += 1
else:
passing += zeros
return passing
You may check all codility solutions as well as passingcars example.
Don’t forget the 1000000000 limit.
def solution(a):
pc=0
fz=0
for e in a:
if pc>1000000000:
return -1
if e==0:
fz+=1
else:
pc+=fz
return pc
Regarding the above answer, it is correct but missing checking the cases where passing exceeds 1000000000.
Also, I found a smarter and simple way to count the pairs of cars that could be passed where you just count all existing ones inside the array from the beginning and inside the loop when you find any zero, you say Ok we could possibly pair all the ones with this zero (so we increase the count) and as we are already looping through the whole array, if we find one then we can simply remove that one from ones since we will never need it again.
It takes O(N) as a time complexity since you just need to loop once in the array.
def solution(A):
ones = A.count(1)
c = 0
for i in range(0, len(A)):
if A[i] == 0:
c += ones
else: ones -= 1
if c > 1000000000:
return -1
return c
In a loop, find indices of zeros in list and nb of ones in front of first zero. In another loop, find the next zero, reduce nOnesInFront by the index difference between current and previous zero
def solution(A):
count = 0; zeroIndices = []
nOnesInFront = 0; foundZero = False
for i in range(len(A)):
if A[i] == 0:
foundZero = True
zeroIndices.append(i)
elif foundZero: nOnesInFront += 1;
if nOnesInFront == 0: return 0 #no ones in front of a zero
if not zeroIndices: return 0 #no zeros
iPrev = zeroIndices[0]
count = nOnesInFront
for i in zeroIndices[1:]:
nOnesInFront -= (i-iPrev) - 1 #decrease nb of ones by the differnce between current and previous zero index
iPrev = i
count += nOnesInFront
if count > 1000000000: return -1
else: return count
Here are some tests cases to verify the solution:
print(solution([0, 1, 0, 1, 1])) # 5
print(solution([0, 1, 1, 0, 1])) # 4
print(solution([0])) # 0
print(solution([1])) # 0
print(solution([1, 0])) # 0
print(solution([0, 1])) # 1
print(solution([1, 0, 0])) # 0
print(solution([1, 0, 1])) # 1
print(solution([0, 0, 1])) # 2
Related
I just recently started learning Python. Tried to solve a problem where you are given an array of integers and you need to find three of them with the sum closest to target.
My idea was to sort the array and then create two pointers - one at the start of the array, moving right, and one at the end of the array, moving left. And the third pointer is supposed to move along the entire array, while the program calculates and uses the sum of all three of them. But it doesn't work. It ignores some of the indexes.
I feel like if I don't understand this I'll never understand loops in general.
nums = [0,1,1,1]
target = 100
nums.sort()
pointer_one = 0
pointer_two = len(nums) - 1
result = nums[0] + nums[1] + nums[2]
while pointer_one < pointer_two:
for i in nums:
if i == pointer_one or i == pointer_two:
pass
else:
sum_num = nums[i] + nums[pointer_one] + nums[pointer_two]
how_close = abs(target - sum_num)
if how_close < abs(target - result):
result = sum_num
pointer_one = pointer_one + 1
pointer_two = pointer_two - 1
print("Result: ", result)`
When you begin my advice is to use the print() to better understand your code:
iterate over items:
for i in nums:
print(i)
0 1 1 1
iterate over indexes:
for i in range(len(nums)):
print(i)
0 1 2 3
Regards
Your for loop iterates over the items of the list, but you use it as an index not the actual value.
The standard for loop with an index would in your case look as:
for i in range(len(nums)):
#rest of your code
Look at the docs for examples of both forms of for loops:
https://docs.python.org/3/tutorial/controlflow.html#for-statements
I have an algorithm that looks for the good pairs in a list of numbers. A good pair is being considered as index i being less than j and arr[i] < arr[j]. It currently has a complexity of O(n^2) but I want to make it O(nlogn) based on divide and conquering. How can I go about doing that?
Here's the algorithm:
def goodPairs(nums):
count = 0
for i in range(0,len(nums)):
for j in range(i+1,len(nums)):
if i < j and nums[i] < nums[j]:
count += 1
j += 1
j += 1
return count
Here's my attempt at making it but it just returns 0:
def goodPairs(arr):
count = 0
if len(arr) > 1:
# Finding the mid of the array
mid = len(arr)//2
# Dividing the array elements
left_side = arr[:mid]
# into 2 halves
right_side = arr[mid:]
# Sorting the first half
goodPairs(left_side)
# Sorting the second half
goodPairs(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count
The current previously accepted answer by Fire Assassin doesn't really answer the question, which asks for better complexity. It's still quadratic, and about as fast as a much simpler quadratic solution. Benchmark with 2000 shuffled ints:
387.5 ms original
108.3 ms pythonic
104.6 ms divide_and_conquer_quadratic
4.1 ms divide_and_conquer_nlogn
4.6 ms divide_and_conquer_nlogn_2
Code (Try it online!):
def original(nums):
count = 0
for i in range(0,len(nums)):
for j in range(i+1,len(nums)):
if i < j and nums[i] < nums[j]:
count += 1
j += 1
j += 1
return count
def pythonic(nums):
count = 0
for i, a in enumerate(nums, 1):
for b in nums[i:]:
if a < b:
count += 1
return count
def divide_and_conquer_quadratic(arr):
count = 0
left_count = 0
right_count = 0
if len(arr) > 1:
mid = len(arr) // 2
left_side = arr[:mid]
right_side = arr[mid:]
left_count = divide_and_conquer_quadratic(left_side)
right_count = divide_and_conquer_quadratic(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count + left_count + right_count
def divide_and_conquer_nlogn(arr):
mid = len(arr) // 2
if not mid:
return 0
left = arr[:mid]
right = arr[mid:]
count = divide_and_conquer_nlogn(left)
count += divide_and_conquer_nlogn(right)
i = 0
for r in right:
while i < mid and left[i] < r:
i += 1
count += i
arr[:] = left + right
arr.sort() # linear, as Timsort takes advantage of the two sorted runs
return count
def divide_and_conquer_nlogn_2(arr):
mid = len(arr) // 2
if not mid:
return 0
left = arr[:mid]
right = arr[mid:]
count = divide_and_conquer_nlogn_2(left)
count += divide_and_conquer_nlogn_2(right)
i = 0
arr.clear()
append = arr.append
for r in right:
while i < mid and left[i] < r:
append(left[i])
i += 1
append(r)
count += i
arr += left[i:]
return count
from timeit import timeit
from random import shuffle
arr = list(range(2000))
shuffle(arr)
funcs = [
original,
pythonic,
divide_and_conquer_quadratic,
divide_and_conquer_nlogn,
divide_and_conquer_nlogn_2,
]
for func in funcs:
print(func(arr[:]))
for _ in range(3):
print()
for func in funcs:
arr2 = arr[:]
t = timeit(lambda: func(arr2), number=1)
print('%5.1f ms ' % (t * 1e3), func.__name__)
One of the most well-known divide-and-conquer algorithms is merge sort. And merge sort is actually a really good foundation for this algorithm.
The idea is that when comparing two numbers from two different 'partitions', you already have a lot of information about the remaining part of these partitions, as they're sorted in every iteration.
Let's take an example!
Consider the following partitions, which has already been sorted individually and "good pairs" have been counted.
Partition x: [1, 3, 6, 9].
Partition y: [4, 5, 7, 8].
It is important to note that the numbers from partition x is located further to the left in the original list than partition y. In particular, for every element in x, it's corresponding index i must be smaller than some index j for every element in y.
We will start of by comparing 1 and 4. Obviously 1 is smaller than 4. But since 4 is the smallest element in partition y, 1 must also be smaller than the rest of the elements in y. Consequently, we can conclude that there is 4 additional good pairs, since the index of 1 is also smaller than the index of the remaining elements of y.
The exact same thing happens with 3, and we can add 4 new good pairs to the sum.
For 6 we will conclude that there is two new good pairs. The comparison between 6 and 4 did not yield a good pair and likewise for 6 and 5.
You might now notice how these additional good pairs would be counted? Basically if the element from x is less than the element from y, add the number of elements remaining in y to the sum. Rince and repeat.
Since merge sort is an O(n log n) algorithm, and the additional work in this algorithm is constant, we can conclude that this algorithm is also an O(n log n) algorithm.
I will leave the actual programming as an exercise for you.
#niklasaa has added an explanation for the merge sort analogy, but your implementation still has an issue.
You are partitioning the array and calculating the result for either half, but
You haven't actually sorted either half. So when you're comparing their elements, your two pointer approach isn't correct.
You haven't used their results in the final computation. That's why you're getting an incorrect answer.
For point #1, you should look at merge sort, especially the merge() function. That logic is what will give you the correct pair count without having O(N^2) iteration.
For point #2, store the result for either half first:
# Sorting the first half
leftCount = goodPairs(left_side)
# Sorting the second half
rightCount = goodPairs(right_side)
While returning the final count, add these two results as well.
return count + leftCount + rightCount
Like #Abhinav Mathur stated, you have most of the code down, your problem is with these lines:
# Sorting the first half
goodPairs(left_side)
# Sorting the second half
goodPairs(right_side)
You want to store these in variables that should be declared before the if statement. Here's an updated version of your code:
def goodPairs(arr):
count = 0
left_count = 0
right_count = 0
if len(arr) > 1:
mid = len(arr) // 2
left_side = arr[:mid]
right_side = arr[mid:]
left_count = goodPairs(left_side)
right_count = goodPairs(right_side)
for i in left_side:
for j in right_side:
if i < j:
count += 1
return count + left_count + right_count
Recursion can be difficult at times, look into the idea of merge sort and quick sort to get better ideas on how the divide and conquer algorithms work.
I'm trying to solve the FibFrog Codility problem and I came up with the following approach:
If len(A) is 0 we know we can reach the other side in one jump.
If len(A) + 1 is a fibonacci number, we can also reach it in one jump.
Else, we loop through A, and for the positions we can reach, we check if we can either reach them directly from -1 using a fibonacci number (idx + 1 in fibonaccis) or if we can reach them by first jumping to another position (reachables) and then jumping to the current position. In either case, we also check if we can go from the current position to the end of the river - if we can, then we can return because we found the minimum number of steps required.
Finally, if unreachable is True once this loop completes, this means we can't reach any position using a Fibonacci number, so we return -1.
I'm getting 83% correctness and 0% performance with this approach.
I understand the solution is O(n^2), assuming the array consists of only 1, the nested loop for v in reachables: would run n times - however I'm not sure how else I can compute this, since for each of the positions I need to check whether we can reach it from the start of the array, or from any previous positions using a fibonacci number.
def solution(A):
if len(A) == 0: return 1
fibonaccis = fibonacci(len(A) + 3)
if len(A) + 1 in fibonaccis: return 1
leaves = [0] * len(A)
unreachable = True
reachables = []
for idx, val in enumerate(A):
if val == 1:
if idx + 1 in fibonaccis:
unreachable = False
leaves[idx] = 1
if len(A) - idx in fibonaccis:
return 2
reachables.append(idx)
elif len(reachables) > 0:
for v in reachables:
if idx - v in fibonaccis:
leaves[idx] = leaves[v] + 1
if len(A) - idx in fibonaccis:
return leaves[v] + 2
reachables.append(idx)
break
if unreachable: return -1
if len(A) - reachables[-1] in fibonaccis:
return leaves[reachables[-1]] + 1
def fibonacci(N):
arr = [0] * N
arr[1] = 1
for i in range(2, N):
arr[i] = arr[i-1] + arr[i-2]
return arr
Some suggestions for improving performance of your algorithm -
If len(A) = 100000, you are calculating 100003 fibonacci numbers, while we only need fibonacci numbers which are less than 100k, which would be <30 of them.
Your solution is O(n^4), since each X in reachables or Y in fibonaccis operation is O(N) where N is len(A). (and length of fibonaccis being N because of above issue)
Since you are doing a lot of item in list operations on fibonaccis and reachables, consider making it a set or a dictionary for faster(O(1) instead of O(n)) lookup.
Even with the above changes, the algorithm would be O(N^2) because of nested looping across A and reachables, so you need to come up with a better approach.
With your existing implementation, you need to traverse through all the paths and then in the end you will get the smallest number of jumps.
Instead of this approach, if you start at 0, and then keep a count of the number of jumps you have taken so far, and maintain how far(and to which numbers) you can reach after each jump then you can easily find the minimum jumps required to reach the end. (this will also save on redundant work you would have to do in case you have all 1s in A.
e.g. for
A = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
fibonacci = set(1, 2, 3, 5)
At first jump, we can reach following 1-based indexes -
reachable = [1, 2, 3, 5]
jumps = 1
After second jump
reachables = [2, 3, 4, 5, 6, 7, 8]
jumps = 2
After third jump
reachables = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
jumps = 3
so you have reached the end(10) after 3 jumps.
Please check out #nialloc's answer here - https://stackoverflow.com/a/64558623/8677071 which seems to be doing something similar.
Check out also my solution, which scores 100% on Codility tests and is easy to comprehend.
The idea is to track all possible positions of the frog after k jumps. If possible position == n, return k.
def fib_up_to(n):
numbers = [1]
i = 1
while True:
new_num = (numbers[-2] + numbers[-1]) if i > 1 else 2
if new_num > n:
break
numbers.append(new_num)
i += 1
return numbers
def solution(A):
n = len(A)
if n == 0:
return 1
numbers = fib_up_to(n+1)
possible_positions = set([-1])
for k in range(1, n+1):
positions_after_k = set()
for pos in possible_positions:
for jump in numbers:
if pos + jump == n:
return k
if pos + jump < n and A[pos + jump]:
positions_after_k.add(pos + jump)
possible_positions = positions_after_k
return -1
I'm playing with the Codality Demo Task. It's asking to design a function which determines the lowest missing integer greater than zero in an array.
I wrote a function that works, but Codility tests it as 88% (80% correctness). I can't think of instances where it would fail.
def solution(A):
#If there are negative values, set any negative values to zero
if any(n < 0 for n in A):
A = [(i > 0) * i for i in A]
count = 0
else: count = 1
#Get rid of repeating values
A = set(A)
#At this point, we may have only had negative #'s or the same # repeated.
#If negagive #'s, or repeated zero, then answer is 1
#If repeated 1's answer is 2
#If any other repeated #'s answer is 1
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
#Sort the array
A = sorted(A)
for j in range(len(A)):
#Test to see if it's greater than zero or >= to count. If so, it exists and is not the lowest integer.
#This fails if the first # is zero and the second number is NOT 1
if (A[j] <= count or A[j] == 0): #If the number is lt or equal to the count or zero then continue the count
count = count + 1
elif (j == 1 and A[j] > 1): return 1
else:
return count
return count
UPDATE:
I got this to 88% with the fixes above. It still fails with some input. I wish Codility would give the inputs that fail. Maybe it does with a full subscription. I'm just playing with the test.
UPDATE 2: Got this to 100% with Tranbi's suggestion.
def solution(A):
#Get rid of all zero and negative #'s
A = [i for i in A if i > 0]
#At this point, if there were only zero, negative, or combination of both, the answer is 1
if (len(A) == 0): return 1
count = 1
#Get rid of repeating values
A = set(A)
#At this point, we may have only had the same # repeated.
#If repeated 1's answer is 2
#If any other repeated #'s only, then answer is 1
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
#Sort the array
A = sorted(A)
for j in range(len(A)):
#Test to see if it's >= to count. If so, it exists and is not the lowest integer.
if (A[j] <= count): #If the number is lt or equal to the count then continue the count
count = count + 1
else:
return count
return count
Besides that bug for len=1, you also fail for example solution([0, 5]), which returns 2.
Anyway... Since you're willing to create a set, why not just make this really simple?
def solution(A):
A = set(A)
count = 1
while count in A:
count += 1
return count
I don't think this is true:
#At this point, we may have only had negative #'s or the same # repeated. If so, then the answer is 1+ the integer.
if (len(A) == 1):
return list(A)[0]+1
If A = [2] you should return 1 not 3.
Your code is quite confusing though. I think you could replace
if any(n < 0 for n in A):
A = [(i > 0) * i for i in A]
with
A = [i for i in A if i > 0]
What's the point of keeping 0 values?
I don't think this is needed:
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
It's already taken into account afterwards :)
Finally got a 100% score.
def solution(A):
# 1 isn't there
if 1 not in A:
return 1
# it's easier to sort
A.sort()
# positive "hole" in the array
prev=A[0]
for e in A[1:]:
if e>prev+1>0:
return prev+1
prev=e
# no positive "hole"
# 1 is in the middle
return A[-1]+1
Here's the problem: I try to randomize n times a choice between two elements (let's say [0,1] -> 0 or 1), and my final list will have n/2 [0] + n/2 [1]. I tend to have this kind of result: [0 1 0 0 0 1 0 1 1 1 1 1 1 0 0, until n]: the problem is that I don't want to have serially 4 or 5 times the same number so often. I know that I could use a quasi randomisation procedure, but I don't know how to do so (I'm using Python).
To guarantee that there will be the same number of zeros and ones you can generate a list containing n/2 zeros and n/2 ones and shuffle it with random.shuffle.
For small n, if you aren't happy that the result passes your acceptance criteria (e.g. not too many consecutive equal numbers), shuffle again. Be aware that doing this reduces the randomness of the result, not increases it.
For larger n it will take too long to find a result that passes your criteria using this method (because most results will fail). Instead you could generate elements one at a time with these rules:
If you already generated 4 ones in a row the next number must be zero and vice versa.
Otherwise, if you need to generate x more ones and y more zeros, the chance of the next number being one is x/(x+y).
You can use random.shuffle to randomize a list.
import random
n = 100
seq = [0]*(n/2) + [1]*(n-n/2)
random.shuffle(seq)
Now you can run through the list and whenever you see a run that's too long, swap an element to break up the sequence. I don't have any code for that part yet.
Having 6 1's in a row isn't particularly improbable -- are you sure you're not getting what you want?
There's a simple Python interface for a uniformly distributed random number, is that what you're looking for?
Here's my take on it. The first two functions are the actual implementation and the last function is for testing it.
The key is the first function which looks at the last N elements of the list where N+1 is the limit of how many times you want a number to appear in a row. It counts the number of ones that occur and then returns 1 with (1 - N/n) probability where n is the amount of ones already present. Note that this probability is 0 in the case of N consecutive ones and 1 in the case of N consecutive zeros.
Like a true random selection, there is no guarantee that the ratio of ones and zeros will be the 1 but averaged out over thousands of runs, it does produce as many ones as zeros.
For longer lists, this will be better than repeatedly calling shuffle and checking that it satisfies your requirements.
import random
def next_value(selected):
# Mathematically, this isn't necessary but it accounts for
# potential problems with floating point numbers.
if selected.count(0) == 0:
return 0
elif selected.count(1) == 0:
return 1
N = len(selected)
selector = float(selected.count(1)) / N
if random.uniform(0, 1) > selector:
return 1
else:
return 0
def get_sequence(N, max_run):
lim = min(N, max_run - 1)
seq = [random.choice((1, 0)) for _ in xrange(lim)]
for _ in xrange(N - lim):
seq.append(next_value(seq[-max_run+1:]))
return seq
def test(N, max_run, test_count):
ones = 0.0
zeros = 0.0
for _ in xrange(test_count):
seq = get_sequence(N, max_run)
# Keep track of how many ones and zeros we're generating
zeros += seq.count(0)
ones += seq.count(1)
# Make sure that the max_run isn't violated.
counts = [0, 0]
for i in seq:
counts[i] += 1
counts[not i] = 0
if max_run in counts:
print seq
return
# Print the ratio of zeros to ones. This should be around 1.
print zeros/ones
test(200, 5, 10000)
Probably not the smartest way, but it works for "no sequential runs", while not generating the same number of 0s and 1s. See below for version that fits all requirements.
from random import choice
CHOICES = (1, 0)
def quasirandom(n, longest=3):
serial = 0
latest = 0
result = []
rappend = result.append
for i in xrange(n):
val = choice(CHOICES)
if latest == val:
serial += 1
else:
serial = 0
if serial >= longest:
val = CHOICES[val]
rappend(val)
latest = val
return result
print quasirandom(10)
print quasirandom(100)
This one below corrects the filtering shuffle idea and works correctly AFAICT, with the caveat that the very last numbers might form a run. Pass debug=True to check that the requirements are met.
from random import random
from itertools import groupby # For testing the result
try: xrange
except: xrange = range
def generate_quasirandom(values, n, longest=3, debug=False):
# Sanity check
if len(values) < 2 or longest < 1:
raise ValueError
# Create a list with n * [val]
source = []
sourcelen = len(values) * n
for val in values:
source += [val] * n
# For breaking runs
serial = 0
latest = None
for i in xrange(sourcelen):
# Pick something from source[:i]
j = int(random() * (sourcelen - i)) + i
if source[j] == latest:
serial += 1
if serial >= longest:
serial = 0
guard = 0
# We got a serial run, break it
while source[j] == latest:
j = int(random() * (sourcelen - i)) + i
guard += 1
# We just hit an infinit loop: there is no way to avoid a serial run
if guard > 10:
print("Unable to avoid serial run, disabling asserts.")
debug = False
break
else:
serial = 0
latest = source[j]
# Move the picked value to source[i:]
source[i], source[j] = source[j], source[i]
# More sanity checks
check_quasirandom(source, values, n, longest, debug)
return source
def check_quasirandom(shuffled, values, n, longest, debug):
counts = []
# We skip the last entries because breaking runs in them get too hairy
for val, count in groupby(shuffled):
counts.append(len(list(count)))
highest = max(counts)
print('Longest run: %d\nMax run lenght:%d' % (highest, longest))
# Invariants
assert len(shuffled) == len(values) * n
for val in values:
assert shuffled.count(val) == n
if debug:
# Only checked if we were able to avoid a sequential run >= longest
assert highest <= longest
for x in xrange(10, 1000):
generate_quasirandom((0, 1, 2, 3), 1000, x//10, debug=True)