Python / smallest positive integer - python

I took following codility demo task
Write a function:
def solution(A)
that, given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.
For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.
Given A = [1, 2, 3], the function should return 4.
Given A = [−1, −3], the function should return 1.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [1..100,000];
each element of array A is an integer within the range [−1,000,000..1,000,000].
My Solution
def solution(A):
# write your code in Python 3.6
l = len(A)
B = []
result = 0
n = 0
for i in range(l):
if A[i] >=1:
B.append(A[i])
if B ==[]:
return(1)
else:
B.sort()
B = list(dict.fromkeys(B))
n = len(B)
for j in range(n-1):
if B[j+1]>B[j]+1:
result = (B[j]+1)
if result != 0:
return(result)
else:
return(B[n-1]+1)
Although I get correct output for all inputs I tried but my score was just 22%. Could somebody please highlight where I am going wrong.

Python solution with O(N) time complexity and O(N) space complexity:
def solution(A):
arr = [0] * 1000001
for a in A:
if a>0:
arr[a] = 1
for i in range(1, 1000000+1):
if arr[i] == 0:
return i
My main idea was to:
creat a zero-initialized "buckets" for all the positive possibilities.
Iterate over A. Whenever you meet a positive number, mark it's bucket as visited (1).
Iterate over the "buckets" and return the first zero "bucket".

def solution(A):
s = set(A)
for x in range(1,100002):
if x not in s:
return x
pass
And GOT 100%

# you can write to stdout for debugging purposes, e.g.
# print("this is a debug message")
def solution(A):
# write your code in Python 3.6
i = 1;
B = set(A);
while True:
if i not in B:
return i;
i+=1;

My Javascript solution. The solution is to sort the array and compare the adjacent elements of the array. Complexity is O(N)
function solution(A) {
// write your code in JavaScript (Node.js 8.9.4)
A.sort((a, b) => a - b);
if (A[0] > 1 || A[A.length - 1] < 0 || A.length <= 2) return 1;
for (let i = 1; i < A.length - 1; ++i) {
if (A[i] > 0 && (A[i + 1] - A[i]) > 1) {
return A[i] + 1;
}
}
return A[A.length - 1] + 1;
}

in Codility you must predict correctly others inputs, not only the sample ones and also get a nice performance. I've done this way:
from collections import Counter
def maior_menos_zero(A):
if A < 0:
return 1
else:
return 1 if A != 1 else 2
def solution(A):
if len(A) > 1:
copia = set(A.copy())
b = max(A)
c = Counter(A)
if len(c) == 1:
return maior_menos_zero(A[0])
elif 1 not in copia:
return 1
else:
for x in range(1,b+2):
if x not in copia:
return x
else:
return maior_menos_zero(A[0])
Got it 100%. If is an array A of len(A) == 1, function maior_menos_zero will be called. Moreover, if it's an len(A) > 1 but its elements are the same (Counter), then function maior_menos_zero will be called again. Finally, if 1 is not in the array, so 1 is the smallest positive integer in it, otherwise 1 is in it and we shall make a for X in range(1,max(A)+2) and check if its elements are in A, futhermore, to save time, the first ocurrence of X not in A is the smallest positive integer.

My solution (100% acceptance):
def solution(nums):
nums_set = set()
for el in nums:
if el > 0 and el not in nums_set:
nums_set.add(el)
sorted_set = sorted(nums_set)
if len(sorted_set) == 0:
return 1
if sorted_set[0] != 1:
return 1
for i in range(0, len(sorted_set) - 1, 1):
diff = sorted_set[i + 1] - sorted_set[i]
if diff >= 2:
return sorted_set[i] + 1
return sorted_set[-1] + 1

I tried the following, and got 100% score
def solution(A):
A_set = set(A)
for x in range(10**5 + 1, 1):
if x not in A_set:
return x
else:
return 10**5 + 1

This solution is an easy approach!
def solution(A):
... A.sort()
... maxval = A[-1]
... nextmaxval = A[-2]
... if maxval < 0:
... while maxval<= 0:
... maxval += 1
... return maxval
... else:
... if nextmaxval + 1 in A:
... return maxval +1
... else:
... return nextmaxval + 1

This is my solution
def solution(A):
# write your code in Python 3.8.10
new = set(A)
max_ = abs(max(A)) #use the absolute here for negative maximum value
for num in range(1,max_+2):
if num not in new:
return num

Try this, I am assuming the list is not sorted but if it is sorted you can remove the number_list = sorted(number_list) to make it a little bit faster.
def get_smallest_positive_integer(number_list):
if all(number < 0 for number in number_list) or 1 not in number_list:
#checks if numbers in list are all negative integers or if 1 is not in list
return 1
else:
try:
#get the smallest number in missing integers
number_list = sorted(number_list) # remove if list is already sorted by default
return min(x for x in range(number_list[0], number_list[-1] + 1) if x not in number_list and x != 0)
except:
#if there is no missing number in list get largest number + 1
return max(number_list) + 1
print(get_smallest_positive_integer(number_list))
input:
number_list = [1,2,3]
output:
>>4
input:
number_list = [-1,-2,-3]
output:
>>1
input:
number_list = [2]
output:
>>1
input:
number_list = [12,1,23,3,4,5,61,7,8,9,11]
output:
>>2
input:
number_list = [-1,3,2,1]
output:
>>4

I think this should be as easy as starting at 1 and checking which number first fails to appear.
def solution(A):
i = 1
while i in A:
i += 1
return i
You can also consider putting A's elements into a set (for better performance on the search), but I'm not sure that it's worth for this case.
Update:
I've been doing some tests with the numbers OP gave (numbers from negative million to positive million and 100000 elements).
100000 elements:
Linear Search: 0.003s
Set Search: 0.017s
1000000 elements (extra test):
Linear Search: 0.8s
Set Search: 2.58s

Related

Codewars Python Two sum not working for some test cases

This Python code is not working for some test cases on code wars two sum. Here is the link to the problem:
https://www.codewars.com/kata/52c31f8e6605bcc646000082/train/python
def two_sum(nums, target):
nums.sort()
l = 0
r = len(nums)-1
while l < r:
sum = nums[l] + nums[r]
if sum == target:
return [l, r]
if sum > target:
r -= 1
if sum < target:
l += 1
return []
Any help is much appreciated! :)
The solution you are looking for will be:
def two_sum(nums, target):
indices = {}
for index, num in enumerate(nums):
remainder = target - num
if remainder in indices:
return indices[remainder], index
indices[num] = index
return 0, 0
Right off the bat, I can also tell you that sorting nums before doing anything else is bad because the original indices can get mixed up.
def two_sum(numbers, target):
for n1 in enumerate(numbers):
for n2 in enumerate(numbers):
if n1[0] != n2[0]:
if (n1[1] + n2[1]) == target:
return [n1[0], n2[0]]

Design O(log n) algorithm for finding 3 distinct elements in a list

The question is:
Design an O(log n) algorithm whose input is a sorted list A. The algorithm should return true if A contains at least 3 distinct elements. Otherwise, the algorithm should return false.
as it has to be O(log n), I tried to use binary search and this is the code I wrote:
def hasThreeDistinctElements(A):
if len(A) < 3:
return False
minInd = 0
maxInd = len(A)-1
midInd = (maxInd+minInd)//2
count = 1
while minInd < maxInd:
if A[minInd] == A[midInd]:
minInd = midInd
if A[maxInd] == A[midInd]:
maxInd = midInd
else:
count += 1
maxInd -= 1
else:
count += 1
minInd += 1
midInd = (maxInd+minInd)//2
return count >= 3
is there a better way to do this?
Thanks
from bisect import bisect
def hasThreeDistinctElements(A):
return A[:1] < A[-1:] > [A[bisect(A, A[0])]]
The first comparison safely(*) checks whether there are two different values at all. If so, we check whether the first value larger than A[0] is also smaller than A[-1].
(*): Doesn't crash if A is empty.
Or without bisect, binary-searching for a third value in A[1:-1]. The invariant is that if there is any, it must be in A[lo : hi+1]:
def hasThreeDistinctElements(A):
lo, hi = 1, len(A) - 2
while lo <= hi:
mid = (lo + hi) // 2
if A[mid] == A[0]:
lo = mid + 1
elif A[mid] == A[-1]:
hi = mid - 1
else:
return True
return False
In order to really be O(logN), the updates to the bounding indeces minInd,maxInd should only ever be
maxInd = midInd [- 1]
minInd = midInd [+ 1]
to half the search space. Since there are paths through your loop body that only do
minInd += 1
maxInd -= 1
respectively, I am not sure that you can't create data for which your function is linear. The following is a bit simpler and guaranteed O(logN)
def x(A):
if len(A) < 3:
return False
minInd, maxInd = 0, len(A)-1
mn, mx = A[minInd], A[maxInd]
while minInd < maxInd:
midInd = (minInd + maxInd) // 2
if mn != A[midInd] != mx:
return True
if A[midInd] == mn:
minInd = midInd + 1 # minInd == midInd might occur
else:
maxInd = midInd # while maxInd != midInd is safe
return False
BTW, if you can use the standard library, it is as easy as:
from bisect import bisect_right
def x(A):
return A and (i := bisect_right(A, A[0])) < len(A) and A[i] < A[-1]
Yes, there is a better approach.
As the list is sorted, you can use binary search with slight custom modifications as follows:
list = [1, 1, 1, 2, 2]
uniqueElementSet = set([])
def binary_search(minIndex, maxIndex, n):
if(len(uniqueElementSet)>=3):
return
#Checking the bounds for index:
if(minIndex<0 or minIndex>=n or maxIndex<0 or maxIndex>=n):
return
if(minIndex > maxIndex):
return
if(minIndex == maxIndex):
uniqueElementSet.add(list[minIndex])
return
if(list[minIndex] == list[maxIndex]):
uniqueElementSet.add(list[minIndex])
return
uniqueElementSet.add(list[minIndex])
uniqueElementSet.add(list[maxIndex])
midIndex = (minIndex + maxIndex)//2
binary_search(minIndex+1, midIndex, n)
binary_search(midIndex+1, maxIndex-1, n)
return
binary_search(0, len(list)-1, len(list))
print(True if len(uniqueElementSet)>=3 else False)
As, we are dividing the array into 2 parts in each iteration of the recursion, it will require maximum of log(n) steps to check if it contains 3 unique elements.
Time Complexity = O(log(n)).

Python query in list without for loop

I want to find a sum with pair of numbers in python list.
List is sorted
Need to check consecutive combinations
Avoid using for loop
I used a for loop to get the job done and its working fine. I want to learn other optimized way to get the same result.
Can I get the same result with other ways without using a for loop?
How could I use binary search in this situation?
This is my code:
def query_sum(list, find_sum):
"""
This function will find sum of two pairs in list
and return True if sum exist in list
:param list:
:param find_sum:
:return:
"""
previous = 0
for number in list:
sum_value = previous + number
if sum_value == find_sum:
print("Yes sum exist with pair {} {}".format(previous, number))
return True
previous = number
x = [1, 2, 3, 4, 5]
y = [1, 2, 4, 8, 16]
query_sum(x, 7)
query_sum(y, 3)
this is the result.
Yes sum exist with pair 3 4
Yes sum exist with pair 1 2
You can indeed use binary search if your list is sorted (and you are only looking at sums of successive elements), since the sums will be monotonically increasing as well. In a list of N elements, there are N-1 successive pairs. You can copy and paste any properly implemented binary search algorithm you find online and replace the criteria with the sum of successive elements. For example:
def query_sum(seq, target):
def bsearch(l, r):
if r >= l:
mid = l + (r - l) // 2
s = sum(seq[mid:mid + 2])
if s == target:
return mid
elif s > target:
return bsearch(l, mid - 1)
else:
return bsearch(mid + 1, r)
else:
return -1
i = bsearch(0, len(seq) - 1)
if i < 0:
return False
print("Sum {} exists with pair {} {}".format(target, *seq[i:i + 2]))
return True
IDEOne Link
You could use the built-in bisect module, but then you would have to pre-compute the sums. This is a much cheaper method since you only have to compute log2(N) sums.
Also, this solution avoids looping using recursion, but you might be better off writing a loop like while r >= l: around the logic instead of using recursion:
def query_sum(seq, target):
def bsearch(l, r):
while r >= l:
mid = l + (r - l) // 2
s = sum(seq[mid:mid + 2])
if s == target:
return mid
elif s > target:
r = mid - 1
else:
l = mid + 1
return -1
i = bsearch(0, len(seq) - 1)
if i < 0:
return False
print("Yes sum exist with pair {} {}".format(*seq[i:i + 2]))
return True
IDEOne Link
# simpler one:
def query_sum(seq, target):
def search(seq, index, target):
if index < len(seq):
if sum(seq[index:index+2]) == target:
return index
else:
return search(seq, index+1, target)
else:
return -1
return search(seq, 0, target)

Python given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A in O(n) time complexity

For example:
input: A = [ 6 4 3 -5 0 2 -7 1 ]
output: 5
Since 5 is the smallest positive integer that does not occur in the array.
I have written two solutions to that problem. The first one is good but I don't want to use any external libraries + its O(n)*log(n) complexity. The second solution "In which I need your help to optimize it" gives an error when the input is chaotic sequences length=10005 (with minus).
Solution 1:
from itertools import count, filterfalse
def minpositive(a):
return(next(filterfalse(set(a).__contains__, count(1))))
Solution 2:
def minpositive(a):
count = 0
b = list(set([i for i in a if i>0]))
if min(b, default = 0) > 1 or min(b, default = 0) == 0 :
min_val = 1
else:
min_val = min([b[i-1]+1 for i, x in enumerate(b) if x - b[i - 1] >1], default=b[-1]+1)
return min_val
Note: This was a demo test in codility, solution 1 got 100% and
solution 2 got 77 %.
Error in "solution2" was due to:
Performance tests ->
medium chaotic sequences length=10005 (with minus) got 3 expected
10000
Performance tests -> large chaotic + many -1, 1, 2, 3 (with
minus) got 5 expected 10000
Testing for the presence of a number in a set is fast in Python so you could try something like this:
def minpositive(a):
A = set(a)
ans = 1
while ans in A:
ans += 1
return ans
Fast for large arrays.
def minpositive(arr):
if 1 not in arr: # protection from error if ( max(arr) < 0 )
return 1
else:
maxArr = max(arr) # find max element in 'arr'
c1 = set(range(2, maxArr+2)) # create array from 2 to max
c2 = c1 - set(arr) # find all positive elements outside the array
return min(c2)
I have an easy solution. No need to sort.
def solution(A):
s = set(A)
m = max(A) + 2
for N in range(1, m):
if N not in s:
return N
return 1
Note: It is 100% total score (Correctness & Performance)
def minpositive(A):
"""Given an list A of N integers,
returns the smallest positive integer (greater than 0)
that does not occur in A in O(n) time complexity
Args:
A: list of integers
Returns:
integer: smallest positive integer
e.g:
A = [1,2,3]
smallest_positive_int = 4
"""
len_nrs_list = len(A)
N = set(range(1, len_nrs_list+2))
return min(N-set(A)) #gets the min value using the N integers
This solution passes the performance test with a score of 100%
def solution(A):
n = sorted(i for i in set(A) if i > 0) # Remove duplicates and negative numbers
if not n:
return 1
ln = len(n)
for i in range(1, ln + 1):
if i != n[i - 1]:
return i
return ln + 1
def solution(A):
B = set(sorted(A))
m = 1
for x in B:
if x == m:
m+=1
return m
Continuing on from Niroj Shrestha and najeeb-jebreel, added an initial portion to avoid iteration in case of a complete set. Especially important if the array is very large.
def smallest_positive_int(A):
sorted_A = sorted(A)
last_in_sorted_A = sorted_A[-1]
#check if straight continuous list
if len(sorted_A) == last_in_sorted_A:
return last_in_sorted_A + 1
else:
#incomplete list, iterate to find the smallest missing number
sol=1
for x in sorted_A:
if x == sol:
sol += 1
else:
break
return sol
A = [1,2,7,4,5,6]
print(smallest_positive_int(A))
This question doesn't really need another answer, but there is a solution that has not been proposed yet, that I believe to be faster than what's been presented so far.
As others have pointed out, we know the answer lies in the range [1, len(A)+1], inclusively. We can turn that into a set and take the minimum element in the set difference with A. That's a good O(N) solution since set operations are O(1).
However, we don't need to use a Python set to store [1, len(A)+1], because we're starting with a dense set. We can use an array instead, which will replace set hashing by list indexing and give us another O(N) solution with a lower constant.
def minpositive(a):
# the "set" of possible answer - values_found[i-1] will tell us whether i is in a
values_found = [False] * (len(a)+1)
# note any values in a in the range [1, len(a)+1] as found
for i in a:
if i > 0 and i <= len(a)+1:
values_found[i-1] = True
# extract the smallest value not found
for i, found in enumerate(values_found):
if not found:
return i+1
We know the final for loop always finds a value that was not marked, because it has one more element than a, so at least one of its cells was not set to True.
def check_min(a):
x= max(a)
if x-1 in a:
return x+1
elif x <= 0:
return 1
else:
return x-1
Correct me if i'm wrong but this works for me.
def solution(A):
clone = 1
A.sort()
for itr in range(max(A) + 2):
if itr not in A and itr >= 1:
clone = itr
break
return clone
print(solution([2,1,4,7]))
#returns 3
def solution(A):
n = 1
for i in A:
if n in A:
n = n+1
else:
return n
return n
def not_in_A(a):
a=sorted(a)
if max(a)<1:
return(1)
for i in range(0,len(a)-1):
if a[i+1]-a[i]>1:
out=a[i]+1
if out==0 or out<1:
continue
return(out)
return(max(a)+1)
mark and then find the first one that didn't find
nums = [ 6, 4, 3, -5, 0, 2, -7, 1 ]
def check_min(nums):
marks = [-1] * len(nums)
for idx, num in enumerate(nums):
if num >= 0:
marks[num] = idx
for idx, mark in enumerate(marks):
if mark == -1:
return idx
return idx + 1
I just modified the answer by #najeeb-jebreel and now the function gives an optimal solution.
def solution(A):
sorted_set = set(sorted(A))
sol = 1
for x in sorted_set:
if x == sol:
sol += 1
else:
break
return sol
I reduced the length of set before comparing
a=[1,222,3,4,24,5,6,7,8,9,10,15,2,3,3,11,-1]
#a=[1,2,3,6,3]
def sol(a_array):
a_set=set()
b_set=set()
cnt=1
for i in a_array:
#In order to get the greater performance
#Checking if element is greater than length+1
#then it can't be output( our result in solution)
if i<=len(a) and i >=1:
a_set.add(i) # Adding array element in set
b_set.add(cnt) # Adding iterator in set
cnt=cnt+1
b_set=b_set.difference(a_set)
if((len(b_set)) > 1):
return(min(b_set))
else:
return max(a_set)+1
sol(a)
def solution(A):
nw_A = sorted(set(A))
if all(i < 0 for i in nw_A):
return 1
else:
ans = 1
while ans in nw_A:
ans += 1
if ans not in nw_A:
return ans
For better performance if there is a possibility to import numpy package.
def solution(A):
import numpy as np
nw_A = np.unique(np.array(A))
if np.all((nw_A < 0)):
return 1
else:
ans = 1
while ans in nw_A:
ans += 1
if ans not in nw_A:
return ans
def solution(A):
# write your code in Python 3.6
min_num = float("inf")
set_A = set(A)
# finding the smallest number
for num in set_A:
if num < min_num:
min_num = num
# print(min_num)
#if negative make positive
if min_num < 0 or min_num == 0:
min_num = 1
# print(min_num)
# if in set add 1 until not
while min_num in set_A:
min_num += 1
return min_num
Not sure why this is not 100% in correctness. It is 100% performance
def solution(A):
arr = set(A)
N = set(range(1, 100001))
while N in arr:
N += 1
return min(N - arr)
solution([1, 2, 6, 4])
#returns 3

What is the minimum number of swaps required to bubble sort an array?

I'm trying to solve the Hackerrank problem New Year Chaos:
Further explanation can be found on the page. For example, denoting the 'swapped' queue as q, if q = [2, 1, 5, 3, 4], then the required number of swaps is 3:
According to the first answer of https://www.quora.com/How-can-I-efficiently-compute-the-number-of-swaps-required-by-slow-sorting-methods-like-insertion-sort-and-bubble-sort-to-sort-a-given-array, the number of swaps required by bubble sort is equal to the number of inversions in the array. I tried to test this with the following Hackerrank submission:
#!/bin/python
import sys
T = int(raw_input().strip())
for a0 in xrange(T):
n = int(raw_input().strip())
q = map(int,raw_input().strip().split(' '))
# your code goes here
diff = [x - y for x, y in zip(q, range(1,n+1))]
if any([abs(el) > 2 for el in diff]):
print "Too chaotic"
else:
all_pairs = [(q[i], q[j]) for i in range(n) for j in range(i+1, n)]
inversions = [pair[0] > pair[1] for pair in all_pairs]
print inversions.count(True)
Here is also a version of the code to run locally:
n = 5
q = [2, 1, 5, 3, 4]
diff = [x - y for x, y in zip(q, range(1,n+1))]
if any([abs(el) > 2 for el in diff]):
print "Too chaotic"
else:
all_pairs = [(q[i], q[j]) for i in range(n) for j in range(i+1, n)]
inversion_or_not = [pair[0] > pair[1] for pair in all_pairs]
print inversion_or_not.count(True)
For the given test case, the script correctly prints the number 3. However, for all the other 'hidden' test cases, it gives the wrong answer:
I've also tried a submission which implements bubble sort:
#!/bin/python
import sys
def swaps_bubble_sort(q):
q = list(q) # Make a shallow copy
swaps = 0
swapped = True
while swapped:
swapped = False
for i in range(n-1):
if q[i] > q[i+1]:
q[i], q[i+1] = q[i+1], q[i]
swaps += 1
swapped = True
return swaps
T = int(raw_input().strip())
for a0 in xrange(T):
n = int(raw_input().strip())
q = map(int,raw_input().strip().split(' '))
# your code goes here
diff = [x - y for x, y in zip(q, range(1,n+1))]
if any([abs(el) > 2 for el in diff]):
print "Too chaotic"
else:
print swaps_bubble_sort(q)
but with the same (failed) result. Is the minimum number of swaps not equal to the number of inversions or that attained by bubble sort?
You just have to count the number of necessary swaps in bubble sort. Here is my code that got accepted.
T = input()
for test in range(T):
n = input()
l = map(int, raw_input().split())
for i,x in enumerate(l):
if x-(i+1) > 2:
print "Too chaotic"
break
else:
counter = 0
while 1:
flag = True
for i in range(len(l)-1):
if l[i] > l[i+1]:
l[i],l[i+1] = l[i+1],l[i]
counter += 1
flag = False
if flag:
break
print counter
In your first code your approach is O(n^2) which is not appropriate for n = 10^5. In this line
all_pairs = [(q[i], q[j]) for i in range(n) for j in range(i+1, n)]
you are trying to store 10^10 tuples in your RAM.
The problem with your second code is you are using the abs of elements of diff to make sure the array is not chaotic. However one person can go to the end of the line only by getting bribed and it doesn't violates the rules. So you just have to make sure a person doesn't come forward more than two positions not the other way around.
Swift 4 version:
func minimumBribes(queue: [Int]) -> Int? {
for (index, value) in queue.enumerated() {
if value - (index + 1) > 2 { // `+ 1` needed because index starts from `0`, not from `1`.
return nil
}
}
var counter = 0
var queue = queue // Just a mutable copy of input value.
while true {
var isSorted = true
for i in 0 ..< queue.count - 1 {
if queue[i] > queue[i + 1] {
queue.swapAt(i, i + 1)
counter += 1
isSorted = false
}
}
if isSorted {
break
}
}
return counter
}
// Complete the minimumBribes function below.
func minimumBribes(q: [Int]) -> Void {
if let value = minimumBribes(queue: q) {
print("\(value)")
} else {
print("Too chaotic")
}
}
clean python solution:
def minimumBribes(q):
b = 0
for i, x in enumerate(q):
if x - i > 3:
print('Too chaotic')
return
for y in q[max(0, x - 2):i]:
if y > x:
b += 1
print(b)

Categories

Resources