Smallest Missing Number - python

I'd like some insight on the following code. The problem says, devise an algorithm that finds the smallest missing number in an array
My Approach:
def small(arr: list) -> int:
s = {*arr}
i = 1
while True:
if i not in s:
return i
i += 1
Easy right?
The problem is this uses space complexity of (n) when I create that extra set
Better Approach:
# Linear time routine for partitioning step of Quicksort
def partition(arr):
pIndex = 0
# each time we find a positive number, `pIndex` is incremented, and
# that element would be placed before the pivot
for i in range(len(arr)):
if arr[i] > 0: # pivot is 0:
arr[i], arr[pIndex] = arr[pIndex], arr[i]
pIndex += 1
# return index of the first non-positive number
return pIndex
# Function to find the smallest missing positive number from an unsorted array
def findSmallestMissing(arr, n):
# Case 1. The missing number is in range 1 to `n`
# do for each array element
for i in range(n):
# get the value of the current element
val = abs(arr[i])
# make element at index `val-1` negative if it is positive
if val - 1 < n and arr[val - 1] >= 0:
arr[val - 1] = -arr[val - 1]
# check for missing numbers from 1 to `n`
for i in range(n):
if arr[i] > 0:
return i + 1
# Case 2. If numbers from 1 to `n` are present in the array,
# then the missing number is `n+1` e.g. [1, 2, 3, 4] —> 5
return n + 1
if __name__ == '__main__':
arr = [1, 4, 2, -1, 6, 5]
k = partition(arr)
print("The smallest positive missing number is",
findSmallestMissing(arr, k))
I don't understand why do we need
if val - 1 < n and arr[val - 1] >= 0:
arr[val - 1] = -arr[val - 1]

findSmallestMissing is really very similar to your set-based solution. The difference is that it uses the sign-bit in the input array as that set. That sign has nothing to do with the value that is stored at the same spot, but with the index. If the sign bit is set it means: I have encountered a value that is this index (after translating a value to an index by subtracting one).
The code you asked about:
if val - 1 < n and arr[val - 1] >= 0:
arr[val - 1] = -arr[val - 1]
First of all, the subtraction of one is there to covert a 1-based value to a 0-based index.
Since we use the array also as a set, this code first checks that the index is in range (of the "set") and then it checks that this index is not yet in the set, i.e. the sign bit at that index is still 0. If so, we add the index to the set, i.e. we set the sign bit at that index to 1.
All in all, one may argue that we cheat here: it is as if we didn't allocate extra memory for maintaining a set, but in reality we used an unused bit. In theory this means we still use extra memory, while in practice that memory was already allocated, but not used. The algorithm assumes that the none of the values in the array are negative.

Related

Recycled array solution to Find minimum in Rotated sorted Array

I am working on a hard but stupid bisect search problem and debugging for hours.
Find Minimum in Rotated Sorted Array II
Find Minimum in Rotated Sorted Array II
Hard
Suppose an array sorted in ascending order is rotated at some pivot unknown to you beforehand.
(i.e., [0,1,2,4,5,6,7] might become [4,5,6,7,0,1,2]).
Find the minimum element.
The array may contain duplicates.
Example 1:
Input: [1,3,5]
Output: 1
Example 2:
Input: [2,2,2,0,1]
Output: 0
Note:
This is a follow up problem to Find Minimum in Rotated Sorted Array.
Would allow duplicates affect the run-time complexity? How and why?
The widely accepted answer takes O(n) time,
class SolutionK:
def findMin(self, nums):
lo, hi = 0, len(nums)-1
while lo < hi:
mid = (hi +lo) // 2
if nums[mid] > nums[hi]:
lo = mid + 1
elif nums[mid] < nums[hi]:
hi = mid
else:
hi -= 1
return nums[lo]
# why not min(nums) or brute force
I think the problem might be solved by a recycled array.
Since there are duplicates, we can find the rightmost max, then max + 1 is the minimal.
#the mid
lo = 0
hi = len(nums)
mid = (lo+hi) // 2
mid = mid % len(nums)
and the terminating condition
if nums[mid-1] <= nums[mid] > nums[mid+1]: return mid as the peak.
Unfortunately I cannot design the decreasing conditions.
Could you please give some hints?
You can indeed use bisection. In case the array consists of only unique numbers and has been rotated either the leftmost or the rightmost value will be out of order with respect to the middle point. That is array[0] <= array[len(array) // 2] <= array[-1] will be False. On the other hand this condition may hold if:
the array is not rotated at all,
or there are duplicates such as [1, 1, 1, 1, 2] => (rotate left 1) [1, 1, 1, 2, 1].
So we can separately check the left and right part of the condition (array[0] and array[-1] respectively) and in case one of them is invalidated check the corresponding sub-array (the left and right sub-array respectively). In case neither condition is invalidated we need to check both sides and compare.
The following is an example implementation (it only uses min where there are less than three elements involved, i.e. a simple comparison could be made as well):
def minimum(array):
if len(array) <= 2:
return min(array)
midpoint = len(array) // 2
if array[0] > array[midpoint]:
return minimum(array[:midpoint+1])
elif array[midpoint] > array[-1]:
return minimum(array[midpoint+1:])
else: # Possibly dealing with duplicates.
return min(minimum(array[:midpoint]),
minimum(array[midpoint:]))
from collections import deque
from random import randint, choices
for test in range(1000):
l = randint(10, 100)
array = deque(sorted(choices(list(range(l // 2)), k=l)))
array.rotate(randint(-len(array), len(array)))
array = list(array)
assert min(array) == minimum(array)

Cannot fix "list index out of range"

I'm trying to write a simple code in python to find the first missing positive integer. My algorithm is to create an array full of zeros with the size of maximum positive integer in the input array+1 (for example if the maximum number is 7, the size of 0's array would be 8). Then I trace the input array and whenever I find a positive number I change the index value+1 in the second array to 1. This is my code:
def minPositive(a):
max_a = max(a)
b = [0]*(max_a+1) # This is the second array initialized to zero
for i in range(len(a)):
if a[i] > 0:
b[a[i]+1]= 1
for j in range(len(b)):
if j != 0:
if b[j] == 0:
return j
But when I code this I face "List index out of range". I traced my program several times but I cannot find the error.
Python indexes from 0, so a list of length n has no nth element. Likewise, a list with n+1 elements has no n+1th element.
One option is for every positive value in a (rather than the value plus 1), the index corresponding to that value in b will be set to 1. You could rewrite your function like this (simplified a bit):
def minPositive(a):
b = [1 if n in a and n > 0 else 0 for n in range(max(a) + 1)]
return b
Or you could just make your list b one element longer.

Why is this going out of range?

I am getting a IndexError: list index out of range error. I'm not sure why. Any advice?
The code is trying to see if a list of numbers is an arithmatic progression, in this case every number is added by 2.
def is_arith_progession(num_list):
delta = num_list[1] - num_list[0]
for num in num_list:
if not (num_list[num + 1] - num_list[num] == delta):
return False
else:
return True
print(is_arith_progession([2, 4, 6, 8, 10]))
You are trying access 5th element of num_list array in the second iteration of for loop. After the first iteration num becomes 4, so program crashes when it tries to evaluate num_list[num + 1].
num variable holds the actual element in the list. It is not index to element.
To iterate over indices, you may try for num in range(len(num_list) - 1) which should solve the issue. (Note -1 in the paranthesis)
This:
for num in num_list:
if not (num_list[num + 1] - num_list[num] == delta):
return False
almost certainly doesn't do what you think it does. When you define for num in num_list:, this means that num is an item from the list num_list. num is NOT an index. So, if your list is [2, 4, 6, 8, 10], you go out of bounds when num is 4 (i.e. the second item in your list), because your input list is only length 5 and you try to access index num+1, which is 5 (indexes are 0 based, so 5 is out of bounds)
You probably want something like this:
# Start at index 1, or you'll always return false since delta == index1 - index0
for index in range(1, len(num_list)-1):
if not (num_list[num + 1] - num_list[num] == delta):
return False
or the more pythonic (note there are no indices):
# Again start at index1, zip will handle the edge case of ending nicely so we don't go OB
for num, next_num in zip(num_list[1:], num_list[2:]):
if not (next_num - num == delta):
return False
You are iterating over the values, not the indexes of the array. So, num_list[num] can be out of range. Since you refer to the i+1 element, iterate up to i < n-1
for i, _ in enumerate(num_list[:-1]):
if num_list[i+1] - num_list[i]...
2 things:
num is an element of num_list, not just an index. Getting an index would be for num in range(len(num_list)):, you're effectively calling num_list[num_list[i]];
Even if it was an index, for the last index num in array you are calling numlist[num+1], which is out of array bounds as num is already last;
Do for INDEX in range(len(num_list)-1): and if not (num_list[INDEX + 1] - num_list[INDEX] == delta):. That should do it.

Solving the "firstDuplicate" question in Python

I'm trying to solve the following challenge from codesignal.com:
Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, return -1.
Example
For a = [2, 1, 3, 5, 3, 2], the output should be
firstDuplicate(a) = 3.
There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than the second occurrence of 2 does, so the answer is 3.
For a = [2, 4, 3, 5, 1], the output should be
firstDuplicate(a) = -1.
The execution time limit is 4 seconds.
The guaranteed constraints were:
1 ≤ a.length ≤ 10^5, and
1 ≤ a[i] ≤ a.length
So my code was:
def firstDuplicate(a):
b = a
if len(list(set(a))) == len(a):
return -1
n = 0
answer = -1
starting_distance = float("inf")
while n!=len(a):
value = a[n]
if a.count(value) > 1:
place_of_first_number = a.index(value)
a[place_of_first_number] = 'string'
place_of_second_number = a.index(value)
if place_of_second_number < starting_distance:
starting_distance = place_of_second_number
answer = value
a=b
n+=1
if n == len(a)-1:
return answer
return answer
Out of the 22 tests the site had, I passed all of them up to #21, because the test list was large and the execution time exceeded 4 seconds. What are some tips for reducing the execution time, while keeping the the code more or less the same?
As #erip has pointed out in the comments, you can iterate through the list, add items to a set, and if the item is already in a set, it is a duplicate that has the lowest index, so you can simply return the item; or return -1 if you get to the end of the loop without finding a duplicate:
def firstDuplicate(a):
seen = set()
for i in a:
if i in seen:
return i
seen.add(i)
return -1
Create a new set and find its already in the new list, if its there return the element:
def firstDuplicate(a):
dup = set()
for i in range(len(a)):
if a[i] in dup:
return a[i]
else:
dup.add(a[i])
return -1
This is just an idea, I didn't verify it but it should work. It seems there's no memory limit but just a time limit. Therefore using space to trade time is probably a practical way to do this. The computation complexity is O(n). This algorithm also depends on the condition that the number range is between 1 to len(a).
def first_duplicate(a):
len_a = len(a)
b = [len_a + 1] * len_a
for i, n in enumerate(a):
n0 = n - 1
if b[n0] == len_a + 1:
b[n0] = len_a
elif b[n0] == len_a:
b[n0] = i
min_i = len_a
min_n = -1
for n0, i in enumerate(b):
if i < min_i:
min_i = i
min_n = n0 + 1
return min_n
Update:
This solution is not as fast as the set() solution by #blhsing. However, it may not be the same if it was implemented in C - it's kinda unfair since set() is a built-in function which was implemented in C as other core functions of CPython.

Find Triplets smaller than a given number

I am trying to solve a problem where:
Given an array of n integers nums and a target, find the number of
index triplets i, j, k with 0 <= i < j < k < n that satisfy the
condition nums[i] + nums[j] + nums[k] < target.
For example, given nums = [-2, 0, 1, 3], and target = 2.
Return 2. Because there are two triplets which sums are less than 2:
[-2, 0, 1] [-2, 0, 3]
My algorithm: Remove a single element from the list, set target = target - number_1, search for doublets such that number_1 + number _2 < target - number_1. Problem solved.
The problem link is https://leetcode.com/problems/3sum-smaller/description/ .
My solution is:
def threeSumSmaller(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: int
"""
nums = sorted(nums)
smaller = 0
for i in range(len(nums)):
# Create temp array excluding a number
if i!=len(nums)-1:
temp = nums[:i] + nums[i+1:]
else:
temp = nums[:len(nums)-1]
# Sort the temp array and set new target to target - the excluded number
l, r = 0, len(temp) -1
t = target - nums[i]
while(l<r):
if temp[l] + temp[r] >= t:
r = r - 1
else:
smaller += 1
l = l + 1
return smaller
My solution fails:
Input:
[1,1,-2]
1
Output:
3
Expected:
1
I am not getting why is the error there as my solution passes more than 30 test cases.
Thanks for your help.
One main point is that when you sort the elements in the first line, you also lose the indexes. This means that, despite having found a triplet, you'll never be sure whether your (i, j, k) will satisfy condition 1, because those (i, j, k) do not come from the original list, but from the new one.
Additionally: everytime you pluck an element from the middle of the array, the remaining part of the array is also iterated (although in an irregular way, it still starts from the first of the remaining elements in tmp). This should not be the case! I'm expanding details:
The example iterates 3 times over the list (which is, again, sorted and thus you lose the true i, j, and k indexes):
First iteration (i = 0, tmp = [1, -2], t = 0).
When you sum temp[l] + temp[r] (l, r are 0, 1) it will be -1.
It satisfies being lower than t. smaller will increase.
The second iteration will be like the first, but with i = 1.
Again it will increase.
The third one will increase as well, because t = 3 and the sum will be 2 now.
So you'll count the value three times (despite only one tuple can be formed in order of indexes) because you are iterating through the permutations of indexes instead of combinations of them. So those two things you did not take care about:
Preserving indexes while sorting.
Ensuring you iterate the indexes in a forward-fashion only.
Try like this better:
def find(elements, upper_bound):
result = 0
for i in range(0, len(elements) - 2):
upper_bound2 = upper_bound - elements[i]
for j in range(i+1, len(elements) - 1):
upper_bound3 = upper_bound2 - elements[j]
for k in range(j+1, len(elements)):
upper_bound4 = upper_bound3 - elements[k]
if upper_bound4 > 0:
result += 1
return result
Seems like you're counting the same triplet more than once...
In the first iteration of the loop, you omit the first 1 in the list, and then increase smaller by 1. Then you omit the second 1 in the list and increase smaller again by 1. And finally you omit the third element in the list, -2, and of course increase smaller by 1, because -- well -- in all these three cases you were in fact considering the same triplet {1,1,-2}.
p.s. It seems like you care more about correctness than performance. In that case, consider maintaining a set of the solution triplets, to ensure you're not counting the same triplet twice.
There are already good answers , Apart that , If you want to check your algorithm result then you can take help of this in-built funtion :
import itertools
def find_(vector_,target):
result=[]
for i in itertools.combinations(vector_, r=3):
if sum(i)<target:
result.append(i)
return result
output:
print(find_([-2, 0, 1, 3],2))
output:
[(-2, 0, 1), (-2, 0, 3)]
if you want only count then:
print(len(find_([-2, 0, 1, 3],2)))
output:
2

Categories

Resources