Thought process behind the two sums solution (leetcode)? - Python 3 - python

class Solution:
def twoSum(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
h = {}
for i, num in enumerate(nums):
n = target - num
if n not in h:
h[num] = i
else:
return [h[n], i]
E.g. nums = [2,7,11,15], target = 9 --> Answer: [0, 1]
Solution is posted above. I know there's plenty of sources of explanations of solutions so far. I understand what each line of the code does and how it is used to arrive at the answer. My dilemma is...how does one conceptualize the answer from scratch? Why does one think to themselves that n = target - num is essential?

yeah, basically my thought process for solving the question may turn out like this
:
(Paraphrasing the question for readers understanding): Return indices of two integers in a given list whose sum is equal to a given target.
i. Like try brute force - which basically mean to add every unique pair of elements in the list, until we get a pair whose sum is equal to the target.
Time Complexity: O(N2) in Worst case. Ex: nums_list = [1,2,3,4] and target = 7
ii. Now, can we reduce the time complexity to O(N)? which means I need to iterate the list only using a single loop, in laymam's terms.
So, iterating only once means, we need to store some values about the visited elements which would help us to find the pair as we iterate the list.
Let us take this small example, given that our target = 10 and our element pair is [A,B], if A = 3, we are for sure that B = 10 - 3 = 7. So, we can check if we have an element B(Difference b/w target and A) in visited elements, if the current element is A, given the target.
So, this given us an idea list keep a track of all visited elements and if the difference between the Target value and the current element is already a visited element, then HOLA!!
We got the pair i.e [Current element, Target - Current element (if available in the visited elements list)]
Just renaming the given code to improve code readability!
class Solution:
def twoSum(self, list_nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
visited_elements = {}
for index, num in enumerate(list_nums):
difference = target - num
if difference not in visited_elements:
visited_elements[num] = index
else:
return [visited_elements[difference], index]
So, that my thought process behind "n = target - num" :)

Related

Easy algorithm-Leet code- Maximum sub array

Struggling to wrap my head around this.
Maximum Subarray
Easy
Given an integer array nums, find the contiguous subarray (containing at least one number) which has the largest sum and return its sum.
A subarray is a contiguous part of an array.
Example 1:
Input: nums = [-2,1,-3,4,-1,2,1,-5,4]
Output: 6
Explanation: [4,-1,2,1] has the largest sum = 6.
Example 2:
Input: nums = [1]
Output: 1
Example 3:
Input: nums = [5,4,-1,7,8]
Output: 23
class Solution(object):
def maxSubArray(self, nums):
"""
:type nums: List[int]
:rtype: int
"""
subarray1=[]
subarray2=[]
for n in nums:
subarray1.append(sum(nums[nums.index(n):]))
nums2=nums[::-1]
subarray2.append(sum(nums2[nums.index(n):]))
para1=subarray1.index(max(subarray1))
para2=len(nums)-subarray2.index(max(subarray2))
ans=sum(nums[para1:para2])
if sum(nums)>ans :
ans=sum(nums)
if len(nums)==2 and sum(nums)< nums[0] or nums[1] :
ans=max(nums)
return ans
I'm don't understand the iterative logic and the answers from vids are coming up wrong.
My logic is to create an array summing the input array from both sides and use the index of max values on those 2 arrays to figure out the maximum sum sub array parameters.
My answer is supposedly wrong when copied onto leet code https://leetcode.com/problems/maximum-subarray/
Been trying for hours, it's marked as easy. Im sure there is an easy iterative way of doing it but everything I've search is wrong so far.
There is a standard logic to many of these problems. Assume you know what subarray with the largest total is nums[:n - 1]. Then what is the subarray with the largest total you can find for the subarray nums[:n]?
There are two possibilities:
The new answer doesn't contain nums[n-1]. In that case, it has to be the same answer as the old answer
The new answer does contain nums[n-1].
So. . .
The actual algorithm is that you iteratively go through the array, repeatedly adding a new element to the array, and keeping track of two answers:
What is the subarray with the largest total
What is the subarray with the largest total containing the last element.
(This answer may be the same as the previous.)
When you then add a new element to the end of the array:
The subarray with the largest total is either (a) the previous largest total or (b) the previous largest total containing the last element plus the new last element or (c) just the last element. Pick the one with the largest total.
The subarray with the largest total containing the last element is the larger of (b) or (c) above.
class Solution:
def maxSubArray(self, nums: List[int]) -> int:
for i in range(1, len(nums)):
if nums[i-1] > 0:
nums[i] += nums[i-1]
return max(nums)
This is a 2 pass O(n) time complexity solution with constant space.
How it works:
We add each element to its predecessor, provided the predecessor is greater than 0 (greater or equal would also do). The Idea is this: If negative numbers have managed to take it below 0, we discard the prefix and don't care about them anymore. But if some positive value is remaining, we add it since it's better than nothing.
At the end we look for the max value.
To make it one pass, you could just have a best value that at each iteration takes the max. Then you wouldn't need to loop over the array at the end again to take the max.
This is by the way Kadane's algorithm, if you are interested to further read about it.
You can use the Kadane's algorithm to solve this problem in O(n) time and space (and constant extra space). It is a simple dynamic programming algorithm:
class Solution:
def maxSubArray(self, nums: List[int]) -> int:
max_sum = -10**4
current_sum = 0
for n in nums:
current_sum = n if n > current_sum+n else current_sum+n
if current_sum > max_sum:
max_sum = current_sum
return max_sum
Here's my solution, although it exceeds time limit when the input list has a lot of elements. My idea is to try the sum of every sublist and update the max sum accordingly. There's a faster, but more complex approach by using "divide and conquer" method: https://leetcode.com/problems/maximum-subarray/discuss/1849465/Divide-and-Conquer-Approach-with-Python
My solution (works in 200/209 cases because of Time Limit Exceeded):
class Solution:
def maxSubArray(self, nums: List[int]) -> int:
max_sum = - 10 ** 4
for i in range(len(nums)):
for j in range(i + 1, len(nums) + 1):
s = sum(nums[i:j])
if max_sum < s:
max_sum = s
return max_sum

Find missing elements in a list created from a sequence of consecutive integers with duplicates in O(n)

This is a Find All Numbers Disappeared in an Array problem from LeetCode:
Given an array of integers where 1 ≤ a[i] ≤ n (n = size of array),
some elements appear twice and others appear once.
Find all the elements of [1, n] inclusive that do not appear in this array.
Could you do it without extra space and in O(n) runtime? You may
assume the returned list does not count as extra space.
Example:
Input:
[4,3,2,7,8,2,3,1]
Output:
[5,6]
My code is below - I think its O(N) but interviewer disagrees
def findDisappearedNumbers(self, nums: List[int]) -> List[int]:
results_list=[]
for i in range(1,len(nums)+1):
if i not in nums:
results_list.append(i)
return results_list
You can implement an algorithm where you loop through each element of the list and set each element at index i to a negative integer if the list contains the element i as one of the values,. You can then add each index i which is positive to your list of missing items. It doesn't take any additional space and uses at the most 3 for loops(not nested), which makes the complexity O(3*n), which is basically O(n). This site explains it much better and also provides the source code.
edit- I have added the code in case someone wants it:
#The input list and the output list
input = [4, 5, 3, 3, 1, 7, 10, 4, 5, 3]
missing_elements = []
#Loop through each element i and set input[i - 1] to -input[i - 1]. abs() is necessary for
#this or it shows an error
for i in input:
if(input[abs(i) - 1] > 0):
input[abs(i) - 1] = -input[abs(i) - 1]
#Loop through the list again and append each positive value to output list
for i in range(0, len(input)):
if input[i] > 0:
missing_elements.append(i + 1)
For me using loops is not the best way to do it because loops increase the complexity of the given problem. You can try doing it with sets.
def findMissingNums(input_arr):
max_num = max(input_arr) # get max number from input list/array
input_set = set(input_arr) # convert input array into a set
set_num = set(range(1,max(input_arr)+1)) #create a set of all num from 1 to n (n is the max from the input array)
missing_nums = list(set_num - input_set) # take difference of both sets and convert to list/array
return missing_nums
input_arr = [4,3,2,7,8,2,3,1] # 1 <= input_arr[i] <= n
print(findMissingNums(input_arr)) # outputs [5 , 6]```
Use hash table, or dictionary in Python:
def findDisappearedNumbers(self, nums):
hash_table={}
for i in range(1,len(nums)+1):
hash_table[i] = False
for num in nums:
hash_table[num] = True
for i in range(1,len(nums)+1):
if not hash_table[i]:
print("missing..",i)
Try the following :
a=input() #[4,3,2,7,8,2,3,1]
b=[x for x in range(1,len(a)+1)]
c,d=set(a),set(b)
print(list(d-c))

recursion vs iteration time complexity

Could anyone explain exactly what's happening under the hood to make the recursive approach in the following problem much faster and efficient in terms of time complexity?
The problem: Write a program that would take an array of integers as input and return the largest three numbers sorted in an array, without sorting the original (input) array.
For example:
Input: [22, 5, 3, 1, 8, 2]
Output: [5, 8, 22]
Even though we can simply sort the original array and return the last three elements, that would take at least O(nlog(n)) time as the fastest sorting algorithm would do just that. So the challenge is to perform better and complete the task in O(n) time.
So I was able to come up with a recursive solution:
def findThreeLargestNumbers(array, largest=[]):
if len(largest) == 3:
return largest
max = array[0]
for i in array:
if i > max:
max = i
array.remove(max)
largest.insert(0, max)
return findThreeLargestNumbers(array, largest)
In which I kept finding the largest number, removing it from the original array, appending it to my empty array, and recursively calling the function again until there are three elements in my array.
However, when I looked at the suggested iterative method, I composed this code:
def findThreeLargestNumbers(array):
sortedLargest = [None, None, None]
for num in array:
check(num, sortedLargest)
return sortedLargest
def check(num, sortedLargest):
for i in reversed(range(len(sortedLargest))):
if sortedLargest[i] is None:
sortedLargest[i] = num
return
if num > sortedLargest[i]:
shift(sortedLargest, i, num)
return
def shift(array, idx, element):
if idx == 0:
array[0] = element
return array
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
return array
Both codes passed successfully all the tests and I was convinced that the iterative approach is faster (even though not as clean..). However, I imported the time module and put the codes to the test by providing an array of one million random integers and calculating how long each solution would take to return back the sorted array of the largest three numbers.
The recursive approach was way much faster (about 9 times faster) than the iterative approach!
Why is that? Even though the recursive approach is traversing the huge array three times and, on top of that, every time it removes an element (which takes O(n) time as all other 999 elements would need to be shifted in the memory), whereas the iterative approach is traversing the input array only once and yes making some operations at every iteration but with a very negligible array of size 3 that wouldn't even take time at all!
I really want to be able to judge and pick the most efficient algorithm for any given problem so any explanation would tremendously help.
Advice for optimization.
Avoid function calls. Avoid creating temporary garbage. Avoid extra comparisons. Have logic that looks at elements as little as possible. Walk through how your code works by hand and look at how many steps it takes.
Your recursive code makes only 3 function calls, and as pointed out elsewhere does an average of 1.5 comparisons per call. (1 while looking for the min, 0.5 while figuring out where to remove the element.)
Your iterative code makes lots of comparisons per element, calls excess functions, and makes calls to things like sorted that create/destroy junk.
Now compare with this iterative solution:
def find_largest(array, limit=3):
if len(array) <= limit:
# Special logic not needed.
return sorted(array)
else:
# Initialize the answer to values that will be replaced.
min_val = min(array[0:limit])
answer = [min_val for _ in range(limit)]
# Now scan for smallest.
for i in array:
if answer[0] < i:
# Sift elements down until we find the right spot.
j = 1
while j < limit and answer[j] < i:
answer[j-1] = answer[j]
j = j+1
# Now insert.
answer[j-1] = i
return answer
There are no function calls. It is possible that you can make up to 6 comparisons per element (verify that answer[0] < i, verify that (j=1) < 3, verify that answer[1] < i, verify that (j=2) < 3, verify that answer[2] < i, then find that (j=3) < 3 is not true). You will hit that worst case if array is sorted. But most of the time you only do the first comparison then move to the next element. No muss, no fuss.
How does it benchmark?
Note that if you wanted the smallest 100 elements, then you'd find it worthwhile to use a smarter data structure such as a heap to avoid the bubble sort.
I am not really confortable with python, but I have a different approach to the problem for what it's worth.
As far as I saw, all solutions posted are O(NM) where N is the length of the array and M the length of the largest elements array.
Because of your specific situation whereN >> M you could say it's O(N), but the longest the inputs the more it will be O(NM)
I agree with #zvone that it seems you have more steps in the iterative solution, which sounds like an valid explanation to your different computing speeds.
Back to my proposal, implements binary search O(N*logM) with recursion:
import math
def binarySearch(arr, target, origin = 0):
"""
Recursive binary search
Args:
arr (list): List of numbers to search in
target (int): Number to search with
Returns:
int: index + 1 from inmmediate lower element to target in arr or -1 if already present or lower than the lowest in arr
"""
half = math.floor((len(arr) - 1) / 2);
if target > arr[-1]:
return origin + len(arr)
if len(arr) == 1 or target < arr[0]:
return -1
if arr[half] < target and arr[half+1] > target:
return origin + half + 1
if arr[half] == target or arr[half+1] == target:
return -1
if arr[half] < target:
return binarySearch(arr[half:], target, origin + half)
if arr[half] > target:
return binarySearch(arr[:half + 1], target, origin)
def findLargestNumbers(array, limit = 3, result = []):
"""
Recursive linear search of the largest values in an array
Args:
array (list): Array of numbers to search in
limit (int): Length of array returned. Default: 3
Returns:
list: Array of max values with length as limit
"""
if len(result) == 0:
result = [float('-inf')] * limit
if len(array) < 1:
return result
val = array[-1]
foundIndex = binarySearch(result, val)
if foundIndex != -1:
result.insert(foundIndex, val)
return findLargestNumbers(array[:-1],limit, result[1:])
return findLargestNumbers(array[:-1], limit,result)
It is quite flexible and might be inspiration for a more elaborated answer.
The recursive solution
The recursive function goes through the list 3 times to fins the largest number and removes the largest number from the list 3 times.
for i in array:
if i > max:
...
and
array.remove(max)
So, you have 3×N comparisons, plus 3x removal. I guess the removal is optimized in C, but there is again about 3×(N/2) comparisons to find the item to be removed.
So, a total of approximately 4.5 × N comparisons.
The other solution
The other solution goes through the list only once, but each time it compares to the three elements in sortedLargest:
for i in reversed(range(len(sortedLargest))):
...
and almost each time it sorts the sortedLargest with these three assignments:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
So, you are N times:
calling check
creating and reversing a range(3)
accessing sortedLargest[i]
comparing num > sortedLargest[i]
calling shift
comparing idx == 0
and about 2×N/3 times doing:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
and N/3 times array[0] = element
It is difficult to count, but that is much more than 4.5×N comparisons.

Find duplicates in a array/list of integers

Given an array/list of integers, output the duplicates.
Also, what I am really looking for: what solutions have best time performance? Best space performance? Is it possible to have both best time and best space performance? Just curious. Thank you!
For example: given the list [4,1,7,9,4,5,2,7,6,5,3,6,7], the answer would be [4,7,6,5] (the order of the output does not matter).
I wrote up my solution in python.
Here's one solution I wrote using a hash and binary search.
def binarySearch(array, number):
start = 0
end = len(array)
mid = (end + start) // 2
while (end > start):
mid = start + (end - start) // 2
if array[mid] == number:
return (mid, True)
elif number > array[mid]:
if start == mid:
return (mid + 1, False)
start = mid
else:
end = mid
return (mid, False)
def findDuplicatesWithHash(array):
duplicatesHash = {}
duplicates = []
for number in array:
try:
index,found = binarySearch(duplicates, number)
if duplicatesHash[number] == 0 and not found:
duplicates.insert(index, number)
except KeyError as error:
duplicatesHash[number] = 0
duplicatesSorted = sorted(duplicates, key=lambda tup: tup)
return duplicatesSorted
There are multiple solutions to finding duplicates. Given this question is completely generic, one can assume that given a list of n values, the number of duplicates lie in the range [0, n/2].
What are the possible methods you can think of?
Hash Table approach:
Store values while traversing the list if value already doesn't exist in the hash table. If the value, exists, you have a duplicate.
Algorithm FindDuplicates(list)
hash_table <- HashTable()
duplicates <- List()
for value in list:
if value in hash_table:
duplicates.add(value)
else:
hash_table.add(value, true)
Time: O(n) to traverse through all values
Space: O(n) to save all possible values in the hash table.
Sort Array
Sort the array and traverse neighbour values.
Algorithm FindDuplicates(list)
list.sort()
duplicates <- Set()
for i <- [1, len(list)-1]:
if list[i] = list[i-1]:
duplicates.add(list[i])
Time: O(n.logn) + O(n) = O(n.logn) to sort and traverse all values
Space: O(1) as no extra space created to produce duplicates
Check for every value
For every value check if the value exists in the array.
Algorithm Search(i, list):
for j <- [0, len(list)-1] - [i]:
if list[j] = list[i]:
return true
return false
Algorithm FindDuplicates(list)
duplicates <- Set()
for i <- [1, len(list)-1]:
if Search(i, list):
duplicates.add(list[i])
Time: O(n^2) number of comparisons are n*n(-1)
Space: O(1) as no extra space created to produce duplicates
Note: space for the duplicates array cannot be included in the space complexity equations as that is the result we want.
Can you think of some more?
One way to get the duplicate:
l = [4,1,7,9,4,5,2,7,6,5,3,6]
import collections
print([item for item, count in collections.Counter(l).items() if count > 1])
Finding duplicates is very similar to sorting. That is, each element needs to be directly or indirectly compared to all other elements to find if there are duplicates. One could modify quicksort to output elements that have an adjacent matching element with O(n) spacial complexity and O(n*log(n)) average time complexity.

Better algorithm (than using a dict) for enumerating pairs with a given sum.

Given a number, I have to find out all possible index-pairs in a given array whose sum equals that number. I am currently using the following algo:
def myfunc(array,num):
dic = {}
for x in xrange(len(array)): # if 6 is the current key,
if dic.has_key(num-array[x]): #look at whether num-x is there in dic
for y in dic[num-array[x]]: #if yes, print all key-pair values
print (x,y),
if dic.has_key(array[x]): #check whether the current keyed value exists
dic[array[x]].append(x) #if so, append the index to the list of indexes for that keyed value
else:
dic[array[x]] = [x] #else create a new array
Will this run in O(N) time? If not, then what should be done to make it so? And in any case, will it be possible to make it run in O(N) time without using any auxiliary data structure?
Will this run in O(N) time?
Yes and no. The complexity is actually O(N + M) where M is the output size.
Unfortunately, the output size is in O(N^2) worst case, for example the array [3,3,3,3,3,...,3] and number == 6 - it will result in quadric number of elements needed to be produced.
However - asymptotically speaking - it cannot be done better then this, because it is linear in the input size and output size.
Very, very simple solution that actually does run in O(N) time by using array references. If you want to enumerate all the output pairs, then of course (as amit notes) it must take O(N^2) in the worst case.
from collections import defaultdict
def findpairs(arr, target):
flip = defaultdict(list)
for i, j in enumerate(arr):
flip[j].append(i)
for i, j in enumerate(arr):
if target-j in flip:
yield i, flip[target-j]
Postprocessing to get all of the output values (and filter out (i,i) answers):
def allpairs(arr, target):
for i, js in findpairs(arr, target):
for j in js:
if i < j: yield (i, j)
This might help - Optimal Algorithm needed for finding pairs divisible by a given integer k
(With a slight modification, there we are seeing for all pairs divisible by given number and not necessarily just equal to given number)

Categories

Resources