Python: Most efficient way to compare two lists of integers

Python: Most efficient way to compare two lists of integers - python

I'm trying to compare two lists of integers, each the same size, in Python 2.6. The comparison I need is to compare the first item in List 1 with the first item in List 2, the second item in List 1 with the second item in List 2, and so on, and returns a result if ALL of the list items follow the same comparison criteria. It should behave as follows:
list1 = [1,1,1,1]
list2 = [2,1,2,3]
compare(list1,list2)
# returns a "list 1 is <= list 2" response.
list1 = [4,1,4,3]
list2 = [2,1,2,3]
compare(list1,list2)
# returns a "list 1 is >= list 2" response.
list1 = [3,2,3,2]
list2 = [1,4,1,4]
compare(list1,list2)
# returns None— some items in list1 > list2, and some items in list2 > list1.
I figured I could write the code like the following block, but I don't know if it's the most efficient. My program is going to be calling this method a LOT so I want to streamline this as much as possible.
def compare(list1,list2):
gt_found = 0
lt_found = 0
for x in range(len(list1)):
if list1[x] > list2[x]:
gt_found += 1
elif list1[x] < list2[x]:
lt_found += 1
if gt_found > 0 and lt_found > 0:
return None #(some items >, some items <)
if gt_found > 0:
return 1 #(list1 >= list2)
if lt_found > 0:
return -1 #(list1 <= list2)
return 0 #(list1 == list2)
Is it already as good as it's going to get (big-O of n), or is there a faster way to go about it (or a way that uses system functions instead)?
CLARIFICATION: I expect the case that returns 'None' to happen the most often, so it is important.

You can consider a numpy-based vectorized comparison.
import numpy as np
a = [1,1,1,2]
b = [2,2,4,3]
all_larger = np.all(np.asarray(b) > np.asarray(a)) # true if b > a holds elementwise
print all_larger
True
Clearly, you can engineer the thing to have your answer.
all_larger = lambda b,a : np.all(np.asarray(b) > np.asarray(a))
if all_larger(b,a):
print "b > a"
elif all_larger(a,b):
print "a > b"
else
print "nothing!"
Every type of comparison such as <, >, <=, >=, can be done.

Are you familiar with the wonderful zip function?
import itertools
def compare(xs, ys):
all_less = True
all_greater = True
for x, y in itertools.izip(xs, ys):
if not all_less and not all_greater:
return None
if x > y:
all_less = False
elif x < y:
all_greater = False
if all_less:
return "list 1 is <= list 2"
elif all_greater:
return "list 1 is >= list 2"
return None # all_greater might be set False on final iteration
Zip takes two lists (xs and ys in this case, but call them whatever you want) and creates an iterator for a sequence of tuples.
izip([1,2,3,4], [4,3,2,1]) == [(1,4), (2,3), (3,2), (4,1)]
This way you can iterate through both lists simultaneously and compare each value in tandem. The time complexity should be O(n), where n is the size of your lists.
It will return early in cases where neither the >= or <= condition are met.
Update
As James Matta points out, itertools.izip performs better than the standard zip in Python 2. This isn't true in Python 3, where the standard zip works the way izip does in older versions.

For anyone interested in the performance of the two methods, I named the iterative method 'tortoise' and the numpy method 'hare', and tested it with the code below.
At first, the 'tortoise' won [.009s [T] vs .033s [H]], but I checked it and found that asarray() was being called more often than it need to be. With that fix, the 'hare' won again, [.009s [T] vs .006s [H]].
The data is here: http://tny.cz/64d6e5dc
It consists of 28 lines of about 950 elements in length. Four of the lines collectively >= all the others.
It might be interesting to see how the performance works on larger data sets.
import itertools, operator, math
import cProfile
import numpy as np
data = #SEE PASTEBIN
def tortoise(xs, ys):
all_less = True
all_greater = True
for x, y in zip(xs, ys):
if not all_less and not all_greater:
return None
if x > y:
all_less = False
elif x < y:
all_greater = False
if all_greater and all_less:
return 0
if all_greater:
return 1
if all_less:
return -1
return None # all_greater might be set False on final iteration
hare = lambda b,a : np.all(b >= a)
def find_uniques_tortoise():
include_list = range(len(data))
current_list_index = 0
while current_list_index < len(data):
if current_list_index not in include_list:
current_list_index += 1
continue
for x in range(current_list_index+1,len(data)):
if x not in include_list:
continue
result = tortoise(data[current_list_index], data[x])
if result is None: #no comparison
continue
elif result == 1 or result == 0: # this one beats the other one
include_list.remove(x)
continue
elif result == -1: #the other one beats this one
include_list.remove(current_list_index)
break
current_list_index +=1
return include_list
def find_uniques_hare():
include_list = range(len(data))
current_list_index = 0
#do all asarray()s beforehand for max efficiency
for x in range(len(data)):
data[x] = np.asarray(data[x])
while current_list_index < len(data):
if current_list_index not in include_list:
current_list_index += 1
continue
for x in range(current_list_index+1,len(data)):
if x not in include_list:
continue
if hare(data[current_list_index], data[x]): # this one beats the other one, or it's a tie
include_list.remove(x)
# print x
continue
elif hare(data[x], data[current_list_index]): #the other one beats this one
include_list.remove(current_list_index)
# print current_list_index
break
else: #no comparison
continue
current_list_index +=1
return include_list
cProfile.run('find_uniques_tortoise()')
cProfile.run('find_uniques_hare()')
print find_uniques_tortoise()
print
print find_uniques_hare()

Related

How to check elements in a list WITHOUT using for loops?

Apologies if the title of the question is phrased badly. I am currently trying to make a function that takes in a list of integers from 1 to n, where n is the length of the list. The function should return the first value that is repeated in the list. Duplicates are NOT always next to one another. If one or more integers is less than 1 or if it is not a list, the function should return -1. If there are no duplicates, return 0.
This is my current code:
def find_duplicates(ls):
if type(ls) != list:
return -1
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
break
if ls.count(i) > 1:
return i
break
else:
non_dupe.append(i)
i += 1
if len(non_dupe) == len(ls):
return 0
While this code works for a majority of test cases, it doesn't seem to pass
print(find_duplicates([1, 2, 2, 0]))
as it returns 2 instead of the expected -1. I am relatively new to Python and I can't seem to be able to fix this error. I've tried searching for ways to counter this problem but I am not allowed to use for loops to check through a list. Any help is greatly appreciated.
EDIT: I am not allowed to use any of the following but anything else is accepted.
for loops
min() / max()
enumerate() / zip ()
sort()
negative indexing e.g ls[-1]
list slicing

Your code returns a duplicate prematurely; traversing the list, the function first finds 2 as a duplicate, return it, and halts the function immediately. But it has not seen the 0 at the end.
So, you need to let the function see the list all the way towards the end, looking for a negative number. If a negative number is found along the way, you can halt the function. If it does not see a negative number until the end, then let it return the duplicate value:
def find_duplicates(ls):
if not isinstance(ls, list): # check whether ls is a list
return -1
dup = 0
seen = [] # list of numbers seen so far
i = 0 # index
while i < len(ls):
if ls[i] < 1: # if a negative number is found, return -1
return -1
if ls[i] in seen and dup == 0:
dup = ls[i]
seen.append(ls[i])
i += 1
return dup
print(find_duplicates([1, 2, 2, 0])) # -1
print(find_duplicates([1, 1, 2, 2, 3])) # 1

Problem is beacause you are breaking while loop when find a duplicated. In that case, function is finding first the duplicated.
Try this:
def find_duplicates(ls):
if type(ls) is not list:
return -1
duplicated = 0
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
if ls.count(ls[i]) > 1 and duplicated == 0
duplicated = ls[i]
i += 1
return duplicated

Your test case returns 2 because 2 stay at lower indexes comparing to 0.
I would suggest to sort the list before moving on:
def find_duplicates(ls):
if type(ls) != list:
return -1
sorted_list = ls.sorted() #Assign sorted `ls` to another variable, while keeping the order of `ls` intact
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
break
if ls.count(i) > 1:
return i
break
else:
non_dupe.append(i)
i += 1
if len(non_dupe) == len(ls):
return 0
Another method I would recommend is using set - a built-in data type of Python. Maybe you should consider trying this approach later on when all test cases are passed. Have a look at this Tutorial for set usage: https://www.w3schools.com/python/python_sets.asp.

You were very close. Try this:
def find_duplicates(ls):
if type(ls) != list:
return -1
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
elif ls[i] in non_dupe:
return ls[i]
else:
non_dupe.append(i)
i += 1
return 0

my_list = [1,2,2,0]
result = list(set(filter(lambda x: my_list.count(x) > 1 , my_list)))
# Result => [2]
I hope this solves your problem

Python code for printing out the second largest number number given a list [duplicate]

I'm learning Python and the simple ways to handle lists is presented as an advantage. Sometimes it is, but look at this:
>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> numbers.remove(max(numbers))
>>> max(numbers)
74
A very easy, quick way of obtaining the second largest number from a list. Except that the easy list processing helps write a program that runs through the list twice over, to find the largest and then the 2nd largest. It's also destructive - I need two copies of the data if I wanted to keep the original. We need:
>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> if numbers[0]>numbers[1]):
... m, m2 = numbers[0], numbers[1]
... else:
... m, m2 = numbers[1], numbers[0]
...
>>> for x in numbers[2:]:
... if x>m2:
... if x>m:
... m2, m = m, x
... else:
... m2 = x
...
>>> m2
74
Which runs through the list just once, but isn't terse and clear like the previous solution.
So: is there a way, in cases like this, to have both? The clarity of the first version, but the single run through of the second?

You could use the heapq module:
>>> el = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> import heapq
>>> heapq.nlargest(2, el)
[90.8, 74]
And go from there...

Since #OscarLopez and I have different opinions on what the second largest means, I'll post the code according to my interpretation and in line with the first algorithm provided by the questioner.
def second_largest(numbers):
count = 0
m1 = m2 = float('-inf')
for x in numbers:
count += 1
if x > m2:
if x >= m1:
m1, m2 = x, m1
else:
m2 = x
return m2 if count >= 2 else None
(Note: Negative infinity is used here instead of None since None has different sorting behavior in Python 2 and 3 – see Python - Find second smallest number; a check for the number of elements in numbers makes sure that negative infinity won't be returned when the actual answer is undefined.)
If the maximum occurs multiple times, it may be the second largest as well. Another thing about this approach is that it works correctly if there are less than two elements; then there is no second largest.
Running the same tests:
second_largest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7])
=> 74
second_largest([1,1,1,1,1,2])
=> 1
second_largest([2,2,2,2,2,1])
=> 2
second_largest([10,7,10])
=> 10
second_largest([1,1,1,1,1,1])
=> 1
second_largest([1])
=> None
second_largest([])
=> None
Update
I restructured the conditionals to drastically improve performance; almost by a 100% in my testing on random numbers. The reason for this is that in the original version, the elif was always evaluated in the likely event that the next number is not the largest in the list. In other words, for practically every number in the list, two comparisons were made, whereas one comparison mostly suffices – if the number is not larger than the second largest, it's not larger than the largest either.

You could always use sorted
>>> sorted(numbers)[-2]
74

Try the solution below, it's O(n) and it will store and return the second greatest number in the second variable. UPDATE: I've adjusted the code to work with Python 3, because now arithmetic comparisons against None are invalid.
Notice that if all elements in numbers are equal, or if numbers is empty or if it contains a single element, the variable second will end up with a value of None - this is correct, as in those cases there isn't a "second greatest" element.
Beware: this finds the "second maximum" value, if there's more than one value that is "first maximum", they will all be treated as the same maximum - in my definition, in a list such as this: [10, 7, 10] the correct answer is 7.
def second_largest(numbers):
minimum = float('-inf')
first, second = minimum, minimum
for n in numbers:
if n > first:
first, second = n, first
elif first > n > second:
second = n
return second if second != minimum else None
Here are some tests:
second_largest([20, 67, 3, 2.6, 7, 74, 2.8, 90.8, 52.8, 4, 3, 2, 5, 7])
=> 74
second_largest([1, 1, 1, 1, 1, 2])
=> 1
second_largest([2, 2, 2, 2, 2, 1])
=> 1
second_largest([10, 7, 10])
=> 7
second_largest( [1, 3, 10, 16])
=> 10
second_largest([1, 1, 1, 1, 1, 1])
=> None
second_largest([1])
=> None
second_largest([])
=> None

Why to complicate the scenario? Its very simple and straight forward
Convert list to set - removes duplicates
Convert set to list again - which gives list in ascending order
Here is a code
mlist = [2, 3, 6, 6, 5]
mlist = list(set(mlist))
print mlist[-2]

You can find the 2nd largest by any of the following ways:
Option 1:
numbers = set(numbers)
numbers.remove(max(numbers))
max(numbers)
Option 2:
sorted(set(numbers))[-2]

The quickselect algorithm, O(n) cousin to quicksort, will do what you want. Quickselect has average performance O(n). Worst case performance is O(n^2) just like quicksort but that's rare, and modifications to quickselect reduce the worst case performance to O(n).
The idea of quickselect is to use the same pivot, lower, higher idea of quicksort, but to then ignore the lower part and to further order just the higher part.

This is one of the Simple Way
def find_second_largest(arr):
first, second = 0, 0
for number in arr:
if number > first:
second = first
first = number
elif number > second and number < first:
second = number
return second

If you do not mind using numpy (import numpy as np):
np.partition(numbers, -2)[-2]
gives you the 2nd largest element of the list with a guaranteed worst-case O(n) running time.
The partition(a, kth) methods returns an array where the kth element is the same it would be in a sorted array, all elements before are smaller, and all behind are larger.

there are some good answers here for type([]), in case someone needed the same thing on a type({}) here it is,
def secondLargest(D):
def second_largest(L):
if(len(L)<2):
raise Exception("Second_Of_One")
KFL=None #KeyForLargest
KFS=None #KeyForSecondLargest
n = 0
for k in L:
if(KFL == None or k>=L[KFL]):
KFS = KFL
KFL = n
elif(KFS == None or k>=L[KFS]):
KFS = n
n+=1
return (KFS)
KFL=None #KeyForLargest
KFS=None #KeyForSecondLargest
if(len(D)<2):
raise Exception("Second_Of_One")
if(type(D)!=type({})):
if(type(D)==type([])):
return(second_largest(D))
else:
raise Exception("TypeError")
else:
for k in D:
if(KFL == None or D[k]>=D[KFL]):
KFS = KFL
KFL = k
elif(KFS == None or D[k] >= D[KFS]):
KFS = k
return(KFS)
a = {'one':1 , 'two': 2 , 'thirty':30}
b = [30,1,2]
print(a[secondLargest(a)])
print(b[secondLargest(b)])
Just for fun I tried to make it user friendly xD

>>> l = [19, 1, 2, 3, 4, 20, 20]
>>> sorted(set(l))[-2]
19

O(n): Time Complexity of a loop is considered as O(n) if the loop variables is incremented / decremented by a constant amount. For example following functions have O(n) time complexity.
// Here c is a positive integer constant
for (int i = 1; i <= n; i += c) {
// some O(1) expressions
}
To find the second largest number i used the below method to find the largest number first and then search the list if thats in there or not
x = [1,2,3]
A = list(map(int, x))
y = max(A)
k1 = list()
for values in range(len(A)):
if y !=A[values]:
k.append(A[values])
z = max(k1)
print z

Objective: To find the second largest number from input.
Input : 5
2 3 6 6 5
Output: 5
*n = int(raw_input())
arr = map(int, raw_input().split())
print sorted(list(set(arr)))[-2]*

def SecondLargest(x):
largest = max(x[0],x[1])
largest2 = min(x[0],x[1])
for item in x:
if item > largest:
largest2 = largest
largest = item
elif largest2 < item and item < largest:
largest2 = item
return largest2
SecondLargest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7])

list_nums = [1, 2, 6, 6, 5]
minimum = float('-inf')
max, min = minimum, minimum
for num in list_nums:
if num > max:
max, min = num, max
elif max > num > min:
min = num
print(min if min != minimum else None)
Output
5

Initialize with -inf. This code generalizes for all cases to find the second largest element.
max1= float("-inf")
max2=max1
for item in arr:
if max1<item:
max2,max1=max1,item
elif item>max2 and item!=max1:
max2=item
print(max2)

Using reduce from functools should be a linear-time functional-style alternative:
from functools import reduce
def update_largest_two(largest_two, x):
m1, m2 = largest_two
return (m1, m2) if m2 >= x else (m1, x) if m1 >= x else (x, m1)
def second_largest(numbers):
if len(numbers) < 2:
return None
largest_two = sorted(numbers[:2], reverse=True)
rest = numbers[2:]
m1, m2 = reduce(update_largest_two, rest, largest_two)
return m2
... or in a very concise style:
from functools import reduce
def second_largest(n):
update_largest_two = lambda a, x: a if a[1] >= x else (a[0], x) if a[0] >= x else (x, a[0])
return None if len(n) < 2 else (reduce(update_largest_two, n[2:], sorted(n[:2], reverse=True)))[1]

This can be done in [N + log(N) - 2] time, which is slightly better than the loose upper bound of 2N (which can be thought of O(N) too).
The trick is to use binary recursive calls and "tennis tournament" algorithm. The winner (the largest number) will emerge after all the 'matches' (takes N-1 time), but if we record the 'players' of all the matches, and among them, group all the players that the winner has beaten, the second largest number will be the largest number in this group, i.e. the 'losers' group.
The size of this 'losers' group is log(N), and again, we can revoke the binary recursive calls to find the largest among the losers, which will take [log(N) - 1] time. Actually, we can just linearly scan the losers group to get the answer too, the time budget is the same.
Below is a sample python code:
def largest(L):
global paris
if len(L) == 1:
return L[0]
else:
left = largest(L[:len(L)//2])
right = largest(L[len(L)//2:])
pairs.append((left, right))
return max(left, right)
def second_largest(L):
global pairs
biggest = largest(L)
second_L = [min(item) for item in pairs if biggest in item]
return biggest, largest(second_L)
if __name__ == "__main__":
pairs = []
# test array
L = [2,-2,10,5,4,3,1,2,90,-98,53,45,23,56,432]
if len(L) == 0:
first, second = None, None
elif len(L) == 1:
first, second = L[0], None
else:
first, second = second_largest(L)
print('The largest number is: ' + str(first))
print('The 2nd largest number is: ' + str(second))

You can also try this:
>>> list=[20, 20, 19, 4, 3, 2, 1,100,200,100]
>>> sorted(set(list), key=int, reverse=True)[1]
100

A simple way :
n=int(input())
arr = set(map(int, input().split()))
arr.remove(max(arr))
print (max(arr))

use defalut sort() method to get second largest number in the list.
sort is in built method you do not need to import module for this.
lis = [11,52,63,85,14]
lis.sort()
print(lis[len(lis)-2])

Just to make the accepted answer more general, the following is the extension to get the kth largest value:
def kth_largest(numbers, k):
largest_ladder = [float('-inf')] * k
count = 0
for x in numbers:
count += 1
ladder_pos = 1
for v in largest_ladder:
if x > v:
ladder_pos += 1
else:
break
if ladder_pos > 1:
largest_ladder = largest_ladder[1:ladder_pos] + [x] + largest_ladder[ladder_pos:]
return largest_ladder[0] if count >= k else None

def secondlarget(passinput):
passinputMax = max(passinput) #find the maximum element from the array
newpassinput = [i for i in passinput if i != passinputMax] #Find the second largest element in the array
#print (newpassinput)
if len(newpassinput) > 0:
return max(newpassinput) #return the second largest
return 0
if __name__ == '__main__':
n = int(input().strip()) # lets say 5
passinput = list(map(int, input().rstrip().split())) # 1 2 2 3 3
result = secondlarget(passinput) #2
print (result) #2

if __name__ == '__main__':
n = int(input())
arr = list(map(float, input().split()))
high = max(arr)
secondhigh = min(arr)
for x in arr:
if x < high and x > secondhigh:
secondhigh = x
print(secondhigh)
The above code is when we are setting the elements value in the list
as per user requirements. And below code is as per the question asked
#list
numbers = [20, 67, 3 ,2.6, 7, 74, 2.8, 90.8, 52.8, 4, 3, 2, 5, 7]
#find the highest element in the list
high = max(numbers)
#find the lowest element in the list
secondhigh = min(numbers)
for x in numbers:
'''
find the second highest element in the list,
it works even when there are duplicates highest element in the list.
It runs through the entire list finding the next lowest element
which is less then highest element but greater than lowest element in
the list set initially. And assign that value to secondhigh variable, so
now this variable will have next lowest element in the list. And by end
of loop it will have the second highest element in the list
'''
if (x<high and x>secondhigh):
secondhigh=x
print(secondhigh)

Max out the value by comparing each one to the max_item. In the first if, every time the value of max_item changes it gives its previous value to second_max. To tightly couple the two second if ensures the boundary
def secondmax(self, list):
max_item = list[0]
second_max = list[1]
for item in list:
if item > max_item:
second_max = max_item
max_item = item
if max_item < second_max:
max_item = second_max
return second_max

you have to compare in between new values, that's the trick, think always in the previous (the 2nd largest) should be between the max and the previous max before, that's the one!!!!
def secondLargest(lista):
max_number = 0
prev_number = 0
for i in range(0, len(lista)):
if lista[i] > max_number:
prev_number = max_number
max_number = lista[i]
elif lista[i] > prev_number and lista[i] < max_number:
prev_number = lista[i]
return prev_number

Most of previous answers are correct but here is another way !
Our strategy is to create a loop with two variables first_highest and second_highest. We loop through the numbers and if our current_value is greater than the first_highest then we set second_highest to be the same as first_highest and then the second_highest to be the current number. If our current number is greater than second_highest then we set second_highest to the same as current number
#!/usr/bin/env python3
import sys
def find_second_highest(numbers):
min_integer = -sys.maxsize -1
first_highest= second_highest = min_integer
for current_number in numbers:
if current_number == first_highest and min_integer != second_highest:
first_highest=current_number
elif current_number > first_highest:
second_highest = first_highest
first_highest = current_number
elif current_number > second_highest:
second_highest = current_number
return second_highest
print(find_second_highest([80,90,100]))
print(find_second_highest([80,80]))
print(find_second_highest([2,3,6,6,5]))

Best solution that my friend Dhanush Kumar came up with:
def second_max(loop):
glo_max = loop[0]
sec_max = float("-inf")
for i in loop:
if i > glo_max:
sec_max = glo_max
glo_max=i
elif sec_max < i < glo_max:
sec_max = i
return sec_max
#print(second_max([-1,-3,-4,-5,-7]))
assert second_max([-1,-3,-4,-5,-7])==-3
assert second_max([5,3,5,1,2]) == 3
assert second_max([1,2,3,4,5,7]) ==5
assert second_max([-3,1,2,5,-2,3,4]) == 4
assert second_max([-3,-2,5,-1,0]) == 0
assert second_max([0,0,0,1,0]) == 0

Below code will find the max and the second max numbers without the use of max function. I assume that the input will be numeric and the numbers are separated by single space.
myList = input().split()
myList = list(map(eval,myList))
m1 = myList[0]
m2 = myList[0]
for x in myList:
if x > m1:
m2 = m1
m1 = x
elif x > m2:
m2 = x
print ('Max Number: ',m1)
print ('2nd Max Number: ',m2)

Here I tried to come up with an answer.
2nd(Second) maximum element in a list using single loop and without using any inbuilt function.
def secondLargest(lst):
mx = 0
num = 0
sec = 0
for i in lst:
if i > mx:
sec = mx
mx = i
else:
if i > num and num >= sec:
sec = i
num = i
return sec

Python given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A in O(n) time complexity

For example:
input: A = [ 6 4 3 -5 0 2 -7 1 ]
output: 5
Since 5 is the smallest positive integer that does not occur in the array.
I have written two solutions to that problem. The first one is good but I don't want to use any external libraries + its O(n)*log(n) complexity. The second solution "In which I need your help to optimize it" gives an error when the input is chaotic sequences length=10005 (with minus).
Solution 1:
from itertools import count, filterfalse
def minpositive(a):
return(next(filterfalse(set(a).__contains__, count(1))))
Solution 2:
def minpositive(a):
count = 0
b = list(set([i for i in a if i>0]))
if min(b, default = 0) > 1 or min(b, default = 0) == 0 :
min_val = 1
else:
min_val = min([b[i-1]+1 for i, x in enumerate(b) if x - b[i - 1] >1], default=b[-1]+1)
return min_val
Note: This was a demo test in codility, solution 1 got 100% and
solution 2 got 77 %.
Error in "solution2" was due to:
Performance tests ->
medium chaotic sequences length=10005 (with minus) got 3 expected
10000
Performance tests -> large chaotic + many -1, 1, 2, 3 (with
minus) got 5 expected 10000

Testing for the presence of a number in a set is fast in Python so you could try something like this:
def minpositive(a):
A = set(a)
ans = 1
while ans in A:
ans += 1
return ans

Fast for large arrays.
def minpositive(arr):
if 1 not in arr: # protection from error if ( max(arr) < 0 )
return 1
else:
maxArr = max(arr) # find max element in 'arr'
c1 = set(range(2, maxArr+2)) # create array from 2 to max
c2 = c1 - set(arr) # find all positive elements outside the array
return min(c2)

I have an easy solution. No need to sort.
def solution(A):
s = set(A)
m = max(A) + 2
for N in range(1, m):
if N not in s:
return N
return 1
Note: It is 100% total score (Correctness & Performance)

def minpositive(A):
"""Given an list A of N integers,
returns the smallest positive integer (greater than 0)
that does not occur in A in O(n) time complexity
Args:
A: list of integers
Returns:
integer: smallest positive integer
e.g:
A = [1,2,3]
smallest_positive_int = 4
"""
len_nrs_list = len(A)
N = set(range(1, len_nrs_list+2))
return min(N-set(A)) #gets the min value using the N integers

This solution passes the performance test with a score of 100%
def solution(A):
n = sorted(i for i in set(A) if i > 0) # Remove duplicates and negative numbers
if not n:
return 1
ln = len(n)
for i in range(1, ln + 1):
if i != n[i - 1]:
return i
return ln + 1

def solution(A):
B = set(sorted(A))
m = 1
for x in B:
if x == m:
m+=1
return m

Continuing on from Niroj Shrestha and najeeb-jebreel, added an initial portion to avoid iteration in case of a complete set. Especially important if the array is very large.
def smallest_positive_int(A):
sorted_A = sorted(A)
last_in_sorted_A = sorted_A[-1]
#check if straight continuous list
if len(sorted_A) == last_in_sorted_A:
return last_in_sorted_A + 1
else:
#incomplete list, iterate to find the smallest missing number
sol=1
for x in sorted_A:
if x == sol:
sol += 1
else:
break
return sol
A = [1,2,7,4,5,6]
print(smallest_positive_int(A))

This question doesn't really need another answer, but there is a solution that has not been proposed yet, that I believe to be faster than what's been presented so far.
As others have pointed out, we know the answer lies in the range [1, len(A)+1], inclusively. We can turn that into a set and take the minimum element in the set difference with A. That's a good O(N) solution since set operations are O(1).
However, we don't need to use a Python set to store [1, len(A)+1], because we're starting with a dense set. We can use an array instead, which will replace set hashing by list indexing and give us another O(N) solution with a lower constant.
def minpositive(a):
# the "set" of possible answer - values_found[i-1] will tell us whether i is in a
values_found = [False] * (len(a)+1)
# note any values in a in the range [1, len(a)+1] as found
for i in a:
if i > 0 and i <= len(a)+1:
values_found[i-1] = True
# extract the smallest value not found
for i, found in enumerate(values_found):
if not found:
return i+1
We know the final for loop always finds a value that was not marked, because it has one more element than a, so at least one of its cells was not set to True.

def check_min(a):
x= max(a)
if x-1 in a:
return x+1
elif x <= 0:
return 1
else:
return x-1
Correct me if i'm wrong but this works for me.

def solution(A):
clone = 1
A.sort()
for itr in range(max(A) + 2):
if itr not in A and itr >= 1:
clone = itr
break
return clone
print(solution([2,1,4,7]))
#returns 3

def solution(A):
n = 1
for i in A:
if n in A:
n = n+1
else:
return n
return n

def not_in_A(a):
a=sorted(a)
if max(a)<1:
return(1)
for i in range(0,len(a)-1):
if a[i+1]-a[i]>1:
out=a[i]+1
if out==0 or out<1:
continue
return(out)
return(max(a)+1)

mark and then find the first one that didn't find
nums = [ 6, 4, 3, -5, 0, 2, -7, 1 ]
def check_min(nums):
marks = [-1] * len(nums)
for idx, num in enumerate(nums):
if num >= 0:
marks[num] = idx
for idx, mark in enumerate(marks):
if mark == -1:
return idx
return idx + 1

I just modified the answer by #najeeb-jebreel and now the function gives an optimal solution.
def solution(A):
sorted_set = set(sorted(A))
sol = 1
for x in sorted_set:
if x == sol:
sol += 1
else:
break
return sol

I reduced the length of set before comparing
a=[1,222,3,4,24,5,6,7,8,9,10,15,2,3,3,11,-1]
#a=[1,2,3,6,3]
def sol(a_array):
a_set=set()
b_set=set()
cnt=1
for i in a_array:
#In order to get the greater performance
#Checking if element is greater than length+1
#then it can't be output( our result in solution)
if i<=len(a) and i >=1:
a_set.add(i) # Adding array element in set
b_set.add(cnt) # Adding iterator in set
cnt=cnt+1
b_set=b_set.difference(a_set)
if((len(b_set)) > 1):
return(min(b_set))
else:
return max(a_set)+1
sol(a)

def solution(A):
nw_A = sorted(set(A))
if all(i < 0 for i in nw_A):
return 1
else:
ans = 1
while ans in nw_A:
ans += 1
if ans not in nw_A:
return ans
For better performance if there is a possibility to import numpy package.
def solution(A):
import numpy as np
nw_A = np.unique(np.array(A))
if np.all((nw_A < 0)):
return 1
else:
ans = 1
while ans in nw_A:
ans += 1
if ans not in nw_A:
return ans

def solution(A):
# write your code in Python 3.6
min_num = float("inf")
set_A = set(A)
# finding the smallest number
for num in set_A:
if num < min_num:
min_num = num
# print(min_num)
#if negative make positive
if min_num < 0 or min_num == 0:
min_num = 1
# print(min_num)
# if in set add 1 until not
while min_num in set_A:
min_num += 1
return min_num
Not sure why this is not 100% in correctness. It is 100% performance

def solution(A):
arr = set(A)
N = set(range(1, 100001))
while N in arr:
N += 1
return min(N - arr)
solution([1, 2, 6, 4])
#returns 3

Python: check if a list can be sorted by swapping two elements, only one swap is allowed

Here in below code, I'm trying to find out if there are two elements in left side which are greater than right side element but this doesn't seem to work for my problem. Any hints to write further logic? I'm stuck here.
swap.py
def swap(lst):
count = 0
for k in range(0, len(lst)-1):
if lst[k] > lst[k+1]:
count += 1
if int(count) == 2:
print "Swapped"
elif int(count) == 0:
print True
else:
print False
if __name__ == "__main__":
swap([1,2,3,4,0])
swap([6,4,2,5])
swap([6,4,2,8])
swap([1,4,5])
My expected output from program -
[1,4,5] will return True
[6,4,2,8] will return Swapped
[6,4,2,5] will return False

from itertools import combinations
def is_swappable(lst):
s = sorted(lst)
for i, j in combinations(range(len(lst)), 2):
l = lst[:]
l[i], l[j] = l[j], l[i]
if l == s:
return True
return False
Here's a pretty naive solution. Tries swapping every pair in the list and sees if that results in the sorted list.

I did not understand the "swapped" condition but the following code snipped will tell you if you can sort an array in one swap or not.
Code is written in Python 3 and the complexity of this code is O(nlog(n))
def checkOneSwap(arr):
N = len(arr)
if N <= 2:
print(True)
arr2 = []
for index in range(N):
arr2.append(arr[index])
arr2.sort()
counter = 0
for i in range(N):
if arr[i] != arr2[i]:
counter += 1
if counter == 0 or counter == 2:
print(True)
else:
print(False)
checkOneSwap([1,2,3,4,0]) # False you definetly need for than 2 swap to make this array sorted
checkOneSwap([6,4,2,5]) # False [2,4,5,6] -> swap(6,2) and then swap(6,5) require 2 swap to make this array sorted
checkOneSwap([6,4,2,8]) # True [2,4,6,8] -> swap(6,2), one swap required to make this array sorted
checkOneSwap([1,4,5]) # True [1,4,5] -> already sorted,counter = 0

Can be done with for loop, but I prefer list comprehension. Zip sorted and unsorted list and create a list of mismatches. If the length of mismatches is more than 2, then you can't sort in 1 swap.
def is_sortable_by_one_swap(unsorted_list):
mismatches = [x for x in zip(unsorted_list, sorted(unsorted_list)) if x[0] != x[1]]
return len(mismatches) <= 2

Test for consecutive numbers in list

I have a list that contains only integers, and I want to check if all the numbers in the list are consecutive (the order of the numbers does not matter).
If there are repeated elements, the function should return False.
Here is my attempt to solve this:
def isconsecutive(lst):
"""
Returns True if all numbers in lst can be ordered consecutively, and False otherwise
"""
if len(set(lst)) == len(lst) and max(lst) - min(lst) == len(lst) - 1:
return True
else:
return False
For example:
l = [-2,-3,-1,0,1,3,2,5,4]
print(isconsecutive(l))
True
Is this the best way to do this?

Here is another solution:
def is_consecutive(l):
setl = set(l)
return len(l) == len(setl) and setl == set(range(min(l), max(l)+1))
However, your solution is probably better as you don't store the whole range in memory.
Note that you can always simplify
if boolean_expression:
return True
else:
return False
by
return boolean_expression

A better approach in terms of how many times you look at the elements would be to incorporate finding the min, max and short circuiting on any dupe all in one pass, although would probably be beaten by the speed of the builtin functions depending on the inputs:
def mn_mx(l):
mn, mx = float("inf"), float("-inf")
seen = set()
for ele in l:
# if we already saw the ele, end the function
if ele in seen:
return False, False
if ele < mn:
mn = ele
if ele > mx:
mx = ele
seen.add(ele)
return mn, mx
def isconsecutive(lst):
"""
Returns True if all numbers in lst can be ordered consecutively, and False otherwise
"""
mn, mx = mn_mx(lst)
# could check either, if mn is False we found a dupe
if mn is False:
return False
# if we get here there are no dupes
return mx - mn == len(lst) - 1

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Most efficient way to compare two lists of integers - python

Related

How to check elements in a list WITHOUT using for loops?

Python code for printing out the second largest number number given a list [duplicate]

Python given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A in O(n) time complexity

Python: check if a list can be sorted by swapping two elements, only one swap is allowed

Test for consecutive numbers in list

Categories

Resources