I am trying to create a function which counts the length of elements in the list and then run an if / elif loop on them:
k_nearest_samples_class = training_data[sorted_indices[:k]][:, -1]
# print(k_nearest_samples_class)
# counting number of occurrences of either 0's or 1's with below 2 lines
class_0_count = len(k_nearest_samples_class[k_nearest_samples_class == 0])
class_1_count = len(k_nearest_samples_class[k_nearest_samples_class == 1])
class_2_count = len(k_nearest_samples_class[k_nearest_samples_class == 2])
# Combining > & = sign so even in tie-up cases the instance will be classified to malignant - assumed it
# would be okay in order to reduce false positives
if class_0_count >= class_1_count and class_2_count:
print("0", class_0_count)
return 0
elif class_1_count >= class_0_count and class_2_count:
print("1", class_1_count)
return 1
else:
print("2", class_2_count)
return 2
Giving input one by one like:
[0.0]
[1.0]
[2.0]
currently, my if loop is working illogically.
This line:
if class_0_count >= class_1_count and class_2_count:
is equivalent to:
if class_0_count >= class_1_count and class_2_count > 0:
you need to change it to:
if class_0_count >= class_1_count and class_0_count >= class_2_count:
or you can compare with the maximum value of the two:
if class_0_count >= max(class_1_count, class_2_count):
Expanding on #MoeA's answer above:
A (alternative) Pythonic way to perform this test is to use the all() function like:
if all([class_0_count >= class_1_count, class_0_count >= class_2_count]):
...
Related
I'm playing with the Codality Demo Task. It's asking to design a function which determines the lowest missing integer greater than zero in an array.
I wrote a function that works, but Codility tests it as 88% (80% correctness). I can't think of instances where it would fail.
def solution(A):
#If there are negative values, set any negative values to zero
if any(n < 0 for n in A):
A = [(i > 0) * i for i in A]
count = 0
else: count = 1
#Get rid of repeating values
A = set(A)
#At this point, we may have only had negative #'s or the same # repeated.
#If negagive #'s, or repeated zero, then answer is 1
#If repeated 1's answer is 2
#If any other repeated #'s answer is 1
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
#Sort the array
A = sorted(A)
for j in range(len(A)):
#Test to see if it's greater than zero or >= to count. If so, it exists and is not the lowest integer.
#This fails if the first # is zero and the second number is NOT 1
if (A[j] <= count or A[j] == 0): #If the number is lt or equal to the count or zero then continue the count
count = count + 1
elif (j == 1 and A[j] > 1): return 1
else:
return count
return count
UPDATE:
I got this to 88% with the fixes above. It still fails with some input. I wish Codility would give the inputs that fail. Maybe it does with a full subscription. I'm just playing with the test.
UPDATE 2: Got this to 100% with Tranbi's suggestion.
def solution(A):
#Get rid of all zero and negative #'s
A = [i for i in A if i > 0]
#At this point, if there were only zero, negative, or combination of both, the answer is 1
if (len(A) == 0): return 1
count = 1
#Get rid of repeating values
A = set(A)
#At this point, we may have only had the same # repeated.
#If repeated 1's answer is 2
#If any other repeated #'s only, then answer is 1
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
#Sort the array
A = sorted(A)
for j in range(len(A)):
#Test to see if it's >= to count. If so, it exists and is not the lowest integer.
if (A[j] <= count): #If the number is lt or equal to the count then continue the count
count = count + 1
else:
return count
return count
Besides that bug for len=1, you also fail for example solution([0, 5]), which returns 2.
Anyway... Since you're willing to create a set, why not just make this really simple?
def solution(A):
A = set(A)
count = 1
while count in A:
count += 1
return count
I don't think this is true:
#At this point, we may have only had negative #'s or the same # repeated. If so, then the answer is 1+ the integer.
if (len(A) == 1):
return list(A)[0]+1
If A = [2] you should return 1 not 3.
Your code is quite confusing though. I think you could replace
if any(n < 0 for n in A):
A = [(i > 0) * i for i in A]
with
A = [i for i in A if i > 0]
What's the point of keeping 0 values?
I don't think this is needed:
if (len(A) == 1):
if (list(A)[0] == 1):
return 2
else:
return 1
It's already taken into account afterwards :)
Finally got a 100% score.
def solution(A):
# 1 isn't there
if 1 not in A:
return 1
# it's easier to sort
A.sort()
# positive "hole" in the array
prev=A[0]
for e in A[1:]:
if e>prev+1>0:
return prev+1
prev=e
# no positive "hole"
# 1 is in the middle
return A[-1]+1
Apologies if the title of the question is phrased badly. I am currently trying to make a function that takes in a list of integers from 1 to n, where n is the length of the list. The function should return the first value that is repeated in the list. Duplicates are NOT always next to one another. If one or more integers is less than 1 or if it is not a list, the function should return -1. If there are no duplicates, return 0.
This is my current code:
def find_duplicates(ls):
if type(ls) != list:
return -1
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
break
if ls.count(i) > 1:
return i
break
else:
non_dupe.append(i)
i += 1
if len(non_dupe) == len(ls):
return 0
While this code works for a majority of test cases, it doesn't seem to pass
print(find_duplicates([1, 2, 2, 0]))
as it returns 2 instead of the expected -1. I am relatively new to Python and I can't seem to be able to fix this error. I've tried searching for ways to counter this problem but I am not allowed to use for loops to check through a list. Any help is greatly appreciated.
EDIT: I am not allowed to use any of the following but anything else is accepted.
for loops
min() / max()
enumerate() / zip ()
sort()
negative indexing e.g ls[-1]
list slicing
Your code returns a duplicate prematurely; traversing the list, the function first finds 2 as a duplicate, return it, and halts the function immediately. But it has not seen the 0 at the end.
So, you need to let the function see the list all the way towards the end, looking for a negative number. If a negative number is found along the way, you can halt the function. If it does not see a negative number until the end, then let it return the duplicate value:
def find_duplicates(ls):
if not isinstance(ls, list): # check whether ls is a list
return -1
dup = 0
seen = [] # list of numbers seen so far
i = 0 # index
while i < len(ls):
if ls[i] < 1: # if a negative number is found, return -1
return -1
if ls[i] in seen and dup == 0:
dup = ls[i]
seen.append(ls[i])
i += 1
return dup
print(find_duplicates([1, 2, 2, 0])) # -1
print(find_duplicates([1, 1, 2, 2, 3])) # 1
Problem is beacause you are breaking while loop when find a duplicated. In that case, function is finding first the duplicated.
Try this:
def find_duplicates(ls):
if type(ls) is not list:
return -1
duplicated = 0
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
if ls.count(ls[i]) > 1 and duplicated == 0
duplicated = ls[i]
i += 1
return duplicated
Your test case returns 2 because 2 stay at lower indexes comparing to 0.
I would suggest to sort the list before moving on:
def find_duplicates(ls):
if type(ls) != list:
return -1
sorted_list = ls.sorted() #Assign sorted `ls` to another variable, while keeping the order of `ls` intact
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
break
if ls.count(i) > 1:
return i
break
else:
non_dupe.append(i)
i += 1
if len(non_dupe) == len(ls):
return 0
Another method I would recommend is using set - a built-in data type of Python. Maybe you should consider trying this approach later on when all test cases are passed. Have a look at this Tutorial for set usage: https://www.w3schools.com/python/python_sets.asp.
You were very close. Try this:
def find_duplicates(ls):
if type(ls) != list:
return -1
non_dupe = []
i = 0
while i < len(ls):
if ls[i] < 1:
return -1
elif ls[i] in non_dupe:
return ls[i]
else:
non_dupe.append(i)
i += 1
return 0
my_list = [1,2,2,0]
result = list(set(filter(lambda x: my_list.count(x) > 1 , my_list)))
# Result => [2]
I hope this solves your problem
The question is:
Design an O(log n) algorithm whose input is a sorted list A. The algorithm should return true if A contains at least 3 distinct elements. Otherwise, the algorithm should return false.
as it has to be O(log n), I tried to use binary search and this is the code I wrote:
def hasThreeDistinctElements(A):
if len(A) < 3:
return False
minInd = 0
maxInd = len(A)-1
midInd = (maxInd+minInd)//2
count = 1
while minInd < maxInd:
if A[minInd] == A[midInd]:
minInd = midInd
if A[maxInd] == A[midInd]:
maxInd = midInd
else:
count += 1
maxInd -= 1
else:
count += 1
minInd += 1
midInd = (maxInd+minInd)//2
return count >= 3
is there a better way to do this?
Thanks
from bisect import bisect
def hasThreeDistinctElements(A):
return A[:1] < A[-1:] > [A[bisect(A, A[0])]]
The first comparison safely(*) checks whether there are two different values at all. If so, we check whether the first value larger than A[0] is also smaller than A[-1].
(*): Doesn't crash if A is empty.
Or without bisect, binary-searching for a third value in A[1:-1]. The invariant is that if there is any, it must be in A[lo : hi+1]:
def hasThreeDistinctElements(A):
lo, hi = 1, len(A) - 2
while lo <= hi:
mid = (lo + hi) // 2
if A[mid] == A[0]:
lo = mid + 1
elif A[mid] == A[-1]:
hi = mid - 1
else:
return True
return False
In order to really be O(logN), the updates to the bounding indeces minInd,maxInd should only ever be
maxInd = midInd [- 1]
minInd = midInd [+ 1]
to half the search space. Since there are paths through your loop body that only do
minInd += 1
maxInd -= 1
respectively, I am not sure that you can't create data for which your function is linear. The following is a bit simpler and guaranteed O(logN)
def x(A):
if len(A) < 3:
return False
minInd, maxInd = 0, len(A)-1
mn, mx = A[minInd], A[maxInd]
while minInd < maxInd:
midInd = (minInd + maxInd) // 2
if mn != A[midInd] != mx:
return True
if A[midInd] == mn:
minInd = midInd + 1 # minInd == midInd might occur
else:
maxInd = midInd # while maxInd != midInd is safe
return False
BTW, if you can use the standard library, it is as easy as:
from bisect import bisect_right
def x(A):
return A and (i := bisect_right(A, A[0])) < len(A) and A[i] < A[-1]
Yes, there is a better approach.
As the list is sorted, you can use binary search with slight custom modifications as follows:
list = [1, 1, 1, 2, 2]
uniqueElementSet = set([])
def binary_search(minIndex, maxIndex, n):
if(len(uniqueElementSet)>=3):
return
#Checking the bounds for index:
if(minIndex<0 or minIndex>=n or maxIndex<0 or maxIndex>=n):
return
if(minIndex > maxIndex):
return
if(minIndex == maxIndex):
uniqueElementSet.add(list[minIndex])
return
if(list[minIndex] == list[maxIndex]):
uniqueElementSet.add(list[minIndex])
return
uniqueElementSet.add(list[minIndex])
uniqueElementSet.add(list[maxIndex])
midIndex = (minIndex + maxIndex)//2
binary_search(minIndex+1, midIndex, n)
binary_search(midIndex+1, maxIndex-1, n)
return
binary_search(0, len(list)-1, len(list))
print(True if len(uniqueElementSet)>=3 else False)
As, we are dividing the array into 2 parts in each iteration of the recursion, it will require maximum of log(n) steps to check if it contains 3 unique elements.
Time Complexity = O(log(n)).
I try to understand why I get unreasonable result from the following if:
def print_if_neg (a,b):
if a < 0 != b < 0:
print "Only One Neg"
else:
print "0 or 2"
print_if_neg(1,1)
print_if_neg(-1,1)
print_if_neg (1,-1)
print_if_neg(-1,-1)
I get 3 times 0 or 2 and then last one Only One Neg.
What is the order of this complicated condition?
I've tried this:
if (a < 0) != (b < 0):
and it's ok but I'm trying to understand why above doesn't work.
You need parentheses due to operator precedence
def print_if_neg (a,b):
if (a < 0) != (b < 0):
print "Only One Neg"
else:
print "0 or 2"
As CoryKramer pointed out, the operator precedence is making the difference.
Your code is equivalent to this:
def print_if_neg (a,b):
if a < (0 != b) < 0:
print "Only One Neg"
else:
print "0 or 2"
Because != has higher precedence than < by language definition.
So, use () to force the precedence that you need:
def print_if_neg (a,b):
if (a < 0) != (b < 0):
print "Only One Neg"
else:
print "0 or 2"
Also, FYI you are coding the xor operator.
Due to operator precedence you need to place the two conditions in parentheses for your expected results. Otherwise the comparison operators are solved, checking for 0 != b in your code, which is not what you expect.
def print_if_neg (a,b):
if (a < 0) != (b < 0):
print ("Only One Neg")
else:
print ("0 or 2")
print_if_neg(1,1)
print_if_neg(-1,1)
print_if_neg (1,-1)
print_if_neg(-1,-1)
Note that all comparison operators have the same precedence and comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y AND y <= z
This is because condition a < 0 != b < 0 means a < 0 AND 0 != b AND b < 0 First of all when a >= 0 first condition evaluates to False and so nothing else gets evaluated. Then, if a is <0 but b=1 last condition in the chain is False. Therefore your chained condition is False 3 out of 4 times.
This is well explained in section 6.10 of Python documentation.
From this, you could make it more readable imo:
from operator import xor
def print_if_neg (a, b):
if xor(a < 0, b < 0):
print "Only One Neg"
else:
print "0 or 2"
I'm trying to compare two lists of integers, each the same size, in Python 2.6. The comparison I need is to compare the first item in List 1 with the first item in List 2, the second item in List 1 with the second item in List 2, and so on, and returns a result if ALL of the list items follow the same comparison criteria. It should behave as follows:
list1 = [1,1,1,1]
list2 = [2,1,2,3]
compare(list1,list2)
# returns a "list 1 is <= list 2" response.
list1 = [4,1,4,3]
list2 = [2,1,2,3]
compare(list1,list2)
# returns a "list 1 is >= list 2" response.
list1 = [3,2,3,2]
list2 = [1,4,1,4]
compare(list1,list2)
# returns None— some items in list1 > list2, and some items in list2 > list1.
I figured I could write the code like the following block, but I don't know if it's the most efficient. My program is going to be calling this method a LOT so I want to streamline this as much as possible.
def compare(list1,list2):
gt_found = 0
lt_found = 0
for x in range(len(list1)):
if list1[x] > list2[x]:
gt_found += 1
elif list1[x] < list2[x]:
lt_found += 1
if gt_found > 0 and lt_found > 0:
return None #(some items >, some items <)
if gt_found > 0:
return 1 #(list1 >= list2)
if lt_found > 0:
return -1 #(list1 <= list2)
return 0 #(list1 == list2)
Is it already as good as it's going to get (big-O of n), or is there a faster way to go about it (or a way that uses system functions instead)?
CLARIFICATION: I expect the case that returns 'None' to happen the most often, so it is important.
You can consider a numpy-based vectorized comparison.
import numpy as np
a = [1,1,1,2]
b = [2,2,4,3]
all_larger = np.all(np.asarray(b) > np.asarray(a)) # true if b > a holds elementwise
print all_larger
True
Clearly, you can engineer the thing to have your answer.
all_larger = lambda b,a : np.all(np.asarray(b) > np.asarray(a))
if all_larger(b,a):
print "b > a"
elif all_larger(a,b):
print "a > b"
else
print "nothing!"
Every type of comparison such as <, >, <=, >=, can be done.
Are you familiar with the wonderful zip function?
import itertools
def compare(xs, ys):
all_less = True
all_greater = True
for x, y in itertools.izip(xs, ys):
if not all_less and not all_greater:
return None
if x > y:
all_less = False
elif x < y:
all_greater = False
if all_less:
return "list 1 is <= list 2"
elif all_greater:
return "list 1 is >= list 2"
return None # all_greater might be set False on final iteration
Zip takes two lists (xs and ys in this case, but call them whatever you want) and creates an iterator for a sequence of tuples.
izip([1,2,3,4], [4,3,2,1]) == [(1,4), (2,3), (3,2), (4,1)]
This way you can iterate through both lists simultaneously and compare each value in tandem. The time complexity should be O(n), where n is the size of your lists.
It will return early in cases where neither the >= or <= condition are met.
Update
As James Matta points out, itertools.izip performs better than the standard zip in Python 2. This isn't true in Python 3, where the standard zip works the way izip does in older versions.
For anyone interested in the performance of the two methods, I named the iterative method 'tortoise' and the numpy method 'hare', and tested it with the code below.
At first, the 'tortoise' won [.009s [T] vs .033s [H]], but I checked it and found that asarray() was being called more often than it need to be. With that fix, the 'hare' won again, [.009s [T] vs .006s [H]].
The data is here: http://tny.cz/64d6e5dc
It consists of 28 lines of about 950 elements in length. Four of the lines collectively >= all the others.
It might be interesting to see how the performance works on larger data sets.
import itertools, operator, math
import cProfile
import numpy as np
data = #SEE PASTEBIN
def tortoise(xs, ys):
all_less = True
all_greater = True
for x, y in zip(xs, ys):
if not all_less and not all_greater:
return None
if x > y:
all_less = False
elif x < y:
all_greater = False
if all_greater and all_less:
return 0
if all_greater:
return 1
if all_less:
return -1
return None # all_greater might be set False on final iteration
hare = lambda b,a : np.all(b >= a)
def find_uniques_tortoise():
include_list = range(len(data))
current_list_index = 0
while current_list_index < len(data):
if current_list_index not in include_list:
current_list_index += 1
continue
for x in range(current_list_index+1,len(data)):
if x not in include_list:
continue
result = tortoise(data[current_list_index], data[x])
if result is None: #no comparison
continue
elif result == 1 or result == 0: # this one beats the other one
include_list.remove(x)
continue
elif result == -1: #the other one beats this one
include_list.remove(current_list_index)
break
current_list_index +=1
return include_list
def find_uniques_hare():
include_list = range(len(data))
current_list_index = 0
#do all asarray()s beforehand for max efficiency
for x in range(len(data)):
data[x] = np.asarray(data[x])
while current_list_index < len(data):
if current_list_index not in include_list:
current_list_index += 1
continue
for x in range(current_list_index+1,len(data)):
if x not in include_list:
continue
if hare(data[current_list_index], data[x]): # this one beats the other one, or it's a tie
include_list.remove(x)
# print x
continue
elif hare(data[x], data[current_list_index]): #the other one beats this one
include_list.remove(current_list_index)
# print current_list_index
break
else: #no comparison
continue
current_list_index +=1
return include_list
cProfile.run('find_uniques_tortoise()')
cProfile.run('find_uniques_hare()')
print find_uniques_tortoise()
print
print find_uniques_hare()