Time complexity of two algorithms for same problem - python

I was trying to solve a problem on Hackerearth. The problem statement is as follows:
Sussutu is a world-renowned magician. And recently, he was blessed with the power to remove EXACTLY ONE element from an array.
Given, an array A (index starting from 0) with N elements. Now, Sussutu CAN remove only that element which makes the sum of ALL the remaining elements exactly divisible by 7.
Throughout his life, Sussutu was so busy with magic that he could never get along with maths. Your task is to help Sussutu find the first array index of the smallest element he CAN remove.
Input:
The first line contains a single integer N.
Next line contains N space separated integers Ak , 0 < k < N.
Output:
Print a single line containing one integer, the first array index of the smallest element he CAN remove, and āˆ’1 if there is no such element that he can remove!
Constraints:
1 < N < 105
0 < Ak < 109
Here is the algorithm that I tried, but it exceeded the time limit on some test cases:
n = int(input())
A = list(map(int, input().split(' ')))
temp = sorted(A)
for i in range(n):
temp[i] = 0
s = sum(temp)
temp = sorted(A)
if s % 7 == 0:
flag = True
break
flag = False
if flag == True:
print(A.index(temp[i]))
else:
print(-1)
Another code which worked fine is given below :
n = int(input())
A = list(map(int, input().split(' ')))
S = sum(A)
t = []
for a in A:
if (S - a) % 7 == 0:
t.append(a)
if len(t) == 0:
print(-1)
else:
print(A.index(min(t)))
Can anyone help me understand why the 1st code exceeded the time limit and why the 2nd code did not??

In the first algorithm, the sort itself is O(n log n), so the complexity of the first one's loop is O(n)*O(n log n) = O(nĀ² log n). In the second one, you simply loop through the input array three times - so its complexity is O(n), far lower. For very large inputs, the first one will then timeout while the second one may not.

Time complexity of the first algorithm is O(n^2 logn) because you are sorting the array in each iteration, while time complexity of the second is O(n).

FWIW: you could avoid some work by:
V = [M] * 7 # where max(A) < M
I = [None] * 7
s = 0
i = 0
for v in A:
m = v % 7
s += m
if v < V[m]: V[m] = v ; I[m] = i
i += 1
s = s % 7
if I[s] == None:
print("No answer!!!")
else:
print("i=%d v=%d" % (I[s], V[s]))
which does the job in a single pass. (Your code has one pass across A "hiding" in the sum(A).)

You simply need to remove an elements which has the same modulo 7 as the sum of the list:
import random
n = 10
A = [ random.randrange(n) for _ in range(n)]
modulo7 = sum(A)%7
index = next((i for i,a in enumerate(A) if a%7==modulo7),-1)
print(A,"sum:",sum(A))
if index < 0:
print("No eligible element to remove")
else:
print("removing",A[index],"at index",index,"makes the sum",sum(A)-A[index])
output:
[4, 8, 4, 1, 8, 9, 6, 9, 4, 4] sum: 57
removing 8 at index 1 makes the sum 49

Related

loop through the list to get Maximum sum for a given range in python

I am a novice in python. I have a code where I loop through a list to capture maximum sum of numbers for given range k. It is working fine but I want it to make it shorter/optimal. 'k' may vary
numb = [100,33,22,200,333,1000,22]
m=0
k=2
sum1=0
temp=[]
for j in range(len(numb)-(k-1)):
for i in range(m,k):
temp.append(numb[i])
if sum1 < sum(temp):
sum1 = sum(temp)
temp=[]
m+=1
k+=1
print(sum1)
Ans: 1533 when k = 3
Ans: 1333 when k = 2
You can start by adding up the first k numbers. That is your starting sum and your current max. Then run a sliding window along the list, adding the next number and removing the one that goes out of the window.
def sum_k(x, k):
m = s = sum(x[:k])
for i, a in enumerate(x[k:]):
b = x[i] # number to remove
s += a - b
m = max(m, s)
return m
numb = [100, 33, 22, 200, 333, 1000, 22]
print(sum_k(numb, 2), sum_k(numb, 3))
This runs in linear time, which is optimal since you need to at least look at every element in your input.
The index, i, in the loop runs from zero to n-k-1, so although we enumerate over x[k:] the indices we pick are from x[0:], so when we pick b we are picking the number that goes out of the window. Meanwhile, a is the new number that comes in.
This is the simplified code you want which takes O(n) of time complexity. This approach is based on Sliding Window Algorithm.
maxSum is the function which takes 2 arguments (array of numbers and k) and returns maximum for any value of k.
def maxSum(arr, k):
# Edge case
if len(arr) <= k:
return sum(arr)
sums = sum(arr[:k]) # sum the first 3 val in arr.
start = 0 # tell us the first element index whose value is in sums variable
maximum = sums
for val in arr[k:]:
sums = (sums - arr[start]) + val
# here we first subtracted the start value and then added current value.
# Eg. From [1,2,3,4] sums have 1+2+3, but now sums have ( 1+2+3(previous) - 1(start) ) + 4(current)
# Now check for maximum.
if sums > maximum:
maximum = sums
# now increase start by 1 to make pointer to value '2' and so on.
start += 1
# return maximum
return maximum
arr = [100,33,22,200,333,1000,22]
k = 2
print("For k=2: ", maxSum(arr, k))
k = 3
print("For k=3: ", maxSum(arr, k))
Output:
For k=2: 1333
For k=3: 1533

Grab 'n' numbers from a given list of numbers with minimum difference between them

I put up a similar question a few hours ago, albeit with a few mistakes, and my poor understanding, admittedly
So the question is, from a given list of indefinite numbers, I'm supposed to take an input from the user, say 3, and grab 3 numbers wherein the numbers have the least difference between them.
def findMinDiff(arr):
# Initialize difference as infinite
diff = 10**20
n = len(arr)
# Find the min diff by comparing difference
# of all possible pairs in given array
for i in range(n-1):
for j in range(i+1,n):
if abs(arr[i]-arr[j]) < diff:
diff = abs(arr[i] - arr[j])
# Return min diff
return diff
def findDiffArray(arr):
diff = 10**20
arr_diff = []
n = len(arr)
for i in range(n-1):
arr_diff.append(abs(arr[i]-arr[i+1]))
return arr_diff
def choosingElements(arr, arr_diff):
arr_choose = []
least_index = 0
least = arr_diff[0]
least_index_array = []
flag = 0
flag2 = 0
for z in range(0,3):
for i in range(0,len(arr_diff)-1):
if arr_diff[i] < least:
if flag > 0:
if i == least_index:
continue
least = arr_diff[i]
least_index = i
least_index_array.append(i)
arr_choose.append(arr[i])
flag += 1
arr_choose.append(arr[i+1])
flag += 1
print("least index is", least_index)
return arr_choose
# Driver code
arr = [1, 5, 3, 19, 18, 25]
arr_diff = findDiffArray(arr)
arr_diff2 = arr_diff.copy()
item_number = int(input("Enter the number of gifts"))
arr_choose = choosingElements(arr, arr_diff2)
print("Minimum difference is " + str(findMinDiff(arr)))
print("Difference array")
print(*arr_diff, sep = "\n")
print("Numbers with least difference for specified items are", arr_choose)
This is how much I've tried, and I've thought to find the difference between numbers, and keep picking ones with the least difference between them, and I realised that my approach is probably wrong.
Can anybody kindly help me out? Thanks!
Now, I'm sure the time complexity on this isn't great, and it might be hard to understand, but how about this:
arr = [1, 18, 5, 19, 25, 3]
# calculates length of the overall path
def calc_path_difference(arr, i1, i2, i3):
return abs(arr[i1] - arr[i2]) + abs(arr[i2] - arr[i3])
# returns dictionary with differences to other numbers in arr from each number
def differences_dict(arr):
return {
current: [
abs(number - current) if abs(number - current) != 0 else float("inf")
for number in arr
]
for current in arr
}
differences = differences_dict(arr)
# Just to give some starting point, take the first three elements of arr
current_path = [calc_path_difference(arr, 0, 1, 2), 0, 1, 2]
# Loop 1
for i, num in enumerate(arr):
# Save some time by skippin numbers who's path
# already exceeds the min path we currently have
if not min(differences[num]) < current_path[0]:
continue
# Loop 2
for j, num2 in enumerate(arr):
# So you can't get 2 of the same index
if j == i:
continue
# some code for making indices i and j of differences
# infinite so they can't be the smallest, but not sure if
# this is needed without more tests
# diff_arr_copy = differences[num2].copy()
# diff_arr_copy[i], diff_arr_copy[j] = float("inf"), float("inf")
# Get index of number in arr with smallest difference to num2
min_index = differences[num2].index(min(differences[num2]))
# So you can't get 2 of the same index again
if min_index == i or min_index == j:
continue
# Total of current path
path_total = calc_path_difference(arr, i, j, min_index)
# Change current path if this one is shorter
if path_total < current_path[0]:
current_path = [path_total, i, j, min_index]
Does this work for you? I played around with the order of the elements in the array and it seemed to give the correct output each time but I would have liked to have another example to test it on.

Python code for printing out the second largest number number given a list [duplicate]

I'm learning Python and the simple ways to handle lists is presented as an advantage. Sometimes it is, but look at this:
>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> numbers.remove(max(numbers))
>>> max(numbers)
74
A very easy, quick way of obtaining the second largest number from a list. Except that the easy list processing helps write a program that runs through the list twice over, to find the largest and then the 2nd largest. It's also destructive - I need two copies of the data if I wanted to keep the original. We need:
>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> if numbers[0]>numbers[1]):
... m, m2 = numbers[0], numbers[1]
... else:
... m, m2 = numbers[1], numbers[0]
...
>>> for x in numbers[2:]:
... if x>m2:
... if x>m:
... m2, m = m, x
... else:
... m2 = x
...
>>> m2
74
Which runs through the list just once, but isn't terse and clear like the previous solution.
So: is there a way, in cases like this, to have both? The clarity of the first version, but the single run through of the second?
You could use the heapq module:
>>> el = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> import heapq
>>> heapq.nlargest(2, el)
[90.8, 74]
And go from there...
Since #OscarLopez and I have different opinions on what the second largest means, I'll post the code according to my interpretation and in line with the first algorithm provided by the questioner.
def second_largest(numbers):
count = 0
m1 = m2 = float('-inf')
for x in numbers:
count += 1
if x > m2:
if x >= m1:
m1, m2 = x, m1
else:
m2 = x
return m2 if count >= 2 else None
(Note: Negative infinity is used here instead of None since None has different sorting behavior in Python 2 and 3 ā€“ see Python - Find second smallest number; a check for the number of elements in numbers makes sure that negative infinity won't be returned when the actual answer is undefined.)
If the maximum occurs multiple times, it may be the second largest as well. Another thing about this approach is that it works correctly if there are less than two elements; then there is no second largest.
Running the same tests:
second_largest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7])
=> 74
second_largest([1,1,1,1,1,2])
=> 1
second_largest([2,2,2,2,2,1])
=> 2
second_largest([10,7,10])
=> 10
second_largest([1,1,1,1,1,1])
=> 1
second_largest([1])
=> None
second_largest([])
=> None
Update
I restructured the conditionals to drastically improve performance; almost by a 100% in my testing on random numbers. The reason for this is that in the original version, the elif was always evaluated in the likely event that the next number is not the largest in the list. In other words, for practically every number in the list, two comparisons were made, whereas one comparison mostly suffices ā€“ if the number is not larger than the second largest, it's not larger than the largest either.
You could always use sorted
>>> sorted(numbers)[-2]
74
Try the solution below, it's O(n) and it will store and return the second greatest number in the second variable. UPDATE: I've adjusted the code to work with Python 3, because now arithmetic comparisons against None are invalid.
Notice that if all elements in numbers are equal, or if numbers is empty or if it contains a single element, the variable second will end up with a value of None - this is correct, as in those cases there isn't a "second greatest" element.
Beware: this finds the "second maximum" value, if there's more than one value that is "first maximum", they will all be treated as the same maximum - in my definition, in a list such as this: [10, 7, 10] the correct answer is 7.
def second_largest(numbers):
minimum = float('-inf')
first, second = minimum, minimum
for n in numbers:
if n > first:
first, second = n, first
elif first > n > second:
second = n
return second if second != minimum else None
Here are some tests:
second_largest([20, 67, 3, 2.6, 7, 74, 2.8, 90.8, 52.8, 4, 3, 2, 5, 7])
=> 74
second_largest([1, 1, 1, 1, 1, 2])
=> 1
second_largest([2, 2, 2, 2, 2, 1])
=> 1
second_largest([10, 7, 10])
=> 7
second_largest( [1, 3, 10, 16])
=> 10
second_largest([1, 1, 1, 1, 1, 1])
=> None
second_largest([1])
=> None
second_largest([])
=> None
Why to complicate the scenario? Its very simple and straight forward
Convert list to set - removes duplicates
Convert set to list again - which gives list in ascending order
Here is a code
mlist = [2, 3, 6, 6, 5]
mlist = list(set(mlist))
print mlist[-2]
You can find the 2nd largest by any of the following ways:
Option 1:
numbers = set(numbers)
numbers.remove(max(numbers))
max(numbers)
Option 2:
sorted(set(numbers))[-2]
The quickselect algorithm, O(n) cousin to quicksort, will do what you want. Quickselect has average performance O(n). Worst case performance is O(n^2) just like quicksort but that's rare, and modifications to quickselect reduce the worst case performance to O(n).
The idea of quickselect is to use the same pivot, lower, higher idea of quicksort, but to then ignore the lower part and to further order just the higher part.
This is one of the Simple Way
def find_second_largest(arr):
first, second = 0, 0
for number in arr:
if number > first:
second = first
first = number
elif number > second and number < first:
second = number
return second
If you do not mind using numpy (import numpy as np):
np.partition(numbers, -2)[-2]
gives you the 2nd largest element of the list with a guaranteed worst-case O(n) running time.
The partition(a, kth) methods returns an array where the kth element is the same it would be in a sorted array, all elements before are smaller, and all behind are larger.
there are some good answers here for type([]), in case someone needed the same thing on a type({}) here it is,
def secondLargest(D):
def second_largest(L):
if(len(L)<2):
raise Exception("Second_Of_One")
KFL=None #KeyForLargest
KFS=None #KeyForSecondLargest
n = 0
for k in L:
if(KFL == None or k>=L[KFL]):
KFS = KFL
KFL = n
elif(KFS == None or k>=L[KFS]):
KFS = n
n+=1
return (KFS)
KFL=None #KeyForLargest
KFS=None #KeyForSecondLargest
if(len(D)<2):
raise Exception("Second_Of_One")
if(type(D)!=type({})):
if(type(D)==type([])):
return(second_largest(D))
else:
raise Exception("TypeError")
else:
for k in D:
if(KFL == None or D[k]>=D[KFL]):
KFS = KFL
KFL = k
elif(KFS == None or D[k] >= D[KFS]):
KFS = k
return(KFS)
a = {'one':1 , 'two': 2 , 'thirty':30}
b = [30,1,2]
print(a[secondLargest(a)])
print(b[secondLargest(b)])
Just for fun I tried to make it user friendly xD
>>> l = [19, 1, 2, 3, 4, 20, 20]
>>> sorted(set(l))[-2]
19
O(n): Time Complexity of a loop is considered as O(n) if the loop variables is incremented / decremented by a constant amount. For example following functions have O(n) time complexity.
// Here c is a positive integer constant
for (int i = 1; i <= n; i += c) {
// some O(1) expressions
}
To find the second largest number i used the below method to find the largest number first and then search the list if thats in there or not
x = [1,2,3]
A = list(map(int, x))
y = max(A)
k1 = list()
for values in range(len(A)):
if y !=A[values]:
k.append(A[values])
z = max(k1)
print z
Objective: To find the second largest number from input.
Input : 5
2 3 6 6 5
Output: 5
*n = int(raw_input())
arr = map(int, raw_input().split())
print sorted(list(set(arr)))[-2]*
def SecondLargest(x):
largest = max(x[0],x[1])
largest2 = min(x[0],x[1])
for item in x:
if item > largest:
largest2 = largest
largest = item
elif largest2 < item and item < largest:
largest2 = item
return largest2
SecondLargest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7])
list_nums = [1, 2, 6, 6, 5]
minimum = float('-inf')
max, min = minimum, minimum
for num in list_nums:
if num > max:
max, min = num, max
elif max > num > min:
min = num
print(min if min != minimum else None)
Output
5
Initialize with -inf. This code generalizes for all cases to find the second largest element.
max1= float("-inf")
max2=max1
for item in arr:
if max1<item:
max2,max1=max1,item
elif item>max2 and item!=max1:
max2=item
print(max2)
Using reduce from functools should be a linear-time functional-style alternative:
from functools import reduce
def update_largest_two(largest_two, x):
m1, m2 = largest_two
return (m1, m2) if m2 >= x else (m1, x) if m1 >= x else (x, m1)
def second_largest(numbers):
if len(numbers) < 2:
return None
largest_two = sorted(numbers[:2], reverse=True)
rest = numbers[2:]
m1, m2 = reduce(update_largest_two, rest, largest_two)
return m2
... or in a very concise style:
from functools import reduce
def second_largest(n):
update_largest_two = lambda a, x: a if a[1] >= x else (a[0], x) if a[0] >= x else (x, a[0])
return None if len(n) < 2 else (reduce(update_largest_two, n[2:], sorted(n[:2], reverse=True)))[1]
This can be done in [N + log(N) - 2] time, which is slightly better than the loose upper bound of 2N (which can be thought of O(N) too).
The trick is to use binary recursive calls and "tennis tournament" algorithm. The winner (the largest number) will emerge after all the 'matches' (takes N-1 time), but if we record the 'players' of all the matches, and among them, group all the players that the winner has beaten, the second largest number will be the largest number in this group, i.e. the 'losers' group.
The size of this 'losers' group is log(N), and again, we can revoke the binary recursive calls to find the largest among the losers, which will take [log(N) - 1] time. Actually, we can just linearly scan the losers group to get the answer too, the time budget is the same.
Below is a sample python code:
def largest(L):
global paris
if len(L) == 1:
return L[0]
else:
left = largest(L[:len(L)//2])
right = largest(L[len(L)//2:])
pairs.append((left, right))
return max(left, right)
def second_largest(L):
global pairs
biggest = largest(L)
second_L = [min(item) for item in pairs if biggest in item]
return biggest, largest(second_L)
if __name__ == "__main__":
pairs = []
# test array
L = [2,-2,10,5,4,3,1,2,90,-98,53,45,23,56,432]
if len(L) == 0:
first, second = None, None
elif len(L) == 1:
first, second = L[0], None
else:
first, second = second_largest(L)
print('The largest number is: ' + str(first))
print('The 2nd largest number is: ' + str(second))
You can also try this:
>>> list=[20, 20, 19, 4, 3, 2, 1,100,200,100]
>>> sorted(set(list), key=int, reverse=True)[1]
100
A simple way :
n=int(input())
arr = set(map(int, input().split()))
arr.remove(max(arr))
print (max(arr))
use defalut sort() method to get second largest number in the list.
sort is in built method you do not need to import module for this.
lis = [11,52,63,85,14]
lis.sort()
print(lis[len(lis)-2])
Just to make the accepted answer more general, the following is the extension to get the kth largest value:
def kth_largest(numbers, k):
largest_ladder = [float('-inf')] * k
count = 0
for x in numbers:
count += 1
ladder_pos = 1
for v in largest_ladder:
if x > v:
ladder_pos += 1
else:
break
if ladder_pos > 1:
largest_ladder = largest_ladder[1:ladder_pos] + [x] + largest_ladder[ladder_pos:]
return largest_ladder[0] if count >= k else None
def secondlarget(passinput):
passinputMax = max(passinput) #find the maximum element from the array
newpassinput = [i for i in passinput if i != passinputMax] #Find the second largest element in the array
#print (newpassinput)
if len(newpassinput) > 0:
return max(newpassinput) #return the second largest
return 0
if __name__ == '__main__':
n = int(input().strip()) # lets say 5
passinput = list(map(int, input().rstrip().split())) # 1 2 2 3 3
result = secondlarget(passinput) #2
print (result) #2
if __name__ == '__main__':
n = int(input())
arr = list(map(float, input().split()))
high = max(arr)
secondhigh = min(arr)
for x in arr:
if x < high and x > secondhigh:
secondhigh = x
print(secondhigh)
The above code is when we are setting the elements value in the list
as per user requirements. And below code is as per the question asked
#list
numbers = [20, 67, 3 ,2.6, 7, 74, 2.8, 90.8, 52.8, 4, 3, 2, 5, 7]
#find the highest element in the list
high = max(numbers)
#find the lowest element in the list
secondhigh = min(numbers)
for x in numbers:
'''
find the second highest element in the list,
it works even when there are duplicates highest element in the list.
It runs through the entire list finding the next lowest element
which is less then highest element but greater than lowest element in
the list set initially. And assign that value to secondhigh variable, so
now this variable will have next lowest element in the list. And by end
of loop it will have the second highest element in the list
'''
if (x<high and x>secondhigh):
secondhigh=x
print(secondhigh)
Max out the value by comparing each one to the max_item. In the first if, every time the value of max_item changes it gives its previous value to second_max. To tightly couple the two second if ensures the boundary
def secondmax(self, list):
max_item = list[0]
second_max = list[1]
for item in list:
if item > max_item:
second_max = max_item
max_item = item
if max_item < second_max:
max_item = second_max
return second_max
you have to compare in between new values, that's the trick, think always in the previous (the 2nd largest) should be between the max and the previous max before, that's the one!!!!
def secondLargest(lista):
max_number = 0
prev_number = 0
for i in range(0, len(lista)):
if lista[i] > max_number:
prev_number = max_number
max_number = lista[i]
elif lista[i] > prev_number and lista[i] < max_number:
prev_number = lista[i]
return prev_number
Most of previous answers are correct but here is another way !
Our strategy is to create a loop with two variables first_highest and second_highest. We loop through the numbers and if our current_value is greater than the first_highest then we set second_highest to be the same as first_highest and then the second_highest to be the current number. If our current number is greater than second_highest then we set second_highest to the same as current number
#!/usr/bin/env python3
import sys
def find_second_highest(numbers):
min_integer = -sys.maxsize -1
first_highest= second_highest = min_integer
for current_number in numbers:
if current_number == first_highest and min_integer != second_highest:
first_highest=current_number
elif current_number > first_highest:
second_highest = first_highest
first_highest = current_number
elif current_number > second_highest:
second_highest = current_number
return second_highest
print(find_second_highest([80,90,100]))
print(find_second_highest([80,80]))
print(find_second_highest([2,3,6,6,5]))
Best solution that my friend Dhanush Kumar came up with:
def second_max(loop):
glo_max = loop[0]
sec_max = float("-inf")
for i in loop:
if i > glo_max:
sec_max = glo_max
glo_max=i
elif sec_max < i < glo_max:
sec_max = i
return sec_max
#print(second_max([-1,-3,-4,-5,-7]))
assert second_max([-1,-3,-4,-5,-7])==-3
assert second_max([5,3,5,1,2]) == 3
assert second_max([1,2,3,4,5,7]) ==5
assert second_max([-3,1,2,5,-2,3,4]) == 4
assert second_max([-3,-2,5,-1,0]) == 0
assert second_max([0,0,0,1,0]) == 0
Below code will find the max and the second max numbers without the use of max function. I assume that the input will be numeric and the numbers are separated by single space.
myList = input().split()
myList = list(map(eval,myList))
m1 = myList[0]
m2 = myList[0]
for x in myList:
if x > m1:
m2 = m1
m1 = x
elif x > m2:
m2 = x
print ('Max Number: ',m1)
print ('2nd Max Number: ',m2)
Here I tried to come up with an answer.
2nd(Second) maximum element in a list using single loop and without using any inbuilt function.
def secondLargest(lst):
mx = 0
num = 0
sec = 0
for i in lst:
if i > mx:
sec = mx
mx = i
else:
if i > num and num >= sec:
sec = i
num = i
return sec

Faster Python technique to count triples from a list of numbers that are multiples of each other

Suppose we have a list of numbers, l. I need to COUNT all tuples of length 3 from l, (l_i,l_j,l_k) such that l_i evenly divides l_j, and l_j evenly divides l_k. With the stipulation that the indices i,j,k have the relationship i<j<k
I.e.;
If l=[1,2,3,4,5,6], then the tuples would be [1,2,6], [1,3,6],[1,2,4], so the COUNT would be 3.
If l=[1,1,1], then the only tuple would be [1,1,1], so the COUNT would be 1.
Here's what I've done so far, using list comprehensions:
def myCOUNT(l):
newlist=[[x,y,z] for x in l for y in l for z in l if (z%y==0 and y%x==0 and l.index(x)<l.index(y) and l.index(y)<l.index(z))]
return len(newlist)
>>>l=[1,2,3,4,5,6]
>>>myCOUNT(l)
3
This works, but as l gets longer (and it can be as large as 2000 elements long), the time it takes increases too much. Is there a faster/better way to do this?
We can count the number of triples with a given number in the middle by counting how many factors of that number are to its left, counting how many multiples of that number are to its right, and multiplying. Doing this for any given middle element is O(n) for a length-n list, and doing it for all n possible middle elements is O(n^2).
def num_triples(l):
total = 0
for mid_i, mid in enumerate(l):
num_left = sum(1 for x in l[:mid_i] if mid % x == 0)
num_right = sum(1 for x in l[mid_i+1:] if x % mid == 0)
total += num_left * num_right
return total
Incidentally, the code in your question doesn't actually work. It's fallen into the common newbie trap of calling index instead of using enumerate to get iteration indices. More than just being inefficient, this is actually wrong when the input has duplicate elements, causing your myCOUNT to return 0 instead of 1 on the [1, 1, 1] example input.
Finding all tuples in O(n2)
You algorithm iterates over all possible combinations, which makes it O(n3).
Instead, you should precompute the division-tree of your list of numbers and recover triples from the paths down the tree.
Division tree
A division tree is a graph which nodes are numbers and children are the multiples of each number.
By example, given the list [1, 2, 3, 4], the division tree looks like this.
1
/|\
2 | 3
\|
4
Computing the division tree requires to compare each number against all others, making its creation O(n2).
Here is a basic implementation of a division-tree that can be used for your problem.
class DivisionTree:
def __init__(self, values):
values = sorted(values)
# For a division-tree to be connected, we need 1 to be its head
# Thus we artificially add it and note whether it was actually in our numbers
if 1 in values:
self._has_one = True
values = values[1:]
else:
self._has_one = False
self._graph = {1: []}
for v in values:
self.insert(v)
def __iter__(self):
"""Iterate over all values of the division-tree"""
yield from self._graph
def insert(self, n):
"""Insert value in division tree, adding it as child of each divisor"""
self._graph[n] = []
for x in self:
if n != x and n % x == 0:
self._graph[x].append(n)
def paths(self, depth, _from=1):
"""Return a generator of all paths of *depth* down the division-tree"""
if _from == 1:
for x in self._graph[_from]:
yield from self.paths(depth , _from=x)
if depth == 1:
yield [_from]
return
if _from != 1 or self._has_one:
for x in self._graph[_from]:
for p in self.paths(depth - 1, _from=x):
yield [_from, *p]
Usage
Once we built a DivisionTree, it suffices to iterate over all paths down the graph and select only those which have length 3.
Example
l = [1,2,3,4,5,6]
dt = DivisionTree(l)
for p in dt.paths(3):
print(p)
Output
[1, 2, 4]
[1, 2, 6]
[1, 3, 6]
This solution assumes that the list of number is initially sorted, as in your example. Although, the output could be filtered with regard to the condition on indices i < j < k to provide a more general solution.
Time complexity
Generating the division-tree is O(n2).
In turn, there can be up to n! different paths, although stopping the iteration whenever we go deeper than 3 prevents traversing them all. This makes us iterate over the following paths:
the paths corresponding to three tuples, say there are m of them;
the paths corresponding to two tuples, there are O(n2) of them;
the paths corresponding to one tuples, there are O(n) of them.
Thus this overall yields an algorithm O(n2 + m).
I suppose this solution without list comprehension will be faster (you can see analogue with list comprehension further):
a = [1, 2, 3, 4, 5, 6]
def count(a):
result = 0
length = len(a)
for i in range(length):
for j in range(i + 1, length):
for k in range(j + 1, length):
if a[j] % a[i] == 0 and a[k] % a[j] == 0:
result += 1
return result
print(count(a))
Output:
3
In your solution index method is too expensive (requires O(n) operations). Also you don't need to iterate over full list for each x, y and z (x = a[i], y = a[j], z = a[k]). Notice how I use indexes in my loops for y and z because I know that a.index(x) < a.index(y) < a.index(z) is always satisfied.
You can write it as one liner too:
def count(a):
length = len(a)
return sum(1 for i in range(length)
for j in range(i + 1, length)
for k in range(j + 1, length)
if a[j] % a[i] == 0 and a[k] % a[j] == 0)
P.S.
Please, don't use l letter for variables names because it's very similar to 1:)
There is a way to do this with itertools combinations:
from itertools import combinations
l=[1,2,3,4,5,6]
>>> [(x,y,z) for x,y,z in combinations(l,3) if z%y==0 and y%x==0]
[(1, 2, 4), (1, 2, 6), (1, 3, 6)]
Since combinations generates the tuples in list order, you do not then need to check the index of z.
Then your myCOUNT function becomes:
def cnt(li):
return sum(1 for x,y,z in combinations(li,3) if z%y==0 and y%x==0)
>>> cnt([1,1,1])
1
>>> cnt([1,2,3,4,5,6])
3
This is a known problem.
Here are some timing for the solutions here:
from itertools import combinations
class DivisionTree:
def __init__(self, values):
# For a division-tree to be connected, we need 1 to be its head
# Thus we artificially add it and note whether it was actually in our numbers
if 1 in values:
self._has_one = True
values = values[1:]
else:
self._has_one = False
self._graph = {1: []}
for v in values:
self.insert(v)
def __iter__(self):
"""Iterate over all values of the division-tree"""
yield from self._graph
def insert(self, n):
"""Insert value in division tree, adding it as child of each divisor"""
self._graph[n] = []
for x in self:
if n != x and n % x == 0:
self._graph[x].append(n)
def paths(self, depth, _from=1):
"""Return a generator of all paths of *depth* down the division-tree"""
if _from == 1:
for x in self._graph[_from]:
yield from self.paths(depth , _from=x)
if depth == 1:
yield [_from]
return
if _from != 1 or self._has_one:
for x in self._graph[_from]:
for p in self.paths(depth - 1, _from=x):
yield [_from, *p]
def f1(li):
return sum(1 for x,y,z in combinations(li,3) if z%y==0 and y%x==0)
def f2(l):
newlist=[[x,y,z] for x in l for y in l for z in l if (z%y==0 and y%x==0 and l.index(x)<l.index(y) and l.index(y)<l.index(z))]
return len(newlist)
def f3(a):
result = 0
length = len(a)
for i in range(length):
for j in range(i + 1, length):
for k in range(j + 1, length):
if a[j] % a[i] == 0 and a[k] % a[j] == 0:
result += 1
return result
def f4(l):
dt = DivisionTree(l)
return sum(1 for _ in dt.paths(3))
def f5(l):
total = 0
for mid_i, mid in enumerate(l):
num_left = sum(1 for x in l[:mid_i] if mid % x == 0)
num_right = sum(1 for x in l[mid_i+1:] if x % mid == 0)
total += num_left * num_right
return total
if __name__=='__main__':
import timeit
tl=list(range(3,155))
funcs=(f1,f2,f3,f4,f5)
td={f.__name__:f(tl) for f in funcs}
print(td)
for case, x in (('small',50),('medium',500),('large',5000)):
li=list(range(2,x))
print('{}: {} elements'.format(case,x))
for f in funcs:
print(" {:^10s}{:.4f} secs".format(f.__name__, timeit.timeit("f(li)", setup="from __main__ import f, li", number=1)))
And the results:
{'f1': 463, 'f2': 463, 'f3': 463, 'f4': 463, 'f5': 463}
small: 50 elements
f1 0.0010 secs
f2 0.0056 secs
f3 0.0018 secs
f4 0.0003 secs
f5 0.0002 secs
medium: 500 elements
f1 1.1702 secs
f2 5.3396 secs
f3 1.8519 secs
f4 0.0156 secs
f5 0.0110 secs
large: 5000 elements
f1 1527.4956 secs
f2 6245.9930 secs
f3 2074.2257 secs
f4 1.3492 secs
f5 1.2993 secs
You can see that f1,f2,f3 are clearly O(n^3) or worse and f4,f5 are O(n^2). f2 took more than 90 minutes for what f4 and f5 did in 1.3 seconds.
Solution in O(M*log(M)) for a sorted list containing positive numbers
As user2357112 has answered, we can count the number of triplets in O(n^2) by calculating for every number the number of its factors and multiples. However, if instead of comparing every pair we go over its multiples smaller than the largest number and check whether they are in the list, we can change the efficiency to O(N+M*log(N)), when M is the largest number in the list.
Code:
def countTriples(myList):
counts = {} #Contains the number of appearances of every number.
factors = {} #Contains the number of factors of every number.
multiples = {} #Contains the number of multiples of every number.
for i in myList: #Initializing the dictionaries.
counts[i] = 0
factors[i] = 0
multiples[i] = 0
maxNum = max(myList) #The maximum number in the list.
#First, we count the number of appearances of every number.
for i in myList:
counts[i] += 1
#Then, for every number in the list, we check whether its multiples are in the list.
for i in counts:
for j in range(2*i, maxNum+1, i):
if(counts.has_key(j)):
factors[j] += counts[i]
multiples[i] += counts[j]
#Finally, we count the number of triples.
ans = 0
for i in counts:
ans += counts[i]*factors[i]*multiples[i] #Counting triplets with three numbers.
ans += counts[i]*(counts[i]-1)*factors[i]/2 #Counting triplets with two larger and one smaller number.
ans += counts[i]*(counts[i]-1)*multiples[i]/2 #Counting triplets with two smaller numbers and one larger number.
ans += counts[i]*(counts[i]-1)*(counts[i]-2)/6 #Counting triplets with three copies of the same number.
return ans
While this solution will work quickly for lists containing many small numbers, it will not work for lists containing large numbers:
countTriples(list(range(1,1000000)) #Took 80 seconds on my computer
countTriples([1,2,1000000000000]) #Will take a very long time
Fast solution with unknown efficiency for unsorted lists
Another method to count the number of multiples and factors of every number in the list would be to use a binary tree data structure, with leaves corresponding to numbers. The data structure supports three operations:
1) Add a number to every position which is a multiple of a number.
2) Add a number to every position which is specified in a set.
3) Get the value of a position.
We use lazy propagation, and propagate the updates from the root to lower nodes only during queries.
To find the number of factors of every item in the list, we iterate over the list, query the number of factors of the current item from the data structure, and add 1 to every position which is a multiple of the item.
To find the number of multiples of every item, we first find for every item in the list all its factors using the algorithm described in the previous solution.
We then iterate over the list in the reverse order. For every item, we query the number of its multiples from the data structure, and add 1 to its factors in the data structure.
Finally, for every item, we add the multiplication of its factors and multiples to the answer.
Code:
'''A tree that supports two operations:
addOrder(num) - If given a number, adds 1 to all the values which are multiples of the given number. If given a tuple, adds 1 to all the values in the tuple.
getValue(num) - returns the value of the number.
Uses lazy evaluation to speed up the algorithm.
'''
class fen:
'''Initiates the tree from either a list, or a segment of the list designated by s and e'''
def __init__(this, l, s = 0, e = -1):
if(e == -1): e = len(l)-1
this.x1 = l[s]
this.x2 = l[e]
this.val = 0
this.orders = {}
if(s != e):
this.s1 = fen(l, s, (s+e)/2)
this.s2 = fen(l, (s+e)/2+1, e)
else:
this.s1 = None
this.s2 = None
'''Testing if a multiple of the number appears in the range of this node.'''
def _numGood(this, num):
if(this.x2-this.x1+1 >= num): return True
m1 = this.x1%num
m2 = this.x2%num
return m1 == 0 or m1 > m2
'''Testing if a member of the group appears in the range of this node.'''
def _groupGood(this, group):
low = 0
high = len(group)
if(this.x1 <= group[0] <= this.x2): return True
while(low != high-1):
mid = (low+high)/2;
if(group[mid] < this.x1): low = mid
elif(group[mid] > this.x2): high = mid
else: return True
return False
def _isGood(this, val):
if(type(val) == tuple):
return this._groupGood(val)
return this._numGood(val)
'''Adds an order to this node.'''
def addOrder(this, num, count = 1):
if(not this._isGood(num)): return
if(this.x1 == this.x2): this.val += count
else :this.orders[num] = this.orders.get(num, 0)+count
'''Pushes the orders to lower nodes.'''
def _pushOrders(this):
if(this.x1 == this.x2): return
for i in this.orders:
this.s1.addOrder(i, this.orders[i])
this.s2.addOrder(i, this.orders[i])
this.orders = {}
def getValue(this, num):
this._pushOrders()
if(num < this.x1 or num > this.x2):
return 0
if(this.x1 == this.x2):
return this.val
return this.s1.getValue(num)+this.s2.getValue(num)
def countTriples2(myList):
factors = [0 for i in myList]
multiples = [0 for i in myList]
numSet = set((abs(i) for i in myList))
sortedList = sorted(list(numSet))
#Calculating factors.
tree = fen(sortedList)
for i in range(len(myList)):
factors[i] = tree.getValue(abs(myList[i]))
tree.addOrder(abs(myList[i]))
#Calculating the divisors of every number in the group.
mxNum = max(numSet)
divisors = {i:[] for i in numSet}
for i in sortedList:
for j in range(i, mxNum+1, i):
if(j in numSet):
divisors[j].append(i)
divisors = {i:tuple(divisors[i]) for i in divisors}
#Calculating the number of multiples to the right of every number.
tree = fen(sortedList)
for i in range(len(myList)-1, -1, -1):
multiples[i] = tree.getValue(abs(myList[i]))
tree.addOrder(divisors[abs(myList[i])])
ans = 0
for i in range(len(myList)):
ans += factors[i]*multiples[i]
return ans
This solution worked for a list containing the numbers 1..10000 in six seconds on my computer, and for a list containing the numbers 1..100000 in 87 seconds.

Python 3: List of over 100 indices cycles back around after index 47. why? how do I stop this?

so this is a function that gets the nth prime number. I know its been done before and that my method may not be very efficient (new coder btw minor dabbling in the past).
Anyway the code below works and returns the prime number of the supplied index.
ie:
ind = 4
final[1,2,3,5,7,11]
return final[ind-1]
returns: 5
But final[51-1] returns whats in final[3-1]. Seems like after index 47 it loops back around and starts over. Ive printed the whole of the list contained in final. and it prints every prime, even those past 47. Im not sure whats going on. Is there some limit to lists in python?
Here is the code:
def nthPrime(ind): #gets nth prime number. IE: 5th prime == 11. works based off very in-efficient version of Sieve of Eratosthenes. but in increments of 200
p = {}
T = 2
incST = 2
incEND = incST + 200
final=[1]
while len(final) < ind:
for i in range(incST,incEND):
p[i] = True
while T <= math.sqrt(incEND):
l = 0
while l <= incEND:
p[T**2 + (T*l)] = False
l+=1
if T**2+(T*l) > incEND:
break
for k,v in p.items():
if p[k] == True and k > T:
T = int(k)
break
for k in p:
if p[k] == True:
final.append(k)
incST = incEND + 1
incEND = incST + 200
'''
currently function works perfectly for
any index under 48.
at index 48 and above it seems to start
back at index 1.
IE: final[51]
^would actually return final[4]
'''
return final[ind-1]
You need to count how many primes you have in your list, but you accumulate in final within the loop, so you add all the numbers up to the limit several times in the loop. Starts at 2 again after 199.
Also, using dictionaries and relying on the order is dangerous. You should sort them when iterating.
My solution only counts the primes to know when to end the loop, and compose the list just in the end, omitting 1 and shifting the index by 1.
I also sort the dictionary when iterating over it to make sure:
import math
def nthPrime(ind): #gets nth prime number. IE: 5th prime == 11. works based off very in-efficient version of Sieve of Eratosthenes. but in increments of 200
p = {}
T = 2
incST = 2
incEND = incST + 200
lenfinal = 1
while lenfinal < ind:
for i in range(incST,incEND):
p[i] = True
while T <= math.sqrt(incEND):
l = 0
while l <= incEND:
p[T**2 + (T*l)] = False
l+=1
if T**2+(T*l) > incEND:
break
for k,v in sorted(p.items()):
if v and k > T:
T = int(k)
break
incST = incEND + 1
incEND = incST + 200
# compute length, no need to order or to actually create the list
lenfinal = sum(1 for k,v in p.items() if v)
# now compose the list
final = [k for k,v in sorted(p.items()) if v]
return final[ind-2]
A more efficient way to do this would be a recursive function:
I'll put some explanation in the code.
def nthPrime(ind):
first_prime=1 #first prime number
number = 1 # all numbers that we will check, this will be incremented
prime_numbers = [first_prime] # The list of prime numbers we will find
def findPrimeInPosition(ind, number):
if ind > len(prime_numbers): # This recursive function will exit if find a sufficient number of primes
number+=1 # incrementing to check the next number
is_prime = True # Assuming number is a prime
for p in prime_numbers[1:]: # Check if it is a prime
if not number % p:
is_prime = False
if is_prime:
prime_numbers.append(number) # Add to the list of primes
findPrimeInPosition(ind, number)
return prime_numbers[-1] # Get the last element found
return findPrimeInPosition(ind, number)
Example of usage:
print nthPrime(47)
>> 199
print nthPrime(48)
>> 211
This isn't a Python issue, the problem is in your calculation on how you calculate the results. When you do final[51] it actually returns the value that holds that position, do this:
# Modify your line
# return final[ind-1]
# return final
# Call your method
o_final = nthPrime(100)
for k in range(len(o_final)):
print(k, y[k])
Then you realize that at pos 93 you reach the next one and keep incrementing.

Categories

Resources