Improve performance on Python nested for loop

Improve performance on Python nested for loop - python

I have two sets of numbers, each in a list in my Python script. For each number in the first list, I need to see if any of the numbers in the second are larger than it. I only need the number of times that an n2 was larger than an n1. (For example, if numset1 is [7,2] and numset2 is [6,9], I just need 3)
Right now I'm doing this - going through each n1 and checking if each n2 is larger than it:
possibilities = [(n1<n2) for n1 in numset1 for n2 in numset2]
numPossibilities = sum(possibilities)
Currently this is the slowest portion of my script, particularly when dealing with larger datasets (numset1 and numset2 containing thousands of numbers). I'm sure there's some way to make this more efficient, I'm just not sure how.

Sort numset2 and then iterate over numset1 but use a binary search on numset2, for example using the bisect module: http://docs.python.org/2/library/bisect.html
import bisect
# your code here
numset2.sort()
L = len(numset2)
numPossibilities = sum([bisect.bisect_right(numset2,n1) < L for n1 in numset1])
Also note that your original code does not compute what you have asked for in your second sentence - for each element in numset1, it sums how many elements in numset2 are greater than this element, not whether there is an element that matches the criterion.
To match your original code, do:
numPossibilities = sum([L - bisect.bisect_right(numset2,n1) for n1 in numset1])

Your problem is that you have to iterate over each combination of (n1, n2), and there are len(numset1) * len(numset2) combinations, which gets really large when numset1 and numset2 are only fairly large.
Put a different way, the running time of your algorithm is O(n^2) (if len(numset1) is about equal to len(numset2). Let's make that runtime faster. :-)
This becomes a lot easier to do if we sort the lists. So let's sort numset1 and numset2.
>>> numset1.sort()
>>> numset2.sort()
Now, compare the smallest element of numset1 (call it n1) and the smallest element of numset2 (call it n2). If n1 is smaller, then we know that there are len(numset2) elements in numset2 larger than it. If n2 is smaller, we know that no elements in numset1 are smaller than it.
Now, we don't want to actually delete elements from the beginning of the list, because that's an O(n) operation on a Python list. So instead, let's keep track of where we are in each list and iterate through.
>>> n1_idx, n2_idx, accumulator = 0, 0, 0
>>> while n1_idx < len(numset1) and n2_idx < len(numset2):
if numset1[n1_idx] < numset2[n2_idx]:
accumulator += len(numset2) - n2_idx
n1_idx += 1
else:
n2_idx += 1
At the end of this operation, we've spent O(nlog(n)) time sorting the lists and O(n) time doing our iteration, so our overall runtime complexity is O(nlog(n)).
That, and accumulator has the number of (n1, n2) pairs where n1 < n2.

Here is what should be a pretty efficient implementation:
def get_possibilities(numset1, numset2):
sortset1 = sorted(numset1)
sortset2 = sorted(numset2)
total = 0
i2 = 0
for i1, n1 in enumerate(sortset1, 1):
while sortset2[i2] <= n1:
i2 += 1
if i2 >= len(sortset2):
# reached end of i2, so just return total now
return total
# current from sortset2 is greater than from sortset1 so far
total += i1
# all remaining elements of sortset2 greater than all elements of sortset1
total += (len(sortset2) - i2 - 1) * len(sortset1)
return total
This iterates over each set only once, which it accomplishes by sorting the sets before running. This allows some improved logic, because if the element in sortset2 at index i2 is greater than an element of sortset1 at index i1, then it is also greater than all elements in sortset1 at earlier indices.

Related

Complexity of comparing elements in two lists

I was coding a function in Python to find elements of a sorted list that exist in another sorted list and print out the results:
# assume that both lists are sorted
def compare_sorted_lists(list1, list2):
res = []
a = 0
b = 0
while a < len(list1) and b < len(list2):
if list1[a] == list2[b]:
res.append(list1[a])
a += 1
elif list1[a] < list2[b]:
a += 1
else:
b += 1
return res
I want to figure out the time complexity of comparing elements with this method.
Assuming that:
list1 has length A and the maximum number of digits/letters in a list1 element is X
list2 has length B and the maximum number of digits/letters in a list2 element is Y
For these lists I have O(A+B) time complexity when traversing them with pointers, but how would comparing elements affect the time complexity for this function (specifically, worst-case time complexity)?
Edit: 12 March 2021 16:30 - rephrased question

The comparison between two elements is constant time, so this does not affect the complexity of your whole algorithm, which you corrected identified as O(A+B).

As user1717828 pointed out, the loop takes place at most A+B times; however comparing two elements is not a constant time operation. If the numbers are fixed point precision numbers, then yes, it is; however Python integers are unbounded. Time cost of their comparison will grow linearly with respect to the number of digits in them. Therefore the time complexity of the algorithm you gave is
O((A+B) * max{X,Y})
You can actually do better than that under specific circumstances. E.g. if A << B, then the following code has O(A*log(B)*max{X,Y}) time complexity.
for a in A:
split B from the middle and keep searching a in B in one of the blocks. Continue
until you find a, or not.
because the inner loops keeps diving the list B into 2, which can last for at most log_2(B)+1 steps.

recursion vs iteration time complexity

Could anyone explain exactly what's happening under the hood to make the recursive approach in the following problem much faster and efficient in terms of time complexity?
The problem: Write a program that would take an array of integers as input and return the largest three numbers sorted in an array, without sorting the original (input) array.
For example:
Input: [22, 5, 3, 1, 8, 2]
Output: [5, 8, 22]
Even though we can simply sort the original array and return the last three elements, that would take at least O(nlog(n)) time as the fastest sorting algorithm would do just that. So the challenge is to perform better and complete the task in O(n) time.
So I was able to come up with a recursive solution:
def findThreeLargestNumbers(array, largest=[]):
if len(largest) == 3:
return largest
max = array[0]
for i in array:
if i > max:
max = i
array.remove(max)
largest.insert(0, max)
return findThreeLargestNumbers(array, largest)
In which I kept finding the largest number, removing it from the original array, appending it to my empty array, and recursively calling the function again until there are three elements in my array.
However, when I looked at the suggested iterative method, I composed this code:
def findThreeLargestNumbers(array):
sortedLargest = [None, None, None]
for num in array:
check(num, sortedLargest)
return sortedLargest
def check(num, sortedLargest):
for i in reversed(range(len(sortedLargest))):
if sortedLargest[i] is None:
sortedLargest[i] = num
return
if num > sortedLargest[i]:
shift(sortedLargest, i, num)
return
def shift(array, idx, element):
if idx == 0:
array[0] = element
return array
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
return array
Both codes passed successfully all the tests and I was convinced that the iterative approach is faster (even though not as clean..). However, I imported the time module and put the codes to the test by providing an array of one million random integers and calculating how long each solution would take to return back the sorted array of the largest three numbers.
The recursive approach was way much faster (about 9 times faster) than the iterative approach!
Why is that? Even though the recursive approach is traversing the huge array three times and, on top of that, every time it removes an element (which takes O(n) time as all other 999 elements would need to be shifted in the memory), whereas the iterative approach is traversing the input array only once and yes making some operations at every iteration but with a very negligible array of size 3 that wouldn't even take time at all!
I really want to be able to judge and pick the most efficient algorithm for any given problem so any explanation would tremendously help.

Advice for optimization.
Avoid function calls. Avoid creating temporary garbage. Avoid extra comparisons. Have logic that looks at elements as little as possible. Walk through how your code works by hand and look at how many steps it takes.
Your recursive code makes only 3 function calls, and as pointed out elsewhere does an average of 1.5 comparisons per call. (1 while looking for the min, 0.5 while figuring out where to remove the element.)
Your iterative code makes lots of comparisons per element, calls excess functions, and makes calls to things like sorted that create/destroy junk.
Now compare with this iterative solution:
def find_largest(array, limit=3):
if len(array) <= limit:
# Special logic not needed.
return sorted(array)
else:
# Initialize the answer to values that will be replaced.
min_val = min(array[0:limit])
answer = [min_val for _ in range(limit)]
# Now scan for smallest.
for i in array:
if answer[0] < i:
# Sift elements down until we find the right spot.
j = 1
while j < limit and answer[j] < i:
answer[j-1] = answer[j]
j = j+1
# Now insert.
answer[j-1] = i
return answer
There are no function calls. It is possible that you can make up to 6 comparisons per element (verify that answer[0] < i, verify that (j=1) < 3, verify that answer[1] < i, verify that (j=2) < 3, verify that answer[2] < i, then find that (j=3) < 3 is not true). You will hit that worst case if array is sorted. But most of the time you only do the first comparison then move to the next element. No muss, no fuss.
How does it benchmark?
Note that if you wanted the smallest 100 elements, then you'd find it worthwhile to use a smarter data structure such as a heap to avoid the bubble sort.

I am not really confortable with python, but I have a different approach to the problem for what it's worth.
As far as I saw, all solutions posted are O(NM) where N is the length of the array and M the length of the largest elements array.
Because of your specific situation whereN >> M you could say it's O(N), but the longest the inputs the more it will be O(NM)
I agree with #zvone that it seems you have more steps in the iterative solution, which sounds like an valid explanation to your different computing speeds.
Back to my proposal, implements binary search O(N*logM) with recursion:
import math
def binarySearch(arr, target, origin = 0):
"""
Recursive binary search
Args:
arr (list): List of numbers to search in
target (int): Number to search with
Returns:
int: index + 1 from inmmediate lower element to target in arr or -1 if already present or lower than the lowest in arr
"""
half = math.floor((len(arr) - 1) / 2);
if target > arr[-1]:
return origin + len(arr)
if len(arr) == 1 or target < arr[0]:
return -1
if arr[half] < target and arr[half+1] > target:
return origin + half + 1
if arr[half] == target or arr[half+1] == target:
return -1
if arr[half] < target:
return binarySearch(arr[half:], target, origin + half)
if arr[half] > target:
return binarySearch(arr[:half + 1], target, origin)
def findLargestNumbers(array, limit = 3, result = []):
"""
Recursive linear search of the largest values in an array
Args:
array (list): Array of numbers to search in
limit (int): Length of array returned. Default: 3
Returns:
list: Array of max values with length as limit
"""
if len(result) == 0:
result = [float('-inf')] * limit
if len(array) < 1:
return result
val = array[-1]
foundIndex = binarySearch(result, val)
if foundIndex != -1:
result.insert(foundIndex, val)
return findLargestNumbers(array[:-1],limit, result[1:])
return findLargestNumbers(array[:-1], limit,result)
It is quite flexible and might be inspiration for a more elaborated answer.

The recursive solution
The recursive function goes through the list 3 times to fins the largest number and removes the largest number from the list 3 times.
for i in array:
if i > max:
...
and
array.remove(max)
So, you have 3×N comparisons, plus 3x removal. I guess the removal is optimized in C, but there is again about 3×(N/2) comparisons to find the item to be removed.
So, a total of approximately 4.5 × N comparisons.
The other solution
The other solution goes through the list only once, but each time it compares to the three elements in sortedLargest:
for i in reversed(range(len(sortedLargest))):
...
and almost each time it sorts the sortedLargest with these three assignments:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
So, you are N times:
calling check
creating and reversing a range(3)
accessing sortedLargest[i]
comparing num > sortedLargest[i]
calling shift
comparing idx == 0
and about 2×N/3 times doing:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
and N/3 times array[0] = element
It is difficult to count, but that is much more than 4.5×N comparisons.

Guidance on removing a nested for loop from function

I'm trying to write the fastest algorithm possible to return the number of "magic triples" (i.e. x, y, z where z is a multiple of y and y is a multiple of x) in a list of 3-2000 integers.
(Note: I believe the list was expected to be sorted and unique but one of the test examples given was [1,1,1] with the expected result of 1 - that is a mistake in the challenge itself though because the definition of a magic triple was explicitly noted as x < y < z, which [1,1,1] isn't. In any case, I was trying to optimise an algorithm for sorted lists of unique integers.)
I haven't been able to work out a solution that doesn't include having three consecutive loops and therefore being O(n^3). I've seen one online that is O(n^2) but I can't get my head around what it's doing, so it doesn't feel right to submit it.
My code is:
def solution(l):
if len(l) < 3:
return 0
elif l == [1,1,1]:
return 1
else:
halfway = int(l[-1]/2)
quarterway = int(halfway/2)
quarterIndex = 0
halfIndex = 0
for i in range(len(l)):
if l[i] >= quarterway:
quarterIndex = i
break
for i in range(len(l)):
if l[i] >= halfway:
halfIndex = i
break
triples = 0
for i in l[:quarterIndex+1]:
for j in l[:halfIndex+1]:
if j != i and j % i == 0:
multiple = 2
while (j * multiple) <= l[-1]:
if j * multiple in l:
triples += 1
multiple += 1
return triples
I've spent quite a lot of time going through examples manually and removing loops through unnecessary sections of the lists but this still completes a list of 2,000 integers in about a second where the O(n^2) solution I found completes the same list in 0.6 seconds - it seems like such a small difference but obviously it means mine takes 60% longer.
Am I missing a really obvious way of removing one of the loops?
Also, I saw mention of making a directed graph and I see the promise in that. I can make the list of first nodes from the original list with a built-in function, so in principle I presume that means I can make the overall graph with two for loops and then return the length of the third node list, but I hit a wall with that too. I just can't seem to make progress without that third loop!!

from array import array
def num_triples(l):
n = len(l)
pairs = set()
lower_counts = array("I", (0 for _ in range(n)))
upper_counts = lower_counts[:]
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[i] += 1
upper_counts[j] += 1
return sum(nx * nz for nz, nx in zip(lower_counts, upper_counts))
Here, lower_counts[i] is the number of pairs of which the ith number is the y, and z is the other number in the pair (i.e. the number of different z values for this y).
Similarly, upper_counts[i] is the number of pairs of which the ith number is the y, and x is the other number in the pair (i.e. the number of different x values for this y).
So the number of triples in which the ith number is the y value is just the product of those two numbers.
The use of an array here for storing the counts is for scalability of access time. Tests show that up to n=2000 it makes negligible difference in practice, and even up to n=20000 it only made about a 1% difference to the run time (compared to using a list), but it could in principle be the fastest growing term for very large n.

How about using itertools.combinations instead of nested for loops? Combined with list comprehension, it's cleaner and much faster. Let's say l = [your list of integers] and let's assume it's already sorted.
from itertools import combinations
def div(i,j,k): # this function has the logic
return l[k]%l[j]==l[j]%l[i]==0
r = sum([div(i,j,k) for i,j,k in combinations(range(len(l)),3) if i<j<k])

#alaniwi provided a very smart iterative solution.
Here is a recursive solution.
def find_magicals(lst, nplet):
"""Find the number of magical n-plets in a given lst"""
res = 0
for i, base in enumerate(lst):
# find all the multiples of current base
multiples = [num for num in lst[i + 1:] if not num % base]
res += len(multiples) if nplet <= 2 else find_magicals(multiples, nplet - 1)
return res
def solution(lst):
return find_magicals(lst, 3)
The problem can be divided into selecting any number in the original list as the base (i.e x), how many du-plets we can find among the numbers bigger than the base. Since the method to find all du-plets is the same as finding tri-plets, we can solve the problem recursively.
From my testing, this recursive solution is comparable to, if not more performant than, the iterative solution.

This answer was the first suggestion by #alaniwi and is the one I've found to be the fastest (at 0.59 seconds for a 2,000 integer list).
def solution(l):
n = len(l)
lower_counts = dict((val, 0) for val in l)
upper_counts = lower_counts.copy()
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[lower] += 1
upper_counts[upper] += 1
return sum((lower_counts[y] * upper_counts[y] for y in l))
I think I've managed to get my head around it. What it is essentially doing is comparing each number in the list with every other number to see if the smaller is divisible by the larger and makes two dictionaries:
One with the number of times a number is divisible by a larger
number,
One with the number of times it has a smaller number divisible by
it.
You compare the two dictionaries and multiply the values for each key because the key having a 0 in either essentially means it is not the second number in a triple.
Example:
l = [1,2,3,4,5,6]
lower_counts = {1:5, 2:2, 3:1, 4:0, 5:0, 6:0}
upper_counts = {1:0, 2:1, 3:1, 4:2, 5:1, 6:3}
triple_tuple = ([1,2,4], [1,2,6], [1,3,6])

Am I counting the array comparisons correctly in this script?

I have this script below that finds the smallest and largest values in an integer array. The goal is to complete this task in less than 1.5 x N comparisons, where N is the length of the input array, n_list. I want to ask a couple of questions.
1: (Inside the for loop) Is comparing the variables smallest or largest to n considered an array comparison? In the script below, I am counting it as one. If it is, why is this the case? The definition I was able to find said that an array comparison is between two arrays, and that's not really what I'm doing, IMO.
2: If I am not counting correctly, what am I doing wrong?
3: What would be a better approach to this problem?
Thanks so much, hope you're having a good day/night :)
def findExtremes(n_list):
smallest = n_list[0]
largest = n_list[0]
counter = 2 # See above
for n in n_list:
if n > largest:
largest = n
counter += 1
continue
elif n < smallest:
smallest = n
counter += 1
continue
else:
counter += 1
continue
return(counter)

You're only counting comparisons that succeed.
When the first if succeeds, you've done one comparison, and you correctly do counter += 1.
But if you get into the elif, you've done two comparisons: n > largest and n < largest, so you need to do counter += 2.
And if that comparison fails, you've still already done two comparisons, so you need to do counter += 2 in the else block as well.
You don't need to initialize counter = 2 at the beginning, you should set it to 0. You'll count the two comparisons with the first element of the list in the loop.
Actually, you might want to just skip those elements, since the result is known. You can do:
for n in n_list[1:]:
to skip over them. If you're supposed to count these unnecessary comparisons, then it makes sense to initialize counter = 2.
Your question about "array comparisons" doesn't seem to be relevant at all. There's nothing about comparing arrays in this problem, you're just comparing array elements to other elements of the same array.
Your algorithm performs anywhere from N+1 to 2*N comparisons. The best case is when the array is sorted from smallest to largest -- the test that updates largest succeeds for each element, so it never has to update smallest. The worst case is when it's sorted in the reverse order or all the numbers are the same: all the largest tests fail, so it has to test each element to see if it's the new smallest. On average with random data it tends to be close to the worst case, about 1.95*N.

max sum of list elements each separated by (at least) k elements

given a list of numbers to find the maximum sum of non-adjacent elements with time complexity o(n) and space complexity of o(1), i could use this :
sum1= 0
sum2= list[0]
for i in range(1, len(list)):
num= sum1
sum1= sum2+ list[i]
sum2= max(num, sum2)
print(max(sum2, sum1))
this code will work only if the k = 1 [ only one element between the summing numbers] how could improve it by changing k value using dynamic programming. where k is the number of elements between the summing numbers.
for example:
list = [5,6,4,1,2] k=1
answer = 11 # 5+4+2
list = [5,6,4,1,2] k=2
answer = 8 # 6+2
list = [5,3,4,10,2] k=1
answer = 15 # 5+10

It's possible to solve this with space O(k) and time O(nk). if k is a constant, this fits the requirements in your question.
The algorithm loops from position k + 1 to n. (If the array is shorter than that, it can obviously be solved in O(k)). At each step, it maintains an array best of length k + 1, such that the jth entry of best is the best solution found so far, such that the last element it used is at least j to the left of the current position.
Initializing best is done by setting, for its entry j, the largest non-negative entry in the array in positions 1, ..., k + 1 - j. So, for example, best[1] is the largest non-negative entry in positions 1, ..., k, and best[k + 1] is 0.
When at position i of the array, element i is used or not. If it is used, the relevant best until now is best[1], so write u = max(best[1] + a[i], best[1]). If element i is not used, then each "at least" part shifts one, so for j = 2, ..., k + 1, best[j] = max(best[j], best[j - 1]). Finally, set best[1] = u.
At the termination of the algorithm, the solution is the largest item in best.

EDIT:
I had misunderstood the question, if you need to have 'atleast' k elements in between then following is an O(n^2) solution.
If the numbers are non-negative, then the DP recurrence relation is:
DP[i] = max (DP[j] + A[i]) For all j st 0 <= j < i - k
= A[i] otherwise.
If there are negative numbers in the array as well, then we can use the idea from Kadane's algorithm:
DP[i] = max (DP[j] + A[i]) For all j st 0 <= j < i - k && DP[j] + A[i] > 0
= max(0,A[i]) otherwise.

Here's a quick implementation of the algorithm described by Ami Tavory (as far as I understand it). It should work for any sequence, though if your list is all negative, the maximum sum will be 0 (the sum of an empty subsequence).
import collections
def max_sum_separated_by_k(iterable, k):
best = collections.deque([0]*(k+1), k+1)
for item in iterable:
best.appendleft(max(item + best[-1], best[0]))
return best[0]
This uses O(k) space and O(N) time. All of the deque operations, including appending a value to one end (and implicitly removing one from the other end so the length limit is maintained) and reading from the ends, are O(1).
If you want the algorithm to return the maximum subsequence (rather than only its sum), you can change the initialization of the deque to start with empty lists rather than 0, and then append max([item] + best[-1], best[0], key=sum) in the body of the loop. That will be quite a bit less efficient though, since it adds O(N) operations all over the place.

Not sure for the complexity but coding efficiency landed me with
max([sum(l[i::j]) for j in range(k,len(l)) for i in range(len(l))])
(I've replace list variable by l not to step on a keyword).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Improve performance on Python nested for loop - python

Related

Complexity of comparing elements in two lists

recursion vs iteration time complexity

Guidance on removing a nested for loop from function

Am I counting the array comparisons correctly in this script?

max sum of list elements each separated by (at least) k elements

Categories

Resources