Nested Loop or 'in', which is faster?

Nested Loop or 'in', which is faster? - python

I am a python novice and was studying some basic coding challenges and was hoping to someone could explain which of the following snippets of code would run faster. The point is to see if there are pairs of integers within the list that add up to 100:
list = [1,2,3,99,5]
for i in list:
for j in list:
if i + j == 100:
return True
or:
list = [1,2,3,99,5]
for i in list:
diff = 100 - i
if diff in list:
return True

Benchmark
This homemade, randomized benchmark demonstrates that the solution using in is significantly faster in most case. I did not investigate, but I did encounter some runs where the solution with the nested for-loops was slightly faster when toying with the sample size.
import time, random
def time_it(f, rep=100000):
sample = [[random.randint(0, 100) for _ in range(20)] for _ in range(rep // 100)]
start = time.time()
for i in range(rep):
f(sample[i % len(sample)])
return (time.time() - start)
def nested_for(lst):
for i in lst:
for j in lst:
if i + j == 100:
return True
def nested_in(lst):
for i in lst:
diff = 100 - i
if diff in lst:
return True
print('for:', time_it(nested_for))
print('in:', time_it(nested_in))
Output
for: 0.7093353271484375
in: 0.24253296852111816
Removing the assignation of j on every iteration is probably what removes a big overhead in the solution with the in.
Improvement
Although note that both solutions are O(n2). You can achieve O(n) by using a set. Since a set hashes its items, lookup is O(1).
def contains_diff(lst):
elements = set(lst)
return any(100 - i in elements for i in elements)
print(contains_diff([1, 2, 3, 99])) # True
print(contains_diff([1, 2, 3])) # False
Interestingly enough, if you benchmark the above it will be generally slower than the in solution. This is because the probability of in finding a sum of 100 quickly in a randomized list is relatively high. If you let the difference you want to find grow, then the overhead of building a set is rapidly compensated by the speed of set lookup.
Sidenote
As a sidenote, you should not be using the list as a variable name as it overwrites the builtin list.

Related

Why is any() so much faster than in?

https://repl.it/#ArmanTavakoli/List-Comprehension-vs-Any
Why is my any check so much faster than my in check when they are essentially doing the same thing?
from timeit import default_timer as timer
import random
input = [random.randint(0, 100) for x in range(0, 1000000)]
def any_check(input):
return any(i == 1 for i in input)
def list_comprehension(input):
return 1 in [num for num in input]
first_start = timer()
any_check(input)
first_end = timer()
print('any_check', first_end - first_start)
second_start = timer()
list_comprehension(input)
second_end = timer()
print('list_comprehension', second_end - second_start)
Results of running the functions 3 times each.
# Calculated with 3 runs each
# Ratio is list_comprehension:any_check
# 10,000 - Mean Ratio: 17.87
# Example run;
# any_check 1.5022000297904015e-05
# list_comprehension 0.00038980199315119535
# 100,000 - Mean Ratio: 140.76
# any_check 2.020499960053712e-05
# list_comprehension 0.0035961729954578914
# 1,000,000 - Mean Ratio: 3379.81
# any_check 2.2904998331796378e-05
# list_comprehension 0.08528400499926647

As several people pointed out in comments, the reason your function doing an in test is slower than the version using any is because that function also includes an unnecessary list comprehension that needs to iterate over the whole input before the in operator can begin looking for a match. When run on lists, both in and any can short circuit, quitting early if a matching value is found early in the search. But the list comprehension in your second function always iterates over the whole input even if there was a 1 right at the start.
If you replaced 1 in [num for num in input] with 1 in input, you'd see performance as good or better than in your function using any. Performance would be fairly similar if input was a list, but might be much faster for other container types (such as sets and ranges).

Why does my prime number sieve return the same result slower than the brute force method for finding primes in Python 2.7?

I am fairly new to Python and I have been trying to find a fast way to find primes till a given number.
When I use the Prime of Eratosthenes sieve using the following code:
#Finding primes till 40000.
import time
start = time.time()
def prime_eratosthenes(n):
list = []
prime_list = []
for i in range(2, n+1):
if i not in list:
prime_list.append(i)
for j in range(i*i, n+1, i):
list.append(j)
return prime_list
lists = prime_eratosthenes(40000)
print lists
end = time.time()
runtime = end - start
print "runtime =",runtime
Along with the list containing the primes, I get a line like the one below as output:
runtime = 20.4290001392
Depending upon the RAM being used etc, I usually consistently get a value within an range of +-0.5.
However when I try to find the primes till 40000 using a brute force method as in the following code:
import time
start = time.time()
prime_lists = []
for i in range(1,40000+1):
for j in range(2,i):
if i%j==0:
break
else:
prime_lists.append(i)
print prime_lists
end = time.time()
runtime = end - start
print "runtime =",runtime
This time, along with the the list of primes, I get a smaller value for runtime:
runtime = 16.0729999542
The value only varies within a range of +-0.5.
Clearly, the sieve is slower than the brute force method.
I also observed that the difference between the runtimes in the two cases only increases with an increase in the value 'n' till which primes are to be found.
Can anyone give a logical explanation for the above mentioned behavior? I expected the sieve to function more efficiently than the brute force method but it seems to work vice-versa here.

While appending to a list is not the best way to implement this algorithm (the original algorithm uses fixed size arrays), it is amortized constant time. I think a bigger issue is if i not in list which is linear time. The best change you can make for larger inputs is having the outer for loop only check up to sqrt(n), which saves a lot of computation.
A better approach is to keep a boolean array which keeps track of striking off numbers, like what is seen in the Wikipedia article for the Sieve. This way, skipping numbers is constant time since it's an array access.
For example:
def sieve(n):
nums = [0] * n
for i in range(2, int(n**0.5)+1):
if nums[i] == 0:
for j in range(i*i, n, i):
nums[j] = 1
return [i for i in range(2, n) if nums[i] == 0]
So to answer your question, your two for loops make the algorithm do potentially O(n^2) work, while being smart about the outer for loop makes the new algorithm take up to O(n sqrt(n)) time (in practice, for reasonably-sized n, the runtime is closer to O(n))

Python: How can I make my implementation of bubble sort more time efficient?

Here is my code - a bubble sort algorithm for sorting list elements in asc order:
foo = [7, 0, 3, 4, -1]
cnt = 0
for i in foo:
for i in range(len(foo)-1):
if foo[cnt] > foo[cnt + 1]:
temp = foo[cnt]
c[cnt] = c[cnt + 1]
c[cnt + 1] = temp
cnt = cnt + 1
cnt = 0
I've been revising my code, but it is still too inefficient for an online judge. Some help would be greatly appreciated!

Early Exit BubbleSort
The first loop has no bearing on what happens inside
The second loop does all the heavy lifting. You can get rid of count by using enumerate
To swap elements, use the pythonic swap - a, b = b, a.
As per this comment, make use of an early exit. If there are no swaps to be made at any point in the inner loop, that means the list is sorted, and no further iteration is necessary. This is the intuition behind changed.
By definition, after the ith iteration of the outer loop, the last i elements will have been sorted, so you can further reduce the constant factor associated with the algorithm.
foo = [7, 0, 3, 4, -1]
for i in range(len(foo)):
changed = False
for j, x in enumerate(foo[:-i-1]):
if x > foo[j + 1]:
foo[j], foo[j + 1] = foo[j + 1], foo[j]
changed = True
if not changed:
break
print(foo)
[-1, 0, 3, 4, 7]
Note that none of these optimisations change the asymptotic (Big-O) complexity of BubbleSort (which remains O(N ** 2)), instead, only reduces the constant factors associated.

One easy optimization is to start second loop from i+1 index:
for i in range(0, len(foo)):
for j in range(i+1, len(foo)):
if (foo[i] > foo[j]):
temp = foo[i]
foo[i] = foo[j]
foo[j] = temp
Since you already sorted everything up to index i there is no need to iterate over it again. This can save you more than 50% of comparisons - in this case it's 10 versus 25 in your original algorithm.

You need to understand the big Oh notation in order to understand how efficient your algorithm is in terms of usage of computational resources independent of computer architecture or clock rate. It basically helps you analyze the worst case running time or memory usage of your algorithm as the size of the input increases.
In summary, the running time of your algorithm will fall into one of these categories (from fastest to slowest);
O(1): Constant time. Pronounced (Oh of 1). The fastest time.
O(lg n): Logarithmic time. Pronounced (Oh of log n). Faster than linear time.
Traditionally, it is the fastest time bound for search.
O(n): Linear time. Pronounced (Oh of n, n is the size of your input e.g size of
an array). Usually something when you need to examine every single bit of
your input.
O(nlgn): The fastest time we can currently achieve when performing a sort on a
list of elements.
O(n**2): Oh of n squared. Quadratic time. Often this is the bound when we have
nested loops.
O(2**n): Really, REALLY big! A number raised to the power of n is slower than
n raised to any power.
In your case, you are using nested loops which is O(n2). The code i have written uses a single while loop and has a growth complexity of O(n) which is faster than O(n2). I haven't really tried it on a very large array but in your case it seems to work. Try it and let me know if it works as expected.
k = [7, 0, 3, 4, -1]
n = len(k)
i = 0
count = 0
while count < n**2: # assuming we wouldn't go through the loop more than n squared times
if i == n - 1:
i = 0
count += 1
swapped = False
elif k[i] > k[i+1]:
temp = k[i]
k[i] = k[i+1]
k[i+1] = temp
i+=1
swapped = True
elif swapped == False:
i += 1
elif swapped == True and i < n - 1:
i += 1
Note: In the example list (k), we only need to loop through the list three times in order for it to be sorted in ascending order. So if you change the while loop to this line of code while count < 4:, it would still work.

Is there a way to avoid this memory error?

I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.
from itertools import combinations
import time
def findanums(n):
l = []
for i in range(1, n + 1):
s = []
for j in range(1, i):
if i % j == 0:
s.append(j)
if sum(s) > i:
l.append(i)
return l
start = time.time() #start time
limit = 28123
anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start
pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start
sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start
print "start main loop"
answer = 0
for i in range(1,limit+1):
if i not in sums:
answer += i
print "ANSWER:",answer
When I run this I run into a MemoryError.
The traceback:
File "test.py", line 20, in <module>
sums = map(lambda x: x[0]+x[1], pairs)
I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.
SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.
When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?
The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.
I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:
def generator_sol(anums1, s):
for comb in itertools.combinations(anums1, s):
yield comb
Now instead of pairs = combinations(anums, 2) which generates a large object.
Use:
pairs = generator_sol(anums, 2)
Then, instead of using the lambda I would use another generator:
sums_sol = (x[0]+x[1] for x in pairs)
Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).

Let me suggest you to use generators. Try changing this:
sums = map(lambda x: x[0]+x[1], pairs)
to
sums = (a+b for (a,b) in pairs)
Ofiris solution is also ok but implies that itertools.combinations return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.

The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:
import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()

Determining index of maximum value in a list - Optimalization

I have written few lines of code to solve this problem, but profiler says, that it is very time-consuming. (using kernprof line-by-line profiler)
Here is the code:
comp = [1, 2, 3] #comp is list with always 3 elements, values 1, 2, 3 are just for illustration
m = max(comp)
max_where = [i for i, j in enumerate(comp) if j == m]
if 0 in max_where:
some action1
if 1 in max_where:
some action2
if 2 in max_where:
some action3
Profiler says that most time is consumed in max_where calculation. I have also tried to split this calculation into if-tree to avoid some unnecessary operations, but results were not satisfactory.
Please, am I doing it wrong or is it just python?

If it's always three elements, why not simply do:
comp = [1, 2, 3]
m = max(comp)
if comp[0] == m:
some action
if comp[1] == m:
some action
if comp[2] == m:
some action

If you're doing this many times, and if you have all the lists available at the same time, then you could make use of numpy.argmax to get the indices for all the lists.

You say that this is a time-consuming operation, but I sincerely doubt this actually affects your program. Have you actually found that this is causing some problem due to slow execution in your code? If not, there is no point optimizing.
This said, there is a small optimization I can think of - which is to use a set rather than a list comprehension for max_where. This will make your three membership tests faster.
max_where = {i for i, j in enumerate(comp) if j == m}
That said, with only three items/checks, the construction of the set may well take more time than it saves.
In general, with a list of three items, this operation is going to take negligible amounts of time. On my system, it takes half a microsecond to perform this operation.
In short: Don't bother. Unless this is a proven bottleneck in your program that needs to be sped up, your current code is fine.

Expanding upon Tobias' answer, using a for loop:
comp = [1, 2, 3]
m = max(comp)
for index in range(len(comp)):
if comp[index] == m:
# some action
Since indexing starts at 0, you do not need to do len(comp) + 1.
I prefer using indexing in a for loop instead of the actual element, because it speeds things up considerably.
Some times in a process, you may need the index of a specific element. Then, using l.index(obj) will waste time (even if only insignificant amounts --- for longer processes, this becomes tedious).
This also assumes that every process (for the comp[index]) is very similar: same process but with different variables. This wouldn't work if you have significantly different processes for each index.
However, by using for index in range(len(l)):, you already have the index and the item can easily be accessed with l[index] (along with the index, which is given by the loop).
Oddly, it seems that Tobias' implementation is faster (I thought otherwise):
comp = [1, 2, 3]
m = max(comp)
from timeit import timeit
def test1():
if comp[0] == m: return m
if comp[1] == m: return m
if comp[2] == m: return m
def test2():
for index in range(len(comp)):
if comp[index] == m: return m
print 'test1:', timeit(test1, number = 1000)
print 'test2:', timeit(test2, number = 1000)
Returns:
test1: 0.00121262329299
test2: 0.00469034990534
My implementation may be faster for longer lists (not sure, though). However, writing the code for that is tedious (for a long list using repeated if comp[n] == m).

How About this:
sample = [3,1,2]
dic = {0:func_a,1:func_b,2:func_c}
x = max(sample)
y = sample.index(x)
dic[y]
As mentioned and rightfully downvoted this does not work for multiple function calls.
However this does:
sample = [3,1,3]
dic = {0:"func_a",1:"func_b",2:"func_c"}
max_val = max(sample)
max_indices = [index for index, elem in enumerate(sample) if elem==max_val]
for key in max_indices:
dic[key]
This is quite similar to other solutions above. I know some time passed but it wasn't right how it was. :)
Cheers!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Nested Loop or 'in', which is faster? - python

Related

Why is any() so much faster than in?

Why does my prime number sieve return the same result slower than the brute force method for finding primes in Python 2.7?

Python: How can I make my implementation of bubble sort more time efficient?

Is there a way to avoid this memory error?

Determining index of maximum value in a list - Optimalization

Categories

Resources