Why is any() so much faster than in?

Why is any() so much faster than in? - python

https://repl.it/#ArmanTavakoli/List-Comprehension-vs-Any
Why is my any check so much faster than my in check when they are essentially doing the same thing?
from timeit import default_timer as timer
import random
input = [random.randint(0, 100) for x in range(0, 1000000)]
def any_check(input):
return any(i == 1 for i in input)
def list_comprehension(input):
return 1 in [num for num in input]
first_start = timer()
any_check(input)
first_end = timer()
print('any_check', first_end - first_start)
second_start = timer()
list_comprehension(input)
second_end = timer()
print('list_comprehension', second_end - second_start)
Results of running the functions 3 times each.
# Calculated with 3 runs each
# Ratio is list_comprehension:any_check
# 10,000 - Mean Ratio: 17.87
# Example run;
# any_check 1.5022000297904015e-05
# list_comprehension 0.00038980199315119535
# 100,000 - Mean Ratio: 140.76
# any_check 2.020499960053712e-05
# list_comprehension 0.0035961729954578914
# 1,000,000 - Mean Ratio: 3379.81
# any_check 2.2904998331796378e-05
# list_comprehension 0.08528400499926647

As several people pointed out in comments, the reason your function doing an in test is slower than the version using any is because that function also includes an unnecessary list comprehension that needs to iterate over the whole input before the in operator can begin looking for a match. When run on lists, both in and any can short circuit, quitting early if a matching value is found early in the search. But the list comprehension in your second function always iterates over the whole input even if there was a 1 right at the start.
If you replaced 1 in [num for num in input] with 1 in input, you'd see performance as good or better than in your function using any. Performance would be fairly similar if input was a list, but might be much faster for other container types (such as sets and ranges).

Related

Nested Loop or 'in', which is faster?

I am a python novice and was studying some basic coding challenges and was hoping to someone could explain which of the following snippets of code would run faster. The point is to see if there are pairs of integers within the list that add up to 100:
list = [1,2,3,99,5]
for i in list:
for j in list:
if i + j == 100:
return True
or:
list = [1,2,3,99,5]
for i in list:
diff = 100 - i
if diff in list:
return True

Benchmark
This homemade, randomized benchmark demonstrates that the solution using in is significantly faster in most case. I did not investigate, but I did encounter some runs where the solution with the nested for-loops was slightly faster when toying with the sample size.
import time, random
def time_it(f, rep=100000):
sample = [[random.randint(0, 100) for _ in range(20)] for _ in range(rep // 100)]
start = time.time()
for i in range(rep):
f(sample[i % len(sample)])
return (time.time() - start)
def nested_for(lst):
for i in lst:
for j in lst:
if i + j == 100:
return True
def nested_in(lst):
for i in lst:
diff = 100 - i
if diff in lst:
return True
print('for:', time_it(nested_for))
print('in:', time_it(nested_in))
Output
for: 0.7093353271484375
in: 0.24253296852111816
Removing the assignation of j on every iteration is probably what removes a big overhead in the solution with the in.
Improvement
Although note that both solutions are O(n2). You can achieve O(n) by using a set. Since a set hashes its items, lookup is O(1).
def contains_diff(lst):
elements = set(lst)
return any(100 - i in elements for i in elements)
print(contains_diff([1, 2, 3, 99])) # True
print(contains_diff([1, 2, 3])) # False
Interestingly enough, if you benchmark the above it will be generally slower than the in solution. This is because the probability of in finding a sum of 100 quickly in a randomized list is relatively high. If you let the difference you want to find grow, then the overhead of building a set is rapidly compensated by the speed of set lookup.
Sidenote
As a sidenote, you should not be using the list as a variable name as it overwrites the builtin list.

Is setting a list to 10,000 blank values then filling, or appending to an empty list faster?

If I have a list 10,000 entries in length, is it faster to:
Make a blank list then append to it.
Or
Make a list filled with 10,000 blank entries and set each entry to the data.
Example Code
# first case
a=[]
for i in range(10000):
a.append(input())
# second case
a= [0]*10000
for i in range(10000):
a[i] = input()

the timeit module is great for testing this sort of thing:
# first case
def test1():
a=[]
for i in range(10000):
a.append(1)
# second case
def test2():
a= [0]*10000
for i in range(10000):
a[i] = 1
#list comprehension
def test3():
a = [1 for _ in range(10000)]
import timeit
n = 10000
print("appending: ",timeit.timeit(test1,number=n))
print("assigning: ",timeit.timeit(test2,number=n))
print("comprehension:",timeit.timeit(test3,number=n))
output:
appending: 13.14265166100813
assigning: 8.314113713015104
comprehension: 6.283505174011225
as requested, I replaced timeit.timeit(...) with sum(timeit.repeat(..., repeat=7))/7 to get an average time and got this result:
appending: 12.813485399578765
assigning: 8.514678678861985
comprehension: 6.271697525575291
which is not drastically different from my original results.

I thought I'd try to answer using CS principles, without timing it empirically, because simply timing it really doesn't give you the gist of why one is better, or whether it is true for other values of N.
Python lists are implemented as arrays. This means that "appending" requires periodic resizing, whereas the blank/reassign option is a single allocation followed by 10,000 O(1) access times. Therefore, in the limit (e.g., for 10K, 100K, 1M, etc), I would expect the second option to be much faster due to all the resizing required by the first option.
For further reading see: How is Python's List Implemented?

They are both comparable in speed. The list comprehension suggested by #Tadhg is about twice as fast.
These are the timing results:
first case: 10.3030366897583
second case: 9.829667568206787
list comprehension: 5.473726511001587
And this is the source code I used:
from time import time
from random import random
# first case
iterations = 10000000
start = time()
a=[]
for i in range(iterations):
a.append(random())
print(time() - start)
# second case
start = time()
a= [0]*iterations
for i in range(iterations):
a[i] = random()
print(time() - start)
start = time()
a = [random() for _ in range(iterations)]
print(time() - start)

Is there a way to avoid this memory error?

I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.
from itertools import combinations
import time
def findanums(n):
l = []
for i in range(1, n + 1):
s = []
for j in range(1, i):
if i % j == 0:
s.append(j)
if sum(s) > i:
l.append(i)
return l
start = time.time() #start time
limit = 28123
anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start
pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start
sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start
print "start main loop"
answer = 0
for i in range(1,limit+1):
if i not in sums:
answer += i
print "ANSWER:",answer
When I run this I run into a MemoryError.
The traceback:
File "test.py", line 20, in <module>
sums = map(lambda x: x[0]+x[1], pairs)
I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.
SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.
When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?
The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.
I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:
def generator_sol(anums1, s):
for comb in itertools.combinations(anums1, s):
yield comb
Now instead of pairs = combinations(anums, 2) which generates a large object.
Use:
pairs = generator_sol(anums, 2)
Then, instead of using the lambda I would use another generator:
sums_sol = (x[0]+x[1] for x in pairs)
Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).

Let me suggest you to use generators. Try changing this:
sums = map(lambda x: x[0]+x[1], pairs)
to
sums = (a+b for (a,b) in pairs)
Ofiris solution is also ok but implies that itertools.combinations return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.

The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:
import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()

Determining index of maximum value in a list - Optimalization

I have written few lines of code to solve this problem, but profiler says, that it is very time-consuming. (using kernprof line-by-line profiler)
Here is the code:
comp = [1, 2, 3] #comp is list with always 3 elements, values 1, 2, 3 are just for illustration
m = max(comp)
max_where = [i for i, j in enumerate(comp) if j == m]
if 0 in max_where:
some action1
if 1 in max_where:
some action2
if 2 in max_where:
some action3
Profiler says that most time is consumed in max_where calculation. I have also tried to split this calculation into if-tree to avoid some unnecessary operations, but results were not satisfactory.
Please, am I doing it wrong or is it just python?

If it's always three elements, why not simply do:
comp = [1, 2, 3]
m = max(comp)
if comp[0] == m:
some action
if comp[1] == m:
some action
if comp[2] == m:
some action

If you're doing this many times, and if you have all the lists available at the same time, then you could make use of numpy.argmax to get the indices for all the lists.

You say that this is a time-consuming operation, but I sincerely doubt this actually affects your program. Have you actually found that this is causing some problem due to slow execution in your code? If not, there is no point optimizing.
This said, there is a small optimization I can think of - which is to use a set rather than a list comprehension for max_where. This will make your three membership tests faster.
max_where = {i for i, j in enumerate(comp) if j == m}
That said, with only three items/checks, the construction of the set may well take more time than it saves.
In general, with a list of three items, this operation is going to take negligible amounts of time. On my system, it takes half a microsecond to perform this operation.
In short: Don't bother. Unless this is a proven bottleneck in your program that needs to be sped up, your current code is fine.

Expanding upon Tobias' answer, using a for loop:
comp = [1, 2, 3]
m = max(comp)
for index in range(len(comp)):
if comp[index] == m:
# some action
Since indexing starts at 0, you do not need to do len(comp) + 1.
I prefer using indexing in a for loop instead of the actual element, because it speeds things up considerably.
Some times in a process, you may need the index of a specific element. Then, using l.index(obj) will waste time (even if only insignificant amounts --- for longer processes, this becomes tedious).
This also assumes that every process (for the comp[index]) is very similar: same process but with different variables. This wouldn't work if you have significantly different processes for each index.
However, by using for index in range(len(l)):, you already have the index and the item can easily be accessed with l[index] (along with the index, which is given by the loop).
Oddly, it seems that Tobias' implementation is faster (I thought otherwise):
comp = [1, 2, 3]
m = max(comp)
from timeit import timeit
def test1():
if comp[0] == m: return m
if comp[1] == m: return m
if comp[2] == m: return m
def test2():
for index in range(len(comp)):
if comp[index] == m: return m
print 'test1:', timeit(test1, number = 1000)
print 'test2:', timeit(test2, number = 1000)
Returns:
test1: 0.00121262329299
test2: 0.00469034990534
My implementation may be faster for longer lists (not sure, though). However, writing the code for that is tedious (for a long list using repeated if comp[n] == m).

How About this:
sample = [3,1,2]
dic = {0:func_a,1:func_b,2:func_c}
x = max(sample)
y = sample.index(x)
dic[y]
As mentioned and rightfully downvoted this does not work for multiple function calls.
However this does:
sample = [3,1,3]
dic = {0:"func_a",1:"func_b",2:"func_c"}
max_val = max(sample)
max_indices = [index for index, elem in enumerate(sample) if elem==max_val]
for key in max_indices:
dic[key]
This is quite similar to other solutions above. I know some time passed but it wasn't right how it was. :)
Cheers!

Generate 4000 unique pseudo-random cartesian coordinates FASTER?

The range for x and y is from 0 to 99.
I am currently doing it like this:
excludeFromTrainingSet = []
while len(excludeFromTrainingSet) < 4000:
tempX = random.randint(0, 99)
tempY = random.randint(0, 99)
if [tempX, tempY] not in excludeFromTrainingSet:
excludeFromTrainingSet.append([tempX, tempY])
But it takes ages and I really need to speed this up.
Any ideas?

Vincent Savard has an answer that's almost twice as fast as the first solution offered here.
Here's my take on it. It requires tuples instead of lists for hashability:
def method2(size):
ret = set()
while len(ret) < size:
ret.add((random.randint(0, 99), random.randint(0, 99)))
return ret
Just make sure that the limit is sane as other answerers have pointed out. For sane input, this is better algorithmically O(n) as opposed to O(n^2) because of the set instead of list. Also, python is much more efficient about loading locals than globals so always put this stuff in a function.
EDIT: Actually, I'm not sure that they're O(n) and O(n^2) respectively because of the probabilistic component but the estimations are correct if n is taken as the number of unique elements that they see. They'll both be slower as they approach the total number of available spaces. If you want an amount of points which approaches the total number available, then you might be better off using:
import random
import itertools
def method2(size, min_, max_):
range_ = range(min_, max_)
points = itertools.product(range_, range_)
return random.sample(list(points), size)
This will be a memory hog but is sure to be faster as the density of points increases because it avoids looking at the same point more than once. Another option worth profiling (probably better than last one) would be
def method3(size, min_, max_):
range_ = range(min_, max_)
points = list(itertools.product(range_, range_))
N = (max_ - min_)**2
L = N - size
i = 1
while i <= L:
del points[random.randint(0, N - i)]
i += 1
return points

My suggestion :
def method2(size):
randints = range(0, 100)
excludeFromTrainingSet = set()
while len(excludeFromTrainingSet) < size:
excludeFromTrainingSet.add((random.choice(randints), random.choice(randints)))
return excludeFromTrainingSet
Instead of generation 2 random numbers every time, you first generate the list of numbers from 0 to 99, then you choose 2 and appends to the list. As others pointed out, there are only 10 000 possibilities so you can't loop until you get 40 000, but you get the point.

I'm sure someone is going to come in here with a usage of numpy, but how about using a set and tuple?
E.g.:
excludeFromTrainingSet = set()
while len(excludeFromTrainingSet) < 40000:
temp = (random.randint(0, 99), random.randint(0, 99))
if temp not in excludeFromTrainingSet:
excludeFromTrainingSet.add(temp)
EDIT: Isn't this an infinite loop since there are only 100^2 = 10000 POSSIBLE results, and you're waiting until you get 40000?

Make a list of all possible (x,y) values:
allpairs = list((x,y) for x in xrange(99) for y in xrange(99))
# or with Py2.6 or later:
from itertools import product
allpairs = list(product(xrange(99),xrange(99)))
# or even taking DRY to the extreme
allpairs = list(product(*[xrange(99)]*2))
Shuffle the list:
from random import shuffle
shuffle(allpairs)
Read off the first 'n' values:
n = 4000
trainingset = allpairs[:n]
This runs pretty snappily on my laptop.

You could make a lookup table of random values... make a random index into that lookup table, and then step through it with a static increment counter...

Generating 40 thousand numbers inevitably will take a while. But you are performing an O(n) linear search on the excludeFromTrainingSet, which takes quite a while especially later in the process. Use a set instead. You could also consider generating a number of coordinate sets e.g. over night and pickle them, so you don't have to generate new data for each test run (dunno what you're doing, so this might or might not help). Using tuples, as someone noted, is not only the semantically correct choice, it might also help with performance (tuple creation is faster than list creation). Edit: Silly me, using tuples is required when using sets, since set members must be hashable and lists are unhashable.
But in your case, your loop isn't terminating because 0..99 is 100 numbers and two-tuples of them have only 100^2 = 10000 unique combinations. Fix that, then apply the above.

Taking Vince Savard's code:
>>> from random import choice
>>> def method2(size):
... randints = range(0, 100)
... excludeFromTrainingSet = set()
... while True:
... x = size - len(excludeFromTrainingSet)
... if not x:
... break
... else:
... excludeFromTrainingSet.add((choice(randints), choice(randints)) for _ in range(x))
... return excludeFromTrainingSet
...
>>> s = method2(4000)
>>> len(s)
4000
This is not a great algorithm because it has to deal with collisions, but the tuple-generation makes it tolerable. This runs in about a second on my laptop.

## for py 3.0+
## generate 4000 points in 2D
##
import random
maxn = 10000
goodguys = 0
excluded = [0 for excl in range(0, maxn)]
for ntimes in range(0, maxn):
alea = random.randint(0, maxn - 1)
excluded[alea] += 1
if(excluded[alea] > 1): continue
goodguys += 1
if goodguys > 4000: break
two_num = divmod(alea, 100) ## Unfold the 2 numbers
print(two_num)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is any() so much faster than in? - python

Related

Nested Loop or 'in', which is faster?

Is setting a list to 10,000 blank values then filling, or appending to an empty list faster?

Is there a way to avoid this memory error?

Determining index of maximum value in a list - Optimalization

Generate 4000 unique pseudo-random cartesian coordinates FASTER?

Categories

Resources