from python timeit module i want to check how much time does it take to print the following , how to do so,
import timeit
x = [x for x in range(10000)]
timeit.timeit("print x[9999]")
d=[{i:i} for i in x]
timeit.timeit("print d[9999]")
NameError: global name 'x' is not defined
NameError: global name 'd' is not defined
Per the docs:
To give the timeit module access to functions you define, you can pass a setup parameter which contains an import statement
In your case, that would be e.g.:
timeit.timeit('print d[9999]',
setup='from __main__ import d')
Here is an example of how you can do it:
import timeit
x = [x for x in range(10000)]
d = [{i: i} for i in x]
for i in [x, d]:
t = timeit.timeit(stmt="print(i[9999])", number=100, globals=globals())
print(f"took: {t:.4f}")
Output:
took: 0.0776
took: 0.0788
Please notice I added number=100, so it runs 100 times each test. By default it 1,000,000 times.
Related
I was convinced to save computation time in using lambda function, but it's not that clear. look at this example:
import numpy as np
import timeit
def f_with_lambda():
a = np.array(range(5))
b = np.array(range(5))
A,B = np.meshgrid(a,b)
rst = list(map(lambda x,y : x+y , A, B))
return np.array(rst)
def f_with_for():
a = range(5)
b = np.array(range(5))
rst = [b+x for x in a]
return np.array(rst)
lambda_rst = f_with_lambda()
for_rst = f_with_for()
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
result is simple:
-lambda function result with time it is 0.3514268280014221 s
- with for loop : 0.10633227700236603 s
How do I write my lambda function to be competitive ? I noticed the list function to get results from de map object is not good in time. Any other way to proceed ? the mesgrid function is certainly not the best as well...
every tip is welcome!
Considering the remark about the list:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(map(lambda x,y : x+y , A, B))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
it is changing a lot of things. This time (lambda vs for)
for 5:
0.30227499100146815 vs 0.2510572589999356 (quite similar)
for 150:
0.6687559890015109 vs 20.31807473200024 ( :) :) :) ) !! great job! thank you!
Memory allocation is taking time (it should call an OS procedure, it might be delayed).
In the lambda version, you allocated a, b, meshgrid, rst (list and array versions) + the return array.
In the for version, you allocated b and rst + the return array. a is a generator so it takes no time to create and load it in memory.
This is why your function using lambda is slower.
Plus, don't use list to handle result of np-array operations to cast it back to np-array.
Just by removing the list() it become faster (from 0.9 to 0.4).
def f_with_lambda():
a = np.array(range(SIZE))
b = np.array(range(SIZE))
A,B = np.meshgrid(a,b)
rst = map(lambda x,y : x+y , A, B)
return np.array(rst)
See https://stackoverflow.com/a/46470401/9453926 for speed comparison.
I compacted the code:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(list(map(lambda x,y : x+y , A, B)))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
This time, for a 5x5, the result is
Lambda vs for
0.38113487999726203 vs 0.24913009200099623
and with 150 it's better:
2.680842614001449 vs 20.176408246999927
But I found no way to integrate the mesgrid inside the lambda function. and the list conversion before the array is sad as well.
I took time to integrate the last remark from politinsa:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(list(map(lambda x,y : x+y , A, B)))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
def f_with_lambda_nolist():
A,B = np.meshgrid(range(150),range(150))
return np.array(map(lambda x,y : x+y , A, B))
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
print(timeit.timeit("f_with_lambda_nolist()",setup = "from __main__ import f_with_lambda_nolist",number = 10000))
results are:
2.4421722999977646 s
18.75847979998798 s
0.6800016999914078 s -> list conversion has (as explained) a real impact on memory allocation
Assume that you have a list with an arbitrary amounts of items, and you wish to get the number of items that match a specific conditions. I though of two ways to do this in a sensible manner but I am not sure which one is best (more pythonic) - or if there is perhaps a better option (without sacrificing too much readability).
import numpy.random as nprnd
import timeit
my = nprnd.randint(1000, size=1000000)
def with_len(my_list):
much = len([t for t in my_list if t >= 500])
def with_sum(my_list):
many = sum(1 for t in my_list if t >= 500)
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
print("with len:",t1.timeit(1000)/1000)
print("with sum:",t2.timeit(1000)/1000)
Performance is almost identical between these two cases. However, which of these is more pythonic? Or is there a better alternative?
For those who are curious, I tested the proposed solutions (from comments and answers) and these are the results:
import numpy as np
import timeit
import functools
my = np.random.randint(1000, size=100000)
def with_len(my_list):
return len([t for t in my_list if t >= 500])
def with_sum(my_list):
return sum(1 for t in my_list if t >= 500)
def with_sum_alt(my_list):
return sum(t >= 500 for t in my_list)
def with_lambda(my_list):
return functools.reduce(lambda a, b: a + (1 if b >= 500 else 0), my_list, 0)
def with_np(my_list):
return len(np.where(my_list>=500)[0])
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
t3 = timeit.Timer('with_sum_alt(my)', 'from __main__ import with_sum_alt, my')
t4 = timeit.Timer('with_lambda(my)', 'from __main__ import with_lambda, my')
t5 = timeit.Timer('with_np(my)', 'from __main__ import with_np, my')
print("with len:", t1.timeit(1000)/1000)
print("with sum:", t2.timeit(1000)/1000)
print("with sum_alt:", t3.timeit(1000)/1000)
print("with lambda:", t4.timeit(1000)/1000)
print("with np:", t5.timeit(1000)/1000)
Python 2.7
('with len:', 0.02201753337348283)
('with sum:', 0.022727363518455238)
('with sum_alt:', 0.2370256687439941) # <-- very slow!
('with lambda:', 0.026367264818657078)
('with np:', 0.0005811764306089913) # <-- very fast!
Python 3.6
with len: 0.017649643657480736
with sum: 0.0182978007766851
with sum_alt: 0.19659815740239048
with lambda: 0.02691670741400111
with np: 0.000534095418615152
The 2nd one, with_sum is more pythonic in the sense that it uses much less memory as it doesn't build the whole list because the generator expression is fed to sum().
I'm with #Chris_Rands. But as far as performance is concerned, there is a faster way using numpy:
import numpy as np
def with_np(my_list):
return len(np.where(my_list>=500)[0])
Say I need to collect millions of strings in an iterable that I can later randomly index by position.
I need to populate the iterable one item at a time, sequentially, for millions of entries.
Given the above, which method could in principle be more efficient:
Populating a list:
while <condition>:
if <condition>:
my_list[count] = value
count += 1
Populating a dictionary:
while <condition>:
if <condition>:
my_dict[count] = value
count += 1
(the above is pesudocode, everything would be initialized before running the snippets).
I am specifically interested in the CPython implementation for Python 3.4.
Lists are definitely faster, if you use them in the right way.
In [19]: %%timeit l = []
....: for i in range(1000000): l.append(str(i))
....:
1 loops, best of 3: 182 ms per loop
In [20]: %%timeit d = {}
....: for i in range(1000000): d[i] = str(i)
....:
1 loops, best of 3: 207 ms per loop
In [21]: %timeit [str(i) for i in range(1000000)]
10 loops, best of 3: 158 ms per loop
Pushing the Python loop down to the C level with a comprehension buys you quite a bit of time. It also makes more sense to prefer a list for keys that are a prefix of the integers. Pre-allocating saves even more time:
>>> %%timeit
... l = [None] * 1000000
... for i in xrange(1000000): my_list[i] = str(i)
...
10 loops, best of 3: 147 ms per loop
For completeness, a dict comprehension does not speed things up:
In [22]: %timeit {i: str(i) for i in range(1000000)}
1 loops, best of 3: 213 ms per loop
With larger strings, I see very similar differences in performance (try str(i) * 10). This is CPython 2.7.6 on an x86-64.
I don't understand why you want to create an empty list or dict and then populate it. Why not create a new list or dictionary directly from the generation process?
results = list(a_generator)
# Or if you really want to use a dict for some reason:
results = dict(enumerate(a_generator))
You can get even better times by using the map function:
>>> def test1():
l = []
for i in range(10 ** 6):
l.append(str(i))
>>> def test2():
d = {}
for i in range(10 ** 6):
d[i] = str(i)
>>> def test3():
[str(i) for i in range(10 ** 6)]
>>> def test4():
{i: str(i) for i in range(10 ** 6)}
>>> def test5():
list(map(str, range(10 ** 6)))
>>> def test6():
r = range(10 ** 6)
dict(zip(r, map(str, r)))
>>> timeit.Timer('test1()', 'from __main__ import test1').timeit(100)
30.628035710889932
>>> timeit.Timer('test2()', 'from __main__ import test2').timeit(100)
31.093550469839613
>>> timeit.Timer('test3()', 'from __main__ import test3').timeit(100)
25.778271498509355
>>> timeit.Timer('test4()', 'from __main__ import test4').timeit(100)
30.10892986559668
>>> timeit.Timer('test5()', 'from __main__ import test5').timeit(100)
20.633583353028826
>>> timeit.Timer('test6()', 'from __main__ import test6').timeit(100)
28.660790917067914
I am researching on speed of factorial. But I am using two ways only,
import timeit
def fact(N):
B = N
while N > 1:
B = B * (N-1)
N = N-1
return B
def fact1(N):
B = 1
for i in range(1, N+1):
B = B * i
return B
print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
Here is the output,
0.540276050568 120
0.654400110245 120
From above code I have observed,
While take less time than for
My question is,
Is the best way to find the factorial in python ?
If you're looking for the best, why not use the one provided in the math module?
>>> import math
>>> math.factorial
<built-in function factorial>
>>> math.factorial(10)
3628800
And a comparison of timings on my machine:
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
0.840167045593 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.04350399971 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial")
0.149857997894
We see that the builtin is significantly better than either of the pure python variants you proposed.
TLDR; microbenchmarks aren't very useful
For Cpython, try this:
>>> from math import factorial
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
1.38128209114 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.46199703217 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), factorial(5)
0.397044181824 120
But under pypy, the while is faster than the one from math
>>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)\
0.170556783676 120
>>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1\
(5)
0.319650173187 120
>>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), f\
actorial(5)
0.210616111755 120
So it depends on the implementation. Now try bigger numbers
>>>> print timeit.timeit('fact(50)', setup="from __main__ import fact"), fact(50)
7.71517109871 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('fact1(50)', setup="from __main__ import fact1"), fact1(50)
6.58060312271 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('factorial(50)', setup="from math import factorial"), factorial(50)
6.53072690964 30414093201713378043612608166064768844377641568960512000000000000
while is in last place, and the version using for is about the same as the one from the math module
Otherwise, if you're looking for a Python implementation (this is my favourite):
from operator import mul
def factorial(n):
return reduce(mul, range(1, (n + 1)), 1)
Usage:
>>> factorial(0)
1
>>> factorial(1)
1
>>> factorial(2)
2
>>> factorial(3)
6
>>> factorial(4)
24
>>> factorial(5)
120
>>> factorial(10)
3628800
Performance: (On my desktop:)
$ python -m timeit -c -s "fact = lambda n: reduce(lambda a, x: a * x, range(1, (n + 1)), 1)" "fact(10)"
1000000 loops, best of 3: 1.98 usec per loop
I have tried with reduce(lambda x, y: x*y, range(1, 5))
>>>timeit("import math; math.factorial(4)")
1.0205099133840179
>>>timeit("reduce(lambda x, y: x*y, range(1, 5))")
1.4047879075160665
>>>timeit("from operator import mul;reduce(mul, range(1, 5))")
2.530837320051319
In coding competitions we encounter inputs like:
2 3
4 5
So we do the following:
m, n = [int(x) for x in raw_input().split(' ')]
Is there a faster way of doing the same thing?
For all practical purposes, That's about as fast as you can get. On some machines, you may see a speedup on the order or a couple percent if you go with map instead of a list comprehension, but that's not guaranteed.
Here's some quick timings on my machine:
from itertools import imap
#map
>>> timeit.timeit('x,y = map(int,line.split(" "))','from __main__ import line')
4.7857139110565186
>>> timeit.timeit('x,y = map(int,line.split())','from __main__ import line')
4.5680718421936035
#list comprehension
>>> timeit.timeit('x,y = [int(x) for x in line.split(" ")]','from __main__ import line')
4.3816750049591064
>>> timeit.timeit('x,y = [int(x) for x in line.split()]','from __main__ import line')
4.3246541023254395
#itertools.imap
>>> timeit.timeit('x,y = imap(int,line.split(" "))','from __main__ import line,imap')
4.431504011154175
>>> timeit.timeit('x,y = imap(int,line.split())','from __main__ import line,imap')
4.3257410526275635
#generator expression
>>> timeit.timeit('x,y = (int(x) for x in line.split(" "))','from __main__ import line')
4.897794961929321
>>> timeit.timeit('x,y = (int(x) for x in line.split())','from __main__ import line')
4.732620000839233
Surprisingly, split() seems to perform better than split(" ").
If you're guaranteed to have ascii input of numbers between 0 and 9, you can do a good bit better using ord:
>>>timeit.timeit('x,y = [ord(x)-48 for x in line.split(" ")]','from __main__ import line')
1.377655029296875
>>> timeit.timeit('x,y = [ord(x)-48 for x in line.split()]','from __main__ import line')
1.3243558406829834
But that imposes a severe restriction on your inputs.
One other idea that you could try (I have no idea what the performance implications would be), but you could read your lines from sys.stdin:
import sys
for line in sys.stdin:
x,y = [ord(x)-48 for x in line.split()]
Use map(), it's faster than list comprehensions when used with built-in functions:
m, n = map(int, raw_input().split())