In coding competitions we encounter inputs like:
2 3
4 5
So we do the following:
m, n = [int(x) for x in raw_input().split(' ')]
Is there a faster way of doing the same thing?
For all practical purposes, That's about as fast as you can get. On some machines, you may see a speedup on the order or a couple percent if you go with map instead of a list comprehension, but that's not guaranteed.
Here's some quick timings on my machine:
from itertools import imap
#map
>>> timeit.timeit('x,y = map(int,line.split(" "))','from __main__ import line')
4.7857139110565186
>>> timeit.timeit('x,y = map(int,line.split())','from __main__ import line')
4.5680718421936035
#list comprehension
>>> timeit.timeit('x,y = [int(x) for x in line.split(" ")]','from __main__ import line')
4.3816750049591064
>>> timeit.timeit('x,y = [int(x) for x in line.split()]','from __main__ import line')
4.3246541023254395
#itertools.imap
>>> timeit.timeit('x,y = imap(int,line.split(" "))','from __main__ import line,imap')
4.431504011154175
>>> timeit.timeit('x,y = imap(int,line.split())','from __main__ import line,imap')
4.3257410526275635
#generator expression
>>> timeit.timeit('x,y = (int(x) for x in line.split(" "))','from __main__ import line')
4.897794961929321
>>> timeit.timeit('x,y = (int(x) for x in line.split())','from __main__ import line')
4.732620000839233
Surprisingly, split() seems to perform better than split(" ").
If you're guaranteed to have ascii input of numbers between 0 and 9, you can do a good bit better using ord:
>>>timeit.timeit('x,y = [ord(x)-48 for x in line.split(" ")]','from __main__ import line')
1.377655029296875
>>> timeit.timeit('x,y = [ord(x)-48 for x in line.split()]','from __main__ import line')
1.3243558406829834
But that imposes a severe restriction on your inputs.
One other idea that you could try (I have no idea what the performance implications would be), but you could read your lines from sys.stdin:
import sys
for line in sys.stdin:
x,y = [ord(x)-48 for x in line.split()]
Use map(), it's faster than list comprehensions when used with built-in functions:
m, n = map(int, raw_input().split())
Related
I was convinced to save computation time in using lambda function, but it's not that clear. look at this example:
import numpy as np
import timeit
def f_with_lambda():
a = np.array(range(5))
b = np.array(range(5))
A,B = np.meshgrid(a,b)
rst = list(map(lambda x,y : x+y , A, B))
return np.array(rst)
def f_with_for():
a = range(5)
b = np.array(range(5))
rst = [b+x for x in a]
return np.array(rst)
lambda_rst = f_with_lambda()
for_rst = f_with_for()
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
result is simple:
-lambda function result with time it is 0.3514268280014221 s
- with for loop : 0.10633227700236603 s
How do I write my lambda function to be competitive ? I noticed the list function to get results from de map object is not good in time. Any other way to proceed ? the mesgrid function is certainly not the best as well...
every tip is welcome!
Considering the remark about the list:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(map(lambda x,y : x+y , A, B))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
it is changing a lot of things. This time (lambda vs for)
for 5:
0.30227499100146815 vs 0.2510572589999356 (quite similar)
for 150:
0.6687559890015109 vs 20.31807473200024 ( :) :) :) ) !! great job! thank you!
Memory allocation is taking time (it should call an OS procedure, it might be delayed).
In the lambda version, you allocated a, b, meshgrid, rst (list and array versions) + the return array.
In the for version, you allocated b and rst + the return array. a is a generator so it takes no time to create and load it in memory.
This is why your function using lambda is slower.
Plus, don't use list to handle result of np-array operations to cast it back to np-array.
Just by removing the list() it become faster (from 0.9 to 0.4).
def f_with_lambda():
a = np.array(range(SIZE))
b = np.array(range(SIZE))
A,B = np.meshgrid(a,b)
rst = map(lambda x,y : x+y , A, B)
return np.array(rst)
See https://stackoverflow.com/a/46470401/9453926 for speed comparison.
I compacted the code:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(list(map(lambda x,y : x+y , A, B)))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
This time, for a 5x5, the result is
Lambda vs for
0.38113487999726203 vs 0.24913009200099623
and with 150 it's better:
2.680842614001449 vs 20.176408246999927
But I found no way to integrate the mesgrid inside the lambda function. and the list conversion before the array is sad as well.
I took time to integrate the last remark from politinsa:
import numpy as np
import timeit
def f_with_lambda():
A,B = np.meshgrid(range(150),range(150))
return np.array(list(map(lambda x,y : x+y , A, B)))
def f_with_for():
return np.array([np.array(range(150))+x for x in range(150)])
def f_with_lambda_nolist():
A,B = np.meshgrid(range(150),range(150))
return np.array(map(lambda x,y : x+y , A, B))
if __name__ == '__main__':
print(timeit.timeit("f_with_lambda()",setup = "from __main__ import f_with_lambda",number = 10000))
print(timeit.timeit("f_with_for()",setup = "from __main__ import f_with_for",number = 10000))
print(timeit.timeit("f_with_lambda_nolist()",setup = "from __main__ import f_with_lambda_nolist",number = 10000))
results are:
2.4421722999977646 s
18.75847979998798 s
0.6800016999914078 s -> list conversion has (as explained) a real impact on memory allocation
I want take logarithm multiple times. We know this
import numpy as np
np.log(x)
now the second logarithm would be
np.log(np.log(x))
what if one wants to take n number of logs? surely it would not be pythonic to repeat n times as above.
As per #eugenhu's suggestion, one way is to use a generic function which loops iteratively:
import numpy as np
def repeater(f, n):
def fn(i):
result = i
for _ in range(n):
result = f(result)
return result
return fn
repeater(np.log, 5)(x)
You could use the following little trick:
>>> from functools import reduce
>>>
>>> k = 4
>>> x = 1e12
>>>
>>> y = np.array(x)
>>> reduce(np.log, (k+1) * (y,))[()]
0.1820258315495139
and back:
>>> reduce(np.exp, (k+1) * (y,))[()]
999999999999.9813
On my machine this is slightly faster than #jp_data_analysis' approach:
>>> def f_pp(ufunc, x, k):
... y = np.array(x)
... return reduce(ufunc, (k+1) * (y,))[()]
...
>>> x = 1e12
>>> k = 5
>>>
>>> from timeit import repeat
>>> kwds = dict(globals=globals(), number=100000)
>>>
>>> repeat('repeater(np.log, 5)(x)', **kwds)
[0.5353733809897676, 0.5327484680456109, 0.5363518510130234]
>>> repeat('f_pp(np.log, x, 5)', **kwds)
[0.4512511100037955, 0.4380568229826167, 0.45331112697022036]
To be fair, their approach is more flexible. Mine uses quite specific properties of unary ufuncs and numpy arrays.
Larger k is also possible. For that we need to make sure that x is complex because np.log will not switch automatically.
>>> x = 1e12+0j
>>> k = 50
>>>
>>> f_pp(np.log, x, 50)
(0.3181323483680859+1.3372351153002153j)
>>> f_pp(np.exp, _, 50)
(1000000007040.9696+6522.577629950761j)
# not that bad, all things considered ...
>>>
>>> repeat('f_pp(np.log, x, 50)', **kwds)
[4.272890724008903, 4.266964592039585, 4.270542044949252]
>>> repeat('repeater(np.log, 50)(x)', **kwds)
[5.799160094989929, 5.796761817007791, 5.80835147597827]
From this post, you can compose functions:
Code
import itertools as it
import functools as ft
import numpy as np
def compose(f, g):
return lambda x: f(g(x))
identity = lambda x: x
Demo
ft.reduce(compose, it.repeat(np.log, times=2), identity)(10)
# 0.83403244524795594
ft.reduce(compose, it.repeat(np.log, times=3), identity)(10)
# -0.18148297420509205
Assume that you have a list with an arbitrary amounts of items, and you wish to get the number of items that match a specific conditions. I though of two ways to do this in a sensible manner but I am not sure which one is best (more pythonic) - or if there is perhaps a better option (without sacrificing too much readability).
import numpy.random as nprnd
import timeit
my = nprnd.randint(1000, size=1000000)
def with_len(my_list):
much = len([t for t in my_list if t >= 500])
def with_sum(my_list):
many = sum(1 for t in my_list if t >= 500)
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
print("with len:",t1.timeit(1000)/1000)
print("with sum:",t2.timeit(1000)/1000)
Performance is almost identical between these two cases. However, which of these is more pythonic? Or is there a better alternative?
For those who are curious, I tested the proposed solutions (from comments and answers) and these are the results:
import numpy as np
import timeit
import functools
my = np.random.randint(1000, size=100000)
def with_len(my_list):
return len([t for t in my_list if t >= 500])
def with_sum(my_list):
return sum(1 for t in my_list if t >= 500)
def with_sum_alt(my_list):
return sum(t >= 500 for t in my_list)
def with_lambda(my_list):
return functools.reduce(lambda a, b: a + (1 if b >= 500 else 0), my_list, 0)
def with_np(my_list):
return len(np.where(my_list>=500)[0])
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
t3 = timeit.Timer('with_sum_alt(my)', 'from __main__ import with_sum_alt, my')
t4 = timeit.Timer('with_lambda(my)', 'from __main__ import with_lambda, my')
t5 = timeit.Timer('with_np(my)', 'from __main__ import with_np, my')
print("with len:", t1.timeit(1000)/1000)
print("with sum:", t2.timeit(1000)/1000)
print("with sum_alt:", t3.timeit(1000)/1000)
print("with lambda:", t4.timeit(1000)/1000)
print("with np:", t5.timeit(1000)/1000)
Python 2.7
('with len:', 0.02201753337348283)
('with sum:', 0.022727363518455238)
('with sum_alt:', 0.2370256687439941) # <-- very slow!
('with lambda:', 0.026367264818657078)
('with np:', 0.0005811764306089913) # <-- very fast!
Python 3.6
with len: 0.017649643657480736
with sum: 0.0182978007766851
with sum_alt: 0.19659815740239048
with lambda: 0.02691670741400111
with np: 0.000534095418615152
The 2nd one, with_sum is more pythonic in the sense that it uses much less memory as it doesn't build the whole list because the generator expression is fed to sum().
I'm with #Chris_Rands. But as far as performance is concerned, there is a faster way using numpy:
import numpy as np
def with_np(my_list):
return len(np.where(my_list>=500)[0])
from python timeit module i want to check how much time does it take to print the following , how to do so,
import timeit
x = [x for x in range(10000)]
timeit.timeit("print x[9999]")
d=[{i:i} for i in x]
timeit.timeit("print d[9999]")
NameError: global name 'x' is not defined
NameError: global name 'd' is not defined
Per the docs:
To give the timeit module access to functions you define, you can pass a setup parameter which contains an import statement
In your case, that would be e.g.:
timeit.timeit('print d[9999]',
setup='from __main__ import d')
Here is an example of how you can do it:
import timeit
x = [x for x in range(10000)]
d = [{i: i} for i in x]
for i in [x, d]:
t = timeit.timeit(stmt="print(i[9999])", number=100, globals=globals())
print(f"took: {t:.4f}")
Output:
took: 0.0776
took: 0.0788
Please notice I added number=100, so it runs 100 times each test. By default it 1,000,000 times.
I am researching on speed of factorial. But I am using two ways only,
import timeit
def fact(N):
B = N
while N > 1:
B = B * (N-1)
N = N-1
return B
def fact1(N):
B = 1
for i in range(1, N+1):
B = B * i
return B
print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
Here is the output,
0.540276050568 120
0.654400110245 120
From above code I have observed,
While take less time than for
My question is,
Is the best way to find the factorial in python ?
If you're looking for the best, why not use the one provided in the math module?
>>> import math
>>> math.factorial
<built-in function factorial>
>>> math.factorial(10)
3628800
And a comparison of timings on my machine:
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
0.840167045593 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.04350399971 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial")
0.149857997894
We see that the builtin is significantly better than either of the pure python variants you proposed.
TLDR; microbenchmarks aren't very useful
For Cpython, try this:
>>> from math import factorial
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
1.38128209114 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.46199703217 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), factorial(5)
0.397044181824 120
But under pypy, the while is faster than the one from math
>>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)\
0.170556783676 120
>>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1\
(5)
0.319650173187 120
>>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), f\
actorial(5)
0.210616111755 120
So it depends on the implementation. Now try bigger numbers
>>>> print timeit.timeit('fact(50)', setup="from __main__ import fact"), fact(50)
7.71517109871 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('fact1(50)', setup="from __main__ import fact1"), fact1(50)
6.58060312271 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('factorial(50)', setup="from math import factorial"), factorial(50)
6.53072690964 30414093201713378043612608166064768844377641568960512000000000000
while is in last place, and the version using for is about the same as the one from the math module
Otherwise, if you're looking for a Python implementation (this is my favourite):
from operator import mul
def factorial(n):
return reduce(mul, range(1, (n + 1)), 1)
Usage:
>>> factorial(0)
1
>>> factorial(1)
1
>>> factorial(2)
2
>>> factorial(3)
6
>>> factorial(4)
24
>>> factorial(5)
120
>>> factorial(10)
3628800
Performance: (On my desktop:)
$ python -m timeit -c -s "fact = lambda n: reduce(lambda a, x: a * x, range(1, (n + 1)), 1)" "fact(10)"
1000000 loops, best of 3: 1.98 usec per loop
I have tried with reduce(lambda x, y: x*y, range(1, 5))
>>>timeit("import math; math.factorial(4)")
1.0205099133840179
>>>timeit("reduce(lambda x, y: x*y, range(1, 5))")
1.4047879075160665
>>>timeit("from operator import mul;reduce(mul, range(1, 5))")
2.530837320051319