what is the most efficient way of concat two numbers to one number in python?
numbers are always in between 0 to 255, i have tested few ways by Concat as string and cast back to int but they are very costly in time vice for my code.
example
a = 152
c = 255
d = concat(a,c)
answer:
d = 152255
If the numbers are bounded, just multiply and add:
>>> a = 152
>>> c = 255
>>> d = a*1000+c
>>> d
152255
>>>
This is pretty fast:
def concat(a, b):
return 10**int(log(b, 10)+1)*a+b
It uses the logarithm to find how many times the first number must be multiplied by 10 for the sum to work as a concatenation
In [1]: from math import log
In [2]: a = 152
In [3]: b = 255
In [4]: def concat(a, b):
...: return 10**int(log(b, 10)+1)*a+b
...:
In [5]: concat(a, b)
Out[5]: 152255
In [6]: %timeit concat(a, b)
1000000 loops, best of 3: 1.18 us per loop
Yeah, there you go:
a = 152
b = 255
def concat(a, b):
n = next(x for x in range(10) if 10**x>a) # concatenates numbers up to 10**10
return a * 10**n + b
print(concat(a, b)) # -> 152255
Hi so i'm trying to make a function where I subtract the first number with the second and then add the third then subtract the fourth ie. x1-x2+x3-x4+x5-x6...
So far I have this, I can only add two variables, x and y. Was thinking of doing
>>> reduce(lambda x,y: (x-y) +x, [2,5,8,10]
Still not getting it
pretty simple stuff just confused..
In this very case it would be easier to use sums:
a = [2,5,8,10]
sum(a[::2])-sum(a[1::2])
-5
Use a multiplier that you revert to +- after each addition.
result = 0
mult = 1
for i in [2,5,8,10]:
result += i*mult
mult *= -1
You can keep track of the position (and thus whether to do + or -) with enumerate, and you could use the fact that -12n is +1 and -12n+1 is -1. Use this as a factor and sum all the terms.
>>> sum(e * (-1)**i for i, e in enumerate([2,5,8,10]))
-5
If you really want to use reduce, for some reason, you could do something like this:
class plusminus(object):
def __init__(self):
self._plus = True
def __call__(self, a, b):
self._plus ^= True
if self._plus:
return a+b
return a-b
reduce(plusminus(), [2,5,8,10]) # output: -5
Or just using sum and a generator:
In [18]: xs
Out[18]: [1, 2, 3, 4, 5]
In [19]: def plusminus(iterable):
....: for i, x in enumerate(iterable):
....: if i%2 == 0:
....: yield x
....: else:
....: yield -x
....:
In [20]: sum(plusminus(xs))
Out[20]: 3
Which could also be expressed as:
sum(map(lambda x: operator.mul(*x), zip(xs, itertools.cycle([1, -1]))))
Say I need to collect millions of strings in an iterable that I can later randomly index by position.
I need to populate the iterable one item at a time, sequentially, for millions of entries.
Given the above, which method could in principle be more efficient:
Populating a list:
while <condition>:
if <condition>:
my_list[count] = value
count += 1
Populating a dictionary:
while <condition>:
if <condition>:
my_dict[count] = value
count += 1
(the above is pesudocode, everything would be initialized before running the snippets).
I am specifically interested in the CPython implementation for Python 3.4.
Lists are definitely faster, if you use them in the right way.
In [19]: %%timeit l = []
....: for i in range(1000000): l.append(str(i))
....:
1 loops, best of 3: 182 ms per loop
In [20]: %%timeit d = {}
....: for i in range(1000000): d[i] = str(i)
....:
1 loops, best of 3: 207 ms per loop
In [21]: %timeit [str(i) for i in range(1000000)]
10 loops, best of 3: 158 ms per loop
Pushing the Python loop down to the C level with a comprehension buys you quite a bit of time. It also makes more sense to prefer a list for keys that are a prefix of the integers. Pre-allocating saves even more time:
>>> %%timeit
... l = [None] * 1000000
... for i in xrange(1000000): my_list[i] = str(i)
...
10 loops, best of 3: 147 ms per loop
For completeness, a dict comprehension does not speed things up:
In [22]: %timeit {i: str(i) for i in range(1000000)}
1 loops, best of 3: 213 ms per loop
With larger strings, I see very similar differences in performance (try str(i) * 10). This is CPython 2.7.6 on an x86-64.
I don't understand why you want to create an empty list or dict and then populate it. Why not create a new list or dictionary directly from the generation process?
results = list(a_generator)
# Or if you really want to use a dict for some reason:
results = dict(enumerate(a_generator))
You can get even better times by using the map function:
>>> def test1():
l = []
for i in range(10 ** 6):
l.append(str(i))
>>> def test2():
d = {}
for i in range(10 ** 6):
d[i] = str(i)
>>> def test3():
[str(i) for i in range(10 ** 6)]
>>> def test4():
{i: str(i) for i in range(10 ** 6)}
>>> def test5():
list(map(str, range(10 ** 6)))
>>> def test6():
r = range(10 ** 6)
dict(zip(r, map(str, r)))
>>> timeit.Timer('test1()', 'from __main__ import test1').timeit(100)
30.628035710889932
>>> timeit.Timer('test2()', 'from __main__ import test2').timeit(100)
31.093550469839613
>>> timeit.Timer('test3()', 'from __main__ import test3').timeit(100)
25.778271498509355
>>> timeit.Timer('test4()', 'from __main__ import test4').timeit(100)
30.10892986559668
>>> timeit.Timer('test5()', 'from __main__ import test5').timeit(100)
20.633583353028826
>>> timeit.Timer('test6()', 'from __main__ import test6').timeit(100)
28.660790917067914
I'm stuck at higher-order functions in python. I need to write a repeat function repeat that applies the function f n times on a given argument x.
For example, repeat(f, 3, x) is f(f(f(x))).
This is what I have:
def repeat(f,n,x):
if n==0:
return f(x)
else:
return repeat(f,n-1,x)
When I try to assert the following line:
plus = lambda x,y: repeat(lambda z:z+1,x,y)
assert plus(2,2) == 4
It gives me an AssertionError. I read about How to repeat a function n times but I need to have it done in this way and I can't figure it out...
You have two problems:
You are recursing the wrong number of times (if n == 1, the function should be called once); and
You aren't calling f on the returned value from the recursive call, so the function is only ever applied once.
Try:
def repeat(f, n, x):
if n == 1: # note 1, not 0
return f(x)
else:
return f(repeat(f, n-1, x)) # call f with returned value
or, alternatively:
def repeat(f, n, x):
if n == 0:
return x # note x, not f(x)
else:
return f(repeat(f, n-1, x)) # call f with returned value
(thanks to #Kevin for the latter, which supports n == 0).
Example:
>>> repeat(lambda z: z + 1, 2, 2)
4
>>> assert repeat(lambda z: z * 2, 4, 3) == 3 * 2 * 2 * 2 * 2
>>>
You've got a very simple error there, in the else block you are just passing x along without doing anything to it. Also you are applying x when n == 0, don't do that.
def repeat(f,n,x):
"""
>>> repeat(lambda x: x+1, 2, 0)
2
"""
return repeat(f, n-1, f(x)) if n > 0 else x
This is more a question of elegance and performance rather than “how to do at all”, so I'll just show the code:
def iterate_adjacencies(gen, fill=0, size=2, do_fill_left=True,
do_fill_right=False):
""" Iterates over a 'window' of `size` adjacent elements in the supploed
`gen` generator, using `fill` to fill edge if `do_fill_left` is True
(default), and fill the right edge (i.e. last element and `size-1` of
`fill` elements as the last item) if `do_fill_right` is True. """
fill_size = size - 1
prev = [fill] * fill_size
i = 1
for item in gen: # iterate over the supplied `whatever`.
if not do_fill_left and i < size:
i += 1
else:
yield prev + [item]
prev = prev[1:] + [item]
if do_fill_right:
for i in range(fill_size):
yield prev + [fill]
prev = prev[1:] + [fill]
and then ask: is there already a function for that? And, if not, can you do the same thing in a better (i.e. more neat and/or more fast) way?
Edit:
with ideas from answers of #agf, #FogleBird, #senderle, a resulting somewhat-neat-looking piece of code is:
def window(seq, size=2, fill=0, fill_left=True, fill_right=False):
""" Returns a sliding window (of width n) over data from the iterable:
s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
"""
ssize = size - 1
it = chain(
repeat(fill, ssize * fill_left),
iter(seq),
repeat(fill, ssize * fill_right))
result = tuple(islice(it, size))
if len(result) == size: # `<=` if okay to return seq if len(seq) < size
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
This page shows how to implement a sliding window with itertools. http://docs.python.org/release/2.3.5/lib/itertools-example.html
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
Example output:
>>> list(window(range(10)))
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9)]
You'd need to change it to fill left and right if you need.
This is my version that fills, keeping the signature the same. I have previously seen the itertools recipe, but did not look at it before writing this.
from itertools import chain
from collections import deque
def ia(gen, fill=0, size=2, fill_left=True, fill_right=False):
gen, ssize = iter(gen), size - 1
deq = deque(chain([fill] * ssize * fill_left,
(next(gen) for _ in xrange((not fill_left) * ssize))),
maxlen = size)
for item in chain(gen, [fill] * ssize * fill_right):
deq.append(item)
yield deq
Edit: I also didn't see your comments on your question before posting this.
Edit 2: Fixed. I had tried to do it with one chain but this design needs two.
Edit 3: As #senderle noted, only use it this as a generator, don't wrap it with list or accumulate the output, as it yields the same mutable item repeatedly.
Ok, after coming to my senses, here's a non-ridiculous version of window_iter_fill. My previous version (visible in edits) was terrible because I forgot to use izip. Not sure what I was thinking. Using izip, this works, and, in fact, is the fastest option for small inputs!
def window_iter_fill(gen, size=2, fill=None):
gens = (chain(repeat(fill, size - i - 1), gen, repeat(fill, i))
for i, gen in enumerate(tee(gen, size)))
return izip(*gens)
This one is also fine for tuple-yielding, but not quite as fast.
def window_iter_deque(it, size=2, fill=None, fill_left=False, fill_right=False):
lfill = repeat(fill, size - 1 if fill_left else 0)
rfill = repeat(fill, size - 1 if fill_right else 0)
it = chain(lfill, it, rfill)
d = deque(islice(it, 0, size - 1), maxlen=size)
for item in it:
d.append(item)
yield tuple(d)
HoverHell's newest solution is still the best tuple-yielding solution for high inputs.
Some timings:
Arguments: [xrange(1000), 5, 'x', True, True]
==============================================================================
window HoverHell's frankeniter : 0.2670ms [1.91x]
window_itertools from old itertools docs : 0.2811ms [2.02x]
window_iter_fill extended `pairwise` with izip : 0.1394ms [1.00x]
window_iter_deque deque-based, copying : 0.4910ms [3.52x]
ia_with_copy deque-based, copying v2 : 0.4892ms [3.51x]
ia deque-based, no copy : 0.2224ms [1.60x]
==============================================================================
Scaling behavior:
Arguments: [xrange(10000), 50, 'x', True, True]
==============================================================================
window HoverHell's frankeniter : 9.4897ms [4.61x]
window_itertools from old itertools docs : 9.4406ms [4.59x]
window_iter_fill extended `pairwise` with izip : 11.5223ms [5.60x]
window_iter_deque deque-based, copying : 12.7657ms [6.21x]
ia_with_copy deque-based, copying v2 : 13.0213ms [6.33x]
ia deque-based, no copy : 2.0566ms [1.00x]
==============================================================================
The deque-yielding solution by agf is super fast for large inputs -- seemingly O(n) instead of O(n, m) like the others, where n is the length of the iter and m is the size of the window -- because it doesn't have to iterate over every window. But I still think it makes more sense to yield a tuple in the general case, because the calling function is probably just going to iterate over the deque anyway; it's just a shift of the computational burden. The asymptotic behavior of the larger program should remain the same.
Still, in some special cases, the deque-yielding version will probably be faster.
Some more timings based on HoverHell's test structure.
>>> import testmodule
>>> kwa = dict(gen=xrange(1000), size=4, fill=-1, fill_left=True, fill_right=True)
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.window(**kwa)]
1000 loops, best of 3: 462 us per loop
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.ia(**kwa)]
1000 loops, best of 3: 463 us per loop
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.window_iter_fill(**kwa)]
1000 loops, best of 3: 251 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.window(**kwa)]
1000 loops, best of 3: 525 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.ia(**kwa)]
1000 loops, best of 3: 462 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.window_iter_fill(**kwa)]
1000 loops, best of 3: 333 us per loop
Overall, once you use izip, window_iter_fill is quite fast, as it turns out -- especially for small windows.
Resulting function (from the edit of the question),
frankeniter with ideas from answers of #agf, #FogleBird, #senderle, a resulting somewhat-neat-looking piece of code is:
from itertools import chain, repeat, islice
def window(seq, size=2, fill=0, fill_left=True, fill_right=False):
""" Returns a sliding window (of width n) over data from the iterable:
s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
"""
ssize = size - 1
it = chain(
repeat(fill, ssize * fill_left),
iter(seq),
repeat(fill, ssize * fill_right))
result = tuple(islice(it, size))
if len(result) == size: # `<=` if okay to return seq if len(seq) < size
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
and, for some performance information regarding deque/tuple:
In [32]: kwa = dict(gen=xrange(1000), size=4, fill=-1, fill_left=True, fill_right=True)
In [33]: %timeit -n 10000 [a+b+c+d for a,b,c,d in tmpf5.ia(**kwa)]
10000 loops, best of 3: 358 us per loop
In [34]: %timeit -n 10000 [a+b+c+d for a,b,c,d in tmpf5.window(**kwa)]
10000 loops, best of 3: 368 us per loop
In [36]: %timeit -n 10000 [sum(x) for x in tmpf5.ia(**kwa)]
10000 loops, best of 3: 340 us per loop
In [37]: %timeit -n 10000 [sum(x) for x in tmpf5.window(**kwa)]
10000 loops, best of 3: 432 us per loop
but anyway, if it's numbers then numpy is likely preferable.
I'm surprised nobody took a simple coroutine approach.
from collections import deque
def window(n, initial_data=None):
if initial_data:
win = deque(initial_data, n)
else:
win = deque(((yield) for _ in range(n)), n)
while 1:
side, val = (yield win)
if side == 'left':
win.appendleft(val)
else:
win.append(val)
win = window(4)
win.next()
print(win.send(('left', 1)))
print(win.send(('left', 2)))
print(win.send(('left', 3)))
print(win.send(('left', 4)))
print(win.send(('right', 5)))
## -- Results of print statements --
deque([1, None, None, None], maxlen=4)
deque([2, 1, None, None], maxlen=4)
deque([3, 2, 1, None], maxlen=4)
deque([4, 3, 2, 1], maxlen=4)
deque([3, 2, 1, 5], maxlen=4)