fast ways needed to get number of different indices in string - python

I wanted to get number of indexes in two string which are not same.
Things that are fixed:
String data will only have 0 or 1 on any index. i.e strings are binary representation of a number.
Both the string will be of same length.
For the above problem I wrote the below function in python
def foo(a,b):
result = 0
for x,y in zip(a,b):
if x != y:
result += 1
return result
But the thing is these strings are huge. Very large. So the above functions is taking too much time. any thing i should do to make it super fast.
This is how i did same in c++, Its quite fast now, but still can't understand how to do packing in short integers and all that said by #Yves Daoust :
size_t diff(long long int n1, long long int n2)
{
long long int c = n1 ^ n2;
bitset<sizeof(int) * CHAR_BIT> bits(c);
string s = bits.to_string();
return std::count(s.begin(), s.end(), '1');
}

I'll walk through the options here, but basically you are calculating the hamming distance between two numbers. There are dedicated libraries that can make this really, really fast, but lets focus on the pure Python options first.
Your approach, zipping
zip() produces one big list first, then lets you loop. You could use itertools.izip() instead, and make it a generator expression:
from itertools import izip
def foo(a, b):
return sum(x != y for x, y in izip(a, b))
This produces only one pair at a time, avoiding having to create a large list of tuples first.
The Python boolean type is a subclass of int, where True == 1 and False == 0, letting you sum them:
>>> True + True
2
Using integers instead
However, you probably want to rethink your input data. It's much more efficient to use integers to represent your binary data; integers can be operated on directly. Doing the conversion inline, then counting the number of 1s on the XOR result is:
def foo(a, b):
return format(int(a, 2) ^ int(b, 2), 'b').count('1')
but not having to convert a and b to integers in the first place would be much more efficient.
Time comparisons:
>>> from itertools import izip
>>> import timeit
>>> s1 = "0100010010"
>>> s2 = "0011100010"
>>> def foo_zipped(a, b): return sum(x != y for x, y in izip(a, b))
...
>>> def foo_xor(a, b): return format(int(a, 2) ^ int(b, 2), 'b').count('1')
...
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_zipped as f')
1.7872788906097412
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor as f')
1.3399651050567627
>>> s1 = s1 * 1000
>>> s2 = s2 * 1000
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_zipped as f', number=1000)
1.0649528503417969
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor as f', number=1000)
0.0779869556427002
The XOR approach is faster by orders of magnitude if the inputs get larger, and this is with converting the inputs to int first.
Dedicated libraries for bitcounting
The bit counting (format(integer, 'b').count(1)) is pretty fast, but can be made faster still if you installed the gmpy extension library (a Python wrapper around the GMP library) and used the gmpy.popcount() function:
def foo(a, b):
return gmpy.popcount(int(a, 2) ^ int(b, 2))
gmpy.popcount() is about 20 times faster on my machine than the str.count() method. Again, not having to convert a and b to integers to begin with would remove another bottleneck, but even then there per-call performance is almost doubled:
>>> import gmpy
>>> def foo_xor_gmpy(a, b): return gmpy.popcount(int(a, 2) ^ int(b, 2))
...
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor as f', number=10000)
0.7225301265716553
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor_gmpy as f', number=10000)
0.47731995582580566
To illustrate the difference when a and b are integers to begin with:
>>> si1, si2 = int(s1, 2), int(s2, 2)
>>> def foo_xor_int(a, b): return format(a ^ b, 'b').count('1')
...
>>> def foo_xor_gmpy_int(a, b): return gmpy.popcount(a ^ b)
...
>>> timeit.timeit('f(si1, si2)', 'from __main__ import si1, si2, foo_xor_int as f', number=100000)
3.0529568195343018
>>> timeit.timeit('f(si1, si2)', 'from __main__ import si1, si2, foo_xor_gmpy_int as f', number=100000)
0.15820622444152832
Dedicated libraries for hamming distances
The gmpy library actually includes a gmpy.hamdist() function, which calculates this exact number (the number of 1 bits in the XOR result of the integers) directly:
def foo_gmpy_hamdist(a, b):
return gmpy.hamdist(int(a, 2), int(b, 2))
which'll blow your socks off entirely if you used integers to begin with:
def foo_gmpy_hamdist_int(a, b):
return gmpy.hamdist(a, b)
Comparisons:
>>> def foo_gmpy_hamdist(a, b):
... return gmpy.hamdist(int(a, 2), int(b, 2))
...
>>> def foo_gmpy_hamdist_int(a, b):
... return gmpy.hamdist(a, b)
...
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor as f', number=100000)
7.479684114456177
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_gmpy_hamdist as f', number=100000)
4.340585947036743
>>> timeit.timeit('f(si1, si2)', 'from __main__ import si1, si2, foo_gmpy_hamdist_int as f', number=100000)
0.22896099090576172
That's 100.000 times the hamming distance between two 3k+ digit numbers.
Another package that can calculate the distance is Distance, which supports calculating the hamming distance between strings directly.
Make sure you use the --with-c switch to have it compile the C optimisations; when installing with pip use bin/pip install Distance --install-option --with-c for example.
Benchmarking this against the XOR-with-bitcount approach again:
>>> import distance
>>> def foo_distance_hamming(a, b):
... return distance.hamming(a, b)
...
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_xor as f', number=100000)
7.229060173034668
>>> timeit.timeit('f(s1, s2)', 'from __main__ import s1, s2, foo_distance_hamming as f', number=100000)
0.7701470851898193
It uses the naive approach; zip over both input strings and count the number of differences, but since it does this in C it is still plenty faster, about 10 times as fast. The gmpy.hamdist() function still beats it when you use integers, however.

Not tested, but how would this perform:
sum(x!=y for x,y in zip(a,b))

If the strings represent binary numbers, you can convert to integers and use bitwise operators:
def foo(s1, s2):
# return sum(map(int, format(int(a, 2) ^ int(b, 2), 'b'))) # one-liner
a = int(s1, 2) # convert string to integer
b = int(s2, 2)
c = a ^ b # use xor to get differences
s = format(c, 'b') # convert back to string of zeroes and ones
return sum(map(int, s)) # sum all ones (count of differences)
s1 = "0100010010"
s2 = "0011100010"
# 12345
assert foo(s1, s2) == 5

Pack your strings as short integers (16 bits). After xoring, pass to a precomputed lookup table of 65536 entries that gives the number of 1s per short.
If pre-packing is not an option, switch to C++ with inline AVX2 intrinsics. They will allow you to load 32 characters in a single instruction, perform the comparisons, then pack the 32 results to 32 bits (if I am right).

Related

Decimal representation in python

When given input Iam getting result like 1.0 but I need it to be 1.000000000000
How should I modify within hypot??
small=1000000000
from math import hypot
def pairwise(iterable):
a, b = iter(iterable), iter(iterable)
next(b, None)
return zip(a, b)
n=int(input())
l1=[]
l2=[]
for i in range(0,n):
c,d=list(map(int,input().split()))
l1.append(c)
l2.append(d)
a=tuple(l1)
b=tuple(l2)
dist = [float(hypot(p2[0]-p1[0], p2[1]-p1[1])) for p1, p2 in pairwise(tuple(zip(a, b)))]
for x in dist:
if x<small:
small=x
print(small)
You can use the format method like so:
format(math.pi, '.12g') # Give 12 significant digits of pi
format(math.pi, '.2f') # Give 2 digits of pi after decimal point
You can change your print to
print(f'{small:.12f}')

multiple logarithm in numpy

I want take logarithm multiple times. We know this
import numpy as np
np.log(x)
now the second logarithm would be
np.log(np.log(x))
what if one wants to take n number of logs? surely it would not be pythonic to repeat n times as above.
As per #eugenhu's suggestion, one way is to use a generic function which loops iteratively:
import numpy as np
def repeater(f, n):
def fn(i):
result = i
for _ in range(n):
result = f(result)
return result
return fn
repeater(np.log, 5)(x)
You could use the following little trick:
>>> from functools import reduce
>>>
>>> k = 4
>>> x = 1e12
>>>
>>> y = np.array(x)
>>> reduce(np.log, (k+1) * (y,))[()]
0.1820258315495139
and back:
>>> reduce(np.exp, (k+1) * (y,))[()]
999999999999.9813
On my machine this is slightly faster than #jp_data_analysis' approach:
>>> def f_pp(ufunc, x, k):
... y = np.array(x)
... return reduce(ufunc, (k+1) * (y,))[()]
...
>>> x = 1e12
>>> k = 5
>>>
>>> from timeit import repeat
>>> kwds = dict(globals=globals(), number=100000)
>>>
>>> repeat('repeater(np.log, 5)(x)', **kwds)
[0.5353733809897676, 0.5327484680456109, 0.5363518510130234]
>>> repeat('f_pp(np.log, x, 5)', **kwds)
[0.4512511100037955, 0.4380568229826167, 0.45331112697022036]
To be fair, their approach is more flexible. Mine uses quite specific properties of unary ufuncs and numpy arrays.
Larger k is also possible. For that we need to make sure that x is complex because np.log will not switch automatically.
>>> x = 1e12+0j
>>> k = 50
>>>
>>> f_pp(np.log, x, 50)
(0.3181323483680859+1.3372351153002153j)
>>> f_pp(np.exp, _, 50)
(1000000007040.9696+6522.577629950761j)
# not that bad, all things considered ...
>>>
>>> repeat('f_pp(np.log, x, 50)', **kwds)
[4.272890724008903, 4.266964592039585, 4.270542044949252]
>>> repeat('repeater(np.log, 50)(x)', **kwds)
[5.799160094989929, 5.796761817007791, 5.80835147597827]
From this post, you can compose functions:
Code
import itertools as it
import functools as ft
import numpy as np
def compose(f, g):
return lambda x: f(g(x))
identity = lambda x: x
Demo
ft.reduce(compose, it.repeat(np.log, times=2), identity)(10)
# 0.83403244524795594
ft.reduce(compose, it.repeat(np.log, times=3), identity)(10)
# -0.18148297420509205

Get the size of a conditional subset of a list

Assume that you have a list with an arbitrary amounts of items, and you wish to get the number of items that match a specific conditions. I though of two ways to do this in a sensible manner but I am not sure which one is best (more pythonic) - or if there is perhaps a better option (without sacrificing too much readability).
import numpy.random as nprnd
import timeit
my = nprnd.randint(1000, size=1000000)
def with_len(my_list):
much = len([t for t in my_list if t >= 500])
def with_sum(my_list):
many = sum(1 for t in my_list if t >= 500)
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
print("with len:",t1.timeit(1000)/1000)
print("with sum:",t2.timeit(1000)/1000)
Performance is almost identical between these two cases. However, which of these is more pythonic? Or is there a better alternative?
For those who are curious, I tested the proposed solutions (from comments and answers) and these are the results:
import numpy as np
import timeit
import functools
my = np.random.randint(1000, size=100000)
def with_len(my_list):
return len([t for t in my_list if t >= 500])
def with_sum(my_list):
return sum(1 for t in my_list if t >= 500)
def with_sum_alt(my_list):
return sum(t >= 500 for t in my_list)
def with_lambda(my_list):
return functools.reduce(lambda a, b: a + (1 if b >= 500 else 0), my_list, 0)
def with_np(my_list):
return len(np.where(my_list>=500)[0])
t1 = timeit.Timer('with_len(my)', 'from __main__ import with_len, my')
t2 = timeit.Timer('with_sum(my)', 'from __main__ import with_sum, my')
t3 = timeit.Timer('with_sum_alt(my)', 'from __main__ import with_sum_alt, my')
t4 = timeit.Timer('with_lambda(my)', 'from __main__ import with_lambda, my')
t5 = timeit.Timer('with_np(my)', 'from __main__ import with_np, my')
print("with len:", t1.timeit(1000)/1000)
print("with sum:", t2.timeit(1000)/1000)
print("with sum_alt:", t3.timeit(1000)/1000)
print("with lambda:", t4.timeit(1000)/1000)
print("with np:", t5.timeit(1000)/1000)
Python 2.7
('with len:', 0.02201753337348283)
('with sum:', 0.022727363518455238)
('with sum_alt:', 0.2370256687439941) # <-- very slow!
('with lambda:', 0.026367264818657078)
('with np:', 0.0005811764306089913) # <-- very fast!
Python 3.6
with len: 0.017649643657480736
with sum: 0.0182978007766851
with sum_alt: 0.19659815740239048
with lambda: 0.02691670741400111
with np: 0.000534095418615152
The 2nd one, with_sum is more pythonic in the sense that it uses much less memory as it doesn't build the whole list because the generator expression is fed to sum().
I'm with #Chris_Rands. But as far as performance is concerned, there is a faster way using numpy:
import numpy as np
def with_np(my_list):
return len(np.where(my_list>=500)[0])

find the best way for factorial in python?

I am researching on speed of factorial. But I am using two ways only,
import timeit
def fact(N):
B = N
while N > 1:
B = B * (N-1)
N = N-1
return B
def fact1(N):
B = 1
for i in range(1, N+1):
B = B * i
return B
print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
Here is the output,
0.540276050568 120
0.654400110245 120
From above code I have observed,
While take less time than for
My question is,
Is the best way to find the factorial in python ?
If you're looking for the best, why not use the one provided in the math module?
>>> import math
>>> math.factorial
<built-in function factorial>
>>> math.factorial(10)
3628800
And a comparison of timings on my machine:
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
0.840167045593 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.04350399971 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial")
0.149857997894
We see that the builtin is significantly better than either of the pure python variants you proposed.
TLDR; microbenchmarks aren't very useful
For Cpython, try this:
>>> from math import factorial
>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)
1.38128209114 120
>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1(5)
1.46199703217 120
>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), factorial(5)
0.397044181824 120
But under pypy, the while is faster than the one from math
>>>> print timeit.timeit('fact(5)', setup="from __main__ import fact"), fact(5)\
0.170556783676 120
>>>> print timeit.timeit('fact1(5)', setup="from __main__ import fact1"), fact1\
(5)
0.319650173187 120
>>>> print timeit.timeit('factorial(5)', setup="from math import factorial"), f\
actorial(5)
0.210616111755 120
So it depends on the implementation. Now try bigger numbers
>>>> print timeit.timeit('fact(50)', setup="from __main__ import fact"), fact(50)
7.71517109871 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('fact1(50)', setup="from __main__ import fact1"), fact1(50)
6.58060312271 30414093201713378043612608166064768844377641568960512000000000000
>>>> print timeit.timeit('factorial(50)', setup="from math import factorial"), factorial(50)
6.53072690964 30414093201713378043612608166064768844377641568960512000000000000
while is in last place, and the version using for is about the same as the one from the math module
Otherwise, if you're looking for a Python implementation (this is my favourite):
from operator import mul
def factorial(n):
return reduce(mul, range(1, (n + 1)), 1)
Usage:
>>> factorial(0)
1
>>> factorial(1)
1
>>> factorial(2)
2
>>> factorial(3)
6
>>> factorial(4)
24
>>> factorial(5)
120
>>> factorial(10)
3628800
Performance: (On my desktop:)
$ python -m timeit -c -s "fact = lambda n: reduce(lambda a, x: a * x, range(1, (n + 1)), 1)" "fact(10)"
1000000 loops, best of 3: 1.98 usec per loop
I have tried with reduce(lambda x, y: x*y, range(1, 5))
>>>timeit("import math; math.factorial(4)")
1.0205099133840179
>>>timeit("reduce(lambda x, y: x*y, range(1, 5))")
1.4047879075160665
>>>timeit("from operator import mul;reduce(mul, range(1, 5))")
2.530837320051319

Return at least X results from split

split has a maxsplit parameter, which is useful when you want at most X results. If there something similar to return at least X results and populate the rest with Nones. I'd like to be able to write
a, b, c = 'foo,bar'.magic_split(',', 3)
and have a=foo, b=bar and c=None.
Any ideas how to write such a function?
Upd. I ended up with a solution which is a combination of this and this answers:
>>> def just(n, iterable, fill=None):
... return (list(iterable) + [fill] * n)[:n]
...
>>> just(3, 'foo,bar'.split(','))
['foo', 'bar', None]
One way would be:
from itertools import chain
from itertools import repeat
from itertools import islice
def magic_split(seq, sep, n, def_value=None):
return list(islice(chain(seq.split(sep), repeat(def_value)), n))
You could just return the return value of islice if you don't need the list.
If you don't want the values to be cut off when n is less than number of split elements in seq, the modification is trivial:
def magic_split(seq, sep, n, def_value=None):
elems = seq.split(sep)
if len(elems) >= n:
return elems
return list(islice(chain(elems, repeat(def_value)), n))
There is no such parameter to str.split(). A hack to achieve this would be
a, b, c = ('foo,bar'.split(',', 2) + [None] * 3)[:3]
Not sure if I recommend this code, though.
I would use a more general function for that:
def fill(iterable, n):
tmp = tuple(iterable)
return tmp + (None,)*(n - len(tmp))
Then:
a, b, c = fill('foo,bar'.split(','), 3)
Since you ask for a string method, you can start by deriving from str:
>>> class magicstr(str):
def magic_split(self, sep=None, mlen=0):
parts = self.split(sep)
return parts + [None]* (mlen - len(parts))
>>> test = magicstr("hello there, ok?")
>>> test.magic_split(",", 3)
['hello there', ' ok?', None]

Categories

Resources