I'm interested in reordering the bits within a number, and since I want to do it several trillion times, I want to do it fast.
Here are the details: given a number num and an order matrix order.
order contains up to ~6000 lines of permutations of the numbers 0..31.
These are the positions to which the bits change.
Simplified example: binary(num) = 1001, order[1]=[0,1,3,2], reordered number for order[1] would be 1010 (binary).
Now I want to know, if my input number num is the smallest of these (~6000) reordered numbers. I'm searching for all 32-Bit numbers which fullfill this criterion.
My current approach is to slow, so I'm looking for a speedup.
minimal-reproducible-example:
num = 1753251840
order = [[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[ 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, 19, 18, 17, 16, 23, 22, 21, 20, 27, 26, 25, 24, 31, 30, 29, 28],
[15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16],
[31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23, 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31],
[21, 20, 23, 22, 29, 28, 31, 30, 17, 16, 19, 18, 25, 24, 27, 26, 5, 4, 7, 6, 13, 12, 15, 14, 1, 0, 3, 2, 9, 8, 11, 10]]
patterns=set()
bits = format(num, '032b')
for perm in order:
bitsn = [bits[perm[i]] for i in range(32)]
patterns.add(int(''.join(bitsn),2))
print( min(patterns)==num)
Where can I start to improve this?
Extracting bits using string is generally very inefficient (whatever the language). The same thing also apply for parsing. Moreover, for such a fast low-level operation, you need to use a JIT or a compiled language as comments already pointed out.
Here is a prototype using the Numba's JIT (assume all numbers are unsigned):
npOrder = np.array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[ 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, 19, 18, 17, 16, 23, 22, 21, 20, 27, 26, 25, 24, 31, 30, 29, 28],
[15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16],
[31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23, 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31],
[21, 20, 23, 22, 29, 28, 31, 30, 17, 16, 19, 18, 25, 24, 27, 26, 5, 4, 7, 6, 13, 12, 15, 14, 1, 0, 3, 2, 9, 8, 11, 10]], dtype=np.uint32)
#njit
def extractBits(num):
bits = np.empty(32, dtype=np.int32)
for i in range(32):
bits[i] = (num >> i) & 0x01
return bits
#njit
def permuteAndMerge(bits, perm):
bitsnFinal = 0
for i in range(32):
bitsnFinal |= bits[31-perm[i]] << i
return bitsnFinal
#njit
def computeOptimized(num):
bits = extractBits(num)
permCount = npOrder.shape[0]
patterns = np.empty(permCount, dtype=np.uint32)
for i in range(permCount):
patterns[i] = permuteAndMerge(bits, npOrder[i])
# The array can be converted to a set if needed here with: set(patterns)
return min(patterns) == num
This code is about 25 time faster than the original one on my machine (ran 5 000 000 times).
You can also use Numba to accelerate and parallelize the loop that run the function computeOptimized resulting in a significant additional speed-up.
Note that this code can be again much faster in C or C++ using low-level processor instructions (available for example on many x86_64 processors). With that and parallelism, the order of magnitude of the execution speed should be close to a billion of permutation per second.
Couple of possible speed-ups, staying with Python and the current algorithm:
Bail out as soon as you find a pattern less than num; once one like that is found, the condition cannot possibly be true. (You also don't need to store patterns; at most a flag whether an equal one was found, if that's not guaranteed by the problem.)
bitsn could be a generator expression, and doesn't need to be in a variable; you'll have to measure whether that's faster.
More fundamental improvements:
If you want to find all the numbers (rather than just test a particular one), it feels like there ought to be a faster algorithm by considering what the bits mean. A couple of hours thinking could potentially let you process just the 6000 lists, rather than all 2³² integers.
As others have written, if you're after pure speed, python is not the ideal language. That depends on the balance of how much time you want to spend on programming vs on running the program.
Side note:
Are the 32-bit integers signed or unsigned?
I have written this code:
rand_map, lst = [2, 2, 6, 6, 8, 11, 4], []
for i in range(len(rand_map)):
num = rand_map[i]
lst.append(num)
for j in range(i+1, len(rand_map)):
assembly = num + rand_map[j]
num += rand_map[j]
lst.append(assembly)
print(sorted(lst))
Which gives this output:
[2, 2, 4, 4, 6, 6, 8, 8, 10, 11, 12, 14, 14, 15, 16, 19, 20, 22, 23, 24, 25, 29, 31, 33, 35, 35, 37, 39]
I've been trying to rewrite this code using list comprehensions, but I don't know how. I have tried multiple ways (standard and itertools) but I just can't get it right. I'll be very grateful for your help!
I came up with a couple of approaches for this problem:
Approach 1 - Vanilla list comprehension
In this approach, we iterate two variables, i and j and calculate the sum of the elements between these two indexes.
Code:
>>> rand_map = [2, 2, 6, 6, 8, 11, 4]
>>> sorted([sum(rand_map[i:i+j+1]) for i in range(len(rand_map)) for j in range(len(rand_map)-i)])
[2, 2, 4, 4, 6, 6, 8, 8, 10, 11, 12, 14, 14, 15, 16, 19, 20, 22, 23, 24, 25, 29, 31, 33, 35, 35, 37, 39]
Approach 2 - Itertools
In this approach, we use the itertools recipe from here to iterate n-wise through the rand_map list, and calculate the sums accordingly. This works in approximately the same way as the first approach, but is a bit tider.
Code:
from itertools import islice
def n_wise(iterable, n):
return zip(*(islice(iterable, i, None) for i in range(n)))
print(sorted([sum(x) for n in range(len(rand_map)) for x in n_wise(rand_map, n+1)]))
Output:
[2, 2, 4, 4, 6, 6, 8, 8, 10, 11, 12, 14, 14, 15, 16, 19, 20, 22, 23, 24, 25, 29, 31, 33, 35, 35, 37, 39]
I have two arrays and i want to check how many integers are the same in the different arrays. The problem i'm having is that it only shows me how many are the same when they are in the same position. Both arrays have 15 numbers in them.
Example:
import numpy as np
a = np.array([1, 4, 5, 7, 9, 14, 15, 17, 18, 19, 21, 22, 23, 25, 26])
b = np.array([8, 28, 12, 3, 24, 16, 23, 19, 14, 2, 11, 29, 27, 6, 13])
print(np.count_nonzero(a==b))
This prints 0 even though there's clearly integers that are the same. How can i make this print how many integers have the same value?
You want to use np.intersect1d, if I am understanding you correctly:
In [12]: import numpy as np
In [13]: a = np.array([1, 4, 5, 7, 9, 14, 15, 17, 18, 19, 21, 22, 23, 25, 26])
...: b = np.array([8, 28, 12, 3, 24, 16, 23, 19, 14, 2, 11, 29, 27, 6, 13])
...:
In [14]: np.intersect1d(a, b)
Out[14]: array([14, 19, 23])
You can perform broadcasted comparison between b and a, and then just tally up the matches:
(b == a[:, None]).sum()
3
This checks out since you have [14, 19, 23] as the common elements.
I've been trying to wrap my head around the best way to split this list of numbers up that are ordered but broken up in sections. Ex:
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 29, 30, 31, 32, 33, 35, 36, 44, 45, 46, 47]
I'd like the output to be this..
sliced_data = [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],[29, 30, 31, 32, 33],[35, 36],[44, 45, 46, 47]]
I've been trying a while look until it's empty but that isn't working too well..
Edit:
for each_half_hour in half_hour_blocks:
if next_number != each_half_hour:
skippers.append(half_hour_blocks[:next_number])
del half_hour_blocks[:next_number]
next_number = each_half_hour + 1
>>> data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 29, 30, 31, 32, 33, 35, 36, 44, 45, 46, 47]
>>> from itertools import groupby, count
>>> [list(g) for k,g in groupby(data, key=lambda i, c=count():i-next(c))]
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [29, 30, 31, 32, 33], [35, 36], [44, 45, 46, 47]]
I don't see why a while-loop wouldn't work here, unless you're going for something more efficient or succinct.
Something like:
slice = [data.pop(0)]
sliced_data = []
while data:
if data[0] == slice[-1] + 1:
slice.append(data.pop(0))
else:
sliced_data.append(slice)
slice = [data.pop(0)]
sliced_data.append(slice)