Applying rectangular step function using scipy - python

Is there a rectangular filter in scipy? I want to replicate #Mike T's solution https://gis.stackexchange.com/questions/9431/what-raster-smoothing-generalization-tools-are-available/162852 but nstead of doing gaussian blur, I want to apply a rectangular filter (i.e. essentially a step function)

You can simply use scipy.ndimage.convolve and define the weights (filter) yourself. There is also a specialized and optimized rectangular filter function: uniform_filter
For example to apply a width 3 mean-filter (multiply by ndim*size if you want a sum-filter):
>>> from scipy.ndimage import uniform_filter
>>> uniform_filter([1,4,2,56,2,3,6,1,3,1,3], size=3)
array([ 2, 2, 20, 19, 20, 3, 3, 3, 1, 2, 2])
This can also be applied to multidimensional arrays:
>>> uniform_filter(np.random.randint(0, 20, (10, 10)), size=3) # 3x3 filter
array([[ 6, 7, 10, 9, 9, 7, 5, 7, 9, 12],
[ 6, 7, 9, 9, 7, 6, 5, 5, 7, 9],
[ 5, 8, 8, 9, 7, 6, 4, 4, 6, 8],
[ 9, 10, 9, 10, 8, 6, 4, 4, 8, 11],
[10, 12, 9, 10, 9, 10, 8, 8, 9, 10],
[12, 12, 9, 10, 10, 10, 9, 9, 9, 9],
[12, 11, 9, 8, 7, 8, 7, 8, 6, 5],
[11, 10, 9, 9, 9, 9, 6, 8, 8, 9],
[12, 9, 7, 7, 9, 8, 6, 6, 6, 8],
[12, 9, 8, 9, 12, 10, 7, 5, 6, 9]])
>>> uniform_filter(np.random.randint(0, 20, (10, 10)), size=(5, 3)) # 5x3 filter
array([[ 7, 7, 8, 10, 11, 10, 11, 11, 12, 12],
[ 7, 7, 7, 9, 10, 11, 11, 10, 9, 9],
[ 7, 6, 6, 8, 8, 9, 9, 9, 8, 8],
[ 6, 6, 6, 8, 8, 9, 8, 7, 7, 8],
[ 8, 8, 7, 9, 8, 11, 8, 8, 6, 6],
[ 7, 8, 7, 9, 8, 11, 7, 7, 5, 7],
[ 9, 8, 8, 9, 9, 9, 7, 7, 7, 8],
[ 8, 8, 8, 9, 10, 10, 8, 7, 6, 6],
[ 9, 7, 6, 8, 9, 10, 8, 7, 6, 6],
[ 9, 6, 6, 7, 10, 8, 8, 6, 6, 6]])

Related

Implement a simple search algorithm

So I am trying to work with the following problem (explanation below):
I have an array = [4,7,9,10,9,8,8,9,9,9,7,7,6,5]. These numbers are the 'number of service agents'. The average service level of this array is 0.86 (calculated from an additional function). However, my goal is to reduce the number of agents in this array (one at a time) until I have an improved array with a average service level of closer to 0.8. A condition however is that the number of agents must always stay above a certain threshold value.
Indeed, I have implemented the following code that controls this search algorithm:
def optimization_of_agents(list_of_new_agents, vector_of_lambdas):
search_round = 0
continue_search = True
while continue_search:
index = 0
service_level_score = 0
for i in range(len(list_of_new_agents)):
if list_of_new_agents[i] > vector_of_lambdas[i]/rate_of_service:
temp_array = list_of_new_agents.copy()
temp_array[i] = temp_array[i]-1
temp_service_level = calculate_service_level_optimized_vector(temp_array, vector_of_lambdas)
if temp_service_level > 0.8 and temp_service_level > service_level_score:
service_level_score = temp_service_level
index = i
print("Search iteration: ", i)
print(temp_array)
print(service_level_score)
if service_level_score <= 0.8:
continue_search = False
else:
list_of_new_agents[index] = list_of_new_agents[index]-1
search_round = search_round + 1
print("\n")
print("Search After Round: ", search_round)
print("Improved Vector Of Service Agents: ", list_of_new_agents)
print("Average Service Level: ", service_level_score)
print("\n")
def calculate_service_level(agents, new_lambda):
N=(((new_lambda/rate_of_service)**agents)/np.math.factorial(agents))
P=1-(new_lambda/rate_of_service)/agents
K= sum_agents(agents,new_lambda)
probability_delay = N / ((P*K)+N)
service_level = 1-(probability_delay*(np.exp(-(rate_of_service*(agents-(new_lambda/rate_of_service))*0.5))))
return service_level
def calculate_service_level_optimized_vector(list_of_new_agents, new_vector_lambdas_list):
service_level_list=[]
for agents, lambdas in zip(list_of_new_agents, new_vector_lambdas_list):
new_service_level = calculate_service_level(agents, lambdas)
service_level_list.append(new_service_level)
average_array = np.array(service_level_list)
average = np.mean(average_array)
#if average > 0.8:
#print(list_of_new_agents)
#print(average)
return average
which gives me the following output:
Search iteration: 0
[3, 7, 9, 10, 9, 8, 8, 9, 9, 9, 7, 7, 6, 5]
0.8485658691850712
Search iteration: 1
[4, 6, 9, 10, 9, 8, 8, 9, 9, 9, 7, 7, 6, 5]
0.854410495947378
Search iteration: 2
[4, 7, 8, 10, 9, 8, 8, 9, 9, 9, 7, 7, 6, 5]
0.8561148487593274
Search iteration: 3
[4, 7, 9, 9, 9, 8, 8, 9, 9, 9, 7, 7, 6, 5]
0.8568206633443681
Search iteration: 11
[4, 7, 9, 10, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8571998848973724
Search After Round: 1
Improved Vector Of Service Agents: [4, 7, 9, 10, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
Average Service Level: 0.8571998848973724
Search iteration: 0
[3, 7, 9, 10, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8394202679038497
Search iteration: 1
[4, 6, 9, 10, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8452648946661565
Search iteration: 2
[4, 7, 8, 10, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.846969247478106
Search iteration: 3
[4, 7, 9, 9, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8476750620631467
Search After Round: 2
Improved Vector Of Service Agents: [4, 7, 9, 9, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
Average Service Level: 0.8476750620631467
Search iteration: 0
[3, 7, 9, 9, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8298954450696241
Search iteration: 1
[4, 6, 9, 9, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8357400718319309
Search iteration: 2
[4, 7, 8, 9, 9, 8, 8, 9, 9, 9, 7, 6, 6, 5]
0.8374444246438804
Search iteration: 8
[4, 7, 9, 9, 9, 8, 8, 9, 8, 9, 7, 6, 6, 5]
0.8375936706851099
Search iteration: 9
[4, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 6, 5]
0.8380728605009203
Search After Round: 3
Improved Vector Of Service Agents: [4, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 6, 5]
Average Service Level: 0.8380728605009203
Search iteration: 0
[3, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 6, 5]
0.8202932435073976
Search iteration: 1
[4, 6, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 6, 5]
0.8261378702697044
Search iteration: 2
[4, 7, 8, 9, 9, 8, 8, 9, 9, 8, 7, 6, 6, 5]
0.8278422230816539
Search iteration: 8
[4, 7, 9, 9, 9, 8, 8, 9, 8, 8, 7, 6, 6, 5]
0.8279914691228836
Search iteration: 12
[4, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 5, 5]
0.8280279465736201
Search After Round: 4
Improved Vector Of Service Agents: [4, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 5, 5]
Average Service Level: 0.8280279465736201
Search iteration: 0
[3, 7, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 5, 5]
0.8102483295800973
Search iteration: 1
[4, 6, 9, 9, 9, 8, 8, 9, 9, 8, 7, 6, 5, 5]
0.8160929563424041
Search iteration: 2
[4, 7, 8, 9, 9, 8, 8, 9, 9, 8, 7, 6, 5, 5]
0.8177973091543536
Search iteration: 8
[4, 7, 9, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
0.8179465551955832
Search After Round: 5
Improved Vector Of Service Agents: [4, 7, 9, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
Average Service Level: 0.8179465551955832
Search iteration: 0
[3, 7, 9, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
0.8001669382020605
Search iteration: 1
[4, 6, 9, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
0.8060115649643675
Search iteration: 2
[4, 7, 8, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
0.807715917776317
Search After Round: 6
Improved Vector Of Service Agents: [4, 7, 8, 9, 9, 8, 8, 9, 8, 8, 7, 6, 5, 5]
Average Service Level: 0.807715917776317
Essentially, my code starts from the first element and changes this value until two conditions are no longer satisfied: 1) average service level of new array is < 0.8 and 2) the new individual element is less than a lambda/mu threshold.
However, I feel this is not great because after changing the first element from 3 to 2 agents, the service level boundary of 0.8 is already reached. This does not seem efficient as it does not allow me to try and reduce the agents in any other portion of the array. So, what I would like to do, is once this level is reached with the new array, go back to the original array I had, and start from the second element and make changes as in the above algorithm (whilst keeping the first element of the original array at 3). But I am unsure how to code this addition.
Hopefully you guys have some tips!
Many thanks, Steve.

How to get 8 maximum / highest values in each row of 2D array with their indices

I'm working on a Python script. I'm trying to achieve a list of 8 highest values and their respective indices of every single row of my 2D array in Python. The shape of my array has 4148 rows and 167 columns. What I essentially want is that for every row it should give me the 8 highest values (in descending order) present in that row with their indices.
I'm relatively new to Python, and I've tried to implement this below, however it gives me the overall 8 maximum values and their indices in the whole array.
a = predicting[:]
indices = np.argpartition(a.flatten(), -8)[-8:]
np.vstack(np.unravel_index(indices, a.shape)).T
You can refer to my exampl. You will get a 2D sequence containing the index of the 8 largest numbers and a 2D sequence containing the 8 largest numbers.:
a = np.random.randint(0,12,size=(12,12))
indice = np.argsort(-a)
indice = indice[:,:8]
b = np.sort(-a.copy())*-1
maximum_8 = b[:,:8]
One output:
a
array([[ 1, 10, 11, 8, 8, 2, 4, 11, 2, 5, 3, 6],
[11, 2, 2, 7, 3, 3, 9, 0, 0, 0, 10, 4],
[ 8, 10, 8, 10, 5, 9, 6, 7, 3, 5, 2, 8],
[ 4, 8, 8, 2, 6, 2, 0, 7, 1, 10, 10, 6],
[ 9, 1, 5, 0, 6, 4, 3, 6, 7, 0, 7, 7],
[ 3, 7, 8, 0, 11, 10, 10, 8, 2, 7, 2, 7],
[ 6, 7, 5, 11, 6, 5, 4, 3, 0, 0, 8, 2],
[ 7, 11, 7, 9, 11, 11, 8, 11, 4, 11, 6, 11],
[11, 3, 9, 7, 11, 8, 11, 3, 8, 9, 0, 3],
[ 4, 7, 6, 9, 11, 3, 8, 0, 5, 11, 6, 5],
[ 9, 11, 8, 2, 5, 4, 4, 4, 9, 4, 7, 9],
[ 5, 5, 3, 6, 4, 8, 4, 9, 4, 1, 8, 9]])
indice
array([[ 2, 7, 1, 3, 4, 11, 9, 6],
[ 0, 10, 6, 3, 11, 4, 5, 1],
[ 1, 3, 5, 0, 2, 11, 7, 6],
[ 9, 10, 1, 2, 7, 4, 11, 0],
[ 0, 8, 10, 11, 4, 7, 2, 5],
[ 4, 5, 6, 2, 7, 1, 9, 11],
[ 3, 10, 1, 0, 4, 2, 5, 6],
[ 1, 4, 5, 7, 9, 11, 3, 6],
[ 0, 4, 6, 2, 9, 5, 8, 3],
[ 4, 9, 3, 6, 1, 2, 10, 8],
[ 1, 0, 8, 11, 2, 10, 4, 5],
[ 7, 11, 5, 10, 3, 0, 1, 4]], dtype=int64)
maximum_8
array([[11, 11, 10, 8, 8, 6, 5, 4],
[11, 10, 9, 7, 4, 3, 3, 2],
[10, 10, 9, 8, 8, 8, 7, 6],
[10, 10, 8, 8, 7, 6, 6, 4],
[ 9, 7, 7, 7, 6, 6, 5, 4],
[11, 10, 10, 8, 8, 7, 7, 7],
[11, 8, 7, 6, 6, 5, 5, 4],
[11, 11, 11, 11, 11, 11, 9, 8],
[11, 11, 11, 9, 9, 8, 8, 7],
[11, 11, 9, 8, 7, 6, 6, 5],
[11, 9, 9, 9, 8, 7, 5, 4],
[ 9, 9, 8, 8, 6, 5, 5, 4]])
Another possible solution:
b = np.flip(np.argsort(a, axis=1)[:,-8:], axis=1)
v = np.take_along_axis(a, b, axis=1)
v, b # v = values; b = indices
If you can put the row data in a dictionary, swap the key and value and put it in to a list, then sort it in reverse. Then take the first 8 values of the list for each row.
# put the row data in a dictionary called dictionaryRow, swap key and value and put into a list, sort in reverse)
x = sorted( [ (value, key) for key, value in dictionaryRow.items() ] ), reverse=True)
# Take the top 8 from the list
x[:8]

Translate array into x and y direction - Python

We have the following two-dimensional array with x and y coordinates:
x = np.array([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]])
We flatten it: x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) )
and our goal is to apply translations into x direction, y direction.
We are dealing with a 4x4 array (lattice), and the first transformation is 1 shift into x direction :
so from '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]' we get '[1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12]'.
The next transformation is two shifts in x:
from '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]' we get '[2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13]'.
We want to get this (flattened) array:
y = np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12],
[2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13],
[3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14],
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3],
[5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0],
[6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1],
[7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2],
[8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7],
[9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4],
[10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5],
[11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6],
[12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8],
[14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9],
[15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10]])
I tried using:
y = np.roll(np.roll(x, -1), -1)
Can concatenate two vstack operations. First, roll in axis=1 and then, roll in axis=0.
np.vstack([np.roll(np.roll(arr, -i, axis=0), -x, axis=1).flatten() \
for x in range(arr.shape[0])] \
for i in range(arr.shape[1]))
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[ 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12],
[ 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13],
[ 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3],
[ 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0],
[ 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1],
[ 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2],
[ 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7],
[ 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4],
[10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5],
[11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6],
[12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8],
[14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9],
[15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10]])

What should be the terminating condition for this implementation of the 15-puzzle problem?

I am trying to implement a solution for outputting the sequence of moves for a 15-puzzle problem in Python. This is part of an optional assignment for a MOOC. The problem statement is given at this link.
I have a version of the program (given below) which performs valid transitions.
I am first identifying the neighbors of the empty cell (represented by 0) and putting them in a list. Then, I am randomly choosing one of the neighbors from the list to perform swaps with the empty cell. All the swaps are accumulated in a different list to record the sequence of moves to solve the puzzle. This is then outputted at the end of the program.
However, the random selection of numbers to make the swap with the empty cell is just going on forever. To avoid "infinite" (very long run) of loops, I have limited the number of swaps to 30 for now.
from random import randint
def find_idx_of_empty_cell(p):
for i in range(len(p)):
if p[i] == 0:
return i
def pick_random_neighbour_idx(neighbours_idx_list):
rand_i = randint(0, len(neighbours_idx_list)-1)
return neighbours_idx_list[rand_i]
def perform__neighbour_transposition(p, tar_idx, src_idx):
temp = p[tar_idx]
p[tar_idx] = p[src_idx]
p[src_idx] = temp
def solve_15_puzzle(p):
standard_perm = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0]
neighbours_idx_list = []
moves_sequence = []
empty_cell_idx = find_idx_of_empty_cell(p)
previous_empty_cell_idx = empty_cell_idx
while (not(p == standard_perm) and len(moves_sequence) < 30):
if not (empty_cell_idx in [0,4,8,12]):
neighbours_idx_list.append(empty_cell_idx - 1)
if not (empty_cell_idx in [3,7,11,15]):
neighbours_idx_list.append(empty_cell_idx + 1)
if not (empty_cell_idx in [0,1,2,3]):
neighbours_idx_list.append(empty_cell_idx - 4)
if not (empty_cell_idx in [12,13,14,15]):
neighbours_idx_list.append(empty_cell_idx + 4)
if previous_empty_cell_idx in neighbours_idx_list:
neighbours_idx_list.remove(previous_empty_cell_idx)
chosen_neighbour_idx = pick_random_neighbour_idx(neighbours_idx_list)
moves_sequence.append(p[chosen_neighbour_idx])
perform__neighbour_transposition(p, empty_cell_idx, chosen_neighbour_idx)
previous_empty_cell_idx = empty_cell_idx
empty_cell_idx = chosen_neighbour_idx
neighbours_idx_list = []
if (p == standard_perm):
print("Solution: ", moves_sequence)
For the below invocation of the method, the expected output is [15, 14, 10, 13, 9, 10, 14, 15].
solve_15_puzzle([1, 2, 3, 4, 5, 6, 7, 8, 13, 9, 11, 12, 10, 14, 15, 0])
The 15-tiles problem is harder as it may seem at a first sight.
Computing the best (shortest) solution is a difficult problem and it has been proved than finding the optimal solution as N increases is NP-hard.
Finding a (non-optimal) solution is much easier. A very simple algorithm that can be made to work for example is:
Define a "distance" of the current position as the sum of the manhattan
distances of every tile from the position you want it to be
Start from the given position and make some random moves
If the distance after the moves improves or stays the same then keep the changes, otherwise undo them and return to the starting point.
This kind of algorithm could be described as a multi-step stochastic hill-climbing approach and is able to solve the 15 puzzle (just make sure to allow enough random moves to be able to escape a local minimum).
Python is probably not the best language to attack this problem, but if you use PyPy implementation you can get solutions in reasonable time.
My implementation finds a solution for a puzzle that has been mixed up with 1000 random moves in seconds, for example:
(1, 5, 43, [9, [4, 10, 14, 11, 15, 3, 8, 1, 13, None, 9, 7, 12, 2, 5, 6]])
(4, 17, 41, [9, [4, 10, 14, 11, 15, 3, 8, 1, 12, None, 6, 2, 5, 13, 9, 7]])
(7, 19, 39, [11, [4, 10, 14, 11, 15, 3, 1, 2, 12, 6, 8, None, 5, 13, 9, 7]])
(9, 54, 36, [5, [4, 14, 3, 11, 15, None, 10, 2, 12, 6, 1, 8, 5, 13, 9, 7]])
(11, 60, 34, [10, [4, 14, 3, 11, 15, 10, 1, 2, 12, 6, None, 8, 5, 13, 9, 7]])
(12, 93, 33, [14, [4, 14, 11, 2, 15, 10, 3, 8, 12, 6, 1, 7, 5, 13, None, 9]])
(38, 123, 31, [11, [4, 14, 11, 2, 6, 10, 3, 8, 15, 12, 1, None, 5, 13, 9, 7]])
(40, 126, 30, [13, [15, 6, 4, 2, 12, 10, 11, 3, 5, 14, 1, 8, 13, None, 9, 7]])
(44, 172, 28, [10, [15, 4, 2, 3, 12, 6, 11, 8, 5, 10, None, 14, 13, 9, 1, 7]])
(48, 199, 23, [11, [15, 6, 4, 3, 5, 12, 2, 8, 13, 10, 11, None, 9, 1, 7, 14]])
(61, 232, 22, [0, [None, 15, 4, 3, 5, 6, 2, 8, 1, 12, 10, 14, 13, 9, 11, 7]])
(80, 276, 20, [10, [5, 15, 4, 3, 1, 6, 2, 8, 13, 10, None, 7, 9, 12, 14, 11]])
(105, 291, 19, [4, [9, 1, 2, 4, None, 6, 8, 7, 5, 15, 3, 11, 13, 12, 14, 10]])
(112, 313, 17, [9, [1, 6, 2, 4, 9, 8, 3, 7, 5, None, 14, 11, 13, 15, 12, 10]])
(113, 328, 16, [15, [1, 6, 2, 4, 9, 8, 3, 7, 5, 15, 11, 10, 13, 12, 14, None]])
(136, 359, 15, [4, [1, 6, 2, 4, None, 8, 3, 7, 9, 5, 11, 10, 13, 15, 12, 14]])
(141, 374, 12, [15, [1, 2, 3, 4, 8, 6, 7, 10, 9, 5, 12, 11, 13, 15, 14, None]])
(1311, 385, 11, [14, [1, 2, 3, 4, 8, 5, 7, 10, 9, 6, 11, 12, 13, 15, None, 14]])
(1329, 400, 10, [13, [1, 2, 3, 4, 6, 8, 7, 10, 9, 5, 11, 12, 13, None, 15, 14]])
(1602, 431, 9, [4, [1, 2, 3, 4, None, 6, 8, 7, 9, 5, 11, 10, 13, 15, 14, 12]])
(1707, 446, 8, [5, [1, 2, 3, 4, 6, None, 7, 8, 9, 5, 15, 12, 13, 10, 14, 11]])
(1711, 475, 7, [12, [1, 2, 3, 4, 6, 5, 7, 8, 9, 10, 15, 12, None, 13, 14, 11]])
(1747, 502, 6, [8, [1, 2, 3, 4, 6, 5, 7, 8, None, 9, 10, 12, 13, 14, 15, 11]])
(1824, 519, 5, [14, [1, 2, 3, 4, 9, 6, 7, 8, 5, 10, 15, 12, 13, 14, None, 11]])
(1871, 540, 4, [10, [1, 2, 3, 4, 9, 6, 7, 8, 5, 10, None, 12, 13, 14, 15, 11]])
(28203, 555, 3, [9, [1, 2, 3, 4, 5, 6, 7, 8, 9, None, 10, 12, 13, 14, 11, 15]])
(28399, 560, 2, [10, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 12, 13, 14, 11, 15]])
(28425, 581, 1, [11, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, None, 13, 14, 15, 12]])
(28483, 582, 0, [15, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, None]])
The last line means that after 24,483 experiments it found the target position after 582 moves. Note that 582 is for sure very far from optimal as it's known that no position in the classic version of the 15 puzzle requires more than 80 moves.
The number after the number of moves is the "manhattan distance", for example the fourth-last row is the position:
where the sum of manhattan distances from the solution is 3.

Performance issues in Python for small loops over simple array look-ups

Why are simple loops and/or simple array look-ups so slow in Python?
Specifically, Python (using pypy) is approximately 9 times slower than C++ (with -O2) in the following example. What is the technical reason that explains the performance penalty? Is it the implementation of Python loops in machine code? differences in the optimizations used by the compilers? the memory management? or something else?
The Python code:
# File: timing.py
import sys
T = [ \
[ 0, 6, 7, 7, 7, 8, 6, 7, 8,19,20, 7, 7, 7, 7, 7, 7,21,22,19,20,21,22,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,], \
[ 9, 7,10, 7, 7, 7, 1, 7,23, 7, 7, 1,23, 7, 7, 7, 7, 7, 7, 9,10,30,31, 7, 9,10,30,31, 7,23, 7, 7,23, 7, 7,30,31,30,31,], \
[ 7, 7, 2,11,12, 7, 7, 7, 7, 7, 7,11,12,24,25,26,27, 7, 7, 7, 7, 7, 7, 7,24,25,26,27,32, 7, 7, 7,32,37,38, 7, 7,37,38,], \
[13, 7,14, 7, 7, 7, 3, 7,28, 7, 7, 3,28, 7, 7, 7, 7, 7, 7,13,14,33,34, 7,13,14,33,34, 7,28, 7, 7,28, 7, 7,33,34,33,34,], \
[15, 7,16, 7, 7, 7, 7, 7, 4, 7, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[17, 7,18, 7, 7, 7, 7, 7, 5, 7, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[19, 7,20, 7, 7, 7, 6, 7,29, 7, 7, 6,29, 7, 7, 7, 7, 7, 7,19,20,35,36, 7,19,20,35,36, 7,29, 7, 7,29, 7, 7,35,36,35,36,], \
[ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[21, 7,22, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 9, 1, 7, 7, 7,23, 1, 7,23, 9,10, 7, 7, 7, 7, 7, 7,30,31, 9,10,30,31,23, 7, 7, 7, 7, 7,23,30,31, 7, 7, 7,30,31, 7, 7,], \
[ 7, 7,10, 1,23, 7, 7, 7, 7, 7, 7, 1,23, 9,10,30,31, 7, 7, 7, 7, 7, 7, 7, 9,10,30,31,23, 7, 7, 7,23,30,31, 7, 7,30,31,], \
[24, 7,25, 7, 7, 7,11, 7,32, 7, 7,11,32, 7, 7, 7, 7, 7, 7,24,25,37,38, 7,24,25,37,38, 7,32, 7, 7,32, 7, 7,37,38,37,38,], \
[26, 7,27, 7, 7, 7, 7, 7,12, 7, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[13, 3, 7, 7, 7,28, 3, 7,28,13,14, 7, 7, 7, 7, 7, 7,33,34,13,14,33,34,28, 7, 7, 7, 7, 7,28,33,34, 7, 7, 7,33,34, 7, 7,], \
[ 7, 7,14, 3,28, 7, 7, 7, 7, 7, 7, 3,28,13,14,33,34, 7, 7, 7, 7, 7, 7, 7,13,14,33,34,28, 7, 7, 7,28,33,34, 7, 7,33,34,], \
[15, 7, 7, 7, 7, 4, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,16, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[17, 7, 7, 7, 7, 5, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,18, 7, 5, 7, 7, 7, 7, 7, 7, 7, 5, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[19, 6, 7, 7, 7,29, 6, 7,29,19,20, 7, 7, 7, 7, 7, 7,35,36,19,20,35,36,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,], \
[ 7, 7,20, 6,29, 7, 7, 7, 7, 7, 7, 6,29,19,20,35,36, 7, 7, 7, 7, 7, 7, 7,19,20,35,36,29, 7, 7, 7,29,35,36, 7, 7,35,36,], \
[21, 7, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,22, 7, 8, 7, 7, 7, 7, 7, 7, 7, 8, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[30, 7,31, 7, 7, 7, 7, 7,23, 7, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[24,11, 7, 7, 7,32,11, 7,32,24,25, 7, 7, 7, 7, 7, 7,37,38,24,25,37,38,32, 7, 7, 7, 7, 7,32,37,38, 7, 7, 7,37,38, 7, 7,], \
[ 7, 7,25,11,32, 7, 7, 7, 7, 7, 7,11,32,24,25,37,38, 7, 7, 7, 7, 7, 7, 7,24,25,37,38,32, 7, 7, 7,32,37,38, 7, 7,37,38,], \
[26, 7, 7, 7, 7,12, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,27, 7,12, 7, 7, 7, 7, 7, 7, 7,12, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[33, 7,34, 7, 7, 7, 7, 7,28, 7, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[35, 7,36, 7, 7, 7, 7, 7,29, 7, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[30, 7, 7, 7, 7,23, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,31, 7,23, 7, 7, 7, 7, 7, 7, 7,23, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[37, 7,38, 7, 7, 7, 7, 7,32, 7, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[33, 7, 7, 7, 7,28, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,34, 7,28, 7, 7, 7, 7, 7, 7, 7,28, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[35, 7, 7, 7, 7,29, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,36, 7,29, 7, 7, 7, 7, 7, 7, 7,29, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[37, 7, 7, 7, 7,32, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
[ 7, 7,38, 7,32, 7, 7, 7, 7, 7, 7, 7,32, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], \
]
M = range(39)
idempotents = [0,2,6,7,8,9,11,12,14,16,17,19,21,25,27]
omega = [0,7,2,7,7,7,6,7,8,9,7,11,12,7,14,7,16,17,7,19,7,21,7,7,7,25,7,27,7,7,7,7,7,7,7,7,7,7,7]
def check():
for e in idempotents:
for x in M:
ex = T[e][x]
for s in M:
es = T[e][s]
for f in idempotents:
exf = T[ex][f]
esf = T[es][f]
for y in M:
exfy = omega[T[exf][y]]
for t in M:
tesf = omega[T[t][esf]]
if T[T[exfy][exf]][tesf] != T[T[exfy][esf]][tesf]:
return 0
return 1
sys.exit(check())
The C++ code (requires C++11 because of the new array initialization and iteration syntax):
// File: timing.cc
// Compile via 'g++ -std=c++11 -O2 timing.cc'
// Run via 'time ./a.out'
#include <vector>
#include <cstddef>
int main(int, char **) {
const size_t N = 39;
typedef unsigned element_t;
const std::vector<std::vector<element_t>> T{{
{{ 0, 6, 7, 7, 7, 8, 6, 7, 8,19,20, 7, 7, 7, 7, 7, 7,21,22,19,20,21,22,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,}},
{{ 9, 7,10, 7, 7, 7, 1, 7,23, 7, 7, 1,23, 7, 7, 7, 7, 7, 7, 9,10,30,31, 7, 9,10,30,31, 7,23, 7, 7,23, 7, 7,30,31,30,31,}},
{{ 7, 7, 2,11,12, 7, 7, 7, 7, 7, 7,11,12,24,25,26,27, 7, 7, 7, 7, 7, 7, 7,24,25,26,27,32, 7, 7, 7,32,37,38, 7, 7,37,38,}},
{{13, 7,14, 7, 7, 7, 3, 7,28, 7, 7, 3,28, 7, 7, 7, 7, 7, 7,13,14,33,34, 7,13,14,33,34, 7,28, 7, 7,28, 7, 7,33,34,33,34,}},
{{15, 7,16, 7, 7, 7, 7, 7, 4, 7, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{17, 7,18, 7, 7, 7, 7, 7, 5, 7, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{19, 7,20, 7, 7, 7, 6, 7,29, 7, 7, 6,29, 7, 7, 7, 7, 7, 7,19,20,35,36, 7,19,20,35,36, 7,29, 7, 7,29, 7, 7,35,36,35,36,}},
{{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{21, 7,22, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 9, 1, 7, 7, 7,23, 1, 7,23, 9,10, 7, 7, 7, 7, 7, 7,30,31, 9,10,30,31,23, 7, 7, 7, 7, 7,23,30,31, 7, 7, 7,30,31, 7, 7,}},
{{ 7, 7,10, 1,23, 7, 7, 7, 7, 7, 7, 1,23, 9,10,30,31, 7, 7, 7, 7, 7, 7, 7, 9,10,30,31,23, 7, 7, 7,23,30,31, 7, 7,30,31,}},
{{24, 7,25, 7, 7, 7,11, 7,32, 7, 7,11,32, 7, 7, 7, 7, 7, 7,24,25,37,38, 7,24,25,37,38, 7,32, 7, 7,32, 7, 7,37,38,37,38,}},
{{26, 7,27, 7, 7, 7, 7, 7,12, 7, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{13, 3, 7, 7, 7,28, 3, 7,28,13,14, 7, 7, 7, 7, 7, 7,33,34,13,14,33,34,28, 7, 7, 7, 7, 7,28,33,34, 7, 7, 7,33,34, 7, 7,}},
{{ 7, 7,14, 3,28, 7, 7, 7, 7, 7, 7, 3,28,13,14,33,34, 7, 7, 7, 7, 7, 7, 7,13,14,33,34,28, 7, 7, 7,28,33,34, 7, 7,33,34,}},
{{15, 7, 7, 7, 7, 4, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,16, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{17, 7, 7, 7, 7, 5, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,18, 7, 5, 7, 7, 7, 7, 7, 7, 7, 5, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{19, 6, 7, 7, 7,29, 6, 7,29,19,20, 7, 7, 7, 7, 7, 7,35,36,19,20,35,36,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,}},
{{ 7, 7,20, 6,29, 7, 7, 7, 7, 7, 7, 6,29,19,20,35,36, 7, 7, 7, 7, 7, 7, 7,19,20,35,36,29, 7, 7, 7,29,35,36, 7, 7,35,36,}},
{{21, 7, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,22, 7, 8, 7, 7, 7, 7, 7, 7, 7, 8, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{30, 7,31, 7, 7, 7, 7, 7,23, 7, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{24,11, 7, 7, 7,32,11, 7,32,24,25, 7, 7, 7, 7, 7, 7,37,38,24,25,37,38,32, 7, 7, 7, 7, 7,32,37,38, 7, 7, 7,37,38, 7, 7,}},
{{ 7, 7,25,11,32, 7, 7, 7, 7, 7, 7,11,32,24,25,37,38, 7, 7, 7, 7, 7, 7, 7,24,25,37,38,32, 7, 7, 7,32,37,38, 7, 7,37,38,}},
{{26, 7, 7, 7, 7,12, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,27, 7,12, 7, 7, 7, 7, 7, 7, 7,12, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{33, 7,34, 7, 7, 7, 7, 7,28, 7, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{35, 7,36, 7, 7, 7, 7, 7,29, 7, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{30, 7, 7, 7, 7,23, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,31, 7,23, 7, 7, 7, 7, 7, 7, 7,23, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{37, 7,38, 7, 7, 7, 7, 7,32, 7, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{33, 7, 7, 7, 7,28, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,34, 7,28, 7, 7, 7, 7, 7, 7, 7,28, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{35, 7, 7, 7, 7,29, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,36, 7,29, 7, 7, 7, 7, 7, 7, 7,29, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{37, 7, 7, 7, 7,32, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
{{ 7, 7,38, 7,32, 7, 7, 7, 7, 7, 7, 7,32, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}},
}};
const std::vector<element_t> idempotents{{0,2,6,7,8,9,11,12,14,16,17,19,21,25,27}};
const std::vector<element_t> omega{{0,7,2,7,7,7,6,7,8,9,7,11,12,7,14,7,16,17,7,19,7,21,7,7,7,25,7,27,7,7,7,7,7,7,7,7,7,7,7}};
element_t ex, es, exf, esf, exfy, tesf;
for(auto e: idempotents) {
for(size_t x = 0; x < N; ++x) {
ex = T[e][x];
for(size_t s = 0; s < N; ++s) {
es = T[e][s];
for(auto f: idempotents) {
exf = T[ex][f];
esf = T[es][f];
for(size_t y = 0; y < N; ++y) {
exfy = omega[T[exf][y]];
for(size_t t = 0; t < N; ++t) {
tesf = omega[T[t][esf]];
if(T[T[exfy][exf]][tesf] != T[T[exfy][esf]][tesf])
return 0;
}}}}}}
return 1;
}
(Don't ask for details about what the code does. Roughly speaking, a decision procedure in the context of algebraic formal language theory is implemented; the code verifies an identity on the monoid given by the multiplication table T. In particular, the code is not a contrived example but a real-world application. Of course, one can argue about "applications" in the context of formal language theory.)
Timing results
With the code as above the user CPU running times on my machine are as follows:
time pypy timing.py outputs 0m9.329s
time python timing.py outputs 2m18.389s
g++ -std=c++11 -O2 timing.cc && time ./a.out outputs 0m1.064s
Edit
For a fairer comparison I did some optimizations that g++ seems to incorporate automatically. I re-ordered the loops and moved the variable assignments as far outward as possible (as suggested by the comments). This yields a speedup factor for pypy of about 2.5 and for python of about 2.
Also for fairness, I used the dynamic-sized std::vector instead of the constant-sized std::array.
I deleted the asides as to why numpy is slower (the comments indicate that I did not use it correctly) and why executing the loops in the main part of the script is slower (it is known that Python is faster with local variables than with global variables; the comments give references for this).
I am aware of the fact that C++ and Python have different scopes. I also know that C++ is compiled, whereas Python is normally interpreted (which is why I used pypy). I want to know the ultimate technical reason as to why pypy is so much slower on this specific piece of code. (Using numpy and numba one may be able to attain near-native performance here, but this is not in the spirit of my question, because it shifts virtually all the computation back to C code.) I clarified my question accordingly.
Short answer to numpy and python loops
if you are using numpy correctly you push everything back to C-level again.
Long answer
How can one do that?
I'll show here a little bit what I meant with my comment and how one avoids unnecessary loops with numpy. Let's have a look at the original code assuming you have put all lists into np.arrays with np.asarray(list).
for e in idempotents:
for x in M:
ex = T[e][x]
This translates directly to:
T[idempotents]
Why is that?
Numpy arrays can use arrays of indices for indexing. E.g.
T[0] returns all columns (actually all following dimensions) of matrix T. So T[0]==T[0,:] for 2d arrays. Since you are looping over all idempotents as indices and than over all elements in the columns T[e][x], T[idempotents] is identical to these two loops.
For details about it see here.
ex, es
Next is
for e in idempotents:
for x in M:
ex = T[e][x]
for s in M:
es = T[e][s]
Since there is no point in redoing the entire loop again, this translates to
es=ex
because we are using python, the matrix es is not even copied, just referenced.
exf, esf
I am skipping now some of the for loops in the snippets.
for f in idempotents:
exf = T[ex][f]
esf = T[es][f]
Now you are accessing the outer most index again with the vector of idempotents. So we can do this exactly with numpy in the same way:
T[T[idempotents]]
print T[T[idempotents]].shape
>> (15, 39, 39)
Now, we have an array of dimension (15, 39, 39), because for each element of the 2d array T[idempotents] you return the element of T. This is basically the third loop.
exf = T[T[idempotents]]
esf = exf
from here on it gets more complicated and I will skip the rest. It will be along following lines:
Ti = T[idempotents] # T[e][x] == T[e][s] by loop definition
TTi = T[Ti] # T[T[e][x]]
TTi.shape = -1 , 39 # bring first index back into shape
exf = TTi[:, idempotents] # T[T[e][x]][f]
esf = exf # T[T[e][s]][f] == T[T[e][x]][f] by loop definition
Texf = T[exf].ravel()
exfy = omega[Texf]
TTexf = T.T[exf].ravel() # tesf = omega[T[t][esf]] # since I cannot index fast along t I use the transpose of T
tesf = omega[TTexf]
and so on...
Because Python interprets its input at runtime, whereas the C++ compiler is able to perform major optimizations, particularly on for-loops with fixed sized arrays.
Some compilers might even compute the entire output of the nested for-loops.

Categories

Resources