How to improve the efficiency of a code in Python

How to improve the efficiency of a code in Python - python

I have an array of Pe of shape (5100,5100). I am trying to find the neighbor indices using the code below but for this shape of Pe, the computational time is 100 seconds. Is there a more time efficient way to do it?
def get_neighbor_indices(position, dimensions):
'''
dimensions is a shape of np.array
'''
i, j = position
indices = [(i+1,j), (i-1,j), (i,j+1), (i,j-1)]
return [
(i,j) for i,j in indices
if i>=0 and i<dimensions[0]
and j>=0 and j<dimensions[1]
]
def iterate_array(init_i, init_j, arr, condition_func):
'''
arr is an instance of np.array
condition_func is a function (value) => boolean
'''
indices_to_check = [(init_i,init_j)]
checked_indices = set()
result = []
t0 = None
t1 = None
timestamps = []
while indices_to_check:
pos = indices_to_check.pop()
if pos in checked_indices:
continue
item = arr[pos]
checked_indices.add(pos)
if condition_func(item):
result.append(item)
t1=time.time()
if(t0==None):
t0=t1
timestamps.append(t1-t0)
indices_to_check.extend(
get_neighbor_indices(pos, arr.shape)
)
return result,timestamps
Visited_Elements,timestamps=iterate_array(0,0, Pe, lambda x : x < Pin0)

With scipy and a slight change in the way the filter condition is described this can be made rather fast:
import numpy as np
from scipy.ndimage import label
import time
def collect_array(init_i, init_j, arr, condition_arr):
t0=time.time()
if not condition_arr[init_i, init_j]:
return [], time.time() - t0
islands = label(condition_arr)[0]
mask = islands != islands[init_i, init_j]
mx = np.ma.masked_array(Pe, mask)
return mx.compressed(), time.time() - t0
# Trying it out
Pe = np.random.rand(5100, 5100)
Pin0 = 0.7
Visited_Elements, timestamp = collect_array(0,0, Pe, Pe < Pin0)
print(Visited_Elements,timestamp)
The core of the code is the label function and the fact that the condition function is replaced by a boolean array.

Related

Python numpy: Simplify operation on multiple matrices

I have 3 numpy matrices:
One contains pixels positions in X (x_pos), another pixel positions in Y (y_pos) and a last one containing pixel values (p_value)
I would like to use these 3 matrices to build a results image
With loops I have this result:
#Resulting image
res = np.zeros((128,128,3), dtype = np.uint8)
for i in range(x_pos.shape[0]):
for j in range(x_pos.shape[1]):
# Get coordinates
x = x_pos[i][j]
y = y_pos[i][j]
res[y,x] = p_value[i][j]
With large matrices (2048*2048) this code already takes a lot of time. Is it possible to optimize this code without using a nested loop?
I specify that the positions in the pos_x and pos_y matrices do not necessarily follow each other, there may be holes or duplicate values

It should be possible using np.meshgrid
i = np.arange(0, x.shape[0])
j = np.arange(0, x.shape[1])
i_1, j_1 = np.meshgrid(i, j, indexing='ij')
res[y_1.ravel(),x_1.ravel()] = p_value[i_1.ravel(),j_1.ravel()]

First use consistent numpy 2d array indexing:
x = x_pos[i,j]
y = y_pos[i,j]
res[y,x] = p_value[i,j]
Now instead of scalar i,j use arrays
i = np.arange(n); j = np.arange(m)
You didn't provida [mcve] so I won't try to demonstrate that th

Thanks to #hpaulj and #ai2ys answer the problem is solved.
Here is a comparison of the results in terms of execution speed:
import numpy as np
import cv2
import time
m_size = 4096
m_x = np.random.randint(0,m_size,(m_size,m_size), dtype = np.uint16)
m_y = np.random.randint(0,m_size,(m_size,m_size), dtype = np.uint16)
p_value = np.ones((m_size,m_size), dtype = np.uint8)
#Meshgrid method:
out = np.zeros((m_size,m_size),dtype=np.uint8)
start = time.time()
i = np.arange(0, m_x.shape[0])
j = np.arange(0, m_x.shape[1])
i_1, j_1 = np.meshgrid(i, j, indexing='ij')
out[m_x.ravel(),m_y.ravel()] = p_value[i_1.ravel(),j_1.ravel()]
end = time.time()
print("Meshgrid: {} s".format(end - start))
#No for loop method:
out = np.zeros((m_size,m_size),dtype=np.uint8)
start = time.time()
i = np.arange(m_x.shape[0])
j = np.arange(m_y.shape[1])
x = m_x[i,j]
y = m_y[i,j]
out[x,y] = p_value[i,j]
end = time.time()
print("No loop: {} s".format(end - start))
#For loop method:
out = np.zeros((m_size,m_size),dtype=np.uint8)
start = time.time()
for i in range(m_x.shape[0]):
for j in range(m_y.shape[1]):
x = m_x[i,j]
y = m_y[i,j]
out[x,y] = p_value[i,j]
end = time.time()
print("Nested loop: {} s".format(end - start))
#Output:
Meshgrid: 0.4837045669555664 s
No loop: 0.3600656986236572 s
Nested loop: 13.10097336769104 s

Numba and multidimensions additions - not working with numpy.newaxis?

Trying to accelerate a DP algorithm on python, numba seemed like an appropriate candidate.
I'm doing a subtraction of a 2D array with a 1D array which delivers a 3D array. I'm then using .argmin() along the 3rd dimension to obtain a 2D array. This works just fine with numpy, but doesn't with numba.
Toy code reproducing the issue :
from numba import jit
import numpy as np
inflow = np.arange(1,0,-0.01) # Dim [T]
actions = np.arange(0,1,0.05) # Dim [M]
start_lvl = np.random.rand(500).reshape(-1,1)*49 # Dim [Nx1]
disc_lvl = np.arange(0,1000) # Dim [O]
#jit(nopython=True)
def my_func(disc_lvl, actions, start_lvl, inflow):
for i in range(0,100):
# Calculate new level at time i
new_lvl = start_lvl + inflow[i] + actions # Dim [N x M]
# For each new_level element, find closest discretized level
diff = (disc_lvl-new_lvl[:,:,np.newaxis]) # Dim [N x M x O]
idx_lvl = abs(diff).argmin(axis=2) # Dim [N x M]
return True
# function works fine without numba
success = my_func(disc_lvl, actions, start_lvl, inflow)
Why does not the code above run ? It does when taking out #jit(nopython=True).
Is there a work round to make the following calculation work with numba ?
I've tried variants with numpy repeats & expand_dims, as well as defining explicitly the input types of the jit function without success.

There are a few things you need to change to make it work:
Adding a dimension with arr[:, :, None]: for Numba, it looks like getitem so prefer using reshape
Use np.abs instead of built-in abs
The argmin with axis keyword argument is not implemented. Prefer using loops, which Numba is designed to optimize.
With all this fixed you can run the jitted function:
from numba import jit
import numpy as np
inflow = np.arange(1,0,-0.01) # Dim [T]
actions = np.arange(0,1,0.05) # Dim [M]
start_lvl = np.random.rand(500).reshape(-1,1)*49 # Dim [Nx1]
disc_lvl = np.arange(0,1000) # Dim [O]
#jit(nopython=True)
def my_func(disc_lvl, actions, start_lvl, inflow):
for i in range(0,100):
# Calculate new level at time i
new_lvl = start_lvl + inflow[i] + actions # Dim [N x M]
# For each new_level element, find closest discretized level
new_lvl_3d = new_lvl.reshape(*new_lvl.shape, 1)
diff = np.abs(disc_lvl - new_lvl_3d) # Dim [N x M x O]
idx_lvl = np.empty(new_lvl.shape)
for i in range(diff.shape[0]):
for j in range(diff.shape[1]):
idx_lvl[i, j] = diff[i, j, :].argmin()
return True
# function works fine without numba
success = my_func(disc_lvl, actions, start_lvl, inflow)

Find below the corrected code of my first post, that you can execute with and without jitted mode of the numba library (by removing the line that starts with #jit). I've observed a speed increase of factor 2 for this example.
from numba import jit
import numpy as np
import datetime as dt
inflow = np.arange(1,0,-0.01) # Dim [T]
nbTime = np.shape(inflow)[0]
actions = np.arange(0,1,0.01) # Dim [M]
start_lvl = np.random.rand(500).reshape(-1,1)*49 # Dim [Nx1]
disc_lvl = np.arange(0,1000) # Dim [O]
#jit(nopython=True)
def my_func(nbTime, disc_lvl, actions, start_lvl, inflow):
# Initialize result
res = np.empty((nbTime,np.shape(start_lvl)[0],np.shape(actions)[0]))
for t in range(0,nbTime):
# Calculate new level at time t
new_lvl = start_lvl + inflow[t] + actions # Dim [N x M]
print(t)
# For each new_level element, find closest discretized level
new_lvl_3d = new_lvl.reshape(*new_lvl.shape, 1)
diff = np.abs(disc_lvl - new_lvl_3d) # Dim [N x M x O]
idx_lvl = np.empty(new_lvl.shape)
for i in range(diff.shape[0]):
for j in range(diff.shape[1]):
idx_lvl[i, j] = diff[i, j, :].argmin()
res[t,:,:] = idx_lvl
return res
# Call function and print running time
start_time = dt.datetime.now()
result = my_func(nbTime, disc_lvl, actions, start_lvl, inflow)
print('Execution time :',(dt.datetime.now() - start_time))

Get all component stats of multiple arrays labeled by one of them

I already asked a similar question which got answered but now this is more in detail:
I need a really fast way to get all important component stats of two arrays, where one array is labeled by opencv2 and gives the component areas for both arrays. The stats for all components masked on the two arrays should then saved to a dictionary. My approach works but it is much too slow. Is there something to avoid the loop or a better approach then the ndimage.öabeled_comprehension?
from scipy import ndimage
import numpy as np
import cv2
def calculateMeanMaxMin(val):
return np.array([np.mean(val),np.max(val),np.min(val)])
def getTheStatsForComponents(array1,array2):
ret, thresholded= cv2.threshold(array2, 120, 255, cv2.THRESH_BINARY)
thresholded= thresholded.astype(np.uint8)
numLabels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresholded, 8, cv2.CV_8UC1)
allComponentStats=[]
meanmaxminArray2 = ndimage.labeled_comprehension(array2, labels, np.arange(1, numLabels+1), calculateMeanMaxMin, np.ndarray, 0)
meanmaxminArray1 = ndimage.labeled_comprehension(array1, labels, np.arange(1, numLabels+1), calculateMeanMaxMin, np.ndarray, 0)
for position, label in enumerate(range(1, numLabels)):
currentLabel = np.uint8(labels== label)
contour, _ = cv2.findContours(currentLabel, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
(side1,side2)=cv2.minAreaRect(contour[0])[1]
componentStat = stats[label]
allstats = {'position':centroids[label,:],'area':componentStat[4],'height':componentStat[3],
'width':componentStat[2],'meanArray1':meanmaxminArray1[position][0],'maxArray1':meanmaxminArray1[position][1],
'minArray1':meanmaxminArray1[position][2],'meanArray2':meanmaxminArray2[position][0],'maxArray2':meanmaxminArray2[position][1],
'minArray2':meanmaxminArray2[position][2]}
if side1 >= side2 and side1 > 0:
allstats['elongation'] = np.float32(side2 / side1)
elif side2 > side1 and side2 > 0:
allstats['elongation'] = np.float32(side1 / side2)
else:
allstats['elongation'] = np.float32(0)
allComponentStats.append(allstats)
return allComponentStats
EDIT
The two arrays are 2d arrays:
array1= np.random.choice(255,(512,512)).astype(np.uint8)
array2= np.random.choice(255,(512,512)).astype(np.uint8)
EDIT2
small example of two arrays and the labelArray with two components(1 and 2, and background 0). Calculate the min,max mean with ndimage.labeled_comprhension.
from scipy import ndimage
import numpy as np
labelArray = np.array([[0,1,1,1],[2,2,1,1],[2,2,0,1]])
data = np.array([[0.1,0.2,0.99,0.2],[0.34,0.43,0.87,0.33],[0.22,0.53,0.1,0.456]])
data2 = np.array([[0.1,0.2,0.99,0.2],[0.1,0.2,0.99,0.2],[0.1,0.2,0.99,0.2]])
numLabels = 2
minimumDataForAllLabels = ndimage.labeled_comprehension(data, labelArray, np.arange(1, numLabels+1), np.min, np.ndarray, 0)
minimumData2ForallLabels = ndimage.labeled_comprehension(data2, labelArray, np.arange(1, numLabels+1), np.min, np.ndarray, 0)
print(minimumDataForAllLabels)
print(minimumData2ForallLabels)
print(bin_and_do_simple_stats(labelArray.flatten(),data.flatten()))
Output:
[0.2 0.22] ##minimum of component 1 and 2 from data
[0.2 0.1] ##minimum of component 1 and 2 from data2
[0.1 0.2 0.22] ##minimum output of bin_and_do_simple_stats from data

labeled_comprehension is definitely slow.
At least the simple stats can be done much faster based on the linked post. For simplicity I'm only doing one data array, but as the procedure returns sort indices it can be easily extended to multiple arrays:
import numpy as np
from scipy import sparse
try:
from stb_pthr import sort_to_bins as _stb_pthr
HAVE_PYTHRAN = True
except:
HAVE_PYTHRAN = False
# fallback if pythran not available
def sort_to_bins_sparse(idx, data, mx=-1):
if mx==-1:
mx = idx.max() + 1
aux = sparse.csr_matrix((data, idx, np.arange(len(idx)+1)), (len(idx), mx)).tocsc()
return aux.data, aux.indices, aux.indptr
def sort_to_bins_pythran(idx, data, mx=-1):
indices, indptr = _stb_pthr(idx, mx)
return data[indices], indices, indptr
# pick best available
sort_to_bins = sort_to_bins_pythran if HAVE_PYTHRAN else sort_to_bins_sparse
# example data
idx = np.random.randint(0,10,(100000))
data = np.random.random(100000)
# if possible compare the two methods
if HAVE_PYTHRAN:
dsp,isp,psp = sort_to_bins_sparse(idx,data)
dph,iph,pph = sort_to_bins_pythran(idx,data)
assert (dsp==dph).all()
assert (isp==iph).all()
assert (psp==pph).all()
# example how to do simple vectorized calculations
def simple_stats(data,iptr):
min = np.minimum.reduceat(data,iptr[:-1])
mean = np.add.reduceat(data,iptr[:-1]) / np.diff(iptr)
return min, mean
def bin_and_do_simple_stats(idx,data,mx=-1):
data,indices,indptr = sort_to_bins(idx,data,mx)
return simple_stats(data,indptr)
print("minima: {}\n mean values: {}".format(*bin_and_do_simple_stats(idx,data)))
If you have pythran (not required but a bit faster), compile this as <stb_pthr.py>:
import numpy as np
#pythran export sort_to_bins(int[:], int)
def sort_to_bins(idx, mx):
if mx==-1:
mx = idx.max() + 1
cnts = np.zeros(mx + 2, int)
for i in range(idx.size):
cnts[idx[i]+2] += 1
for i in range(2, cnts.size):
cnts[i] += cnts[i-1]
res = np.empty_like(idx)
for i in range(idx.size):
res[cnts[idx[i]+1]] = i
cnts[idx[i]+1] += 1
return res, cnts[:-1]

Vectorization of python numpy code is making it slower instead of faster

I am trying to perform image correlation to find which frame out of a set of 20 frames (the set is stored in a 3D array, x) matches best with a given frame (stored as a 2D array, y). This step has to be performed 1000 times.
I tried to vectorize the code to make it run faster. But somehow the vectorization is making the code take twice as long. I am probably doing something wrong in the vectorization process which is making it slower.
Here is the code
import numpy as np
import time
def corr2(a,b):
#Getting shapes and prealocating the auxillairy variables
k = np.shape(a)
#Calculating mean values
AM=np.mean(a)
BM=np.mean(b)
#calculate vectors
c_vect = (a-AM)*(b-BM)
d_vect = (a-AM)**2
e_vect = (b-BM)**2
#Formula itself
r_out = np.sum(c_vect)/float(np.sqrt(np.sum(d_vect)*np.sum(e_vect)))
return r_out
def ZZ_1X_v1_MCC(R,RefImage):
from img_proccessing import corr2
Cor = np.zeros(R.shape[2])
for t in range(R.shape[2]):
Cor[t]=corr2(RefImage,R[:,:,t]) #Correlation
#report
max_correlationvalue_intermediate = np.amax(Cor)
max_correlatedframe_intermediate = np.argmax(Cor)
max_correlatedframeandvalue = [max_correlatedframe_intermediate,max_correlationvalue_intermediate];
return max_correlatedframeandvalue
def ZZ_1X_v1_MCC_vectorized(R,RefImage):
R_shape = np.asarray(np.shape(R))
R_flattened = R.swapaxes(0,2).reshape(R_shape[2],R_shape[0]*R_shape[1])
AA = np.transpose(R_flattened)
RefImageflattened = RefImage.transpose().ravel()
#Calculating mean subtracted values
AAM = AA - np.mean(AA,axis=0)
BM = RefImageflattened - np.mean(RefImageflattened)
#calculate vectors
DD_vect = AAM**2
E_vect = BM**2
EE_vect = np.transpose(np.tile(np.transpose(E_vect),(R_shape[2],1)))
CC_vect = AAM*np.transpose(np.tile(BM,(R_shape[2],1)))
#Formula itself
Cor = np.sum(CC_vect,axis=0)/np.sqrt((np.sum(DD_vect,axis=0)*np.sum(EE_vect,axis=0)).astype(float))
#report
max_correlationvalue_intermediate = np.amax(Cor)
max_correlatedframe_intermediate = np.argmax(Cor)
max_correlatedframeandvalue = [max_correlatedframe_intermediate,max_correlationvalue_intermediate];
return max_correlatedframeandvalue
x = np.arange(400000).reshape((20,200,100)).swapaxes(0,2) #3D array with 20 frames
y = np.transpose(np.arange(20000).reshape((200,100))) #2D array with 1 frame
# using for loop
tic = time.time()
for i in range(500):
[a,b] = ZZ_1X_v1_MCC(x,y)
print(time.time() - tic)
# using vectorization
tic = time.time()
for i in range(500):
[a,b] = ZZ_1X_v1_MCC_vectorized(x,y)
print(time.time() - tic)

Vectorize compressed sparse matrix from array in Python

I am trying to apply graph theory methods to an image processing problem. I want to generate an adjacency matrix from an array containing the points I want to graph. I want to generate a complete graph of the points in the array. If I have N points in the array that I need to graph, I will need an NxN matrix. The weights should be the distances between the points, so this is the code that I have:
''' vertexarray is an array where the points that are to be
included in the complete graph are True and all others False.'''
import numpy as np
def array_to_complete_graph(vertexarray):
vertcoords = np.transpose(np.where(vertexarray == True))
cg_array = np.eye(len(vertcoords))
for idx, vals in enumerate(vertcoords):
x_val_1, y_val_1 = vals
for jdx, wals in enumerate(vertcoords):
x_diff = wals[0] - vals[0]
y_diff = wals[1] - vals[1]
cg_array[idx,jdx] = np.sqrt(x_diff**2 + y_diff**2)
return cg_array
This works, of course, but my question is: can this same array be generated without the nested for loops?

Use the function scipy.spatial.distance.cdist():
import numpy as np
def array_to_complete_graph(vertexarray):
vertcoords = np.transpose(np.where(vertexarray == True))
cg_array = np.eye(len(vertcoords))
for idx, vals in enumerate(vertcoords):
x_val_1, y_val_1 = vals
for jdx, wals in enumerate(vertcoords):
x_diff = wals[0] - vals[0]
y_diff = wals[1] - vals[1]
cg_array[idx,jdx] = np.sqrt(x_diff**2 + y_diff**2)
return cg_array
arr = np.random.rand(10, 20) > 0.75
from scipy.spatial.distance import cdist
y, x = np.where(arr)
p = np.c_[x, y]
dist = cdist(p, p)
np.allclose(array_to_complete_graph(arr), dist)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to improve the efficiency of a code in Python - python

Related

Python numpy: Simplify operation on multiple matrices

Numba and multidimensions additions - not working with numpy.newaxis?

Get all component stats of multiple arrays labeled by one of them

Vectorization of python numpy code is making it slower instead of faster

Vectorize compressed sparse matrix from array in Python

Categories

Resources