I'm working on an image processing program with OpenCV and numpy. For most pixel operations, I'm able to avoid nested for loops by using np.vectorize(), but one of the functions I need to implement requires as a parameter the 'distance from center', or basically the coordinates of the point being processed.
Pseudoexample :
myArr = [[0,1,2]
[3,4,5]]
def myFunc(val,row,col):
return [row,col]
f = np.vectorize(myFunc)
myResult = f(myArr,row,col)
I obviously can't get elemX and elemY from the vectorized array, but is there another numpy function I could use in this situation or do I have to use for loops?, Is there a way to do it using openCV?
The function I need to put each pixel through is :
f(i, j) = 1/(1 + d(i, j)/L) , d(i,j) being the euclidean distance of the point from the center of the image.
You can get an array of distance from the center using the following lines (which is an example, there are a lot of ways to do this):
import numpy as np
myArr = np.array([[0,1,2], [3,4,5]])
nx, ny = myArr.shape
x = np.arange(nx) - (nx-1)/2. # x an y so they are distance from center, assuming array is "nx" long (as opposed to 1. which is the other common choice)
y = np.arange(ny) - (ny-1)/2.
X, Y = np.meshgrid(x, y)
d = np.sqrt(X**2 + Y**2)
# d =
# [[ 1.11803399 1.11803399]
# [ 0.5 0.5 ]
# [ 1.11803399 1.11803399]]
Then you can calculate f(i, j) by:
f = 1/(1 + d/L)
As an aside, your heavy use of np.vectorize() is a bit dubious. Are you sure it's doing what you want, and did you note the statement from the documentation:
The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
It's generally better to just write you code in vectorized form (like my line for f above which will work whether L is an array or a scaler), and not use numpy.vectorize(), and these are different things.
np.vectorize don't accelerate the code, you can vectorize it this way, `
# This compute distance between all points of MyArray and the center
dist_vector= np.sqrt(np.sum(np.power(center-MyArray,2),axis=1))
# F will contain the target value for each point
F = 1./(1 + 1. * dist_vector/L)
Related
I have 10,00,000 agents, each associated with (x,y) coordinates. I am trying to find agents close to each other (radius=1.5). I tried to implement this using PyTorch:
X = torch.DoubleTensor(1000000,2).uniform_(0,10000)
torch.cdist(X,X,p=2)
However, with this the session crashes. I am running this on google colab. The same happened when I tried constructing the graph using, radius_neighbors_graph of scikit-learn package. It would be of great help if someone suggested a memory efficient way to implement the same.
It's unlikely that you'll be able to compute a 1M*1M matrix in its entirety without thinking it through very carefully. You probably want something along the lines of scipy.spatial.KDTree. Once you've constructed a tree, you can pass the coordinates of an agent to the query method to get its neighbors within a certain radius. To get all the neighbors at once, you can come compute something like sparse_distance_matrix of the tree with itself at an appropriate threshold.
Alternatively, you can look into any number of efficient clustering algorithms.
I found three solutions,
Solution 1
import torch
x = torch.randn(3000000, 2).cuda()
y = x
# Turn our Tensors into KeOps symbolic variables:
from pykeops.torch import LazyTensor
x_i = LazyTensor( x[:,None,:] )
y_j = LazyTensor( y[None,:,:] )
# We can now perform large-scale computations, without memory overflows:
D_ij = ((x_i - y_j)**2).sum(dim=2)
D_ij.argKmin(20,dim=1)
Solution 2
M = 3000000
import numpy as np
from pykeops.numpy import LazyTensor as LazyTensor_np
x = np.random.rand(M, 2)
y = x
x_i = LazyTensor_np(
x[:, None, :]
) # (M, 1, 2) KeOps LazyTensor, wrapped around the numpy array x
y_j = LazyTensor_np(
y[None, :, :]
) # (1, N, 2) KeOps LazyTensor, wrapped around the numpy array y
D_ij = ((x_i - y_j) ** 2).sum(-1) # **Symbolic** (M, N) matrix of squared distances
s_i = D_ij.argKmin(20,dim=1).ravel() # genuine (M,) array of integer indices
Solution 3
from sklearn.neighbors import NearestNeighbors
import numpy as np
M = 3000000
x = np.random.rand(M, 2)
nbrs = NearestNeighbors(n_neighbors=20, algorithm='ball_tree').fit(x)
distances, indices = nbrs.kneighbors(x)
Although the execution time of all the three solutions is the same, a minute, the memory requirements are approximately 2GB, 1GB and 1.3GB, respectively. It would be great to hear ideas to lower the execution time.
I'm facing a problem with vectorizing a function so that it applies efficiently on a numpy array.
My program entries :
A pos_part 2D Array of Nb_particles lines, 3 columns (basicaly x,y,z coordinates, only z is relevant for the part that bothers me) Nb_particles can up to several hundreds of thousands.
An prop_part 1D array with Nb_particles values. This part I got covered, creation is made with some nice numpy functions ; I just put here a basic distribution that ressembles real values.
A z_distances 1D Array, a simple np.arange betwwen z=0 and z=z_max.
Then come the calculation that takes time, because where I can't find a way to do things properply with only numpy operation of arrays. What i want to do is :
For all distances z_i in z_distances, sum all values from prop_part if corresponding particle coordinate z_particle < z_i. This would return a 1D array the same length as z_distances.
My ideas so far :
Version 0, for loop, enumerate and np.where do retrieve the index of values that I need to sum. Obviously quite long.
Version 1, using a mask on a new array (combination of z coordinates and particle properties), and sum on the masked array. Seems better than v0
Version 2, another mask and a np.vectorize, but i understand it's not efficient as vectorize is basicaly a for loop. Still seems better than v0
Version 3, I'm trying to use mask on a function that can I directly apply to z_distances, but it's not working so far.
So, here I am. There is maybe something to do with a sort and a cumulative sum, but I don't know how to do this, so any help would be greatly appreciated. Please find below the code to make things clearer
Thanks in advance.
import numpy as np
import time
import matplotlib.pyplot as plt
# Creation of particles' positions
Nb_part = 150_000
pos_part = 10*np.random.rand(Nb_part,3)
pos_part[:,0] = pos_part[:,1] = 0
#usefull property creation
beta = 1/1.5
prop_part = (1/beta)*np.exp(-pos_part[:,2]/beta)
z_distances = np.arange(0,10,0.1)
#my version 0
t0=time.time()
result = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
positions = np.where(pos_part[:,2]<val_dist)[0]
result[index_dist] = sum(prop_part[i] for i in positions)
print("v0 :",time.time()-t0)
#A graph to help understand
plt.figure()
plt.plot(z_distances,result, c="red")
plt.ylabel("Sum of particles' usefull property for particles with z-pos<d")
plt.xlabel("d")
#version 1 ??
t1=time.time()
combi = np.column_stack((pos_part[:,2],prop_part))
result2 = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
mask = (combi[:,0]<val_dist)
result2[index_dist]=sum(combi[:,1][mask])
print("v1 :",time.time()-t1)
plt.plot(z_distances,result2, c="blue")
#version 2
t2=time.time()
def themask(a):
mask = (combi[:,0]<a)
return sum(combi[:,1][mask])
thefunc = np.vectorize(themask)
result3 = thefunc(z_distances)
print("v2 :",time.time()-t2)
plt.plot(z_distances,result3, c="green")
### This does not work so far
# version 3
# =============================
# t3=time.time()
# def thesum(a):
# mask = combi[combi[:,0]<a]
# return sum(mask[:,1])
# result4 = thesum(z_distances)
# print("v3 :",time.time()-t3)
# =============================
You can get a lot more performance by writing your first version completely in numpy. Replace pythons sum with np.sum. Instead of the for i in positions list comprehension, simply pass the positions mask you are creating anyways.
Indeed, the np.where is not necessary and my best version looks like:
#my version 0
t0=time.time()
result = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
positions = pos_part[:, 2] < val_dist
result[index_dist] = np.sum(prop_part[positions])
print("v0 :",time.time()-t0)
# out: v0 : 0.06322097778320312
You can get a bit faster if z_distances is very long by using numba.
Running calc for the first time usually creates some overhead which we can get rid of by running the function for some small set of `z_distances.
The below code achieves roughly a factor of two speedup over pure numpy on my laptop.
import numba as nb
#nb.njit(parallel=True)
def calc(result, z_distances):
n = z_distances.shape[0]
for ii in nb.prange(n):
pos = pos_part[:, 2] < z_distances[ii]
result[ii] = np.sum(prop_part[pos])
return result
result4 = np.zeros_like(result)
# _t = time.time()
# calc(result4, z_distances[:10])
# print(time.time()-_t)
t3 = time.time()
result4 = calc(result4, z_distances)
print("v3 :", time.time()-t3)
plt.plot(z_distances, result4)
I have two arrays containing point coordinates as shapely.geometry.Point with different sizes.
Eg:
[Point(X Y), Point(X Y)...]
[Point(X Y), Point(X Y)...]
I would like to create a "cross product" of these two arrays with a distance function. Distance function is from shapely.geometry, which is a simple geometry vector distance calculation. I am tryibg to create distance matrix between M:N points:
Right now I have this function:
source = gpd.read_file(source)
near = gpd.read_file(near)
source_list = source.geometry.values.tolist()
near_list = near.geometry.values.tolist()
array = np.empty((len(source.ID_SOURCE), len(near.ID_NEAR)))
for index_source, item_source in enumerate(source_list):
for index_near, item_near in enumerate(near_list):
array[index_source, index_near] = item_source.distance(item_near)
df_matrix = pd.DataFrame(array, index=source.ID_SOURCE, columns = near.ID_NEAR)
Which does the job fine, but is slow. 4000 x 4000 points is around 100 seconds (I have datasets which are way bigger, so speed is main issue). I would like to avoid this double loop if possible. I tried to do in in pandas dataframe as in (which has terrible speed):
for index_source, item_source in source.iterrows():
for index_near, item_near in near.iterrows():
df_matrix.at[index_source, index_near] = item_source.geometry.distance(item_near.geometry)
A bit faster is (but still 4x slower than numpy):
for index_source, item_source in enumerate(source_list):
for index_near, item_near in enumerate(near_list):
df_matrix.at[index_source, index_near] = item_source.distance(item_near)
Is there a faster way to do this? I guess there is, but I have no idea how to proceed. I might be able to chunk the dataframe into smaller pieces and send the chunk onto different core and concat the results - this is the last resort. If somehow we can use numpy only with some indexing only magic, I can send it to GPU and be done with it in no time. But the double for loop is a no no right now. Also I would like to not use any other library than Pandas/Numpy. I can use SAGA processing and its Point distances module (http://www.saga-gis.org/saga_tool_doc/2.2.2/shapes_points_3.html), which is pretty damn fast, but I am looking for Python only solution.
If you can get the coordinates in separate vectors, I would try this:
import numpy as np
x = np.asarray([5.6, 2.1, 6.9, 3.1]) # Replace with data
y = np.asarray([7.2, 8.3, 0.5, 4.5]) # Replace with data
x_i = x[:, np.newaxis]
x_j = x[np.newaxis, :]
y_i = y[:, np.newaxis]
y_j = y[np.newaxis, :]
d = (x_i-x_j)**2+(y_i-y_j)**2
np.sqrt(d, out=d)
I'm trying to convert this version of interp2 from Matlab to Python.
In Matlab is used as
Vq = interp2(V,k)
Which perform interpolation over a matrix V where each original interval has been recursively subdivided k times. Adding a total of 2^k-1 elements to each division.
However I haven't found a Python alternative to this function. I tried with scipy.interpolation.interp2 but only works with three matrix.
I found this alternative in a Forum, looks like an email transcription, but I´ll paste the answer here anyways.
import numpy as np
def interp2d_interleave(z,n):
'''performs linear interpolation on a grid
all points are interpolated in one step not recursively
Parameters
----------
z : 2d array (M,N)
n : int
number of points interpolated
Returns
-------
zi : 2d array ((M-1)*n+M, (N-1)*n+N)
original and linear interpolated values
'''
frac = np.atleast_2d(np.arange(0,n+1)/(1.0+n)).T
zi1 = np.kron(z[:,:-1],np.ones(len(frac))) + np.kron(np.diff(z),frac.T)
zi1 = np.hstack((zi1,z[:,-1:]))
zi2 = np.kron(zi1.T[:,:-1],np.ones(len(frac))) + np.kron(np.diff(zi1.T),frac.T)
zi2 = np.hstack((zi2,zi1.T[:,-1:]))
return zi2.T
def interp2d_interleave_recursive(z,n):
'''interpolates by recursively interleaving n times
'''
zi = z.copy()
for ii in range(1,n+1):
zi = interp2d_interleave(zi,1)
return zi
This should be used as follows
xyz = np.zeros((2, 2))
xyz = interp2d_interleave_recursive(xyz, 1)
And the result would be:
I have a simple 2x2 transformation matrix, s, which encodes some liner transformation of coordinates such that X' = sX.
I have generated a set of uniformley distributed coordinates on a grid using the np.meshgrid() function and at the moment I traverse each coordinate and apply the transformation at a coordinate by coordinate level. Unfortunately, this very slow for large arrays. Are there any fast ways of doing this? Thanks!
import numpy as np
image_dimension = 1024
image_index = np.arange(0,image_dimension,1)
xx, yy = np.meshgrid(image_index,image_index)
# Pre-calculated Transformation Matrix.
s = np.array([[ -2.45963439e+04, -2.54997726e-01], [ 3.55680731e-02, -2.48005486e+04]])
xx_f = xx.flatten()
yy_f = yy.flatten()
for x_t in range(0, image_dimension*image_dimension):
# Get the current (x,y) coordinate.
x_y_in = np.matrix([[xx_f[x_t]],[yy_f[x_t]]])
# Perform the transformation with x.
optout = s * x_y_in
# Store the new coordinate.
xx_f[x_t] = np.array(optout)[0][0]
yy_f[x_t] = np.array(optout)[1][0]
# Reshape Output
xx_t = xx_f.reshape((image_dimension, image_dimension))
yy_t = yy_f.reshape((image_dimension, image_dimension))
You can use the numpy dot function to get the dot product of your matices as:
xx_tn,yy_tn = np.dot(s,[xx.flatten(),yy.flatten()])
xx_t = xx_tn.reshape((image_dimension, image_dimension))
yy_t = yy_tn.reshape((image_dimension, image_dimension))
Which is much faster
Loops are slow in Python. It is better to use vectorization.
In a nutshell, the idea is to let numpy do the loops in C, which is much faster.
You can express your problem as matrix multiplications X' = sX, where you put all the points in X and transform them all with just one call to numpy's dot product:
xy = np.vstack([xx.ravel(), yy.ravel()])
xy_t = np.dot(s, xy)
xx_t, yy_t = xy_t.reshape((2, image_dimension, image_dimension))