I have been searching for a python alternative to MATLAB's inpolygon() and I have come across contains_points as a good option.
However, the docs are a little bare with no indication of what type of data contains_points expects:
contains_points(points, transform=None, radius=0.0)
Returns a bool array which is True if the path contains the corresponding point.
If transform is not None, the path will be transformed before performing the test.
radius allows the path to be made slightly larger or smaller.
I have the polygon stored as an n*2 numpy array (where n is quite large ~ 500). As far as I can see I need to call the Path() method on this data which seems to work OK:
poly_path = Path(poly_points)
At the moment I also have the points I wish to test stored as another n*2 numpy array (catalog_points).
Perhaps my problem lies here? As when I run:
in_poly = poly_path.contains_points(catalog_points)
I get back an ndarray containing False for every value no matter the set of points I use (I have tested this on arrays of points well within the polygon).
Often in these situations, I find the source to be illuminating...
We can see the source for path.contains_point accepts a container that has at least 2 elements. The source for contains_points is a bit harder to figure out since it calls through to a C function Py_points_in_path. It seems that this function accepts a iterable that yields elements that have a length 2:
>>> from matplotlib import path
>>> p = path.Path([(0,0), (0, 1), (1, 1), (1, 0)]) # square with legs length 1 and bottom left corner at the origin
>>> p.contains_points([(.5, .5)])
array([ True], dtype=bool)
Of course, we could use a numpy array of points as well:
>>> points = np.array([.5, .5]).reshape(1, 2)
>>> points
array([[ 0.5, 0.5]])
>>> p.contains_points(points)
array([ True], dtype=bool)
And just to check that we aren't always just getting True:
>>> points = np.array([.5, .5, 1, 1.5]).reshape(2, 2)
>>> points
array([[ 0.5, 0.5],
[ 1. , 1.5]])
>>> p.contains_points(points)
array([ True, False], dtype=bool)
Make sure that the vertices are ordered as wanted. Below vertices are ordered in a way that the resulting path is a pair of triangles rather than a rectangle. So, contains_points only returns True for points inside any of the triangles.
>>> p = path.Path(np.array([bfp1, bfp2, bfp4, bfp3]))
>>> p
Path([[ 5.53147871 0.78330843]
[ 1.78330843 5.46852129]
[ 0.53147871 -3.21669157]
[-3.21669157 1.46852129]], None)
>>> IsPointInside = np.array([[1, 2], [1, 9]])
>>> IsPointInside
array([[1, 2],
[1, 9]])
>>> p.contains_points(IsPointInside)
array([False, False], dtype=bool)
>>>
The output for the first point would have been True if bfp3 and bfp4 were swapped.
I wrote this function to return a array as in matlab inpolygon function. But this will return only the points that are inside the given polygon. You can't find the points in the edge of the polygon with this function.
import numpy as np
from matplotlib import path
def inpolygon(xq, yq, xv, yv):
shape = xq.shape
xq = xq.reshape(-1)
yq = yq.reshape(-1)
xv = xv.reshape(-1)
yv = yv.reshape(-1)
q = [(xq[i], yq[i]) for i in range(xq.shape[0])]
p = path.Path([(xv[i], yv[i]) for i in range(xv.shape[0])])
return p.contains_points(q).reshape(shape)
You can call the function as:
xv = np.array([0.5,0.2,1.0,0,0.8,0.5])
yv = np.array([1.0,0.1,0.7,0.7,0.1,1])
xq = np.array([0.1,0.5,0.9,0.2,0.4,0.5,0.5,0.9,0.6,0.8,0.7,0.2])
yq = np.array([0.4,0.6,0.9,0.7,0.3,0.8,0.2,0.4,0.4,0.6,0.2,0.6])
print(inpolygon(xq, yq, xv, yv))
As in the matlab documentation this function,
returns in indicating if the query points specified by xq and yq are inside or on the edge of the polygon area defined by xv and yv.
Related
I am on my way to understand a vectorized approach to calculating (and plotting) Julia sets. On the web, I found the following code (annotations are mainly mine, based on my growing understanding of the ideas behind the code):
import numpy as np
import matplotlib.pyplot as plt
c = -0.74543+0.11301j # Example value for this picture (Julia set)
n = 512 # Maximum number of iterations
x = np.linspace(-1.5, 1.5, 2000).reshape((1, 2000)) # 1 row, 2000 columns
y = np.linspace(-1.2, 1.2, 1600).reshape((1600, 1)) # 1600 rows, 1 column
z = x + 1j*y # z is an array with 1600 * 2000 complex entries
c = np.full(z.shape, c) # c is a complex number matrix to be added for the iteration
diverge = np.zeros(z.shape) # 1600 * 2000 zeroes (0s), contains divergent iteration counts
m = np.full(z.shape, True) # 1600 * 2000 True, used as a kind of mask (convergent values)
for i in range(0,n): # Do at most n iterations
z[m] = z[m]**2 + c[m] # Matrix op: Complex iteration for fixed c (Julia set perspective)
m[np.abs(z) > 2] = False # threshold for convergence of absolute(z) is 2
diverge[m] = i
plt.imshow(diverge, cmap='magma') # Color map "magma" applied to the iterations for each point
plt.show() # Display image plotted
I don't understand the mechanics of the line
diverge[m] = i
I gather that m is a 1600*2000 element array of Booleans. It seems that m is used as a kind of mask to let stand only those values in diverge[] for which the corresponding element in m is True. Yet I would like to understand this concept in greater detail. The syntax diverge[m] = i seems to imply that an array is used as some sort of generalized "index" to another array (diverge), and I could use some help understanding this concept. (The code runs as expected, I just have problems understanding the working of it.)
Thank you.
Yes, you can use an array to index another. In many many ways. That a complex matter. And even if I flatter myself to understand numpy quite a bit now, I still sometimes encouter array indexation that make me scratch my head a little bit before I understand.
But this case is not a very complex one
M=np.array([[1,2,3],
[4,5,6],
[7,8,9]])
msk=np.array([[True, False, True],
[True, True, True],
[False, True, False]])
M[msk]
Returns array([1, 3, 4, 5, 6, 8]). You can, I am sure, easily understand the logic.
But more importantly, indexation is a l-value. So that means that M[msk] can be to the left side of the =. And then the values of M are impacted
So, that means that
M[msk]=0
M
shows
array([[0, 2, 0],
[0, 0, 0],
[7, 0, 9]])
Likewise
M=np.array([[1,2,3],
[4,5,6],
[7,8,9]])
A=np.array([[2,2,4],
[4,6,6],
[8,8,8]])
msk=np.array([[True, False, True],
[True, True, True],
[False, True, False]])
M[msk] = M[msk]+A[msk]
M
Result is
array([[ 3, 2, 7],
[ 8, 11, 12],
[ 7, 16, 9]])
So back to your case,
z[m] = z[m]**2 + c[m] # Matrix op: Complex iteration for fixed c (Julia set perspective)
Is somehow just an optimisation. You could have also just z=z**2+c. But why would be the point to compute that even where overflow has already occured. So, it computes z=z**2+c only where there was no overflow yet
m[np.abs(z) > 2] = False # threshold for convergence of absolute(z) is 2
np.abs(s)>2 is a 2d array of True/False values. m is set to False at for every "pixels" for which |z|>2. Other values of m remain unchanged. So they stay False if they were already False. Note that this one is slightly over complicated. Since, because of the previous line, z doesn't change once it became >2, in reality, there is no pixels where np.abs(z)<=2 and yet m is already False. So
m=np.abs(z)<=2
would have worked as well. And it would not have been slower, since the original version computes that anyway. In fact, it would be faster, since we spare the indexation/affecation operation. On my computer my version runs 1.3 seconds faster than the original (on a 12 second computation time. So 10% approx.)
But the original version has the merit to makes next line easier to understand, becaus it makes one point clear: m starts with all True values, and then some values turn False as long as algorithm runs, but none never become True again.
diverge[m] = i
m being the mask of pixels that has not yet diverged (it starts with all True, and as long as we iterate, more and more values of m are False).
So doing so update diverge to i everywhere no divergence occured yet (the name of the variable is not the most pertinent).
So pixels whose z values become>2 at iteration 50, so whose m value became False at iteration 50, would have been updated to 1, then 2, then 3, then 4, ..., then 48, then 49 by this line. But not to 50, 51, ...
So at the end, what stays in "diverge" is the last i for which m was still True. That is the last i for which algorithm was still converging. Or, at 1 unit shift, the first one for which algorithm diverges.
I'm trying to use scipy's LinearNDInterpolatorExtrapolate.
The following minimal code should be as trivial as possible, yet it returns an error
from scipy.interpolate import NearestNDInterpolator
points = [[0,0,0], [1,0,0], [1,1,0],[0,1,0],[.5,.5,1]]
values = [1,2,3,4,5]
interpolator = NearestNDInterpolator(points,values)
interpolator([.5,.5,.8])
returns
TypeError: only integer scalar arrays can be converted to a scalar index
The error seems to come from line 81 of scipy.interpolate.ndgriddata [source]. Unfortunately I could not chase the error further, as I don't understand what tree.query is returning.
Is this a bug or I'm doing something wrong?
In your case, it seems like a problem with value type. Because first values of points and values are Python's integers, the rest are interpreted as integers.
The following fixes your code and returns a correct answer, which is [5]:
import numpy as np
from scipy.interpolate import NearestNDInterpolator
points = np.array([[0, 0, 0], [1, 0, 0], [1, 1, 0],[0, 1, 0],[.5, .5, 1]])
values = np.array([1, 2, 3, 4, 5])
interpolator = NearestNDInterpolator(points, values)
interpolator(np.array([[.5, .5, .8]]))
>>> array([5])
Notice two things:
I imported numpy and used np.array. This is the preferable way to work with scipy, because np.array, albeit being static, is much faster comparing to python's list and provides a spectrum of mathematical operations.
When calling interpolator, I used [[...]] instead of your [...]. Why? It highlights the fact that NearestNDInterpolator can interpolate values in multiple points.
Pass your input as arrays
interpolator = NearestNDInterpolator(np.array(points),np.array(
values))
You can even pass many points:
interpolator([np.array([.5,.5,.8]),np.array([1,1,2])])
>>>> array([5,5])
Just pass the values without a list as a tuple of x-values
from scipy.interpolate import NearestNDInterpolator
points = [[0,0,0], [1,0,0], [1,1,0],[0,1,0],[.5,.5,1]]
values = [1,2,3,4,5]
interpolator = NearestNDInterpolator(points,values)
interpolator((.5,.5,.8))
# 5
If you want to stick to passing lists, you can unpack the list contents using * as
interpolator(*[.5,.5,.8])
For interpolating for more than one points, you can map the interpolator onto your list of points (tuples)
answer = list(map(interpolator, [(.5,.5,.8), (.05, 1.6, 2.9)]))
# [5, 5]
This is the Python Code:
import numpy as np
def find_nearest_vector(array, value):
idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
return array[idx]
A = np.random.random((10,2))*100
""" A = array([[ 34.19762933, 43.14534123],
[ 48.79558706, 47.79243283],
[ 38.42774411, 84.87155478],
[ 63.64371943, 50.7722317 ],
[ 73.56362857, 27.87895698],
[ 96.67790593, 77.76150486],
[ 68.86202147, 21.38735169],
[ 5.21796467, 59.17051276],
[ 82.92389467, 99.90387851],
[ 6.76626539, 30.50661753]])"""
pt = [6, 30]
print find_nearest_vector(A,pt)
#array([ 6.76626539, 30.50661753])
Can somebody explain me the step-by-step process of getting the nearest vector? The whole process of function "find_nearest_vector()". Can someone show me the tracing process of this function? Thank you.
From Wikipedia; the L2 (Euclidean) norm is defined as
np.linalg.norm simply implements this formula in numpy, but only works for two points at a time. Additionally, it appears your implementation is incorrect, as #unutbu pointed out, it only happens to work by chance in some cases.
If you want to vectorize this, I'd recommend implementing the L2 norm yourself with vectorised numpy.
This works when pt is a 1D array:
>>> pt = np.array(pt)
>>> A[((A - pt[ None, :]) ** 2).sum(1).argmin()]
array([ 6.76626539, 30.50661753])
Note, the closest point will have the smallest L2 norm as well as the smallest squared L2 norm, so this is, in a sense, even more efficient than np.linalg.norm which additionally computes the square root.
I am trying to get the x and y coordinates of a given value in a numpy image array.
I can do it by running through the rows and columns manually with a for statement, but this seems rather slow and I am possitive there is a better way to do this.
I was trying to modify a solution I found in this post. Finding the (x,y) indexes of specific (R,G,B) color values from images stored in NumPy ndarrays
a = image
c = intensity_value
y_locs = np.where(np.all(a == c, axis=0))
x_locs = np.where(np.all(a == c, axis=1))
return np.int64(x_locs), np.int64(y_locs)
I have the np.int64 to convert the values back to int64.
I was also looking at numpy.where documentation
I don't quite understand the problem. The axis parameter in all() runs over the colour channels (axis 2 or -1) rather than the x and y indices. Then where() will give you the coordinates of the matching values in the image:
>>> # set up data
>>> image = np.zeros((5, 4, 3), dtype=np.int)
>>> image[2, 1, :] = [7, 6, 5]
>>> # find indices
>>> np.where(np.all(image == [7, 6, 5], axis=-1))
(array([2]), array([1]))
>>>
This is really just repeating the answer you linked to. But is a bit too long for a comment. Maybe you could explain a bit more why you need to modify the previous answer? It doesn't seem like you do need to.
I have a set of two sets of 2D points. I want to see if set B is included completely or partially in the convex hull of set A according to euclidean coordinates.
To explain inclusion the following example might help
Lets consider the following sets
A={(5,5),(10,10),(5,10),(0,5)}
B={(3,3),(5,8)} partially included in convex hull of A
C={(1,5),(5,8)} fully included in convex hull of A
D={(1,1),(3,3)} is not included in convex hull of A
Thanks a lot
One way to find the convex hull of a set of points in Python is to use the Delaunay triangulation function in scipy.spatial. Given a set of points, it returns an object which has a convex_hull attribute - this is an array consisting of pairs of indices into the original set of points, which correspond to edges on the polygon. Annoyingly these are not ordered, so the polygon containing these points needs to be reconstructed (eg. as follows):
import numpy as np
import matplotlib.nxutils
import scipy.spatial
def find_convex_hull(points):
triangulation = scipy.spatial.Delaunay(points)
unordered = list(triangulation.convex_hull)
ordered = list(unordered.pop(0))
while len(unordered) > 0:
next = (i for i, seg in enumerate(unordered) if ordered[-1] in seg).next()
ordered += [point for point in unordered.pop(next) if point != ordered[-1]]
return points[ordered]
As suggested by #user1443118, the points_inside_poly function in matplotlib.nxutils can then be used to test if points lie in the resulting polygon, which corresponds to the convex hull. This leads to the following function for calculating the degree of intersection.
def inclusion(points_a, points_b):
ch_a = find_convex_hull(points_a)
return (1.0 * matplotlib.nxutils.points_inside_poly(points_b, ch_a)).mean()
So given some sets of points (with properties as in your original example), illustrated below:
A = np.random.randn(100, 2)
B = np.array([2,0]) + 0.5 * np.random.randn(100, 2)
C = 0.5 * np.random.randn(100, 2)
D = np.array([5,0]) + 0.5 * np.random.randn(100, 2)
The degree of inclusion can be calculated as follows:
>>> inclusion(A, B)
0.44
>>> inclusion(A, C)
1.0
>>> inclusion(A, D)
0.0
Finally, however, it's worth noting that the the points_in_poly function does not always signify points on the polygon boundary as being inside (see here for an explanation why the underlying function behaves this way). For this reason set C in your original example would only be partially included as point (1,5) lies on the convex hull of A and is not counted.
Matplotlib has a point_in_poly function that is pretty fast. This is taken straight from the matplotlib documentation: nxutils
In [25]: import numpy as np
In [26]: import matplotlib.nxutils as nx
In [27]: verts = np.array([ [0,0], [0, 1], [1, 1], [1,0]], float)
In [28]: nx.pnpoly( 0.5, 0.5, verts)
Out[28]: 1
In [29]: nx.pnpoly( 0.5, 1.5, verts)
Out[29]: 0
In [30]: points = np.random.rand(10,2)*2
In [31]: points
Out[31]:
array([[ 1.03597426, 0.61029911],
[ 1.94061056, 0.65233947],
[ 1.08593748, 1.16010789],
[ 0.9255139 , 1.79098751],
[ 1.54564936, 1.15604046],
[ 1.71514397, 1.26147554],
[ 1.19133536, 0.56787764],
[ 0.40939549, 0.35190339],
[ 1.8944715 , 0.61785408],
[ 0.03128518, 0.48144145]])
In [32]: nx.points_inside_poly(points, verts)
Out[32]: array([False, False, False, False, False, False, False, True, False, True], dtype=bool)
After that, its just a matter of testing each point in the set and figuring out if both, one, or neither are inside the vertices.