Why is my shapely polygon generated from a mask invalid?

Why is my shapely polygon generated from a mask invalid? - python

I am trying to make a shapely Polygon from a binary mask, but I always end up with an invalid Polygon. How can I make a valid polygon from an arbitrary binary mask? Below is an example using a circular mask. I suspect that it is because the points I get from the mask contour are out of order, which is apparent when I plot the points (see images below).
import matplotlib.pyplot as plt
import numpy as np
from shapely.geometry import Point, Polygon
from scipy.ndimage.morphology import binary_erosion
from skimage import draw
def get_circular_se(radius=2):
N = (radius * 2) + 1
se = np.zeros(shape=[N,N])
for i in range(N):
for j in range(N):
se[i,j] = (i - N / 2)**2 + (j - N / 2)**2 <= radius**2
se = np.array(se, dtype="uint8")
return se
return new_regions, np.asarray(new_vertices)
#generates a circular mask
side_len = 512
rad = 100
mask = np.zeros(shape=(side_len, side_len))
rr, cc = draw.circle(side_len/2, side_len/2, radius=rad, shape=mask.shape)
mask[rr, cc] = 1
#makes a polygon from the mask perimeter
se = get_circular_se(radius=1)
contour = mask - binary_erosion(mask, structure=se)
pixels_mask = np.array(np.where(contour==1)[::-1]).T
polygon = Polygon(pixels_mask)
print polygon.is_valid
>>False
#plots the results
fig, ax = plt.subplots()
ax.imshow(mask,cmap='Greys_r')
ax.plot(pixels_mask[:,0],pixels_mask[:,1],'b-',lw=0.5)
plt.tight_layout()
plt.show()

In fact I already found a solution that worked for me, but maybe someone has a better one. The problem was indeed that my points were out of order. Input coordinate order is crucial for making valid polygons. So, one just has to put the points in the right order first. Below is an example solution using a nearest neighbor approach with a KDTree, which I've already posted elsewhere for related problems.
from sklearn.neighbors import KDTree
def polygonize_by_nearest_neighbor(pp):
"""Takes a set of xy coordinates pp Numpy array(n,2) and reorders the array to make
a polygon using a nearest neighbor approach.
"""
# start with first index
pp_new = np.zeros_like(pp)
pp_new[0] = pp[0]
p_current_idx = 0
tree = KDTree(pp)
for i in range(len(pp) - 1):
nearest_dist, nearest_idx = tree.query([pp[p_current_idx]], k=4) # k1 = identity
nearest_idx = nearest_idx[0]
# finds next nearest point along the contour and adds it
for min_idx in nearest_idx[1:]: # skip the first point (will be zero for same pixel)
if not pp[min_idx].tolist() in pp_new.tolist(): # make sure it's not already in the list
pp_new[i + 1] = pp[min_idx]
p_current_idx = min_idx
break
pp_new[-1] = pp[0]
return pp_new
pixels_mask_ordered = polygonize_by_nearest_neighbor(pixels_mask)
polygon = Polygon(pixels_mask_ordered)
print polygon.is_valid
>>True
#plots the results
fig, ax = plt.subplots()
ax.imshow(mask,cmap='Greys_r')
ax.plot(pixels_mask_ordered[:,0],pixels_mask_ordered[:,1],'b-',lw=2)
plt.tight_layout()
plt.show()

Related

Generate a new set of points along a line

I have a Python project where I need to redraw a line many times with the points in random places but keeping the line's shape and point count roughly the same. The final output will be using polygonal points and not Bezier paths (though I wouldn't be opposed to using Bezier as an intermediary step).
This animation is demonstrating how the points could move along the line to different positions while maintaining the general shape.
I also have a working example below where I'm moving along the line and picking random new points between existing points (the red line, below). It works okay, but I'd love to hear some other approaches I might take if someone knows of a better one?
Though this code is using matplotlib to demonstrate the line, the final program will not.
import numpy as np
from matplotlib import pyplot as plt
import random
from random import (randint,uniform)
def move_along_line(p1, p2, scalar):
distX = p2[0] - p1[0]
distY = p2[1] - p1[1]
modX = (distX * scalar) + p1[0]
modY = (distY * scalar) + p1[1]
return [modX, modY]
x_coords = [213.5500031,234.3809357,255.211853,276.0427856,296.8737183,317.7046204,340.1997681,364.3751221,388.5505066,414.8896484,444.5192261,478.5549622,514.5779419,545.4779053,570.3830566,588.0241699,598.2469482,599.772583,596.758728,593.7449341,590.7310791,593.373291,610.0373535,642.1326294,677.4451904,710.0697021,737.6887817,764.4020386,791.1152954,817.8284912,844.541687,871.2550049,897.9682007,924.6813965,951.3945923,978.1078491,1009.909546,1042.689941,1068.179199,1089.543091]
y_coords = [487.3099976,456.8832703,426.4565125,396.0297852,365.6030273,335.1763,306.0349426,278.1913452,250.3477478,224.7166748,203.0908051,191.2358704,197.6810608,217.504303,244.4946136,276.7698364,312.0551453,348.6885986,385.4395447,422.1904297,458.9414063,495.5985413,527.0128479,537.1477661,527.6642456,510.959259,486.6988525,461.2799683,435.8611145,410.4422913,385.023468,359.6045532,334.18573,308.7669067,283.3480835,257.929184,239.4429474,253.6099091,280.1803284,310.158783]
plt.plot(x_coords,y_coords,color='b')
plt.scatter(x_coords,y_coords,s=2)
new_line_x = []
new_line_y = []
for tgt in range(len(x_coords)-1):
#tgt = randint(0, len(x_coords)-1)
next_pt = tgt+1
new_pt = move_along_line([x_coords[tgt],y_coords[tgt]], [x_coords[next_pt],y_coords[next_pt]], uniform(0, 1))
new_line_x.append(new_pt[0])
new_line_y.append(new_pt[1])
plt.plot(new_line_x,new_line_y,color='r')
plt.scatter(new_line_x,new_line_y,s=10)
ax = plt.gca()
ax.set_aspect('equal')
plt.show()
Thank you very much!

I'm not sure if this is the most optimal way to do this but essentially you want to follow these steps:
Calculate the distance of the entire path, and the distance between all the points. Then for each point, tally the distances to that point.
Generate a new set of random points along the path starting with 0, then for each pair of points calculate a random distance: random value between 0 and 1 * total length of the path.
Sort these distances from smallest to largest.
For each random distance loop over the distances find the index where the random distance is > than distance i, and less than distance i+1. Interpolate new x and y values from these points.
from matplotlib import pyplot as plt
from scipy.interpolate import interp1d
import numpy
import random
import math
x_coords = [195.21,212.53,237.39,270.91,314.21,368.43,434.69,514.1,607.8,692.69,746.98,773.8,776.25,757.45,720.52,668.55,604.68,545.37,505.79,487.05,490.27,516.58,567.09,642.93,745.2,851.5,939.53,1010.54,1065.8,1106.58,1134.15,1149.75,1154.68]
y_coords = [195.34,272.27,356.59,438.98,510.14,560.76,581.52,563.13,496.27,404.39,318.83,242.15,176.92,125.69,91.02,75.48,81.62,113.49,168.57,239.59,319.29,400.38,475.6,537.67,579.32,586.78,558.32,504.7,436.69,365.05,300.55,253.95,236.03]
n_points = 100
x_coords = numpy.array(x_coords)
x_min = x_coords.min()
x_max = x_coords.max()
x_range = x_max - x_min
distances = []
tallied_distances = [0]
tallied_distance = 0
for i in range(0, len(x_coords) -1):
xi = x_coords[i]
xf = x_coords[i + 1]
yi= y_coords[i]
yf = y_coords[i+1]
d = math.sqrt((xf-xi)**2 + (yf-yi)**2)
tallied_distance += d
tallied_distances.append(tallied_distance)
random_distances_along_line = [0]
for i in range(0, n_points-2):
random_distances_along_line.append(random.random()*tallied_distance)
random_distances_along_line.sort()
new_x_points = [x_coords[0]]
new_y_points = [y_coords[0]]
for i in range(0, len(random_distances_along_line)):
dt = random_distances_along_line[i]
for j in range(0, len(tallied_distances)-1):
di = tallied_distances[j]
df = tallied_distances[j+1]
if di < dt and dt < df:
difference = dt - di
xi = x_coords[j]
xf = x_coords[j+1]
yi = y_coords[j]
yf = y_coords[j+1]
xt = xi+(xf-xi)*difference/(df-di)
yt = yi+(yf-yi)*difference/(df-di)
new_x_points.append(xt)
new_y_points.append(yt)
new_x_points.append(x_coords[len(x_coords)-1])
new_y_points.append(y_coords[len(y_coords)-1])
plt.plot(new_x_points, new_y_points)
plt.scatter(new_x_points, new_y_points,s=2)
ax = plt.gca()
ax.set_aspect('equal')
plt.show()

What is the best way to calculate radial average of the image with python?

I have a square image, for example this one:
and I would like to calculate the 1D average of the image for each radius from the position (0,0). I have written some code to do so, but first of all it very slow even for small images, secondly I see that there are also some problems with the idea behind it. Code is here:
import matplotlib.pyplot as plt
import numpy as np
import collections
from skimage import data
image = data.coins()
image = image[:,0:303]
print(image.shape)
projection = {}
total_count = {}
for x_i,x in enumerate(image):
for y_i,y in enumerate(x):
if round(np.sqrt(x_i**2+y_i**2),1) not in projection:
projection[round(np.sqrt(x_i**2+y_i**2),1)] = y
total_count[round(np.sqrt(x_i**2+y_i**2),1)] = 1
elif np.sqrt(round(np.sqrt(x_i**2+y_i**2),1)) in projection:
projection[round(np.sqrt(x_i**2+y_i**2),1)] += y
total_count[round(np.sqrt(x_i ** 2 + y_i ** 2), 1)] += 1
od = collections.OrderedDict(sorted(projection.items()))
x, y = [],[]
for k, v in od.items():
x.append(k)
y.append(v/total_count[k])
plt.plot(x,y)
plt.xlabel('Radius from (0,0)')
plt.ylabel('Averaged pixel value')
plt.show()
The result of the code looks like this:
Has anyone have some clue how to improve my the script? I also don't know why in some cases there are some spikes which have very small average value. I would really appreciate some hints. Thanks!

You may filter the image by the radius by creating a matrix of radii R
and calculating
image[(R >= r-.5) & (R < r+.5)].mean()
where r is the radius you are interested in.
import numpy as np
import matplotlib.pyplot as plt
from skimage import data
# get some image
image = data.coins()
image = image[:,0:303]
# create array of radii
x,y = np.meshgrid(np.arange(image.shape[1]),np.arange(image.shape[0]))
R = np.sqrt(x**2+y**2)
# calculate the mean
f = lambda r : image[(R >= r-.5) & (R < r+.5)].mean()
r = np.linspace(1,302,num=302)
mean = np.vectorize(f)(r)
# plot it
fig,ax=plt.subplots()
ax.plot(r,mean)
plt.show()

I think the problem with your spikes is rounding the Euclidean distance. Anyway for raster images would be more appropriate to use Manhattan or Chebyshev metric to group intensities. In my implementation, I created coordinate matrices which are arranged to the array of pixel coordinates. The actual distances are calculated using cdist function from scipy.spatial.distance. Inverse indices of unique distance values are used to index the image and calculate average intensities.
import matplotlib.pyplot as plt
import numpy as np
from skimage import data
from scipy.spatial import distance
image = data.coins()
image = image[:,0:303]
print(image.shape)
r, c = np.mgrid[0:image.shape[0], 0:image.shape[1]]
# coordinates of origin
O = [[0, 0]]
# 2D array of pixel coordinates
D = np.vstack((r.ravel(), c.ravel())).T
metric = 'cityblock' # or 'chebyshev'
# calculate distances
dst = distance.cdist(O, D, metric)
# group same distances
dst_u, indices, total_count = np.unique(dst, return_inverse=True,
return_counts=True)
# summed intensities for each unique distance
f_image = image.flatten()
proj_sum = [sum(f_image[indices == ix]) for ix, d in enumerate(dst_u)]
# calculatge averaged pixel values
projection = np.divide(proj_sum, total_count)
plt.plot(projection)
plt.xlabel('Distance[{}] from {}'.format(metric, O[0]))
plt.ylabel('Averaged pixel value')
plt.show()
Here is result for Manhattan metric
and here for Chebyshev metric,

I have found also another, very elegant way of doing radial average posted by #Bi Rico here:
def radial_profile(data, center):
y, x = np.indices((data.shape))
r = np.sqrt((x - center[0])**2 + (y - center[1])**2)
r = r.astype(np.int)
tbin = np.bincount(r.ravel(), data.ravel())
nr = np.bincount(r.ravel())
radialprofile = tbin / nr
return radialprofile
It works quite well and what is most important is much more efficient than previously posted suggestions.

Colormap by vector direction in python using quiver

I am trying to colormap vectors by their direction using quiver, in python 2.7. I read in my data from a text file, get the angle of each vector and normalize so that everything falls between [0,1]. However, when I go to plot the color it comes out that the same color indicates two different directions.
Also, it might be relevant that I'm not plotting the data on a mesh but as points with velocity vectors. Here's my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as col
import sys
data = np.loadtxt("" + str(sys.argv[1]) + "")
x_dat = data[:,0]
y_dat = data[:,1]
vx_dat = data[:,2]
vy_dat = data[:,3]
rad = np.arctan(vy_dat/vx_dat) * 2
theta = np.degrees(rad)
for i in range(len(theta)):
if theta[i] < 0:
theta[i] += 360
theta[i] /= 360
I realize I don't need to convert to degrees. Then I normalize my vectors:
N = np.array([])
for i in range(len(vx_dat)):
N = np.append(N,np.sqrt(vx_dat[i]**2 + vy_dat[i]**2))
vx_dat[i] = vx_dat[i]/N[i]
vy_dat[i] = vy_dat[i]/N[i]
And finally, I plot it:
q = plt.quiver(x_dat, y_dat, vx_dat, vy_dat, theta, units='dots', angles='xy', cmap = 'Blues')
Where 'theta' should map the color for each vector based on direction. However here's what I get out (I zoomed in so it'd be easier to see):
How can I fix this so that each direction gets a unique color?

As suggested use the np.arctan2(V,U) function to calculate your colour value. To return a unique colour use a different colour map, 'Blues' can only return different shades of blue. A cyclic colour map like 'hsv' is more suitable. Try the following:
q = plt.quiver(x_dat, y_dat, vx_dat, vy_dat, np.arctan2(vy_dat, vx_dat), units='dots', angles='xy', cmap = 'hsv')

Method to uniformly randomly populate a disk with points in python

I have an application that requires a disk populated with 'n' points in a quasi-random fashion. I want the points to be somewhat random, but still have a more or less regular density over the disk.
My current method is to place a point, check if it's inside the disk, and then check if it is also far enough away from all other points already kept. My code is below:
import os
import random
import math
# ------------------------------------------------ #
# geometric constants
center_x = -1188.2
center_y = -576.9
center_z = -3638.3
disk_distance = 2.0*5465.6
disk_diam = 5465.6
# ------------------------------------------------ #
pts_per_disk = 256
closeness_criteria = 200.0
min_closeness_criteria = disk_diam/closeness_criteria
disk_center = [(center_x-disk_distance),center_y,center_z]
pts_in_disk = []
while len(pts_in_disk) < (pts_per_disk):
potential_pt_x = disk_center[0]
potential_pt_dy = random.uniform(-disk_diam/2.0, disk_diam/2.0)
potential_pt_y = disk_center[1]+potential_pt_dy
potential_pt_dz = random.uniform(-disk_diam/2.0, disk_diam/2.0)
potential_pt_z = disk_center[2]+potential_pt_dz
potential_pt_rad = math.sqrt((potential_pt_dy)**2+(potential_pt_dz)**2)
if potential_pt_rad < (disk_diam/2.0):
far_enough_away = True
for pt in pts_in_disk:
if math.sqrt((potential_pt_x - pt[0])**2+(potential_pt_y - pt[1])**2+(potential_pt_z - pt[2])**2) > min_closeness_criteria:
pass
else:
far_enough_away = False
break
if far_enough_away:
pts_in_disk.append([potential_pt_x,potential_pt_y,potential_pt_z])
outfile_name = "pt_locs_x_lo_"+str(pts_per_disk)+"_pts.txt"
outfile = open(outfile_name,'w')
for pt in pts_in_disk:
outfile.write(" ".join([("%.5f" % (pt[0]/1000.0)),("%.5f" % (pt[1]/1000.0)),("%.5f" % (pt[2]/1000.0))])+'\n')
outfile.close()
In order to get the most even point density, what I do is basically iteratively run this script using another script, with the 'closeness' criteria reduced for each successive iteration. At some point, the script can not finish, and I just use the points of the last successful iteration.
So my question is rather broad: is there a better way to do this? My method is ok for now, but my gut says that there is a better way to generate such a field of points.
An illustration of the output is graphed below, one with a high closeness criteria, and another with a 'lowest found' closeness criteria (what I want).

A simple solution based on Disk Point Picking from MathWorld:
import numpy as np
import matplotlib.pyplot as plt
n = 1000
r = np.random.uniform(low=0, high=1, size=n) # radius
theta = np.random.uniform(low=0, high=2*np.pi, size=n) # angle
x = np.sqrt(r) * np.cos(theta)
y = np.sqrt(r) * np.sin(theta)
# for plotting circle line:
a = np.linspace(0, 2*np.pi, 500)
cx,cy = np.cos(a), np.sin(a)
fg, ax = plt.subplots(1, 1)
ax.plot(cx, cy,'-', alpha=.5) # draw unit circle line
ax.plot(x, y, '.') # plot random points
ax.axis('equal')
ax.grid(True)
fg.canvas.draw()
plt.show()
It gives.
Alternatively, you also could create a regular grid and distort it randomly:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
n = 20
tt = np.linspace(-1, 1, n)
xx, yy = np.meshgrid(tt, tt) # create unit square grid
s_x, s_y = xx.ravel(), yy.ravel()
ii = np.argwhere(s_x**2 + s_y**2 <= 1).ravel() # mask off unwanted points
x, y = s_x[ii], s_y[ii]
triang = tri.Triangulation(x, y) # create triangluar grid
# distort the grid
g = .5 # distortion factor
rx = x + np.random.uniform(low=-g/n, high=g/n, size=x.shape)
ry = y + np.random.uniform(low=-g/n, high=g/n, size=y.shape)
rtri = tri.Triangulation(rx, ry, triang.triangles) # distorted grid
# for circle:
a = np.linspace(0, 2*np.pi, 500)
cx,cy = np.cos(a), np.sin(a)
fg, ax = plt.subplots(1, 1)
ax.plot(cx, cy,'k-', alpha=.2) # circle line
ax.triplot(triang, "g-", alpha=.4)
ax.triplot(rtri, 'b-', alpha=.5)
ax.axis('equal')
ax.grid(True)
fg.canvas.draw()
plt.show()
It gives
The triangles are just there for visualization. The obvious disadvantage is that depending on your choice of grid, either in the middle or on the borders (as shown here), there will be more or less large "holes" due to the grid discretization.

If you have a defined area like a disc (circle) that you wish to generate random points within you are better off using an equation for a circle and limiting on the radius:
x^2 + y^2 = r^2 (0 < r < R)
or parametrized to two variables
cos(a) = x/r
sin(a) = y/r
sin^2(a) + cos^2(a) = 1
To generate something like the pseudo-random distribution with low density you should take the following approach:
For randomly distributed ranges of r and a choose n points.
This allows you to generate your distribution to roughly meet your density criteria.
To understand why this works imagine your circle first divided into small rings of length dr, now imagine your circle divided into pie slices of angle da. Your randomness now has equal probability over the whole boxed area arou d the circle. If you divide the areas of allowed randomness throughout your circle you will get a more even distribution around the overall circle and small random variation for the individual areas giving you the psudo-random look and feel you are after.
Now your job is just to generate n points for each given area. You will want to have n be dependant on r as the area of each division changes as you move out of the circle. You can proportion this to the exact change in area each space brings:
for the n-th to n+1-th ring:
d(Area,n,n-1) = Area(n) - Area(n-1)
The area of any given ring is:
Area = pi*(dr*n)^2 - pi*(dr*(n-1))
So the difference becomes:
d(Area,n,n-1) = [pi*(dr*n)^2 - pi*(dr*(n-1))^2] - [pi*(dr*(n-1))^2 - pi*(dr*(n-2))^2]
d(Area,n,n-1) = pi*[(dr*n)^2 - 2*(dr*(n-1))^2 + (dr*(n-2))^2]
You could expound this to gain some insight on how much n should increase but it may be faster to just guess at some percentage increase (30%) or something.
The example I have provided is a small subset and decreasing da and dr will dramatically improve your results.
Here is some rough code for generating such points:
import random
import math
R = 10.
n_rings = 10.
n_angles = 10.
dr = 10./n_rings
da = 2*math.pi/n_angles
base_points_per_division = 3
increase_per_level = 1.1
points = []
ring = 0
while ring < n_rings:
angle = 0
while angle < n_angles:
for i in xrange(int(base_points_per_division)):
ra = angle*da + da*math.random()
rr = r*dr + dr*random.random()
x = rr*math.cos(ra)
y = rr*math.sin(ra)
points.append((x,y))
angle += 1
base_points_per_division = base_points_per_division*increase_per_level
ring += 1
I tested it with the parameters:
n_rings = 20
n_angles = 20
base_points = .9
increase_per_level = 1.1
And got the following results:
It looks more dense than your provided image, but I imagine further tweaking of those variables could be beneficial.
You can add an additional part to scale the density properly by calculating the number of points per ring.
points_per_ring = densitymath.pi(dr**2)*(2*n+1)
points_per_division = points_per_ring/n_angles
This will provide a an even better scaled distribution.
density = .03
points = []
ring = 0
while ring < n_rings:
angle = 0
base_points_per_division = density*math.pi*(dr**2)*(2*ring+1)/n_angles
while angle < n_angles:
for i in xrange(int(base_points_per_division)):
ra = angle*da + min(da,da*random.random())
rr = ring*dr + dr*random.random()
x = rr*math.cos(ra)
y = rr*math.sin(ra)
points.append((x,y))
angle += 1
ring += 1
Giving better results using the following parameters
R = 1.
n_rings = 10.
n_angles = 10.
density = 10/(dr*da) # ~ ten points per unit area
With a graph...
and for fun you can graph the divisions to see how well it is matching your distriubtion and adjust.

Depending on how random the points need to be, it may be simple enough to just make a grid of points within the disk, and then displace each point by some small but random amount.

It may be that you want more randomness, but if you just want to fill your disc with an even-looking distribution of points that aren't on an obvious grid, you could try a spiral with a random phase.
import math
import random
import pylab
n = 300
alpha = math.pi * (3 - math.sqrt(5)) # the "golden angle"
phase = random.random() * 2 * math.pi
points = []
for k in xrange(n):
theta = k * alpha + phase
r = math.sqrt(float(k)/n)
points.append((r * math.cos(theta), r * math.sin(theta)))
pylab.scatter(*zip(*points))
pylab.show()

Probability theory ensures that the rejection method is an appropriate method
to generate uniformly distributed points within the disk, D(0,r), centered at origin and of radius r. Namely, one generates points within the square [-r,r] x [-r,r], until a point falls within the disk:
do{
generate P in [-r,r]x[-r,r];
}while(P[0]**2+P[1]**2>r);
return P;
unif_rnd_disk is a generator function implementing this rejection method:
import matplotlib.pyplot as plt
import numpy as np
import itertools
def unif_rnd_disk(r=1.0):
pt=np.zeros(2)
while True:
yield pt
while True:
pt=-r+2*r*np.random.random(2)
if (pt[0]**2+pt[1]**2<=r):
break
G=unif_rnd_disk()# generator of points in disk D(0,r=1)
X,Y=zip(*[pt for pt in itertools.islice(G, 1, 1000)])
plt.scatter(X, Y, color='r', s=3)
plt.axis('equal')
If we want to generate points in a disk centered at C(a,b), we have to apply a translation to the points in the disk D(0,r):
C=[2.0, -3.5]
plt.scatter(C[0]+np.array(X), C[1]+np.array(Y), color='r', s=3)
plt.axis('equal')

Colorize Voronoi Diagram

I'm trying to colorize a Voronoi Diagram created using scipy.spatial.Voronoi. Here's my code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
# make up data points
points = np.random.rand(15,2)
# compute Voronoi tesselation
vor = Voronoi(points)
# plot
voronoi_plot_2d(vor)
# colorize
for region in vor.regions:
if not -1 in region:
polygon = [vor.vertices[i] for i in region]
plt.fill(*zip(*polygon))
plt.show()
The resulting image:
As you can see some of the Voronoi regions at the border of the image are not colored. That is because some indices to the Voronoi vertices for these regions are set to -1, i.e., for those vertices outside the Voronoi diagram. According to the docs:
regions: (list of list of ints, shape (nregions, *)) Indices of the Voronoi vertices forming each Voronoi region. -1 indicates vertex outside the Voronoi diagram.
In order to colorize these regions as well, I've tried to just remove these "outside" vertices from the polygon, but that didn't work. I think, I need to fill in some points at the border of the image region, but I can't seem to figure out how to achieve this reasonably.
Can anyone help?

The Voronoi data structure contains all the necessary information to construct positions for the "points at infinity". Qhull also reports them simply as -1 indices, so Scipy doesn't compute them for you.
https://gist.github.com/pv/8036995
http://nbviewer.ipython.org/gist/pv/8037100
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi
def voronoi_finite_polygons_2d(vor, radius=None):
"""
Reconstruct infinite voronoi regions in a 2D diagram to finite
regions.
Parameters
----------
vor : Voronoi
Input diagram
radius : float, optional
Distance to 'points at infinity'.
Returns
-------
regions : list of tuples
Indices of vertices in each revised Voronoi regions.
vertices : list of tuples
Coordinates for revised Voronoi vertices. Same as coordinates
of input vertices, with 'points at infinity' appended to the
end.
"""
if vor.points.shape[1] != 2:
raise ValueError("Requires 2D input")
new_regions = []
new_vertices = vor.vertices.tolist()
center = vor.points.mean(axis=0)
if radius is None:
radius = vor.points.ptp().max()
# Construct a map containing all ridges for a given point
all_ridges = {}
for (p1, p2), (v1, v2) in zip(vor.ridge_points, vor.ridge_vertices):
all_ridges.setdefault(p1, []).append((p2, v1, v2))
all_ridges.setdefault(p2, []).append((p1, v1, v2))
# Reconstruct infinite regions
for p1, region in enumerate(vor.point_region):
vertices = vor.regions[region]
if all(v >= 0 for v in vertices):
# finite region
new_regions.append(vertices)
continue
# reconstruct a non-finite region
ridges = all_ridges[p1]
new_region = [v for v in vertices if v >= 0]
for p2, v1, v2 in ridges:
if v2 < 0:
v1, v2 = v2, v1
if v1 >= 0:
# finite ridge: already in the region
continue
# Compute the missing endpoint of an infinite ridge
t = vor.points[p2] - vor.points[p1] # tangent
t /= np.linalg.norm(t)
n = np.array([-t[1], t[0]]) # normal
midpoint = vor.points[[p1, p2]].mean(axis=0)
direction = np.sign(np.dot(midpoint - center, n)) * n
far_point = vor.vertices[v2] + direction * radius
new_region.append(len(new_vertices))
new_vertices.append(far_point.tolist())
# sort region counterclockwise
vs = np.asarray([new_vertices[v] for v in new_region])
c = vs.mean(axis=0)
angles = np.arctan2(vs[:,1] - c[1], vs[:,0] - c[0])
new_region = np.array(new_region)[np.argsort(angles)]
# finish
new_regions.append(new_region.tolist())
return new_regions, np.asarray(new_vertices)
# make up data points
np.random.seed(1234)
points = np.random.rand(15, 2)
# compute Voronoi tesselation
vor = Voronoi(points)
# plot
regions, vertices = voronoi_finite_polygons_2d(vor)
print "--"
print regions
print "--"
print vertices
# colorize
for region in regions:
polygon = vertices[region]
plt.fill(*zip(*polygon), alpha=0.4)
plt.plot(points[:,0], points[:,1], 'ko')
plt.xlim(vor.min_bound[0] - 0.1, vor.max_bound[0] + 0.1)
plt.ylim(vor.min_bound[1] - 0.1, vor.max_bound[1] + 0.1)
plt.show()

I have a much simpler solution to this problem, that is to add 4 distant dummy points to your point list before calling the Voronoi algorithm.
Based on your codes, I added two lines.
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
# make up data points
points = np.random.rand(15,2)
# add 4 distant dummy points
points = np.append(points, [[999,999], [-999,999], [999,-999], [-999,-999]], axis = 0)
# compute Voronoi tesselation
vor = Voronoi(points)
# plot
voronoi_plot_2d(vor)
# colorize
for region in vor.regions:
if not -1 in region:
polygon = [vor.vertices[i] for i in region]
plt.fill(*zip(*polygon))
# fix the range of axes
plt.xlim([0,1]), plt.ylim([0,1])
plt.show()
Then the resulting figure just looks like the following.

I don't think there is enough information from the data available in the vor structure to figure this out without doing at least some of the voronoi computation again. Since that's the case, here are the relevant portions of the original voronoi_plot_2d function that you should be able to use to extract the points that intersect with the vor.max_bound or vor.min_bound which are the bottom left and top right corners of the diagram in order figure out the other coordinates for your polygons.
for simplex in vor.ridge_vertices:
simplex = np.asarray(simplex)
if np.all(simplex >= 0):
ax.plot(vor.vertices[simplex,0], vor.vertices[simplex,1], 'k-')
ptp_bound = vor.points.ptp(axis=0)
center = vor.points.mean(axis=0)
for pointidx, simplex in zip(vor.ridge_points, vor.ridge_vertices):
simplex = np.asarray(simplex)
if np.any(simplex < 0):
i = simplex[simplex >= 0][0] # finite end Voronoi vertex
t = vor.points[pointidx[1]] - vor.points[pointidx[0]] # tangent
t /= np.linalg.norm(t)
n = np.array([-t[1], t[0]]) # normal
midpoint = vor.points[pointidx].mean(axis=0)
direction = np.sign(np.dot(midpoint - center, n)) * n
far_point = vor.vertices[i] + direction * ptp_bound.max()
ax.plot([vor.vertices[i,0], far_point[0]],
[vor.vertices[i,1], far_point[1]], 'k--')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is my shapely polygon generated from a mask invalid? - python

Related

Generate a new set of points along a line

What is the best way to calculate radial average of the image with python?

Colormap by vector direction in python using quiver

Method to uniformly randomly populate a disk with points in python

Colorize Voronoi Diagram

Categories

Resources