I would like a solution to automatically center a basemap plot on my coordinate data.
I've got things to automatically center, but the resulting area is much larger than the area actually used by my data. I would like the plot to be bounded by the plot coordinates, rather than an area drawn from the lat/lon boundaries.
I am using John Cook's code for calculating the distance between two points on (an assumed perfect) sphere.
First Try
Here is the script I started with. This was causing the width and height to bee small too small for the data area, and the center latitude (lat0) too far south.
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
import sys
import csv
import spheredistance as sd
print '\n'
if len(sys.argv) < 3:
print >>sys.stderr,'Usage:',sys.argv[0],'<datafile> <#rows to skip>'
sys.exit(1)
print '\n'
dataFile = sys.argv[1]
dataStream = open(dataFile, 'rb')
dataReader = csv.reader(dataStream,delimiter='\t')
numRows = sys.argv[2]
dataValues = []
dataLat = []
dataLon = []
print 'Plotting Data From: '+dataFile
dataReader.next()
for row in dataReader:
dataValues.append(row[0])
dataLat.append(float(row[1]))
dataLon.append(float(row[2]))
# center and set extent of map
earthRadius = 6378100 #meters
factor = 1.00
lat0new = ((max(dataLat)-min(dataLat))/2)+min(dataLat)
lon0new = ((max(dataLon)-min(dataLon))/2)+min(dataLon)
mapH = sd.distance_on_unit_sphere(max(dataLat),lon0new,
min(dataLat),lon0new)*earthRadius*factor
mapW = sd.distance_on_unit_sphere(lat0new,max(dataLon),
lat0new,min(dataLon))*earthRadius*factor
# setup stereographic basemap.
# lat_ts is latitude of true scale.
# lon_0,lat_0 is central point.
m = Basemap(width=mapW,height=mapH,
resolution='l',projection='stere',\
lat_0=lat0new,lon_0=lon0new)
#m.shadedrelief()
m.drawcoastlines(linewidth=0.2)
m.fillcontinents(color='white', lake_color='aqua')
#plot data points (omitted due to ownership)
#x, y = m(dataLon,dataLat)
#m.scatter(x,y,2,marker='o',color='k')
# draw parallels and meridians.
m.drawparallels(np.arange(-80.,81.,20.), labels=[1,0,0,0], fontsize=10)
m.drawmeridians(np.arange(-180.,181.,20.), labels=[0,0,0,1], fontsize=10)
m.drawmapboundary(fill_color='aqua')
plt.title("Example")
plt.show()
After generating some random data, it was obvious that the bounds that I chose did not work with this projection (red lines). Using map.drawgreatcircle(), I first visualized where I wanted the bounds while zoomed over the projection of random data.
I corrected the longitude by using the longitudinal difference at the southern most latitude (blue horizontal line).
I determined the latitudinal range using the Pythagorean theorem to solve for the vertical distance, knowing the distance between the northern most longitudinal bounds, and the central southernmost point (blue triangle).
def centerMap(lats,lons,scale):
#Assumes -90 < Lat < 90 and -180 < Lon < 180, and
# latitude and logitude are in decimal degrees
earthRadius = 6378100.0 #earth's radius in meters
northLat = max(lats)
southLat = min(lats)
westLon = max(lons)
eastLon = min(lons)
# average between max and min longitude
lon0 = ((westLon-eastLon)/2.0)+eastLon
# a = the height of the map
b = sd.spheredist(northLat,westLon,northLat,eastLon)*earthRadius/2
c = sd.spheredist(northLat,westLon,southLat,lon0)*earthRadius
# use pythagorean theorom to determine height of plot
mapH = pow(pow(c,2)-pow(b,2),1./2)
arcCenter = (mapH/2)/earthRadius
lat0 = sd.secondlat(southLat,arcCenter)
# distance between max E and W longitude at most souther latitude
mapW = sd.spheredist(southLat,westLon,southLat,eastLon)*earthRadius
return lat0,lon0,mapW*scale,mapH*scale
lat0center,lon0center,mapWidth,mapHeight = centerMap(dataLat,dataLon,1.1)
The lat0 (or latitudinal center) in this case is therefore the point half-way up the height of this triangle, which I solved using John Cooks method, but for solving for an unknown coordinate while knowing the first coordinate (the median longitude at the southern boundary) and the arc length (half that of the total height).
def secondlat(lat1, arc):
degrees_to_radians = math.pi/180.0
lat2 = (arc-((90-lat1)*degrees_to_radians))*(1./degrees_to_radians)+90
return lat2
Update:
The above function, as well as the distance between two coordinates can be achieved with higher accuracy using the pyproj Geod class methods geod.fwd() and geod.inv(). I found this in Erik Westra's Python for Geospatial Development, which is an excellent resource.
Update:
I have now verified that this also works for Lambert Conformal Conic (lcc) projections.
Related
I have a bunch of shapes (e.g. shapely LineStrings or Polygons) in a geopandas GeoDataFrame.
The shapes specify coordinates in a local 200x200 meters grid, i.e. all coordinates are between (0, 0) and (200, 200).
I now would like to "place" these lines globally.
For this, I want to specify a GPS Point (with a given lat/lon) as a reference.
My first (naive) approach would be to use geographiclib, take all shapes' coords (in local X/Y) and apply the following transformation and "recreate" the shape:
# Convert coordinates to GPS location
from shapely.geometry import LineString
from geographiclib.geodesic import Geodesic
geod = Geodesic.WGS84 # the base geodesic (i.e. the world)
origin = (48.853772345870176, 2.350983211585546) # this is somewhere in Paris, for example
def local_to_latlong(x, y, orientation=0, scale=1):
""" Two step process.
- First walk x meters to east from origin.
- Then, from that point, walk y meters north from origin.
Optional:
- orientation allows to "spin" the coordinates
- scale allows to grow/shrink the distances
"""
go_X = geod.Direct(*origin, orientation + 90, x * scale) # x is East-coordinate
go_Y = geod.Direct(go_X["lat2"], go_X["lon2"], orientation + 0, y * scale) # y is North-coordinate
return go_Y["lat2"], go_Y["lon2"]
original_line = LineString([(0,0), (100,100), (200,100)])
global_line = LineString([local_to_latlong(x, y) for y, x in original_line.coords])
However, I hope that this is not the smartest way to do it, and that there are smarter ways out there...
I would like to apply such a transformation onto any shape within a GeoDataFrame. Ideally, it would work using a "to_crs", but I am not sure how to transform the shapes so they are "in reference to a origin" and which crs to use.
given your origin is EPSG:4326, you can estimate the UTM zone
with this you can get UTM zone coordinates of origin
translate your custom 200x200 metre zone into co-ordinates of UTM zone
finally use to_crs() to transform into EPSG:4326
import shapely.geometry
import geopandas as gpd
import pandas as pd
import numpy as np
# generate some polygons (squares), where grid is 200*200
gdf = gpd.GeoDataFrame(
geometry=pd.DataFrame(
np.repeat(np.sort(np.random.randint(0, 200, [20, 2]), axis=1), 2, axis=1)
).apply(lambda d: shapely.geometry.box(*d), axis=1)
)
# chage to linestrings, clearer when we plot
gdf["geometry"] = gdf["geometry"].exterior
origin = (2.350983211585546, 48.853772345870176) # this is somewhere in Paris, for example
# work out utm crs of point. utm is in metres
gdf_o = gpd.GeoDataFrame(geometry=[shapely.geometry.Point(origin)], crs="EPSG:4326")
crs = gdf_o.estimate_utm_crs()
# where is origin in utm zone
xo,yo = gdf_o.to_crs(crs).loc[0,"geometry"].xy
# translate custom zone to co-ordinates of utm zone
# assume point is center of 200x200 grid (hence subtract 100)
gdf_gps = gdf["geometry"].translate(xoff=xo[0]-100, yoff=yo[0]-100).set_crs(crs).to_crs("epsg:4326")
# plot on map to show it has worked...
m = gdf_gps.explore()
m = gdf_o.explore(m=m, color="red", marker_kwds={"radius":20})
m
I have a set of points in an example ASCII file showing a 2D image.
I would like to estimate the total area that these points are filling. There are some places inside this plane that are not filled by any point because these regions have been masked out. What I guess might be practical for estimating the area would be applying a concave hull or alpha shapes.
I tried this approach to find an appropriate alpha value, and consequently estimate the area.
from shapely.ops import cascaded_union, polygonize
import shapely.geometry as geometry
from scipy.spatial import Delaunay
import numpy as np
import pylab as pl
from descartes import PolygonPatch
from matplotlib.collections import LineCollection
def plot_polygon(polygon):
fig = pl.figure(figsize=(10,10))
ax = fig.add_subplot(111)
margin = .3
x_min, y_min, x_max, y_max = polygon.bounds
ax.set_xlim([x_min-margin, x_max+margin])
ax.set_ylim([y_min-margin, y_max+margin])
patch = PolygonPatch(polygon, fc='#999999',
ec='#000000', fill=True,
zorder=-1)
ax.add_patch(patch)
return fig
def alpha_shape(points, alpha):
if len(points) < 4:
# When you have a triangle, there is no sense
# in computing an alpha shape.
return geometry.MultiPoint(list(points)).convex_hull
def add_edge(edges, edge_points, coords, i, j):
"""
Add a line between the i-th and j-th points,
if not in the list already
"""
if (i, j) in edges or (j, i) in edges:
# already added
return
edges.add( (i, j) )
edge_points.append(coords[ [i, j] ])
coords = np.array([point.coords[0]
for point in points])
tri = Delaunay(coords)
edges = set()
edge_points = []
# loop over triangles:
# ia, ib, ic = indices of corner points of the
# triangle
for ia, ib, ic in tri.vertices:
pa = coords[ia]
pb = coords[ib]
pc = coords[ic]
# Lengths of sides of triangle
a = np.sqrt((pa[0]-pb[0])**2 + (pa[1]-pb[1])**2)
b = np.sqrt((pb[0]-pc[0])**2 + (pb[1]-pc[1])**2)
c = np.sqrt((pc[0]-pa[0])**2 + (pc[1]-pa[1])**2)
# Semiperimeter of triangle
s = (a + b + c)/2.0
# Area of triangle by Heron's formula
area = np.sqrt(s*(s-a)*(s-b)*(s-c))
circum_r = a*b*c/(4.0*area)
# Here's the radius filter.
#print circum_r
if circum_r < 1.0/alpha:
add_edge(edges, edge_points, coords, ia, ib)
add_edge(edges, edge_points, coords, ib, ic)
add_edge(edges, edge_points, coords, ic, ia)
m = geometry.MultiLineString(edge_points)
triangles = list(polygonize(m))
return cascaded_union(triangles), edge_points
points=[]
with open("test.asc") as f:
for line in f:
coords=map(float,line.split(" "))
points.append(geometry.shape(geometry.Point(coords[0],coords[1])))
print geometry.Point(coords[0],coords[1])
x = [p.x for p in points]
y = [p.y for p in points]
pl.figure(figsize=(10,10))
point_collection = geometry.MultiPoint(list(points))
point_collection.envelope
convex_hull_polygon = point_collection.convex_hull
_ = plot_polygon(convex_hull_polygon)
_ = pl.plot(x,y,'o', color='#f16824')
concave_hull, edge_points = alpha_shape(points, alpha=0.001)
lines = LineCollection(edge_points)
_ = plot_polygon(concave_hull)
_ = pl.plot(x,y,'o', color='#f16824')
I get this result but I would like that this method could detect the hole in the middle.
Update
This is how my real data looks like:
My question is what is the best way to estimate an area of the aforementioned shape? I can not figure out what has gone wrong that this code doesn't work properly?!! Any help will be appreciated.
Okay, here's the idea. A Delaunay triangulation is going to generate triangles which are indiscriminately large. It's also going to be problematic because only triangles will be generated.
Therefore, we'll generate what you might call a "fuzzy Delaunay triangulation". We'll put all the points into a kd-tree and, for each point p, look at its k nearest neighbors. The kd-tree makes this fast.
For each of those k neighbors, find the distance to the focal point p. Use this distance to generate a weighting. We want nearby points to be favored over more distant points, so an exponential function exp(-alpha*dist) is appropriate here. Use the weighted distances to build a probability density function describing the probability of drawing each point.
Now, draw from that distribution a large number of times. Nearby points will be chosen often while farther away points will be chosen less often. For point drawn, make a note of how many times it was drawn for the focal point. The result is a weighted graph where each edge in the graph connects nearby points and is weighted by how often the pairs were chosen.
Now, cull all edges from the graph whose weights are too small. These are the points which are probably not connected. The result looks like this:
Now, let's throw all of the remaining edges into shapely. We can then convert the edges into very small polygons by buffering them. Like so:
Differencing the polygons with a large polygon covering the entire region will yield polygons for the triangulation. THIS MAY TAKE A WHILE. The result looks like this:
Finally, cull off all of the polygons which are too large:
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
import random
import scipy
import scipy.spatial
import networkx as nx
import shapely
import shapely.geometry
import matplotlib
dat = np.loadtxt('test.asc')
xycoors = dat[:,0:2]
xcoors = xycoors[:,0] #Convenience alias
ycoors = xycoors[:,1] #Convenience alias
npts = len(dat[:,0]) #Number of points
dist = scipy.spatial.distance.euclidean
def GetGraph(xycoors, alpha=0.0035):
kdt = scipy.spatial.KDTree(xycoors) #Build kd-tree for quick neighbor lookups
G = nx.Graph()
npts = np.max(xycoors.shape)
for x in range(npts):
G.add_node(x)
dist, idx = kdt.query(xycoors[x,:], k=10) #Get distances to neighbours, excluding the cenral point
dist = dist[1:] #Drop central point
idx = idx[1:] #Drop central point
pq = np.exp(-alpha*dist) #Exponential weighting of nearby points
pq = pq/np.sum(pq) #Convert to a PDF
choices = np.random.choice(idx, p=pq, size=50) #Choose neighbors based on PDF
for c in choices: #Insert neighbors into graph
if G.has_edge(x, c): #Already seen neighbor
G[x][c]['weight'] += 1 #Strengthen connection
else:
G.add_edge(x, c, weight=1) #New neighbor; build connection
return G
def PruneGraph(G,cutoff):
newg = G.copy()
bad_edges = set()
for x in newg:
for k,v in newg[x].items():
if v['weight']<cutoff:
bad_edges.add((x,k))
for b in bad_edges:
try:
newg.remove_edge(*b)
except nx.exception.NetworkXError:
pass
return newg
def PlotGraph(xycoors,G,cutoff=6):
xcoors = xycoors[:,0]
ycoors = xycoors[:,1]
G = PruneGraph(G,cutoff)
plt.plot(xcoors, ycoors, "o")
for x in range(npts):
for k,v in G[x].items():
plt.plot((xcoors[x],xcoors[k]),(ycoors[x],ycoors[k]), 'k-', lw=1)
plt.show()
def GetPolys(xycoors,G):
#Get lines connecting all points in the graph
xcoors = xycoors[:,0]
ycoors = xycoors[:,1]
lines = []
for x in range(npts):
for k,v in G[x].items():
lines.append(((xcoors[x],ycoors[x]),(xcoors[k],ycoors[k])))
#Get bounds of region
xmin = np.min(xycoors[:,0])
xmax = np.max(xycoors[:,0])
ymin = np.min(xycoors[:,1])
ymax = np.max(xycoors[:,1])
mls = shapely.geometry.MultiLineString(lines) #Bundle the lines
mlsb = mls.buffer(2) #Turn lines into narrow polygons
bbox = shapely.geometry.box(xmin,ymin,xmax,ymax) #Generate background polygon
polys = bbox.difference(mlsb) #Subtract to generate polygons
return polys
def PlotPolys(polys,area_cutoff):
fig, ax = plt.subplots(figsize=(8, 8))
for polygon in polys:
if polygon.area<area_cutoff:
mpl_poly = matplotlib.patches.Polygon(np.array(polygon.exterior), alpha=0.4, facecolor=np.random.rand(3,1))
ax.add_patch(mpl_poly)
ax.autoscale()
fig.show()
#Functional stuff starts here
G = GetGraph(xycoors, alpha=0.0035)
#Choose a value that rips off an appropriate amount of the left side of this histogram
weights = sorted([v['weight'] for x in G for k,v in G[x].items()])
plt.hist(weights, bins=20);plt.show()
PlotGraph(xycoors,G,cutoff=6) #Plot the graph to ensure our cut-offs were okay. May take a while
prunedg = PruneGraph(G,cutoff=6) #Prune the graph
polys = GetPolys(xycoors,prunedg) #Get polygons from graph
areas = sorted(p.area for p in polys)
plt.plot(areas)
plt.hist(areas,bins=20);plt.show()
area_cutoff = 150000
PlotPolys(polys,area_cutoff=area_cutoff)
good_polys = ([p for p in polys if p.area<area_cutoff])
total_area = sum([p.area for p in good_polys])
Here's a thought: use k-means clustering.
You can accomplish this in Python as follows:
from sklearn.cluster import KMeans
import numpy as np
import matplotlib.pyplot as plt
dat = np.loadtxt('test.asc')
xycoors = dat[:,0:2]
fit = KMeans(n_clusters=2).fit(xycoors)
plt.scatter(dat[:,0],dat[:,1], c=fit.labels_)
plt.axes().set_aspect('equal', 'datalim')
plt.gray()
plt.show()
Using your data, this gives the following result:
Now, you can take the convex hull of the top cluster and the bottom cluster and calculate the areas of each separately. Adding the areas then becomes an estimator of the area of their union, but, cunningly, avoids the hole in the middle.
To fine-tune your results, you can play with the number of clusters and the number of different starts to the algorithm (the algorithm is randomized and is typically run more than once).
You asked, for instance, if two clusters will always leave the hole in the middle. I've used the following code to experiment with that. I generate a uniform distribution of points and then chop out a randomly sized and orientated ellipse to simulate a hole.
#!/usr/bin/env python3
import sklearn
import sklearn.cluster
import numpy as np
import matplotlib.pyplot as plt
PWIDTH = 6
PHEIGHT = 6
def GetPoints(num):
return np.random.rand(num,2)*300-150 #Centered about zero
def MakeHole(pts): #Chop out a randomly orientated and sized ellipse
a = np.random.uniform(10,150) #Semi-major axis
b = np.random.uniform(10,150) #Semi-minor axis
h = np.random.uniform(-150,150) #X-center
k = np.random.uniform(-150,150) #Y-center
A = np.random.uniform(0,2*np.pi) #Angle of rotation
surviving_points = []
for pt in range(pts.shape[0]):
x = pts[pt,0]
y = pts[pt,1]
if ((x-h)*np.cos(A)+(y-k)*np.sin(A))**2/a/a+((x-h)*np.sin(A)-(y-k)*np.cos(A))**2/b/b>1:
surviving_points.append(pt)
return pts[surviving_points,:]
def ShowManyClusters(pts,fitter,clusters,title):
colors = np.array([x for x in 'bgrcmykbgrcmykbgrcmykbgrcmyk'])
fig,axs = plt.subplots(PWIDTH,PHEIGHT)
axs = axs.ravel()
for i in range(PWIDTH*PHEIGHT):
lbls = fitter(pts[i],clusters)
axs[i].scatter(pts[i][:,0],pts[i][:,1], c=colors[lbls])
axs[i].get_xaxis().set_ticks([])
axs[i].get_yaxis().set_ticks([])
plt.suptitle(title)
#plt.show()
plt.savefig('/z/'+title+'.png')
fitters = {
'SpectralClustering': lambda x,clusters: sklearn.cluster.SpectralClustering(n_clusters=clusters,affinity='nearest_neighbors').fit(x).labels_,
'KMeans': lambda x,clusters: sklearn.cluster.KMeans(n_clusters=clusters).fit(x).labels_,
'AffinityPropagation': lambda x,clusters: sklearn.cluster.AffinityPropagation().fit(x).labels_,
}
np.random.seed(1)
pts = []
for i in range(PWIDTH*PHEIGHT):
temp = GetPoints(300)
temp = MakeHole(temp)
pts.append(temp)
for name,fitter in fitters.items():
for clusters in [2,3]:
np.random.seed(1)
ShowManyClusters(pts,fitter,clusters,"{0}: {1} clusters".format(name,clusters))
Consider the results for K-Means:
At least to my eye, it seems as though using two clusters performs worst when the "hole" separates the data into two separate blobs. (In this case that occurs when the ellipse is orientated such that it overlaps two edges of the rectangular region containing the sample points.) Using three clusters resolves most of these difficulties.
You'll also notice that K-means produces some counter-intuitive results on the 1st Column, 3rd Row as well as on the 3rd Column, 4th Row. Reviewing sklearn's menagerie of clustering methods here shows the following comparison image:
From this, image it seems as though SpectralClustering produces results that align with what we want. Trying this on the same data above fixes the problems mentioned (see 1st Column, 3rd Row and 3rd Column, 4th Row).
The foregoing suggests that Spectral clustering with three clusters should be adequate for most situations of this sort.
Although you seem intent on doing a concave shape, here is an alternate route that is hella fast and I think would give you very a pretty stable reading:
Create a function which takes as an argument (int radiusOfInfluence). Inside the function run a voxel filter with that as the radius. Then simply multiply the area of that circle (pi*AOI^2) by the number of remaining points in the cloud. This should give you a relatively robust estimation of area and would be very resilient to holes and weird edges.
Some things to consider:
-This will give you a positive overshoot of area due to over-reaching edges by exactly one radius. A modification to adjust for this could be to run a statistical outlier removal filter (in inverse mode) to acquire statistical edge points. Then an assumption can be made that approximately half of each edge point is lying outside the shape, subtract half the number of points found from your total count prior to multiplying into area.
-The radius of influence largely determines this function's hole detection as a larger one will allow single points to cover larger areas, but also by tuning the std cutoff on the stat outlier filter, you can more aggressively detect interior holes and adjust your area that way.
It really begs the question of what you are after, as this is more of a shot accuracy/ shot grouping type assessment assuming a reasonably distributed set of samples. Your method kinda is making the assumption that your outer edge points are the absolute limits of what is possible (which may be a fair assumption depending on the situation)
EDIT-----------------------
I do not have time to write out example code, but I can further explain to aid in understanding.
At the core of this is the voxel filter. Very simply, it sets a seed point in x,y coordinates and then creates a grid over the whole space which has units (grid spacing) on both axes of a user specified filter radius. Inside each grid box, it will average all points to a single point. This is very important for this concept because it almost entirely eliminates the issue of overlap.
The second part (the inverse stat outlier removal) is just a bit of cleverness to tighten your edge fit. Basically, stat outlier is built to remove noise by looking at the distance from each point to its (k) nearest neighbors. After generating the average distance to k nearest neighbors for each point, it sets up a histogram and a user defined parameter acts as a binary threshold for keeping or removing points. When inverted and set to a reasonable cutt-off (~0.75 std should work), instead it will delete all the points that are in the bulk of the object (ie only leaving edge points). The reason this is important is that technically these points are over-reaching the boundary of your object by 1 radius. Although some will be on acute and some on obtuse edge angles (ie more than or less than half a circle of overfill) taking off 1/2 of a circle area per point should over the whole object give you a pretty sound improvement on edge fit.
Keep in mind though that at the end of the day, this is just going to give you a number. As far as stress testing, I suggest creating contrived point clouds of known area and or creating a graphical output that shows where you are dropping circles and half circles (oriented towards the interior of the object if you are fancy).
The knobs you will want to turn to improve this method are:
Voxel filter radius, area of influence per point (could actually be controlled separately from vox filter radius, though they should remain pretty close to one another), std cutt-off.
Hope this helped to clarify, cheers!
Edit:
I have noticed that you have your own code to compute the alpha shape,
and the areas of Delaunay triangles are just there, so computing the area of the shape is even easier...
Just add the areas of triangles, if triangle is going to be added to the alpha-shape polygon.
If you want to detect holes... add a secondary threshold to avoid adding triangles with an area greater than the threshold. For this example, a value of max_area = 99999 will remove the hole.
The only problem is the way you create the graphic output, because you will not see the hole.
def alpha_shape(points, alpha, max_area):
if len(points) < 4:
# When you have a triangle, there is no sense
# in computing an alpha shape.
return geometry.MultiPoint(list(points)).convex_hull , 0
def add_edge(edges, edge_points, coords, i, j):
"""
Add a line between the i-th and j-th points,
if not in the list already
"""
if (i, j) in edges or (j, i) in edges:
# already added
return
edges.add( (i, j) )
edge_points.append(coords[ [i, j] ])
coords = np.array([point.coords[0]
for point in points])
tri = Delaunay(coords)
total_area = 0
edges = set()
edge_points = []
# loop over triangles:
# ia, ib, ic = indices of corner points of the
# triangle
for ia, ib, ic in tri.vertices:
pa = coords[ia]
pb = coords[ib]
pc = coords[ic]
# Lengths of sides of triangle
a = np.sqrt((pa[0]-pb[0])**2 + (pa[1]-pb[1])**2)
b = np.sqrt((pb[0]-pc[0])**2 + (pb[1]-pc[1])**2)
c = np.sqrt((pc[0]-pa[0])**2 + (pc[1]-pa[1])**2)
# Semiperimeter of triangle
s = (a + b + c)/2.0
# Area of triangle by Heron's formula
area = np.sqrt(s*(s-a)*(s-b)*(s-c))
circum_r = a*b*c/(4.0*area)
# Here's the radius filter.
# print("radius", circum_r)
if circum_r < 1.0/alpha and area < max_area:
add_edge(edges, edge_points, coords, ia, ib)
add_edge(edges, edge_points, coords, ib, ic)
add_edge(edges, edge_points, coords, ic, ia)
total_area += area
m = geometry.MultiLineString(edge_points)
triangles = list(polygonize(m))
return cascaded_union(triangles), edge_points, total_area
The
Old answer:
To compute the area of an irregular simple polygon, you can use the Shoelace formula, and the CCW coordinates of the boundary as input.
If you want to detect holes inside of your cloud, you have to remove the Delaunay triangles with a circumradius greater that a secondary threshold.
The ideal is: Compute the Delaunay triangulation and filter with your current alpha shape. Then, compute the circumradius of every triangle and remove those triangles with circumradius much bigger than average circumradius.
To compute the area of an irregular polygon with holes, use the Shoelace formula for each hole boundary. Input the external boundary in CCW (positive) order to obtain the area. Then input the boundary of each hole in CW (negative) order, to obtain a (negative) value for area.
OBJECTIVE
Upload a GIS, shapefile (county boundaries) into Basemap
Use Basemap to plot county boundaries
Determine whether or not a location falls within Boundaries
Assign a weight to a point, depending on which boundary they fall into
Use DBSCAN to discover cluster centriod based on coordinates and weight
APPROACH
Using this tutorial on Basemap, upload a shapefile for mapping.
#First, we have to import our datasets.
#These datasets include store locations, existing distribution locations, county borders, and real estate by county
walmartStores = pd.read_csv("data/walmart-stores.csv",header=0, encoding='latin1')
propertyValues = pd.read_csv("data/property values.csv")
shp = fiona.open('data/boundaries/Counties.shp')
#We need to create a workable array with Walmart Stores
longitude = walmartStores.longitude
latitude = walmartStores.latitude
stores = np.column_stack((longitude, latitude))
#We also need to load the shape file for county boundaries
extra = 0.1
bds = shp.bounds
shp.close()
#We need to assign the lower-left bound and upper-right bound
ll = (bds[0], bds[1])
ur = (bds[2], bds[3])
#concatenate the lower left and upper right into a variable called coordinates
coords = list(chain(ll, ur))
print(coords)
#define variables for the width and the height of the map
w, h = coords[2] - coords[0], coords[3] - coords[1]
with print(coords) = [105571.4206781257, 4480951.235680977, 779932.0626624253, 4985476.422250552]
All is well thus far, however I run into a problem below:
m = Basemap(
#set projection to 'tmerc' to minimize map distortion
projection='tmerc',
#set longitude as average of lower, upper longitude bounds
lon_0 = np.average([bds[0],bds[2]]),
#set latitude as average of lower,upper latitude bounds
lat_0 = np.average([bds[1],bds[3]]),
#string describing ellipsoid (‘GRS80’ or ‘WGS84’, for example).
#Not sure what this does...
ellps = 'WGS84',
#set the map boundaries. Note that we use the extra variable to provide a 10% buffer around the map
llcrnrlon=coords[0] - extra * w,
llcrnrlat=coords[1] - extra + 0.01 * h,
urcrnrlon=coords[2] + extra * w,
urcrnrlat=coords[3] + extra + 0.01 * h,
#provide latitude of 'true scale.'
#check the Basemap API
lat_ts=0,
#resolution of boundary database to use. Can be c (crude), l (low), i (intermediate), h (high), f (full) or None.
resolution='i',
#don't show the axis ticks automatically
suppress_ticks = False)
m.readshapefile(
#provide the path to the shapefile, but leave off the .shp extension
'data/boundaries/Counties.shp',
#name your map something useful (I named this 'srilanka')
'nyCounties',
#set the default shape boundary coloring (default is black) and the zorder (layer order)
color='none',
zorder=2)
Error: lat_0 must be between -90.000000 and 90.000000
QUESTIONS
lat_0 and lon_0 aren't between -90 and 90. However, lon_0 doesn't throw an error. Why is this the case?
I've looked online for others facing a similar issue and have come up empty handed. Is there something unique with my notebook? (NOTE: conda list shows `basemap 1.0.7, so I know that it's installed and running)
Thanks!
Latitude can only be between -90 and 90 - anything else doesn't makes sense. The North pole is +90 and the South pole is -90, with the equator at 0. There are no other acceptable values!
Regarding longtitude, it can only be between -180 and 180. 0 is at the Prime Meridian and going to -180 (westward) and +180 (eastward)
This demo program (intended to be run in an IPython notebook; you need matplotlib, mpl_toolkits.basemap, pyproj, and shapely) is supposed to plot increasingly large circles on the surface of the Earth. It works correctly as long as the circle does not cross over one of the poles. If that happens, the result is complete nonsense when plotted on a map (see below cell 2)
If I plot them "in a void" instead of on a map (see below cell 3) the results are correct in the sense that, if you removed the horizontal line going from +180 to -180 longitude, the rest of the curve would indeed delimit the boundary between the interior and exterior of the desired circle. However, they are wrong in that the polygon is invalid (.is_valid is False), and much more importantly, the nonzero-winding-number interior of the polygon does not enclose the correct region of the map.
I believe this is happening because shapely.ops.transform is blind to the coordinate singularity at +180==-180 longitude. The question is, how do I detect the problem and repair the polygon, so that it does enclose the correct region of the map? In this case, an appropriate fixup would be to replace the horizontal segment from (X,+180) -- (X,-180) with three lines, (X,+180) -- (+90,+180) -- (+90,-180) -- (X,-180); but note that if the circle had gone over the south pole, the fixup lines would need to go south instead. And if the circle had gone over both poles, we'd have a valid polygon again but its interior would be the complement of what it should be. I need to detect all of these cases and handle them correctly. Also, I do not know how to "edit" a shapely geometry object.
Downloadable notebook: https://gist.github.com/zackw/e48cb1580ff37acfee4d0a7b1d43a037
## cell 1
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import pyproj
from shapely.geometry import Point, Polygon, MultiPolygon
from shapely.ops import transform as sh_transform
from functools import partial
wgs84_globe = pyproj.Proj(proj='latlong', ellps='WGS84')
def disk_on_globe(lat, lon, radius):
aeqd = pyproj.Proj(proj='aeqd', ellps='WGS84', datum='WGS84',
lat_0=lat, lon_0=lon)
return sh_transform(
partial(pyproj.transform, aeqd, wgs84_globe),
Point(0, 0).buffer(radius)
)
## cell 2
def plot_poly_on_map(map_, pol):
if isinstance(pol, Polygon):
map_.plot(*(pol.exterior.xy), '-', latlon=True)
else:
assert isinstance(pol, MultiPolygon)
for p in pol:
map_.plot(*(p.exterior.xy), '-', latlon=True)
plt.figure(figsize=(14, 12))
map_ = Basemap(projection='cyl', resolution='c')
map_.drawcoastlines(linewidth=0.25)
for rad in range(1,10):
plot_poly_on_map(
map_,
disk_on_globe(40.439, -79.976, rad * 1000 * 1000)
)
plt.show()
## cell 3
def plot_poly_in_void(pol):
if isinstance(pol, Polygon):
plt.plot(*(pol.exterior.xy), '-')
else:
assert isinstance(pol, MultiPolygon)
for p in pol:
plt.plot(*(p.exterior.xy), '-', latlon=True)
plt.figure()
for rad in range(1,10):
plot_poly_in_void(
disk_on_globe(40.439, -79.976, rad * 1000 * 1000)
)
plt.show()
(The sunlit region shown at http://www.die.net/earth/rectangular.html is an example of what a circle that crosses a pole should look like when projected onto an equirectangular map, as long as it's not an equinox today.)
Manually fixing up the projected polygon turns out not to be that bad.
There are two steps: first, find all segments of the polygon that cross the coordinate singularity at longitude ±180, and replace them with excursions to either the north or south pole, whichever is nearest; second, if the resulting polygon doesn't contain the origin point, invert it. Note that both steps must be carried out whether or not shapely thinks the projected polygon is "invalid"; depending on where the starting point is, it may cross one or both poles without being invalid.
This probably isn't the most efficient way to do it, but it works.
import pyproj
from shapely.geometry import Point, Polygon, box as Box
from shapely.ops import transform as sh_transform
from functools import partial
wgs84_globe = pyproj.Proj(proj='latlong', ellps='WGS84')
def disk_on_globe(lat, lon, radius):
"""Generate a shapely.Polygon object representing a disk on the
surface of the Earth, containing all points within RADIUS meters
of latitude/longitude LAT/LON."""
aeqd = pyproj.Proj(proj='aeqd', ellps='WGS84', datum='WGS84',
lat_0=lat, lon_0=lon)
disk = sh_transform(
partial(pyproj.transform, aeqd, wgs84_globe),
Point(0, 0).buffer(radius)
)
# Fix up segments that cross the coordinate singularity at longitude ±180.
# We do this unconditionally because it may or may not create a non-simple
# polygon, depending on where the initial point was.
boundary = np.array(disk.boundary)
i = 0
while i < boundary.shape[0] - 1:
if abs(boundary[i+1,0] - boundary[i,0]) > 180:
assert (boundary[i,1] > 0) == (boundary[i,1] > 0)
vsign = -1 if boundary[i,1] < 0 else 1
hsign = -1 if boundary[i,0] < 0 else 1
boundary = np.insert(boundary, i+1, [
[hsign*179, boundary[i,1]],
[hsign*179, vsign*89],
[-hsign*179, vsign*89],
[-hsign*179, boundary[i+1,1]]
], axis=0)
i += 5
else:
i += 1
disk = Polygon(boundary)
# If the fixed-up polygon doesn't contain the origin point, invert it.
if not disk.contains(Point(lon, lat)):
disk = Box(-180, -90, 180, 90).difference(disk)
assert disk.is_valid
assert disk.boundary.is_simple
assert disk.contains(Point(lon, lat))
return disk
The other problem -- mpl_toolkits.basemap.Basemap.plot producing garbage -- is not corrected by fixing up the polygon as above. However, if you manually project the polygon into map coordinates and then draw it using a descartes.PolygonPatch, that works, as long as the projection has a rectangular boundary, and that's enough of a workaround for me. (I think it would work for any projection if one added a lot of extra points along all straight lines at the map boundary.)
%matplotlib inline
from matplotlib import pyplot as plt
from mpl_toolkits.basemap import Basemap
from descartes import PolygonPatch
plt.figure(figsize=(14, 12))
map_ = Basemap(projection='cea', resolution='c')
map_.drawcoastlines(linewidth=0.25)
for rad in range(3,19,2):
plt.gca().add_patch(PolygonPatch(
sh_transform(map_,
disk_on_globe(40.439, -79.976, rad * 1000 * 1000)),
alpha=0.1))
plt.show()
Some satellite based earth observation products provide latitude/longitude information while others provide the X/Y coordinates within a given grid projection (and there are also some having both, see example).
My approach in the second case is to set up a Basemap map which has the same parameters (projection, ellipsoid, origin of map) as given by the data provider in a way that the given X/Y values equal the Basemap coordinates. However if I do so the geolocation does not agree with other data sets including the Basemap coastline.
I have experienced this with three different data sets from different trustworthy sources. For the minimal example I use Landsat data provided by the U.S. Geological Survey which includes both, X/Y coordinates of a South Polar Stereographic grid and the corresponding lat/lon coordinates for all four corners of the image.
From a Landsat metafile we get (ID: LC82171052016079LGN00):
CORNER_UL_LAT_PRODUCT = -66.61490 CORNER_UL_LON_PRODUCT = -61.31816
CORNER_UR_LAT_PRODUCT = -68.74325 CORNER_UR_LON_PRODUCT = -58.04533
CORNER_LL_LAT_PRODUCT = -67.68721 CORNER_LL_LON_PRODUCT = -67.01109
CORNER_LR_LAT_PRODUCT = -69.94052 CORNER_LR_LON_PRODUCT = -64.18581
CORNER_UL_PROJECTION_X_PRODUCT = -2259300.000
CORNER_UL_PROJECTION_Y_PRODUCT = 1236000.000
CORNER_UR_PROJECTION_X_PRODUCT = -1981500.000
CORNER_UR_PROJECTION_Y_PRODUCT = 1236000.000
CORNER_LL_PROJECTION_X_PRODUCT = -2259300.000
CORNER_LL_PROJECTION_Y_PRODUCT = 958500.000
CORNER_LR_PROJECTION_X_PRODUCT = -1981500.000
CORNER_LR_PROJECTION_Y_PRODUCT = 958500.000
...
GROUP = PROJECTION_PARAMETERS MAP_PROJECTION = "PS" DATUM = "WGS84"
ELLIPSOID = "WGS84" VERTICAL_LON_FROM_POLE = 0.00000 TRUE_SCALE_LAT =
-71.00000 FALSE_EASTING = 0 FALSE_NORTHING = 0 GRID_CELL_SIZE_PANCHROMATIC = 15.00 GRID_CELL_SIZE_REFLECTIVE = 30.00
GRID_CELL_SIZE_THERMAL = 30.00 ORIENTATION = "NORTH_UP"
RESAMPLING_OPTION = "CUBIC_CONVOLUTION" END_GROUP =
PROJECTION_PARAMETERS
By using Basemap with the right map projection we should be able to derive the corner lat/lon values from the X/Y values:
import numpy as np
from mpl_toolkits.basemap import Basemap
m=Basemap(resolution='h',projection='spstere', ellps='WGS84', boundinglat=-60,lon_0=180, lat_ts=-71)
x_crn=np.array([-2259300,-1981500,-2259300,-1981500])# upper left, upper right, lower left, lower right
y_crn=np.array([1236000, 1236000, 958500, 958500])# upper left, upper right, lower left, lower right
x0, y0= m(0, -90)
#Basemap coordinates at the south pole
#note that (0,0) of the Basemap is in a corner of the map,
#while other data sets use the south pole.
#This is easy to take into account:
lon_crn, lat_crn = m(x0-x_crn, y0-y_crn, inverse=True)
print 'lon_crn: '+str(lon_crn)
print 'lat_crn: '+str(lat_crn)
Which returns:
lon_crn: [-61.31816102 -58.04532791 -67.01108782 -64.1858106 ]
lat_crn: [-67.23548626 -69.3099076 -68.28071626 -70.47651326]
As you can see the longitudes agree to the given precision with those from the metafile, but the latitudes are to low.
I can approximate the latitudes by:
lat_crn=(lat_crn+90.)*1.0275-90.
But this is really not satisfying.
This is how the image is located if using the X/Y corner coordinates from the metafile (in red the Basemap drawcoastlines()):
and this is how it looks like using the corner lat/lon:
In this case I can simply use the lat/lon coordinates, but as mentioned before there are datasets (like this) which is provided by X/Y coordinates only, which makes it very important to rely on the Basemap projection. I know that there are other modules to re-project the data as a potential workaround, but it should work without other modules and a re-projection could introduce errors itself.
As this problem appears with different data sets I like to believe that it is a bug in the Basemap module, but I might also make the same mistake again and again or have wrong expectations.
I did some experimentation and it seems like changing lat_ts has no effect with projection='spstere'. In fact, it seems as if the projection latitude is implicitly assumed to be lat_ts=-90. regardless of what value you assign.
I had more success using projection='stere' instead, so that you would construct the Basemap in your example as follows:
m=Basemap(width=5400000., height=5400000., projection='stere',
ellps='WGS84', lon_0=180., lat_0=-90., lat_ts=-71.)
You may prefer to set the latitude and longitude of the corners instead of the width and height of the plot for your application.