The situation follows:
Each supplier has some service areas, which the user have defined using GoogleMaps (polygons).
I need to store this data in the DB and make simple (but fast) queries over this.
Queries should looks like: "List all suppliers with service area containing x,y" or "In which polygons (service areas) x,y are inside?"
At this time, I've found GeoDjango which looks a very complex solution to this problem. To use it, I need a quite complex setup and I couldn't find any recent (and good) tutorial.
I came with this solution:
Store every polygon as a Json into the database
Apply a method to determine if some x,y belongs to any polygon
The problem with this solution is quite obvious: Queries may take too long to execute, considering I need to evaluate every polygon.
Finally: I'm looking for another solution for this problem, and I hope find something that doesn't have setup GeoDjango in my currently running server
Determine wheter some point is inside a polygon is not a problem (I found several examples); the problem is that retrieve every single polygon from DB and evaluate it does not scale. To solve that, I need to store the polygon in such way I can query it fast.
My approach.
Find centroid of polygon C++ code.
Store in database
Find longest distance from vertex to centroid (pythag)
Store as radius
Search database using centroid & radius as bounding box
If 1 or more result use point in polygon on resultant polygons
This solution enables you to store polygons outside of GeoDjango to dramatically speed up point in polygon queries.
In my case, I needed to find whether the coordinates of my numpy arrays where inside a polygon stored in my geodjango db (land/water masking). This required iterating over every coordinate combination in my arrays to test if it was inside or outside the polygon. As my arrays are large, this was taking a very long time using geodjango.
Using django's GEOSGeometry.contains my command looked something like this:
import numpy as np
from django.contrib.gis.geos import Point
my_polygon = model.geometry # get model multipolygon field
lat_lon = zip(latitude.flat, longitude.flat) # zip coordinate arrays to tuple
mask = np.array([my_polygon.contains(Point(l)) for l in lon_lat]) # boolean mask
This was taking 20 s or more on large arrays. I tried different ways of applying the geometry.contains() function over the array (e.g. np.vectorize) but this did not lead to any improvements. I then realised it was the Django contains lookup which was taking too long. I also converted the geometry to a shapely polygon and tested shapely's polygon.contains function - no difference or worse.
The solution lay in bypassing GeoDjango by using Polygon isInside method. First I created a function to create a Polygon object from my Geos Multipolygon.
from Polygon import Polygon
def multipolygon_to_polygon(multipolygon):
"""
Convert a Geos Multipolygon to python Polygon
"""
polygon = multipolygon[0] # select first polygon object
nrings = polygon.num_interior_rings # get number of rings in polygon
poly = Polygon()
poly.addContour(polygon[0].coords) # Add first ring coordinates tuple
# Add subsequent rings
if nrings > 0:
for i in range(nrings):
print("Adding ring %s" % str(i+1))
hole = True
poly.addContour(polygon[i+1].coords, hole)
return poly
Applying this to my problem
my_polygon = model.geometry # get model multipolygon field
polygon = multipolygon_to_polygon(my_polygon) # convert to python Polygon
lat_lon = zip(bands['latitude'].flat, bands['longitude'].flat) # points tuple
land_mask = array([not polygon.isInside(ll[1], ll[0]) for ll in lat_lon])
This resulted in a roughly 20X improvement in speed. Hope this helps someone.
Python 2.7.
Related
I have multiple standard formed bricks in an IFC file of the type IfcBuildingElementProxy. While I already managed to extract their positions from the IFC file, I now struggle to get the geometry (lenght, height, widht) from the file. I know that there are 2 ways to get the geometry:
parse trough the representation attributes of the bricks and try to write a code, that calculates the geometry. This method is really exhausting, as IFC files tend to work with a lot of references. I won't go this path.
get the geometry using a engine like ifcopenshell and opencascade. I know how to cast the bricks into a TopoDS object, but struggle to find the right methods to get the geometry.
import ifcopenshell
bricklist = ifc_file.by_type('IfcBuildingElementProxy')
for brick in bricklist:
shape = ifcopenshell.geom.create_shape(settings, brick).geometry
shape.methodtogetXYZgemeometrics???
Use
settings = geom.settings()
settings.set(settings.USE_WORLD_COORDS, True) #Translates and rotates the points to their world coordinates
...
shape = geom.create_shape(settings , brick )
points=shape.geometry.verts #The xyz points
triangles=shape.geometry.faces #Indexes to the points, 3 indexes form one triangle
Note that you can also use the element's 4x3 rotation/translation matrix and do the points translation yourself if you do not use the USE_WORLD_COORDS setting. This matrix is given by
shape.transformation.matrix.data
With the Trimesh module in Python, I am able to get 2D cross-sections from a STL file, with the code shown below.
mesh = trimesh.load_mesh('MyFile.stl')
slicex = mesh.section(plane_origin=mesh.centroid, plane_normal=[0,30,0])
slice_2D, to_3D = slice.to_planar()
With the 2D Path (Slice_2D), obtained from the above code, I am able to get the polygons in it as a NumPy array and iterate over it with the code below:
for polygon in slice_2D.polygons_closed:
trimesh.path.polygons.plot_polygon(polygon, show=True)
The above code SHOWS the polygons on the console. However, I would like to know if there is a way to get the properties of the polygon, for example: No. of edges in the polygon; Perimeter and Area of the polygon; Type of Polygon (triangle or square or rectangle or parallelogram or circle, etc.).
Any help in this regard would be much appreciated!
The property "polygons_closed" returns an array of shapely polygons. So to get ie. the area, use:
for polygon in slice_2D.polygons_closed:
trimesh.path.polygons.plot_polygon(polygon, show=True)
print(polygon.area)
We are using a shapely library to check that some random point is not in some prohibited areas stored in a shape file.
with fiona.open(path) as source:
geometry = get_exclusive_item(source[0])
geom = shapely.geometry.shape(geometry['geometry'])
def check(lat, lng):
point = shapely.geometry.Point(lng, lat)
return not geom.contains(point)
But the latest call geom.contains(point) takes about a second to complete. Is there any other faster libraries for python, or could we optimize a shape files somehow to get better speed?
Thank for the #iant point to use a spatial indexes.
My shapefile was a single MultiPoligon with a lot of points, makes .contains() are really slow.
I solved the issue by splitting it into smaller shapes and use Rtree index.
To split shapefile I used QGIS, as it descrived here - https://gis.stackexchange.com/a/23694/65569
The core idea how to use RTree in python is here - https://gis.stackexchange.com/a/144764/65569
In total this gaves me 1000x speed-up for .contains() lookups!
My Profile model has this field:
location = models.PointField(geography=True, dim=2, srid=4326)
I'd like to calculate the distance between the two of these locations (taking into account that the Earth is a spheroid) using GeoDjango, so that I can store this distance in the database.
How can I calculate this distance with GeoDjango?
What units are the results in?
Is there a 'best' way to store this data? Float? Decimal?
I've reviewed previous, similar questions, and haven't found them useful. No answer gives enough explanation of what's happening or why it works.
I'm using Django 1.8 and the latest versions of required libraries for GeoDjango.
Thanks!
Following Abhyudit Jain's comment, I'm using geopy to calculate distance. I'm adding it as a property as opposed to storing it, as per e4c5's advice:
from django.contrib.gis.measure import Distance, D
from geopy.distance import distance
#property
def distance(self):
return Distance(m=distance(self.user_a.profile.location, self.user_b.profile.location).meters)
Geopy defaults to Vincenty’s formulae, with an error of up to 0.5%, and contains a lot of other functionality I'll use in future.
The above returns a GeoDjango Distance object, ready for easy conversion between measurements.
Thanks for the help!
How can I calculate this distance with GeoDjango?
For two objects:
a.location.distance(b.location)
Suppose you have an object a which is an instance of your profile model and you wish to find the distance to every other profile you can perform the following query as described in Geodjango reference:
for profile in Profile.objects.distance(a.location):
print profile.distance
If you only want to compare with objects that are less than 1km distance away:
for profile in Profile.objects.filter(location__dwithin=(a.location, D(km=1)).distance(a.location):
print profile.distance
What units are the results in?
the unit can be whatever you want it to be. What's returned is a distance object. However the default is in meter and that's what the print statement above will display.
Is there a 'best' way to store this data? Float? Decimal?
The best way is not to save it. Typically one does not save in a database what can be calculated by a simple query. And the number of records will grow expontially. For example if you have N profiles in your database it will have some distance property to N-1 other profiles. So you end up with N(N-1) number of records in your 'cache table'
To get the distance computed in a GeoQuerySet you can combine annotate with the GeoDjango Distance database function (not to be confused with the Distance measure)
from django.contrib.gis.db.models.functions import Distance
queryset = Profile.objects.annotate(
distance=Distance('location', a.location)
)
The annotated distance will be a Distance measure. Meaning you can do the following:
for profile in queryset:
print(profile.distance.mi) # or km, etc
To filter for profiles within a certain radius you can add a filter to the QuerySet.
from django.contrib.gis.db.models.functions import Distance as DistanceDBFunction
from django.contrib.gis.measure import Distance as DistanceMeasure
queryset = Profile.objects.annotate(
distance=DistanceDBFunction('location', a.location)
).filter(
distance__lt=DistanceMeasure(mi=1)
)
If you do not need the annotated distance, you can simply use the distance lookups.
from django.contrib.gis.measure import Distance
queryset = Profile.objects.filter(
location__distance_lt=(a.location, Distance(mi=1))
)
Note: the Profile.objects.distance(a.location) as noted in other answers has been deprecated since Django 1.9.
I am looking for a minimalistic solution for doing basic geospatial search in Python.
We have a dataset of roughly 10 k locations and we need to solve the find the all locations within a radius of N kilometers from a given location. I am not looking for explicit database with geospatial support. I hope to get around another external solution. Is there something that would use Python only?
Shapely seems to be a good solution. Its description seems to correspond to what you're looking for :
[Shapely] It lets you do PostGIS-ish stuff outside the context of a database using Python.
It is based on GEOS, which a widely used C++ library.
Here is a link to the documentation
scipy.spatial has a kd-tree implementation that might be the most popular in Python.
A self made solution without any external modules could be something like this:
import numpy as np
points = np.array([[22.22, 33.33],
[08.00, 05.00],
[03.12, 05.00],
[09.00, 08.00],
[-02.5, 03.00],
[0.00, -01.00],
[-10.0,-10.00],
[12.00, 12.00],
[-4.00, -6.00]])
r = 10.0 # Radius withing the points should lie
xm = 3 # Center x coordinate
ym = 8 # Center y coordinate
points_i = points[((points[:,0] - xm)**2 + (points[:,1] - ym)**2)**(1/2.0) < r]
points_i contains those points which lie within the radius. This solution requires the data to be in a numpy array which is to my knowledge also a very fast way to go trough large data sets as oppose to for loops. I guess this solution is pretty much minimalistic. The plot below shows the outcome with the data given in the code.