I have a large list of longitude latitude points and want to find the nearest rectangle (so which rectangle contains the point) in a given raster of geographic coordinates.
However, for the raster I only have the centroids of each rectangle (polygon) in the raster. I know though that the rectangles have a size of 250m x 250m.
Just checking for absolute distance or geographic distance to the centers does not work, as the rectangles are not necessarily aligned. I am happy to get ideas.
I think you could generate your raster of geographic coordinates that represent raster cells following this approach: https://gis.stackexchange.com/questions/177061/ascii-file-with-latitude-longitude-and-data-to-geotiff-using-python
And then if you created a shapefile of your latitute and longitude points you could get raster cell ID for each point using this approach:
def GetRasterValueAtPoints(rasterfile, shapefile, fieldname):
'''
__author__ = "Marc Weber <weber.marc#epa.gov>"
Original code attribution: https://gis.stackexchange.com/a/46898/2856
returns raster values at points in a point shapefile
assumes same projection in shapefile and raster file
Arguments
---------
rasterfile : a raster file with full pathname and extension
shapefile : a shapefile with full pathname and extension
fieldname : field name in the shapefile to identify values
'''
src_ds=gdal.Open(rasterfile)
no_data = src_ds.GetRasterBand(1).GetNoDataValue()
gt=src_ds.GetGeoTransform()
rb=src_ds.GetRasterBand(1)
df = pd.DataFrame(columns=(fieldname, "RasterVal"))
i = 0
ds=ogr.Open(shapefile)
lyr=ds.GetLayer()
for feat in lyr:
geom = feat.GetGeometryRef()
name = feat.GetField(fieldname)
mx,my=geom.GetX(), geom.GetY() #coord in map units
#Convert from map to pixel coordinates.
#Only works for geotransforms with no rotation.
px = int((mx - gt[0]) / gt[1]) #x pixel
py = int((my - gt[3]) / gt[5]) #y pixel
intval = rb.ReadAsArray(px,py,1,1)
if intval == no_data:
intval = -9999
df.set_value(i,fieldname,name)
df.set_value(i,"RasterVal",float(intval))
i+=1
return df
Related
I need to rotate a polygon (which I am provided as a KML file) on a map, so that its shape remains the same.
Here is the function I use on each coordinate in turn to perform the rotation:
def rotate(
points,
origin,
angle
):
# Convert to radians:
rotation_angle = np.deg2rad(angle)
# Build transformer:
R = np.array([[np.cos(rotation_angle), -np.sin(rotation_angle)], [np.sin(rotation_angle), np.cos(rotation_angle)]])
# Convert list to at least 2 dimension arrays...
o = np.atleast_2d(origin)
p = np.atleast_2d(points)
# Return transformed points:
return np.squeeze((R # (p.T-o.T) + o.T).T)
When I perform this on all the coordinates in a rectangle and rotate 90 degrees anti-clockwise, I get a parallelogram. I am thinking this is because the scaling of distances along the arc between two lines of latitude and longitude are different.
Here is an example of what I am seeing:
The green rectangle is the original shape, the red parallelogram is the rotated shape, and the origin is shown for ease of understanding.
I have spent too long trying to make this work and there is obviously a simple way to achieve this which both I and trusty Google are missing!
I am trying to reproject polar data into Cartesian data that matches along latitude / longitude lines. The code that I have thus far is as follows:
latitude = 35.6655197143554
longitude = -78.48975372314453
# Convert to Cartesian
x = ranges * np.sin(np.deg2rad(azimuths))[:,None]
y = ranges * np.cos(np.deg2rad(azimuths))[:,None]
# Setup a projection
dataproj = Proj(f"+proj=stere +lat_0={latitude} +lat_ts={latitude} +lon_0={longitude} +ellps=WGS84 +units=m")
lons,lats = dataproj(x,y,inverse=True)
...
...
# Plot
im = ax.pcolormesh(lons,lats,data,cmap=cmap_data,norm=norm_cmap)
where the data is a [720,1832] array. The output from plotting looks like below:
Notice how the individual colored pixels move across the latitude and longitude lines. How might I add and/or change the code I have thus far to make the data aligned along lat/lons?
I have some polygons (Canadian provinces), read in with GeoPandas, and want to use these to create a mask to apply to gridded data on a 2-d latitude-longitude grid (read from a netcdf file using iris). An end goal would be to only have data for a given province remaining, with the rest of the data masked out. So the mask would be 1's for grid boxes within the province, and 0's or NaN's for grid boxes outside the province.
The polygons can be obtained from the shapefile here:
https://www.dropbox.com/s/o5elu01fetwnobx/CAN_adm1.shp?dl=0
The netcdf file I am using can be downloaded here:
https://www.dropbox.com/s/kxb2v2rq17m7lp7/t2m.20090815.nc?dl=0
I imagine there are two approaches here but I am struggling with both:
1) Use the polygon to create a mask on the latitude-longitude grid so that this can be applied to lots of datafiles outside of python (preferred)
2) Use the polygon to mask the data that have been read in and extract only the data inside the province of interest, to work with interactively.
My code so far:
import iris
import geopandas as gpd
#read the shapefile and extract the polygon for a single province
#(province names stored as variable 'NAME_1')
Canada=gpd.read_file('CAN_adm1.shp')
BritishColumbia=Canada[Canada['NAME_1'] == 'British Columbia']
#get the latitude-longitude grid from netcdf file
cubelist=iris.load('t2m.20090815.nc')
cube=cubelist[0]
lats=cube.coord('latitude').points
lons=cube.coord('longitude').points
#create 2d grid from lats and lons (may not be necessary?)
[lon2d,lat2d]=np.meshgrid(lons,lats)
#HELP!
Thanks very much for any help or advice.
UPDATE: Following the great solution from #DPeterK below, my original data can be masked, giving the following:
It looks like you have started well! Geometries loaded from shapefiles expose various geospatial comparison methods, and in this case you need the contains method. You can use this to test each point in your cube's horizontal grid for being contained within your British Columbia geometry. (Note that this is not a fast operation!) You can use this comparison to build up a 2D mask array, which could be applied to your cube's data or used in other ways.
I've written a Python function to do the above – it takes a cube and a geometry and produces a mask for the (specified) horizontal coordinates of the cube, and applies the mask to the cube's data. The function is below:
def geom_to_masked_cube(cube, geometry, x_coord, y_coord,
mask_excludes=False):
"""
Convert a shapefile geometry into a mask for a cube's data.
Args:
* cube:
The cube to mask.
* geometry:
A geometry from a shapefile to define a mask.
* x_coord: (str or coord)
A reference to a coord describing the cube's x-axis.
* y_coord: (str or coord)
A reference to a coord describing the cube's y-axis.
Kwargs:
* mask_excludes: (bool, default False)
If False, the mask will exclude the area of the geometry from the
cube's data. If True, the mask will include *only* the area of the
geometry in the cube's data.
.. note::
This function does *not* preserve lazy cube data.
"""
# Get horizontal coords for masking purposes.
lats = cube.coord(y_coord).points
lons = cube.coord(x_coord).points
lon2d, lat2d = np.meshgrid(lons,lats)
# Reshape to 1D for easier iteration.
lon2 = lon2d.reshape(-1)
lat2 = lat2d.reshape(-1)
mask = []
# Iterate through all horizontal points in cube, and
# check for containment within the specified geometry.
for lat, lon in zip(lat2, lon2):
this_point = gpd.geoseries.Point(lon, lat)
res = geometry.contains(this_point)
mask.append(res.values[0])
mask = np.array(mask).reshape(lon2d.shape)
if mask_excludes:
# Invert the mask if we want to include the geometry's area.
mask = ~mask
# Make sure the mask is the same shape as the cube.
dim_map = (cube.coord_dims(y_coord)[0],
cube.coord_dims(x_coord)[0])
cube_mask = iris.util.broadcast_to_shape(mask, cube.shape, dim_map)
# Apply the mask to the cube's data.
data = cube.data
masked_data = np.ma.masked_array(data, cube_mask)
cube.data = masked_data
return cube
If you just need the 2D mask you could return that before the above function applies it to the cube.
To use this function in your original code, add the following at the end of your code:
geometry = BritishColumbia.geometry
masked_cube = geom_to_masked_cube(cube, geometry,
'longitude', 'latitude',
mask_excludes=True)
If this doesn't mask anything it might well mean that your cube and geometry are defined on different extents. That is, your cube's longitude coordinate runs from 0°–360°, and if the geometry's longitude values run from -180°–180°, then the containment test will never return True. You can fix this by changing the extents of your cube with the following:
cube = cube.intersection(longitude=(-180, 180))
I found an alternative solution to the excellent one posted by #DPeterK above, which yields the same result. It uses matplotlib.path to test if points are contained within the exterior coordinates described by the geometries loaded from a shape file. I am posting this because this method is ~10 times faster than that given by #DPeterK (2:23 minutes vs 25:56 minutes). I'm not sure what is preferable: an elegant solution, or a speedy, brute force solution. Perhaps one can have both?!
One complication with this method is that some geometries are MultiPolygons - i.e. the shape consists of several smaller polygons (in this case, the province of British Columbia includes islands off of the west coast, which can't be described by the coordinates of the mainland British Columbia Polygon). The MultiPolygon has no exterior coordinates but the individual polygons do, so these each need to be treated individually. I found that the neatest solution to this was to use a function copied from GitHub (https://gist.github.com/mhweber/cf36bb4e09df9deee5eb54dc6be74d26), which 'explodes' MultiPolygons into a list of individual polygons that can then be treated separately.
The working code is outlined below, with my documentation. Apologies that it is not the most elegant code - I am relatively new to Python and I'm sure there are lots of unnecessary loops/neater ways to do things!
import numpy as np
import iris
import geopandas as gpd
from shapely.geometry import Point
import matplotlib.path as mpltPath
from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon
#-----
#FIRST, read in the target data and latitude-longitude grid from netcdf file
cubelist=iris.load('t2m.20090815.minus180_180.nc')
cube=cubelist[0]
lats=cube.coord('latitude').points
lons=cube.coord('longitude').points
#create 2d grid from lats and lons
[lon2d,lat2d]=np.meshgrid(lons,lats)
#create a list of coordinates of all points within grid
points=[]
for latit in range(0,241):
for lonit in range(0,480):
point=(lon2d[latit,lonit],lat2d[latit,lonit])
points.append(point)
#turn into np array for later
points=np.array(points)
#get the cube data - useful for later
fld=np.squeeze(cube.data)
#create a mask array of zeros, same shape as fld, to be modified by
#the code below
mask=np.zeros_like(fld)
#NOW, read the shapefile and extract the polygon for a single province
#(province names stored as variable 'NAME_1')
Canada=gpd.read_file('/Users/ianashpole/Computing/getting_province_outlines/CAN_adm_shp/CAN_adm1.shp')
BritishColumbia=Canada[Canada['NAME_1'] == 'British Columbia']
#BritishColumbia.geometry.type reveals this to be a 'MultiPolygon'
#i.e. several (in this case, thousands...) if individual polygons.
#I ultimately want to get the exterior coordinates of the BritishColumbia
#polygon, but a MultiPolygon is a list of polygons and therefore has no
#exterior coordinates. There are probably many ways to progress from here,
#but the method I have stumbled upon is to 'explode' the multipolygon into
#it's individual polygons and treat each individually. The function below
#to 'explode' the MultiPolygon was found here:
#https://gist.github.com/mhweber/cf36bb4e09df9deee5eb54dc6be74d26
#---define function to explode MultiPolygons
def explode_polygon(indata):
indf = indata
outdf = gpd.GeoDataFrame(columns=indf.columns)
for idx, row in indf.iterrows():
if type(row.geometry) == Polygon:
#note: now redundant, but function originally worked on
#a shapefile which could have combinations of individual polygons
#and MultiPolygons
outdf = outdf.append(row,ignore_index=True)
if type(row.geometry) == MultiPolygon:
multdf = gpd.GeoDataFrame(columns=indf.columns)
recs = len(row.geometry)
multdf = multdf.append([row]*recs,ignore_index=True)
for geom in range(recs):
multdf.loc[geom,'geometry'] = row.geometry[geom]
outdf = outdf.append(multdf,ignore_index=True)
return outdf
#-------
#Explode the BritishColumbia MultiPolygon into its constituents
EBritishColumbia=explode_polygon(BritishColumbia)
#Loop over each individual polygon and get external coordinates
for index,row in EBritishColumbia.iterrows():
print 'working on polygon', index
mypolygon=[]
for pt in list(row['geometry'].exterior.coords):
print index,', ',pt
mypolygon.append(pt)
#See if any of the original grid points read from the netcdf file earlier
#lie within the exterior coordinates of this polygon
#pth.contains_points returns a boolean array (true/false), in the
#shape of 'points'
path=mpltPath.Path(mypolygon)
inside=path.contains_points(points)
#find the results in the array that were inside the polygon ('True')
#and set them to missing. First, must reshape the result of the search
#('points') so that it matches the mask & original data
#reshape the result to the main grid array
inside=np.array(inside).reshape(lon2d.shape)
i=np.where(inside == True)
mask[i]=1
print 'fininshed checking for points inside all polygons'
#mask now contains 0's for points that are not within British Columbia, and
#1's for points that are. FINALLY, use this to mask the original data
#(stored as 'fld')
i=np.where(mask == 0)
fld[i]=np.nan
#Done.
import netCDF4
import numpy as np
nc_data = netCDF4.Dataset(out_nc, 'w', format='NETCDF4')
nc_data.description = 'Test'
# dimensions
nc_data.createDimension('lat', 720)
nc_data.createDimension('lon', 1440)
fl_res = 0.25
latitudes = np.arange(90.0 - fl_res/2.0, -90.0, -fl_res)
longitudes = np.arange(-180.0 + fl_res/2.0, 180.0, fl_res)
I am creating a 0.25 degree resolution netCDF file. In the latitudes and longitudes that I am creating, do they represent the corner of each grid-cell or the center? Is there any way I can choose what they represent?
Coordinates (should) represent the centers of each grid cell.
There are some special cases where a variable, oftentimes wind velocity, is solved on the edges of the grid cell. In that case, the variable is attached to a set of lat/lon pairs that has one extra pair either in the x- or y-dimension. Though in post-processing, these edge-based variables are usually interpolated to the centers of each grid cell to follow the set of centered-coordinates, like the ones you've defined above.
EDIT: sources
CF conventions - cell boundaries essentially stating that lat/lon are in the center of the grid cell, bounded by the vertices of the grid cell. For most products, only one set of lat/lon coordinates are provided and that can be assumed to be for the centers of the grid cells (CH.4: "If bounds are not provided, an application might reasonably assume the gridpoints to be at the centers of the cells, but we do not require that in this standard.")
NOAA ioSSTv2 - an example of a NOAA product stating that "The latitude/longitude values in the netCDF coordinate variables are the centers of the grid cells."
Using GDAL in Python, how do you get the latitude and longitude of a GeoTIFF file?
GeoTIFF's do not appear to store any coordinate information. Instead, they store the XY Origin coordinates. However, the XY coordinates do not provide the latitude and longitude of the top left corner and bottom left corner.
It appears I will need to do some math to solve this problem, but I don't have a clue on where to start.
What procedure is required to have this performed?
I know that the GetGeoTransform() method is important for this, however, I don't know what to do with it from there.
To get the coordinates of the corners of your geotiff do the following:
from osgeo import gdal
ds = gdal.Open('path/to/file')
width = ds.RasterXSize
height = ds.RasterYSize
gt = ds.GetGeoTransform()
minx = gt[0]
miny = gt[3] + width*gt[4] + height*gt[5]
maxx = gt[0] + width*gt[1] + height*gt[2]
maxy = gt[3]
However, these might not be in latitude/longitude format. As Justin noted, your geotiff will be stored with some kind of coordinate system. If you don't know what coordinate system it is, you can find out by running gdalinfo:
gdalinfo ~/somedir/somefile.tif
Which outputs:
Driver: GTiff/GeoTIFF
Size is 512, 512
Coordinate System is:
PROJCS["NAD27 / UTM zone 11N",
GEOGCS["NAD27",
DATUM["North_American_Datum_1927",
SPHEROID["Clarke 1866",6378206.4,294.978698213901]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-117],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["metre",1]]
Origin = (440720.000000,3751320.000000)
Pixel Size = (60.000000,-60.000000)
Corner Coordinates:
Upper Left ( 440720.000, 3751320.000) (117d38'28.21"W, 33d54'8.47"N)
Lower Left ( 440720.000, 3720600.000) (117d38'20.79"W, 33d37'31.04"N)
Upper Right ( 471440.000, 3751320.000) (117d18'32.07"W, 33d54'13.08"N)
Lower Right ( 471440.000, 3720600.000) (117d18'28.50"W, 33d37'35.61"N)
Center ( 456080.000, 3735960.000) (117d28'27.39"W, 33d45'52.46"N)
Band 1 Block=512x16 Type=Byte, ColorInterp=Gray
This output may be all you need. If you want to do this programmaticly in python however, this is how you get the same info.
If the coordinate system is a PROJCS like the example above you are dealing with a projected coordinate system. A projected coordiante system is a representation of the spheroidal earth's surface, but flattened and distorted onto a plane. If you want the latitude and longitude, you need to convert the coordinates to the geographic coordinate system that you want.
Sadly, not all latitude/longitude pairs are created equal, being based upon different spheroidal models of the earth. In this example, I am converting to WGS84, the geographic coordinate system favoured in GPSs and used by all the popular web mapping sites. The coordinate system is defined by a well defined string. A catalogue of them is available from spatial ref, see for example WGS84.
from osgeo import osr, gdal
# get the existing coordinate system
ds = gdal.Open('path/to/file')
old_cs= osr.SpatialReference()
old_cs.ImportFromWkt(ds.GetProjectionRef())
# create the new coordinate system
wgs84_wkt = """
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.01745329251994328,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]"""
new_cs = osr.SpatialReference()
new_cs .ImportFromWkt(wgs84_wkt)
# create a transform object to convert between coordinate systems
transform = osr.CoordinateTransformation(old_cs,new_cs)
#get the point to transform, pixel (0,0) in this case
width = ds.RasterXSize
height = ds.RasterYSize
gt = ds.GetGeoTransform()
minx = gt[0]
miny = gt[3] + width*gt[4] + height*gt[5]
#get the coordinates in lat long
latlong = transform.TransformPoint(minx,miny)
Hopefully this will do what you want.
I don't know if this is a full answer, but this site says:
The x/y map dimensions are called easting and northing. For datasets in a geographic coordinate system these would hold the longitude and latitude. For projected coordinate systems they would normally be the easting and northing in the projected coordinate system. For ungeoreferenced images the easting and northing would just be the pixel/line offsets of each pixel (as implied by a unity geotransform).
so they may actually be longitude and latitude.