I want to select grid cells from ERA5 gridded data (surface level only) that are inside geographical masks for North- and South-Switzerland (plus the radar buffer), to calculate regional means.
The 4 masks (masks) are given as polygons/multipolygons (polygons) in a shapefile and so far for 2 of the masks I was able to use salem roi to get what I want:
radar_north = salem.read_shapefile('radar_north140.shp')
file_radar_north = file.salem.roi(shape=radar_north)
file_radar_north.cape.mean(dim='time').salem.quick_map()
However, for the radar_south and alpensuedseite shapefiles the code didn´t work at the beginning (wrong selection or shows no data), and now the nothing works anymore (?). I don´t know why, as I have not changed anything from the first time to the second.
If someone sees the issue or knows a different way to mask the ERA data (which is maybe quicker) I would be grateful! (I was unsuccessfull with the answers from similar questions here).
Best
Lena
This could work if you are working on netcdf files
import geopandas as gpd
import xarray as xr
import rioxarray
from shapely.geometry import mapping
# load shapefile with geopandas
radar_north = gpd.read_file('radar_north140.shp')
# load ERA5 netcdf with xarray
era = xr.open_dataset('ERA5.nc')
# add projection system to nc
era = era.rio.write_crs("EPSG:4326", inplace=True)
# mask ERA5 data with shapefile
era_radar_north = era.rio.clip(radar_north.geometry.apply(mapping), radar_north.crs)
I have a set of latitude and longitude points and I would like find if the coordinate exists within a building or not (whether it is indoor point or not). Points could be anywhere within US. I would like to find the Shape File that could help me solve this problem.
I tried downloading shape-files from www.census.gov/ but it does not have any documentation on the granularity of shape-files it does contain. So I am not able to correctly pinpoint on the right set of shape-files I can use
import geopandas as gpd
from shapely.geometry import shape
from shapely.geometry import Point, Polygon
import pandas as pd
p1 = Point(32.999231, -96.773696)
data3 = gpd.read_file(fp)
data3['geometry'].intersects(p1)
data3['geometry'].contains(p1)
Final output is Indoor/Outdoor. Need some info on the shape files I can use for this kind of problem.
I have a satellite image of 7-channels (Basically I have seven .tif files, one for each band). And I have a .csv file with coordinates of points-of-interest that are in the region shot by the satellite. I want to cut small portions of the image in the surroundings of each coordinate point. How could I do that?
As I don't have a full working code right now, it really doesn't matter the size of those small portions of image. For the explanation of this question let's say that I want them to be 15x15 pixels. So for the moment, my final objective is to obtain a lot of 15x15x7 vectors, one for every coordinate point that I have in the .csv file. And that is what I am stucked with. (the "7" in the "15x15x7" is because the image has 7 channels)
Just to give some background in case it's relevant: I will use those vectors later to train a CNN model in keras.
This is what I did so far: (I am using jupyter notebook, anaconda environment)
imported gdal, numpy, matplotlib, geopandas, among other libraries.
Opened the .gif files using gdal, converted them into arrays
Opened the .csv file using pandas.
Created a numpy array called "imagen" of shape (7931, 7901, 3) that will host the 7 bands of the satellite image (in form of numbers). At this point I just need to know which rows and colums of the array "imagen" correspond to each coordinate point. In other words I need to convert every coordinate point into a pair of numbers (row,colum). And that is what I am stucked with.
After that, I think that the "cutting part" will be easy.
#I import libraries
from osgeo import gdal_array
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas
from geopandas import GeoDataFrame
from shapely.geometry import Point
#I access the satellite images (I just show one here to make it short)
b1 = r"E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif"
band1 = gdal.Open(b1, gdal.GA_ReadOnly)
#I open the .csv file
file_svc = "C:\\Users\\Administrador\Desktop\DeepLearningInternship\Crop Yield Prediction\Crop Type Classification model - CNN\First\T28_Pringles4.csv"
df = pd.read_csv(file_svc)
print(df.head())
That prints something like this:
Lat1 Long1 CropingState
-37.75737 -61.14537 Barbecho
-37.78152 -61.15872 Verdeo invierno
-37.78248 -61.17755 Barbecho
-37.78018 -61.17357 Campo natural
-37.78850 -61.18501 Campo natural
#I create the array "imagen" (I only show one channel here to make it short)
imagen = (np.zeros(7931*7901*7, dtype = np.float32)).reshape(7931,7901,7)
imagen[:,:,0] = band1.ReadAsArray().astype(np.float32)
#And then I can plot it:
plt.imshow(imagen[:,:,0], cmap = 'hot')
plt.plot()
Which plots something like this:
(https://github.com/jamesluc007/DeepLearningInternship/blob/master/Crop%20Yield%20Prediction/Crop%20Type%20Classification%20model%20-%20CNN/First/red_band.png)
I want to transform those (-37,-61) into something like (2230,1750). But I haven't figured it how yet. Any clues?
When dealing with coordinates systems in Python with fiona and osgeo, there seem to be a lot of ways to define a coordinate system by importing/exporting different crs formats , for example:
FIONA:
from fiona.crs import from_epsg,from_string,to_string
# Import crs from different formats:
wgs = from_epsg(4326)
wgs = from_string("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs ")
# Export crs as proj4 string
wgs_proj4_string = to_string(wgs)
OSGEO:
from osgeo import osr
srs = osr.SpatialReference()
srs.ImportFromESRI(['GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]'])
srs.ImportFromProj4("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
srs.ImportFromEPSG(4326)
#the import options are very rich
# Export to different formats
srs.ExportToProj4()
srs.ExportToWkt()
srs.ExportToXML()
#... many export options!
However, I noticed, that both libraries allow for easy definition of crs by its EPSG code, but they both lack an inverse function (exporting a crs as an ESPG code).
The closest I get the EPSG code is by:
srs.AutoIdentifyEPSG()
epsg = srs.GetAuthorityCode(None)
but it doesn't seem to be so reliable, and also other proposed solutions seem to include a great amount of tweaking or at least web service dependency.
QUESTIONS:
Can somebody show me a simple, straight-forward way to export a CRS as an EPSG code in python? Something like to_epsg() in Fiona or ExportToEPSG() in osgeo?
Can someone explain the theoretical background of having such a shortage of EPSG export possibilites throughout the internet, especially compared to the ease of importing by EPSG code. Isn't the whole point of EPSG codes in making coordinate systems easy to identify and use for people without advanced geospatial expertise? Shouldn't it serve like an ID for a CRS and therefore be easily retrievable?
Could try pyproj CRS: https://pyproj4.github.io/pyproj/stable/examples.html#converting-crs-to-a-different-format
from pyproj import CRS
from fiona.crs import to_string, from_epsg
fiona_crs = from_epsg(28356)
proj4_crs = CRS.from_proj4(to_string(fiona_crs))
srid = proj4_crs.to_epsg()
Although, for some reason this doesn't work for EPSG 4326, for me, unfortunately (to_epsg returns None in that case), not sure why.
How can I check if a geopoint is within the area of a given shapefile?
I managed to load a shapefile in python, but can't get any further.
Another option is to use Shapely (a Python library based on GEOS, the engine for PostGIS) and Fiona (which is basically for reading/writing files):
import fiona
import shapely
with fiona.open("path/to/shapefile.shp") as fiona_collection:
# In this case, we'll assume the shapefile only has one record/layer (e.g., the shapefile
# is just for the borders of a single country, etc.).
shapefile_record = fiona_collection.next()
# Use Shapely to create the polygon
shape = shapely.geometry.asShape( shapefile_record['geometry'] )
point = shapely.geometry.Point(32.398516, -39.754028) # longitude, latitude
# Alternative: if point.within(shape)
if shape.contains(point):
print "Found shape for point."
Note that doing point-in-polygon tests can be expensive if the polygon is large/complicated (e.g., shapefiles for some countries with extremely irregular coastlines). In some cases it can help to use bounding boxes to quickly rule things out before doing the more intensive test:
minx, miny, maxx, maxy = shape.bounds
bounding_box = shapely.geometry.box(minx, miny, maxx, maxy)
if bounding_box.contains(point):
...
Lastly, keep in mind that it takes some time to load and parse large/irregular shapefiles (unfortunately, those types of polygons are often expensive to keep in memory, too).
This is an adaptation of yosukesabai's answer.
I wanted to ensure that the point I was searching for was in the same projection system as the shapefile, so I've added code for that.
I couldn't understand why he was doing a contains test on ply = feat_in.GetGeometryRef() (in my testing things seemed to work just as well without it), so I removed that.
I've also improved the commenting to better explain what's going on (as I understand it).
#!/usr/bin/python
import ogr
from IPython import embed
import sys
drv = ogr.GetDriverByName('ESRI Shapefile') #We will load a shape file
ds_in = drv.Open("MN.shp") #Get the contents of the shape file
lyr_in = ds_in.GetLayer(0) #Get the shape file's first layer
#Put the title of the field you are interested in here
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("P_Loc_Nm")
#If the latitude/longitude we're going to use is not in the projection
#of the shapefile, then we will get erroneous results.
#The following assumes that the latitude longitude is in WGS84
#This is identified by the number "4326", as in "EPSG:4326"
#We will create a transformation between this and the shapefile's
#project, whatever it may be
geo_ref = lyr_in.GetSpatialRef()
point_ref=ogr.osr.SpatialReference()
point_ref.ImportFromEPSG(4326)
ctran=ogr.osr.CoordinateTransformation(point_ref,geo_ref)
def check(lon, lat):
#Transform incoming longitude/latitude to the shapefile's projection
[lon,lat,z]=ctran.TransformPoint(lon,lat)
#Create a point
pt = ogr.Geometry(ogr.wkbPoint)
pt.SetPoint_2D(0, lon, lat)
#Set up a spatial filter such that the only features we see when we
#loop through "lyr_in" are those which overlap the point defined above
lyr_in.SetSpatialFilter(pt)
#Loop through the overlapped features and display the field of interest
for feat_in in lyr_in:
print lon, lat, feat_in.GetFieldAsString(idx_reg)
#Take command-line input and do all this
check(float(sys.argv[1]),float(sys.argv[2]))
#check(-95,47)
This site, this site, and this site were helpful regarding the projection check. EPSG:4326
Here is a simple solution based on pyshp and shapely.
Let's assume that your shapefile only contains one polygon (but you can easily adapt for multiple polygons):
import shapefile
from shapely.geometry import shape, Point
# read your shapefile
r = shapefile.Reader("your_shapefile.shp")
# get the shapes
shapes = r.shapes()
# build a shapely polygon from your shape
polygon = shape(shapes[0])
def check(lon, lat):
# build a shapely point from your geopoint
point = Point(lon, lat)
# the contains function does exactly what you want
return polygon.contains(point)
i did almost exactly what you are doing yesterday using gdal's ogr with python binding. It looked like this.
import ogr
# load the shape file as a layer
drv = ogr.GetDriverByName('ESRI Shapefile')
ds_in = drv.Open("./shp_reg/satreg_etx12_wgs84.shp")
lyr_in = ds_in.GetLayer(0)
# field index for which i want the data extracted
# ("satreg2" was what i was looking for)
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("satreg2")
def check(lon, lat):
# create point geometry
pt = ogr.Geometry(ogr.wkbPoint)
pt.SetPoint_2D(0, lon, lat)
lyr_in.SetSpatialFilter(pt)
# go over all the polygons in the layer see if one include the point
for feat_in in lyr_in:
# roughly subsets features, instead of go over everything
ply = feat_in.GetGeometryRef()
# test
if ply.Contains(pt):
# TODO do what you need to do here
print(lon, lat, feat_in.GetFieldAsString(idx_reg))
Checkout http://geospatialpython.com/2011/01/point-in-polygon.html and http://geospatialpython.com/2011/08/point-in-polygon-2-on-line.html
One way to do this is to read the ESRI Shape file using the OGR
library Link and then use the GEOS geometry
library http://trac.osgeo.org/geos/ to do the point-in-polygon test.
This requires some C/C++ programming.
There is also a python interface to GEOS at http://sgillies.net/blog/14/python-geos-module/ (which I have never used). Maybe that is what you want?
Another solution is to use the http://geotools.org/ library.
That is in Java.
I also have my own Java software to do this (which you can download
from http://www.mapyrus.org plus jts.jar from http://www.vividsolutions.com/products.asp ). You need only a text command
file inside.mapyrus containing
the following lines to check if a point lays inside the
first polygon in the ESRI Shape file:
dataset "shapefile", "us_states.shp"
fetch
print contains(GEOMETRY, -120, 46)
And run with:
java -cp mapyrus.jar:jts-1.8.jar org.mapyrus.Mapyrus inside.mapyrus
It will print a 1 if the point is inside, 0 otherwise.
You might also get some good answers if you post this question on
https://gis.stackexchange.com/
If you want to find out which polygon (from a shapefile full of them) contains a given point (and you have a bunch of points as well), the fastest way is using postgis. I actually implemented a fiona based version, using the answers here, but it was painfully slow (I was using multiprocessing and checking bounding box first). 400 minutes of processing = 50k points. Using postgis, that took less than 10seconds. B tree indexes are efficient!
shp2pgsql -s 4326 shapes.shp > shapes.sql
That will generate a sql file with the information from the shapefiles, create a database with postgis support and run that sql. Create a gist index on the geom column. Then, to find the name of the polygon:
sql="SELECT name FROM shapes WHERE ST_Contains(geom,ST_SetSRID(ST_MakePoint(%s,%s),4326));"
cur.execute(sql,(x,y))