get difference of two shape files

get difference of two shape files - python

I'm having two shape files which is contains polygons in it. I'm trying to find delta out of it.
Trying to do this by following code but not working the way i expected.
Following is two shape files blue one is buffer shape file, I need to remove that buffer area which intersecting with blue buffer.i.e. need to get difference of geometry the same as Qgis difference function
import fiona
from shapely.geometry import shape, mapping, Polygon
green = fiona.open(
"/home/gulve/manual_geo_ingestion/probe-data/images/r/shape_out/dissolved.shp")
blue = fiona.open(
"/home/gulve/manual_geo_ingestion/probe-data/images/g/shape/shape.shp")
print([not shape(i['geometry']).difference(shape(j['geometry'])).is_empty for i, j in zip(list(blue), list(green))])
schema = {'geometry': 'Polygon',
'properties': {}}
crs = {'init': u'epsg:3857'}
with fiona.open(
'/home/gulve/manual_geo_ingestion/probe-data/images/r/shape_out/diff.shp', 'w',
driver='ESRI Shapefile', crs=crs, schema=schema
) as write_shape:
for geom in [shape(i['geometry']).difference(shape(j['geometry'])) for i, j in zip(list(blue), list(green))]:
if not geom.empty:
write_shape.write({'geometry': mapping((shape(geom))), 'properties': {}})
Expected output:

After you imported the shapefiles into PostgreSQL, just execute this query:
CREATE TABLE not_intersects AS
SELECT * FROM shape
WHERE id NOT IN (SELECT DISTINCT shape.id
FROM buffer,shape
WHERE ST_Intersects(buffer.geom,shape.geom));
This query will create a third table (called here not_intersects) containing the polygons that do not intersect between the two tables (shape files).
The result is represented in yellow.

I'm able to solve this with shapely functions
Here's my code..
import fiona
from shapely.geometry import shape, mapping, Polygon
from shapely.ops import unary_union
buffered_shape = fiona.open(
"dissolved.shp", 'r', encoding='UTF-8')
color_shape = fiona.open(
"shape.shp", 'r', encoding='UTF-8')
print([not shape(i['geometry']).difference(shape(j['geometry'])).is_empty for i, j in
zip(list(color_shape), list(buffered_shape))])
outmulti = []
for pol in color_shape:
green = shape(pol['geometry'])
for pol2 in buffered_shape:
red = shape(pol2['geometry'])
if red.intersects(green):
# If they intersect, create a new polygon that is
# essentially pol minus the intersection
intersect = green.intersection(red)
nonoverlap = green.symmetric_difference(intersect)
outmulti.append(nonoverlap)
else:
outmulti.append(green)
finalpol = unary_union(outmulti)
schema = {'geometry': 'MultiPolygon',
'properties': {}}
crs = {'init': u'epsg:4326'}
with fiona.open(
'shape_out/diff.shp', 'w',
driver='ESRI Shapefile', crs=crs, schema=schema
) as write_shape:
for geom in finalpol:
# if not geom.empty:
write_shape.write({'geometry': mapping(Polygon(shape(geom))), 'properties': {}})

Related

Extract raster value using multipolygon type of shape using python

I am trying to export raster values using multipolygon shapefile in python. I have found the answer here, but the calculation there is not valid for multipolygon. Could please someone guide me, how i should correct the code in order to have not polygon but multipolygon datatype in calculation.
My code is below:
import rasterio
from rasterio.mask import mask
import geopandas as gpd
import numpy as np
from rasterio import Affine
from shapely.geometry import mapping
shapefile = gpd.read_file(r'/Users..../polygon_sector.shp')
geoms = shapefile.geometry.values
geometry = geoms[0] # shapely geometry
# transform to GeJSON format
geoms = [mapping(geoms[0])]
# extract the raster values within the polygon
with rasterio.open("/Users/.../map_reclass.tif") as src:
out_image, out_transform = mask(src, geoms, crop=True)
# no data values of the original raster
no_data=src.nodata
print(no_data)
# extract the values of the masked array
data = out_image[0,:,:]
# extract the row, columns of the valid values
row, col = np.where(data != no_data)
rou = np.extract(data != no_data, data)
# affine import Affine
T1 = out_transform * Affine.translation(0.5, 0.5) # reference the pixel centre
rc2xy = lambda r, c: (c, r) * T1
d = gpd.GeoDataFrame({'col':col,'row':row,'ROU':rou})
# coordinate transformation
d['x'] = d.apply(lambda row: rc2xy(row.row,row.col)[0], axis=1)
d['y'] = d.apply(lambda row: rc2xy(row.row,row.col)[1], axis=1)
# geometry
from shapely.geometry import Point
d['geometry'] =d.apply(lambda row: Point(row['x'], row['y']), axis=1)
# save to a shapefile
d.to_file(r'/Users/y.../result_full.shp', driver='ESRI Shapefile')
I have tried to assign the other geometry (multipolygon) but i did it wrong, since when i print the geometry it was still POLYGON, not MULTIPOLYGON. So far as i understood it should come from shapely.

Convert .geojson to .wkt | extract 'coordinates'

Goal: Ultimately, to convert .geojson to .wkt. Here, I want to extract all coordinates, each as a list.
In my.geojson, there are n many: {"type":"Polygon","coordinates":...
Update: I've successfully extracted the first coordinates. However, this file has two coordinates.
Every .geojson has at least 1 coordinates, but may have more.
How can I dynamically extract all key-values of many coordinates?
Code:
from pathlib import Path
import os
import geojson
import json
from shapely.geometry import shape
ROOT = Path('path/')
all_files = os.listdir(ROOT)
geojson_files = list(filter(lambda f: f.endswith('.geojson'), all_files))
for gjf in geojson_files:
with open(f'{str(ROOT)}/{gjf}') as f:
gj = geojson.load(f)
o = dict(coordinates = gj['features'][0]['geometry']['coordinates'], type = "Polygon")
geom = shape(o)
wkt = geom.wkt
Desired Output:
1 .wkt for all corrdinates in geojson

To convert a series of geometries in GeoJSON files to WKT, the shape() function can convert the GeoJSON geometry to a shapely object which then can be formatted as WKT and/or projected to a different coordinate reference system.
If want to access the coordinates of polygon once it's in a shapely object, usex,y = geo.exterior.xy.
If just want to convert a series of GeoJSON files into one .wkt file per GeoJSON file then try this:
from pathlib import Path
import json
from shapely.geometry import shape
ROOT = Path('path')
for f in ROOT.glob('*.geojson'):
with open(f) as fin, open(f.with_suffix(".wkt"), "w") as fout:
features = json.load(fin)["features"]
for feature in features:
geo = shape(feature["geometry"])
# format geometry coordinates as WKT
wkt = geo.wkt
print(wkt)
fout.write(wkt + "\n")
This output uses your example my.geojson file.
Output:
POLYGON ((19372 2373, 19322 2423, ...
POLYGON ((28108 25855, 27755 26057, ...
If need to convert the coordinates to EPSG:4327 (WGS-84) (e.g. 23.314208, 37.768469), you can use pyproj.
Full code to convert collection of GeoJSON files to a new GeoJSON file in WGS-84.
from pathlib import Path
import json
import geojson
from shapely.geometry import shape, Point
from shapely.ops import transform
from pyproj import Transformer
ROOT = Path('wkt/')
features = []
# assume we're converting from 3857 to 4327
# and center point is at lon=23, lat=37
c = Point(23.676757000000002, 37.9914205)
local_azimuthal_projection = f"+proj=aeqd +R=6371000 +units=m +lat_0={c.y} +lon_0={c.x}"
aeqd_to_wgs84 = Transformer.from_proj(local_azimuthal_projection,
'+proj=longlat +datum=WGS84 +no_defs')
for f in ROOT.glob('*.geojson'):
with open(f) as fin:
features = json.load(fin)["features"]
for feature in features:
geo = shape(feature["geometry"])
poly_wgs84 = transform(aeqd_to_wgs84.transform, geo)
features.append(geojson.Feature(geometry=poly_wgs84))
# Output new GeoJSON file
with open("out.geojson", "w") as fp:
fc = geojson.FeatureCollection(features)
fp.write(geojson.dumps(fc))
Assuming the conversion is from EPSG:3857 to EPSG:4327 and center point is at lon=23, lat=37, the output GeoJSON file will look like this:
{"features": [{"type": "Polygon", "geometry": {"coordinates": [[[23.897879, 38.012554], ...

Plotly is not rendering Choropleth Mapbox Polygons

I have been trying to render geoJSON in Plotly by converting shapefiles from https://geoportal.statistics.gov.uk/datasets/local-authority-districts-december-2019-boundaries-uk-bfc.
The Python Plotly docs for plotly.graph_objects.Choroplethmapbox mention that in the geoJSON an id field is required for each feature. I have tried both creating an artificial id and using the plotly featurekeyid field but neither of them are working. When I do use the id key, I have checked that the id key is in the correct location and have tried both as int64 and string.
Sometimes the base mapbox layer will render but no polygons and others the code will run and then hang.
I have also tried reducing the size of the .shp file using mapshaper's various algorithms then saving that to geoJSON format and skipping the conversion step in Python from .shp to geoJSON but again to no avail. Also changing the tolerance in the shapely manipulation does not seem to change the output.
What I am expecting is a map projection with a mapbox base layer with the local authority district polygons on top and filled. The below link shows the polygons and was created on mapshaper.org:
Polygons of Local Authority District
My mapbox access token is valid.
This is an example of trying to render the Local Authority Boundaries polygons by adding in an id field and converting the .shp file to geoJSON and then creating the trace:
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString
import plotly.graph_objs as go
# load in shp files
lad_shp = gpd.read_file('zip://../../data/external/Local_Authority_Districts_(December_2019)_Boundaries_UK_BFC-shp.zip', encoding='utf-8')
# using empet code to convert .shp to geoJSON
def shapefile_to_geojson(gdf, index_list, tolerance=0.025):
# gdf - geopandas dataframe containing the geometry column and values to be mapped to a colorscale
# index_list - a sublist of list(gdf.index) or gdf.index for all data
# tolerance - float parameter to set the Polygon/MultiPolygon degree of simplification
# returns a geojson type dict
geo_names = list(gdf[f'lad19nm']) # name of authorities
geojson = {'type': 'FeatureCollection', 'features': []}
for index in index_list:
geo = gdf['geometry'][index].simplify(tolerance)
if isinstance(geo.boundary, LineString):
gtype = 'Polygon'
bcoords = np.dstack(geo.boundary.coords.xy).tolist()
elif isinstance(geo.boundary, MultiLineString):
gtype = 'MultiPolygon'
bcoords = []
for b in geo.boundary:
x, y = b.coords.xy
coords = np.dstack((x,y)).tolist()
bcoords.append(coords)
else: pass
feature = {'type': 'Feature',
'id' : index,
'properties': {'name': geo_names[index]},
'geometry': {'type': gtype,
'coordinates': bcoords},
}
geojson['features'].append(feature)
return geojson
geojsdata = shapefile_to_geojson(lad_shp, list(lad_shp.index))
# length to generate synthetic data for z attribute
L = len(geojsdata['features'])
# check id key is there
geojsdata['features'][0].keys()
>> dict_keys(['type', 'id', 'properties', 'geometry'])
# example of authroity name
geojsdata['features'][0]['properties']['name']
>> 'Hartlepool'
# check id
k=5
geojsdata['features'][k]['id']
>> '5'
trace = go.Choroplethmapbox(z=np.random.randint(10, 75, size=L), # synthetic data
locations=[geojsdata['features'][k]['id'] for k in range(L)],
colorscale='Viridis',
colorbar=dict(thickness=20, ticklen=3),
geojson=geojsdata,
text=regions,
marker_line_width=0.1, marker_opacity=0.7)
layout = go.Layout(title_text='UK LAD Choropleth Demo',
title_x=0.5,
width=750,
height=700,
mapbox=dict(center=dict(lat=54, lon=-2),
accesstoken=mapbox_access_token,
zoom=3))
fig=go.Figure(data=[trace], layout =layout)
fig.show()
The geoJSON output from the above shapefile_to_geojson function can be found here: https://www.dropbox.com/s/vuf3jtrr2boq5eg/lad19-geo.json?dl=0
Does anyone have any idea what could be causing the issue? I'm assuming the .shp files are good as they are rendered fine on mapshaper.org and QGis. Any help would be greatly appreciated.
Thank you.

Simply changing the projection system corrected the error. Doing this before conversion to geoJSON:
lad_shp = lad_shp.to_crs(epsg=4326)

Intersection between 2 Geodataframe

I am doing some work on the Geopanda library, I have a shapefile with polygons and data on a excel sheet that I transform into points. I want to intersect the two DataFrames and export it to a file. I use also on both projections (WGS84) so that I can compare them.
There should be at least some points that intersects the polygons.
My intersect GeoSeries does not give me any points that fit into the polygon, but I don't see why...
I checked if the unit of the shapefile was really Kilometer and not somthing else. I am not proficient into GeoPlot so I can't really make sure what the GeoDataFrame look like.
f = pd.read_excel(io = 'C:\\Users\\peilj\\meteo_sites.xlsx')
#Converting panda dataframe into a GeoDataFrame with CRS projection
geometry = [Point(xy) for xy in zip(df.geoBreite, df.geoLaenge)]
df = df.drop(['geoBreite', 'geoLaenge'], axis=1)
crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
gdf = GeoDataFrame(df, crs=crs, geometry=geometry)
#Reading shapefile and creating buffer
gdfBuffer = geopandas.read_file(filename = 'C:\\Users\\peilj\\lkr_vallanUTM.shp')
gdfBuffer = gdfBuffer.buffer(100) #When the unit is kilometer
#Converting positions long/lat into shapely object
gdfBuffer = gdfBuffer.to_crs("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
#Intersection coordonates from polygon Buffer and points of stations
gdf['intersection'] = gdf.geometry.intersects(gdfBuffer)
#Problem: DOES NOT FIND ANY POINTS INSIDE STATIONS !!!!!!!
#Giving CRS projection to the intersect GeoDataframe
gdf_final = gdf.to_crs("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
gdf_final['intersection'] = gdf_final['intersection'].astype(int) #Shapefile does not accept bool
#Exporting to a file
gdf_final.to_file(driver='ESRI Shapefile', filename=r'C:\\GIS\\dwd_stationen.shp
The files needed:
https://drive.google.com/open?id=11x55aNxPOdJVKDzRWLqrI3S_ExwbqCE9

two things:
You need to swap geoBreite and geoLaenge when creating the points to:
geometry = [Point(xy) for xy in zip(df.geoLaenge, df.geoBreite)]
This is because shapely follows the x, y logic, not lat, lon.
As for checking the intersection, you could do as follows:
gdf['inside'] = gdf['geometry'].apply(lambda shp: shp.intersects(gdfBuffer.dissolve('LAND').iloc[0]['geometry']))
which detects six stations inside the shape file:
gdf['inside'].sum()
ouputs:
6
So along with some other minor fixes we get:
import geopandas as gpd
from shapely.geometry import Point
df = pd.read_excel(r'C:\Users\peilj\meteo_sites.xlsx')
geometry = [Point(xy) for xy in zip(df.geoLaenge, df.geoBreite)]
crs = {'init': 'epsg:4326'}
gdf = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)
gdfBuffer = gpd.read_file(filename = r'C:\Users\peilj\lkr_vallanUTM.shp')
gdfBuffer['goemetry'] = gdfBuffer['geometry'].buffer(100)
gdfBuffer = gdfBuffer.to_crs(crs)
gdf['inside'] = gdf['geometry'].apply(lambda shp: shp.intersects(gdfBuffer.dissolve('LAND').iloc[0]['geometry']))

Combining shapefiles in Python / GeoPandas

I have three polygon shapefiles which overlap each other. Let's call them:
file_one.shp (polygon Name is 1)
file_two.shp (polygon Name is 2)
file_three.shp (polygon Name is 3)
I want to combine them and keep the values like this.
How can I achieve the result (As shown in the figure) in Python, please?
Thanks!

If you want to simply create one shapefile from files you've mentioned you can try following code (I assume that shapefiles has same columns).
import pandas as pd
import geopandas as gpd
gdf1 = gpd.read_file('file_one.shp')
gdf2 = gpd.read_file('file_two.shp')
gdf3 = gpd.read_file('file_three.shp')
gdf = gpd.GeoDataFrame(pd.concat([gdf1, gdf2, gdf3]))

First, let's generate some data for demonstration:
import geopandas as gpd
from shapely.geometry import Point
shp1 = gpd.GeoDataFrame({'geometry': [Point(1, 1).buffer(3)], 'name': ['Shape 1']})
shp2 = gpd.GeoDataFrame({'geometry': [Point(1, 1).buffer(2)], 'name': ['Shape 2']})
shp3 = gpd.GeoDataFrame({'geometry': [Point(1, 1).buffer(1)], 'name': ['Shape 3']})
Now take the symmetric difference for all, but the smallest shape, that can be left as is:
diffs = []
gdfs = [shp1, shp2, shp3]
for idx, gdf in enumerate(gdfs):
if idx < 2:
diffs.append(gdf.symmetric_difference(gdfs[idx+1]).iloc[0])
diffs.append(shp3.iloc[0].geometry)
There you go, now you have the desired shapes as a list in diffs. If you would like to combine them to one GeoDataFrame, just do as follows:
all_shapes = gpd.GeoDataFrame(geometry=diffs)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

get difference of two shape files - python

Related

Extract raster value using multipolygon type of shape using python

Convert .geojson to .wkt | extract 'coordinates'

Plotly is not rendering Choropleth Mapbox Polygons

Intersection between 2 Geodataframe

Combining shapefiles in Python / GeoPandas

Categories

Resources