I have the GeoJSON
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[[13.65374516425911, 52.38533382814119], [13.65239769133293, 52.38675829106993], [13.64970274383571, 52.38675829106993], [13.64835527090953, 52.38533382814119], [13.64970274383571, 52.38390931824483], [13.65239769133293, 52.38390931824483], [13.65374516425911, 52.38533382814119]]
]
}
}
]
}
which http://geojson.io displays as
I would like to calculate its area (87106.33m^2) with Python. How do I do that?
What I tried
# core modules
from functools import partial
# 3rd pary modules
from shapely.geometry import Polygon
from shapely.ops import transform
import pyproj
l = [[13.65374516425911, 52.38533382814119, 0.0], [13.65239769133293, 52.38675829106993, 0.0], [13.64970274383571, 52.38675829106993, 0.0], [13.64835527090953, 52.38533382814119, 0.0], [13.64970274383571, 52.38390931824483, 0.0], [13.65239769133293, 52.38390931824483, 0.0], [13.65374516425911, 52.38533382814119, 0.0]]
polygon = Polygon(l)
print(polygon.area)
proj = partial(pyproj.transform, pyproj.Proj(init='epsg:4326'),
pyproj.Proj(init='epsg:3857'))
print(transform(proj, polygon).area)
It gives 1.1516745933889345e-05 and 233827.03300877335 - that the first one doesn't make any sense was expected, but how do I fix the second one? (I have no idea how to set the pyproj.Proj init parameter)
I guess epsg:4326 makes sense at it is WGS84 (source), but for epsg:3857 I'm uncertain.
Better results
The following is a lot closer:
# core modules
from functools import partial
# 3rd pary modules
import pyproj
from shapely.geometry import Polygon
import shapely.ops as ops
l = [[13.65374516425911, 52.38533382814119, 0],
[13.65239769133293, 52.38675829106993, 0],
[13.64970274383571, 52.38675829106993, 0],
[13.64835527090953, 52.38533382814119, 0],
[13.64970274383571, 52.38390931824483, 0],
[13.65239769133293, 52.38390931824483, 0],
[13.65374516425911, 52.38533382814119, 0]]
polygon = Polygon(l)
print(polygon.area)
geom_area = ops.transform(
partial(
pyproj.transform,
pyproj.Proj(init='EPSG:4326'),
pyproj.Proj(
proj='aea',
lat1=polygon.bounds[1],
lat2=polygon.bounds[3])),
polygon)
print(geom_area.area)
it gives 87254.7m^2 - that is still 148m^2 different from what geojson.io says. Why is that the case?
It looks like geojson.io is not calculating the area after projecting the spherical coordinates onto a plane like you are, but rather using a specific algorithm for calculating the area of a polygon on the surface of a sphere, directly from the WGS84 coordinates. If you want to recreate it you can find the source code here.
If you are happy to carry on projecting the coordinates to a flat system to calculate the area, since it's good enough accuracy for your use case, then you might trying using this projection for Germany instead. E.g:
from osgeo import ogr
from osgeo import osr
source = osr.SpatialReference()
source.ImportFromEPSG(4326)
target = osr.SpatialReference()
target.ImportFromEPSG(5243)
transform = osr.CoordinateTransformation(source, target)
poly = ogr.CreateGeometryFromJson(str(geoJSON['features'][0]['geometry']))
poly.Transform(transform)
poly.GetArea()
which returns 87127.2534625642
Related
I am trying to generate a python code that takes geojson containing polygon as input an then outputs a polygon that is scaled by exact value in meters. For example lets say that my polygon is a square representing a house and I want to get an output of another square that is exactly 20m bigger on each side. This would be simple with square but it is quite hard with complex shapes.
Heres is what I use now (found here Scale GeoJSON to find latitude and longitude points nearby):
import json
from shapely import affinity
from shapely.geometry import shape, Point, mapping
def handler(event, context):
with open('polygon.geojson', encoding='utf-8') as f:
js = json.load(f)
polygon = shape(js['features'][0]['geometry'])
polygon_nearby = affinity.scale(polygon, xfact=1.1, yfact=1.1)
print(json.dumps({"type": "FeatureCollection", "features": [{"type": "Feature", 'properties': {}, 'geometry': mapping(polygon)}]}))
print(json.dumps({"type": "FeatureCollection", "features": [{"type": "Feature", 'properties': {}, 'geometry': mapping(polygon_nearby)}]}))
if __name__ == '__main__':
handler(None, None)
However as you can see this scales by realtive value (10%) and I would like to change that to absolute value in meters.
Thank you in forward for any suggestions
I have a script that creates a lot of Polygons using Shapely and then exports them as .geojson files. See toy example below
from shapely.geometry import Polygon
import geopandas
roi = Polygon([(0,0), (0,1), (1,0), (1,1)])
rois = [roi, roi]
geopandas.GeoSeries(rois).to_file("detection_data.geojson", driver='GeoJSON')
However, I also have a list of numbers, each number is associated with one ploygon. Is there a way to export this with the GeoJSON file inside properties?
For example, if I have a list:
detection_prob = [0.8, 0.9]
In the .geojson file I would like the properties section for the first polygon to read
"properties":{"detection_prob":0.8}
and for the second polygon
"properties":{"detection_prob":0.9}
etc etc etc... in the outputted GeoJSON file.
If you call to_file on a dataframe instead of a series, you can add extra attributes as columns:
import geopandas as gpd
import shapely.geometry as g
df = gpd.GeoDataFrame({
'geometry': [g.Point(0, 0), g.Point(1,1)],
'name': ['foo', 'bar']
})
df.to_file('out.json', driver='GeoJSON')
I tried to write a code that creates a visualization of all forest fires that happened during the year 2021. The CSV file containing the data is around 1.5Gb, the program looks correct for me, but when I try to run it, it gets stuck without displaying any visualization or error message. The last time I tried, it run for almost half a day until python crashed.
I don't know if I am having an infinite loop, if that's because the file is too big or if there is something else I am missing.
Can anyone provide feedback, please?
Here is my code:
import csv
from datetime import datetime
from plotly.graph_objs import Scattergeo , Layout
from plotly import offline
filename='fire_nrt_J1V-C2_252284.csv'
with open(filename) as f:
reader=csv.reader(f)
header_row=next(reader)
lats, lons, brights, dates=[],[],[],[]
for row in reader:
date=datetime.strptime(row[5], '%Y-%m-%d')
lat=row[0]
lon=row[1]
bright=row[2]
lats.append(lat)
lons.append(lon)
brights.append(bright)
dates.append(date)
data=[{
'type':'scattergeo',
'lon':lons,
'lat':lats,
'text':dates,
'marker':{
'size':[5*bright for bright in brights],
'color': brights,
'colorscale':'Reds',
'colorbar': {'title':'Fire brightness'},
}
}]
my_layout=Layout(title="Forestfires during the year 2021")
fig={'data':data,'layout':my_layout}
offline.plot(fig, filename='global_fires_2021.html')
have found data you describe here https://wifire-data.sdsc.edu/dataset/viirs-i-band-375-m-active-fire-data/resource/3ce73b20-f584-44f7-996b-2f319c480294
plotly uses resources for every point plotted on a scatter. So there is a limit before you run out of resources
there are other approaches to plotting larger number of points
https://plotly.com/python/mapbox-density-heatmaps/ fewer limits, but still limited on very large data sets
https://plotly.com/python/datashader/ can work with very large data sets as it generates an image. It is more challenging to work with (install and navigate API)
data sourcing
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
df = pd.read_csv("https://firms.modaps.eosdis.nasa.gov/data/active_fire/noaa-20-viirs-c2/csv/J1_VIIRS_C2_Global_7d.csv")
df
scatter_geo
limited to random sample of 1000 rows
px.scatter_geo(
df.sample(1000),
lat="latitude",
lon="longitude",
color="bright_ti4",
# size="size",
hover_data=["acq_date"],
color_continuous_scale="reds",
)
density mapbox
px.density_mapbox(
df.sample(5000),
lat="latitude",
lon="longitude",
z="bright_ti4",
radius=3,
color_continuous_scale="reds",
zoom=1,
mapbox_style="carto-positron",
)
datashader Mapbox
all data
some libraries are more difficult to install and use
need to deal with this issue https://community.plotly.com/t/datashader-image-distorted-when-passed-to-mapbox/39375/2
import datashader as ds, colorcet
from pyproj import Transformer
t3857_to_4326 = Transformer.from_crs(3857, 4326, always_xy=True)
# project CRS to ensure image overlays appropriately back over mapbox
# https://community.plotly.com/t/datashader-image-distorted-when-passed-to-mapbox/39375/2
df.loc[:, "longitude_3857"], df.loc[:, "latitude_3857"] = ds.utils.lnglat_to_meters(
df.longitude, df.latitude
)
RESOLUTION=1000
cvs = ds.Canvas(plot_width=RESOLUTION, plot_height=RESOLUTION)
agg = cvs.points(df, x="longitude_3857", y="latitude_3857")
img = ds.tf.shade(agg, cmap=colorcet.fire).to_pil()
fig = go.Figure(go.Scattermapbox())
fig.update_layout(
mapbox={
"style": "carto-positron",
"layers": [
{
"sourcetype": "image",
"source": img,
# Sets the coordinates array contains [longitude, latitude] pairs for the image corners listed in
# clockwise order: top left, top right, bottom right, bottom left.
"coordinates": [
t3857_to_4326.transform(
agg.coords["longitude_3857"].values[a],
agg.coords["latitude_3857"].values[b],
)
for a, b in [(0, -1), (-1, -1), (-1, 0), (0, 0)]
],
}
],
},
margin={"l": 0, "r": 0, "t": 0, "r": 0},
)
in the below code i want to calculate the distance from a point to the nearest edge of a polygon.as shown in the results section below, the coordinates are provided.the code posted below shows how i find the distance from a point to the neatrest edge of a polygon.
at run time, and as shown below in restults section, for the give point and geometry, the distance from postgis is equal to 4.32797817574802 while the one calculated from geopandas gives 3.8954865274727614e-05
please let me know how to find the distance from a point to nearest edge of a polygon.
code
poly = wkt.loads(fieldCoordinatesAsTextInWKTInEPSG25832)
pt = wkt.loads(centerPointointAsTextInWKTInEPSG25832)
print(poly.distance(pt)))
results:
queryPostgreSQLForDistancesFromPointsToPolygon:4.32797817574802#result from postgis using st_distance operator
centerPointointAsTextInWKTInEPSG4326:POINT(6.7419520458647835 51.08427961641239)
centerPointointAsTextInWKTInEPSG25832:POINT(341849.5 5661622.5)
centerPointointAsTextInWKTInEPSG4326:POINT(6.7419520458647835 51.08427961641239)
fieldCoordinatesAsTextInWKTInEPSG25832:POLYGON ((5622486.93624152 1003060.89945681,5622079.52632924 1003170.95198635,5622126.00418918 1003781.73122161,5622444.73987453 1003694.55868486,5622486.93624152 1003060.89945681))
fieldCoordinatesAsTextInWKTInEPSG4326:POLYGON((6.741879696309871 51.08423775429969,6.742907378503366 51.08158745820981,6.746964018740842 51.08233499299334,6.746152690693346 51.08440763989611,6.741879696309871 51.08423775429969))
poly.distance(pt):3.8954865274727614e-05#result from geopandas
your code works. It's approx 7000km from Belgium to Ethiopia
are you sure your data is correct? Have built a plotly graph to show where buffered polygon, polygon centroid and point are located in EPSG:4326 CRS
from shapely import wkt
import geopandas as gpd
import plotly.express as px
import json
# queryPostgreSQLForDistancesFromPointsToPolygon:4.32797817574802#result from postgis using st_distance operator
centerPointointAsTextInWKTInEPSG4326 = "POINT(6.7419520458647835 51.08427961641239)"
centerPointointAsTextInWKTInEPSG25832 = "POINT(341849.5 5661622.5)"
centerPointointAsTextInWKTInEPSG4326 = "POINT(6.7419520458647835 51.08427961641239)"
fieldCoordinatesAsTextInWKTInEPSG25832 = "POLYGON ((5622486.93624152 1003060.89945681,5622079.52632924 1003170.95198635,5622126.00418918 1003781.73122161,5622444.73987453 1003694.55868486,5622486.93624152 1003060.89945681))"
fieldCoordinatesAsTextInWKTInEPSG4326 = "POLYGON((6.741879696309871 51.08423775429969,6.742907378503366 51.08158745820981,6.746964018740842 51.08233499299334,6.746152690693346 51.08440763989611,6.741879696309871 51.08423775429969))"
# poly.distance(pt):3.8954865274727614e-05#result from geopandas
poly = wkt.loads(fieldCoordinatesAsTextInWKTInEPSG25832)
pt = wkt.loads(centerPointointAsTextInWKTInEPSG25832)
print(poly.distance(pt)/10**3)
# let's visualize it....
gpoly = (
gpd.GeoDataFrame(geometry=[poly], crs="EPSG:25832")
.buffer(10 ** 6)
.to_crs("EPSG:4326")
)
gpoly.plot()
gpt = gpd.GeoDataFrame(geometry=[pt, poly.centroid], crs="EPSG:25832").to_crs(
"EPSG:4326"
)
px.scatter_mapbox(
gpt.assign(dist=poly.distance(pt)/10**3),
lat=gpt.geometry.y,
lon=gpt.geometry.x,
hover_data={"dist":":.0f"},
).update_layout(
mapbox={
"style": "carto-positron",
"zoom": 4,
"layers": [
{
"source": json.loads(gpoly.to_json()),
"below": "traces",
"type": "fill",
"color": "red",
}
],
}
)
I'm having difficulty loading the following JSON containing GIS data (https://data.cityofnewyork.us/resource/5rqd-h5ci.json) into a GeoDataFrame.
The following code fails when I try to set the geometry.
import requests
import geopandas as gpd
data = requests.get("https://data.cityofnewyork.us/resource/5rqd-h5ci.json")
gdf = gpd.GeoDataFrame(data.json())
gdf = gdf.set_geometry('the_geom')
gdf.head()
For people who are using web mapping libraries...
If the GeoJSON is wrapped in a FeatureCollection, as they often are when exported to a GeoJSON string by web mapping libraries (in my case, Leaflet), then all you need to do is pass the list at features to from_features() like so:
import geopandas as gpd
study_area = json.loads("""
{"type": "FeatureCollection", "features": [{"type": "Feature", "properties": {}, "geometry": {"type": "Polygon", "coordinates": [[[36.394272, -18.626726], [36.394272, -18.558391], [36.489716, -18.558391], [36.489716, -18.626726], [36.394272, -18.626726]]]}}]}
""")
gdf = gpd.GeoDataFrame.from_features(study_area["features"])
print(gdf.head())
Output:
geometry
0 POLYGON ((36.394272 -18.626726, 36.394272 -18....
Easy peasy.
Setting the geometry fails because the geopandas.GeoDataFrame constructor doesn't appear to be built to handle JSON objects as python data structures. It therefore complains about the argument not being a valid geometry object. You have to parse it into something that geopandas.GeoDataFrame can understand, like a shapely.geometry.shape. Here's what ran without error on my side, Python 3.5.4:
#!/usr/bin/env python3
import requests
import geopandas as gpd
from shapely.geometry import shape
r = requests.get("https://data.cityofnewyork.us/resource/5rqd-h5ci.json")
r.raise_for_status()
data = r.json()
for d in data:
d['the_geom'] = shape(d['the_geom'])
gdf = gpd.GeoDataFrame(data).set_geometry('the_geom')
gdf.head()
A disclaimer: I know absolutely nothing about Geo anything. I didn't even know these libraries and this kind of data existed until I installed geopandas to tackle this bounty and read a little bit of online documentation.
Combining the above answers, this worked for me.
import pandas as pd
import geopandas as gpd
from shapely.geometry import shape
nta = pd.read_json( r'https://data.cityofnewyork.us/resource/93vf-i5bz.json' )
nta['the_geom'] = nta['the_geom'].apply(shape)
nta_geo = gpd.GeoDataFrame(nta).set_geometry('geometry')
A more idiomatic way that uses regular dataframe functions inherited from pandas, and the native GeoDataFrame.from_features:
gdf = gpd.GeoDataFrame(data.json())
# features column does not need to be stored, this is just for illustration
gdf['features'] = gdf['the_geom'].apply(lambda x: {'geometry': x, 'properties': {}})
gdf2 = gpd.GeoDataFrame.from_features(gdf['features'])
gdf = gdf.set_geometry(gdf2.geometry)
gdf.head()