Convert geopandas dataframe to GEE feature collection using python - python

Given a geopandas dataframe (e.g. df that contains a geometry field), is the following a simplest way to convert it into ee.FeatureCollection?
import ee
features=[]
for index, row in df.iterrows():
g=ee.Geometry.Point([row['geometry'].x,row['geometry'].y])
# Define feature with a geometry and 'name' field from the dataframe
feature = ee.Feature(g,{'name':ee.String(row['name'])})
features.append(feature)
fc = ee.FeatureCollection(features)

If you want convert points geodataframe (GeoPandas) to ee.FeatureCollection, you can use this function:
import geopandas as gpd
import numpy as np
from functools import reduce
from geopandas import GeoDataFrame
from shapely.geometry import Point,Polygon
def make_points(gdf):
g = [i for i in gdf.geometry]
features=[]
for i in range(len(g)):
g = [i for i in gdf.geometry]
x,y = g[i].coords.xy
cords = np.dstack((x,y)).tolist()
double_list = reduce(lambda x,y: x+y, cords)
single_list = reduce(lambda x,y: x+y, double_list)
g=ee.Geometry.Point(single_list)
feature = ee.Feature(g)
features.append(feature)
#print("done")
ee_object = ee.FeatureCollection(features)
return ee_object
points_features_collections = make_points(points_gdf)
to do this function I based on this Reference

You can build a FeatureCollection from a json object. So if your geometry data file type is GeoJson you can do the following:
# import libraries
import ee
import json
# initialize earth engine client
ee.Initialize()
# load your gemotry data (which should be in GeoJson file)
with open("my_geometry_data.geojson") as f:
geojson = json.load(f)
# construct a FeatureCollection object from the json object
fc = ee.FeatureCollection(geojson)
If your geometry data is in different format (shapefile, geopackage), you can first save it to GeoJson then build a FeatureCollection object.
Finally, if you don't want to write any conversion code, and want just to convert your Geopandas.GeoDataFrame instantly to ee.FeatureCollection you can use the python package: geemap
geemap has several function for converting geometry data to FeatureCollection, and vice versa. You can see examples here. In your case, you need to use the geopandas_to_ee function, so your code would look like this:
# import libraries
import ee
import geemap
import geopandas as gpd
# initialize earth engine client
ee.Initialize()
# load your gemotry data using GeoPandas (which can be stored in different formats)
gdf = gpd.read_file("my_geometry_file.geojson")
# convert to FeatureCollection using one line of code
fc = geemap.geopandas_to_ee(gdf)
Note that under the hood, geemap is converting the GeoDataFrame to a json file, and then following the first approach I mentioned above.

Related

Convert .geojson to .wkt | extract 'coordinates'

Goal: Ultimately, to convert .geojson to .wkt. Here, I want to extract all coordinates, each as a list.
In my.geojson, there are n many: {"type":"Polygon","coordinates":...
Update: I've successfully extracted the first coordinates. However, this file has two coordinates.
Every .geojson has at least 1 coordinates, but may have more.
How can I dynamically extract all key-values of many coordinates?
Code:
from pathlib import Path
import os
import geojson
import json
from shapely.geometry import shape
ROOT = Path('path/')
all_files = os.listdir(ROOT)
geojson_files = list(filter(lambda f: f.endswith('.geojson'), all_files))
for gjf in geojson_files:
with open(f'{str(ROOT)}/{gjf}') as f:
gj = geojson.load(f)
o = dict(coordinates = gj['features'][0]['geometry']['coordinates'], type = "Polygon")
geom = shape(o)
wkt = geom.wkt
Desired Output:
1 .wkt for all corrdinates in geojson
To convert a series of geometries in GeoJSON files to WKT, the shape() function can convert the GeoJSON geometry to a shapely object which then can be formatted as WKT and/or projected to a different coordinate reference system.
If want to access the coordinates of polygon once it's in a shapely object, usex,y = geo.exterior.xy.
If just want to convert a series of GeoJSON files into one .wkt file per GeoJSON file then try this:
from pathlib import Path
import json
from shapely.geometry import shape
ROOT = Path('path')
for f in ROOT.glob('*.geojson'):
with open(f) as fin, open(f.with_suffix(".wkt"), "w") as fout:
features = json.load(fin)["features"]
for feature in features:
geo = shape(feature["geometry"])
# format geometry coordinates as WKT
wkt = geo.wkt
print(wkt)
fout.write(wkt + "\n")
This output uses your example my.geojson file.
Output:
POLYGON ((19372 2373, 19322 2423, ...
POLYGON ((28108 25855, 27755 26057, ...
If need to convert the coordinates to EPSG:4327 (WGS-84) (e.g. 23.314208, 37.768469), you can use pyproj.
Full code to convert collection of GeoJSON files to a new GeoJSON file in WGS-84.
from pathlib import Path
import json
import geojson
from shapely.geometry import shape, Point
from shapely.ops import transform
from pyproj import Transformer
ROOT = Path('wkt/')
features = []
# assume we're converting from 3857 to 4327
# and center point is at lon=23, lat=37
c = Point(23.676757000000002, 37.9914205)
local_azimuthal_projection = f"+proj=aeqd +R=6371000 +units=m +lat_0={c.y} +lon_0={c.x}"
aeqd_to_wgs84 = Transformer.from_proj(local_azimuthal_projection,
'+proj=longlat +datum=WGS84 +no_defs')
for f in ROOT.glob('*.geojson'):
with open(f) as fin:
features = json.load(fin)["features"]
for feature in features:
geo = shape(feature["geometry"])
poly_wgs84 = transform(aeqd_to_wgs84.transform, geo)
features.append(geojson.Feature(geometry=poly_wgs84))
# Output new GeoJSON file
with open("out.geojson", "w") as fp:
fc = geojson.FeatureCollection(features)
fp.write(geojson.dumps(fc))
Assuming the conversion is from EPSG:3857 to EPSG:4327 and center point is at lon=23, lat=37, the output GeoJSON file will look like this:
{"features": [{"type": "Polygon", "geometry": {"coordinates": [[[23.897879, 38.012554], ...

How to join point with polygon in geopandas

I have the polygon combination of lat-long1,lat2-long2 ..... and point like Lat - Long .
I have used GeoPandas library to get the result if there is any point is exist within polygon.
Sample Data of Polygon saved in csv file:
POLYGON((28.56056 77.36535,28.564635293716776
77.3675137204626,28.56871055311656 77.36967760850214,28.572785778190855 77.3718416641586,28.576860968931193 77.37400588747194,28.580936125329096 77.3761702784821,28.585011247376094 77.37833483722912,28.58908633506372 77.38049956375293,28.593161388383457 77.38266445809356,28.59723640732686 77.38482952029099,28.60131139188541 77.38699475038526,28.605386342050664 77.38916014841635,28.60946125781409 77.39132571442434,28.613536139167238 77.39349144844923,28.61761098610158 77.39565735053108,28.62168579860863 77.39782342070995,28.62576057667991 77.39998965902589,28.62983532030691 77.402156065519,28.633910029481108 77.40432264022931,28.637984704194054 77.40648938319696,28.642059344437207 77.408656294462,28.64068221074683 77.41187044231611,28.63920739580329 77.41502778244606,28.63763670052024 77.41812446187686,28.635972042808007 77.42115670220443,28.634215455216115 77.42412080422613,28.63236908243526 77.42701315247152,28.630435178662026 77.42983021962735,28.628416104829583 77.43256857085188,28.626314325707924 77.43522486797251,28.624132406877322 77.437795873562,28.621873011578572 77.44027845488824,28.619538897444272 77.4426695877325,28.617132913115164 77.44496636007166,28.614657994745563 77.44716597562005,28.612117162402576 77.44926575722634,28.609513516363293 77.45126315012166,28.606850233314923 77.45315572501488,28.604130562462267 77.45494118103147,28.60135782154758 77.45661734849246,28.598535392787774 77.45818219153013,28.595666718733966 77.45963381053753,28.592755298058414 77.46097044444889,28.589804681274302 77.46219047284835,28.586818466393503 77.46329241790465,28.583800294527727 77.46427494612952,28.58075384543836 77.46513686995802,28.57768283304089 77.46587714914885,28.574591000868892 77.4664948920035,28.571482117503592 77.46698935640259,28.568359971974488 77.46735995065883,28.565228369136484 77.46760623418534,28.56209112502966 77.4677279179792,28.558952062226695 77.4677248649196,28.55581500517431 77.46759708988064,28.552683775533943 77.46734475965891,28.552683775533943 77.46734475965891,28.553079397193876 77.4622453846313,28.553474828308865 77.45714597129259,28.55387006887434 77.4520465196603,28.554265118885752 77.44694702975198,28.554659978338513 77.4418475015852,28.555054647228083 77.43674793517746,28.555449125549913 77.43164833054634,28.555843413299442 77.42654868770937,28.55623751047213 77.42144900668411,28.556631417063407 77.41634928748812,28.55702513306874 77.41124953013893,28.55741865848359 77.40614973465412,28.557811993303396 77.40104990105122,28.55820513752363 77.39595002934782,28.558598091139757 77.39085011956145,28.558990854147225 77.38575017170969,28.559383426541523 77.3806501858101,28.559775808318093 77.37555016188024,28.560167999472434 77.37045009993768,28.56056 77.36535))
and second dataset is for LAT and LONG as 28.56282, 77.36824 respectively saved in csv file .
I have used below Python code to join both data set based on condition if point exist within polygon. like below
import pandas as pd
import shapely.geometry
from shapely.geometry import Point
import geopandas as gpd
site_df = pd.read_csv (r'lat_long_file.csv') # load lat and long file
site_df['geometry'] = pd.DataFrame(site_df).apply(lambda x: Point(x.LAT,x.LONG), axis='columns') # convert lat and long to point
gdf = gpd.GeoDataFrame(site_df, geometry = site_df.geometry,crs='EPSG:4326') #creating geo pandas data frame for point
from shapely import wkt
polygon_df = pd.read_csv (r'polygon_csv_file') #reading polygon sample raw string file
polygon_df['geometry'] = pd.DataFrame(polygon_df).apply(lambda row: shapely.wkt.loads(row.polygon), axis='columns') #converting string polygon to geometory
gd_polygon = gpd.GeoDataFrame(polygon_df, geometry = polygon_df.geometry,crs='EPSG:4326') #create geopandas dataframe
import shapely.speedups
shapely.speedups.enable() # this makes some spatial queries run faster
join_data = gpd.sjoin(gdf, gd_polygon, how="inner", op="within") //actual join condition
But that query does not retun anything . But point is exist within polygon. as we can see in below diagram
Green Location marker is point Lat and long which is exist within polygon.
I would check the axis order - WKT usually interpreted as longitude first, latitude second order, while the point you construct uses latitude:longitude order.
You can try removing the CRS identifier to see if it changes the result.
Also see
https://gis.stackexchange.com/questions/376751/shapely-flips-lat-long-coordinate
and
https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
your sample data is unusable as it's an image
have sourced a polygon - a county boundary in UK
constructed a geopandas data frame of a point that is within this county
have used plotly to demonstrate visually the data
have used your code fragment gpd.sjoin(gdf, gd_polygon, how="inner", op="within") to do spatial join and it correctly joins point to polygon
import requests, json
import geopandas as gpd
import plotly.express as px
import shapely.geometry
# fmt: off
# get a polygon and construct a point
res = requests.get("https://opendata.arcgis.com/datasets/69dc11c7386943b4ad8893c45648b1e1_0.geojson")
gd_polygon = gpd.GeoDataFrame.from_features(res.json()).loc[lambda d: d["LAD20NM"].str.contains("Hereford")]
gdf = gpd.GeoDataFrame(geometry=gd_polygon.loc[:,["LONG","LAT"]].apply(shapely.geometry.Point, axis=1)).reset_index(drop=True)
# fmt: on
# plot to show point is within polygon
px.scatter_mapbox(gd_polygon, lon="LONG", lat="LAT").update_traces(
name="gd_polygon"
).add_traces(
px.scatter_mapbox(gdf, lat=gdf2.geometry.y, lon=gdf2.geometry.x)
.update_traces(name="gdf", marker_color="red")
.data
).update_traces(
showlegend=True
).update_layout(
mapbox={
"style": "carto-positron",
"layers": [
{"source": json.loads(gd_polygon.geometry.to_json()), "type": "line"}
],
}
).show()
# spatial join, all good :-)
gpd.sjoin(gdf, gd_polygon, how="inner", op="within")
output
spatial join has worked, point is within polygon
geometry
index_right
OBJECTID
LAD20CD
LAD20NM
LAD20NMW
BNG_E
BNG_N
LONG
LAT
Shape__Area
Shape__Length
0
POINT (-2.73931 52.081539)
18
19
E06000019
Herefordshire, County of
349434
242834
-2.73931
52.0815
2.18054e+09
285427

How do I test if Point is in Polygon/Multipolygon with geopandas in Python?

I have the Polygon data from the States from the USA from the website
arcgis
and I also have an excel file with coordinates of citys. I have converted the coordinates to geometry data (Points).
Now I want to test if the Points are in the USA.
Both are dtype: geometry. I thought with this I can easily compare, but when I use my code I get for every Point the answer false. Even if there are Points that are in the USA.
The code is:
import geopandas as gp
import pandas as pd
import xlsxwriter
import xlrd
from shapely.geometry import Point, Polygon
df1 = pd.read_excel('PATH')
gdf = gp.GeoDataFrame(df1, geometry= gp.points_from_xy(df1.longitude, df1.latitude))
US = gp.read_file('PATH')
print(gdf['geometry'].contains(US['geometry']))
Does anybody know what I do wrong?
contains in GeoPandas currently work on a pairwise basis 1-to-1, not 1-to-many. For this purpose, use sjoin.
points_within = gp.sjoin(gdf, US, op='within')
That will return only those points within the US. Alternatively, you can filter polygons which contain points.
polygons_contains = gp.sjoin(US, gdf, op='contains')

Choropleth map is not showing color variation in the output

I am not getting any color variation in my output map even after the choropleth is linked with the geo_data and the data frame is linked with data parameter in the choropleth method.
I have provided the "key_on" parameter rightly and the "columns" parameter rightly. I have removed all the NULL values from the Dataframe.
import pandas as pd
from pandas import read_csv
import folium
import os
import webbrowser
crimes = read_csv('Dataframe.csv',error_bad_lines=False)
vis = os.path.join('Community_Areas.geojson')
m = folium.Map(location = [41.878113, -87.629799], zoom_start = 10, tiles = "cartodbpositron")
m.choropleth(geo_data=vis, data = crimes, columns = ['Community Area', 'count'], fill_color = 'YlGn', key_on = 'feature.properties.area_numbe')
folium.LayerControl().add_to(m)
m.save('map.html')
webbrowser.open(filepath)
I expected a colored choropleth map, but the actual output was completely grey. I will add the code, data, output in the link below.
Code link: https://github.com/rahul0070/Stackoverflow_question_data
Data
,Community Area,count
0,25.0,92679
1,8.0,48751
2,43.0,47731
3,23.0,45943
4,29.0,44819
5,28.0,42243
6,71.0,40624
7,67.0,40157
8,24.0,39680
9,32.0,38513
10,49.0,37227
11,68.0,37023
12,69.0,35874
13,66.0,33877
14,44.0,32256
15,6.0,31043
16,26.0,30565
17,27.0,28113
18,61.0,27362
19,22.0,27329
20,46.0,26897
21,19.0,26198
22,30.0,24362
23,53.0,22645
24,42.0,21284
25,7.0,21047
26,1.0,19932
27,3.0,19799
28,15.0,17820
29,38.0,17660
30,2.0,17213
31,73.0,17071
32,16.0,15926
33,40.0,14943
34,58.0,14143
35,31.0,13934
36,63.0,13203
37,70.0,12980
38,35.0,12965
39,14.0,12714
40,77.0,12612
41,21.0,12587
42,75.0,11156
43,65.0,10812
44,51.0,10348
45,56.0,10176
46,4.0,9984
47,33.0,8987
48,60.0,8982
49,76.0,8938
50,20.0,8922
51,17.0,8536
52,41.0,8076
53,48.0,7823
54,5.0,7761
55,45.0,7485
56,39.0,7466
57,52.0,7150
58,54.0,6777
59,10.0,6372
60,11.0,6072
61,34.0,6053
62,62.0,5736
63,50.0,5730
64,59.0,5695
65,57.0,5244
66,64.0,5194
67,72.0,4962
68,37.0,4908
69,13.0,4570
70,36.0,3359
71,74.0,3145
72,55.0,3109
73,18.0,3109
74,12.0,2454
75,47.0,2144
76,9.0,1386
After reading the data what I found was you did not convert the data type of Community Area column to string type as the geojson file contains the key in the form of string. So the data type of both the key and column should match.
Here is a full fledged solution to your answer
# importing libraries
import pandas as pd
from pandas import read_csv
import folium
import os
import webbrowser
# read the data
crimes = read_csv('Dataframe.csv',error_bad_lines=False)
# convert float to int then to string
crimes['Community Area'] = crimes['Community Area'].astype('int').astype('str')
# choropleth map
vis = 'Community_Areas.geojson'
m = folium.Map(location = [41.878113, -87.629799], zoom_start = 10, tiles = "cartodbpositron")
m.choropleth(geo_data=vis, data = crimes, columns = ['Community Area', 'count'], fill_color = 'YlGn', key_on = 'feature.properties.area_num_1')
folium.LayerControl().add_to(m)
m.save('map.html')
webbrowser.open('map.html')
Please comment if you find difficulty in understanding any part of code. For the key_on parameter you can also try feature.properties.area_numbe

Add Points to Geopandas Object

My objective is to create some kind of geojson object and add several Point's objects to it, with a For Loop.
What am I missing here?
from geojson import Feature
import pandas as pd
import geopandas as gpd
# Point((-115.81, 37.24))
# Create a Dataframe with **Schools Centroids**
myManipulationObj = pd.DataFrame
for schoolNumber in listOfResults:
myManipulationObj.append(centroids[schoolNumber])
# GDF should be a Beautiful collection (geoDataFrame) of Points
gdf = gpd.GeoDataFrame(myManipulationObj, geometry='Coordinates')
After that, I want to use geopandas write() to create a .geojson file.
Any Help?
(solved)
I solved that problem by:
creating a python list (listOfPoints),
Using the POINT object as geometry parameter to the FEATURE object,
Using the List of Features (with Points) to create a FeatureCollection
Leave here for future reference if someone needs :D
# Used to get the Index of Schools from the M Model Optimized
listOfResults = []
for e in range(numSchools):
tempObj = m.getVarByName(str(e))
# If This School is on the Results Optimized
if(tempObj.x != 0):
listOfResults.append(int(tempObj.varName))
# Select, from the List Of Results, A set of Centroid Points
listOfPoints = []
for schoolNumber in listOfResults:
# Attention to the Feature(geometry) from geopandas
listOfPoints.append(Feature(geometry=centroids[schoolNumber]))
# Creating a FeatureCollection with the Features (Points) manipulated above
resultCentroids = FeatureCollection(listOfPoints)

Categories

Resources