Flipping longitude and latitude coordinates in GeoPandas - python

I'm working with datasets where latitudes and longitudes are sometimes mislabeled and I need to flip the longitudes and the latitudes. The best solution I could come up with is to extract the x an y coordinates using df.geometry.x and df.geometry.y, create a new geometry column, and reconstruct the GeoDataFrame using the new geometry column. Or in code form:
import geopandas
from shapely.geometry import Point
gdf['coordinates'] = list(zip(gdf.geometry.y, gdf.geometry.x))
gdf['coordinates'] = gdf['coordinates'].apply(Point)
gdf= gpd.GeoDataFrame(point_data, geometry='coordinates', crs = 4326)
This is pretty ugly, requires creating a new column and isn't efficient for large datasets. Is there an easier way to flip the longitude and latitude coordinates of a GeoSeries/ GeoDataFrame?

You can create the geometry column directly:
df['geometry'] = df.apply(lambda row: Point(row['y'], row['x']), axis=1)
df = gpd.GeoDataFrame(df, crs=4326)

It works for Point and Polygon either:
gpd.GeoSeries(gdf['coordinates']).map(lambda polygon: shapely.ops.transform(lambda x, y: (y, x), polygon))

Related

How acess/get/extract the values of POLYGON in a geometry column of Geopandas Dataframe?

How can i get the values of coordinate in a POLYGON in a Geopandas Dataframe?
shapefile = gpd.read_file("CAMPOS_PRODUCAO_SIRGASPolygon.shp")
I can easy do it for a centroid of my POLYGON
with it:
print(campos_shape.centroid.iloc[0].x)
-39.853276865819765
print(type(campos_shape.centroid.iloc[0].x))
<class 'float'>
I want a list or numpy array with all point value of lat and lon
contained in POLYGON
So how can i convert a POLYGON to numpy array?
If anyone else has this problem, here is one solution
that work for me:
def coord_lister(geom):
coords = list(geom.exterior.coords)
return (coords)
coordinates_list = your_geopandas_df.geometry.apply(coord_lister)
Update:
This for example gives you the coordinates of the polygon for the first entry in the shapefile:
list(shapefile["geometry"][0].exterior.coords)

Convert Column to Polygon in Python to perform Point in Polygon

I have written Code to establish Point in Polygon in Python, the program uses a shapefile that I read in as the Polygons.
I now have a dataframe I read in with a column containing the Polygon e.g [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]].
I want to transform this column into a polygon in order to perform Point in Polygon, but all the examples use geometry = [Point(xy) for xy in zip(dataPoints['Long'], dataPoints['Lat'])] but mine is already zip?
How would I go about achieving this?
Thanks
taking your example above you could do the following:
list_coords = [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]]
from shapely.geometry import Point, Polygon
# Create a list of point objects using list comprehension
point_list = [Point(x,y) for [x,y] in list_coords]
# Create a polygon object from the list of Point objects
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
And if you would like to apply it to a dataframe you could do the following:
import pandas as pd
import geopandas as gpd
df = pd.DataFrame({'coords': [list_coords]})
def get_polygon(list_coords):
point_list = [Point(x,y) for [x,y] in list_coords]
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
return polygon_feature
df['geom'] = df['coords'].apply(get_polygon)
However, there might be geopandas built-in functions in order to avoid "reinventing the wheel", so let's see if anyone else has a suggestion :)

Define points within a polygon

I have a list of customers lat and long and I want to define which ones are within a given polygon.
But the results I got are none of them in that polygon and it is not correct.
Could you please help? Thanks!
from shapely.geometry import Polygon
from shapely.geometry import Point
import pandas as pd
import geopandas as gpd
df=pd.read_csv("C:\\Users\\n.nguyen.2\\Documents\\order from May 1.csv")
geometry=[Point(xy) for xy in zip(df['customer_lat'],df['customer_lng'])]
crs={'init':'epsg:4326'}
gdf=gpd.GeoDataFrame(df,crs=crs,geometry=geometry)
gdf.head()
polygon= Polygon ([(103.85362669999994, 1.4090082), (103.8477709, 1.4051988), (103.84821190000002, 1.4029509), (103.84933950000004, 1.4012179), (103.85182859999998, 1.4001453), (103.85393150000004, 1.3986867), (103.85745050000001, 1.3962412), (103.85809410000002, 1.3925516), (103.85843750000004, 1.3901491), (103.8583946, 1.3870601), (103.8585663, 1.3838853), (103.8582659, 1.3812682), (103.85822299999997, 1.3792946), (103.85843750000004, 1.3777931), (103.85882370000002, 1.3748757), (103.86015410000005, 1.3719582), (103.8607978, 1.3700276), (103.86092659999998, 1.368097), (103.86036880000006, 1.3657372), (103.8593174, 1.3633562), (103.85852339999995, 1.3607605), (103.85745050000001, 1.3581005), (103.8571071, 1.355655), (103.85736459999998, 1.3520941), (103.85873790000007, 1.3483615), (103.86187100000006, 1.3456583), (103.86488409999993, 1.340689), (103.87096889999998, 1.3378933), (103.87519599999996, 1.3373354), (103.88178349999998, 1.3408963), (103.88508790000004, 1.3433418), (103.89186870000005, 1.3436426), (103.89742610000008, 1.342355), (103.91813279999997, 1.3805388), (103.91824964404806, 1.3813377489306), (103.91433759243228, 1.38607494841128), (103.91607279999994, 1.3895484), (103.91942029999996, 1.3940104), (103.92903330000001, 1.4009604), (103.9342689, 1.402076), (103.93289559999994, 1.4075675), (103.92534249999994, 1.4146035), (103.92517090000003, 1.4211246), (103.90972139999997, 1.4238704), (103.89942169999993, 1.4202666), (103.89744760000008, 1.4224117), (103.89315599999998, 1.425758), (103.88740540000003, 1.4285896), (103.88148309999995, 1.4328798), (103.87478829999998, 1.4331372), (103.85918850000007, 1.4249644), (103.85401679999995, 1.4114284), (103.85362669999994, 1.4090082)])
gdf['answer']=gdf['geometry'].within(polygon)
writer = pd.ExcelWriter("C:\\Users\\n.nguyen.2\\Documents\\order may define1.xlsx")
gdf.to_excel(writer, 'Sheet1', index=False)
writer.save()
The results are all false.
Raw data:
Result:
Adding my comments as an answer for future reference.
You have switched longitude and latitude in the order of coordinates. Look at coordinates of your polygon and those of points. Coordinates of points you have generated are (Lat, Lon), while your polygon (Lon, Lat). So these points are not within this polygon. Do
geometry=[Point(xy) for xy in zip(df['customer_lng'],df['customer_lat'])]
instead and it will work.
To make your life easier, geopandas has helper function for creating points from polygons points_from_xy() (http://geopandas.org/gallery/create_geopandas_from_pandas.html?highlight=points_from_xy)

using python to project lat lon geometry to utm

I have a dataframe with earthquake data called eq that has columns listing latitude and longitude. using geopandas I created a point column with the following:
from geopandas import GeoSeries, GeoDataFrame
from shapely.geometry import Point
s = GeoSeries([Point(x,y) for x, y in zip(df['longitude'], df['latitude'])])
eq['geometry'] = s
eq.crs = {'init': 'epsg:4326', 'no_defs': True}
eq
Now I have a geometry column with lat lon coordinates but I want to change the projection to UTM. Can anyone help with the transformation?
Latitude/longitude aren't really a projection, but sort of a default "unprojection". See this page for more details, but it probably means your data uses WGS84 or epsg:4326.
Let's build a dataset and, before we do any reprojection, we'll define the crs as epsg:4326
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point
df = pd.DataFrame({'id': [1, 2, 3], 'population' : [2, 3, 10], 'longitude': [-80.2, -80.11, -81.0], 'latitude': [11.1, 11.1345, 11.2]})
s = gpd.GeoSeries([Point(x,y) for x, y in zip(df['longitude'], df['latitude'])])
geo_df = gpd.GeoDataFrame(df[['id', 'population']], geometry=s)
# Define crs for our geodataframe:
geo_df.crs = {'init': 'epsg:4326'}
I'm not sure what you mean by "UTM projection". From the wikipedia page I see there are 60 different UTM projections depending on the area of the world. You can find the appropriate epsg code online, but I'll just give you an example with a random epsgcode. This is the one for zone 33N for example
How do you do the reprojection? You can easily get this info from the geopandas docs on projection. It's just one line:
geo_df = geo_df.to_crs({'init': 'epsg:3395'})
and the geometry isn't coded as latitude/longitude anymore:
id population geometry
0 1 2 POINT (-8927823.161620541 1235228.11420853)
1 2 3 POINT (-8917804.407449147 1239116.84994171)
2 3 10 POINT (-9016878.754255159 1246501.097746004)

Coordinates from UTM to Latitude and Longitude in pandas

I have a DataFrame with the following result:
and I want to convert those coordinate columns from WGS84 to Lon & Lat and finally add those new columns in my data frame:
For conversion I am using the following code, but I think there should a better way without converting the coordinate columns to list and create a new one DataFrame.
import pyproj as pp
from mpl_toolkits.basemap import Basemap
import pandas as pd
cx =dfb.COORDENADA_X.tolist()
cy =dfb.COORDENADA_Y.tolist()
utm15_wgs84 = pp.Proj(init='epsg:32615')
for ix, iy in zip(cx, cy):
lon, lat = utm15_wgs84(ix, iy, inverse=True)
print(lon, lat)
Any suggestion for doing this?
Use the apply function in pandas DataFrame. For example
dfb[['wgs_x', 'wgs_y']] = dfb.apply(lambda row:utm15_wgs84(row['COORDENADA_X'], row['COORDENADA_Y'], inverse=True), axis=1).apply(pd.Series)

Categories

Resources