How to resolve tick problems in Geopandas overlay? - python

I am trying to overlay a polygon and lines in Geopandas, but I am getting tick plot problems.
ValueError: cannot convert float NaN to integer
import geopandas as gpd
from geopandas.tools import overlay
zip1 = "zip://data/mmcovidshp.zip"
mmcovid = gpd.read_file(zip1)
zip2 = "zip://data/roads_MM.zip"
mmroads = gpd.read_file(zip2)
overlay_intersection = overlay(mmcovid, mmroads,
how='intersection')
overlay_intersection.plot(figsize=(6, 8))
Data: https://drive.google.com/drive/folders/1Xxo1Ep6Dgau5ThmNetuqzehpSh9sgpfP?usp=sharing

It is not clear what are you trying to do.
overlay_intersection is empty because it tries to preserve the geometry type of the left GeoDataFrame. Because the left gdf are polygons and intersection of polygon and linestring is linestring, the result is empty. You can control that using keep_geom_type keyword. keep_geom_type=False returns everything.
The simple solution here is to change order.
overlay_intersection = overlay(mmroads, mmcovid
how='intersection')
That produces non-empty gdf. See more https://geopandas.readthedocs.io/en/latest/docs/user_guide/set_operations.html?highlight=overlay.
If you are trying to simply clip mmroads to mmcovid's shape, use geopandas.clip. https://geopandas.readthedocs.io/en/latest/gallery/plot_clip.html

Related

splitting circle into two using LineString Python Shapely

I have created a circle using Shapely and would like to split it into two using LineString.
I created the circle as follows
from functools import partial
import pyproj
from shapely import geometry
from shapely.geometry import Point, Polygon, shape, MultiPoint, LineString, mapping
from shapely.ops import transform, split
radius = 92600
lon = 54.08
lat = 17.05
local_azimuthal_projection = "+proj=aeqd +R=6371000 +units=m +lat_0={} +lon_0={}".format(
lat, lon
)
wgs84_to_aeqd = partial(
pyproj.transform,
pyproj.Proj("+proj=longlat +datum=WGS84 +no_defs"),
pyproj.Proj(local_azimuthal_projection),
)
aeqd_to_wgs84 = partial(
pyproj.transform,
pyproj.Proj(local_azimuthal_projection),
pyproj.Proj("+proj=longlat +datum=WGS84 +no_defs"),
)
center = Point(float(lon), float(lat))
point_transformed = transform(wgs84_to_aeqd, center)
buffer = point_transformed.buffer(radius)
# Get the polygon with lat lon coordinates
circle_poly = transform(aeqd_to_wgs84, buffer)
For the line Splitter I have the following:
splitter = LingString([Point(54.79,16.90), Point(53.56,16.65)])
Now I want to see the two split shapes so I used split function.
result = split(circle_poly, splitter)
However, this only results the same circle and not two shapes.
At the end I would like to use one section of the split to form another shape.
To split a circle or a polygon, you can use spatial difference operation with another polygon. Shapely does not allow the use of line to do so.
"Shapely can not represent the difference between an object and a lower dimensional object (such as the difference between a polygon and a line or point) as a single object."
See document:
In your case, you can build two polygons that have the line as the common edges.
Make sure that the 2 polygons together are big enough to cover the whole polygon you are splitting. Then you use that polygons to do the job.
If you want crude results from the difference operation,
you can turn the line into a slim polygon by buffer operation, and use it.
For the second approach, here is the relevant code:
the_line = LineString([(54.95105, 17.048144), (53.40473921, 17.577181)])
buff_line = the_line.buffer(0.000001) #is polygon
# the `difference` operation between 2 polygons
multi_pgon = circle_poly.difference(buff_line)
The result is a multi-polygon geometry object.

Matching Geopandas Dissolve with ArcGIS Dissolve on set of Polylines

I am trying to replicate the output from ArcGIS Dissolve on a set of stream flow lines using geopandas. Essentially the df/stream_0 layer is a stream network extracted from a DEM using pysheds. That output has some randomly overlapping reaches which I am trying to remove. Running Dissolve through ArcGIS Pro does this well, but I would prefer not to have to deal with ArcGIS/ArcPy to resolve this.
Stream Network
ArcGIS Dissolve Setting
#streams_0.geojson = df.shp = streams_0.shp from Dissolve Setting image
#~~~~~~~~~~~~~~~~~~~~
import geopandas as gpd
df = gpd.read_file('streams_0.geojson')
df.head()
Out[3]:
geometry
0 LINESTRING (400017.781 3000019.250, 400017.781...
1 LINESTRING (400027.781 3000039.250, 400027.781...
2 LINESTRING (400027.781 3000039.250, 400037.781...
3 LINESTRING (400027.781 3000029.250, 400037.781...
4 LINESTRING (400047.781 3000079.250, 400047.781...
I have tried using gpd.dissolve() using a filler column with no luck.
df['dissolvefield'] = 1;
df2 = df.dissolve(by='dissolvefield')
df3 = gpd.geoseries.GeoSeries([geom for geom in df2.geometry.iloc[0].geoms])
Similarly tried to use unary_union in shapely with no luck.
import fiona
shape1 = fiona.open("df.shp")
first = shape1.next()
from shapely.geometry import shape
shp_geom = shape(first['geometry'])
from shapely.ops import unary_union
shape2 = unary_union(shp_geom)
Seems like an easy solution, wondering why I am running into so many issues. My GeoDataFrame only consists of the line geometry, so there is not necessarily another attribute I can aggregate based on. I am essentially just trying keep the geometry of the lines unchanged, but remove any overlapping features that may be there. I don't want to split the lines, and I don't want to aggregate them into multipart features.
i use the unary_union, but no need to read it as shapely feature.
after reading the file and put it in GPD (you can do it straight from the *.shp file):
df = gpd.read_file('streams_0.geojson')
try to plot it to see the if the output is correct
df.plot()
than use the unary_union like this, and plot again:
shape2 = df.unary_union
shape2
and the last step (if necessary), is to set as geopandas again:
# transform Geometry Collection to shapely multilinestirng
segments = [feature for feature in shape2]
# set back as geopandas
gdf = gpd.GeoDataFrame(list(range(len(segments))), geometry=segments,
crs=crs)
gdf .columns = ['index', 'geometry']

Convert Column to Polygon in Python to perform Point in Polygon

I have written Code to establish Point in Polygon in Python, the program uses a shapefile that I read in as the Polygons.
I now have a dataframe I read in with a column containing the Polygon e.g [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]].
I want to transform this column into a polygon in order to perform Point in Polygon, but all the examples use geometry = [Point(xy) for xy in zip(dataPoints['Long'], dataPoints['Lat'])] but mine is already zip?
How would I go about achieving this?
Thanks
taking your example above you could do the following:
list_coords = [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]]
from shapely.geometry import Point, Polygon
# Create a list of point objects using list comprehension
point_list = [Point(x,y) for [x,y] in list_coords]
# Create a polygon object from the list of Point objects
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
And if you would like to apply it to a dataframe you could do the following:
import pandas as pd
import geopandas as gpd
df = pd.DataFrame({'coords': [list_coords]})
def get_polygon(list_coords):
point_list = [Point(x,y) for [x,y] in list_coords]
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
return polygon_feature
df['geom'] = df['coords'].apply(get_polygon)
However, there might be geopandas built-in functions in order to avoid "reinventing the wheel", so let's see if anyone else has a suggestion :)

Define points within a polygon

I have a list of customers lat and long and I want to define which ones are within a given polygon.
But the results I got are none of them in that polygon and it is not correct.
Could you please help? Thanks!
from shapely.geometry import Polygon
from shapely.geometry import Point
import pandas as pd
import geopandas as gpd
df=pd.read_csv("C:\\Users\\n.nguyen.2\\Documents\\order from May 1.csv")
geometry=[Point(xy) for xy in zip(df['customer_lat'],df['customer_lng'])]
crs={'init':'epsg:4326'}
gdf=gpd.GeoDataFrame(df,crs=crs,geometry=geometry)
gdf.head()
polygon= Polygon ([(103.85362669999994, 1.4090082), (103.8477709, 1.4051988), (103.84821190000002, 1.4029509), (103.84933950000004, 1.4012179), (103.85182859999998, 1.4001453), (103.85393150000004, 1.3986867), (103.85745050000001, 1.3962412), (103.85809410000002, 1.3925516), (103.85843750000004, 1.3901491), (103.8583946, 1.3870601), (103.8585663, 1.3838853), (103.8582659, 1.3812682), (103.85822299999997, 1.3792946), (103.85843750000004, 1.3777931), (103.85882370000002, 1.3748757), (103.86015410000005, 1.3719582), (103.8607978, 1.3700276), (103.86092659999998, 1.368097), (103.86036880000006, 1.3657372), (103.8593174, 1.3633562), (103.85852339999995, 1.3607605), (103.85745050000001, 1.3581005), (103.8571071, 1.355655), (103.85736459999998, 1.3520941), (103.85873790000007, 1.3483615), (103.86187100000006, 1.3456583), (103.86488409999993, 1.340689), (103.87096889999998, 1.3378933), (103.87519599999996, 1.3373354), (103.88178349999998, 1.3408963), (103.88508790000004, 1.3433418), (103.89186870000005, 1.3436426), (103.89742610000008, 1.342355), (103.91813279999997, 1.3805388), (103.91824964404806, 1.3813377489306), (103.91433759243228, 1.38607494841128), (103.91607279999994, 1.3895484), (103.91942029999996, 1.3940104), (103.92903330000001, 1.4009604), (103.9342689, 1.402076), (103.93289559999994, 1.4075675), (103.92534249999994, 1.4146035), (103.92517090000003, 1.4211246), (103.90972139999997, 1.4238704), (103.89942169999993, 1.4202666), (103.89744760000008, 1.4224117), (103.89315599999998, 1.425758), (103.88740540000003, 1.4285896), (103.88148309999995, 1.4328798), (103.87478829999998, 1.4331372), (103.85918850000007, 1.4249644), (103.85401679999995, 1.4114284), (103.85362669999994, 1.4090082)])
gdf['answer']=gdf['geometry'].within(polygon)
writer = pd.ExcelWriter("C:\\Users\\n.nguyen.2\\Documents\\order may define1.xlsx")
gdf.to_excel(writer, 'Sheet1', index=False)
writer.save()
The results are all false.
Raw data:
Result:
Adding my comments as an answer for future reference.
You have switched longitude and latitude in the order of coordinates. Look at coordinates of your polygon and those of points. Coordinates of points you have generated are (Lat, Lon), while your polygon (Lon, Lat). So these points are not within this polygon. Do
geometry=[Point(xy) for xy in zip(df['customer_lng'],df['customer_lat'])]
instead and it will work.
To make your life easier, geopandas has helper function for creating points from polygons points_from_xy() (http://geopandas.org/gallery/create_geopandas_from_pandas.html?highlight=points_from_xy)

raise ValueError when producing a shape file with geopandas

I have just recently started to work with shapefiles. I have a shapefile in which each object is a polygon. I want to produce a new shapefile in which the geometry of each polygon is replaced by its centroid. There is my code.
import geopandas as gp
from shapely.wkt import loads as load_wkt
fname = '../data_raw/bg501c_starazagora.shp'
outfile = 'try.shp'
shp = gp.GeoDataFrame.from_file(fname)
centroids = list()
index = list()
df = gp.GeoDataFrame()
for i,r in shp.iterrows():
index.append(i)
centroid = load_wkt(str(r['geometry'])).centroid.wkt
centroids.append(centroid)
df['geometry'] = centroids
df['INDEX'] = index
gp.GeoDataFrame.to_file(df,outfile)
When I run the script I end up with raise ValueError("Geometry column cannot contain mutiple " ValueError: Geometry column cannot contain mutiple geometry types when writing to file.
I cannot understand what is wrong. Any help?
The issue is that you're populating the geometry field with a string representation of the geometry rather than a shapely geometry object.
No need to convert to wkt. Your loop could instead be:
for i,r in shp.iterrows():
index.append(i)
centroid = r['geometry'].centroid
centroids.append(centroid)
However, there's no need to loop through the geodataframe at all. You could create a new one of shapefile centroids as follows:
df=gp.GeoDataFrame(data=shp, geometry=shp['geometry'].centroid)
df.to_file(outfile)

Categories

Resources