How can I extract data points contained in matplotlib Path vertices?

How can I extract data points contained in matplotlib Path vertices? - python

I define the vertices of a polygon and their types as follows:
verts=[(x1,y1),(x2,y2),(x3,y3),(x4,y4),(0,0)]
codes=[Path.MOVETO,Path.LINETO,Path.LINETO,Path.LINETO,Path.CLOSEPOLY]
path=Path(verts,codes)
patch=patches.PathPatch(path)
Now I would like to extract the indices of the data points contained inside the vertices so that I can manipulate them. I tried the following:
datapts = np.column_stack((x_data,y_data))
inds, = Path(verts).contains_points(datapts)
but of course the verts themselves are not data and so this doesn't work. Help is appreciated.

3 minutes after I posted the above I found the answer:
inds = path.contains_points(datapts)
This can be used as follows:
ax.plot(x_data[inds], y_data[inds])

Related

Adding icon for node shape using networkx and pyvis (python)

I am new to netwrokx and pyvis and am making a small network to display the different shapes possible for each node. I managed to use all the shapes except for icons. I searched a lot but I couldn't find anything useful and the examples available did not work with my code I would appreciate it if anyone could help me figure this out.
here is my code:
import networkx as nx
import xlrd #used to access the external excel file
import pyvis
from pyvis.network import Network
import pandas as pd
import textwrap
df = pd.read_csv("Visualizer\Data\EECS2311\shapes.csv",encoding='cp1252')
G=nx.Graph()
nodes = []
p1 = df['person1']
p2 = df['person2']
p3 = df['person3']
p4 = df['person4']
p5 = df['person5']
p6 = df['person6']
p7 = df['person7']
p8 = df['person8']
p9 = df['person9']
p10 = df['person10']
p11 = df['person11']
p12 = df['person12']
p13 = df['person13']
p14 = df['person14']
data = zip(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14)
for e in data:
person1 = e[0]
G.add_node(person1, shape="ellipse")
person2 = e[1]
G.add_node(person2, shape="circle")
person3 = e[2]
G.add_node(person3, shape="database")
person4 = e[3]
G.add_node(person4, shape="box")
person5 = e[4]
G.add_node(person5, shape="text")
person6 = e[5]
G.add_node(person6, shape="image", image="https://image.shutterstock.com/image-vector/hello-funny-person-simple-cartoon-260nw-1311467669.jpg")
person7 = e[6]
G.add_node(person7, shape="circularImage", image="https://image.shutterstock.com/image-vector/hello-funny-person-simple-cartoon-260nw-1311467669.jpg")
person8 = e[7]
G.add_node(person8, shape="diamond")
person9 = e[8]
G.add_node(person9, shape="dot")
person10 = e[9]
G.add_node(person10, shape="star")
person11 = e[10]
G.add_node(person11, shape="triangle")
person12 = e[11]
G.add_node(person12, shape="triangleDown")
person13 = e[12]
G.add_node(person13, shape="square")
person14 = e[13]
G.add_node(person14, shape="icon", icon="https://image.shutterstock.com/image-vector/hello-funny-person-simple-cartoon-260nw-1311467669.jpg")
nodes.append((person1, person2))
nodes.append((person2, person3))
nodes.append((person3, person4))
nodes.append((person4, person5))
nodes.append((person5, person6))
nodes.append((person6, person7))
nodes.append((person7, person8))
nodes.append((person8, person9))
nodes.append((person9, person10))
nodes.append((person10, person11))
nodes.append((person11, person12))
nodes.append((person12, person13))
nodes.append((person13, person14))
options = {
"layout": {
"hierarchical": {
"enabled": True,
"levelSeparation": 300,
"nodeSpacing": 165,
"treeSpacing": 305,
"direction": "LR"
}
},
"physics": {
"hierarchicalRepulsion": {
"centralGravity": 0,
"nodeDistance": 110,
},
"minVelocity": 0.75,
"solver": "hierarchicalRepulsion"
}
}
G.add_edges_from(nodes)
G2 = Network(height="800px", width="100%", bgcolor="#222222", font_color="white", select_menu=True, filter_menu=True, directed=True)
G2.from_nx(G)
G2.options = options
neighbor_map = G2.get_adj_list()
for node in G2.nodes:
node["value"] = len(neighbor_map[node["id"]])
#to wrap long labels:
id_string = node["label"]
width = 20
wrapped_strings = textwrap.wrap(id_string, width)
wrapped_id ="";
for line in wrapped_strings:
wrapped_id = textwrap.fill(id_string, width)
node["label"] = wrapped_id
#G2.show_buttons()
G2.show("shapes.html")
and here is my .csv file:
person1,person2,person3,person4,person5,person6,person7,person8,person9,person10,person11,person12,person13,person14
ellipse, circle, database,box,text,image, circularImage,diamond,dot,star,triangle,triangleDown,square,icon
"ellipse shape displays label inside the shape. To use this simply set shape =""ellipse""","circle shape displays label inside the shape. To use this simply set shape =""circle""","database shape displays label inside the shape. To use this simply set shape =""database""","box shape displays label inside the shape. To use this simply set shape =""box""","only displays text. To use this simply set shape =""text""","image displays a image with label outside. To use set shape=""image"", image=""url"". Note: requires link to image","circularImage displays a circular image with label outside. To use set shape="" circularImage"", image=""url"". Note: requires link to image","diamond shape displays label outside the shape. To use this simply set shape =""diamond""","dot shape displays label outside the shape. To use this simply set shape =""dot""","star shape displays label outside the shape. To use this simply set shape =""star""","triangle shape displays label outside the shape. To use this simply set shape =""triangle""","triangleDown shape displays label outside the shape. To use this simply set shape =""triangleDown""","square shape displays label outside the shape. To use this simply set shape =""square""","icon displays a circular image with label outside. To use set shape="" icon"", image=""url"". Note: requires link to image"
ps. forgive the heading for the csv file :)

This doesn't answer your question, I just want to help you shrink your code so you can debug it more easily.
Use the DataFrame directly
You're doing a ton of extra work to get at your data, assigning to temporary variables, then zipping them together. They are already together! To loop over the things in row 0 of the DataFrame try this:
for item in df.loc[0]:
print(item)
There's also a function in NetworkX, nx.from_pandas_dataframe(), that will create a network directly from a DataFrame... but you can only add edge attributes with that, not node attributes.
Then again...
Maybe dn't even use a DataFrame
Pandas is a convenient way to load CSVs, but your data isn't all that well-suited to this data structure. A dict would be better. It's a kind of mapping, in your case from node names to a node attribute.
Fortunately, there's a fairly easy way to get a dict from your DataFrame:
df.T.to_dict()[0]
This 'transposes' the DataFrame (turns the rows into columns) then turns the result into a dict. Then the [0] gives you the only column in the data.
This way you can avoid needing to repeat all your data (the mapping from person number to symbol) in your code.
Then again...
Maybe don't even use a dictionary
Any time you are mapping from a continuous set of numbers to some other objects (like person1, person2, etc) you might as well just use a list. Everything is indexed by position, which is basically what you have already. So you could just store your data like ['ellipse', 'circle', 'dot'] etc.
Then again...
Maybe don't even store the data
It turns out all these symbols are already defined in matplotlib. Have a look at:
from matplotlib.lines import Line2D
Line2D.markers
It's a dictionary of all the markers! If you want to try all of them, then you can just use these, no need to define anything.
Use zip to add your edges
zip is great for combining two or more lists, or combining a list with itself but with some offset. You can step over the nodes and make edges like so:
nodes = list(G.nodes)
for u, v in zip(nodes, nodes[1:]):
G.add_edge(u, v)
General advice
Try to avoid using tools like pandas just to load data. In my experience, it often introduces a bunch of complexity you don't need.
Get something small and simple working before making it more complex, e.g. with URLs of images.
You can store dictionaries easily as JSON text files. Check out the json module.
Again, sorry for not directly answering your question. But I feel like all this should help get your code down to something that is much easier to debug.

When I convert the obtained road network data into a geodataframes structure, why does the list element appear in the highway column?

import osmnx as ox
G = ox.graph_from_place('成都市',network_type='all')
G = ox.project_graph(G)
ox.plot_graph(G)
# 将图形转换为节点和边的GeoDataFrames类型
gdf_nodes, gdf_edges = ox.graph_to_gdfs(G)
gdf_edges
So I got the following image
[enter image description here][1]
[1]: https://i.stack.imgur.com/5d5J1.png
the result was [motorway_link, trunk]

By default, OSMnx simplifies the graph unless you use simplify=False. See documentation for details:
Some of the resulting consolidated edges may comprise multiple OSM ways, and if so, their multiple attribute values are stored as a list.

Error when trying to make a GeoDataFrame of network nodes

I need to make a GeoDataFrame of some nodes on a road network (which was extracted from OpenStreetMap using OSMnx). In the code below, graph_proj is the graph whose nodes I'm working with, the points are start_point and end_point:
import osmnx as ox
import geopandas as gpd
nodes_proj, edges_proj = ox.graph_to_gdfs(graph_proj, nodes=True, edges=True)
# Finding the nodes on the graph nearest to the points
start_node = ox.nearest_nodes(graph_proj, start_point.geometry.x, start_point.geometry.y, return_dist=False)
end_node = ox.nearest_nodes(graph_proj, end_point.geometry.x, end_point.geometry.y, return_dist=False)
start_closest = nodes_proj.loc[start_node]
end_closest = nodes_proj.loc[end_node]
# Create a GeoDataBase from the start and end nodes
od_nodes = gpd.GeoDataFrame([start_closest, end_closest], geometry='geometry', crs=nodes_proj.crs)
During the last step ("# Create a GeoDataBase...", etc.), an error is thrown. Apparently, it has something to do with a 3-dimensional array being passed to the GeoDataFrame function. Am I right that the way I pass in the locations([start_closest, end_closest]) results in a 3D array? (The error message reads, 'Must pass 2-d input. shape=(2, 1, 7)') I tried transposing this array, but then GeoPandas could not locate the 'geometry' column. How do I go about passing in this argument in a way that it will be accepted?

OK, so I was able to get around this by writing each node to its own GeoDataFrame and then merging the two GeoDataFrames, like this:
od_nodes1 = gpd.GeoDataFrame(start_closest, geometry='geometry', crs=nodes_proj.crs)
od_nodes2 = gpd.GeoDataFrame(end_closest, geometry='geometry', crs=nodes_proj.crs)
od_nodes = od_nodes1.append(od_nodes2)
Surely, though, there must be a more elegant way of writing more than one feature into a GeoDataFrame?

Convert Column to Polygon in Python to perform Point in Polygon

I have written Code to establish Point in Polygon in Python, the program uses a shapefile that I read in as the Polygons.
I now have a dataframe I read in with a column containing the Polygon e.g [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]].
I want to transform this column into a polygon in order to perform Point in Polygon, but all the examples use geometry = [Point(xy) for xy in zip(dataPoints['Long'], dataPoints['Lat'])] but mine is already zip?
How would I go about achieving this?
Thanks

taking your example above you could do the following:
list_coords = [[28.050815,-26.242253],[28.050085,-26.25938],[28.011934,-26.25888],[28.020216,-26.230127],[28.049828,-26.230704],[28.050815,-26.242253]]
from shapely.geometry import Point, Polygon
# Create a list of point objects using list comprehension
point_list = [Point(x,y) for [x,y] in list_coords]
# Create a polygon object from the list of Point objects
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
And if you would like to apply it to a dataframe you could do the following:
import pandas as pd
import geopandas as gpd
df = pd.DataFrame({'coords': [list_coords]})
def get_polygon(list_coords):
point_list = [Point(x,y) for [x,y] in list_coords]
polygon_feature = Polygon([[poly.x, poly.y] for poly in point_list])
return polygon_feature
df['geom'] = df['coords'].apply(get_polygon)
However, there might be geopandas built-in functions in order to avoid "reinventing the wheel", so let's see if anyone else has a suggestion :)

Plot missing points for complicated 3D list of points - Python

Hi I have a 3D list (I realise this may not be the best representation of my data so any advice here is appreciated) as such:
y_data = [
[[a,0],[b,1],[c,None],[d,6],[e,7]],
[[a,5],[b,2],[c,1],[d,None],[e,1]],
[[a,3],[b,None],[c,4],[d,9],[e,None]],
]
The y-axis data is such that each sublist is a list of values for one hour. The hours are the x-axis data. Each sublist of this has the following format:
[label,value]
So essentially:
line a is [0,5,3] on the y-axis
line b is [1,2,None] on the y-axis etc.
My x-data is:
x_data = [0,1,2,3,4]
Now when I plot this list directly i.e.
for i in range(0,5):
ax.plot(x_data, [row[i][1] for row in y_data], label=y_data[0][i][0])
I get a line graph however where the value is None the point is not drawn and the line not connected.
What I would like to do is to have a graph which will plot my data in it's current format, but ignore missing points and draw a line between the point before the missing data and the point after (i.e. interpolating the missing point).
I tried doing it like this https://stackoverflow.com/a/14399830/1800665 but I couldn't work out how to do this for a 3D list.
Thanks for any help!

The general approach that you linked to will work fine here ; it looks like the question you're asking is how to apply that approach to your data. I'd like to suggest that by factoring out the data you're plotting, you'll see more clearly how to do it.
import numpy as np
y_data = [
[[a,0],[b,1],[c,None],[d,6],[e,7]],
[[a,5],[b,2],[c,1],[d,None],[e,1]],
[[a,3],[b,None],[c,4],[d,9],[e,None]],
]
x_data = [0, 1, 2, 3, 4]
for i in range(5):
xv = []
yv = []
for j, v in enumerate(row[i][1] for row in y_data):
if v is not None:
xv.append(j)
yv.append(v)
ax.plot(xv, yv, label=y_data[0][i][0])
Here instead of using a mask like in the linked question/answer, I've explicitly built up the lists of valid data points that are to be plotted.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.