How to use matplotlib or pyqtgraph draw plot like this:
Line AB is a two-directions street, green part represents the direction from point A to point B, red part represents B to A, width of each part represents the traffic volume. Widths are measured in point, will not changed at different zoom levels or dpi settings.
This is only an example, in fact I have hunderds of streets. This kind of plot is very common in many traffic softwares. I tried to use matplotlib's patheffect but result is frustrated:
from matplotlib import pyplot as plt
import matplotlib.patheffects as path_effects
x=[0,1,2,3]
y=[1,0,0,-1]
ab_width=20
ba_width=30
fig, axes= plt.subplots(1,1)
center_line, = axes.plot(x,y,color='k',linewidth=2)
center_line.set_path_effects(
[path_effects.SimpleLineShadow(offset=(0, -ab_width/2),shadow_color='g', alpha=1, linewidth=ab_width),
path_effects.SimpleLineShadow(offset=(0, ba_width/2), shadow_color='r', alpha=1, linewidth=ba_width),
path_effects.SimpleLineShadow(offset=(0, -ab_width), shadow_color='k', alpha=1, linewidth=2),
path_effects.SimpleLineShadow(offset=(0, ba_width), shadow_color='k', alpha=1, linewidth=2),
path_effects.Normal()])
axes.set_xlim(-1,4)
axes.set_ylim(-1.5,1.5)
One idea came to me is to take each part of the line as a standalone line, and recalculate it's position when changing zoom level, but it's too complicated and slow.
If there any easy way to use matplotlib or pyqtgraph draw what I want? Any suggestion will be appreciated!
If you can have each independent line, this can be done easily with the fill_between function.
from matplotlib import pyplot as plt
import numpy as np
x=np.array([0,1,2,3])
y=np.array([1,0,0,-1])
y1width=-1
y2width=3
y1 = y + y1width
y2 = y + y2width
fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(x,y, 'k', x,y1, 'k',x,y2, 'k',linewidth=2)
ax.fill_between(x, y1, y, color='g')
ax.fill_between(x, y2, y, color='r')
plt.xlim(-1,4)
plt.ylim(-3,6)
plt.show()
Here I considered the center line as the reference (thus the negative y1width), but could be done differently. The result is then:
If the lines are 'complicated', eventually intersecting at some point, then the keyword argument interpolate=True must be used to fill the crossover regions properly. Another interesting argument probably useful for your use case is where, to condition the region, for instance, where=y1 < 0. For more information you can check out the documentation.
One way of solving your issue is using filled polygons, some linear algebra and some calculus. The main idea is to draw a polygon along your x and y coordinates and along shifted coordinates to close and fill the polygon.
These are my results:
And here is the code:
from __future__ import division
import numpy
from matplotlib import pyplot, patches
def road(x, y, w, scale=0.005, **kwargs):
# Makes sure input coordinates are arrays.
x, y = numpy.asarray(x, dtype=float), numpy.asarray(y, dtype=float)
# Calculate derivative.
dx = x[2:] - x[:-2]
dy = y[2:] - y[:-2]
dy_dx = numpy.concatenate([
[(y[1] - y[0]) / (x[1] - x[0])],
dy / dx,
[(y[-1] - y[-2]) / (x[-1] - x[-2])]
])
# Offsets the input coordinates according to the local derivative.
offset = -dy_dx + 1j
offset = w * scale * offset / abs(offset)
y_offset = y + w * scale
#
AB = zip(
numpy.concatenate([x + offset.real, x[::-1]]),
numpy.concatenate([y + offset.imag, y[::-1]]),
)
p = patches.Polygon(AB, **kwargs)
# Returns polygon.
return p
if __name__ == '__main__':
# Some plot initializations
pyplot.close('all')
pyplot.ion()
# This is the list of coordinates of each point
x = [0, 1, 2, 3, 4]
y = [1, 0, 0, -1, 0]
# Creates figure and axes.
fig, ax = pyplot.subplots(1,1)
ax.axis('equal')
center_line, = ax.plot(x, y, color='k', linewidth=2)
AB = road(x, y, 20, color='g')
BA = road(x, y, -30, color='r')
ax.add_patch(AB)
ax.add_patch(BA)
The first step in calculating how to offset each data point is by calculating the discrete derivative dy / dx. I like to use complex notation to handle vectors in Python, i.e. A = 1 - 1j. This makes life easier for some mathematical operations.
The next step is to remember that the derivative gives the tangent to the curve and from linear algebra that the normal to the tangent is n=-dy_dx + 1j, using complex notation.
The final step in determining the offset coordinates is to ensure that the normal vector has unity size n_norm = n / abs(n) and multiply by the desired width of the polygon.
Now that we have all the coordinates for the points in the polygon, the rest is quite straightforward. Use patches.Polygon and add them to the plot.
This code allows you also to define if you want the patch on top of your route or below it. Just give a positive or negative value for the width. If you want to change the width of the polygon depending on your zoom level and/or resolution, you adjust the scale parameter. It also gives you freedom to add additional parameters to the patches such as fill patterns, transparency, etc.
Related
A user case: given a signed distance field phi, the contour phi = 0 marks the surface of a geometry, and regions inside the geometry have phi < 0. In some case, one wants to focus on values inside the geometry and only plot regions inside the geometry, i.e., regions masked by phi < 0.
Note: directly masking the array phi causes zig-zag boundary near the contour line phi = 0, i.e., bad visualization.
I was able to write the following code with the answer here: Fill OUTSIDE of polygon | Mask array where indicies are beyond a circular boundary? the function mask_outside_polygon below is from that post. My idea is to extract and use the coordinate of the contour line for creating a polygon mask.
The code works well when the contour line does not intersect the boundary of the figure. There is no zig-zag boundary so it's a good visualization.
But when the contour intersects with the figure boundary, the contour line is fragmented into pieces and the simple code no longer works. I wonder if there is some existing feature for masking the figure, or there is some simpler method I can use. Thanks!
import numpy as np
import matplotlib.pyplot as plt
def mask_outside_polygon(poly_verts, ax=None):
"""
Plots a mask on the specified axis ("ax", defaults to plt.gca()) such that
all areas outside of the polygon specified by "poly_verts" are masked.
"poly_verts" must be a list of tuples of the verticies in the polygon in
counter-clockwise order.
Returns the matplotlib.patches.PathPatch instance plotted on the figure.
"""
import matplotlib.patches as mpatches
import matplotlib.path as mpath
if ax is None:
ax = plt.gca()
# Get current plot limits
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# Verticies of the plot boundaries in clockwise order
bound_verts = [(xlim[0], ylim[0]), (xlim[0], ylim[3]),
(xlim[3], ylim[3]), (xlim[3], ylim[0]),
(xlim[0], ylim[0])]
# A series of codes (1 and 2) to tell matplotlib whether to draw a line or
# move the "pen" (So that there's no connecting line)
bound_codes = [mpath.Path.MOVETO] + (len(bound_verts) - 1) * [mpath.Path.LINETO]
poly_codes = [mpath.Path.MOVETO] + (len(poly_verts) - 1) * [mpath.Path.LINETO]
# Plot the masking patch
path = mpath.Path(bound_verts + poly_verts, bound_codes + poly_codes)
patch = mpatches.PathPatch(path, facecolor='white', edgecolor='none')
patch = ax.add_patch(patch)
# Reset the plot limits to their original extents
ax.set_xlim(xlim)
ax.set_ylim(ylim)
return patch
def main():
x = np.linspace(-1.2, 1.2, 101)
y = np.linspace(-1.2, 1.2, 101)
xx, yy = np.meshgrid(x, y)
rr = np.sqrt(xx**2 + yy**2)
psi = xx*xx - yy*yy
plt.contourf(xx,yy,psi)
if 0: # change to 1 to see the working result
cs = plt.contour(xx,yy,rr,levels=[3]) # works
else:
cs = plt.contour(xx,yy,rr,levels=[1.3]) # does not work
path = cs.collections[0].get_paths()[0]
poly_verts = path.vertices
mask_outside_polygon(poly_verts.tolist()[::-1])
plt.show()
if __name__ == '__main__':
main()
I'm trying to adapt the following resources to this question:
Python conversion between coordinates
https://matplotlib.org/gallery/pie_and_polar_charts/polar_scatter.html
I can't seem to get the coordinates to transfer the dendrogram shape over to polar coordinates.
Does anyone know how to do this? I know there is an implementation in networkx but that requires building a graph and then using pygraphviz backend to get the positions.
Is there a way to convert dendrogram cartesian coordinates to polar coordinates with matplotlib and numpy?
import requests
from ast import literal_eval
import matplotlib.pyplot as plt
import numpy as np
def read_url(url):
r = requests.get(url)
return r.text
def cartesian_to_polar(x, y):
rho = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
return(rho, phi)
def plot_dendrogram(icoord,dcoord,figsize, polar=False):
if polar:
icoord, dcoord = cartesian_to_polar(icoord, dcoord)
with plt.style.context("seaborn-white"):
fig = plt.figure(figsize=figsize)
ax = fig.add_subplot(111, polar=polar)
for xs, ys in zip(icoord, dcoord):
ax.plot(xs,ys, color="black")
ax.set_title(f"Polar= {polar}", fontsize=15)
# Load the dendrogram data
string_data = read_url("https://pastebin.com/raw/f953qgdr").replace("\r","").replace("\n","").replace("\u200b\u200b","")
# Convert it to a dictionary (a subset of the output from scipy.hierarchy.dendrogram)
dendrogram_data = literal_eval(string_data)
icoord = np.asarray(dendrogram_data["icoord"], dtype=float)
dcoord = np.asarray(dendrogram_data["dcoord"], dtype=float)
# Plot the cartesian version
plot_dendrogram(icoord,dcoord, figsize=(8,3), polar=False)
# Plot the polar version
plot_dendrogram(icoord,dcoord, figsize=(5,5), polar=True)
I just tried this and it's closer but still not correct:
import matplotlib.transforms as mtransforms
with plt.style.context("seaborn-white"):
fig, ax = plt.subplots(figsize=(5,5))
for xs, ys in zip(icoord, dcoord):
ax.plot(xs,ys, color="black",transform=trans_offset)
ax_polar = plt.subplot(111, projection='polar')
trans_offset = mtransforms.offset_copy(ax_polar.transData, fig=fig)
for xs, ys in zip(icoord, dcoord):
ax_polar.plot(xs,ys, color="black",transform=trans_offset)
You can make the "root" of the tree start in the middle and have the leaves outside. You also have to add more points to the "bar" part for it to look nice and round.
We note that each element of icoord and dcoord (I will call this seg) has four points:
seg[1] seg[2]
+-------------+
| |
+ seg[0] + seg[3]
The vertical bars are fine as straight lines between the two points, but we need more points between seg[1] and seg[2] (the horizontal bar, which will need to become an arc).
This function will add more points in those positions and can be called on both xs and ys in the plotting function:
def smoothsegment(seg, Nsmooth=100):
return np.concatenate([[seg[0]], np.linspace(seg[1], seg[2], Nsmooth), [seg[3]]])
Now we must modify the plotting function to calculate the radial coordinates. Some experimentation has led to the log formula I am using, based on the other answer which also uses log scale. I've left a gap open on the right for the radial labels and done a very rudimentary mapping of the "icoord" coordinates to the radial ones so that the labels correspond to the ones in the rectangular plot. I don't know exactly how to handle the radial dimension. The numbers are correct for the log, but we probably want to map them as well.
def plot_dendrogram(icoord,dcoord,figsize, polar=False):
if polar:
dcoord = -np.log(dcoord+1)
# avoid a wedge over the radial labels
gap = 0.1
imax = icoord.max()
imin = icoord.min()
icoord = ((icoord - imin)/(imax - imin)*(1-gap) + gap/2)*2*numpy.pi
with plt.style.context("seaborn-white"):
fig = plt.figure(figsize=figsize)
ax = fig.add_subplot(111, polar=polar)
for xs, ys in zip(icoord, dcoord):
if polar:
xs = smoothsegment(xs)
ys = smoothsegment(ys)
ax.plot(xs,ys, color="black")
ax.set_title(f"Polar= {polar}", fontsize=15)
if polar:
ax.spines['polar'].set_visible(False)
ax.set_rlabel_position(0)
Nxticks = 10
xticks = np.linspace(gap/2, 1-gap/2, Nxticks)
ax.set_xticks(xticks*np.pi*2)
ax.set_xticklabels(np.round(np.linspace(imin, imax, Nxticks)).astype(int))
Which results in the following figure:
First, I think you might benefit from this question.
Then, let's break down the objective: it is not very clear to me what you want to do, but I assume you want to get something that looks like this
(source, page 14)
To render something like this, you need to be able to render horizontal lines that appear as hemi-circles in polar coordinates. Then, it's a matter of mapping your horizontal lines to polar plot.
First, note that your radius are not normalized in this line:
if polar:
icoord, dcoord = cartesian_to_polar(icoord, dcoord)
you might normalize them by simply remapping icoord to [0;2pi).
Now, let's try plotting something simpler, instead of your complex plot:
icoord, dcoord = np.meshgrid(np.r_[1:10], np.r_[1:4])
# Plot the cartesian version
plot_dendrogram(icoord, dcoord, figsize=(8, 3), polar=False)
# Plot the polar version
plot_dendrogram(icoord, dcoord, figsize=(5, 5), polar=True)
Result is the following:
as you can see, the polar code does not map horizontal lines to semi-circles, therefore that is not going to work. Let's try with plt.polar instead:
plt.polar(icoord.T, dcoord.T)
produces
which is more like what we need. We need to fix the angles first, and then we shall consider that Y coordinate goes inward (while you probably want it going from center to border). It boils down to this
nic = (icoord.T - icoord.min()) / (icoord.max() - icoord.min())
plt.polar(2 * np.pi * nic, -dcoord.T)
which produces the following
Which is similar to what you need. Note that straight lines remain straight, and are not replaced with arcs, so you might want to resample them in your for loop.
Also, you might benefit from single color and log-scale to make reading easier
plt.subplots(figsize=(10, 10))
ico = (icoord.T - icoord.min()) / (icoord.max() - icoord.min())
plt.polar(2 * np.pi * ico, -np.log(dcoord.T), 'b')
I try to hatch only the regions where I have statistically significant results. How can I do this using Basemap and pcolormesh?
plt.figure(figsize=(12,12))
lons = iris_cube.coord('longitude').points
lats = iris_cube.coord('latitude').points
m = Basemap(llcrnrlon=lons[0], llcrnrlat=lats[0], urcrnrlon=lons[-1], urcrnrlat=lats[-1], resolution='l')
lon, lat = np.meshgrid(lons, lats)
plt.subplot(111)
cs = m.pcolormesh(lon, lat, significant_data, cmap=cmap, norm=norm, hatch='/')
It seems pcolormesh does not support hatching (see https://github.com/matplotlib/matplotlib/issues/3058). Instead, the advice is to use pcolor, which starting from this example would look like,
import matplotlib.pyplot as plt
import numpy as np
dx, dy = 0.15, 0.05
y, x = np.mgrid[slice(-3, 3 + dy, dy),
slice(-3, 3 + dx, dx)]
z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
z = z[:-1, :-1]
zm = np.ma.masked_less(z, 0.3)
cm = plt.pcolormesh(x, y, z)
plt.pcolor(x, y, zm, hatch='/', alpha=0.)
plt.colorbar(cm)
plt.show()
where a mask array is used to get the values of z greater than 0.3 and these are hatched using pcolor.
To avoid plotting another colour over the top (so you get only hatching) I've set alpha to 0. in pcolor which feels a bit like a hack. The alternative is to use patch and assign to the areas you want. See this example Python: Leave Numpy NaN values from matplotlib heatmap and its legend. This may be more tricky for basemaps, etc than just choosing areas with pcolor.
I have a simple solution for this problem, using only pcolormesh and not pcolor: Plot the color mesh, then hatch the entire plot, and then plot the original mesh again, this time by masking statistically significant cells, so that the only hatching visible is those on significant cells. Alternatively, you can put a marker on every cell (looks good too), instead of hatching the entire figure.
(I use cartopy instead of basemap, but this shouldn't matter.)
Step 1: Plot your field (z) normally, using pcolormesh.
mesh = plt.pcolormesh(x,y,z)
where x/y can be lons/lats.
Step 2: Hatch the entire plot. For this, use fill_between:
hatch = plt.fill_between([xmin,xmax],y1,y2,hatch='///////',color="none",edgecolor='black')
Check details of fill_between to set xmin, xmax, y1 and y2. You simply define two horizontal lines beyond the bounds of your plot, and hatch the area in between. Use more, or less /s to set hatch density.
To adjust hatch thickness, use below lines:
import matplotlib as mpl
mpl.rcParams['hatch.linewidth'] = 0.3
As an alternative to hatching everything, you can plot all your x-y points (or, lon-lat couples) as markers. A simple solution is putting a dot (x also looks good).
hatch = plt.plot(x,y,'.',color='black',markersize=1.5)
One of the above will be the basis of your 'hatch'. This is how it should look after Step 2:
Step 3: On top of these two, plot your color mesh once again with pcolormesh, this time masking cells containing statistically significant values. This way, the markers on your 'insignificant' cells become invisible again, while significant markers stay visible.
Assuming you have an identically sized array containing the t statistic for each cell (t_z), you can mask significant values using numpy's ma module.
z_masked = numpy.ma.masked_where(t_z >= your_threshold, z)
Then, plot the color mesh, using the masked array.
mesh_masked = plt.pcolormesh(x,y,z_masked)
Use zorder to make sure the layers are in correct order. This is how it should look after Step 3:
I know that matplotlib and scipy can do bicubic interpolation:
http://matplotlib.org/examples/pylab_examples/image_interp.html
http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp2d.html
I also know that it is possible to draw a map of the world with matplotlib:
http://matplotlib.org/basemap/users/geography.html
http://matplotlib.org/basemap/users/examples.html
http://matplotlib.org/basemap/api/basemap_api.html
But can I do a bicubic interpolation based on 4 data points and only color the land mass?
For example using these for 4 data points (longitude and latitude) and colors:
Lagos: 6.453056, 3.395833; red HSV 0 100 100 (or z = 0)
Cairo: 30.05, 31.233333; green HSV 90 100 100 (or z = 90)
Johannesburg: -26.204444, 28.045556; cyan HSV 180 100 100 (or z = 180)
Mogadishu: 2.033333, 45.35; purple HSV 270 100 100 (or z = 270)
I am thinking that it must be possible to do the bicubic interpolation across the range of latitudes and longitudes and then add oceans, lakes and rivers on top of that layer? I can do this with drawmapboundary. Actually there is an option maskoceans for this:
http://matplotlib.org/basemap/api/basemap_api.html#mpl_toolkits.basemap.maskoceans
I can interpolate the data like this:
xnew, ynew = np.mgrid[-1:1:70j, -1:1:70j]
tck = interpolate.bisplrep(x, y, z, s=0)
znew = interpolate.bisplev(xnew[:,0], ynew[0,:], tck)
Or with scipy.interpolate.interp2d:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp2d.html
Here it is explained how to convert to map projection coordinates:
http://matplotlib.org/basemap/users/mapcoords.html
But I need to figure out how to do this for a calculated surface instead of individual points. Actually there is an example of such a topographic map using external data, which I should be able to replicate:
http://matplotlib.org/basemap/users/examples.html
P.S. I am not looking for a complete solution. I would much prefer to solve this myself. Rather I am looking for suggestions and hints. I have been using gnuplot for more than 10 years and only switched to matplotlib within the past few weeks, so please don't assume I know even the simplest things about matplotlib.
I think this is what you are looking for (roughly). Note the crucial things are masking the data array before you plot the pcolor and passing in the hsv colormap (Docs: cmap parameter for pcolormesh and available colormaps).
I've kept the code for plotting the maps quite close to the examples so it should be easy to follow. I've kept your interpolation code for the same reason. Note that the interpolation is linear rather than cubic - kx=ky=1 - because you don't give enough points to do cubic interpolation (you'd need at least 16 - scipy will complain with less saying that "m must be >= (kx+1)(ky+1)", although the constraint is not mentioned in the documentation).
I've also extended the range of your meshgrid and kept in lat / lon for x and y throughout.
Code
from mpl_toolkits.basemap import Basemap,maskoceans
import matplotlib.pyplot as plt
import numpy as np
from scipy import interpolate
# set up orthographic map projection with
# perspective of satellite looking down at 0N, 20W (Africa in main focus)
# use low resolution coastlines.
map = Basemap(projection='ortho',lat_0=0,lon_0=20,resolution='l')
# draw coastlines, country boundaries
map.drawcoastlines(linewidth=0.25)
map.drawcountries(linewidth=0.25)
# Optionally (commented line below) give the map a fill colour - e.g. a blue sea
#map.drawmapboundary(fill_color='aqua')
# draw lat/lon grid lines every 30 degrees.
map.drawmeridians(np.arange(0,360,30))
map.drawparallels(np.arange(-90,90,30))
data = {'Lagos': (6.453056, 3.395833,0),
'Cairo': (30.05, 31.233333,90),
'Johannesburg': (-26.204444, 28.045556,180),
'Mogadishu': (2.033333, 45.35, 270)}
x,y,z = zip(*data.values())
xnew, ynew = np.mgrid[-30:60:0.1, -50:50:0.1]
tck = interpolate.bisplrep(x, y, z, s=0,kx=1,ky=1)
znew = interpolate.bisplev(xnew[:,0], ynew[0,:], tck)
znew = maskoceans(xnew, ynew, znew)
col_plot = map.pcolormesh(xnew, ynew, znew, latlon=True, cmap='hsv')
plt.show()
Output
Observe that doing the opposite, that is putting a raster on the sea and lay a mask over the continents, is easy as pie. Simply use map.fillcontinents(). So the basic idea of this solution is to modify the fillcontinents function so that it lays polygons over the oceans.
The steps are:
Create a large circle-like polygon that covers the entire globe.
Create a polygon for each shape in the map.coastpolygons array.
Cut the shape of the landmass polygon away from the circle using shapely and its difference method.
Add the remaining polygons, which have the shape of the oceans, on the top, with a high zorder.
The code:
from mpl_toolkits.basemap import Basemap
import numpy as np
from scipy import interpolate
from shapely.geometry import Polygon
from descartes.patch import PolygonPatch
def my_circle_polygon( (x0, y0), r, resolution = 50 ):
circle = []
for theta in np.linspace(0,2*np.pi, resolution):
x = r * np.cos(theta) + x0
y = r * np.sin(theta) + y0
circle.append( (x,y) )
return Polygon( circle[:-1] )
def filloceans(the_map, color='0.8', ax=None):
# get current axes instance (if none specified).
if not ax:
ax = the_map._check_ax()
# creates a circle that covers the world
r = 0.5*(map.xmax - map.xmin) # + 50000 # adds a little bit of margin
x0 = 0.5*(map.xmax + map.xmin)
y0 = 0.5*(map.ymax + map.ymin)
oceans = my_circle_polygon( (x0, y0) , r, resolution = 100 )
# for each coastline polygon, gouge it out of the circle
for x,y in the_map.coastpolygons:
xa = np.array(x,np.float32)
ya = np.array(y,np.float32)
xy = np.array(zip(xa.tolist(),ya.tolist()))
continent = Polygon(xy)
## catches error when difference with lakes
try:
oceans = oceans.difference(continent)
except:
patch = PolygonPatch(continent, color="white", zorder =150)
ax.add_patch( patch )
for ocean in oceans:
sea_patch = PolygonPatch(ocean, color="blue", zorder =100)
ax.add_patch( sea_patch )
########### DATA
x = [3.395833, 31.233333, 28.045556, 45.35 ]
y = [6.453056, 30.05, -26.204444, 2.033333]
z = [0, 90, 180, 270]
# set up orthographic map projection
map = Basemap(projection='ortho', lat_0=0, lon_0=20, resolution='l')
## Plot the cities on the map
map.plot(x,y,".", latlon=1)
# create a interpolated mesh and set it on the map
interpol_func = interpolate.interp2d(x, y, z, kind='linear')
newx = np.linspace( min(x), max(x) )
newy = np.linspace( min(y), max(y) )
X,Y = np.meshgrid(newx, newy)
Z = interpol_func(newx, newy)
map.pcolormesh( X, Y, Z, latlon=1, zorder=3)
filloceans(map, color="blue")
VoilĂ :
I have a complicated curve defined as a set of points in a table like so (the full table is here):
# x y
1.0577 12.0914
1.0501 11.9946
1.0465 11.9338
...
If I plot this table with the commands:
plt.plot(x_data, y_data, c='b',lw=1.)
plt.scatter(x_data, y_data, marker='o', color='k', s=10, lw=0.2)
I get the following:
where I've added the red points and segments manually. What I need is a way to calculate those segments for each of those points, that is: a way to find the minimum distance from a given point in this 2D space to the interpolated curve.
I can't use the distance to the data points themselves (the black dots that generate the blue curve) since they are not located at equal intervals, sometimes they are close and sometimes they are far apart and this deeply affects my results further down the line.
Since this is not a well behaved curve I'm not really sure what I could do. I've tried interpolating it with a UnivariateSpline but it returns a very poor fit:
# Sort data according to x.
temp_data = zip(x_data, y_data)
temp_data.sort()
# Unpack sorted data.
x_sorted, y_sorted = zip(*temp_data)
# Generate univariate spline.
s = UnivariateSpline(x_sorted, y_sorted, k=5)
xspl = np.linspace(0.8, 1.1, 100)
yspl = s(xspl)
# Plot.
plt.scatter(xspl, yspl, marker='o', color='r', s=10, lw=0.2)
I also tried increasing the number of interpolating points but got a mess:
# Sort data according to x.
temp_data = zip(x_data, y_data)
temp_data.sort()
# Unpack sorted data.
x_sorted, y_sorted = zip(*temp_data)
t = np.linspace(0, 1, len(x_sorted))
t2 = np.linspace(0, 1, 100)
# One-dimensional linear interpolation.
x2 = np.interp(t2, t, x_sorted)
y2 = np.interp(t2, t, y_sorted)
plt.scatter(x2, y2, marker='o', color='r', s=10, lw=0.2)
Any ideas/pointers will be greatly appreciated.
If you're open to using a library for this, have a look at shapely: https://github.com/Toblerity/Shapely
As a quick example (points.txt contains the data you linked to in your question):
import shapely.geometry as geom
import numpy as np
coords = np.loadtxt('points.txt')
line = geom.LineString(coords)
point = geom.Point(0.8, 10.5)
# Note that "line.distance(point)" would be identical
print(point.distance(line))
As an interactive example (this also draws the line segments you wanted):
import numpy as np
import shapely.geometry as geom
import matplotlib.pyplot as plt
class NearestPoint(object):
def __init__(self, line, ax):
self.line = line
self.ax = ax
ax.figure.canvas.mpl_connect('button_press_event', self)
def __call__(self, event):
x, y = event.xdata, event.ydata
point = geom.Point(x, y)
distance = self.line.distance(point)
self.draw_segment(point)
print 'Distance to line:', distance
def draw_segment(self, point):
point_on_line = line.interpolate(line.project(point))
self.ax.plot([point.x, point_on_line.x], [point.y, point_on_line.y],
color='red', marker='o', scalex=False, scaley=False)
fig.canvas.draw()
if __name__ == '__main__':
coords = np.loadtxt('points.txt')
line = geom.LineString(coords)
fig, ax = plt.subplots()
ax.plot(*coords.T)
ax.axis('equal')
NearestPoint(line, ax)
plt.show()
Note that I've added ax.axis('equal'). shapely operates in the coordinate system that the data is in. Without the equal axis plot, the view will be distorted, and while shapely will still find the nearest point, it won't look quite right in the display:
The curve is by nature parametric, i.e. for each x there isn't necessary a unique y and vice versa. So you shouldn't interpolate a function of the form y(x) or x(y). Instead, you should do two interpolations, x(t) and y(t) where t is, say, the index of the corresponding point.
Then you use scipy.optimize.fminbound to find the optimal t such that (x(t) - x0)^2 + (y(t) - y0)^2 is the smallest, where (x0, y0) are the red dots in your first figure. For fminsearch, you could specify the min/max bound for t to be 1 and len(x_data)
You could try implementing a calculation of distance from point to line on incremental pairs of points on the curve and finding that minimum. This will introduce a small bit of error from the curve as drawn, but it should be very small, as the points are relatively close together.
http://en.wikipedia.org/wiki/Distance_from_a_point_to_a_line
You can easily use the package trjtrypy in PyPI: https://pypi.org/project/trjtrypy/
All needed computations and visualizations are available in this package. You can get your answer within a line of code like:
to get the minimum distance use: trjtrypy.basedists.distance(points, curve)
to visualize the curve and points use: trjtrypy.visualizations.draw_landmarks_trajectory(points, curve)