Matching two vector paths - python

I have several vector paths and a query path and now I am trying to get the path which is most similar to the query path. I can access length(perimeter) of each path, and width and height of their bounding boxes. I am using python and using pyx library for rendering SVG paths and calculating their bounding boxes. Pseudo code looks like...
THRESHOLD = //some value
qpath = //my query path
similar_paths = []
for path in path_list:
if (comparable width and comparable height and comparable perimeters):
similar_paths.append(path)
But It does not seem to give nice results. Any ideas on how to improve the results?

Lets use a simple PyX graph to generate some paths:
The paths could also come from an SVG file read in parsed mode.
Once you have PyX paths, you can use PyX features to get further information about the paths. In the following simple version I calculate a few points along each path and the sum up their distance. (I do it using method names ending by _pt, which work in PostScript points. It is a little faster than using PyX units. Also I converted all paths to normpaths explicitely in the beginning. While this is not necessary, it helps reduces some function calls internally.)
Here is the full code (including the graph to generate the sample paths):
import math
from pyx import *
# create some data (and draw it)
g = graph.graphxy(width=10, x=graph.axis.lin(min=0, max=2*math.pi))
qpi = g.plot(graph.data.function("y(x)=sin(x)"))
opi1 = g.plot(graph.data.function("y(x)=sin(x)+0.1*sin(10*x)"))
opi2 = g.plot(graph.data.function("y(x)=sin(x)+0.2*sin(20*x)", points=1000))
g.writePDFfile()
# get the corresponding PyX paths
qpath = qpi.path.normpath()
opath1 = opi1.path.normpath()
opath2 = opi2.path.normpath()
# now analyse it
POINTS = 10
qpath_arclen_pt = qpath.arclen_pt()
qpath_points_pt = qpath.at_pt([qpath_arclen_pt*i/(POINTS-1) for i in range(POINTS)])
for opath in [opath1, opath2]:
opath_arclen_pt = opath.arclen_pt()
opath_points_pt = opath.at_pt([opath_arclen_pt*i/(POINTS-1) for i in range(POINTS)])
print(sum(math.sqrt((qpoint_x_pt-opoint_x_pt)**2 + (qpoint_y_pt-opoint_y_pt)**2)
for (qpoint_x_pt, qpoint_y_pt), (opoint_x_pt, opoint_y_pt) in zip(qpath_points_pt, opath_points_pt)))
The program just prints out:
25.381154890630064
56.44386644062556
which indicates, that the dashed lines is closer to the solid one than the dotted line.
You may also compare tangents, curvatures, the arclen itself etc. ... there are plenty of options depending on your needs.

Related

How can you export images from an .emd file with hyperspy?

Given a HAADF-STEM Spectrum Image (SI) as .emd (Velox) file, I want to extract all individual HAADF-images from the stack.
I assume there is an easy way with hyperspy, but I am unable to identify it.
My code so far:
import hyperspy.api as hs
path = r'C:\Users\SI_File.emd'
s = hs.load(path)

Getting path length in mm SvgPathTools

I have an svg and I need to get the length of the paths in millimeters to be used in a real life application. Using the python svgpathtools library, using the length function works well but I cannot find anywhere in the docs for svpathtools what unit is used in calculating the length of the arc (or the curves included in it)
Below is the code I currently have
paths, attributes, svg_attributes = svg2paths2('sample.svg')
for path in paths:
for curve in path:
print(curve.length)
This is the output I receive from running the above code
10.895300154213514
15.142629055866722
23.730517433325748
23.73105785794831
21.77125455329005
11.279624921558746
11.279680843104023
21.771279269128605
23.731580549550927
23.731627915439564
21.77125455329005
11.279624921558746
11.279680843104023
21.774144213511132
23.729677579725156
23.731627915439564
21.795916354994993
10.986505331490429
10.604205974729867
46.43923310506517
45.3986468453268
10.324124024685046
8.071888088998016
0.003605551275512736
26.124050994562996
45.737106670828
53.94962464402549
17.012395710264993
51.95279012002064
16.17007686336362
20.841277156788127
78.72783591297403
17.23728179298918
55.8107820599928
48.672712650608986
0.0010000000000331966
27.058952638511517
68.02833313606227
18.822022537493844
45.413567277673636
45.42345206832466
19.62489493121194
44.61710566995185
22.870883374333967
29.565863874964815
34.33246812172493
45.73710667082802
37.11313707897945
2.842170943040401e-14

Reading channel locations in MNE Python

I am new to MNE Python and I am working with .set files from EEGlab(Matlab) for source estimation analysis. The data were recorded from 66 channels (64 EEG and 2 EOG) from EasyCaps, with 10-20 IS. In Matlab, the EEG.chanlocs correctly shows the coordinates of each electrode (labels, type, theta, radius, X, Y, Z, sph_theta, sph_phi, sph_radius, urchin, ref). But it seems that I cannot read these locations in MNE Python.
import mne
#The .set files are imported ok
data_path = r"D:\EEGdata";
fname = data_path + '\ppt10.set'
mydata = mne.io.read_epochs_eeglab(fname)
#The data look ok, and channel labels are correctly displayed
mydata
mydata.plot()
mydata.ch_names
#But the channel locations are not found
mydata.plot_sensors() #RuntimeError: No valid channel positions found
Any suggestion on how to read the channel locations from the .set files? Or alternatively, how to manually create the locations based on the coordinates from EEG.chanlocs?
I have also tried to use the default montage 10-20, selecting only the channels I used, but I cannot make it work.
#Create a montage based on the standard 1020, which includes 94 electrode labels in upper case
montage = mne.channels.make_standard_montage('standard_1020')
[ch_name.upper() for ch_name in mydata.ch_names] #it correctly convert the channel labels into upper case
mydata.ch_names = [ch_name.upper() for ch_name in mydata.ch_names] #doesn't work
#File "<ipython-input-62-69a7053dc310>", line 1, in <module>
#mydata.ch_names=[ch_name.upper() for ch_name in mydata.ch_names]
#AttributeError: can't set attribute
montage = mne.channels.make_standard_montage('standard_1020',mydata.ch_names]
I also thought I could use a conversion tool to convert the .set files into .fif files. I have checked the online documentation, but I cannot find such tool. Any idea?
I had a similar problem that I fixed by adding a line for mydata.set_montage(montage) before running mydata.plot_sensors(). You don't need to convert the channel names to uppercase, as they are case-insensitive in MNE

Is there any way to use arithmetic ops on FITS files in Python?

I'm fairly new to Python, and I have been trying to recreate a working IDL program to Python, but I'm stuck and keep getting errors. I haven't been able to find a solution yet.
The program requires 4 FITS files in total (img and correctional images dark, flat1, flat2). The operations are as follows:
flat12 = (flat1 + flat2)/2
img1 = (img - dark)/flat12
The said files have dimensions (1024,1024,1). I have resized them to (1024,1024) to be able to even use im_show() function.
I have also tried using cv2.add(), but I get this:
TypeError: Expected Ptr for argument 'src1'
Is there any workaround for this? Thanks in advance.
To read your FITS files use astropy.io.fits: http://docs.astropy.org/en/latest/io/fits/index.html
This will give you Numpy arrays (and FITS headers if needed, there are different ways to do this, as explained in the documentation), so you could do something like:
>>> from astropy.io import fits
>>> img = fits.getdata('image.fits', ext=0) # extension number depends on your FITS files
>>> dark = fits.getdata('dark.fits') # by default it reads the first "data" extension
>>> darksub = img - dark
>>> fits.writeto('out.fits', darksub) # save output
If your data has an extra dimension, as shown with the (1024,1024,1) shape, and if you want to remove that axis, you can use the normal Numpy array slicing syntax: darksub = img[0] - dark[0].
Otherwise in the example above it will produce and save a (1024,1024,1) image.

Check if a geopoint with latitude and longitude is within a shapefile

How can I check if a geopoint is within the area of a given shapefile?
I managed to load a shapefile in python, but can't get any further.
Another option is to use Shapely (a Python library based on GEOS, the engine for PostGIS) and Fiona (which is basically for reading/writing files):
import fiona
import shapely
with fiona.open("path/to/shapefile.shp") as fiona_collection:
# In this case, we'll assume the shapefile only has one record/layer (e.g., the shapefile
# is just for the borders of a single country, etc.).
shapefile_record = fiona_collection.next()
# Use Shapely to create the polygon
shape = shapely.geometry.asShape( shapefile_record['geometry'] )
point = shapely.geometry.Point(32.398516, -39.754028) # longitude, latitude
# Alternative: if point.within(shape)
if shape.contains(point):
print "Found shape for point."
Note that doing point-in-polygon tests can be expensive if the polygon is large/complicated (e.g., shapefiles for some countries with extremely irregular coastlines). In some cases it can help to use bounding boxes to quickly rule things out before doing the more intensive test:
minx, miny, maxx, maxy = shape.bounds
bounding_box = shapely.geometry.box(minx, miny, maxx, maxy)
if bounding_box.contains(point):
...
Lastly, keep in mind that it takes some time to load and parse large/irregular shapefiles (unfortunately, those types of polygons are often expensive to keep in memory, too).
This is an adaptation of yosukesabai's answer.
I wanted to ensure that the point I was searching for was in the same projection system as the shapefile, so I've added code for that.
I couldn't understand why he was doing a contains test on ply = feat_in.GetGeometryRef() (in my testing things seemed to work just as well without it), so I removed that.
I've also improved the commenting to better explain what's going on (as I understand it).
#!/usr/bin/python
import ogr
from IPython import embed
import sys
drv = ogr.GetDriverByName('ESRI Shapefile') #We will load a shape file
ds_in = drv.Open("MN.shp") #Get the contents of the shape file
lyr_in = ds_in.GetLayer(0) #Get the shape file's first layer
#Put the title of the field you are interested in here
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("P_Loc_Nm")
#If the latitude/longitude we're going to use is not in the projection
#of the shapefile, then we will get erroneous results.
#The following assumes that the latitude longitude is in WGS84
#This is identified by the number "4326", as in "EPSG:4326"
#We will create a transformation between this and the shapefile's
#project, whatever it may be
geo_ref = lyr_in.GetSpatialRef()
point_ref=ogr.osr.SpatialReference()
point_ref.ImportFromEPSG(4326)
ctran=ogr.osr.CoordinateTransformation(point_ref,geo_ref)
def check(lon, lat):
#Transform incoming longitude/latitude to the shapefile's projection
[lon,lat,z]=ctran.TransformPoint(lon,lat)
#Create a point
pt = ogr.Geometry(ogr.wkbPoint)
pt.SetPoint_2D(0, lon, lat)
#Set up a spatial filter such that the only features we see when we
#loop through "lyr_in" are those which overlap the point defined above
lyr_in.SetSpatialFilter(pt)
#Loop through the overlapped features and display the field of interest
for feat_in in lyr_in:
print lon, lat, feat_in.GetFieldAsString(idx_reg)
#Take command-line input and do all this
check(float(sys.argv[1]),float(sys.argv[2]))
#check(-95,47)
This site, this site, and this site were helpful regarding the projection check. EPSG:4326
Here is a simple solution based on pyshp and shapely.
Let's assume that your shapefile only contains one polygon (but you can easily adapt for multiple polygons):
import shapefile
from shapely.geometry import shape, Point
# read your shapefile
r = shapefile.Reader("your_shapefile.shp")
# get the shapes
shapes = r.shapes()
# build a shapely polygon from your shape
polygon = shape(shapes[0])
def check(lon, lat):
# build a shapely point from your geopoint
point = Point(lon, lat)
# the contains function does exactly what you want
return polygon.contains(point)
i did almost exactly what you are doing yesterday using gdal's ogr with python binding. It looked like this.
import ogr
# load the shape file as a layer
drv = ogr.GetDriverByName('ESRI Shapefile')
ds_in = drv.Open("./shp_reg/satreg_etx12_wgs84.shp")
lyr_in = ds_in.GetLayer(0)
# field index for which i want the data extracted
# ("satreg2" was what i was looking for)
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("satreg2")
def check(lon, lat):
# create point geometry
pt = ogr.Geometry(ogr.wkbPoint)
pt.SetPoint_2D(0, lon, lat)
lyr_in.SetSpatialFilter(pt)
# go over all the polygons in the layer see if one include the point
for feat_in in lyr_in:
# roughly subsets features, instead of go over everything
ply = feat_in.GetGeometryRef()
# test
if ply.Contains(pt):
# TODO do what you need to do here
print(lon, lat, feat_in.GetFieldAsString(idx_reg))
Checkout http://geospatialpython.com/2011/01/point-in-polygon.html and http://geospatialpython.com/2011/08/point-in-polygon-2-on-line.html
One way to do this is to read the ESRI Shape file using the OGR
library Link and then use the GEOS geometry
library http://trac.osgeo.org/geos/ to do the point-in-polygon test.
This requires some C/C++ programming.
There is also a python interface to GEOS at http://sgillies.net/blog/14/python-geos-module/ (which I have never used). Maybe that is what you want?
Another solution is to use the http://geotools.org/ library.
That is in Java.
I also have my own Java software to do this (which you can download
from http://www.mapyrus.org plus jts.jar from http://www.vividsolutions.com/products.asp ). You need only a text command
file inside.mapyrus containing
the following lines to check if a point lays inside the
first polygon in the ESRI Shape file:
dataset "shapefile", "us_states.shp"
fetch
print contains(GEOMETRY, -120, 46)
And run with:
java -cp mapyrus.jar:jts-1.8.jar org.mapyrus.Mapyrus inside.mapyrus
It will print a 1 if the point is inside, 0 otherwise.
You might also get some good answers if you post this question on
https://gis.stackexchange.com/
If you want to find out which polygon (from a shapefile full of them) contains a given point (and you have a bunch of points as well), the fastest way is using postgis. I actually implemented a fiona based version, using the answers here, but it was painfully slow (I was using multiprocessing and checking bounding box first). 400 minutes of processing = 50k points. Using postgis, that took less than 10seconds. B tree indexes are efficient!
shp2pgsql -s 4326 shapes.shp > shapes.sql
That will generate a sql file with the information from the shapefiles, create a database with postgis support and run that sql. Create a gist index on the geom column. Then, to find the name of the polygon:
sql="SELECT name FROM shapes WHERE ST_Contains(geom,ST_SetSRID(ST_MakePoint(%s,%s),4326));"
cur.execute(sql,(x,y))

Categories

Resources