TLDR: I need to construct a python object for fast interior point testing, similar to a SciPy ConvexHull or DelaunayTriangulation. The catch is that I know ahead of time the order in which the triangulation of the points must be constructed: (6 points, 8 triangular faces, with a specific ordering of each face). In effect, I already know what the convex hull should be, but I need it in a form that I can use with existing (and optimised!) libraries (eg Scipy spatial). How can I do this?
Context:
I need to construct a triangular prism (imagine a Toblerone bar - 2 end faces, 6 side faces, all triangular) in order to do some interior point testing. As I will have many such prisms lying adjacent to each other (adjacent on their side faces, imagine many Toblerone bars stood on their ends and next to each other), I need to be careful to ensure that no region in space is contained by two adjacent prisms. The cross section of the prism will not generally be uniform, hence the possibility of overlap between adjacent prisms, as illustrated by this diagram of the approximately planar face between two adjacent prisms:
____
|\ /|
| \/ |
| /\ |
|/__\|
Note the two different diagonals constructed along the face - this is the problem. One prism may split the face into two triangles using the \ diagonal, and the neighbouring prism may instead use the /. In order to ensure no overlap between adjacent prisms, I need to explicitly control the order in which the triangles are formed so that they always use the same diagonal. This I can do: for each prism that I need to construct, I know ahead of time in what order the triangular faces should be constructed. Here's an illustration of two adjacent prisms, with the correct shared diagonal between them: neighbouring prisms, shared diagonal
My issue is with performing fast interior point testing with these prisms. Previously, I was using the approach linked in this answer: Delaunay(prism_points).find_simplex(test_points) >= 0. It's quick because it is using highly optimised library code, but I have no control over the construction of the triangulation, so there could be overlap.
If I construct the hulls as explicit np.array objects (vertices, faces) then I can use my own code to do the tests (there are numerous possible approaches, I'm projecting rays and testing for intersection with each triangular face). The problem is that this is around ~100x slower than the find_simplex() approach mentioned earlier. Whilst I'm sure I could get the code a bit quicker, it is worth pointing out this code is already fairly optimised from another use case with Cython - I am not sure if I can find all the extra speed I need here. As for the inevitable "do you really need the speed question", please take my word for it. This is turning a 5 minute job into many hours.
What I need is to construct an object I can use with external optimised libraries, whilst retaining control of the triangular faces. Adding extra Cython to my code is of course an option, but which such highly optimised code already out there, using that would be vastly preferable.
Thanks to anyone that can help.
Half a solution... Not an exact solution to the original question, but a different way of achieving the same outcome. Any triangular prism can be split into exactly three tetrahedra (see http://www.alecjacobson.com/weblog/?p=1888). This is a specific case of the fact that any polyhedron may be split into tetrahedra by connecting all faces to one vertex, if the faces does not already include it.
Knowing exactly how I would like the face triangles of my prism to be arranged, I can work out what three tetrahedra would reproduce the same configuration of triangles (with extra faces of course being added inside the original prism itself). I then form Delaunay triangulations around each of these three tetrahedra (ie, collections of 4 points) in turn and perform the original interior point tests: if it matches on any then I have a positive result for the whole prism. The key point is that by only giving four points to the Delaunay constructor at a time, I know exactly what triangulation it will return as there is only one way of forming such a tetrahedra (assuming no geometric degeneracy).
It's a bit longwinded, and involves 3x as many tests as I would like, but it's a start. If anyone in the future does know how I could do this better please do let me know.
Related
I'm a wee bit stuck.
I have a 3D point cloud (an array of (n,3) vertices), in which I am trying to generate a 3D triangular mesh from. So far I have had no luck.
The format my data comes in:
(x,y) values in regularly spaced (z) intervals. Think of the data as closed loop planar contours stored slice by slice in the z direction.
The vertices in my data must be absolute positions for the mesh triangles (i.e. I don't want them to be smoothed out such that the volume begins to change shape, but linear interpolation between the layers is fine).
Illustration:
Z=2. : ..x-------x... <- Contour 2
Z=1.5: ...\......|... <- Join the two contours into a mesh.
Z=1. : .....x----x... <- Contour 1
Repeat for n slices, end up with an enclosed 3D triangular mesh.
Things I have tried:
Using Open3D:
The rolling ball (pivot) method can only get 75% of the mesh completed and leaves large areas incomplete (despite a range of ball sizes). It has particular problems at the top and bottom slices where there tends to be large gaps in the middle (i.e. a flat face).
The Poisson reconstruction method smooths out the volume too much and I no longer have an accurate representation of the volume. This occurs at all depths from 3-12.
CGAL:
I cannot get this to work for the life of me. SWIG is not very good, the implementation of CGAL using SWIG is also not very good.
There are two PyBind implementations of CGAL however they have not incorporated the 3D triangulation libraries from CGAL.
Explored other modules like PyMesh, TriMesh, TetGen, Scikit-Geometry, Shapely etc. etc. I may have missed the answer somewhere along the line.
Given that my data is a list of closed-loop planar contours, it seems as though there must be some simple solution to just "joining" adjacent slice contours into one big 3d mesh. Kind of like you would in blender.
There are non-python solutions (like MeshLab) that may well solve these problems, but I require a python solution. Does anyone have any ideas? I've had a bit of a look into VTK and ITK but haven't found exactly what I'm looking for as of yet.
I'm also starting to consider that maybe I can interpolate intermediate contours between slices, and fill the contours on the top and bottom with vertices to make the data a bit more "pivot ball" method friendly.
Thank you in advance for any help, it is appreciated.
If there is a good way of doing this that isn't coded yet, I promise to code it and make it available for people in my situation :)
Actually there are two ways of having meshlab functionality in python:
The first is MeshLabXML (https://github.com/3DLIRIOUS/MeshLabXML ) a third party, is a Python scripting interface to meshlab scripting interface
the second is PyMeshLab (https://github.com/cnr-isti-vclab/PyMeshLab ) an ongoing effort done by the MeshLab authors, (currently in alpha stage) to have a direct Python bindings to all the meshlab filters
There is a very neat paper titled "Technical Note: an algorithm and software for conversion of radiotherapy contour‐sequence data to ready‐to‐print 3D structures" in the Journal of Medical Physics that describes this problem quite nicely. No python packages are required, however it is more easily implemented with numpy. No need for any 3D packages.
A useful excerpt is provided:
...
The number of slices (2D contours) constituting the specified structure is determined.
The number of points in each slice is determined.
Cartesian coordinates of each of the points in each slice are extracted and stored within dedicated data structures...
Numbers of points in each slice (curve) are re‐arranged in such a way, that the starting points (points with indices 0) are the closest points between the subsequent slices. Renumeration starts at point 0, slice 0 (slice with the lowest z coordinate).
Orientation (i.e., the direction determined by the increasing indices of points with relation to the interior/exterior of the curve) of each curve is determined. If differences between slices are found, numbering of points in non‐matching curves (and thus, orientation) is reversed.
The lateral surface of the considered structure is discretized. Points at the neighboring layers are arranged into threes, constituting triangular facets for the STL file. For each triangle the closest points with the subsequent indices from each layer are connected.
Lower and upper base surfaces of the considered structure are discretized. The program iterates over every subsequent three points on the curve and checks if they belong to a convex part of the edge. If yes, they are connected into a facet, and the middle point is removed from further iterations.
So basically it's a problem of aligning datasets in each slice to the nearest value of each slice. Then aligning the orientation of each contour. Then joining the points between two layers based on distance.
The paper also provides code to do this (for a DICOM file), however I re-wrote it myself and it works a charm.
I hope this helps others! Make sure you credit the author's in any work you do that uses this.
A recent feature of pymadcad can do things like this, not sure through if it fits your exact expectation in term of "pivot ball" or such things, checkout the doc for blending
Starting from a list of outlines, it can generate blended surfaces to join them:
For your purpose, I guess the best is one of:
blendpair(line1, line2)
junction(*lines)
I split de world in X random polygons.
Then I am given a coordinate C1, for instance (-21.45, 7.10), and I want to attribute the right polygon to this coordinate.
The first solution is to apply my ‘point_in_polygon’ algorithm (given a set of coordinates that defines a polygon and a coordinate that defines a point, tell me if the point is inside or not) on each polygon until I find the right one.
But that is very expensive if I have a lot of points to put in a lot of polygons.
An improvement on that relies on the following idea:
To optimise the search, I create a grid (a collection) with a step n, k where I already attribute each pair of coordinates such that:
for i=-180 to 180 step n
for j = -90 to 90 step k
grid.add(i,j)
Then I create a dictionary, and for each pair in the collection I find the corresponding polygon
For each g in grid
For each p in polygons
If point_in_polygon(g,p) == True
my_dict(g) = p
Then, when I receive C1, I look for the closest coordinate in my grid, let’s say g1.
Thanks to my_dict, I can get quickly p1 = my_dict(g1)
Then I compute point_in_polygon(C1, p1) which is likely to be true. If it’s not, I find the closest g which is assigned to a different polygon, and I redo a test. Etc. until I have found the right polygon.
Now, the question is: what is the optimal n, k to create the grid?
So that I can find the right polygon in the minimum number of steps.
I don’t want it too low, because the search of the closest g which is assigned to a different polygon might be expensive.
I don’t want it too high as well, because then I might be missing some polygons and then the search never converges.
My intuition is that the smallest polygon is going to give the steps.
I am not sure if this is a programming problem, a maths problem, or just something I can find empirically, that's why I ask it here.
Any inputs appreciated!
Let me suggest a slight modification to your grid. Currently, you store for each cell the polygon that the cell's center belongs to. Instead, store all the polygons that overlap the cell. Then, whenever you see that a cell has only a single overlapping polygon, you don't need to do any inclusion testing. The grid can be built by methods of conservative rasterization (note that the referenced article is not focused on conservative but rather general rasterization).
The efficiency of your grid correlates with the ratio of single-polygon cells and total cells (because this is the probability of not having to perform polygon-inclusion tests). The storage itself is pretty cheap. You can use a dense array and get constant access to the cells. Hence, from a theoretical point of view, you should have as many cells as possible (because as you have more cells, the single-polygon cell ratio increases). In practice, you might find that cache and other memory effects might make large grids impractical. However, there is no good way to know other than test. So, just try with a couple of sizes on a few different machines and try to find a good fit.
If I had to guess, I would say that your cells should be square and have an area of about 1% - 5% of the average polygon area. Also, more compact polygons can be handled more efficiently than many long and thin polygons.
Pick any point and draw a line straight down from that point. The first polygon edge you hit tells you what polygon the point is in.
So, if you don't want to do polygon tests, then instead of dividing the space into a regular grid, first cut it into strips with vertical cuts that go through all polygon intersections.
Now, within each strip none of the polygon edges cross or end, so you can make an ordered list of all those edges from bottom to top.
If you want to find the polygon that contains a point, then, do a binary search using the x coordinate to find the proper strip. Then in the list of edges that span the strip, you can do a binary search using the y coordinate to find the closest one underneath the point, and that tells you what polygon the point is in.
Google 'trapezoidal decomposition' to find lots of information about similar techniques.
I need help in python coding an algorithm capable of detecting the corners of a image. I have a thresholded image so far and I was using cornerHarris from opencv to detect all the corners. My problem is filtrating all those points to output only the ones I desired. Maybe I can do a loop to achieve this?
In my case, I want the two lowest corners and the two highest corners points. My main interest is to obtain the pixel coordinates of this corners. You can see an example of a image I'm processing here:
In this image I draw the corners points I'm interested in.
There are several ways to solve this problem. In real-world applications it's rare (that is, actually never occurs) that you need to solve a problem once for a single image. If you have additional images it would be nice to see how much the object of interest varies.
One method to find corners is the convex hull. This method is more generally used to find a convex shape encompassing scattered points, but it's worth knowing about and implementing.
https://en.wikipedia.org/wiki/Convex_hull
What's handy about the convex hull is that the concept of a "corner" (a vertex on the convex hull polygon) is easy to grasp and doesn't rely on parameter settings. You don't have to consider whether a corner is sharp enough, strong enough, pointy enough, unique in its neighborhood, etc.--the convex hull will simply make sense to you.
You should be able to write a functional version of a convex hull "gift wrapping" algorithm in a reasonable period of time.
https://en.wikipedia.org/wiki/Gift_wrapping_algorithm
There are many ways to compute the convex hull, but don't get lost in all the different methods. Choose one that makes sense to you and implement it. The fastest known method may still be Seidel, but don't even think about running down that rabbit hole. Simple is good.
Before you compute the convex hull, you'll need to reduce your white shape to edge points; otherwise the hull algorithm will check far too many points. Reducing the number of points to be considered can be done using edge-finding on the connected component (the white "blob"), edge-finding without first segmenting foreground from background, or any of various simple kernels (e.g. Sobel).
Although the algorithm is called the "convex" hull, your shape doesn't have to be convex, especially if you're only interested in the top and bottom vertices/corners as shown in your sample image.
Corner finders can be a bit disappointing, frankly, especially since the name implies, "Hey, it'll just find corners all the time." There are some good ones out there, but you could spend a lot of time investigating all the alternatives. Even then you'll likely have to set thresholds, consider whether your application will yield the occasional weird result given the shape and scale of corners, and so on.
Although you mention wanting to find only the top and bottom points, if you wanted to find those two odd triangular outcroppings on the left side the corner-finding gets a little more complicated; using the convex hull keeps this very simple.
Although you want to find a robust solution to corner detection, preferably using a known algorithm for which performance can be understood easily, you also want to avoid overgeneralizing. In any case, review some list of corner detectors and see what strikes your fancy. If you see a promising algorithm that looks easy-ish to implement, why not try implementing it?
https://en.wikipedia.org/wiki/Corner_detection
It is pretty easy to split a rectangle/square into smaller regions and enforce a maximum area of each sub-region. You can just divide the region into regions with sides length sqrt(max_area) and treat the leftovers with some care.
With a quadrilateral however I am stumped. Let's assume I don't know the angle of any of the corners. Let's also assume that all four points are on the same plane. Also, I don't need for the the small regions to be all the same size. The only requirement I have is that the area of each individual region is less than the max area.
Is there a particular data structure I could use to make this easier?
Is there an algorithm I'm just not finding?
Could I use quadtrees to do this? I'm not incredibly versed in trees but I do know how to implement the structure.
I have GIS work in mind when I'm doing this, but I am fairly confident that that will have no impact on the algorithm to split the quad.
You could recursively split the quad in half on the long sides until the resulting area is small enough.
If your quadrilateral is convex, then in fact you can split it into two equal-area pieces which at the same time have equal perimeters! This is called a fair partitioning, and is described at The Open Problems Project (it is open for larger number of pieces, but solved for two pieces).
For nonconvex quadrilaterals, it is not difficult to find a line to partition it into
two equal pieces.
I believe this will work: Pass a line through the one
reflex vertex, and spin it about that vertex until it partitions the area equally.
The same method works for convex polygons, if your only goal is to partition the area into
two equal halves.
The generic problem (for arbitrary polygons) goes under the name of
"ham-sandwich sectioning of polygons." In fact, I wrote a paper with that exact title.
I have a large collection of rectangles, all of the same size. I am generating random points that should not fall in these rectangles, so what I wish to do is test if the generated point lies in one of the rectangles, and if it does, generate a new point.
Using R-trees seem to work, but they are really meant for rectangles and not points. I could use a modified version of a R-tree algorithm which works with points too, but I'd rather not reinvent the wheel, if there is already some better solution. I'm not very familiar with data-structures, so maybe there already exists some structure that works for my problem?
In summary, basically what I'm asking is if anyone knows of a good algorithm, that works in Python, that can be used to check if a point lies in any rectangle in a given set of rectangles.
edit: This is in 2D and the rectangles are not rotated.
This Reddit thread addresses your problem:
I have a set of rectangles, and need to determine whether a point is contained within any of them. What are some good data structures to do this, with fast lookup being important?
If your universe is integer, or if the level of precision is well known and is not too high, you can use abelsson's suggestion from the thread, using O(1) lookup using coloring:
As usual you can trade space for
time.. here is a O(1) lookup with very
low constant. init: Create a bitmap
large enough to envelop all rectangles
with sufficient precision, initialize
it to black. Color all pixels
containing any rectangle white. O(1)
lookup: is the point (x,y) white? If
so, a rectangle was hit.
I recommend you go to that post and fully read ModernRonin's answer which is the most accepted one. I pasted it here:
First, the micro problem. You have an
arbitrarily rotated rectangle, and a
point. Is the point inside the
rectangle?
There are many ways to do this. But
the best, I think, is using the 2d
vector cross product. First, make sure
the points of the rectangle are stored
in clockwise order. Then do the vector
cross product with 1) the vector
formed by the two points of the side
and 2) a vector from the first point
of the side to the test point. Check
the sign of the result - positive is
inside (to the right of) the side,
negative is outside. If it's inside
all four sides, it's inside the
rectangle. Or equivalently, if it's
outside any of the sides, it's outside
the rectangle. More explanation here.
This method will take 3 subtracts per
vector * times 2 vectors per side,
plus one cross product per side which
is three multiplies and two adds. 11
flops per side, 44 flops per
rectangle.
If you don't like the cross product,
then you could do something like:
figure out the inscribed and
circumscribed circles for each
rectangle, check if the point inside
the inscribed one. If so, it's in the
rectangle as well. If not, check if
it's outside the circumscribed
rectangle. If so, it's outside the
rectangle as well. If it falls between
the two circles, you're f****d and you
have to check it the hard way.
Finding if a point is inside a circle
in 2d takes two subtractions and two
squarings (= multiplies), and then you
compare distance squared to avoid
having to do a square root. That's 4
flops, times two circles is 8 flops -
but sometimes you still won't know.
Also this assumes that you don't pay
any CPU time to compute the
circumscribed or inscribed circles,
which may or may not be true depending
on how much pre-computation you're
willing to do on your rectangle set.
In any event, it's probably not a
great idea to test the point against
every rectangle, especially if you
have a hundred million of them.
Which brings us to the macro problem.
How to avoid testing the point against
every single rectangle in the set? In
2D, this is probably a quad-tree
problem. In 3d, what generic_handle
said - an octree. Off the top of my
head, I would probably implement it as
a B+ tree. It's tempting to use d = 5,
so that each node can have up to 4
children, since that maps so nicely
onto the quad-tree abstraction. But if
the set of rectangles is too big to
fit into main memory (not very likely
these days), then having nodes the
same size as disk blocks is probably
the way to go.
Watch out for annoying degenerate
cases, like some data set that has ten
thousand nearly identical rectangles
with centers at the same exact point.
:P
Why is this problem important? It's
useful in computer graphics, to check
if a ray intersects a polygon. I.e.,
did that sniper rifle shot you just
made hit the person you were shooting
at? It's also used in real-time map
software, like say GPS units. GPS
tells you the coordinates you're at,
but the map software has to find where
that point is in a huge amount of map
data, and do it several times per
second.
Again, credit to ModernRonin...
For rectangles that are aligned with the axes, you only need two points (four numbers) to identify the rectangle - conventionally, bottom-left and top-right corners. To establish whether a given point (Xtest, Ytest) overlaps with a rectangle (XBL, YBL, XTR, YTR) by testing both:
Xtest >= XBL && Xtest <= XTR
Ytest >= YBL && Ytest <= YTR
Clearly, for a large enough set of points to test, this could be fairly time consuming. The question, then, is how to optimize the testing.
Clearly, one optimization is to establish the minimum and maximum X and Y values for the box surrounding all the rectangles (the bounding box): a swift test on this shows whether there is any need to look further.
Xtest >= Xmin && Xtest <= Xmax
Ytest >= Ymin && Ytest <= Ymax
Depending on how much of the total surface area is covered with rectangles, you might be able to find non-overlapping sub-areas that contain rectangles, and you could then avoid searching those sub-areas that cannot contain a rectangle overlapping the point, again saving comparisons during the search at the cost of pre-computation of suitable data structures. If the set of rectangles is sparse enough, there may be no overlapping, in which case this degenerates into the brute-force search. Equally, if the set of rectangles is so dense that there are no sub-ranges in the bounding box that can be split up without breaking rectangles.
However, you could also arbitrarily break up the bounding area into, say, quarters (half in each direction). You would then use a list of boxes which would include more boxes than in the original set (two or four boxes for each box that overlapped one of the arbitrary boundaries). The advantage of this is that you could then eliminate three of the four quarters from the search, reducing the amount of searching to be done in total - at the expense of auxilliary storage.
So, there are space-time trade-offs, as ever. And pre-computation versus search trade-offs. If you are unlucky, the pre-computation achieves nothing (for example, there are two boxes only, and they don't overlap on either axis). On the other hand, it could achieve considerable search-time benefit.
I suggest you take a look at BSP trees (and possible quadtrees or octrees, links available on that page as well). They are used to partition the whole space recursively and allow you to quickly check for a point which rectangles you need to check at all.
At minimum you just have one huge partition and need to check all rectangles, at maximum your partitions get so small, that they get down to the size of single rectangles. Of course the more fine-grained the partition, the longer you need to walk down the tree in order to find the rectangles you want to check.
However, you can freely decide how many rectangles are suitable to be checked for a point and then create the corresponding structure.
Pay attention to overlapping rectangles though. As the BSP tree needs to be precomputed anyways, you may as well remove overlaps during that time, so you can get clear partitions.
Your R-tree approach is the best approach I know of (that's the approach I would choose over quadtrees, B+ trees, or BSP trees, as R-trees seem convenient to build in your case). Caveat: I'm no expert, even though I remember a few things from my senior year university class of algorithmic!
Why not try this. It seems rather light on both computation and memory.
Consider the projections of all the rectangles onto the base line of your space. Denote that set of line intervals as
{[Rl1, Rr1], [Rl2, Rr2],..., [Rln, Rrn]}, ordered by increasing left coordinates.
Now suppose your point is (x, y), start a search at the left of this set until you reach a line interval that contains the point x.
If none does, your point (x,y) is outside all rectangles.
If some do, say [Rlk, Rrk], ..., [Rlh, Rrh], (k <= h) then just check whether y is within the vertical extent of any of these rectangles.
Done.
Good luck.
John Doner