I´m currently working on a project to measure the surface of plant leaves. Until now I´ve successfully implemented an RCNN model to segment individual leaves and also generated a depth map using stereo computer vision which allows me to calculate distances between any two points.
Now I´m stuck trying to connect everything together in order to calculate the area of a leaf/polygon.
**I got original RGB images, Binary masks containing leaves, and also the depth information of every pixel.
Can someone please point me in the right direction?**
I reckon the right way would be to use Delauney triangulation on the polygons in the binary masks and then calculate the surface using the distance between the 3 points of each triangle. I haven't been able to find something quite similar to my problem which is implemented in python.
Thanks so much for your help in advance. I´ll upload a picture of an RGB image with the masks plotted.
leaf instance segmentation
Count the pixels inside the outlines (by polygon filling) or use the shoelace formula.
Related
I want to separate a noiseless 1-bit (black and white) image with white circles based on the concave part of the outline.
Please refer to the picture below.
This is the white object to separate:
The target result is:
Here is my implementation with the watershed algorithm:
The above result is not what I want.
If the size of the separated objects is similar, my algorithm is fine, but if the size difference is large, a problem occurs as shown in the picture above.
I would like to implement an opencv algorithm that can segment a region like the second picture.
However, the input photo is not necessarily a perfect circle.
It can be oval like the picture below:
Or it can be squished:
However, I would like to separate it based on the concave part of the outline anyway.
I think it can be implemented by using the distanceTransform function well, but I'm not sure how to approach it.
Please let me know which way to refer.
Thank you.
Here is an algorithm which should give you a good start.
Compute all contours.
For each contour compute the convexity defects. If there is no defect the contour is an isolated circle and you can segment it out.
After you handled all the isolated circles, you can work out the remaining contours by counting the convexity defects: the number of circles N for each contour is the number of convexity defects divided by 2.
Use a clustering algorithm (https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html should do well given the shapes you have) and cluster the "white" points using N as the number of clusters to be found.
If you want to find the minimal openings, you can use a medial axis based approach.
Pseudo code:
compute contours of bitmap
compute medial-axis of bitmap
for each point on medial-axis:
get minimal distance d from medial axis algorithm
for each local minimum of distance d:
get two points on bitmap contours with minimal distance that are at least d apart from each other
use these points for deviding line
If you need a working implementation in python, please let me know. I would use skimage lib. For other languages you might have to implement medial-axis on your own. But that shouldn't be a big deal.
I am working on an application using an IFM 3D camera to identify parts prior to a robot pickup. Currently I am able to find the centroid of these objects using contours from a depth image and from there calculate the center point of these objects in pixel space.
My next task is to then transform the 2D centroid coordinates to a 3D point in 'real' space. I am able to train the robot such that it's coordinate frame is either at the center of the image or at the traditional (0,0) point of an image (top left).
The 3D camera I am using provides both an intrinsic and extrinsic matrix. I know I need to use some combination of these matrices to project my centroid into three space but the following questions remain:
My current understanding from googling is the intrinsic matrix is used to fix lens distortion (barrel and pinhole warping, etc.) whereas the extrinsic matrix is used to project points into the real world. Is this simplified assumption correct?
How can a camera supply a single extrinsic matrix? I know traditionally these matrices are found using the checkerboard corners method but are these not dependent on the height of the camera?
Is the solution as simple as taking the 3x4 extrinsic matrix and multiplying it by a 3x1 matrix [x, y, 1] and if so, will the returned values be relative to the camera center or the traditional (0,0) point of an image.
Thanks in advance for any insight! Also if it's any consolation I am doing everything in python and openCV.
No. I suggest you read the basics in Multiple View Geometry of Hartley and Zisserman, freely available in the web. Dependent on the camera model, the intrinsics contain different parameters. For the pinhole camera model, these are the focal length and the principal point.
The only reason why you maybe could directly transform your 2D centroid to 3D is that you use a 3D camera. Read the manual of the camera, it should be explained how the relation between 2D and 3D coordinates is given for your specific model.
If you have only image data, you can only compute a 3D point from at least two views.
No, of course not. Please don't be lazy and start reading the basics about camera projection instead of asking for others to explain the common basics that are written down everywhere in the web and literature.
I am working on a project to figure out the difference between two objects and tag them with the proper model code.
I need help with a suggestion on how can we tackle such problem with image processing using OpenCV, following are the images
Till now I tried calculating black pixel difference between two images after doing binary threshold and also calculated a number of holes present on the gasket.
I also tried using feature points but it didn't worked well
what else can be done to improve the detection?
Thank you
The holes are excellent features that can be robustly detected by blob analysis.
In the first place, locate the large circle and determine its center and radius. The radius might be a first discriminant feature.
Next, establish the configuration of the screw holes around the center. You can use the distance to the center, the number of holes and the angles they define around the center.
If this is still not enough, you can register the gaskets and compare them to the models by matching the screw holes, adjusting the rotation, then comparing pixel-wise with a similarity measure such as SAD or SSD.
I'm doing image processing and mathematical morphology using scipy.ndimage and really enjoy it. Our work involves simulating charges moving through various films, and we're trying to use image analysis tools to estimate why different morphologies work better than others.
I quickly was able to use ndimage.label and distance_transform_edt to find the connected components and get sizing on them. I also implemented a breadth-first search to find minimal paths between the components and the edges, which represent electrodes.
Now, I'd like to determine "bottleneck" or "narrow channel" regions. I'm not even sure if I'm searching for the right keywords, since my expertise isn't really in image processing. I've given two examples below.. I want to find features like the red circles and count them and determine their size distributions. (Consider that charges will move more easily through wider bottlenecks.)
The problem is that I can't label these, since they're not independent components. The distance transforms give me small numbers at the edges.. I want something like the smallest distance through these bottlenecks.
Any advice where to look or general strategies?
One could use the medial axis transform to calculate the radius of a ball fit at each point in the bacl set to obtain the nooks in the image. In the following example we use the watershed of the distance function weighted by the distance function itself to obtain contours which separate minimas(the white components in the image). This thus gives a path weighted by the maximum value of the distance function separating 2 white components. I have done this in matlab but i think its easy to replicate the same in Scikit image tool box.
Image1:
Filling the holes since they aren't paths:
Distance function: (heat map)
Watershed of distance function (paths):
Watershed weighted by Distance function (final paths):
Image 2:
Distance function:
Watershed of distance function (paths):
Watershed weighted by Distance function (final paths):
Thus as demonstrated we have calculated technical a skeleton by zone of influence(SKIZ) using the watershed of the distance function(cityblock used here). One has to also note that the holes on the borders are not filled since the imfill ignores holes on borders. If its to be filled one can add a frame around so that one can use imfill to fill these later.
I'm currently involved in a hardware project where I am mapping triangular shaped LED to traditional bitmap images. I'd like to overlay a triangle vector onto an image and get the average pixel data within the bounds of that vector. However, I'm unfamiliar with the math needed to calculate this. Does anyone have an algorithm or a link that could send me in the right direction? (I tagged this as Python, which is preferred, but I'd be happy with the general algorithm!)
I've created a basic image of what I'm trying to capture here: http://imgur.com/Isjip.gif
Will this work: http://www.blackpawn.com/texts/pointinpoly/default.html ?
You can do line rasterization on the lineparts to determine for each pixel at each horizontal scanline lie within your triangle. Sum and divide their RGB values to get the average.