Zero keypoints detection using ORB descriptor - python

I am trying to calculate ORB (Oriented FAST and Rotated BRIEF) features for a database of images. The nexr task is to use a Bag Of Words approach in order to calculate the final features of images. My problem is that in some cases I get 0 keypoints from images of the database (either in ORB or in BRISK implementation). My code is from here.
img = cv2.imread('D:/_DATABASES/clothes_second/striped_141.descr',0)
orb = cv2.ORB()
kp = orb.detect(img,None)
kp, des = orb.compute(img, kp)
img2 = cv2.drawKeypoints(img,kp,color=(0,255,0), flags=0)
plt.imshow(img2),plt.show()
What could be done here, at least orb find one keypoint? How is it possible to use dense sampling for those cases?

You can use a dense feature detector, like the one implemented in C++: http://docs.opencv.org/modules/features2d/doc/common_interfaces_of_feature_detectors.html#densefeaturedetector
The thing is, I'm not sure if that has been ported to python yet. But, since the algorithm is not so hard, you could implement it yourself. Here is the implementation in C++:
void DenseFeatureDetector::detectImpl( const Mat& image, vector<KeyPoint>& keypoints, const Mat& mask ) const
{
float curScale = static_cast<float>(initFeatureScale);
int curStep = initXyStep;
int curBound = initImgBound;
for( int curLevel = 0; curLevel < featureScaleLevels; curLevel++ )
{
for( int x = curBound; x < image.cols - curBound; x += curStep )
{
for( int y = curBound; y < image.rows - curBound; y += curStep )
{
keypoints.push_back( KeyPoint(static_cast<float>(x), static_cast<float>(y), curScale) );
}
}
curScale = static_cast<float>(curScale * featureScaleMul);
if( varyXyStepWithScale ) curStep = static_cast<int>( curStep * featureScaleMul + 0.5f );
if( varyImgBoundWithScale ) curBound = static_cast<int>( curBound * featureScaleMul + 0.5f );
}
KeyPointsFilter::runByPixelsMask( keypoints, mask );
}
However, as you will notice, this implementation does not deal with the angle of the keypoints. That can be a problem if your images have rotation.

Related

How to get all pixels in mask in C++

In python, we can use such code to fetch all pixels under mask:
src_img = cv2.imread("xxx")
mask = src_img > 50
fetch = src_img[mask]
what we get is a ndarray including all pixels matching condition mask. How to implement the same function using C++opencv ?
I've found that copyTo can select pixels under specified mask, but it can only copy those pixels to another Mat instead of what python did.
This is not that straightforward in C++ (as expected). That operation breaks down in further, smaller operations. One way to achieve a std::vector with the same pixel values above your threshold is this, I'm using this test image:
// Read the input image:
std::string imageName = "D://opencvImages//grayDog.png";
cv::Mat inputImage = cv::imread( imageName );
// Convert BGR to Gray:
cv::Mat grayImage;
cv::cvtColor( inputImage, grayImage, cv::COLOR_RGB2GRAY );
cv::Mat mask;
int thresholdValue = 50;
cv::threshold( grayImage, mask, thresholdValue, 255, cv::THRESH_BINARY );
The above bit just creates a cv::Mat where each pixel above the threshold is drawn with a value of 255, 0 otherwise. It is (one possible) equivalent of mask = src_img > 50. Now, let's mask the original grayscale image with this mask. Think about an element-wise multiplication between the two cv::Mats. One possible way is this:
// Create grayscale mask:
cv::Mat output;
grayImage.copyTo( output, mask );
Now we have the original pixel values and everything else is zero. Convenient, because we can find now the locations of the non-zero pixels:
// Locate the non-zero pixel values:
std::vector< cv::Point > pixelLocations;
cv::findNonZero( output, pixelLocations );
Alright, we have a std::vector of cv::Points that locate each non-zero pixel. We can use this info to index the original grayscale pixels in the original matrix:
// Extract each pixel value using its location:
std::vector< int > pixelValues;
int totalPoints = (int)pixelLocations.size();
for( int i = 0; i < totalPoints; i++ ){
// Get pixel location:
cv::Point currentPoint = pixelLocations[i];
// Get pixel value:
int currentPixel = (int)grayImage.at<uchar>( currentPoint );
pixelValues.push_back( currentPixel );
// Print info:
std::cout<<"i: "<<i<<" currentPoint: "<<currentPoint<<" pixelValue: "<<currentPixel<<std::endl;
}
You end up with pixelValues, which is a std::vector containing a list of all the pixels that are above your threshold.
Why do you hate writing loop?
I think this is the easiest way:
cv::Mat Img = ... //Where, this Img is 8UC1
// * In this sample, extract the pixel positions
std::vector< cv::Point > ResultData;
const unsigned char Thresh = 50;
for( int y=0; y<Img.rows; ++y )
{
const unsigned char *p = Img.ptr<unsigned char>(y);
for( int x=0; x<Img.cols; ++x, ++p )
{
if( *p > Thresh )
{//Here, pick up this pixel's info you want.
ResultData.emplace_back( x,y );
}
}
}
Because I received a nervous complaint, I add an example of collecting values.
In the following example, a mask image Mask is input to the process.
cv::Mat Img = ... //Where, this Img is 8UC1
cv::Mat Mask = ...; //Same size as Img, 8UC1
std::vector< unsigned char > ResultData; //collect pixel values
for( int y=0; y<Img.rows; ++y )
{
const unsigned char *p = Img.ptr<unsigned char>(y);
const unsigned char *m = Mask.ptr<unsigned char>(y);
for( int x=0; x<Img.cols; ++x, ++p, ++m )
{
if( *m ){ ResultData.push_back( *p ); }
}
}

Reusing models from grabcut in OpenCV

I used the interactive grabcut.py from the OpenCV samples to segment an image and saved the foreground and background models. Then I used these models to segment more images of the same kind, as I don't want to retrain the model each time.
After running the grabcut algorithm, the mask is all zeros (all background) and therefore it doesn't segment anything.
from matplotlib import pyplot as plt
import numpy as np
import cv2
img = cv2.imread('usimg1.jpg')
mask = np.zeros(img.shape[:2], np.uint8)
bgdModel = np.load('bgdmodel.npy')
fgdModel = np.load('fgdmodel.npy')
cv2.grabCut(img, mask, None, bgdModel, fgdModel, 5, cv2.GC_EVAL)
mask = np.where((mask==2) | (mask==0), 0, 1).astype('uint8')
img = img * mask[:, :, np.newaxis]
plt.imshow(img)
plt.show()
I tried to initialize the algorithm with a mask or a rectangle but this produces an error because the models are not empty (which is what I actually want).
How do I have to pass the pre-trained models to the algorithm, such that they are not retrained from scratch each time I'm segmenting an image?
EDIT
After rayryeng's comment I implemented following code:
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 0, cv2.GC_INIT_WITH_RECT)
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 2, cv2.GC_EVAL)
It seems to work but the first call now changes my model. In the source code it calls learnGMMs without checking whether a pretrained model is provided.
You have the correct line of thinking where you use cv2.GC_EVAL so that you only need to perform the segmentation without having to compute the models again.
Unfortunately even when you use this flag, this is a limitation with the OpenCV source itself. If you look at the actual C++ implementation when you encounter the GC_EVAL condition, it does this towards the end of the cv::grabcut method. Note that the Python cv2.grabCut method is a wrapper for cv::grabcut:
if( mode == GC_EVAL )
checkMask( img, mask );
const double gamma = 50;
const double lambda = 9*gamma;
const double beta = calcBeta( img );
Mat leftW, upleftW, upW, uprightW;
calcNWeights( img, leftW, upleftW, upW, uprightW, beta, gamma );
for( int i = 0; i < iterCount; i++ )
{
GCGraph<double> graph;
assignGMMsComponents( img, mask, bgdGMM, fgdGMM, compIdxs );
learnGMMs( img, mask, compIdxs, bgdGMM, fgdGMM );
constructGCGraph(img, mask, bgdGMM, fgdGMM, lambda, leftW, upleftW, upW, uprightW, graph );
estimateSegmentation( graph, mask );
}
You'll see that GC_EVAL is only encountered once in the code and that's to check the validity of the inputs. The culprit is the learnGMMs function. Even though you specified the trained models, these get reset because the call to learnGMMs ignores the GC_EVAL flag, so this gets called regardless of whatever flag you specify as the input.
Inspired by this post: OpenCV - GrabCut with custom foreground/background models, what you can do is you'll have to modify the OpenCV source yourself and inside the loop you can place an if statement to check for the GC_EVAL flag prior to calling learnGMMs:
if( mode == GC_EVAL )
checkMask( img, mask );
const double gamma = 50;
const double lambda = 9*gamma;
const double beta = calcBeta( img );
Mat leftW, upleftW, upW, uprightW;
calcNWeights( img, leftW, upleftW, upW, uprightW, beta, gamma );
for( int i = 0; i < iterCount; i++ )
{
GCGraph<double> graph;
assignGMMsComponents( img, mask, bgdGMM, fgdGMM, compIdxs );
if (mode != GC_EVAL) // New
learnGMMs( img, mask, compIdxs, bgdGMM, fgdGMM );
constructGCGraph(img, mask, bgdGMM, fgdGMM, lambda, leftW, upleftW, upW, uprightW, graph );
estimateSegmentation( graph, mask );
}
This should be able to use the pre-trained models without having to learn them all over again at each iteration. Once you make the change, you'll have to recompile the source again and that should hopefully be able to use your pre-trained models without clearing them when you use the cv2.GC_EVAL flag.
For the future, I have opened up a issue on the official repo for OpenCV. Hopefully they'll fix this when they have the time: https://github.com/opencv/opencv/issues/9191

To find the number of circles in an image using OpenCV

I have an image as below :
Can anyone tell me how to detect the number of circles in it.I'm using Hough circle transform to achieve this and this is my code:
# import the necessary packages
import numpy as np
import sys
import cv2
# load the image, clone it for output, and then convert it to grayscale
image = cv2.imread(str(sys.argv[1]))
output = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# detect circles in the image
circles = cv2.HoughCircles(gray, cv2.cv.CV_HOUGH_GRADIENT, 1.2, 5)
no_of_circles = 0
# ensure at least some circles were found
if circles is not None:
# convert the (x, y) coordinates and radius of the circles to integers
circles = np.round(circles[0, :]).astype("int")
no_of_circles = len(circles)
# loop over the (x, y) coordinates and radius of the circles
for (x, y, r) in circles:
# draw the circle in the output image, then draw a rectangle
# corresponding to the center of the circle
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.rectangle(output, (x - 5, y - 5), (x + 5, y + 5), (0, 128, 255), -1)
# show the output image
cv2.imshow("output", np.hstack([image, output]))
print 'no of circles',no_of_circles
I'm getting wrong answers for this code.Can anyone tell me where I went wrong?
i tried a tricky way to detect all circles.
i found HoughCircles parameters manually
HoughCircles( src_gray, circles, HOUGH_GRADIENT, 1, 50, 40, 46, 0, 0 );
the tricky part is
flip( src, flipped, 1 );
hconcat( src,flipped, flipped );
hconcat( flipped, src, src );
flip( src, flipped, 0 );
vconcat( src,flipped, flipped );
vconcat( flipped, src, src );
flip( src, src, -1 );
will create a model like below before detection.
the result is like this
the c++ code can be easily converted to python
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
using namespace std;
using namespace cv;
int main(int argc, char** argv)
{
Mat src, src_gray, flipped, display;
if (argc < 2)
{
std::cerr<<"No input image specified\n";
return -1;
}
// Read the image
src = imread( argv[1], 1 );
if( src.empty() )
{
std::cerr<<"Invalid input image\n";
return -1;
}
flip( src, flipped, 1 );
hconcat( src,flipped, flipped );
hconcat( flipped, src, src );
flip( src, flipped, 0 );
vconcat( src,flipped, flipped );
vconcat( flipped, src, src );
flip( src, src, -1 );
// Convert it to gray
cvtColor( src, src_gray, COLOR_BGR2GRAY );
// Reduce the noise so we avoid false circle detection
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
// will hold the results of the detection
std::vector<Vec3f> circles;
// runs the actual detection
HoughCircles( src_gray, circles, HOUGH_GRADIENT, 1, 50, 40, 46, 0, 0 );
// clone the colour, input image for displaying purposes
display = src.clone();
Rect rect_src(display.cols / 3, display.rows / 3, display.cols / 3, display.rows / 3 );
rectangle( display, rect_src, Scalar(255,0,0) );
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
Rect r = Rect( center.x-radius, center.y-radius, radius * 2, radius * 2 );
Rect intersection_rect = r & rect_src;
if( intersection_rect.width * intersection_rect.height > r.width * r.height / 3 )
{
// circle center
circle( display, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( display, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
}
// shows the results
imshow( "results", display(rect_src));
// get user key
waitKey();
return 0;
}
This SO post describes detection of semi-circles, and may be a good start for you:
Detect semi-circle in opencv
If you get stuck in OpenCV, try coding up the solution yourself. Writing a Hough circle finder parameterized for your particular application is relatively straightforward. If you write application-specific Hough algorithms a few times, you should be able to write a reasonable solution in less time than it takes to sort through a bunch of google results, decipher someone else's code, and so on.
You definitely don't need Canny edge detection for an image like this, but it won't hurt.
Other libraries (esp. commercial ones) will allow you to set more parameters for Hough circle finding. I would've expected some overload of the HoughCircle function to allow a struct of search parameters to be passed in, including the minimum percentage of circle completeness (arc length) allowed.
Although it's good to learn both RANSAC and Hough techniques--and, over time, more exotic techniques--I wouldn't necessarily recommend using RANSAC when you have circles defined so nicely and crisply. Without offering specific evidence, I'll just claim that fiddling with RANSAC parameters may be less intuitive than fiddling with Hough parameters.
HoughCircles needs some parameter tuning to work properly.
It could be that in your case the default values of Param1 and Param2 (set to 100) are not good.
You can fine tune your detection with HoughCircle, by computing the ultimate eroded. It will give you the number of circles in your image.
If there are only circles and background on the input you can count the number of connected components and ignore the component associated with background. This will be the simplest and most robust solution

How to adapt or resize a rectangle inside an object without including (or with a few numbers) of background pixels?

After I applied thresholding and finding the contours of the object, I used the following code to get the straight rectangle around the object (or the rotated rectangle inputting its instruction):
img = cv2.imread('image.png')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,cv2.THRESH_BINARY)
# find contours
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
# straight rectangle
x,y,w,h = cv2.boundingRect(cnt)
img= cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
see the image
Then I have calculated the number of object and background pixels inside the straight rectangle using the following code:
# rectangle area (total number of object and background pixels inside the rectangle)
area_rect = w*h
# white or object pixels (inside the rectangle)
obj = cv2.countNonZero(imgray)
# background pixels (inside the rectangle)
bac = area_rect - obj
Now I want to adapt the rectangle of the object as a function of the relationship between the background pixel and those of the object, ie to have a rectangle occupying the larger part of the object without or with less background pixel, for example:
How do I create this?
This problem can be stated as the find the largest rectangle inscribed in a non-convex polygon.
An approximate solution can be found at this link.
This problem can be formulated also as: for each angle, find the largest rectangle containing only zeros in a matrix, explored in this SO question.
My solution is based on this answer. This will find only axis aligned rectangles, so you can easily rotate the image by a given angle and apply this solution for every angle.
My solution is C++, but you can easily port it to Python, since I'm using mostly OpenCV function, or adjust the solution in the above mentioned answer accounting for rotation.
Here we are:
#include <opencv2\opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
// https://stackoverflow.com/a/30418912/5008845
Rect findMinRect(const Mat1b& src)
{
Mat1f W(src.rows, src.cols, float(0));
Mat1f H(src.rows, src.cols, float(0));
Rect maxRect(0,0,0,0);
float maxArea = 0.f;
for (int r = 0; r < src.rows; ++r)
{
for (int c = 0; c < src.cols; ++c)
{
if (src(r, c) == 0)
{
H(r, c) = 1.f + ((r>0) ? H(r-1, c) : 0);
W(r, c) = 1.f + ((c>0) ? W(r, c-1) : 0);
}
float minw = W(r,c);
for (int h = 0; h < H(r, c); ++h)
{
minw = min(minw, W(r-h, c));
float area = (h+1) * minw;
if (area > maxArea)
{
maxArea = area;
maxRect = Rect(Point(c - minw + 1, r - h), Point(c+1, r+1));
}
}
}
}
return maxRect;
}
RotatedRect largestRectInNonConvexPoly(const Mat1b& src)
{
// Create a matrix big enough to not lose points during rotation
vector<Point> ptz;
findNonZero(src, ptz);
Rect bbox = boundingRect(ptz);
int maxdim = max(bbox.width, bbox.height);
Mat1b work(2*maxdim, 2*maxdim, uchar(0));
src(bbox).copyTo(work(Rect(maxdim - bbox.width/2, maxdim - bbox.height / 2, bbox.width, bbox.height)));
// Store best data
Rect bestRect;
int bestAngle = 0;
// For each angle
for (int angle = 0; angle < 90; angle += 1)
{
cout << angle << endl;
// Rotate the image
Mat R = getRotationMatrix2D(Point(maxdim,maxdim), angle, 1);
Mat1b rotated;
warpAffine(work, rotated, R, work.size());
// Keep the crop with the polygon
vector<Point> pts;
findNonZero(rotated, pts);
Rect box = boundingRect(pts);
Mat1b crop = rotated(box).clone();
// Invert colors
crop = ~crop;
// Solve the problem: "Find largest rectangle containing only zeros in an binary matrix"
// https://stackoverflow.com/questions/2478447/find-largest-rectangle-containing-only-zeros-in-an-n%C3%97n-binary-matrix
Rect r = findMinRect(crop);
// If best, save result
if (r.area() > bestRect.area())
{
bestRect = r + box.tl(); // Correct the crop displacement
bestAngle = angle;
}
}
// Apply the inverse rotation
Mat Rinv = getRotationMatrix2D(Point(maxdim, maxdim), -bestAngle, 1);
vector<Point> rectPoints{bestRect.tl(), Point(bestRect.x + bestRect.width, bestRect.y), bestRect.br(), Point(bestRect.x, bestRect.y + bestRect.height)};
vector<Point> rotatedRectPoints;
transform(rectPoints, rotatedRectPoints, Rinv);
// Apply the reverse translations
for (int i = 0; i < rotatedRectPoints.size(); ++i)
{
rotatedRectPoints[i] += bbox.tl() - Point(maxdim - bbox.width / 2, maxdim - bbox.height / 2);
}
// Get the rotated rect
RotatedRect rrect = minAreaRect(rotatedRectPoints);
return rrect;
}
int main()
{
Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);
// Compute largest rect inside polygon
RotatedRect r = largestRectInNonConvexPoly(img);
// Show
Mat3b res;
cvtColor(img, res, COLOR_GRAY2BGR);
Point2f points[4];
r.points(points);
for (int i = 0; i < 4; ++i)
{
line(res, points[i], points[(i + 1) % 4], Scalar(0, 0, 255), 2);
}
imshow("Result", res);
waitKey();
return 0;
}
The result image is:
NOTE
I'd like to point out that this code is not optimized, so it can probably perform better. For an approximized solution, see here, and the papers reported there.
This answer to a related question put me in the right direction.
There's now a python library calculating the maximum drawable rectangle inside a polygon.
Library: maxrect
Install through pip:
pip install git+https://${GITHUB_TOKEN}#github.com/planetlabs/maxrect.git
Usage:
from maxrect import get_intersection, get_maximal_rectangle, rect2poly
# For a given convex polygon
coordinates1 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
coordinates2 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
# find the intersection of the polygons
_, coordinates = get_intersection([coordinates1, coordinates2])
# get the maximally inscribed rectangle
ll, ur = get_maximal_rectangle(coordinates)
# casting the rectangle to a GeoJSON-friendly closed polygon
rect2poly(ll, ur)
Source: https://pypi.org/project/maxrect/
here is a python code I wrote with rotation included. I tried to speed it up, but I guess it can be improved.
For future googlers,
Since your provided sample solution allows background pixels to be within the rectangle, I suppose you wish to find the the smallest rectangle that covers perhaps 80% of the white pixels.
This can be done using a similar method of finding the error ellipse given a set of data (in this case, the data is all the white pixels, and the error ellipse needs to be modified to be a rectangle)
The following links would hence be helpful
How to get the best fit bounding box from covariance matrix and mean position?
http://www.visiondummy.com/2014/04/draw-error-ellipse-representing-covariance-matrix/

How do I crop to largest interior bounding box in OpenCV?

I have some images on a black background where the images don't have square edges (see bottom right of image below). I would like to crop them down the largest rectangular image (red border). I know I will potentially lose from of the original image. Is it possible to do this in OpenCV with Python. I know there are are functions to crop to a bounding box of a contour but that would still leave me with black background in places.
ok, I've played with an idea and tested it (it's c++ but you'll probably be able to convert that to python):
assumption: background is black and the interior has no black boundary parts
you can find the external contour with findContours
use min/max x/y point positions from that contour until the rectangle that is built by those points contains no points that lie outside of the contour
I can't guarantee that this method always finds the "best" interior box, but I use a heuristic to choose whether the rectangle is reduced at top/bottom/left/right side.
Code can certainly be optimized, too ;)
using this as a testimage, I got that result (non-red region is the found interior rectangle):
regard that there is one pixel at top right that shouldnt containt to the rectangle, maybe thats from extrascting/drawing the contour wrong?!?
and here's code:
cv::Mat input = cv::imread("LenaWithBG.png");
cv::Mat gray;
cv::cvtColor(input,gray,CV_BGR2GRAY);
cv::imshow("gray", gray);
// extract all the black background (and some interior parts maybe)
cv::Mat mask = gray>0;
cv::imshow("mask", mask);
// now extract the outer contour
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(mask,contours,hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cv::Point(0,0));
std::cout << "found contours: " << contours.size() << std::endl;
cv::Mat contourImage = cv::Mat::zeros( input.size(), CV_8UC3 );;
//find contour with max elements
// remark: in theory there should be only one single outer contour surrounded by black regions!!
unsigned int maxSize = 0;
unsigned int id = 0;
for(unsigned int i=0; i<contours.size(); ++i)
{
if(contours.at(i).size() > maxSize)
{
maxSize = contours.at(i).size();
id = i;
}
}
std::cout << "chosen id: " << id << std::endl;
std::cout << "max size: " << maxSize << std::endl;
/// Draw filled contour to obtain a mask with interior parts
cv::Mat contourMask = cv::Mat::zeros( input.size(), CV_8UC1 );
cv::drawContours( contourMask, contours, id, cv::Scalar(255), -1, 8, hierarchy, 0, cv::Point() );
cv::imshow("contour mask", contourMask);
// sort contour in x/y directions to easily find min/max and next
std::vector<cv::Point> cSortedX = contours.at(id);
std::sort(cSortedX.begin(), cSortedX.end(), sortX);
std::vector<cv::Point> cSortedY = contours.at(id);
std::sort(cSortedY.begin(), cSortedY.end(), sortY);
unsigned int minXId = 0;
unsigned int maxXId = cSortedX.size()-1;
unsigned int minYId = 0;
unsigned int maxYId = cSortedY.size()-1;
cv::Rect interiorBB;
while( (minXId<maxXId)&&(minYId<maxYId) )
{
cv::Point min(cSortedX[minXId].x, cSortedY[minYId].y);
cv::Point max(cSortedX[maxXId].x, cSortedY[maxYId].y);
interiorBB = cv::Rect(min.x,min.y, max.x-min.x, max.y-min.y);
// out-codes: if one of them is set, the rectangle size has to be reduced at that border
int ocTop = 0;
int ocBottom = 0;
int ocLeft = 0;
int ocRight = 0;
bool finished = checkInteriorExterior(contourMask, interiorBB, ocTop, ocBottom,ocLeft, ocRight);
if(finished)
{
break;
}
// reduce rectangle at border if necessary
if(ocLeft)++minXId;
if(ocRight) --maxXId;
if(ocTop) ++minYId;
if(ocBottom)--maxYId;
}
std::cout << "done! : " << interiorBB << std::endl;
cv::Mat mask2 = cv::Mat::zeros(input.rows, input.cols, CV_8UC1);
cv::rectangle(mask2,interiorBB, cv::Scalar(255),-1);
cv::Mat maskedImage;
input.copyTo(maskedImage);
for(unsigned int y=0; y<maskedImage.rows; ++y)
for(unsigned int x=0; x<maskedImage.cols; ++x)
{
maskedImage.at<cv::Vec3b>(y,x)[2] = 255;
}
input.copyTo(maskedImage,mask2);
cv::imshow("masked image", maskedImage);
cv::imwrite("interiorBoundingBoxResult.png", maskedImage);
with reduction function:
bool checkInteriorExterior(const cv::Mat&mask, const cv::Rect&interiorBB, int&top, int&bottom, int&left, int&right)
{
// return true if the rectangle is fine as it is!
bool returnVal = true;
cv::Mat sub = mask(interiorBB);
unsigned int x=0;
unsigned int y=0;
// count how many exterior pixels are at the
unsigned int cTop=0; // top row
unsigned int cBottom=0; // bottom row
unsigned int cLeft=0; // left column
unsigned int cRight=0; // right column
// and choose that side for reduction where mose exterior pixels occured (that's the heuristic)
for(y=0, x=0 ; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the top side of the rect a bit to the bottom
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cTop;
}
}
for(y=sub.rows-1, x=0; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the bottom side of the rect a bit to the top
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cBottom;
}
}
for(y=0, x=0 ; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cLeft;
}
}
for(x=sub.cols-1, y=0; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cRight;
}
}
// that part is ugly and maybe not correct, didn't check whether all possible combinations are handled. Check that one please. The idea is to set `top = 1` iff it's better to reduce the rect at the top than anywhere else.
if(cTop > cBottom)
{
if(cTop > cLeft)
if(cTop > cRight)
top = 1;
}
else
if(cBottom > cLeft)
if(cBottom > cRight)
bottom = 1;
if(cLeft >= cRight)
{
if(cLeft >= cBottom)
if(cLeft >= cTop)
left = 1;
}
else
if(cRight >= cTop)
if(cRight >= cBottom)
right = 1;
return returnVal;
}
bool sortX(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.x == a.x)
if(b.x==b.x)
ret = a.x < b.x;
return ret;
}
bool sortY(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.y == a.y)
if(b.y == b.y)
ret = a.y < b.y;
return ret;
}
A solution inspired by #micka answer, in python.
This is not a clever solution, and could be optimized, but it worked (slowly) in my case.
I modified you image to add a square, like in your example: see
At the end, this code crops the white rectangle in this
Hope you will find it helpful!
import cv2
# Import your picture
input_picture = cv2.imread("LenaWithBG.png")
# Color it in gray
gray = cv2.cvtColor(input_picture, cv2.COLOR_BGR2GRAY)
# Create our mask by selecting the non-zero values of the picture
ret, mask = cv2.threshold(gray,0,255,cv2.THRESH_BINARY)
# Select the contour
mask , cont, _ = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# if your mask is incurved or if you want better results,
# you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE,
# but the rectangle search will be longer
cv2.drawContours(gray, cont, -1, (255,0,0), 1)
cv2.imshow("Your picture with contour", gray)
cv2.waitKey(0)
# Get all the points of the contour
contour = cont[0].reshape(len(cont[0]),2)
# we assume a rectangle with at least two points on the contour gives a 'good enough' result
# get all possible rectangles based on this hypothesis
rect = []
for i in range(len(contour)):
x1, y1 = contour[i]
for j in range(len(contour)):
x2, y2 = contour[j]
area = abs(y2-y1)*abs(x2-x1)
rect.append(((x1,y1), (x2,y2), area))
# the first rect of all_rect has the biggest area, so it's the best solution if he fits in the picture
all_rect = sorted(rect, key = lambda x : x[2], reverse = True)
# we take the largest rectangle we've got, based on the value of the rectangle area
# only if the border of the rectangle is not in the black part
# if the list is not empty
if all_rect:
best_rect_found = False
index_rect = 0
nb_rect = len(all_rect)
# we check if the rectangle is a good solution
while not best_rect_found and index_rect < nb_rect:
rect = all_rect[index_rect]
(x1, y1) = rect[0]
(x2, y2) = rect[1]
valid_rect = True
# we search a black area in the perimeter of the rectangle (vertical borders)
x = min(x1, x2)
while x <max(x1,x2)+1 and valid_rect:
if mask[y1,x] == 0 or mask[y2,x] == 0:
# if we find a black pixel, that means a part of the rectangle is black
# so we don't keep this rectangle
valid_rect = False
x+=1
y = min(y1, y2)
while y <max(y1,y2)+1 and valid_rect:
if mask[y,x1] == 0 or mask[y,x2] == 0:
valid_rect = False
y+=1
if valid_rect:
best_rect_found = True
index_rect+=1
if best_rect_found:
cv2.rectangle(gray, (x1,y1), (x2,y2), (255,0,0), 1)
cv2.imshow("Is that rectangle ok?",gray)
cv2.waitKey(0)
# Finally, we crop the picture and store it
result = input_picture[min(y1, y2):max(y1, y2), min(x1,x2):max(x1,x2)]
cv2.imwrite("Lena_cropped.png",result)
else:
print("No rectangle fitting into the area")
else:
print("No rectangle found")
If your mask is incurved or simply if you want better results, you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE, but the rectangle search will take more time (because it's a quadratic solution in the best case).
In ImageMagick 6.9.10-30 (or 7.0.8.30) or higher, you can use the -trim function with a new define.
Input:
convert image.png -fuzz 5% -define trim:percent-background=0% -trim +repage result.png
Or for the image presented below:
Input:
convert image2.png -bordercolor black -border 1 -define trim:percent-background=0% -trim +repage result2.png

Categories

Resources