find specific points on an image - python

I am trying to make a program to solve a puzzle an. My attempts work good with sample puzzles I made to test. Now I am trying to make one for a actual puzzle.
The puzzle pieces of this new puzzle don't really have a proper shape.
I managed to make the image into black and white and finally in to array of 1s and 0s where 1s indicate the piece and 0s background. I want to find a way to identify the coordinates of the 4 corners, peaks and depths of these pieces.
I tried to count the numbers of 0s near the 1s to see the maximum curves in the border. But the shapes are not smooth enough for it to work.
counter = np.zeros((lenX,lenY),dtype=int)
for i in range(lenX):
for j in range(lenY):
if img[i,j]==1:
counter[i,j] = count_white(img,i,j,lenX,lenY)
print(counter)
tpath = os.getcwd()+"/test.jpg"
print(cv2.imwrite(tpath, Image))
print("saved at : ",tpath)
np.savetxt("test.csv", counter, delimiter=",")
def count_white(img,x,y,lenX,lenY):
X = [x-1,x,x+1,x+1,x+1,x,x-1,x-1]
Y = [y-1,y-1,y-1,y,y+1,y+1,y+1,y]
count = 0
for i in range(len(X)):
if X[i] < lenX and Y[i] < lenY:
if img[X[i],Y[i]] == 0:
count=count+1
return count
Any suggestion, references or ideas?

Sorry for the C++ code but it works for your case:
cv::Mat gray = cv::imread("Sq01a.png", cv::IMREAD_GRAYSCALE);
gray = 255 - gray;
cv::Mat bin;
cv::threshold(gray, bin, 1, 255, cv::THRESH_BINARY);
cv::Mat bigBin(2 * bin.rows, 2 * bin.cols, CV_8UC1, cv::Scalar(0));
bin.copyTo(cv::Mat(bigBin, cv::Rect(bin.cols / 2, bin.rows / 2, bin.cols, bin.rows)));
std::vector<std::vector<cv::Point> > contours;
cv::findContours(bigBin, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_NONE);
if (contours.size() > 0)
{
std::vector<cv::Point> tmp = contours[0];
const cv::Point* elementPoints[1] = { &tmp[0] };
int numberOfPoints = (int)tmp.size();
cv::fillPoly(bigBin, elementPoints, &numberOfPoints, 1, cv::Scalar(255, 255, 255), 8);
}
int maxCorners = 20;
double qualityLevel = 0.01;
double minDistance = bigBin.cols / 8;
int blockSize = 5;
bool useHarrisDetector = true;
double k = 0.04;
std::vector<cv::Point2f> corners;
cv::goodFeaturesToTrack(bigBin, corners, maxCorners, qualityLevel, minDistance, cv::noArray(), blockSize, useHarrisDetector, k);
std::vector<cv::Point2f> resCorners;
std::vector<cv::Point2f> imgCorners = { cv::Point2f(0, 0), cv::Point2f(bigBin.cols, 0), cv::Point2f(bigBin.cols, bigBin.rows), cv::Point2f(0, bigBin.rows) };
for (auto imgCorn : imgCorners)
{
size_t best_i = corners.size();
float min_dist = bigBin.cols * bigBin.rows;
for (size_t i = 0; i < corners.size(); ++i)
{
float dist = cv::norm(imgCorn - corners[i]);
if (dist < min_dist)
{
best_i = i;
min_dist = dist;
}
}
if (best_i != corners.size())
{
resCorners.push_back(corners[best_i]);
}
}
cv::Mat bigColor;
cv::cvtColor(bigBin, bigColor, cv::COLOR_GRAY2BGR);
for (auto corner : corners)
{
cv::circle(bigColor, corner, 10, cv::Scalar(0, 0, 255.), 1);
}
for (auto corner : resCorners)
{
cv::circle(bigColor, corner, 5, cv::Scalar(0, 255, 0), 2);
}
cv::imshow("gray", gray);
cv::imshow("bigColor", bigColor);
cv::waitKey(0);
Here red circles - corners from Harris and green - the nearest to the image corners. It's OK?

Related

How to get all pixels in mask in C++

In python, we can use such code to fetch all pixels under mask:
src_img = cv2.imread("xxx")
mask = src_img > 50
fetch = src_img[mask]
what we get is a ndarray including all pixels matching condition mask. How to implement the same function using C++opencv ?
I've found that copyTo can select pixels under specified mask, but it can only copy those pixels to another Mat instead of what python did.
This is not that straightforward in C++ (as expected). That operation breaks down in further, smaller operations. One way to achieve a std::vector with the same pixel values above your threshold is this, I'm using this test image:
// Read the input image:
std::string imageName = "D://opencvImages//grayDog.png";
cv::Mat inputImage = cv::imread( imageName );
// Convert BGR to Gray:
cv::Mat grayImage;
cv::cvtColor( inputImage, grayImage, cv::COLOR_RGB2GRAY );
cv::Mat mask;
int thresholdValue = 50;
cv::threshold( grayImage, mask, thresholdValue, 255, cv::THRESH_BINARY );
The above bit just creates a cv::Mat where each pixel above the threshold is drawn with a value of 255, 0 otherwise. It is (one possible) equivalent of mask = src_img > 50. Now, let's mask the original grayscale image with this mask. Think about an element-wise multiplication between the two cv::Mats. One possible way is this:
// Create grayscale mask:
cv::Mat output;
grayImage.copyTo( output, mask );
Now we have the original pixel values and everything else is zero. Convenient, because we can find now the locations of the non-zero pixels:
// Locate the non-zero pixel values:
std::vector< cv::Point > pixelLocations;
cv::findNonZero( output, pixelLocations );
Alright, we have a std::vector of cv::Points that locate each non-zero pixel. We can use this info to index the original grayscale pixels in the original matrix:
// Extract each pixel value using its location:
std::vector< int > pixelValues;
int totalPoints = (int)pixelLocations.size();
for( int i = 0; i < totalPoints; i++ ){
// Get pixel location:
cv::Point currentPoint = pixelLocations[i];
// Get pixel value:
int currentPixel = (int)grayImage.at<uchar>( currentPoint );
pixelValues.push_back( currentPixel );
// Print info:
std::cout<<"i: "<<i<<" currentPoint: "<<currentPoint<<" pixelValue: "<<currentPixel<<std::endl;
}
You end up with pixelValues, which is a std::vector containing a list of all the pixels that are above your threshold.
Why do you hate writing loop?
I think this is the easiest way:
cv::Mat Img = ... //Where, this Img is 8UC1
// * In this sample, extract the pixel positions
std::vector< cv::Point > ResultData;
const unsigned char Thresh = 50;
for( int y=0; y<Img.rows; ++y )
{
const unsigned char *p = Img.ptr<unsigned char>(y);
for( int x=0; x<Img.cols; ++x, ++p )
{
if( *p > Thresh )
{//Here, pick up this pixel's info you want.
ResultData.emplace_back( x,y );
}
}
}
Because I received a nervous complaint, I add an example of collecting values.
In the following example, a mask image Mask is input to the process.
cv::Mat Img = ... //Where, this Img is 8UC1
cv::Mat Mask = ...; //Same size as Img, 8UC1
std::vector< unsigned char > ResultData; //collect pixel values
for( int y=0; y<Img.rows; ++y )
{
const unsigned char *p = Img.ptr<unsigned char>(y);
const unsigned char *m = Mask.ptr<unsigned char>(y);
for( int x=0; x<Img.cols; ++x, ++p, ++m )
{
if( *m ){ ResultData.push_back( *p ); }
}
}

Want to detect blur from image, but couldn't get it right

I am actually want to convert this blur detection into C++. As a beginner in OpenCV, I am actually following this for conversion, But maybe I am getting it wrong. Here is my approach. I have to use DFT instead of FFT in C++.
(h, w) = image.shape
(cX, cY) = (int(w / 2.0), int(h / 2.0))
# compute the FFT to find the frequency transform, then shift
# the zero frequency component (i.e., DC component located at
# the top-left corner) to the center where it will be more
# easy to analyze
fft = np.fft.fft2(image)
fftShift = np.fft.fftshift(fft)
I converted this by
Mat I = imread( samples::findFile( filename ), IMREAD_GRAYSCALE);
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero values
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
dft(complexI, complexI, DFT_COMPLEX_OUTPUT); // this way the result may fit in the source matrix
#For DFT shift as python code
// compute the magnitude and switch to logarithmic scale
// => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
// crop the spectrum, if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
// rearrange the quadrants of Fourier image so that the origin is at the image center
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
Then, in the next part
# zero-out the center of the FFT shift (i.e., remove low
# frequencies), apply the inverse shift such that the DC
# component once again becomes the top-left, and then apply
# the inverse FFT
fftShift[cY - size:cY + size, cX - size:cX + size] = 0
fftShift = np.fft.ifftshift(fftShift)
recon = np.fft.ifft2(fftShift)
I converted this in this way
// construct a Mat object to zero out of the center, here size = 60
Mat H;
Mat H(complexI.size(), CV_32F, Scalar(1));
float D = 0, D0 = 60;
for (int u = 0; u < H.rows; u++)
{
for (int v = 0; v < H.cols; v++)
{
D = sqrt((u - scr.rows / 2)*(u - scr.rows / 2) + (v - scr.cols / 2)*(v - scr.cols / 2));
if (D < D0)
{
H.at<float>(u, v) = 0;
}
}
}
Mat planesH[] = { Mat_<float>(H.clone()), Mat_<float>(H.clone()) };
Mat planes_dft[] = { complexI, Mat::zeros(complexI.size(), CV_32F) };
split(complexI, planes_dft);
Mat planes_out[] = { Mat::zeros(complexI.size(), CV_32F), Mat::zeros(complexI.size(), CV_32F) };
planes_out[0] = planesH[0].mul(planes_dft[0]);
planes_out[1] = planesH[1].mul(planes_dft[1]);
merge(planes_out, 2, complexIH);
#for Dft shift
Mat p0(complexIH, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat p1(complexIH, Rect(cx, 0, cx, cy)); // Top-Right
Mat p2(complexIH, Rect(0, cy, cx, cy)); // Bottom-Left
Mat p3(complexIH, Rect(cx, cy, cx, cy)); // Bottom-Right
p0.copyTo(tmp);
p3.copyTo(p0);
tmp.copyTo(p3);
p1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
p2.copyTo(p1);
tmp.copyTo(p2);
Mat recon;
dft(complexIH, recon, DFT_INVERSE);
Then the tutorial stated
# compute the magnitude spectrum of the reconstructed image,
# then compute the mean of the magnitude values
magnitude = 20 * np.log(np.abs(recon))
mean = np.mean(magnitude)
# the image will be considered "blurry" if the mean value of the
# magnitudes is less than the threshold value
return (mean, mean <= thresh)
And I converted this in this way
Mat planes2[] = {Mat_<float>(complexIH), Mat::zeros(complexIH.size(), CV_32F)};
// compute the magnitude and switch to logarithmic scale
// => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
split(recon, planes2); // planes2[0] = Re(DFT(I), planes2[1] = Im(DFT(I))
magnitude(planes2[0], planes2[1], planes2[0]);// planes2[0] = magnitude
Mat output = planes2[0];
output += Scalar::all(1); // switch to logarithmic scale
log(output, output);
float avg = mean(magI)[0];
I know it is a mess. I want to get the blur value like the tutorial says.
I think this comes close to the original Python code
#include <iostream>
#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main(int argc, char **argv) {
if (argc <= 1) {
fprintf(stderr, "Error: missing image file\n");
return 1;
}
string image_file = argv[1];
cout << "Processing " << image_file << std::endl;
Mat frame = imread(image_file, IMREAD_GRAYSCALE);
// Go float
Mat fImage;
frame.convertTo(fImage, CV_32F);
// FFT
cout << "Direct transform...\n";
Mat fourierTransform;
dft(fImage, fourierTransform, DFT_SCALE|DFT_COMPLEX_OUTPUT);
int Wd = frame.cols;
int Ht = frame.rows;
int cx = Wd/2;
int cy = Ht/2;
int Sw = 60;
int Sh = 60;
//center low frequencies in the middle
//by shuffling the quadrants.
Mat q0(fourierTransform, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(fourierTransform, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(fourierTransform, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(fourierTransform, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
// Block the low frequencies
fourierTransform(Rect(cx-Sw,cy-Sh,2*Sw,2*Sh)).setTo(0);
//shuffle the quadrants to their original position
Mat orgFFT;
fourierTransform.copyTo(orgFFT);
Mat p0(orgFFT, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat p1(orgFFT, Rect(cx, 0, cx, cy)); // Top-Right
Mat p2(orgFFT, Rect(0, cy, cx, cy)); // Bottom-Left
Mat p3(orgFFT, Rect(cx, cy, cx, cy)); // Bottom-Right
p0.copyTo(tmp);
p3.copyTo(p0);
tmp.copyTo(p3);
p1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
p2.copyTo(p1);
tmp.copyTo(p2);
// IFFT
cout << "Inverse transform...\n";
Mat invFFT;
Mat logFFT;
double minVal,maxVal;
dft(orgFFT, invFFT, DFT_INVERSE|DFT_REAL_OUTPUT);
//img_fft = 20*numpy.log(numpy.abs(img_fft))
invFFT = cv::abs(invFFT);
cv::minMaxLoc(invFFT,&minVal,&maxVal,NULL,NULL);
//check for impossible values
if(maxVal<=0.0){
cerr << "No information, complete black image!\n";
return 1;
}
cv::log(invFFT,logFFT);
logFFT *= 20;
//result = numpy.mean(img_fft)
cv::Scalar result= cv::mean(logFFT);
cout << "Result : "<< result.val[0] << endl;
// Back to 8-bits
Mat finalImage;
logFFT.convertTo(finalImage, CV_8U);
// show if you like
imshow("Input", frame);
imshow("Result", finalImage);
cv::waitKey();
return 0;
}

how to crop the blank part from an image (document form) in python?

enter image description hereI have a scanned copy of a document as an image submitted by the user, it covers only 40% of the paper's height. I want to crop only that part, how to achieve this.
It is not necessary that the required form will always be on the top of the paper, it can be anywhere and the rest is blank white paper, how to crop that part?
The scanned copy I have got using scanner made in python only, so it has little black dots in the page.
You can consider the steps below to crop the blank or the none blank part:
cv::namedWindow("result", cv::WINDOW_FREERATIO);
cv::Mat img = cv::imread(R"(xbNQF.png)"); // read the image
// main code starts from here
cv::Mat gray; // convert the image to gray and put the result in gray mat
cv::cvtColor(img, gray, cv::COLOR_BGR2GRAY); // img -> gray
// threshold the gray image to remove the noise and put the result again in gray image
// it will convert all the background to black and all the text and fields to white
cv::threshold(gray, gray, 150, 255, cv::THRESH_BINARY_INV);
// now enlage the text or the inpout text fields
cv::dilate(gray, gray, cv::getStructuringElement(cv::MORPH_RECT, cv::Size(15, 3)));
// now clean the image, remove unwanted small pixels
cv::erode(gray, gray, cv::getStructuringElement(cv::MORPH_RECT, cv::Size(3, 3)));
// find all non zero to get the max y
cv::Mat idx; // the findNonZero() function will put the result in this mat
cv::findNonZero(gray, idx); // pass the mat idx to the function
// now iterate throgh the idx to find the max y
double maxY = 0; // this will keep the max y, init value is 0
for (int i=0; i<idx.rows; ++i) {
cv::Point pnt = idx.at<cv::Point>(i);
if (pnt.y > maxY) { // if this Y is greater than the last Y, copy it to the last Y
maxY = pnt.y; // this
}
}
// crop the none blank (upper) part
// NOTE: from this point you can also crop the blank part
// (0,0) means start form left-top, (gray.cols, int(maxY+5)) means
// whidth the same as the original image, and the height is
// the maxY + 5, 5 here means let give some margin the cropped image
// if you don't want, then you can delete it.
cv::Mat submat = img(cv::Rect(0, 0, gray.cols, int(maxY+5)));
cv::imshow("result", submat);
cv::waitKey();
And it is the result:
Hope it helps!
Update:
If you interest to all min, max of the (x, y), then search like this:
double maxX = 0, minX = std::numeric_limits<double>::max();
double maxY = 0, minY = std::numeric_limits<double>::max();
for (int i=0; i<idx.rows; ++i) {
cv::Point pnt = idx.at<cv::Point>(i);
if (pnt.x > maxX) {
maxX = pnt.x;
}
if (pnt.x < minX) {
minX = pnt.x;
}
if (pnt.y > maxY) {
maxY = pnt.y;
}
if (pnt.y < minY) {
minY = pnt.y;
}
}
So then you can crop anywhere of the image once you have these points.

How to adapt or resize a rectangle inside an object without including (or with a few numbers) of background pixels?

After I applied thresholding and finding the contours of the object, I used the following code to get the straight rectangle around the object (or the rotated rectangle inputting its instruction):
img = cv2.imread('image.png')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,cv2.THRESH_BINARY)
# find contours
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
# straight rectangle
x,y,w,h = cv2.boundingRect(cnt)
img= cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
see the image
Then I have calculated the number of object and background pixels inside the straight rectangle using the following code:
# rectangle area (total number of object and background pixels inside the rectangle)
area_rect = w*h
# white or object pixels (inside the rectangle)
obj = cv2.countNonZero(imgray)
# background pixels (inside the rectangle)
bac = area_rect - obj
Now I want to adapt the rectangle of the object as a function of the relationship between the background pixel and those of the object, ie to have a rectangle occupying the larger part of the object without or with less background pixel, for example:
How do I create this?
This problem can be stated as the find the largest rectangle inscribed in a non-convex polygon.
An approximate solution can be found at this link.
This problem can be formulated also as: for each angle, find the largest rectangle containing only zeros in a matrix, explored in this SO question.
My solution is based on this answer. This will find only axis aligned rectangles, so you can easily rotate the image by a given angle and apply this solution for every angle.
My solution is C++, but you can easily port it to Python, since I'm using mostly OpenCV function, or adjust the solution in the above mentioned answer accounting for rotation.
Here we are:
#include <opencv2\opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
// https://stackoverflow.com/a/30418912/5008845
Rect findMinRect(const Mat1b& src)
{
Mat1f W(src.rows, src.cols, float(0));
Mat1f H(src.rows, src.cols, float(0));
Rect maxRect(0,0,0,0);
float maxArea = 0.f;
for (int r = 0; r < src.rows; ++r)
{
for (int c = 0; c < src.cols; ++c)
{
if (src(r, c) == 0)
{
H(r, c) = 1.f + ((r>0) ? H(r-1, c) : 0);
W(r, c) = 1.f + ((c>0) ? W(r, c-1) : 0);
}
float minw = W(r,c);
for (int h = 0; h < H(r, c); ++h)
{
minw = min(minw, W(r-h, c));
float area = (h+1) * minw;
if (area > maxArea)
{
maxArea = area;
maxRect = Rect(Point(c - minw + 1, r - h), Point(c+1, r+1));
}
}
}
}
return maxRect;
}
RotatedRect largestRectInNonConvexPoly(const Mat1b& src)
{
// Create a matrix big enough to not lose points during rotation
vector<Point> ptz;
findNonZero(src, ptz);
Rect bbox = boundingRect(ptz);
int maxdim = max(bbox.width, bbox.height);
Mat1b work(2*maxdim, 2*maxdim, uchar(0));
src(bbox).copyTo(work(Rect(maxdim - bbox.width/2, maxdim - bbox.height / 2, bbox.width, bbox.height)));
// Store best data
Rect bestRect;
int bestAngle = 0;
// For each angle
for (int angle = 0; angle < 90; angle += 1)
{
cout << angle << endl;
// Rotate the image
Mat R = getRotationMatrix2D(Point(maxdim,maxdim), angle, 1);
Mat1b rotated;
warpAffine(work, rotated, R, work.size());
// Keep the crop with the polygon
vector<Point> pts;
findNonZero(rotated, pts);
Rect box = boundingRect(pts);
Mat1b crop = rotated(box).clone();
// Invert colors
crop = ~crop;
// Solve the problem: "Find largest rectangle containing only zeros in an binary matrix"
// https://stackoverflow.com/questions/2478447/find-largest-rectangle-containing-only-zeros-in-an-n%C3%97n-binary-matrix
Rect r = findMinRect(crop);
// If best, save result
if (r.area() > bestRect.area())
{
bestRect = r + box.tl(); // Correct the crop displacement
bestAngle = angle;
}
}
// Apply the inverse rotation
Mat Rinv = getRotationMatrix2D(Point(maxdim, maxdim), -bestAngle, 1);
vector<Point> rectPoints{bestRect.tl(), Point(bestRect.x + bestRect.width, bestRect.y), bestRect.br(), Point(bestRect.x, bestRect.y + bestRect.height)};
vector<Point> rotatedRectPoints;
transform(rectPoints, rotatedRectPoints, Rinv);
// Apply the reverse translations
for (int i = 0; i < rotatedRectPoints.size(); ++i)
{
rotatedRectPoints[i] += bbox.tl() - Point(maxdim - bbox.width / 2, maxdim - bbox.height / 2);
}
// Get the rotated rect
RotatedRect rrect = minAreaRect(rotatedRectPoints);
return rrect;
}
int main()
{
Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);
// Compute largest rect inside polygon
RotatedRect r = largestRectInNonConvexPoly(img);
// Show
Mat3b res;
cvtColor(img, res, COLOR_GRAY2BGR);
Point2f points[4];
r.points(points);
for (int i = 0; i < 4; ++i)
{
line(res, points[i], points[(i + 1) % 4], Scalar(0, 0, 255), 2);
}
imshow("Result", res);
waitKey();
return 0;
}
The result image is:
NOTE
I'd like to point out that this code is not optimized, so it can probably perform better. For an approximized solution, see here, and the papers reported there.
This answer to a related question put me in the right direction.
There's now a python library calculating the maximum drawable rectangle inside a polygon.
Library: maxrect
Install through pip:
pip install git+https://${GITHUB_TOKEN}#github.com/planetlabs/maxrect.git
Usage:
from maxrect import get_intersection, get_maximal_rectangle, rect2poly
# For a given convex polygon
coordinates1 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
coordinates2 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
# find the intersection of the polygons
_, coordinates = get_intersection([coordinates1, coordinates2])
# get the maximally inscribed rectangle
ll, ur = get_maximal_rectangle(coordinates)
# casting the rectangle to a GeoJSON-friendly closed polygon
rect2poly(ll, ur)
Source: https://pypi.org/project/maxrect/
here is a python code I wrote with rotation included. I tried to speed it up, but I guess it can be improved.
For future googlers,
Since your provided sample solution allows background pixels to be within the rectangle, I suppose you wish to find the the smallest rectangle that covers perhaps 80% of the white pixels.
This can be done using a similar method of finding the error ellipse given a set of data (in this case, the data is all the white pixels, and the error ellipse needs to be modified to be a rectangle)
The following links would hence be helpful
How to get the best fit bounding box from covariance matrix and mean position?
http://www.visiondummy.com/2014/04/draw-error-ellipse-representing-covariance-matrix/

How do I crop to largest interior bounding box in OpenCV?

I have some images on a black background where the images don't have square edges (see bottom right of image below). I would like to crop them down the largest rectangular image (red border). I know I will potentially lose from of the original image. Is it possible to do this in OpenCV with Python. I know there are are functions to crop to a bounding box of a contour but that would still leave me with black background in places.
ok, I've played with an idea and tested it (it's c++ but you'll probably be able to convert that to python):
assumption: background is black and the interior has no black boundary parts
you can find the external contour with findContours
use min/max x/y point positions from that contour until the rectangle that is built by those points contains no points that lie outside of the contour
I can't guarantee that this method always finds the "best" interior box, but I use a heuristic to choose whether the rectangle is reduced at top/bottom/left/right side.
Code can certainly be optimized, too ;)
using this as a testimage, I got that result (non-red region is the found interior rectangle):
regard that there is one pixel at top right that shouldnt containt to the rectangle, maybe thats from extrascting/drawing the contour wrong?!?
and here's code:
cv::Mat input = cv::imread("LenaWithBG.png");
cv::Mat gray;
cv::cvtColor(input,gray,CV_BGR2GRAY);
cv::imshow("gray", gray);
// extract all the black background (and some interior parts maybe)
cv::Mat mask = gray>0;
cv::imshow("mask", mask);
// now extract the outer contour
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(mask,contours,hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cv::Point(0,0));
std::cout << "found contours: " << contours.size() << std::endl;
cv::Mat contourImage = cv::Mat::zeros( input.size(), CV_8UC3 );;
//find contour with max elements
// remark: in theory there should be only one single outer contour surrounded by black regions!!
unsigned int maxSize = 0;
unsigned int id = 0;
for(unsigned int i=0; i<contours.size(); ++i)
{
if(contours.at(i).size() > maxSize)
{
maxSize = contours.at(i).size();
id = i;
}
}
std::cout << "chosen id: " << id << std::endl;
std::cout << "max size: " << maxSize << std::endl;
/// Draw filled contour to obtain a mask with interior parts
cv::Mat contourMask = cv::Mat::zeros( input.size(), CV_8UC1 );
cv::drawContours( contourMask, contours, id, cv::Scalar(255), -1, 8, hierarchy, 0, cv::Point() );
cv::imshow("contour mask", contourMask);
// sort contour in x/y directions to easily find min/max and next
std::vector<cv::Point> cSortedX = contours.at(id);
std::sort(cSortedX.begin(), cSortedX.end(), sortX);
std::vector<cv::Point> cSortedY = contours.at(id);
std::sort(cSortedY.begin(), cSortedY.end(), sortY);
unsigned int minXId = 0;
unsigned int maxXId = cSortedX.size()-1;
unsigned int minYId = 0;
unsigned int maxYId = cSortedY.size()-1;
cv::Rect interiorBB;
while( (minXId<maxXId)&&(minYId<maxYId) )
{
cv::Point min(cSortedX[minXId].x, cSortedY[minYId].y);
cv::Point max(cSortedX[maxXId].x, cSortedY[maxYId].y);
interiorBB = cv::Rect(min.x,min.y, max.x-min.x, max.y-min.y);
// out-codes: if one of them is set, the rectangle size has to be reduced at that border
int ocTop = 0;
int ocBottom = 0;
int ocLeft = 0;
int ocRight = 0;
bool finished = checkInteriorExterior(contourMask, interiorBB, ocTop, ocBottom,ocLeft, ocRight);
if(finished)
{
break;
}
// reduce rectangle at border if necessary
if(ocLeft)++minXId;
if(ocRight) --maxXId;
if(ocTop) ++minYId;
if(ocBottom)--maxYId;
}
std::cout << "done! : " << interiorBB << std::endl;
cv::Mat mask2 = cv::Mat::zeros(input.rows, input.cols, CV_8UC1);
cv::rectangle(mask2,interiorBB, cv::Scalar(255),-1);
cv::Mat maskedImage;
input.copyTo(maskedImage);
for(unsigned int y=0; y<maskedImage.rows; ++y)
for(unsigned int x=0; x<maskedImage.cols; ++x)
{
maskedImage.at<cv::Vec3b>(y,x)[2] = 255;
}
input.copyTo(maskedImage,mask2);
cv::imshow("masked image", maskedImage);
cv::imwrite("interiorBoundingBoxResult.png", maskedImage);
with reduction function:
bool checkInteriorExterior(const cv::Mat&mask, const cv::Rect&interiorBB, int&top, int&bottom, int&left, int&right)
{
// return true if the rectangle is fine as it is!
bool returnVal = true;
cv::Mat sub = mask(interiorBB);
unsigned int x=0;
unsigned int y=0;
// count how many exterior pixels are at the
unsigned int cTop=0; // top row
unsigned int cBottom=0; // bottom row
unsigned int cLeft=0; // left column
unsigned int cRight=0; // right column
// and choose that side for reduction where mose exterior pixels occured (that's the heuristic)
for(y=0, x=0 ; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the top side of the rect a bit to the bottom
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cTop;
}
}
for(y=sub.rows-1, x=0; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the bottom side of the rect a bit to the top
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cBottom;
}
}
for(y=0, x=0 ; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cLeft;
}
}
for(x=sub.cols-1, y=0; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cRight;
}
}
// that part is ugly and maybe not correct, didn't check whether all possible combinations are handled. Check that one please. The idea is to set `top = 1` iff it's better to reduce the rect at the top than anywhere else.
if(cTop > cBottom)
{
if(cTop > cLeft)
if(cTop > cRight)
top = 1;
}
else
if(cBottom > cLeft)
if(cBottom > cRight)
bottom = 1;
if(cLeft >= cRight)
{
if(cLeft >= cBottom)
if(cLeft >= cTop)
left = 1;
}
else
if(cRight >= cTop)
if(cRight >= cBottom)
right = 1;
return returnVal;
}
bool sortX(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.x == a.x)
if(b.x==b.x)
ret = a.x < b.x;
return ret;
}
bool sortY(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.y == a.y)
if(b.y == b.y)
ret = a.y < b.y;
return ret;
}
A solution inspired by #micka answer, in python.
This is not a clever solution, and could be optimized, but it worked (slowly) in my case.
I modified you image to add a square, like in your example: see
At the end, this code crops the white rectangle in this
Hope you will find it helpful!
import cv2
# Import your picture
input_picture = cv2.imread("LenaWithBG.png")
# Color it in gray
gray = cv2.cvtColor(input_picture, cv2.COLOR_BGR2GRAY)
# Create our mask by selecting the non-zero values of the picture
ret, mask = cv2.threshold(gray,0,255,cv2.THRESH_BINARY)
# Select the contour
mask , cont, _ = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# if your mask is incurved or if you want better results,
# you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE,
# but the rectangle search will be longer
cv2.drawContours(gray, cont, -1, (255,0,0), 1)
cv2.imshow("Your picture with contour", gray)
cv2.waitKey(0)
# Get all the points of the contour
contour = cont[0].reshape(len(cont[0]),2)
# we assume a rectangle with at least two points on the contour gives a 'good enough' result
# get all possible rectangles based on this hypothesis
rect = []
for i in range(len(contour)):
x1, y1 = contour[i]
for j in range(len(contour)):
x2, y2 = contour[j]
area = abs(y2-y1)*abs(x2-x1)
rect.append(((x1,y1), (x2,y2), area))
# the first rect of all_rect has the biggest area, so it's the best solution if he fits in the picture
all_rect = sorted(rect, key = lambda x : x[2], reverse = True)
# we take the largest rectangle we've got, based on the value of the rectangle area
# only if the border of the rectangle is not in the black part
# if the list is not empty
if all_rect:
best_rect_found = False
index_rect = 0
nb_rect = len(all_rect)
# we check if the rectangle is a good solution
while not best_rect_found and index_rect < nb_rect:
rect = all_rect[index_rect]
(x1, y1) = rect[0]
(x2, y2) = rect[1]
valid_rect = True
# we search a black area in the perimeter of the rectangle (vertical borders)
x = min(x1, x2)
while x <max(x1,x2)+1 and valid_rect:
if mask[y1,x] == 0 or mask[y2,x] == 0:
# if we find a black pixel, that means a part of the rectangle is black
# so we don't keep this rectangle
valid_rect = False
x+=1
y = min(y1, y2)
while y <max(y1,y2)+1 and valid_rect:
if mask[y,x1] == 0 or mask[y,x2] == 0:
valid_rect = False
y+=1
if valid_rect:
best_rect_found = True
index_rect+=1
if best_rect_found:
cv2.rectangle(gray, (x1,y1), (x2,y2), (255,0,0), 1)
cv2.imshow("Is that rectangle ok?",gray)
cv2.waitKey(0)
# Finally, we crop the picture and store it
result = input_picture[min(y1, y2):max(y1, y2), min(x1,x2):max(x1,x2)]
cv2.imwrite("Lena_cropped.png",result)
else:
print("No rectangle fitting into the area")
else:
print("No rectangle found")
If your mask is incurved or simply if you want better results, you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE, but the rectangle search will take more time (because it's a quadratic solution in the best case).
In ImageMagick 6.9.10-30 (or 7.0.8.30) or higher, you can use the -trim function with a new define.
Input:
convert image.png -fuzz 5% -define trim:percent-background=0% -trim +repage result.png
Or for the image presented below:
Input:
convert image2.png -bordercolor black -border 1 -define trim:percent-background=0% -trim +repage result2.png

Categories

Resources