I'm trying to detect angle difference between two circular objects, which be shown as 2 image below.
I'm thinking about rotate one of image with some small angle. Every time one image rotated, SSIM between rotated image and the another image will be calculated. The angle with maximum SSIM will be the angle difference.
But, finding the extremes is never an easy problem. So my question is: Are there another algorithms (opencv) can be used is this case?
IMAGE #1
IMAGE #2
EDIT:
Thanks #Micka, I just do the same way he suggest and remove black region like #Yves Daoust said to improve processing time. Here is my final result:
ORIGINAL IMAGE
ROTATED + SHIFTED IMAGE
Here's a way to do it:
detect circles (for the example I assume circle is in the image center and radius is 50% of the image width)
unroll circle images by polar coordinates
make sure that the second image is fully visible in the first image, without a "circle end overflow"
simple template matching
Result for the following code:
min: 9.54111e+07
pos: [0, 2470]
angle-right: 317.571
angle-left: -42.4286
I think this should work quite well in general.
int main()
{
// load images
cv::Mat image1 = cv::imread("C:/data/StackOverflow/circleAngle/circleAngle1.jpg");
cv::Mat image2 = cv::imread("C:/data/StackOverflow/circleAngle/circleAngle2.jpg");
// generate circle information. Here I assume image center and image is filled by the circles.
// use houghCircles or a RANSAC based circle detection instead, if necessary
cv::Point2f center1 = cv::Point2f(image1.cols/2.0f, image1.rows/2.0f);
cv::Point2f center2 = cv::Point2f(image2.cols / 2.0f, image2.rows / 2.0f);
float radius1 = image1.cols / 2.0f;
float radius2 = image2.cols / 2.0f;
cv::Mat unrolled1, unrolled2;
// define a size for the unrolling. Best might be to choose the arc-length of the circle. The smaller you choose this, the less resolution is available (the more pixel information of the circle is lost during warping)
cv::Size unrolledSize(radius1, image1.cols * 2);
// unroll the circles by warpPolar
cv::warpPolar(image1, unrolled1, unrolledSize, center1, radius1, cv::WARP_POLAR_LINEAR);
cv::warpPolar(image2, unrolled2, unrolledSize, center2, radius2, cv::WARP_POLAR_LINEAR);
// double the first image (720° of the circle), so that the second image is fully included without a "circle end overflow"
cv::Mat doubleImg1;
cv::vconcat(unrolled1, unrolled1, doubleImg1);
// the height of the unrolled image is exactly 360° of the circle
double degreesPerPixel = 360.0 / unrolledSize.height;
// template matching. Maybe correlation could be the better matching metric
cv::Mat matchingResult;
cv::matchTemplate(doubleImg1, unrolled2, matchingResult, cv::TemplateMatchModes::TM_SQDIFF);
double minVal; double maxVal; cv::Point minLoc; cv::Point maxLoc;
cv::Point matchLoc;
cv::minMaxLoc(matchingResult, &minVal, &maxVal, &minLoc, &maxLoc, cv::Mat());
std::cout << "min: " << minVal << std::endl;
std::cout << "pos: " << minLoc << std::endl;
// angles in clockwise direction:
std::cout << "angle-right: " << minLoc.y * degreesPerPixel << std::endl;
std::cout << "angle-left: " << minLoc.y * degreesPerPixel -360.0 << std::endl;
double foundAngle = minLoc.y * degreesPerPixel;
// visualizations:
// display the matched position
cv::Rect pos = cv::Rect(minLoc, cv::Size(unrolled2.cols, unrolled2.rows));
cv::rectangle(doubleImg1, pos, cv::Scalar(0, 255, 0), 4);
// resize because the images are too big
cv::Mat resizedResult;
cv::resize(doubleImg1, resizedResult, cv::Size(), 0.2, 0.2);
cv::resize(unrolled1, unrolled1, cv::Size(), 0.2, 0.2);
cv::resize(unrolled2, unrolled2, cv::Size(), 0.2, 0.2);
double startAngleUpright = 0;
cv::ellipse(image1, center1, cv::Size(100, 100), 0, startAngleUpright, startAngleUpright + foundAngle, cv::Scalar::all(255), -1, 0);
cv::resize(image1, image1, cv::Size(), 0.5, 0.5);
cv::imshow("image1", image1);
cv::imshow("unrolled1", unrolled1);
cv::imshow("unrolled2", unrolled2);
cv::imshow("resized", resizedResult);
cv::waitKey(0);
}
This is how the intermediate images and results look like:
unrolled image 1 / unrolled 2 / unrolled 1 (720°) / best match of unrolled 2 in unrolled 1 (720°):
Here's the same idea but the correlation is done with a convolution (FFT) instead of matchTemplate. FFTs can be faster if there's much data.
Load inputs:
im1 = cv.imread("circle1.jpg", cv.IMREAD_GRAYSCALE)
im2 = cv.imread("circle2.jpg", cv.IMREAD_GRAYSCALE)
height, width = im1.shape
Polar transform (log polar as an exercise to the reader) with some arbitrary parameters that affect "resolution":
maxradius = width // 2
stripwidth = maxradius
stripheight = int(maxradius * 2 * pi) # approximately square at the radius
#stripheight = 360
def polar(im):
return cv.warpPolar(im, center=(width/2, height/2),
dsize=(stripwidth, stripheight), maxRadius=maxradius,
flags=cv.WARP_POLAR_LOG*0 + cv.INTER_LINEAR)
strip1 = polar(im1)
strip2 = polar(im2)
Convolution:
f1 = np.fft.fft2(strip1[::-1, ::-1])
f2 = np.fft.fft2(strip2)
conv = np.fft.ifft2(f1 * f2)
minmaxloc:
conv = np.real(conv) # or np.abs, can't decide
(i,j) = np.unravel_index(conv.argmax(), conv.shape)
i,j = (i+1) % stripheight, (j+1) % stripwidth
and what's that as an angle:
print("degrees:", i / stripheight * 360)
# 42.401091405184175
https://gist.github.com/crackwitz/3da91f43324b0c53504d587a394d4c71
Related
I am creating program that helps processing microstructure images. One of the function is detecting circles with the same radius. User draws one circle, my program spots others. I've already implemented distance transform method
I am trying to create method that uses HoughCircles. However, I am confused with its parameters.
My code:
def find_circles_with_radius_haugh(path, radius):
img = cv2.imread(path)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(img_gray, cv2.HOUGH_GRADIENT, int(radius),
1.5,
param1=80, param2=40,
minRadius=int(radius * 0.9),
maxRadius=int(radius * 1.1))
res = list()
if circles is not None:
for i in circles[0, :]:
res.append((i[0], i[1], i[2]))
return res
Original picture:
My result of detecting circles with radius 57 pixels (+- 10%):
Please help me with better processing images like that.
I might try findContours method, but I don't know any filters that will make borders on this picture clearer.
I tried a little.
My idea is simply using filter2D instead of Hough-Transform.
Because detection target is the circles has specific radius, if edge of circles detected clearly, the center of the circles will be able to found by convoluting circular mask to the edge image.
I checked the filter2D(=convolution) result with following code (C++).
int main()
{
//This source image "MicroSpheres.png" was copied from this question
cv::Mat Src = cv::imread( "MicroSpheres.png", cv::IMREAD_GRAYSCALE );
if( Src.empty() )return 0;
//Test with 50% Scale
cv::resize( Src, Src, cv::Size(0,0), 0.5, 0.5, cv::INTER_AREA );
cv::imshow( "Src", Src );
const int Radius = cvRound(57 * 0.5); //So, Radius is also 50% scale
//Trying to detect edge of circles
cv::Mat EdgeImg;
{
cv::Mat Test;
cv::medianBlur( Src, Test, 5 );
cv::morphologyEx( Test, Test, cv::MORPH_GRADIENT, cv::Mat() );
cv::imshow( "Test", Test );
cv::adaptiveThreshold( Test, EdgeImg, 255, cv::ADAPTIVE_THRESH_GAUSSIAN_C, cv::THRESH_BINARY, (Test.rows/6)|0x01, -6 );
cv::imshow( "EdgeImg", EdgeImg );
}
cv::Mat BufferFor_imwrite = EdgeImg.clone();
//filter2D
cv::Mat FilterResult;
{
const int FilterRadius = Radius + 2;
const int FilterSize = FilterRadius*2 + 1;
cv::Mat Filter = cv::Mat::zeros( FilterSize,FilterSize, CV_32F );
cv::circle( Filter, cv::Point(FilterRadius,FilterRadius), Radius/2, cv::Scalar(-1), -1 );
cv::circle( Filter, cv::Point(FilterRadius,FilterRadius), Radius, cv::Scalar(1), 3 );
cv::filter2D( EdgeImg, FilterResult, CV_32F, Filter );
}
{//Very lazy check of the filter2D result.
double Min, Max;
cv::minMaxLoc( FilterResult, &Min, &Max );
double scale = 255 / (Max-Min);
cv::Mat Show;
FilterResult.convertTo( Show, CV_8U, scale, -Min*scale );
cv::imshow( "Filter2D_Result", Show );
cv::vconcat( BufferFor_imwrite, Show, BufferFor_imwrite );
//(Estimating center of circles based onthe filter2D result.)
// Here, just only simple thresholding is implemented.
// At least non-maximum suppression must be done, I think.
cv::Mat Centers;
cv::threshold( FilterResult, Centers, (Max+Min)*0.6, 255, cv::THRESH_BINARY );
Centers.convertTo( Centers, CV_8U );
Show = Src * 0.5;
Show.setTo( cv::Scalar(255), Centers );
cv::imshow( "Centers", Show );
cv::vconcat( BufferFor_imwrite, Show, BufferFor_imwrite );
}
if( cv::waitKey() == 's' ){ cv::imwrite( "Result.png", BufferFor_imwrite ); }
return 0;
}
The following image is result. 3 images are concatenated vertically.
edge detection result
filter2D result
Circle center estimation result (very lazy. just binarized the filter2D result and overlapped it onto source image.)
I can't say this is perfect, but it looks like that the result roughly indicates some centers.
Rewrote #fana code in Python
import cv2
import numpy as np
img = cv2.imread('spheres1.bmp')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.resize(gray, (0, 0), gray, 0.5, 0.5, cv2.INTER_AREA)
cv2.imwrite("resized.png", gray)
radius = round(57 * 0.5)
test = cv2.medianBlur(gray, 5)
struct_elem = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
# might be better to use "I" matrix
# struct_elem = np.ones((3,3), np.uint8)
test = cv2.morphologyEx(test, cv2.MORPH_GRADIENT, kernel=struct_elem)
cv2.imwrite("MorphologyEx.png", test)
edge_img = cv2.adaptiveThreshold(test, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, int(len(test) / 6) | 0x01, -6)
cv2.imwrite("EdgeImg.png", edge_img );
buffer_for_imwrite = edge_img.copy()
filter_radius = radius + 2
filter_size = filter_radius * 2 + 1
img_filter = np.zeros((filter_size, filter_size))
cv2.circle(img_filter, (filter_radius, filter_radius), int(radius / 2), -1, -1)
cv2.circle(img_filter, (filter_radius, filter_radius), radius, 1, 3)
# second circle better to generate with smaller width like this:
# cv2.circle(img_filter, (filter_radius, filter_radius), radius, 1, 2)
cv2.imwrite("Filter.png", img_filter)
filter_result = cv2.filter2D(edge_img, cv2.CV_32F, img_filter)
cv2.imwrite("FilterResult.png", filter_result)
min_val, max_val, _, _ = cv2.minMaxLoc(filter_result)
scale = 255 / (max_val - min_val)
show = np.uint8(filter_result * scale - min_val * scale)
cv2.imwrite("Filter2D_Result.png", show)
_, centers = cv2.threshold(filter_result, (max_val + min_val) * 0.6, 255, cv2.THRESH_BINARY)
centers = np.uint8(centers)
show = gray * 0.5
show[np.where(centers == 255)] = 255
cv2.imwrite("Centers.png", show)
I am actually want to convert this blur detection into C++. As a beginner in OpenCV, I am actually following this for conversion, But maybe I am getting it wrong. Here is my approach. I have to use DFT instead of FFT in C++.
(h, w) = image.shape
(cX, cY) = (int(w / 2.0), int(h / 2.0))
# compute the FFT to find the frequency transform, then shift
# the zero frequency component (i.e., DC component located at
# the top-left corner) to the center where it will be more
# easy to analyze
fft = np.fft.fft2(image)
fftShift = np.fft.fftshift(fft)
I converted this by
Mat I = imread( samples::findFile( filename ), IMREAD_GRAYSCALE);
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero values
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
dft(complexI, complexI, DFT_COMPLEX_OUTPUT); // this way the result may fit in the source matrix
#For DFT shift as python code
// compute the magnitude and switch to logarithmic scale
// => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
// crop the spectrum, if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
// rearrange the quadrants of Fourier image so that the origin is at the image center
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
Then, in the next part
# zero-out the center of the FFT shift (i.e., remove low
# frequencies), apply the inverse shift such that the DC
# component once again becomes the top-left, and then apply
# the inverse FFT
fftShift[cY - size:cY + size, cX - size:cX + size] = 0
fftShift = np.fft.ifftshift(fftShift)
recon = np.fft.ifft2(fftShift)
I converted this in this way
// construct a Mat object to zero out of the center, here size = 60
Mat H;
Mat H(complexI.size(), CV_32F, Scalar(1));
float D = 0, D0 = 60;
for (int u = 0; u < H.rows; u++)
{
for (int v = 0; v < H.cols; v++)
{
D = sqrt((u - scr.rows / 2)*(u - scr.rows / 2) + (v - scr.cols / 2)*(v - scr.cols / 2));
if (D < D0)
{
H.at<float>(u, v) = 0;
}
}
}
Mat planesH[] = { Mat_<float>(H.clone()), Mat_<float>(H.clone()) };
Mat planes_dft[] = { complexI, Mat::zeros(complexI.size(), CV_32F) };
split(complexI, planes_dft);
Mat planes_out[] = { Mat::zeros(complexI.size(), CV_32F), Mat::zeros(complexI.size(), CV_32F) };
planes_out[0] = planesH[0].mul(planes_dft[0]);
planes_out[1] = planesH[1].mul(planes_dft[1]);
merge(planes_out, 2, complexIH);
#for Dft shift
Mat p0(complexIH, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat p1(complexIH, Rect(cx, 0, cx, cy)); // Top-Right
Mat p2(complexIH, Rect(0, cy, cx, cy)); // Bottom-Left
Mat p3(complexIH, Rect(cx, cy, cx, cy)); // Bottom-Right
p0.copyTo(tmp);
p3.copyTo(p0);
tmp.copyTo(p3);
p1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
p2.copyTo(p1);
tmp.copyTo(p2);
Mat recon;
dft(complexIH, recon, DFT_INVERSE);
Then the tutorial stated
# compute the magnitude spectrum of the reconstructed image,
# then compute the mean of the magnitude values
magnitude = 20 * np.log(np.abs(recon))
mean = np.mean(magnitude)
# the image will be considered "blurry" if the mean value of the
# magnitudes is less than the threshold value
return (mean, mean <= thresh)
And I converted this in this way
Mat planes2[] = {Mat_<float>(complexIH), Mat::zeros(complexIH.size(), CV_32F)};
// compute the magnitude and switch to logarithmic scale
// => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
split(recon, planes2); // planes2[0] = Re(DFT(I), planes2[1] = Im(DFT(I))
magnitude(planes2[0], planes2[1], planes2[0]);// planes2[0] = magnitude
Mat output = planes2[0];
output += Scalar::all(1); // switch to logarithmic scale
log(output, output);
float avg = mean(magI)[0];
I know it is a mess. I want to get the blur value like the tutorial says.
I think this comes close to the original Python code
#include <iostream>
#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main(int argc, char **argv) {
if (argc <= 1) {
fprintf(stderr, "Error: missing image file\n");
return 1;
}
string image_file = argv[1];
cout << "Processing " << image_file << std::endl;
Mat frame = imread(image_file, IMREAD_GRAYSCALE);
// Go float
Mat fImage;
frame.convertTo(fImage, CV_32F);
// FFT
cout << "Direct transform...\n";
Mat fourierTransform;
dft(fImage, fourierTransform, DFT_SCALE|DFT_COMPLEX_OUTPUT);
int Wd = frame.cols;
int Ht = frame.rows;
int cx = Wd/2;
int cy = Ht/2;
int Sw = 60;
int Sh = 60;
//center low frequencies in the middle
//by shuffling the quadrants.
Mat q0(fourierTransform, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(fourierTransform, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(fourierTransform, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(fourierTransform, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
// Block the low frequencies
fourierTransform(Rect(cx-Sw,cy-Sh,2*Sw,2*Sh)).setTo(0);
//shuffle the quadrants to their original position
Mat orgFFT;
fourierTransform.copyTo(orgFFT);
Mat p0(orgFFT, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat p1(orgFFT, Rect(cx, 0, cx, cy)); // Top-Right
Mat p2(orgFFT, Rect(0, cy, cx, cy)); // Bottom-Left
Mat p3(orgFFT, Rect(cx, cy, cx, cy)); // Bottom-Right
p0.copyTo(tmp);
p3.copyTo(p0);
tmp.copyTo(p3);
p1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
p2.copyTo(p1);
tmp.copyTo(p2);
// IFFT
cout << "Inverse transform...\n";
Mat invFFT;
Mat logFFT;
double minVal,maxVal;
dft(orgFFT, invFFT, DFT_INVERSE|DFT_REAL_OUTPUT);
//img_fft = 20*numpy.log(numpy.abs(img_fft))
invFFT = cv::abs(invFFT);
cv::minMaxLoc(invFFT,&minVal,&maxVal,NULL,NULL);
//check for impossible values
if(maxVal<=0.0){
cerr << "No information, complete black image!\n";
return 1;
}
cv::log(invFFT,logFFT);
logFFT *= 20;
//result = numpy.mean(img_fft)
cv::Scalar result= cv::mean(logFFT);
cout << "Result : "<< result.val[0] << endl;
// Back to 8-bits
Mat finalImage;
logFFT.convertTo(finalImage, CV_8U);
// show if you like
imshow("Input", frame);
imshow("Result", finalImage);
cv::waitKey();
return 0;
}
I want to rotate an image at several angles sequentially. I do that using cv2.getRotationMatrix2D and cv2.warpAffine. Having a pair of pixels coordinates [x,y], where x=cols, y=rows (in this case) I want to find their new coordinates in the rotated images.
I used the following slightly changed code courtesy of http://www.pyimagesearch.com/2017/01/02/rotate-images-correctly-with-opencv-and-python/ and the explanation from Affine Transformation to try to map the points in the rotated image : http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html.
The problem is my mapping or my rotation is wrong because the transformed calculated coordinates are wrong. (I tried to compute the corners manually for simple verification)
CODE:
def rotate_bound(image, angle):
# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = ((w-1) // 2.0, (h-1)// 2.0)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
print nW, nH
# adjust the rotation matrix to take into account translation
M[0, 2] += ((nW-1) / 2.0) - cX
M[1, 2] += ((nH-1) / 2.0) - cY
# perform the actual rotation and return the image
return M, cv2.warpAffine(image, M, (nW, nH))
#function that calculates the updated locations of the coordinates
#after rotation
def rotated_coord(points,M):
points = np.array(points)
ones = np.ones(shape=(len(points),1))
points_ones = np.concatenate((points,ones), axis=1)
transformed_pts = M.dot(points_ones.T).T
return transformed_pts
#READ IMAGE & CALL FCT
img = cv2.imread("Lenna.png")
points = np.array([[511, 511]])
#rotate by 90 angle for example
M, rotated = rotate_bound(img, 90)
#find out the new locations
transformed_pts = rotated_coord(points,M)
If I have for example the coordinates [511,511] I will obtain [-0.5, 511.50] ([col, row]) when I expect to obtain [0,511].
If I use instead the w // 2 a black border will be added on the image and my rotated updated coordinates will be off again.
Question: How can I find the correct location of a pair of pixels coordinates in a rotated image (by a certain angle) using Python ?
For this case of image rotation, where the image size changes after rotation and also the reference point, the transformation matrix has to be modified. The new with and height can be calculated using the following relations:
new.width = h*\sin(\theta) + w*\cos(\theta)
new.height = h*\cos(\theta) + w*\sin(\theta)
Since the image size changes, because of the black border that you might see, the coordinates of the rotation point (centre of the image) change too. Then it has to be taken into account in the transformation matrix.
I explain an example in my blog image rotation bounding box opencv
def rotate_box(bb, cx, cy, h, w):
new_bb = list(bb)
for i,coord in enumerate(bb):
# opencv calculates standard transformation matrix
M = cv2.getRotationMatrix2D((cx, cy), theta, 1.0)
# Grab the rotation components of the matrix)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cx
M[1, 2] += (nH / 2) - cy
# Prepare the vector to be transformed
v = [coord[0],coord[1],1]
# Perform the actual rotation and return the image
calculated = np.dot(M,v)
new_bb[i] = (calculated[0],calculated[1])
return new_bb
## Calculate the new bounding box coordinates
new_bb = {}
for i in bb1:
new_bb[i] = rotate_box(bb1[i], cx, cy, heigth, width)
The corresponding C++ code of the above mentioned Python code of # cristianpb, if someone is looking for a C++ code as like me:
// send the original angle i.e. don't transform it in radian
cv::Point2f rotatePointUsingTransformationMat(const cv::Point2f& inPoint, const cv::Point2f& center, const double& rotAngle)
{
cv::Mat rot = cv::getRotationMatrix2D(center, rotAngle, 1.0);
float cos = rot.at<double>(0,0);
float sin = rot.at<double>(0,1);
int newWidth = int( ((center.y*2)*sin) + ((center.x*2)*cos) );
int newHeight = int( ((center.y*2)*cos) + ((center.x*2)*sin) );
rot.at<double>(0,2) += newWidth/2.0 - center.x;
rot.at<double>(1,2) += newHeight/2.0 - center.y;
int v[3] = {static_cast<int>(inPoint.x),static_cast<int>(inPoint.y),1};
int mat3[2][1] = {{0},{0}};
for(int i=0; i<rot.rows; i++)
{
for(int j=0; j<= 0; j++)
{
int sum=0;
for(int k=0; k<3; k++)
{
sum = sum + rot.at<double>(i,k) * v[k];
}
mat3[i][j] = sum;
}
}
return Point2f(mat3[0][0],mat3[1][0]);
}
After I applied thresholding and finding the contours of the object, I used the following code to get the straight rectangle around the object (or the rotated rectangle inputting its instruction):
img = cv2.imread('image.png')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,cv2.THRESH_BINARY)
# find contours
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
# straight rectangle
x,y,w,h = cv2.boundingRect(cnt)
img= cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
see the image
Then I have calculated the number of object and background pixels inside the straight rectangle using the following code:
# rectangle area (total number of object and background pixels inside the rectangle)
area_rect = w*h
# white or object pixels (inside the rectangle)
obj = cv2.countNonZero(imgray)
# background pixels (inside the rectangle)
bac = area_rect - obj
Now I want to adapt the rectangle of the object as a function of the relationship between the background pixel and those of the object, ie to have a rectangle occupying the larger part of the object without or with less background pixel, for example:
How do I create this?
This problem can be stated as the find the largest rectangle inscribed in a non-convex polygon.
An approximate solution can be found at this link.
This problem can be formulated also as: for each angle, find the largest rectangle containing only zeros in a matrix, explored in this SO question.
My solution is based on this answer. This will find only axis aligned rectangles, so you can easily rotate the image by a given angle and apply this solution for every angle.
My solution is C++, but you can easily port it to Python, since I'm using mostly OpenCV function, or adjust the solution in the above mentioned answer accounting for rotation.
Here we are:
#include <opencv2\opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
// https://stackoverflow.com/a/30418912/5008845
Rect findMinRect(const Mat1b& src)
{
Mat1f W(src.rows, src.cols, float(0));
Mat1f H(src.rows, src.cols, float(0));
Rect maxRect(0,0,0,0);
float maxArea = 0.f;
for (int r = 0; r < src.rows; ++r)
{
for (int c = 0; c < src.cols; ++c)
{
if (src(r, c) == 0)
{
H(r, c) = 1.f + ((r>0) ? H(r-1, c) : 0);
W(r, c) = 1.f + ((c>0) ? W(r, c-1) : 0);
}
float minw = W(r,c);
for (int h = 0; h < H(r, c); ++h)
{
minw = min(minw, W(r-h, c));
float area = (h+1) * minw;
if (area > maxArea)
{
maxArea = area;
maxRect = Rect(Point(c - minw + 1, r - h), Point(c+1, r+1));
}
}
}
}
return maxRect;
}
RotatedRect largestRectInNonConvexPoly(const Mat1b& src)
{
// Create a matrix big enough to not lose points during rotation
vector<Point> ptz;
findNonZero(src, ptz);
Rect bbox = boundingRect(ptz);
int maxdim = max(bbox.width, bbox.height);
Mat1b work(2*maxdim, 2*maxdim, uchar(0));
src(bbox).copyTo(work(Rect(maxdim - bbox.width/2, maxdim - bbox.height / 2, bbox.width, bbox.height)));
// Store best data
Rect bestRect;
int bestAngle = 0;
// For each angle
for (int angle = 0; angle < 90; angle += 1)
{
cout << angle << endl;
// Rotate the image
Mat R = getRotationMatrix2D(Point(maxdim,maxdim), angle, 1);
Mat1b rotated;
warpAffine(work, rotated, R, work.size());
// Keep the crop with the polygon
vector<Point> pts;
findNonZero(rotated, pts);
Rect box = boundingRect(pts);
Mat1b crop = rotated(box).clone();
// Invert colors
crop = ~crop;
// Solve the problem: "Find largest rectangle containing only zeros in an binary matrix"
// https://stackoverflow.com/questions/2478447/find-largest-rectangle-containing-only-zeros-in-an-n%C3%97n-binary-matrix
Rect r = findMinRect(crop);
// If best, save result
if (r.area() > bestRect.area())
{
bestRect = r + box.tl(); // Correct the crop displacement
bestAngle = angle;
}
}
// Apply the inverse rotation
Mat Rinv = getRotationMatrix2D(Point(maxdim, maxdim), -bestAngle, 1);
vector<Point> rectPoints{bestRect.tl(), Point(bestRect.x + bestRect.width, bestRect.y), bestRect.br(), Point(bestRect.x, bestRect.y + bestRect.height)};
vector<Point> rotatedRectPoints;
transform(rectPoints, rotatedRectPoints, Rinv);
// Apply the reverse translations
for (int i = 0; i < rotatedRectPoints.size(); ++i)
{
rotatedRectPoints[i] += bbox.tl() - Point(maxdim - bbox.width / 2, maxdim - bbox.height / 2);
}
// Get the rotated rect
RotatedRect rrect = minAreaRect(rotatedRectPoints);
return rrect;
}
int main()
{
Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);
// Compute largest rect inside polygon
RotatedRect r = largestRectInNonConvexPoly(img);
// Show
Mat3b res;
cvtColor(img, res, COLOR_GRAY2BGR);
Point2f points[4];
r.points(points);
for (int i = 0; i < 4; ++i)
{
line(res, points[i], points[(i + 1) % 4], Scalar(0, 0, 255), 2);
}
imshow("Result", res);
waitKey();
return 0;
}
The result image is:
NOTE
I'd like to point out that this code is not optimized, so it can probably perform better. For an approximized solution, see here, and the papers reported there.
This answer to a related question put me in the right direction.
There's now a python library calculating the maximum drawable rectangle inside a polygon.
Library: maxrect
Install through pip:
pip install git+https://${GITHUB_TOKEN}#github.com/planetlabs/maxrect.git
Usage:
from maxrect import get_intersection, get_maximal_rectangle, rect2poly
# For a given convex polygon
coordinates1 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
coordinates2 = [ [x0, y0], [x1, y1], ... [xn, yn] ]
# find the intersection of the polygons
_, coordinates = get_intersection([coordinates1, coordinates2])
# get the maximally inscribed rectangle
ll, ur = get_maximal_rectangle(coordinates)
# casting the rectangle to a GeoJSON-friendly closed polygon
rect2poly(ll, ur)
Source: https://pypi.org/project/maxrect/
here is a python code I wrote with rotation included. I tried to speed it up, but I guess it can be improved.
For future googlers,
Since your provided sample solution allows background pixels to be within the rectangle, I suppose you wish to find the the smallest rectangle that covers perhaps 80% of the white pixels.
This can be done using a similar method of finding the error ellipse given a set of data (in this case, the data is all the white pixels, and the error ellipse needs to be modified to be a rectangle)
The following links would hence be helpful
How to get the best fit bounding box from covariance matrix and mean position?
http://www.visiondummy.com/2014/04/draw-error-ellipse-representing-covariance-matrix/
I have some images on a black background where the images don't have square edges (see bottom right of image below). I would like to crop them down the largest rectangular image (red border). I know I will potentially lose from of the original image. Is it possible to do this in OpenCV with Python. I know there are are functions to crop to a bounding box of a contour but that would still leave me with black background in places.
ok, I've played with an idea and tested it (it's c++ but you'll probably be able to convert that to python):
assumption: background is black and the interior has no black boundary parts
you can find the external contour with findContours
use min/max x/y point positions from that contour until the rectangle that is built by those points contains no points that lie outside of the contour
I can't guarantee that this method always finds the "best" interior box, but I use a heuristic to choose whether the rectangle is reduced at top/bottom/left/right side.
Code can certainly be optimized, too ;)
using this as a testimage, I got that result (non-red region is the found interior rectangle):
regard that there is one pixel at top right that shouldnt containt to the rectangle, maybe thats from extrascting/drawing the contour wrong?!?
and here's code:
cv::Mat input = cv::imread("LenaWithBG.png");
cv::Mat gray;
cv::cvtColor(input,gray,CV_BGR2GRAY);
cv::imshow("gray", gray);
// extract all the black background (and some interior parts maybe)
cv::Mat mask = gray>0;
cv::imshow("mask", mask);
// now extract the outer contour
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(mask,contours,hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cv::Point(0,0));
std::cout << "found contours: " << contours.size() << std::endl;
cv::Mat contourImage = cv::Mat::zeros( input.size(), CV_8UC3 );;
//find contour with max elements
// remark: in theory there should be only one single outer contour surrounded by black regions!!
unsigned int maxSize = 0;
unsigned int id = 0;
for(unsigned int i=0; i<contours.size(); ++i)
{
if(contours.at(i).size() > maxSize)
{
maxSize = contours.at(i).size();
id = i;
}
}
std::cout << "chosen id: " << id << std::endl;
std::cout << "max size: " << maxSize << std::endl;
/// Draw filled contour to obtain a mask with interior parts
cv::Mat contourMask = cv::Mat::zeros( input.size(), CV_8UC1 );
cv::drawContours( contourMask, contours, id, cv::Scalar(255), -1, 8, hierarchy, 0, cv::Point() );
cv::imshow("contour mask", contourMask);
// sort contour in x/y directions to easily find min/max and next
std::vector<cv::Point> cSortedX = contours.at(id);
std::sort(cSortedX.begin(), cSortedX.end(), sortX);
std::vector<cv::Point> cSortedY = contours.at(id);
std::sort(cSortedY.begin(), cSortedY.end(), sortY);
unsigned int minXId = 0;
unsigned int maxXId = cSortedX.size()-1;
unsigned int minYId = 0;
unsigned int maxYId = cSortedY.size()-1;
cv::Rect interiorBB;
while( (minXId<maxXId)&&(minYId<maxYId) )
{
cv::Point min(cSortedX[minXId].x, cSortedY[minYId].y);
cv::Point max(cSortedX[maxXId].x, cSortedY[maxYId].y);
interiorBB = cv::Rect(min.x,min.y, max.x-min.x, max.y-min.y);
// out-codes: if one of them is set, the rectangle size has to be reduced at that border
int ocTop = 0;
int ocBottom = 0;
int ocLeft = 0;
int ocRight = 0;
bool finished = checkInteriorExterior(contourMask, interiorBB, ocTop, ocBottom,ocLeft, ocRight);
if(finished)
{
break;
}
// reduce rectangle at border if necessary
if(ocLeft)++minXId;
if(ocRight) --maxXId;
if(ocTop) ++minYId;
if(ocBottom)--maxYId;
}
std::cout << "done! : " << interiorBB << std::endl;
cv::Mat mask2 = cv::Mat::zeros(input.rows, input.cols, CV_8UC1);
cv::rectangle(mask2,interiorBB, cv::Scalar(255),-1);
cv::Mat maskedImage;
input.copyTo(maskedImage);
for(unsigned int y=0; y<maskedImage.rows; ++y)
for(unsigned int x=0; x<maskedImage.cols; ++x)
{
maskedImage.at<cv::Vec3b>(y,x)[2] = 255;
}
input.copyTo(maskedImage,mask2);
cv::imshow("masked image", maskedImage);
cv::imwrite("interiorBoundingBoxResult.png", maskedImage);
with reduction function:
bool checkInteriorExterior(const cv::Mat&mask, const cv::Rect&interiorBB, int&top, int&bottom, int&left, int&right)
{
// return true if the rectangle is fine as it is!
bool returnVal = true;
cv::Mat sub = mask(interiorBB);
unsigned int x=0;
unsigned int y=0;
// count how many exterior pixels are at the
unsigned int cTop=0; // top row
unsigned int cBottom=0; // bottom row
unsigned int cLeft=0; // left column
unsigned int cRight=0; // right column
// and choose that side for reduction where mose exterior pixels occured (that's the heuristic)
for(y=0, x=0 ; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the top side of the rect a bit to the bottom
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cTop;
}
}
for(y=sub.rows-1, x=0; x<sub.cols; ++x)
{
// if there is an exterior part in the interior we have to move the bottom side of the rect a bit to the top
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cBottom;
}
}
for(y=0, x=0 ; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cLeft;
}
}
for(x=sub.cols-1, y=0; y<sub.rows; ++y)
{
// if there is an exterior part in the interior
if(sub.at<unsigned char>(y,x) == 0)
{
returnVal = false;
++cRight;
}
}
// that part is ugly and maybe not correct, didn't check whether all possible combinations are handled. Check that one please. The idea is to set `top = 1` iff it's better to reduce the rect at the top than anywhere else.
if(cTop > cBottom)
{
if(cTop > cLeft)
if(cTop > cRight)
top = 1;
}
else
if(cBottom > cLeft)
if(cBottom > cRight)
bottom = 1;
if(cLeft >= cRight)
{
if(cLeft >= cBottom)
if(cLeft >= cTop)
left = 1;
}
else
if(cRight >= cTop)
if(cRight >= cBottom)
right = 1;
return returnVal;
}
bool sortX(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.x == a.x)
if(b.x==b.x)
ret = a.x < b.x;
return ret;
}
bool sortY(cv::Point a, cv::Point b)
{
bool ret = false;
if(a.y == a.y)
if(b.y == b.y)
ret = a.y < b.y;
return ret;
}
A solution inspired by #micka answer, in python.
This is not a clever solution, and could be optimized, but it worked (slowly) in my case.
I modified you image to add a square, like in your example: see
At the end, this code crops the white rectangle in this
Hope you will find it helpful!
import cv2
# Import your picture
input_picture = cv2.imread("LenaWithBG.png")
# Color it in gray
gray = cv2.cvtColor(input_picture, cv2.COLOR_BGR2GRAY)
# Create our mask by selecting the non-zero values of the picture
ret, mask = cv2.threshold(gray,0,255,cv2.THRESH_BINARY)
# Select the contour
mask , cont, _ = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# if your mask is incurved or if you want better results,
# you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE,
# but the rectangle search will be longer
cv2.drawContours(gray, cont, -1, (255,0,0), 1)
cv2.imshow("Your picture with contour", gray)
cv2.waitKey(0)
# Get all the points of the contour
contour = cont[0].reshape(len(cont[0]),2)
# we assume a rectangle with at least two points on the contour gives a 'good enough' result
# get all possible rectangles based on this hypothesis
rect = []
for i in range(len(contour)):
x1, y1 = contour[i]
for j in range(len(contour)):
x2, y2 = contour[j]
area = abs(y2-y1)*abs(x2-x1)
rect.append(((x1,y1), (x2,y2), area))
# the first rect of all_rect has the biggest area, so it's the best solution if he fits in the picture
all_rect = sorted(rect, key = lambda x : x[2], reverse = True)
# we take the largest rectangle we've got, based on the value of the rectangle area
# only if the border of the rectangle is not in the black part
# if the list is not empty
if all_rect:
best_rect_found = False
index_rect = 0
nb_rect = len(all_rect)
# we check if the rectangle is a good solution
while not best_rect_found and index_rect < nb_rect:
rect = all_rect[index_rect]
(x1, y1) = rect[0]
(x2, y2) = rect[1]
valid_rect = True
# we search a black area in the perimeter of the rectangle (vertical borders)
x = min(x1, x2)
while x <max(x1,x2)+1 and valid_rect:
if mask[y1,x] == 0 or mask[y2,x] == 0:
# if we find a black pixel, that means a part of the rectangle is black
# so we don't keep this rectangle
valid_rect = False
x+=1
y = min(y1, y2)
while y <max(y1,y2)+1 and valid_rect:
if mask[y,x1] == 0 or mask[y,x2] == 0:
valid_rect = False
y+=1
if valid_rect:
best_rect_found = True
index_rect+=1
if best_rect_found:
cv2.rectangle(gray, (x1,y1), (x2,y2), (255,0,0), 1)
cv2.imshow("Is that rectangle ok?",gray)
cv2.waitKey(0)
# Finally, we crop the picture and store it
result = input_picture[min(y1, y2):max(y1, y2), min(x1,x2):max(x1,x2)]
cv2.imwrite("Lena_cropped.png",result)
else:
print("No rectangle fitting into the area")
else:
print("No rectangle found")
If your mask is incurved or simply if you want better results, you may want to use cv2.CHAIN_APPROX_NONE instead of cv2.CHAIN_APPROX_SIMPLE, but the rectangle search will take more time (because it's a quadratic solution in the best case).
In ImageMagick 6.9.10-30 (or 7.0.8.30) or higher, you can use the -trim function with a new define.
Input:
convert image.png -fuzz 5% -define trim:percent-background=0% -trim +repage result.png
Or for the image presented below:
Input:
convert image2.png -bordercolor black -border 1 -define trim:percent-background=0% -trim +repage result2.png