I have a few images of pages on a table. I would like to crop the pages out of the image. Generally, the page will be the biggest rectangle in the image, however, all four sides of the rectangle might not be visible in some cases.
I am doing the following but not getting desired results:
import cv2
import numpy as np
im = cv2.imread('images/img5.jpg')
gray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,0)
_,contours,_ = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
areas = [cv2.contourArea(c) for c in contours]
max_index = np.argmax(areas)
cnt=contours[max_index]
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
cv2.imshow("Show",im)
cv2.imwrite("images/img5_rect.jpg", im)
cv2.waitKey(0)
Below are a few examples:
1st Example: I can find the rectangle in this image , however, would like if the remaining part of the wood can be cropped out as well.
2nd Example: Not finding the correct dimensions of the rectangle in this image.
3rd Example: Not able to find the correct dimensions in this image either.
4th Example: Same with this as well.
As I have previously done something similar, I have experienced with hough transforms, but they were much harder to get right for my case than using contours. I have the following suggestions to help you get started:
Generally paper (edges, at least) is white, so you may have better luck by going to a colorspace like YUV which better separates luminosity:
image_yuv = cv2.cvtColor(image,cv2.COLOR_BGR2YUV)
image_y = np.zeros(image_yuv.shape[0:2],np.uint8)
image_y[:,:] = image_yuv[:,:,0]
The text on the paper is a problem. Use a blurring effect, to (hopefully) remove these high frequency noises. You may also use morphological operations like dilation as well.
image_blurred = cv2.GaussianBlur(image_y,(3,3),0)
You may try to apply a canny edge-detector, rather than a simple threshold. Not necessarily, but may help you:
edges = cv2.Canny(image_blurred,100,300,apertureSize = 3)
Then find the contours. In my case I only used the extreme outer contours. You may use CHAIN_APPROX_SIMPLE flag to compress the contour
contours,hierarchy = cv2.findContours(edges,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
Now you should have a bunch of contours. Time to find the right ones. For each contour cnt, first find the convex hull, then use approaxPolyDP to simplify the contour as much as possible.
hull = cv2.convexHull(cnt)
simplified_cnt = cv2.approxPolyDP(hull,0.001*cv2.arcLength(hull,True),True)
Now we should use this simplified contour to find the enclosing quadrilateral. You may experiment with lots of rules you come up with. The simplest method is picking the four longest longest segments of the contour, and then create the enclosing quadrilateral by intersecting these four lines. Based on your case, you can find these lines based on the contrast the line makes, the angle they make and similar things.
Now you have a bunch of quadrilaterals. You can now perform a two step method to find your required quadrilateral. First you remove those ones that are probably wrong. For example one angle of the quadrilateral is more than 175 degrees. Then you can pick the one with the biggest area as the final result. You can see the orange contour as one of the results I got at this point:
The final step after finding (hopefully) the right quadrilateral, is transforming back to a rectangle. For this you can use findHomography to come up with a transformation matrix.
(H,mask) = cv2.findHomography(cnt.astype('single'),np.array([[[0., 0.]],[[2150., 0.]],[[2150., 2800.]],[[0.,2800.]]],dtype=np.single))
The numbers assume projecting to letter paper. You may come up with better and more clever numbers to use. You also need to reorder the contour points to match the order of coordinates of the letter paper. Then you call warpPerspective to create the final image:
final_image = cv2.warpPerspective(image,H,(2150, 2800))
This warping should result in something like the following (from my results before):
I hope this helps you to find an appropriate approach in your case.
That's a pretty complicated task which cannot be solved by simply searching contours. The Economist cover for example only shows 1 edge of the magazine which splits the image in half. How should your computer know which one is the magazine and which one is the table? So you have to add much more intelligence to your program.
You might look for lines in your image. Hough transform for example. Then find sets of more or less parallel or orthogonal lines, lines of a certain length...
Find prints by checking for typical print colours or colours that you usually don't find on a table. Search for high contrast frequencies as created by printed texts...
Imagine how you as a human recognize a printed paper...
All in all this is a too broad question for StackOverflow. Try to break it down into smaller sub-problems, try to solve them and if you hit a wall, come back here.
Related
I want to separate a noiseless 1-bit (black and white) image with white circles based on the concave part of the outline.
Please refer to the picture below.
This is the white object to separate:
The target result is:
Here is my implementation with the watershed algorithm:
The above result is not what I want.
If the size of the separated objects is similar, my algorithm is fine, but if the size difference is large, a problem occurs as shown in the picture above.
I would like to implement an opencv algorithm that can segment a region like the second picture.
However, the input photo is not necessarily a perfect circle.
It can be oval like the picture below:
Or it can be squished:
However, I would like to separate it based on the concave part of the outline anyway.
I think it can be implemented by using the distanceTransform function well, but I'm not sure how to approach it.
Please let me know which way to refer.
Thank you.
Here is an algorithm which should give you a good start.
Compute all contours.
For each contour compute the convexity defects. If there is no defect the contour is an isolated circle and you can segment it out.
After you handled all the isolated circles, you can work out the remaining contours by counting the convexity defects: the number of circles N for each contour is the number of convexity defects divided by 2.
Use a clustering algorithm (https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html should do well given the shapes you have) and cluster the "white" points using N as the number of clusters to be found.
If you want to find the minimal openings, you can use a medial axis based approach.
Pseudo code:
compute contours of bitmap
compute medial-axis of bitmap
for each point on medial-axis:
get minimal distance d from medial axis algorithm
for each local minimum of distance d:
get two points on bitmap contours with minimal distance that are at least d apart from each other
use these points for deviding line
If you need a working implementation in python, please let me know. I would use skimage lib. For other languages you might have to implement medial-axis on your own. But that shouldn't be a big deal.
I'm facing some general problems regarding the edge detection in an image (the image should be irrelevant for my question).
I want the canny edge detector to ignore a certain pixel value. For example: It should only look for edges if the gray value is not 0. Otherwise there will be "false edges" detected.
I usually use the cv2.canny function which works quite fast and well. Problem is, it is not customizable. So I took this code of a custom canny edge detector (https://rosettacode.org/wiki/Canny_edge_detector#Python) in order to customize it. It works but it's calculating the edges way too slow (It takes several minutes, whereas the cv2.canny function takes a fraction of a second).
This is my first problem.
Is there another way to make the cv2.canny function "ignore" pixels of a certein value. Imagine somewhere in the picture is a area filled with black (soo the image below). I don't want the edge detector to detect the edge of this black area.
Once I have some clear edges detected in my image, I want to create masks based on those edges. I couldn't find any examples for this online. So if anyone knows where to find a good tutorial on how to create masks from edges it would be great if you could help me out.
Thanks in advance
Here's an approach:
Calculate your Canny as usual using the fast OpenCV function.
Now locate all the black pixels in the image - you can do that with _,thr = cv2.threshold(im,1,255,cv2.THRESH_BINARY) and dilate those areas by 1 pixel with morphology to allow edges to be offset a little as they often are.
Multiply the normal Canny image with the mask you created so that anything it found in the black areas gets multiplied by zero, i.e. lost.
I am working on an application where I need feature like Cam Scanner where document is to be detected in an image. For that I am using Canny Edge detection followed by Hough Transform.
The results look promising but the text in the document is creating issues as explained via images below:
Original Image
After canny edge detection
After hough transform
My issue lies in the third image, the text in original mage near the bottom has forced hough transform to detect the horizontal line(2nd cluster from bottom).
I know I can take the largest quadrilateral and that would work fine in most cases, but still I want to know any other ways where in this processing I can ignore the effect of text on the edges.
Any help would be appreciated.
I solved the issue of text with the help of median filter of size 15(square) in an image of 500x700.
Median filter doesn't affect the boundaries of the paper, but can help eliminate the text completely.
Using that I was able to get much more effective boundaries.
Another approach you could try is to use thresholding to find the paper boundaries. This would create a binary image. You can then examine the blobs of white pixels and see if any are large enough to be the paper and have the right dimensions. If it fits the criteria, you can find the min/max points of this blob to represent the paper.
There are several ways to do the thresholding, including iterative, otsu, and adaptive.
Also, for best results you may have to dilate the binary image to close the black lines in the table as shown in your example.
I'm newbie in computer vision. My goal is to distinguish individual cells on a set of pictures like this: Example
Basically, I blur whole image, find region maximum on it and use it like seed in watershed algorithm on distance tranfsform of threesholded blurred image. In fact I'm following tutorial which you can find here:
github/luispedro/python-image-tutorial
(sorry, can't post more than 2 links).
My problem is that some cells in my set have very distinguishable dark nucleus (which you can see on the example) and my algorithm produce results like this which are cleary wrong.
Of course it's possible to fix it by increasing strength of gaussian blur but it will merge some other cells toghether which is even worse.
What can be done to solve this problem? What are other possibilites if watershed just isn't situable for this case (keeping in mind that my set is pretty small and learning seems impossible)?
The watershed tends to over-segment if you don't use a watershed with markers.
Usually, we start with DNA/DAPI segmentation that is easy, and it provides the number of cells and the inner markers for the watershed.
If you blur the images, you smooth all the patterns. You should use an alternate sequential filter (opening / closing) in order to simplify each zone, and then try an ultimate eroded in order to find the number of inner seed for your watershed.
Given an image of a connect-4 board I'd like to recognize and output the board's state (a 6 by 7 matrix). The first approach I tried was based on finding the circles and then looking for a grid pattern in their centroids.
This is the open-cv function I'm using:
circles = cv2.HoughCircles(bw_im,
cv2.cv.CV_HOUGH_GRADIENT,
dp=DP,
minDist=MIN_DIST,
minRadius=MIN_RADIUS,
maxRadius=MAX_RADIUS)
I add non-maximum suppression, but the results are not great.
Is there a better way than dealing with Hough circles directly, perhaps there is some sort of filled circularity morphological operation that I don't know of.
Here's an example input image:
You can assume that the input image has been cropped and has similar margins as above (I have another piece of code that takes care of this).
If Hough isn't a requirement, Id suggest implementing a ray-casting algorithm as described here: https://en.wikipedia.org/wiki/Point_in_polygon
The general steps are:
Create a mask for the red circles
Run ray-casting on x columns spaced y apart to determine # and position of reds
Repeat steps 1 & 2 for yellow
Since you're working in RGB, the color contrast should be enough to give you good results.
Assuming your grid will maintain its position the easiest way would be to setup a fixed region of interest for every slot and measure their hue values every time you change something.