I'm trying to extract the rotated bounding box of contours robustly. I would like to take an image, find the largest contour, get its rotated bounding box, rotate the image to make the bounding box vertical, and crop to size.
For a demonstration, here is an original image linked in the following code. I would like to end up with that shoe rotated to vertical and cropped to size. The following code from this answer seems to work on simple images like opencv lines, etc., but not on photos.
Ends up with this, which is rotated and cropped wrong:
EDIT: After changing the threshold type to cv2.THRESH_BINARY_INV, it now is rotated correctly but cropped wrong:
import cv2
import matplotlib.pyplot as plt
import numpy as np
import urllib.request
plot = lambda x: plt.imshow(x, cmap='gray').figure
url = 'https://i.imgur.com/4E8ILuI.jpg'
img_path = 'shoe.jpg'
urllib.request.urlretrieve(url, img_path)
img = cv2.imread(img_path, 0)
plot(img)
threshold_value, thresholded_img = cv2.threshold(
img, 250, 255, cv2.THRESH_BINARY)
_, contours, _ = cv2.findContours(thresholded_img, 1, 1)
contours.sort(key=cv2.contourArea, reverse=True)
shoe_contour = contours[0][:, 0, :]
min_area_rect = cv2.minAreaRect(shoe_contour)
def crop_minAreaRect(img, rect):
# rotate img
angle = rect[2]
rows, cols = img.shape[0], img.shape[1]
M = cv2.getRotationMatrix2D((cols / 2, rows / 2), angle, 1)
img_rot = cv2.warpAffine(img, M, (cols, rows))
# rotate bounding box
rect0 = (rect[0], rect[1], 0.0)
box = cv2.boxPoints(rect)
pts = np.int0(cv2.transform(np.array([box]), M))[0]
pts[pts < 0] = 0
# crop
img_crop = img_rot[pts[1][1]:pts[0][1],
pts[1][0]:pts[2][0]]
return img_crop
cropped = crop_minAreaRect(thresholded_img, min_area_rect)
plot(cropped)
How can I get the correct cropping?
After some research, this is what I get:
This is how I get it:
pad the original image on each side (500 pixels in my case)
find the four corner points of the shoe (the four points should form a polygon enclosing the shoe, but do not need to be exact rectangle)
employing the code here to crop the shoe:
img = cv2.imread("padded_shoe.jpg")
# four corner points for padded shoe
cnt = np.array([
[[313, 794]],
[[727, 384]],
[[1604, 1022]],
[[1304, 1444]]
])
print("shape of cnt: {}".format(cnt.shape))
rect = cv2.minAreaRect(cnt)
print("rect: {}".format(rect))
box = cv2.boxPoints(rect)
box = np.int0(box)
width = int(rect[1][0])
height = int(rect[1][1])
src_pts = box.astype("float32")
dst_pts = np.array([[0, height-1],
[0, 0],
[width-1, 0],
[width-1, height-1]], dtype="float32")
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(img, M, (width, height))
Cheers, hope it helps.
Related
I want to crop images according to their right frame. I have about 10000 of hand X-ray images to preprocess, and what I have done so far:
Apply Gaussian Blur and Threshold (Binary + Otsu) on the image.
Apply dilation to get a single object (in this case a hand).
Used cv2.findContours() to draw outline along the edges around the hand.
Used cv2.boundingRect() to find the right frame, and then cv2.minAreaRect() and cv2.boxPoints to get the right points for the bounding box.
Used cv2.warpPerspective to adjust image according to height and width.
The code below describes the above:
import os
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load image, create mask, grayscale, Gaussian blur, Otsu's threshold
img_path = "sample_image.png"
image = cv2.imread(image_path)
original = image.copy()
blank = np.zeros(image.shape[:2], dtype = np.uint8)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (33,33), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Merge text into a single contour
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations = 3)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key = lambda x: cv2.boundingRect(x)[0])
for c in cnts:
# Filter using contour area and aspect ratio (x1 = width, y1 = height)
x, y, x1, y1 = cv2.boundingRect(c)
if (x1 > 500) and (y1 > 700):
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
box = np.int0(box)
width = int(rect[1][0])
height = int(rect[1][1])
src_pts = box.astype("float32")
dst_pts = np.array([[0, height-1], [0, 0],
[width-1, 0], [width-1, height-1]], dtype="float32")
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(image, M, (width, height))
plt.imshow(warped)
If you have a look at some of the images in the folder, those are the inputs. When I run these images through the code above, I get an output like this. Some of them are cropped nicely (straightened), however, some of them are cropped with 90 degree rotations. Is there a code to counter the 'rotating 90 degrees output' problem?
Here are some images:
Image Inputs: Four X-ray examples
Image Outputs: Returns images that are 90 degrees rotated
Image Outputs wanted: Straightened image (Just used Photoshop to straighten them. Dont want to do this for 10000 images...)
UPDATE:
I edited the code according to below-mentioned suggestions. After running the some samples, it now returns images that are now 90 degrees slanted to the right.
Input images:
Output images:
I doubt it's because of the quality of the images. Maybe it's got to do with OpenCV's minAreaRect()? or boxPoints?
FINAL UPDATE:
According to #Prashant Maurya, the code was updated with a function added to detect whether the position of the hand is left or right. And then mapping src_pts to right dst_pts. Full code is shown below.
Hi there are two changes which will correct the output:
The width and height taken in the code is in the wrong order ie: width: 1470 & height: 1118 just switch the values:
Map src_pts to right dst_pts the current code is mapping top left
corner to bottom left therefore the image is being rotated.
Added function to detect whether image is right tilted or left and rotate and rotate it accordingly
Full code with changes is:
import os
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load image, create mask, grayscale, Gaussian blur, Otsu's threshold
img_path = "xray1.png"
image = cv2.imread(img_path)
cv2.imshow("image original", image)
cv2.waitKey(10000)
original = image.copy()
blank = np.zeros(image.shape[:2], dtype = np.uint8)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (33,33), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Merge text into a single contour
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations = 3)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key = lambda x: cv2.boundingRect(x)[0])
def get_tilt(box):
tilt = "Left"
x_list = [coord[0] for coord in box]
y_list = [coord[1] for coord in box]
print(x_list)
print(y_list)
x_list = sorted(x_list)
y_list = sorted(y_list)
print(x_list)
print(y_list)
for coord in box:
if coord[0] == x_list[0]:
index = y_list.index(coord[1])
print("Index: ", index)
if index == 1:
tilt = "Left"
else:
tilt = "Right"
return tilt
for c in cnts:
# Filter using contour area and aspect ratio (x1 = width, y1 = height)
x, y, x1, y1 = cv2.boundingRect(c)
if (x1 > 500) and (y1 > 700):
rect = cv2.minAreaRect(c)
print("rect",rect)
box = cv2.boxPoints(rect)
box = np.int0(box)
# print("rect:", box)
tilt = get_tilt(box)
src_pts = box.astype("float32")
if tilt == "Left":
width = int(rect[1][1])
height = int(rect[1][0])
dst_pts = np.array([[0, 0],
[width-1, 0], [width-1, height-1], [0, height-1]], dtype="float32")
else:
width = int(rect[1][0])
height = int(rect[1][1])
dst_pts = np.array([[0, height-1], [0, 0],
[width-1, 0], [width-1, height-1]], dtype="float32")
print("Src pts:", src_pts)
print("Dst pts:", dst_pts)
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(image, M, (width, height))
print("Showing image ..")
# plt.imshow(warped)
cv2.imshow("image crop", warped)
cv2.waitKey(10000)
I'm trying to extract the detected circles in one image using the circular hough transform. My idea is get every circle or separate each one to then get his color histogram features and after send this features to one classifier as SVM, ANN, KNN etc..
This is my input image:
I'm getting the circles of this way:
import numpy as np
import cv2
import matplotlib.pyplot as plt
cv2.__version__
#read image
file = "lemon.png"
image = cv2.imread(file)
#BGR to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(gray,
cv2.HOUGH_GRADIENT,
15,
41,
param1=31,
param2=31,
minRadius=0,
maxRadius=33)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(image,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(image,(i[0],i[1]),2,(0,0,255),3)
print("Number of circles: "+ str(len(circles[0,:])))
plt.imshow(image, cmap='gray', vmin=0, vmax=255)
plt.show()
Output:
The next step is try to extract those circles but I don't have idea how to do it.
Well guys I would like to see your suggestions, any I idea I will apreciate it.
Thanks so much.
You can create a binary mask for every circle you detect. Use this mask to extract only the ROIs from the input image. Additionally, you can crop these ROIs and store them in a list to pass them to your classifier.
Here's the code:
import numpy as np
import cv2
# image path
path = "C://opencvImages//"
file = path + "LLfN7.png"
image = cv2.imread(file)
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(gray,
cv2.HOUGH_GRADIENT,
15,
41,
param1=31,
param2=31,
minRadius=0,
maxRadius=33)
# Here are your circles:
circles = np.uint16(np.around(circles))
# Get input size:
dimensions = image.shape
# height, width
height = image.shape[0]
width = image.shape[1]
# Prepare a list to store each ROI:
lemonROIs = []
The idea is that you process one circle at a step. Get the current circle, create a mask, mask the original input, crop the ROI and store it inside the list:
for i in circles[0, :]:
# Prepare a black canvas:
canvas = np.zeros((height, width))
# Draw the outer circle:
color = (255, 255, 255)
thickness = -1
centerX = i[0]
centerY = i[1]
radius = i[2]
cv2.circle(canvas, (centerX, centerY), radius, color, thickness)
# Create a copy of the input and mask input:
imageCopy = image.copy()
imageCopy[canvas == 0] = (0, 0, 0)
# Crop the roi:
x = centerX - radius
y = centerY - radius
h = 2 * radius
w = 2 * radius
croppedImg = imageCopy[y:y + h, x:x + w]
# Store the ROI:
lemonROIs.append(croppedImg)
For each circle you get a cropped ROI:
You can pass that info to your classifier.
I have a code that computes the orientation of a figure. Based on this orientation the figure is then rotated until it is straightened out. This all works fine. What I am struggling with, is getting the center of the rotated figure to the center of the whole image. So the center point of the figure should match the center point of the whole image.
Input image:
code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
path = "inputImage.png"
image=cv2.imread(path)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh=cv2.threshold(gray,0,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
contours,hierarchy = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
cnt1 = contours[0]
cnt=cv2.convexHull(contours[0])
angle = cv2.minAreaRect(cnt)[-1]
print("Actual angle is:"+str(angle))
rect = cv2.minAreaRect(cnt)
p=np.array(rect[1])
if p[0] < p[1]:
print("Angle along the longer side:"+str(rect[-1] + 180))
act_angle=rect[-1]+180
else:
print("Angle along the longer side:"+str(rect[-1] + 90))
act_angle=rect[-1]+90
#act_angle gives the angle of the minAreaRect with the vertical
if act_angle < 90:
angle = (90 + angle)
print("angleless than -45")
# otherwise, just take the inverse of the angle to make
# it positive
else:
angle=act_angle-180
print("grter than 90")
# rotate the image to deskew it
(h, w) = image.shape[:2]
print(h,w)
center = (w // 2, h // 2)
print(center)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
plt.imshow(rotated)
cv2.imwrite("rotated.png", rotated)
With output:
As you can see the white figure is slightly placed to left, I want it to be perfectly centered.
Does anyone know how this can be done?
EDIT: I have tried #joe's suggestion and subtracted the centroid coordinates, from the center of the image by dividing the width and height of the picture by 2. From this I got an offset, this had to be added to the array that describes the image. But I don't know how I add the offset to the array. How would this work with the x and y coordinates?
The code:
img = cv2.imread("inputImage")
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray_image,127,255,0)
height, width = gray_image.shape
print(img.shape)
wi=(width/2)
he=(height/2)
print(wi,he)
M = cv2.moments(thresh)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
offsetX = (wi-cX)
offsetY = (he-cY)
print(offsetX,offsetY)
print(cX,cY)
Here is one way in Python/OpenCV.
Get the bounding box for the white region from the contours. Compute the offset for the recentered region. Use numpy slicing to copy that to the center of a black background the size of the input.
Input:
import cv2
import numpy as np
# read image as grayscale
img = cv2.imread('white_shape.png', cv2.COLOR_BGR2GRAY)
# get shape
hh, ww = img.shape
# get contours (presumably just one around the nonzero pixels)
contours = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
for cntr in contours:
x,y,w,h = cv2.boundingRect(cntr)
# recenter
startx = (ww - w)//2
starty = (hh - h)//2
result = np.zeros_like(img)
result[starty:starty+h,startx:startx+w] = img[y:y+h,x:x+w]
# view result
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save reentered image
cv2.imwrite('white_shape_centered.png',result)
One approach is to obtain the bounding box coordinates of the binary object then crop the ROI using Numpy slicing. From here we calculate the new shifted coordinates then paste the ROI onto a new blank mask.
Code
import cv2
import numpy as np
# Load image as grayscale and obtain bounding box coordinates
image = cv2.imread('1.png', 0)
height, width = image.shape
x,y,w,h = cv2.boundingRect(image)
# Create new blank image and shift ROI to new coordinates
mask = np.zeros(image.shape, dtype=np.uint8)
ROI = image[y:y+h, x:x+w]
x = width//2 - ROI.shape[0]//2
y = height//2 - ROI.shape[1]//2
mask[y:y+h, x:x+w] = ROI
cv2.imshow('ROI', ROI)
cv2.imshow('mask', mask)
cv2.waitKey()
#NawinNarain, from this point onwards where you found out the relative shifts w.r.t. centroid of the image, it is very straightforward - You want to make an Affine matrix with this translations and apply cv2.warpAffine() to your image. That's -it.
T = np.float32([[1, 0, shift_x], [0, 1, shift_y]])
We then use warpAffine() to transform the image using the matrix, T
centered_image = cv2.warpAffine(image, T, (orig_width, orig_height))
This will transform your image so that the centroid is at the center. Hope this helps. The complete center image function will look like this:
def center_image(image):
height, width = image.shape
print(img.shape)
wi=(width/2)
he=(height/2)
print(wi,he)
ret,thresh = cv2.threshold(image,95,255,0)
M = cv2.moments(thresh)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
offsetX = (wi-cX)
offsetY = (he-cY)
T = np.float32([[1, 0, offsetX], [0, 1, offsetY]])
centered_image = cv2.warpAffine(image, T, (width, height))
return centered_image
I am working with an image with distorted/rotated texts. I need to rotate these text blobs back to the horizontal level (0 degrees) before I can run OCR on them. I managed to fix the rotation issue but now I need to find a way to copy the contents of the original contour to the rotated matrix.
Here are a few things I've done to extract and fix the rotation issue:
Find contour
Heavy dilation and remove non-text lines
Find the contour angle and do angle correction in the polar space.
I have tried using affine transformation to rotate the rectangle text blobs but it ended up cropping out some of the texts because some of the text blobs are irregular. Result here
Blue dots in the contours are centroids, the numbers are contour angles. How can I copy the content of unrotated contour, rotate them and copy to a new image?
Code
def getContourCenter(contour):
M = cv2.moments(contour)
if M["m00"] != 0:
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
else:
return 0, 0
return int(cx), int(cy)
def rotateContour(contour, center: tuple, angle: float):
def cart2pol(x, y):
theta = np.arctan2(y, x)
rho = np.hypot(x, y)
return theta, rho
def pol2cart(theta, rho):
x = rho * np.cos(theta)
y = rho * np.sin(theta)
return x, y
# Translating the contour by subtracting the center with all the points
norm = contour - [center[0], center[1]]
# Convert the points to polar co-ordinates, add the rotation, and convert it back to Cartesian co-ordinates.
coordinates = norm[:, 0, :]
xs, ys = coordinates[:, 0], coordinates[:, 1]
thetas, rhos = cart2pol(xs, ys)
thetas = np.rad2deg(thetas)
thetas = (thetas + angle) % 360
thetas = np.deg2rad(thetas)
# Convert the new polar coordinates to cartesian co-ordinates
xs, ys = pol2cart(thetas, rhos)
norm[:, 0, 0] = xs
norm[:, 0, 1] = ys
rotated = norm + [center[0], center[1]]
rotated = rotated.astype(np.int32)
return rotated
def straightenText(image, vis):
# create a new mat
mask = 0*np.ones([image.shape[0], image.shape[1], 3], dtype=np.uint8)
# invert pixel index arrangement and dilate aggressively
dilate = cv2.dilate(~image, ImageUtils.box(33, 1))
# find contours
_, contours, hierarchy = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for contour in contours:
[x, y, w, h] = cv2.boundingRect(contour)
if w > h:
# find contour angle and centers
(x, y), (w, h), angle = cv2.minAreaRect(contour)
cx, cy = getContourCenter(contour)
# fix angle returned
if w < h:
angle = 90 + angle
# fix contour angle
rotatedContour = rotateContour(contour, (cx, cy), 0-angle)
cv2.drawContours(vis, contour, -1, (0, 255, 0), 2)
cv2.drawContours(mask, rotatedContour, -1, (255, 0, 0), 2)
cv2.circle(vis, (cx, cy), 2, (0, 0, 255), 2, 8) # centroid
cv2.putText(vis, str(round(angle, 2)), (cx, cy), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255,0,0), 2)
Here is one way, which is the simplest way I can think to do it in Python/OpenCV, though may not be optimal in speed.
Create a white empty image for the desired output. (So we have black text on white background in case you need to do OCR)
Get the rotated bounding rectangle of your contour in your input.
Get the normal bounding rectangle of your contour in the output.
Get the 4 bounding box corners for each.
Compute an affine transform matrix between the two sets of 4 corner
points.
Warp the (whole) input image to the same size (non-optimal).
Use the output bounding box dimensions and upper left corner with
numpy slicing to transfer the region in the warped image to the same
region in the white output image.
Repeat for each text contour using the resulting image in place of
the original white image as the new destintination image.
So here is a simulation to show you how.
Source Text Image:
Source Text Image with Red Rotated Rectangle:
Desired Bounding Rectangle in White Destination Image:
Text Transferred To White Image into Desired Rectangle Region:
Code:
import cv2
import numpy as np
# Read source text image.
src = cv2.imread('text_on_white.png')
hs, ws, cs = src.shape
# Read same text image with red rotated bounding box drawn.
src2 = cv2.imread('text2_on_white.png')
# Read white image showing desired output bounding box.
src2 = cv2.imread('text2_on_white.png')
# create white destination image
dst = np.full((hs,ws,cs), (255,255,255), dtype=np.uint8)
# define coordinates of bounding box in src
src_pts = np.float32([[51,123], [298,102], [300,135], [54,157]])
# size and placement of text in dst is (i.e. bounding box):
xd = 50
yd = 200
wd = 249
hd = 123
dst_pts = np.float32([[50,200], [298,200], [298,234], [50,234]])
# get rigid affine transform (no skew)
# use estimateRigidTransform rather than getAffineTransform so can use all 4 points
matrix = cv2.estimateRigidTransform(src_pts, dst_pts, 0)
# warp the source image
src_warped = cv2.warpAffine(src, matrix, (ws,hs), cv2.INTER_AREA, borderValue=(255,255,255))
# do numpy slicing on warped source and place in white destination
dst[yd:yd+hd, xd:xd+wd] = src_warped[yd:yd+hd, xd:xd+wd]
# show results
cv2.imshow('SRC', src)
cv2.imshow('SRC2', src2)
cv2.imshow('SRC_WARPED', src_warped)
cv2.imshow('DST', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save results
cv2.imwrite('text_on_white_transferred.png', dst)
To extract ONLY the content of a single contour, and not its larger bounding box, you can create a mask by drawing a filled contour and then applying that to the original image. In your case you would need something like this:
# prepare the target image
resX,resY = image.shape[1],image.shape[0]
target = np.zeros((resY, resX , 3), dtype=np.uint8)
target.fill(255) # make it entirely white
# find the contours
allContours,hierarchy = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# then perform rotation, etc, per contour
for contour in allContours:
# create empty mask
mask = np.zeros((resY, resX , 1), dtype=np.uint8)
# draw the contour filled into the mask
cv2.drawContours(mask, [contour], -1, (255), thickness=cv2.FILLED)
# copy the relevant part into a new image
# (you might want to use bounding box here for more efficiency)
single = cv2.bitwise_and(image, image, mask=mask)
# then apply your rotation operations both on the mask and the result
single = doContourSpecificOperation(single)
mask = doContourSpecificOperation(mask)
# then, put the result into your target image (which was originally white)
target = cv2.bitwise_and(target, single, mask=mask)
I need to detect corner of a paper on given image. It will always be a cropped part of whole picture containing only one of the corners. My idea was to transform image by bluring and Canny edge detection to get outlines and then aplying Houghlines to get coordinates of corner.
However i get some problem to actualy detect anything consistently and precisly by Hough lines and I'm running out of ideas what can be the cause here.
I've tried tresholding instead of Canny, but it's not gonna work due to many variations in applicable images. I've downcaled whole image to make it easier to see just edges of paper, but still no improvement. Increasing line tresholds make lines from paper content diapear, but at the same time edge lines disapear from time to time
Input
Edges
Results
Code to reproduce
import cv2
import numpy as np
img = cv2.imread('inv_0001-01.1_0_corner.jpg')
resized = cv2.resize(img, (250,250), interpolation = cv2.INTER_AREA)
gray = cv2.cvtColor(resized, cv2.COLOR_BGR2GRAY)
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)
edges = cv2.Canny(blur_gray,50,150,apertureSize = 3)
cv2.imshow('edges', edges)
cv2.waitKey()
min_line_length = 50
max_line_gap = 20
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 5, np.array([]), min_line_length, max_line_gap)
for line in lines:
for x1,y1,x2,y2 in line:
cv2.line(resized,(x1,y1),(x2,y2),(255,0,0),5)
cv2.imshow('hough', resized)
cv2.waitKey()
My go-to result would be coordinate of paper corner in given image, but in this post I'm rather looking for some help in understanding how to use Houglines for such tasks
This answer explains how to find the corner. Finding the corner requires a two part solution. First, the image needs to be segmented in to two regions: paper and background. Second, you can look for corners in the segmented image.
After you find the edges, floodfill the image to segment the paper from the background (this is the floodfill image):
mask = np.zeros((h+2, w+2), np.uint8)
# Floodfill from point (0, 0)
cv2.floodFill(edges, mask, (0,0), 123);
Now that you have segmented the image, get rid of the text on the paper using a mask (this is the image titled 'Masking'):
bg = np.zeros_like(edges)
bg[edges == 123] = 255
After you get the mask, appl the canny edge filter again to get the out line of the paper (HoughLines needs an outline not a mask...this is the 'Edges after masking' image):
bg = cv2.blur(bg, (3,3))
edges = cv2.Canny(bg,50,150,apertureSize = 3)
Now you can run your HoughLines algorithm on the cleaner image. I used a different HoughLines algorithm than you did, but yours should work too. Here is the full code that I used:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Create a multi plot
f, axarr = plt.subplots(2,3, sharex=True)
img = cv2.imread('/home/stephen/Desktop/IRcCAWL.png')
resized = cv2.resize(img, (250,250), interpolation = cv2.INTER_AREA)
# Show source image
axarr[0,0].imshow(resized)
gray = cv2.cvtColor(resized, cv2.COLOR_BGR2GRAY)
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)
edges = cv2.Canny(blur_gray,50,150,apertureSize = 3)
# Show first edges image
axarr[0,1].imshow(edges)
h, w = edges.shape[:2]
mask = np.zeros((h+2, w+2), np.uint8)
# Floodfill from point (0, 0)
cv2.floodFill(edges, mask, (0,0), 123);
# Show the flood fill image
axarr[0,2].imshow(edges)
floodfill = edges.copy()
bg = np.zeros_like(edges)
bg[edges == 123] = 255
# Show the masked image
axarr[1,0].imshow(bg)
bg = cv2.blur(bg, (3,3))
edges = cv2.Canny(bg,50,150,apertureSize = 3)
# Show the edges after masking
axarr[1,1].imshow(edges)
min_line_length = 50
max_line_gap = 20
def intersection(line1, line2):
"""Finds the intersection of two lines given in Hesse normal form.
Returns closest integer pixel locations.
See https://stackoverflow.com/a/383527/5087436
"""
rho1, theta1 = line1[0]
rho2, theta2 = line2[0]
A = np.array([
[np.cos(theta1), np.sin(theta1)],
[np.cos(theta2), np.sin(theta2)]
])
b = np.array([[rho1], [rho2]])
x0, y0 = np.linalg.solve(A, b)
x0, y0 = int(np.round(x0)), int(np.round(y0))
return [[x0, y0]]
import math
lines = cv2.HoughLines(edges, 1, np.pi / 180, 100, None, 0, 0)
# Draw the lines
if lines is not None:
for i in range(0, len(lines)):
rho = lines[i][0][0]
theta = lines[i][0][1]
a = math.cos(theta)
b = math.sin(theta)
x0 = a * rho
y0 = b * rho
pt1 = (int(x0 + 1000*(-b)), int(y0 + 1000*(a)))
pt2 = (int(x0 - 1000*(-b)), int(y0 - 1000*(a)))
cv2.line(resized, pt1, pt2, (123,234,123), 2, cv2.LINE_AA)
xy = tuple(intersection(lines[0], lines[1])[0])
resized = cv2.circle(resized, xy, 5, 255, 2)
# Show the image with the corner
axarr[1,2].imshow(resized)
# Add titles
axarr[0,0].set_title('Source Image')
axarr[0,1].set_title('Edges')
axarr[0,2].set_title('Floodfill')
axarr[1,0].set_title('Masking')
axarr[1,1].set_title('Edges after masking')
axarr[1,2].set_title('Hough Lines')
# Clean up
axarr[0,0].axis('off')
axarr[0,1].axis('off')
axarr[1,0].axis('off')
axarr[1,1].axis('off')
axarr[1,2].axis('off')
axarr[0,2].axis('off')
plt.show()