I’m trying to find the corners of a polygon segmentation that was made with Yolov8, as in this image:
chessboard segmentation
This is my code:
model_trained = YOLO("runs/segment/yolov8n-seg_chessboard/weights/best.pt")
results = model_trained.predict(source="1.jpgresized.jpg", line_thickness=2, save_txt=True, save=True)
masks = results[0].masks # Masks object
masks.segments[0] # a numpy array of
I'm not able to figure out how to get the four corners of the segmentation out of this array.
Notice, that this could involve quite a lot of fine-tuning for you particular case. The idea here is to pass the segmentation mask to goodFeaturesToTrack which finds strong corners in it. Then you pick the 4 best candidates. Finally they are plotted on the original image.
from ultralytics.yolo.engine.model import YOLO
import cv2
def on_predict_batch_end(predictor):
# results -> List[batch_size]
path, im, im0s, vid_cap, s = predictor.batch
predictor.results = zip(predictor.results, im0s)
model = YOLO("yolov8n-seg.pt")
model.add_callback("on_predict_batch_end", on_predict_batch_end)
results = model.predict(source="0", show=True, stream=True, classes=67)
for i, (result, im0) in enumerate(results):
Masks = result.masks
if Masks is not None:
for mask in Masks.masks:
x = mask.cpu().numpy()
corners = cv2.goodFeaturesToTrack(x, 4, 0.5 , 50) # get 4 corners
for corner in corners: # plot the corners on the original image
x,y = corner.ravel()
cv2.circle(im0,(int(x), int(y)),5,(0, 0, 255),-1)
I've tried using the findChessboardCorners function in open CV python. But it's not working.
These are the images I'm trying to get it to detect these images.
I want it to be able to detect where the squares are and if a piece is on it.
So far I've tried
import cv2 as cv
import numpy as np
def rescaleFrame(frame, scale=0.75):
#rescale image
width = int(frame.shape[1] * scale)
height = int(frame.shape[0] * scale)
dimensions = (width,height)
return cv.resize(frame, dimensions, interpolation=cv.INTER_AREA)
img = cv.imread("board2.jpg")
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
ret, corners = cv.findChessboardCorners(gray, (8,8),None)
if ret == True:
# Draw and display the corners
img = cv.drawChessboardCorners(img, (8,8), corners,ret)
I was expect it to work like how this tutorial shows
The function findChessboardCorners is used to calibrate cameras using a black-and-white chessboard pattern. As far as I know, is not designed to detect the corners of a chess board with chess pieces on it.
This site shows an example of calibration "chess boards." And this site shows how these calibration chess boards are used, this example uses the ROS Library.
You can still use OpenCV but will need to try other functions. Assuming you took the photos yourself, you've also made the problem harder on yourself by using a background that has a lot of lines and corners, meaning you'll have to differentiate between those corners and corners on the board. You can also see that the top corners of the board behind the rooks are occluded. If you can retake the photos, I would take a top-down photo and do it on a blank surface that contrasts with the chessboard.
One example of corner detection in OpenCV is Harris corner detection. I wrote up a short example for you. You'll need to play around with this and other corner detection methods to see what works best. I found that adding a sobel filter to strength the lines in your image gave much better results. But it's still going to detect corners in the background and the corners on the pieces. You'll need to figure out how to filter those out.
import cv2 as cv
from matplotlib import pyplot as plt
import numpy as np
def sobel(src_image, kernel_size):
grad_x = cv.Sobel(src_image, cv.CV_16S, 1, 0, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
grad_y = cv.Sobel(src_image, cv.CV_16S, 0, 1, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
abs_grad_x = cv.convertScaleAbs(grad_x)
abs_grad_y = cv.convertScaleAbs(grad_y)
grad = cv.addWeighted(abs_grad_x, 0.5, abs_grad_y, 0.5, 0)
return grad
def process_image(src_image_path):
# load the image
src_image = cv.imread(src_image_path)
# convert to RGB (otherwise when you display this image the colors will look incorrect)
src_image = cv.cvtColor(src_image, cv.COLOR_BGR2RGB)
# convert to grayscale before attempting corner detection
src_gray = cv.cvtColor(src_image, cv.COLOR_BGR2GRAY)
# standard technique to eliminate noise
blur_image = cv.blur(src_gray,(3,3))
# strengthen the appearance of lines in the image
sobel_image = sobel(blur_image, 3)
# detect corners
corners = cv.cornerHarris(sobel_image, 2, 3, 0.04)
# for visualization to make corners easier to see
corners = cv.dilate(corners, None)
# overlay on a copy of the source image
dest_image = np.copy(src_image)
return dest_image
src_image_path = "board1.jpg"
dest_image = process_image(src_image_path)
When I try to make an inverse polar transformation to my image, the output is outside of the output image. There are also some weird white patterns on the top. I tried to make the output image larger but the circle is on the left side so it didn't help.
I am trying to make a line circle using warpPolar function, for that first I'm flipping the line and giving it a black area as shown on the image, then using the cv2.warpPolar function with WARP_INVERSE_MAP flag.
How can I fully draw the circle, and get its bounding box is my question.
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
h,w = new_image.shape
center = (w/2,h)
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
Note: I am not getting the same result as you showed above when I tried the same code. You may miss some code lines to add ?
If I didn't misunderstand your problem,you are trying to get this result: (If I am wrong, I will update the answer accordingly)
The only point you are missing is that defining the center and radius. You are making inverse transform here, the input is created by you not warpPolar. Since you are defining size as (1500,1500), you need to update center and radius accordingly. Here is my code giving this result:
import cv2
import numpy as np
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
h,w = new_image.shape
center = (750,750)
maxRadius = 750
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
I have some hundreds of images (scanned documents), most of them are skewed. I wanted to de-skew them using Python.
Here is the code I used:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
This code works well with most of the documents, except with some angles: (180 and 0) and (90 and 270) are often detected as the same angle (i.e it does not make difference between (180 and 0) and (90 and 270)). So I get a lot of upside-down documents.
Here is an example:
The resulted image that I get is the same as the input image.
Is there any suggestion to detect if an image is upside down using Opencv and Python?
PS: I tried to check the orientation using EXIF data, but it didn't lead to any solution.
It is possible to detect the orientation using Tesseract (pytesseract for Python), but it is only possible when the image contains a lot of characters.
For anyone who may need this:
import cv2
import pytesseract
If the document contains enough characters, it is possible for Tesseract to detect the orientation. However, when the image has few lines, the orientation angle suggested by Tesseract is usually wrong. So this can not be a 100% solution.
Python3/OpenCV4 script to align scanned documents.
Rotate the document and sum the rows. When the document has 0 and 180 degrees of rotation, there will be a lot of black pixels in the image:
Use a score keeping method. Score each image for it's likeness to a zebra pattern. The image with the best score has the correct rotation. The image you linked to was off by 0.5 degrees. I omitted some functions for readability, the full code can be found here.
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
How to tell if the document is upside down? Fill in the area from the top of the document to the first non-black pixel in the image. Measure the area in yellow. The image that has the smallest area will be the one that is right-side-up:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
Project the corrected image to the y-axis, so that you get a 'peak' for each line. Important: There are actually almost always two sub-peaks!
Smooth this projection by convolving with a gaussian in order to get rid of fine structure, noise, etc.
For each peak, check if the stronger sub-peak is on top or at the bottom.
Calculate the fraction of peaks that have sub-peaks on the bottom side. This is your scalar value that gives you the confidence that the image is oriented correctly.
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
Blue: Original projection
Orange: smoothed projection
Horizontal line: average of the smoothed projection for the whole image.
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
You can use the Alyn module. To install it:
pip install alyn
Then to use it to deskew images(Taken from the homepage):
from alyn import Deskew
d = Deskew(
display_image='preview the image on screen',
output_file='path_for_deskewed image',
Note that Alyn is only for deskewing text.
I am having a image here. The region within the yellow lines is my region of interest, as shown in this image here, which is also one of my objective. Here's my planning / steps:
Denoise, color filtering, masking and Canny edging (DONE)
Coordinates of the edges (DONE)
Select coordinates of certain vertices, for example
Draw polygon with those vertices' coordinates
Here's the code:
import cv2
import numpy as np
from matplotlib import pyplot as plt
frame = cv2.imread('realtest.jpg')
denoisedFrame = cv2.fastNlMeansDenoisingColored(frame, None, 10, 10, 7, 21)
HSVframe = cv2.cvtColor(denoisedFrame, cv2.COLOR_BGR2HSV)
lower_yellowColor = np.array([15,105,105])
upper_yellowColor = np.array([25,255,255])
whiteMask = cv2.inRange(HSVframe, lower_yellowColor, upper_yellowColor)
maskedFrame = cv2.bitwise_and(denoisedFrame, denoisedFrame, mask=whiteMask)
grayFrame = cv2.cvtColor(maskedFrame, cv2.COLOR_BGR2GRAY)
gaussBlurFrame = cv2.GaussianBlur(grayFrame, (5,5), 0)
edgedFrame = cv2.Canny(grayFrame, 100, 200)
#Coordinates of each white pixels that make up the edges
ans = []
for y in range(0, edgedFrame.shape[0]):
for x in range(0, edgedFrame.shape[1]):
if edgedFrame[y, x] != 0:
ans = ans + [[x, y]]
ans = np.array(ans)
#print(ans[0:100, :])
cv2.imshow("edged", edgedFrame)
As you can see, I have successfully done step number (2) in getting the coordinates of each white pixels that make the edges. Whereas for the next step, step number (3), I am stuck. I have tried the coding here, but getting error that says 'ValueError: too many values to unpack (expected 2)'.
Please help teaching me in finding good vertices for constructing a polygon that is as close to the yellow lines as possible.
I have split the answer into two parts
Part 1: Finding the good vertices to construct a polygon
The required vertices around an image containing edges can be done using OpenCV's inbuilt cv2.findContours() function. It returns the image with contours, vertices of the contours and the hierarchy of the contours.
One can find vertices of contours in two ways:
cv2.CHAIN_APPROX_NONE plots ALL the coordinates(boundary points) on each contour
cv2.CHAIN_APPROX_SIMPLE plots ONLY the most necessary coordinates on each contour. It doesn't store all the points. Only the most required coordinates that best represent the contours are stored.
In your case option 2 can be opted. After finding the edges you can do the following:
image, contours, hier = cv2.findContours(edgedFrame, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
contours contains the vertices of every contour in the image edgedFrame
Part 2: Constructing the polygon
Opencv has an in-built function for this as well cv2.convexHull() After finding those points you can draw them using cv2.drawContours().
for cnt in contours:
hull = cv2.convexHull(cnt)
cv2.drawContours(frame, [hull], -1, (0, 255, 0), 2)
cv2.imshow("Polygon", frame)
You can obtain a better approximation of the desired edge by doing some more pre-processing while creating the mask
I am trying to crop a face using the facial landmarks identified by dlib. The right eyebrow is causing problems - the crop goes flat across rather than follow the eyebrow arc.
What am I doing wrong here?
from imutils import face_utils
import imutils
import numpy as np
import collections
import dlib
import cv2
def face_remap(shape):
remapped_image = shape.copy()
# left eye brow
remapped_image[17] = shape[26]
remapped_image[18] = shape[25]
remapped_image[19] = shape[24]
remapped_image[20] = shape[23]
remapped_image[21] = shape[22]
# right eye brow
remapped_image[22] = shape[21]
remapped_image[23] = shape[20]
remapped_image[24] = shape[19]
remapped_image[25] = shape[18]
remapped_image[26] = shape[17]
# neatening
remapped_image[27] = shape[0]
return remapped_image
# load the input image, resize it, and convert it to grayscale
image = cv2.imread("images/faceCM1.jpg")
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
out_face = np.zeros_like(image)
# initialize dlib's face detector (HOG-based) and then create the facial landmark predictor
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(SHAPE_PREDICTOR)
# detect faces in the grayscale image
rects = detector(gray, 1)
# loop over the face detections
for (i, rect) in enumerate(rects):
Determine the facial landmarks for the face region, then convert the facial landmark (x, y)-coordinates to a NumPy array
shape = predictor(gray, rect)
shape = face_utils.shape_to_np(shape)
#initialize mask array
remapped_shape = np.zeros_like(shape)
feature_mask = np.zeros((image.shape[0], image.shape[1]))
# we extract the face
remapped_shape = face_remap(shape)
cv2.fillConvexPoly(feature_mask, remapped_shape[0:27], 1)
feature_mask = feature_mask.astype(np.bool)
out_face[feature_mask] = image[feature_mask]
cv2.imshow("mask_inv", out_face)
cv2.imwrite("out_face.png", out_face)
sample image of cropped face showing the issue
Using the convex hull formed by the 68 landmarks didn't exactly achieve the desired output, so I had the following approach to this problem using scikit-image instead of OpenCV
1. Load image and predict 68 landmarks
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
img = dlib.load_rgb_image('mean.jpg')
rect = detector(img)[0]
sp = predictor(img, rect)
landmarks = np.array([[p.x, p.y] for p in sp.parts()])
2. Select the landmarks that represents the shape of the face
(I had to reverse the order of the eyebrows landmarks because the 68 landmarks aren't ordered to describe the face outline)
outline = landmarks[[*range(17), *range(26,16,-1)]]
3. Draw a polygon using these landmarks using scikit-image
Y, X = skimage.draw.polygon(outline[:,1], outline[:,0])
4. Create a canvas with zeros and use the polygon as mask to original image
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
For the sake of completeness, I provide below a solution using scipy.spatial.ConvexHull, if this option is still preferred
vertices = ConvexHull(landmarks).vertices
Y, X = skimage.draw.polygon(landmarks[vertices, 1], landmarks[vertices, 0])
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
Its because the face shape you are providing is not convex.
fillConvexPoly works perfectly on convex shapes only, In this case there is a concave corner (at point #27) and hence the results are messed up.
To fix this, modify the function as
def face_remap(shape):
remapped_image = cv2.convexHull(shape)
return remapped_image
This would give you a result which looks like.
Now you may write some more code to remove the triangular section on forehead (if you want it that way)