Camera calibration, reverse projection of pixel to direction - python

I am using OpenCV to estimate a webcam's intrinsic matrix from a series of chessboard images - as detailed in this tutorial, and reverse project a pixel to a direction (in term of azimuth/elevation angles).
The final goal is to let the user select a point on the image, estimate the direction of this point in relation to the center of the webcam, and use this as DOA for a beam-forming algorithm.
So once I have estimated the intrinsic matrix, I reverse project the user-selected pixel (see code below) and display it as azimuth/elevation angles.
result = [0, 0, 0] # reverse projected point, in homogeneous coord.
while 1:
_, img =
if flag: # If the user has clicked somewhere
result =, [mouse_x, mouse_y, 1])
result = np.arctan(result) # convert to angle
flag = False
cv2.putText(img, '({},{})'.format(mouse_x, mouse_y), (20, 440), cv2.FONT_HERSHEY_SIMPLEX,
0.5, (0, 255, 0), 2, cv2.LINE_AA)
cv2.putText(img, '({:.2f},{:.2f})'.format(180/np.pi*result[0], 180/np.pi*result[1]), (20, 460),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2, cv2.LINE_AA)
cv2.imshow('image', img)
if cv2.waitKey(1) & 0xFF == ord('q'):
My problem is that I'm not sure whether my results are coherent. The major incoherence is that, the point of the image corresponding to the {0,0} angle is noticeably off the image center, as seen below (camera image has been replaced by a black background for privacy reasons) :
I don't really see a simple yet efficient way of measuring the accuracy (the only method I could think of was to use a servo motor with a laser on it, just under the camera and point it to the computed direction).
Here is the intrinsic matrix after calibration with 15 images :
I get an error of around 0.44 RMS which seems satisfying.
My calibration code :
nCalFrames = 12 # number of frames for calibration
nFrames = 0
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001) # termination criteria
objp = np.zeros((9*7, 3), np.float32)
objp[:, :2] = np.mgrid[0:9, 0:7].T.reshape(-1, 2)
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.
cap = cv2.VideoCapture(0)
previousTime = 0
gray = 0
while 1:
# Capture frame-by-frame
_, img =
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv2.findChessboardCorners(gray, (9, 7), None)
# If found, add object points, image points (after refining them)
if ret:
corners2 = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
if time.time() - previousTime > 2:
previousTime = time.time()
img = cv2.bitwise_not(img)
nFrames = nFrames + 1
# Draw and display the corners
img = cv2.drawChessboardCorners(img, (9, 7), corners, ret)
cv2.putText(img, '{}/{}'.format(nFrames, nCalFrames), (20, 460), cv2.FONT_HERSHEY_SIMPLEX,
2, (0, 255, 0), 2, cv2.LINE_AA)
cv2.putText(img, 'press \'q\' to exit...', (255, 15), cv2.FONT_HERSHEY_SIMPLEX,
0.5, (0, 0, 255), 1, cv2.LINE_AA)
# Display the resulting frame
cv2.imshow('Webcam Calibration', img)
if nFrames == nCalFrames:
if cv2.waitKey(1) & 0xFF == ord('q'):
RMS_error, mtx, disto_coef, _, _ = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)
EDIT: another test method would be to use a whiteboard with known angles points and estimate the error by comparing with experimental results, but I don't know how to set up such a system

Regarding your first concern, it is normal to have the principal point off the image center. The estimated point, which is the point of zero elevation and azimuth, is the one that minimizes the radial distortion coefficients, and for a low value wide angle lens (e.g., that of a typical webcam) it can be easily off by noticeable amount.
Your calibration should be ok up to the call to calibrateCamera. However, in your code snippet it seems your ignoring the distortion coefficients. What is missing is initUndistortRectifyMap, which lets you also re-center the principal point if that matters.
h, w = img.shape[:2]
# compute new camera matrix with central principal point
new_mtx,roi = cv2.getOptimalNewCameraMatrix(mtx,disto_coef,(w,h),1,(w,h))
# compute undistort maps
mapx,mapy = cv2.initUndistortRectifyMap(mtx,disto_coef,None,new_mtx,(w,h),5)
It essentially makes focal length equal in both dimensions and centers the principal point (see OpenCV python documentation for parameters).
Then, at each
_, img =
you must undistort the image before rendering
# apply the remap
img = cv2.remap(img,mapx,mapy,cv2.INTER_LINEAR)
# crop the image
x,y,w,h = roi
img = img[y:y+h, x:x+w]
here, I put background to green to emphasize the barrel distortion. The output could be something like this (camera image replaced by checkerboard for privacy reasons):
If you do all these, your calibration target is accurate and your calibration samples fill the entire image area you should be quite confident of the computation. However, to validate the measured azimuth and elevation with respect to the undistorted image's pixel readings, I'd maybe suggest tape measure from the lenses first principal point and a calibration plate placed in normal angle right in front of the camera. There you can compute the expected angles and compare.
Hope this helps.


Matching a real world picture with a 3D model inaccuracy problem

Suppose that I have a fixed physical camera and static scene. I have a point cloud scan of the physical world, so I can use basic surfaces and cubes to perform a simple reconstruction of the real world.
Simple reconstruction in unity
Pointcloud scan
Next step is calculate real world camera pose using checkerboard and PnP. After calculation, I used the resulting Tvec, Rvec, and ProjectPoint to draw a virtual cube in world unit, it shows up perfectly, showing that the camera pose is valid within opencv framework.
Verify camera pose after PnP
However, when I put the resulting camera transformation in Unity, the camera translation seems to be off by half a meter compared to the physical world estimate. Ideally what I would like to achieve is a pixel-perfect alignment between a real world image and a unity camera view image (which is a digital twin of the physical world).
Tvec Rvec
Thank you for your insights in advance!
Below is code for calculatePnP
import numpy as np
import cv2
import glob
import os
from scipy.spatial.transform import Rotation
# Used to draw standard axis
def draw(img, corners, imgPoints):
corner = tuple(corners[0].ravel())
print("int(corner[0])):\n {0}".format(int(corner[0])))
print("int(corner[1])):\n {0}".format(int(corner[1])))
print("int(tuple(imgPoints[0].ravel())[0]):\n {0}".format(int(tuple(imgPoints[0].ravel())[0])))
print("int(tuple(imgPoints[0].ravel())[1]):\n {0}".format(int(tuple(imgPoints[0].ravel())[1])))
img = cv2.line(img, (int(corner[0]), int(corner[1])), (int(tuple(imgPoints[0].ravel())[0]),int(tuple(imgPoints[0].ravel())[1])), (255,0,0), 5)
img = cv2.line(img, (int(corner[0]), int(corner[1])), (int(tuple(imgPoints[1].ravel())[0]),int(tuple(imgPoints[1].ravel())[1])), (0,255,0), 5)
img = cv2.line(img, (int(corner[0]), int(corner[1])), (int(tuple(imgPoints[2].ravel())[0]),int(tuple(imgPoints[2].ravel())[1])), (0,0,255), 5)
return img
# used to draw a standard cube (1,1,1)
# opencv official
def drawCube(img, corners, imgpts):
imgpts = np.int32(imgpts).reshape(-1,2)
# draw ground floor in green
img = cv2.drawContours(img, [imgpts[:4]],-1,(0,255,0),-3)
# draw pillars in blue color
for i,j in zip(range(4),range(4,8)):
img = cv2.line(img, tuple(imgpts[i]), tuple(imgpts[j]),(255),3)
# draw top layer in red color
img = cv2.drawContours(img, [imgpts[4:]],-1,(0,0,255),3)
return img
# Load the camera calibration data
with np.load('opencvcalib.npz') as calibData:
mtx, dist, rvecs, tvecs = [calibData[i] for i in ('mtx', 'dist', 'rvecs', 'tvecs')]
print("Previously calibrated dist:\n {0}".format(dist))
print("mtx:\n {0}".format(mtx))
# Define the chess board rows and columns
rows = 9
cols = 6
# Set the termination criteria for the corner sub-pixel algorithm
criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 30, 0.001)
# Prepare the object points: (0,0,0), (1,0,0), (2,0,0), ..., (6,5,0). They are the same for all images
objectPoints = np.zeros((rows * cols,3), np.float32)
objectPoints[:, :2] = np.mgrid[0:rows, 0:cols].T.reshape(-1, 2)
print("objpts before divide:\n {0}".format(objectPoints))
#scale object points to real world, 1unit = 0.034m, should divide by 29.41176470588235
objectPoints = objectPoints / 29.41176470588235
print("divided objpts:\n {0}".format(objectPoints))
# Create the axis points, unit is meters, here shortest X axis is 10cm.
axisPoints = np.float32([[0.1, 0, 0], [0, 0.2, 0], [0, 0, -0.3]]).reshape(-1, 3)
#this unit is per checkerboard square
#axisPoints = np.float32([[1, 0, 0], [0, 2, 0], [0, 0, -5]]).reshape(-1, 3)
# Loop over the image files
img = cv2.imread("ext3.jpg");
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find the chess board corners, and visualize
ret, corners = cv2.findChessboardCorners(gray, (rows, cols), None)
imgchkbrd = cv2.drawChessboardCorners(img, (9,6), corners, ret)
cv2.imwrite('corners.jpg', imgchkbrd)
# Make sure the chess board pattern was found in the image
if ret == True:
# Refine the corner position
corners = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
# Find the rotation and translation vectors
val, rvecs, tvecs, inliers = cv2.solvePnPRansac(objectPoints, corners, mtx, dist)
#success, rvecs, tvecs = cv2.solvePnP(objectPoints, corners, mtx, dist,flags=cv2.SOLVEPNP_ITERATIVE)
#print("objpts:\n {0}".format(objectPoints))
print("corners:\n {0}".format(corners))
print ("Rotation Vector:\n {0}".format(rvecs))
print ("Translation Vector:\n {0}".format(tvecs))
Rt = cv2.Rodrigues(rvecs)
print ("Rt tuple:\n {0}".format(Rt))
R = Rt.transpose()# 'tuple' object has no attribute 'transpose'
pos = -R * tvecs #pos is the position of the camera expressed in the global frame
roll = atan2(-R[2][1], R[2][2])
pitch = asin(R[2][0])
yaw = atan2(-R[1][0], R[0][0])
print ("pos of camera:\n {0}".format(pos))
#position of camera would be {- transpose( r ) * t }
r = Rotation.from_rotvec(rvecs.T)
quaternion = r.as_quat()
print("Quaternion1:\n {0}".format(quaternion))
RotationMatrix,_ = cv2.Rodrigues(rvecs)
print("RotationMatrix:\n {0}".format(RotationMatrix))
# Project the 3D axis points to the image plane
axisImgPoints, jac = cv2.projectPoints(axisPoints, rvecs, tvecs, mtx, dist)
# Draw the axis lines
img = draw(img, corners, axisImgPoints)
#render a cube
CubeAxis = np.float32([[0,0,0], [0,0.034,0], [0.034,0.034,0], [0.034,0,0],
[0,0,-0.034],[0,0.034,-0.034],[0.034,0.034,-0.034],[0.034,0,-0.034] ])
axisImgPoints, jac = cv2.projectPoints(CubeAxis, rvecs, tvecs, mtx, dist)
img2 = drawCube(img, corners, axisImgPoints)
# Display the image
cv2.imshow('chess board', img2)
cv2.imwrite('checkerboardpnp3.png', img2)

Low contrast stops HoughCircles from detection(?)

I am trying to build a script capable of counting how many Euros (for now just with coins) are in a picture. In order to accomplish this I am thinking of firstly locating the coins and then compare their relative size in order to know the value of each one as I've seen done in other places. My hardship lies in the first step, in the pre processing of the image.
A note is that this problem arises only when contrast between the background and certain coins is very low
I've tried various methods pre processing with different methods of detection such as connectedComponentsWithStats(), findContours() and SimpleBlobDetector, but the most successful combination I've achieved is:
import numpy as np
import cv2
import os
path = 'GenericImages/TP2/'
path_coins_highlighted = 'GenericImages/Highlights'
path_gaussian_blurs = 'GenericImages/Gaussian_Blurs'
dirs = os.listdir(path)
i = 0
for file in dirs:
path2img = os.path.join(path, file)
img = cv2.imread(path2img)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# clahe = cv2.createCLAHE(clipLimit=40, tileGridSize=(8, 8))
# equalized = clahe.apply(gray)
gray_blur = cv2.GaussianBlur(gray, (15, 15), 0)
# gray_blur = cv2.bilateralFilter(gray, 9, 65, 9)
circles = cv2.HoughCircles(gray_blur, cv2.HOUGH_GRADIENT, 1, 15, param1=50, param2=30, minRadius=0, maxRadius=0)
circles = np.uint16(np.around(circles))
for x in circles[0, :]:, (x[0], x[1]), x[2], (0, 255, 0), 2), (x[0], x[1]), 2, (0, 0, 255), 3)
cv2.imshow('Gray', gray)
cv2.imshow('Gaussian Blur', gray_blur)
path_save_gaussian_blur = os.path.join(path_gaussian_blurs, str(i) + '_gaussian_blur.jpg')
cv2.imwrite(path_save_gaussian_blur, gray_blur)
# cv2.imshow('equalized', equalized)
cv2.imshow('Highlights', img)
path_save_highlights = os.path.join(path_coins_highlighted, str(i) + '_highlight.jpg')
cv2.imwrite(path_save_highlights, img)
i += 1
The problem lies in the consistency of the detection, I believe that when it fails, it does so because there is little to no contrast between the background and the coins that HoughCircles is not detecting. The set of images below show the cases in which the algorithm fails.
SET 0:
I've tried tweaking with equalization and a bilateral filter with different parameters in order to remove noise but keep the transition zones (contours of the coin) but I haven't found significant improvements.
I would appreciate some direction or ideas of what I should be looking for to solve this issue.
The lighting is non-uniform and your images are small and heavily compressed. These are the two factors that hinder a good detection. It might be difficult to control lighting but at least make sure you use lossless image formats (such as png) to avoid compression artifacts.
Anyway, your non-uniform lighting makes this a good case for a lighting normalization method called Gain Division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image. This is very useful because if we eliminate the non-uniform lighting we can create a foreground mask for the coins, and then we simply approximate circles to the coin's contours.
Let's give it a try:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "FHlbm.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Get local maximum:
kernelSize = 30
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(inputImage, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage / localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
cv2.imshow("Gain Division", gainDivision)
Which yields:
This is the result of applying gain division to the first image. Note that now the background is almost uniform. This is excellent, because we can apply a simple auto threshold to create a binary mask containing just the foreground objects, like this:
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
This is the binary image:
Now, we have a problem here. The compression artifacts make this mask noisy. We could apply a little bit of morphology to improve the binary blobs, but your image is really small, so I have skipped this step. If you have access to larger, lossless images, you might want to include a cleaning step.
For now I'll simply try to compute the Minimum Enclosing Circle of each blob larger than a threshold, and I should get a detection a little bit more robust than Hough's. Let's see:
# Find the circle blobs on the binary mask:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contoursPoly = [None] * len(contours)
# Store the circles here:
detectedCircles = []
# Alright, just look for the outer bounding boxes:
for i, c in enumerate(contours):
# Get blob area:
blobArea = cv2.contourArea(c)
# Set min area:
minArea = 100
# Process only big blobs:
if blobArea > minArea:
# Approximate the contour to a circle:
(x, y), radius = cv2.minEnclosingCircle(c)
# Compute the center and radius:
center = (int(x), int(y))
radius = int(radius)
# Draw the circles:, center, radius, (0, 0, 255), 1)
cv2.line(inputImageCopy, center, center, (0, 255, 0), 2)
# Store the center and radius:
detectedCircles.append([center, radius])
cv2.imshow("Circles", inputImageCopy)
Let's see the results drawn onto a deep copy of the original image:
Not bad. All the circle's data (center and radius) is stored in the detectedCircles list. We can print the info like this:
# Check out the detected circles:
for i in range(len(detectedCircles)):
center, r = detectedCircles[i]
print("Circle #: "+str(i)+" x: "+str(center[0])+" y: "+str(center[1])+" r: "+str(r))

Problem to remove distortion (after camera-calibration) from image

The image is a single frame of a 90-minute video.
If one draws a line from the top left corner (see, left-distorted) to the right (see, right-distorted), the distortion is visible by looking at the top-line (see, middle-distorted).
The white line between the points to the left and the right should be straight but has a slight U-shape compared to the straight line in red.
Performing a camera calibration with open-cv (reference) using a sample of entire frames from calibration videos (for some examples see, indoor, outdoor-1, and outdoor-2) results in this undistorted image.
Here, the line from the top left corner (see, left-undistorted) to the right (see, right-undistorted) overcorrects the distortion as the difference between the red-line and the white has become an inverse U shape (see, middle-undistorted).
I took the recording with an iPhone 11, and I am using python 3.8.8 and open-cv 4.5.3.
I followed the advice I could find at StackOverflow (and through the most popular search results), but using any checkerboard (variation of size, camera angle, distance, and setting) does not correct distortion correction. I fail to understand why.
Based on below conversation the calibration footage needs to
have a focus, similar to the reference scenario (in my case, objects are far away),
the board should cover a minimum of say 20% of the image.
Here the video that I used (focus is fixed on objects being far away) to extract a sample of 50 frames to calculate the camera calibration:
objpoints = []
imgpoints = []
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
objp = np.zeros((CHECKERBOARD_HEIGHT * CHECKERBOARD_WIDTH, 3), np.float32)
objp[:, :2] = np.mgrid[0:CHECKERBOARD_WIDTH, 0:CHECKERBOARD_HEIGHT.T.reshape(-1, 2)
objp = objp * SQUARE_SIZE
# for loop going over the sample of images
img = cv2.imread('path to image')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, corners = cv2.findChessboardCorners(gray, (CHECKERBOARD_WIDTH[i], CHECKERBOARD_HEIGHT[i]), val)
if ret == True:
# refining pixel coordinates for given 2d points.
corners2 = cv2.cornerSubPix(gray, corners, (11,11), (-1,-1), criteria)
# Calculate camera calibration
img = cv2.imread('path to reference image')
h,w = img.shape[:2]
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints,
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
# Undistort
mapx,mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (w,h), 5)
dst = cv2.remap(img,mapx,mapy,cv2.INTER_LINEAR)
cv2.imwrite('path to output image', dst)
The result is unfortunately not correct as shown by the deviation of the red and white line at the top of the pitch.

Cv2 findChessboardCorners fails to find corners

I am using cv2 findChessBoardCorners for camera calibration in a vision application. My call to the function looks like this:
def auto_detect_checkerboard(self, image):
retval, corners = cv2.findChessboardCorners(image, (7, 7), flags=cv2.CALIB_CB_ADAPTIVE_THRESH
return corners[0][0], corners[0][1]
print("No Checkerboard Found")
assert False
But it seems to fail to find any corners on all images I have tried with it so far. The most trivial example I have used is
Is there an issue with my use of the the function? Or is there an issue with the image that I need to deal with in preprocessing?
So far I have tried converting to grayscale, and applying a Gaussian filter, neither of which seem to have made a difference.
My approach for the problem is to perform color-segmentation to get a binary mask. Next, using binary mask to remove the background to make the board visible, removed from artifacts. Finally output the chess border features in an accurate way.
Performing color-segmentation: We convert the loaded image to the HSV format define lower/upper ranges and perform color segmentation using cv2.inRange to obtain a binary mask.
Extracting chess-board: After obtaining binary mask we will use it to remove the background and separate chess part from the rest of the image using cv2.bitwise_and. Arithmetic operation and is highly useful for defining roi in hsv colored images.
Displaying chess-board features. After extracting the chessboard from the image, we will set the patternSizeto (7, 7) and flags to adaptive_thresh + fast_check + normalize image inspired from the source.
Color-segmentation to get the binary mask.
lwr = np.array([0, 0, 143])
upr = np.array([179, 61, 252])
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
msk = cv2.inRange(hsv, lwr, upr)
Removing background using mask
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (50, 30))
dlt = cv2.dilate(msk, krn, iterations=5)
res = 255 - cv2.bitwise_and(dlt, msk)
Displaying Chess-board features
res = np.uint8(res)
ret, corners = cv2.findChessboardCorners(res, (7, 7),
if ret:
fnl = cv2.drawChessboardCorners(img, (7, 7), corners, ret)
cv2.imshow("fnl", fnl)
print("No Checkerboard Found")
import cv2
import numpy as np
# Load the image
img = cv2.imread("kFM1C.jpg")
# Color-segmentation to get binary mask
lwr = np.array([0, 0, 143])
upr = np.array([179, 61, 252])
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
msk = cv2.inRange(hsv, lwr, upr)
# Extract chess-board
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (50, 30))
dlt = cv2.dilate(msk, krn, iterations=5)
res = 255 - cv2.bitwise_and(dlt, msk)
# Displaying chess-board features
res = np.uint8(res)
ret, corners = cv2.findChessboardCorners(res, (7, 7),
if ret:
fnl = cv2.drawChessboardCorners(img, (7, 7), corners, ret)
cv2.imshow("fnl", fnl)
print("No Checkerboard Found")
To find lower and upper boundaries of the mask, you may find useful: HSV-Threshold-script
In my environment (opencv-python, opencv 4.5.4), just converting it to grey scale can make it work without additional adjustment (at least detected all but the lower left corners). After downsample by resize(), all corners are detected.
img_captured = cv2.imread('example.jpg', cv2.IMREAD_GRAYSCALE)
# img_captured = cv2.resize(img_captured, (350, 350))
GRID = (7, 7)
found, corners = cv2.findChessboardCorners(
cv2.drawChessboardCorners(img_captured_corners, GRID, corners, found)
cv2.imshow('img_captured_corners', img_captured_corners)
findChessboardCorners no resize
There is also findChessboardCornersSB. From my experience, it works generally better than the plain version. However, I don't know benchmark difference between the two methods.

Getting a centre of an irregular shape

I have a irregular shape like below:
I need to get the centre of that white area, I just tried with contour in openCV like below
ret,thresh = cv.threshold(img,127,255,cv.THRESH_BINARY_INV)
cnts = cv.findContours(thresh.copy(), cv.RETR_EXTERNAL,
cnt = contours[0]
x,y,w,h = cv.boundingRect(cnt)
res_img = cv.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
But those cnts doeesn't give me very good results, as you can see the original image and its two small black point below the picture. Can someone point me to a good solution to get a centre of an irregular shape like above?
As I suggested, perform an erosion followed by dilation (an opening operation) on the binary image, then compute central moments and use this information to calculate the centroid. These are the steps:
Get a binary image from the input via Otsu's Thresholding
Compute central moments using cv2.moments
Compute the blob's centroid using the previous information
Let's see the code:
import cv2
import numpy as np
# Set image path
path = "C:/opencvImages/"
fileName = "pn43H.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
This is the binary image you get. Notice the small noise:
The opening operation will get rid of the smalls blobs. A rectangular structuring element will suffice, let's use 3 iterations:
# Apply an erosion + dilation to get rid of small noise:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 3
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_OPEN, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the filtered image:
Now, compute the central moments and then the blob's centroid:
# Calculate the moments
imageMoments = cv2.moments(openingImage)
# Compute centroid
cx = int(imageMoments['m10']/imageMoments['m00'])
cy = int(imageMoments['m01']/imageMoments['m00'])
# Print the point:
print("Cx: "+str(cx))
print("Cy: "+str(cy))
Additionally, let's draw this point onto the binary image to check out the results:
# Draw centroid onto BGR image:
bgrImage = cv2.cvtColor(binaryImage, cv2.COLOR_GRAY2BGR)
bgrImage = cv2.line(bgrImage, (cx,cy), (cx,cy), (0,255,0), 10)
This is the result:
One can think of the centroid calculated using the image moments as the "mass" center of the object in relation to the pixel intensity. Depending on the actual shape of the object it may not even be inside the object.
An alternative would be calculating the center of the bounding circle:
thresh = cv2.morphologyEx(thresh, cv2.MORPH_DILATE, np.ones((3, 3)))
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = [c for c in contours if cv2.contourArea(c) > 100]
(x, y), r = cv2.minEnclosingCircle(contours[0])
output = thresh.copy(), (int(x), int(y)), 3, (0, 0, 0), -1)
cv2.putText(output, f"{(int(x), int(y))}", (int(x-50), int(y-10)), cv2.FONT_HERSHEY_PLAIN, 1, (0, 0, 0), 1), (int(x), int(y)), int(r), (255, 0, 0), 2)
The output of that code looks like this:
You may want to try ConnectedComponentsWithStats function. This returns centroids, areas and bounding box parameters. Also blur and morphology(dilate/erode) helps a lot with noice, as noted above. If you're generous enough with erode, you`re gonna get almost no stray pixels after processing.

