OpenCV, Python: Perspective warping problem in aerial image stitching - python
Currently, I'm working on image stitching of aerial footage. I'm using the dataset, get from OrchardDataset. First of all, thanks to some great answers on stackoverflow, especially the answer from #alkasm (Here and Here). But I having an issue, as you can see below at Gap within the stitched image section.
I used the H21, H31, H41, etc to wrap the images. The stitched image using H21 is excellent, but when wrap the img3 to current stitched image using H31, result shown terrible alignment between img3 and current stitched image. As the more images I wrap, the gap gets bigger and the images totally not well aligned.
Does the brillant stackoverflow community have an ideas on how can I solve this problem?
These are the steps I use to stitch the images:
Extract the frame every second from the footage and undistort the image to get rid of fish-eye effect using the provided camera calibration matrix.
Compute the SIFT feature descriptors. Set up macther using FLANN kd-tree and find matches between the images. Find the Homography (H21, H32, H43 and etc, where H21 refer to the homography which warps imag2 into coordinates of img1)
Compose the homography with the previous homographies to get net homography using the method suggested in Here. (Compute H31, H41, H51, etc)
Wrap the images using the answer provided in Here.
Gap within the stitched image:
I'm using the first 10 images get from OrchardDataSet.
Stitched Image with Gaps
Here's portion of my script:
main.py
ref_img is the first frame (img1). AdjHomoSet contain the images to be wraped (img2, img3, img4, etc). AccHomoSet contain the net homography (H31, H41, H51, etc)
temp_mosaic = ref_img
h, w = temp_mosaic.shape[:2]
# Wrap the Images
for x in range(1, (len(AccHomoSet)+1)):
query_img = AdjHomoSet['H%d%d'%(x+1,(x))][1]
M_homo = AccHomoSet['H%d1'%(x+1)]
M_homo_inv = np.linalg.inv(M_homo)
(shifted_transf, dst_padded) = warpPerspectivePadded(query_img,
temp_mosaic,
M_homo_inv)
dst_pad_h, dst_pad_w = dst_padded.shape[:2]
next_img_warp = cv2.warpPerspective(query_img, shifted_transf,
(dst_pad_w, dst_pad_h),
flags=cv2.INTER_NEAREST)
# Put the base image on an enlarged palette
enlarged_base_img = np.zeros((dst_pad_h, dst_pad_w, 3),
np.uint8)
# Create masked composite
(ret,data_map) = cv2.threshold(cv2.cvtColor(next_img_warp,
cv2.COLOR_BGR2GRAY),
0, 255, cv2.THRESH_BINARY)
# add base image
enlarged_base_img = cv2.add(enlarged_base_img, dst_padded,
mask=np.bitwise_not(data_map),
dtype=cv2.CV_8U)
final_img = cv2.add(enlarged_base_img, next_img_warp,
dtype=cv2.CV_8U)
temp_mosaic = final_img
warpPerspectivePadded.py
def warpPerspectivePadded(image, temp_mosaic, homography):
src_h, src_w = image.shape[:2]
lin_homg_pts = np.array([[0, src_w, src_w, 0],
[0, 0, src_h, src_h],
[1, 1, 1, 1]])
trans_lin_homg_pts = homography.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
minX = np.floor(np.min(trans_lin_homg_pts[0])).astype(int)
minY = np.floor(np.min(trans_lin_homg_pts[1])).astype(int)
maxX = np.ceil(np.max(trans_lin_homg_pts[0])).astype(int)
maxY = np.ceil(np.max(trans_lin_homg_pts[1])).astype(int)
# add translation to the transformation matrix to shift to positive values
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = -minX
transl_transf[0,2] += anchorX
if minY < 0:
anchorY = -minY
transl_transf[1,2] += anchorY
shifted_transf = transl_transf.dot(homography)
shifted_transf /= shifted_transf[2,2]
# create padded destination image
temp_mosaic_h, temp_mosaic_w = temp_mosaic.shape[:2]
pad_widths = [anchorY, max(maxY, temp_mosaic_h) - temp_mosaic_h,
anchorX, max(maxX, temp_mosaic_w) - temp_mosaic_w]
dst_padded = cv2.copyMakeBorder(temp_mosaic, pad_widths[0],
pad_widths[1],pad_widths[2],
pad_widths[3],
cv2.BORDER_CONSTANT)
return (shifted_transf, dst_padded)
Updates:
Well, here's my code for image stitching. However, this solution is not perfect but hope it would be helpful to someone else. This solution is good enough for generating a panaroma view, SIFT+FLANN did the best to the dataset, Stitched image of the dataset with Straightline flight pattern. The interframes alignment is terribly shifted and visible skewness is obtained when stitching the dataset with lawnmower flight pattern, Stitched image of the dataset with lawnmower flight pattern and this solution absolutely not an ideal solution for orthomosaic.
imageStitcher.py
import cv2
import numpy as np
import glob
import os
import time
#import math
from colorama import Style, Back
import xlsxwriter as xls
"""
Important Parameter
-------------------
detector_type (string): type of determine, "sift" or "orb"
Defaults to "sift".
matcher_type (string): type of determine, "flann" or "bf"
Defaults to "flann".
resize_ratio (int) = number needed to decrease the input images size
output_height_times (int): determines the output height based on input image height.
Defaults to 2.
output_width_times (int): determines the output width based on input image width.
Defaults to 4.
"""
detector_type = "sift"
matcher_type = "flann"
resize_ratio = 3
output_height_times = 20
output_width_times = 15
gms = False
visualize = True
image_dir = "image/Input"
key_frame = "image/Input/frame1.jpg"
output_dir = "image/Input"
class ImageStitching:
def __init__(self, first_image,
output_height_times = output_height_times,
output_width_times = output_width_times,
detector_type = detector_type,
matcher_type = matcher_type):
"""This class processes every frame and generates the panorama
Args:
first_image (image for the first frame): first image to initialize the output size
output_height_times (int, optional): determines the output height based on input image height. Defaults to 2.
output_width_times (int, optional): determines the output width based on input image width. Defaults to 4.
detector_type (str, optional): the detector for feature detection. It can be "sift" or "orb". Defaults to "sift".
"""
self.detector_type = detector_type
self.matcher_type = matcher_type
if detector_type == "sift":
# SIFT feature detector
self.detector = cv2.xfeatures2d.SIFT_create(nOctaveLayers = 3,
contrastThreshold = 0.04,
edgeThreshold = 10,
sigma = 1.6)
if matcher_type == "flann":
# FLANN: the randomized kd trees algorithm
FLANN_INDEX_KDTREE = 1
flann_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict (checks=200)
self.matcher = cv2.FlannBasedMatcher(flann_params,search_params)
else:
# Brute-Force matcher
self.matcher = cv2.BFMatcher()
elif detector_type == "orb":
# ORB feature detector
self.detector = cv2.ORB_create()
self.detector.setFastThreshold(0)
if matcher_type == "flann":
FLANN_INDEX_LSH = 6
flann_params= dict(algorithm = FLANN_INDEX_LSH,
table_number = 6, # 12
key_size = 12, # 20
multi_probe_level = 1) #2
search_params = dict (checks=200)
self.matcher = cv2.FlannBasedMatcher(flann_params,search_params)
else:
# Brute-Force-Hamming matcher
self.matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
self.record = []
self.visualize = visualize
self.output_img = np.zeros(shape=(int(output_height_times * first_image.shape[0]),
int(output_width_times*first_image.shape[1]),
first_image.shape[2]))
self.process_first_frame(first_image)
# output image offset
self.w_offset = int(self.output_img.shape[0]/2 - first_image.shape[0]/2)
self.h_offset = int(self.output_img.shape[1]/2 - first_image.shape[1]/2)
self.output_img[self.w_offset:self.w_offset+first_image.shape[0],
self.h_offset:self.h_offset+first_image.shape[1], :] = first_image
a = self.output_img
heightM, widthM = a.shape[:2]
a = cv2.resize(a, (int(widthM / 4),
int(heightM / 4)),
interpolation=cv2.INTER_AREA)
# cv2.imshow('output', a)
self.H_old = np.eye(3)
self.H_old[0, 2] = self.h_offset
self.H_old[1, 2] = self.w_offset
def process_first_frame(self, first_image):
"""processes the first frame for feature detection and description
Args:
first_image (cv2 image/np array): first image for feature detection
"""
self.base_frame_rgb = first_image
base_frame_gray = cv2.cvtColor(first_image, cv2.COLOR_BGR2GRAY)
base_frame = cv2.GaussianBlur(base_frame_gray, (5,5), 0)
self.base_features, self.base_desc = self.detector.detectAndCompute(base_frame, None)
def process_adj_frame(self, next_frame_rgb):
"""gets an image and processes that image for mosaicing
Args:
next_frame_rgb (np array): input of current frame for the mosaicing
"""
self.next_frame_rgb = next_frame_rgb
next_frame_gray = cv2.cvtColor(next_frame_rgb, cv2.COLOR_BGR2GRAY)
next_frame = cv2.GaussianBlur(next_frame_gray, (5,5), 0)
self.next_features, self.next_desc = self.detector.detectAndCompute(next_frame, None)
self.matchingNhomography(self.next_desc, self.base_desc)
if len(self.matches) < 4:
return
print ("\n")
self.warp(self.next_frame_rgb, self.H)
# For record purpose: save into csv file later
self.record.append([len(self.base_features), len(self.next_features),
self.no_match_lr, self.no_GMSmatches, self.inlier, self.inlierRatio, self.reproError])
# loop preparation
self.H_old = self.H
self.base_features = self.next_features
self.base_desc = self.next_desc
self.base_frame_rgb = self.next_frame_rgb
def matchingNhomography(self, next_desc, base_desc):
"""matches the descriptors
Args:
next_desc (np array): current frame descriptor
base_desc (np array): previous frame descriptor
Returns:
array: and array of matches between descriptors
"""
# matching
if self.detector_type == "sift":
pair_matches = self.matcher.knnMatch(next_desc, trainDescriptors = base_desc,
k = 2)
"""
Store all the good matches as per Lowe's ratio test'
The Lowe's ratio is refer to the journal "Distinctive
Image Features from Scale-Invariant Keypoints" by
David G. Lowe.
"""
lowe_ratio = 0.8
matches = []
for m, n in pair_matches:
if m.distance < n.distance * lowe_ratio:
matches.append(m)
self.no_match_lr = len(matches)
# Rate of matches (Lowe's ratio test)
rate = float(len(matches) / ((len(self.base_features) + len(self.next_features))/2))
print (f"Rate of matches (Lowe's ratio test): {Back.RED}%f{Style.RESET_ALL}" % rate)
elif self.detector_type == "orb":
if self.matcher_type == "flann":
matches = self.matcher.match(next_desc, base_desc)
'''
lowe_ratio = 0.8
matches = []
for m, n in pair_matches:
if m.distance < n.distance * lowe_ratio:
matches.append(m)
'''
self.no_match_lr = len(matches)
# Rate of matches (Lowe's ratio test)
rate = float(len(matches) / (len(base_desc) + len(next_desc)))
print (f"Rate of matches (Lowe's ratio test): {Back.RED}%f{Style.RESET_ALL}" % rate)
else:
pair_matches = self.matcher.match(next_desc, base_desc)
# Rate of matches (before Lowe's ratio test)
self.no_match_lr = len(pair_matches)
rate = float(len(pair_matches) / (len(base_desc) + len(next_desc)))
print (f"Rate of matches: {Back.RED}%f{Style.RESET_ALL}" % rate)
# Sort them in the order of their distance.
matches = sorted(matches, key=lambda x: x.distance)
# OPTIONAL: used to remove the unmatch pair match
matches = cv2.xfeatures2d.matchGMS(self.next_frame_rgb.shape[:2],
self.base_frame_rgb.shape[:2],
self.next_features,
self.base_features, matches,
withScale = False, withRotation = False,
thresholdFactor = 6.0) if gms else matches
self.no_GMSmatches = len(matches) if gms else 0
# Rate of matches (GMS)
rate = float(self.no_GMSmatches / (len(base_desc) + len(next_desc)))
print (f"Rate of matches (GMS): {Back.CYAN}%f{Style.RESET_ALL}" % rate)
# OPTIONAL: Obtain the maximum of 20 best matches
# matches = matches[:min(len(matches), 20)]
# Visualize the matches.
if self.visualize:
match_img = cv2.drawMatches(self.next_frame_rgb, self.next_features, self.base_frame_rgb,
self.base_features, matches, None,
flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
cv2.imshow('matches', match_img)
self.H, self.status, self.reproError = self.findHomography(self.next_features, self.base_features, matches)
print ('inlier/matched = %d / %d' % (np.sum(self.status), len(self.status)))
self.inlier = np.sum(self.status)
self.inlierRatio = float(np.sum(self.status)) / float(len(self.status))
print ('inlierRatio = ', self.inlierRatio)
# len(status) - np.sum(status) = number of detected outliers
'''
TODO -
To minimize or get rid of cumulative homography error is use block bundle adjustnment
Suggested from "Multi View Image Stitching of Planar Surfaces on Mobile Devices"
Using 3-dimentional multiplication to find cumulative homography is very sensitive
to homography error.
'''
# 3-dimensional multiplication to find cumulative homography to the reference keyframe
self.H = np.matmul(self.H_old, self.H)
self.H = self.H/self.H[2,2]
self.matches = matches
return matches
# staticmethod
def findHomography(base_features, next_features, matches):
"""gets two matches and calculate the homography between two images
Args:
base_features (np array): keypoints of image 1
next_features (np_array): keypoints of image 2
matches (np array): matches between keypoints in image 1 and image 2
Returns:
np arrat of shape [3,3]: Homography matrix
"""
kp1 = []
kp2 = []
for match in matches:
kp1.append(base_features[match.queryIdx])
kp2.append(next_features[match.trainIdx])
p1_array = np.array([k.pt for k in kp1])
p2_array = np.array([k.pt for k in kp2])
homography, status = cv2.findHomography(p1_array, p2_array, method = cv2.RANSAC,
ransacReprojThreshold = 5.0,
mask = None,
maxIters = 2000,
confidence = 0.995)
#### Finding the euclidean distance error ####
list1 = np.array(p2_array)
list2 = np.array(p1_array)
list2 = np.reshape(list2, (len(list2), 2))
ones = np.ones(len(list1))
TestPoints = np.transpose(np.reshape(list1, (len(list1), 2)))
print ("Length:", np.shape(TestPoints), np.shape(ones))
TestPointsHom = np.vstack((TestPoints, ones))
print ("Homogenous Points:", np.shape(TestPointsHom))
projectedPointsH = np.matmul(homography, TestPointsHom) # projecting the points in test image to collage image using homography matrix
projectedPointsNH = np.transpose(np.array([np.true_divide(projectedPointsH[0,:], projectedPointsH[2,:]), np.true_divide(projectedPointsH[1,:], projectedPointsH[2,:])]))
print ("list2 shape:", np.shape(list2))
print ("NH Points shape:", np.shape(projectedPointsNH))
print ("Raw Error Vector:", np.shape(np.linalg.norm(projectedPointsNH-list2, axis=1)))
Error = int(np.sum(np.linalg.norm(projectedPointsNH-list2, axis=1)))
print ("Total Error:", Error)
AvgError = np.divide(np.array(Error), np.array(len(list1)))
print ("Average Error:", AvgError)
##################
return homography, status, AvgError
def warp(self, next_frame_rgb, H):
""" warps the current frame based of calculated homography H
Args:
next_frame_rgb (np array): current frame
H (np array of shape [3,3]): homography matrix
Returns:
np array: image output of mosaicing
"""
warped_img = cv2.warpPerspective(
next_frame_rgb, H, (self.output_img.shape[1], self.output_img.shape[0]),
flags=cv2.INTER_LINEAR)
transformed_corners = self.get_transformed_corners(next_frame_rgb, H)
warped_img = self.draw_border(warped_img, transformed_corners)
self.output_img[warped_img > 0] = warped_img[warped_img > 0]
output_temp = np.copy(self.output_img)
output_temp = self.draw_border(output_temp, transformed_corners, color=(0, 0, 255))
# Visualize the stitched result
if self.visualize:
output_temp_copy = output_temp/255.
output_temp_copy = cv2.normalize(output_temp_copy, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U) # convert float64 to unit8
size = 720
heightM, widthM = output_temp_copy.shape[:2]
ratio = size / float(heightM)
output_temp_copy = cv2.resize(output_temp_copy, (int(ratio * widthM), size), interpolation=cv2.INTER_AREA)
cv2.imshow('output', output_temp_copy)
return self.output_img
# staticmethod
def get_transformed_corners(next_frame_rgb, H):
"""finds the corner of the current frame after warp
Args:
next_frame_rgb (np array): current frame
H (np array of shape [3,3]): Homography matrix
Returns:
[np array]: a list of 4 corner points after warping
"""
corner_0 = np.array([0, 0])
corner_1 = np.array([next_frame_rgb.shape[1], 0])
corner_2 = np.array([next_frame_rgb.shape[1], next_frame_rgb.shape[0]])
corner_3 = np.array([0, next_frame_rgb.shape[0]])
corners = np.array([[corner_0, corner_1, corner_2, corner_3]], dtype=np.float32)
transformed_corners = cv2.perspectiveTransform(corners, H)
transformed_corners = np.array(transformed_corners, dtype=np.int32)
# output_temp = np.copy(output_img)
# mask = np.zeros(shape=(output_temp.shape[0], output_temp.shape[1], 1))
# cv2.fillPoly(mask, transformed_corners, color=(1, 0, 0))
# cv2.imshow('mask', mask)
return transformed_corners
def draw_border(self, image, corners, color=(0, 0, 0)):
"""This functions draw rectancle border
Args:
image ([type]): current mosaiced output
corners (np array): list of corner points
color (tuple, optional): color of the border lines. Defaults to (0, 0, 0).
Returns:
np array: the output image with border
"""
for i in range(corners.shape[1]-1, -1, -1):
cv2.line(image, tuple(corners[0, i, :]), tuple(
corners[0, i-1, :]), thickness=5, color=color)
return image
#staticmethod
def stitchedimg_crop(stitched_img):
"""This functions crop the black edge
Args:
stitched_img (np array): stitched image with black edge
Returns:
np array: the output image with no black edge
"""
stitched_img = cv2.normalize(stitched_img, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U) # convert float64 to unit8
# Crop black edges
stitched_img_gray = cv2.cvtColor(stitched_img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(stitched_img_gray, 1, 255, cv2.THRESH_BINARY)
dino, contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
print ("Cropping black edge of stitched image ...")
print ("Found %d contours...\n" % (len(contours)))
max_area = 0
best_rect = (0,0,0,0)
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
deltaHeight = h-y
deltaWidth = w-x
if deltaHeight < 0 or deltaWidth < 0:
deltaHeight = h+y
deltaWidth = w+x
area = deltaHeight * deltaWidth
if ( area > max_area and deltaHeight > 0 and deltaWidth > 0):
max_area = area
best_rect = (x,y,w,h)
if ( max_area > 0 ):
final_img_crop = stitched_img[best_rect[1]:best_rect[1]+best_rect[3],
best_rect[0]:best_rect[0]+best_rect[2]]
return final_img_crop
def main():
images = sorted(glob.glob(image_dir + "/*.jpg"),
key=lambda x: int(os.path.splitext(os.path.basename(x))[0][5:]))
# read the first frame
first_frame = cv2.imread(key_frame)
heightM, widthM = first_frame.shape[:2]
first_frame = cv2.resize(first_frame, (int(widthM / resize_ratio),
int(heightM / resize_ratio)),
interpolation=cv2.INTER_AREA)
image_stitching = ImageStitching(first_frame)
round = 2
for next_img_path in images[1:]:
print (f'Reading {Back.YELLOW}%s{Style.RESET_ALL}...' % next_img_path)
next_frame_rgb = cv2.imread(next_img_path)
heightM, widthM = next_frame_rgb.shape[:2]
next_frame_rgb = cv2.resize(next_frame_rgb, (int(widthM / resize_ratio),
int(heightM / resize_ratio)),
interpolation=cv2.INTER_AREA)
print ("Stitching %d / %d of image ..." % (round,len(images)))
# process each frame
image_stitching.process_adj_frame(next_frame_rgb)
round += 1
if round > len(images):
print ("Please press 'q' to continue the process ...")
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.waitKey(0)
cv2.destroyAllWindows()
# cv2.imwrite('mosaic.jpg', image_stitching.output_img)
final_img_crop = image_stitching.stitchedimg_crop(image_stitching.output_img)
print ("Image stitching done ...")
cv2.imwrite("%s/Normal.JPG" % output_dir, final_img_crop)
# Save important results into csv file
tuplelist = tuple(image_stitching.record)
workbook = xls.Workbook('Normal.xlsx')
worksheet = workbook.add_worksheet("Normal")
row = 0
col = 0
worksheet.write(row, col, 'number_pairs')
worksheet.write(row, col + 1, 'basefeature')
worksheet.write(row, col + 2, 'nextfeature')
worksheet.write(row, col + 3, 'no_match_lr')
worksheet.write(row, col + 4, 'match_rate')
worksheet.write(row, col + 5, 'no_GMSmatches (OFF)')
worksheet.write(row, col + 6, 'gms_match_rate')
worksheet.write(row, col + 7, 'inlier')
worksheet.write(row, col + 8, 'inlierratio')
worksheet.write(row, col + 9, 'reproerror')
row += 1
number = 1
# Iterate over the data and write it out row by row.
for basefeature, nextfeature, no_match_lr, no_GMSmatches, inlier, inlierratio, reproerror in (tuplelist):
worksheet.write(row, col, number)
worksheet.write(row, col + 1, basefeature)
worksheet.write(row, col + 2, nextfeature)
worksheet.write(row, col + 3, no_match_lr)
match_rate = no_match_lr / ((basefeature+nextfeature)/2)
worksheet.write(row, col + 4, match_rate)
worksheet.write(row, col + 5, no_GMSmatches)
gms_match_rate = no_GMSmatches / ((basefeature+nextfeature)/2)
worksheet.write(row, col + 6, gms_match_rate)
worksheet.write(row, col + 7, inlier)
worksheet.write(row, col + 8, inlierratio)
worksheet.write(row, col + 9, reproerror)
number += 1
row += 1
workbook.close()
""""""""""""""""""""""""""""""""""""""""""""" Main """""""""""""""""""""""""""""""""""""""
if __name__ == "__main__":
program_start = time.process_time()
main()
program_end = time.process_time()
print (f'Program elapsed time: {Back.GREEN}%s s{Style.RESET_ALL}\n' % str(program_end-program_start))
Eventually I changed the way of warping the image using the approach provided by Jahaniam Real Time Video Mosaic. He locates the reference image at the middle of preset size of blank image and compute the subsequent homography and warp the adjacent images to the reference image.
Example of stitched image
Related
How to stitch the two images more accurately?
I'm doing a "Circle View"(Bird View) system for a long truck. Due to the fact that the car is long on the sides of one camera is not enough. I decided to try to put two cameras and sew them, but there is a drawback. the code that I use stitches the image is not quite even and the joint is visible. How can I change the code to make the joint was less visible and sewn better? A chessboard with a size of 4x6. It stands in the middle of two video cameras. Maybe there's a way to stitch on a checkerboard? Here's my stitching result: Here are two images to be stitched together: 1 image: 2 image: The code I have now: import cv2 as cv import numpy as np def FindHomography(Matches, BaseImage_kp, SecImage_kp): # If less than 4 matches found, exit the code. if len(Matches) < 4: print("\nNot enough matches found between the images.\n") exit(0) # Storing coordinates of points corresponding to the matches found in both the images BaseImage_pts = [] SecImage_pts = [] for Match in Matches: BaseImage_pts.append(BaseImage_kp[Match[0].queryIdx].pt) SecImage_pts.append(SecImage_kp[Match[0].trainIdx].pt) # Changing the datatype to "float32" for finding homography BaseImage_pts = np.float32(BaseImage_pts) SecImage_pts = np.float32(SecImage_pts) # Finding the homography matrix(transformation matrix). (HomographyMatrix, Status) = cv.findHomography(SecImage_pts, BaseImage_pts, cv.RANSAC, 4.0) return HomographyMatrix, Status def GetNewFrameSizeAndMatrix(HomographyMatrix, Sec_ImageShape, Base_ImageShape): # Reading the size of the image (Height, Width) = Sec_ImageShape # Taking the matrix of initial coordinates of the corners of the secondary image # Stored in the following format: [[x1, x2, x3, x4], [y1, y2, y3, y4], [1, 1, 1, 1]] # Where (xi, yi) is the coordinate of the i th corner of the image. InitialMatrix = np.array([[0, Width - 1, Width - 1, 0], [0, 0, Height - 1, Height - 1], [1, 1, 1, 1]]) # Finding the final coordinates of the corners of the image after transformation. # NOTE: Here, the coordinates of the corners of the frame may go out of the # frame(negative values). We will correct this afterwards by updating the # homography matrix accordingly. FinalMatrix = np.dot(HomographyMatrix, InitialMatrix) [x, y, c] = FinalMatrix x = np.divide(x, c) y = np.divide(y, c) # Finding the dimentions of the stitched image frame and the "Correction" factor min_x, max_x = int(round(min(x))), int(round(max(x))) min_y, max_y = int(round(min(y))), int(round(max(y))) New_Width = max_x New_Height = max_y Correction = [0, 0] if min_x < 0: New_Width -= min_x Correction[0] = abs(min_x) if min_y < 0: New_Height -= min_y Correction[1] = abs(min_y) # Again correcting New_Width and New_Height # Helpful when secondary image is overlaped on the left hand side of the Base image. if New_Width < Base_ImageShape[1] + Correction[0]: New_Width = Base_ImageShape[1] + Correction[0] if New_Height < Base_ImageShape[0] + Correction[1]: New_Height = Base_ImageShape[0] + Correction[1] # Finding the coordinates of the corners of the image if they all were within the frame. x = np.add(x, Correction[0]) y = np.add(y, Correction[1]) OldInitialPoints = np.float32([[0, 0], [Width - 1, 0], [Width - 1, Height - 1], [0, Height - 1]]) NewFinalPonts = np.float32(np.array([x, y]).transpose()) # Updating the homography matrix. Done so that now the secondary image completely # lies inside the frame HomographyMatrix = cv.getPerspectiveTransform(OldInitialPoints, NewFinalPonts) return [New_Height, New_Width], Correction, HomographyMatrix ratio_thresh = 0.9 image1 = cv.imread(filename='/home/msi-user/PycharmProjects/170Camera/1_camera.jpg') image2 = cv.imread(filename='/home/msi-user/PycharmProjects/170Camera/2_camera.jpg') # -----------------------------------------KAZE--------------------------------# AKAZE = cv.KAZE_create() # KAZE, AKAZE, ORB, BRISK, xfeatures2d.SURF keypoints1, descriptors1 = AKAZE.detectAndCompute(image1, None) keypoints2, descriptors2 = AKAZE.detectAndCompute(image2, None) FLANN_INDEX_KDTREE = 1 index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5) search_params = dict(checks=50) descriptors1 = np.float32(descriptors1) descriptors2 = np.float32(descriptors2) FLANN = cv.FlannBasedMatcher(indexParams=index_params, searchParams=search_params) matches = FLANN.knnMatch(queryDescriptors=descriptors1, trainDescriptors=descriptors2, k=2) good_matches = [] t = [] for m, n in matches: if m.distance < ratio_thresh * n.distance: good_matches.append([m]) t.append(m) output = cv.drawMatches(img1=image1, keypoints1=keypoints1, img2=image2, keypoints2=keypoints2, matches1to2=t, outImg=None, flags=cv.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS) cv.namedWindow("drawMatches.jpg", cv.WINDOW_NORMAL) cv.imshow("drawMatches.jpg", output) # ----------------------------------FindHomography-------------------------------------------# HomographyMatrix, Status = FindHomography(good_matches, keypoints1, keypoints2) BaseImage = image1 SecImage = image2 NewFrameSize, Correction, HomographyMatrix = GetNewFrameSizeAndMatrix(HomographyMatrix, SecImage.shape[:2], BaseImage.shape[:2]) StitchedImage = cv.warpPerspective(SecImage, HomographyMatrix, (NewFrameSize[1], NewFrameSize[0])) StitchedImage[Correction[1]:Correction[1] + BaseImage.shape[0], Correction[0]:Correction[0] + BaseImage.shape[1]] = BaseImage cv.namedWindow("stisched2.jpg", cv.WINDOW_NORMAL) cv.imshow("stisched2.jpg", StitchedImage) cv.imwrite("result.jpg", StitchedImage) while True: if cv.waitKey(1) == 27: break
Finding an unfilled circle in an image of finite size using Python
Trying to find a circle in an image that has finite radius. Started off using 'HoughCircles' method from OpenCV as the parameters for it seemed very much related to my situation. But it is failing to find it. Looks like the image may need more pre-processing for it to find reliably. So, started off playing with different thresholds in opencv to no success. Here is an example of an image (note that the overall intensity of the image will vary, but the radius of the circle always remain the same ~45pixels) Here is what I have tried so far image = cv2.imread('image1.bmp', 0) img_in = 255-image mean_val = int(np.mean(img_in)) ret, img_thresh = cv2.threshold(img_in, thresh=mean_val-30, maxval=255, type=cv2.THRESH_TOZERO) # detect circle circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, 1.0, 100, minRadius=40, maxRadius=50) If you look at the image, the circle is obvious, its a thin light gray circle in the center of the blob. Any suggestions? Edited to show expected result The expected result should be like this, as you can see, the circle is very obvious for naked eye on the original image and is always of the same radius but not at the same location on the image. But there will be only one circle of this kind on any given image. As of 8/20/2020, here is the code I am using to get the center and radii from numpy import zeros as np_zeros,\ full as np_full from cv2 import calcHist as cv2_calcHist,\ HoughCircles as cv2_HoughCircles,\ HOUGH_GRADIENT as cv2_HOUGH_GRADIENT def getCenter(img_in, saturated, minradius, maxradius): img_local = img_in[100:380,100:540,0] res = np_full(3, -1) # do some contrast enhancement img_local = stretchHistogram(img_local, saturated) circles = cv2_HoughCircles(img_local, cv2_HOUGH_GRADIENT, 1, 40, param1=70, param2=20, minRadius=minradius, maxRadius=maxradius) if circles is not None: # found some circles circles = sorted(circles[0], key=lambda x: x[2]) res[0] = circles[0][0]+100 res[1] = circles[0][1]+100 res[2] = circles[0][2] return res #x,y,radii def stretchHistogram(img_in, saturated=0.35, histMin=0.0, binSize=1.0): img_local = img_in.copy() img_out = img_in.copy() min, max = getMinAndMax(img_local, saturated) if max > min: min = histMin+min * binSize max = histMin+max * binSize w, h = img_local.shape[::-1] #create a new lut lut = np_zeros(256) max2 = 255 for i in range(0, 256): if i <= min: lut[i] = 0 elif i >= max: lut[i] = max2 else: lut[i] = (round)(((float)(i - min) / (max - min)) * max2) #update image with new lut values for i in range(0, h): for j in range(0, w): img_out[i, j] = lut[img_local[i, j]] return img_out def getMinAndMax(img_in, saturated): img_local = img_in.copy() hist = cv2_calcHist([img_local], [0], None, [256], [0, 256]) w, h = img_local.shape[::-1] pixelCount = w * h saturated = 0.5 threshold = (int)(pixelCount * saturated / 200.0) found = False count = 0 i = 0 while not found and i < 255: count += hist[i] found = count > threshold i = i + 1 hmin = i i = 255 count = 0 while not found and i > 0: count += hist[i] found = count > threshold i = i - 1 hmax = i return hmin, hmax and calling the above function as getCenter(img, 5.0, 55, 62) But it is still very unreliable. Not sure why it is so hard to get to an algorithm that works reliably for something that is very obvious to a naked eye. Not sure why there is so much variation in the result from frame to frame even though there is no change between them. Any suggestions are greatly appreciated. Here are some more samples to play with
simple, draw your circles: cv2.HoughCircles returns a list of circles.. take care of maxRadius = 100 for i in circles[0,:]: # draw the outer circle cv2.circle(image,(i[0],i[1]),i[2],(255,255,0),2) # draw the center of the circle cv2.circle(image,(i[0],i[1]),2,(255,0,255),3) a full working code (you have to change your tresholds): import cv2 import numpy as np image = cv2.imread('0005.bmp', 0) height, width = image.shape print(image.shape) img_in = 255-image mean_val = int(np.mean(img_in)) blur = cv2.blur(img_in , (3,3)) ret, img_thresh = cv2.threshold(blur, thresh=100, maxval=255, type=cv2.THRESH_TOZERO) # detect circle circles = cv2.HoughCircles(img_thresh, cv2.HOUGH_GRADIENT,1,40,param1=70,param2=20,minRadius=60,maxRadius=0) print(circles) for i in circles[0,:]: # check if center is in middle of picture if(i[0] > width/2-30 and i[0] < width/2+30 \ and i[1] > height/2-30 and i[1] < height/2+30 ): # draw the outer circle cv2.circle(image,(i[0],i[1]),i[2],(255,255,0),2) # draw the center of the circle cv2.circle(image,(i[0],i[1]),2,(255,0,255),3) cv2.imshow("image", image ) while True: keyboard = cv2.waitKey(2320) if keyboard == 27: break cv2.destroyAllWindows() result:
Pixels intensity values between two lines
I have created an alghoritm that detects the edges of an extruded colagen casing and draws a centerline between these edges on an image. Casing with a centerline. Here is my code: import numpy as np import matplotlib.pyplot as plt plt.style.use('fivethirtyeight') img = cv2.imread("C:/Users/5.jpg", cv2.IMREAD_GRAYSCALE) img = cv2.resize(img, (1500, 1200)) #ROI fromCenter = False r = cv2.selectROI(img, fromCenter) imCrop = img[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])] #Operations on an image _,thresh = cv2.threshold(imCrop,100,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU) kernel = np.ones((5,5),np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel) blur = cv2.GaussianBlur(opening,(7,7),0) edges = cv2.Canny(blur, 0,20) #Edges localization, packing coords into a list indices = np.where(edges != [0]) coordinates = list(zip(indices[1], indices[0])) num = len(coordinates) #Separating into top and bot edge bot_cor = coordinates[:int(num/2)] top_cor = coordinates[-int(num/2):] #Converting to arrays, sorting a, b = np.array(top_cor), np.array(bot_cor) a, b = a[a[:,0].argsort()], b[b[:,0].argsort()] #Edges approximation by a 5th degree polynomial min_a_x, max_a_x = np.min(a[:,0]), np.max(a[:,0]) new_a_x = np.linspace(min_a_x, max_a_x, imCrop.shape[1]) a_coefs = np.polyfit(a[:,0],a[:,1], 5) new_a_y = np.polyval(a_coefs, new_a_x) min_b_x, max_b_x = np.min(b[:,0]), np.max(b[:,0]) new_b_x = np.linspace(min_b_x, max_b_x, imCrop.shape[1]) b_coefs = np.polyfit(b[:,0],b[:,1], 5) new_b_y = np.polyval(b_coefs, new_b_x) #Defining a centerline midx = [np.average([new_a_x[i], new_b_x[i]], axis = 0) for i in range(imCrop.shape[1])] midy = [np.average([new_a_y[i], new_b_y[i]], axis = 0) for i in range(imCrop.shape[1])] plt.figure(figsize=(16,8)) plt.title('Cross section') plt.xlabel('Length of the casing', fontsize=18) plt.ylabel('Width of the casing', fontsize=18) plt.plot(new_a_x, new_a_y,c='black') plt.plot(new_b_x, new_b_y,c='black') plt.plot(midx, midy, '-', c='blue') plt.show() #Converting coords type to a list (plotting purposes) coords = list(zip(midx, midy)) points = list(np.int_(coords)) mask = np.zeros((imCrop.shape[:2]), np.uint8) mask = edges #Plotting for point in points: cv2.circle(mask, tuple(point), 1, (255,255,255), -1) for point in points: cv2.circle(imCrop, tuple(point), 1, (255,255,255), -1) cv2.imshow('imCrop', imCrop) cv2.imshow('mask', mask) cv2.waitKey(0) cv2.destroyAllWindows() Now I would like to sum up the intensities of each pixel in a region between top edge and a centerline (same thing for a region between centerline and a bottom edge). Is there any way to limit the ROI to the region between the detected edges and split it into two regions based on the calculated centerline? Or is there any way to access the pixels which are contained between the edge and a centerline based on theirs coordinates? (It's my very first post here, sorry in advance for all the mistakes)
I wrote a somewhat naïve code to get masks for the upper and lower part. My code considers that the source image will be always like yours: with horizontal stripes. After applying Canny I get this: Then I run some loops through image array to fill unwanted areas of your image. This is done separately for upper and lower part, creating masks. The results are: Then you can use this masks to sum only the elements you're interested in, using cv.sumElems. import cv2 as cv #open as grayscale image src = cv.imread("colagen.png",cv.IMREAD_GRAYSCALE) # apply canny and find contours threshold = 100 canny_output = cv.Canny(src, threshold, threshold * 2) # find mask for upper part mask1 = canny_output.copy() x, y = canny_output.shape area = 0 for j in range(y): area = 0 for i in range(x): if area == 0: if mask1[i][j] > 0: area = 1 continue else: mask1[i][j] = 255 elif area == 1: if mask1[i][j] > 0: area = 2 else: continue else: mask1[i][j] = 255 mask1 = cv.bitwise_not(mask1) # find mask for lower part mask2 = canny_output.copy() x, y = canny_output.shape area = 0 for j in range(y): area = 0 for i in range(x): if area == 0: if mask2[-i][j] > 0: area = 1 continue else: mask2[-i][j] = 255 elif area == 1: if mask2[-i][j] > 0: area = 2 else: continue else: mask2[-i][j] = 255 mask2 = cv.bitwise_not(mask2) # apply masks and calculate sum of elements in upper and lower part sums = [0,0] (sums[0],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask1)) (sums[1],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask2)) cv.imshow('src',src) cv.imshow('canny',canny_output) cv.imshow('mask1',mask1) cv.imshow('mask2',mask2) cv.imshow('masked1',cv.bitwise_and(src,mask1)) cv.imshow('masked2',cv.bitwise_and(src,mask2)) cv.waitKey() Alternatives... Probably there exist some function that fill the areas of the Canny result. I tried cv.fillPoly and cv.floodFill, but didn't manage to make them work easily... But maybe someone else can help you with that... Edit Found another way to get the masks with a cleaner code. Using numpy np.add.accumulate then np.clip, and then a modulo operation: # first divide canny_output by 255 to get 0's and 1's, then perform # an accumulate addition for each column. Thus you'll get +1 for every # line, "painting" areas with 1, 2, 3... a = np.add.accumulate(canny_output/255,0) # clip values: anything greater than 2 becomes 2 a = np.clip(a, 0, 2) # performe a modulo, to get areas alternating with 0 or 1; then multiply by 255 a = a%2 * 255 # convert to uint8 mask1 = cv.convertScaleAbs(a) # to get mask2 (the lower mask) flip the array then do the same as above a = np.add.accumulate(np.flip(canny_output,0)/255,0) a = np.clip(a, 0, 2) a = a%2 * 255 mask2 = cv.convertScaleAbs(np.flip(a,0)) This returns almost the same result. The border of the mask is a little bit different...
Extracting data from tables without any grid lines and border from scanned image of document
Extracting table data from digital PDFs have been simple using camelot and tabula. However, the solution doesn't work with scanned images of the document pages specifically when the table doesn't have borders and inner grids. I have been trying to generate vertical and horizontal lines using OpenCV. However, since the scanned images will have slight rotation angles, it is difficult to proceed with the approach. How can we utilize OpenCV to generate grids (horizontal and vertical lines) and borders for the scanned document page which contains table data (along with paragraphs of text)? If this is feasible, how to nullify the rotation angle of the scanned image?
I wrote some code to estimate the horizontal lines from the printed letters in the page. The same could be done for vertical ones I guess. The code below follows some general assumptions, here some basic steps in pseudo code style: prepare picture for contour detection do contour detection we assume most contours are letters calc mean width of all contours calc mean area of contours filter all contours with two conditions: a) contour (letter) heigths < meanHigh * 2 b) contour area > 4/5 meanArea calc center point of all remaining contours assume we have line regions (bins) list all center point which are inside the region do linear regression of region points save slope and intercept calc mean slope and intercept here the full code: import cv2 import numpy as np from scipy import stats def resizeImageByPercentage(img,scalePercent = 60): width = int(img.shape[1] * scalePercent / 100) height = int(img.shape[0] * scalePercent / 100) dim = (width, height) # resize image return cv2.resize(img, dim, interpolation = cv2.INTER_AREA) def calcAverageContourWithAndHeigh(contourList): hs = list() ws = list() for cnt in contourList: (x, y, w, h) = cv2.boundingRect(cnt) ws.append(w) hs.append(h) return np.mean(ws),np.mean(hs) def calcAverageContourArea(contourList): areaList = list() for cnt in contourList: a = cv2.minAreaRect(cnt) areaList.append(a[2]) return np.mean(areaList) def calcCentroid(contour): houghMoments = cv2.moments(contour) # calculate x,y coordinate of centroid if houghMoments["m00"] != 0: #case no contour could be calculated cX = int(houghMoments["m10"] / houghMoments["m00"]) cY = int(houghMoments["m01"] / houghMoments["m00"]) else: # set values as what you need in the situation cX, cY = -1, -1 return cX,cY def getCentroidWhenSizeInRange(contourList,letterSizeWidth,letterSizeHigh,deltaOffset,minLetterArea=10.0): centroidList=list() for cnt in contourList: (x, y, w, h) = cv2.boundingRect(cnt) area = cv2.minAreaRect(cnt) #calc diff diffW = abs(w-letterSizeWidth) diffH = abs(h-letterSizeHigh) #thresold A: almost smaller than mean letter size +- offset #when almost letterSize if diffW < deltaOffset and diffH < deltaOffset: #threshold B > min area if area[2] > minLetterArea: cX,cY = calcCentroid(cnt) if cX!=-1 and cY!=-1: centroidList.append((cX,cY)) return centroidList DEBUGMODE = True #read image, do git clone https://github.com/WZBSocialScienceCenter/pdftabextract.git for the example img = cv2.imread('pdftabextract/examples/catalogue_30s/data/ALA1934_RR-excerpt.pdf-2_1.png') #get some basic infos imgHeigh, imgWidth, imgChannelAmount = img.shape if DEBUGMODE: cv2.imwrite("img00original.jpg",resizeImageByPercentage(img,30)) cv2.imshow("original",img) # prepare img imgGrey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # apply Gaussian filter imgGaussianBlur = cv2.GaussianBlur(imgGrey,(5,5),0) #make binary img, black or white _, imgBinThres = cv2.threshold(imgGaussianBlur, 130, 255, cv2.THRESH_BINARY) ## detect contours contours, _ = cv2.findContours(imgBinThres, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) #we get some letter parameter averageLetterWidth, averageLetterHigh = calcAverageContourWithAndHeigh(contours) threshold1AllowedLetterSizeOffset = averageLetterHigh * 2 # double size averageContourAreaSizeOfMinRect = calcAverageContourArea(contours) threshHold2MinArea = 4 * averageContourAreaSizeOfMinRect / 5 # 4/5 * mean print("mean letter Width: ", averageLetterWidth) print("mean letter High: ", averageLetterHigh) print("threshold 1 tolerance: ", threshold1AllowedLetterSizeOffset) print("mean letter area ", averageContourAreaSizeOfMinRect) print("thresold 2 min letter area ", threshHold2MinArea) #we get all centroid of letter sizes contours, the other we ignore centroidList = getCentroidWhenSizeInRange(contours,averageLetterWidth,averageLetterHigh,threshold1AllowedLetterSizeOffset,threshHold2MinArea) if DEBUGMODE: #debug print all centers: imgFilteredCenter = img.copy() for cX,cY in centroidList: #draw in red color as BGR cv2.circle(imgFilteredCenter, (cX, cY), 5, (0, 0, 255), -1) cv2.imwrite("img01letterCenters.jpg",resizeImageByPercentage(imgFilteredCenter,30)) cv2.imshow("letterCenters",imgFilteredCenter) #we estimate a bin widths amountPixelFreeSpace = averageLetterHigh #TODO get better estimate out of histogram estimatedBinWidth = round( averageLetterHigh + amountPixelFreeSpace) #TODO round better ? binCollection = dict() #range(0,imgHeigh,estimatedBinWidth) #we do seperate the center points into bins by y coordinate for i in range(0,imgHeigh,estimatedBinWidth): listCenterPointsInBin = list() yMin = i yMax = i + estimatedBinWidth for cX,cY in centroidList: if yMin < cY < yMax:#if fits in bin listCenterPointsInBin.append((cX,cY)) binCollection[i] = listCenterPointsInBin #we assume all point are in one line ? #model = slope (x) + intercept #model = m (x) + n mList = list() #slope abs in img nList = list() #intercept abs in img nListRelative = list() #intercept relative to bin start minAmountRegressionElements = 12 #is also alias for letter amount we expect #we do regression for every point in the bin for startYOfBin, values in binCollection.items(): #we reform values xValues = [] #TODO use more short transform yValues = [] for x,y in values: xValues.append(x) yValues.append(y) #we assume a min limit of point in bin if len(xValues) >= minAmountRegressionElements : slope, intercept, r, p, std_err = stats.linregress(xValues, yValues) mList.append(slope) nList.append(intercept) #we calc the relative intercept nRelativeToBinStart = intercept - startYOfBin nListRelative.append(nRelativeToBinStart) if DEBUGMODE: #we debug print all lines in one picute imgLines = img.copy() colorOfLine = (0, 255, 0) #green for i in range(0,len(mList)): slope = mList[i] intercept = nList[i] startPoint = (0, int( intercept)) #better round ? endPointY = int( (slope * imgWidth + intercept) ) if endPointY < 0: endPointY = 0 endPoint = (imgHeigh,endPointY) cv2.line(imgLines, startPoint, endPoint, colorOfLine, 2) cv2.imwrite("img02lines.jpg",resizeImageByPercentage(imgLines,30)) cv2.imshow("linesOfLetters ",imgLines) #we assume in mean we got it right meanIntercept = np.mean(nListRelative) meanSlope = np.mean(mList) print("meanIntercept :", meanIntercept) print("meanSlope ", meanSlope) #TODO calc angle with math.atan(slope) ... if DEBUGMODE: cv2.waitKey(0) original: center point of letters: lines:
I had the same problem some time ago and this tutorial is the solution to that. It explains using pdftabextract which is a Python library by Markus Konrad and leverages OpenCV’s Hough transform to detect the lines and works even if the scanned document is a bit tilted. The tutorial walks your through parsing a 1920s German newspaper
Unable to find enough matches between two images to stitch them both
I'm trying to stitch two images together, but I'm failing to do so because the program doesn't detect enough matches between the two images Here's the code: import numpy as np import imutils import cv2 class Stitcher: def __init__(self): self.isv3 = imutils.is_cv3() def stitch(self, images, ratio=0.75, reprojThresh=5.0, showMatches=False): (imageB, imageA) = images (kpsA, featuresA) = self.detectAndDescribe(imageA) (kpsB, featuresB) = self.detectAndDescribe(imageB) M = self.matchKeypoints(kpsA, kpsB, featuresA, featuresB, ratio, reprojThresh) if M is None: return None (matches, H, status) = M result = cv2.warpPerspective(imageA, H, (imageA.shape[1] + imageB.shape[1], imageA.shape[0])) result[0:imageB.shape[0], 0:imageB.shape[1]] = imageB if showMatches: vis = self.drawMatches(imageA, imageB, kpsA, kpsB, matches, status) return (result, vis) return result def detectAndDescribe(self, image): gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if self.isv3: descriptor = cv2.xfeatures2d.SIFT_create() (kps, features) = descriptor.detectAndCompute(image, None) else: detector = cv2.FeatureDetector_create("SIFT") kps = detector.detect(gray) # extract features from the image extractor = cv2.DescriptorExtractor_create("SIFT") (kps, features) = extractor.compute(gray, kps) # convert the keypoints from KeyPoint objects to NumPy # arrays kps = np.float32([kp.pt for kp in kps]) # return a tuple of keypoints and features return (kps, features) def matchKeypoints(self, kpsA, kpsB, featuresA, featuresB, ratio, reprojThresh): # compute the raw matches and initialize the list of actual # matches matcher = cv2.DescriptorMatcher_create("BruteForce") rawMatches = matcher.knnMatch(featuresA, featuresB, 2) matches = [] # loop over the raw matches for m in rawMatches: # ensure the distance is within a certain ratio of each # other (i.e. Lowe's ratio test) if len(m) == 2 and m[0].distance < m[1].distance * ratio: matches.append((m[0].trainIdx, m[0].queryIdx)) # computing a homography requires at least 4 matches if len(matches) > 4: # construct the two sets of points ptsA = np.float32([kpsA[i] for (_, i) in matches]) ptsB = np.float32([kpsB[i] for (i, _) in matches]) # compute the homography between the two sets of points (H, status) = cv2.findHomography(ptsA, ptsB, cv2.RANSAC, reprojThresh) # return the matches along with the homograpy matrix # and status of each matched point return (matches, H, status) # otherwise, no homograpy could be computed return None def drawMatches(self, imageA, imageB, kpsA, kpsB, matches, status): # initialize the output visualization image (hA, wA) = imageA.shape[:2] (hB, wB) = imageB.shape[:2] vis = np.zeros((max(hA, hB), wA + wB, 3), dtype="uint8") vis[0:hA, 0:wA] = imageA vis[0:hB, wA:] = imageB # loop over the matches for ((trainIdx, queryIdx), s) in zip(matches, status): # only process the match if the keypoint was successfully # matched if s == 1: # draw the match ptA = (int(kpsA[queryIdx][0]), int(kpsA[queryIdx][1])) ptB = (int(kpsB[trainIdx][0]) + wA, int(kpsB[trainIdx][1])) cv2.line(vis, ptA, ptB, (0, 255, 0), 1) # return the visualization return vis And here are the original images: Image A: Image B: matched points: The result of stitching: The result is nowhere as desired, and if I'm correct it's due to there not being enough matching points between the two images.