I'm a dental student and currently trying to write a script for analyzing and extracting handwritten digits from dental records. I already have a rough version of the script finished but my recognition rate is pretty low. A big problem with analyzing the data is a grid that proves difficult to remove.
Scanned form that I want to analyse (white fields are for anonymity):
Empty form:
I've tried different solutions for this problem (Erosion/Dilation, HoughLineTransform and susbtraction of the Lines).
Using featurematching and substracting with an empty template currently give me the best results.
Results:
Eroding and dilating this image gives even better results
Results:
![][4]
But this needs a new calibration nearly every time i try it.
Do you know of a more elegant solution to my problem.
Could SURF matching give better results?
Thank you very much!
Here's my code so far:
GOOD_MATCH_PERCENT = 0.15
def match_img_to_template(input_img, template_img, MAX_FEATURES, GOOD_MATCH_PERCENT):
# blurring of the input image
template_img = cv2.GaussianBlur(template_img, (3, 3), cv2.BORDER_DEFAULT)
# equalizing the histogramm of the input image
img_preprocessed = cv2.equalizeHist(input_img)
# ORB Detector
orb = cv2.ORB_create(MAX_FEATURES)
kp1, des1 = orb.detectAndCompute(img_preprocessed, None)
kp2, des2 = orb.detectAndCompute(template_img, None)
# Brute Force Matching
matcher= cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(des1, des2, None)
matches.sort(key=lambda x:x.distance, reverse=False)
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = kp1[match.queryIdx].pt
points2[i, :] = kp2[match.trainIdx].pt
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography
height, width = template_img.shape
input_warped = cv2.warpPerspective(input_img, h, (width, height))
ret1, input_warped_thresh = cv2.threshold(input_warped,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
diff = cv2.absdiff(template_img, input_warped_thresh)
ret, diff = cv2.threshold(diff, 20, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C + cv2.THRESH_BINARY)
diff = cv2.equalizeHist(diff)
# Create kernels
kernel1 = np.ones((3,3),np.uint8)
kernel2 = np.ones((6,6), np.uint8)
# erode dilate to remove the grid
diff_erode = cv2.erode(diff,kernel1)
diff_dilated = cv2.dilate(diff_erode,kernel2)
# invert diff_dilate
diff_dilated_inv = cv2.bitwise_not(diff_dilated)
return diff_dilated_inv
Related
I've did quite a search about image stitching on python and most are for panoramic images, warping and rotating the images to combine them into one.
What I'm trying to do is using computer images, so they are digital and can be template matched without a problem, it will always be 2D without need of warping.
Basically here I have pieces of a map that is zoomed in and I want to make a massive image of this small pictures, here we have all the images used: https://imgur.com/a/HZIeT3z
import os
import numpy as np
import cv2
def stitchImagesWithoutWarp(img1, img2):
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1,des2)
matches = sorted(matches, key = lambda x:x.distance)
good_matches = matches[:10]
src_pts = np.float32([kp1[m.queryIdx].pt for m in good_matches]).reshape(-1,1,2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good_matches]).reshape(-1,1,2)
start = (abs(int(dst_pts[0][0][0]-src_pts[0][0][0])), abs(int(dst_pts[0][0][1]-src_pts[0][0][1])))
h1, w1 = img1.shape[:2]
h2, w2 = img2.shape[:2]
vis = np.zeros((start[1]+h1,start[0]+w1,3), np.uint8)
vis[start[1]:start[1]+h1, start[0]:start[0]+w1, :3] = img1
vis[:h2, :w2, :3] = img2
return vis
imgList = []
for it in os.scandir("images"):
imgList.append(cv2.imread(it.path))
vis = stitchImagesWithoutWarp(imgList[0],imgList[1])
for index in range(2,len(imgList)):
cv2.imshow("result", vis)
cv2.waitKey()
vis = stitchImagesWithoutWarp(vis,imgList[index])
By running this code I can successfully stitch the first four images together, such as this:
But once I stitch the fifth image it seems to have wrong match and incorrectly, but I always get the best match by distance on NORM_HAMMING, this is the result:
The thing is, this is the first image, in this order, that the best match point (var start) is negative in the x axis, here is the matching points in the imgur order:
(7, 422)
(786, 54)
(394, 462)
(-350, 383)
I attempted switching the top image, doing specific code for negative match but I've believe I was deviating the performance.
Also noting from the docs the first image should be the query and the second supposed to be the target, but I couldn't get it to work by inverting the vis variable in function param.
The main issue here was when recognized points weren't on the screen (negative values), it needs offsets to adjust, I also incremented a little bit to the code and verified if the matches were legit, as if all the calculated displacement were in average the around the matched first pick in brute force.
with the average of 2MB for each image, without preprocessing the images/downscaling/compressing, after stitching 9 images together I got the average of 1050ms in my PC, as for other algorithms tested (that warped the image) took around 2-3seconds for stitching 2 of those images.
here is the final result:
import os
import numpy as np
import cv2
def averageTuple(tupleList):
avgX, avgY = 0,0
for tuple in tupleList:
avgX += tuple[0]
avgY += tuple[1]
return (int(avgX/len(tupleList)),int(avgY/len(tupleList)))
def tupleInRange(t1, t2, dif=3):
if t1[0] + dif > t2[0] and t1[0] - dif < t2[0]:
if t1[1] + dif > t2[1] and t1[1] - dif < t2[1]:
return True
return False
def rgbToRGBA(img):
b_channel, g_channel, r_channel = cv2.split(img)
alpha_channel = np.ones(b_channel.shape, dtype=b_channel.dtype) * 255
return cv2.merge((b_channel, g_channel, r_channel, alpha_channel))
def cropAlpha(img,extraRange=0.05):
y, x = img[:, :, 3].nonzero() # get the nonzero alpha coordinates
minx = int(np.min(x)*(1-extraRange))
miny = int(np.min(y)*(1-extraRange))
maxx = int(np.max(x)*(1+extraRange))
maxy = int(np.max(y)*(1+extraRange))
return img[miny:maxy, minx:maxx]
def stitchImagesWithoutWarp(img1, img2):
if len(cv2.split(img1)) != 4:
img1 = rgbToRGBA(img1)
if len(cv2.split(img2)) != 4:
img2 = rgbToRGBA(img2)
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1,des2)
matches = sorted(matches, key = lambda x:x.distance)
good_matches = matches[:10]
src_pts = np.float32([kp1[m.queryIdx].pt for m in good_matches]).reshape(-1,1,2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good_matches]).reshape(-1,1,2)
pointsList = []
for index in range(0,len(src_pts)):
curPoint = (int(dst_pts[index][0][0]-src_pts[index][0][0])), (int(dst_pts[index][0][1]-src_pts[index][0][1]))
pointsList.append(curPoint)
start = pointsList[0]
avgTuple = averageTuple(pointsList)
if not tupleInRange(start, avgTuple): return img1
h1, w1 = img1.shape[:2]
h2, w2 = img2.shape[:2]
ax = abs(start[0])
ay = abs(start[1])
vis = np.zeros((ay+h1,ax+w1,4), np.uint8)
ofst2 = (ax if start[0]<0 else 0, ay if start[1]<0 else 0)
ofst1 = (0 if start[0]<0 else ax, 0 if start[1]<0 else ay)
vis[ofst1[1]:ofst1[1]+h1, ofst1[0]:ofst1[0]+w1, :4] = img1
vis[ofst2[1]:ofst2[1]+h2, ofst2[0]:ofst2[0]+w2, :4] = img2
return cropAlpha(vis)
imgList = []
for it in os.scandir("images"):
imgList.append(cv2.imread(it.path))
vis = stitchImagesWithoutWarp(imgList[0],imgList[1])
for index in range(2,len(imgList)):
vis = stitchImagesWithoutWarp(vis,imgList[index])
cv2.imwrite("output.png", cropAlpha(vis,0))
here is the output image (compressed in JPEG for stackoverflow):
I was using the code given on this site to align a photograph of my certificate to a 'template' i created by converting my pdf certificate to png and deleting my name, candidate id and certification date. Sadly it key point matches the wrong areas (click here for image), specifically matching my name, canditate ID and the date to areas in the certificate when it shouldn't. Oddly enough it works fine when I key point match the photograph with itself (i.e.: without deleting my name, date etc.)
I have tried using different matchers including sift and the brute force matcher but still all result in the same problem. Does anyone know why this may be happening and is there anything I can try to overcome this?
Thanks :)
Here is the code that I am using:
# import the necessary packages
#import numpy as np
import imutils
import cv2
def align_images(image, template, maxFeatures=500, keepPercent=0.2,
debug=True):
# convert both the input image and template to grayscale
imageGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
templateGray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
# use ORB to detect keypoints and extract (binary) local
# invariant features
orb = cv2.ORB_create(maxFeatures)
(kpsA, descsA) = orb.detectAndCompute(imageGray, None)
(kpsB, descsB) = orb.detectAndCompute(templateGray, None)
# match the features
method = cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING
matcher = cv2.DescriptorMatcher_create(method)
matches = matcher.match(descsA, descsB, None)
# sort the matches by their distance (the smaller the distance,
# the "more similar" the features are)
matches = sorted(matches, key=lambda x:x.distance)
# keep only the top matches
keep = int(len(matches) * keepPercent)
matches = matches[:keep]
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
matches = cv2.drawMatches(image, kpsA,template,kpsB,matches,None)
matches = imutils.resize(matches, width = 1000)
cv2.imshow("Matched Keypoints",matches)
cv2.waitKey(0)
for i, match in enumerate(matches):
points1[i, :] = kpsA[match.queryIdx].pt
points2[i, :] = kpsB[match.trainIdx].pt
# Find homography
H, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# use the homography matrix to align the images
(h, w) = template.shape[:2]
aligned = cv2.warpPerspective(image, H, (w, h))
# return the aligned image
return aligned
image = cv2.imread(r"C:\Users\Soffie\Documents\GrayceAssignments\certificate_scanner\certif_image2.jpg")
template = cv2.imread(r"C:\Users\Soffie\Documents\GrayceAssignments\certificate_scanner\certificate_template.png")
aligned = align_images(image, template)
aligned = imutils.resize(aligned,width =700)
template = imutils.resize(template,width = 700)
stacked = np.hstack([aligned,template])
overlay = template.copy()
output = aligned.copy()
cv2.addWeighted(overlay,0.5,output,0.5,0,output)
cv2.imshow("Image Alignment Stacked",stacked)
cv2.imshow("Image Alignment Overlay", output)
cv2.waitKey(0)
I am trying to use feature matching to determine if two images are similar. I have tried SIFT, SURF, and ORB with similar results from each. I am first trying to display the matches between these two images. Image1 Image2 I have tried both brute force matching and knn matching but with both implementations, I get around 10 matches with little to no accuracy. Are the two images too different in terms of scale, transformation, and perspective to generate accurate matches? What parameters could I modify to improve performance? The matching often produces a lot of matches but only a few pass ratio test.
import cv2
import matplotlib.pyplot as plt
import numpy as np
model = cv2.imread("path")
modelg = cv2.cvtColor(model,cv2.COLOR_BGR2GRAY)
modelg = cv2.GaussianBlur(modelg,(7,7),0)
height, width = modelg.shape[:2]
print(str(height)+ " " + str(width))
frame = cv2.imread("Path")
plt.imshow(frame)
plt.show()
#frame = cv2.imread("Path")
frame = cv2.resize(frame,(width,height))
frame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(frame,(7,7),0)
#blur = cv2.blur(frame,(7,7))
plt.imshow(blur,cmap='gray')
plt.show()
plt.imshow(modelg,cmap='gray')
plt.show()
# Initiate SIFT detector
surf = cv2.xfeatures2d.SURF_create()
# find the keypoints and descriptors with SIFT
orb = cv2.ORB_create()
# find the keypoints and descriptors with ORB
kp1, des1 = surf.detectAndCompute(modelg,None)
kp2, des2 = surf.detectAndCompute(blur,None)
#kp1, des1 = surf.detectAndCompute(,None)
#kp2, des2 = surf.detectAndCompute(blur,None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2,k=2)
print(len(matches))
# Apply ratio test
good = []
for m,n in matches:
if m.distance < 0.75*n.distance:
good.append([m])
print(len(good))
# cv.drawMatchesKnn expects list of lists as matches.
img3 = cv2.drawMatchesKnn(modelg,kp1,blur,kp2,good,None,flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.imshow(img3),plt.show()
I am trying to implement a Python (3.7) OpenCV (3.4.3) ORB image alignment. I normally do most of my processing with ImageMagick. But I need to do some image alignment and am trying to use Python OpenCV ORB. My script is based upon one from Satya Mallick's Learn OpenCV tutorial at https://www.learnopencv.com/image-alignment-feature-based-using-opencv-c-python/.
However, I am trying to modify it to use a rigid alignment rather than a perspective homology and to filter the points using a mask to limit the difference in y values, since the images are nearly aligned already.
The mask approach was taken from a FLANN alignment code in the last example at https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_matcher/py_matcher.html.
My script works fine, if I remove the matchesMask, which should provide the point filtering. (I have two other working scripts. One is similar, but just filters the points and ignores the mask. The other is based upon the ECC algorithm.)
However, I would like to understand why my code below is not working.
Perhaps the structure of my mask is incorrect in current versions of Python Opencv?
The error that I get is:
Traceback (most recent call last):
File "warp_orb_rigid2_filter.py", line 92, in <module>
imReg, m = alignImages(im, imReference)
File "warp_orb_rigid2_filter.py", line 62, in alignImages
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None, **draw_params)
SystemError: <built-in function drawMatches> returned NULL without setting an error
Here is my code. The first arrow shows where the mask is created. The second arrow shows the line I have to remove to get the script to work. But then it ignores my filtering of points.
#!/bin/python3.7
import cv2
import numpy as np
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.15
def alignImages(im1, im2):
# Convert images to grayscale
im1Gray = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
im2Gray = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(MAX_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(im1Gray, None)
keypoints2, descriptors2 = orb.detectAndCompute(im2Gray, None)
# Match features.
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(descriptors1, descriptors2, None)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Extract location of good matches and filter by diffy
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# initialize empty arrays for newpoints1 and newpoints2 and mask
newpoints1 = np.empty(shape=[0, 2])
newpoints2 = np.empty(shape=[0, 2])
matches_Mask = [0] * len(matches)
# filter points by using mask
for i in range(len(matches)):
pt1 = points1[i]
pt2 = points2[i]
pt1x, pt1y = zip(*[pt1])
pt2x, pt2y = zip(*[pt2])
diffy = np.float32( np.float32(pt2y) - np.float32(pt1y) )
print(diffy)
if abs(diffy) < 10.0:
newpoints1 = np.append(newpoints1, [pt1], axis=0)
newpoints2 = np.append(newpoints2, [pt2], axis=0)
matches_Mask[i]=[1,0] #<--- mask created
print(matches_Mask)
draw_params = dict(matchColor = (255,0,),
singlePointColor = (255,255,0),
matchesMask = matches_Mask, #<---- remove mask here
flags = 0)
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None, **draw_params)
cv2.imwrite("/Users/fred/desktop/lena_matches.png", imMatches)
# Find Affine Transformation
# true means full affine, false means rigid (SRT)
m = cv2.estimateRigidTransform(newpoints1,newpoints2,False)
# Use affine transform to warp im1 to match im2
height, width, channels = im2.shape
im1Reg = cv2.warpAffine(im1, m, (width, height))
return im1Reg, m
if __name__ == '__main__':
# Read reference image
refFilename = "/Users/fred/desktop/lena.png"
print("Reading reference image : ", refFilename)
imReference = cv2.imread(refFilename, cv2.IMREAD_COLOR)
# Read image to be aligned
imFilename = "/Users/fred/desktop/lena_r1.png"
print("Reading image to align : ", imFilename);
im = cv2.imread(imFilename, cv2.IMREAD_COLOR)
print("Aligning images ...")
# Registered image will be stored in imReg.
# The estimated transform will be stored in m.
imReg, m = alignImages(im, imReference)
# Write aligned image to disk.
outFilename = "/Users/fred/desktop/lena_r1_aligned.jpg"
print("Saving aligned image : ", outFilename);
cv2.imwrite(outFilename, imReg)
# Print estimated homography
print("Estimated Affine Transform : \n", m)
Here are my two images: lena and lena rotated by 1 degree. Note that these are not my actual images. These image have no diffy values > 10, but my actual images do.
I am trying to align and warp the rotated image to match the original lena image.
The way you are creating the mask is incorrect. It only needs to be a list with single numbers, with each number telling you whether you want to use that particular feature match.
Therefore, replace this line:
matches_Mask = [[0,0] for i in range(len(matches))]
With this:
matches_Mask = [0] * len(matches)
... so:
# matches_Mask = [[0,0] for i in range(len(matches))]
matches_Mask = [0] * len(matches)
This creates a list of 0s that is as long as the number of matches. Finally, you need to change writing to the mask with a single value:
if abs(diffy) < 10.0:
#matches_Mask[i]=[1,0] #<--- mask created
matches_Mask[i] = 1
I finally get this:
Estimated Affine Transform :
[[ 1.00001187 0.01598318 -5.05963793]
[-0.01598318 1.00001187 -0.86121051]]
Take note that the format of the mask is different depending on what matcher you use. In this case, you use brute force matching so the mask needs to be in the format that I just described.
If you used FLANN's knnMatch for example, then it will be a nested list of lists, with each element being a list that is k long. For example, if you had k=3 and five keypoints, it will be a list of five elements long, with each element being a three element list. Each element in the sub-list delineates what match you want to use for drawing.
So I'm trying to overlay a thermal image with an rgb image using SIFT to match features and homography so that I can overlay them later on. The code I have works with about 50% of the thermal/rgb sets I have but many sets, such as this one, give horrible results. I think the homography is fine but doesn't work because the matches are way off. I'll attach some code, any advice on how to tune this would be great as I've spent a long time already trying to get this working on my own. Thanks!
MIN_MATCH_COUNT = 10
sift = cv2.xfeatures2d.SIFT_create(sigma=1.6, contrastThreshold=0.04,edgeThreshold = 15)
kp1, des1 = sift.detectAndCompute(rgb, None)
kp2, des2 = sift.detectAndCompute(thermal, None)
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)
good = []
for m, n in matches:
if m.distance < 0.8 * n.distance:
good.append(m)
good = sorted(good, key=lambda x: x.distance)
img3 = cv2.drawMatches(rgb, kp1, thermal, kp2, good, None, flags=2)
which gives the following
Then I do homography with RANSAC on the found matches
if len(good) > MIN_MATCH_COUNT:
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0, maxIters=1000)
matchesMask = mask.ravel().tolist()
h, w, c = rgb.shape
pts = np.float32([[0, 0], [0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(-1, 1, 2)
dst = cv2.perspectiveTransform(pts, M)
thermal = cv2.polylines(thermal, [np.int32(dst)], True, 255, 3, cv2.LINE_AA) # draw lines around as a box
draw_params = dict(matchColor=(0, 255, 0), # draw matches in green color
singlePointColor=None,
matchesMask=matchesMask, # draw only inliers
flags=2)
img3 = cv2.drawMatches(rgb, kp1, thermal, kp2, good, None, **draw_params)
resulting in this
Like I said, I think that this is failing because the BFMatcher is not finding correct matches but I am not sure why. Again any and all help is very appreciated! I've tried using an orb detector, converting the rgb image to grayscale, and pre-sclaing images to similar sizes and still get bad results.
Here is an example of a working rgb-thermal pair to demonstrate what I am trying to do.
The problem with your image is that it's so simple compared to natural images (no color, no major differences in texture, etc.), you cannot reliably use SIFT and other techniques made with normal photos in mind. Most of your wrong matches are actually good matches, since the matches look locally similar to each other (after obtaining the descriptor).
My suggestion is to look at alternatives that match images using structural information, or add information to the image (e.g. using a height rainbow colormap since your images can be seen as bumpmaps; using the distance transform + colormap might work too, or using both mentioned + edge detection as the 3 channels for a very weird but heterogeneous color image) and see if SIFT behaves differently.