I am trying to implement KAZE and A-KAZE using Python and OpenCV for Feature Detection and Description on an aerial image.
What is the code?
Also, what descriptor should go along with it for Feature Matching?
KAZE, as well as some previous state-of-the-art methods such as SIFT and SURF, are Local Feature Descriptors, and in some ways, it shows better performance in both detection and description compared to SIFT descriptor. A-KAZE, on the other hand, is a Local Binary Descriptor and presents excellent results in terms of speed and performance compared to state-of-the-art methods such as Local Feature Descriptors: SIFT, SURF, and KAZE, and compared to Local Binary Descriptors: ORB, and BRISK.
Responding to your question, both of them can go along with it for Feature Matching, although, A-KAZE descriptor do not fit appropriately in smaller patches (e.g., smallest images — 32x32 patch), that is, in order to avoid the return of keypoints without descriptors, A-KAZE normally remove the keypoints.
Therefore, the choice between KAZE and A-KAZE depends on the context of your application. But, a priori A-KAZE has a better performance than KAZE.
In this example, I will show you Feature Detection and Matching with A-KAZE through the FLANN algorithm using Python and OpenCV.
First, load the input image and the image that will be used for training.
In this example, we are using those images:
image1:
image2:
# Imports
import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np
# Open and convert the input and training-set image from BGR to GRAYSCALE
image1 = cv.imread(filename = 'image1.jpg',
flags = cv.IMREAD_GRAYSCALE)
image2 = cv.imread(filename = 'image2.jpg',
flags = cv.IMREAD_GRAYSCALE)
Note that when importing the images, we use the flags = cv.IMREAD_GRAYSCALE parameter, because in OpenCV the default color mode setting is BGR. Therefore, to work with Descriptors, we need to convert the color mode pattern from BGR to grayscale.
Now we will use the A-KAZE algorithm:
# Initiate A-KAZE descriptor
AKAZE = cv.AKAZE_create()
# Find the keypoints and compute the descriptors for input and training-set image
keypoints1, descriptors1 = AKAZE.detectAndCompute(image1, None)
keypoints2, descriptors2 = AKAZE.detectAndCompute(image2, None)
The features detected by the A-KAZE algorithm can be combined to find objects or patterns that are similar between different images.
Now we will use the FLANN algorithm:
# FLANN parameters
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm = FLANN_INDEX_KDTREE,
trees = 5)
search_params = dict(checks = 50)
# Convert to float32
descriptors1 = np.float32(descriptors1)
descriptors2 = np.float32(descriptors2)
# Create FLANN object
FLANN = cv.FlannBasedMatcher(indexParams = index_params,
searchParams = search_params)
# Matching descriptor vectors using FLANN Matcher
matches = FLANN.knnMatch(queryDescriptors = descriptors1,
trainDescriptors = descriptors2,
k = 2)
# Lowe's ratio test
ratio_thresh = 0.7
# "Good" matches
good_matches = []
# Filter matches
for m, n in matches:
if m.distance < ratio_thresh * n.distance:
good_matches.append(m)
# Draw only "good" matches
output = cv.drawMatches(img1 = image1,
keypoints1 = keypoints1,
img2 = image2,
keypoints2 = keypoints2,
matches1to2 = good_matches,
outImg = None,
flags = cv.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.imshow(output)
plt.show()
And the output will be:
To perform the same example with the KAZE descriptor, just initialize this descriptor, changing:
AKAZE = cv.AKAZE_create()
To:
KAZE = cv.KAZE_create()
To learn more about Detection, Description, and Feature Matching techniques, Local Feature Descriptors, Local Binary Descriptors, and algorithms for Feature Matching, I recommend the following repositories on GitHub:
https://github.com/whoisraibolt/Feature-Detection-and-Description
https://github.com/whoisraibolt/Feature-Detection-and-Matching
Related
I understand that this is a popular question on Stack Overflow however, I have not managed to find the best solution yet.
Background
I am trying to classify an image. I currently have 10,000 unique images that a given image can match with. For each image in my database, I only have a single image for training. So I have a DB of 10,000 and the possible output classes are also 10,000. e.g. lets say there are 10,000 unique objects and I have a single image for each.
The goal is to match an input image to the 'best' matching image in the DB.
I am currently using Python with OpenCV and the Sift library to identify keypoints / descriptors then applying the standard matching methods to see which image in the DB that the input image best matches.
Code
I am using the following code to iterate over my database of images, to then find all the key points / descriptors and saving those descriptors to a file. This is to save time later on.
for i in tqdm(range(labels.shape[0])): #Use the length of the DB
# Read img from DB
img_path = 'data/'+labels['Image_Name'][i]
img = cv2.imread(img_path)
# Resize to ensure all images are equal for ROI
dim = (734,1024)
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
#Grayscale
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#Roi
img = img[150:630, 20:700]
# Sift
sift = cv2.xfeatures2d.SIFT_create()
keypoints_1, descriptors_1 = sift.detectAndCompute(img,None)
# Save descriptors
path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")
savetxt(path, descriptors_1, delimiter=',')
Then when I am ready to classify an image, I can then read in all of the descriptors. This has proven to be 30% quicker.
# Array to store all of the descriptors from SIFT
descriptors = []
for i in tqdm(range(labels.shape[0])): #Use the length of the DB
# Read in teh descriptor file
path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")
descriptor = loadtxt(path, delimiter=',')
# Add to array
descriptors.append(descriptor)
Finally, I just need to read in an image, apply the sift method and then find the best match.
# Calculate simaularity
img = cv2.imread(PATH)
# Resize
dim = (734,1024)
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
#Grayscale
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#Roi
img = img[150:630, 20:700]
# Sift
sift = cv2.xfeatures2d.SIFT_create()
keypoints_1, descriptors_1 = sift.detectAndCompute(img,None)
# Use Flann (Faster)
index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)
# Store results
scoresdf = pd.DataFrame(columns=["index","score"])
#Find best matches in DB
for i in tqdm(range(labels.shape[0])):
# load in data
path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")
# Get descriptors for both images to compare
descriptors_2 = descriptors[i]
descriptors_2 = np.float32(descriptors_2)
# Find matches
matches = flann.knnMatch(descriptors_1, descriptors_2, k=2)
# select the lowest amount of keypoints
number_keypoints = 0
if len(descriptors_1) <= len(descriptors_2):
number_keypoints = len(descriptors_1)
else:
number_keypoints = len(descriptors_2)
# Find 'good' matches LOWE
good_points = []
ratio = 0.6
for m, n in matches:
if m.distance < ratio*n.distance:
good_points.append(m)
# Get simularity score
score = len(good_points) / number_keypoints * 100
scoresdf.loc[len(scoresdf)] = [i, score]
This all works but it does take some time and I would like to find a match much quicker.
Solutions?
I have read about the bag of word (BOW) method. However, I do not know if this will work given there are 10,000 classes. Would I need to set K=10000?
Given that each descriptor is an array, is there a way to reduce my search space? Can I find the X closest arrays (descriptors) to the descriptor of my input image?
Any help would be greatly appreciated :)
Edit
Can you use a Bag of Words (BOW) method to create X clusters. Then when I read in a new image, find out which cluster it belongs to. Then use SIFT matching on the images in that cluster to find the exact match? I am struggling to find much code examples for this.
I'm working on a project to automatically rotate microscope image stacks of a fluid experiment so that they are lined up with images of the CAD template for the microfluidic chip. I am using the OpenCV package in Python for image processing. Having the correct rotational orientation is necessary so that the images can be masked properly for analysis. Our chips have markers filled with fluorescent dye that are visible in every frame. The template and a sample image look like the following (the template can be scaled to arbitrary size, but the relevant region of the images is typically ~100x100 pixels or so):
I have not been able to rotationally align the image to the CAD template. Typically, the misalignment between the CAD template and the images is less than a few degrees, which is still sufficient to interfere with analysis, so I need to be able to measure the rotational difference even if it is relatively small.
Following examples online I am using the following procedure:
Scale up the image to approximately the same size as the template using cubic interpolation (~800 x 800)
Threshold both images using Otsu's method
Find keypoints and extract descriptors using a built-in method (I've tried ORB, AKAZE, and BRIEF).
Match descriptors using a brute-force matcher with Hamming distance.
Take the best matches and use them to compute a partial affine transformation matrix
Use that matrix to infer a rotational shift, warping the one image to the other as a check.
Here's a sample of my code (borrowed in part from here):
import numpy as np
import cv2
import matplotlib.pyplot as plt
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.5
def alignImages(im1, im2,returnpoints=False):
# Detect ORB features and compute descriptors.
size1 = int(0.1*(np.mean(np.shape(im1))))
size2 = int(0.1*(np.mean(np.shape(im2))))
orb1 = cv2.ORB_create(MAX_FEATURES,edgeThreshold=size1,patchSize=size1)
orb2 = cv2.ORB_create(MAX_FEATURES,edgeThreshold=size2,patchSize=size2)
keypoints1, descriptors1 = orb1.detectAndCompute(im1, None)
keypoints2, descriptors2 = orb2.detectAndCompute(im2, None)
matcher = cv2.BFMatcher(cv2.NORM_HAMMING,crossCheck=True)
matches = matcher.match(descriptors1,descriptors2)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
cv2.imwrite("matches.jpg", imMatches)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# Find homography
M, inliers = cv2.estimateAffinePartial2D(points1,points2)
height, width = im2.shape
im1Reg = cv2.warpAffine(im1,M,(width,height))
return im1Reg, M
if __name__ == "__main__":
test_template = cv2.cvtColor(cv2.imread("test_CAD_cropped.png"),cv2.COLOR_RGB2GRAY)
test_image = cv2.cvtColor(cv2.imread("test_CAD_cropped.png"),cv2.COLOR_RGB2GRAY)
fx = fy = 88/923
test_image_big = cv2.resize(test_image,(0,0),fx=1/fx,fy=1/fy,interpolation=cv2.INTER_CUBIC)
ret, imRef_t = cv2.threshold(test_template,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
ret, test_big_t = cv2.threshold(test_image_big,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
imReg, M = alignImages(test_big_t,imRef_t)
fig, ax = plt.subplots(nrows=2,ncols=2,figsize=(8,8))
ax[1,0].imshow(imReg)
ax[1,0].set_title("Warped Image")
ax[0,0].imshow(imRef_t)
ax[0,0].set_title("Template")
ax[0,1].imshow(test_big_t)
ax[0,1].set_title("Thresholded Image")
ax[1,1].imshow(imRef_t - imReg)
ax[1,1].set_title("Diff")
plt.show()
In this example, I get the following bad transformation because there are only 3 matching keypoints and they are all incorrect:
I find that regardless of my keypoint/descriptor parameters I tend to get too few "good" features. Is there anything I can do to pre-process my images better to get good features more reliably, or is there a better method to align my images to this template that doesn't involve keypoint matching? The specific application of this experiment means that I can't use the patented keypoint extractor/descriptors like SURF and SIFT.
A good method to align two images based on rotation, translation and scaling only is the Fourier Mellin transform.
Here is an example using the implementation in DIPlib (disclosure: I'm an author):
import diplib as dip
# load data
image = dip.ImageRead('image.png')
template = dip.ImageRead('template.png')
template = template.TensorElement(0) # this one is RGB, take any one channel
# pad the two images with zeros so they have equal sizes
sz = [max(image.Size(0), template.Size(0)), max(image.Size(1), template.Size(1))]
image = image.Pad(sz)
template = template.Pad(sz)
# match
res = dip.FourierMellinMatch2D(template, image)
# display
dip.JoinChannels((template,res,res)).Show()
However, there are many other approaches. A key thing here is that both the template and the image are quite simple, and very similar. This makes registration very easy.
For example, assuming you have the proper scaling of the template (this should not be a problem I presume), all you need to do is find the rotation and the translation. You can brute-force the rotations, simply rotating the image over a set of small angles, and matching each of the results with the template (cross-correlation). The one with the best match (largest cross-correlation value) has the appropriate rotation. If you need to have a very precise rotation estimation, you can do a second set of angles close to the best choice in the first set.
Cross-correlation is cheap and easy to compute, and leads to high precision translation estimates (the Fourier Mellin method makes extensive use of it). Don't just find the pixel with the largest value in the cross-correlation output, you can fit a parabola to the few pixels around this one and use the location of the maximum of the fitted parabola. This leads to sub-pixel estimates of translation.
i've been trying to match a scanned formular with its empty template. The goal is to rotate and scale it to match the template.
Source (left), template (right)
Match (left), Homography warp (right)
The template does not contain any very specific logo, fixation cross or rectangular frame that would conveniently help me with feature or pattern matching. Even worse, the scanned formular can be skewed, altered and contains handwritten signatures and stamps.
My approach, after unsuccessfully testing ORB feature matching, was to concentrate on the shape of the formular (lines and column).
The pictures I provide here are obtained by reconstituting lines after a segment detection (LSD) with a certain minimum size. Most of what remains for source and template is the document layout itself.
In the following script (that should work out of the box along with pictures), I attempt to do ORB feature matching, but fail to make it work because it is concentrating on edges and not on the document layout.
import cv2 # using opencv-python v3.4
import numpy as np
from imutils import resize
# alining image using ORB descriptors, then homography warp
def align_images(im1, im2,MAX_MATCHES=5000,GOOD_MATCH_PERCENT = 0.15):
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(MAX_MATCHES)
keypoints1, descriptors1 = orb.detectAndCompute(im1, None)
keypoints2, descriptors2 = orb.detectAndCompute(im2, None)
# Match features.
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(descriptors1, descriptors2, None)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography
if len(im2.shape) == 2:
height, width = im2.shape
else:
height, width, channels = im2.shape
im1Reg = cv2.warpPerspective(im1, h, (width, height))
return im1Reg, h, imMatches
template_fn = './stack/template.jpg'
image_fn = './stack/image.jpg'
im = cv2.imread(image_fn, cv2.IMREAD_GRAYSCALE)
template = cv2.imread(template_fn, cv2.IMREAD_GRAYSCALE)
# aligh images
imReg, h, matches = align_images(template,im)
# display output
cv2.imshow('im',im)
cv2.imshow('template',template)
cv2.imshow('matches',matches)
cv2.imshow('result',imReg)
cv2.waitKey(0)
cv2.destroyAllWindows()
Is there any way to make the pattern matching algorithm work on the image on the left (source)? (another idea was to leave only lines intersections)
Alternatively, I have been trying to do scale and rotation invariant pattern matching for loops and while keeping max correlation, but it is way too resource consuming and not very reliable.
I'm therefore looking for hints in the right direction using opencv.
SOLUTION
The issue was about reducing the image to what really matters: the layout.
Also, ORB was not appropriate since it is not as robust (rotation and size invariant) as SIFT and AKAZE are.
I proceeded as follows:
convert the images to black and white
use line segment detection and filter lines shorter than 1/60th of the width
reconstruct the image from segments (line width does not have a big impact)
(optional: resize the pictures to speed up the rest)
apply a Gaussian transformation on the line reconstruction, 1/25th of the width
detect and match features using SIFT (patented) or AKAZE (free) algorithm
find a homography and warp the source picture to match the template
Matches for AKAZE
Matches for SIFT
I noted:
the layout of the template has to match, otherwise it will only stick to what it recognizes
line detection is better with higher resolution, then downsizing is possible since Gaussian are applied
SIFT produces more features and seems more reliable than AKAZE
I found example in c++:
http://docs.opencv.org/3.0-beta/doc/tutorials/features2d/akaze_matching/akaze_matching.html
But there isn't any example in python showing how to use this feature detector (also couldn't find anything more in documentation about AKAZE there is ORB SIFT, SURF, etc but not what I'm looking for)
http://docs.opencv.org/3.1.0/db/d27/tutorial_py_table_of_contents_feature2d.html#gsc.tab=0
Can someone could share or show me where I can find information how to match images in python with akaze?
I am not sure on where to find it, the way I made it work was through this function which used the Brute Force matcher:
def kaze_match(im1_path, im2_path):
# load the image and convert it to grayscale
im1 = cv2.imread(im1_path)
im2 = cv2.imread(im2_path)
gray1 = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
# initialize the AKAZE descriptor, then detect keypoints and extract
# local invariant descriptors from the image
detector = cv2.AKAZE_create()
(kps1, descs1) = detector.detectAndCompute(gray1, None)
(kps2, descs2) = detector.detectAndCompute(gray2, None)
print("keypoints: {}, descriptors: {}".format(len(kps1), descs1.shape))
print("keypoints: {}, descriptors: {}".format(len(kps2), descs2.shape))
# Match the features
bf = cv2.BFMatcher(cv2.NORM_HAMMING)
matches = bf.knnMatch(descs1,descs2, k=2) # typo fixed
# Apply ratio test
good = []
for m,n in matches:
if m.distance < 0.9*n.distance:
good.append([m])
# cv2.drawMatchesKnn expects list of lists as matches.
im3 = cv2.drawMatchesKnn(im1, kps1, im2, kps2, good[1:20], None, flags=2)
cv2.imshow("AKAZE matching", im3)
cv2.waitKey(0)
Remember that the feature vectors are binary vectors. Therefore, the similarity is based on the Hamming distance, rather than the commonly used L2 norm or Euclidean distance if you will.
I searched for the same tutorial and found out the tutorial is given in 3 alternate languages C++, Python & Java. There are 3 hyperlinks for them before the start of code area.
Try this [ https://docs.opencv.org/3.4/db/d70/tutorial_akaze_matching.html ]
I have to stitch two or more images together using python and openCV.
I found this code for finding keypoints and matches, but I don't know how to continue.
Help me please!
import numpy as np
import cv2
MIN_MATCH_COUNT = 10
img1 = cv2.imread('a.jpg',0) # queryImage
img2 = cv2.imread('b.jpg',0) # trainImage
# Initiate SIFT detector
sift = cv2.SIFT()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1,des2,k=2)
# store all the good matches as per Lowe's ratio test.
good = []
for m,n in matches:
if m.distance < 0.7*n.distance:
good.append(m)
Your question is not very clear, but I assume what you mean is that you have a bunch of images and you want to have opencv find the corresponding landmarks and then warp/scale each picture so that they can form one big image.
A method without using the stitcher class, basically looping over pictures and determining the best fitting one each iteration, is documented in this github code
One approach to image stitching consists of the following steps.
Firstly, as you've already figured out, you need a feature point detector and the some way to find correspondences between feature points on both images. It's typically a good idea to eliminate a lot of correspondences because they will likely contain a lot of noise. A super simple way to eliminate a lot of noise is to look for symmetry in the matches.
This is roughly what your code does up to this point.
Next, to stitch images together, you need to warp one of the images to match the perspective of the other image. This is done by estimating the homography using the correspondences. Because your correspondences will still likely contain a lot of noise, we typically use RANSAC to robustly estimate the homography.
A quick google search provides many examples of this being implemented.