I am trying to align an RGB image with an IR image (single channel).
The goal is to create a 4 channel image R,G,B,IR.
In order to do this, I am using cv2.findTransformECC as described in this very neat guide. The code is unchanged for now, except for line 13 where the Motion is set to Euclidian because I want to handle rotations in the future. I am using Python.
In order to verify the workings of the software, I used the images from the guide. It worked well so I wanted to correlate satellite images from multiple spectra as described above. Unfortunately, I ran into problems here.
Sometimes the algorithm converged (after ages) and sometimes it immediately crashed because it cant converge and other times it "finds" a solution that is clearly wrong. Attached you find two images that, from a human perspective, are easy to match, but the algorithm fails. The images are not rotated in any way, they are just not the exact same image (check the borders), so a translational motion is expected. Images are of Lake Neusiedlersee in Austria, the source is Sentinelhub.
Edit: With "sometimes" I refer to using different images from Sentinel. One pair of images has consistently the same outcome.
I know that ECC is not feature-based which might pose a problem here.
I have also read that it is somewhat dependent on the initial warp matrix.
My questions are:
Am I using cv2.findTransformECC wrong?
Is there a better way to do this?
Should I try to "Monte-Carlo" the initial matrices until it converges? (This feels wrong)
Do you suggest using a feature-based algorithm?
If so, is there one available or would I have to implement this myself?
Thanks for the help!
Do you suggest using a feature-based algorithm?
Sure.
There are many feature detections algorithms.
I generally choose SIFT because it provides good matching results and the runtime is feasibly fast.
import cv2 as cv
import numpy as np
# read the images
ir = cv.imread('ir.jpg', cv.IMREAD_GRAYSCALE)
rgb = cv.imread('rgb.jpg', cv.IMREAD_COLOR)
descriptor = cv.SIFT.create()
matcher = cv.FlannBasedMatcher()
# get features from images
kps_ir, desc_ir = descriptor.detectAndCompute(ir, mask=None)
gray = cv.cvtColor(rgb, cv.COLOR_BGR2GRAY)
kps_color, desc_color = descriptor.detectAndCompute(gray, mask=None)
# find the corresponding point pairs
if (desc_ir is not None and desc_color is not None and len(desc_ir) >=2 and len(desc_color) >= 2):
rawMatch = matcher.knnMatch(desc_color, desc_ir, k=2)
matches = []
# ensure the distance is within a certain ratio of each other (i.e. Lowe's ratio test)
ratio = 0.75
for m in rawMatch:
if len(m) == 2 and m[0].distance < m[1].distance * ratio:
matches.append((m[0].trainIdx, m[0].queryIdx))
# convert keypoints to points
pts_ir, pts_color = [], []
for id_ir, id_color in matches:
pts_ir.append(kps_ir[id_ir].pt)
pts_color.append(kps_color[id_color].pt)
pts_ir = np.array(pts_ir, dtype=np.float32)
pts_color = np.array(pts_color, dtype=np.float32)
# compute homography
if len(matches) > 4:
H, status = cv.findHomography(pts_ir, pts_color, cv.RANSAC)
warped = cv.warpPerspective(ir, H, (rgb.shape[1], rgb.shape[0]))
warped = cv.cvtColor(warped, cv.COLOR_GRAY2BGR)
# visualize the result
winname = 'result'
cv.namedWindow(winname, cv.WINDOW_KEEPRATIO)
alpha = 5
# res = cv.addWeighted(rgb, 0.5, warped, 0.5, 0)
res = None
def onChange(alpha):
global rgb, warped, res, winname
res = cv.addWeighted(rgb, alpha/10, warped, 1 - alpha/10, 0)
cv.imshow(winname, res)
onChange(alpha)
cv.createTrackbar('alpha', winname, alpha, 10, onChange)
cv.imshow(winname, res)
cv.waitKey()
cv.destroyWindow(winname)
Result (alpha=8)
Edit: It seems like SIFT is not the best option as it fails for some other examples. Example images are in another question.
In this case, I suggest using SURF.
It is a patented algorithm, so it does not come with the latest OpenCV PIP installations.
You can install previous versions of OpenCV or build it from source.
descriptor = cv.xfeatures2d.SURF_create()
Result (alpha=8)
Edit2: It is now clear that the key to achieve this task is to choose the correct feature descriptor. As a final note, I suggest choosing the appropriate motion model. Affine transform fits better than homography in this case.
H, _ = cv.estimateAffine2D(pts_ir, pts_color)
H = np.vstack((H, [0, 0, 1]))
Affine transform result:
Related
My task is to detect an object in a given image using OpenCV (I do not care whether it is the Python or C++ implementation). The object, shown below in three examples, is a black rectangle with five white rectagles within. All dimensions are known.
However, the rotation, scale, distance, perspective, lighting conditions, camera focus/lens, and background of the image are not known. The edge of the black rectangle is not guaranteed to be fully visible, however there will not be anything in front of the five white rectangles ever - they will always be fully visible. The end goal is to be able to detect the presence of this object within an image, and rotate, scale, and crop to show the object with the perspective removed. I am fairly confident that I can adjust the image to crop to just the object, given its four corners. However I am not so confident that I can reliably find those four corners. In ambiguous cases, not finding the object is preferred to misidentifying some other feature of the image as the object.
Using OpenCV I have come up with the following methods, however I feel I might be missing something obvious. Are there any more methods available, or is one of these the optimal solution?
Edge based outline
First idea was to look for the outside edge of the object.
Using Canny edge detection (after scaling to known size, grayscaling and gaussian blurring), finding a contour which best matches the outer shape of the object.
This deals with perspective, colour, size issues, but fails when there is a complicated background for example, or if there is something of similar shape to the object elsewhere in the image. Maybe this could be improved by a better set of rules for finding the correct contour - perhaps involving the five white rectangles as well as the outer edge.
Feature detection
The next idea was to match to a known template using feature detecting.
Using ORB feature detecting, descriptor matching and homography (from this tutorial) fails, I believe because the features it is detecting are very similar to other features within the object (lots of coreners which are precisely one-quarter white and three-quarters black). However, I do like the idea of matching to a known template - this idea makes sense to me. I suppose though that because the object is quite basic geometrically, it's likely to find a lot of false positives in the feature matching step.
Parallel Lines
Using Houghlines or HoughLinesP, looking for evenly spaced parallel lines. Have just started down this road so need to investigate the best methods for thresholding etc. While it looks messy for images with complex backgrounds, I think it may work well as I can rely on the fact that the white rectangles within the black object should always be high contrast, giving a good indication of where the lines are.
'Barcode Scan'
My final idea is to scan the image by line, looking for the white to black pattern.
I have not started this method, but the idea is to take a strip of the image (at some angle), convert to HSV colour space, and look for the regular black-to-white pattern appearing five times sequentially in the Value column. This idea sounds promising to me, as I believe it should ignore many of the unknown variables.
Thoughts
I have looked at a number of OpenCV tutorials, as well as SO questions such as this one, however because my object is quite geometrically simple I am having issues implementing the ideas given.
I feel like this is an achievable task, however my struggle is knowing which method to pursue further. I have experimented with the first two ideas quite a bit, and while I haven't achieved anything very reliable, maybe there is something I am missing. Is there a standard way of achieving this task which I have not thought of, or is one of my suggested methods the most sensible?
EDIT: Once the corners are found using one of the above methods (or some other method), I am thinking of using Hu Moments or OpenCV's matchShapes() function to remove any false positives.
EDIT2: Added some more input image examples as requested by #Timo
Orig1
Orig2
Orig3
Extra image 1
Extra image 2
Extra image 3
Extra image 4
I had some time looking into the problem and made a little python script. I'm detecting the white rectangles inside your shape. Paste the code into a .py file and copy all input images in an input subfolder. The final result of the image is just a dummy atm and the script isn't complete yet. I'll try to continue it in the next couple of days. The script will create a debug subfolder where it'll save some images that show the current detection state.
import numpy as np
import cv2
import os
INPUT_DIR = 'input'
DEBUG_DIR = 'debug'
OUTPUT_DIR = 'output'
IMG_TARGET_SIZE = 1000
# each algorithm must return a rotated rect and a confidence value [0..1]: (((x, y), (w, h), angle), confidence)
def main():
# a list of all used algorithms
algorithms = [rectangle_detection]
# load and prepare images
files = list(os.listdir(INPUT_DIR))
images = [cv2.imread(os.path.join(INPUT_DIR, f), cv2.IMREAD_GRAYSCALE) for f in files]
images = [scale_image(img) for img in images]
for img, filename in zip(images, files):
results = [alg(img, filename) for alg in algorithms]
roi, confidence = merge_results(results)
display = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
display = cv2.drawContours(display, [cv2.boxPoints(roi).astype('int32')], -1, (0, 230, 0))
cv2.imshow('img', display)
cv2.waitKey()
def merge_results(results):
'''Merges all results into a single result.'''
return max(results, key=lambda x: x[1])
def scale_image(img):
'''Scales the image so that the biggest side is IMG_TARGET_SIZE.'''
scale = IMG_TARGET_SIZE / np.max(img.shape)
return cv2.resize(img, (0,0), fx=scale, fy=scale)
def rectangle_detection(img, filename):
debug_img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
_, binarized = cv2.threshold(img, 50, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(binarized, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
# detect all rectangles
rois = []
for contour in contours:
if len(contour) < 4:
continue
cont_area = cv2.contourArea(contour)
if not 1000 < cont_area < 15000: # roughly filter by the volume of the detected rectangles
continue
cont_perimeter = cv2.arcLength(contour, True)
(x, y), (w, h), angle = rect = cv2.minAreaRect(contour)
rect_area = w * h
if cont_area / rect_area < 0.8: # check the 'rectangularity'
continue
rois.append(rect)
# save intermediate results in the debug folder
rois_img = cv2.drawContours(debug_img, contours, -1, (0, 0, 230))
rois_img = cv2.drawContours(rois_img, [cv2.boxPoints(rect).astype('int32') for rect in rois], -1, (0, 230, 0))
save_dbg_img(rois_img, 'rectangle_detection', filename, 1)
# todo: detect pattern
return rois[0], 1.0 # dummy values
def save_dbg_img(img, folder, filename, index=0):
'''Writes the given image to DEBUG_DIR/folder/filename_index.png.'''
folder = os.path.join(DEBUG_DIR, folder)
if not os.path.exists(folder):
os.makedirs(folder)
cv2.imwrite(os.path.join(folder, '{}_{:02}.png'.format(os.path.splitext(filename)[0], index)), img)
if __name__ == "__main__":
main()
Here is an example image of the current WIP
The next step is to detect the pattern / relation between mutliple rectangles. I'll update this answer when I make progress.
I'm working on a project to automatically rotate microscope image stacks of a fluid experiment so that they are lined up with images of the CAD template for the microfluidic chip. I am using the OpenCV package in Python for image processing. Having the correct rotational orientation is necessary so that the images can be masked properly for analysis. Our chips have markers filled with fluorescent dye that are visible in every frame. The template and a sample image look like the following (the template can be scaled to arbitrary size, but the relevant region of the images is typically ~100x100 pixels or so):
I have not been able to rotationally align the image to the CAD template. Typically, the misalignment between the CAD template and the images is less than a few degrees, which is still sufficient to interfere with analysis, so I need to be able to measure the rotational difference even if it is relatively small.
Following examples online I am using the following procedure:
Scale up the image to approximately the same size as the template using cubic interpolation (~800 x 800)
Threshold both images using Otsu's method
Find keypoints and extract descriptors using a built-in method (I've tried ORB, AKAZE, and BRIEF).
Match descriptors using a brute-force matcher with Hamming distance.
Take the best matches and use them to compute a partial affine transformation matrix
Use that matrix to infer a rotational shift, warping the one image to the other as a check.
Here's a sample of my code (borrowed in part from here):
import numpy as np
import cv2
import matplotlib.pyplot as plt
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.5
def alignImages(im1, im2,returnpoints=False):
# Detect ORB features and compute descriptors.
size1 = int(0.1*(np.mean(np.shape(im1))))
size2 = int(0.1*(np.mean(np.shape(im2))))
orb1 = cv2.ORB_create(MAX_FEATURES,edgeThreshold=size1,patchSize=size1)
orb2 = cv2.ORB_create(MAX_FEATURES,edgeThreshold=size2,patchSize=size2)
keypoints1, descriptors1 = orb1.detectAndCompute(im1, None)
keypoints2, descriptors2 = orb2.detectAndCompute(im2, None)
matcher = cv2.BFMatcher(cv2.NORM_HAMMING,crossCheck=True)
matches = matcher.match(descriptors1,descriptors2)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
cv2.imwrite("matches.jpg", imMatches)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# Find homography
M, inliers = cv2.estimateAffinePartial2D(points1,points2)
height, width = im2.shape
im1Reg = cv2.warpAffine(im1,M,(width,height))
return im1Reg, M
if __name__ == "__main__":
test_template = cv2.cvtColor(cv2.imread("test_CAD_cropped.png"),cv2.COLOR_RGB2GRAY)
test_image = cv2.cvtColor(cv2.imread("test_CAD_cropped.png"),cv2.COLOR_RGB2GRAY)
fx = fy = 88/923
test_image_big = cv2.resize(test_image,(0,0),fx=1/fx,fy=1/fy,interpolation=cv2.INTER_CUBIC)
ret, imRef_t = cv2.threshold(test_template,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
ret, test_big_t = cv2.threshold(test_image_big,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
imReg, M = alignImages(test_big_t,imRef_t)
fig, ax = plt.subplots(nrows=2,ncols=2,figsize=(8,8))
ax[1,0].imshow(imReg)
ax[1,0].set_title("Warped Image")
ax[0,0].imshow(imRef_t)
ax[0,0].set_title("Template")
ax[0,1].imshow(test_big_t)
ax[0,1].set_title("Thresholded Image")
ax[1,1].imshow(imRef_t - imReg)
ax[1,1].set_title("Diff")
plt.show()
In this example, I get the following bad transformation because there are only 3 matching keypoints and they are all incorrect:
I find that regardless of my keypoint/descriptor parameters I tend to get too few "good" features. Is there anything I can do to pre-process my images better to get good features more reliably, or is there a better method to align my images to this template that doesn't involve keypoint matching? The specific application of this experiment means that I can't use the patented keypoint extractor/descriptors like SURF and SIFT.
A good method to align two images based on rotation, translation and scaling only is the Fourier Mellin transform.
Here is an example using the implementation in DIPlib (disclosure: I'm an author):
import diplib as dip
# load data
image = dip.ImageRead('image.png')
template = dip.ImageRead('template.png')
template = template.TensorElement(0) # this one is RGB, take any one channel
# pad the two images with zeros so they have equal sizes
sz = [max(image.Size(0), template.Size(0)), max(image.Size(1), template.Size(1))]
image = image.Pad(sz)
template = template.Pad(sz)
# match
res = dip.FourierMellinMatch2D(template, image)
# display
dip.JoinChannels((template,res,res)).Show()
However, there are many other approaches. A key thing here is that both the template and the image are quite simple, and very similar. This makes registration very easy.
For example, assuming you have the proper scaling of the template (this should not be a problem I presume), all you need to do is find the rotation and the translation. You can brute-force the rotations, simply rotating the image over a set of small angles, and matching each of the results with the template (cross-correlation). The one with the best match (largest cross-correlation value) has the appropriate rotation. If you need to have a very precise rotation estimation, you can do a second set of angles close to the best choice in the first set.
Cross-correlation is cheap and easy to compute, and leads to high precision translation estimates (the Fourier Mellin method makes extensive use of it). Don't just find the pixel with the largest value in the cross-correlation output, you can fit a parabola to the few pixels around this one and use the location of the maximum of the fitted parabola. This leads to sub-pixel estimates of translation.
I've succeeded on it by using the below method, but I'm sure there must be other more time-efficient alternatives to provide exact angle of rotation instead of an approximation as the method below. I'll be pleased to hear your feedback.
The procedure is based on the following steps:
Import a template image (i.e.: with orientation at 0º)
Create a discrete array of the same image but each one rotated at 360º/rotatesteps compared to its nearest neighbour (i.e.: 30 to 50 rotated images)
# python 3 / opencv 3
# Settings:
rotate_steps = 36
step_angle = round((360/rotate_steps), 0) # one image at each 10º
# Rotation function
def rotate_image(image, angle):
# ../..
return rotated_image
# Importing a sample image and creating a n-dimension array where to store images in:
image = cv2.imread('sample_image.png')
image_Array = np.zeros((image.shape[1], image.shape[0], 1), dtype='uint8')
# Rotating sample image and saving it into the array as a new channel:
while rotation_angle <= (360 - step_angle):
angles.append(rotation_angle)
image_array[:,:,channel] = rotate_image(image.copy(), rotation_angle)
# ../..
So I get:
angles = [0, 10.0, 20.0, 30.0, .../..., 340.0, 350.0]
image_array = [image_1, image_2, image_3, ...] where image_i is a different channel on a numpy array.
Retrieve the 'test_image' for which I'm looking at the angle compared to the sample image we have previously rotated and stored into an array
Follow a series of cv2.matchTemplate() and cv2.minMaxLoc() to find what rotated image's angle best matches the 'test_image'
for i in range(len(angles)):
res = cv2.matchTemplate(test_image, image_array[:,:,i], cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
# ../..
And finally I pick the discretized angle matching the sample image as the one corresponding to the template image with 'max_val' highest value.
This has proved to work well having in mind the resulting precision is based on an approximation with higher / lower precision depending on the amount of rotated template images, and also the rising time taken when rotated template number increases...
I'm sure there must be other smarter alternatives based on different methods such as generating a kind of "orientation vector" of an image, and so comparing just the resulting number with a previously known one from a sample template...
Your feedback will be highly appreciated.
I think your problem doesn't have an easy solution. It's in fact a registration problem, warping (in this case, rotating) an image to fit another. And it's a known difficult problem, as segmentation is.
I heard image processing researchers say that "he who masters segmentation and registration masters image processing", which might be a little bit of a hyperbole, but it gives the general idea.
Anyway, your technique is how I would have gone with it. Looking on researchgate, https://www.researchgate.net/post/How_can_one_determine_the_rotation_angle_between_two_images, lots of answers also go your way. The alternative would be using feature matching, but I'm not sure it would be faster than your solution.
Maybe you can have a look at OpenCV registration methods http://docs.opencv.org/trunk/db/d61/group__reg.html (the method in this link uses pixel matching and not feature matching, maybe it's faster)
How can I optimise the SIFT feature matching for many pictures using FLANN?
I have a working example taken from the Python OpenCV docs. However this is comparing one image with another and it's slow. I need it to search for features matching in a series of images (a few thousands) and I need it to be faster.
My current idea:
Run through all the images and save the features. How?
Compare an image from a camera with this above base, and find the correct one. How?
Give me the result, matching image or something.
http://docs.opencv.org/trunk/doc/py_tutorials/py_feature2d/py_feature_homography/py_feature_homography.html
import sys # For debugging only
import numpy as np
import cv2
from matplotlib import pyplot as plt
MIN_MATCH_COUNT = 10
img1 = cv2.imread('image.jpg',0) # queryImage
img2 = cv2.imread('target.jpg',0) # trainImage
# Initiate SIFT detector
sift = cv2.SIFT()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1,des2,k=2)
# store all the good matches as per Lowe's ratio test.
good = []
for m,n in matches:
if m.distance MIN_MATCH_COUNT:
src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
matchesMask = mask.ravel().tolist()
h,w = img1.shape
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
dst = cv2.perspectiveTransform(pts,M)
img2 = cv2.polylines(img2,[np.int32(dst)],True,255,3, cv2.LINE_AA)
else:
print "Not enough matches are found - %d/%d" % (len(good),MIN_MATCH_COUNT)
matchesMask = None
draw_params = dict(matchColor = (0,255,0), # draw matches in green color
singlePointColor = None,
matchesMask = matchesMask, # draw only inliers
flags = 2)
img3 = cv2.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)
plt.imshow(img3, 'gray'),plt.show()
UPDATE
After trying out many things I might have come closer to the solution now. I hope it's possible to build the index and then search in it like this:
flann_params = dict(algorithm=1, trees=4)
flann = cv2.flann_Index(npArray, flann_params)
idx, dist = flann.knnSearch(queryDes, 1, params={})
However I still haven't managed to build an accepted npArray to the flann_Index parameter.
loop through all images as image:
npArray.append(sift.detectAndCompute(image, None))
npArray = np.array(npArray)
I never solved this in Python, however I switched environment to C++ where you get more OpenCV examples and don't have to use a wrapper with less documentation.
An example on the issue I had with matching in multiple files can be found here: https://github.com/Itseez/opencv/blob/2.4/samples/cpp/matching_to_many_images.cpp
Along with the reply of #stanleyxu2005 I'd like to add some tips as to how to do the whole matching itself since I'm currently working of such a thing.
I strongly recommend to create some custom class that wraps around the cv::Mat but also stores various other essential pieces of data. In my case I have an ImageContainer store the original image (that I will use for the final stitching), the processed one (grayscaled, undistorted etc.), its keypoints and the descriptors for those. By doing so you can access all the matching-relevant information in a pretty well organized well. You can either implement the keypoint extraction and descriptor generation in it or do that outside the class and just store the results in that container.
Store all image containers in some kind of a structure (vector is usually a good choice) for easy access.
I also created a class called ImageMultiMatchContainer, which stores a pointer to a given query image (all images are query images), a vector with pointers to all train images (for a single query image of the image set all others are train images) that were matched to it and also a vector of the match vectors for each of those matches. Here I stumbled across a storage issue namely - first you have to skip matching of an image with itself because it is pointless and second you have the problem of comparing two images two times and thus generating a considerable overhead if you have a lot of images. The second problem is due to the fact that we iterate through all images (query images) and compare them to the rest in the set (train images). At some point we have image X (query) matched with image Y (train), but later we also have image Y (now query) matched with image X (now train). As you can see this is also pointless since it's basically matching the same pair of images twice. This can be solved (currently working on this) by creating a class (MatchContainer) that stores a pointer to each of the two images in a matched pair and also the match vector. You store this in a central location (in my case this is my matcher class) and for each image as query image you check the list of matched images of the train image. If it's empty then you create a new MatchContainer and add it to the rest of the MatchContainers. If it's not then you look in it and see if the current query image is not present there (comparing pointers is a fast operation). If it is then you just pass the pointer to that MatchContainer's vector item that stores the matches for those two images. If that is not the case, you do as if it's empty and create a new MatchContainer etc. MatchingContainers should be stored in a data structure with a small access times since you will be looking at them a lot and iterating from start to end costs too much time. I'm considering using a map but maybe a tree of some sort can offer some advantages as well.
The homography estimation is a very tricky part. Here I recommend you look at bundle block adjustment. I saw that the stitcher class in OpenCV has a BundleBase-class but haven't tested it yet to see what's in it.
A general recommendation is to look at the stitching process in OpenCV and read the source code. The stitching pipeline is a straight forward set of processes and you just have to see how exactly you can implement the single steps.
Here are several pieces of my advice:
You should reduce the amount of point data by using proper techniques.
Calculate the reference image repeatedly is a waste. You should persistent all calculated reference.
Do not put the calculate on a mobile device. You'd better upload the calculated reference of a captured image to a powerful server and do the searching there.
This is a very interesting topic. My ears are opening too.
Is there any good algorithm for detecting particles on a changing background intensity?
For example, if I have the following image:
Is there a way to count the small white particles, even with the clearly different background that appears towards the lower left?
To be a little more clear, I would like to label the image and count the particles with an algorithm that finds these particles to be significant:
I have tried many things with the PIL, cv , scipy , numpy , etc. modules.
I got some hints from this very similar SO question, and it appears at first glance that you could take a simple threshold like so:
im = mahotas.imread('particles.jpg')
T = mahotas.thresholding.otsu(im)
labeled, nr_objects = ndimage.label(im>T)
print nr_objects
pylab.imshow(labeled)
but because of the changing background you get this:
I have also tried other ideas, such as a technique I found for measuring paws, which I implemented in this way:
import numpy as np
import scipy
import pylab
import pymorph
import mahotas
from scipy import ndimage
import cv
def detect_peaks(image):
"""
Takes an image and detect the peaks usingthe local maximum filter.
Returns a boolean mask of the peaks (i.e. 1 when
the pixel's value is the neighborhood maximum, 0 otherwise)
"""
# define an 8-connected neighborhood
neighborhood = ndimage.morphology.generate_binary_structure(2,2)
#apply the local maximum filter; all pixel of maximal value
#in their neighborhood are set to 1
local_max = ndimage.filters.maximum_filter(image, footprint=neighborhood)==image
#local_max is a mask that contains the peaks we are
#looking for, but also the background.
#In order to isolate the peaks we must remove the background from the mask.
#we create the mask of the background
background = (image==0)
#a little technicality: we must erode the background in order to
#successfully subtract it form local_max, otherwise a line will
#appear along the background border (artifact of the local maximum filter)
eroded_background = ndimage.morphology.binary_erosion(background, structure=neighborhood, border_value=1)
#we obtain the final mask, containing only peaks,
#by removing the background from the local_max mask
detected_peaks = local_max - eroded_background
return detected_peaks
im = mahotas.imread('particles.jpg')
imf = ndimage.gaussian_filter(im, 3)
#rmax = pymorph.regmax(imf)
detected_peaks = detect_peaks(imf)
pylab.imshow(pymorph.overlay(im, detected_peaks))
pylab.show()
but this gives no luck either, showing this result:
Using the regional max function, I get images which almost appear to be giving correct particle identification, but there are either too many, or too few particles in the wrong spots depending on my gaussian filtering (images have gaussian filter of 2,3, & 4):
Also, it would need to work on images similar to this as well:
This is the same type of image above, just at a much higher density of particles.
EDIT: Solved solution: I was able to get a decent working solution to this problem using the following code:
import cv2
import pylab
from scipy import ndimage
im = cv2.imread('particles.jpg')
pylab.figure(0)
pylab.imshow(im)
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5,5), 0)
maxValue = 255
adaptiveMethod = cv2.ADAPTIVE_THRESH_GAUSSIAN_C#cv2.ADAPTIVE_THRESH_MEAN_C #cv2.ADAPTIVE_THRESH_GAUSSIAN_C
thresholdType = cv2.THRESH_BINARY#cv2.THRESH_BINARY #cv2.THRESH_BINARY_INV
blockSize = 5 #odd number like 3,5,7,9,11
C = -3 # constant to be subtracted
im_thresholded = cv2.adaptiveThreshold(gray, maxValue, adaptiveMethod, thresholdType, blockSize, C)
labelarray, particle_count = ndimage.measurements.label(im_thresholded)
print particle_count
pylab.figure(1)
pylab.imshow(im_thresholded)
pylab.show()
This will show the images like this:
(which is the given image)
and
(which is the counted particles)
and calculate the particle count as 60.
I had solved the "variable brightness in background" by using a tuned difference threshold with a technique called Adaptive Contrast. It works by performing a linear combination (a difference, in the case) of a grayscale image with a blurred version of itself, then applying a threshold to it.
Convolve the image with a suitable statistical operator.
Subtract the original from the convolved image, correcting intensity scale/gamma if necessary.
Threshold the difference image with a constant.
(original paper)
I did this very successfully with scipy.ndimage, in the floating-point domain (way better results than integer image processing), like this:
original_grayscale = numpy.asarray(some_PIL_image.convert('L'), dtype=float)
blurred_grayscale = scipy.ndimage.filters.gaussian_filter(original_grayscale, blur_parameter)
difference_image = original_grayscale - (multiplier * blurred_grayscale);
image_to_be_labeled = ((difference_image > threshold) * 255).astype('uint8') # not sure if it is necessary
labelarray, particle_count = scipy.ndimage.measurements.label(image_to_be_labeled)
Hope this helps!!
I cannot really give a definite answer, but here are a few pointers:
The function mahotas.morph.regmax might be better than the maximum filter as it removes pseudo-maxima. Perhaps combine this with a global threshold, with a local threshold (such as the mean over a window) or both.
If you have several images and the same uneven background, then maybe you can compute an average background and normalize against that, or use empty images as your estimate of background. This would be the case if you have a microscope, and like every microscope I've seen, the illumination is uneven.
Something like:
average = average_of_many(images)
# smooth it
average = mahotas.gaussian_filter(average,24)
Now you preprocess your images, like:
preproc = image/average
or something like that.