Here is a cropped example (about 11x9 pixels) of the kind of images (which ultimately are actually all of size 28x28, but stored in memory flattened as a 784-components array) I will be trying to apply the algorithm on:
Basically, I want to be able to recognize when this shape appears (red lines are used to put emphasis on the separation of the pixels, while the surrounding black border is used to better outline the image against the white background of StackOverflow):
The orientation of it doesn't matter: it must be detected in any of its possible representations (rotations and symmetries) along the horizontal and vertical axis (so, for example, a 45° rotation shouldn't be considered, nor a diagonal symmetry: only consider 90°, 180°, and 270° rotations, for example).
There are two solutions to be found on that image that I first presented, though only one needs to be found (ignore the gray blurr surrounding the white region):
Take this other sample (which also demonstrates that the white figures inside the images aren't always fully surrounded by black pixels):
The function should return True because the shape is present:
Now, there is obviously a simple solution to this:
Use a variable such as pattern = [[1,0,0,0],[1,1,1,1]], produce its variations, and then slide all of the variations along the image until an exact match is found at which point the whole thing just stops and returns True.
This would, however, in the worst case scenario, take up to 8*(28-2)*(28-4)*(2*4) which is approximately 40000 operations for a single image, which seem a bit overkill (if I did my quick calculations right).
I'm guessing one way of making this naive approach better would be to first of all scan the image until I find the very first white pixel, and then start looking for the pattern 4 rows and 4 columns earlier than that point, but even that doesn't seem good enough.
Any ideas? Maybe this kind of function has already been implemented in some library? I'm looking for an implementation or an algorithm that beats my naive approach.
As a side note, while kind of a hack, I'm guessing this is the kind of problem that can be offloaded to the GPU but I do not have much experience with that. While it wouldn't be what I'm looking for primarily, if you provide an answer, feel free to add a GPU-related note.
EDIT:
I ended up making an implementation of the accepted answer. You can see my code in this Gist.
If you have too many operations, think how to do less of them.
For this problem I'd use image integrals.
If you convolve a summing kernel over the image (this is a very fast operation in fft domain with just conv2,imfilter), you know that only locations where the integral is equal to 5 (in your case) are possible pattern matching places. Checking those (even for your 4 rotations) should be computationally very fast. There can not be more than 50 locations in your example image that fit this pattern.
My python is not too fluent, but this is the proof of concept for your first image in MATLAB, I am sure that translating this code should not be a problem.
% get the same image you have (imgur upscaled it and made it RGB)
I=rgb2gray(imread('https://i.stack.imgur.com/l3u4A.png'));
I=imresize(I,[9 11]);
I=double(I>50);
% Integral filter definition (with your desired size)
h=ones(3,4);
% horizontal and vertical filter (because your filter is not square)
Ifiltv=imfilter(I,h);
Ifilth=imfilter(I,h');
% find the locations where integral is exactly the value you want
[xh,yh]=find(Ifilth==5);
[xv,yv]=find(Ifiltv==5);
% this is just plotting, for completeness
figure()
imshow(I,[]);
hold on
plot(yh,xh,'r.');
plot(yv,xv,'r.');
This results in 14 locations to check. My standard computer takes 230ns on average on computing both image integrals, which I would call fast.
Also GPU computing is not a hack :D. Its the way to go with a big bunch of problems because of the enormous computing power they have. E.g. convolutions in GPUs are incredibly fast.
The operation you are implementing is an operator in Mathematical Morphology called hit and miss.
It can be implemented very efficiently as a composition of two erosions. If the shape you’re detecting can be decomposed into a few simple geometrical shapes (especially rectangles are quick to compute) then the operator can be even more efficient.
You’ll find very efficient erosions in most image processing libraries, for example try OpenCV. OpenCV also has a hit and miss operator, here is a tutorial for how to use it.
As an example for what output to expect, I generated a simple test image (left), applied a hit and miss operator with a template that matches at exactly one place in the image (middle), and again with a template that does not match anywhere (right):
I did this in MATLAB, not Python, because I have it open and it's easiest for me to use. This is the code:
se = [1,1,1,1 % Defines the template
0,0,0,1];
img = [0,0,0,0,0,0 % Defines the test image
0,1,1,1,1,0
0,0,0,0,1,0
0,0,0,0,0,0
0,0,0,0,0,0
0,0,0,0,0,0];
img = dip_image(img,'bin');
res1 = hitmiss(img,se);
res2 = hitmiss(img,rot90(se,2));
% Quick-and-dirty display
h = dipshow([img,res1,res2]);
diptruesize(h,'tight',3000)
hold on
plot([5.5,5.5],[-0.5,5.5],'r-')
plot([11.5,11.5],[-0.5,5.5],'r-')
The code above uses the hit and miss operator as I implemented in DIPimage. This same implementation is available in DIPlib's Python bindings as dip.HitAndMiss() (install with pip install diplib):
import diplib as dip
# ...
res = dip.HitAndMiss(img, se)
Related
I have a set of images that look like this:
Using python need a way to find a contour around the yellow shape that ignores the isolated points and is not too complex. Something looking a bit like this :
I tried some methods such as the find_contours function from skimage,which gives this after keeping only the biggest contour:
which is not what I am looking for. A also tried active contour (snake) which had the problem of paying too much attention to isolated pixels. Is there a particular method that would help me in this situation ?
Thank you
Assuming the yellow blob is slightly different across your images, I recommend you look into either using Morphological Operations, or using Contour Approximation.
I've never used scikit-image, but it appears to have Morphological functionalities included.
You can take a look at this OpenCV tutorial for a quick guideline of the different operations.
But I think all you need is to use the "Opening" operation to preprocess your yellow shape; making it smoother and removing the random speckles.
Another approach is by approximating that contour you've extracted to make it smoother. For scikit-image, that is the measure.approximate_polygon function. Also another OpenCV tutorial for reference on how Contour Approximation works (the same algorithm as with scikit-image).
I've been trying to write on a fast (ish) image matching program which doesn't match rotated or scale deformed image, in Python.
The goal is to be able to find small sections of an image that are similar to other images in color features, but dissimilar if rotated or warped.
I found out about perceptual image hashing, and I've had a look at the ImageHash module for Python and SSIM, however most of the things I've looked at do not have in color as a major factor, ie they average the color and only work in one channel, and phash in particular doesn't care if images are rotated.
I would like to be able to have an algorithm which would match images which at a distance would appear the same (but which would not necessarily need to be the same image).
Can anyone suggest how I would structure and write such an algorithm in python? or suggest a function which would be able to compare images in this manner?
I found a couple of ways to do this.
I ended up using a Mean Squared Error function that I wrote myself:
def mse(reference, query):
return (((reference).astype("double")-(query).astype("double"))**2).mean()
Until, upon later tinkering I found a function that seemed to do something similar (compare image similarity, bit by bit), but a good amount faster:
def linalg_norm(reference, query):
return np.linalg.norm(reference-query)
I have no theoretical knowledge of what the second function does, however practically it doesn't matter. I am not averse to learning how it works.
I have written a program in Python which automatically reads score sheets like this one
At the moment I am using the following basic strategy:
Deskew the image using ImageMagick
Read into Python using PIL, converting the image to B&W
Calculate calculate the sums of pixels in the rows and the columns
Find peaks in these sums
Check the intersections implied by these peaks for fill.
The result of running the program is shown in this image:
You can see the peak plots below and to the right of the image shown in the top left. The lines in the top left image are the positions of the columns and the red dots show the identified scores. The histogram bottom right shows the fill levels of each circle, and the classification line.
The problem with this method is that it requires careful tuning, and is sensitive to differences in scanning settings. Is there a more robust way of recognising the grid, which will require less a-priori information (at the moment I am using knowledge about how many dots there are) and is more robust to people drawing other shapes on the sheets? I believe it may be possible using a 2D Fourier Transform, but I'm not sure how.
I am using the EPD, so I have quite a few libraries at my disposal.
First of all, I find your initial method quite sound and I would have probably tried the same way (I especially appreciate the row/column projection followed by histogramming, which is an underrated method that is usually quite efficient in real applications).
However, since you want to go for a more robust processing pipeline, here is a proposal that can probably be fully automated (also removing at the same time the deskewing via ImageMagick):
Feature extraction: extract the circles via a generalized Hough transform. As suggested in other answers, you can use OpenCV's Python wrapper for that. The detector may miss some circles but this is not important.
Apply a robust alignment detector using the circle centers.You can use Desloneux parameter-less detector described here. Don't be afraid by the math, the procedure is quite simple to implement (and you can find example implementations online).
Get rid of diagonal lines by a selection on the orientation.
Find the intersections of the lines to get the dots. You can use these coordinates for deskewing by assuming ideal fixed positions for these intersections.
This pipeline may be a bit CPU-intensive (especially step 2 that will proceed to some kind of greedy search), but it should be quite robust and automatic.
The correct way to do this is to use Connected Component analysis on the image, to segment it into "objects". Then you can use higher level algorithms (e.g. hough transform on the components centroids) to detect the grid and also determine for each cell whether it's on/off, by looking at the number of active pixels it contains.
Using python, which may be the best algorithm or the best strategy to detect the presence of colored bands as in image?
The image is scanned and cropped, the problem is that the crop not to be precise and I can not make use of a control that makes use of Cartesian coordinates to determine if the lines are present.
The strips may be present or not.
You have a number of options at your disposal:
If the strips are going to be the same size, and their orientation is known, then you can use cross-correlation (with working Python source). Your template image could be a single stripe, or a multiple strip pattern if you know the number of strips and their spacing.
More generally, you could go with morphological image processing and look for rectangles. You'd first have to threshold your image (using Ohtsu's method or some empirically determined threshold) and then perform contour detection. Here's an example that does something similar, but for ellipses -- it's trivial to modify it to look for rectangles. This time the source in in C, but it uses OpenCV like the first example, so it should be trivial to port
There are other approaches such as edge detection and Fourier analysis, but I really think that the first two are going to be more than enough for you.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Image comparison algorithm
So basically i need to write a program that checks whether 2 images are the same or not. Consider the following 2 images:
http://i221.photobucket.com/albums/dd298/ramdeen32/starry_night.jpg
http://i221.photobucket.com/albums/dd298/ramdeen32/starry_night2.jpg
Well they are both the same images but how do i check to see if these images are the same. I am only limited to the media functions. All i can think of right now is the width height scaling and compare the RGB for each pixel but wouldnt the color be different?
Im completely lost on this one, any help is appreciated.
*Note this has to be in python and use the (media library)
Wow - that is a massive question, and one that has a vast number of possible solutions. I'm afraid I'm not a python expert, but I thought your question was interesting - so I wanted to propose a method that I would implement if I were posed with this problem.
Obviously, the two images you posted are actually very different - so you will need to consider 'how much different is the same', especially when working with images and considering different image formats and compression etc.
Anyway, for a solution that allows for a given difference in colour values (but not for pixels to be in the wrong places), I would do something like the following;
Pick two images.
Rescale the largest image to the exact same height and width as the first (even distorting the image if necessary).
Possibly grayscale the images to make the next steps simpler, without losing much in the way of effectiveness. Actually, possibly running edge detection here could work too.
Go through each pixel in both images and store the difference in either each of the RGB channels, or just the difference in grayscale intensity. You would end up with an array the size of the image noting the difference between the pixel intensities on the two images.
Now, I don't know the exact values, but you would probably then find that if you iterate over the array you could see whether the difference between each pixel in the two images is the same (or nearly the same) across all of the pixels. Perhaps iterate over the array once to find the average difference between the pixel intensities in the two images, then iterate over the image again to see if 90% of the differences fall within a certain threshold (5% difference?).
Just an idea. Of course, there might be some nice functions that I'm not aware of to make this easy, but I wouldn't hold my breath!
ImageMagick has Python bindings and a comparison function. It should do most of the work for you, but I've never used it in Python.
I think step 2 of John Wordsworths answer may be one of the hardest - here you are dealing with a stretched copy of the image but do you also allow rotated, cropped or in other ways distorted images? If so you are going to need a feature matching algorithm, such as used in Hugin or other panorama creation software. This will find matching features, distort to fit and then you can do the other stages of comparing. Ideally you want to recognise Van Gogh's painting from photos, even photos on mugs! It's easy for a human to do this, for a computer it needs rather more complex maths.