I have a bunch of SEM images that I'm trying to segment to search for features. This is one of those things that's obvious to a person looking at it but less so to a computer (at least one coded by me!). I was hoping that someone might give me a way to think about these kinds of problems and this one in particular.
So here's an easy example that works more or less whatever I do:
I do some basic trimming to get rid of the meta-data (actually I read it first but that's btw). Then if I, for example, do:
import skimage.filters as filters
threshold = filters.threshold_isodata(image)
seg_image = image > threshold
I get a true/false array which can be viewed as an image and looks like this:
(I put a little black strip at the bottom so you can flick back and forth with pleasing effect).
However, the same code on this image:
leads to this result:
It should be clear to any human reading this that it's got a lot of false positives. The background has been overexposed and has added a bunch of noise which then gets caught in the threshold.
I've tried various combinations of:
import skimage.segmentation as seg
import skimage.restoration as res
import skimage.filters as filters
import skimage.morphology as morph
seg_image = np.copy(image)
seg_image = morph.opening(seg_image, morph.disk(2))
seg_image = res.denoise_bilateral(seg_image)
seed = np.copy(seg_image)
seed[1:-1,1:-1] = seg_image.min()
seg_image = seg_image - morph.reconstruction(seed, seg_image)
As well as a few other filters (in different orders and including or excluding at random). I don't, in general, do all of those things because it's a disaster.
My logic, such as it is, was:
opening is a way to get rid of small bright spots. A small disk could remove this little noise pixels in the bulk of the image.
denoise_bilateral was a similar logic.
the reconstruction is meant to remove backgrounds and leave only the foreground. Unfortunately this tended to include the noisy pixels.
I'm continuing to fiddle around, but I'd love to know thoughts on these kinds of "gray on gray" images. But I really want to understand a better approach.
Update: Canny edge detection seems to work quite uncannily.
Good image:
Bad image:
Now the challenge is to count the stuff inside these edges. The hole filling approach from here (using from scipy import ndimage as ndi , ndi.binary_fill_holes(seg_image)) fails because it fills the wrong areas. But this seems like a good approach.
Related
I am working on a python program to detect if there is something missing in a picture as compared to another picture. Basically, it should take pictures of a place, save it, and after that whenever we run the program, it should take pictures from the camera of the same place and compare them to the picture we saved earlier to detect if an object is missing in the picture.
Requirement:
For example, we take a reference photo of the floor with two apples and one orange laying on the ground, then save that picture for future reference.
After that, we take one apple from the floor and now there is only one apple and one orange on the floor. When we run the program, it should take a photo using a camera and compare that photo to the one we took earlier. It should detect that a certain area of the picture (where the missing apple was placed) is different in the new picture and alert the user (print a message)...This is the whole idea.
Here is what I achieved so far:
I used a tool called ImageChops from the Pillow library. The problem with it is that it detects even very small changes that can occur due to noise. But I want it to detect major changes like an object missing or added to the picture. I would also like to control it with a threshold. I searched for doing this over the internet but couldn't find anything useful. Here is the code that I used
from PIL import Image, ImageChops
import cv2
import os
import sys
img1 = Image.open('1.jpg')
img2 = Image.open('2.jpg')
diff = ImageChops.difference(img1, img2)
if diff.getbbox():
print("Total {} differences found".format(diff.getbbox()))
diff.show()
else:
print('No difference found')
I would really appreciate it if someone can point me in the right direction. Thanks
The general problem you're describing is called "feature matching". OpenCV has lots of tooling for this exact purpose: a simple approach is this one https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html, which compares ORB features (basically, generalized corners) between two images. These kinds of features are more robust to lighting and noise than just comparing pixel values directly.
I have tried 3 algorithms:
Compare by Compare_ssim.
Difference detection by PIL (ImageChops.difference).
Images subtraction.
The first algorithm:
(score, diff) = compare_ssim(img1, img2, full=True)
diff = (diff * 255).astype("uint8")
The second algorithm:
from PIL import Image ,ImageChops
img1=Image.open("canny1.jpg")
img2=Image.open("canny2.jpg")
diff=ImageChops.difference(img1,img2)
if diff.getbbox():
diff.show()
The third algorithm:
image3= cv2.subtract(image1,image2)
The problem is these algorithms are so sensitive. If the images have different noise, they consider that the two images are totally different. Any ideas to fix that?
These pictures are different in many ways (deformation, lighting, colors, shape) and simple image processing just cannot handle all of this.
I would recommend a higher level method that tries to extract the geometry and color of those tubes, in the form of a simple geometric graph. Then compare the graphs rather than the images.
I acknowledge that this is easier said than done, and will only work with this particular kind of scene.
It is very difficult to help since we don't really know which parameters you can change, like can you keep your camera fixed? Will it always be just about tubes? What about tubes colors?
Nevertheless, I think what you are looking for is a framework for image registration and I propose you to use SimpleElastix. It is mainly used for medical images so you might have to get familiar with the library SimpleITK. What's interesting is that you have a lot of parameters to control the registration. I think that you will have to look into the documentation to find out how to control a specific image frequency, the one that create the waves and deform the images. Hereafter I did not configured it to have enough local distortion, you'll have to find the best trade-off, but I think it should be flexible enough.
Anyway, you can get such result with the following code, I don't know if it helps, I hope so:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import SimpleITK as sitk
fixedImage = sitk.ReadImage('1.jpg', sitk.sitkFloat32)
movingImage = sitk.ReadImage('2.jpg', sitk.sitkFloat32)
elastixImageFilter = sitk.ElastixImageFilter()
affine_registration_parameters = sitk.GetDefaultParameterMap('affine')
affine_registration_parameters["NumberOfResolutions"] = ['6']
affine_registration_parameters["WriteResultImage"] = ['false']
affine_registration_parameters["MaximumNumberOfSamplingAttempts"] = ['4']
parameterMapVector = sitk.VectorOfParameterMap()
parameterMapVector.append(affine_registration_parameters)
parameterMapVector.append(sitk.GetDefaultParameterMap("bspline"))
elastixImageFilter.SetFixedImage(fixedImage)
elastixImageFilter.SetMovingImage(movingImage)
elastixImageFilter.SetParameterMap(parameterMapVector)
elastixImageFilter.Execute()
registeredImage = elastixImageFilter.GetResultImage()
transformParameterMap = elastixImageFilter.GetTransformParameterMap()
resultImage = sitk.Subtract(registeredImage, fixedImage)
resultImageNp = np.sqrt(sitk.GetArrayFromImage(resultImage) ** 2)
cv2.imwrite('gray_1.png', sitk.GetArrayFromImage(fixedImage))
cv2.imwrite('gray_2.png', sitk.GetArrayFromImage(movingImage))
cv2.imwrite('gray_2r.png', sitk.GetArrayFromImage(registeredImage))
cv2.imwrite('gray_diff.png', resultImageNp)
Your first image resized to 256x256:
Your second image:
Your second image registered with the first one:
Here is the difference between the first and second image which could show what's different:
This is one of the classical problems of image treatment - and one which does not have an answer which holds universally. The possible answers depend highly on what type of images you have, and what type of information you want to extract from them and the differences between them.
You can reduce noise by two means:
a) take several images of the same object, such that the object does not change. You can stack the images and noise is reduced by square-root of the number of images.
b) You can run a blur filter over the image. The more you blur, the more noise is averaged. Noise is here reduced by square-root of the number of pixels you average over. But so is detail in the images.
In both cases (a) and (b) you run the difference analysis after you applied either method.
Probably not applicable to you as you likely cannot get hold of either: it helps, if you can get hold of flatfields which give the inhomogeneity of illumination and pixel sensitivity of your camera and allow correcting the images prior to any treatment. Similar goes for darkfields which give an estimate of the influence of the read-out noise of the camera and allow correcting images for those.
There is somewhat another 3rd option, which is more high-level: run your object analysis first at a detailed-enough level. And compare the results.
Here is a cropped example (about 11x9 pixels) of the kind of images (which ultimately are actually all of size 28x28, but stored in memory flattened as a 784-components array) I will be trying to apply the algorithm on:
Basically, I want to be able to recognize when this shape appears (red lines are used to put emphasis on the separation of the pixels, while the surrounding black border is used to better outline the image against the white background of StackOverflow):
The orientation of it doesn't matter: it must be detected in any of its possible representations (rotations and symmetries) along the horizontal and vertical axis (so, for example, a 45° rotation shouldn't be considered, nor a diagonal symmetry: only consider 90°, 180°, and 270° rotations, for example).
There are two solutions to be found on that image that I first presented, though only one needs to be found (ignore the gray blurr surrounding the white region):
Take this other sample (which also demonstrates that the white figures inside the images aren't always fully surrounded by black pixels):
The function should return True because the shape is present:
Now, there is obviously a simple solution to this:
Use a variable such as pattern = [[1,0,0,0],[1,1,1,1]], produce its variations, and then slide all of the variations along the image until an exact match is found at which point the whole thing just stops and returns True.
This would, however, in the worst case scenario, take up to 8*(28-2)*(28-4)*(2*4) which is approximately 40000 operations for a single image, which seem a bit overkill (if I did my quick calculations right).
I'm guessing one way of making this naive approach better would be to first of all scan the image until I find the very first white pixel, and then start looking for the pattern 4 rows and 4 columns earlier than that point, but even that doesn't seem good enough.
Any ideas? Maybe this kind of function has already been implemented in some library? I'm looking for an implementation or an algorithm that beats my naive approach.
As a side note, while kind of a hack, I'm guessing this is the kind of problem that can be offloaded to the GPU but I do not have much experience with that. While it wouldn't be what I'm looking for primarily, if you provide an answer, feel free to add a GPU-related note.
EDIT:
I ended up making an implementation of the accepted answer. You can see my code in this Gist.
If you have too many operations, think how to do less of them.
For this problem I'd use image integrals.
If you convolve a summing kernel over the image (this is a very fast operation in fft domain with just conv2,imfilter), you know that only locations where the integral is equal to 5 (in your case) are possible pattern matching places. Checking those (even for your 4 rotations) should be computationally very fast. There can not be more than 50 locations in your example image that fit this pattern.
My python is not too fluent, but this is the proof of concept for your first image in MATLAB, I am sure that translating this code should not be a problem.
% get the same image you have (imgur upscaled it and made it RGB)
I=rgb2gray(imread('https://i.stack.imgur.com/l3u4A.png'));
I=imresize(I,[9 11]);
I=double(I>50);
% Integral filter definition (with your desired size)
h=ones(3,4);
% horizontal and vertical filter (because your filter is not square)
Ifiltv=imfilter(I,h);
Ifilth=imfilter(I,h');
% find the locations where integral is exactly the value you want
[xh,yh]=find(Ifilth==5);
[xv,yv]=find(Ifiltv==5);
% this is just plotting, for completeness
figure()
imshow(I,[]);
hold on
plot(yh,xh,'r.');
plot(yv,xv,'r.');
This results in 14 locations to check. My standard computer takes 230ns on average on computing both image integrals, which I would call fast.
Also GPU computing is not a hack :D. Its the way to go with a big bunch of problems because of the enormous computing power they have. E.g. convolutions in GPUs are incredibly fast.
The operation you are implementing is an operator in Mathematical Morphology called hit and miss.
It can be implemented very efficiently as a composition of two erosions. If the shape you’re detecting can be decomposed into a few simple geometrical shapes (especially rectangles are quick to compute) then the operator can be even more efficient.
You’ll find very efficient erosions in most image processing libraries, for example try OpenCV. OpenCV also has a hit and miss operator, here is a tutorial for how to use it.
As an example for what output to expect, I generated a simple test image (left), applied a hit and miss operator with a template that matches at exactly one place in the image (middle), and again with a template that does not match anywhere (right):
I did this in MATLAB, not Python, because I have it open and it's easiest for me to use. This is the code:
se = [1,1,1,1 % Defines the template
0,0,0,1];
img = [0,0,0,0,0,0 % Defines the test image
0,1,1,1,1,0
0,0,0,0,1,0
0,0,0,0,0,0
0,0,0,0,0,0
0,0,0,0,0,0];
img = dip_image(img,'bin');
res1 = hitmiss(img,se);
res2 = hitmiss(img,rot90(se,2));
% Quick-and-dirty display
h = dipshow([img,res1,res2]);
diptruesize(h,'tight',3000)
hold on
plot([5.5,5.5],[-0.5,5.5],'r-')
plot([11.5,11.5],[-0.5,5.5],'r-')
The code above uses the hit and miss operator as I implemented in DIPimage. This same implementation is available in DIPlib's Python bindings as dip.HitAndMiss() (install with pip install diplib):
import diplib as dip
# ...
res = dip.HitAndMiss(img, se)
I am working on a project that requires detecting lines on a plate of sand. The lines are hand-drew by user so that are not exactly "straight" (see photo). And because of the sand, the lines are quite hard to distinguish.
I tried cv2.HoughLines from OpenCV but didn't achieve good results. So any suggestion on the detecting method? And welcome for suggestion to improve the clarity of the lines. I am thinking of putting a few led light surrounding the plate.
Thanks
The detecting method depends a lot on how much generality you require: is the exposure and contrast going to change from one image to another? Is the typical width of lines going to change? In the following, I assume that such parameters do not vary much for your applications, please correct me if I'm wrong.
I'll be using scikit-image, a common image processing package for Python. If you're not familiar with this package, documentation can be found on http://scikit-image.org/, and the package is bundled with all installations of Scientific Python. However, the algorithms that I use are also available in other tools, like opencv.
My solution is written below. Basically, the principle is
first, denoise the image. Life is usually simpler after a denoising step. Here I use a total variation filter, since it results in a piecewise-constant image that will be easier to threshold. I enhance dark regions using a morphological erosion (on the gray-level image).
then apply an adaptive threshold that varies locally in space, since the contrast varies through the image. This operation results in a binary image.
erode the binary image to break spurious links between regions, and keep only large regions.
compute a measure of the elongation of the regions to keep only the most elongated ones. Here I use the ratio of the eigenvalues of the inertia tensor.
Parameters that are the most difficult to tune is the block size for the adaptive thresholding, and the minimum size of regions to keep. I also tried a Canny filter on the denoised image (skimage.filters.canny), and results were quite good, but edges were not always closed, you might also want to try an edge-detection method such as a Canny filter.
The result is shown below:
# Import modules
import numpy as np
from skimage import io, measure, morphology, restoration, filters
from skimage import img_as_float
import matplotlib.pyplot as plt
# Open the image
im = io.imread('sand_lines.png')
im = img_as_float(im)
# Denoising
tv = restoration.denoise_tv_chambolle(im, weight=0.4)
ero = morphology.erosion(tv, morphology.disk(5))
# Threshold the image
binary = filters.threshold_adaptive(ero, 181)
# Clean the binary image
binary = morphology.binary_dilation(binary, morphology.disk(8))
clean = morphology.remove_small_objects(np.logical_not(binary), 4000)
labels = measure.label(clean, background=0) + 1
# Keep only elongated regions
props = measure.regionprops(labels)
eigvals = np.array([prop.inertia_tensor_eigvals for prop in props])
eigvals_ratio = eigvals[:, 1] / eigvals[:, 0]
eigvals_ratio = np.concatenate(([0], eigvals_ratio))
color_regions = eigvals_ratio[labels]
# Plot the result
plt.figure()
plt.imshow(color_regions, cmap='spectral')
I am trying to convert a color image to pure BW. I looked around for some code to do this and settled with
im = Image.open("mat.jpg")
gray = im.convert('L')
bw = gray.point(lambda x: 0 if x<128 else 255, '1')
bw.save("result_bw.jpg")
However, the result still has grays!
So, I tried to do it myself:
floskel = Image.open("result_bw.jpg")
flopix = floskel.load()
for i in range (0,floskel.size[0]):
for j in range (0, floskel.size[1]):
print flopix[i,j]
if flopix[i,j]>100:
flopix[i,j]=255
else:
flopix[i,j]=0
But, STILL, there are grays in the image.
Am I doing something wrong?
As sebdelsol mentioned, it's much better to use im.convert('1') directly on the colour source image. The standard PIL "dither" is Floyd-Steinberg error diffusion, which is generally pretty good (depending on the image), but there are a variety of other options, eg random dither and ordered dither, although you'd have to code them yourself, so they'd be quite a bit slower.
The conversion algorithm(s) you use in the code in the OP is just simple thresholding, which generally loses a lot of detail, although it's easy to write. But I guess in this case you were just trying to confirm your theory about grey pixels being present in the final image. But as sebdelsol said, it just looks like there are grey pixels due to the "noise", i.e. regions containing a lot of black and white pixels mixed together, which you should be able to verify if you zoom into the image.
FWIW, if you do want to do your own pixel-by-pixel processing of whole images it's more efficient to get a list of pixels using im.getdata() and put them back into an image with im.putdata(), rather than doing that flopix[i,j] stuff. Of course, if you don't need to know coordinates, algorithms that use im.point() are usually pretty quick.
Finally, JPEG isn't really suitable for B&W images, it was designed for images with (mostly) continuous tone. Try saving as PNG; the resulting files will probably be a lot smaller than the equivalent JPEGs. It's possible to reduce JPEG file size by saving with low quality settings, but the results generally don't look very good.
You'd rather use convert to produce a mode('1') image. It would be faster and better since it use dithering by default.
bw = im.convert('1')
The greys you see appear probably in the parts of the image with noise near the 128 level, that produces high frequency B&W that looks grey.