Recognizing black and white images with OpenCV - python

I have this set of images :
The leftmost one is the reference image.
I want to have a value telling me how close is any of the other images to the leftmost one.
I experimented with matchShapes(), by calling it for each contour and averaging the values, but I didn't get useful result (the rightmost one had a too high value, for example)
I would also want the matching to work only in the correct orientation.

If they're purely black and white images it would probably be easier to just AND the two pictures together and sum up the total pixels left in the result.
Something like this:
import cv2
import numpy as np
x = np.zeros((100,100))
y = np.zeros((100,100))
for i in range(25,75):
x[i][i] = 255
y[i][100-i] = 255
cv2.imshow('x', x)
cv2.imshow('y', y)
z = cv2.bitwise_and(x,y)
sum = 0
for i in range(0,z.shape[0]):
for j in range(0,z.shape[1]):
if z[i][j] == 255:
sum += 1
print(f"Similarity Score: {sum}")
There probably exists some better library to perform this all in one line but if performance isn't much of a concern perhaps this could work.

It was difficult to not recognize images that were too different. With the methods proposed here, I always got really close values for images that I thought were too different to correspond.
In the end, I did a multistep process:
First I got the contour of the test image like so :
testContours, _ = cv.findContours(testImage, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
Then, if the contour count between the test image and the original image are not the same, I abort.
If they have the same contour count, I then calculate the average between all shape distances of the contours :
distances = []
sd = cv2.createShapeContextDistanceExtractor()
for i in range(len(testContours)):
d2 = sd.computeDistance(testContours[i], originalContours[i])
value = sum(distances) / len(distances)
Then, I count the number of white pixels after AND-ing the two images, divided by the total number of pixels in the source image (in case the contours match but are not placed correctly)
exactly_placed_ratio = cv.countNonZero(cv.bitwise_and(testImage, originalImage)) / cv.countNonZero(originalImage)
In the end I have two values, I can use the first one to check if the shapes are close enough, and the second one to check if they are in the right position relative to the whole image.


How to delete or clear contours from image?

I'm working with license plates, what I do is apply a series of filters to it, such as:
The problem is when I doing this, there are some contour like this image at borders, how can I clear them? or make it just black color (masked)? I used this code but sometimes it falls.
# invert image and detect contours
inverted = cv2.bitwise_not(image_binary_and_dilated)
contours, hierarchy = cv2.findContours(inverted,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# get the biggest contour
biggest_index = -1
biggest_area = -1
i = 0
for c in contours:
area = cv2.contourArea(c)
if area > biggest_area:
biggest_area = area
biggest_index = i
i = i+1
print("biggest area: " + str(biggest_area) + " index: " + str(biggest_index))
cv2.drawContours(image_binary_and_dilated, contours, biggest_index, [0,0,255])
center, size, angle = cv2.minAreaRect(contours[biggest_index])
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.)
dst = cv2.warpAffine(inverted, rot_mat, (int(size[0]), int(size[1])))
mask = dst * 0
x1 = max([int(center[0] - size[0] / 2)+1, 0])
y1 = max([int(center[1] - size[1] / 2)+1, 0])
x2 = int(center[0] + size[0] / 2)-1
y2 = int(center[1] + size[1] / 2)-1
point1 = (x1, y1)
point2 = (x2, y2)
cv2.rectangle(dst, point1, point2, [0,0,0])
cv2.rectangle(mask, point1, point2, [255,255,255], cv2.FILLED)
masked = cv2.bitwise_and(dst, mask)
Some results:
The original plates were:
Good result 1
Good result 2
Good result 3
Good result 4
Bad result 1
Bad result 2
Binary plates are:
Image 1
Image 2
Image 3
Image 4
Image 5 - Bad result 1
Image 6 - Bad result 2
How can I fix this code? only that I want to avoid that bad result or improve it.
What you are asking starts to become complicated, and I believe there is not anymore a right or wrong answer, just different ways to do this. Almost all of them will yield positive and negative results, most likely in a different ratio. Having a 100% positive result is quite a challenging task, and I do believe my answer does not reach it. Yet it can be the basis for a more sophisticated work towards that goal.
So, I want to make a different proposal here.
I am not 100% sure why you are doing all the steps, and I believe some of them could be unnecessary.
Let's start from the problem: you want to remove the white parts on the borders (which are not numbers).
So, we need an idea about how to distinguish them from the letters, in order to correctly tackle them.
If we just try to contour and warp, it is likely to work on some images and not on others, because not all of them look the same. This is the hardest problem to have a general solution that works for many images.
What are the difference between the characteristics of the numbers and the characteristics of the borders (and other small points?):
after thinking about that, I would say: the shapes! That meaning, if you would imagine a bounding box around a letter/number, it would look like a rectangle, whose size is related to the image size. While in the case of the border, they are usually very large and narrow, or too small to be considered a letter/number (random points).
Therefore, my guess would be on segmentation, dividing the features via their shape. So we take the binary image, we remove some parts using the projection on their axes (as you correctly asked in the previous question and I believe we should use) and we get an image where each letter is separated from the white borders.
Then we can segment and check the shape of each segmented object, and if we think these are letters, we keep them, otherwise we discard them.
I wrote the code before as an example on your data. Some of the parameters are tuned on this set of images, so they may have to be relaxed for a larger dataset.
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import scipy.ndimage as ndimage
# do this for all the images
num_images = 6
for k in range(num_images):
# read the image
binary_image = cv2.imread("binary_image/img{}.png".format(k), cv2.IMREAD_GRAYSCALE)
# just for visualization purposes, I create another image with the same shape, to show what I am doing
new_intermediate_image = np.zeros((binary_image.shape), np.uint8)
new_intermediate_image += binary_image
# here we will copy only the cleaned parts
new_cleaned_image = np.zeros((binary_image.shape), np.uint8)
h_projection = np.array([ x/rows for x in binary_image.sum(axis=0)])
threshold_h = (np.max(h_projection) - np.min(h_projection)) / 10
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas_horizontal = np.where(h_projection < threshold_h)
for j in black_areas_horizontal:
new_intermediate_image[:, j] = 0
v_projection = np.array([ x/cols for x in binary_image.sum(axis=1)])
threshold_v = (np.max(v_projection) - np.min(v_projection)) / 10
print("we will use threshold {} for vertical".format(threshold_v))
black_areas_vertical = np.where(v_projection < threshold_v)
for j in black_areas_vertical:
new_intermediate_image[j, :] = 0
# define the features we are looking for
# this parameters can also be tuned
min_width = binary_image.shape[1] / 14
max_width = binary_image.shape[1] / 2
min_height = binary_image.shape[0] / 5
max_height = binary_image.shape[0]
print("we look for feature with width in [{},{}] and height in [{},{}]".format(min_width, max_width, min_height, max_height))
# segment the iamge
labeled_array, num_features = ndimage.label(new_intermediate_image)
# loop over all features found
for i in range(num_features):
# get a bounding box around them
slice_x, slice_y = ndimage.find_objects(labeled_array==i)[0]
roi = labeled_array[slice_x, slice_y]
# check the shape, if the bounding box is what we expect, copy it to the new image
if roi.shape[0] > min_height and \
roi.shape[0] < max_height and \
roi.shape[1] > min_width and \
roi.shape[1] < max_width:
new_cleaned_image += (labeled_array == i)
# print all images on a grid
that produces the output (in the grid, left image are the input images, central one are the images after the mask based on histogram projections, and on the right are the cleaned images):
As said above, this method does not yield 100% positive results. The last picture has lower quality and some parts are unconnected, and they are lost in the process. I personally believe this is a price to pay to get cleaner image, and if you have a lot of images, it won't be a problem, and you can remove those kind of images. Overall, I think this method returns quite clear images, where all other parts that are not letters or numbers are correctly removed.
the image is clean, nothing more than letters or numbers are kept
the parameters can be tuned, and should be consistent across images
in case of problem, using some prints or some debugging on the loop that chooses the features to keep should make it easier to understand where are the problem and correct them
it may fail in some cases where letters and numbers touch the white borders, which seems quite possible. It is handled from the black_areas created using the projection, but I am not so confident this will work 100% of the time.
some small parts of the numbers can be lost during the process, as in the last picture.

How to remove a rough line artifact from image after binarization

I am stuck in a problem where I want to differentiate between an object and the background(having a semi-transparent white sheet with backlight) i.e a fixed rough line introduced in the background and is merged with the object. My algorithm right now is I am taking the image from the camera, smoothing with gaussian blur, then extracting Value component from HSV, applying local binarization using wolf method to get the binarized image after which using OpenCV connected component algorithm I remove some small artifacts that are not connected to object as seen here. Now there is only this line artifact which is merged with the object but I want only the object as seen in this image. Please note that there are 2 lines in the binary image so using the 8 connected logic to detect lines not making a loop is not possible this is what I think and tried also. here is the code for that
size = np.size(thresh_img)
skel = np.zeros(thresh_img.shape,np.uint8)
element = cv2.getStructuringElement(cv2.MORPH_RECT,(3,3))
done = False
while( not done):
eroded = cv2.erode(thresh_img,element)
temp = cv2.dilate(eroded,element)
temp = cv2.subtract(thresh_img,temp)
skel = cv2.bitwise_or(skel,temp)
thresh_img = eroded.copy()
zeros = size - cv2.countNonZero(thresh_img)
if zeros==size:
done = True
# set max pixel value to 1
s = np.uint8(skel > 0)
count = 0
i = 0
while count != np.sum(s):
# non-zero pixel count
count = np.sum(s)
# examine 3x3 neighborhood of each pixel
filt = cv2.boxFilter(s, -1, (3, 3), normalize=False)
# if the center pixel of 3x3 neighborhood is zero, we are not interested in it
s = s*filt
# now we have pixels where the center pixel of 3x3 neighborhood is non-zero
# if a pixels' 8-connectivity is less than 2 we can remove it
# threshold is 3 here because the boxfilter also counted the center pixel
s[s < 1] = 0
# set max pixel value to 1
s[s > 0] = 1
i = i + 1
Any help in the form of code would be highly appreciated thanks.
Since you are already using connectedComponents the best way is to exclude, not only the ones which are small, but also the ones that are touching the borders of the image.
You can know which ones are to be discarded using connectedComponentsWithStats() that gives you also information about the bounding box of each component.
Alternatively, and very similarly you can switch from connectedComponents() to findContours() which gives you directly the Components so you can discard the external ones and the small ones to retrieved the part you are interested in.

How to separate the photoshopped part from the rest of the image if the original image is given?

I'm working on a project to detect spliced(photoshopped) images and want 128x128 patches at the boundaries of the forged regions. I have the authentic background image and the forged one.
If I simply find difference in pixel values and apply a threshold to get the binary image, I get a lot of noise(small black patches in the white part and vice versa) which doesn't get effectively removed by cv2.medianBlur().
I'm assuming that this is because of different compression factors of the images before and after splicing. Also, some pixels in the spliced part are similar to the corresponding pixels in the auth. image.
So I replace the normal cv2.threshold() function which adds the values of 4-connected neighbours of the pixel and compares that with a threshold value.
This is my threshold function :
def threshold(image,thresh):
b,g,r= cv2.split(image)
#Not considering boundary pixels for the binary image
for i in range(1,b.shape[0]-1):
for j in range(1,b.shape[1]-1):
sumb = b[i][j] + b[i+1][j] + b[i-1][j] + b[i][j+1] + b[i][j-1]
sumg = g[i][j] + g[i+1][j] + g[i-1][j] + g[i][j+1] + g[i][j-1]
sumr = r[i][j] + r[i+1][j] + r[i-1][j] + r[i][j+1] + r[i][j-1]
res[i][j]=255 if sumb<=5*thresh or sumg<=5*thresh or sumr<=5*thresh else 0
return res
This does give better results but not as good as expected.
For example, this is an authentic image:
This is the spliced image:
This is the thresholded image (I found that thresh=2 was the optimal value):
I tried to remove small components by removing components with few white pixels using connectedComponentsWithStats().
These are the borders after removing small connected components:
while the expected image is:
I could increase the minimum number of pixels required for each component but there are images in my dataset where the forged part is small.
How can I get better results than this?
Also, is it possible to optimize my threshold function? Right now it takes at least 2 seconds to process one image!
[I don't have OpenCV installed on my computed right now, so looked at your images in MATLAB instead. The Python code below is not tested.]
Because your images are identical except where purposeful changes were made (there is no scaling or translations to take into account), one can simply subtract the two images and look at the difference:
res = cv2.absdiff(image,thresh)
If you display this (with some contrast stretch) you'll see:
As you can see, at least one of the channels has a strong difference in the "spliced" region, outside of it here are some very light dots, caused by the lossy compression.
Let's take the maximum over R, G, B for each pixel:
res = np.amax(res, axis=2) # (I think OpenCV stores the channels in the 3rd dimension?)
I found that most of the compression artefacts are below 15, so let's threshold there:
res = res > 15
Finally, apply your cv2.medianBlur() to remove the last bits of noise. You could also try applying GaussianBlur() before the threshold.

How to remove moving objects to obtain background only?

I'm practically new to python and don't have much knowledge about it. I need help converting this pseudocode into Python which is written to obtain the background by removing moving objects in the images. In regards to the Pseudocode, I don't understand the Lines 3, 4 and 5 so maybe once its converted into Python, I can understand it better. In line 3 and 4, I don't understand what the & does and in the last line, I don't understand how is it even computing an image.
Any help will be appreciated.
The code is provided below:
Mat sequence[3];// the sequence of images to loop through
Mat output, x = 0, y = 0; // looping through the sequence
matchTemplate(sequence[i], sequence[i+1], output, CV_TM_CCOEFF_NORMED)
mask = 1 & (output>0.9) // get correlated part amongst the images
x += sequence[i] & mask + sequence[i+1] & mask; // accumulate background infer
y += 2*mask; // keep count
end of loop;
Mat bg = x.mul(1.0/y); // average background
Sample images to try are also provided below:
I'm not very familiar with OpenCV, so I hope you'll excuse me if I don't provide a code snippet you can just copy and paste. But if I understand the pseudocode correctly, it is doing this:
sequence = list of images
x will hold sum of backgrounds
y will hold the number of frames use to build x
for each index i in sequence:
c = matrix of correlation coefficients between (sequence[i], sequence[i+1]) from matchTemplate
mask = pixels that are highly correlated (90%+)
x += actual pixels from sequence[i] & mask and sequence[i+1] & mask that are considered background
y += 2 for every pixel in mask
bg = average of background images x / number of frames y
So what's happening is, for every pair of images, it marks the pixels that are the same in both images. The assumption is that background doesn't change between adjacent frames and foreground does. Whether pixels are "the same" is judged on the basis of correlation being >90%. Then it takes all the marked pixels, and averages them.
As one of the commentors mentioned, the mean of the images does remove the foreground but the entire image becomes a little faded. Here is the code that does that:
import as io
import numpy as np
import matplotlib.pyplot as plt
cim1 = io.imread('')
cim2 = io.imread('')
cim3 = io.imread('')
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.mean([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.mean([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.mean([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
The output image I get from this:
A better approach to this problem is to find the median of the three images. The more images you have in the algorithm the better is the background. Here is a snippet I tried (just replacing mean with median). If you have more images you can get a much more accurate one.
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.median([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.median([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.median([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
The final output:
If you had more images, you can completely remove the foreground. Hope you got the idea on which you can build upon.
My code assumes all your images are of the same dimensions. The solution will be a bit more complicated if you captured the images in different views. In that case you may have to use template matching algorithm (your pseudo code seems to be doing something similar) to extract the common canvas from your images.

Counting particles using image processing in python

Is there any good algorithm for detecting particles on a changing background intensity?
For example, if I have the following image:
Is there a way to count the small white particles, even with the clearly different background that appears towards the lower left?
To be a little more clear, I would like to label the image and count the particles with an algorithm that finds these particles to be significant:
I have tried many things with the PIL, cv , scipy , numpy , etc. modules.
I got some hints from this very similar SO question, and it appears at first glance that you could take a simple threshold like so:
im = mahotas.imread('particles.jpg')
T = mahotas.thresholding.otsu(im)
labeled, nr_objects = ndimage.label(im>T)
print nr_objects
but because of the changing background you get this:
I have also tried other ideas, such as a technique I found for measuring paws, which I implemented in this way:
import numpy as np
import scipy
import pylab
import pymorph
import mahotas
from scipy import ndimage
import cv
def detect_peaks(image):
Takes an image and detect the peaks usingthe local maximum filter.
Returns a boolean mask of the peaks (i.e. 1 when
the pixel's value is the neighborhood maximum, 0 otherwise)
# define an 8-connected neighborhood
neighborhood = ndimage.morphology.generate_binary_structure(2,2)
#apply the local maximum filter; all pixel of maximal value
#in their neighborhood are set to 1
local_max = ndimage.filters.maximum_filter(image, footprint=neighborhood)==image
#local_max is a mask that contains the peaks we are
#looking for, but also the background.
#In order to isolate the peaks we must remove the background from the mask.
#we create the mask of the background
background = (image==0)
#a little technicality: we must erode the background in order to
#successfully subtract it form local_max, otherwise a line will
#appear along the background border (artifact of the local maximum filter)
eroded_background = ndimage.morphology.binary_erosion(background, structure=neighborhood, border_value=1)
#we obtain the final mask, containing only peaks,
#by removing the background from the local_max mask
detected_peaks = local_max - eroded_background
return detected_peaks
im = mahotas.imread('particles.jpg')
imf = ndimage.gaussian_filter(im, 3)
#rmax = pymorph.regmax(imf)
detected_peaks = detect_peaks(imf)
pylab.imshow(pymorph.overlay(im, detected_peaks))
but this gives no luck either, showing this result:
Using the regional max function, I get images which almost appear to be giving correct particle identification, but there are either too many, or too few particles in the wrong spots depending on my gaussian filtering (images have gaussian filter of 2,3, & 4):
Also, it would need to work on images similar to this as well:
This is the same type of image above, just at a much higher density of particles.
EDIT: Solved solution: I was able to get a decent working solution to this problem using the following code:
import cv2
import pylab
from scipy import ndimage
im = cv2.imread('particles.jpg')
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5,5), 0)
maxValue = 255
blockSize = 5 #odd number like 3,5,7,9,11
C = -3 # constant to be subtracted
im_thresholded = cv2.adaptiveThreshold(gray, maxValue, adaptiveMethod, thresholdType, blockSize, C)
labelarray, particle_count = ndimage.measurements.label(im_thresholded)
print particle_count
This will show the images like this:
(which is the given image)
(which is the counted particles)
and calculate the particle count as 60.
I had solved the "variable brightness in background" by using a tuned difference threshold with a technique called Adaptive Contrast. It works by performing a linear combination (a difference, in the case) of a grayscale image with a blurred version of itself, then applying a threshold to it.
Convolve the image with a suitable statistical operator.
Subtract the original from the convolved image, correcting intensity scale/gamma if necessary.
Threshold the difference image with a constant.
(original paper)
I did this very successfully with scipy.ndimage, in the floating-point domain (way better results than integer image processing), like this:
original_grayscale = numpy.asarray(some_PIL_image.convert('L'), dtype=float)
blurred_grayscale = scipy.ndimage.filters.gaussian_filter(original_grayscale, blur_parameter)
difference_image = original_grayscale - (multiplier * blurred_grayscale);
image_to_be_labeled = ((difference_image > threshold) * 255).astype('uint8') # not sure if it is necessary
labelarray, particle_count = scipy.ndimage.measurements.label(image_to_be_labeled)
Hope this helps!!
I cannot really give a definite answer, but here are a few pointers:
The function mahotas.morph.regmax might be better than the maximum filter as it removes pseudo-maxima. Perhaps combine this with a global threshold, with a local threshold (such as the mean over a window) or both.
If you have several images and the same uneven background, then maybe you can compute an average background and normalize against that, or use empty images as your estimate of background. This would be the case if you have a microscope, and like every microscope I've seen, the illumination is uneven.
Something like:
average = average_of_many(images)
# smooth it
average = mahotas.gaussian_filter(average,24)
Now you preprocess your images, like:
preproc = image/average
or something like that.

