How to remove moving objects to obtain background only? - python

I'm practically new to python and don't have much knowledge about it. I need help converting this pseudocode into Python which is written to obtain the background by removing moving objects in the images. In regards to the Pseudocode, I don't understand the Lines 3, 4 and 5 so maybe once its converted into Python, I can understand it better. In line 3 and 4, I don't understand what the & does and in the last line, I don't understand how is it even computing an image.
Any help will be appreciated.
The code is provided below:
Mat sequence[3];// the sequence of images to loop through
Mat output, x = 0, y = 0; // looping through the sequence
matchTemplate(sequence[i], sequence[i+1], output, CV_TM_CCOEFF_NORMED)
mask = 1 & (output>0.9) // get correlated part amongst the images
x += sequence[i] & mask + sequence[i+1] & mask; // accumulate background infer
y += 2*mask; // keep count
end of loop;
Mat bg = x.mul(1.0/y); // average background
Sample images to try are also provided below:
image1
image2
image3

I'm not very familiar with OpenCV, so I hope you'll excuse me if I don't provide a code snippet you can just copy and paste. But if I understand the pseudocode correctly, it is doing this:
sequence = list of images
x will hold sum of backgrounds
y will hold the number of frames use to build x
for each index i in sequence:
c = matrix of correlation coefficients between (sequence[i], sequence[i+1]) from matchTemplate
mask = pixels that are highly correlated (90%+)
x += actual pixels from sequence[i] & mask and sequence[i+1] & mask that are considered background
y += 2 for every pixel in mask
bg = average of background images x / number of frames y
So what's happening is, for every pair of images, it marks the pixels that are the same in both images. The assumption is that background doesn't change between adjacent frames and foreground does. Whether pixels are "the same" is judged on the basis of correlation being >90%. Then it takes all the marked pixels, and averages them.

As one of the commentors mentioned, the mean of the images does remove the foreground but the entire image becomes a little faded. Here is the code that does that:
import skimage.io as io
import numpy as np
import matplotlib.pyplot as plt
cim1 = io.imread('https://i.stack.imgur.com/P44wT.jpg')
cim2 = io.imread('https://i.stack.imgur.com/wU4Yt.jpg')
cim3 = io.imread('https://i.stack.imgur.com/yUbB6.jpg')
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.mean([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.mean([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.mean([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
ax.axis('off')
ax.imshow(newimage)
The output image I get from this:
A better approach to this problem is to find the median of the three images. The more images you have in the algorithm the better is the background. Here is a snippet I tried (just replacing mean with median). If you have more images you can get a much more accurate one.
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.median([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.median([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.median([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
ax.axis('off')
ax.imshow(newimage)
The final output:
If you had more images, you can completely remove the foreground. Hope you got the idea on which you can build upon.
My code assumes all your images are of the same dimensions. The solution will be a bit more complicated if you captured the images in different views. In that case you may have to use template matching algorithm (your pseudo code seems to be doing something similar) to extract the common canvas from your images.

Related

Recognizing black and white images with OpenCV

I have this set of images :
The leftmost one is the reference image.
I want to have a value telling me how close is any of the other images to the leftmost one.
I experimented with matchShapes(), by calling it for each contour and averaging the values, but I didn't get useful result (the rightmost one had a too high value, for example)
I would also want the matching to work only in the correct orientation.
If they're purely black and white images it would probably be easier to just AND the two pictures together and sum up the total pixels left in the result.
Something like this:
import cv2
import numpy as np
x = np.zeros((100,100))
y = np.zeros((100,100))
for i in range(25,75):
x[i][i] = 255
y[i][100-i] = 255
cv2.imshow('x', x)
cv2.imshow('y', y)
z = cv2.bitwise_and(x,y)
sum = 0
for i in range(0,z.shape[0]):
for j in range(0,z.shape[1]):
if z[i][j] == 255:
sum += 1
print(f"Similarity Score: {sum}")
cv2.imshow('z',z)
cv2.waitKey(0)
There probably exists some better library to perform this all in one line but if performance isn't much of a concern perhaps this could work.
It was difficult to not recognize images that were too different. With the methods proposed here, I always got really close values for images that I thought were too different to correspond.
In the end, I did a multistep process:
First I got the contour of the test image like so :
testContours, _ = cv.findContours(testImage, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
Then, if the contour count between the test image and the original image are not the same, I abort.
If they have the same contour count, I then calculate the average between all shape distances of the contours :
distances = []
sd = cv2.createShapeContextDistanceExtractor()
for i in range(len(testContours)):
d2 = sd.computeDistance(testContours[i], originalContours[i])
distances.append(d2)
value = sum(distances) / len(distances)
Then, I count the number of white pixels after AND-ing the two images, divided by the total number of pixels in the source image (in case the contours match but are not placed correctly)
exactly_placed_ratio = cv.countNonZero(cv.bitwise_and(testImage, originalImage)) / cv.countNonZero(originalImage)
In the end I have two values, I can use the first one to check if the shapes are close enough, and the second one to check if they are in the right position relative to the whole image.

I am trying to crop an image to remove extra space in python

This may sound confusing but I will demonstrate what my goal is. I want to crop the extra space around a image using python. You can see in image two the outside border is cropped off until it cuts out the extra space around the colored bars in the center. I am not sure if this is possible. Sorry for the vague question as it is very hard to explain what I am trying to do.
(Before) Image Before Cropping ----------> (After) What I am trying to achieve.
You can notice the extra black space around the colored thing in the center is cutout on the after image.
I want to be able to cut the extra space out without me manually typing in where to crop out. How could this be done?
Sure!
With a bit of Pillow and Numpy magic, you can do something like this – not sure if this is optimal, since I whipped it up in about 15 minutes and I'm not the Numpyiest of Numpyistas:
from PIL import Image
import numpy as np
def get_first_last(mask, axis: int):
""" Find the first and last index of non-zero values along an axis in `mask` """
mask_axis = np.argmax(mask, axis=axis) > 0
a = np.argmax(mask_axis)
b = len(mask_axis) - np.argmax(mask_axis[::-1])
return int(a), int(b)
def crop_borders(img, crop_color):
np_img = np.array(img)
mask = (np_img != crop_color)[..., 0] # compute a mask
x0, x1 = get_first_last(mask, 0) # find boundaries along x axis
y0, y1 = get_first_last(mask, 1) # find boundaries along y axis
return img.crop((x0, y0, x1, y1))
def main():
img = Image.open("0d34A.png").convert("RGB")
img = crop_borders(img, crop_color=(0, 0, 0))
img.save("0d34A_cropped.png")
if __name__ == "__main__":
main()
If you need a different condition (e.g. all pixels dark enough, you can change how mask is defined.
As I understand, your problem is how to identify extra space, rather than which library/framework/tool to use to edit images. If so, then I've solved a similar problem about 4 years ago. I'm sorry, I don't have sample code to show (I left that organisation); but the logic was as follows:
Decide a colour for background. Never use this colour in any of the bars in the graph/image.
Read image data as RGB matrices (3 x 2D-array).
Repeat following on first row, last row, first column, last column:
Simultaneously iterate all 3 matrices
If all values in the row or column from index=0 to index=len(row or column) are equal to corresponding background colour RGB component, then this is extra space. Remove this row or column from all RGB matrices.
Write the remaining RGB matrices as image.
Following are some helpful links in this regard:
Read image as RGB matrix
Iterating through RGB matrices
Write RGB matrix as image

How to delete or clear contours from image?

I'm working with license plates, what I do is apply a series of filters to it, such as:
Grayscale
Blur
Threshhold
Binary
The problem is when I doing this, there are some contour like this image at borders, how can I clear them? or make it just black color (masked)? I used this code but sometimes it falls.
# invert image and detect contours
inverted = cv2.bitwise_not(image_binary_and_dilated)
contours, hierarchy = cv2.findContours(inverted,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# get the biggest contour
biggest_index = -1
biggest_area = -1
i = 0
for c in contours:
area = cv2.contourArea(c)
if area > biggest_area:
biggest_area = area
biggest_index = i
i = i+1
print("biggest area: " + str(biggest_area) + " index: " + str(biggest_index))
cv2.drawContours(image_binary_and_dilated, contours, biggest_index, [0,0,255])
center, size, angle = cv2.minAreaRect(contours[biggest_index])
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.)
#cv2.warpPerspective()
print(size)
dst = cv2.warpAffine(inverted, rot_mat, (int(size[0]), int(size[1])))
mask = dst * 0
x1 = max([int(center[0] - size[0] / 2)+1, 0])
y1 = max([int(center[1] - size[1] / 2)+1, 0])
x2 = int(center[0] + size[0] / 2)-1
y2 = int(center[1] + size[1] / 2)-1
point1 = (x1, y1)
point2 = (x2, y2)
print(point1)
print(point2)
cv2.rectangle(dst, point1, point2, [0,0,0])
cv2.rectangle(mask, point1, point2, [255,255,255], cv2.FILLED)
masked = cv2.bitwise_and(dst, mask)
#cv2_imshow(imgg)
cv2_imshow(dst)
cv2_imshow(masked)
#cv2_imshow(mask)
Some results:
The original plates were:
Good result 1
Good result 2
Good result 3
Good result 4
Bad result 1
Bad result 2
Binary plates are:
Image 1
Image 2
Image 3
Image 4
Image 5 - Bad result 1
Image 6 - Bad result 2
How can I fix this code? only that I want to avoid that bad result or improve it.
INTRODUCTION
What you are asking starts to become complicated, and I believe there is not anymore a right or wrong answer, just different ways to do this. Almost all of them will yield positive and negative results, most likely in a different ratio. Having a 100% positive result is quite a challenging task, and I do believe my answer does not reach it. Yet it can be the basis for a more sophisticated work towards that goal.
MY PROPOSAL
So, I want to make a different proposal here.
I am not 100% sure why you are doing all the steps, and I believe some of them could be unnecessary.
Let's start from the problem: you want to remove the white parts on the borders (which are not numbers).
So, we need an idea about how to distinguish them from the letters, in order to correctly tackle them.
If we just try to contour and warp, it is likely to work on some images and not on others, because not all of them look the same. This is the hardest problem to have a general solution that works for many images.
What are the difference between the characteristics of the numbers and the characteristics of the borders (and other small points?):
after thinking about that, I would say: the shapes! That meaning, if you would imagine a bounding box around a letter/number, it would look like a rectangle, whose size is related to the image size. While in the case of the border, they are usually very large and narrow, or too small to be considered a letter/number (random points).
Therefore, my guess would be on segmentation, dividing the features via their shape. So we take the binary image, we remove some parts using the projection on their axes (as you correctly asked in the previous question and I believe we should use) and we get an image where each letter is separated from the white borders.
Then we can segment and check the shape of each segmented object, and if we think these are letters, we keep them, otherwise we discard them.
THE CODE
I wrote the code before as an example on your data. Some of the parameters are tuned on this set of images, so they may have to be relaxed for a larger dataset.
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import scipy.ndimage as ndimage
# do this for all the images
num_images = 6
plt.figure(figsize=(16,16))
for k in range(num_images):
# read the image
binary_image = cv2.imread("binary_image/img{}.png".format(k), cv2.IMREAD_GRAYSCALE)
# just for visualization purposes, I create another image with the same shape, to show what I am doing
new_intermediate_image = np.zeros((binary_image.shape), np.uint8)
new_intermediate_image += binary_image
# here we will copy only the cleaned parts
new_cleaned_image = np.zeros((binary_image.shape), np.uint8)
### THIS CODE COMES FROM THE PREVIOUS ANSWER:
# https://stackoverflow.com/questions/62127537/how-to-clean-binary-image-using-horizontal-projection?noredirect=1&lq=1
(rows,cols)=binary_image.shape
h_projection = np.array([ x/rows for x in binary_image.sum(axis=0)])
threshold_h = (np.max(h_projection) - np.min(h_projection)) / 10
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas_horizontal = np.where(h_projection < threshold_h)
for j in black_areas_horizontal:
new_intermediate_image[:, j] = 0
v_projection = np.array([ x/cols for x in binary_image.sum(axis=1)])
threshold_v = (np.max(v_projection) - np.min(v_projection)) / 10
print("we will use threshold {} for vertical".format(threshold_v))
black_areas_vertical = np.where(v_projection < threshold_v)
for j in black_areas_vertical:
new_intermediate_image[j, :] = 0
### UNTIL HERE
# define the features we are looking for
# this parameters can also be tuned
min_width = binary_image.shape[1] / 14
max_width = binary_image.shape[1] / 2
min_height = binary_image.shape[0] / 5
max_height = binary_image.shape[0]
print("we look for feature with width in [{},{}] and height in [{},{}]".format(min_width, max_width, min_height, max_height))
# segment the iamge
labeled_array, num_features = ndimage.label(new_intermediate_image)
# loop over all features found
for i in range(num_features):
# get a bounding box around them
slice_x, slice_y = ndimage.find_objects(labeled_array==i)[0]
roi = labeled_array[slice_x, slice_y]
# check the shape, if the bounding box is what we expect, copy it to the new image
if roi.shape[0] > min_height and \
roi.shape[0] < max_height and \
roi.shape[1] > min_width and \
roi.shape[1] < max_width:
new_cleaned_image += (labeled_array == i)
# print all images on a grid
plt.subplot(num_images,3,1+(k*3))
plt.imshow(binary_image)
plt.subplot(num_images,3,2+(k*3))
plt.imshow(new_intermediate_image)
plt.subplot(num_images,3,3+(k*3))
plt.imshow(new_cleaned_image)
that produces the output (in the grid, left image are the input images, central one are the images after the mask based on histogram projections, and on the right are the cleaned images):
CONCLUSIONS:
As said above, this method does not yield 100% positive results. The last picture has lower quality and some parts are unconnected, and they are lost in the process. I personally believe this is a price to pay to get cleaner image, and if you have a lot of images, it won't be a problem, and you can remove those kind of images. Overall, I think this method returns quite clear images, where all other parts that are not letters or numbers are correctly removed.
ADVANTAGES
the image is clean, nothing more than letters or numbers are kept
the parameters can be tuned, and should be consistent across images
in case of problem, using some prints or some debugging on the loop that chooses the features to keep should make it easier to understand where are the problem and correct them
LIMITATIONS
it may fail in some cases where letters and numbers touch the white borders, which seems quite possible. It is handled from the black_areas created using the projection, but I am not so confident this will work 100% of the time.
some small parts of the numbers can be lost during the process, as in the last picture.

Tranform HSV mask into a set of points

I created a HSV mask from the image. The result is like the following:
I hope that this mask can be represented by a set of points. My original idea was to use Skimage Skeletonize to create a line and then use the sliding window to calculate the local mean for point creation.
However, skeletonize takes too long. It requires 0.4s for each frame. This is not a good idea for video processing.
Do you want the points of all True elements of the mask, or do you just want a skeleton? If the former..
import skimage as ski
from skimage import io
import numpy as np
mask = ski.io.imread('./mask.png')[:,:,0]/255
mask = mask.astype('bool')
s0,s1 = mask.shape # dimensions of mask
a0,a1 = np.arange(s0),np.arange(s1) # make two 1d coordinate arrays
coords = np.array(np.meshgrid(a0,a1)).T # cartesian product into a coordinate matrix
coords = coords[mask] # mask out the points of interest
If the latter, you can get the start and end points (from left to right) of the object in the mask in a fast way with something like
start_mat = np.stack((np.roll(mask,1,axis=1),mask),-1)
start_mask = np.fromiter(map(lambda p: np.alltrue(p==np.array([False,True])),start_mat[mask]),dtype=bool)
starts = coords[start_mask]
end_mat = np.stack((np.roll(mask,-1,axis=1),mask),-1)
end_mask = np.fromiter(map(lambda p: np.alltrue(p==np.array([False,True])),end_mat[mask]),dtype=bool)
ends = coords[end_mask]
This will give you a rough outline of the object. Outline points will be missing anywhere that the slope of the figure is 0. You may have to think of a vertical difference scheme for those areas. The same idea would work with np.roll(...,axis=0). You could just concatenate the unique points from rolling over rows to the points from rolling over columns to get the full outline.
Averaging the correct pairs to get the skeleton isn't so easy.
Here's a resultant outline. You can definitely make this faster than 0.4s:
Couldn't a simple For loop work?
Scan each "across" line of your bitmap looking for...
X pos where from Black meets White = new start point.
Also in same scanned line now look for a new X-pos: where from White meets Black = new end point.
Either put dots at start/end points for "outline" effect, or else put dots in "center" effect by dot.x = (end_point - start_point) / 2

Get the (x,y) coordinate values from an image array's RGB value using numpy

I am new to python so I really need help with this one.
I have an image greyscaled and thresholded so that the only colors present are black and white.
I'm not sure how to go about writing an algorithm that will give me a list of coordinates (x,y) on the image array that correspond to the white pixels only.
Any help is appreciated!
Surely you must already have the image data in the form of a list of intensity values? If you're using Anaconda, you can use the PIL Image module and call getdata() to obtain this intensity information. Some people advise to use NumPy methods, or others, instead, which may improve performance. If you want to look into that then go for it, my answer can apply to any of them.
If you have already a function to convert a greyscale image to B&W, then you should have the intensity information on that output image, a list of 0's and 1's , starting from the top left corner to the bottom right. If you have that, you already have your location data, it just isnt in (x,y) form. To do that, use something like this:
data = image.getdata()
height = image.getHeight()
width = image.getWidth()
pixelList = []
for i in range(height):
for j in range(width):
stride = (width*i) + j
pixelList.append((j, i, data[stride]))
Where data is a list of 0's and 1's (B&W), and I assume you have written getWidth() and getHeight() Don't just copy what I've written, understand what the loops are doing. That will result in a list, pixelList, of tuples, each tuple containing intensity and location information, in the form (x, y, intensity). That may be a messy form for what you are doing, but that's the idea. It would be much cleaner and accessible to instead of making a list of tuples, pass the three values (x, y, intensity) to a Pixel object or something. Then you can get any of those values from anywhere. I would encourage you to do that, for better organization and so you can write the code on your own.
In either case, having the intensity and location stored together makes sorting out the white pixels very easy. Here it is using the list of tuples:
whites = []
for pixel in pixelList:
if pixel[2] == 1:
whites.append(pixel[0:2])
Then you have a list of white pixel coordinates.
You can usePIL and np.where to get the results efficiently and concisely
from PIL import Image
import numpy as np
img = Image.open('/your_pic.png')
pixel_mat = np.array(img.getdata())
width = img.size[0]
pixel_ind = np.where((pixel_mat[:, :3] > 0).any(axis=1))[0]
coordinate = np.concatenate(
[
(pixel_ind % width).reshape(-1, 1),
(pixel_ind // width).reshape(-1, 1),
],
axis=1,
)
Pick the required pixels and get their index, then calculate the coordinates based on it. Without using Loop expressions, this algorithm may be faster.
PIL is only used to get the pixel matrix and image width, you can use any library you are familiar with to replace it.

Categories

Resources