Create mask OpenCV python - python

I'm trying to create a mask. I have database of images similar like this image.
import cv2
import numpy as np
img = cv2.imread('sample1.png', cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
#img_ = cv2.threshold(gray,100,225,cv2.THRESH_BINARY)
edges = cv2.Canny(gray, 250, 250)
How can I remove inner bonder and fill with white.
Result I Want

Well, there are many ways to do that. All of them need some tuning, depending on your image.
There is, for example, a floodfill function in opencv.
But the easiest is probably to use some mathematical morphology and then connected component. Because from the connected component, it is easier to adjust result if needed.
We can start by having a binary version of your edges
Once this is done, since there are "holes" in it, we need to fill those holes, so that the edge strictly separate inside from outside. This can be done by a dilatation
fatedge=cv2.dilate(binedge, ker)
Then, we want to find the inside. That is not easy, because there might be many parts in that inside. So the easiest way is probably to find the outside and revert it. Tho there could also be several outside parts, if character touch the border in different places.
So, let's start to find all connected black parts of this picture.
comp here is an image whose values is the index of the connected component. Shown here with random colors for each index.
Let's assume that outside is connected and that (0,0) is in it (it is almost always the case. And it is here. If not, you'll have to find more complex criteria. Such as "the biggest component". Or even to merge different parts). The component we are interested in is the one that contains (0,0). That is the pixels of comp that have the same value as comp[0,0]. And in fact, what we are interested in is the opposite of that: what is inside. We compute outside only because it is easier. Inside is what is not inside, that is pixels that are != comp[0,0].
Last stage (not really necessary from an aesthetics point of view. But strictly speaking, it is needed) : since we have dilated the edges at the beginning, this picture is a few pixels bigger than it should. We can erode it back now that we have what we want
output=cv2.erode(filled, ker)*255
So, all together
import cv2
import numpy as np
img = cv2.imread('Downloads/93Lwd.png', cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
#img_ = cv2.threshold(gray,100,225,cv2.THRESH_BINARY)
edges = cv2.Canny(gray, 250, 250)
# Binarize edges
# Removing edges too close from left and right borders
# Fatten them so that there is no hole
fatedge=cv2.dilate(binedge, ker)
# Find connected black areas
# comp is an image whose each value is the index of the connected component
# Assuming that point (0,0) is in the border, not inside the character, border is where comp is == comp[0,0]
# So character is where it is not
# Or, variant from new image: considering "outside" any part that touches one of the left, right, or top border
# Note: that is redundant with previous 0ing of left and right borders
# Set of all components touching left, right or top border
if 0 in listOutside: listOutside.remove(0) # 0 are the lines, that is what is False in fatedge==0
filled=(~np.isin(comp, list(listOutside))).astype(np.uint8) # isin need array or list, not set
# Just to be extra accurate, since we had dilated edges, with can now erode result
output=cv2.erode(filled, ker)


Difference between images of different sizes

My problem is as follows. I have an image img0 (array shape (A,B,3)) and then a face img1 cut out from the middle of that image (by an algorithm I don't have access to: my input is only the whole image, and the face cut out from it), now an array shaped (C,D,3) where C<A and D<B. Now, I want to perform operations on the face (e.g., colour it differently) and then stick it back inside the original background (which is not coloured differently) -- these operations will not affect the shape of img1 array containing the face alone, it will remain (C,D,3). Something like img0-img1 doesn't work because of the shape mismatch.
I guess an approach like finding the starting coordinate of the face in img0 would work in the case that the face cut out is rectangular (which is possible for me to use, though not ideal), since it is guaranteed that the face is exactly identical in img1 and img0. That means, to get the background, we only need to find the starting coordinate of the img1 array in img0, cut out the subsequent elements (that correspond to img1) from img0, and we're left with the background. After I've done whatever I want to the face, I can use the new (C,D,3) array in place of the previous img1 part of the whole image (img0).
Is there a way to do this in Python? i.e., compute the difference between two images of different sizes, where one image is a 'subimage' of the other? Or, failing that, if we can find the starting coordinate of the rectangular portion of an image (img0) which corresponds to a rectangular cutout available to us (img1)?
Or, failing that, if we can find the starting coordinate of the rectangular ?portion of an image (img0) which corresponds to a rectangular cutout available to us (img1)?
One easy way to do that would be to cross-correlate your zero-mean cut-out with the zero-mean original image. As you have no noise added to the image, any maximum of the cross-correlation is a possible candidate.
(i) If you don't use faces but e.g. blocks, there will be multiple maxima and you don't have an unique solution.
(ii) It is not exactly an elegant solution to your problem.
I modified the code example from [1] to make it clearer:
from scipy import signal, misc
import numpy as np
face = misc.face(gray=True)
face = face - np.mean(face)
face_cutout = np.copy(face[300:365, 670:750])
face_cutout = face_cutout - np.mean(face_cutout)
corr = signal.correlate2d(face, face_cutout, mode='valid')
y, x = np.unravel_index(np.argmax(corr), corr.shape) # find the match
print(f'x: {x} y: {y}')

Comparing and plotting regions of the same color over a dataset of a few hundred images

A chem student asked me for help with plotting image segmenetation:
A stationary camera takes a picture of the experimental setup every second over a period of a few minutes, so like 300 images yield.
The relevant parts in the setup are two adjacent layers of differently-colored foams observed from the side, a 2-color sandwich shrinking from both sides, basically, except one of the foams evaporates a bit faster.
I'd like to segment each of the images in the way that would let me plot both foam regions' "width" against time.
Here is a "diagram" :)
I want to go from here --> To here
Ideally, given a few hundred of such shots, in which only the widths change, I get an array of scalars back that I can plot. (Going to look like a harmonic series on either side of the x-axis)
I have a bit of python and matlab experience, but have never used OpenCV or Image Processing toolbox in matlab, or actually never dealt with any computer vision in general. Could you guys throw like a roadmap of what packages/functions to use or steps one should take and i'll take it from there?
I'm not sure how to address these things:
-selecting at which slice along the length of the slice the algorithm measures the width(i.e. if the foams are a bit uneven), although this can be ignored.
-which library to use to segment regions of the image based on their color, (some k-means shenanigans probably), and selectively store the spatial parameters of the resulting segments?
-how to iterate that above over a number of files.
Thank you kindly in advance!
Assume your Intensity will be different after converting into gray scale ( if not, just convert to other color space like HSV or LAB, then just use one of the components)
img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
First, Threshold your grayscaled input into a few bands
ret,thresh1 = cv.threshold(img,128,255,cv.THRESH_BINARY)
ret,thresh2 = cv.threshold(img,27,255,cv.THRESH_BINARY_INV)
ret,thresh3 = cv.threshold(img,77,255,cv.THRESH_TRUNC)
ret,thresh4 = cv.threshold(img,97,255,cv.THRESH_TOZERO)
ret,thresh5 = cv.threshold(img,227,255,cv.THRESH_TOZERO_INV)
The value should be tested out by your actual data. Here Im just give a example
Clean up the segmented image using median filter with a radius larger than 9. I do expect some noise. You can also use ROI here to help remove part of noise. But personally I`m lazy, I just wrote program to handle all cases and angle
threshholed_images_aftersmoothing = cv2.medianBlur(threshholed_images,9)
Each band will be corresponding to one color (layer). Now you should have N segmented image from one source. where N is the number of layers you wish to track
Second use opencv function bounding rect to find location and width/height of each Layer AKA each threshholed_images_aftersmoothing. Eg. boundingrect on each sub-segmented images.
C++: Rect boundingRect(InputArray points)
Python: cv2.boundingRect(points) → retval¶
Last, the rect have x,y, height and width property. You can use a simple sorting order to sort from top to bottom layer based on rect attribute x. Run though all vieo to obtain the x(layer id) , height vs time graph.
Rect API
Public Attributes
_Tp **height** // this is what you are looking for
_Tp width
_Tp **x** // this tells you the position of the band
_Tp y
By plot the corresponding heights (|AB| or |CD|) over time, you can obtain the graph you needed.
The more correct way is to use Kalman filter to track the position and height graph as I would expect some sort of bubble will occur and will interfere with the height of the layers.
To be honest, i didnt expect a chem student to be good at this. Haha good luck
Anything wrong you can find me here or Email me if i`m not watching stackoverflow
You can select a region of interest straight down the middle of the foams, a few pixels wide. If you stack these regions for each image it will show the shrink over time.
If for example you use 3 pixel width for the roi, the result of 300 images will be a 900 pixel wide image, where the left is the start of the experiment and the right is the end. The following image can help you understand:
Though I have not fully tested it, this code should work. Note that there must only be images in the folder you reference.
import cv2
import numpy as np
import os
# path to folder that holds the images
path = '.'
# dimensions of roi
x = 0
y = 0
w = 3
h = 100
# store references to all images
all_images = os.listdir(path)
# sort images
# create empty result array
result = np.empty([h,0,3],dtype=np.uint8)
for image in all_images:
# load image
img = cv2.imread(path+'/'+image)
# get the region of interest
roi = img[y:y+h,x:x+w]
# add the roi to previous results
result = np.hstack((result,roi))
# optinal: save result as image
# cv2.imwrite('result.png',result)
# display result - can also plot with matplotlib
cv2.imshow('Result', result)
Update after question edit:
If the foams have different colors, your can use easily separate them by color by converting the image you hsv and using inrange (example). This creates a mask (=2D array with values from 0-255, one for each pixel) that you can use to calculate average height and extract the parameters and area of the image.
You can find a script to help you find the HSV colors for separation on this GitHub

Tranform HSV mask into a set of points

I created a HSV mask from the image. The result is like the following:
I hope that this mask can be represented by a set of points. My original idea was to use Skimage Skeletonize to create a line and then use the sliding window to calculate the local mean for point creation.
However, skeletonize takes too long. It requires 0.4s for each frame. This is not a good idea for video processing.
Do you want the points of all True elements of the mask, or do you just want a skeleton? If the former..
import skimage as ski
from skimage import io
import numpy as np
mask ='./mask.png')[:,:,0]/255
mask = mask.astype('bool')
s0,s1 = mask.shape # dimensions of mask
a0,a1 = np.arange(s0),np.arange(s1) # make two 1d coordinate arrays
coords = np.array(np.meshgrid(a0,a1)).T # cartesian product into a coordinate matrix
coords = coords[mask] # mask out the points of interest
If the latter, you can get the start and end points (from left to right) of the object in the mask in a fast way with something like
start_mat = np.stack((np.roll(mask,1,axis=1),mask),-1)
start_mask = np.fromiter(map(lambda p: np.alltrue(p==np.array([False,True])),start_mat[mask]),dtype=bool)
starts = coords[start_mask]
end_mat = np.stack((np.roll(mask,-1,axis=1),mask),-1)
end_mask = np.fromiter(map(lambda p: np.alltrue(p==np.array([False,True])),end_mat[mask]),dtype=bool)
ends = coords[end_mask]
This will give you a rough outline of the object. Outline points will be missing anywhere that the slope of the figure is 0. You may have to think of a vertical difference scheme for those areas. The same idea would work with np.roll(...,axis=0). You could just concatenate the unique points from rolling over rows to the points from rolling over columns to get the full outline.
Averaging the correct pairs to get the skeleton isn't so easy.
Here's a resultant outline. You can definitely make this faster than 0.4s:
Couldn't a simple For loop work?
Scan each "across" line of your bitmap looking for...
X pos where from Black meets White = new start point.
Also in same scanned line now look for a new X-pos: where from White meets Black = new end point.
Either put dots at start/end points for "outline" effect, or else put dots in "center" effect by dot.x = (end_point - start_point) / 2

Detecting the upper side of a dice

Is it possible to detect the upper side of a dice? While this will be an easy task if you look from the top, from many perspectives multiple sides are visible.
Here is an example of a dice, feel free to take your own pictures:
You usually want to know the score you have achieved. It is easy for me to extract ALL dots, but how to only extract those on the top? In this special case, the top side is the largest, but this might not always be true. I am looking for someting which evaluates the distortion of the top square (or circle in this case, which I can extract) in relation to the perspective given by the grid in the bottom.
Example program with some results is given below.
import numpy as np
import cv2
img = cv2.imread('dice.jpg')
# Colour range to be extracted
lower_blue = np.array([0,0,0])
upper_blue = np.array([24,24,24])
# Threshold the BGR image
dots = cv2.inRange(img, lower_blue, upper_blue)
# Colour range to be extracted
lower_blue = np.array([0,0,0])
upper_blue = np.array([226,122,154])
# Threshold the BGR image
upper_side_shape = cv2.inRange(img, lower_blue, upper_blue)
cv2.imshow('Upper side shape',upper_side_shape)
Some resulting images:
The best solution is dot size, which I mentioned in the comment. You find the largest dot, consider it as max, and then create a tolerance level.
But what if all dots are nearly equal (viewing it from the edge at angle that makes things equidistant), or even too small? The best solution for that is creating a boundary to capture the dots. This requires the analysis of the dice's edge (edge detection basically), but once you define the boundary, you're solid.
All you need is to capture the edges of the dice from the perspective you're seeing.
Here's a visual example:
Since you have a virtual boundary set, you'll simply measure dots above a specific point on the y-axis.
The dot size is a good heuristic, but I would also add the dot roundness: if you compute the second order image moments of the binarized dots, the more the x and y moment are similar, the more round the figure. This will of course fail, like the size, for a side view, but then what does "top-side" really means if you can't sense gravity..
why try to chop up the image at all? based on what numbers you see on the side you can infer what number is on the top. your side numbers can be used as a check to validate your guess.
note that you'll have to be careful about handedness (see:

Remove spurious small islands of noise in an image - Python OpenCV

I am trying to get rid of background noise from some of my images. This is the unfiltered image.
To filter, I used this code to generate a mask of what should remain in the image:
element = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
mask = cv2.erode(mask, element, iterations = 1)
mask = cv2.dilate(mask, element, iterations = 1)
mask = cv2.erode(mask, element)
With this code and when I mask out the unwanted pixels from the original image, what I get is:
As you can see, all the tiny dots in the middle area are gone, but a lot of those coming from the denser area are also gone. To reduce the filtering, I tried changing the second parameter of getStructuringElement() to be (1,1) but doing this gives me the first image as if nothing has been filtered.
Is there any way where I can apply some filter that is between these 2 extremes?
In addition, can anyone explain to me what exactly does getStructuringElement() do? What is a "structuring element"? What does it do and how does its size (the second parameter) affect the level of filtering?
A lot of your questions stem from the fact that you're not sure how morphological image processing works, but we can put your doubts to rest. You can interpret the structuring element as the "base shape" to compare to. 1 in the structuring element corresponds to a pixel that you want to look at in this shape and 0 is one you want to ignore. There are different shapes, such as rectangular (as you have figured out with MORPH_RECT), ellipse, circular, etc.
As such, cv2.getStructuringElement returns a structuring element for you. The first parameter specifies the type you want and the second parameter specifies the size you want. In your case, you want a 2 x 2 "rectangle"... which is really a square, but that's fine.
In a more bastardized sense, you use the structuring element and scan from left to right and top to bottom of your image and you grab pixel neighbourhoods. Each pixel neighbourhood has its centre exactly at the pixel of interest that you're looking at. The size of each pixel neighbourhood is the same size as the structuring element.
For an erosion, you examine all of the pixels in a pixel neighbourhood that are touching the structuring element. If every non-zero pixel is touching a structuring element pixel that is 1, then the output pixel in the corresponding centre position with respect to the input is 1. If there is at least one non-zero pixel that does not touch a structuring pixel that is 1, then the output is 0.
In terms of the rectangular structuring element, you need to make sure that every pixel in the structuring element is touching a non-zero pixel in your image for a pixel neighbourhood. If it isn't, then the output is 0, else 1. This effectively eliminates small spurious areas of noise and also decreases the area of objects slightly.
The size factors in where the larger the rectangle, the more shrinking is performed. The size of the structuring element is a baseline where any objects that are smaller than this rectangular structuring element, you can consider them as being filtered and not appearing in the output. Basically, choosing a 1 x 1 rectangular structuring element is the same as the input image itself because that structuring element fits all pixels inside it as the pixel is the smallest representation of information possible in an image.
Dilation is the opposite of erosion. If there is at least one non-zero pixel that touches a pixel in the structuring element that is 1, then the output is 1, else the output is 0. You can think of this as slightly enlarging object areas and making small islands bigger.
The implications with size here is that the larger the structuring element, the larger the areas of the objects will be and the larger the isolated islands become.
What you're doing is an erosion first followed by a dilation. This is what is known as an opening operation. The purpose of this operation is to remove small islands of noise while (trying to) maintain the areas of the larger objects in your image. The erosion removes those islands while the dilation grows back the larger objects to their original sizes.
You follow this with an erosion again for some reason, which I can't quite understand, but that's ok.
What I would personally do is perform a closing operation first which is a dilation followed by an erosion. Closing helps group areas that are close together into a single object. As such, you see that there are some larger areas that are close to each other that should probably be joined before we do anything else. As such, I would do a closing first, then do an opening after so that we can remove the isolated noisy areas. Take note that I'm going to make the closing structuring element size larger as I want to make sure I get nearby pixels and the opening structuring element size smaller so that I don't want to mistakenly remove any of the larger areas.
Once you do this, I would mask out any extra information with the original image so that you leave the larger areas intact while the small islands go away.
Instead of chaining an erosion followed by a dilation, or a dilation followed by an erosion, use cv2.morphologyEx, where you can specify MORPH_OPEN and MORPH_CLOSE as the flags.
As such, I would personally do this, assuming your image is called spots.png:
import cv2
import numpy as np
img = cv2.imread('spots.png')
img_bw = 255*(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) > 5).astype('uint8')
se1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
se2 = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
mask = cv2.morphologyEx(img_bw, cv2.MORPH_CLOSE, se1)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, se2)
mask = np.dstack([mask, mask, mask]) / 255
out = img * mask
cv2.imshow('Output', out)
cv2.imwrite('output.png', out)
The above code is pretty self-explanatory. First, I read in the image and then I convert the image to grayscale and threshold with an intensity of 5 to create a mask of what is considered object pixels. This is a rather clean image and so anything larger than 5 seems to have worked. For the morphology routines, I need to convert the image to uint8 and scale the mask to 255. Next, we create two structuring elements - one that is a 5 x 5 rectangle for the closing operation and another that is 2 x 2 for the opening operation. I run cv2.morphologyEx twice for the opening and closing operations respectively on the thresholded image.
Once I do that, I stack the mask so that it becomes a 3D matrix and divide by 255 so that it becomes a mask of [0,1] and then we multiply this mask with the original image so that we can grab the original pixels of the image back and maintaining what is considered a true object from the mask output.
The rest is just for illustration. I show the image in a window, and I also save the image to a file called output.png, and its purpose is to show you what the image looks like in this post.
I get this:
Bear in mind that it isn't perfect, but it's much better than how you had it before. You'll have to play around with the structuring element sizes to get something that you consider as a good output, but this is certainly enough to get you started.
C++ version
There have been some requests to translate the code I wrote above into the C++ version using OpenCV. I have finally gotten around to writing a C++ version of the code and this has been tested on OpenCV 3.1.0. The code for this is below. As you can see, the code is very similar to that seen in the Python version. However, I used cv::Mat::setTo on a copy of the original image and set whatever was not part of the final mask to 0. This is the same thing as performing an element-wise multiplication in Python.
#include <opencv2/opencv.hpp>
using namespace cv;
int main(int argc, char *argv[])
// Read in the image
Mat img = imread("spots.png", CV_LOAD_IMAGE_COLOR);
// Convert to black and white
Mat img_bw;
cvtColor(img, img_bw, COLOR_BGR2GRAY);
img_bw = img_bw > 5;
// Define the structuring elements
Mat se1 = getStructuringElement(MORPH_RECT, Size(5, 5));
Mat se2 = getStructuringElement(MORPH_RECT, Size(2, 2));
// Perform closing then opening
Mat mask;
morphologyEx(img_bw, mask, MORPH_CLOSE, se1);
morphologyEx(mask, mask, MORPH_OPEN, se2);
// Filter the output
Mat out = img.clone();
out.setTo(Scalar(0), mask == 0);
// Show image and save
namedWindow("Output", WINDOW_NORMAL);
imshow("Output", out);
imwrite("output.png", out);
The results should be the same as what you get in the Python version.
One can also remove small pixel clusters using the remove_small_objects function in skimage:
import matplotlib.pyplot as plt
from skimage import morphology
import numpy as np
import skimage
# read the image, grayscale it, binarize it, then remove small pixel clusters
im = plt.imread('spots.png')
grayscale = skimage.color.rgb2gray(im)
binarized = np.where(grayscale>0.1, 1, 0)
processed = morphology.remove_small_objects(binarized.astype(bool), min_size=2, connectivity=2).astype(int)
# black out pixels
mask_x, mask_y = np.where(processed == 0)
im[mask_x, mask_y, :3] = 0
# plot the result
This displays:
To retain only larger clusters, try increasing min_size (smallest size of retained clusters) and decreasing connectivity (size of pixel neighborhood when forming clusters). Using just those two parameters, one can retain only pixel clusters of an appropriate size.

