How to remove whitespace from an image in OpenCV? - python

I have the following image which has text and a lot of white space underneath the text. I would like to crop the white space such that it looks like the second image.
Cropped Image
Here is what I've done
>>> img = cv2.imread("pg13_gau.jpg.png")
>>> gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
>>> edged = cv2.Canny(gray, 30,300)
>>> (img,cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
>>> cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:10]

As many have alluded in the comments, the best way is to invert the image so the black text becomes white, find all the non-zero points in the image then determine what the minimum spanning bounding box would be. You can use this bounding box to finally crop your image. Finding the contours is very expensive and it isn't needed here - especially since your text is axis-aligned. You can use a combination of cv2.findNonZero and cv2.boundingRect to do what you need.
Therefore, something like this would work:
import numpy as np
import cv2
img = cv2.imread('ws.png') # Read in the image and convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8) # To invert the text to white
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
rect = img[y:y+h, x:x+w] # Crop the image - note we do this on the original image
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("rect.png", rect) # Save the image
The code above exactly lays out what I talked about in the beginning. We read in the image, but we also convert to grayscale as your image is in colour for some reason. The tricky part is the third line of code where I threshold below the intensity of 128 so that the dark text becomes white. This however produces a binary image, so I convert to uint8, then scale by 255. This essentially inverts the text.
Next, given this image we find all of the non-zero coordinates with cv2.findNonZero and we finally put this into cv2.boundingRect which will give you the top-left corner of the bounding box as well as the width and height. We can finally use this to crop the image. Note we do this on the original image and not the inverted one. We use simply NumPy array indexing to do the cropping for us.
Finally, we show the image to show that it works and we save it to disk.
I now get this image:
For the second image, a good thing to do is to remove some of the right border and bottom border. We can do that by cropping the image down to that first. Next, this image contains some very small noisy pixels. I would recommend doing a morphological opening with a very small kernel, then redo the logic we talked about above.
Therefore:
import numpy as np
import cv2
img = cv2.imread('pg13_gau_preview.png') # Read in the image and convert to grayscale
img = img[:-20,:-20] # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8) # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones((2, 2), dtype=np.uint8)) # Perform noise filtering
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
rect = img[y:y+h, x:x+w] # Crop the image - note we do this on the original image
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("rect.png", rect) # Save the image
Note: Output image removed due to privacy

Opencv reads the image as a numpy array and it's much simpler to use numpy directly (scikit-image does the same). One possible way of doing it is to read the image as grayscale or convert to it and do the row-wise and column-wise operations as shown in the code snippet below. This will remove the columns and rows when all pixels are of pixel_value (white in this case).
def crop_image(filename, pixel_value=255):
gray = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
crop_rows = gray[~np.all(gray == pixel_value, axis=1), :]
cropped_image = crop_rows[:, ~np.all(crop_rows == pixel_value, axis=0)]
return cropped_image
and the output:

This would also work:
from PIL import Image, ImageChops
img = Image.open("pUq4x.png")
pixels = img.load()
print (f"original: {img.size[0]} x {img.size[1]}")
xlist = []
ylist = []
for y in range(0, img.size[1]):
for x in range(0, img.size[0]):
if pixels[x, y] != (255, 255, 255, 255):
xlist.append(x)
ylist.append(y)
left = min(xlist)
right = max(xlist)
top = min(ylist)
bottom = max(ylist)
img = img.crop((left-10, top-10, right+10, bottom+10))
img.show()

Related

how do I get the color that surrounds the Contour from outside, and how to fill the inside of a contour with a color?

I wrote a program that takes a picture as an input and detects the written text
I need to edit my program to add two more functionalities
First
since I already was able to get the Contour coordinates(done), I want to know how do I get the color of the background that surrounds the contour line of each character ( at multiple points because I want to calculate the average of RGB color values surrounding that character )
I just don't know if there is already a function to do what I want to do or if there is a specific approach that I should follow
second,
I tried to fill the inside of the Contour but it did not work, can someone help me to know why
I did try both thickness=-1 and thickness=cv2.FILLE in my cv2.drawContours, nothing worked, clearly only the control line is blue but what is inside is not blue
I know that the code is not perfect and there are so many comments but it is because I'm Kim trying new stuff
thank you in advance to everyone who's going to help
import cv2
import pytesseract
import numpy as np
from PIL import ImageGrab
import pyautogui
def Contour(img):
Contours,hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for cnt in Contours:
area = cv2.contourArea(cnt)
# print(area)
# if area<150: could be useded to be cpecific
# we need a copy of the image so we dont draw on the original or thickness=-1
cv2.drawContours(imgcpy, cnt, -1, (255,0,0), thickness=-1)
#pytesseract.pytesseract.tesseract_cmd= 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
cap= cv2
img = cv2.imread('D:\\opencv\\R\\img\\lec\\3.png')
imgcpy = img.copy()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# for japanese txt = pytesseract.image_to_string(gray, lang='jpn')
txt = pytesseract.image_to_string(gray)
imgblur = cv2.GaussianBlur(img, (7, 7), 0)
imgcanny = cv2.Canny(img, 200, 200)
#imgDialation = cv2.dilate(imgcanny, kernel, iterations=1)
#imgeroded = cv2.erode(imgDialation, kernel, iterations=1)
imBlank = np.zeros_like(img)
# waste of time only the same picture work -_-
# imstack = np.hstack((gray,gray,gray))
Contour(imgcanny)
cv2.imshow('Contour', imgcpy)
#img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print(txt)
print(Contour(imgcanny))
#cv2.imshow('test', img)
cv2.waitKey()
The "first" issue is solved by
getting a mask, either
using connectedComponents or
by filling a mask using drawContours into np.zeros((height, width), dtype=np.uint8)
dilate the mask, then
subtract the mask from the dilated version (literal subtraction will work, but so do bitwise operations)
The result is a mask that contains pixels just outside of the contour.
Use this mask to get those pixel values:
# have `mask` somehow, which should be dtype uint8
dilated = cv.dilate(mask)
surrounding = dilated - mask
values = img[surrounding != 0] # numpy indexing using a boolean array
meancolor = values.mean(axis=(0,1)) # sum/average over both axes
The "second" issue is that you pass a single contour to drawContours, which expects a list of contours.
Change your code to pass [cnt] instead of just cnt:
cv2.drawContours(imgcpy, [cnt], -1, (255,0,0), thickness=-1)
If you only pass cnt, it will interpret the list of points (a contour is a list of points) as a list of contours, hence it will interpret each point as a contour... and that's just a point, not a contour.

Simple Image Segmentation OpenCV-Watershed

I'm still pretty new within the image-segmentation / OpenCV scene and hope you can help me out.
Currently, I'm trying to calculate the percentage of the 2 liquids within this photo
It should be something like this (as an example)
I thought opencv watershed could help me but I'm unable to get it right. I'm trying to set the markers manually but I get the following error: (-215:Assertion failed) src.type() == CV_8UC3 && dst.type() == CV_32SC1 in function 'cv::watershed'
(probably I got my markers all wrong)
If anyone can help me (maybe there is a better way to do this), I would greatly appreciate it
This is the code I use:
import cv2
import numpy as np
img = cv2.imread('image001.jpg')
# convert the image to grayscale and blur it slightly
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("gray", gray)
# read image
#img = cv2.imread('jeep.jpg')
hh, ww = img.shape[:2]
hh2 = hh // 2
ww2 = ww // 2
# define circles
radius = hh2
yc = hh2
xc = ww2
# draw filled circle in white on black background as mask
mask = np.zeros_like(gray)
mask = cv2.circle(mask, (xc,yc), radius, (255,255,255), -1)
# apply mask to image
result = cv2.bitwise_and(gray, mask)
cv2.imshow("Output", result)
ret, thresh = cv2.threshold(result,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imshow("ret 1", thresh)
markers = cv2.circle(thresh, (xc,50), 5, 1, -1)
markers = cv2.circle(thresh, (xc,yc+50), 5, 2, -1)
markers = cv2.circle(thresh, (15,15), 5, 3, -1)
cv2.imshow("marker 1", markers)
markers = cv2.watershed(img, markers)
img[markers == -1] = [0,0,255]
cv2.imshow("watershed", markers)
cv2.waitKey(0)
First of all, you obtain an exception because OpenCV's watershed() function expects markers array to be made of 32-bit integers. Converting it forth and back will remove the errors:
markers = markers.astype(np.int32)
markers = cv2.watershed(img, markers)
markers = markers.astype(np.uint8)
However, if you execute your code now you will see that the result isn't very good, the watershed algorithm will merge the liquid areas with unwanted regions. To make your code work like this, you should give one marker for every feature in your image. This will be very impractical.
Let's first extract the region of the image which interest us, i.e. the two liquid circle. You already tried to do a masking operation, I improved it in the code below by detecting automatically the circle using OpenCV's HoughCircles() function. Then the watershed algorithm will need an image with one marker for each region. We will fill the marker image with zeros, place one marker in each liquid area and one marker for the background. Putting all of this together we obtain the code and result below.
Considering the quality of your image (reflects, compression artefacts, etc), I personally think that the result is quite good and I am not sure that another method will give you better than that. On the other side, if you can get better quality images, segmentation methods based on colour space may be more appropriate (as pointed out by Christoph Rackwitz).
import cv2
import numpy as np
img = cv2.imread('image001.jpg')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Detect circle
circles = cv2.HoughCircles(img_gray, cv2.HOUGH_GRADIENT, 1.3, 100)
# only one circle is detected
circle = circles[0,0,:]
center_x, center_y, radius = circles[0,0,0], circles[0,0,1], circles[0,0,2]
img_circle = img.copy()
cv2.circle(img_circle, (center_x, center_y), int(radius), (0, 255, 0), 3)
cv2.imshow("circle", img_circle)
# build mask for this circle
mask = np.zeros(img_gray.shape, np.uint8)
cv2.circle(mask, (center_x, center_y), int(radius), 255, -1)
img_masked = img.copy()
img_masked[mask == 0] = 0
cv2.imshow("image-masked", img_masked)
# create markers
markers = np.zeros(img_gray.shape, np.int32)
markers[10, 10] = 1 # background marker
markers[int(center_y - radius*0.9), int(center_x)] = 100 # top liquid
markers[int(center_y + radius*0.9), int(center_x)] = 200 # bottom liquid
# do watershed
markers = cv2.watershed(img_masked, markers)
cv2.imshow("watershed", markers.astype(np.uint8))
cv2.waitKey(0)

cropping x-ray image to remove background

I have many x-ray scans and need to crop the scanned object from its background noise.
The files are in .png format and I am planning to use OpenCV Python for this task. I have seen some works with FindContours() but unsure that thresholding will work for this case.
Before Image:
After/Cropped Image:
Any suggested solution/code is appreciated.
Here is one way to do that in Python/OpenCV. It assumes you have the same excess border in all your images so that one can sort contours by area and skip the largest contour to get the second largest one.
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("table_xray.jpg")
hh, ww = img.shape[:2]
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# median filter
filt = cv2.medianBlur(gray, 15)
# threshold the filtered image and invert
thresh = cv2.threshold(filt, 64, 255, cv2.THRESH_BINARY)[1]
thresh = 255 - thresh
# find contours and store index with area in list
cntrs_info = []
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
index=0
for cntr in contours:
area = cv2.contourArea(cntr)
print(index, area)
cntrs_info.append((index,area))
index = index + 1
# sort contours by area
def takeSecond(elem):
return elem[1]
cntrs_info.sort(key=takeSecond, reverse=True)
# get bounding box of second largest contour skipping large border
index_second = cntrs_info[1][0]
x,y,w,h = cv2.boundingRect(contours[index_second])
print(index_second,x,y,w,h)
# crop input image
results = img[y:y+h,x:x+w]
# write result to disk
cv2.imwrite("table_xray_thresholded.png", thresh)
cv2.imwrite("table_xray_extracted.png", results)
cv2.imshow("THRESH", thresh)
cv2.imshow("RESULTS", results)
cv2.waitKey(0)
cv2.destroyAllWindows()
Filtered and Thresholded Image:
Cropped Result:
This is another possible solution. It uses the K-Channel of your input image, once converted to the CMYK color-space. The K (or Key) channel has most of the information of the black color, so it should be useful for segmenting the input image. After that, you can apply a heavy morphological chain to produce a good mask of the object. After that, cropping the object is very straightforward. Let's see the code:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath+"jU6QA.jpg")
# Convert to float and divide by 255:
imgFloat = inputImage.astype(np.float) / 255.
# Calculate channel K:
kChannel = 1 - np.max(imgFloat, axis=2)
# Convert back to uint 8:
kChannel = (255*kChannel).astype(np.uint8)
The first bit of the program converts your image to the CMYK color-space and extracts the K channel. OpenCV has no direct conversion to this color-space, so a manual conversion is necessary. We need to be careful with the data types because there are float operations involved. The resulting image is this:
Pixels with black information are assigned an intensity close to 255. Now, let's threshold this image to get a binary mask. The threshold level is fixed:
# Threshold the image with a fixed thresh level
thresholdLevel = 200
_, binaryImage = cv2.threshold(kChannel, thresholdLevel, 255, cv2.THRESH_BINARY)
This produces the following binary image:
Alright. We need to isolate the object, however we have both the lines of the background and the "frame" around the image. Let's get rid of the lines first. We will apply a morphological Erosion. Then, we will remove the frame Flood-Filling with black color at two locations: upper left and bottom right of the image. After that, we will apply a Dilation to restore the object's original size. I wrapped these OpenCV functions inside custom functions that save me the typing of a couple of lines - These helper functions are presented at the end of the post. This is the approach:
# Perform Small Erosion:
binaryImage = morphoOperation(binaryImage, 3, 5, "Erode")
# Flood-Fill at two locations: Top left corner and bottom right:
(imageHeight, imageWidth) = binaryImage.shape[:2]
floodPositions = [(0, 0),(imageWidth-1, imageHeight-1)]
binaryImage = floodFill(binaryImage, floodPositions, 0)
# Perform Small Dilate:
binaryImage = morphoOperation(binaryImage, 3, 5, "Dilate")
This is the result:
Nice. We can improve the mask by applying a second morphological chain, this time with more iterations. Let's apply a Dilation to try and join the "holes" of the object, followed with a Erosion to, once again, restore the object's original size:
# Perform Big Dilate:
binaryImage = morphoOperation(binaryImage, 3, 10, "Dilate")
# Perform Big Erode:
binaryImage = morphoOperation(binaryImage, 3, 10, "Erode")
This yields the following result:
The gaps inside the object have been filled. Now, let's retrieve the contours on this mask to find the object's contour. I've additionally included an area filter. The mask is pretty clean by this point, so maybe this filter is not too necessary. Once the contour is located, we can crop the object from the original image:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# BGR image for drawing results:
binaryBGR = cv2.cvtColor(binaryImage, cv2.COLOR_GRAY2BGR)
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Set a min area value:
minArea = 10000
if minArea < currentArea:
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rect:
color = (0, 255, 0)
cv2.rectangle( binaryBGR, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 5 )
cv2.imshow("Rects", binaryBGR)
# Crop original input:
currentCrop = inputImage[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
cv2.imshow("Cropped", currentCrop)
cv2.waitKey(0)
The last step produces the following two images. The first is the object enclosed by a rectangle, the second one is the actual crop:
I also tested the algorithm with your second image, these are the final results:
Wow. Somebody brought a gun to the airport? That's not OK. These are the helper functions used earlier. This first function performs the morphological operations:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Operation:
if opString == "Dilate":
op = cv2.MORPH_DILATE
else:
if opString == "Erode":
op = cv2.MORPH_ERODE
outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations,
cv2.BORDER_REFLECT101)
return outImage
The second function performs Flood-Filling given a list of seed-points:
def floodFill(binaryImage, positions, color):
# Loop thru the positions list of tuples:
for p in range(len(positions)):
currentSeed = positions[p]
x = int(currentSeed[0])
y = int(currentSeed[1])
# Apply flood-fill:
cv2.floodFill(binaryImage, mask=None, seedPoint=(x, y), newVal=(color))
return binaryImage

How to use OpenCV to crop an image based on a certain criteria?

I would like to crop the images like the one below using python's OpenCV library. The area of interest is inside the squiggly lines on the top and bottom, and the lines on the side. The problem is that every image is slightly different. This means that I need some automated way of cropping for the area of interest. I guess the top and the sides would be easy since you could just crop it by 10 pixels or so. But how can I crop out the bottom half of the image where the line is not straight? I have included this example image. The image that follows highlights in pink the area of the image that I am interested in keeping.
Here is one way using Python/OpenCV.
Read input
Get center point (assume it is inside the desired region)
Convert image to grayscale
Floodfill the gray image and set background to black
Get the largest contour and its bounding box
Draw the largest contour as filled on black background as mask
Apply the mask to the input image
Crop the masked input image
Input:
import cv2
import numpy as np
# load image and get dimensions
img = cv2.imread("odd_region.png")
hh, ww, cc = img.shape
# compute center of image (as integer)
wc = ww//2
hc = hh//2
# create grayscale copy of input as basis of mask
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# create zeros mask 2 pixels larger in each dimension
zeros = np.zeros([hh + 2, ww + 2], np.uint8)
# do floodfill at center of image as seed point
ffimg = cv2.floodFill(gray, zeros, (wc,hc), (255), (0), (0), flags=8)[1]
# set rest of ffimg to black
ffimg[ffimg!=255] = 0
# get contours, find largest and its bounding box
contours = cv2.findContours(ffimg, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
area_thresh = 0
for cntr in contours:
area = cv2.contourArea(cntr)
if area > area_thresh:
area = area_thresh
outer_contour = cntr
x,y,w,h = cv2.boundingRect(outer_contour)
# draw the filled contour on a black image
mask = np.full([hh,ww,cc], (0,0,0), np.uint8)
cv2.drawContours(mask,[outer_contour],0,(255,255,255),thickness=cv2.FILLED)
# mask the input
masked_img = img.copy()
masked_img[mask == 0] = 0
#masked_img[mask != 0] = img[mask != 0]
# crop the bounding box region of the masked img
result = masked_img[y:y+h, x:x+w]
# draw the contour outline on a copy of result
result_outline = result.copy()
cv2.drawContours(result_outline,[outer_contour],0,(0,0,255),thickness=1,offset=(-x,-y))
# display it
cv2.imshow("img", img)
cv2.imshow("ffimg", ffimg)
cv2.imshow("mask", mask)
cv2.imshow("masked_img", masked_img)
cv2.imshow("result", result)
cv2.imshow("result_outline", result_outline)
cv2.waitKey(0)
cv2.destroyAllWindows()
# write result to disk
cv2.imwrite("odd_region_cropped.png", result)
cv2.imwrite("odd_region_cropped_outline.png", result_outline)
Result:
Result With Contour Drawn:

OpenCV - Apply mask to a color image

How can I apply mask to a color image in latest python binding (cv2)? In previous python binding the simplest way was to use cv.Copy e.g.
cv.Copy(dst, src, mask)
But this function is not available in cv2 binding. Is there any workaround without using boilerplate code?
Here, you could use cv2.bitwise_and function if you already have the mask image.
For check the below code:
img = cv2.imread('lena.jpg')
mask = cv2.imread('mask.png',0)
res = cv2.bitwise_and(img,img,mask = mask)
The output will be as follows for a lena image, and for rectangular mask.
Well, here is a solution if you want the background to be other than a solid black color. We only need to invert the mask and apply it in a background image of the same size and then combine both background and foreground. A pro of this solution is that the background could be anything (even other image).
This example is modified from Hough Circle Transform. First image is the OpenCV logo, second the original mask, third the background + foreground combined.
# http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghcircles/py_houghcircles.html
import cv2
import numpy as np
# load the image
img = cv2.imread('E:\\FOTOS\\opencv\\opencv_logo.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# detect circles
gray = cv2.medianBlur(cv2.cvtColor(img, cv2.COLOR_RGB2GRAY), 5)
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=50, minRadius=0, maxRadius=0)
circles = np.uint16(np.around(circles))
# draw mask
mask = np.full((img.shape[0], img.shape[1]), 0, dtype=np.uint8) # mask is only
for i in circles[0, :]:
cv2.circle(mask, (i[0], i[1]), i[2], (255, 255, 255), -1)
# get first masked value (foreground)
fg = cv2.bitwise_or(img, img, mask=mask)
# get second masked value (background) mask must be inverted
mask = cv2.bitwise_not(mask)
background = np.full(img.shape, 255, dtype=np.uint8)
bk = cv2.bitwise_or(background, background, mask=mask)
# combine foreground+background
final = cv2.bitwise_or(fg, bk)
Note: It is better to use the opencv methods because they are optimized.
import cv2 as cv
im_color = cv.imread("lena.png", cv.IMREAD_COLOR)
im_gray = cv.cvtColor(im_color, cv.COLOR_BGR2GRAY)
At this point you have a color and a gray image. We are dealing with 8-bit, uint8 images here. That means the images can have pixel values in the range of [0, 255] and the values have to be integers.
Let's do a binary thresholding operation. It creates a black and white masked image. The black regions have value 0 and the white regions 255
_, mask = cv.threshold(im_gray, thresh=180, maxval=255, type=cv.THRESH_BINARY)
im_thresh_gray = cv.bitwise_and(im_gray, mask)
The mask can be seen below on the left. The image on its right is the result of applying bitwise_and operation between the gray image and the mask. What happened is, the spatial locations where the mask had a pixel value zero (black), became pixel value zero in the result image. The locations where the mask had pixel value 255 (white), the resulting image retained its original gray value.
To apply this mask to our original color image, we need to convert the mask into a 3 channel image as the original color image is a 3 channel image.
mask3 = cv.cvtColor(mask, cv.COLOR_GRAY2BGR) # 3 channel mask
Then, we can apply this 3 channel mask to our color image using the same bitwise_and function.
im_thresh_color = cv.bitwise_and(im_color, mask3)
mask3 from the code is the image below on the left, and im_thresh_color is on its right.
You can plot the results and see for yourself.
cv.imshow("original image", im_color)
cv.imshow("binary mask", mask)
cv.imshow("3 channel mask", mask3)
cv.imshow("im_thresh_gray", im_thresh_gray)
cv.imshow("im_thresh_color", im_thresh_color)
cv.waitKey(0)
The original image is lenacolor.png that I found here.
Answer given by Abid Rahman K is not completely correct. I also tried it and found very helpful but got stuck.
This is how I copy image with a given mask.
x, y = np.where(mask!=0)
pts = zip(x, y)
# Assuming dst and src are of same sizes
for pt in pts:
dst[pt] = src[pt]
This is a bit slow but gives correct results.
EDIT:
Pythonic way.
idx = (mask!=0)
dst[idx] = src[idx]
The other methods described assume a binary mask. If you want to use a real-valued single-channel grayscale image as a mask (e.g. from an alpha channel), you can expand it to three channels and then use it for interpolation:
assert len(mask.shape) == 2 and issubclass(mask.dtype.type, np.floating)
assert len(foreground_rgb.shape) == 3
assert len(background_rgb.shape) == 3
alpha3 = np.stack([mask]*3, axis=2)
blended = alpha3 * foreground_rgb + (1. - alpha3) * background_rgb
Note that mask needs to be in range 0..1 for the operation to succeed. It is also assumed that 1.0 encodes keeping the foreground only, while 0.0 means keeping only the background.
If the mask may have the shape (h, w, 1), this helps:
alpha3 = np.squeeze(np.stack([np.atleast_3d(mask)]*3, axis=2))
Here np.atleast_3d(mask) makes the mask (h, w, 1) if it is (h, w) and np.squeeze(...) reshapes the result from (h, w, 3, 1) to (h, w, 3).

Categories

Resources