Related
I am trying to create a GAN model which will remove watermark. After doing some homework, I got to this Google AI Blog which makes things worse. Thus I need to create a dataset from these websites Shutterstock, Adobe Stock, Fotolia and Canstock and manymore.
So, when I try to do same image using reverse image search. I founded out that the resolutions, images are changed which makes it more worse.
Thus, I'm only left to create a custom dataset doing the same watermark like from these websites and that's why I need to create same watermark like them on images from unsplash and so..
Can anyone please help me create same watermark which we can get from Shutterstock and Adobe Stock. It'd be a great help.
Note: I have gone through this link for watermark using Imagemagick but I need it in python. If someone can show me a way of doing the same in python. That'd be a great help.
EDIT1: If you look at this Example of Shutterstock. Zoom in and you will find that not only lines but text and rounded symbols are curved and also name and rounded symbol with different opacity. So, that's what I want to replicate.
Here is one way to do that in Python/OpenCV.
Read the input
Create an image of the text
Rotate the text image
Tile out the rotated text image to the size of the input
Blend the tiled, rotated text image with the input image
Save the output
Input:
import cv2
import numpy as np
import math
text = "WATERMARK"
thickness = 2
scale = 0.75
pad = 5
angle = -45
blend = 0.25
def rotate_bound(image, angle):
# function to rotate an image
# from https://github.com/PyImageSearch/imutils/blob/master/imutils/convenience.py
# grab the dimensions of the image and then determine the center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
# read image
photo = cv2.imread('lena.jpg')
ph, pw = photo.shape[:2]
# determine size for text image
(wd, ht), baseLine = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, scale, thickness)
print (wd, ht, baseLine)
# add text to black background image padded all around
pad2 = 2 * pad
text_img = np.zeros((ht+pad2,wd+pad2,3), dtype=np.uint8)
text_img = cv2.putText(text_img, text, (pad,ht+pad), cv2.FONT_HERSHEY_SIMPLEX, scale, (255,255,255), thickness)
# rotate text image
text_rot = rotate_bound(text_img, angle)
th, tw = text_rot.shape[:2]
# tile the rotated text image to the size of the input
xrepeats = math.ceil(pw/tw)
yrepeats = math.ceil(ph/th)
print(yrepeats,xrepeats)
tiled_text = np.tile(text_rot, (yrepeats,xrepeats,1))[0:ph, 0:pw]
# combine the text with the image
result = cv2.addWeighted(photo, 1, tiled_text, blend, 0)
# save results
cv2.imwrite("text_img.png", text_img)
cv2.imwrite("text_img_rot.png", text_rot)
cv2.imwrite("lena_tiled_rotated_text_img.jpg", result)
# show the results
cv2.imshow("text_img", text_img)
cv2.imshow("text_rot", text_rot)
cv2.imshow("tiled_text", tiled_text)
cv2.imshow("result", result)
cv2.waitKey(0)
Text Image:
Rotated Text Image:
Result:
Here is another variation in Python/OpenCV that does outline font for the watermark. I have made the font size larger so that the outline is more visible.
import cv2
import numpy as np
import math
text = "WATERMARK"
thickness = 2
scale = 1.5
pad = 5
angle = -45
blend = 0.4
# function to rotate an image
def rotate_bound(image, angle):
# from https://github.com/PyImageSearch/imutils/blob/master/imutils/convenience.py
# grab the dimensions of the image and then determine the center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
# read image
photo = cv2.imread('lena.jpg')
ph, pw = photo.shape[:2]
# determine size for text image
(wd, ht), baseLine = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, scale, thickness)
print (wd, ht, baseLine)
# add text to black background image padded all around
# write thicker white text and then write over that with thinner gray text to make outline text
pad2 = 2 * pad
text_img = np.zeros((ht+pad2,wd+pad2,3), dtype=np.uint8)
text_img = cv2.putText(text_img, text, (pad,ht+pad), cv2.FONT_HERSHEY_SIMPLEX, scale, (256,256,256), thickness+3)
text_img = cv2.putText(text_img, text, (pad,ht+pad), cv2.FONT_HERSHEY_SIMPLEX, scale, (128,128,128), thickness)
# rotate text image
text_rot = rotate_bound(text_img, angle)
th, tw = text_rot.shape[:2]
# tile the rotated text image to the size of the input
xrepeats = math.ceil(pw/tw)
yrepeats = math.ceil(ph/th)
print(yrepeats,xrepeats)
tiled_text = np.tile(text_rot, (yrepeats,xrepeats,1))[0:ph, 0:pw]
# combine the text with the image
#result = cv2.addWeighted(photo, 1, tiled_text, blend, 0)
mask = blend * cv2.threshold(tiled_text, 0, 255, cv2.THRESH_BINARY)[1]
result = (mask * tiled_text.astype(np.float64) + (255-mask)*photo.astype(np.float64))/255
result = result.clip(0,255).astype(np.uint8)
# save results
cv2.imwrite("text_img.png", text_img)
cv2.imwrite("text_img_rot.png", text_rot)
cv2.imwrite("lena_tiled_rotated_text_img2.jpg", result)
# show the results
cv2.imshow("text_img", text_img)
cv2.imshow("text_rot", text_rot)
cv2.imshow("tiled_text", tiled_text)
cv2.imshow("result", result)
cv2.waitKey(0)
Result:
I was wondering, given the type of interpolation that is used for image resizes using cv2.resize. How can I find out exactly where a particular pixel maps too? For example, if I'm increasing the size of an image using Linear_interpolation and I take coordinates (785, 251) for a particular pixel, regardless of whether or not the aspect ratio changes between the source image and resized image, how could I find out exactly to what coordinates the pixel in the source image with coordinates == (785, 251) maps in the resized version? I've looked over the internet for a solution but all solutions seem to be indirect methods of finding out where a pixel maps that don't actually work for different aspect ratio's:
https://answers.opencv.org/question/209827/resize-and-remap/
After resizing an image with cv2, how to get the new bounding box coordinate
Is there a way through cv2 to access the way pixels are mapped maybe and through reversing the script finding out the new coordinates?
The reason why I would like this is that I want to be able to create bounding boxes that give me back the same information regardless of the change in aspect ratio of a given image. Every method I've used so far doesn't give me back the same information. I figure that if I can figure out where the particular pixel coordinates of x,y top left and bottom right maps I can recreate an accurate bounding box regardless of aspect ratio changes.
Scaling the coordinates works when the center coordinate is (0, 0).
You may compute x_scaled and y_scaled as follows:
Subtract x_original_center and y_original_center from x_original and y_original.
After subtraction, (0, 0) is the "new center".
Scale the "zero centered" coordinates by scale_x and scale_y.
Convert the "scaled zero centered" coordinates to "top left (0, 0)" by adding x_scaled_center and y_scaled_center.
Computing the center accurately:
The Python conversion is:
(0, 0) is the top left, and (cols-1, rows-1) is the bottom right coordinate.
The accurate center coordinate is:
x_original_center = (original_rows-1)/2
y_original_center = (original_cols-1)/2
Python code (assume img is the original image):
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)])
rows, cols = img.shape[0:2]
resized_rows, resized_cols = resized_img.shape[0:2]
x_original_center = (cols-1) / 2
y_original_center = (rows-1) / 2
x_scaled_center = (resized_cols-1) / 2
y_scaled_center = (resized_rows-1) / 2
# Subtract the center, scale, and add the "scaled center".
x_scaled = (x_original - x_original_center)*scale_x + x_scaled_center
y_scaled = (y_original - y_original_center)*scale_y + y_scaled_center
Testing
The following code sample draws crosses at few original and scaled coordinates:
import cv2
def draw_cross(im, x, y, use_color=False):
""" Draw a cross with center (x,y) - cross is two rows and two columns """
x = int(round(x - 0.5))
y = int(round(y - 0.5))
if use_color:
im[y-4:y+6, x] = [0, 0, 255]
im[y-4:y+6, x+1] = [255, 0, 0]
im[y, x-4:x+6] = [0, 0, 255]
im[y+1, x-4:x+6] = [255, 0, 0]
else:
im[y-4:y+6, x] = 0
im[y-4:y+6, x+1] = 255
im[y, x-4:x+6] = 0
im[y+1, x-4:x+6] = 255
img = cv2.imread('graf.png') # http://man.hubwiz.com/docset/OpenCV.docset/Contents/Resources/Documents/db/d70/tutorial_akaze_matching.html
rows, cols = img.shape[0:2] # cols = 320, rows = 256
# 3 points for testing:
x0_original, y0_original = cols//2-0.5, rows//2-0.5 # 159.5, 127.5
x1_original, y1_original = cols//5-0.5, rows//4-0.5 # 63.5, 63.5
x2_original, y2_original = (cols//5)*3+20-0.5, (rows//4)*3+30-0.5 # 211.5, 221.5
draw_cross(img, x0_original, y0_original) # Center of cross (159.5, 127.5)
draw_cross(img, x1_original, y1_original)
draw_cross(img, x2_original, y2_original)
scale_x = 2.5
scale_y = 2
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)], interpolation=cv2.INTER_NEAREST)
resized_rows, resized_cols = resized_img.shape[0:2] # cols = 800, rows = 512
# Compute center column and center row
x_original_center = (cols-1) / 2 # 159.5
y_original_center = (rows-1) / 2 # 127.5
# Compute center of resized image
x_scaled_center = (resized_cols-1) / 2 # 399.5
y_scaled_center = (resized_rows-1) / 2 # 255.5
# Compute the destination coordinates after resize
x0_scaled = (x0_original - x_original_center)*scale_x + x_scaled_center # 399.5
y0_scaled = (y0_original - y_original_center)*scale_y + y_scaled_center # 255.5
x1_scaled = (x1_original - x_original_center)*scale_x + x_scaled_center # 159.5
y1_scaled = (y1_original - y_original_center)*scale_y + y_scaled_center # 127.5
x2_scaled = (x2_original - x_original_center)*scale_x + x_scaled_center # 529.5
y2_scaled = (y2_original - y_original_center)*scale_y + y_scaled_center # 443.5
# Draw crosses on resized image
draw_cross(resized_img, x0_scaled, y0_scaled, True)
draw_cross(resized_img, x1_scaled, y1_scaled, True)
draw_cross(resized_img, x2_scaled, y2_scaled, True)
cv2.imshow('img', img)
cv2.imshow('resized_img', resized_img)
cv2.waitKey()
cv2.destroyAllWindows()
Original image:
Resized image:
Making sure the crosses are aligned:
Note:
In my answer I was using the naming conventions of Miki's comment.
I have Lego cubes forming 4x4 shape, and I'm trying to infer the status of a zone inside the image:
empty/full and the color whether if yellow or Blue.
to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
Here is a clear image of the shape I'm trying to detect taken by my phone camera
( EDIT : Note that this image is not my input image, it is used just to demonstrate the required shape clearly ).
The shape from the side camera that I'm supposed to use looks like this:
(EDIT : Now this is my input image)
to focus my work on the working zone I have created a mask:
what I have tried so far is to locate the red markers by color (simple threshold without HSV color-space) as following:
import numpy as np
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)
masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]
masked = cv2.medianBlur(masked,5)
plt.imshow(masked, cmap='gray')
plt.show()
and I have spotted the markers so far:
But I'm still confused:
how to detect the external borders of the desired zone, and the internal borders (each Lego cube(Yellow-Blue-Green) borders) inside the red markers precisely?.
thanks in advance for your kind advice.
I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.
These are the steps:
Get an HSV mask of the red markers
Use each marker to estimate the center rectangle through each marker's coordinates
Crop the center rectangle
Divide the rectangle into cells - this is the grid
Run a series of HSV-based maks on each cell and compute the dominant color
Label each cell with the dominant color
Let's see the code:
# Importing cv2 and numpy:
import numpy as np
import cv2
# image path
path = "D://opencvImages//"
fileName = "Bg9iB.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Store a deep copy for results:
inputCopy = inputImage.copy()
# Convert the image to HSV:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# The HSV mask values (Red):
lowerValues = np.array([127, 0, 95])
upperValues = np.array([179, 255, 255])
# Create the HSV mask
mask = cv2.inRange(hsvImage, lowerValues, upperValues)
The first part is very straightforward. You set the HSV range and use cv2.inRange to get a binary mask of the target color. This is the result:
We can further improve the binary mask using some morphology. Let's apply a closing with a somewhat big structuring element and 10 iterations. We want those markers as clearly defined as possible:
# Set kernel (structuring element) size:
kernelSize = 5
# Set operation iterations:
opIterations = 10
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
Which yields:
Very nice. Now, let's detect contours on this mask. We will approximate each contour to a bounding box and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours:
# Create a deep copy, convert it to BGR for results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Bounding Rects are stored here:
boundRectsList = []
# Process each contour 1-1:
for i, c in enumerate(contours):
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Convert the polygon to a bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the bounding rect's data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Estimate the bounding rect area:
rectArea = rectWidth * rectHeight
# Set a min area threshold
minArea = 100
# Filter blobs by area:
if rectArea > minArea:
#Store the rect:
boundRectsList.append(boundRect)
I also created a deep copy of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:
Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea. Alright, now we have the bounding boxes in the boundRectsList variable. Let's sort this boxes using the Y coordinate:
# Sort the list based on ascending y values:
boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0's bottom right ending point (both x and y). Its width is defined by bounding box 1's bottom left x coordinate, height is defined by bounding box 2's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height). There's more than one way yo achieve this. I prefer to use a dictionary to get the relevant data. Let's see:
# Rectangle dictionary:
# Each entry is an index of the currentRect list
# 0 - X, 1 - Y, 2 - Width, 3 - Height
# Additionally: -1 is 0 (no dimension):
pointsDictionary = {0: (2, 3),
1: (-1, 3),
2: (2, -1),
3: (-1, -1)}
# Store center rectangle coordinates here:
centerRectangle = [None]*4
# Process the sorted rects:
rectCounter = 0
for i in range(len(boundRectsSorted)):
# Get sorted rect:
currentRect = boundRectsSorted[i]
# Get the bounding rect's data:
rectX = currentRect[0]
rectY = currentRect[1]
rectWidth = currentRect[2]
rectHeight = currentRect[3]
# Draw sorted rect:
cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
int(rectY + rectHeight)), (0, 255, 0), 5)
# Get the inner points:
currentInnerPoint = pointsDictionary[i]
borderPoint = [None]*2
# Check coordinates:
for p in range(2):
# Check for '0' index:
idx = currentInnerPoint[p]
if idx == -1:
borderPoint[p] = 0
else:
borderPoint[p] = currentRect[idx]
# Draw the border points:
color = (0, 0, 255)
thickness = -1
centerX = rectX + borderPoint[0]
centerY = rectY + borderPoint[1]
radius = 50
cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
# Mark the circle
org = (centerX - 20, centerY + 20)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(maskCopy, str(rectCounter), org, font,
2, (0, 0, 0), 5, cv2.LINE_8)
# Show the circle:
cv2.imshow("Sorted Rects", maskCopy)
cv2.waitKey(0)
# Store the coordinates into list
if rectCounter == 0:
centerRectangle[0] = centerX
centerRectangle[1] = centerY
else:
if rectCounter == 1:
centerRectangle[2] = centerX - centerRectangle[0]
else:
if rectCounter == 2:
centerRectangle[3] = centerY - centerRectangle[1]
# Increase rectCounter:
rectCounter += 1
This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle list:
If you join each inner point you get the center rectangle we have been looking for:
# Check out the big rectangle at the center:
bigRectX = centerRectangle[0]
bigRectY = centerRectangle[1]
bigRectWidth = centerRectangle[2]
bigRectHeight = centerRectangle[3]
# Draw the big rectangle:
cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
cv2.imshow("Big Rectangle", maskCopy)
cv2.waitKey(0)
Check it out:
Now, just crop this portion of the original image:
# Crop the center portion:
centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
# Store a deep copy for results:
centerPortionCopy = centerPortion.copy()
This is the central portion of the image:
Cool, now let's create the grid. You know that there must be 4 bricks per width and 4 bricks per height. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:
# Dive the image into a grid:
verticalCells = 4
horizontalCells = 4
# Cell dimensions
cellWidth = bigRectWidth / verticalCells
cellHeight = bigRectHeight / horizontalCells
# Store the cells here:
cellList = []
# Store cell centers here:
cellCenters = []
# Loop thru vertical dimension:
for j in range(verticalCells):
# Cell starting y position:
yo = j * cellHeight
# Loop thru horizontal dimension:
for i in range(horizontalCells):
# Cell starting x position:
xo = i * cellWidth
# Cell Dimensions:
cX = int(xo)
cY = int(yo)
cWidth = int(cellWidth)
cHeight = int(cellHeight)
# Crop current cell:
currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
# into the cell list:
cellList.append(currentCell)
# Store cell center:
cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
# Draw Cell
cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
cv2.imshow("Grid", centerPortionCopy)
cv2.waitKey(0)
This is the grid:
Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV masks with the target colors: yellow, blue and green (empty). I prefer to, again, implement a dictionary with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero. Again, I set a minimum threshold. This time of 10. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:
# HSV dictionary - color ranges and color name:
colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
1: ([20, 64, 21], [30, 255, 255], "yellow"),
2: ([55, 64, 21], [92, 255, 255], "green")}
# Cell counter:
cellCounter = 0
for c in range(len(cellList)):
# Get current Cell:
currentCell = cellList[c]
# Convert to HSV:
hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
# Some additional info:
(h, w) = currentCell.shape[:2]
# Process masks:
maxCount = 10
cellColor = "None"
for m in range(len(colorDictionary)):
# Get current lower and upper range values:
currentLowRange = np.array(colorDictionary[m][0])
currentUppRange = np.array(colorDictionary[m][1])
# Create the HSV mask
mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
# Get max number of target pixels
targetPixelCount = cv2.countNonZero(mask)
if targetPixelCount > maxCount:
maxCount = targetPixelCount
# Get color name from dictionary:
cellColor = colorDictionary[m][2]
# Get cell center, add an x offset:
textX = int(cellCenters[cellCounter][0]) - 100
textY = int(cellCenters[cellCounter][1])
# Draw text on cell's center:
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(centerPortion, cellColor, (textX, textY), font,
2, (0, 0, 255), 5, cv2.LINE_8)
# Increase cellCounter:
cellCounter += 1
cv2.imshow("centerPortion", centerPortion)
cv2.waitKey(0)
This is the result:
From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!
Edit:
If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:
You probably will have to tweak some values because some of the distortion still remains, even after rectification.
I have an image such as this one, which is only black and white:
I would like to obtain only the flooded area of the image with the border using cv2.floodfill, like so (pardon my Paint skills):
Here's my current code:
# Copy the image.
im_floodfill = cv2.resize(actual_map_image, (500, 500)).copy()
# Floodfill from point (X, Y)
cv2.floodFill(im_floodfill, None, (X, Y), (255, 255, 255))
# Display images.
cv2.imshow("Floodfilled Image", im_floodfill)
cv2.waitKey(0)
The output I get is equal to the original image. How can I get only the flooded area with borders?
EDIT: I want to floodfill from any white point inside the "arena", like the red dot (X,Y) in the image. I wish to have only the outer border of the small circles inside the arena and the inner border of the outside walls.
EDIT2: I'm halfway there with this:
# Resize for test purposes
actual_map_image = cv2.resize(actual_map_image, (1000, 1000))
actual_map_image = cv2.cvtColor(actual_map_image, cv2.COLOR_BGR2GRAY)
h, w = actual_map_image.shape[:2]
flood_mask = np.zeros((h+2, w+2), dtype=np.uint8)
connectivity = 8
flood_fill_flags = (connectivity | cv2.FLOODFILL_FIXED_RANGE | cv2.FLOODFILL_MASK_ONLY | 255 << 8)
# Copy the image.
im_floodfill = actual_map_image.copy()
# Floodfill from point inside arena, not inside a black dot
cv2.floodFill(im_floodfill, flood_mask, (h/2 + 20, w/2 + 20), 255, None, None, flood_fill_flags)
borders = []
for i in range(len(actual_map_image)):
borders.append([B-A for A,B in zip(actual_map_image[i], flood_mask[i])])
borders = np.asarray(borders)
borders = cv2.bitwise_not(borders)
# Display images.
cv2.imshow("Original Image", cv2.resize(actual_map_image, (500, 500)))
cv2.imshow("Floodfilled Image", cv2.resize(flood_mask, (500, 500)))
cv2.imshow("Borders", cv2.resize(borders, (500, 500)))
cv2.waitKey(0)
I get this:
However, I feel like this is the wrong way of getting the borders, and they are incomplete.
I think the easiest, and fastest, way to do this is to flood-fill the arena with mid-grey. Then extract just the grey pixels and find their edges. That looks like this, but bear in mind more than half the lines are comments and debug statements :-)
#!/usr/bin/env python3
import cv2
# Load image as greyscale to use 1/3 of the memory and processing time
im = cv2.imread('arena.png', cv2.IMREAD_GRAYSCALE)
# Floodfill arena area with value 128, i.e. mid-grey
floodval = 128
cv2.floodFill(im, None, (150,370), floodval)
# DEBUG cv2.imwrite('result-1.png', im)
# Extract filled area alone
arena = ((im==floodval) * 255).astype(np.uint8)
# DEBUG cv2.imwrite('result-2.png', arena)
# Find edges and save
edges = cv2.Canny(arena,100,200)
# DEBUG cv2.imwrite('result-3.png',edges)
Here are the 3 steps of debug output showing you the sequence of processing:
result-1.png looks like this:
result-2.png looks like this:
result-3.png looks like this:
By the way, you don't have to write any Python code to do this, as you can just do it in the Terminal with ImageMagick which is included in most Linux distros and is available for macOS and Windows. The method used here corresponds exactly to the method I used in Python above:
magick arena.png -colorspace gray \
-fill gray -draw "color 370,150 floodfill" \
-fill white +opaque gray -canny 0x1+10%+30% result.png
How about dilating and xor
kernel = np.ones((3,3), np.uint8)
dilated = cv2.dilate(actual_map_image, kernel, iterations = 1)
borders = cv2.bitwise_xor(dilated, actual_map_image)
That will give you only the borders, I'm not clear if you want the circle borders only or also the interior borders, you should be able to remove borders you don't want based on size.
You can remove the exterior border with a size threshold, define a function like this:
def size_threshold(bw, minimum, maximum):
retval, labels, stats, centroids = cv.connectedComponentsWithStats(bw)
for val in np.where((stats[:, 4] < minimum) + (stats[:, 4] > maximum))[0]:
labels[labels==val] = 0
return (labels > 0).astype(np.uint8) * 255
result = size_threshold(borders, 0, 500)
Replace 500 with the a number larger than borders you want to keep and smaller than the border you want to lose.
I had to create my own Flood Fill implementation to get what I wanted. I based myself on this one.
def fill(data, start_coords, fill_value, border_value, connectivity=8):
"""
Flood fill algorithm
Parameters
----------
data : (M, N) ndarray of uint8 type
Image with flood to be filled. Modified inplace.
start_coords : tuple
Length-2 tuple of ints defining (row, col) start coordinates.
fill_value : int
Value the flooded area will take after the fill.
border_value: int
Value of the color to paint the borders of the filled area with.
connectivity: 4 or 8
Connectivity which we use for the flood fill algorithm (4-way or 8-way).
Returns
-------
filled_data: ndarray
The data with the filled area.
borders: ndarray
The borders of the filled area painted with border_value color.
"""
assert connectivity in [4,8]
filled_data = data.copy()
xsize, ysize = filled_data.shape
orig_value = filled_data[start_coords[0], start_coords[1]]
stack = set(((start_coords[0], start_coords[1]),))
if fill_value == orig_value:
raise ValueError("Filling region with same value already present is unsupported. Did you already fill this region?")
border_points = []
while stack:
x, y = stack.pop()
if filled_data[x, y] == orig_value:
filled_data[x, y] = fill_value
if x > 0:
stack.add((x - 1, y))
if x < (xsize - 1):
stack.add((x + 1, y))
if y > 0:
stack.add((x, y - 1))
if y < (ysize - 1):
stack.add((x, y + 1))
if connectivity == 8:
if x > 0 and y > 0:
stack.add((x - 1, y - 1))
if x > 0 and y < (ysize - 1):
stack.add((x - 1, y + 1))
if x < (xsize - 1) and y > 0:
stack.add((x + 1, y - 1))
if x < (xsize - 1) and y < (ysize - 1):
stack.add((x + 1, y + 1))
else:
if filled_data[x, y] != fill_value:
border_points.append([x,y])
# Fill all image with white
borders = filled_data.copy()
borders.fill(255)
# Paint borders
for x,y in border_points:
borders[x, y] = border_value
return filled_data, borders
The only thing I did was adding the else condition. If the point does not have a value equal to orig_value or fill_value, then it is a border, so I append it to a list that contains the points of all borders. Then I only paint the borders.
I was able to get the following images with this code:
# Resize for test purposes
actual_map_image = cv2.resize(actual_map_image, (500, 500))
actual_map_image = cv2.cvtColor(actual_map_image, cv2.COLOR_BGR2GRAY)
h, w = actual_map_image.shape[:2]
filled_data, borders = fill(actual_map_image, [h/2 + 20, w/2 + 20], 127, 0, connectivity=8)
cv2.imshow("Original Image", actual_map_image)
cv2.imshow("Filled Image", filled_data)
cv2.imshow("Borders", borders)
The one on the right was what I was aiming for. Thank you all!
I have an image (front facing man) with 4 different colors (background, hair, skin-tone, and cloth). I used k-mean with k=4, and image is segmented. Now what I want to do is extract out the hair out of the image.
I used canny edge detection to detect edge, which helped to detect the point in hair area(Pointed out by red dot). Now, I want to extract hair area, as the member of k-mean pointed out by the red dot. Is it possible?
Or is there any other way to extract out hair area from image of a person?
Code done till now is:
import cv2
import numpy as np
image1 = cv2.imread('Test1.jpg')
#Resizing Image for fixed width
def image_resize(image1, width = None, height = None, inter =
cv2.INTER_AREA):
# initialize the dimensions of the image to be resized and
# grab the image size
dim = None
(h, w) = image1.shape[:2]
# if both the width and height are None, then return the
# original image
if width is None and height is None:
return image1
# check to see if the width is None
if width is None:
# calculate the ratio of the height and construct the
# dimensions
r = height / float(h)
dim = (int(w * r), height)
# otherwise, the height is None
else:
# calculate the ratio of the width and construct the
# dimensions
r = width / float(w)
dim = (width, int(h * r))
# resize the image
resized = cv2.resize(image1, dim, interpolation = inter)
# return the resized image
return resized
img1 = image_resize(image1, width = 500)
cv2.imshow("Resized", img1)
cv2.waitKey(0)
#Detecting Edge of image
canny = cv2.Canny(img1, 100, 150)
cv2.imshow("Edge", canny)
cv2.waitKey(0)
coords = np.nonzero(canny)
topmost_y = np.min(coords[0])
#Blurring effect
img2 = cv2.medianBlur(img1, 5)
cv2.imshow("Blurred", img2)
cv2.waitKey(0)
#K-mean approach
Z = img2.reshape((-1,3))
Z = np.float32(Z)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K=4
ret, label1, center1 = cv2.kmeans(Z, K, None,
criteria, 10,
cv2.KMEANS_RANDOM_CENTERS)
center1 = np.uint8(center1)
res1 = center1[label1.flatten()]
output1 = res1.reshape((img2.shape))
cv2.circle(output1, (250, topmost_y + 20), 5, (0,0,255), -1)
cv2.imshow("k = 4", output1)
cv2.waitKey(0)
cv2.destroyAllWindows()
Images:
,
,
,
,
Given the code you already have you can get the xy coordinates of the cluster to which the hair belongs with just a few extra lines. You can also create an image that shows only the hair's cluster:
# find the index of the cluster of the hair
mask = label1.reshape(output1.shape[:-1])
khair = mask[(topmost_y + 20, 250)]
# get a mask that's True at all of the indices of hair's group
hairmask = mask==khair
# get the hair's cluster's xy coordinates
xyhair = hairmask.nonzero()
# plot an image with only the hair's cluster on a white background
cv2.imwrite("khair.jpg", np.where(hairmask[..., None], img1, [255,255,255]))
Here's what the hair's cluster looks like:
Once you have the hair's cluster, you can then find the blob that represents "just the hair". Here's how you'd do that:
import scipy.ndimage as snd
# label all connected blobs in hairmask
bloblab = snd.label(hairmask, structure=np.ones((3,3)))[0]
# create a mask for only the hair
haironlymask = bloblab == bloblab[topmost_y + 20, 250]
# get an image with just the hair and then crop it
justhair = np.where(haironlymask[..., None], img1, [255,255,255])
nz = haironlymask.nonzero()
justhair = justhair[nz[0].min():nz[0].max(), nz[1].min():nz[1].max()]
# save the image of just the hair on a white background
cv2.imwrite("justhair.jpg", justhair)
and here's the image of your hair by itself:
Now that you have one point in this hair region, propagate this point to all the other points.
The pseudo code would be:
set = red point
while set of hair doesn't change:
add all points (i-1, j) (i+1, j) (i, j-1) (i, j+1) to the set
intersect the set with the mask of brown points
At the end, you will have a mask with the hair.
You can do that easily in numpy by starting with a Boolean image with just one True element at the red dot and then use |= and &= operators. I suspect OpenCV also has this kind of morphological dilation operator.