I have an image (front facing man) with 4 different colors (background, hair, skin-tone, and cloth). I used k-mean with k=4, and image is segmented. Now what I want to do is extract out the hair out of the image.
I used canny edge detection to detect edge, which helped to detect the point in hair area(Pointed out by red dot). Now, I want to extract hair area, as the member of k-mean pointed out by the red dot. Is it possible?
Or is there any other way to extract out hair area from image of a person?
Code done till now is:
import cv2
import numpy as np
image1 = cv2.imread('Test1.jpg')
#Resizing Image for fixed width
def image_resize(image1, width = None, height = None, inter =
cv2.INTER_AREA):
# initialize the dimensions of the image to be resized and
# grab the image size
dim = None
(h, w) = image1.shape[:2]
# if both the width and height are None, then return the
# original image
if width is None and height is None:
return image1
# check to see if the width is None
if width is None:
# calculate the ratio of the height and construct the
# dimensions
r = height / float(h)
dim = (int(w * r), height)
# otherwise, the height is None
else:
# calculate the ratio of the width and construct the
# dimensions
r = width / float(w)
dim = (width, int(h * r))
# resize the image
resized = cv2.resize(image1, dim, interpolation = inter)
# return the resized image
return resized
img1 = image_resize(image1, width = 500)
cv2.imshow("Resized", img1)
cv2.waitKey(0)
#Detecting Edge of image
canny = cv2.Canny(img1, 100, 150)
cv2.imshow("Edge", canny)
cv2.waitKey(0)
coords = np.nonzero(canny)
topmost_y = np.min(coords[0])
#Blurring effect
img2 = cv2.medianBlur(img1, 5)
cv2.imshow("Blurred", img2)
cv2.waitKey(0)
#K-mean approach
Z = img2.reshape((-1,3))
Z = np.float32(Z)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K=4
ret, label1, center1 = cv2.kmeans(Z, K, None,
criteria, 10,
cv2.KMEANS_RANDOM_CENTERS)
center1 = np.uint8(center1)
res1 = center1[label1.flatten()]
output1 = res1.reshape((img2.shape))
cv2.circle(output1, (250, topmost_y + 20), 5, (0,0,255), -1)
cv2.imshow("k = 4", output1)
cv2.waitKey(0)
cv2.destroyAllWindows()
Images:
,
,
,
,
Given the code you already have you can get the xy coordinates of the cluster to which the hair belongs with just a few extra lines. You can also create an image that shows only the hair's cluster:
# find the index of the cluster of the hair
mask = label1.reshape(output1.shape[:-1])
khair = mask[(topmost_y + 20, 250)]
# get a mask that's True at all of the indices of hair's group
hairmask = mask==khair
# get the hair's cluster's xy coordinates
xyhair = hairmask.nonzero()
# plot an image with only the hair's cluster on a white background
cv2.imwrite("khair.jpg", np.where(hairmask[..., None], img1, [255,255,255]))
Here's what the hair's cluster looks like:
Once you have the hair's cluster, you can then find the blob that represents "just the hair". Here's how you'd do that:
import scipy.ndimage as snd
# label all connected blobs in hairmask
bloblab = snd.label(hairmask, structure=np.ones((3,3)))[0]
# create a mask for only the hair
haironlymask = bloblab == bloblab[topmost_y + 20, 250]
# get an image with just the hair and then crop it
justhair = np.where(haironlymask[..., None], img1, [255,255,255])
nz = haironlymask.nonzero()
justhair = justhair[nz[0].min():nz[0].max(), nz[1].min():nz[1].max()]
# save the image of just the hair on a white background
cv2.imwrite("justhair.jpg", justhair)
and here's the image of your hair by itself:
Now that you have one point in this hair region, propagate this point to all the other points.
The pseudo code would be:
set = red point
while set of hair doesn't change:
add all points (i-1, j) (i+1, j) (i, j-1) (i, j+1) to the set
intersect the set with the mask of brown points
At the end, you will have a mask with the hair.
You can do that easily in numpy by starting with a Boolean image with just one True element at the red dot and then use |= and &= operators. I suspect OpenCV also has this kind of morphological dilation operator.
Related
I've bellow function:
def alphaMerge(small_foreground, background, top, left):
result = background.copy()
fg_b, fg_g, fg_r, fg_a = cv.split(small_foreground)
print(fg_b, fg_g, fg_r, fg_a)
fg_a = fg_a / 255.0
label_rgb = cv.merge([fg_b * fg_a, fg_g * fg_a, fg_r * fg_a])
height, width = small_foreground.shape[0], small_foreground.shape[1]
part_of_bg = result[top:top + height, left:left + width, :]
bg_b, bg_g, bg_r = cv.split(part_of_bg)
part_of_bg = cv.merge([bg_b * (1 - fg_a), bg_g * (1 - fg_a), bg_r * (1 - fg_a)])
cv.add(label_rgb, part_of_bg, part_of_bg)
result[top:top + height, left:left + width, :] = part_of_bg
return result
if __name__ == '__main__':
folder_dir = r"C:\photo_datasets\products_small"
logo = cv.imread(r"C:\Users\PiotrSnella\photo_datasets\discount.png", cv.IMREAD_UNCHANGED)
for images in os.listdir(folder_dir):
input_path = os.path.join(folder_dir, images)
image_size = os.stat(input_path).st_size
if image_size < 8388608:
img = cv.imread(input_path, cv.IMREAD_UNCHANGED)
height, width, channels = img.shape
if height > 500 and width > 500:
result = alphaMerge(logo, img, 0, 0)
cv.imwrite(r'C:\photo_datasets\products_small_output_cv\{}.png'.format(images), result)
I want to combine two pictures, one with the logo which I would like to apply on full dataset from folder products_small. I'm getting a error part_of_bg = cv.merge([bg_b * (1 - fg_a), bg_g * (1 - fg_a), bg_r * (1 - fg_a)]) ValueError: operands could not be broadcast together with shapes (720,540) (766,827)
I tried other combining options and still get the error about problem with shapes, the photo could be a problem or something with the code?
Thank you for your help guys :)
Here is one way to do that in Python/OpenCV. I will place a 20% resized logo onto the pants image at coordinates 660,660 on the right side pocket.
Read the background image (pants)
Read the foreground image (logo) unchanged to preserve the alpha channel
Resize the foreground (logo) to 20%
Create a transparent image the size of the background image
Insert the resized foreground (logo) into the transparent image at the desired location
Extract the alpha channel from the inserted, resized foreground image
Extract the base BGR channels from the inserted, resized foreground image
Blend the background image and the base BGR image using the alpha channel as a mask using np.where(). Note all images must be the same dimensions and 3 channels
Save the result
Background Image:
Foreground Image:
import cv2
import numpy as np
# read background image
bimg = cv2.imread('pants.jpg')
hh, ww = bimg.shape[:2]
# read foreground image
fimg = cv2.imread('flashsale.png', cv2.IMREAD_UNCHANGED)
# resize foreground image
fimg_small = cv2.resize(fimg, (0,0), fx=0.2, fy=0.2)
ht, wd = fimg_small.shape[:2]
# create transparent image
fimg_new = np.full((hh,ww,4), (0,0,0,0), dtype=np.uint8)
# insert resized image into transparent image at desired coordinates
fimg_new[660:660+ht, 660:660+wd] = fimg_small
# extract alpha channel from foreground image as mask and make 3 channels
alpha = fimg_new[:,:,3]
alpha = cv2.merge([alpha,alpha,alpha])
# extract bgr channels from foreground image
base = fimg_new[:,:,0:3]
# blend the two images using the alpha channel as controlling mask
result = np.where(alpha==(0,0,0), bimg, base)
# save result
cv2.imwrite("pants_flashsale.png", result)
# show result
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Result:
This just requires some multiplication and subtraction.
Your overlay has an actual alpha channel, not just a boolean mask. You should use it. It makes edges look better than just a hard boolean mask.
I see one issue with your overlay: it doesn't have any "shadow" to give the white text contrast against a potentially white background.
When you resize RGBA data, it's not trivial. You'd better export the graphic from your vector graphics program in the desired resolution in the first place. Resizing after the fact requires operations to make sure partially transparent pixels (neither 100% opaque nor 100% transparent) are calculated properly so undefined "background" from the fully transparent areas of the overlay image is not mixed into those partially transparent pixels.
base = cv.imread("U3hRd.jpg")
overlay = cv.imread("OBxGQ.png", cv.IMREAD_UNCHANGED)
(bheight, bwidth) = base.shape[:2]
(oheight, owidth) = overlay.shape[:2]
print("base:", bheight, bwidth)
print("overlay:", oheight, owidth)
# place overlay in center
#ox = (bwidth - owidth) // 2
#oy = (bheight - oheight) // 2
# place overlay in top left
ox = 0
oy = 0
overlay_color = overlay[:,:,:3]
overlay_alpha = overlay[:,:,3] * np.float32(1/255)
# "unsqueeze" (insert 1-sized color dimension) so numpy broadcasting works
overlay_alpha = np.expand_dims(overlay_alpha, axis=2)
composite = base.copy()
base_roi = base[oy:oy+oheight, ox:ox+owidth]
composite_roi = composite[oy:oy+oheight, ox:ox+owidth]
composite_roi[:,:] = overlay_color * overlay_alpha + base_roi * (1 - overlay_alpha)
This is what you wanted on top left corner. Noticed, the logo on white foreground doesn't work on background on pant.jpg.
Just 17 lines of codes compared to
import cv2
import numpy as np
img1 = cv2.imread('pant.jpg')
overlay_img1 = np.ones(img1.shape,np.uint8)*255
img2 = cv2.imread('logo3.png')
rows,cols,channels = img2.shape
overlay_img1[0:rows, 0:cols ] = img2
img2gray = cv2.cvtColor(overlay_img1,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray,220,255,cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
temp1 = cv2.bitwise_and(img1,img1,mask = mask_inv)
temp2 = cv2.bitwise_and(overlay_img1,overlay_img1, mask = mask)
result = cv2.add(temp1,temp2)
cv2.imshow("Result",result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Logo resize(320x296):
I Here is my Code
# Json file in which Easyocr anotations have saved.
img = cv2.imread('dummy.jpg')
img1 = img.copy()
#rotoated because anotation have according to vertical alignment of image i have matched the orientation
img1=cv2.rotate(img1,rotateCode=cv2.ROTATE_90_CLOCKWISE)
rects = []
with open('dummy.json') as jsn:
jsn_dict = json.load(jsn)
for k in jsn_dict['textAnnotations']:
vertices= k['boundingPoly']['vertices']
cv2.rectangle(img1,list(vertices[2].values()),list(vertices[0].values()),[0,255,0],10)
# I want to put predicted text on top of bounding boxes vertically because my image is rotated anti clockwise
cv2.putText(img1, k['description'], list(vertices[0].values()),cv2.FONT_HERSHEY_SIMPLEX,5,[0,255,0],5)
I have the code mentioned above I am labelling the recognized text. First step is, I put the image into the OCR model and it returns some values according to the image, in which we have three values for every detected text. These values are the vertex of the bounding box, the text that was recognized, and the accuracy percentage. But my problem is that my image was rotated by the Exif orientation value but cv2 read it as a zero angle and my text is printing horizontally. I want to print text on an image vertically. I have tried so many times but could not resolve my problem. I hope I have explained it well.
Try this one
import cv2
def transparentOverlay(src, overlay, pos=(0, 0), scale=1):
"""
:param src: Input Color Background Image
:param overlay: transparent Image (BGRA)
:param pos: position where the image to be blit.
:param scale : scale factor of transparent image.
:return: Resultant Image
"""
overlay = cv2.resize(overlay, (0, 0), fx=scale, fy=scale)
h, w, _ = overlay.shape # Size of foreground
rows, cols, _ = src.shape # Size of background Image
y, x = pos[0], pos[1] # Position of foreground/overlay image
# loop over all pixels and apply the blending equation
for i in range(h):
for j in range(w):
if x + i >= rows or y + j >= cols:
continue
alpha = float(overlay[i][j][3] / 255.0) # read the alpha channel
src[x + i][y + j] = alpha * overlay[i][j][:3] + (1 - alpha) * src[x + i][y + j]
return src
def addImageWatermark(LogoImage,MainImage,opacity,pos=(10,100),):
opacity = opacity / 100
OriImg = cv2.imread(MainImage, -1)
waterImg = cv2.imread(LogoImage, -1)
tempImg = OriImg.copy()
print(tempImg.shape)
overlay = transparentOverlay(tempImg, waterImg, pos)
output = OriImg.copy()
# apply the overlay
cv2.addWeighted(overlay, opacity, output, 1 - opacity, 0, output)
cv2.imshow('Life2Coding', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == '__main__':
addImageWatermark('./logo.png','./hanif.jpg',100,(10,100))
Rotate your image 90º clockwise, add the text, and rotate the image back to the original.
# Rotate 90º clockwise
img_rot = cv2.rotate(img1 , cv2.ROTATE_90_CLOCKWISE)
# Add your text here, adjusting x and y coordinates to the new orientation.
# The new adjusted coordinates will be:
# (x2, y2) = (original_height - y, x)
# [...]
# Rotate back
img1 = cv2.rotate(img_rot, cv2.ROTATE_90_CLOCKWISE)
Hello I want to reflect an object in the image as in this image[enter image description here][1]
[1]: https://i.stack.imgur.com/N9J3I.jpg How can I get this kind of result?
It is possible that OpenCV does not have good solutions for this, take a closer look at Pillow.
from PIL import Image, ImageFilter
def drop_shadow(image, iterations=3, border=8, offset=(5,5), background_colour=0xffffff, shadow_colour=0x444444):
shadow_width = image.size[0] + abs(offset[0]) + 2 * border
shadow_height = image.size[1] + abs(offset[1]) + 2 * border
shadow = Image.new(image.mode, (shadow_width, shadow_height), background_colour)
shadow_left = border + max(offset[0], 0)
shadow_top = border + max(offset[1], 0)
shadow.paste(shadow_colour, [shadow_left, shadow_top, shadow_left + image.size[0], shadow_top + image.size[1]])
for i in range(iterations):
shadow = shadow.filter(ImageFilter.BLUR)
img_left = border - min(offset[0], 0)
img_top = border - min(offset[1], 0)
shadow.paste(image, (img_left, img_top))
return shadow
drop_shadow(Image.open('boobs.jpg')).save('shadowed_boobs.png')
Here is one way to do the reflection in Python/OpenCV.
One flips the image. Then makes a vertical ramp (gradient) image and puts that into the alpha channel of the flipped image. Then one concatenates the original and the flipped images.
Input:
import cv2
import numpy as np
# set top and bottom opacity percentages
top = 85
btm = 15
# load image
img = cv2.imread('bear2.png')
hh, ww = img.shape[:2]
# flip the input
flip = np.flip(img, axis=0)
# add opaque alpha channel to input
img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
# make vertical gradient that is bright at top and dark at bottom as alpha channel for the flipped image
gtop = 255*top//100
gbtm = 255*btm//100
grady = np.linspace(gbtm, gtop, hh, dtype=np.uint8)
gradx = np.linspace(1, 1, ww, dtype=np.uint8)
grad = np.outer(grady, gradx)
grad = np.flip(grad, axis=0)
# alternate method
#grad = np.linspace(0, 255, hh, dtype=np.uint8)
#grad = np.tile(grad, (ww,1))
#grad = np.transpose(grad)
#grad = np.flip(grad, axis=0)
# put the gradient into the alpha channel of the flipped image
flip = cv2.cvtColor(flip, cv2.COLOR_BGR2BGRA)
flip[:,:,3] = grad
# concatenate the original and the flipped versions
result = np.vstack((img, flip))
# save output
cv2.imwrite('bear2_vertical_gradient.png', grad)
cv2.imwrite('bear2_reflection.png', result)
# Display various images to see the steps
cv2.imshow('flip',flip)
cv2.imshow('grad',grad)
cv2.imshow('result',result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Ramped (Gradient) Image:
Result:
I have Lego cubes forming 4x4 shape, and I'm trying to infer the status of a zone inside the image:
empty/full and the color whether if yellow or Blue.
to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
Here is a clear image of the shape I'm trying to detect taken by my phone camera
( EDIT : Note that this image is not my input image, it is used just to demonstrate the required shape clearly ).
The shape from the side camera that I'm supposed to use looks like this:
(EDIT : Now this is my input image)
to focus my work on the working zone I have created a mask:
what I have tried so far is to locate the red markers by color (simple threshold without HSV color-space) as following:
import numpy as np
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)
masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]
masked = cv2.medianBlur(masked,5)
plt.imshow(masked, cmap='gray')
plt.show()
and I have spotted the markers so far:
But I'm still confused:
how to detect the external borders of the desired zone, and the internal borders (each Lego cube(Yellow-Blue-Green) borders) inside the red markers precisely?.
thanks in advance for your kind advice.
I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.
These are the steps:
Get an HSV mask of the red markers
Use each marker to estimate the center rectangle through each marker's coordinates
Crop the center rectangle
Divide the rectangle into cells - this is the grid
Run a series of HSV-based maks on each cell and compute the dominant color
Label each cell with the dominant color
Let's see the code:
# Importing cv2 and numpy:
import numpy as np
import cv2
# image path
path = "D://opencvImages//"
fileName = "Bg9iB.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Store a deep copy for results:
inputCopy = inputImage.copy()
# Convert the image to HSV:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# The HSV mask values (Red):
lowerValues = np.array([127, 0, 95])
upperValues = np.array([179, 255, 255])
# Create the HSV mask
mask = cv2.inRange(hsvImage, lowerValues, upperValues)
The first part is very straightforward. You set the HSV range and use cv2.inRange to get a binary mask of the target color. This is the result:
We can further improve the binary mask using some morphology. Let's apply a closing with a somewhat big structuring element and 10 iterations. We want those markers as clearly defined as possible:
# Set kernel (structuring element) size:
kernelSize = 5
# Set operation iterations:
opIterations = 10
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
Which yields:
Very nice. Now, let's detect contours on this mask. We will approximate each contour to a bounding box and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours:
# Create a deep copy, convert it to BGR for results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Bounding Rects are stored here:
boundRectsList = []
# Process each contour 1-1:
for i, c in enumerate(contours):
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Convert the polygon to a bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the bounding rect's data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Estimate the bounding rect area:
rectArea = rectWidth * rectHeight
# Set a min area threshold
minArea = 100
# Filter blobs by area:
if rectArea > minArea:
#Store the rect:
boundRectsList.append(boundRect)
I also created a deep copy of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:
Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea. Alright, now we have the bounding boxes in the boundRectsList variable. Let's sort this boxes using the Y coordinate:
# Sort the list based on ascending y values:
boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0's bottom right ending point (both x and y). Its width is defined by bounding box 1's bottom left x coordinate, height is defined by bounding box 2's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height). There's more than one way yo achieve this. I prefer to use a dictionary to get the relevant data. Let's see:
# Rectangle dictionary:
# Each entry is an index of the currentRect list
# 0 - X, 1 - Y, 2 - Width, 3 - Height
# Additionally: -1 is 0 (no dimension):
pointsDictionary = {0: (2, 3),
1: (-1, 3),
2: (2, -1),
3: (-1, -1)}
# Store center rectangle coordinates here:
centerRectangle = [None]*4
# Process the sorted rects:
rectCounter = 0
for i in range(len(boundRectsSorted)):
# Get sorted rect:
currentRect = boundRectsSorted[i]
# Get the bounding rect's data:
rectX = currentRect[0]
rectY = currentRect[1]
rectWidth = currentRect[2]
rectHeight = currentRect[3]
# Draw sorted rect:
cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
int(rectY + rectHeight)), (0, 255, 0), 5)
# Get the inner points:
currentInnerPoint = pointsDictionary[i]
borderPoint = [None]*2
# Check coordinates:
for p in range(2):
# Check for '0' index:
idx = currentInnerPoint[p]
if idx == -1:
borderPoint[p] = 0
else:
borderPoint[p] = currentRect[idx]
# Draw the border points:
color = (0, 0, 255)
thickness = -1
centerX = rectX + borderPoint[0]
centerY = rectY + borderPoint[1]
radius = 50
cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
# Mark the circle
org = (centerX - 20, centerY + 20)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(maskCopy, str(rectCounter), org, font,
2, (0, 0, 0), 5, cv2.LINE_8)
# Show the circle:
cv2.imshow("Sorted Rects", maskCopy)
cv2.waitKey(0)
# Store the coordinates into list
if rectCounter == 0:
centerRectangle[0] = centerX
centerRectangle[1] = centerY
else:
if rectCounter == 1:
centerRectangle[2] = centerX - centerRectangle[0]
else:
if rectCounter == 2:
centerRectangle[3] = centerY - centerRectangle[1]
# Increase rectCounter:
rectCounter += 1
This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle list:
If you join each inner point you get the center rectangle we have been looking for:
# Check out the big rectangle at the center:
bigRectX = centerRectangle[0]
bigRectY = centerRectangle[1]
bigRectWidth = centerRectangle[2]
bigRectHeight = centerRectangle[3]
# Draw the big rectangle:
cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
cv2.imshow("Big Rectangle", maskCopy)
cv2.waitKey(0)
Check it out:
Now, just crop this portion of the original image:
# Crop the center portion:
centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
# Store a deep copy for results:
centerPortionCopy = centerPortion.copy()
This is the central portion of the image:
Cool, now let's create the grid. You know that there must be 4 bricks per width and 4 bricks per height. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:
# Dive the image into a grid:
verticalCells = 4
horizontalCells = 4
# Cell dimensions
cellWidth = bigRectWidth / verticalCells
cellHeight = bigRectHeight / horizontalCells
# Store the cells here:
cellList = []
# Store cell centers here:
cellCenters = []
# Loop thru vertical dimension:
for j in range(verticalCells):
# Cell starting y position:
yo = j * cellHeight
# Loop thru horizontal dimension:
for i in range(horizontalCells):
# Cell starting x position:
xo = i * cellWidth
# Cell Dimensions:
cX = int(xo)
cY = int(yo)
cWidth = int(cellWidth)
cHeight = int(cellHeight)
# Crop current cell:
currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
# into the cell list:
cellList.append(currentCell)
# Store cell center:
cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
# Draw Cell
cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
cv2.imshow("Grid", centerPortionCopy)
cv2.waitKey(0)
This is the grid:
Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV masks with the target colors: yellow, blue and green (empty). I prefer to, again, implement a dictionary with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero. Again, I set a minimum threshold. This time of 10. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:
# HSV dictionary - color ranges and color name:
colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
1: ([20, 64, 21], [30, 255, 255], "yellow"),
2: ([55, 64, 21], [92, 255, 255], "green")}
# Cell counter:
cellCounter = 0
for c in range(len(cellList)):
# Get current Cell:
currentCell = cellList[c]
# Convert to HSV:
hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
# Some additional info:
(h, w) = currentCell.shape[:2]
# Process masks:
maxCount = 10
cellColor = "None"
for m in range(len(colorDictionary)):
# Get current lower and upper range values:
currentLowRange = np.array(colorDictionary[m][0])
currentUppRange = np.array(colorDictionary[m][1])
# Create the HSV mask
mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
# Get max number of target pixels
targetPixelCount = cv2.countNonZero(mask)
if targetPixelCount > maxCount:
maxCount = targetPixelCount
# Get color name from dictionary:
cellColor = colorDictionary[m][2]
# Get cell center, add an x offset:
textX = int(cellCenters[cellCounter][0]) - 100
textY = int(cellCenters[cellCounter][1])
# Draw text on cell's center:
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(centerPortion, cellColor, (textX, textY), font,
2, (0, 0, 255), 5, cv2.LINE_8)
# Increase cellCounter:
cellCounter += 1
cv2.imshow("centerPortion", centerPortion)
cv2.waitKey(0)
This is the result:
From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!
Edit:
If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:
You probably will have to tweak some values because some of the distortion still remains, even after rectification.
I have used openCV for cloud detection.. it can detect white clouds but when the clouds are being dark it's not detecting them.
so here my detection is > the white masking is for the clouds where the black is for the sky.
here what it look like:
this the original image
enter image description here
this is the masked image
enter image description here
Here is the painted clouds which supposed to be masked in white color.. but here it is considered as sky and it's masked in black color. :
enter image description here
it supposed to be all white since there is no sky it's just dark clouds..
how can I solve this problem? thanks
here is the code used:
def preprocess(image):
""" Preprocess 'image' to help separate cloud and sky. """
B, G, R = cv2.split(image) # extract the colour channels
# construct a ratio between the blue and red channel sum and difference
BR_sum = B + R
BR_diff = B - R
# handle X/0 and 0/0 errors, and remove NaNs (not a number)
with np.errstate(divide='ignore', invalid='ignore'):
BR_ratio = BR_diff / BR_sum
BR_ratio = np.nan_to_num(BR_ratio)
# normalize to 0-255 range and convert to 8-bit unsigned integers
return cv2.normalize(BR_ratio, None, 0, 255, cv2.NORM_MINMAX) \
.astype(np.uint8)
def makebinary(imagepath, radiusMask = None):
startTime = datetime.now()
# read in the image and shrink for faster processing
image = cv2.imread(imagepath)
scale = 0.2
smaller = cv2.resize(image, (0,0), fx=scale, fy=scale)
center = [dim / 2 for dim in smaller.shape[:2]]
preprocessed = preprocess(smaller.astype(float))
if radiusMask:
# apply a circular mask to get only the pixels of interest
from cmask import cmask
mask = cmask(index, scale * radiusMask, resized).astype(bool)
else:
mask = np.ones(preprocessed.shape).astype(bool)
masked = preprocessed[mask]
# use Otsu's method to separate clouds and sky
threshold, result = cv2.threshold(masked, 0, 255,
cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# invert the result so clouds are white (255) and sky is black (0)
inverted = cv2.bitwise_not(result)
here is the cmask:
import numpy as np
def cmask(center, radius, image):
c_y, c_x = center
num_y, num_x = image.shape[:2]
image_mask = np.ones((num_y, num_x))
y, x = np.ogrid[-c_y:num_y-c_y, -c_x:num_x-c_x]
# remove components outside the circle
image_mask[(x*x + y*y) > radius*radius] = 0
return(image_mask)