First of all, I am not asking anyone to do my homework. I would like to get an explanation or clarification about my difficulties in understanding the following question.
I just finished my image processing test, but one question that I could not solve due to my confusion.
The question is:
Write the code to detect the red eye in a given image in RGB color space using the following formula for HSL color space:
LS_ratio = L / S
eye_pixel = (L >= 64) and (S >= 100) and (LS_ratio > 0.5) and (LS_ratio < 1.5) and ((H <= 7) or (H >= 162))
Please note that in above formula, H, S and L represent a single pixel value for the image in HSL color space and the value of ‘eye_pixel’ will be either True or False depending on the values of H, S and L (i.e. it will be either a red eye color pixel or not).
Your task is to write the code to check all pixels in the image. Store the result as a numpy array and display the resulted image.
My code is:
from __future__ import print_function
import numpy as np
import argparse
import cv2
#argument paser
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True, help = "Path to the image")
args = vars(ap.parse_args())
#load the image
image = cv2.imread(args["image"])
#Convert image to HLS
hls = cv2.cvtColor(image, cv2.COLOR_BGR2HLS)
#Split HLS Channels
H = hls[:, :, 0]
S = hls[:, :, 1]
L = hls[:, :, 2]
LS_ratio = L / S
#eye_pixel = (L >= 64) and (S >= 100) and (LS_ratio > 0.5) and (LS_ratio < 1.5) and ((H <= 7) or (H >= 162))
#if HSL pixel
#eye pixel either red or not
#show the image
#cv2.imshow("Image", np.hstack([image, red_eye]))
print("Lightness is: {}".format(L))
print("Saturation is: {}".format(S))
print("Hue is: {}".format(H))
#print("LS ratio: {}", LS_ratio)
Suppose that the image is:
I literally feel confused about what needs to be done. Highly appreciate if anyone helps explains to me what should be done.
Thank you.
All you need to do is implement the formula in term of the entire H, L, S images.
#Convert image to HLS
hls = cv2.cvtColor(image, cv2.COLOR_BGR2HLS)
#Split HLS Channels
H = hls[:, :, 0]
L = hls[:, :, 1]
S = hls[:, :, 2]
LS_ratio = L/(S + 1e-6)
redeye = ((L>=64) * (S>=100) * np.logical_or(H<=7, H>=162) * (LS_ratio>0.5) * (LS_ratio<1.5)).astype(bool)
Here redeye is a bool array the same size of your original image, where each pixel contains a True or False, representing whether if it's a redeye pixel or not. If I display the image:
redeye = cv2.cvtColor(redeye.astype(np.uint8)*255, cv2.COLOR_GRAY2BGR)
cv2.imshow('image-redeye', np.hstack([image, redeye]))
I have this image:
And for the beard, I have this mask:
I want to cut the beard out using the mask with a transparent background like this:
I followed this SO post's attempt. Here it is:
for img in input_images:
gaberiel = + '/gaberiel-images/' + img)
beard_mask = imread(path + '/gaberiel-masks/' + 'beard_binary_' + img[:-4] + '.png', cv2.IMREAD_GRAYSCALE)
gaberiel_x, gaberiel_y = gaberiel.size
beard_mask_x, beard_mask_y, _ = beard_mask.shape
x_beard_mask= min(gaberiel_x, beard_mask_x)
x_half_beard_mask = beard_mask.shape[0] // 2
mask_beard = beard_mask[x_half_beard_mask - x_beard_mask // 2: x_half_beard_mask + x_beard_mask // 2 + 1, :gaberiel_y]
gaberiel_width_half = gaberiel.size[1] // 2
gaberiel_to_mask = gaberiel[:, gaberiel_width_half - x_half_beard_mask:gaberiel_width_half + x_half_beard_mask]
masked = cv2.bitwise_and(gaberiel_to_mask, gaberiel_to_mask, mask=mask_beard)
tmp = cv2.cvtColor(masked, cv2.COLOR_BGR2GRAY)
_, alpha = cv2.threshold(tmp, 0, 255, cv2.THRESH_BINARY)
b, g, r = cv2.split(masked)
rgba = [b, g, r, alpha]
masked_tr = cv2.merge(rgba, 4)
But this is the error I am getting:
gaberiel_to_mask = gaberiel[:, gaberiel_width_half - x_half_beard_mask:gaberiel_width_half + x_half_beard_mask]
TypeError: 'PngImageFile' object is not subscriptable
I think that my attempt is overall bad. Is there a way I can simplify this process?
Make sure your mask is of the same size as your image and then use the function as in below:
def apply_mask(image, mask):
# Convert to numpy arrays
image = np.array(image)
mask = np.array(mask)
# Convert grayscale image to RGB
mask = np.stack((mask,)*3, axis=-1)
# Multiply arrays
resultant = image*mask
return resultant
image = ...
mask = ...
resultant = apply_mask(image, mask)
For the code to work, the image array must be in range 0-255 and the mask array must be in binary (either 0 or 1). In the mask array, the area of beard must have pixel value of 1 and the rest 0, so that multiplying the image with the mask will give the desired image as shown above.
Save image as a png with transparent background:
import matplotlib.pyplot as plt
plt.savefig('image.png', transparent=True)
In the image I linked below, I need to get all the yellow/green pixels in this rotated rectangle and get rid of the blue background, so that the rectangle's axis are aligned with the x and y axis.
I'm using numpy but don't have a clue what I should do.
I uploaded the array in this drive in case anyone would like to work with the actual array
Thanks for the help in advance.
I used the same image as user2640045, but different approach.
import numpy as np
import cv2
# load and convert image to grayscale
img = cv2.imread('image.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# binarize image
threshold, binarized_img = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# find the largest contour
contours, hierarchy = cv2.findContours(binarized_img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
c = max(contours, key = cv2.contourArea)
# get size of the rotated rectangle
center, size, angle = cv2.minAreaRect(c)
# get size of the image
h, w, *_ = img.shape
# create a rotation matrix and rotate the image
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated_img = cv2.warpAffine(img, M, (w, h))
# crop the image
pad_x = int((w - size[0]) / 2)
pad_y = int((h - size[1]) / 2)
cropped_img = rotated_img[pad_y : pad_y + int(size[1]), pad_x : pad_x + int(size[0]), :]
I realize there is a allow_pickle=False option in numpys load method but I didn't feel comfortable with unpickling/using data from the internet so I used the small image. After removing the coordinate system and stuff I had
I define two helper methods. One to later rotate the image taken from an other stack overflow thread. See link below. And one to get a mask being one at a specified color and zero otherwise.
import numpy as np
import matplotlib.pyplot as plt
import sympy
import cv2
import functools
color = arr[150,50]
def similar_to_boundary_color(arr, color=tuple(color)):
mask = functools.reduce(np.logical_and, [np.isclose(arr[:,:,i], color[i]) for i in range(4)])
return mask
def rotate_image(image, angle):
image_center = tuple(np.array(image.shape[1::-1]) / 2)
rot_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
result = cv2.warpAffine(image, rot_mat, image.shape[1::-1], flags=cv2.INTER_LINEAR)
return result
Next I calculate the angle to rotate about. I do that by finding the lowest pixel at width 50 and 300. I picked those since they are far enough from the boundary to not be effected by missing corners etc..
i,j = np.where(~similar_to_boundary_color(arr))
slope = (max(i[j == 50])-max(i[j == 300]))/(50-300)
angle = np.arctan(slope)
arr = rotate_image(arr, np.rad2deg(angle))
One way of doing the cropping is the following. You calculate the mid in height and width. Then you take two slices around the mid say 20 pixels in one direction and to until the mid in the other one. The biggest/smallest index where the pixel is white/background colored is a reasonable point to cut.
i,j = np.where(~(~similar_to_boundary_color(arr) & ~similar_to_boundary_color(arr, (0,0,0,0))))
imid, jmid = np.array(arr.shape)[:2]/2
imin = max(i[(i < imid) & (jmid - 10 < j) & (j < jmid + 10)])
imax = min(i[(i > imid) & (jmid - 10 < j) & (j < jmid + 10)])
jmax = min(j[(j > jmid) & (imid - 10 < i) & (i < imid + 10)])
jmin = max(j[(j < jmid) & (imid - 10 < i) & (i < imid + 10)])
arr = arr[imin:imax,jmin:jmax]
and the result is:
So far , I've divided an image into blocks of specific size and these blocks have the mean color of the original block. Now, I have to merge these blocks based on their similarity, where each block contains a single pixel value(mean color value). For this , I have been trying to merge pixels within an image based on their rgb values. So far I've not found anything that would help me with this. So kindly help me to solve this problem. What I've done so far...
x and y are the block sizes. Here x=y=16.
Input :Original Image
Output: Processed image
I've not implemented anything after this since I don't know how to proceed further. Now I've to group the pixels in the processed image based on their similarity.
data = np.zeros( (256,256,3), dtype=np.uint8 )
for q in range(len(l)):
for w in range(len(l)):
img = smp.toimage( data )
data1 = np.asarray( img, dtype="int32" )
cv2.imwrite(os.path.join('G:/AI package/datasets/_normalized',filename),data1)
You have used quite a lot of code to get the first step done, However, the same output can be achieved using numpy functions within 2-3 lines of code as:
import cv2
import numpy as np
def get_mean_color(box):
return int(np.mean(box[:, :, 0])), int(np.mean(box[:, :, 1])), int(np.mean(box[:, :, 2]))
def get_super_square_pixels(img, super_pix_width):
height, width, ch = img.shape
if height % super_pix_width != 0:
raise Exception("height must be multiple of super pixel width")
if width % super_pix_width != 0:
raise Exception("width must be multiple of super pixel width")
output_img = np.zeros(img.shape, np.uint8)
for i in xrange(height / super_pix_width):
for j in xrange(width / super_pix_width):
src_box = img[i * super_pix_width:(i + 1) * super_pix_width, j * super_pix_width:(j + 1) * super_pix_width]
mean_val = get_mean_color(src_box)
output_img[i * super_pix_width:(i + 1) * super_pix_width, j * super_pix_width:(j + 1) * super_pix_width] = mean_val
return output_img
img = cv2.imread("/path/to/your/img.jpg")
out = get_super_square_pixels(img, 16)
My code may not be optimal but it works just fine.
import cv2
import numpy as np
import scipy.misc as smp
import os
res = np.zeros( (256,256,3), dtype=np.uint8 )
low=np.array([l[0] - thresh, l[1] - thresh, l[2] - thresh])
high=np.array([l[0] + thresh, l[1] + thresh, l[2] + thresh])
res = cv2.bitwise_and(image, image, mask = mask1)
while(b<256 and d<256):
while((k!=black).all() and b<256 and d<256):
image= cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
img = smp.toimage( image )
data1 = np.asarray( img, dtype="int32" )
cv2.imwrite(os.path.join('G:/AI package/datasets/btob/',filename),data1)
Edit: Quick Summary so far: I use the watershed algorithm but I have probably a problem with threshold. It didn't detect the brighter circles.
New: Fast radial symmetry transform approach which didn't quite work eiter (Edit 6).
I want to detect circles with different sizes. The use case is to detect coins on an image and to extract them solely. -> Get the single coins as single image files.
For this I used the Hough Circle Transform of open-cv:
import sys
import cv2 as cv
import numpy as np
def main(argv):
## [load]
default_file = "data/newcommon_1euro.jpg"
filename = argv[0] if len(argv) > 0 else default_file
# Loads an image
src = cv.imread(filename, cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image!')
print ('Usage: [image_name -- default ' + default_file + '] \n')
return -1
## [load]
## [convert_to_gray]
# Convert it to gray
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
## [convert_to_gray]
## [reduce_noise]
# Reduce the noise to avoid false circle detection
gray = cv.medianBlur(gray, 5)
## [reduce_noise]
## [houghcircles]
rows = gray.shape[0]
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 8,
param1=100, param2=30,
minRadius=0, maxRadius=120)
## [houghcircles]
## [draw]
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0, :]:
center = (i[0], i[1])
# circle center, center, 1, (0, 100, 100), 3)
# circle outline
radius = i[2], center, radius, (255, 0, 255), 3)
## [draw]
## [display]
cv.imshow("detected circles", src)
## [display]
return 0
if __name__ == "__main__":
I tried all parameters (rows, param1, param2, minRadius, and maxRadius) to optimize the results. This worked very well for one specific image but other images with different sized coins didn't work.
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 16,
param1=100, param2=30,
minRadius=0, maxRadius=120)
With the same parameters:
Changed to rows/8
I also tried two other approaches of this thread: writing robust (color and size invariant) circle detection with opencv (based on Hough transform or other features)
The approach of fireant leads to this result:
The approach of fraxel didn't work either.
For the first approach: This happens with all different sizes and also the min and max radius.
How could I change the code, so that the coin size is not important or that it finds the parameters itself?
Thank you in advance for any help!
I tried the watershed algorithm of Open-cv, as suggested by Alexander Reynolds:
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('data/P1190263.jpg')
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(gray,0,255,cv.THRESH_BINARY_INV+cv.THRESH_OTSU)
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv.morphologyEx(thresh,cv.MORPH_OPEN,kernel, iterations = 2)
# sure background area
sure_bg = cv.dilate(opening,kernel,iterations=3)
# Finding sure foreground area
dist_transform = cv.distanceTransform(opening,cv.DIST_L2,5)
ret, sure_fg = cv.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
markers = cv.watershed(img,markers)
img[markers == -1] = [255,0,0]
cv.imshow("detected circles", img)
It works very well on the test image of the open-cv website:
But it performs very bad on my own images:
I can't really think of a good reason why it's not working on my images?
Edit 2:
As suggested I looked at the intermediate images. The thresh looks not good in my opinion. Next, there is no difference between opening and dist_transform. The corresponding sure_fg shows the detected images.
Edit 3:
I tried all distanceTypes and maskSizes I could find, but the results were quite the same (
Edit 4:
Furthermore, I tried to change the (first) threshold function. I used different threshold values instead of the OTSU Function. The best one was with 160, but it was far from good:
In the tutorial it looks like this:
It seems like the coins are somehow too bright to be detected by this algorithm, but I don't know how to improve it?
Edit 5:
Changing the overall contrast and brightness of the image (with cv.convertScaleAbs) didn't improve the results. Increasing the contrast however should increase the "difference" between foreground and background, at least on the normal image. But it even got worse. The corresponding threshold image didn't improved (didn't get more white pixel).
Edit 6: I tried another approach, the fast radial symmetry transform (from here
import cv2
import numpy as np
def gradx(img):
img = img.astype('int')
rows, cols = img.shape
# Use hstack to add back in the columns that were dropped as zeros
return np.hstack((np.zeros((rows, 1)), (img[:, 2:] - img[:, :-2]) / 2.0, np.zeros((rows, 1))))
def grady(img):
img = img.astype('int')
rows, cols = img.shape
# Use vstack to add back the rows that were dropped as zeros
return np.vstack((np.zeros((1, cols)), (img[2:, :] - img[:-2, :]) / 2.0, np.zeros((1, cols))))
# Performs fast radial symmetry transform
# img: input image, grayscale
# radii: integer value for radius size in pixels (n in the original paper); also used to size gaussian kernel
# alpha: Strictness of symmetry transform (higher=more strict; 2 is good place to start)
# beta: gradient threshold parameter, float in [0,1]
# stdFactor: Standard deviation factor for gaussian kernel
# mode: BRIGHT, DARK, or BOTH
def frst(img, radii, alpha, beta, stdFactor, mode='BOTH'):
mode = mode.upper()
assert mode in ['BRIGHT', 'DARK', 'BOTH']
dark = (mode == 'DARK' or mode == 'BOTH')
bright = (mode == 'BRIGHT' or mode == 'BOTH')
workingDims = tuple((e + 2 * radii) for e in img.shape)
# Set up output and M and O working matrices
output = np.zeros(img.shape, np.uint8)
O_n = np.zeros(workingDims, np.int16)
M_n = np.zeros(workingDims, np.int16)
# Calculate gradients
gx = gradx(img)
gy = grady(img)
# Find gradient vector magnitude
gnorms = np.sqrt(np.add(np.multiply(gx, gx), np.multiply(gy, gy)))
# Use beta to set threshold - speeds up transform significantly
gthresh = np.amax(gnorms) * beta
# Find x/y distance to affected pixels
gpx = np.multiply(np.divide(gx, gnorms, out=np.zeros(gx.shape), where=gnorms != 0),
gpy = np.multiply(np.divide(gy, gnorms, out=np.zeros(gy.shape), where=gnorms != 0),
# Iterate over all pixels (w/ gradient above threshold)
for coords, gnorm in np.ndenumerate(gnorms):
if gnorm > gthresh:
i, j = coords
# Positively affected pixel
if bright:
ppve = (i + gpx[i, j], j + gpy[i, j])
O_n[ppve] += 1
M_n[ppve] += gnorm
# Negatively affected pixel
if dark:
pnve = (i - gpx[i, j], j - gpy[i, j])
O_n[pnve] -= 1
M_n[pnve] -= gnorm
# Abs and normalize O matrix
O_n = np.abs(O_n)
O_n = O_n / float(np.amax(O_n))
# Normalize M matrix
M_max = float(np.amax(np.abs(M_n)))
M_n = M_n / M_max
# Elementwise multiplication
F_n = np.multiply(np.power(O_n, alpha), M_n)
# Gaussian blur
kSize = int(np.ceil(radii / 2))
kSize = kSize + 1 if kSize % 2 == 0 else kSize
S = cv2.GaussianBlur(F_n, (kSize, kSize), int(radii * stdFactor))
return S
img = cv2.imread('data/P1190263.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
result = frst(gray, 60, 2, 0, 1, mode='BOTH')
cv2.imshow("detected circles", result)
I only get this nearly black output (it has some very dark grey shadows). I don't know what to change and would be thankful for help!
Say you want to scale a transparent image but do not yet know the color(s) of the background you will composite it onto later. Unfortunately PIL seems to incorporate the color values of fully transparent pixels leading to bad results. Is there a way to tell PIL-resize to ignore fully transparent pixels?
import PIL.Image
filename = "trans.png" #
size = (25,25)
im =
print im.mode # RGBA
im = im.resize(size, PIL.Image.LINEAR) # the same with CUBIC, ANTIALIAS, transform
# # does not use alpha"resizelinear_"+filename)
# PIL scaled image has dark border
original image with (0,0,0,0) (black but fully transparent) background (left)
output image with black halo (middle)
proper output scaled with gimp (right)
edit: It looks like to achieve what I am looking for I would have to modify the sampling of the resize function itself such that it would ignore pixels with full transparency.
edit2: I have found a very ugly solution. It sets the color values of fully transparent pixels to the average of the surrounding non fully transparent pixels to minimize impact of fully transparent pixel colors while resizing. It is slow in the simple form but I will post it if there is no other solution. Might be possible to make it faster by using a dilate operation to only process the necessary pixels.
edit3: premultiplied alpha is the way to go - see Mark's answer
It appears that PIL doesn't do alpha pre-multiplication before resizing, which is necessary to get the proper results. Fortunately it's easy to do by brute force. You must then do the reverse to the resized result.
def premultiply(im):
pixels = im.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
r, g, b, a = pixels[x, y]
if a != 255:
r = r * a // 255
g = g * a // 255
b = b * a // 255
pixels[x, y] = (r, g, b, a)
def unmultiply(im):
pixels = im.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
r, g, b, a = pixels[x, y]
if a != 255 and a != 0:
r = 255 if r >= a else 255 * r // a
g = 255 if g >= a else 255 * g // a
b = 255 if b >= a else 255 * b // a
pixels[x, y] = (r, g, b, a)
You can resample each band individually:
bands = im.split()
bands = [b.resize(size, Image.LINEAR) for b in bands]
im = Image.merge('RGBA', bands)
Maybe by avoiding high transparency values like so (need numpy)
import numpy as np
# ...
bands = list(im.split())
a = np.asarray(bands[-1])
a.flags.writeable = True
a[a != 0] = 1
bands[-1] = Image.fromarray(a)
bands = [b.resize(size, Image.LINEAR) for b in bands]
a = np.asarray(bands[-1])
a.flags.writeable = True
a[a != 0] = 255
bands[-1] = Image.fromarray(a)
im = Image.merge('RGBA', bands)
Maybe you can fill the whole image with the color you want, and only create the shape in the alpha channnel?
sorry for answering myself but this is the only working solution that I know of. It sets the color values of fully transparent pixels to the average of the surrounding non fully transparent pixels to minimize impact of fully transparent pixel colors while resizing. There are special cases where the proper result will not be achieved.
It is very ugly and slow. I'd be happy to accept your answer if you can come up with something better.
# might be possible to speed this up by only processing necessary pixels
# using scipy dilate, numpy where
import PIL.Image
filename = "trans.png" #
size = (25,25)
import numpy as np
im =
npImRgba = np.asarray(im, dtype=np.uint8)
npImRgba2 = np.asarray(im, dtype=np.uint8)
npImRgba2.flags.writeable = True
lenY = npImRgba.shape[0]
lenX = npImRgba.shape[1]
for y in range(npImRgba.shape[0]):
for x in range(npImRgba.shape[1]):
if npImRgba[y, x, 3] != 0: # only change completely transparent pixels
colSum = np.zeros((3), dtype=np.uint16)
i = 0
for oy in [-1, 0, 1]:
for ox in [-1, 0, 1]:
if not oy and not ox:
iy = y + oy
if iy < 0:
if iy >= lenY:
ix = x + ox
if ix < 0:
if ix >= lenX:
col = npImRgba[iy, ix]
if not col[3]:
colSum += col[:3]
i += 1
npImRgba2[y, x, :3] = colSum / i
im = PIL.Image.fromarray(npImRgba2)
im = im.transform(size, PIL.Image.EXTENT, (0,0) + im.size, PIL.Image.LINEAR)"slime_"+filename)