I'm currently trying to crop faces from an image.
I want the code to work no matter how many faces are in the image.
An example of the input image:
I'd like to crop the faces so I can run a facial keypoint detection algorithm on them (previously made).
The end result will look something like this:
My code is currently:
# Load in color image for face detection
image = cv2.imread('images/obamas4.jpg')
# Convert the image to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Make a copy of the original image to draw face detections on
image_copy = np.copy(image)
# Convert the image to gray
gray_image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# Detect faces in the image using pre-trained face dectector
faces = face_cascade.detectMultiScale(gray_image, 1.25, 6)
# Print number of faces found
print('Number of faces detected:', len(faces))
# Get the bounding box for each detected face
for f in faces:
x, y, w, h = [ v for v in f ]
cv2.rectangle(image_copy, (x,y), (x+w, y+h), (255,0,0), 3)
# Define the region of interest in the image
face_crop = gray_image[y:y+h, x:x+w]
# Display the image with the bounding boxes
fig = plt.figure(figsize = (9,9))
axl = fig.add_subplot(111)
axl.set_xticks([])
axl.set_yticks([])
ax1.set_title("Obamas with Face Detection")
axl.imshow(image_copy)
# Display the face crops
fig = plt.figure(figsize = (9,9))
axl = fig.add_subplot(111)
axl.set_xticks([])
axl.set_yticks([])
axl.set_title("Obamas Face Crops")
axl.imshow(face_crop)
The output looks like this:
Right now it only outputs the last face detected in the image. I'm certain I'm missing something simple like a for loop.
I'd like to be able to run my facial keypoint detection algorithm on all of the gray cropped face images.
Thank you for your help!
The problem is in your code, face_crop is storing only the last face detected.
So make it as a list and append all faces to it. Then use a for loop to display all faces. Like this:
face_crop = []
for f in faces:
x, y, w, h = [ v for v in f ]
cv2.rectangle(image_copy, (x,y), (x+w, y+h), (255,0,0), 3)
# Define the region of interest in the image
face_crop.append(gray_image[y:y+h, x:x+w])
for face in face_crop:
cv2.imshow('face',face)
cv2.waitKey(0)
I used cv2.imshow() to display the images. You can modify this to use plt.imshow()
Hope this helps!
Related
I am making an OpenCV Face Recognizer that draws a bounding box around the faces it detects from an image it has read. I am using a cascade classifier (haarcascades)
It shows the picture, not in grayscale (full color) and will not draw the bounding boxes. Not sure what I did, I am new to this.
Here is the code:
import cv2
# Load image
image = cv2.imread("/home/tyler/Downloads/PythonProjects/VN5anAL3_400x400.jpg")
# Convert image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Define the cascade classifier
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
# Detect faces
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))
# Draw bounding boxes
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 2)
# Show the image
cv2.imshow("faces", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
You are almost there, but you are drawing the bounding boxes on the grayscale image instead of the original image. Instead of using the gray variable, use the image variable in the following line to draw the bounding boxes:
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 2)
Also, you may want to check if the cascade classifier is loaded correctly by adding the following line before the detection:
if face_cascade.empty():
raise Exception("Failed to load cascade classifier.")
I successfully displayed the video feed and am trying to change the background color of the area outside ROI from black to blue but the screen still shows black background. Please help me solve the problem. Any help would be greatly appreciated.
Original code
import numpy as np
from cv2 import cv2
'''
ML object detection algo(haarcascade)used to identify objects.
the XML file consists of trained Haar Cascade models.
'''
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
#initialize video from the webcam
video = cv2.VideoCapture(1)
while True:
# ret tells if the camera works properly. Frame is an actual frame from the video feed
ret, frame= video.read()
# print(cv2.VideoCapture(0).isOpened())
# make sure port is working and read the image
if frame is not None and video.isOpened():
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect the faces within the subregions of the image in scales
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=6)
# Draw the rectangle around each face
for (x, y, w, h) in faces:
#Use the coordinates to find the center of the face and from that point draw a rectangle of radius w/2 or h/2.
center_coordinates = x + w // 2, y + h // 2
radius = w // 2 # or can be h / 2 or can be anything based on your requirements
#background color(black)
mask=np.zeros(frame.shape[:2] , dtype="uint8")
# Draw the desired region to crop out in white
cv2.circle(mask, center_coordinates, radius, (255,255,255),-1)
masked=cv2.bitwise_and(frame,frame,mask=mask)
cv2.imshow('mask applied',masked)
if cv2.waitKey(30) & 0xff==27:
break
video.release()
cv2.destroyAllWindows()
The above code detects and displays the face in the circular mask on the black background. But as mentioned above, The background color outside circular ROI should be blue.
I tried replacing mask=np.zeros(frame.shape[:2], dtype="uint8")with the code below and fails. Frame.shape[0:2]doesn't even include channel and I can't figure out how to change the color in the first place.
mask=np.ones(frame.shape[0:2], dtype="uint8")
mask[:,:,0]=255
mask[:,:,1]=0
mask[:,:,2]=0
I also tried creating a circular masked image then place it on another image only to find out it results in the same problem.
import numpy as np
from cv2 import cv2
'''
ML object detection algo(haarcascade)used to identify objects.
the XML file consists of trained Haar Cascade models.
'''
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
#initialize video from the webcam
video = cv2.VideoCapture(1)
print(cv2.VideoCapture(1).isOpened())
while True:
# ret tells if the camera works properly. Frame is an actual frame from the video feed
ret, frame= video.read()
# print(cv2.VideoCapture(0).isOpened())
# make sure port is working and read the image
if frame is not None and video.isOpened():
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect the faces within the subregions of the image in scales
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=6)
# Draw the rectangle around each face
for (x, y, w, h) in faces:
#Use the coordinates to find the center of the face and from that point draw a rectangle of radius w/2 or h/2.
center_coordinates = x + w // 2, y + h // 2
radius = w // 2 # or can be h / 2 or can be anything based on your requirements
#background color(black)
mask=np.zeros(frame.shape[:2] , dtype="uint8")
# create blue colored background
color = np.full_like(frame, (255,0,0))
# Draw the desired region to crop out in white
roi=cv2.circle(mask, center_coordinates, radius, (255,255,255),-1)
masked=cv2.bitwise_and(frame,frame,mask=mask)
mask_blue=cv2.bitwise_and(color,color,mask=mask-roi)
# combine the two masked images
result = cv2.add(masked,mask_blue)
cv2.imshow('result',result)
if cv2.waitKey(30) & 0xff==27:
break
video.release()
cv2.destroyAllWindows()
I have changed your code as follows according to your requirement. Here i have added an extra line
masked[np.where((masked==[0,0,0]).all(axis=2))]=[255,0,0]
where you can change pixel values of black region to any specific color.
import numpy as np
import cv2
'''
ML object detection algo(haarcascade)used to identify objects.
the XML file consists of trained Haar Cascade models.
'''
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades +'haarcascade_frontalface_default.xml')
#initialize video from the webcam
video = cv2.VideoCapture(0)
print(cv2.VideoCapture(0).isOpened())
while True:
# ret tells if the camera works properly. Frame is an actual frame from the video feed
ret, frame= video.read()
# print(cv2.VideoCapture(0).isOpened())
# make sure port is working and read the image
if frame is not None and video.isOpened():
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect the faces within the subregions of the image in scales
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=6)
# Draw the rectangle around each face
for (x, y, w, h) in faces:
#Use the coordinates to find the center of the face and from that point draw a rectangle of radius w/2 or h/2.
center_coordinates = x + w // 2, y + h // 2
radius = w // 2 # or can be h / 2 or can be anything based on your requirements
#background color(black)
mask=np.zeros(frame.shape[:2] , dtype="uint8")
# Draw the desired region to crop out in white
roi=cv2.circle(mask, center_coordinates, radius, (255,255,255),-1)
masked=cv2.bitwise_and(frame,frame,mask=mask)
masked[np.where((masked==[0,0,0]).all(axis=2))]=[255,0,0]
cv2.imshow('result',masked)
if cv2.waitKey(30) & 0xff==27:
break
video.release()
cv2.destroyAllWindows()
I need some help with a project. My intention is to crop videos of sonographies with OpenCV and python in order to process them further. The features I am looking for are:
Loop through all available videos in a folder
find the contours and crop
export each video with one fixed size and resolution
Now i am a bit stuck on the contour finding and cropping part. I would like OpenCV to automatically recognize a bounding box around the shape of the sonography, knowing that all videos have the particular conus shape. Besides, it would be great if the non-relevant clutter could be removed. Can you help me? Attached you can find one original frame of the videos and the desired result.
import cv2
import numpy as np
cap = cv2.VideoCapture('video.mjpg')
# (x, y, w, h) = cv2.boundingRect(c)
# cv2.rectangle(frame, (x,y), (x+w, y+h), (0, 255, 0), 20)
# roi = frame[y:y+h, x:x+w]
while True:
ret, frame = cap.read()
# (height, width) = frame.shape[:2]
sky = frame[0:100, 0:200]
cv2.imshow('Video', sky)
if cv2.waitKey(1) == 27:
exit(0)
For the first frame of video; you can use this to detect the bounding-box of the image and then you can crop it or whatever you want :)
import sys
import cv2
import numpy as np
# Load our image
dir = sys.path[0]
org = cv2.imread(dir+'/im.png')
im=org.copy()
H,W=im.shape[:2]
# Convert image to grayscale
im=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
# remove noise
im=cv2.GaussianBlur(im,(21,21),21)
im=cv2.erode(im,np.ones((5,5)))
# remove horizantal line
im=cv2.GaussianBlur(im,(5,0),21)
blr=im.copy()
# make binary image
im=cv2.threshold(im,5,255,cv2.THRESH_BINARY)[1]
# draw black border around image to better detect blobs:
cv2.rectangle(im,(0,0),(W,H),0,thickness=W//25)
bw=im.copy()
# Invert the black and white colors
im=~im
# Find contours and sort them by width
cnts, _ = cv2.findContours(im, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnts.sort(key=lambda x: cv2.boundingRect(x)[2],reverse=True)
# Change the type and channels of image copies
im=cv2.cvtColor(im,cv2.COLOR_GRAY2BGR)
bw=cv2.cvtColor(bw,cv2.COLOR_GRAY2BGR)
blr=cv2.cvtColor(blr,cv2.COLOR_GRAY2BGR)
# Find the second biggest blob
x, y, w, h = cv2.boundingRect(cnts[1])
cv2.rectangle(org, (x, y), (x+w, y+h), (128, 0, 255), 10)
cv2.rectangle(im, (x, y), (x+w, y+h), (128, 255, 0), 10)
print(x,y,w,h)
# Save final result
top=np.hstack((blr,bw))
btm=np.hstack((im,org))
cv2.imwrite(dir+'/img_.png',np.vstack((top,btm)))
Bounding-Box area:
133 25 736 635
Cut and save the final image:
org = cv2.imread(dir+'/im.png')
cv2.imwrite(dir+'/img_.png',org[y:y+h,x:x+w])
I have the following image which has text and a lot of white space underneath the text. I would like to crop the white space such that it looks like the second image.
Cropped Image
Here is what I've done
>>> img = cv2.imread("pg13_gau.jpg.png")
>>> gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
>>> edged = cv2.Canny(gray, 30,300)
>>> (img,cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
>>> cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:10]
As many have alluded in the comments, the best way is to invert the image so the black text becomes white, find all the non-zero points in the image then determine what the minimum spanning bounding box would be. You can use this bounding box to finally crop your image. Finding the contours is very expensive and it isn't needed here - especially since your text is axis-aligned. You can use a combination of cv2.findNonZero and cv2.boundingRect to do what you need.
Therefore, something like this would work:
import numpy as np
import cv2
img = cv2.imread('ws.png') # Read in the image and convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8) # To invert the text to white
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
rect = img[y:y+h, x:x+w] # Crop the image - note we do this on the original image
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("rect.png", rect) # Save the image
The code above exactly lays out what I talked about in the beginning. We read in the image, but we also convert to grayscale as your image is in colour for some reason. The tricky part is the third line of code where I threshold below the intensity of 128 so that the dark text becomes white. This however produces a binary image, so I convert to uint8, then scale by 255. This essentially inverts the text.
Next, given this image we find all of the non-zero coordinates with cv2.findNonZero and we finally put this into cv2.boundingRect which will give you the top-left corner of the bounding box as well as the width and height. We can finally use this to crop the image. Note we do this on the original image and not the inverted one. We use simply NumPy array indexing to do the cropping for us.
Finally, we show the image to show that it works and we save it to disk.
I now get this image:
For the second image, a good thing to do is to remove some of the right border and bottom border. We can do that by cropping the image down to that first. Next, this image contains some very small noisy pixels. I would recommend doing a morphological opening with a very small kernel, then redo the logic we talked about above.
Therefore:
import numpy as np
import cv2
img = cv2.imread('pg13_gau_preview.png') # Read in the image and convert to grayscale
img = img[:-20,:-20] # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8) # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones((2, 2), dtype=np.uint8)) # Perform noise filtering
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
rect = img[y:y+h, x:x+w] # Crop the image - note we do this on the original image
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("rect.png", rect) # Save the image
Note: Output image removed due to privacy
Opencv reads the image as a numpy array and it's much simpler to use numpy directly (scikit-image does the same). One possible way of doing it is to read the image as grayscale or convert to it and do the row-wise and column-wise operations as shown in the code snippet below. This will remove the columns and rows when all pixels are of pixel_value (white in this case).
def crop_image(filename, pixel_value=255):
gray = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
crop_rows = gray[~np.all(gray == pixel_value, axis=1), :]
cropped_image = crop_rows[:, ~np.all(crop_rows == pixel_value, axis=0)]
return cropped_image
and the output:
This would also work:
from PIL import Image, ImageChops
img = Image.open("pUq4x.png")
pixels = img.load()
print (f"original: {img.size[0]} x {img.size[1]}")
xlist = []
ylist = []
for y in range(0, img.size[1]):
for x in range(0, img.size[0]):
if pixels[x, y] != (255, 255, 255, 255):
xlist.append(x)
ylist.append(y)
left = min(xlist)
right = max(xlist)
top = min(ylist)
bottom = max(ylist)
img = img.crop((left-10, top-10, right+10, bottom+10))
img.show()
I am trying out a code for face and eye detection in Open CV using Python. The code works well for image size 2848 X 4272 and even when I resized it by a factor of 0.5. But whenever I am resizing it with another factors such as 0.2,0.4 etc , it gives me ambiguous results for eyes(such as few regions of forehead, nose).In that case I am not able to get a generalised code for all image sizes. Is there any code so that I get correct detections with any image size as its very difficult to process such big images. The code is as such
import numpy as np
import cv2
import cv2.cv as cv
#attaching the haar cascade files
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
# reading the image
img11 = cv2.imread('IMG_0347.JPG')
if img11 !=None:
# resizing the image
w,h,c= img11.shape
print "dimension"
print w,h
img = cv2.resize(img11,None,fx=0.4, fy=0.3, interpolation = cv2.INTER_LINEAR)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # converting into grayscale
gray = cv2.equalizeHist(gray)
#cv2.imshow('histo',gray)
w,h,c= img.shape # finding out the dimensions of the image i.e width, height and number of channels
# creating a white background of same dimensions as input image for pasting the eyes detected by 'haarcascade_eye.xml'
im = np.zeros((w,h,c),np.uint8)
im[:]=[255,255,255]
# creating a white background of same dimensions as input image for pasting the masked eyes
im_mask = np.zeros((w,h,c),np.uint8)
im_mask[:]=[255,255,255]
# faces gives the top left coordinates of the detected face and width and height of the rectangle
faces = face_cascade.detectMultiScale(gray, 1.5, 5)
# taking face as the ROI
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),1) # Draws the rectangle around the detected face
roi_gray = gray[y:y+h, x:x+w]
roi_color = img[y:y+h, x:x+w]
#cv2.imshow('image1',img) # shows the original image with face detected
#cv2.imshow('image1',roi_color) # shows only the face detected (colored)
# searching for eyes in the detected face i.e in the roi gray
eyes = eye_cascade.detectMultiScale(roi_gray)
#print eyes # prints the top left coordinates of the detected eyes and width and height of the rectangle
if eyes.any():
for (ex,ey,ew,eh)in eyes:
cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),1) # draws rectangle around the masked eyes
eye_mask= roi_color[ey+1:u, ex+1:ex+ew] # eye_mask is the masked portion of the detected eye extracted from roi_color
im_mask[ey+1+y:y+u, ex+x+1:ex+ew+x]=eye_mask #pasting the eye_mask on the white background called im_mask
else:
print ("eyes could not be detected")
cv2.imshow('image',im_mask) #shows the im-mask white background with masked eyes pasted on it
It is logical that as the image gets smaller and smaller, it becomes harder to differentiate an eye from a nose, for example. So unless you understand fundamentally what your image analysis functions are looking for (I don't) it's hard to know the best way to downsize your images while retaining the type of information that the analysis needs.
Having said that, I believe cv2.INTER_AREA is used for shrinking images more commonly than cv2.INTER_LINEAR etc.
Try this instead of the resize you have:
img = cv2.resize(img11, None, fx=0.4, fy=0.3, interpolation=cv2.INTER_AREA)
Also, aren't you making it harder to identify eyes by changing the aspect ratio of your images (fx != fy)? If you don't have a special reason for that, you can just choose the target size explicitly with the second position argument, size. For example:
effective_but_smaller_size = (640, 480) # or whatever you find works
img = cv2.resize(img11, effective_but_smaller_size, interpolation=cv2.INTER_AREA)