I have installed Mediapipe ( using Python (3.7.0) on windows 11.
I have been able to successfully get Mediapipe to generate landmarks (for face and body); for an image, video, and webcam stream.
I would like to now get Mediapipe to only draw body specific landmarks (i.e. exclude facial landmarks).
I understand that I may use OpenCV (or Czone) to accomplish this goal, however, I am looking to achieve my objective using Mediapipe (i.e. using the draw_landmarks function in the MediaPipe library).
The specific bit of code I am trying (but with errors) is the following:
#Initialize a list to store the detected landmarks.
landmarks = []
# Iterate over the Mediapipe detected landmarks.
for landmark in results.pose_landmarks.landmark:
# Append the Mediapipe landmark into the list.
landmarks.append((int(landmark.x * width), int(landmark.y * height),
(landmark.z * width)))
#create index list for specific landmarks
body_landmark_indices = [11,12,13,14,15,16,23,24,25,26,27,28,29,30,31,32]
landmark_list_body = []
#Create a list which only has the required landmarks
for index in body_landmark_indices:
landmark_list_body.append(landmarks[index - 1])
Executing the above I get the error `'list' object has no attribute 'pose_landmarks'
I have replaced landmark_list=landmark_list_body.pose_landmarks, with landmark_list=landmark_list_body but with errors.
I am now very tiered and out of ideas. Is there a capeless hero out there?
You can try the following approach:
import cv2
import mediapipe as mp
import numpy as np
from mediapipe.python.solutions.pose import PoseLandmark
from mediapipe.python.solutions.drawing_utils import DrawingSpec
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_pose = mp.solutions.pose
custom_style = mp_drawing_styles.get_default_pose_landmarks_style()
custom_connections = list(mp_pose.POSE_CONNECTIONS)
# list of landmarks to exclude from the drawing
excluded_landmarks = [
PoseLandmark.MOUTH_RIGHT ]
for landmark in excluded_landmarks:
# we change the way the excluded landmarks are drawn
custom_style[landmark] = DrawingSpec(color=(255,255,0), thickness=None)
# we remove all connections which contain these landmarks
custom_connections = [connection_tuple for connection_tuple in custom_connections
if landmark.value not in connection_tuple]
IMAGE_FILES = ["test.jpg"]
BG_COLOR = (192, 192, 192)
with mp_pose.Pose(
min_detection_confidence=0.5) as pose:
for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
image_height, image_width, _ = image.shape
results = pose.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
annotated_image = image.copy()
connections = custom_connections, # passing the modified connections list
landmark_drawing_spec=custom_style) # and drawing style
cv2.imshow('landmarks', annotated_image)
It modifies the DrawingSpec and POSE_CONNECTIONS to "hide" a subset of landmarks.
However, due to the way the draw_landmarks() function is implemented in Mediapipe, it is also required to add a condition in drawing_utils.py (located in site-packages/mediapipe/python/solutions):
if drawing_spec.thickness == None: continue
Add it before the Line 190 (# White circle border). The result should look like this:
drawing_spec = landmark_drawing_spec[idx] if isinstance(
landmark_drawing_spec, Mapping) else landmark_drawing_spec
if drawing_spec.thickness == None: continue
# White circle border
circle_border_radius = max(drawing_spec.circle_radius + 1,
int(drawing_spec.circle_radius * 1.2))
This change is required in order to completely eliminate the white border that is drawn around landmarks regardless of their drawing specification.
Hope it helps.
I've tried using the findChessboardCorners function in open CV python. But it's not working.
These are the images I'm trying to get it to detect these images.
I want it to be able to detect where the squares are and if a piece is on it.
So far I've tried
import cv2 as cv
import numpy as np
def rescaleFrame(frame, scale=0.75):
#rescale image
width = int(frame.shape[1] * scale)
height = int(frame.shape[0] * scale)
dimensions = (width,height)
return cv.resize(frame, dimensions, interpolation=cv.INTER_AREA)
img = cv.imread("board2.jpg")
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
ret, corners = cv.findChessboardCorners(gray, (8,8),None)
if ret == True:
# Draw and display the corners
img = cv.drawChessboardCorners(img, (8,8), corners,ret)
I was expect it to work like how this tutorial shows
The function findChessboardCorners is used to calibrate cameras using a black-and-white chessboard pattern. As far as I know, is not designed to detect the corners of a chess board with chess pieces on it.
This site shows an example of calibration "chess boards." And this site shows how these calibration chess boards are used, this example uses the ROS Library.
You can still use OpenCV but will need to try other functions. Assuming you took the photos yourself, you've also made the problem harder on yourself by using a background that has a lot of lines and corners, meaning you'll have to differentiate between those corners and corners on the board. You can also see that the top corners of the board behind the rooks are occluded. If you can retake the photos, I would take a top-down photo and do it on a blank surface that contrasts with the chessboard.
One example of corner detection in OpenCV is Harris corner detection. I wrote up a short example for you. You'll need to play around with this and other corner detection methods to see what works best. I found that adding a sobel filter to strength the lines in your image gave much better results. But it's still going to detect corners in the background and the corners on the pieces. You'll need to figure out how to filter those out.
import cv2 as cv
from matplotlib import pyplot as plt
import numpy as np
def sobel(src_image, kernel_size):
grad_x = cv.Sobel(src_image, cv.CV_16S, 1, 0, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
grad_y = cv.Sobel(src_image, cv.CV_16S, 0, 1, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
abs_grad_x = cv.convertScaleAbs(grad_x)
abs_grad_y = cv.convertScaleAbs(grad_y)
grad = cv.addWeighted(abs_grad_x, 0.5, abs_grad_y, 0.5, 0)
return grad
def process_image(src_image_path):
# load the image
src_image = cv.imread(src_image_path)
# convert to RGB (otherwise when you display this image the colors will look incorrect)
src_image = cv.cvtColor(src_image, cv.COLOR_BGR2RGB)
# convert to grayscale before attempting corner detection
src_gray = cv.cvtColor(src_image, cv.COLOR_BGR2GRAY)
# standard technique to eliminate noise
blur_image = cv.blur(src_gray,(3,3))
# strengthen the appearance of lines in the image
sobel_image = sobel(blur_image, 3)
# detect corners
corners = cv.cornerHarris(sobel_image, 2, 3, 0.04)
# for visualization to make corners easier to see
corners = cv.dilate(corners, None)
# overlay on a copy of the source image
dest_image = np.copy(src_image)
return dest_image
src_image_path = "board1.jpg"
dest_image = process_image(src_image_path)
I work at a studio that does school photos and we are trying to make a script to eliminate the job of cropping each photo to a template. The photos we work with are fairly uniform but they vary in resolution and head position a bit. I took up the mantle of trying to write the script with my fairly limited Python knowledge and through a lot of trial and error and online resources I think I have got most of the way there.
At the moment I am trying to figure out the best way to have the image crop from the NumPy array with the head where I want and I just cant find a good flexible solution. The head needs to be positioned slightly differently for pose 1 and pose 2 so its needs to be easy to change on the fly (Probably going to implement some sort of simple GUI to input stuff like that, but for now I can just change the code).
I also need to be able to change the output resolution of the photo so they are all uniform (2000x2500). Anyone have any ideas?
At the moment this is my current code, it just saves the detected face square:
import cv2
import os.path
import glob
# Cascade path
cascPath = 'haarcascade_frontalface_default.xml'
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
#Check for output folder and create if its not there
if not os.path.exists('output'):
# Read Images
images = glob.glob('*.jpg')
for c, i in enumerate(images):
image = cv2.imread(i, 1)
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find face(s) using cascade
faces = faceCascade.detectMultiScale(
scaleFactor=1.1, # size of groups
minNeighbors=5, # How many groups around are detected as face for it to be valid
minSize=(500, 500) # Min size in pixels for face
# Outputs number of faces found in image
print('Found {0} faces!'.format(len(faces)))
# Places a rectangle on face
for (x, y, w, h) in faces:
imgCrop = image[y:y+h,x:x+w]
if len(faces) > 0:
#Saves Images to output folder with OG name
cv2.imwrite('output/'+ i, imgCrop)
I can crop using it like this:
# Crop Padding
left = 300
right = 300
top = 400
bottom = 1000
for (x, y, w, h) in faces:
imgCrop = image[y-top:y+h+bottom, x-left:x+w+right]
but that outputs pretty random resolutions and changes based on the image resolution
To set a new resolution with the dimension, you can use cv2.resize. There may be a pixel loss so you can use the interpolation method.
The newly resized image may be in BGR format, so you may need to convert to RGB format.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
One approach is using python's face-recognition library.
The approach is using two sample images for training.
Predict the next image based on training images.
For instance, The followings are the training images:
We want to predict the faces in the below image:
When we get the facial encodings of the training images and apply to the next image:
import face_recognition
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw
# Load a sample picture and learn how to recognize it.
first_image = face_recognition.load_image_file("images/ex.jpg")
first_face_encoding = face_recognition.face_encodings(first_image)[0]
# Load a second sample picture and learn how to recognize it.
second_image = face_recognition.load_image_file("images/index.jpg")
sec_face_encoding = face_recognition.face_encodings(second_image)[0]
# Create arrays of known face encodings and their names
known_face_encodings = [
print('Learned encoding for', len(known_face_encodings), 'images.')
# Load an image with an unknown face
unknown_image = face_recognition.load_image_file("images/babes.jpg")
# Find all the faces and face encodings in the unknown image
face_locations = face_recognition.face_locations(unknown_image)
face_encodings = face_recognition.face_encodings(unknown_image, face_locations)
# Convert the image to a PIL-format image so that we can draw on top of it with the Pillow library
# See http://pillow.readthedocs.io/ for more about PIL/Pillow
pil_image = Image.fromarray(unknown_image)
# Create a Pillow ImageDraw Draw instance to draw with
draw = ImageDraw.Draw(pil_image)
# Loop through each face found in the unknown image
for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):
matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
best_match_index = np.argmin(face_distances)
draw.rectangle(((left, top), (right, bottom)), outline=(0, 0, 255), width=5)
# Remove the drawing library from memory as per the Pillow docs
del draw
# Display the resulting image
The output will be:
The above is my suggestion. When you create a new resolution with the current image, there will be a pixel loss. Therefore you need to use an interpolation method.
For instance: after finding the face locations, select the coordinates in the original image.
# Add after draw.rectangle function.
crop = unknown_image[top:bottom, left:right]
Set new resolution with the size 2000 x 2500 and interpolation with CV2.INTERN_LANCZOS4.
Possible Question: Why CV2.INTERN_LANCZOS4?
Of course, you can select whatever you like, but in this post CV2.INTERN_LANCZOS4 was suggested.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
Save the image
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
Outputs are around 4.3 MB Therefore I can't display in here.
From the final result, we clearly see and identify faces. The library precisely finds the faces in the image.
Here what you can do:
Either you can use the training images of your own-set, or you can use the example above.
Apply the face-recognition function for each image, using the trained face-locations and save the results in the directory.
here is how I got it to crop how I wanted, this is added right below the "output number of faces" function
#Get the face postion and output values into variables, might not be needed but I did it
for (x, y, w, h) in faces:
xdis = x
ydis = y
w = w
h = h
#Get scale value by dividing wanted head hight by detected head hight
ws = 600/w
hs = 600/h
#scale image to get head to right size, uses bilinear interpolation by default
scale = cv2.resize(image,(0,0),fx=hs,fy=ws)
#calculate head postion for given values
sxdis = int(xdis*ws) #applying scale to x distance and turning it into a integer
sydis = int(ydis*hs) #applying scale to y distance and turning it into a integer
sycent = sydis+300 #adding half head hight to get center
ystart = sycent-700 #subtract where you want the head center to be in pixels, this is for the vertical
yend = ystart+2500 #Add whatever you want vertical resolution to be
xcent = sxdis+300 #adding half head hight to get center
xstart = xcent-1000 #subtract where you want the head center to be in pixels, this is for the horizontal
xend = xstart+2000 #add whatever you want the horizontal resolution to be
#Crop the image
cropped = scale[ystart:yend, xstart:xend]
Its a mess but it works exactly how I wanted it to work.
ended up going with openCV instead of switching to python-Recognition because of speed but I might switch over if I can get multithreading to work in python-recognition.
I´m construction a dataset with more than one image for each person for python face_recognition package. It will add a classifier on top of the bultin model. See also this example: face_recognition_knn.py. here is my code:
# import the necessary packages
from imutils import paths
import face_recognition
import pickle
import cv2
import os
# grab the paths to the input images in our dataset
print("[INFO] quantifying faces...")
imagePaths = list(paths.list_images('dataset'))
# initialize the list of known encodings and known names
knownEncodings = []
knownNames = []
# loop over the image paths
for (i, imagePath) in enumerate(imagePaths):
# extract the person name from the image path
print(f"[INFO] processing image {i+1}/{len(imagePaths)} -> {imagePath}")
name = imagePath.split(os.path.sep)[-2]
# load the input image and convert it from BGR (OpenCV ordering)
# to dlib ordering (RGB)
image = cv2.imread(imagePath)
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect the (x, y)-coordinates of the bounding boxes
# corresponding to each face in the input image
boxes = face_recognition.face_locations(rgb,
model='hog') #can be cnn
# compute the facial embedding for the face
encodings = face_recognition.face_encodings(rgb, boxes)
# loop over the encodings
for encoding in encodings:
# add each encoding + name to our set of known names and
# encodings
# dump the facial encodings + names to disk
print("[INFO] serializing encodings...")
data = {"encodings": knownEncodings, "names": knownNames}
f = open('encodings.pickle', "wb")
Then, I try to identify these people with this code:
import face_recognition
import pickle
import cv2
import numpy as np
import requests
from datetime import datetime
# load the known faces and embeddings
print("[INFO] loading encodings...")
data = pickle.loads(open("encodings.pickle", "rb").read())
def processa_imagem(url):
# load the input image and convert it from BGR to RGB and returns file with cofidence
image = cv2.imread(url)
if image is None:
print(f'Image not found: {imagem}')
#image = np.array(image, dtype=np.uint8)
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect the (x, y)-coordinates of the bounding boxes corresponding
# to each face in the input image, then compute the facial embeddings
# for each face
print("[INFO] recognizing faces...")
boxes = face_recognition.face_locations(rgb,
encodings = face_recognition.face_encodings(rgb, boxes)
# initialize the list of names for each face detected
names = []
# loop over the facial embeddings
for encoding in encodings:
# attempt to match each face in the input image to our known
# encodings
## ATTENTION! the ideal is face_recognition.api.batch_face_locations but i dont have a GPU
matches = face_recognition.face_distance(data["encodings"],
name = "unkown"
# check to see if we have found a match
if max(matches) > 0.7:
# find the indexes of all matched faces then initialize a
# dictionary to count the total number of times each face
# was matched
matchedIdxs = [i for (i, b) in enumerate(matches) if b]
counts = {}
# loop over the matched indexes and maintain a count for
# each recognized face face
for i in matchedIdxs:
name = data["names"][i]
counts[name] = counts.get(name, 0) + 1
# determine the recognized face with the largest number of
# votes (note: in the event of an unlikely tie Python will
# select first entry in the dictionary)
name = max(counts, key=counts.get)
# update the list of names
# loop over the recognized faces
for ((top, right, bottom, left), name) in zip(boxes, names):
# draw the predicted face name on the image
cv2.rectangle(image, (left, top), (right, bottom), (255, 0, 0), 2)
y = top - 15 if top - 15 > 15 else top + 15
cv2.putText(image, name, (left, y), cv2.FONT_HERSHEY_SIMPLEX,
0.75, (255, 0, 0), 2)
now = datetime.now()
current_time = now.strftime("%H%M%S%f")
#file_path = f'static/face-{current_time}.jpg'
file_path = f'face-{current_time}.jpg'
return (file_path, ', '.join(names))
On my dataset, I´ve added, on average, about 10 photos of each individual. The script uses face_recognition.face_distance and it works well to recognize someone in the dataset.
The problema is that, when I use it with someone that OUT. For these people, sometimes I still get about 0.90 higher confidence false positive results.
Some of the pictures in dataset are low quality. Maybe that´s the reason? Should I change my approach, using more detailed photos (2 or 3) and maybe encoding them with jitters?
Thanks for any input!
I am trying to crop a face using the facial landmarks identified by dlib. The right eyebrow is causing problems - the crop goes flat across rather than follow the eyebrow arc.
What am I doing wrong here?
from imutils import face_utils
import imutils
import numpy as np
import collections
import dlib
import cv2
def face_remap(shape):
remapped_image = shape.copy()
# left eye brow
remapped_image[17] = shape[26]
remapped_image[18] = shape[25]
remapped_image[19] = shape[24]
remapped_image[20] = shape[23]
remapped_image[21] = shape[22]
# right eye brow
remapped_image[22] = shape[21]
remapped_image[23] = shape[20]
remapped_image[24] = shape[19]
remapped_image[25] = shape[18]
remapped_image[26] = shape[17]
# neatening
remapped_image[27] = shape[0]
return remapped_image
# load the input image, resize it, and convert it to grayscale
image = cv2.imread("images/faceCM1.jpg")
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
out_face = np.zeros_like(image)
# initialize dlib's face detector (HOG-based) and then create the facial landmark predictor
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(SHAPE_PREDICTOR)
# detect faces in the grayscale image
rects = detector(gray, 1)
# loop over the face detections
for (i, rect) in enumerate(rects):
Determine the facial landmarks for the face region, then convert the facial landmark (x, y)-coordinates to a NumPy array
shape = predictor(gray, rect)
shape = face_utils.shape_to_np(shape)
#initialize mask array
remapped_shape = np.zeros_like(shape)
feature_mask = np.zeros((image.shape[0], image.shape[1]))
# we extract the face
remapped_shape = face_remap(shape)
cv2.fillConvexPoly(feature_mask, remapped_shape[0:27], 1)
feature_mask = feature_mask.astype(np.bool)
out_face[feature_mask] = image[feature_mask]
cv2.imshow("mask_inv", out_face)
cv2.imwrite("out_face.png", out_face)
sample image of cropped face showing the issue
Using the convex hull formed by the 68 landmarks didn't exactly achieve the desired output, so I had the following approach to this problem using scikit-image instead of OpenCV
1. Load image and predict 68 landmarks
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
img = dlib.load_rgb_image('mean.jpg')
rect = detector(img)[0]
sp = predictor(img, rect)
landmarks = np.array([[p.x, p.y] for p in sp.parts()])
2. Select the landmarks that represents the shape of the face
(I had to reverse the order of the eyebrows landmarks because the 68 landmarks aren't ordered to describe the face outline)
outline = landmarks[[*range(17), *range(26,16,-1)]]
3. Draw a polygon using these landmarks using scikit-image
Y, X = skimage.draw.polygon(outline[:,1], outline[:,0])
4. Create a canvas with zeros and use the polygon as mask to original image
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
For the sake of completeness, I provide below a solution using scipy.spatial.ConvexHull, if this option is still preferred
vertices = ConvexHull(landmarks).vertices
Y, X = skimage.draw.polygon(landmarks[vertices, 1], landmarks[vertices, 0])
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
Its because the face shape you are providing is not convex.
fillConvexPoly works perfectly on convex shapes only, In this case there is a concave corner (at point #27) and hence the results are messed up.
To fix this, modify the function as
def face_remap(shape):
remapped_image = cv2.convexHull(shape)
return remapped_image
This would give you a result which looks like.
Now you may write some more code to remove the triangular section on forehead (if you want it that way)
I am working on surf implementation in opencv using python which will detect the template in the given image. I have modified the code such that it will take video capture from the webcam connected and convert into images and then apply surf on it. Following is the modified code.
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
ret ,img = cap.read()
# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# SURF extraction
surf = cv2.SURF()
kp, descritors = surf.detect(imgg,None,useProvidedKeypoints = False)
# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)
# kNN training
knn = cv2.KNearest()
# Now loading a template image and searching for similar keypoints
template = cv2.imread('template.png')
templateg= cv2.cvtColor(template,cv2.COLOR_BGR2GRAY)
keys,desc = surf.detect(templateg,None,useProvidedKeypoints = False)
for h,des in enumerate(desc):
des = np.array(des,np.float32).reshape((1,128))
retval, results, neigh_resp, dists = knn.find_nearest(des,1)
res,dist = int(results[0][0]),dists[0][0]
if dist<0.1: # draw matched keypoints in red color
color = (0,0,255)
else: # draw unmatched in blue color
print dist
color = (255,0,0)
#Draw matched key points on original image
x,y = kp[res].pt
center = (int(x),int(y))
#Draw matched key points on template image
x,y = keys[h].pt
center = (int(x),int(y))
But the error which is coming is
TyepError: data type = 17 is not supported
Does anybody have any idea on this?
