I am trying to implement HOG Descriptor with OpenCV to detect Pedestrians in a video. I am currently using the pre-made dataset by OpenCV hogcascade_pedestrians.xml. Unfortuntley the documentation on this part is very poor on the internet although the HOG Descriptor is very effective for human detection. I have been writing a code for pedestrians detection with Python, and I have stopped at the following code:
import cv2
import numpy as np
import imutils
VidCap = cv2.VideoCapture('pedestrians.mp4')
HOGCascade = cv2.HOGDescriptor('hogcascade_pedestrians.xml')
while True:
_ , image = VidCap.read()
image = imutils.resize(image, width=700)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
clahe = cv2.createCLAHE(clipLimit=15.0,tileGridSize=(8,8))
gray = clahe.apply(gray)
winStride = (8,8)
padding = (16,16)
scale = 1.05
meanshift = -1
(rects, weights) = HOGCascade.detectMultiScale(gray, winStride=winStride,
for (x, y, w, h) in rects:
cv2.rectangle(image, (x, y), (x+w, y+h), (0,200,255), 2)
cv2.imshow('Image', image)
if cv2.waitKey(5) == 27:
I presume that the code scripting would be something like codes written for Haar Cascades. But I have tried that and I got errors. Do anyone have any idea of how to implement the HOG Descriptor on OpenCV with Python.
I have read the following question, but I get nothing from the second answer.
My problem is that I can't find the way to write the code, as the documentation about this part is very poor.
Note: I am using OpenCV 3.1.0-dev with Python 2.7.11
HOGCascade = cv2.HOGDescriptor()
If you want to use this .xml, You have lots of preparation work to do.
When u finally get the available descriptor, you should replace the cv2.HOGDescriptor_getDefaultPeopleDetector() in
Hello there people of the internet,
The code in question is using python 3.8.5 and opencv 4 (I do not know how to check the exact version but I know its opencv 4). My team and I are attempting to take a live video feed from a usb webcam and determine the distance between the camera and the object in the video feed. We had some success in reading the distance with image stills taken from the same camera and read via the imutils library. But now we want to attempt to calculate that data live.
Our code is below.
from imutils import paths
import numpy as np
import imutils
import cv2
import time
import os
def find_marker(image):
#conver the image into grayscales, blurs it then detects edges
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(gray, 35, 125)
#find the contours in the edged image and keep the largest one;
#w'll assume that this our piece of paper in the image
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
c = max(cnts, key = cv2.contourArea)
#compute the bounding box of the paper region and return it
return cv2.minAreaRect(c)
def distance_to_camera(knownWidth, focalLength, perWidth):
#compute and return the distance from the marker to the camera
return (knownWidth * focalLength) / perWidth
#intialize the known distances from the camera to the object
#initialize the known object width, which in this case the piece of paper is 12 inches
#load the first image that contains an object that is known to be 2 feet
#from our camera, the find the paper marker in the image and
#initialize the focal length
rootimage = cv2.imread("/Volumes/404/final_rov_code/Python/images/2ft.jpg")
marker1 = find_marker(rootimage)
marker2 = marker1[0][1] - marker1[1][1]
focalLength = (marker2 * KNOWN_DISTANCE) / KNOWN_WIDTH
image = cv2.VideoCapture(0)
#Loop over the image
while True:
#load the image, find the marker in the image then compute the
#distance to the marker from the camera
frame, ret = image.read()
marker = find_marker(ret)
inches = distance_to_camera(KNOWN_WIDTH, focalLength, marker[1][0])
#draw a bounding box around the image and display it
box = cv2.cv.BoxPoints(marker) if imutils.is_cv2() else cv2.boxPoints(marker)
box = np.int0(box)
cv2.drawContours(frame, [box], -1, (0, 255, 0), 2)
cv2.putText(ret, "%.2fin" % inches,
(ret.shape[1] - 200, ret.shape[0] - 20), cv2.FONT_HERSHEY_SIMPLEX,
2.0, (0, 255, 0), 3)
cv2.imshow("image", ret)
# if cv2.waitKey(33) == ord('q'):
# os.system('pause')
I understand that it should be as minimalistic as possible but since we have no idea what could be causing the program to hang upon reading the first frame of the video feed. Could it be the fact that the processing is taking too many resources from the single thread? (We're all newbies to the advanced sides of opencv and python 3)
There is no other errors that we are aware of at the moment so no leads in the terminal of where it could be coming from.
Thank you in advance.
Your problem is likely a result of not including the waitkey() statement at the end of your while loop. It takes time for openCV to load the image, so if the program doesn't pause for long enough for the image to be drawn, the display just doesn't update. Check out this other StackOverflow question for more details.
In addition, you have your ret and frame variables mixed up. ret should be the first one and frame should be the second. Right now, the drawContours() method isn't going to do anything because you're passing it a boolean instead of an image.
Making those changes fixed this for me using Python 3.9 and OpenCV 4.5.
I am trying to use the haar-cascade in OpenCV 4.0 to detect faces for emotion, gender & age estimation. sometimes the detectmultiscale() function returns an empty tuple which raises an error in the later parts of recognition.
I tried creating a while loop until the face is detected, but it seems once the face is not detected it is not being detected again(in the same captured frame), I get empty tuples returned. the weird thing is that sometimes the program works flawlessly.
the detection model is being loaded correctly, since cv2.CascadeClassifier.empty(face_cascade) returns False.
there seems to be no problem with the captured frame since I can display it properly.
after searching I found that detectmultiscale() does, in fact, return an empty tuple when no faces are detected.
Python OpenCV face detection code sometimes raises `'tuple' object has no attribute 'shape'`
face_cascade = cv2.CascadeClassifier(
'C:\\Users\\kj\\Desktop\\jeffery 1\\trained_models\\detection_models\\haarcascade_frontalface_alt.xml')
retval = cv2.CascadeClassifier.empty(face_cascade)
returns False
def video_cap(out_queue):
video_capture = cv2.VideoCapture(0, cv2.CAP_DSHOW)
#video_capture.set(3, 768)
#video_capture.set(4, 1024)
while True:
ret, bgr_image = video_capture.read()
if video_capture.isOpened() == False :
gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)
rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)
faces = detect_faces(face_detection, gray_image)
ret_list = [gray_image, rgb_image, faces]
video_cap function is threaded
def detect_faces(detection_model, gray_image_array):
faces1 = detection_model.detectMultiScale(gray_image_array, scaleFactor= 2, minNeighbors=10,minSize=(64,64))
while(len(faces1)== 0 ):
faces1 = detection_model.detectMultiScale(gray_image_array, scaleFactor=2, minNeighbors=10, minSize=(64, 64))
return faces1
I get the output:
goes on until I terminate.
how do I fix the problem?
This is a snippet of the code I used. I removed the ARGUMENTS in the detectMultiScale() function and it ran fine.
Also, make sure you have the correct path to the xml files.
classifier = cv2.CascadeClassifier("../../../l-admin/anaconda3/lib/python3.6/site-packages/cv2/data/haarcascade_frontalface_default.xml")
img = cv2.imread('../Tolulope/Adetula Tolulope (2).jpg')
face = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = classifier.detectMultiScale(face)
print(type(faces), faces)
for (x, y, w, h) in faces:
img = cv2.imwrite("facesa.png", cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 3))
On a Secondary note, the reason my own did work might be because my camera did locate my face due to the lightning. So I suggest you try it out with a picture first before using the video.
I have a similar issue when I use jpg format but the main problem is always in the format of the image as when i used png it automatically give the tuple with correct values.
classifier = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
# reading the image
img = cv2.imread('i.png')
# showing the image
#cv2.imshow('shaswat face detection ',img)
# making image to gray scale as black and white
grayscaled_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# cv2.imshow('shaswat face detection ',grayscaled_img)
# detecting the image
# return top left and bottom right points
faces = classifier.detectMultiScale(grayscaled_img)
#cv2.rectangle(img , face_coordinates[0] , face_coordinates[1] , (255,0,0) , 10)
the output shows
[[ 87 114 361 361]]
my first time over here, but I really searched about it and sadly I haven't found help.
I have an algorithm of facial recognition... Currently, I am trying to improve the efficiency of it.
After some study, I concluded that the dlib.get_frontal_face_detector () was the function slowing down my code.
Then, my approach was to remove the background of my figures and extract only the differences between the two images. After it, I have pieces of image, cropped from the full and original one, and this pieces with only the differences are much smaller (e.g. Full Image: 1520 x 2592 pixels and Cropped Image With Face: 150x200 pixels).
OBS: Camera of detection it is very far of the people being recognized, because of this the faces are tiny in the images and much of this size are useless, because of it I decided to remove the useless parts.
BUT, here things got crazy:
When I apply the tiny, smaller, cropped face to the dlib.get_frontal_face_detector () it SOMETIMES doesn't recognizes it! (there are times where it recognizes, it's weird!). It sounds weird because when I apply the Full Image to the face_detector, yet with same resolution, only with larger shape, it recognizes the same face!
I think I am missing some theoretical information here...
detector = dlib.get_frontal_face_detector()
new_image = []
im2, contours, hierarchy = cv2.findContours(im_bw,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
[x, y, w, h] = cv2.boundingRect(contour)
#print(x, y, w, h)
if h<20 or w<20:
new_image.append(img2_n[y*inv_coef:y*inv_coef+h*inv_coef, x*inv_coef:x*inv_coef+w*inv_coef])
for images in new_image:
rgb_img = cv2.cvtColor(images, cv2.COLOR_BGR2RGB)
dets = detector(rgb_img, 1)
Above the extractor of smaller images and using it on the detector. Below, the simple code where I apply the full image to the same detector.
full_img = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
dets = detector(full_img, 1)
Anyone that understands a little bit more of DLIB and OPENCV and PYTHON could help me?
OBSĀ¹: The images that are not being recognized are larger than the threshold defined in the for to trash away too small pieces of image.
I am new to opencv and I am currently working on 'Diabetic Retinopathy Detection' (a kaggle competition was launched 3 years ago; more details here : https://www.kaggle.com/c/diabetic-retinopathy-detection/data). Currently, I am trying to achieve similar results on image processing as depicted in the image below (source: http://blog.kaggle.com/2015/09/09/diabetic-retinopathy-winners-interview-1st-place-ben-graham/):
Now I have tried different approaches including histogram equalization and Contrast Limited Adaptive Histogram Equalization (CLAHE). CLAHE gives the best results so far, but nothing compared to the images above. I got some ideas from here : (How to remove the local average color from an image with OpenCV) but couldn't reproduce the results. If someone can guide me how it can be done with opencv or any other python vision library, it would be great. Sample images can be downloaded from kaggle site (link mentioned above). Thanks.
Here is my code so far:
def equalize_hist(input_path):
img = cv.imread(input_path)
for c in range(0, 2):
img[:,:,c] = cv.equalizeHist(img[:,:,c])
cv.imshow('Histogram equalized', img)
def clahe_rgb(input_path):
bgr = cv.imread(input_path)
lab = cv.cvtColor(bgr, cv.COLOR_BGR2LAB)
lab_planes = cv.split(lab)
gridsize = 5
clahe = cv.createCLAHE(clipLimit=2.0,tileGridSize=(gridsize,gridsize))
lab_planes[0] = clahe.apply(lab_planes[0])
lab = cv.merge(lab_planes)
bgr2 = cv.cvtColor(lab, cv.COLOR_LAB2BGR)
cv.imshow('CLAHE RGB', bgr2)
def clahe_greyscale(input_path):
img = cv.imread(input_path)
gray_image = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
clahe = cv.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl1 = clahe.apply(gray_image)
cv.imshow('CLAHE Grayscale', cl1)
The code you show is doing a local histogram equalization, whereas the highlighted text you posted talks about removing the average color from each pixel.
Removing the average color could be done like this:
# Blur the image
blurred = cv2.blur(img, ksize=(15, 15))
# Take the difference with the original image
# Weight with a factor of 4x to increase contrast
dst = cv2.addWeighted(img, 4, blurred, -4, 128)
You can adjust the kernel size of the blur code (above it's 15) to find something that works for your use case.
You may need to downscale the images to a common size before doing this, to get comparable results (as also noted in the blog post you cite).
I am working on a human detection program using OpenCV using Python. I saw this very good example and I ran it on the samples it had. It can detect people regardless where they are facing and has decent overlap detection as well as blurred motion as well.
However, when I was running it on some images I had (mostly knee up, waist up, and chest up photos of people), I found out that the software doesn't quite detect people.
You can get the photos from this link. This is the code I am using:
# import the necessary packages
from __future__ import print_function
from imutils.object_detection import non_max_suppression
from imutils import paths
import numpy as np
import argparse
import imutils
import cv2
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--images", required=True, help="path to images directory")
args = vars(ap.parse_args())
# initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
# loop over the image paths
imagePaths = list(paths.list_images(args["images"]))
for imagePath in imagePaths:
# load the image and resize it to (1) reduce detection time
# and (2) improve detection accuracy
image = cv2.imread(imagePath)
image = imutils.resize(image, width=min(400, image.shape[1]))
orig = image.copy()
# detect people in the image
(rects, weights) = hog.detectMultiScale(image, winStride=(4, 4),
padding=(8, 8), scale=1.05)
# draw the original bounding boxes
for (x, y, w, h) in rects:
cv2.rectangle(orig, (x, y), (x + w, y + h), (0, 0, 255), 2)
# apply non-maxima suppression to the bounding boxes using a
# fairly large overlap threshold to try to maintain overlapping
# boxes that are still people
rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in rects])
pick = non_max_suppression(rects, probs=None, overlapThresh=0.65)
# draw the final bounding boxes
for (xA, yA, xB, yB) in pick:
cv2.rectangle(image, (xA, yA), (xB, yB), (0, 255, 0), 2)
# show some information on the number of bounding boxes
filename = imagePath[imagePath.rfind("/") + 1:]
print("[INFO] {}: {} original boxes, {} after suppression".format(
filename, len(rects), len(pick)))
# show the output images
cv2.imshow("Before NMS", orig)
cv2.imshow("After NMS", image)
It is pretty straightforward. It goes through the images, finds the people in it, then draws bounding rectangles. If rectangles overlap, they are joined together to prevent false positives and detecting more than 1 person in a single person.
However, as I mentioned above, the code fails to recognize people if parts of their feet aren't present.
Is there a way to make OpenCV recognize people who may only have partial of their body (knee up, waist up, chest up) present in a video? In my use case scenarios, I don't think it will be critical to look for arms and legs, as long as the torso and head is present, I should be able to see it.
I found the haar upper body cascade. Though it may not work always (I'll post a new question regarding this), it's a good start.
Here's the code:
import numpy as np
import cv2
img = cv2.imread('path/to/img.jpg',0)
upperBody_cascade = cv2.CascadeClassifier('../path/to/haarcascade_upperbody.xml')
arrUpperBody = upperBody_cascade.detectMultiScale(img)
if arrUpperBody != ():
for (x,y,w,h) in arrUpperBody:
print 'body found'
But it's not as refined as the solution I lifted off of pyimagesearch.