I am working on a human detection program using OpenCV using Python. I saw this very good example and I ran it on the samples it had. It can detect people regardless where they are facing and has decent overlap detection as well as blurred motion as well.
However, when I was running it on some images I had (mostly knee up, waist up, and chest up photos of people), I found out that the software doesn't quite detect people.
You can get the photos from this link. This is the code I am using:
# import the necessary packages
from __future__ import print_function
from imutils.object_detection import non_max_suppression
from imutils import paths
import numpy as np
import argparse
import imutils
import cv2
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--images", required=True, help="path to images directory")
args = vars(ap.parse_args())
# initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
# loop over the image paths
imagePaths = list(paths.list_images(args["images"]))
for imagePath in imagePaths:
# load the image and resize it to (1) reduce detection time
# and (2) improve detection accuracy
image = cv2.imread(imagePath)
image = imutils.resize(image, width=min(400, image.shape[1]))
orig = image.copy()
# detect people in the image
(rects, weights) = hog.detectMultiScale(image, winStride=(4, 4),
padding=(8, 8), scale=1.05)
# draw the original bounding boxes
for (x, y, w, h) in rects:
cv2.rectangle(orig, (x, y), (x + w, y + h), (0, 0, 255), 2)
# apply non-maxima suppression to the bounding boxes using a
# fairly large overlap threshold to try to maintain overlapping
# boxes that are still people
rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in rects])
pick = non_max_suppression(rects, probs=None, overlapThresh=0.65)
# draw the final bounding boxes
for (xA, yA, xB, yB) in pick:
cv2.rectangle(image, (xA, yA), (xB, yB), (0, 255, 0), 2)
# show some information on the number of bounding boxes
filename = imagePath[imagePath.rfind("/") + 1:]
print("[INFO] {}: {} original boxes, {} after suppression".format(
filename, len(rects), len(pick)))
# show the output images
cv2.imshow("Before NMS", orig)
cv2.imshow("After NMS", image)
cv2.waitKey(0)
It is pretty straightforward. It goes through the images, finds the people in it, then draws bounding rectangles. If rectangles overlap, they are joined together to prevent false positives and detecting more than 1 person in a single person.
However, as I mentioned above, the code fails to recognize people if parts of their feet aren't present.
Is there a way to make OpenCV recognize people who may only have partial of their body (knee up, waist up, chest up) present in a video? In my use case scenarios, I don't think it will be critical to look for arms and legs, as long as the torso and head is present, I should be able to see it.
I found the haar upper body cascade. Though it may not work always (I'll post a new question regarding this), it's a good start.
Here's the code:
import numpy as np
import cv2
img = cv2.imread('path/to/img.jpg',0)
upperBody_cascade = cv2.CascadeClassifier('../path/to/haarcascade_upperbody.xml')
arrUpperBody = upperBody_cascade.detectMultiScale(img)
if arrUpperBody != ():
for (x,y,w,h) in arrUpperBody:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
print 'body found'
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
But it's not as refined as the solution I lifted off of pyimagesearch.
Related
I am trying to detect as many circles in my images using the following code:
maxRadius = int(1.2*(width/16)/2)
minRadius = int(0.9*(width/16)/2)
gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY)
circles = cv.HoughCircles(image=gray,
method=cv.HOUGH_GRADIENT,
dp=1.2,
minDist=2*minRadius,
param1=70,
param2=0.9,
minRadius=minRadius,
maxRadius=maxRadius
)
Although it does work for some of the images there are a few exceptions for which it doesn't.
Below we can see that for two different images that represent the same kind of experiment, my algorithm yields very different results.
How can I fix this? Should I apply some sort of filter on the images first to enhance the contrast?
EDIT: added original image:
enter image description here
This solution may or may not work on other images but it does work on the one you posted. You might want to work on that "sweet spot" apropos the adaptiveThreshold and HoughCricles parameters so that it works with other images as well.
import numpy as np
import cv2
import matplotlib.pyplot as plt
rgb = cv2.imread('/path/to/your/image/cells_0001.jpeg')
gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
imh, imw = gray.shape
th = cv2.adaptiveThreshold(gray,255, cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY_INV,11,2)
maxRadius = int(1.2*(imw/16)/2)
minRadius = int(0.9*(imw/16)/2)
circles = cv2.HoughCircles(image=th,
method=cv2.HOUGH_GRADIENT,
dp=1.2,
minDist=2*minRadius,
param1=70,
param2=25,
minRadius=minRadius,
maxRadius=maxRadius
)
out_img = rgb.copy()
for (x, y, r) in circles[0]:
# draw the circle in the output image
cv2.circle(out_img, (x, y), int(r), (0, 255, 0), 1)
plt.imshow(out_img)
I work at a studio that does school photos and we are trying to make a script to eliminate the job of cropping each photo to a template. The photos we work with are fairly uniform but they vary in resolution and head position a bit. I took up the mantle of trying to write the script with my fairly limited Python knowledge and through a lot of trial and error and online resources I think I have got most of the way there.
At the moment I am trying to figure out the best way to have the image crop from the NumPy array with the head where I want and I just cant find a good flexible solution. The head needs to be positioned slightly differently for pose 1 and pose 2 so its needs to be easy to change on the fly (Probably going to implement some sort of simple GUI to input stuff like that, but for now I can just change the code).
I also need to be able to change the output resolution of the photo so they are all uniform (2000x2500). Anyone have any ideas?
At the moment this is my current code, it just saves the detected face square:
import cv2
import os.path
import glob
# Cascade path
cascPath = 'haarcascade_frontalface_default.xml'
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
#Check for output folder and create if its not there
if not os.path.exists('output'):
os.makedirs('output')
# Read Images
images = glob.glob('*.jpg')
for c, i in enumerate(images):
image = cv2.imread(i, 1)
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find face(s) using cascade
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1, # size of groups
minNeighbors=5, # How many groups around are detected as face for it to be valid
minSize=(500, 500) # Min size in pixels for face
)
# Outputs number of faces found in image
print('Found {0} faces!'.format(len(faces)))
# Places a rectangle on face
for (x, y, w, h) in faces:
imgCrop = image[y:y+h,x:x+w]
if len(faces) > 0:
#Saves Images to output folder with OG name
cv2.imwrite('output/'+ i, imgCrop)
I can crop using it like this:
# Crop Padding
left = 300
right = 300
top = 400
bottom = 1000
for (x, y, w, h) in faces:
imgCrop = image[y-top:y+h+bottom, x-left:x+w+right]
but that outputs pretty random resolutions and changes based on the image resolution
TL;DR
To set a new resolution with the dimension, you can use cv2.resize. There may be a pixel loss so you can use the interpolation method.
The newly resized image may be in BGR format, so you may need to convert to RGB format.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
Suggestion:
One approach is using python's face-recognition library.
The approach is using two sample images for training.
Predict the next image based on training images.
For instance, The followings are the training images:
We want to predict the faces in the below image:
When we get the facial encodings of the training images and apply to the next image:
import face_recognition
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw
# Load a sample picture and learn how to recognize it.
first_image = face_recognition.load_image_file("images/ex.jpg")
first_face_encoding = face_recognition.face_encodings(first_image)[0]
# Load a second sample picture and learn how to recognize it.
second_image = face_recognition.load_image_file("images/index.jpg")
sec_face_encoding = face_recognition.face_encodings(second_image)[0]
# Create arrays of known face encodings and their names
known_face_encodings = [
first_face_encoding,
sec_face_encoding
]
print('Learned encoding for', len(known_face_encodings), 'images.')
# Load an image with an unknown face
unknown_image = face_recognition.load_image_file("images/babes.jpg")
# Find all the faces and face encodings in the unknown image
face_locations = face_recognition.face_locations(unknown_image)
face_encodings = face_recognition.face_encodings(unknown_image, face_locations)
# Convert the image to a PIL-format image so that we can draw on top of it with the Pillow library
# See http://pillow.readthedocs.io/ for more about PIL/Pillow
pil_image = Image.fromarray(unknown_image)
# Create a Pillow ImageDraw Draw instance to draw with
draw = ImageDraw.Draw(pil_image)
# Loop through each face found in the unknown image
for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):
matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
best_match_index = np.argmin(face_distances)
draw.rectangle(((left, top), (right, bottom)), outline=(0, 0, 255), width=5)
# Remove the drawing library from memory as per the Pillow docs
del draw
# Display the resulting image
plt.imshow(pil_image)
plt.show()
The output will be:
The above is my suggestion. When you create a new resolution with the current image, there will be a pixel loss. Therefore you need to use an interpolation method.
For instance: after finding the face locations, select the coordinates in the original image.
# Add after draw.rectangle function.
crop = unknown_image[top:bottom, left:right]
Set new resolution with the size 2000 x 2500 and interpolation with CV2.INTERN_LANCZOS4.
Possible Question: Why CV2.INTERN_LANCZOS4?
Of course, you can select whatever you like, but in this post CV2.INTERN_LANCZOS4 was suggested.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
Save the image
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
Outputs are around 4.3 MB Therefore I can't display in here.
From the final result, we clearly see and identify faces. The library precisely finds the faces in the image.
Here what you can do:
Either you can use the training images of your own-set, or you can use the example above.
Apply the face-recognition function for each image, using the trained face-locations and save the results in the directory.
here is how I got it to crop how I wanted, this is added right below the "output number of faces" function
#Get the face postion and output values into variables, might not be needed but I did it
for (x, y, w, h) in faces:
xdis = x
ydis = y
w = w
h = h
#Get scale value by dividing wanted head hight by detected head hight
ws = 600/w
hs = 600/h
#scale image to get head to right size, uses bilinear interpolation by default
scale = cv2.resize(image,(0,0),fx=hs,fy=ws)
#calculate head postion for given values
sxdis = int(xdis*ws) #applying scale to x distance and turning it into a integer
sydis = int(ydis*hs) #applying scale to y distance and turning it into a integer
sycent = sydis+300 #adding half head hight to get center
ystart = sycent-700 #subtract where you want the head center to be in pixels, this is for the vertical
yend = ystart+2500 #Add whatever you want vertical resolution to be
xcent = sxdis+300 #adding half head hight to get center
xstart = xcent-1000 #subtract where you want the head center to be in pixels, this is for the horizontal
xend = xstart+2000 #add whatever you want the horizontal resolution to be
#Crop the image
cropped = scale[ystart:yend, xstart:xend]
Its a mess but it works exactly how I wanted it to work.
ended up going with openCV instead of switching to python-Recognition because of speed but I might switch over if I can get multithreading to work in python-recognition.
In the original picture, I would like to detect circular regions. (glands) I managed to get to know the outlines of the regions, but because of the many smaller objects (nuclei), I can not go any further.
My original idea was to remove small objects using the cv2.connectedComponentsWithStats function. But unfortunately, as shown in the picture, the glandy regions also contain small objects, they are not connected properly. The function also throws out the small regions that outline the glands, leaving some parts out of the contours.
Can someone help me to find a solution to this problem?
Thank you very much in advance
Original picture
The approximate contour of the glands (with a lot of small objects in it)
After cv2.connectedComponentsWithStats
OpenCV
I think you can solve your task by using the Hough transform. Something like this could work for you (you have to adjust the parameters according to your needs):
import sys
import cv2 as cv
import numpy as np
def main(argv):
filename = argv[0]
src = cv.imread(filename, cv.IMREAD_COLOR)
if src is None:
print ('Error opening image!')
print ('Usage: hough_circle.py [image_name -- default ' + default_file + '] \n')
return -1
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
gray = cv.medianBlur(gray, 5)
rows = gray.shape[0]
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 32,
param1=100, param2=30,
minRadius=20, maxRadius=200)
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0, :]:
center = (i[0], i[1])
# circle center
cv.circle(src, center, 1, (0, 100, 100), 3)
# circle outline
radius = i[2]
cv.circle(src, center, radius, (255, 0, 255), 2)
cv.imshow("detected circles", src)
cv.waitKey(0)
return 0
if __name__ == "__main__":
main(sys.argv[1:])
Some additional preprocessing might be required, to get rid of the noise, e.g. Morphological Transformations and performing edge detection right before the transformation might be helpful as well.
Neural Networks
Another option would be to use a neural network for image segmentation. A quite successful one is Mask RCNN. There is already a working python implementation on GitHub: Mask RCNN - Nucleus.
I'm trying to build a people detection function using opencv and python, but I need some help understanding a few things.
Images come from a mobile phone taped to a window.
Here's the code:
# USAGE
# python detect.py --images images
from __future__ import print_function
from glob import glob
from imutils.object_detection import non_max_suppression
import argparse
import cv2
import numpy as np
import os
# parse arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--images", required=True, help="path to images directory")
args = vars(ap.parse_args())
# strip last / if present
arg_images = args["images"].rstrip("/")
# check folder exists
if not os.path.isdir(arg_images):
print(arg_images + " is not a folder, terminate")
quit()
# load images names
imagePaths = sorted(glob(arg_images + "/*.jpg"))
# check folder is not empty
if len(imagePaths) == 0:
print(arg_images + " is empty, terminate")
quit()
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
for imagePath in imagePaths:
# INTER_NEAREST - a nearest-neighbor interpolation
# INTER_LINEAR - a bilinear interpolation (used by default)
# INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method.
# INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
# INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
image = cv2.imread(imagePath)
image = cv2.resize(image, (min(800, image.shape[1]), min(600, image.shape[0])), interpolation = cv2.INTER_LINEAR)
#image = cv2.resize(image, (min(1200, image.shape[1]), min(900, image.shape[0])), interpolation = cv2.INTER_LINEAR)
(rects, weights) = hog.detectMultiScale(image, winStride=(4, 4), padding=(0, 0), scale=1.01)
foldername = imagePath[0:imagePath.rfind("/")]
filename = imagePath[imagePath.rfind("/") + 1:]
if len(rects) == 0:
print("- " + filename)
else:
rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in rects])
pick = non_max_suppression(rects, probs=None, overlapThresh=0.65)
for (xA, yA, xB, yB) in pick:
cv2.rectangle(image, (xA, yA), (xB, yB), (0, 255, 0), 2)
print("+ " + filename)
cv2.imwrite(foldername + "/detected/" + filename, image)
When I run this function on these images (test.zip) I get very different results depending on the image size:
800x600: it has the least amount of false positive, roughly 40% at night and 10% during the day
1200x900: a lot more false positive
original size: looks more like a random guess than a detection
Can I assume it's because detectMultiScale works with a small detection window and I can change the stride, but not the size?
Also, if you look at IMG_20180329_061603.jpg it's a false positive always detected, but I can't understand why. Night pictures looks all the same (to me), but there's too many that get a green box..
Any help with this is welcome. If you need clarifications please ask..
Thanks
The reason for what you see is that the HOG descriptor used for people detection has a 64x128 window. So you want to optimize the scale parameter and the image sizes to get mostly true positives. You probably mostly care about a person getting near the bike, so the ones who far (and are small) are not that important. I would collect a set of "training" images to find the optimal param values. Also you're ignoring the weights returned by detectMultiScale(), again you can visualize those values for your true positives and false positives and find the optimal threshold.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
This is an antibiotic assay on a petri dish:
I'm working on a project where I'm automating the reading of antibiotic assays of the petri dish kind. In these tests, bacteria from a patient is spread over the petri dish and allowed to grow into a "bacterial lawn." Once the bacteria cover the entire surface of the petri dish, pills either of different medicine types or the same medicine, but in different concentrations are put onto the dish and after 24 hours the kill zones, if they exist, are measured. Kill zones represent regions where the medicine has killed off the bacteria for some radius from the pill. How big the radius of the kill zone is determines whether that bacteria is susceptible or resistant to the medication and the different concentrations.
I've been using a training set of 20 images in a file to generalize my results. I've been using opencv, skimage, numpy, scipy and matplotlib python libraries to build this program. So far I've been able to accurately identify the petri dish rim edge with a hough circle finder and morphological gradient transform of image. I've also been able to use SURF to identify the positions of the pills in the image.
My question:
The issue is that there isn't enough contrast between the kill zone edge and bacterial lawn to use HoughCircleFinder to find these circular zones. Could anyone help me find any method to accurately identify these kill zones?
import numpy as np
import cv2
from skimage import io,img_as_float
from matplotlib import pyplot as plt
import os
from skimage.util import img_as_ubyte
from skimage.color import rgb2gray
from skimage.filters import sobel
import matplotlib.patches as patches
import matplotlib.cbook as cbook
from skimage.filters.rank import entropy
from skimage.morphology import disk
from skimage import io,exposure
from skimage.segmentation import slic
from skimage import io, segmentation
from skimage.color import label2rgb,rgb2gray
from scipy import signal
import scipy
from skimage.future import graph
from skimage.feature import peak_local_max
def HoughCircleFinder(filtered_image,image):
output = image.copy()
gray = img_as_ubyte(filtered_image)
# gray = cv2.cvtColor(img_as_ubyte(image), cv2.COLOR_BGR2GRAY)
# detect circles in the image
circles = cv2.HoughCircles(gray, cv2.cv.CV_HOUGH_GRADIENT, 5, minDist = int(0.45 * max(image.shape)), minRadius = int(0.25 * max(image.shape)), maxRadius = int(0.6 * max(image.shape)))
# ensure at least some circles were found
if circles is not None:
# convert the (x, y) coordinates and radius of the circles to integers
circles = np.round(circles[0, :]).astype("int")
# loop over the (x, y) coordinates and radius of the circles
for (x, y, r) in circles:
print x,y,r
# draw the circle in the output image, then draw a rectangle
# corresponding to the center of the circle
# filter circles that go out of view of image
if not (image.shape[0]*0.25) < x < (image.shape[0]*0.75):
continue
elif not (image.shape[0]*0.25) < y < (image.shape[0]*0.75):
continue
else:
# output = CropCircle(output,x,y,r)
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.rectangle(output, (x - 5, y - 5), (x + 5, y + 5), (0, 128, 255), -1)
return output
def MorphologicalGradient(img):
kernel = np.ones((7,3),np.uint8)
# kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)
return gradient
def SURF(img): #image1
featurels = []
surf = cv2.SURF(7000)
kp, des = surf.detectAndCompute(img,None)
for i in kp:
# print round(i.pt[0])
featurels.append(i.pt)
surf.hessianThreshold = 50000
img2 = cv2.drawKeypoints(img,kp,None,(255,0,0),4)
return img2,featurels
# make list of images from file
pic_list = [os.path.join("/Users/sethcommichaux/Desktop/PetriKillZone/PetriDishes/",pic) for pic in os.listdir("/Users/sethcommichaux/Desktop/PetriKillZone/PetriDishes/")]
# for loop for processing images and getting useful data
for image in pic_list[1:]:
print image
image1 = cv2.imread(image,0) #grayscale image
image = cv2.imread(image) #color image
print "number of pixels in image: ",image.size
print "image shape (if grayscale will be 2 tuple, if color 3 or more): ", image.shape
image = HoughCircleFinder(MorphologicalGradient(image1),image)
print 'image data type: ',image.dtype
plt.figure()
io.imshow(image)
plt.show()
# part that finds pills
image,features = SURF(image)
plt.figure()
io.imshow(image)
plt.show()
I have used the SURF key points object to identify the locations of the pills, but how to find the kill zones I'm at a loss for. Below is code for getting plots and histograms along x-axis going out from pill.
for row in features:
PeakFind(image[round(row[1])])
print image[round(row[1])]
plt.figure()
plt.plot(range(len(image[round(row[1])])),image[round(row[1])])
plt.show()
# plt.figure()
# x = plt.hist(image[round(row[0])])
print features