I'm looking for ideas to help improve my current approach for real-time object detection using computer vision (specifically the opencv library). My goal is to accurately detect a golf-ball through image processing in a large variety of environments/lighting conditions. My detection process works quite well probably 80% of the time, but I'm hitting edge cases that cause failures that I can't ignore. The edge case I'm focusing on right now is extreme shadows being cast by the golf-ball. Here is a pair of example images. The coloured image is my source, and the black and white image is my post-processed result.
There are a few important variables to consider with my application
Source coming from a video feed, and being processed in real-time
It can be windy, so camera shake can be an issue
Camera isn't guaranteed to be incredible quality, so need to account for extra noise/not incredible resolution
I won't go into full details in the processing I'm doing to detect moving objects (Kalman Filter, Background Subtraction, ...) as in this specific example I'm failing to detect a stationary object (ie. ball has gone to rest).
Grab initial frame before any balls are in-frame as my base frame (this will be used for background subtraction)
convert image to greyscale
apply a median blur to eliminate noise, which can otherwise be pretty extreme due to a combination of camera shake, poor camera quality
apply an adaptive threshold on the image. I'm using ADAPTIVE_THRESH_GUASSIAN_C and have been tuning the block size and C constant values as best I can
apply background subtraction (I'm using the built-in CNT Subtractor)
Apply a small dilation kernel to the entire image to try and increase the size of the contours that are left after the above processing, as they can sometimes become quite small after the blur filter for example
use opencv's "findContours" with RETR_TREE, and CHAIN_APPROX_SIMPLE parameters
walk the contour hierarchy, looking for "filled in" contours. The idea being that the golf-balls should mostly be completely filled in, compared to other objects which will have an outline, and I can use the hierarchy to determine which contours are filled in or not (ie do they have child contours)
for each filled in contour, do an enclosing circle. Compare area of enclosing circle to contour area, filter by an acceptable difference to determine how circular the object is
another pass, filtering by min/max area size since I can assume the camera will always be at a similar height to "hone-in" on the object
As you can see from the images above, this approach runs into problems when the ball itself has a lot of contrast due to shadows. To me it looks like the adaptive threshold pass is filtering out the darker part of the ball (due to shadow) which creates a non-circular shape. Perhaps I need to dial in the adaptive threshhold pass to allow for a bit more contrast since we can assume shadows are always on the dark side and a ground shadow should be darker than the shaded part of the ball? I'd also like to completely eliminate the leftover outline of the ground-shadow if possible. My guess is that the edges of the shadow being slightly lighter is the reason they don't get filtered out by my adaptive threshold pass. Open to any and all suggestions :-)
Related
My goal is to draw a rectangle border around the face by removing the neck area connected to the whole face area. All positive values here represent skin color pixels. Here I have so far filtered out the binary image using OpenCV and python. Code so far skinid.py
Below is the test image.
Noise removals have also been applied to this binary image
Up to this point, I followed this paper Face segmentation using skin-color map in videophone applications. And for the most of it, I used custom functions rather than using built-in OpenCV functions because I kind of wanted to do it from scratch. (although some erosion, opening, closing were used to tune it up)
I want to know a way to split the neck from the whole face area and remove it like this,
as I am quite new to the whole image processing area.
Perform a distance transform (built into opencv or you could write by hand its a pretty fun and easy one to write using the erode function iteratively, and adding the result into another matrix each round, lol slow but conceptually easy). On the binary image you presented above, the highest value in a distance transform (and tbh I think pretty generalized across any mug shots) will be the center of the face. So that pixel is the center of your box, but also that value (value of that pixel after the distance transform) will give you a pretty solid approx face size (since it is going to be the pixel distance from the center of the face to the horizontal edges of the face). Depending on what you are after, you may just be able to multiply that distance by say 1.5 or so (figure out standard face width to height ratio and such to choose your best multiplier), set that as your circle radius (or half side width for a box) and call it a day. Comment if you need anything clarified as I am pretty confident in this answer and would be happy to write up some quick code (in c++ opencv) if you need/ it would help.
(alt idea). You could tweak your color filter a bit to reject darker areas (this will at least in the image presented) create a nice separation between your face and neck due to the shadowing of the chin. (you may have to dial back your dilate/ closing op tho)
I'm facing some general problems regarding the edge detection in an image (the image should be irrelevant for my question).
I want the canny edge detector to ignore a certain pixel value. For example: It should only look for edges if the gray value is not 0. Otherwise there will be "false edges" detected.
I usually use the cv2.canny function which works quite fast and well. Problem is, it is not customizable. So I took this code of a custom canny edge detector (https://rosettacode.org/wiki/Canny_edge_detector#Python) in order to customize it. It works but it's calculating the edges way too slow (It takes several minutes, whereas the cv2.canny function takes a fraction of a second).
This is my first problem.
Is there another way to make the cv2.canny function "ignore" pixels of a certein value. Imagine somewhere in the picture is a area filled with black (soo the image below). I don't want the edge detector to detect the edge of this black area.
Once I have some clear edges detected in my image, I want to create masks based on those edges. I couldn't find any examples for this online. So if anyone knows where to find a good tutorial on how to create masks from edges it would be great if you could help me out.
Thanks in advance
Here's an approach:
Calculate your Canny as usual using the fast OpenCV function.
Now locate all the black pixels in the image - you can do that with _,thr = cv2.threshold(im,1,255,cv2.THRESH_BINARY) and dilate those areas by 1 pixel with morphology to allow edges to be offset a little as they often are.
Multiply the normal Canny image with the mask you created so that anything it found in the black areas gets multiplied by zero, i.e. lost.
I am trying to obtain the relative depth of pixels of an image. For example, the image in https://www.awn.com/news/nvidia-unveils-quadro-rtx-worlds-first-ray-tracing-gpu . I don't need the precise distance of each pixel, which I believe would be impossible, but I would like to get something as "the green ball is further than the other balls". Is it possible using OpenCV in python? The codes I generated can identify each ball, but not their relative distance or depth, so they are pretty much useless to my intents.
That's an ill-posed problem (you can not measure depth with a single RGB camera) and a topic of resent research. I found this survey paper. Most often a depth image is learned from an RGB image using convolutional neural networks.
However, if you use a lot of prior information about your scene (all objects are circular within in the image and the partially visible circles corresponds to the ones which are in the background), then you might be able to do something with heuristical methods like, thresholding, edge detection or hough transforms, but it won't be easy.
I am trying to use OpenCV to measure size of filament ( that plastic material used for 3D printing)
What I am trying to do is measuring filament size ( that plastic material used for 3D printing ). The idea is that I use led panel to illuminate filament, then take image with camera, preprocess the image, apply edge detections and calculate it's size. Most filaments are fine made of one colour which is easy to preprocess and get fine results.
The problem comes with transparent filament. I am not able to get useful results. I would like to ask for a little help, or if someone could push me the right directions. I have already tried cropping the image to heigh that is a bit higher than filament, and width just a few pixels and calculating size using number of pixels in those images, but this did not work very well. So now I am here and trying to do it with edge detections
works well for filaments of single colour
not working for transparent filament
Code below is working just fine for common filaments, the problem is when I try to use it for transparent filament. I have tried adjusting tresholds for Canny function. I have tried different colour-spaces. But I am not able to get the results.
Images that may help to understand:
https://imgur.com/gallery/CIv7fxY
image = cv.imread("../images/img_fil_2.PNG") # load image
gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY) # convert image to grayscale
edges = cv.Canny(gray, 100, 200) # detect edges of image
You can use the assumption that the images are taken under the same conditions.
Your main problem is that the reflections in the transparent filament are detected as edges. But, since the image is relatively simple, without any other edges, you can simply take the upper and the lower edge, and measure the distance between them.
A simple way of doing this is to take 2 vertical lines (e.g. image sides), find the edges that intersect the line (basically traverse a column in the image and find edge pixels), and connect the highest and the lowest points to form the edges of the filament. This also removes the curvature in the filament, which I assume is not needed for your application.
You might want to use 3 or 4 vertical lines, for robustness.
I'm currently working on a computer vision project, and got most of my algorithm working. However I'm currently doing background subtraction manually on every image. This is because the most common background subtraction algorithms that I can find make use of thresholding, and my project should deal with backgrounds both brighter and darker than the object I want to extract.
This is the way I am subtracting the background currently (using python and the scikit stack):
val = filters.threshold_otsu(image)
return image > val
Of course, this only works with backgrounds darker than the subject.
I had the idea of finding whether or not the background is bright, and then depending on that change the sign of the inequality, but could not find a way to do that.
Is there a background subtraction algorithm which is able to handle both bright and dark backgrounds, or is there another way to solve this problem?
There are no fixed method of solving your problem generally. Foreground and background can be defined differently according to situations.
That being said, it is not impossible to use some heuristic method to make the algorithm work on your dataset. It will be helpful if you can share some of the images to give us a better understanding of your definition of foreground and background.
Here are some of heuristic method that might help:
Run Ostu thresholding with both THRESH_BINARY and THRESH_BINARY_INV. Then assuming your foreground is always centered, choose the result where the a large portion of the center region is white.
If the foreground is always larger than backgorund or vice versa, calculate the area of white region instead.
There are several automatic thresholding techiques available. One of them is Otsu.
http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html
It is implemented in opencv (https://docs.opencv.org/trunk/d7/d4d/tutorial_py_thresholding.html)
import cv2
img = cv2.imread('noisy2.png',0)
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)