I'm detecting faces with haarcascade and tracking them with a webcam using OpenCV. I need to save each face that is tracked. But the problem is when people are moving. In which case the face becomes blurry.
I've tried to mitigate this problem with opencv's dnn face detector and Laplacian with the following code:
blob = cv2.dnn.blobFromImage(cropped_face, 1.0, (300, 300), (104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
confidence = detections[0, 0, 0, 2]
blur = cv2.Laplacian(cropped_face, cv2.CV_64F).var()
if confidence >= confidence_threshold and blur >= blur_threshold:
cv2.imwrite('less_blurry_image', cropped_face)
Here I tried to limit saving a face if it is not blurry due to motion by setting blur_threshold to 500 and confidence_threshold to 0.98 (i.e. 98%).
But the problem is if I change the camera I have to change the thresholds again manually. And in most of the cases setting a threshold omits most of the faces.
Plus, it is difficult to detect since the background is always clear compared to the blurred face.
So my question is how can I detect this motion blur on a face. I know I can train an ML model for motion blur detection of a face. But that would require heavy processing resources for a small task.
Moreover, I will be needing a huge amount of annotated data for training if I go that route. Which is not easy for a student like me.
Hence, I am trying to detect this with OpenCV which will be a lot less resource intensive compared to using an ML model for detection.
Is there any less resource intensive solution for this?
You can probably use a Fourier Transform (FFT) or a Discrete Cosine Transform (DCT) to figure out how blurred your faces are. Blur in images leads to high frequencies disappearing, and only low frequencies remaining.
So you'd take an image of your face, zero-pad it to a size that'll work well for FFT or DCT, and look how much spectral power you have at higher frequencies.
You probably don't need FFT - DCT will be enough. The advantage of DCT is that it produces a real-valued result (no imaginary part). Performance-wise, FFT and DCT are really fast for sizes that are powers of 2, as well as for sizes that have only factors 2, 3 and 5 in them (although if you also have 3's and 5's it'll be a bit slower).
As mentioned by #PlinyTheElder, DCT information can give you motion blur. I am attaching the code snippet from repo below:
The code is in C and i am not sure if there is python binding for libjpeg. Else you need to create one.
/* Fast blur detection using JPEG DCT coefficients
*
* Based on "Blur Determination in the Compressed Domain Using DCT
* Information" by Xavier Marichal, Wei-Ying Ma, and Hong-Jiang Zhang.
*
* Tweak MIN_DCT_VALUE and MAX_HISTOGRAM_VALUE to adjust
* effectiveness. I reduced these values from those given in the
* paper because I find the original to be less effective on large
* JPEGs.
*
* Copyright 2010 Julian Squires <julian#cipht.net>
*/
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <jpeglib.h>
static int min_dct_value = 1; /* -d= */
static float max_histogram_value = 0.005; /* -h= */
static float weights[] = { /* diagonal weighting */
8,7,6,5,4,3,2,1,
1,8,7,6,5,4,3,2,
2,1,8,7,6,5,4,3,
3,2,1,8,7,6,5,4,
4,3,2,1,8,7,6,5,
5,4,3,2,1,8,7,6,
6,5,4,3,2,1,8,7,
7,6,5,4,3,2,1,8
};
static float total_weight = 344;
static inline void update_histogram(JCOEF *block, int *histogram)
{
for(int k = 0; k < DCTSIZE2; k++, block++)
if(abs(*block) > min_dct_value) histogram[k]++;
}
static float compute_blur(int *histogram)
{
float blur = 0.0;
for(int k = 0; k < DCTSIZE2; k++)
if(histogram[k] < max_histogram_value*histogram[0])
blur += weights[k];
blur /= total_weight;
return blur;
}
static int operate_on_image(char *path)
{
struct jpeg_error_mgr jerr;
struct jpeg_decompress_struct cinfo;
jvirt_barray_ptr *coeffp;
JBLOCKARRAY cs;
FILE *in;
int histogram[DCTSIZE2] = {0};
cinfo.err = jpeg_std_error(&jerr);
jpeg_create_decompress(&cinfo);
if((in = fopen(path, "rb")) == NULL) {
fprintf(stderr, "%s: Couldn't open.\n", path);
jpeg_destroy_decompress(&cinfo);
return 0;
}
jpeg_stdio_src(&cinfo, in);
jpeg_read_header(&cinfo, TRUE);
// XXX might be a little faster if we ask for grayscale
coeffp = jpeg_read_coefficients(&cinfo);
/* Note: only looking at the luma; assuming it's the first component. */
for(int i = 0; i < cinfo.comp_info[0].height_in_blocks; i++) {
cs = cinfo.mem->access_virt_barray((j_common_ptr)&cinfo, coeffp[0], i, 1, FALSE);
for(int j = 0; j < cinfo.comp_info[0].width_in_blocks; j++)
update_histogram(cs[0][j], histogram);
}
printf("%f\n", compute_blur(histogram));
// output metadata XXX should be in IPTC etc
// XXX also need to destroy coeffp?
jpeg_destroy_decompress(&cinfo);
return 0;
}
int main(int argc, char **argv)
{
int status, i;
for(status = 0, i = 1; i < argc; i++) {
if(argv[i][0] == '-') {
if(argv[i][1] == 'd')
sscanf(argv[i], "-d=%d", &min_dct_value);
else if(argv[i][1] == 'h')
sscanf(argv[i], "-h=%f", &max_histogram_value);
continue;
}
status |= operate_on_image(argv[i]);
}
return status;
}
Compile the code:
gcc -std=c99 blur_detection.c -l jpeg -o blur-detection
Run the code:
./blur-detection <image path>
Related
I created an image similarity model and used the reference data images to test it out. I tested the turicreate model, and I got back zero distances for reference data images, and the same came back when using this code with the coreml model:
image = tc.image_analysis.resize(reference_data[0]['image'], *reversed(model.input_image_shape))
image = PIL.Image.fromarray(image.pixel_data)
mlmodel.predict({'image':image})`
However, when using the model in iOS as a VNCoreMLModel, no reference image test came back with a zero distance, and most of them weren't even the shortest distance, i.e. reference image 0 had a shortest distance to reference id 78.
Since the coreml model works in python, I figured it was a preprocessing issue, so I preprocessed the image myself before passing it to the CoreMLModel. Doing this gave me a consistent output of the reference ids matching the reference images for the shortest distance--yay. The distance still isn't zero, so I have attempted to do whatever I can think of to affect the image to get some difference, but I can't get it any closer to zero.
Preprocessing code:
+ (CVPixelBufferRef)pixelBufferForImage:(UIImage *)image sideLength:(CGFloat)sideLength {
UIGraphicsBeginImageContextWithOptions(CGSizeMake(sideLength, sideLength), YES, image.scale);
[image drawInRect:CGRectMake(0, 0, sideLength, sideLength)];
UIImage *resizedImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
CFStringRef keys[2] = {kCVPixelBufferCGImageCompatibilityKey, kCVPixelBufferCGBitmapContextCompatibilityKey};
CFBooleanRef values[2] = {kCFBooleanTrue, kCFBooleanTrue};
CFDictionaryRef attrs = CFDictionaryCreate(kCFAllocatorDefault, (const void **)keys, (const void **)values, 2, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
CVPixelBufferRef buffer;
int status = CVPixelBufferCreate(kCFAllocatorDefault, (int)(sideLength), (int)(sideLength), kCVPixelFormatType_32ARGB, attrs, &buffer);
if (status != kCVReturnSuccess) {
return nil;
}
CVPixelBufferLockBaseAddress(buffer, kCVPixelBufferLock_ReadOnly);
void *data = CVPixelBufferGetBaseAddress(buffer);
CGColorSpaceRef colorSpace = CGColorSpaceCreateWithName(kCGColorSpaceSRGB);
CGContextRef context = CGBitmapContextCreate(data, sideLength, sideLength, 8, CVPixelBufferGetBytesPerRow(buffer), colorSpace, kCGImageAlphaNoneSkipFirst);
CGContextTranslateCTM(context, 0, sideLength);
CGContextScaleCTM(context, 1.0, -1.0);
UIGraphicsPushContext(context);
[resizedImage drawInRect:CGRectMake(0, 0, sideLength, sideLength)];
UIGraphicsPopContext();
CVPixelBufferUnlockBaseAddress(buffer, kCVPixelBufferLock_ReadOnly);
return buffer;
}
The mlmodel takes an RGB image with size: (224, 224)
What else can I do to the image to improve my results?
I was in the same boat as you. Since image preprocessing involves usage of blurring, conversion from RGB to gray and other steps. It would be easier to use Objective C++ wrapper. Below link gives good understanding about how it can be linked using header classes.
https://www.timpoulsen.com/2019/using-opencv-in-an-ios-app.html
Hope it helps!
Image Credits :https://medium.com/#borisohayon/ios-opencv-and-swift-1ee3e3a5735b
I am studying about C++ BoostPython, But I have one problem about transforming Image data type.
Receive DepthImage from Intel Realsense camera SR300.
depthImage data type = cv::Mat(this is imageformat of opencv in C++)
I want to put this Image of cv::Mat type in the tf.placeholder of Boost.Python, C++.
How can I do that? Please...
the follow code could work, i am not sure this is the best approach, but this is how i managed to get the the tensroflow working for me.
void* object = (void*)resized.ptr() ;
int64_t* dims; //set dims
//C API
TF_Tensor* tftensor = TF_NewTensor(TF_DataType::TF_FLOAT, dims, nDims, object, data_size, &deallocator, 0);
//C++ API
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({ batch_size, objectHeight, objectWidth, objectChannels }));
int data_size = objectHeight * objectWidth * objectChannels * batch_size * sizeof(float);
std::copy_n((float*)object, data_size, (input_tensor.flat<float>()).data());
I am trying to time the houghcircle in python and c++ to see if c++ gives edge over processing time (intuitively it should!)
Versions
python: 3.6.4
gcc compiler: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
cmake : 3.5.1
opencv : 3.4.1
I actually installed opencv using anaconda. Surprisingly c++ version
also worked
The image I am using is given here:
Python code
import cv2
import time
import sys
def hough_transform(src,dp,minDist,param1=100,param2=100,minRadius=0,maxRadius=0):
gray = cv2.cvtColor(src,cv2.COLOR_RGB2GRAY)
start_time = time.time()
circles=cv2.HoughCircles(gray,
cv2.HOUGH_GRADIENT,
dp = dp,
minDist = minDist,
param1=param1,
param2=param2,
minRadius=minRadius,
maxRadius=maxRadius)
end_time = time.time()
print("Time taken for hough circle transform is : {}".format(end_time-start_time))
# if circles is not None:
# circles = circles.reshape(circles.shape[1],circles.shape[2])
# else:
# raise ValueError("ERROR!!!!!! circle not detected try tweaking the parameters or the min and max radius")
#
# a = input("enter 1 to visualize")
# if int(a) == 1 :
# for circle in circles:
# center = (circle[0],circle[1])
# radius = circle[2]
# cv2.circle(src, center, radius, (255,0,0), 5)
#
# cv2.namedWindow("Hough circle",cv2.WINDOW_NORMAL)
# cv2.imshow("Hough circle",src)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
#
#
return
if __name__ == "__main__":
if len(sys.argv) != 2:
raise ValueError("usage: python hough_circle.py <path to image>")
image = cv2.imread(sys.argv[1])
image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
hough_transform(image,1.7,100,50,30,690,700)
C++ code
#include <iostream>
#include <opencv2/opencv.hpp>
#include <ctime>
using namespace std;
using namespace cv;
void hough_transform(Mat src, double dp, double minDist, double param1=100, double param2=100, int minRadius=0, int maxRadius=0 )
{
Mat gray;
cvtColor( src, gray, COLOR_RGB2GRAY);
vector<Vec3f> circles;
int start_time = clock();
HoughCircles( gray, circles, HOUGH_GRADIENT, dp, minDist, param1, param2, minRadius, maxRadius);
int end_time = clock();
cout<<"Time taken hough circle transform: "<<(end_time-start_time)/double(CLOCKS_PER_SEC)<<endl;
// cout<<"Enter 1 to visualize the image";
// int vis;
// cin>>vis;
// if (vis == 1)
// {
// for( size_t i = 0; i < circles.size(); i++ )
// {
// Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
// int radius = cvRound(circles[i][2]);
// circle( src, center, radius, Scalar(255,0,0), 5);
// }
// namedWindow( "Hough Circle", WINDOW_NORMAL);
// imshow( "Hough Circle", src);
// waitKey(0);
// destroyAllWindows();
// }
return;
}
int main(int argc, char** argv)
{
if( argc != 2 ){
cout<<"Usage hough_circle <path to image.jpg>";
return -1;
}
Mat image;
image = imread(argv[1]);
cvtColor(image,image,COLOR_BGR2RGB);
hough_transform(image,1.7,100,50,30,690,700);
return 0;
}
I was hoping for C++ hough transform to ace python but what happened was actually opposite.
Python result:
C++ result:
Even though C++ ran the complete program ~2X faster it is very slow in hough transform. Why is it so? This is very counter intuitive. What am I missing here?
I wouldn't expect any difference between the two at all to be honest. The python library more than likely is a wrapper around the C++ library; meaning that once they get into the core of the opencv they will have identical performance if compiled with the same optimisation flags.
The only slight slowdown I'd EXPECT is python getting to that point; and with so little python code actually there; the difference is unlikely to be measureable. The fact that you're seeing it the other way around I don't think proves anything as you're performing a single test; and getting a difference of 0.2s which could trivially be the difference in just the hard disk seeking to the file to process.
I was actually comparing 2 different times. Namely wall and CPU.
In Linux, in C++ clock() gives CPU time and in Windows it gives wall time. So when I changed my python code to time.clock() Both gave same results.
As explained by #UKMonkey, The time to calculate hough in python and C++ did not have any difference at all. But, running the entire program in c++ was almost 2.5 times faster (looped 100 times).Hands down to C++ :P.
I am using QL in Python and have translated parts of the example file
http://quantlib.org/reference/_fitted_bond_curve_8cpp-example.html#_a25;
of how to fit a yield curve with bonds in order to fit a Nelson-Siegel
yield curve to a set of given calibration bonds.
As usual when performing such a non-linear fit, the results depend strongly
on the initial conditions and many (economically meaningless) minima of the
objective function exist. This is why putting constraints on the parameters
is essential for success. To give an example, at times I get negative
tau/lambda parameters and my yield curve diverges.
I did not find how these parameter constraints can be specified in
the NelsonSiegelFitting or the FittedBondDiscountCurve classes. I could
imagine that anyone performing NS fitting in QL will encounter the same
issue.
Thanks to Andres Hernandez for the answer:
Currently it is not possible. However, it is very easy to extend QL to allow it, but I think it needs to be done on the c++. So even though you are using QL in python, can you modify the c++ code and export a new binding? If yes, then you can use the following code, if not then I could just check it into the code, but it will take some time for the pull request to be accepted. In case you can touch the code, you can add something like this:
in nonlinearfittingmethods.hpp:
class NelsonSiegelConstrainedFitting
: public FittedBondDiscountCurve::FittingMethod {
public:
NelsonSiegelConstrainedFitting(const Array& lower, const Array& upper,
const Array& weights = Array(),
boost::shared_ptr<OptimizationMethod> optimizationMethod
= boost::shared_ptr<OptimizationMethod>());
std::auto_ptr<FittedBondDiscountCurve::FittingMethod> clone() const;
private:
Size size() const;
DiscountFactor discountFunction(const Array& x, Time t) const;
Array lower_, upper_;
};
in nonlinearfittingmethods.cpp:
NelsonSiegelConstrainedFitting::NelsonSiegelConstrainedFitting(
const Array& lower, const Array& upper, const Array& weights,
boost::shared_ptr<OptimizationMethod> optimizationMethod)
: FittedBondDiscountCurve::FittingMethod(true, weights, optimizationMethod),
lower_(lower), upper_(upper){
QL_REQUIRE(lower_.size() == 4, "Lower constraint must have 4 elements");
QL_REQUIRE(upper_.size() == 4, "Lower constraint must have 4 elements");
}
std::auto_ptr<FittedBondDiscountCurve::FittingMethod>
NelsonSiegelConstrainedFitting::clone() const {
return std::auto_ptr<FittedBondDiscountCurve::FittingMethod>(
new NelsonSiegelFitting(*this));
}
Size NelsonSiegelConstrainedFitting::size() const {
return 4;
}
DiscountFactor NelsonSiegelConstrainedFitting::discountFunction(const Array& x,
Time t) const {
///extreme values of kappa result in colinear behaviour of x[1] and x[2], so it should be constrained not only
///to be positive, but also not very extreme
Real kappa = lower_[3] + upper_[3]/(1.0+exp(-x[3]));
Real x0 = lower_[0] + upper_[0]/(1.0+exp(-x[0])),
x1 = lower_[1] + upper_[1]/(1.0+exp(-x[1])),
x2 = lower_[2] + upper_[2]/(1.0+exp(-x[2])),;
Real zeroRate = x0 + (x1 + x2)*
(1.0 - std::exp(-kappa*t))/
((kappa+QL_EPSILON)*(t+QL_EPSILON)) -
x2*std::exp(-kappa*t);
DiscountFactor d = std::exp(-zeroRate * t) ;
return d;
}
You then need to add it to the swig interface, but it should be trivial to do so.
What is the simplest way to make object detector on C++ with Fast/Faster-RCNN and Caffe?
As known, we can use follow RCNN (Region-based Convolutional Neural Networks) with Caffe:
RCNN: https://github.com/BVLC/caffe/blob/be163be0ea5befada208dbf0db29e6fa5811dc86/python/caffe/detector.py#L174
Fast RCNN: https://github.com/rbgirshick/fast-rcnn/blob/master/tools/demo.py#L89
scores, boxes = im_detect(net, im, obj_proposals) which calls to def im_detect(net, im, boxes):
for this used rbgirshick/caffe-fast-rcnn, ROIPooling-layers and output bbox_pred
Faster RCNN: https://github.com/rbgirshick/py-faster-rcnn/blob/master/tools/demo.py#L82
scores, boxes = im_detect(net, im) which calls to def im_detect(net, im, boxes=None):
for this used rbgirshick/caffe-fast-rcnn, ROIPooling-layers and output bbox_pred
All of these use Python and Caffe, but how to do it on C++ and Caffe?
There is only C++ example for classification (to say what on image), but there is not for detecton (to say what and where on image): https://github.com/BVLC/caffe/tree/master/examples/cpp_classification
Is it enough to simply clone rbgirshick/py-faster-rcnn repository with
rbgirshick/caffe-fast-rcnn, download the pre-tained model ./data/scripts/fetch_faster_rcnn_models.sh, use this coco/VGG16/faster_rcnn_end2end/test.prototxt and done a small change in CaffeNet C++ Classification example?
And how can I get output data from two layers bbox_pred and cls_score ?
Will I have all (bbox_pred & cls_score) in one array:
const vector<Blob<float>*>& output_blobs = net_->ForwardPrefilled();
Blob<float>* output_layer = output_blobs[0];
const float* begin = output_layer->cpu_data();
const float* end = begin + output_layer->channels();
std::vector<float> bbox_and_score_array(begin, end);
Or in two arrays?
const vector<Blob<float>*>& output_blobs = net_->ForwardPrefilled();
Blob<float>* bbox_output_layer = output_blobs[0];
const float* begin_b = bbox_output_layer ->cpu_data();
const float* end_b = begin_b + bbox_output_layer ->channels();
std::vector<float> bbox_array(begin_b, end_b);
Blob<float>* score_output_layer = output_blobs[1];
const float* begin_c = score_output_layer ->cpu_data();
const float* end_c = begin_c + score_output_layer ->channels();
std::vector<float> score_array(begin_c, end_c);
for those of you who are still looking for it, there is a C++ version of faster-RCNN with caffe in this project. You can even find a c++ api to include it in your project. I have successfully tested it.