is possible to face recognition with mediapipe in python - python

I try to make app with python to be able recognition face, recently use cv2+dlib and face_recognition module for recognition, but i have two problems:
have 3 or 4 second delay
low accuracy
That's why I decided to use another library, after so many search, find MediaPipe, this library is very fast (real time) and find this example for face detection, but I need face recognition! but not found any example for face recognition
Is there a solution?

Mediapipe doesn't provide a face recognition method, only face detector.
The face_recognition library has really good accuracy, It's claimed accuracy is 99%+. your dataset probably isn't good enough.
Solutions:
For better speed performance, use the "hog" model instead of "cnn" model. you can modify it when you use the face_locations method like the following code line.
locations = face_recognition.face_locations(frame, model="hog")
For accuracy, use better dataset images (higher quality, a face looking straight at the camera, more pictures for the same person but usually 1-3 pictures is enough)

If you want to increase the accuracy of face recognization reduce the tolerance value to 0.4 or 0.5 and for face detection use the hog model.
face_recognition.api.compare_faces(known_face_encodings,face_encoding_to_check, tolerance=0.5)
https://face-recognition.readthedocs.io/en/latest/face_recognition.html

Related

How to get human face feature's size in Python using face recognition?

I have face recognition model in Python which detects face and differentiates the features like eyes, nose, mouth from a face. I want to filter-out the face features (face parts) depending upon their sizes, so that I can easily get persons if they have big eyes or small eyes without displaying the whole data.
Can anyone suggest what should I use to detect facial feature's size?

Neural Network visualization for detection task

I'm currently working with SqueezeDet for detection purposes. I trained the network on synthetic data and it performs reasonably well. detection results
For my project I would like to be able to visualize which parts of the input were more relevant for the detection process. So in case of the detection of a pedestrian, I'd assume that its pixel would be more important than for example the surroundings. I tried a couple of different methods, but none of them is fully satisfactory.
I did my own research and couldnt't really any papers that talk about visualization for object detection. So I implemented VisualBackProp, the results however don't look all to promising. If instead I compute the relevance things look slightly better, but still not as expected.
I started thinking that perhaps the issues might be related to the complexity of my outputs, with respect to a network that might only be dealing with classification, or as in the VisualBackProp paper just the prediction of steering angle.
I was wondering if anyone has idea of what visualization technique might best suit the detection task.
You could try just augmenting different areas of the image and see how it affects the detection confidence. For example, you could put the area containing the pedestrian on just a black background instead of the natural background to see how much the surroundings actually affect things. You could also add moderate to severe noise to select areas of the image and observe which areas correspond to the biggest change in detection confidence.
More directly, mathematically you seem to be interested in the gradient of detection confidence WRT pixel data. Depending on what deep learning platform you are using, if you run a single training iteration you may be able to obtain the gradients in the data layer (dL/dx) which will directly show these. This will only represent the effect of small changes to the pixel data - if you are aiming for more macroscopic insights than that, I think my first suggestion is probably your only option.

Computer Vision: How to obtain what percentage of image contains a specific texture?

I am building a app to see the progress of deforestation. Over time i would like to take a satellite image from a location and see what percentage of that image contains forest.
I have attempted google's vision API, it does not have this functionality.
Is this something that can be done in OpenCV or must I do this from scratch with semantic segmentation or something similar?
So far I could see in the documentation there doesn't seem to be any pattern/texture recognition for the API. My belief is that you could try to do a dominant color recognition. if your image data has enough data of differentiable colors, I think you should be able to get an acceptable analysis.
PD: Having some experience with satellital imagery processing, as additional info, I can comment that the usual way to find out the status of the land for plants, forest and general crop development and health is though color analysis.
Nonetheless, Satellite/drone images are mostly multispectral and several UV bands are extensively used as biomass behaves very different with season/health/development status with the combination of visible and UV electromagnetic bands.
Have you tried to look at the satellite image recognition Kaggle competitions? There are a lot of discussions as well as available scripts for tasks similar to yours:
Links: https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
Example script: https://www.kaggle.com/arpandhatt/satellite-image-classification

dlib face detection failing to catch

Ive been exploring dlib's face detector over its python API. On most images in my data set it seems to perform slightly better than cv2 on most images so I kept playing around with it on multiple faces in picture scenarios.
Going through dlib's python examples it seems like it would be possible to train these images but I am wondering if anyone has a suggestion how to make sure that the two faces on the far left and right are detected out of the box?
This is he image that I am having trouble finding all 6 faces on (https://images2.onionstatic.com/onionstudios/6215/original/600.jpg)
Dlib has a very precise face detector. But it works bad detecting not frontal (like far left) and/or occluded faces (like far right).
Seeta (https://github.com/seetaface/SeetaFaceEngine) works better with those. But it's less precise.
Also I tried retraining Dlib's face detector. And obtained much lower precise than DLIB and less recall than Seeta. So, re-training DLIB seems not perfect idea.
In my experience, Dlib does not do very well out of the box with obscured and profile faces out of the box. I would recommend training Dlib with more data of this kind.

OCR of low-resolution text from screenshots

I'm writing an OCR application to read characters from a screenshot image. Currently, I'm focusing only on digits. I'm partially basing my approach on this blog post: http://blog.damiles.com/2008/11/basic-ocr-in-opencv/.
I can successfully extract each individual character using some clever thresholding. Where things get a bit tricky is matching the characters. Even with fixed font face and size, there are some variables such as background color and kerning that cause the same digit to appear in slightly different shapes. For example, the below image is segmented into 3 parts:
Top: a target digit that I successfully extracted from a screenshot
Middle: the template: a digit from my training set
Bottom: the error (absolute difference) between the top and middle images
The parts have all been scaled (the distance between the two green horizontal lines represents one pixel).
You can see that despite both the top and middle images clearly representing a 2, the error between them is quite high. This causes false positives when matching other digits -- for example, it's not hard to see how a well-placed 7 can match the target digit in the image above better than the middle image can.
Currently, I'm handling this by having a heap of training images for each digit, and matching the target digit against those images, one-by-one. I tried taking the average image of the training set, but that doesn't resolve the problem (false positives on other digits).
I'm a bit reluctant to perform matching using a shifted template (it'd be essentially the same as what I'm doing now). Is there a better way to compare the two images than simple absolute difference? I was thinking of maybe something like the EMD (earth movers distance, http://en.wikipedia.org/wiki/Earth_mover's_distance) in 2D: basically, I need a comparison method that isn't as sensitive to global shifting and small local changes (pixels next to a white pixel becoming white, or pixels next to a black pixel becoming black), but is sensitive to global changes (black pixels that are nowhere near white pixels become black, and vice versa).
Can anybody suggest a more effective matching method than absolute difference?
I'm doing all this in OpenCV using the C-style Python wrappers (import cv).
I would look into using Haar cascades. I've used them for face detection/head tracking, and it seems like you could build up a pretty good set of cascades with enough '2's, '3's, '4's, and so on.
http://alereimondo.no-ip.org/OpenCV/34
http://en.wikipedia.org/wiki/Haar-like_features
OCR on noisy images is not easy - so simple approaches no not work well.
So, I would recommend you to use HOG to extract features and SVM to classify. HOG seems to be one of the most powerful ways to describe shapes.
The whole processing pipeline is implemented in OpenCV, however I do not know the function names in python wrappers. You should be able to train with the latest haartraining.cpp - it actually supports more than haar - HOG and LBP also.
And I think the latest code (from trunk) is much improved over the official release (2.3.1).
HOG usually needs just a fraction of the training data used by other recognition methods, however, if you want to classify shapes that are partially ocludded (or missing), you should make sure you include some such shapes in training.
I can tell you from my experience and from reading several papers on character classification, that a good way to start is by reading about Principal Component Analysis (PCA), Fisher's Linear Discriminant Analysis (LDA), and Support Vector Machines (SVMs). These are classification methods that are extremely useful for OCR, and it turns out that OpenCV already includes excellent implementations on PCAs and SVMs. I haven't seen any OpenCV code examples for OCR, but you can use some modified version of face classification to perform character classification. An excellent resource for face recognition code for OpenCV is this website.
Another library for Python that I recommend you is "scikits.learn". It is very easy to send cvArrays to scikits.learn and run machine learning algorithms on your data. A basic example for OCR using SVM is here.
Another more complicated example using manifold learning for handwritten character recognition is here.

Categories

Resources