How to detect moving object on a moving conveyor using opencv - python

I'm building a grading system for crabs. In this system, the animals (crabs) are placed in a moving conveyor and I need to identify dead or alive animals by detecting its motion based on images captured by a camera on this conveyor.
The color of conveyor belt is black.
As the conveyor is always moving, so I can't apply methods using stationary camera like here. Does anyone have a suggestion about motion detection of the animals in this case using opencv? I can use more than one camera if it's necessary. Thanks.

Well, the most obvious answer is:
1) adjust the pictures of the conveyor in the different periods of time so that they become of the same area.
2) watch which ones of the crabs have different poses (like, "substract the images") - different regions (pixels) mean that there happened a motion.
If using a tracking - well, you should train your classifier to watch the crabs, and than compare the regions of crabs in a same way. But i think it's too complicated for your particular issue.

Well, This is an interesting question. While weighing different solutions to the problem, I learned that crabs are ectothermal animals, i.e. they can not control their body temperatures but rather their body temperatures are equal to the temperatures of the environment they are in. So, using a remote thermometer is out of question. (But I learned something new, thank you for that :] )
A different, but a little bit cruel method would be, to give take a shot of a crab on the the belt, then give it a nudge of electric pulse (very very small voltage, enough for it to make it react only, similar to us when we get a static discharge) and take another shot of the crab immediately. Compare two images to see if there is a difference in crab's movements. If so, it should be alive, if not, RIP crab.
There are downsides of this solution too:
I really do not like the idea of giving electric shocks to crabs,
even if it is low voltage. Sounds very cruel to me. I am not sure,
if it is legally doable where you live in either.
This requires adding another step to process.
I absolutely have no idea what would be a amount of voltage to be
used in such a system. Would it pose any danger for the employees
around the conveyor belt?
[I hope I am not get stoned for suggesting giving electric shocks to crabs here]

Related

face recognition - updating the reference face encodings daily

how to update the face recognition reference image encoding daily or weekly to make face detection more accurate. because when time goes there will be changes in peoples face. so iam thinking to compare face_distance with a threshold value like .40 if it is greater than .40 then iam taking the face encodings. is that right or is there any better way to do this ?
How about:
Make two dataset folders
Name one of them: face_recognition
Second one: face_recognition_updated
Write a function to update dataset with new one
Yes, you are right but you need to be careful about:
make sure the first/initial enrolled image (or corresponding
embedding) is of high quality.
make sure there is no false positive--if some day your software finds
a false positive (may be a look alike face) then the software will
enroll the wrong person forever.
Better option is to re-enroll the person every 5 years.

AWS live human detection from video

Is there any AWS api for detecting live human in a video ? For example a person can fake the human detection by just showing an image of another person. So is there a way to overcome this ?
The recognition will first verify if there is a person in the video or not.
If not AWS is there any other api or python libraries to do that?
Depending on the expected attack vector(s), it is likely going to be your responsibility to craft a solution comprised of other identity verification building blocks. Amazon Rekognition offers functionality that can be used for these tasks.
Depending on the expected levels of nefariousness of your userbase, one single sample data point (image) may not be sufficient for being able to robustly determine whether the subject is a human or not (nonetheless a specific one). Without additional datapoints like depth sensors, thermal imaging, and more, it is hard to definitively determine if someone is attempting to obfuscate with a mimicked face.
One method for increasing the robustness of such a system is to craft a multi-factor authentication layer with custom semi-random "tests" for users, including other information tests that would not be spoofed by physical attack vectors. A further way to increase robustness of the optical system would be to record video while asking the person to assume a sequence of poses or tasks (cover mouth, hand over right eye, tongue out) that are easy for a real person to do but not an imitation like this.
Amazon Rekognition supports finding faces within an image, as well as matching a test face to faces in a collection, and can also be used to help estimate other meta-concepts like emotion (happy, sad, frown, smile, etc). Rekognition video in particular can be used to detect changes while the user is changing from one pose to the next in an attempt to auto-detect bad actors.

How to count objects in videos

In general, is there any "best practise" on how to use videos as input of deep learning models? How can we annotate video in the most efficient way?
Also, I have some videos with ducks walking through a passage. I want to count the number of grey-duck and the number of yellow-duck passing through the passage. A duck can pass directly through (easiest case), or can stay in the passage for a while and pass through, or can go half the passage and go back the other direction (in this case it should not be counted).
I plan to use Mask-RCNN to segment ducks in each frame and then to look at the masks from frame i and masks from frame i+1 and make rules to count the number of different ducks that truly pass the passage.
This does not seems optimal to me.
Any ideas/help/hints?
I guess it depends on the video, but a good option was to
Annotate some 'not to similar frames' with: http://www.robots.ox.ac.uk/~vgg/software/via/
Use a model like YOLO or Mask-RCNN to find bounding box over each objects and classfiy them. Or use Optical flow algorithm. Optical flow algorithm is also an option instead of using deep learning, but I finally decided not to use it due to several possible outcome which made it from my point of view less automatic: *object that moves, stop and restart moving would require special attention *objects which are of one main color might be split into two pieces (middle pixels might be saw as not moving) *group of object passing together will probably be saw as one object
Then using tracking algorithm you will be able to give a specific ID to each object, and hence to count when they pass a certain line.

Matching a Pattern in a Region in Sikuli is very slow

I am automating a computer game using Sikuli as a hobby project and to hopefully get good enough to make scripts to help me at my job. In a certain small region, (20x20 pixels) one of 15 characters will appear. Right now I have these 15 images defined as variables, and then using an if, elif loop I am doing Region.exists(). If one of my images is present in the region, I assign a variable the appropriate value.
I am doing this for two areas on the screen and then based on the combination of characters the script clicks appropriately.
The problem right now is that to run the 15 if statements is taking approximately 10 seconds. I was hoping to do this recognition in closer to 1 second.
These are just text characters but the OCR feature was not reading them reliably and I wanted close to 100% accuracy.
Is this an appropriate way to do OCR? Is there a better way you guys can recommend? I haven't done much coding in the last 3 years so I am wondering if OCR has improved and if Sikuli is still even a relevant program. Seeing as this is just a hobby project I am hoping to stick to free solutions.
Sikuli operates by scanning a Screen or a part of a screen and attempting to match a set pattern. Naturally, the smaller the pattern is, the more time it will consume to match it. There few ways to improve the detection time:
Region and Pattern manipulation (bound region size)
Functions settings (reduce minimum wait time)
Configuration (amend scan rate)
I have described the issue in some more detail here.
OCR is still quite unreliable. There are ways to improve that but if you only have a limited set of characters, I reckon you will be better off using them as patterns. It will be quicker and more reliable.
As of Sikuli itself, the tool is under active development and is still relevant if it helps you to solve your problem.

Count the number of people in the video

I am working on image processing and computer vision project. The project is to count the number of people entering the conference. This need to done in OpenCV or Python.
I have already tried the Haar Cascade that is available in OpenCV for Upper body: Detect upper body portion using OpenCV
However, it does not address the requirement. The link of the videos is as follows:
https://drive.google.com/open?id=0B3LatSCwKo2benZyVXhKLXV6R0U
If you view the sample1 file, at 0:16 secs a person is entering the room, that would always be the way. The camera is on top of the door.
Identifying People from this Aerial Video Stream
I think there is a simple way of approaching this problem. Background subtraction methods for detecting moving objects are just what you need because the video you provided seems to only have one moving object at any point: the person walking through the door. Thus, if you follow this tutorial in Python, you should be able to implement a satisfying solution for your problem.
Counting People Entering / Exiting
Now, the first question that pops to my mind is what might I do to count if multiple people are walking through the door at separate time intervals (one person walks in 10 seconds into the video and a second person walks in 20 seconds into the video)? Here's the simplest solution to this consideration that I can think of. Once you've detected the blob(s) via background subtraction, you only have to track the blob until it goes off the frame. Once it leaves the frame, the next blob you detect must be a new person entering the room and thus you can continue counting. If you aren't familiar with how to track objects once they have been detected, give this tutorial a read. In this manner, you'd avoid counting the same blob (i.e., the same person) entering too many times.
The Difficulties in Processing Complex Dynamic Environments
If you think that there is a high level of traffic through that doorway, then the problem becomes much more difficult. This is because in that case there may not be much stationary background to subtract at any given moment, and further there may be a lot of overlap between detected blobs. There is a lot of active research in the area of autonomous pedestrian tracking and identification - so, in short, it's a difficult question that doesn't have a straightforward easy-to-implement solution. However, if you're interested in reading about some of the potential approaches you could take to solving these more challenging problems in pedestrian detection from an aerial view, I'd recommend reading the answers to this question.
I hope this helps, good luck coding!

Categories

Resources