I want to count number of people going up or down using a reference line let's say in the middle of that video. Now, How do I actually implement it using python and openCV.. I saw a lot of videos showing the people counter but no one has the method or instructions on how to exactly do that.. I don't need code.. Plz just tell me the method..
Btw here is something that i tried.. But this isn't working:
import cv2
Take a look at the detailed breakdown here, as pointed out in comments by leaf, but basically you can use the OpenCV2 built-in methods to perform pedestrian detection. OpenCV ships with a pre-trained HOG + Linear SVM model that can be used to perform pedestrian detection in both images and video streams.
To separate the Up & Down counters I would split each frame on the vertical line before running the detection on each half separately. You can count the number of people going in the given direction in a single frame by a simple len(contours) while processing that frames direction half.
To track the total number of people going in a given direction you will need to detect the motion of each contour across the frame and only add a new entry to the count when a new contour is created near the entry edge of the direction frame - of course this could be confused by people sprinting through the frame, moving the opposite direction to the expected running up the down or vice-versa and entering the frame then backing out.
Related
What am I trying to do?
Listen to camera video
Detect faces
Save only faces to folder
The problem:
I have done all of those things above EXCEPT it gives me bunch of same person faces because it checks it on every frame it captures and gives as new face when it's the same person.
I want the script to understand that it's the same person and skip it. (of course it depends on accuracy overall, but that's okay, as long as it's not giving me 60 files of same face a second)
So I was thinking to somehow use face_verify within the same library, but couldn't make it working quickly and decided to ask first instead of wasting time on something what most likely not gonna work.
Any suggestions? Hopefully described it well, also didn't find any duplicates of this question.
Thanks in advance
It depends on what you are using, face-detection or face-recognition.
Face-detection will just detect faces and return bounding box for each frame. In this case you can use any simple tracking algorithm (like SORT) which will give tracking ID for each bounding boxes in each frame. This way you can ensure to save only 1 face for each tracking ID
In case of Face-recognition, it's all the more easier as each detected face has an associated label (similar to a tracking ID in the above approach). So you can simply save face-images based on labels (1 image per label)
I've been given a video consisting of frames like this:
and I am supposed to count how many of fishes went in front of the screen during the video using Python. I am not allowed to use OpenCV or similar library. The only library I am allowed to use is PIL and Numpy, therefore I am forced to use numpy matrices as the image representation.
So far I am able to generate this one channel image mask:
which I believe should be enough to detect fishes on one frame. However now I need to somehow track those fishes throughout the remaining frames to distinguish those fishes who were already on the screen and those who moved in.
What would be the procedure to do that? I'm imagining somehow marking those areas and then predicting the movement of those fishes or something.
I have a camera in a fixed position looking at a target and I want to detect whether someone walks in front of the target. The lighting in the scene can change so subtracting the new changed frame from the previous frame would therefore detect motion even though none has actually occurred. I have thought to compare the number of contours (obtained by using findContours() on a binary edge image obtained with canny and then getting size() of this) between the two frames as a big change here could denote movement while also being less sensitive to lighting changes, I am quite new to OpenCV and my implementations have not been successful so far. Is there a way I could make this work or will I have to just subtract the frames. I don't need to track the person, just detect whether they are in the scene.
I am a bit rusty but there are various ways to do this.
SIFT and SURF are very expensive operations, so I don't think you would want to use them.
There are a couple of 'background removal' methods.
Average removal: in this one you get the average of N frames, and consider it as BG. This is vulnerable to many things, light changes, shadow, moving object staying at a location for long time etc.
Gaussian Mixture Model: a bit more advanced than 1. Still vulnerable to a lot of things.
IncPCP (incremental principal component pursuit): I can't remember the algorithm totally but basic idea was they convert each frame to a sparse form, then extract the moving objects from sparse matrix.
Optical flow: you find the change across the temporal domain of a video. For example, you compare frame2 with frame1 block by block and tell the direction of change.
CNN based methods: I know there are a bunch of them, but I didn't really follow them. You might have to do some research. As far as I know, they often are better than the methods above.
Notice that, for a #30Fps, your code should complete in 33ms per frame, so it could be real time. You can find a lot of code available for this task.
There are a handful of ways you could do this.
The first that comes to mind is doing a 2D FFT on the incoming images. Color shouldn't affect the FFT too much, but an object moving, entering/exiting a frame will.
The second is to use SIFT or SURF to generate a list of features in an image, you can insert these points into a map, sorted however you like, then do a set_difference between the last image you took, and the current image that you have. You could also use the FLANN functionality to compare the generated features.
I am trying to write a script in Python using OpenCV that will find, track, and output the positions of multiple markers on a person performing an exercise in a video. However, the markers were not properly lit up at the time of video capture, and so as a result they appear the same color as much of the background - an unspectacular, non-reflective grey. This is a problem when it comes to pulling them out. Even when converting the image to HSV, it seems impossible to filter out the surroundings (subject's clothes, the walls, the ceiling, etc) without the markers vanishing too. And as far as finding contours goes, there's so much going on in any particular frame that the number of contours found is pretty high and the markers themselves are not necessarily the smallest detected, so I can't just assuming 'min(contours)' as many tutorials try to do.
I've tried to isolate the markers using several different methods, mostly involving manipulating the mask/HSV image, but also some others, such as SimpleBlobDetector and finding keypoints. The best method I can think of is using keypoint detection to manually select the points of interest, but even those don't always pick up the markers.
I can't share a full-size sample image since it's a person in the videos I'm using, but some notes on the situation:
I can't retake the video to do the markers correctly. This data wasn't originally taken for more than camera alignment, so no one was too concerned about marker illumination. The data sucks, but it's all I have to work with.
Skin is very easy to filter out for the most part, but outlines of the clothes, environment, and skin always remain.
in the image above, the user is holding the exercise bar. there's a marker just under the center of the image, and another further up the arm. The spots towards the right edge are not markers. H(0, 26), S(0,57), V(0,255)
Markers really are basically the same color as the wall and ceiling.
TL;DR: I need a way to grab non-reflective markers in a busy environment with as little user input as possible. Data can't simply be re-taken, and methods typical for acquiring motion capture data are not working out here.
I am getting video input from 2 separate cameras with some area of overlap between the output videos. I have tried out a code which combines the video output horizontally. Here is the link for that code:
https://github.com/rajatsaxena/NeuroscienceLab/blob/master/positiontracking/combinevid.py
To explain the problem visually:
The red part shows the overlap region between two image frame. I need the output to look like the second image, with first frame in blue and second frame in green (as shown in third illustration)
A solutions I can think of but unable to implement is, Using SIFT/SURF find out the maximum distance keypoints from both frames and then take the first video frame completely and just pick the non overlapping region from second video frame and horizontally combine them to get the stitched output.
Let me know of any other solutions possible as well. Thanks!
I read this post one hour ago. I tried some really easy approach. Not perfect but in some cases should work well. For example, if you have both cameras on one frame placed side by side.
I took 2 images from the phone like on a picture (color images). Program select Rectangles region from both source images and resize end extract this roi rectangles. The idea is to find the "best" overlapping Rect regions by normalized correlation.
M1 and M2 is mat roi to compare,
matchTemplate(M1, M2, res, TM_CCOEFF_NORMED);
After, I find this overlapping Rect use this to crop source images and combine by hconcat() function together.
My code is in C++ but is really simple to replicate this in python. It is not the best solution but one of the most simple solution. If your cameras are fixed in stable position between themselves. This is a good solution I think.
I hold my phone in hand :)
You can also use this simple approach on video. The speed depends only on the number of rectangle candidate you compare.
You can improve this by smart region to compare selection.
Also, I am thinking about another idea to use optical flow by putting your images from a camera at the same time to sequence behind each other. From the possible overlapping regions in one image extract good features to track and find them in the region of second images.
Surf and sift are great for this but this is the most simple idea on my mind.
Code is Here Code