OpenCV Object Tracking accurate enough to measure speed of conveyor precisely? - python

I am building a conveyor speed tracking system with only vision.
Basic concept of the project is to calculate conveyor's speed based on looking at the surface of the conveyor by camera.
[What I Tried]
By far, I tried to use OpenCV's object tracking algorithm to track each sections, but it seems that it's not accurate enough to calculate the speed. To normalize the speed variance, I created many tracking instances and get the speed average of the instances. However, even the average seems to be not consistent. For algorithm, I used MOSSE for the speed. When I use different algorithms such as KCF or CSRT, the process time is too slow for real-time speed tracking. I didn't try to use GOTURN yet as it requires a trained model.
Is object tracking in OpenCV not accurate enough? or is it just a problem in my algorithm?
Also, do you have any suggestion on how to precisely calculate conveyor's speed only by vision system?
Any help is greatly appreciated.

Related

What does it really mean real time object detection?

So here is the context.
I created an script in python, YOLOv4, OpenCV, CUDA and CUDNN, for object detection and object tracking to count the objects in a video. I intend to use it in real time, but what real time really means? The video I'm using is 1min long and 60FPS originally, but the video after processing is 30FPS on average and takes 3mins to finish. So comparing both videos side by side, one is clearly faster. 30FPS is industry standard for movies and stuff. I'm trying to wrap my head around what real time truly means.
Imagine I need to use this information for traffic lights management or use this to lift a bridge for a passing boat, it should be done automatically. It's time sensitive or the chaos would be visible. In these cases, what it trully means to be real time?
First, learn what "real-time" means. Wikipedia: https://en.wikipedia.org/wiki/Real-time_computing
Understand the terms "hard" and "soft" real-time. Understand which aspects of your environment are soft and which require hard real-time.
Understand the response times that your environment requires. Understand the time scales.
This does not involve fuzzy terms like "quick" or "significant" or "accurate". It involves actual quantifiable time spans that depend on your task and its environment, acceptable error rates, ...
You did not share any details about your environment. I find it unlikely that you even need 30 fps for any application involving a road intersection.
You only need enough frame rate so you don't miss objects of interest, and you have fine enough data to track multiple objects with identity without mistaking them for each other.
Example: assume a car moving at 200 km/h. If your camera takes a frame every 1/30 second, the car moves 1.85 meters between frames.
How's your motion blur? What's the camera's exposure time? I'd recommend something on the order of a millisecond or better, giving motion blur of 0.05m
How's your tracking? Can it deal with objects "jumping" that far between frames? Does it generate object identity information that is usable for matching (association)?

Community detection for larger than memory embeddings dataset

I currently have a dataset of textual embeddings (768 dimensions). The current number of records is ~1 million. I am looking to detect related embeddings through a community detection algorithm. For small data sets, I have been able to use this one:
https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/clustering/fast_clustering.py
It works great, but, it doesn't really scale as the data set grows larger than memory.
The key here is that I am able to specify a threshold for community matches. I have been able to find clustering algorithms that scale to larger than memory, but I always have to specify a fixed number of clusters ahead of time. I need the system to detect the number of clusters for me.
I'm certain there are a class of algorithms - and hopefully a python library - that can handle this situation, but I have been unable to locate it. Does anyone know of an algorithm or a solution I could use?
That seems small enough that you could just rent a bigger computer.
Nevertheless, to answer the question, typically the play is to cluster the data into a few chunks (overlapping or not) that fit in memory and then apply a higher-quality in-memory clustering algorithm to each chunk. One typical strategy for cosine similarity is to cluster by SimHashes, but
there's a whole literature out there;
if you already have a scalable clustering algorithm you like, you can use that.

Does the quality of the video matters using Deep Sort and Yolo v3?

I'm using Deep Sort and Yolo to code an object tracker. Everything works fine, objects are detected and tracked by the algorithm. This app has to count vehicule passing through a crossroad.
But I noticed different results by detecting objects on 720p and 480p videos. Indeed, the tracker seems to better work on the 720p video, almost all vehicules are tracked and keep the same id during the process. But with 480p video, it's not that accurate. IDs of vehicule are often changing so the counting process cannot be ok at the end.
So I'm wondering if the video quality has a real impact on deep sort and if it does how can I modify parameters of Deep Sort like iou_distance or matching_threshold to have better result in a 480p video ?
So I'm wondering if the video quality has a real impact on deep sort
Your object detection model inputs observations to deep sort. The better your object detection model, the easier it will be for DeepSort to track. Higher img resolution usually gives better detections, specially if there are small objects with respect to the total field of view of the camera. This is because you would loose details that could make certain object more easily identifiable with lower resolutions.
and if it does how can I modify parameters of Deep Sort like iou_distance or matching_threshold to have better result in a 480p video?
As the answer to the previous question is yes, I will continue answering to the rest of the question. If your object detection is alright you can start by lowering
MAX_DIST: 0.2 # The matching threshold. Samples with larger distance are considered an invalid match
so that you minimize the ID switches

Is there any existing library for median filtering with kernel size greater then 5 using OpenCL acceleration in python?

Thank you in advance for answering.
I tried a lot of google searching, I saw scipy.sgnal.medfilt2, skimage.filters.median but none of these support OpenCL.
I also saw: https://wiki.tiker.net/PyOpenCL/Examples/MedianFilter?fbclid=IwAR3tnqNEnZ8oVJUWplxkz33Km3aj3uYVbnQQQBsGjLlIjyCxWIijbxaNuGQ
But this code is hardcoded for kernel size: 3.
OpenCV has UMat, the transparent API but it is again limited to kernel size 5.
If you know any library which can do median filtering with kernel size in the order of ~51 using openCL in python, that would be great.
If there isn't one, I am sure, I am not the first person with the need of median filter with a bigger kernel, I very humbly ask Why the community hasn't solved it yet? Is there any technical challenge which I am not aware of? Please educate me.
Thank you again for looking into this.
Regards
HM
Have you tried cv::medianBlur (the 8-bit CPU version) yet? It's moving histogram implementation makes it's complexity O(1) with respect to block radius, thus it is very hard to beat at larger radii no matter how many cores you have. The OpenCL version on the other hand, which is embarrassingly parallel, uses sorting network for the median. Unfortunately such algorithm becomes inpractical quickly as the number of comparison operations required grows very fast.

signal disaggregation in real time

i have a streamed power data in real time coming from my electric meter, and when i see the load with my eyes i can tell which kind of appliance is on.
Currently i'm using a sliding window of ten points and calculating the standard deviation to detect appliances turning on or off. The aim is to know how much each appliance is consuming by an integral calculation. I need help to perform a signal disaggregation in real Time os i can calculate the inegral of each appliance and avoid having cross calculated consumption values that can happen like in this img
Thx in advance for any help you could provide!
If it's just about distinguish between on and off state, naive bayes classification might do the work (https://machinelearningmastery.com/naive-bayes-classifier-scratch-python/) there are several interesting links at the end.
If you want to disaggregate various consumers, an artificial neural network might be a possible solution using TensorFlow https://www.tensorflow.org/tutorials/
An issue here is to generate the labeled training data from scratch.
Performing a fast fourier analysis is used e.g. for detection of hifi equipment - as each device has a specific spectrum.

Categories

Resources