I am currently working on a project that requires me to segment parts of drone imagery into their regions of interest and to create masks based on them.
Below is a hand made example on what the best output should be like
Original image
Image with water body mask
Image with crops masks
Image with tillage masks
Image with road masks
Image with building masks
I am aware that the best way to do this is through semantic segmentation using convolutional neural networks with training data, but the task that was allocated to me was to do a more basic separation through colour segmentation, so ideally I can make a mask of all the green foliage and trees on a layer, the roads on another layer, the water bodies on their own layers, and buildings and others on a layer based on their colours and contours after cleaning the noise.
Enhanced contours
I increased the contrast of this image and used canny contour detection after clearing out some noise to apply enhanced contours to this image, so that it would be easier to pick out different instances of objects.
What I would like to try and do is to create a mask based on the contours of an object, then check the colour underneath to assign it a specific colour layer. Could anyone provide any ideas as to what sort of algorithm I could use to achieve this?
Also, I understand that the colour of the water and the roads could be very similar depending on the location, so this is not a very good solution. Any advice on how I could make a better distinction between the two?
Any insights into this would be much appreciated, thanks!
Related
I am struggling to get my KNN object detection script to tell the difference between two almost identical objects.
The obvious solution, which I am doing now is to add more reference images.
I was wondering if there is anything else I could do to easier distinguish between the two objects. They are both beverage containers with the same shape and colors, where the first has a small blueish tint and the other is just plain white with a shiny finish.
Right now I am feeding the image of my objects to VGG19 and capture the feature map output of the 1x4096 flatten layer and adding to this the five most dominant colors in RGB format along the the pixel height and width of the image crop of the object. This gives me a 4113 length one dimensional vector on which I can perform the KNN search.
I am curious if things like saturation and contrast on the image will help the KNN search finding the small differences or if there is something else I could try?
Best regards
Martin
Imagine a factory warehouse, there are different sized boxes which loaded with products. I want to measure these.boxes with a camera. There is no background, background is natural factory warehouse. I have a code for measuring. But this code measuring everything. I want to measure only boxes.
I have code for measuring objects but How to detect only cardboard boxes with opencv?
Should i detect with color filter or with yolo?
Also maybe user will measure other objects instead of cardboard boxes like industrial machines etc. Maybe i need more general solution...
camera facing with width,height(180degrees).
As you see code measuring everything but I want only cardboard Boxes. I have tried filter colors with Hue, Saturation, Volume. But it didnt work because I'm using Aruco perimeter. Aruco perimeter is Black and White. When i lost other colors, Aruco perimeter is lost too. And maybe there would be different colored boxes.
You can try detecting rectangular or quadrilateral contours in a copy of B/W frame and then correlate the contours to the original(colored) frame. Thus you can apply color filtering on it to detect cardboard box. The advantage of this over DL would be DL might take more processing power.
Did your use any deep learning(DL) methods for cardboardboxes detection? If not, I recommend you to use yolov5 method based DL or choose some machine learning methods such as HOG with SVM. The advantage you use DL methods is that you only need label this boxes and pass the data(images and annotations) to model without worrying about whatever other object.
I tagged the cells using Labelme software (you can tag any object with it), and then I trained yolact's model with the images and annotations. Figure 1 shows the results that the model predicted a new image.
I am trying to obtain the relative depth of pixels of an image. For example, the image in https://www.awn.com/news/nvidia-unveils-quadro-rtx-worlds-first-ray-tracing-gpu . I don't need the precise distance of each pixel, which I believe would be impossible, but I would like to get something as "the green ball is further than the other balls". Is it possible using OpenCV in python? The codes I generated can identify each ball, but not their relative distance or depth, so they are pretty much useless to my intents.
That's an ill-posed problem (you can not measure depth with a single RGB camera) and a topic of resent research. I found this survey paper. Most often a depth image is learned from an RGB image using convolutional neural networks.
However, if you use a lot of prior information about your scene (all objects are circular within in the image and the partially visible circles corresponds to the ones which are in the background), then you might be able to do something with heuristical methods like, thresholding, edge detection or hough transforms, but it won't be easy.
I have two different types of images (which I cannot post due to reputation, so I've linked them.):
Image 1 Image 2
I was trying to extract hand features from the images using OpenCV and Python. Which kinda looks like this:
import cv2
image = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(image, (5,5), 0)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
retval, thresh1 = cv2.threshold(gray, 70, 255, / cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
cv2.imshow('image', thresh1)
cv2.waitKey(0)
The result of which looks like this:
Image 1 Image 2
The change in background in the second image is messing with the cv2.threshold() function and its not getting the skin parts right. Is there a way to do this right?
As a follow up question, what is the best way to extract hand features? I tried a HaaR Cascade and I didn't really get results? Should I train my own cascade? What other options do I have?
It's hard to say based on a sample size of two images, but I would try OpenCV's Integral Channel Features (ChnFtrs), which are like supercharged Haar features that can take cues from colour as well as any other image channels you care to create and provide.
In any case, you are going to have to train your own cascades. Separate cascades for front and profile shots of course.
Take out your thresholding by skin colour, because as you've already noticed, it may throw away some or all of the hands depending on the actual subject's skin colour and lighting. ChnFtrs will do the skin detection for you more robustly than a fixed threshold can. (Though for future reference, all humans are actually orange :))
You could eliminate some false positives by only detecting within a bounding box of where you expect the hands to be.
Try both RGB and YUV channels to see what works best. You could also throw in the results of edge detection (say, Canny, maximised across your 3 colour channels) for good measure. At the end, you could cull channels which are underused to save processing if necessary.
If you have much variation in hand pose, you may need to group similar poses and train a separate ChnFtrs cascade for each group. Individual cascades do not have a branching structure, so they do not cope well when the positive samples are disjoint in parameter space. This is, AFAIK, a bit of an unexplored area.
A correctly trained ChnFtrs cascade (or several) may give you a bounding box for the hands, which will help in extracting hand contours, but it can't exclude invalid contours within the same bounding box. Most other object detection routines will also have this problem.
Another option, which may be better/simpler than ChnFtrs, is LINEMOD (a current favourite of mine). It has the advantage that there's no complex training process, nor any training time needed.
I need to blur faces to protect the privacy of people in street view images like Google does in Google Street View. The blur should not make the image aesthetically unpleasant. I read in the paper titled Large-scale Privacy Protection in Google Street View by Google (link) that Google does the following to blur the detected faces.
We chose to apply a combination of noise and aggressive Gaussian blur that we alpha-blend smoothly with the background starting at the edge of the box.
Can someone explain how to perform this task? I understand Gaussian Blur, but how to blend it with the background?
Code will be helpful but not required
My question is not how to blur a part of image?, it is how to blend the blurred portion with the background so that blur is not unpleasant? Please refer to the quote I provided from the paper.
I have large images and a lot of them. An iterative process as in the possible duplicate will be time consuming.
EDIT
If someone ever wants to do something like this, I wrote a Python implementation. It isn't exactly what I was asking for but it does the job.
Link: pyBlur
I'm reasonably sure the general idea is:
Create a shape for the area you want to blur (say a rectangle).
Extend your shape by X pixels outwards.
Apply a gradient on alpha from 0.0 .. 1.0 (or similar) over the extended area.
Apply blur the extended area (ignoring alpha)
Now use an alpha-blend to apply the modified image to the original image.
Adding noise in a similar way to the original image would make it further less obvious that it's been blurred (because the blur will of course also blur away the noise).
I don't know the exact parameters for how much to grow, what values to use for the alpha gradient, etc, but that's what I understand from the quoted text.