i write an app where the user takes the picture of a logo and the app tries to find the right logo in its database. Therefore i use the cv2.SIFT() algorithm and basically a modified version of the find_obj.py example (https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/python2/find_obj.py?rev=6080).
The script now gives me values for "matches" and "inliers" and I have trouble finding out how to compute the logo that fits best. First i just took the image with most matches or inliners but it was often the wrong choice. Then I used inliers/matches as a confidence value but it could well be that one logo has a value of 4/5 (0.8) and the right one has 16/24 (0.66). So I tried to weight the number of matches like this inliers/matches*matches*0.3 but of course i have no clue how to really weight it.
Any advice what to do?!
You could try taking each match and computing the alignment between your image and the reference image. Then warp your image to align with the reference image and compute the error in that space.
Related
I'm really curious about this image and I have little to no information how it was created. Thus, I'm here to research how to do it.
Can someone tell me where to begin? I only know this problem might be related to a recursive subdivision task.
I can only see the images was divided into 64 blocks initially.
There is some color simplification going on which I don't understand how to achieve this and am curious about how to do it.
A reference to an algorithm or procedure is enough(python/c++ only please)
You could have squares of the average color of that part of the image, check how similar it is to the original using something like image similarity measures and if its not good enough, subdivide into 4 squares and make them each the average color of that part of the image. Repeat this until every square in the image is good enough.
With help of Google Images I was able to find the name of the person who is in the image: Kenny Cason. With some more research I was able to find the answer.
The problem is related to Quad Tree Images:
Partition the image into four quadrants.
Color each quadrant based on the average color of the pixels in the
target image.
Compute each quadrant's squared error between the original target
image and the generated image.
Select the quadrant with the highest error and recur into it.
Repeat from step one, using the current highest error quadrant.
GitHub link.
Kenny Cason's blog .
I am trying to identify a state of a valve(on or off). My approach is to give to images of each states and compare the current image with those two and see which one it belongs to.
I have tried to compare using new_image - on_image and new_image - off_image. Then compare the number of different pixels. It works, but i feel like in some cases it might not work and there must be another better way do a simple classification like this.
Any reference or ideas?
Subtracting pixels might not be very robust in case your camera position changes slightly. If you don't shy away from using open Computer Vision (open CV) there is an interesting recipe for finding a predefined object in a picture:
Feature Matching + Homography to find Objects
You could cut out the lever from your image and search it in every new image. Depending on the coordinates and especially the rotation, you can set the status of the valve. This might even work in crazy cases where someone half opened (or for pessimists: half closed) the valve, or if the lever becomes partially covered.
I am about to start learning CV and ML. I want to start by solving a problem. Below I am sharing an image and I want to extract each symbol and location from an image and create a new image with those extracted symbols in a pattern just like in source image. After that, I will do a translation job. Right now how can I or which steps I should follow to extract the symbols and find those symbols from the dataset (in terms of Gardiner's sign list) and place in the new image?
I know there is some computer vision + machine learning is involved in this process because symbols are not 100% accurate because these are too old symbols. I don't know from where to start and end. I have plans to use Python. Also, share if you know anyone already done this. Thank you.
Run sobel edge detection on source images in Gardiner's sign list.
Train a CNN on the list.
Normalize the contrast of the source image.
Run sobel edge detection on the source image. (referred to as source image heretofore)
Evaluate in the CNN by varying heights and widths(from the largest to smallest) on the source image.
Select the highest probability glyph. Output the corresponding Glyph from Gardiner's list at that start position and the corresponding height and width.
I do not claim this can be done in six simple steps, but this is the approach I would take.
I have an image captured by android camera. Is it possible to calculate depth of object in the image ? Image contains object and background only. Any suggestion, explanation or links that you think can help me will be appreciated.
OpenCV is the library you need.
I did some depth identification of water levels in pure white background a few days ago. Generally, if you want to identify the depth, you can convert the question to identify the edge of the changing colors. In this case, you can convert the colorful pictures to grey and identify the changing of while-black-grey interface. OpenCV is capable of doing the job at high speed.
Hope it helps. Let me know if you need further help.
Edits:
If you want to find the actual depths, you need to project the coordinate system of your pictures to the real world, or vice versa. To do it, you have to know a fix location as your reference and the relationship between pixels and real distances.
What I did is find the fixed location and set it as zero. Afterwards, I measured a length of an object in the picture, and also calculated the pixel amount of the object. Therefore I obtained the relationship between pixels and real distances.
Note that these procedures may involve errors in the identification. I did it very carefully and the error was acceptable in my case.
With only one image, accurate depth estimation is near impossible. However, there are various methods of estimating depth under certain assumptions or the availability of the camera calibration matrix. As mentioned by #WenlongLiu, OpenCV is a very good place to start with.
Okay so i am trying to find homography of a soccer match. What i have till now is
Read images from a folder which is basically many cropped images of a template soccer field. Basically this has images for center circle and penalty lines etc.
Read video stream from a file and crop it into many smaller segments.
Loop inside the images in video stream and inside that another loop for images that i read from folder.
Now in the two images that i get through iteration , i applied a green filter because of my assumption that field is green
Use orb to find points and then find matches.
Now the Problem is that because of players and some noise from croud, i am unable to find proper matches for homography. Also removing them is a problem because that also tends to hide the soccer field lines that i need to calculate the homography on.
Any suggestions on this is greatly appreciated. Also below are some sample code and images that i am using.
"Code being used"
Sample images
Output that i am getting
The image on right of output is a frame from video and that on left is the same sample image that i uploaded after filterGreen function as can be seen from the code.
Finally what i want is for the image to properly map to center circle so i can draw a cube in center, Somewhat similar to "This example" . Thanks in advance for helping me out.
An interesting technique to throw at this problem is RASL. It computes homographies that align stacks of related images. It does not require that you specify corresponding points on the images, but operates directly on the image pixels. It is robust against image occlusions (eg, players moving in the foreground).
I've just released a Python implementation here: https://github.com/welch/rasl
(there are also links there to the original RASL paper, MATLAB implementation, and data).
I am unsure if you'd want to crop the input images to that center circle, or if the entire frames can be aligned. Try both and see.