OpenCV: Recognizing the Orientation of a Cube in a 2D Image - python

I'm looking forward to developing a Rubik's Cube solver (just for fun). For this, the user would take a picture of the cube in a way that three sides of it are visible, like in this picture:
cubeA.jpg:
It can be safely assumed that the cube is covering most of the picture. Now, what I would like to do is detect the exact orientation of the cube. That is, I am interesting in detecting these seven points:
cubeB.jpg:
From this, I would be able to split each of the three faces into 3x3 grid and detect the colors (for which I have already written the code and which works quite well).
Now, I have made a few attempts to solve this problem using corner detection and contour extraction in OpenCV. None of them have been satisfactory, however. Another way of looking at this problem would be to recognize each of the 27 little, so-called “facelets”, and detect the orientation using those. But I'm not sure if that's the right approach either.
How would you go about solving this problem using OpenCV?

Related

How to close a contour with OpenCV when dilating isn't enough

I have been looking to close contour for shapes that are similar to circles or rounded objects.
I found a very interesting answer here.
I thought my problem was solved until I tested with different images.
This is my mask from OpenCV canny AFTER dilating. As you can see, there is a big gap at the bottom.
If the shape was a circle, I could try to recreate the circle, but the thing is the shape could be different. I could look like a cucumber or a rugby balloon.
I wonder if it's actually possible to connect the contour without doing a straight line, which will make me lose a lot of information.
Approximating the "curveness" or "straightness" according to the neighbor?

How to calculate a facial point grid using OpenCV and Python?

I'm trying to automatically draw a mesh or grid over a face, similar to the image below, to use the result in a blog post that I'm writing. However, my knowledge of computer vision is not enough to recognize which model or algorithm is behind these types of cool visualizations.
Could someone help pointing me some link to reador or a starting point?
Using Python, OpenCV and dlib the closest thing I found is something called delauny triangulation but I'm not sure if that's exactly what I'm looking for seeing the results.
Putting it in a few words what I have so far is:
Detect all faces on image and calculate their landmarks using dlib.get_frontal_face_detector() and dlib.shape_predictor() methods from dlib.
Use the method cv2.Subdiv2D() from OpenCV to compute a 2D subdivision based on my landmarks. In particulary I'm getting the delauny subdivision using the getTriangleList() method from the resulting subdivision.
The complete code is available here.
However, the result is not so attractive perhaps because the division is using triangles instead of polygons and I want to check if I can improve it!

How to detect the direction of a finger pointing

I am using python with OpenCV and pyautogui to play the game tetris (via a website) with your hands. In order to play the game there are 4 directions (up/down/left/right) that I want to detect to match with inputs for the pyautogui. I have tried so many ways to detect the direction of the finger that it pointing but nothing works consistently,any help on the current solution I am working on, or another idea would be great.
I am currently trying to use findContours with convexHull to detect the perimeter of the hand and attempt to find the sides with the smallest angle which in turn should tell me the side the finger is pointing. But I don't know how to extract that information.
I have tried using a BoundingRect to find which side is longer (w or h) then splitting the rectangle in half and seeing which side has more pixels in it (from the binary of the contour), but it wasn't consistent enough.
I have also tried using HoughLines and HoughLinesP to find a vanishing point, but I couldn't figure out how to get that to work.
Basically I am at a loss.
Edit: I'm trying to avoid MediaPipe so I can show I understand how to use the elements of openCV

Python openCV findContours falsely connects neighbouring objects

I use OpenCV 2.4.8 in python to segment images and find objects.
I want to use findContours to list the objects and analyse their area, shape and so on. But if I have two objects that are only separated by a thin (1 px wide) diagonal line or even only diagonally touch at the corners, they will be recognised as one object.
This image illustrates the problem:
There are obviously two objects, but they are recognised as one.
In Matlab one can specify a connectivity parameter (neighbourhood of 4 or 8) to solve this problem. Can this also be done in some way using opencv? Maybe using the hierarchy of the contours or some other work around?
I know, that I could use morphological erosion or opening to separate the two objects, but this can cause problems in other parts of my image. I already tried this.
If your objects are circular, you can try using circular Hough transform.
If such an image is represented as a single contour, then it is bound to have defects. You can search for convexity defects and proceed from there. But this again depends on the objects in your image.
I'm not sure what kind of objects your image contains, so its hard to come to a definitive answer.

Robust detection of grid pattern in an image

I have written a program in Python which automatically reads score sheets like this one
At the moment I am using the following basic strategy:
Deskew the image using ImageMagick
Read into Python using PIL, converting the image to B&W
Calculate calculate the sums of pixels in the rows and the columns
Find peaks in these sums
Check the intersections implied by these peaks for fill.
The result of running the program is shown in this image:
You can see the peak plots below and to the right of the image shown in the top left. The lines in the top left image are the positions of the columns and the red dots show the identified scores. The histogram bottom right shows the fill levels of each circle, and the classification line.
The problem with this method is that it requires careful tuning, and is sensitive to differences in scanning settings. Is there a more robust way of recognising the grid, which will require less a-priori information (at the moment I am using knowledge about how many dots there are) and is more robust to people drawing other shapes on the sheets? I believe it may be possible using a 2D Fourier Transform, but I'm not sure how.
I am using the EPD, so I have quite a few libraries at my disposal.
First of all, I find your initial method quite sound and I would have probably tried the same way (I especially appreciate the row/column projection followed by histogramming, which is an underrated method that is usually quite efficient in real applications).
However, since you want to go for a more robust processing pipeline, here is a proposal that can probably be fully automated (also removing at the same time the deskewing via ImageMagick):
Feature extraction: extract the circles via a generalized Hough transform. As suggested in other answers, you can use OpenCV's Python wrapper for that. The detector may miss some circles but this is not important.
Apply a robust alignment detector using the circle centers.You can use Desloneux parameter-less detector described here. Don't be afraid by the math, the procedure is quite simple to implement (and you can find example implementations online).
Get rid of diagonal lines by a selection on the orientation.
Find the intersections of the lines to get the dots. You can use these coordinates for deskewing by assuming ideal fixed positions for these intersections.
This pipeline may be a bit CPU-intensive (especially step 2 that will proceed to some kind of greedy search), but it should be quite robust and automatic.
The correct way to do this is to use Connected Component analysis on the image, to segment it into "objects". Then you can use higher level algorithms (e.g. hough transform on the components centroids) to detect the grid and also determine for each cell whether it's on/off, by looking at the number of active pixels it contains.

Categories

Resources