How getPerspectiveTransform and warpPerspective work? [Python] - python

I couldn't find a perfect explanation for how getPerspectiveTransform and warpPerspective work in OpenCV, specifically in Python. My understanding of the methods is :
Given 4 points from a source image and 4 new points getPerspectiveTransform returns a (3, 3) matrix that somehow crops the image when sent into warpPerspective as an argument. I thought that the 4 points(from src image) form a polygon on the image which is then removed/cropped and this new cropped image is then fitted between the newly given 4 points and also I saw that warpPerspective takes the input size of the new image. So I inferred this as, if the new points' max-height/max-width(Calculated from the points...imagining the points are corners of a rectangle or a quadrilateral) is less than the provided width or height the remaining area is left blank that is essentially black/white, but this wasn't the case...if the width/height calculated from the new points is less than the provided width and height the remaining space is filled with some part of the source image that is essentially the outer part of the 4 source points...
I wasn't able to comprehend this behavior...
So am I interpreting the methods incorrectly? if so please provide the correct interpretation of these methods.
PS. I'm pretty new to OpenCV and it would be great if someone explains the underlying math that is used by getPerspectiveTransform warpPerspective.
Thanks in advance.

These functions are parts of an image processing concept called Geometric transformations.
When taking a picture in real life, there is always some sort of geometric distortion which can be removed using Geometric transformations. It has other applications too, including construction of mosaics, geographical mapping, stereo and video.
Here's an example from this site :
So basically warpPerspective transforms the source image to the desired version of it and it does the job using a 3*3 transformation matrix given by getPerspectiveTransform.
See more details here.
Now if you wonder how to find that pair of 4 dots from source and dest image, you should check another image processing concept called Feature extraction. These are methods that perfectly find important regions of an image and you can match them to another image of the same object taken from a different view. (check SIFT, SURF, ORB ,etc.)
An example of matched features:
So warpPerspective won't just crop your image, it will transfer the whole image (not just the region specified by 4 dots) base on the transformation matrix and those dots will only be used to find the correct matrix.

Related

Image Operations with Python

I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()

Find Coordinates of cropped image (JPG) from it's original

I have a database of original images and for each original images there are various cropped versions.
This is an example of how the image look like:
Original
Horizontal Crop
Square Crop
This is a very simple example, but most images are like this, some might taken a smaller section of the original image than others.
I was looking at OpenCV in python but I'm very new to this kind of image processing.
The idea is to be able to save the cropping information separate from the image to save space and then generate all the cropping and different aspect ratio on the fly with a cache system instead.
The method you are looking for is called "template matching". You find examples here
https://docs.opencv.org/trunk/d4/dc6/tutorial_py_template_matching.html
For your problem, given the large images, it might be a good idea to constrain the search space by resizing both images by the same factor. So that searching a position that isn't as precise, but allows then to constrain the actual full pixel sized search to a smaller region around that point.

How to mosaic/bend/curve image with curvature in python?

I have an image that represents the elevation of some area. But the drone that made it didn't necessarily go in a straight line(although image is always rectangular). I also have gps coordinates generated every 20cm of the way.
How can I "bend" this rectangular image (curve/mosaic) so that it represents the curved path that the drone actually went through? (in python)
I haven't managed to write any code as I have no idea what is the name of this "warping" of the image. Please find the attached image as a wanted end state, and normal horizontal letters as a start state.
There might be a better answer, but I guess you could use the remapping functions of openCV for that.
The process would look like that :
From your data, get your warping function. This will be a function that maps (x,y) pixel values from your input image I to (x,y) pixel values from your output image O
Compute the size needed in the output image to host your whole warped image, and create it
Create two maps, mapx and mapy, which will tell the pixel coordinates in I for every pixel in 0 (that's, in a sense, the inverse of your warping function)
Apply OpenCV remap function (which is better than simply applying your maps because it interpolates if the output image is larger than the input)
Depending on your warping function, it might be very simple, or close to impossible to apply this technique.
You can find an example with a super simple warping function here : https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/remap/remap.html
More complex examples can be looked at in OpenCV doc and code when looking at distortion and rectification of camera images.

Determine The Orientation Of An Image

I am trying to determine the orientation of the following image. Given an image at random between 140x140 to 150X150 pixels with no EXIF data. Is there a method to define each image as 0, 90, 180 or 270 degrees so that when I get an image of a particular orientation I can match that with my predefined images? I've looked into feature matching with opencv using the following tutorial, and it works correctly. Identify the images as the same no matter its orientation, but I have no clue how to tell them apart.
I've looked into feature matching with opencv using the following tutorial, and it works correctly
So you could establish a valid match between an image of unknown rotation and an image in your database? And the latter one is of a known rotation (i.e. upright)?
In this case you can compute a transformation matrix:
either a homography which defines a full planar transformation (use cv::findHomography)
or an affine transform which expresses translation, rotation and scaling and thus seems best for your needs (use cv::estimateRigidTransform with fullAffine=true). You can find more about affine transformations here
If you don't have any known image then this task seems mathematically unsolvable but you could use something like an Artificial-Neural-Network-based heuristic which seems like a very research-intensive project.
If you have the random image somewhere (say, you're trying to match a certain image to a list of images you have), you could try taking the difference of your random image and your list of known images four times for each image, rotating the known image each time by 90 deg. Whichever one is closer to zero should be what you want.
If the image sizes of both your new image and the list of images are the same, you might also be able to just compare the keypoint distance differences (if the image is a match but all the keypoints are all rotated a quadrant clockwise from each other, then it's 90 deg off etc).
If you have no idea what that random image is supposed to be, I can't really think of any way to figure that out, unless you know for sure that a blob of light blue is supposed to be the sky. As far as I know, there's got to be something that you know to be up in order to determine what up is.

Eliminate unwanted keypoints

I would like to eliminate the keypoints detected around the frame of an image (an artwork of a museum gallery ). In other words I want to separate out the actual artwork from its frame. Each artwork consist of different types of frames.
![Keypoints detected using sift][1]
I have already written a Python wrapper for David Lowe's SIFT implementation to detect keypoints as well as to compute descriptors.
However my question is what is the best approach to solve this problem? any of the following or something else?
Using Hough transformation (using Python Image Library)
Template matching
Your help is highly appreciated
Thanks again
I'd go with Hough transform and try to detect lines which form a quadrilateral.
You might get into trouble if the painting actually does contain a square or something. I'd look for some assumptions like: acceptable aspect ratio, acceptable size. Also find the outermost quadrilateral, and work your way towards the center of the image picking up inner quadrilaterals, if applicable. This would give you the frame and its thickness, so you can disregard any keypoints here or beyond the frame.
P.S. If you got some random replies from me, it's because I accidentally replied to another post in your thread... ^^
For each artwork, do you have a clean, properly framed reference image?
If so another solution to remove the background clutter is:
to use the ratio test algorithm to compute keypoints correspondences between your frame and the reference image,
to perform a geometric consistency check to filter out false matches.
In addition the geometric check will provide you with the homography matrix that you can use to warp your input frame or alternatively to project the corners of the reference images.
That way you will natively obtain the artwork area within your frame.
Here's an example about how you can do that with opensift's match tool - below is an illustration.

Categories

Resources