I have the following image from which I have calculated the dimensions using by few preprocessing like gray scaling, noise removal, dilation, getting corners (CornerHarris), and then using distance formula to get the dimensions. Complete info regarding it can be seen in the Previos Question I know that the dimensions are not correct because I need a reference object to transform the size of the box. But you can see that the image is of a box so logically lengths 42, 54 and 49 must be same but they are not because the way image is taken makes the background smaller so it returns a smaller distance. Am I doing it wrong ? OR any better ideas to get those dimensions such that they are equal. Thanks.
Related
I am trying to find the roughness of shapes in an image. I have found the contours and used simplification.cutil.simplify_coords_vwp(contour,1000) to calculate a polygon that I want to use as the "smooth" shape (this is similar to the more commonly used "Douglas-Peuker" algorithm). These shapes have ~13 points to them giving leeway to any bends. This is done for all of the shapes present in the full image.
The images below show the full image I want the roughness of and a zoomed in image showing what I am trying to calculate. I want to quantify the black inside of the lines and the white outside of the lines, giving a quantity to the amount of roughness. I didn't post the code because it would be a lot of extra information, I am only looking for help conceptualizing what modules could be useful here.
I couldn't find a perfect explanation for how getPerspectiveTransform and warpPerspective work in OpenCV, specifically in Python. My understanding of the methods is :
Given 4 points from a source image and 4 new points getPerspectiveTransform returns a (3, 3) matrix that somehow crops the image when sent into warpPerspective as an argument. I thought that the 4 points(from src image) form a polygon on the image which is then removed/cropped and this new cropped image is then fitted between the newly given 4 points and also I saw that warpPerspective takes the input size of the new image. So I inferred this as, if the new points' max-height/max-width(Calculated from the points...imagining the points are corners of a rectangle or a quadrilateral) is less than the provided width or height the remaining area is left blank that is essentially black/white, but this wasn't the case...if the width/height calculated from the new points is less than the provided width and height the remaining space is filled with some part of the source image that is essentially the outer part of the 4 source points...
I wasn't able to comprehend this behavior...
So am I interpreting the methods incorrectly? if so please provide the correct interpretation of these methods.
PS. I'm pretty new to OpenCV and it would be great if someone explains the underlying math that is used by getPerspectiveTransform warpPerspective.
Thanks in advance.
These functions are parts of an image processing concept called Geometric transformations.
When taking a picture in real life, there is always some sort of geometric distortion which can be removed using Geometric transformations. It has other applications too, including construction of mosaics, geographical mapping, stereo and video.
Here's an example from this site :
So basically warpPerspective transforms the source image to the desired version of it and it does the job using a 3*3 transformation matrix given by getPerspectiveTransform.
See more details here.
Now if you wonder how to find that pair of 4 dots from source and dest image, you should check another image processing concept called Feature extraction. These are methods that perfectly find important regions of an image and you can match them to another image of the same object taken from a different view. (check SIFT, SURF, ORB ,etc.)
An example of matched features:
So warpPerspective won't just crop your image, it will transfer the whole image (not just the region specified by 4 dots) base on the transformation matrix and those dots will only be used to find the correct matrix.
I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
I have a sample of two-dimensional, black-and-white/binary barcodes that have been photographed.
The (colour) photographs typically suffer from all the usual suspects: blurring, distortion, contrast issues/lighting gradients, and erosion.
I am trying to reconstruct the original barcodes, which were once computer-generated pixel arrays of black/white values.
We should be able to exploit the images' spatial-frequency information to infer the dimensions of each pixel. The hope is to use this to better restore the original by convolving the image with such a structuring element defined by the data.
Although this is a very broad topic, I therefore have a very specific question:
What is the best way to establish a structuring element from image data in OpenCV/Python, without using prior knowledge of it?
(Assume for now that the underlying pixel scale is to some good approximation spatially invariant)
Note that I am not trying to execute the whole extraction pipeline: this question is simply about inferring an optimal structuring element from the data.
For example, the spatial kernel could be used as input to an unsharp mask, a la Python unsharp mask
References:
(1-D ideas) http://answers.opencv.org/question/174384/how-to-reconstruct-damaged-barcode, http://www.windytan.com/2016/02/barcode-recovery-using-priori.html
(Similar idea) Finding CheckerBoard Points in opencv for any random ChessBoard( pattern size not known)
(Sort of but not really, and answer-less) OpenCV find image frequencies
(Broad) https://en.wikipedia.org/wiki/Chessboard_detection
One way of doing this is:
Compute the Scharr gradient magnitude representations in both the x and y direction.
Subtract the y-gradient from the x-gradient. By performing this subtraction we are left with regions of the image that have high horizontal gradients and low vertical gradients.
Blur and threshold the image to filter out the noise.
Apply a closing kernel to the thresholded image to close the gaps between vertical stripes of the barcode.
Perform a series of dilations and erosions.
Find the largest contour in the image, which is now presumably the barcode.
More details and complete code can be found in this PyImageSearch blog post.
I am trying to determine the orientation of the following image. Given an image at random between 140x140 to 150X150 pixels with no EXIF data. Is there a method to define each image as 0, 90, 180 or 270 degrees so that when I get an image of a particular orientation I can match that with my predefined images? I've looked into feature matching with opencv using the following tutorial, and it works correctly. Identify the images as the same no matter its orientation, but I have no clue how to tell them apart.
I've looked into feature matching with opencv using the following tutorial, and it works correctly
So you could establish a valid match between an image of unknown rotation and an image in your database? And the latter one is of a known rotation (i.e. upright)?
In this case you can compute a transformation matrix:
either a homography which defines a full planar transformation (use cv::findHomography)
or an affine transform which expresses translation, rotation and scaling and thus seems best for your needs (use cv::estimateRigidTransform with fullAffine=true). You can find more about affine transformations here
If you don't have any known image then this task seems mathematically unsolvable but you could use something like an Artificial-Neural-Network-based heuristic which seems like a very research-intensive project.
If you have the random image somewhere (say, you're trying to match a certain image to a list of images you have), you could try taking the difference of your random image and your list of known images four times for each image, rotating the known image each time by 90 deg. Whichever one is closer to zero should be what you want.
If the image sizes of both your new image and the list of images are the same, you might also be able to just compare the keypoint distance differences (if the image is a match but all the keypoints are all rotated a quadrant clockwise from each other, then it's 90 deg off etc).
If you have no idea what that random image is supposed to be, I can't really think of any way to figure that out, unless you know for sure that a blob of light blue is supposed to be the sky. As far as I know, there's got to be something that you know to be up in order to determine what up is.