I am using OpenCV's Optical Flow module. I understand the examples in the documentation but those take the entire image and then get the optical flow over the image.
I only want to pass it over some parts of an image. Is it possible to do that? If yes, how do I go about it?
Thanks!
Yes, it's possible. cv2.calcOpticalFlowPyrLK() will be the optical flow function you need. Before you make that function call, you will have to create an image mask. I did a similar project, but in C++, though I can outline the steps for you:
Create an empty matrix with same width and height of your images
Using the points from your ROI, create a shape out of it (I did mine using cv2.fillPoly()) and fill the inside of the shape with white (Your image mask should only be comprised of black and white color)
If you are planning on using corners as features, then call cv2.goodFeaturesToTrack() and pass in the mask you've made as one of its arguments.
If you're using the Feature2D module to detect features, you can use the same mask to only extract the features in that masked area.
By this step, you should now have a collection of features/points that are only within the bounds of the shape! Call the optical flow function and then process the results.
I hope that helps.
Related
I couldn't find a perfect explanation for how getPerspectiveTransform and warpPerspective work in OpenCV, specifically in Python. My understanding of the methods is :
Given 4 points from a source image and 4 new points getPerspectiveTransform returns a (3, 3) matrix that somehow crops the image when sent into warpPerspective as an argument. I thought that the 4 points(from src image) form a polygon on the image which is then removed/cropped and this new cropped image is then fitted between the newly given 4 points and also I saw that warpPerspective takes the input size of the new image. So I inferred this as, if the new points' max-height/max-width(Calculated from the points...imagining the points are corners of a rectangle or a quadrilateral) is less than the provided width or height the remaining area is left blank that is essentially black/white, but this wasn't the case...if the width/height calculated from the new points is less than the provided width and height the remaining space is filled with some part of the source image that is essentially the outer part of the 4 source points...
I wasn't able to comprehend this behavior...
So am I interpreting the methods incorrectly? if so please provide the correct interpretation of these methods.
PS. I'm pretty new to OpenCV and it would be great if someone explains the underlying math that is used by getPerspectiveTransform warpPerspective.
Thanks in advance.
These functions are parts of an image processing concept called Geometric transformations.
When taking a picture in real life, there is always some sort of geometric distortion which can be removed using Geometric transformations. It has other applications too, including construction of mosaics, geographical mapping, stereo and video.
Here's an example from this site :
So basically warpPerspective transforms the source image to the desired version of it and it does the job using a 3*3 transformation matrix given by getPerspectiveTransform.
See more details here.
Now if you wonder how to find that pair of 4 dots from source and dest image, you should check another image processing concept called Feature extraction. These are methods that perfectly find important regions of an image and you can match them to another image of the same object taken from a different view. (check SIFT, SURF, ORB ,etc.)
An example of matched features:
So warpPerspective won't just crop your image, it will transfer the whole image (not just the region specified by 4 dots) base on the transformation matrix and those dots will only be used to find the correct matrix.
I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
I have an image that represents the elevation of some area. But the drone that made it didn't necessarily go in a straight line(although image is always rectangular). I also have gps coordinates generated every 20cm of the way.
How can I "bend" this rectangular image (curve/mosaic) so that it represents the curved path that the drone actually went through? (in python)
I haven't managed to write any code as I have no idea what is the name of this "warping" of the image. Please find the attached image as a wanted end state, and normal horizontal letters as a start state.
There might be a better answer, but I guess you could use the remapping functions of openCV for that.
The process would look like that :
From your data, get your warping function. This will be a function that maps (x,y) pixel values from your input image I to (x,y) pixel values from your output image O
Compute the size needed in the output image to host your whole warped image, and create it
Create two maps, mapx and mapy, which will tell the pixel coordinates in I for every pixel in 0 (that's, in a sense, the inverse of your warping function)
Apply OpenCV remap function (which is better than simply applying your maps because it interpolates if the output image is larger than the input)
Depending on your warping function, it might be very simple, or close to impossible to apply this technique.
You can find an example with a super simple warping function here : https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/remap/remap.html
More complex examples can be looked at in OpenCV doc and code when looking at distortion and rectification of camera images.
I've been working on a project of recognizing a flag shown in the camera using opencv python.
I've already tried using surf, color histogram matching, and template matching. But of these 3, it does not always return the correct answer. what i want now, is what would be the best solution to this problem of mine.
Example of the template images:
Here is an example of flag shown in camera.
what to use if this is the kind of images that i want to recognize?
Update code in matchTemplate
flags=["Cambodia.jpg","Laos.jpg","Malaysia.jpg","Myanmar.jpg","Philippines.jpg","Singapore.jpg","Thailand.jpg","Vietnam.jpg","Indonesia.jpg","Brunei.jpg"]
while True:
methods = 'cv2.TM_CCOEFF_NORMED'
list_of_pics=[]
for flag in flags:
template= cv2.imread(flag,0)
img = cv2.imread('philippines2.jpg',0)
# generate Gaussian pyramid for A
G = template.copy()
gpA = [G]
for i in xrange(6):
G = cv2.pyrDown(G)
gpA.append(G)
n=0
for x in gpA:
w, h = x.shape[::-1]
method = eval(methods)#
# Apply template Match
res = cv2.matchTemplate(img,x,method)
matchVal=res[0][0]
picDict={"matchVal":matchVal,"name":flag}
list_of_pics.append(picDict)
n=n+1
newlist = sorted(list_of_pics, key=operator.itemgetter('matchVal'),reverse=True)
#print newlist
matched_image=newlist[0]['name']
print matched_image
k=cv2.waitKey(10)
if (k==27):
break
cv2.destroyAllWindows()
I don't think that you can get good results from SURF/SIFT because:
SURF/SIFT need keypoints to detect the object but in your case, you have to detect flags and most of the flags are mostly uniform and do not provide much keypoints.
In your webcam frame, you have several things rather than having only flag. Those several things also contribute to get the keypoints.
Solution: i still think that you should use matchTemplate() of opencv which you have already tried but the problem in your version is that you didn't consider the fact that matchTemplate() is not scale and orientation invariant. So, the solution is to use Gaussian pyramid and create the different size (half, one forth, double etc.) of your sample flags. After getting the same flag in 2-5 different size, you should perform the matchTemplate() between every size of flag and the webcam frame.
Strategy:
Receive the webcam frame
Load the image of a flag.
Using Gaussian pyramid, create smaller and bigger images of that flag (you don't need to store them.)
Perform matchTemplate() between the webcam frame and each size of flag.
Result = with which so ever image you get the maximum correlation value is the flag present in your webcam.
REMEMBER: matchTemplate is not scale and orientation invariant. so, if you rotate the image or make it larger/smaller in the webcam frame...you won't get the good results.
SURF cannot be applied to the images that have no corners (when gradient is mostly goes in one direction like in a striped flag). Color histogram of the whole object may not work since both of your examples have similar colors. However, if you can apply a histogram to different parts of the image it will work better.
What you need to do is to split your training image on say 4 quadrants and create 4 color histograms. The testing stage will integrate these 4 back projected histograms and check for the right spatial order of responses. Color histogram is quite robust to rotations, scaling and perspective. It changes with illumination so you need to have liberal matching thresholds. Spatial resolution from 4 quadrants will help to ameliorate this situation.
For the future I recommend studying methods in more detail to understand their applicability rather than trying them randomly.
I would like to eliminate the keypoints detected around the frame of an image (an artwork of a museum gallery ). In other words I want to separate out the actual artwork from its frame. Each artwork consist of different types of frames.
![Keypoints detected using sift][1]
I have already written a Python wrapper for David Lowe's SIFT implementation to detect keypoints as well as to compute descriptors.
However my question is what is the best approach to solve this problem? any of the following or something else?
Using Hough transformation (using Python Image Library)
Template matching
Your help is highly appreciated
Thanks again
I'd go with Hough transform and try to detect lines which form a quadrilateral.
You might get into trouble if the painting actually does contain a square or something. I'd look for some assumptions like: acceptable aspect ratio, acceptable size. Also find the outermost quadrilateral, and work your way towards the center of the image picking up inner quadrilaterals, if applicable. This would give you the frame and its thickness, so you can disregard any keypoints here or beyond the frame.
P.S. If you got some random replies from me, it's because I accidentally replied to another post in your thread... ^^
For each artwork, do you have a clean, properly framed reference image?
If so another solution to remove the background clutter is:
to use the ratio test algorithm to compute keypoints correspondences between your frame and the reference image,
to perform a geometric consistency check to filter out false matches.
In addition the geometric check will provide you with the homography matrix that you can use to warp your input frame or alternatively to project the corners of the reference images.
That way you will natively obtain the artwork area within your frame.
Here's an example about how you can do that with opensift's match tool - below is an illustration.