I am writing function that applies affine transformation to the input image.My function, first finds the six affine transformation parameters with size is 6x1.The function then applies these parameters to all image coordinates.The new coordinates I obtained have a float value. To create a new image from these coordinates, I converted the new coordinates to integers. I made the color assignment for the new image as follows:
Let's say input image coordinate is equal (i,j) and digital number value is this pixsel is equal (0,0,0) and (i,j) refers to (m,k) in the output image.Then,I say digital value of (m,k) is equal (0,0,0).
I searched forward warping and I didn't understand a topic.As I said before,I converted the new coordinates float to integers.Can this be done in forward warping?
Please help me...
Related
I'm looking for rectangular areas in a non-negative two-dimensional float array in numpy where the distance to the center point of the area is less than x. In fact, the purpose of data analysis is the output of a depth estimation function in which I specify the areas that are less distant from each other (which can be said, for example, to be part of a wall or objects that are vertical and facing the camera).
For example, in the image below, the output of the depth estimation function can be seen, where each pixel represents a distance between 0 and 500 cm. In any area where the difference in size is less than a value indicates that the object is in a vertical position and I am looking for these areas
https://drive.google.com/file/d/1Z2Bsi5ZNoo4pFU6N188leq56vGHFfvcd/view?usp=sharing
The code I am working on is related to MiDas, at the end of which I have added my code, which is in the following link
https://colab.research.google.com/drive/1vsfukFqOOZZjTajySM8hL0VNCvOxaa6X?usp=sharing
Now, for example, I'm looking for areas like this paper that are stuck behind a chair in the picture below
https://drive.google.com/file/d/1ui99gpU2i0JFumLivLpoEyHn3QTcfD8a/view?usp=sharing
I'm trying to reconstruct the trajectory of a camera by extracting the features from images and computing the camera pose, however the trajectory is not as I expected. For my visual odometry system I am using an Intel camera D435i which extracts color and depth frames. My code is written with python and all the important functions used are from the OpenCV library.
I am using a dataset where the camera just moves straightforward and therefore I expect the rotation to be the identity matrix and the translation to go forward. However, this isn't why I get when I compute them.
This is my approach:
Read the color frames, transform them to grayscale and apply CLAHE-method to reduce light contrasts
Extract features using SIFT from the frame t and frame t+1
Match them using FLANN and apply Lowe's ratio test to filter outliers
Transform the 2D pixel coordinates (u1p, v1p) of the features to 3D world coordinates using the camera intrinsic matrix (K) and the depth of the pixels (s) (Acquired from the depth frames):
Normalization: x_y_norm = (K^-1) * [u1p ,v1p, 1]
Scaling: x_y_norm_s *= x_y_norm
Afterwards I try two methods: PNP and essential matrix
A) PNP:
4. Use the function cv2.solvePNPRansac() to get the rotation vector and the translation vector. The inputs of the function are the normalized and scaled keypoints of the frame t (object points) and the normalized keypoints of the frame t+1 (image points). The rotation matrix is then computed with the function cv2.Rodrigues()
B) Essential matrix:
4. Use cv2.findEssentialMat() with the normalized x-y-coordinates of the keypoints as input. Check if the essential matrix is computed correctly by checking the singular values. Use cv2.recoverPose() to get the rotation and translation (Here I also used the normalized x-y-coordinates as input).
Compute the trajectory as follows (assuming the second camera_pose is [R|t] and the first pose is the identity matrix as stated in the book multiple view geometry):
camera_pose[t+1] = pose[t] * pose[t+1]^-1 (Get pose from initial position by multiplying previous and current pose as stated in Combining relative camera rotations & translations with known pose )
position = pose[t+1] * [0,0,0,1] (Position calculated from the origin)
trajectory[:,t+1] = position[0:3]
I hope someone can help me since I don't know where my error is and neither the PnP method nor essential matrix method work.
Edit: For both methods I added a picture of the trajectory (in the Z-X and the X-Y plane) that I get from the first 15 frames of my dataset and an extract of a text file containing the rotation matrix and the translation vector of some frames. The Z-X-Plane should show dots forming a line in positive direction, since my dataset just shows images where the camera moves straightforward, however a random trajectory is shown. The green dots are old camera poses and the red dot the current one. I also added an example of the features extracted in the first frame and matched with the second frame, they seem too be alright.
I am new to opencv and image related geometry. But now I am working some image processing tasks.
Here is what I did:
Given pts_src, pts_dst, I warped the entire image with cv2.findHomograpy() and cv2.warpPerspective()
cv2.findHomography provided me with a 3x3 homography matrix.
cv2.warpPerspective() provided me with a warpped image, and I can detect some feature points from this warpped image.
However, I need to map the feature point coordinates back to the original input image.
Could someone please tell me how to achieve that?
Thanks
I think that the inverse of the homography matrix wil be the new matrix to map from pts_dst back to pts_src.
As you said in a comment and as shown here, it seems that your 3rd component is, indeed, a scaling coefficient w of the resulting vector:
(x, y) → (x′/w, y′/w)
where (x′, y′, w′) = mat ⋅ [x y 1]
(...) In case of a 2D vector transformation, the z component is omitted.
So you could rescale it by doing (V[0]/V[2], V[1]/V[2]), just like you mentioned in your comment.
Also, I think this scaling factor has something to do with how the matrix is generated, so please check the details of the matrix as well, as the official documentation of cv2.findHomography() says:
Homography matrix is determined up to a scale. Thus, it is normalized so that h33=1.
Hope this helps!
I've got two image matrices. Both contain radar information about the same area, with different units and values. But, these circles are not fit onto each other. One is like it's zoomed in. I would like to do some element operations in these matrices.
Let's say there is a value in matrix A which is located in A[100, 50], but its element wise associate value is B[95, 45]. However, still I don't know which value is associated with which one in each matrix. This is only an example.
Matrix A shape is (720, 720) a circle that inside the circle there are values.
Matrix B's shape is the same, but the circle is a little bit zoomed in and the edges are not completely round in some areas due to not fitting in edges of the matrix square!
How can I reshape(interpolate?) one matrix into another?
For a class, I need to transform RGB image into YIQ.
We have been told that the conversion can be made by:
I started to write a messy code with loops to have the matrix multiplication and then I found out a function
skimage.color.yiq2rgb(imYIQ)
and when I looked inside to see what they were doing I saw the following (I'm copying stuff so it will be more clear):
yiq_from_rgb = yiq_from_rgb = np.array([[0.299, 0.587, 0.114],
[0.59590059, -0.27455667, -0.32134392],
[0.21153661, -0.52273617, 0.31119955]])
return np.dot(arr, yiq_from_rgb.T.copy())
when arr is just the RGB pic as a matrix
I'm trying to understand why this works? why do they take the Transpose matrix? (.T)
And how exactly does the dot product work when the arr shape is different than the yiq_from_rgb?
In your reference figure containing the matrix for the conversion, the transformation matrix is on the left of the RGB channels. So, for the first pixel in your RGB image, let's call it (p1r, p1g, p1b) corresponding to R, G, B channels respectively, we need to multiply with the transformation matrix and sum the results like:
y1y = (0.299*p1r + 0.587*p1g + 0.114*p1b)
y1i = (0.596*p1r - 0.275*p1g - 0.321*p1b)
y1q = (0.212*p1r - 0.523*p1g + 0.311*p1b)
where (y1y,y1i,y1q) is the value for the first pixel in the resulting YIQ image, after rounding/taking int. We do the same kind of multiplication for all the pixels in the whole RGB image and obtain the desired YIQ image.
Now, since they do this whole implementation using np.dot(arr, yiq_from_rgb.T), to have the weighting work out correctly the transformation matrix needs to be transposed. And copy is just to have a dedicated of the transposed transformation matrix for the purpose of this conversion.
Also, notice that contrary to your figure, in np.dot() the RGB array is on the left of transformation matrix.