How to interpolate one image matrix into another one? - python

I've got two image matrices. Both contain radar information about the same area, with different units and values. But, these circles are not fit onto each other. One is like it's zoomed in. I would like to do some element operations in these matrices.
Let's say there is a value in matrix A which is located in A[100, 50], but its element wise associate value is B[95, 45]. However, still I don't know which value is associated with which one in each matrix. This is only an example.
Matrix A shape is (720, 720) a circle that inside the circle there are values.
Matrix B's shape is the same, but the circle is a little bit zoomed in and the edges are not completely round in some areas due to not fitting in edges of the matrix square!
How can I reshape(interpolate?) one matrix into another?

Related

How to find rectangular areas in a two-dimensional array whose distance from the center of the rectangle is less than the specified value of x

I'm looking for rectangular areas in a non-negative two-dimensional float array in numpy where the distance to the center point of the area is less than x. In fact, the purpose of data analysis is the output of a depth estimation function in which I specify the areas that are less distant from each other (which can be said, for example, to be part of a wall or objects that are vertical and facing the camera).
For example, in the image below, the output of the depth estimation function can be seen, where each pixel represents a distance between 0 and 500 cm. In any area where the difference in size is less than a value indicates that the object is in a vertical position and I am looking for these areas
https://drive.google.com/file/d/1Z2Bsi5ZNoo4pFU6N188leq56vGHFfvcd/view?usp=sharing
The code I am working on is related to MiDas, at the end of which I have added my code, which is in the following link
https://colab.research.google.com/drive/1vsfukFqOOZZjTajySM8hL0VNCvOxaa6X?usp=sharing
Now, for example, I'm looking for areas like this paper that are stuck behind a chair in the picture below
https://drive.google.com/file/d/1ui99gpU2i0JFumLivLpoEyHn3QTcfD8a/view?usp=sharing

Estimating relative rotation between two uncalibrated cameras from the fundamental matrix

I have two images of the same object that were taken from different positions and an uncalibrated camera. I want to treat these images pair as stereo pairs. Then I can estimate a fundamental matrix, without any knowledge of the scene. I want to know the rotation angle between camera two relative to the first one.
pts1 and pts2 are the array of matching key points that are obtained from one of the known Feature Descriptor methods. RANSAC is used to exclude outliers.
F, mask = cv2.findFundamentalMat(pts1,pts2,v2.FM_RANSAC, 3,0.99 )
p1fNew and p2fNew - inliers of key points
With the stereoRectifyUncalibrated method, I can obtain two matrices to rectify images into the same plane.
retBool, H1, H2 = cv2.stereoRectifyUncalibrated(p1fNew,p2fNew, F, image_size)
By multiplying these matrices I can estimate transformation of one image plane to another:
T = np.linalg.inv(np.mat(H1)) * np.mat(H2)
How do I decompose this T matrix and extract from it rotation information and vector direction from the first image center to another?

Visual Odometry - Problem with Rotation and translation

I'm trying to reconstruct the trajectory of a camera by extracting the features from images and computing the camera pose, however the trajectory is not as I expected. For my visual odometry system I am using an Intel camera D435i which extracts color and depth frames. My code is written with python and all the important functions used are from the OpenCV library.
I am using a dataset where the camera just moves straightforward and therefore I expect the rotation to be the identity matrix and the translation to go forward. However, this isn't why I get when I compute them.
This is my approach:
Read the color frames, transform them to grayscale and apply CLAHE-method to reduce light contrasts
Extract features using SIFT from the frame t and frame t+1
Match them using FLANN and apply Lowe's ratio test to filter outliers
Transform the 2D pixel coordinates (u1p, v1p) of the features to 3D world coordinates using the camera intrinsic matrix (K) and the depth of the pixels (s) (Acquired from the depth frames):
Normalization: x_y_norm = (K^-1) * [u1p ,v1p, 1]
Scaling: x_y_norm_s *= x_y_norm
Afterwards I try two methods: PNP and essential matrix
A) PNP:
4. Use the function cv2.solvePNPRansac() to get the rotation vector and the translation vector. The inputs of the function are the normalized and scaled keypoints of the frame t (object points) and the normalized keypoints of the frame t+1 (image points). The rotation matrix is then computed with the function cv2.Rodrigues()
B) Essential matrix:
4. Use cv2.findEssentialMat() with the normalized x-y-coordinates of the keypoints as input. Check if the essential matrix is computed correctly by checking the singular values. Use cv2.recoverPose() to get the rotation and translation (Here I also used the normalized x-y-coordinates as input).
Compute the trajectory as follows (assuming the second camera_pose is [R|t] and the first pose is the identity matrix as stated in the book multiple view geometry):
camera_pose[t+1] = pose[t] * pose[t+1]^-1 (Get pose from initial position by multiplying previous and current pose as stated in Combining relative camera rotations & translations with known pose )
position = pose[t+1] * [0,0,0,1] (Position calculated from the origin)
trajectory[:,t+1] = position[0:3]
I hope someone can help me since I don't know where my error is and neither the PnP method nor essential matrix method work.
Edit: For both methods I added a picture of the trajectory (in the Z-X and the X-Y plane) that I get from the first 15 frames of my dataset and an extract of a text file containing the rotation matrix and the translation vector of some frames. The Z-X-Plane should show dots forming a line in positive direction, since my dataset just shows images where the camera moves straightforward, however a random trajectory is shown. The green dots are old camera poses and the red dot the current one. I also added an example of the features extracted in the first frame and matched with the second frame, they seem too be alright.

How do I fit rectangles to an image in python and obtain their coordinates

I'm looking for a way to split a number of images into proper rectangles. These rectangles are ideally shaped such that each of them take on the largest possible size without containing a lot of white.
So let's say that we have the following image
I would like to get an output such as this:
Note the overlapping rectangles, the hole and the non axis aligned rectangle, all of these are likely scenario's I have to deal with.
I'm aiming to get the coordinates describing the corner pieces of the rectangles so something like
[[(73,13),(269,13),(269,47)(73,47)],
[(73,13),(73,210),(109,210),(109,13)]
...]
In order to do this I have already looked at the cv2.findContours but I couldn't get it to work with overlapping rectangles (though I could use the hierarchy model to deal with holes as that causes the contours to be merged into one.
Note that although not shown holes can be nested.
A algorithm that works roughly as follow should be able to give you the result you seek.
Get all the corner points in the image.
Randomly select 3 points to create a rectangle
Count the ratio of yellow pixels within the rectangle, accept if the ratio satisfy a threshold.
Repeat 2 to 4 until :
a) every single combination of point is complete or
b) all yellow pixel are accounted for or
c) after n number of iteration
The difficult part of this algorithm lies in step 2, creating rectangle from 3 points.
If all the rectangles were right angle, you can simply find the minimum x and y to correspond for topLeft corner and maximum x and y to correspond for bottomRight corner of your new rectangle.
But since you have off axis rectangle, you will need to check if the two vector created from the 3 points have a 90 degree angle between them before generating the rectangle.

find closest clusters of colors in a numpy array from an image file

Current state
I have a numpy array of shape (900, 1800, 3) that has been made from an image file.
That's one array element per pixel: 900 px high, 1800 px wide, and 3 channels (R, G, B) per pixel represented in the array.
There are only a small number (3-20) unique RGB colors in the images being parsed, so there are only very few different RGB value combinations represented in the array.
Goal
Identify the smallest circular areas in the image that contains n number of unique colors, where n will always be less than or equal to the number of unique colors in the image.
Return top y (by count or pct) of the smallest areas.
A 'result' could simply be the x,y value of the center pixel of an identified circular area and its radius.
I do plan to draw a circle around each area, but this question is about the best approach for first identifying the top smallest areas.
The Catch/Caveat
The images are actually flattened projections of spheres. That means that a pixel at the right edge of the image is actually adjacent to a pixel on the left edge, and similarly for top and bottom pixels. The solution must account for this as it is parsing pixels to identify closest pixels with other colors. EDIT: this part may be answered in comments below
The Question
My initial approach is to simply parse pixel by pixel and brute force the problem with handrolled x/y coordinate math: take a pixel, work outwards until we hit n colors, score that pixel for how many steps outward it took, next pixel. Keep a top y dict that gets re-evaluated after each pixel, adding any pixels that make top y, and dumping any that get pushed out. Return that dict as the output.
I know that many python libs like scipy, scikit-image, and maybe others like to work with images as numpy arrays. I'm sure there is a method/approach that is smarter and leverages a library or some kind of clustering algo instead of brute forcing it, but I'm not familiar enough with the space to know intuitively what methods and libs to consider. The question: What is the pseudocode for a good method/lib to do this the right way?

Categories

Resources