Python: Project 2D features onto voxels - python

My goal is to project feature vectors I have on an image pixel level onto voxels (via ray casting/marching). The ideal output would be to cast a ray and get the first intersected voxel as output. I have the camera intrinsics, extrinsics and the voxels. So I should have everything that is needed to do it. Currently my voxels are in a sparse format, i.e. an array of coordinates and feature vectors associated with each coordinate.
But unfortunately I couldn't find a simple way to do it. Is there some performant python library, that should work well for this usecase?
I know of open3d but it only seems to support my usecase for meshes.

Related

How to Transform Centroid from Pixel to Real World Coordinates

I am working on an application using an IFM 3D camera to identify parts prior to a robot pickup. Currently I am able to find the centroid of these objects using contours from a depth image and from there calculate the center point of these objects in pixel space.
My next task is to then transform the 2D centroid coordinates to a 3D point in 'real' space. I am able to train the robot such that it's coordinate frame is either at the center of the image or at the traditional (0,0) point of an image (top left).
The 3D camera I am using provides both an intrinsic and extrinsic matrix. I know I need to use some combination of these matrices to project my centroid into three space but the following questions remain:
My current understanding from googling is the intrinsic matrix is used to fix lens distortion (barrel and pinhole warping, etc.) whereas the extrinsic matrix is used to project points into the real world. Is this simplified assumption correct?
How can a camera supply a single extrinsic matrix? I know traditionally these matrices are found using the checkerboard corners method but are these not dependent on the height of the camera?
Is the solution as simple as taking the 3x4 extrinsic matrix and multiplying it by a 3x1 matrix [x, y, 1] and if so, will the returned values be relative to the camera center or the traditional (0,0) point of an image.
Thanks in advance for any insight! Also if it's any consolation I am doing everything in python and openCV.
No. I suggest you read the basics in Multiple View Geometry of Hartley and Zisserman, freely available in the web. Dependent on the camera model, the intrinsics contain different parameters. For the pinhole camera model, these are the focal length and the principal point.
The only reason why you maybe could directly transform your 2D centroid to 3D is that you use a 3D camera. Read the manual of the camera, it should be explained how the relation between 2D and 3D coordinates is given for your specific model.
If you have only image data, you can only compute a 3D point from at least two views.
No, of course not. Please don't be lazy and start reading the basics about camera projection instead of asking for others to explain the common basics that are written down everywhere in the web and literature.

Is there a way to take a large group of 2D images and turn the into a 3D image?

I am currently working on a summer research project and we have generated 360 slices of a tumor. I now need to compile (if that's the right word) these images into one large 3D image. Is there a way to do this with either a python module or an outside source? I would prefer to use a free software if that is possible.
Perhaps via matplotlib, but anyway may require preprocessing I suppose:
https://www.youtube.com/watch?v=5E5mVVsrwZw
In your case, the z axis (3rd dimension) should be specified by your vector of images. Nonetheless, before proceeding, I suppose you would need to extract the shapes of the object you want to reconstruct. For instance, if i take any image of the many 2D you have, I expect to find RGB value for each pixel, but then, for instance if you want to plot a skull like in the video link, as I understand you would need to extract the borders of your object and from each of its frame (2D shape) and then plot their serie. But anyway, the processing may depend on the encoding of the information you have. Perhaps is sufficient to simply plot the series of images.
Some useful link I found:
https://www.researchgate.net/post/How_to_reconstruct_3D_images_from_two_or_four_2D_images
Python: 3D contour from a 2D image - pylab and contourf

How to generate a 3D image based on ChArUco calibration of two 2D images

I'm currently extracting the calibration parameters of two images that were taken in a stereo vision setup via cv2.aruco.calibrateCameraCharucoExtended(). I'm using the cv2.undistortPoints() & cv2.triangulatePoints() function to convert any two 2D points to a 3D point coordinate, which works perfectly fine.
I'm now looking for a way to convert the 2D images, which can be seen under approach 1, to one 3D image. I need this 3D image because I would like to determine the order of these cups from left to right, to correctly use the triangulatePoints function. If I determine the order of the cups from left to right purely based on the x-coordinates of each of the 2D images, I get different results for each camera (the cup on the front left corner of the table for example is in a different 'order' depending on the camera angle).
Approach 1: Keypoint Feature Matching
I was first thinking about using a keypoint feature extractor like SIFT or SURF, so I therefore tried to do some keypoint extraction and matching. I tried using both the Brute-Force Matching and FLANN based Matcher, but the results are not really good:
Brute-Force
FLANN-based
I also tried to swap the images, but it still gives more or less the same results.
Approach 2: ReprojectImageTo3D()
I looked further into the issue and I think I need the cv2.reprojectImageTo3D() [docs] function. However, to use this function, I first need the Q matrix which needs to be obtained with cv2.stereoRectify [docs]. This stereoRectify function on its turn expects a couple of parameters that I'm able to provide, but there's two I'm confused about:
R – Rotation matrix between the
coordinate systems of the first and
the second cameras.
T – Translation vector between
coordinate systems of the cameras.
I do have the rotation and translation matrices for each camera separately, but not between them? Also, do I really need to do this stereoRectify all over again when I already did a full calibration in ChArUco and already have the camera matrix, distortion coefficients, rotation vectors and translations vectors?
Some extra info that might be useful
I'm using 40 calibration images per camera of the ChArUco board to calibrate. I first extract all corners and markers after which I estimate the calibration parameters with the following code:
(ret, camera_matrix, distortion_coefficients0,
rotation_vectors, translation_vectors,
stdDeviationsIntrinsics, stdDeviationsExtrinsics,
perViewErrors) = cv2.aruco.calibrateCameraCharucoExtended(
charucoCorners=allCorners,
charucoIds=allIds,
board=board,
imageSize=imsize,
cameraMatrix=cameraMatrixInit,
distCoeffs=distCoeffsInit,
flags=flags,
criteria=(cv2.TERM_CRITERIA_EPS & cv2.TERM_CRITERIA_COUNT, 10000, 1e-9))
The board paremeter is created with the following settings:
CHARUCO_BOARD = aruco.CharucoBoard_create(
squaresX=9,
squaresY=6,
squareLength=4.4,
markerLength=3.5,
dictionary=ARUCO_DICT)
Thanks a lot in advance!

Plotting a 3d triangular mesh from point cloud

I have this object/point cloud,rendered with pyopengl and pygame.
My object is a numpy array of the co-ordinates of the point. I wish to generate a 3d triangular mesh of this object, also it would be nice if you could decrease the number of triangles.
I have tried scipy.spatial.Delaunay and it doesnt generate triangles for 3d objects.
Dual Contouring would probably work well here, it's an algorithm that takes voxelized data and turns it into a mesh. I don't understand it trivially enough to outline it here, but basically you'd take your array of points and place them into a 3D grid array where if that grid cell contains a point it's set to equal 1 (full), and if it doesn't it is set to 0 (empty), you would then run the DC algorithm on this grid and it would output a mesh. The nice thing about this algorithm is it supports internal cavities and concave shapes.
Here's some links I found that may help you if you decide to use DC:
Basic Dual Contouring Theory
http://ngildea.blogspot.com/2014/11/implementing-dual-contouring.html
This is the github repo to the source I used when I implemented this algorithm in Unity3D:
https://github.com/nickgildea/DualContouringSample

Python get transformation matrix from two sets of points

I have to images, one simulation, one real data, with bright spots.
Simulation:
Reality:
I can detect the spots just fine and get the coordinates. Now I need to compute transformation matrix (scale, rotation, translation, maybe shear) between the two coordinate systems. If needed, I can pick some (5-10) corresponding points by hand to give to the algorithm
I tried a lot of approaches already, including:
2 implementations of ICP:
https://engineering.purdue.edu/kak/distICP/ICP-2.0.html#ICP
https://github.com/KojiKobayashi/iterative_closest_point_2d
Implementing affine transformations:
https://math.stackexchange.com/questions/222113/given-3-points-of-a-rigid-body-in-space-how-do-i-find-the-corresponding-orienta/222170#222170
Implementations of affine transformations:
Determining a homogeneous affine transformation matrix from six points in 3D using Python
how to perform coordinates affine transformation using python? part 2
Most of them simply fail somehow like this:
The red points are the spots from the simulation transformed into the reality - coordinate system.
The best approach so far is this one how to perform coordinates affine transformation using python? part 2 yielding this:
As you see, the scaling and translating mostly works, but the image still needs to be rotated / mirrored.
Any ideas on how to get a working algorithm? If neccessary, I can provide my current non-working implementations, but they are basically as linked.
I found the error.
I used plt.imshow to display both the simulated and real image and from there, pick the reference points from which to calculate the transformation.
Turns out, due to the usual array-to-image-index-flipping-voodoo (or a bad missunderstanding of the transformation on my side), I need to switch the x and y indices of the reference points from the simulated image.
With this, everything works fine using this how to perform coordinates affine transformation using python? part 2

Categories

Resources