Why does gdal_grid turn image upside-down?

Why does gdal_grid turn image upside-down? - python

I'm trying to use gdal_grid to make an elevation grid from a surface in a geojson. I use this command:
gdal_grid -a linear:radius=0 inputSurface.geojson outputFile.tif
It seems to give the correct pixel values, but if I open the result in Global Mapper or QGIS, the image is flipped/mirrored in a horizontal axis, such that the tif is directly below the surface and upside-down.
What is the reason for this and how do I fix it??
Update
I already tried changing the geotransform, but it hasn't totally fixed my problem.
I looked at the resulting image in gdalinfo and found out that the upper left corner is actually the lower left corner, so I set it using the SetGeoTransform. This moved it to the correct location, but it is still upside-down. (This may by dependent on the projection, which might cause problems later)
I also tried looking at the pixel width in the geotransform as mentioned below:
Xgeo = GT[0] + Xpixel*GT[1] + Yline*GT[2]
Ygeo = GT[3] + Xpixel*GT[4] + Yline*GT[5]
The image returned by gdal_grid has a positive GT[5], but unfortunately changing it to -GT[5] doesn't change anything.
The code I used to change the geotransform:
transform = list(ds.GetGeoTransform())
transform = [upperLeftX, transform[1], 0, upperLeftY, 0, -transform[5]]
ds.SetGeoTransform(transform)

GDAL's georeferencing is commonly specified by two sets of parameters. The first is the spatial reference, which defines the coordinate system (UTM, WGS, something more localized). The spatial reference for a raster is set using gdal.Dataset.setProjection(). The second piece of georeferencing is the GeoTransform, which translates (row, column) pixel indices into coordinates in the coordinate system. It is likely the geotransform that you need to update to make your image "unflipped".
The GeoTransform is a tuple of 6 values, which relate raster indices into coordinates.
Xgeo = GT[0] + Xpixel*GT[1] + Yline*GT[2]
Ygeo = GT[3] + Xpixel*GT[4] + Yline*GT[5]
Because these are raster images, the (line, pixel) or (row, col) coordinates start from the top left of the image.
[ ]----> column
|
|
v row
This means that GT[1] will be positive when the image is positioned "upright" in the coordinate system. Similarly, and sometimes counter-intuitively, GT[5] will be negative because the y value should decrease for every increasing row in the image. This isn't a requirement, but it is very common.
Modifying the GeoTransform
You state that the image is upside down and below where is should be. This isn't guaranteed to be a fix, but it will get you started. It's easier if you have the image in front of you and can experiment or compare coordinates...
import gdal
# open dataset as readable/writable
ds = gdal.Open('input.tif', gdal.GA_Update)
# get the GeoTransform as a tuple
gt = gdal.GetGeoTransform()
# change gt[5] to be it's negative, flipping the image
gt_new = (gt[0], gt[1], gt[2], gt[3], gt[4], -1 * gt[5])
# set the new GeoTransform, effectively flipping the image
ds.SetGeoTransform(gt_new)
# delete the dataset reference, flushing the cache of changes
del ds

I ended up having more problems with gdal_grid, where it just crashes at seemingly random places, so I'm using the scipy.interpolate-function called griddata in stead. This uses a meshgrid to get the coordinates in the grid, and I had to tile it up because of memory restrictions of meshgrid.
import scipy.interpolate as il #for griddata
import numpy as np
# meshgrid of coords in this tile
gridX, gridY = np.meshgrid(xi[c*tcols:(c+1)*tcols], yi[r*trows:(r+1)*trows][::-1])
## Creating the DEM in this tile
zi = il.griddata((coordsT[0], coordsT[1]), coordsT[2], (gridX, gridY),method='linear',fill_value = nodata) # fill_value to prevent NaN at polygon outline
raster.GetRasterBand(1).WriteArray(zi,c*tcols,nrows-r*trows-rtrows)
The linear interpolation seems to do the same as gdal_grid is supposed to. This was actually effected by making the 5'th element in the geotransform negative as described in the question update.
See description at scipy.interpolate.griddata.
A few things to note:
The point used in the geotransform should be upper-left
The resolution in y-direction should be negative
In the projection (at least the ones I use) positive y-direction is up
In numpy arrays positive y-direction is down
When using gdal's WriteArray it uses the upper left corner
Hope this helps other people's confusion.

I've solved a similar issue by simply re-projecting the results of the gdal_grid. Give this a try (replacing the epsg code with your projection and replacing the input/output filepaths):
gdalwarp -s_srs epsg:4326 -t_srs epsg:4326 gdal_grid_result.tif inverted_output.tif

it does not. it is simply the standards of the tool rendering it. try opening it in QGIS and youll notice it is right side up.

Related

Testing if a point lies within a labeled object with scipys ndi.label()

Above is an image that has been put through ndi.label()and displayed with matplotlib with each coloured region representing a different feature. Plotted on top of the image are red points that represent a pair of coordinates each. All coordinates are stored and ndi.label returns the number of features. Does skimage, scipy or ndimage have a function that will test if a given set of coordinates lies within a labelled feature?
Initially I intended to use the binding box (left, right, top, bottom) of each feature but due to the regions not all being quadrilateral this won't work.
code to generate the image:
image = io.import("image path")
labelledImage, featureNumber = ndi.label(image)
plt.imshow(labelledImage)
for i in range(len(list))
y, x = list[i]
plt.scatter(y,x, c='r', s=40)

You can use ndi.map_coordinates to find the value at a particular coordinate (or group of coordinates) in an image:
labels_at_coords = ndi.map_coordinates(
labelledImage, np.transpose(list), order=0
)
Notes:
the coordinates array needs to be of shape (ndim, npoints), instead of the sometimes more intuitive (npoints, ndim), hence the transpose.
ideally, it would be best to rename your points list to something like points_list, so that you don't overwrite the Python built-in function list.

2D X-ray reconstruction from 3D DICOM images

I need to write a python function or class with the following Input/Output
Input :
The position of the X-rays source (still not sure why it's needed)
The position of the board (still not sure why it's needed)
A three dimensional CT-Scan
Output :
A 2D X-ray Scan (simulate an X-Ray Scan which is a scan that goes through the whole body)
A few important remarks to what I'm trying to achieve:
You don’t need additional information from the real world or any advanced knowledge.
You can add any input parameter that you see fit.
If your method produces artifacts, you are excepted to fix them.
Please explain every step of your method.
What I've done until now: (.py file added)
I've read the .dicom files, which are located in "Case2" folder.
These .dicom files can be downloaded from my Google Drive:
https://drive.google.com/file/d/1lHoMJgj_8Dt62JaR2mMlK9FDnfkesH5F/view?usp=sharing
I've sorted the files by their position.
Finally, I've created a 3D array, and added all the images to that array in order to plot the results (you can see them in the added image) - which are slice of the CT Scans. (reference: https://pydicom.github.io/pydicom/stable/auto_examples/image_processing/reslice.html#sphx-glr-auto-examples-image-processing-reslice-py)
Here's the full code:
import pydicom as dicom
import os
import matplotlib.pyplot as plt
import sys
import glob
import numpy as np
path = "./Case2"
ct_images = os.listdir(path)
slices = [dicom.read_file(path + '/' + s, force=True) for s in ct_images]
slices[0].ImagePositionPatient[2]
slices = sorted(slices, key = lambda x: x.ImagePositionPatient[2])
#print(slices)
# Read a dicom file with a ctx manager
with dicom.dcmread(path + '/' + ct_images[0]) as ds:
# plt.imshow(ds.pixel_array, cmap=plt.cm.bone)
print(ds)
#plt.show()
fig = plt.figure()
for num, each_slice in enumerate(slices[:12]):
y= fig.add_subplot(3,4,num+1)
#print(each_slice)
y.imshow(each_slice.pixel_array)
plt.show()
for i in range(len(ct_images)):
with dicom.dcmread(path + '/' + ct_images[i], force=True) as ds:
plt.imshow(ds.pixel_array, cmap=plt.cm.bone)
plt.show()
# pixel aspects, assuming all slices are the same
ps = slices[0].PixelSpacing
ss = slices[0].SliceThickness
ax_aspect = ps[1]/ps[0]
sag_aspect = ps[1]/ss
cor_aspect = ss/ps[0]
# create 3D array
img_shape = list(slices[0].pixel_array.shape)
img_shape.append(len(slices))
img3d = np.zeros(img_shape)
# fill 3D array with the images from the files
for i, s in enumerate(slices):
img2d = s.pixel_array
img3d[:, :, i] = img2d
# plot 3 orthogonal slices
a1 = plt.subplot(2, 2, 1)
plt.imshow(img3d[:, :, img_shape[2]//2])
a1.set_aspect(ax_aspect)
a2 = plt.subplot(2, 2, 2)
plt.imshow(img3d[:, img_shape[1]//2, :])
a2.set_aspect(sag_aspect)
a3 = plt.subplot(2, 2, 3)
plt.imshow(img3d[img_shape[0]//2, :, :].T)
a3.set_aspect(cor_aspect)
plt.show()
The result isn't what I wanted because:
These are slice of the CT scans. I need to simulate an X-Ray Scan which is a scan that goes through the whole body.
Would love your help to simulate an X-Ray scan that goes through the body.
I've read that it could be done in the following way: "A normal 2D X-ray image is a sum projection through the volume. Send parallel rays through the volume and add up the densities." Which I'm not sure how it's accomplished in code.
References that may help: https://pydicom.github.io/pydicom/stable/index.html

EDIT: as further answers noted, this solution yields a parallel projection, not a perspective projection.
From what I understand of the definition of "A normal 2D X-ray image", this can be done by summing each density for each pixel, for each slice of a projection in a given direction.
With your 3D volume, this means performing a sum over a given axis, which can be done with ndarray.sum(axis) in numpy.
# plot 3 orthogonal slices
a1 = plt.subplot(2, 2, 1)
plt.imshow(img3d.sum(2), cmap=plt.cm.bone)
a1.set_aspect(ax_aspect)
a2 = plt.subplot(2, 2, 2)
plt.imshow(img3d.sum(1), cmap=plt.cm.bone)
a2.set_aspect(sag_aspect)
a3 = plt.subplot(2, 2, 3)
plt.imshow(img3d.sum(0).T, cmap=plt.cm.bone)
a3.set_aspect(cor_aspect)
plt.show()
This yields the following result:
Which, to me, looks like a X-ray image.
EDIT : the result is a bit too "bright", so you may want to apply gamma correction. With matplotlib, import matplotlib.colors as colors and add a colors.PowerNorm(gamma_value) as the norm parameter in plt.imshow:
plt.imshow(img3d.sum(0).T, norm=colors.PowerNorm(gamma=3), cmap=plt.cm.bone)
Result:

The way I understand the task you are expected to write a ray-tracer that follows the X-rays from the source (that's why you need its position) to the projection plane (That's why you need its position).
Sum up the values as you go and do a mapping to the allowed grey-values in the end.
Take a look at line drawing algorithms to see how you can do this.
It is really no black magic, I have done this kind of stuff more than 30 years ago. Damn, I'm old...

What you want is a perspective projection instead of a parallel projection. In order to obtain this, you need to know which values to sum for each point on the projection plane. There are multiple considerations to keep in mind:
We are talking about voxels, so you need to a method to determine whether a certain point in space belongs to a certain voxel in your volume.
A line between two points is straight, but because voxels are a discrete representation of space different methods of determining the above can lead to different (mostly minor) results. This difference will ultimately also lead to slightly different images depending on the alogrithms used. This is expected.
Let's say you have a CT scan volume comprising of 256 512x512 pixel slices. This gives you a volume of 512x512x256 voxels. For each of these voxels you need to know what their positions in x,y,z coordinates are. You can do this as follows:
- Use the ImagePositionPatient attribute to find out the x,y,z coordinate of the upper left hand corner pixel in mm for a given slice.
- Use the PixelSpacing attribute to calculate the x,y,z coordinates of the other pixels in your slice. Repeat for all slices
edit: i just found a counterexample against below method, the rest is still helpful. will update
Now to find out for a given point (Xa, Ya, Za) what voxel values need to be summed if the source is at (Xb, Yb, Zb):
Find the voxel that belongs to (Xa,Ya, Za). Keep pixel/voxel data.
Calculate (you can do this with NumPy) the distance between voxel(Xa, Ya, Za) and (Xb, Yb, Zb). There is an optimalization possible here :)
For all directly surrounding voxels (that will be a number of 3x3x3-1 voxels) also calculate this distance. Can also be optimized :)
Take the voxel with the shortest distance as the starting point for a next iteration of the above. Add pixel/voxel data.
Repeat until out of bounds of you CT volume.
In order to obtain a projection repeat these steps for all points on your projection plane and visualize the result. Good luck with your assignment! :)

How to rotate a 3D array without rounding, by using Python?

I have a 3D numpy array that I want to rotate with an angle that I want. I have tried using scipy.ndimage.rotate function and it does the job. However, it does a lot of rounding when rotating. This causes me a problem because my 3D array is representation of an object and numbers in each pixel represent the material that pixel is filled with (which I store in a different file). Therefore, I need a way to rotate the array without doing approximation or rounding and making the object blurry is not a problem
Here is what I got with the function I used:

The problem you are dealing with is essentially a sampling issue. Your resolution is too low for the data you are dealing with. One possibility to solve this is to increase the resolution of the image you are working with, enforce the color values as you rotate (ie no blending colors at the edges), and create a size/shape template that must be met after the rotation.
Edit: For clarity, it isn't the data that is at too low of a resolution, it's the image in which the data is stored that should be at a high enough resolution. The wikipedia page on multidimensional sampling is good for this topic: https://en.wikipedia.org/wiki/Multidimensional_sampling

I think the way I would approach it, outside of someone knowing an actual package to do this, is start with the indices and rotate them, then, given they may be floating point, round them. This may not be the best, but I think it should work.
Most of this example is loading a 3D dataset I found to use as an example.
import matplotlib.pyplot as plt
import os
import numpy as np
from scipy.ndimage import rotate
def load_example_data():
# Found data as an example
from urllib.request import urlopen
import tarfile
opener = urlopen( 'http://graphics.stanford.edu/data/voldata/MRbrain.tar.gz')
tar_file = tarfile.open('MRbrain.tar.gz')
try:
os.mkdir('mri_data')
except:
pass
tar_file.extractall('mri_data')
tar_file.close()
import numpy as np
data = np.array([np.fromfile(os.path.join('mri_data', 'MRbrain.%i' % i),
dtype='>u2') for i in range(1, 110)])
data.shape = (109, 256, 256)
return data
def rotate_nn(data, angle, axes):
"""
Rotate a `data` based on rotating coordinates.
"""
# Create grid of indices
shape = data.shape
d1, d2, d3 = np.mgrid[0:shape[0], 0:shape[1], 0:shape[2]]
# Rotate the indices
d1r = rotate(d1, angle=angle, axes=axes)
d2r = rotate(d2, angle=angle, axes=axes)
d3r = rotate(d3, angle=angle, axes=axes)
# Round to integer indices
d1r = np.round(d1r)
d2r = np.round(d2r)
d3r = np.round(d3r)
d1r = np.clip(d1r, 0, shape[0])
d2r = np.clip(d2r, 0, shape[1])
d3r = np.clip(d3r, 0, shape[2])
return data[d1r, d2r, d3r]
data = load_example_data()
# Rotate the coordinates indices
angle = 5
axes = (0, 1)
data_r = rotate_nn(data, angle, axes)
I think the general idea will work. You will have to consider what the axis is to rotate around.

For anyone with this problem stumbling upon this thread: brechmos' comment under the OP put me in the right direction for an actual solution. rotate() by default uses a third-order spline interpolation, which gives nice smooth edges. We want sharp edges though, without numbers in between. Setting order = 0 does exactly this. No need for extra functions or implementing anything yourself, just change a single argument.

OpenCV recoverPose camera coordinate system

I'm estimating the translation and rotation of a single camera using the following code.
E, mask = cv2.findEssentialMat(k1, k2,
focal = SCALE_FACTOR * 2868
pp = (1920/2 * SCALE_FACTOR, 1080/2 * SCALE_FACTOR),
method = cv2.RANSAC,
prob = 0.999,
threshold = 1.0)
points, R, t, mask = cv2.recoverPose(E, k1, k2)
where k1 and k2 are my matching set of key points, which are Nx2 matrices where the first column is the x-coordinates and the second column is y-coordinates.
I collect all the translations over several frames and generate a path that the camera traveled like this.
def generate_path(rotations, translations):
path = []
current_point = np.array([0, 0, 0])
for R, t in zip(rotations, translations):
path.append(current_point)
# don't care about rotation of a single point
current_point = current_point + t.reshape((3,)
return np.array(path)
So, I have a few issues with this.
The OpenCV camera coordinate system suggests that if I want to view the 2D "top down" view of the camera's path, I should plot the translations along the X-Z plane.
plt.plot(path[:,0], path[:,2])
This is completely wrong.
However, if I write this instead
plt.plot(path[:,0], path[:,1])
I get the following (after doing some averaging)
This path is basically perfect.
So, perhaps I am misunderstanding the coordinate system convention used by cv2.recoverPose? Why should the "birds eye view" of the camera path be along the XY plane and not the XZ plane?
Another, perhaps unrelated issue is that the reported Z-translation appears to decrease linearly, which doesn't really make sense.
I'm pretty sure there's a bug in my code since these issues appear systematic - but I wanted to make sure my understanding of the coordinate system was correct so I can restrict the search space for debugging.

At the very beginning, actually, your method is not producing a real path. The translation t produced by recoverPose() is always a unit vector. Thus, in your 'path', every frame is moving exactly 1 'meter' from the previous frame. The correct method would be, 1) initialize:(featureMatch, findEssentialMatrix, recoverPose), then 2) track:(triangluate, featureMatch, solvePnP). If you would like to dig deeper, finding tutorials on Monocular Visual SLAM would help.
Secondly, you might have messed up with the camera coordinate system and world coordinate system. If you want to plot the trajectory, you would use the world coordinate system rather than camera coordinate system. Besides, the results of recoverPose() are also in world coordinate system. And the world coordinate system is: x-axis pointing to right, y-axis pointing forward, z-axix pointing up.Thus, when you would like to plot the 'bird view', it is correct that you should plot along the X-Y plane.

Healpy coordinate error after interpolation: appearance of bisector

I have a coarse skymap made up of 128 points, of which I would like to make a smooth healpix map (see attached Figure, LHS). Figures referenced in the text:
I load my data, then make new longitude and latitude arrays of the appropriate pixel length for the final map (with e.g. nside=32).
My input data are:
lats = pi/2 + ths # theta from 0, pi, size 8
lons = phs # phi from 0, 2pi, size 16
data = sky_data[0] # shape (8,16)
New lon/lat array size based on number of pixels from nside:
nside = 32
pixIdx = hp.nside2npix(nside) # number of pixels I can get from this nside
pixIdx = np.arange(pixIdx) # pixel index numbers
I then find the new data values for those pixels by interpolation, and then convert back from angles to pixels.
# new lon/lat
new_lats = hp.pix2ang(nside, pixIdx)[0] # thetas I need to populate with interpolated theta values
new_lons = hp.pix2ang(nside, pixIdx)[1] # phis, same
# interpolation
lut = RectSphereBivariateSpline(lats, lons, data, pole_values=4e-14)
data_interp = lut.ev(new_lats.ravel(), new_lons.ravel()) #interpolate the data
pix = hp.ang2pix(nside, new_lats, new_lons) # convert latitudes and longitudes back to pixels
Then, I construct a healpy map with the interpolated values:
healpix_map = np.zeros(hp.nside2npix(nside), dtype=np.double) # create empty map
healpix_map[pix] = data_interp # assign pixels to new interpolated values
testmap = hp.mollview(healpix_map)
The result of the map is the upper RHS of the attached Figure.
(Forgive the use of jet -- viridis doesn't have a "white" zero, so using that colormap adds a blue background.)
The map doesn't look right: you can see from the coarse map in the Figure that there should be a "hotspot" on the lower RHS, but here it appears in the upper left.
As a sanity check, I used matplotlib to make a scatter plot of the interpolated points in a mollview projection, Figure 2, where I removed the edges of the markers to make it look like a map ;)
ax = plt.subplot(111, projection='astro mollweide')
ax.grid()
colors = data_interp
sky=plt.scatter(new_lons, new_lats-pi/2, c = colors, edgecolors='none', cmap ='jet')
plt.colorbar(sky, orientation = 'horizontal')
You can see that this map, lower RHS of attached Figure, produces exactly what I expect! So the coordinates are ok, and I am completely confused.
Has anyone encountered this before? What can I do? I'd like to use the healpy functions on this and future maps, so just using matplotlib isn't an option.
Thanks!

I figured it out -- I had to add pi/2 to my thetas for the interpolation to work, so in the end need to apply the following transformation for the image to render correctly:
newnew_lats = pi - new_lats
newnew_lons = pi + new_lons
There still seems to be a bit of an issue with the interpolation, although the seem is not so visible now. I may try a different one to compare.

I'm no expert in healpix (actually I've never used it before - I'm a particle physicist), but as far as I can tell it's just a matter of conventions: in a Mollweide projection, healpy places the north pole (positive latitude) at the bottom of the map, for some reason. I'm not sure why it would do that, or whether this is intentional behavior, but it seems pretty clear that's what is happening. If I mask out everything below the equator, i.e. keep only the positive-latitude points
mask = new_lats - pi/2 > 0
pix = hp.ang2pix(nside, new_lats[mask], new_lons[mask])
healpix_map = np.zeros(hp.nside2npix(nside), dtype=np.double)
healpix_map[pix] = data_interp[mask]
testmap = hp.mollview(healpix_map)
it comes up with a plot with no data above the center line:
At least it's easy enough to fix. mollview admits a rot parameter that will effectively rotate the sphere around the viewing axis before projecting it, and a flip parameter which can be set to 'astro' (default) or 'geo' to set whether east is shown at the left or right. A little experimentation shows that you get the coordinate system you want with
hp.mollview(healpix_map, rot=(180, 0, 180), flip='geo')
In the tuple, the first two elements are longitude and latitude of the point to set in the center of the plot, and the third element is the rotation. All are in degrees. With no mask it gives this:
which I believe is just what you're looking for.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.