Related
I have a mosaic tif file (gdalinfo below) I made (with some additional info on the tiles here) and have looked extensively for a function that simply returns the elevation (the z value of this mosaic) for a given lat/long. The functions I've seen want me to input the coordinates in the coordinates of the mosaic, but I want to use lat/long, is there something about GetGeoTransform() that I'm missing to achieve this?
This example for instance here shown below:
from osgeo import gdal
import affine
import numpy as np
def retrieve_pixel_value(geo_coord, data_source):
"""Return floating-point value that corresponds to given point."""
x, y = geo_coord[0], geo_coord[1]
forward_transform = \
affine.Affine.from_gdal(*data_source.GetGeoTransform())
reverse_transform = ~forward_transform
px, py = reverse_transform * (x, y)
px, py = int(px + 0.5), int(py + 0.5)
pixel_coord = px, py
data_array = np.array(data_source.GetRasterBand(1).ReadAsArray())
return data_array[pixel_coord[0]][pixel_coord[1]]
This gives me an out of bounds error as it's likely expecting x/y coordinates (e.g. retrieve_pixel_value([153.023499,-27.468968],dataset). I've also tried the following from here:
import rasterio
dat = rasterio.open(fname)
z = dat.read()[0]
def getval(lon, lat):
idx = dat.index(lon, lat, precision=1E-6)
return dat.xy(*idx), z[idx]
Is there a simple adjustment I can make so my function can query the mosaic in lat/long coords?
Much appreciated.
Driver: GTiff/GeoTIFF
Files: mosaic.tif
Size is 25000, 29460
Coordinate System is:
PROJCRS["GDA94 / MGA zone 56",
BASEGEOGCRS["GDA94",
DATUM["Geocentric Datum of Australia 1994",
ELLIPSOID["GRS 1980",6378137,298.257222101004,
LENGTHUNIT["metre",1]],
ID["EPSG",6283]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433,
ID["EPSG",9122]]]],
CONVERSION["UTM zone 56S",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",153,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",10000000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]],
ID["EPSG",17056]],
CS[Cartesian,2],
AXIS["easting",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["northing",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
Data axis to CRS axis mapping: 1,2
Origin = (491000.000000000000000,6977000.000000000000000)
Pixel Size = (1.000000000000000,-1.000000000000000)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=BAND
Corner Coordinates:
Upper Left ( 491000.000, 6977000.000) (152d54'32.48"E, 27d19'48.33"S)
Lower Left ( 491000.000, 6947540.000) (152d54'31.69"E, 27d35'45.80"S)
Upper Right ( 516000.000, 6977000.000) (153d 9'42.27"E, 27d19'48.10"S)
Lower Right ( 516000.000, 6947540.000) (153d 9'43.66"E, 27d35'45.57"S)
Center ( 503500.000, 6962270.000) (153d 2' 7.52"E, 27d27'47.16"S)
Band 1 Block=25000x1 Type=Float32, ColorInterp=Gray
NoData Value=-999
Update 1 - I tried the following:
tif = r"mosaic.tif"
dataset = rio.open(tif)
d = dataset.read()[0]
def get_xy_coords(latlng):
transformer = Transformer.from_crs("epsg:4326", dataset.crs)
coords = [transformer.transform(x, y) for x,y in latlng][0]
#idx = dataset.index(coords[1], coords[0])
return coords #.xy(*idx), z[idx]
longx,laty = 153.023499,-27.468968
coords = get_elevation([(laty,longx)])
print(coords[0],coords[1])
print(dataset.width,dataset.height)
(502321.11181384244, 6961618.891167777)
25000 29460
So something is still not right. Maybe I need to subtract the coordinates from the bottom left/right of image e.g.
coords[0]-dataset.bounds.left,coords[1]-dataset.bounds.bottom
where
In [78]: dataset.bounds
Out[78]: BoundingBox(left=491000.0, bottom=6947540.0, right=516000.0, top=6977000.0)
Update 2 - Indeed, subtracting the corners of my box seems to get closer.. though I'm sure there is a much nice way just using the tif metadata to get what I want.
longx,laty = 152.94646, -27.463175
coords = get_xy_coords([(laty,longx)])
elevation = d[int(coords[1]-dataset.bounds.bottom),int(coords[0]-dataset.bounds.left)]
fig,ax = plt.subplots(figsize=(12,12))
ax.imshow(d,vmin=0,vmax=400,cmap='terrain',extent=[dataset.bounds.left,dataset.bounds.right,dataset.bounds.bottom,dataset.bounds.top])
ax.plot(coords[0],coords[1],'ko')
plt.show()
You basically have two distinct steps:
Convert lon/lat coordinates to map coordinates, this is only necessary if your input raster is not already in lon/lat. Map coordinates are the coordinates in the projection that the raster itself uses
Convert the map coordinates to pixel coordinates.
There are all kinds of tool you might use, perhaps to make things simpler (like pyproj, rasterio etc). But for such a simple case it's probably nice to start with doing it all in GDAL, that probably also enhances your understanding of what steps are needed.
Inputs
from osgeo import gdal, osr
raster_file = r'D:\somefile.tif'
lon = 153.023499
lat = -27.468968
lon/lat to map coordinates
# fetch metadata required for transformation
ds = gdal.OpenEx(raster_file)
raster_proj = ds.GetProjection()
gt = ds.GetGeoTransform()
ds = None # close file, could also keep it open till after reading
# coordinate transformation (lon/lat to map)
# define source projection
# this definition ensures the order is always lon/lat compared
# to EPSG:4326 for which it depends on the GDAL version (2 vs 3)
source_srs = osr.SpatialReference()
source_srs.ImportFromWkt(osr.GetUserInputAsWKT("urn:ogc:def:crs:OGC:1.3:CRS84"))
# define target projection based on the file
target_srs = osr.SpatialReference()
target_srs.ImportFromWkt(raster_proj)
# convert
ct = osr.CoordinateTransformation(source_srs, target_srs)
mapx, mapy, *_ = ct.TransformPoint(lon, lat)
You could verify this intermediate result by for example adding it as Point WKT in something like QGIS (using the QuickWKT plugin, making sure the viewer has the same projection as the raster).
map coordinates to pixel
# apply affine transformation to get pixel coordinates
gt_inv = gdal.InvGeoTransform(gt) # invert for map -> pixel
px, py = gdal.ApplyGeoTransform(gt_inv, mapx, mapy)
# it wil return fractional pixel coordinates, so convert to int
# before using them to read. Round to nearest with +0.5
py = int(py + 0.5)
px = int(px + 0.5)
# read pixel data
ds = gdal.OpenEx(raster_file) # open file again
elevation_value = ds.ReadAsArray(px, py, 1, 1)
ds = None
The elevation_value variable should be the value you're after. I would definitelly verify the result independently, try a few points in QGIS or the gdallocationinfo utility:
gdallocationinfo -l_srs "urn:ogc:def:crs:OGC:1.3:CRS84" filename.tif 153.023499 -27.468968
# Report:
# Location: (4228P,4840L)
# Band 1:
# Value: 1804.51879882812
If you're reading a lot of points, there will be some threshold at which it would be faster to read a large chunk and extract the values from that array, compared to reading every point individually.
edit:
For applying the same workflow on multiple points at once a few things change.
So for example having the inputs:
lats = np.array([-27.468968, -27.468968, -27.468968])
lons = np.array([153.023499, 153.023499, 153.023499])
The coordinate transformation needs to use ct.TransformPoints instead of ct.TransformPoint which also requires the coordinates to be stacked in a single array of shape [n_points, 2]:
coords = np.stack([lons.ravel(), lats.ravel()], axis=1)
mapx, mapy, *_ = np.asarray(ct.TransformPoints(coords)).T
# reshape in case of non-1D inputs
mapx = mapx.reshape(lons.shape)
mapy = mapy.reshape(lons.shape)
Converting from map to pixel coordinates changes because the GDAL method for this only takes single point. But manually doing this on the arrays would be:
px = gt_inv[0] + mapx * gt_inv[1] + mapy * gt_inv[2]
py = gt_inv[3] + mapx * gt_inv[4] + mapy * gt_inv[5]
And rounding the arrays to integer changes to:
px = (px + 0.5).astype(np.int32)
py = (py + 0.5).astype(np.int32)
If the raster (easily) fits in memory, reading all points would become:
ds = gdal.OpenEx(raster_file)
all_elevation_data = ds.ReadAsArray()
ds = None
elevation_values = all_elevation_data[py, px]
That last step could be optimized by checking highest/lowest pixel coordinates in both dimensions and only read that subset for example, but it would require normalizing the coordinates again to be valid for that subset.
The py and px arrays might also need to be clipped (eg np.clip) if the input coordinates fall outside the raster. In that case the pixel coordinates will be < 0 or >= xsize/ysize.
#!/usr/bin/env python3
import numpy as np
from osgeo import gdal
from osgeo import osr
# Load an array with shape (197, 250, 3)
# Data with dim of 3 contain (value, longitude, latitude)
data = np.load("data.npy")
# Copy the data and coordinates
array = data[:,:,0]
lon = data[:,:,1]
lat = data[:,:,2]
nLons = array.shape[1]
nLats = array.shape[0]
# Calculate the geotransform parameters
maxLon, minLon, maxLat, minLat = [lon.max(), lon.min(), lat.max(), lat.min()]
resLon = (maxLon - minLon) / nLons
resLat = (maxLat - minLat) / nLats
# Get the transform
geotransform = (minLon, resLon, 0, maxLat, 0, -resLat)
# Create the ouptut raster
output_raster = gdal.GetDriverByName('GTiff').Create('myRaster.tif', nLons, nLats, 1,
gdal.GDT_Int32)
# Set the geotransform
output_raster.SetGeoTransform(geotransform)
srs = osr.SpatialReference()
# Set to world projection 4326
srs.ImportFromEPSG(4326)
output_raster.SetProjection(srs.ExportToWkt())
output_raster.GetRasterBand(1).WriteArray(array)
output_raster.FlushCache()
The code above is meant to georeference a raster using GDAL but returns blank tiff files. I have vetted the data and variables, I, however, suspect the problem could be from geotransform variables. The documentation demands the variable to be:
top-left-x, w-e-pixel-resolution, 0,
top-left-y, 0, n-s-pixel-resolution (negative value)
I have used lats and lons not sure I'm getting which one corresponds to x and which to y. It could be something else but I'm not quite sure.
Overall your approach looks correct to me, but it's hard to tell without seeing the data you're using, but here are some points to consider:
First, there's a difference between the output file being empty, and/or being in the wrong location, georeferencing relates only to the latter.
When working interactive, you should also make sure to properly close the Dataset using output_raster = None, that will also trigger flushing for you.
You could start by testing if GDAL reads the same data that you intended to write. Using something like:
ds = gdal.Open('myRaster.tif')
data_from_disk = ds.ReadAsArray()
ds = None
np.testing.assert_array_equal(data_from_disk, array)
If those are not identical, it could be an issue with the datatype. Like writing floats close to 0 as integers, causing them to clip to 0 giving the appearance of an "empty" file.
Regarding the georeferencing, the projection you use has the coordinates in degrees. If yours are in radians your output ends up close to null-island.
Your approach also assumes that the data and lat/lon arrays are on a regular grid (having a constant resolution). That might not be the case (especially if the data comes with a 2D grid of coordinates).
Often when coordinate arrays are given, they are defined as valid for the center of the pixel. Compared to GDAL's geotransform which is defined for the (outer) edge of the pixel. So you might need to account for that by subtracting half the resolution. And this also impacts your calculation of the resolution, which in the case for the center-definition should probably use / (nLons-1) & / (nLats-1). Or alternatively verify with:
# for a regular grid
resLon = lon[0,1] - lon[0,0]
resLat = lat[1,0] - lat[0,0]
When I run your snippet with some dummy data, it gives me a correct output (ignoring the center/edge issue mentioned above).
lat, lon = np.mgrid[89:-90:-2, -179:180:2]
array = np.sqrt(lon**2 + lat**2).astype(np.int32)
I would like to create buffered polygons of locations (towns, villages etc) in order to use them for searching in radius.
This is what I would like to achieve (units for ilustration):
This is how I do it in pyclipper:
import pyclipper
coordinates = # Array of lat,lng tuples
clipper_offset = pyclipper.PyclipperOffset()
coordinates = pyclipper.scale_to_clipper(coordinates)
clipper_offset.AddPath(coordinates, pyclipper.JT_ROUND,
pyclipper.ET_CLOSEDPOLYGON)
scaled_coordinates = clipper_offset.Execute(1000.0)
scaled_coordinates = pyclipper.scale_from_clipper(scaled_coordinates)
Number 1000.0 is arbitrary and my question is - how do I calculate the right offset ratio for Execute method, so that the offsetted polygon will approximately represent 10,20 and 50km radius ?
Btw. is this the right approach to this problem ?
The way that worked for me:
import pyclipper
coordinates = [(198,362),(220,330),(282,372),(260,404)] # Array of lat,lng tuples
clipper_offset = pyclipper.PyclipperOffset()
coordinates_scaled = pyclipper.scale_to_clipper(coordinates)
clipper_offset.AddPath(coordinates_scaled, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
new_coordinates = clipper_offset.Execute(pyclipper.scale_to_clipper(10))
new_coordinates_scaled = pyclipper.scale_from_clipper(new_coordinates)
I mean:
new_coordinates = clipper_offset.Execute(pyclipper.scale_to_clipper(10))
10 or any other value that you need.
It based on baji answer that don't work for me
[ scale_to_clipper(x) => pyclipper.scale_to_clipper(x) ]
You need to also scale the applied offset (to the clipper function):
scaled_coordinates = clipper_offset.Execute(scale_to_clipper(1000.0))
pyclipper scales the input floats to 64 bit integers:
http://www.angusj.com/delphi/clipper/documentation/Docs/Overview/Rounding.htm
I am totally new to Python and I am totally lost.
My supervisor helped me to generate a script to see some slices of a 3D velocity model:
import numpy as np
import matplotlib.pyplot as plt
import yt
from yt.units import km
#Import et reshape data
d = np.genfromtxt('velocity_model.txt', delimiter=' ')
nd=22
nx=131
vel = d[:,3].reshape(nd,nx,nx)
lat = d[:,0].reshape(nd,nx,nx)
lon = d[:,1].reshape(nd,nx,nx)
dep = d[:,2].reshape(nd,nx,nx)
# When this is read into YT, depth increases along x axis, longitude increases along y axis and latitude increases along z axis, need to swap x and z and then flip z
dep=dep.swapaxes(0,2) # swap first and third dimensions: gives lon (x), lat (y), depth (z)
vel=vel.swapaxes(0,2) # swap first and third dimensions:
lat=lat.swapaxes(0,2) # swap first and third dimensions:
lon=lon.swapaxes(0,2) # swap first and third dimensions:
dep=dep[:,:,::-1] # reverse z direction
vel=vel[:,:,::-1] # swap first and 2nd dimensions:
lat=lat[:,:,::-1] # swap first and 2nd dimensions:
lon=lon[:,:,::-1] # swap first and 2nd dimensions:
xmin=0
xmax=289
ymin=0
ymax=289
zmin=-100
zmax=5
#Entrer dans YT
data=dict(velocity=(vel,'km/s'),latitude=(lat,'deg'),longitude=(lon,'deg'),depth=(dep,'km'))
bbox = np.array([[xmin,xmax], [ymin,ymax], [zmin,zmax]])
ds=yt.load_uniform_grid(data,vel.shape, length_unit='km', bbox=bbox)
#Off-Axis Slice
for key in ['latitude','longitude','depth','velocity'] :
L = [0,0,1] # cutting plane=z
slicepos=-50
c = [(xmax-xmin)/2, (ymax-ymin)/2, slicepos]
cut = yt.SlicePlot(ds, L, key,origin='native',center=c) #, width=(200,90,'km'))
cut.set_log(key, False)
cut.annotate_text([0.5,0.9],'z={:d} km'.format(slicepos),coord_system='axis')
cut.set_cmap(field='velocity',cmap='jet_r')
cut.save()
With this script, I would like to fix the colorbar, because for each image this one change, and it's not easy to interpret like this.
I tried to add limits like this:
h=colorbar
h.Limits = [5 9]
cut.set_cmap(field='velocity',cmap='jet_r', h)
But it's not the good way. Does someone have an idea? I saw lot of things but not for cmap.
You're looking for the set_zlim function:
http://yt-project.org/doc/reference/api/generated/yt.visualization.plot_window.AxisAlignedSlicePlot.set_zlim.html
The set_cmap function just allows you to choose which colormap you want, it does not allow you to set the colormap range. You need to use set_zlim for that. Here's an example, using one of the sample datasets from http://yt-project.org/data:
import yt
ds = yt.load('IsolatedGalaxy/galaxy0030/galaxy0030')
plot = yt.SlicePlot(ds, 2, 'density')
plot.set_cmap('density', 'viridis')
plot.set_zlim('density', 1e-28, 1e-25)
This produces the following image:
This is really a question about the yt visualization library rather than matplotlib per se - I've edited the title and tags to reflect this.
I have never come across yt before, but based on the official documentation for yt.SlicePlot, it seems that cut will either be an AxisAlignedSlicePlot or an OffAxisSlicePlot object. Both of these classes have a .set_zlim() method that seems to do what you want:
AxisAlignedSlicePlot.set_zlim(*args, **kwargs)
set the scale of the
colormap
Parameters:
field : string
the field to set a colormap scale if field == ‘all’, applies to all
plots.
zmin : float
the new minimum of the colormap scale. If ‘min’,
will set to the minimum value in the current view.
zmax : float
the new maximum of the colormap scale. If ‘max’, will set to the maximum
value in the current view.
Other Parameters:
dynamic_range : float (default: None)
The dynamic range of the image. If zmin == None, will set zmin = zmax / dynamic_range If zmax == None, will set zmax = zmin * dynamic_range. When dynamic_range is specified, defaults to setting zmin = zmax / dynamic_range.
In other words, you could probably use:
cut.set_zlim(field='velocity', zmin=5, zmax=9)
I have a data grid where the rows represent theta (0, pi) and the columns represent phi (0, 2*pi) and where f(theta,phi) is the density of dark matter at that location. I wanted to calculate the power spectrum for this and have decided to use healpy.
What I can not understand is how to format my data for healpy to use. If someone could provide code (in python for obvious reasons) or point me to a tutorial, that would be great! I have tried my hand at doing it with the following code:
#grid dimensions are Nrows*Ncols (subject to change)
theta = np.linspace(0, np.pi, num=grid.shape[0])[:, None]
phi = np.linspace(0, 2*np.pi, num=grid.shape[1])
nside = 512
print "Pixel area: %.2f square degrees" % hp.nside2pixarea(nside, degrees=True)
pix = hp.ang2pix(nside, theta, phi)
healpix_map = np.zeros(hp.nside2npix(nside), dtype=np.double)
healpix_map[pix] = grid
But, when I try to execute the code to do the power spectrum. Specifically, :
cl = hp.anafast(healpix_map[pix], lmax=1024)
I get this error:
TypeError: bad number of pixels
If anyone could point me to a good tutorial or help edit my code that would be great.
More specifications:
my data is in a 2d np array and I can change the numRows/numCols if I need to.
Edit:
I have solved this problem by first changing the args of anafast to healpix_map.
I also improved the spacing by making my Nrows*Ncols=12*nside*nside.
But, my power spectrum is still giving errors. If anyone has links to good documentation/tutorial on how to calculate the power spectrum (condition of theta/phi args), that would be incredibly helpful.
There you go, hope it's what you're looking for. Feel free to comment with questions :)
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
# Set the number of sources and the coordinates for the input
nsources = int(1.e4)
nside = 16
npix = hp.nside2npix(nside)
# Coordinates and the density field f
thetas = np.random.random(nsources) * np.pi
phis = np.random.random(nsources) * np.pi * 2.
fs = np.random.randn(nsources)
# Go from HEALPix coordinates to indices
indices = hp.ang2pix(nside, thetas, phis)
# Initate the map and fill it with the values
hpxmap = np.zeros(npix, dtype=np.float)
for i in range(nsources):
hpxmap[indices[i]] += fs[i]
# Inspect the map
hp.mollview(hpxmap)
Since the map above contains nothing but noise, the power spectrum should just contain shot noise, i.e. be flat.
# Get the power spectrum
Cl = hp.anafast(hpxmap)
plt.figure()
plt.plot(Cl)
There is a faster way to do the map initialization using numpy.add.at, following this answer.
This is several times faster on my machine as compared to the first section of Daniel's excellent answer:
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
# Set the number of sources and the coordinates for the input
nsources = int(1e7)
nside = 64
npix = hp.nside2npix(nside)
# Coordinates and the density field f
thetas = np.random.uniform(0, np.pi, nsources)
phis = np.random.uniform(0, 2*np.pi, nsources)
fs = np.random.randn(nsources)
# Go from HEALPix coordinates to indices
indices = hp.ang2pix(nside, thetas, phis)
# Baseline, from Daniel Lenz's answer:
# time: ~5 s
hpxmap1 = np.zeros(npix, dtype=np.float)
for i in range(nsources):
hpxmap1[indices[i]] += fs[i]
# Using numpy.add.at
# time: ~0.6 ms
hpxmap2 = np.zeros(npix, dtype=np.float)
np.add.at(hpxmap2, indices, fs)