How to Specify Dimension Values when Creating NetCDF File in Python?

How to Specify Dimension Values when Creating NetCDF File in Python? - python

I am creating a NetCDF4 file which currently has four variables:
1) Land Surface Temperature (3D array - time, latitude, longitude)
2) Longitude (1D - coordinate of each pixel centre)
3) Latitude (1D - coordinate of each pixel centre)
4) Time (time of image acquisition in hours since 1900-01-01 00:00:00)
I am currently using the following code to do this:
#==========================WRITE THE NETCDF FILE==========================#
newfile = nc.Dataset(export_filename, 'w', format = 'NETCDF4_CLASSIC')
#==========================SET FILE DIMENSIONS============================#
newfile.createDimension('lat', ny)
newfile.createDimension('lon', nx)
newfile.createDimension('time', len(filenames))
#==========================SET GLOBAL ATTRIBUTES==========================#
newfile.title = ('Title')
newfile.history = "File created on " + datetime.strftime(datetime.today(), "%c")
newfile.Conventions = 'CF-1.6'
#==========================CREATE DATA VARIABLES==========================#
#--------------------------LST VARIABLE-----------------------------------#
LSTs = newfile.createVariable('LST', np.int16, ('time', 'lat', 'lon'), fill_value = -8000)
LSTs.units = 'Degrees C'
LSTs.add_offset = 273.15
LSTs.scale_factor = 0.01
LSTs.standard_name = 'LST'
LSTs.long_name = 'Land Surface Temperature'
LSTs.grid_mapping = 'latitude_longitude'
LSTs.coordinates = 'lon lat'
LSTs[:] = LSTd[:]
#--------------------------LON AND LAT AND TIME--------------------------#
LONGITUDEs = newfile.createVariable('LONGITUDE', np.float64, ('lon',))
LONGITUDEs.units = 'Decimal Degrees East'
LONGITUDEs.standard_name = 'Longitude'
LONGITUDEs.long_name = 'Longitude'
LONGITUDEs[:] = LONd[:]
LATITUDEs = newfile.createVariable('LATITUDE', np.float64, ('lat',))
LATITUDEs.units = 'Decimal Degrees North'
LATITUDEs.standard_name = 'Latitude'
LATITUDEs.long_name = 'Latitude'
LATITUDEs[:] = LATd[:]
TIMEs = newfile.createVariable('TIME', np.int32, ('time',))
TIMEs.units = 'hours since 1900-01-01 00:00:00'
TIMEs.standard_name = 'Time'
TIMEs.long_name = 'Time of Image Acquisition'
TIMEs.axis = 'T'
TIMEs.calendar = 'gregorian'
TIMEs[:] = time[:]
#--------------------------SAVE THE FILE---------------------------------#
newfile.close();
This code produces a netCDF file with the land surface temperature variable having 24 bands (one for each hour of the day). This code works as I wanted it to albeit one small problem which I wish to address. When I run gdalinfo for the LST variable, I get (this is a reduced version):
Band 1.....
...
NETCDF_DIM_TIME = 1
...
I want this value of 1 to be set to the same as the 'time' variable (which is something like 1081451 hours since 1900-01-01 00:00:00) which I have included in my above code. I therefore want to understand how this can be changed for each band in the file?
UPDATE TO QUESTION: When I do gdalinfo on the file (again, a subset):
NETCDF_DIM_EXTRA={time}
NETCDF_DIM_time_DEF={24,3}
but there is an option missing 'NETCDF_DIM_time_VALUES' and I need to set this to the time variable and it should work. HOW DO I DO THIS?
At present it is just being set to the band number but I want it to contain information regarding its hour of acquisition.
UPDATE 1:
I have tried to specify
LSTs.NETCDF_DIM_Time = time
during the netCDF file formation and this has assigned all values from time to the NETCDF_DIM_TIME in gdal so that each band has 24 time values rather than just one.
UPDATE 2:
With some further digging I think it is the NETCDF_DIM_time_VALUES metadata which needs to be set to the 'time' variable. I have updated my question to ask how to do this.

The variables associated with the dimensions should have the same name as the dimensions. So in your code above replace the create variable line with:
TIMEs = newfile.createVariable('time', np.int32, ('time',))
now gdalinfo knows where to find the data. I ran your code using dummy times [1000000, 1000024] and gdal info returns:
Band1...
...
NETCDF_DIM_time=1000000
...
Band2...
...
NETCDF_DIM_time=1000024
...
To answer your title question: You can't assign values to a Dimension but you can have a variable with the same name as the dimension that holds the data/values associated with the dimension. Readers of netcdf files, like gdal, look for conventions like this to interpret the data. See for example Unidata's 'Writing NetCDF Files: Best Practices' 'Coordinate Systems'

Related

How can i open several .nc files as one in python with xarray?

I have 30 .nc files with dimensions: time, lat, lon, pres; coordinates: time, lat, lon, pres and 18 different variables. I would like to open them in one dataset.
If I use xarray.open_mfdataset (), it won't work, as the dimensions of pres are all different.
Is there any way to open them as one anyway?
This is what I tried, and which Error I got:
import xarray as xr
path = ('Data/*.nc')
ds = xr.open_mfdataset(paths = sorted (glob(path)),
chunks={"time_counter":1, "y":100, "x":100},
coords = 'all', combine = 'by_coords')
ValueError: Resulting object does not have monotonic global indexes along dimension PRES

I keep getting blank images while trying to Georeference an image with GDAL

#!/usr/bin/env python3
import numpy as np
from osgeo import gdal
from osgeo import osr
# Load an array with shape (197, 250, 3)
# Data with dim of 3 contain (value, longitude, latitude)
data = np.load("data.npy")
# Copy the data and coordinates
array = data[:,:,0]
lon = data[:,:,1]
lat = data[:,:,2]
nLons = array.shape[1]
nLats = array.shape[0]
# Calculate the geotransform parameters
maxLon, minLon, maxLat, minLat = [lon.max(), lon.min(), lat.max(), lat.min()]
resLon = (maxLon - minLon) / nLons
resLat = (maxLat - minLat) / nLats
# Get the transform
geotransform = (minLon, resLon, 0, maxLat, 0, -resLat)
# Create the ouptut raster
output_raster = gdal.GetDriverByName('GTiff').Create('myRaster.tif', nLons, nLats, 1,
gdal.GDT_Int32)
# Set the geotransform
output_raster.SetGeoTransform(geotransform)
srs = osr.SpatialReference()
# Set to world projection 4326
srs.ImportFromEPSG(4326)
output_raster.SetProjection(srs.ExportToWkt())
output_raster.GetRasterBand(1).WriteArray(array)
output_raster.FlushCache()
The code above is meant to georeference a raster using GDAL but returns blank tiff files. I have vetted the data and variables, I, however, suspect the problem could be from geotransform variables. The documentation demands the variable to be:
top-left-x, w-e-pixel-resolution, 0,
top-left-y, 0, n-s-pixel-resolution (negative value)
I have used lats and lons not sure I'm getting which one corresponds to x and which to y. It could be something else but I'm not quite sure.

Overall your approach looks correct to me, but it's hard to tell without seeing the data you're using, but here are some points to consider:
First, there's a difference between the output file being empty, and/or being in the wrong location, georeferencing relates only to the latter.
When working interactive, you should also make sure to properly close the Dataset using output_raster = None, that will also trigger flushing for you.
You could start by testing if GDAL reads the same data that you intended to write. Using something like:
ds = gdal.Open('myRaster.tif')
data_from_disk = ds.ReadAsArray()
ds = None
np.testing.assert_array_equal(data_from_disk, array)
If those are not identical, it could be an issue with the datatype. Like writing floats close to 0 as integers, causing them to clip to 0 giving the appearance of an "empty" file.
Regarding the georeferencing, the projection you use has the coordinates in degrees. If yours are in radians your output ends up close to null-island.
Your approach also assumes that the data and lat/lon arrays are on a regular grid (having a constant resolution). That might not be the case (especially if the data comes with a 2D grid of coordinates).
Often when coordinate arrays are given, they are defined as valid for the center of the pixel. Compared to GDAL's geotransform which is defined for the (outer) edge of the pixel. So you might need to account for that by subtracting half the resolution. And this also impacts your calculation of the resolution, which in the case for the center-definition should probably use / (nLons-1) & / (nLats-1). Or alternatively verify with:
# for a regular grid
resLon = lon[0,1] - lon[0,0]
resLat = lat[1,0] - lat[0,0]
When I run your snippet with some dummy data, it gives me a correct output (ignoring the center/edge issue mentioned above).
lat, lon = np.mgrid[89:-90:-2, -179:180:2]
array = np.sqrt(lon**2 + lat**2).astype(np.int32)

Spatial resampling of netCDF file in python

I am working with Sentinel 3 SLSTR data which comes in netCDF file format. The file contains 11 bands:
S1-S6 (500 m resolution) and S7-S9 and F1 & F2 (1000 m resolution). S1-S6 contains radiance values and S7-S9 contains brightness temperature values. Right now, I want to resample my S7-S9 band to 500 m resolution to match the resolution of S1-S6 bands.
I am using xarray to read the netCDF files. There is a function xarray.Dataset.resample() but the documentation says that it resample to a new temporal resolution.
I also tried to resample using gdal but couldn't get any result.
import gdal
import xarray as xr
import matplotlib.pyplot as plt
data = xr.open_dataset('S7_BT_in.nc') # one of the files in 1000 m resolution
geo = xr.open_dataset(path+'geodetic_an.nc') # file containing the geodetic values
ds = data['S7_BT_in'] # fetching variable I need to work on
lat = geo['latitude_an'] # fetching latitude values
lon = geo['longitude_an'] # fetching longitude values
#assigning latitude and longitude values to the coordinates of ds
ds = ds.assign_coords(coords = {'Latitude': lat, 'Longitude': lon})
x = gdal.Open('ds') # Opening the netCDF file using gdal
# resampling the data to 500 m resolution
xreproj = gdal.Warp('resampled.nc', x, xRes = 500, yRes = 500)
This is the error I am getting:
SystemError: <built-in function wrapper_GDALWarpDestName> returned NULL without setting an error.
I also tried opening the file directly using gdal but still getting the same error.

About changing longitude array from 0 - 360 to -180 to 180 with Python xarray

I am a matlab user trying to use Python more for my computations recently. I am using xarray and would like to change my longitude array from 0 - 360 to -180 to 180 of a geophysical field. But when I do that:
df=xr.open_dataset(ecmwf_winds.nc)
u10=df['u10']
lon=df['longitude']
lon = np.where(lon > 180, lon-360, lon)
[X,Y]=np.meshgrid(lon,df.latitude)
plt.contourf(X,Y,u10)
the contourplot turns out to messy with gaps, which does not make sense. Can anyone please help me with it. I am not sure where I am doing wrong.

Another faster approach and much simpler approach without using where would be
df.coords['lon'] = (df.coords['lon'] + 180) % 360 - 180
df = df.sortby(df.lon)
Tip: For quick plotting, you can use Xarrays inbuilt plotting function so you won't have to create a meshgrid.
df.u10.plot()
#or
df.u10.plt.contourf()

You need to assign the values as you've done and then also sort the resulting DataArray along the new coordinate values:
lon_name = 'longitude' # whatever name is in the data
# Adjust lon values to make sure they are within (-180, 180)
ds['_longitude_adjusted'] = xr.where(
ds[lon_name] > 180,
ds[lon_name] - 360,
ds[lon_name])
# reassign the new coords to as the main lon coords
# and sort DataArray using new coordinate values
ds = (
ds
.swap_dims({lon_name: '_longitude_adjusted'})
.sel(**{'_longitude_adjusted': sorted(ds._longitude_adjusted)})
.drop(lon_name))
ds = ds.rename({'_longitude_adjusted': lon_name})

How do you convert from AltAz coordinates to equatorial coordinates in Astropy

I have been trying to figure out how to convert a set of AltAz coordinates into equatorial coordinates, and so far all I have been able to find is how to convert equatorial into AltAz (but not the reverse) using the following method:
c = SkyCoord('22h50m0.19315s', '+24d36m05.6984s', frame='icrs')
loc = EarthLocation(lat = 31.7581*u.deg, lon = -95.6386*u.deg, height = 147*u.m)
time = Time('1991-06-06 12:00:00')
cAltAz = c.transform_to(AltAz(obstime = time, location = loc))
However now I want to rotate the azimuth of mpAltAz by some increment and figure out what the corresponding equatorial coordinates are of the new point.
i.e. I want something like this:
newAltAzcoordiantes = SkyCoord(alt = cAltAz.alt.deg, az = cAltAz.az.deg + x*u.deg, obstime = time, frame = 'altaz')
newAltAzcoordiantes.transform_to(ICRS)
The problem Im having though is it does not seem like I can build a SkyCoord object in the AltAz coordinate system
I hope that is clear enough, This is my first time posting on stackoverflow.

I don't know much about astronomy, but it seems like there's plenty of documentation:
transforming between coordinates
full API docs for coordinates
For me, cAltAz.icrs works and returns the original c. I needed to tweak a bunch of stuff to make it work on newAltAzcoordiantes (need to define x, stop calling deg on attributes that already have units, add location):
>>> x = 5 # why not?
>>> newAltAzcoordiantes = SkyCoord(alt = cAltAz.alt, az = cAltAz.az + x*u.deg, obstime = time, frame = 'altaz', location = loc)
>>> newAltAzcoordiantes.icrs
<SkyCoord (ICRS): (ra, dec) in deg
(341.79674062, 24.35770826)>

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to Specify Dimension Values when Creating NetCDF File in Python? - python

Related

How can i open several .nc files as one in python with xarray?

I keep getting blank images while trying to Georeference an image with GDAL

Spatial resampling of netCDF file in python

About changing longitude array from 0 - 360 to -180 to 180 with Python xarray

How do you convert from AltAz coordinates to equatorial coordinates in Astropy

Categories

Resources