How to write manipulated raster values to ASCII grid with GDAL? - python

I am trying to manipulate raster values in a grid (ASCII Grid) with GDAL. But before proceeding with this, I have trouble writing the new values into the file. I get these error messages when slopeband.WriteArray(s) is called.
ERROR 6: slope.asc, band 1: WriteBlock() not supported for this dataset.
ERROR 1: slope.asc, band 1: An error occured while writing a dirty block
I'm sorry if this is very basic, but I'm still new to python and GDAL in particular. I use GDAL 1.9.0 on Mac OS X 10.6.8 and Python 2.7. Thank you!
import numpy
import gdal
import gdalconst
dgm = gdal.Open("DGM_10_MR.asc", gdalconst.GA_ReadOnly)
driver = dgm.GetDriver()
geotransform = dgm.GetGeoTransform()
band = dgm.GetRasterBand(1)
data = band.ReadAsArray()
cols = dgm.RasterXSize
rows = dgm.RasterYSize
slope = driver.CreateCopy("slope.asc", dgm)
slope = None
dgm = None
slope = gdal.Open("slope.asc", gdalconst.GA_Update)
slope.SetGeoTransform(geotransform)
slopeband = slope.GetRasterBand(1)
s = slopeband.ReadAsArray()
for y in range(rows):
for x in range(cols):
s[y, x] = 0.0
slopeband.WriteArray(s)
slopeband.FlushCache()
del s
dgm = None
slope = None
print "done"

Unfortunately, GDAL cannot read and write to the same degrees across all filetypes. Arc ASCII grid happens to be one of those filetypes that GDAL cannot write to. As your error message says: WriteBlock() not supported for this dataset., so you can't write to Arc ASCII grids.
As alternative, you could convert your existing ASCII dataset to a different filetype, one that GDAL more fully spports such as GeoTiff. To convert filetypes, you could use the gdal_translate command-line program like so:
gdal_translate -of GTiff DGM_10_R.asc DGM_10_R.tif
I was able to reproduce your errors on my computer, and simply changing the filetype fixes the errors.

Related

Trying to crop image and save dicom with pydicom,

Trying to load a chest x-ray DICOM file that has JPEG2000 compression, extract the pixel array, crop it, and then save as a new DICOM file. Tried this on a Windows10 and MacOS machine, but getting similar errors. Running Python 3.6.13, GDCM 2.8.0, OpenJpeg 2.3.1, Pillow 8.1.2 in a conda environment (installed OpenJPEG and GDCM first before installing Pillow and Pydicom).
My initial code:
file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
arr = arr[500:1500,500:1500]
ds.Rows = arr.shape[0]
ds.Columns = arr.shape[1]
ds.PixelData = arr.tobytes()
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)
Subsequent error: ValueError: With tag (7fe0, 0010) got exception: (7FE0,0010) Pixel Data has an undefined length indicating that it's compressed, but the data isn't encapsulated as required. See pydicom.encaps.encapsulate() for more information
I then tried modifying the ds.PixelData line to ds.PixelData = pydicom.encaps.encapsulate([arr.tobytes()]) which creates the .dcm without error, but when I open the .dcm to view it doesn't show any image (all black).
My next attempt was to see if I needed to somehow compress back to JPEG2000, so I attempted:
arr = Image.fromarray(arr)
output = io.BytesIO()
arr.save(output, format='JPEG2000')
but then I get error: OSError: encoder jpeg2k not available. I also tried format='JPEG' but then it tells me OSError: cannot write mode I;16 as JPEG ...
Any help much appreciated!
Was able to get this working by using the imagecodecs library and the jpeg2k_encode function. One potential pitfall is you need to .copy() the array to meet the function's C contiguous requirement, which you can confirm by running arr_crop.flag if you needed to. Here is the updated code that worked best for me:
import os
import numpy as np
import matplotlib.pyplot as plt
import pydicom
from pydicom.encaps import encapsulate
from pydicom.uid import JPEG2000
from imagecodecs import jpeg2k_encode
file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
#Need to copy() to meet jpeg2k_encodes C contiguous requirement
arr_crop = arr[500:1500,500:1500].copy()
# jpeg2k_encode to perform JPEG2000 compression
arr_jpeg2k = jpeg2k_encode(arr_crop)
# convert from bytearray to bytes before saving to PixelData
arr_jpeg2k = bytes(arr_jpeg2k)
ds.Rows = arr_crop.shape[0]
ds.Columns = arr_crop.shape[1]
ds.PixelData = encapsulate([arr_jpeg2k])
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)
I also ended up using the interactivecrop package to relatively quickly get the crop indices I needed (a tip in case future folks try this in jupyter). In case it's helpful, here's a snippet of code for that (which is run before the above):
from interactivecrop.interactivecrop import main as crop
file_names = [os.path.split(f)[1].split(".")[0] for f in file_list]
image_list = []
for x in file_list:
ds = pydicom.dcmread(x)
arr = ds.pixel_array
image_list.append(arr)
crop(image_list, file_names, optimize=True)
#After cropping all images, will print a dictionary
#copied and pasted this dictionary to a new cell as crop_dict
#use the below function to convert the output to actual indices
def convert_crop_to_index(fname, crop_dict):
x = [crop_dict[fname][1], crop_dict[fname][1] + crop_dict[fname][3]]
y = [crop_dict[fname][0], crop_dict[fname][0] + crop_dict[fname][2]]
return x, y
arr_crop = arr[x[0]:x[1],y[0]:y[1]].copy()
Was never able to quite figure out why ds.decompress() and saving the decompressed dicom was generating an all black image. I feel like that should have been the easiest method, but the above ended up working for me, so I'm happy was able to figure it out.

How to transform .grib file into a GeoTIFF with correct projection using Python, GDAL, ArcPy

I am trying to transform a .grib file into a GeoTIFF to be used in a GIS (ArcGIS to be particular), but am having trouble getting the image to project properly. I have been able to create a GeoTIFF, using GDAL in Python, that shows the data but is not showing up in the correct location when brought into ArcGIS. The resulting image is below.
The data I am working with can be downloaded from: https://gimms.gsfc.nasa.gov/SMOS/SMAP/L05/
I am trying to project the data into WGS84 Web Mercator (Auxiliary Sphere), EPSG: 3857
Note: I have tried bringing in the data via ArcMap by creating a Raster Mosaic which should be able to work with .grib data, but I didn't have any luck.
Update: I have also tried using the Project Raster tool, but ArcGIS does not like the default projection that comes from the .grib file and gives an error.
The code I'm using:
import gdal
src_filename = r"C:\att\project\arcshare\public\disaster_response\nrt_products\smap\20150402_20150404_anom1.grib"
dst_filename = r"C:\att\project\arcshare\public\disaster_response\nrt_products\smap\smap_py_test1.tif"
#Open existing dataset
src_ds = gdal.Open(src_filename)
#Open output format driver, see gdal_translate --formats for list
format = "GTiff"
driver = gdal.GetDriverByName( format )
#Output to new format
dst_file = driver.CreateCopy( dst_filename, src_ds, 0 )
#Properly close the datasets to flush to disk
dst_ds = None
src_ds = None
I am not very well versed in using GDAL or GDAL in Python, so any help or tips would be greatly appreciated.
Try using gdal.Translate (in Python) or gdal_translate (from command line). Here are two examples of how I have used each approach in the past:
Option 1: Python approach
from osgeo import gdal
# Open existing dataset
src_ds = gdal.Open(src_filename)
# Ensure number of bands in GeoTiff will be same as in GRIB file.
bands = [] # Set up array for gdal.Translate().
if src_ds is not None:
bandNum = src_ds.RasterCount # Get band count
for i in range(bandNum+1): # Update array based on band count
if (i==0): #gdal starts band counts at 1, not 0 like the Python for loop does.
pass
else:
bands.append(i)
# Open output format driver
out_form= "GTiff"
# Output to new format using gdal.Translate. See https://gdal.org/python/ for osgeo.gdal.Translate options.
dst_ds = gdal.Translate(dst_filename, src_ds, format=out_form, bandList=bands)
# Properly close the datasets to flush to disk
dst_ds = None
src_ds = None
Option 2: Command line gdal_translate (called from Python) approach
import os
# Open output format driver, see gdal_translate --formats for list
out_form = "GTiff"
# Pull out specific band of interest
band=3
# Convert from GRIB to GeoTIFF using system gdal_translate
src_ds = src_filename
dst_ds = dst_filename
os.system("gdal_translate -b %s -of %s %s %s" %(str(band), out_form, src_ds, dst_ds))
I've had trouble in the past creating a multi-band GeoTiff using option 2, so I recommend using option 1 when possible.
Something like this should transform your native coordinates into your desired projection. This is not tested, yet. (Could by latitude instead of latitudes).
from cfgrib import xarray_store
from pyproj import Proj, transform
grib_data = xarray_store.open_dataset('your_grib_file.grib')
lat = grib_data.latitudes.value
lon = grib_data.longitudes.value
lon_transformed, lat_transformed = transform (Proj(init='init_projection'),
Proj(init='target_projection', lon, lat)

VTK : how to read grid cells' lenth, width and height?

I have a huge grid in *.pvd format. I would like to ensure some cells size specification have been respected when building said grid. To do so, I should get a cell data array with (dx,dy,dz)
I first tried to check this in Paraview with very little success. Then I resolved to export the mesh in various format (vtk, vtu, ex2) and import things into python using the vtk module, as in the code below. Unfortunately, the size of the mesh forbids it and I get various error messages stating "Unable to allocate n cells of size x".
import vtk
reader = vtk.vtkXMLUnstructuredGridReader()
reader.SetFileName("my_mesh.vtu")
reader.Update()
Finally, in Paraview there is a python-shell that allows me to open the grid file in either pvd or vtk format:
>>> from paraview.simple import *
>>> my_vtk = OpenDataFile("my_mesh.vtk")
>>> print dir(my_vtk)
Despite my browsing the methods and attribute of this reader object, I remain clueless about where to fetch any geometry information on the grid. I also browsed through the simple module documentation and I can't really wrap my head around it.
So how can one retrieve information regarding the geometry of cells from a paraview.servermanager.LegacyVTKReader object?
Any clue about how to achieve this with through the paraview GUI, or any kludge to load the vtk object into python vtk despite the memory issue is also very welcome. Sorry for such a hazy question, but I don't really know where to get started...
You can use GetClientSideObject() (see here) to get a VTK object in the Paraview Python shell. After that you can use all the regular VTK Python functions. For example, you can write the following in the Paraview Python shell
>>> from paraview.simple import *
>>> currentSelection = GetActiveSource()
>>> readerObj = currentSelection.GetClientSideObject()
>>> unstructgrid = readerObj.GetOutput()
>>> firstCell = unstructgrid.GetCell(0)
>>> cellPoints = firstCell.GetPoints()
Alternatively, you can use Programmable Filter in ParaView. This allows access to full VTK python module and even NumPy or other modules. You can enter following script in the script window of the programmable filter:
import vtk as v
import numpy as np
inp = self.GetUnstructuredGridInput()
cells = inp.GetCells()
cells.InitTraversal()
cellPtIds = v.vtkIdList()
lenArr = v.vtkDoubleArray()
lenArr.SetNumberOfComponents(3)
lenArr.SetName('CellSize')
while cells.GetNextCell( cellPtIds ):
pts = []
for i in range( cellPtIds.GetNumberOfIds() ):
ptCoords = inp.GetPoint( cellPtIds.GetId(i) )
pts.append( ptCoords )
pts = np.array( pts )
dx = np.max(pts[:,0]) - np.min(pts[:,0])
dy = np.max(pts[:,1]) - np.min(pts[:,1])
dz = np.max(pts[:,2]) - np.min(pts[:,2])
lenArr.InsertNextTuple3(dx, dy, dz)
out = self.GetUnstructuredGridOutput()
out.ShallowCopy( inp )
out.GetCellData().AddArray( lenArr )
In Paraview when you select the 'ProgrammableFilter1' icon in your pipeline, a new cell data array will be available to you from the drop-down as shown in the screenshot below. You can modify the script above to save the data to file to analyze externally.
This information is visible in the Information Tab.

Read/Open a modis aqua .hdf file and display/plot the output in gdal and matplotlib

I have tried and search on how to solve this but still can't find a way on how to do read and plot this in gdal and matplotlib from a given Modis Aqua .hdf file. Any help is much appreciated. By the way am using Python 2.7.5 in Windows 7. The filename is A2014037040000.L2_LAC.SeAHABS.hdf.Among the Geophysical Datas of the hdf file I will only be using the chlor_a.
Update:
Here is the link of the sample file.
A2014037040500.L2_LAC.SeAHABS.hdf
The trick with HDF's is that most of the time you need a specific subdataset. If you use GDAL you need to open the HDF pointing directly to that subdataset:
import gdal
import matplotlib.pyplot as plt
ds = gdal.Open('HDF4_SDS:UNKNOWN:"MOD021KM.A2013048.0750.hdf":6')
data = ds.ReadAsArray()
ds = None
fig, ax = plt.subplots(figsize=(6,6))
ax.imshow(data[0,:,:], cmap=plt.cm.Greys, vmin=1000, vmax=6000)
You can also open the 'main' HDF file and inspect the subdatasets, and go from there:
# open the main HDF
ds = gdal.Open('MOD021KM.A2013048.0750.hdf')
# get the path for a specific subdataset
subds = [sd for sd, descr in ds.GetSubDatasets() if descr.endswith('EV_250_Aggr1km_RefSB (16-bit unsigned integer)')][0]
# open and read it like normal
dssub = gdal.Open(subds)
data = dssub.ReadAsArray()
dssub = None
ds = None
You should try setting datatype for the MODIS dataset. I guess it is 16 bits unsigned
ds= gdal.Open(hdfpath)
data = ds.GetRasterBand(N).ReadAsArray().astype(numpy.uint16)
N is the band number for your data of interest. You can try open it with QGIS or ENVI to see the structure of the HDF file.
Remember that the bands starts at 1 and not as 0. First band is 1.
Hope it helps

Importing variables from Netcdf into Python

I am very new to Python, and I have managed to read in some variables from NetCDF in to Python and plot them, but the size of the variables isn't correct.
My dataset is 144 x 90 (lon x lat) but when I call in the variables, it seems to miss a large section of data.
Do I need to specify the size of the dataset I'm reading in? Is that what I'm doing wrong here?
Here is the code I am using:
import netCDF4
from netCDF4 import Dataset
from pylab import *
ncfile = Dataset('DEC3499.aijE03Ccek11p5A.nc','r')
temp = ncfile.variables['tsurf']
prec = ncfile.variables['prec']
subplot(2,1,1)
pcolor(temp)
subplot(2,1,2)
pcolor(prec)
savefig('DEC3499.png',optimize=True,quality=85)
quit()
Just to clarify, here is an image showing the output. There should be data right to the far right hand side of the box.
(http://img163.imageshack.us/img163/6900/screenshot20130520at112.png)
I figured it out.
For those interested, I just needed to amend the following lines to pull in the variables properly:
temp = ncfile.variables['tsurf'][:,:]
prec = ncfile.variables['prec'][:,:]
Thanks!

Categories

Resources