There are problems in my writing data into .tif file with gdal module in python.
I want to extract data (numpy array) from a tif file and modify some of its values before saving it into a new one, with the new file functioning normally. I use following script:
tif = gdal.Open('data/pre_heilj_mean90_15.tif') #original tif file
imwidth = tif.RasterXSize
imheight = tif.RasterYSize
data = tif.ReadAsArray()
data[100][100] = 100 #modify value
data = data.astype(np.float32)
driver = gdal.GetDriverByName("GTiff")
dataset = driver.Create('data/res.tif', imwidth, imheight, 1, gdal.GDT_Float32)
dataset.SetSpatialRef(tif.GetSpatialRef())
dataset.SetGeoTransform(tif.GetGeoTransform())
dataset.SetProjection(tif.GetProjection())
dataset.GetRasterBand(1).WriteArray(data)
dataset.FlushCache()
dataset=None
data=None
tif=None
I am certain that data in original tif file is 2-d and float32 type.
However, the new tif file(res.tif) is all black in ArcMap:
res.tif
Here is how the original tif file shows in ArcMap:
original tif file
And sizes of the two files differ a lot, original is 5287KB and the new one is 4633KB.
I want to know what goes wrong.(forgive my poor English pls)
You probably forgot to write the nodata value in the metadata of the output file. The fact that it's "black" is probably just due to stretching, if you stretch the output similar (min = ~406) is should look similar.
For example get the nodata value with:
nodata_value = tif.GetRasterBand(1).GetNoDataValue()
Then write/assign it with:
dataset.GetRasterBand(1).SetNoDataValue(nodata_value)
Keep in mind that this is a property of a band, so multiple bands in a single file can potentially have different nodata values.
Related
I have a problem with exporting a GeoTIFF file by using python and gdal.
What I want to do is to convert a NumPy array into a GeoTIFF file.
There are reference GeoTIFF files, so I want to make sure that the produced GeoTIFF file has proper geometric coordinates.
The problem is that the tiff file seems to be produced, but the values it contains are not good.
I tried to view the file by using QGIS, but the appearance of the image was completely black. In addition to the problem with the appearance of the image, the values are also changed from the original NumPy array. For example, the maximum value of the NumPy array is 149, but QGIS says that there is no such value in the file.
What is the cause of this problem, and how can I fix it?
The codes are here.
#Check the metadata of the reference file.
with rio.open('/content/drive/My Drive/Colab Notebooks/satellite_data/MCD12Q1/MCD12Q1.A2019001.IGBP.Buryat.geotiff.tif') as filename : filename.bounds
filename.meta["transform"]
#Output=> {'count': 1,'crs': CRS.from_epsg(4326),'driver': 'GTiff','dtype': 'uint8','height': 1920, 'nodata': None, 'transform': Affine(0.00416666, 0.0, 97.99955519999997, 0.0, -0.00416666, 58.0000512),
'width': 4800}
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326)
#Set some parameters
xsize=filename.meta["width"]
ysize=filename.meta["height"]
band=1
dtype = gdal.GDT_UInt16
output = gdal.GetDriverByName('GTiff').Create('/content/drive/My Drive/Colab Notebooks/outputs/output.tif', xsize, ysize, band, dtype)
output.SetGeoTransform((filename.meta["transform"][0],filename.meta["transform"][1],filename.meta["transform"][2],filename.meta["transform"][3],filename.meta["transform"][4],filename.meta["transform"][5]))
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326)
output.SetProjection(srs.ExportToWkt())
output.GetRasterBand(1).WriteArray(NumOfFiresExpanded["2019"]) #The "NumOfFireExpanded["2019"] is the target numpy array.
output.FlushCache()
output = None
I have an example here which takes numpy arrays and builds a GeoTiff.
How do I write/create a GeoTIFF RGB image file in python?
Use gdalinfo -stats <geotiff> to view the GeoTiff and make sure the range is good. Viewing 16-bit imagery can be non-deterministic depending on the viewer. QGIS is a good test viewer.
I have a large 40 mb (about 173,397 lines) .dat file filled with binary data (random symbols). It is an astronomical photograph. I need to read and display it with Python. I am using a binary file because I will need to extract pixel value data from specific regions of the image. But for now I just need to ingest it into Python. Something like the READU procedure in IDL. Tried numpy and matplotlib but nothing worked. Suggestions?
You need to know the data type and dimensions of the binary file. For example, if the file contains float data, use numpy.fromfile like:
import numpy as np
data = np.fromfile(filename, dtype=float)
Then reshape the array to the dimensions of the image, dims, using numpy.reshape (the equivalent of REFORM in IDL):
im = np.reshape(data, dims)
I want to read some DICOM files, so I'm testing pydicom for my work, which I think is considerably useful.
Now I want to load existing DICOM files, replace the pixel data array with another pixel array (e.g. pre-processing or literally another DICOM pixel array) and most of all, I want to read it again with any DICOM viewer application.
For this test, I used the tutorial code below. This code loads a test data file. The size of image is 64*64. The code below does sub-sampling from the original data. After that, the size of image is 8*8, and the result is saved to after.dcm.
But when I read the file using a DICOM viewer app (I used 'Dicompass'), the size of DICOM image is still 64*64. What is it that I'm missing?
I referred to the pydicom documentation (http://pydicom.readthedocs.io/en/stable/getting_started.html, https://pydicom.github.io/pydicom/stable/index.html) to solve my problem.
# authors : Guillaume Lemaitre <g.lemaitre58#gmail.com>
# license : MIT
import pydicom
from pydicom.data import get_testdata_files
print(__doc__)
# FIXME: add a full-sized MR image in the testing data
filename = get_testdata_files('MR_small.dcm')[0]
ds = pydicom.dcmread(filename)
# get the pixel information into a numpy array
data = ds.pixel_array
print(data)
print('The image has {} x {} voxels'.format(data.shape[0],
data.shape[1]))
data_downsampling = data[::8, ::8]
print('The downsampled image has {} x {} voxels'.format(
data_downsampling.shape[0], data_downsampling.shape[1]))
# copy the data back to the original data set
ds.PixelData = data_downsampling.tostring()
# update the information regarding the shape of the data array
ds.Rows, ds.Columns = data_downsampling.shape
# print the image information given in the dataset
print('The information of the data set after downsampling: \n')
print(ds)
print(ds.pixel_array)
print(len(ds.PixelData))
ds.save_as("after.dcm")
The code looks OK. But, you are not overwriting original file.
You load the file with:
filename = get_testdata_files('MR_small.dcm')[0]
ds = pydicom.dcmread(filename)
where original file name is "MR_small.dcm".
Then you save the file with:
ds.save_as("after.dcm")
where destination file name is different. That means, original file is still unchanged.
You should either load "after.dcm" in your DICOM viewer to test
OR
You should overwrite the file (pydicom.filewriter.dcmwrite) while saving it.
Not a part of your problem, but if you are creating copy of original image with change in pixel data, it is recommended that you also modify instance specific information in dataset like some UIDs and InstanceNumber (0020,0013), SOPInstanceUID (0008,0018) etc.
I'm having a problem wit fits file manipulation in the astropy package, and I'm in need of some help.
I essentially want to take an image I have in fits file format, and create a new file I need to start inputing correction factors to and a new image which can then be used with the correction factors and the original image to produce a correction image. Each of these will have the same dimensions.
Starting with this:
from astropy.io import fits
# Compute the size of the images (you can also do this manually rather than calling these keywords from the header):
#URL: /Users/UCL_Astronomy/Documents/UCL/PHASG199/M33_UVOT_sum/UVOTIMSUM/M33_sum_epoch1_um2_norm.img
nxpix_um2_ext1 = fits.open('...')[1]['NAXIS1']
nypix_um2_ext1 = fits.open('...')[1]['NAXIS2']
#nxpix_um2_ext1 = 4071 #hima_sk_um2[1].header['NAXIS1'] # IDL: nxpix_uw1_ext1 = sxpar(hima_sk_uw1_ext1,'NAXIS1')
#nypix_um2_ext1 = 4321 #hima_sk_um2[1].header['NAXIS2'] # IDL: nypix_uw1_ext1 = sxpar(hima_sk_uw1_ext1,'NAXIS2')
# Make a new image file with the same dimensions (and headers, etc) to save the correction factors:
coicorr_um2_ext1 = ??[nxpix_um2_ext1,nypix_um2_ext1]
# Make a new image file with the same dimensions (and headers, etc) to save the corrected image:
ima_sk_coicorr_um2_ext1 = ??[nxpix_um2_ext1,nypix_um2_ext1]
Can anyone give me the obvious knowledge I am missing to do this...the last two lines are just there to outline what is missing. I have included ?? to perhaps signal I need something else there perhaps fits.writeto() or something similar...
The astropy documentation takes you though this task step by step: create an array with size (NAXIS1,NAXIS2), put the data in the primary HDU, make an HDUlist and write it to disk:
import numpy as np
from astropy.io import fits
data = np.zeros((NAXIS2,NAXIS1))
hdu = fits.PrimaryHDU(data)
hdulist = fits.HDUList([hdu])
hdulist.writeto('new.fits')
I think #VincePs answer is correct but I'll add some more information because I think you are not using the capabilities of astropy well here.
First of all Python is zero-based so the primary extension has the number 0. Maybe you got that wrong, maybe you don't - but it's uncommon to access the second HDU so I thought I better mention it.
hdu_num = 0 # Or use = 1 if you really want the second hdu.
First you do not need to open the same file twice, you can open it once and close it after extracting the relevant values:
with fits.open('...') as hdus:
nxpix_um2_ext1 = hdus[hdu_num]['NAXIS1']
nxpix_um2_ext1 = hdus[hdu_num]['NAXIS2']
# Continue without indentation and the file will be closed again.
or if you want to keep the whole header (for saving it later) and the data you can use:
with fits.open('...') as hdus:
hdr = hdus[hdu_num].header
data = hdus[hdu_num].data # I'll also take the data for comparison.
I'll continue with the second approach because I think it's a lot cleaner and you'll have all the data and header values ready.
new_data = np.zeros((hdr['NAXIS2'], hdr['NAXIS1']))
Please note that Python interprets the axis different than IRAF (and I think IDL, but I'm not sure) so you need axis2 as first and axis1 as second element.
So do a quick check that the shapes are the same:
print(new_data.shape)
print(data.shape)
If they are not equal I got confused about the axis in Python (again) but I don't think so. But instead of creating a new array based on the header values you can also create a new array by just using the old shape:
new_data_2 = np.zeros(data.shape)
That will ensure the dimensions and shape is identical. Now you have an empty image. If you rather like a copy then you can, but do not need to, explicitly copy the data (except if you opened the file explicitly in write/append/update mode then you should always copy it but that's not the default.)
new_data = data # or = data.copy() for explicitly copying.
Do your operations on it and if you want to save it again you can use what #VinceP suggested:
hdu = fits.PrimaryHDU(new_data, header=hdr) # This ensures the same header is written to the new file
hdulist = fits.HDUList([hdu])
hdulist.writeto('new.fits')
Please note that you don't have to alter the shape-related header keywords even if you changed the data's shape because during writeto astropy will update these (by default)
I am looking to store pixel values from satellite imagery into an array. I've been using
np.empty((image_width, image_length)
and it worked for smaller subsets of an image, but when using it on the entire image (3858 x 3743) the code terminates very quickly and all I get is an array of zeros.
I load the image values into the array using a loop and opening the image with gdal
img = gdal.Open(os.path.join(fn + "\{0}".format(fname))).ReadAsArray()
but when I include print img_array I end up with just zeros.
I have tried almost every single dtype that I could find in the numpy documentation but keep getting the same result.
Is numpy unable to load this many values or is there a way to optimize the array?
I am working with 8-bit tiff images that contain NDVI (decimal) values.
Thanks
Not certain what type of images you are trying to read, but in the case of radarsat-2 images you can the following:
dataset = gdal.Open("RADARSAT_2_CALIB:SIGMA0:" + inpath + "product.xml")
S_HH = dataset.GetRasterBand(1).ReadAsArray()
S_VV = dataset.GetRasterBand(2).ReadAsArray()
# gets the intensity (Intensity = re**2+imag**2), and amplitude = sqrt(Intensity)
self.image_HH_I = numpy.real(S_HH)**2+numpy.imag(S_HH)**2
self.image_VV_I = numpy.real(S_VV)**2+numpy.imag(S_VV)**2
But that is specifically for that type of images (in this case each image contains several bands, so i need to read in each band separately with GetRasterBand(i), and than do ReadAsArray() If there is a specific GDAL driver for the type of images you want to read in, life gets very easy
If you give some more info on the type of images you want to read in, i can maybe help more specifically
Edit: did you try something like this ? (not sure if that will work on tiff, or how many bits the header is, hence the something:)
A=open(filename,"r")
B=numpy.fromfile(A,dtype='uint8')[something:].reshape(3858,3743)
C=B*1.0
A.close()
Edit: The problem is solved when using 64bit python instead of 32bit, due to memory errors at 2Gb when using the 32bit python version.