Saving masked image as FITS - python

I've constructed an image from some FITS files, and I want to save the resultant masked image as another FITS file. Here's my code:
import numpy as np
from import fits
import matplotlib.pyplot as plt
#from astropy.nddata import CCDData
from ccdproc import CCDData
hdulist1 ='wise_neowise_w1-MJpersr.fits')
hdulist2 ='wise_neowise_w2-MJpersr.fits')
data1_raw = hdulist1[0].data
data2_raw = hdulist2[0].data
# Hide negative values in order to take logs
# Where {condition}==True, return data_raw, else return np.nan
data1 = np.where(data1_raw >= 0, data1_raw, np.nan)
data2 = np.where(data2_raw >= 0, data2_raw, np.nan)
# Calculation and image subtraction
w1mag = -2.5 * (np.log10(data1) - 9.0)
w2mag = -2.5 * (np.log10(data2) - 9.0)
color = w1mag - w2mag
## Find upper and lower 5th %ile of pixels
mask_percent = 5
masked_value_lower = np.nanpercentile(color, mask_percent)
masked_value_upper = np.nanpercentile(color, (100 - mask_percent))
## Mask out the upper and lower 5% of pixels
## Need to hide values outside the range [lower, upper]
color_masked =, masked_value_lower, masked_value_upper)
color_masked =
plt.savefig('color.png', overwrite = True)
plt.savefig('color_masked.png', overwrite = True)
overwrite = True)
ccd = CCDData(color_masked, unit = 'adu')
ccd.write('color_masked.fits', overwrite = True))
When I use matplotlib.pyplot to imshow the images color and color_masked, they look as I expect:
However, my two output files, color_masked.fits == color.fits. I think somehow I'm not quite understanding the masking process properly. Can anyone see where I've gone wrong? only handles normal arrays and that means it just ignores/discards the mask of your MaskedArray.
Depending on your use-case you have different options:
Saving the file so other FITS programs recognize the mask
I actually don't think that's possible. But some programs like DS9 can handle NaNs, so you could just set the masked values to NaN for the purpose of displaying them:
data_naned = np.where(color_masked.mask, np.nan, color_masked)
fits.writeto(filename, data_naned, overwrite=True)
They do still show up as "bright white spots" but they don't affect the color-scale.
If you want to take this a step further you could replace the masked pixels using a convolution filter before writing them to a file. Not sure if there's one in astropy that only replaces masked pixels though.
Saving the mask as extension so you can read them back
You could use astropy.nddata.CCDData (available since astropy 2.0) to save it as FITS file with mask:
from astropy.nddata import CCDData
ccd = CCDData(color_masked, unit='adu')
ccd.write('color_masked.fits', overwrite=True)
Then the mask will be saved in an extension called 'MASK' and it can be read using CCDData as well:
ccd2 ='color_masked.fits')
The CCDData behaves like a masked array in normal NumPy operations but you could also convert it to a masked-array by hand:
import numpy as np
marr = np.asanyarray(ccd2)


How to extract pixel value of every band in a rasterstack using a polygon?

I am trying to extract the pixel values of a raster from a polygon that has several features. My raster is a rasterstack with 4 bands. I have found how to do it for a whole raster, but I need the info PER BAND. Any hints?
from rasterstats import zonal_stats
import os
import numpy as np
import statistics
shapefile = Class1.shp
geotiff = tile_105.tif
# calculate all zonal stats
stats = zonal_stats(shapefile, geotiff)
# extract the mean and store it
single_mean = [f['mean'] for f in stats]
val_list = []
# only store the positive values.
for val in single_mean:
if val != None :
val = [float(val)]
all_mean = statistics.mean(val_list)
rasterio to read a [multi-band raster]!
import rasterio
dataset ='path_to_your_multiband.tif')
You may also want to define the extent of your AOI by defining the cropping window ((row_start, row_stop), (col_start, col_stop))
window = ((20, 50), (10, 40))
with as src:
# Create zero array (you may want to set dtype too) for negative values
# only store the positive values.
array = np.zeros((window[0][1] - window[0][0],
window[1][1] - window[1][0],
# Fill the array
for i, band in enumerate(bands):
array[:,:,i] =, window=window)
all_mean = statistics.mean(val_list)

Save 3D array into a stack of 2D images in Python

I made a 3D array, which consists of numbers(0~4). What I want is to save 3D array as a stack of 2D images(if possible, save *.tiff file). What am I supposed to do?
import numpy as np
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
Actually, I made it. This is my code.
With this code, I don't need to stack a series of 2D image(array).
Make a 3D array, and save it. That is just what I did for this.
import numpy as np
from skimage.external import tifffile as tif
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
tif.imsave('a.tif', a, bigtiff=True)
This should work. I haven't tested it but I have separated color images into RGB slices using this method and it should work pretty much the same way here, assuming you don't want to do anything with those pixel values first. (They will be very close to the same color in an image).
import imageio
import numpy as np
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
for i in range(100):
newimage = a[:, :, i]
imageio.imwrite("path/to/image%d.tiff" %i, newimage)
What exactly do you mean by "stack"? As you refer to tiff as output format, I assume here you want your data in one file as a multiframe-tiff.
This can easily be done with imageio's mimwrite() function:
# import numpy as np
# a = np.random.randint(0,5, size=(100,100,100))
# a = a.astype('int8')
import imageio
imageio.mimwrite("image.tiff", a)
Note that this function relies on having the counter for your several frames as first parameter and x and y follw. See also its documentation.
However, if I'm wrong and you want to have n (e.g. 100) separate tif-files, you can also use the normal imwrite() function in a loop:
n = len(a)
for i in range(n):
imageio.imwrite(f'image_{i:03}.tiff', a[i])

Splicing image array (FITS file) using coordinates from header

I am trying to splice a fits array based on the latitudes provided from the Header. However, I cannot seem to do so with my knowledge of Python and the documentation of astropy. The code I have is something like this:
from import fits
import numpy as np
Wise1 ='Image1.fits')
im1 = Wise1[0].data
im1 = np.where(im1 > *latitude1, 0, im1)
newhdu = fits.PrimaryHDU(im1)
newhdulist = fits.HDUList([newhdu])
Here latitude1 would be a value in degrees, recognized after being called from the header. So there are two things I need to accomplish:
How to call the header to recognize Galactic Latitudes?
Splice the array in such a way that it only contains values for the range of latitudes, with everything else being 0.
I think by "splice" you mean "cut out" or "crop", based on the example you've shown.
astropy.nddata has a routine for world-coordinate-system-based (i.e., lat/lon or ra/dec) cutouts
However, in the simple case you're dealing with, you just need the coordinates of each pixel. Do this by making a WCS:
from astropy import wcs
w = wcs.WCS(Wise1[0].header)
xx,yy = np.indices(im.shape)
lon,lat = w.wcs_pix2world(xx,yy,0)
newim = im[lat > my_lowest_latitude]
But if you want to preserve the header information, you're much better off using the cutout tool, since you then do not have to manually manage this.
from astropy.nddata import Cutout2D
from astropy import coordinates
from astropy import units as u
# example coordinate - you'll have to figure one out that's in your map
center = coordinates.SkyCoord(mylon*u.deg, mylat*u.deg, frame='fk5')
# then make an array cutout
co = nddata.Cutout2D(im, center, size=[0.1,0.2]*u.arcmin, wcs=w)
# create a new FITS HDU
hdu = fits.PrimaryHDU(, header=co.wcs.to_header())
# write to disk
An example use case is in the astropy documentation.

filling gaps on an image using numpy and scipy

The image (test.tif) is attached.
The np.nan values are the whitest region.
How to fill those whitest region using some gap filling algorithms that uses values from the neighbours?
import scipy.ndimage
data = ndimage.imread('test.tif')
As others have suggested, scipy.interpolate can be used. However, it requires fairly extensive index manipulation to get this to work.
Complete example:
from pylab import *
import numpy
import scipy.ndimage
import scipy.interpolate
import pdb
data = scipy.ndimage.imread('data.png')
# a boolean array of (width, height) which False where there are missing values and True where there are valid (non-missing) values
mask = ~( (data[:,:,0] == 255) & (data[:,:,1] == 255) & (data[:,:,2] == 255) )
# array of (number of points, 2) containing the x,y coordinates of the valid values only
xx, yy = numpy.meshgrid(numpy.arange(data.shape[1]), numpy.arange(data.shape[0]))
xym = numpy.vstack( (numpy.ravel(xx[mask]), numpy.ravel(yy[mask])) ).T
# the valid values in the first, second, third color channel, as 1D arrays (in the same order as their coordinates in xym)
data0 = numpy.ravel( data[:,:,0][mask] )
data1 = numpy.ravel( data[:,:,1][mask] )
data2 = numpy.ravel( data[:,:,2][mask] )
# three separate interpolators for the separate color channels
interp0 = scipy.interpolate.NearestNDInterpolator( xym, data0 )
interp1 = scipy.interpolate.NearestNDInterpolator( xym, data1 )
interp2 = scipy.interpolate.NearestNDInterpolator( xym, data2 )
# interpolate the whole image, one color channel at a time
result0 = interp0(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
result1 = interp1(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
result2 = interp2(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
# combine them into an output image
result = numpy.dstack( (result0, result1, result2) )
This passes to the interpolator all values we have, not just the ones next to the missing values (which may be somewhat inefficient). It also interpolates every point in the output, not just the missing values (which is extremely inefficient). A better way is to interpolate just the missing values, and then patch them into the original image. This is just a quick working example to get started :)
I think viena's question is more related to an inpainting problem.
Here are some ideas:
In order to fill the gaps in B/W images you can use some filling algorithm like scipy.ndimage.morphology.binary_fill_holes. But you have a gray level image, so you can't use it.
I suppose that you don't want to use a complex inpainting algorithm. My first suggestion is: Don't try to use Nearest gray value (you don't know the real value of the NaN pixels). Using the NEarest value will generate a dirty algorithm. Instead, I would suggest you to fill the gaps with some other value (e.g. the mean of the row). You can do it without coding by using scikit-learn:
>>> from sklearn.preprocessing import Imputer
>>> imp = Imputer(strategy="mean")
>>> a = np.random.random((5,5))
>>> a[(1,4,0,3),(2,4,2,0)] = np.nan
>>> a
array([[ 0.77473361, 0.62987193, nan, 0.11367791, 0.17633671],
[ 0.68555944, 0.54680378, nan, 0.64186838, 0.15563309],
[ 0.37784422, 0.59678177, 0.08103329, 0.60760487, 0.65288022],
[ nan, 0.54097945, 0.30680838, 0.82303869, 0.22784574],
[ 0.21223024, 0.06426663, 0.34254093, 0.22115931, nan]])
>>> a = imp.fit_transform(a)
>>> a
array([[ 0.77473361, 0.62987193, 0.24346087, 0.11367791, 0.17633671],
[ 0.68555944, 0.54680378, 0.24346087, 0.64186838, 0.15563309],
[ 0.37784422, 0.59678177, 0.08103329, 0.60760487, 0.65288022],
[ 0.51259188, 0.54097945, 0.30680838, 0.82303869, 0.22784574],
[ 0.21223024, 0.06426663, 0.34254093, 0.22115931, 0.30317394]])
The dirty solution that uses the Nearest values can be this:
1) Find the perimeter points of the NaN regions
2) Compute all the distances between the NaN points and the perimeter
3) Replace the NaNs with the nearest's point gray value
If you want values from the nearest neighbors, you could use the NearestNDInterpolator from scipy.interpolate. There are also other interpolators as well you can consider.
You can locate the X,Y index values for the NaN values with:
import numpy as np
nan_locs = np.where(np.isnan(data))
There are some other options for the interpolation as well. One option is to replace NaN values with the results of a median filter (but your areas are kind of large for this). Another option might be grayscale dilation. The correct interpolation depends on your end domain.
If you haven't used a SciPy ND interpolator before, you'll need to provide X, Y, and value data to fit the interpolator to then X and Y data for values to interpolate at. You can do this using the where example above as a template.
OpenCV has some image in-painting algorithms that you could use. You just need to provide a binary mask which indicates which pixels should be in-painted.
import cv2
import numpy as np
import scipy.ndimage
data = ndimage.imread("test.tif")
mask = np.isnan(data)
inpainted_img = cv2.inpaint(img, mask, inpaintRadius=3, flags=cv2.INPAINT_TELEA)

How to create a grid from LiDAR points (X,Y,Z) with GDAL python?

I'm new really to python programming, and I was just wondering if you can create a regular grid of 0.5 by o.5 m of resolution using LiDAR points.
My data are in LAS format (reading with from liblas import file as lasfile) and they have the following format: X,Y,Z. Where X and Y are coordinates.
The points are randomly positioned and some pixel are empty (NAN value) and in some pixel there are more of one points. Where there are more of one point, I wish to obtain a mean value. In the end i need to save the data in a TIF format or Ascii format.
I am studying osgeo module and GDAL but I honest to say that i don't know if osgeo module is the best solution.
I am really glad for help with some code that i can study and implement,
Thanks in Advance for the help, I really need.
I don't know the best way to get a grid with these parameters.
It's a bit late but maybe this answer will be useful for others, if not for you...
I have done this with Numpy and Pandas, and it's pretty fast. I was using TLS data and could do this with several million data points without any trouble on a decent 2009-vintage laptop. The key is 'binning' by rounding the data, and then using Pandas' GroupBy methods to do the aggregating and calculate the means.
If you need to round to a power of 10 you can use np.round, otherwise you can round to an arbitrary value by making a function to do so, which I have done by modifying this SO answer.
import numpy as np
import pandas as pd
# make rounding function:
def round_to_val(a, round_val):
return np.round( np.array(a, dtype=float) / round_val) * round_val
# load data
data = np.load( 'shape of ndata, 3')
n_d = data.shape[0]
# round the data
d_round = np.empty( [n_d, 5] )
d_round[:,0] = data[:,0]
d_round[:,1] = data[:,1]
d_round[:,2] = data[:,2]
del data # free up some RAM
d_round[:,3] = round_to_val( d_round[:,0], 0.5)
d_round[:,4] = round_to_val( d_round[:,1], 0.5)
# sorting data
ind = np.lexsort( (d_round[:,4], d_round[:,3]) )
d_sort = d_round[ind]
# making dataframes and grouping stuff
df_cols = ['x', 'y', 'z', 'x_round', 'y_round']
df = pd.DataFrame( d_sort)
df.columns = df_cols
df_round = df[['x_round', 'y_round', 'z']]
group_xy = df_round.groupby(['x_round', 'y_round'])
# calculating the mean, write to csv, which saves the file with:
# [x_round, y_round, z_mean] columns. You can exit Python and then start up
# later to clear memory if that's an issue.
group_mean = group_xy.mean()
# Restarting...
import numpy as np
from scipy.interpolate import griddata
binned_data = np.loadtxt('your_binned_data.csv', skiprows=1, delimiter=',')
x_bins = binned_data[:,0]
y_bins = binned_data[:,1]
z_vals = binned_data[:,2]
pts = np.array( [x_bins, y_bins])
pts = pts.T
# make grid (with borders rounded to 0.5...)
xmax, xmin = 640000.5, 637000
ymax, ymin = 6070000.5, 6067000
grid_x, grid_y = np.mgrid[640000.5:637000:0.5, 6067000.5:6070000:0.5]
# interpolate onto grid
data_grid = griddata(pts, z_vals, (grid_x, grid_y), method='cubic')
# save to ascii
np.savetxt('data_grid.txt', data_grid)
When I've done this, I have saved the output as a .npy and converted to a tiff with the Image library, and then georeferenced in ArcMap. There is probably a way to do that with osgeo but I haven't used it.
Hope this helps someone at least...
You can use the histogram function in Numpy to do binning, for instance:
import numpy as np
points = np.random.random(1000)
#create 10 bins from 0 to 1
bins = np.linspace(0, 1, 10)
means = (numpy.histogram(points, bins, weights=data)[0] /
numpy.histogram(points, bins)[0])
Try LAStools, particularly lasgrid or las2dem.

