working with netCDF on python with matplotlib - python

So I am pretty new in programming, currently doing some research on netCDF .nc files to work with python. I have the following code which I am sure it will not work. The objective here is to plot a graph simple line graph of 10m u-component of winds over time.
The problem I think is that 10m u-component of winds has 4D shape of (time=840, expver=2, latitude=19, longitude=27) while time being only (time=840).
Any replies or ideas will be much appreciated! The code is as shown:
from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as np
nc = Dataset(r'C:\WAIG\Python\ERA5_py01\Downloaded_nc_file/test_era5.nc','r')
for i in nc.variables:
print(i)
lat = nc.variables['latitude'][:]
lon = nc.variables['longitude'][:]
time = nc.variables['time'][:]
u = nc.variables['u10'][:]
plt.plot(np.squeeze(u), np.squeeze(time))
plt.show()

Right, you have winds that represent the 10m wind at every location of the model grid (lat,lon) as well as dimensioned on exper--not sure what that last one is. You need to select a subset of u. For example, let's pick exper=1, and lat/lon indexes of lat=8, lon=12 (not sure where those are going to be:
exper_index = 1
lat_index = 8
lon_index = 12
# ':' below means "full slice"--take everything along that dimension
plt.plot(u[:, exper_index, lat_index, lon_index], time)
plt.title(f'U at latitude {lat[lat_index]}, longitude {lon[lon_index]}')
plt.show()

Have you tried using xarray?
I think that it will be easier for you to read the netCDF4 file and plot it using matplotlib.
This is straight forward:
import xarray as xr
ds = xr.open_dataset('C:\WAIG\Python\ERA5_py01\Downloaded_nc_file/test_era5.nc')
This line will plot the timeseries of the horizontal mean U10
ds['u10'].mean(['longitude','latitude']).plot()
You can also select by the value or index in a specific dimension using sel and isel methods:
This line selects the 10th latitude and 5th longitude and plot it. In this case I am interested in specific indexes for latitude and longitude, not in the real units.
ds['u10'].isel(latitude=10,longitude=5).plot()
This line selects the nearest value of latitude and longitude to the given values and plot it. In this case, I am interested in the values of latitude and longitude in the real units.
ds['u10'].sel(latitude=-15,longitude=40,method='nearest').plot()
See their documentation to learn more about xarray.
I hope this solution is better for your case and it also introduce you this great tool. Please, let me know if you still need some help with this.

Related

How to average a 3-D Array Into 2-D Array Python

I would like to take a temperature variable from a netcdf file in python and average over all of the satellite's scans.
The temperature variable is given as:
tdisk = file.variables['tdisk'][:,:,:] # Disk Temp(nscans,nlons,nlats)
The shape of the tdisk array is 68,52,46. The satellite makes 68 scans per day. The longitude and latitude variables are given as:
lats = file.variables['latitude'][:,:] # Latitude(nlons,nlats)
lons = file.variables['longitude'][:,:] # Longitude(nlons,nlats)
Which have sizes of 52,46. I would like to average the each nscan of temperature together to get a daily mean so the temperature size becomes 52,46. I've seen ways to stack the arrays and concatenate them, but I would like a mean value. Eventually I am looking to make a contour plot with (x=longitude, y=latitude , and z=temp)
Is this possible to do? Thanks for any help
If you are using Xarray, you can do this using DataArray.mean:
import xarray as xr
# open netcdf file
ds = xr.open_dataset('file.nc')
# take the mean of the tdisk variable
da = ds['tdisk'].mean(dim='nscans')
# make a contour plot
da.plot.contour('longitude', 'latitude')
Based on the question you seem to want to calculate a temporal mean, not a daily mean as you seem to only have one day. So the following will probably work:
ds.mean(“time”)

Interpolation gridded data to geographical point location

I am a big fan of MetPy and had a look at their interpolation functions (https://unidata.github.io/MetPy/latest/api/generated/metpy.interpolate.html) but could not find what I was looking for.
I am looking for a function to interpolate a gridded 2D (lon and lat) or 3D (lon, lat and vertical levels) climate data field to a specific geographic location (lat/lon).
The function would take 5 arguments: a 2D/3D data variable and associated latitude and longitude variables, as well as the two desired latitude and longitude coordinate values. Returned is either a single value (for 2D field) or a vertical profile (for 3D field).
I am basically looking for an equivalent to the old Basemap function bm.interp(). Cartopy does not have an equivalent. The CDO (Climate Data Operators) operator 'remapbil,lon=/lat=' does the same thing but works directly on netCDF files from the command line, I'm looking for a Python solution.
I think such a function would be a useful addition to the MetPy library as it allows for comparing gridded data (e.g., model or satellite data) with point observations such as from weather stations or radiosonde profiles (treated as just a vertical profile here).
Can you point me in the right direction?
I think what you're looking for already exists in scipy.interpolate (scipy is one of MetPy's dependencies). Here we can use interpn to interpolate linearly in n dimensions:
import numpy as np
from scipy.interpolate import interpn
# Array of synthetic grid to interpolate--ordered z,y,x
a = np.arange(24).reshape(2, 3, 4)
# Locations of grid points along each dimension
z = np.array([1.5, 2.5])
y = np.array([-1., 0., 1.])
x = np.array([-3.5, -1, 1, 3.5])
interpn((z, y, x), a, (2., 0.5, 2.))
This can be done easily with my nctoolkit package (https://nctoolkit.readthedocs.io/en/latest/). It uses CDO as a backend, and defaults to bilinear interpolation. The following would regrid a .nc file to a single grid point and then convert it to an xarray dataset.
import nctoolkit as nc
import pandas as pd
data = nc.open_data("example.nc")
grid = pd.DataFrame({"lon":[0], "lat":[50]})
data.regrid(grid)
ds = data.to_xarray()
To add one more solution, if you're already using multidimensional netCDF files and want a Python solution: check out xarray's interpolation tools. They support multidimensional, label-based interpolation with usage similar to xarray's indexing interface. These are built on top of the same scipy.interpolate otherwise mentioned, and xarray is also a MetPy dependency.

Python Interpolating Time

Novice in python so I hope I am asking correctly.
I have a huge set of data that I would like to interpolate to every one second and fill in the gaps with the appropriate latitude and longitude provided:
Lat Long Time
-87.10 30.42 16:38:49
.
.
.
-87.09 30.40 16:39:22
.
.
.
-87.08 30.40 16:39:30
So I would like to generate a new latitude and longitude every second.
I have already plotted the corresponding latitude and longitude and would like to fill in the gaps with the interpolated data with points possibly.
If linear interpolation is good enough for you, you can use the numpy.interp function with the time array that you want to use for interpolation and the time and longitude/latitude data points that are read from the input file (the time must be increasing so you may need to sort the data).
To read the data for the file you can use the numpy.loadtxt function adding a converter to transform the time to an increasing number:
import numpy as np
from matplotlib.dates import strpdate2num
lon, lat, time = np.loadtxt('data.txt', skiprows=1,
converters={2:strpdate2num('%H:%M:%S')}, unpack=True)
Then you can interpolate the longitude and latitude values using the interp function. The last argument to the linspace function gives the number of points in the time interpolated data.
interp_time = np.linspace(time[0], time[-1], 100)
interp_lon = np.interp(interp_time, time, lon)
interp_lat = np.interp(interp_time, time, lat)
For something more complicated that linear interpolation there are several facilities in scipy.interpolation.

Using pcolormesh for plotting an orbit data

I am trying to map a dataset with associated latitude and longitude. The details of the data I am using are given below:
Variable Type Data/Info
-------------------------------
lat ndarray 1826x960, type `float64`
lon ndarray 1826x960, type `float64`
data ndarray 1826x960, type `float64`
I have created then a basemap:
m = Basemap(projection='cyl', llcrnrlon=-180, urcrnrlon=180, llcrnrlat=-40, urcrnrlat=40, resolution='c')
Now, on the basemap created, I'd plot the above mentioned dataset using pcolormesh:
m.drawcoastlines()
m.drawcountries
x,y = m(lon,lat)
m.pcolormesh(x,y,data)
m.colorbar()
plt.show()
This gives following figure:
Temp Brightness plot
But if I perform similar plot on a dataset (size 2691x960, same goes to lon and lat) covering whole londitude stretch(-180 to 180), I get a 'strange bar':
strange bar
I am pretty sure that the strange bar occurs due to the overlapping of dataset. The same plot has been performed in matlab and it works pretty fine.
Please tell me what the problem is, what can be done to remove the bar, what are the other methods of plotting this kind of data in python.
I think that you are running into a problem that I ran into a little bit ago. The problem here is that, when basemap tries to create the polygons, it uses an interpolation method that does not appear to handle the prime meridian correctly. Pixels that actually cross the prime maridian get interpolated into a polygon that extends around the globe.
The solution that I have used is to split the file into two masked arrays (or just mask the original array two different ways at different times), one with the eastern hemisphere masked, and one with the western hemisphere masked, then map them both to the same axes object.
edit: Another solution may be to have your longitude bounds go from -179.99 to 179.99 or something similar.
I haven't worked with anything to give me this problem, but it looks like a solution to a similar sounding problem was offered here using the mpl_toolkit.basemap.addcyclic method.
From the docs:
arrout, lonsout = addcyclic(arrin, lonsin) adds cyclic (wraparound) point in longitude to arrin and lonsin, assumes longitude is the right-most dimension of arrin.

Interpolation over an irregular grid

So, I have three numpy arrays which store latitude, longitude, and some property value on a grid -- that is, I have LAT(y,x), LON(y,x), and, say temperature T(y,x), for some limits of x and y. The grid isn't necessarily regular -- in fact, it's tripolar.
I then want to interpolate these property (temperature) values onto a bunch of different lat/lon points (stored as lat1(t), lon1(t), for about 10,000 t...) which do not fall on the actual grid points. I've tried matplotlib.mlab.griddata, but that takes far too long (it's not really designed for what I'm doing, after all). I've also tried scipy.interpolate.interp2d, but I get a MemoryError (my grids are about 400x400).
Is there any sort of slick, preferably fast way of doing this? I can't help but think the answer is something obvious... Thanks!!
Try the combination of inverse-distance weighting and
scipy.spatial.KDTree
described in SO
inverse-distance-weighted-idw-interpolation-with-python.
Kd-trees
work nicely in 2d 3d ..., inverse-distance weighting is smooth and local,
and the k= number of nearest neighbours can be varied to tradeoff speed / accuracy.
There is a nice inverse distance example by Roger Veciana i Rovira along with some code using GDAL to write to geotiff if you're into that.
This is of coarse to a regular grid, but assuming you project the data first to a pixel grid with pyproj or something, all the while being careful what projection is used for your data.
A copy of his algorithm and example script:
from math import pow
from math import sqrt
import numpy as np
import matplotlib.pyplot as plt
def pointValue(x,y,power,smoothing,xv,yv,values):
nominator=0
denominator=0
for i in range(0,len(values)):
dist = sqrt((x-xv[i])*(x-xv[i])+(y-yv[i])*(y-yv[i])+smoothing*smoothing);
#If the point is really close to one of the data points, return the data point value to avoid singularities
if(dist<0.0000000001):
return values[i]
nominator=nominator+(values[i]/pow(dist,power))
denominator=denominator+(1/pow(dist,power))
#Return NODATA if the denominator is zero
if denominator > 0:
value = nominator/denominator
else:
value = -9999
return value
def invDist(xv,yv,values,xsize=100,ysize=100,power=2,smoothing=0):
valuesGrid = np.zeros((ysize,xsize))
for x in range(0,xsize):
for y in range(0,ysize):
valuesGrid[y][x] = pointValue(x,y,power,smoothing,xv,yv,values)
return valuesGrid
if __name__ == "__main__":
power=1
smoothing=20
#Creating some data, with each coodinate and the values stored in separated lists
xv = [10,60,40,70,10,50,20,70,30,60]
yv = [10,20,30,30,40,50,60,70,80,90]
values = [1,2,2,3,4,6,7,7,8,10]
#Creating the output grid (100x100, in the example)
ti = np.linspace(0, 100, 100)
XI, YI = np.meshgrid(ti, ti)
#Creating the interpolation function and populating the output matrix value
ZI = invDist(xv,yv,values,100,100,power,smoothing)
# Plotting the result
n = plt.normalize(0.0, 100.0)
plt.subplot(1, 1, 1)
plt.pcolor(XI, YI, ZI)
plt.scatter(xv, yv, 100, values)
plt.title('Inv dist interpolation - power: ' + str(power) + ' smoothing: ' + str(smoothing))
plt.xlim(0, 100)
plt.ylim(0, 100)
plt.colorbar()
plt.show()
There's a bunch of options here, which one is best will depend on your data...
However I don't know of an out-of-the-box solution for you
You say your input data is from tripolar data. There are three main cases for how this data could be structured.
Sampled from a 3d grid in tripolar space, projected back to 2d LAT, LON data.
Sampled from a 2d grid in tripolar space, projected into 2d LAT LON data.
Unstructured data in tripolar space projected into 2d LAT LON data
The easiest of these is 2. Instead of interpolating in LAT LON space, "just" transform your point back into the source space and interpolate there.
Another option that works for 1 and 2 is to search for the cells that maps from tripolar space to cover your sample point. (You can use a BSP or grid type structure to speed up this search) Pick one of the cells, and interpolate inside it.
Finally there's a heap of unstructured interpolation options .. but they tend to be slow.
A personal favourite of mine is to use a linear interpolation of the nearest N points, finding those N points can again be done with gridding or a BSP. Another good option is to Delauney triangulate the unstructured points and interpolate on the resulting triangular mesh.
Personally if my mesh was case 1, I'd use an unstructured strategy as I'd be worried about having to handle searching through cells with overlapping projections. Choosing the "right" cell would be difficult.
I suggest you taking a look at GRASS (an open source GIS package) interpolation features (http://grass.ibiblio.org/gdp/html_grass62/v.surf.bspline.html). It's not in python but you can reimplement it or interface with C code.
Am I right in thinking your data grids look something like this (red is the old data, blue is the new interpolated data)?
alt text http://www.geekops.co.uk/photos/0000-00-02%20%28Forum%20images%29/DataSeparation.png
This might be a slightly brute-force-ish approach, but what about rendering your existing data as a bitmap (opengl will do simple interpolation of colours for you with the right options configured and you could render the data as triangles which should be fairly fast). You could then sample pixels at the locations of the new points.
Alternatively, you could sort your first set of points spatially and then find the closest old points surrounding your new point and interpolate based on the distances to those points.
There is a FORTRAN library called BIVAR, which is very suitable for this problem. With a few modifications you can make it usable in python using f2py.
From the description:
BIVAR is a FORTRAN90 library which interpolates scattered bivariate data, by Hiroshi Akima.
BIVAR accepts a set of (X,Y) data points scattered in 2D, with associated Z data values, and is able to construct a smooth interpolation function Z(X,Y), which agrees with the given data, and can be evaluated at other points in the plane.

Categories

Resources