Using NASA's SRTM data, I've generated a global elevation heatmap.
The problem is, however, the continents tend to blend in with the ocean because of the range of elevation values. Is it possible to change the colorbar's scale so that the edges of the continents are more distinct from the ocean? I've tried different cmaps, but they all seem to suffer from the problem.
Here is my code. I'm initializing a giant array (with 0s) to hold global elevation data, and then populating it file by file from the SRTM dataset. Each file is 1 degree latitude by 1 degree longitude.
Another question I had was regarding the map itself. For some reason, the Appalachian Mountains seem to have disappeared entirely.
import os
import numpy as np
from .srtm_map import MapGenerator
from ..utils.hgt_parser import HGTParser
from tqdm import tqdm
import cv2
import matplotlib.pyplot as plt
import richdem as rd
class GlobalMapGenerator():
def __init__(self):
self.gen = MapGenerator()
self.base_dir = "data/elevation/"
self.hgt_files = os.listdir(self.base_dir)
self.global_elevation_data = None
def shrink(data, rows, cols):
return data.reshape(rows, data.shape[0]/rows, cols, data.shape[1]/cols).sum(axis=1).sum(axis=2)
def GenerateGlobalElevationMap(self, stride):
res = 1201//stride
max_N = 59
max_W = 180
max_S = 56
max_E = 179
# N59 --> N00
# S01 --> S56
# E000 --> E179
# W180 --> W001
# Initialize array global elevation
self.global_elevation_data = np.zeros(( res*(max_S+max_N+1), res*(max_E+max_W+1) ))
print("Output Image Shape:", self.global_elevation_data.shape)
for hgt_file in tqdm(self.hgt_files):
lat_letter = hgt_file[0]
lon_letter = hgt_file[3]
lat = int(hgt_file[1:3])
lon = int(hgt_file[4:7])
if lat_letter == "S":
# Shift south down by max_N, but south starts at S01 so we translate up by 1 too
lat_trans = max_N + lat - 1
else:
# Bigger N lat means further up. E.g. N59 is at index 0 and is higher than N00
lat_trans = max_N - lat
if lon_letter == "E":
# Shift east right by max_W
lon_trans = max_W + lon
else:
# Bigger W lon means further left. E.g. W180 is at index 0 and is more left than W001
lon_trans = max_W - lon
# load in data from file as resized
data = cv2.resize(HGTParser(os.path.join(self.base_dir, hgt_file)), (res, res))
# generate bounds (x/y --> lon.lat for data from this file for the giant array)
lat_bounds = [res*lat_trans, res*(lat_trans+1)]
lon_bounds = [res*lon_trans, res*(lon_trans+1)]
try:
self.global_elevation_data[ lat_bounds[0]:lat_bounds[1], lon_bounds[0]:lon_bounds[1] ] = data
except:
print("REFERENCE ERROR: " + hgt_file)
print("lat: ", lat_bounds)
print("lon: ", lon_bounds)
# generate figure
plt.figure(figsize=(20,20))
plt.imshow(self.global_elevation_data, cmap="rainbow")
plt.title("Global Elevation Heatmap")
plt.colorbar()
plt.show()
np.save("figures/GlobalElevationMap.npy", self.global_elevation_data)
plt.savefig("figures/GlobalElevationMap.png")
def GenerateGlobalSlopeMap(self, stride):
pass
Use a TwoSlopeNorm (docs) for your norm, like the example here.
From the example:
Sometimes we want to have a different colormap on either side of a conceptual center point, and we want those two colormaps to have different linear scales. An example is a topographic map where the land and ocean have a center at zero, but land typically has a greater elevation range than the water has depth range, and they are often represented by a different colormap.
If you set the midpoint at sea level (0), then you can have two very different scalings based on ocean elevation vs land elevation.
Example code (taken from the example linked above):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cbook as cbook
from matplotlib import cm
dem = cbook.get_sample_data('topobathy.npz', np_load=True)
topo = dem['topo']
longitude = dem['longitude']
latitude = dem['latitude']
fig, ax = plt.subplots()
# make a colormap that has land and ocean clearly delineated and of the
# same length (256 + 256)
colors_undersea = plt.cm.terrain(np.linspace(0, 0.17, 256))
colors_land = plt.cm.terrain(np.linspace(0.25, 1, 256))
all_colors = np.vstack((colors_undersea, colors_land))
terrain_map = colors.LinearSegmentedColormap.from_list(
'terrain_map', all_colors)
# make the norm: Note the center is offset so that the land has more
# dynamic range:
divnorm = colors.TwoSlopeNorm(vmin=-500., vcenter=0, vmax=4000)
pcm = ax.pcolormesh(longitude, latitude, topo, rasterized=True, norm=divnorm,
cmap=terrain_map, shading='auto')
# Simple geographic plot, set aspect ratio beecause distance between lines of
# longitude depends on latitude.
ax.set_aspect(1 / np.cos(np.deg2rad(49)))
ax.set_title('TwoSlopeNorm(x)')
cb = fig.colorbar(pcm, shrink=0.6)
cb.set_ticks([-500, 0, 1000, 2000, 3000, 4000])
plt.show()
See how it scales numbers with this simple usage (from docs):
>>> import matplotlib. Colors as mcolors
>>> offset = mcolors.TwoSlopeNorm(vmin=-4000., vcenter=0., vmax=10000)
>>> data = [-4000., -2000., 0., 2500., 5000., 7500., 10000.]
>>> offset(data)
array([0., 0.25, 0.5, 0.625, 0.75, 0.875, 1.0])
Related
I want to draw a volume in x1,x2,x3-space. The volume is an isocurve found by the marching cubes algorithm in skimage. The function generating the volume is pdf_grid = f(x1,x2,x3) and
I want to draw the volume where pdf = 60% max(pdf).
My issue is that the marching cubes algorithm generates vertices and faces, but how do I map those back to the x1, x2, x3-space?
My (rather limited) understanding of marching cubes is that "vertices" refer to the indices in the volume (pdf_grid in my case). If "vertices" contained only the exact indices in the grid this would have been easy, but "vertices" contains floats and not integers. It seems like marching cubes do some interpolation between grid points (according to https://www.cs.carleton.edu/cs_comps/0405/shape/marching_cubes.html), so the question is then how to recover exactly the values of x1,x2,x3?
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
#Make some random data
cov = np.array([[1, .2, -.5],
[.2, 1.2, .1],
[-.5, .1, .8]])
dist = scipy.stats.multivariate_normal(mean = [1., 3., 2], cov = cov)
N = 500
x_samples = dist.rvs(size=N).T
#Create the kernel density estimator - approximation of a pdf
kernel = scipy.stats.gaussian_kde(x_samples)
x_mean = x_samples.mean(axis=1)
#Find the mode
res = scipy.optimize.minimize(lambda x: -kernel.logpdf(x),
x_mean #x0, initial guess
)
x_mode = res["x"]
num_el = 50 #number of elements in the grid
x_min = np.min(x_samples, axis = 1)
x_max = np.max(x_samples, axis = 1)
x1g, x2g, x3g = np.mgrid[x_min[0]:x_max[0]:num_el*1j,
x_min[1]:x_max[1]:num_el*1j,
x_min[2]:x_max[2]:num_el*1j
]
pdf_grid = np.zeros(x1g.shape) #implicit function/grid for the marching cubes
for an in range(x1g.shape[0]):
for b in range(x1g.shape[1]):
for c in range(x1g.shape[2]):
pdf_grid[a,b,c] = kernel(np.array([x1g[a,b,c],
x2g[a,b,c],
x3g[a,b,c]]
))
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
from skimage import measure
iso_level = .6 #draw a volume which contains pdf_val(mode)*60%
verts, faces, normals, values = measure.marching_cubes(pdf_grid, kernel(x_mode)*iso_level)
#How to convert the figure back to x1,x2,x3 space? I just draw the output as it was done in the skimage example here https://scikit-image.org/docs/0.16.x/auto_examples/edges/plot_marching_cubes.html#sphx-glr-auto-examples-edges-plot-marching-cubes-py so you can see the volume
# Fancy indexing: `verts[faces]` to generate a collection of triangles
mesh = Poly3DCollection(verts[faces],
alpha = .5,
label = f"KDE = {iso_level}"+r"$x_{mode}$",
linewidth = .1)
mesh.set_edgecolor('k')
fig, ax = plt.subplots(subplot_kw=dict(projection='3d'))
c1 = ax.add_collection3d(mesh)
c1._facecolors2d=c1._facecolor3d
c1._edgecolors2d=c1._edgecolor3d
#Plot the samples. Marching cubes volume does not capture these samples
pdf_val = kernel(x_samples) #get density value for each point (for color-coding)
x1, x2, x3 = x_samples
scatter_plot = ax.scatter(x1, x2, x3, c=pdf_val, alpha = .2, label = r" samples")
ax.scatter(x_mode[0], x_mode[1], x_mode[2], c = "r", alpha = .2, label = r"$x_{mode}$")
ax.set_xlabel(r"$x_1$")
ax.set_ylabel(r"$x_2$")
ax.set_zlabel(r"$x_3$")
# ax.set_box_aspect([np.ptp(i) for me in x_samples]) # equal aspect ratio
cbar = fig.color bar(scatter_plot, ax=ax)
cbar.set_label(r"$KDE(w) \approx pdf(w)$")
ax.legend()
#Make the axis limit so that the volume and samples are shown.
ax.set_xlim(- 5, np.max(verts, axis=0)[0] + 3)
ax.set_ylim(- 5, np.max(verts, axis=0)[1] + 3)
ax.set_zlim(- 5, np.max(verts, axis=0)[2] + 3)
This is probably way too late of an answer to help OP, but in case anyone else comes across this post looking for a solution to this problem, the issue stems from the marching cubes algorithm outputting the relevant vertices in array space. This space is defined by the number of elements per dimension of the mesh grid and the marching cubes algorithm does indeed do some interpolation in this space (explaining the presence of floats).
Anyways, in order to transform the vertices back into x1,x2,x3 space you just need to scale and shift them by the appropriate quantities. These quantities are defined by the range, number of elements of the mesh grid, and the minimum value in each dimension respectively. So using the variables defined in the OP, the following will provide the actual location of the vertices:
verts_actual = verts*((x_max-x_min)/pdf_grid.shape) + x_min
I have an HDF4 file whose StructMetadata.0 contains the following attributes:
UpperLeftPointMtrs = (-20015109.354000,1111950.519667)
LowerRightMtrs = (-18903158.834333,0.000000)
These are X and Y distances in meters of the MODIS Tile for L3 Gridded product (Sinusoidal Projection). I want to extract/create the coordinates of all the pixels (240 x 240) in this tile given the pixel resolution is 5km. How can I achieve this in Python?
HDF-EOS provides this script. Showing how to access and visualize an LP DAAC MCD19A2 v6 HDF-EOS2 Sinusoidal Grid file in Python.
"""
Copyright (C) 2014-2019 The HDF Group
Copyright (C) 2014 John Evans
This example code illustrates how to access and visualize an LP DAAC MCD19A2
v6 HDF-EOS2 Sinusoidal Grid file in Python.
If you have any questions, suggestions, or comments on this example, please use
the HDF-EOS Forum (http://hdfeos.org/forums). If you would like to see an
example of any other NASA HDF/HDF-EOS data product that is not listed in the
HDF-EOS Comprehensive Examples page (http://hdfeos.org/zoo), feel free to
contact us at eoshelp#hdfgroup.org or post it at the HDF-EOS Forum
(http://hdfeos.org/forums).
Usage: save this script and run
$python MCD19A2.A2010010.h25v06.006.2018047103710.hdf.py
Tested under: Python 3.7.3 :: Anaconda custom (64-bit)
Last updated: 2019-09-20
"""
import os
import re
import pyproj
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from pyhdf.SD import SD, SDC
from mpl_toolkits.basemap import Basemap
FILE_NAME = 'MCD19A2.A2010010.h25v06.006.2018047103710.hdf'
DATAFIELD_NAME = 'Optical_Depth_055'
hdf = SD(FILE_NAME, SDC.READ)
# Read dataset.
data3D = hdf.select(DATAFIELD_NAME)
data = data3D[1,:,:].astype(np.double)
# Read attributes.
attrs = data3D.attributes(full=1)
lna=attrs["long_name"]
long_name = lna[0]
vra=attrs["valid_range"]
valid_range = vra[0]
fva=attrs["_FillValue"]
_FillValue = fva[0]
sfa=attrs["scale_factor"]
scale_factor = sfa[0]
ua=attrs["unit"]
units = ua[0]
aoa=attrs["add_offset"]
add_offset = aoa[0]
# Apply the attributes to the data.
invalid = np.logical_or(data < valid_range[0], data > valid_range[1])
invalid = np.logical_or(invalid, data == _FillValue)
data[invalid] = np.nan
data = (data - add_offset) * scale_factor
data = np.ma.masked_array(data, np.isnan(data))
# Construct the grid. The needed information is in a global attribute
# called 'StructMetadata.0'. Use regular expressions to tease out the
# extents of the grid.
fattrs = hdf.attributes(full=1)
ga = fattrs["StructMetadata.0"]
gridmeta = ga[0]
ul_regex = re.compile(r'''UpperLeftPointMtrs=\(
(?P<upper_left_x>[+-]?\d+\.\d+)
,
(?P<upper_left_y>[+-]?\d+\.\d+)
\)''', re.VERBOSE)
match = ul_regex.search(gridmeta)
x0 = np.float(match.group('upper_left_x'))
y0 = np.float(match.group('upper_left_y'))
lr_regex = re.compile(r'''LowerRightMtrs=\(
(?P<lower_right_x>[+-]?\d+\.\d+)
,
(?P<lower_right_y>[+-]?\d+\.\d+)
\)''', re.VERBOSE)
match = lr_regex.search(gridmeta)
x1 = np.float(match.group('lower_right_x'))
y1 = np.float(match.group('lower_right_y'))
nx, ny = data.shape
x = np.linspace(x0, x1, nx)
y = np.linspace(y0, y1, ny)
xv, yv = np.meshgrid(x, y)
sinu = pyproj.Proj("+proj=sinu +R=6371007.181 +nadgrids=#null +wktext")
wgs84 = pyproj.Proj("+init=EPSG:4326")
lon, lat= pyproj.transform(sinu, wgs84, xv, yv)
# There is a wrap-around issue to deal with, as some of the grid extends
# eastward over the international dateline. Adjust the longitude to avoid
# a smearing effect.
lon[lon < 0] += 360
m = Basemap(projection='cyl', resolution='l',
llcrnrlat=np.min(lat), urcrnrlat = np.max(lat),
llcrnrlon=np.min(lon), urcrnrlon = np.max(lon))
m.drawcoastlines(linewidth=0.5)
m.drawparallels(np.arange(np.floor(np.min(lat)), np.ceil(np.max(lat)), 5),
labels=[1, 0, 0, 0])
m.drawmeridians(np.arange(np.floor(np.min(lon)), np.ceil(np.max(lon)), 5),
labels=[0, 0, 0, 1])
# Subset data if you don't see any plot due to limited memory.
# m.pcolormesh(lon[::2,::2], lat[::2,::2], data[::2,::2], latlon=True)
m.pcolormesh(lon, lat, data, latlon=True)
cb = m.colorbar()
cb.set_label(units)
basename = os.path.basename(FILE_NAME)
plt.title('{0}\n{1}'.format(basename, long_name))
fig = plt.gcf()
pngfile = "{0}.py.png".format(basename)
fig.savefig(pngfile)
I have figured out a method to cluster disperse point data into structured 2-d array(like rasterize function). And I hope there are some better ways to achieve that target.
My work
1. Intro
1000 point data has there dimensions of properties (lon, lat, emission) whicn represent one factory located at (x,y) emit certain amount of CO2 into atmosphere
grid network: predefine the 2-d array in the shape of 20x20
http://i4.tietuku.com/02fbaf32d2f09fff.png
The code reproduced here:
#### define the map area
xc1,xc2,yc1,yc2 = 113.49805889531724,115.5030664238035,37.39995194888143,38.789235929357105
map = Basemap(llcrnrlon=xc1,llcrnrlat=yc1,urcrnrlon=xc2,urcrnrlat=yc2)
#### reading the point data and scatter plot by their position
df = pd.read_csv("xxxxx.csv")
px,py = map(df.lon, df.lat)
map.scatter(px, py, color = "red", s= 5,zorder =3)
#### predefine the grid networks
lon_grid,lat_grid = np.linspace(xc1,xc2,21), np.linspace(yc1,yc2,21)
lon_x,lat_y = np.meshgrid(lon_grid,lat_grid)
grids = np.zeros(20*20).reshape(20,20)
plt.pcolormesh(lon_x,lat_y,grids,cmap = 'gray', facecolor = 'none',edgecolor = 'k',zorder=3)
2. My target
Finding the nearest grid point for each factory
Add the emission data into this grid number
3. Algorithm realization
3.1 Raster grid
note: 20x20 grid points are distributed in this area represented by blue dot.
http://i4.tietuku.com/8548554587b0cb3a.png
3.2 KD-tree
Find the nearest blue dot of each red point
sh = (20*20,2)
grids = np.zeros(20*20*2).reshape(*sh)
sh_emission = (20*20)
grids_em = np.zeros(20*20).reshape(sh_emission)
k = 0
for j in range(0,yy.shape[0],1):
for i in range(0,xx.shape[0],1):
grids[k] = np.array([lon_grid[i],lat_grid[j]])
k+=1
T = KDTree(grids)
x_delta = (lon_grid[2] - lon_grid[1])
y_delta = (lat_grid[2] - lat_grid[1])
R = np.sqrt(x_delta**2 + y_delta**2)
for i in range(0,len(df.lon),1):
idx = T.query_ball_point([df.lon.iloc[i],df.lat.iloc[i]], r=R)
# there are more than one blue dot which are founded sometimes,
# So I'll calculate the distances between the factory(red point)
# and all blue dots which are listed
if (idx > 1):
distance = []
for k in range(0,len(idx),1):
distance.append(np.sqrt((df.lon.iloc[i] - grids[k][0])**2 + (df.lat.iloc[i] - grids[k][1])**2))
pos_index = distance.index(min(distance))
pos = idx[pos_index]
# Only find 1 point
else:
pos = idx
grids_em[pos] += df.so2[i]
4. Result
co2 = grids_em.reshape(20,20)
plt.pcolormesh(lon_x,lat_y,co2,cmap =plt.cm.Spectral_r,zorder=3)
http://i4.tietuku.com/6ded65c4ac301294.png
5. My question
Can someone point out some drawbacks or error of this method?
Is there some algorithms more aligned with my target?
Thanks a lot!
There are many for-loop in your code, it's not the numpy way.
Make some sample data first:
import numpy as np
import pandas as pd
from scipy.spatial import KDTree
import pylab as pl
xc1, xc2, yc1, yc2 = 113.49805889531724, 115.5030664238035, 37.39995194888143, 38.789235929357105
N = 1000
GSIZE = 20
x, y = np.random.multivariate_normal([(xc1 + xc2)*0.5, (yc1 + yc2)*0.5], [[0.1, 0.02], [0.02, 0.1]], size=N).T
value = np.ones(N)
df_points = pd.DataFrame({"x":x, "y":y, "v":value})
For equal space grids you can use hist2d():
pl.hist2d(df_points.x, df_points.y, weights=df_points.v, bins=20, cmap="viridis");
Here is the output:
Here is the code to use KdTree:
X, Y = np.mgrid[x.min():x.max():GSIZE*1j, y.min():y.max():GSIZE*1j]
grid = np.c_[X.ravel(), Y.ravel()]
points = np.c_[df_points.x, df_points.y]
tree = KDTree(grid)
dist, indices = tree.query(points)
grid_values = df_points.groupby(indices).v.sum()
df_grid = pd.DataFrame(grid, columns=["x", "y"])
df_grid["v"] = grid_values
fig, ax = pl.subplots(figsize=(10, 8))
ax.plot(df_points.x, df_points.y, "kx", alpha=0.2)
mapper = ax.scatter(df_grid.x, df_grid.y, c=df_grid.v,
cmap="viridis",
linewidths=0,
s=100, marker="o")
pl.colorbar(mapper, ax=ax);
the output is:
I'm looking to compute poleward heat fluxes at a level in the atmosphere, i.e the mean of (u't') . I'm aware of the covariance function in NumPy, but cannot seem to implement it. Here is my code below.
from netCDF4 import Dataset
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
myfile = '/home/ubuntu/Fluxes_Test/out.nc'
Import = Dataset(myfile, mode='r')
lon = Import.variables['lon'][:] # Longitude
lat = Import.variables['lat'][:] # Latitude
time = Import.variables['time'][:] # Time
lev = Import.variables['lev'][:] # Level
wind = Import.variables['ua'][:]
temp = Import.variables['ta'][:]
lon = lon-180 # to shift co-ordinates to -180 to 180.
variable1 = np.squeeze(wind,temp, axis=0)
variable2 = np.cov(variable1)
m = Basemap(resolution='l')
lons, lats = np.meshgrid(lon,lat)
X, Y = m(lons, lats)
cs = m.pcolor(X,Y, variable2)
plt.show()
The shape of the variables wind and temp which I am trying to compute the flux of (the covariance) are both (3960,64,128), so 3960 pieces of data on a 64x128 grid (with co-ordinates).
I tried squeezing both variables to produce a array of (3960, 3960, 64,128) so cov could work on these first two series of data (the two 3960's) of wind and temp, but this didn't work.
I have the following code that reads data from a CSV file and creates a 2D histogram:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
#Read in CSV data
filename = 'Complete_Storms_All_US_Only.csv'
df = pd.read_csv(filename)
min_85 = df.min85
min_37 = df.min37
verification = df.one_min_15
#Numbers
x = min_85
y = min_37
H = verification
#Estimate the 2D histogram
nbins = 33
H, xedges, yedges = np.histogram2d(x,y,bins=nbins)
#Rotate and flip H
H = np.rot90(H)
H = np.flipud(H)
#Mask zeros
Hmasked = np.ma.masked_where(H==0,H)
#Calculate Averages
avgarr = np.zeros((nbins, nbins))
xbins = np.digitize(x, xedges[1:-1])
ybins = np.digitize(y, yedges[1:-1])
for xb, yb, v in zip(xbins, ybins, verification):
avgarr[yb, xb] += v
divisor = H.copy()
divisor[divisor==0.0] = np.nan
avgarr /= divisor
binavg = np.around((avgarr * 100), decimals=1)
binper = np.ma.array(binavg, mask=np.isnan(binavg))
#Plot 2D histogram using pcolor
fig1 = plt.figure()
plt.pcolormesh(xedges,yedges,binper)
plt.title('1 minute at +/- 0.15 degrees')
plt.xlabel('min 85 GHz PCT (K)')
plt.ylabel('min 37 GHz PCT (K)')
cbar = plt.colorbar()
cbar.ax.set_ylabel('Probability of CG Lightning (%)')
plt.show()
Each pixel in the histogram contains the probability of lightning for a given range of temperatures at two different frequencies on the x and y axis (min_85 on the x axis and min_37 on the y axis). I am trying to reference the probability of lightning from the histogram based on a wide range of temperatures that vary on an individual basis for any given storm. Each storm has a min_85 and min_37 that corresponds to a probability from the 2D histogram. I know there is a brute-force method where you can create a ridiculous amount of if statements, with one for each pixel, but this is tedious and inefficient when trying to incorporate over multiple 2D histograms. Is there a more efficient way to reference the probability from the histogram based on the given min_85 and min_37? I have a separate file with the min_85 and min_37 data for a large amount of storms, I just need to assign the corresponding probability of lightning from the histogram to each one.
It sounds like all you need to do is turn the min_85 and min_37 values into indices. Something like this will work:
# min85data and min37data from your file
dx = xedges[1] - xedges[0]
dy = yedges[1] - yedges[0]
min85inds = np.floor((min85data - yedges[1]) / dx).astype(np.int)
min37inds = np.floor((min37data - yedges[0]) / dy).astype(np.int)
# Pretend you didn't do all that flipping of H, or make a copy of it first
hvals = h_orig[min85inds, min37ends]
But do make sure that the resulting indices are valid before you extract them.