Modelling Ocean surface waves - python

Context: I am trying to model Ocean surface waves, I have played using a linear superposition of sinusoids and also on the fact that phase speed (following a wave's crest) is twice as fast as group speed (following a group of waves). Finally, I more recently used such generation of a model of the random superposition of waves to model the wigggling lines formed by the refraction (bendings of the trajectory of a ray at the interface between air and water) of light, called caustics... So far so good...
Observing real-life ocean waves instructed me that if a single wave is well approximated by a sinusoidal waveform, each wave is qualitatively a bit different. Typical for these surface gravity waves are the sharp crests and flat troughs. As a matter of fact, modelling ocean waves is on one side very useful (Ocean dynamics and its impact on climate, modelling tides, tsunamis, diffraction in a bay to predict coastline evolution, ...) but quite demanding despite a well known mathematical model. Starting with the Navier-Stokes equations to an incompressible fluid (water) in a gravitational field leads to Luke's variational principle in certain simplifying conditions. Further simplifications lead to the approximate solution given by Stokes which gives the following shape as the sum of different harmonics:
This seems well enough at the moment and I will capture this shape in this notebook and notably if that applies to a random mixture of such waves... However, a simple implementation such as
def stokes(pos, a=.3/2/np.pi, k=2*np.pi):
# k*a is a dimensionless expansion parameter
elevation = (1-1/16*(k*a)**2) * np.cos(k*pos)
elevation += 1/2*(k*a)**1 * np.cos(2*k*pos)
elevation += 3/8*(k*a)**2 * np.cos(3*k*pos)
return a * elevation
N_pos = 501
pos = np.linspace(0, 1, N_pos, endpoint=True)
fig, ax = plt.subplots(figsize=(fig_width, fig_width/phi**3))
ax.plot(pos, stokes(pos))
ax.set_xlim(np.min(pos), np.max(pos));
fig, ax = plt.subplots(figsize=(fig_width, fig_width/phi**3))
for a in np.linspace(.001, .2, 10, endpoint=True):
elevation = stokes(pos, a=a)
ax.plot(pos, elevation)
ax.set_xlim(np.min(pos), np.max(pos));
Gives something similar to:
(The full implementation gives more details.)
My present implementation seem to not fit what is displayed on the wikipedia page and that I do not spot the bug I may have introduced... Does someone spot the bug?

Related

How to calculate Ocean heat content?

I have a subsurface temperature data upto 300m oceanic depth (having irregular depth). And I want to calculate ocean heat content for 0-300m in Python. The cell area is being computed by CDO tool.
The formula is:
OHC = sea water density * Specific heat capacity * integrating the temperature over this depth.
I am able to write a code.
#OHC Calculation
def ocean_heat(Temperature,cell Area):
density = 1026 #kg/m^3
c_p = 3990 #J/(kg K)
heat = Temperature.sum(dim=['depth','lon','lat']) * density * c_p * cell Area
return heat
But, the depth is not on same interval. So I think there is need to use weighted temperature. So if anyone can help to know the proper procedure to compute OHC. And if there is another sources or modules then please let me know.
Thank you.
If your dataset is a NetCDF file I suggest taking a look at the Xarray package. It is used to work with labeled multidimensional arrays. It is very popular in Earth Science.
Here is an example from Pangeo using Xarray to calculate ocean heat content:
https://gallery.pangeo.io/repos/NCAR/notebook-gallery/notebooks/Run-Anywhere/Ocean-Heat-Content/OHC_tutorial.html
The first part is about speeding up the computation with Dask. Task 8 is where they start calculating ocean heat content.

pymc3 likelihood math with non-theano function

I'm new to doing Bayesian inference. I'm trying to adapt a grid search code I wrote to Bayesian Monte Carlo Markov Chain approach, and I'm using PyMC3. My problem is that the code has to call a function that can't be rewritten in theano syntax. (The function relies on a piece of Fortran code in an f2py wrapper.)
Here's the code I'm working with:
with pm.Model() as model:
# Independent parameters
x = pm.Normal('x', sx, sd=incx*float(nrangex)/2.0)
y = pm.Normal('y', sy, sd=incy*float(nrangey)/2.0)
depth = pm.Normal('depth', sdepth, sd=incdepth*float(nrangedepth)/2.0)
length = pm.Normal('length', slength, sd=inclength*float(nrangelength)/2.0)
width = pm.Normal('width', swidth, sd=incwidth*float(nrangewidth)/2.0)
strike = pm.Normal('strike', sstrike, sd=incstrike*float(nrangestrike)/2.0)
dip = pm.Normal('dip', sdip, sd=incdip*float(nrangedip)/2.0)
rake = pm.Normal('rake', srake, sd=incrake*float(nrangerake)/2.0)
# Model (This is the part that doesn't work.)
los_disp = zne2los(getdisp(lon2km(dsidata['lon'], x), lat2km(dsidata['lat'], y), depth, length, width, strike, dip, rake))
# Likelihood
odisp = pymc3.Normal('los_disp', mu = los_disp, sd = 0.5, observed = dsidata['disp'])
# Sampling
trace = pm.sample(100)
What this code is trying to do is invert for earthquake source parameters from ground displacement data. The dsidata data frame contains ground displacement data as a function of latitude and longitude. I'm trying to solve for the eight earthquake source parameters that produce the best fit to this ground surface displacement.
The getdisp function simply cannot be rewritten for theano because it calls a piece of Fortran that forward models ground surface displacement from earthquake source parameters. Is there a way to compile non-theano code into a theano form? Is there another way do this?
Since I'm new to this, and I can't find many great examples to look at, there may well be other errors in the code. Are there other mistakes I'm making?

curve fitting and parameter estimation in Python

I am currently using Python to compare two different datasets (xDAT and yDAT) that are composed of 240 distance measurements taken over a certain amount of time. However, dataset xDAT is offset by a non-linear amount. This non-linear amount is equal to the width of a time-dependent, dynamic medium, which I call level-A. More specifically xDAT measures from the origin to the top of level-A, whereas yDAT measures from the origin to the bottom of level-A. See following diagram:
In order to compare both curves, I must fist apply a correction to xDAT to make up for its offset (the width of level-A).
As of yet, I have played around with different degrees of numpy.polyfit. I.E:
coefs = np.polynomial.polynomial.polyfit(xDAT, yDAT, 5)
polyEST=[]
for i in range(0,len(x-DAT)):
polyEST.append(coefs[0] + coefs[1]*xDAT[i] + coefs[2]*pow(xDAT[i],2) + coefs[3]*pow(xDAT[i],3) + coefs[4]*pow(xDAT[i],4) + coefs[5]*pow(xDAT[i],5))
The problem with using this method, is that when I plot polyEST (which is the corrected version of xDAT), the plot still does not match the trend of yDAT and remains offset. Please see the figure below, where xDAT= blue, corrected xDAT=red, and yDAT=green:
Ideally, the corrected xDAT should still remain noisier than the yDAT, but the general oscillation and trend of the curves should match.
I would greatly appreciate help on implementing a different curve-fitting and parameter estimation technique in order to correct for the non-linear offset caused by level-A.
Thank you.
The answer depends on what Level A is. If it is independent, your first line should be something like
coefs = np.polynomial.polynomial.polyfit(numpy.arange(xDAT.size), yDAT-xDAT, 5)
This will give a polyfit of an independent A as drawn, and then the corrected x should be
xDAT+np.polynomial.polynomial.polyval(numpy.arange(xDAT.size),coefs)
If A is dependent on the variables (as it looks to be), you don't want to polyfit, as that only regresses the real part of the oscillation (the "spring" part of a spring-damper system), which is why your corrected_xDat is in phase with xDat instead of yDat. To regress something like that you'll need to use Fourier transforms (which is not my specialty).

Modelling Finite Difference for Temperature Distribution on a house-shaped domain

Currently I am doing a research to re-model a temperature distribution (dirichlet) in several house-shaped domains (firstly I'll model the steady state and will continue to the unsteady state with time variable).
There are three different roof geometries to make, here's the picture of the domains to make it easier to interpret:
https://www.dropbox.com/sh/92lmwv67k8chzi1/AABmozTBmQqeTz_fGigBsvsaa?dl=0
The boundary conditions were that the temperature at (the bottom of) the roof was 30 C and the temperature at (the inside of) the wall was 25 C and the temperature at the (top of the) ground was 25 C. All are the conditions for steady state model.
So far, I'm done with the rectangular domain in Math approach using handwritten matrix and construct the matrix in a kind of brute-force Python-numpy-matplotlib code. Here's the code:
import numpy as np
import matplotlib.pyplot as plt
floor=25
rightwall=25
leftwall=25
ceiling=30
width=19
height=20
numofnodes=width*height
#all the variables above depend on user input
K=[]
K = [[0 for i in range(numofnodes)]for j in range(numofnodes)]
for i in range (numofnodes):
for j in range (numofnodes):
if(i==j):
K[i][j]=-4
for i in range(numofnodes-1):
K[i][i+1]=1
for i in range(1,numofnodes):
K[i][i-1]=1
for i in range(numofnodes-width):
K[i][i+width]=1
for i in range(width, numofnodes):
K[i][i-width]=1
for i in range(1,height):
K[width*i][(width*i)-1]=0
for i in range(height):
K[(width*i)-1][(width*i)]=0
D1=np.zeros([numofnodes,1])
for i in range(width):
D1[i]=leftwall
D2=np.zeros([numofnodes,1])
for i in range(height):
D2[(i*width)]=floor
D3=np.zeros([numofnodes,1])
for i in range(1,height+1):
D3[(i*width)-1]=ceiling
D4=np.zeros([numofnodes,1])
for i in range(numofnodes-width, numofnodes):
D4[i]=rightwall
D=np.zeros([numofnodes,1])
Dnew=np.negative(D1+D2+D3+D4)
Kinv=np.linalg.inv(K)
T=np.dot(Kinv,Dnew)
Tnew=np.reshape(T,(height,width))
Tmatrix=np.transpose(Tnew)
fig = plt.figure(figsize=(6, 4))
ax = fig.add_subplot(1,1,1)
ax.set_title('temp distribution')
plt.imshow(Tmatrix,origin='lower', interpolation='bilinear')
ax.set_aspect('auto')
plt.clim(25,30)
plt.colorbar(orientation='vertical')
plt.grid()
plt.show()
My apology for the brutality of the code, the code was intended to be made as pure-algorithmic as possible.
Now, I'm having hard time to construct the "Roof" shape even for the simplest one (the rightmost in the domains picture).
These are the pictures of my goals (copied from the MATLAB-based-literature) :
https://www.dropbox.com/s/lxan6i5q0hr3ax3/goals.PNG?dl=0
Can anybody tell me how to implement the steady state finite difference on the roof side (arbitrary or irregular domain) and join it with the rectangular domain code like the goalpicture above? Or maybe tell me some helping literature or examples with case to read?
Apology for the question if the question is violating the terms of Stackoverflow, this is my first time here and thank you for your help.

Interpolation and Extrapolation of Randomly Scattered data to Uniform Grid in 3D

I have a 256 x 256 x 32 grid of regularly spaced points ranging over x, y, and z and with an associated variable "a". I also have a group of randomly scattered points in a more confined x, y, z space, with an associated variable "b". What I essentially want to do is interpolate and extrapolate my random data to a regularly spaced grid that matches the "a" cube, as shown below:
I have used scipy's griddata so far to achieve the interpolation, which seems to work fine, but it cannot handle the extrapolation (as far as I know) and the output sharply truncates to 'nan' values. Whilst researching this problem I came across a couple of people using griddata a second time with 'nearest' as the interpolation method to fill in the 'nan' values. I tried this but the results don't seem reliable. More appropriate looking results are obtained if I use a fill_Value with 'linear' mode, but at the moment it's more a fudge because fill_Value has to be a constant.
I noticed that MATLAB has a ScatteredInterpolant class which seems to do what I want, but I am unable to find an equivalent class in Python, nor figure out how to implement such a routine efficiently in 3D. Any help is greatly appreciated.
The code I am using for the interpolation is below:
x, y, z, b = np.loadtxt(scatteredfile, unpack = True)
# Create cube to match aCube dimensions
xi = np.linspace(-xmax_aCube, xmax_aCube, 256)
yi = np.linspace(-ymax_aCube, ymax_aCube, 256)
zi = np.linspace(zmin_aCube, zmax_aCube, 32)
# Interpolate scattered points
X, Y, Z = np.meshgrid(xi, yi, zi)
bCube = griddata((x, y, z), b, (X, Y, Z), method = 'linear')
This discussion applies in any dimensionality. For your 3D case lets talk about computational geometry first, to understand why part of the region gives NaN from griddata.
The scattered points in your volume make up a convex hull; a geometric shape with the following properties:
The surface is always convex (as the name suggests)
The volume of the shape is the lowest possible without violating convexity
The surface (in 3d) is triangulated and closed
Less formally, the convex hull (which you can compute easily with scipy) is like stretching a balloon over a frame, where the frame corners are the outermost points of your scattered cluster.
At the regular grid location inside the balloon you're surrounded by known points. You can interpolate to these locations. Outside it, you have to extrapolate.
Extrapolation is hard. There's no general rule for how to do it... it's problem-specific. In that region, algorithms like griddata choose to return NaN - this is the safest way of informing the scientist that s/he must choose a sensible way of extrapolating.
Let's go through some ways of doing that.
1. [WORST] Botch it
Assign some scalar value outside the hull. In the numpy docs you'll see this is done with:
s = mean(b)
bCube = griddata((x, y, z), b, (X, Y, Z), method = 'linear', fill_value=s)
Cons: This produces a sharp discontinuity in the interpolated field at the hull boundary, heavily biases the mean scalar field value and doesn't respect the functional form of the data.
2. [NEXT WORST] "Blended botching it"
Assume that at the corners of your domain, you apply some value. This might be the average value of the scalar field associated with your scattered points.
Sorry, this is pseudocode as I don't use numpy at all, but it'll probably be fairly clear
# With a unit cube, and selected scalar value
x, y, z, b = np.loadtxt(scatteredfile, unpack = True)
s = mean(b)
x.append([0 0 0 0 1 1 1 1])
y.append([0 0 1 1 0 0 1 1])
z.append([0 1 0 1 0 1 0 1])
b.append([s s s s s s s s])
# drop in the rest of your code
Cons: This produces a sharp discontinuity in gradient of the interpolated field at the hull boundary, fairly heavily biases the mean scalar field value and doesn't respect the functional form of the data.
3. [STILL PRETTY BAD] Nearest neighbour
For each of the regular NaN points, find the nearest non-NaN and assign that value. This is effective and stable, but crude because your field can end up with patterned features (like stripes or beams radiating out from the hull), often visually unappealing or, worse, unacceptable in terms of data smoothness
Depending on the density of data, you could use the nearest scattered datapoint instead of the nearest non-NaN regular point. This can be done simply by (again, pseudocode):
bCube = griddata((x, y, z), b, (X, Y, Z), method = 'linear', fill_value=nan)
bCubeNearest = griddata((x, y, z), b, (X, Y, Z), method = 'nearest')
indicesMask = isNan(bCube)
# Use nearest interpolation outside the hull, keeping linear interpolation inside.
bCube(indicesMask) = bCubeNearest(indicesMask)
Using MATLAB's delaunay based approaches will reveal more powerful methods for achieving similar in a one-liner, but numpy looks a bit limited here.
4. [NOT ALWAYS TERRIBLE] Naturally weighted
apologies for poor explanation in this section, I've never written the algorithm but I'm sure some research on the natural neighbour technique will get you far
Use a distance weighting function with some parameter D, which might be similar to, or twice (say) the length of your box. You can adjust. For each NaN location, figure out the distance to each of the scattered points.
# Don't do it this way for anything but small matrices - this is O(NM)
# and it can be done much more effectively (e.g. MATLAB has a quick
# natural weighting option), but for illustrative purposes:
for each NaN point 1:N
for each scattered point 1:M
calculate a basis function using inverse distance from NaN to point, normalised on D, and store in a [1 x M] vector of weights
Multiply weights by the b value, summate and divide by M
You basically want to end up with a function that smoothly goes to the average intensity of B at a distance D away from the hull, but coincides with the hull at the boundary. Away from the boundary it is weighted most strongly on its nearest points.
Pros: nicely stable and reasonably continuous. Because of the weighting, is more resilient to noise at single data points than nearest neighbour.
5. [HEROIC ROCKSTAR] Functional form assumption
What do you know about the physics? Assume a functional form that represents what you expect the physics to do, then do a least squares (or some equivalent) fit of that form to the scattered data. Use the function to stabilise the extrapolation.
Some good ideas which can help you construct a function:
Do you expect symmetry or periodicity?
Is b a component of a vector field which has some property like zero divergence?
Directionality: do you expect all corners to be the same? Or maybe a linear variation in one direction?
is field b at a point in time - perhaps a smoothed timeseries of measurements can be used to come up with a basic function?
Is there already a known form like a gaussian or quadratic?
Some examples:
b represents intensity of a laser beam passing thru a volume. You expect the entry side to be nominally identical to the outlet, with the other four boundaries of zero intensity. The intensity will have a concentric gaussian profile.
b is one component of a velocity field in an incompressible fluid. The fluid must be divergence free, so any field produced in the NaN zone must also be divergence free so you apply this condition.
b represents temperature in a room. You expect higher temperature at the top, because hot air rises.
b represents lift on an aerofoil, tested over three independent variables. You can look up the lift at stall easily, so know exactly what it'll be in some parts of the space.
Pros/Cons: Get this right and it'll be awesome. Get it wrong, especially with nonlinear functional forms, and it will go very wrong and can lead to very unstable results.
Health warning you can't assume a functional form, get pretty results, then use them to prove that the functional form is correct. That's just bad science. The form needs to be something well behaved and known independent of your data analysis.
If your scatter of points conforms fairly well to a cube shape, one approach could be to use griddata to interpolate onto a regular grid of data that fits within your point cloud (therefore avoiding nans) and then use this regular grid of values as the input to interpn which does facilitate linear extrapolation (but requires a regular grid as input).
This way you can use griddata as before for all the points within the convex hull of your scatter of points and you can use interpn to estimate the points that are returned as nans.
This is far from perfect, but I think it comes closer to achieving what you are looking for.
Pros:
Avoids sharp discontinuities.
Captures the basic linear trends at the edge of your dataset without having to know the functional form.
Respects asymmetries in your data (e.g. doesn't tend to the population mean at large distances, so one side of your dataset can have larger values than the other at large distances.)
Cons:
The effectiveness of this approach will depend a lot on how large a cube you can fit within the convex hull of your initial scatter of points. If your data is spikey/patchy and irregular then even points on the edge of the convex hull may have been extrapolated significant distances from the edge of the nested cube, incurring errors as the extrapolation won't be taking into account nearer data points that lie outside the cube.
The linear extrapolation will be heavily influenced by noise in the data
at the edges of the point cloud.
Computational cost of doing two sets of interpolations.

Categories

Resources