Interpolate and Smooth Data Between Two Logarithmic Equations in Python - python

I have several equations as follows:
windGust8 = -53.3 + (28.3 * log(windSpeed))
windGust7to8 = -70.0 + (30.8 * log(windSpeed))
windGust6to7 = -29.2 + (17.7 * log(windSpeed))
windGust6 = -32.3 + (16.7 * log(windSpeed))
where windSpeed is the wind speed from a model at 850mb.
I then use lapse rates from a model to determine which equation to use as such:
windGustTemp = where(greater(lapseRate,8.00),windGust8,windGustTemp)
windGustTemp = where(logical_and(less_equal(lapseRate,8.00),greater(lapseRate,7.00)),windGust7to8,windGustTemp)
windGustTemp = where(logical_and(less_equal(lapseRate,7.00),greater(lapseRate,6.00)),windGust6to7,windGustTemp)
windGustTemp = where(less_equal(lapseRate,6.00),windGust6,windGustTemp)
return windGustTemp
When I plot these equations as filled contours on a map there are sharp gradients as you would expect. What I would like to do is interpolate between these equations and maybe smooth a little to give a clean look to the graphic. I assume I would use scipy.interpolate.interp1d, but I am not sure how I would apply that to this circumstance. Any help would be much appreciated! Thanks!
EDIT to add output image example
This is an example of the output image generated currently. You can see how abrupt the edges are. I want to interpolate and smooth out this data.
There are data points every 1 km with the model data. For each point it is first determined what the lapse rate is...let's say 7.4. Then based on the logic equations above, it knows to use the equation associated with the lapse rate between 7 and 8. It then finds the model wind speed for that point, plugs it into the equation and gets a number that it plots on the map. This is done for all the model data points and generates the above image.

Related

Is it possible to generate data with peak and x y location?

I am trying to create a 3d surface plot like this, link available here :
https://plotly.com/python/3d-surface-plots/
But the problem is that I only have limited data available where I only have data for the peak location and the height of peak but the rest of the data is missing. In the example z-data need 25 X 25 values 625 data points to generate a valid surface plot.
My data looks something like this:
So my question is that, is it possible to use some polynomial function with the peak location value as a constrain to generate Z-data based on the information I have?
Open to any discussion. Any form of suggestion is appreciated.
Though I don't like this form of interpolation, which is pretty artificial, you can use the following trick:
F(P) = (Σ Fk / d(P, Pk)) / (Σ 1 / d(P, Pk))
P is the point where you interpolate and Pk are the known peak positions. d is the Euclidean distance. (This gives sharp peaks; the squared distance gives smooth ones.)
Unfortunately, far from the peaks this formula tends to the average of the Fk, giving an horizontal surface that is above some of the Fk, giving downward peaks. You can work around this by adding fake peaks of negative height around your data set, to lower the average.

Generating new 2D data using power spectrum density function from spatial frequency domain via ifft?

This is my first post so apologies for any formatting related issues.
So I have a dataset which was obtained from an atomic microscope. The data looks like a 1024x1024 matrix which is composed of different measurements taken from the sample in units of meters, eg.
data = [[1e-07 ... 4e-08][ ... ... ... ][3e-09 ... 12e-06]]
np.size(data) == (1024,1024)
From this data, I was hoping to 1) derive some statistics about the real data; and 2) using the power spectrum density (PSD) distribution, hopefully create a new dataset which is different, but statistically similar to the characteristics of the original data. My plan to do this was 2a) take a 2d fft of data, calculate the power spectrum density 2b) some method?, 2c) take the 2d ifft of the modified signal to turn it back into a new sample with the same power spectrum density as the original.
Moreover, regarding part 2b) this was the closest link I could find regarding a time series based solution; however, I am not understanding exactly how to implement this so far, since I am not exactly sure what the phase, frequency, and amplitudes of the fft data represent in this 2d case, and also since we are now talking about a 2d ifft I'm not exactly sure how to construct this complex matrix while incorporating the random number generation, and amplitude/phase shifts in a way that will translate back to something meaningful.
So basically, I have been having some trouble with my intuition. For this problem, we are working with a 2d Fourier transform of spatial data with no temporal component, so I believe that methods which are applied to time series data could be applied here as well. Since the fft of the original data is the 'frequency in the spatial domain', the x-axis of the PSD should be either pixels or meters, but then what is the 'power' in the y-axis describing? I was hoping that someone could help me figure this problem out.
My code is below, hopefully someone could help me solve my problem. Bonus if someone could help me understand what this shifted frequency vs amplitude plot is saying:
here is the image with the fft, shifted fft, and freq. vs aplitude plots.
Fortunately the power spectrum density function is a bit easier to understand
Thank you all for your time.
data = np.genfromtxt('asample3.0_00001-filter.txt')
x = np.arange(0,int(np.size(data,0)),1)
y = np.arange(0,int(np.size(data,1)),1)
z = data
npix = data.shape[0]
#taking the fourier transform
fourier_image = np.fft.fft2(data)
#Get power spectral density
fourier_amplitudes = np.abs(fourier_image)**2
#calculate sampling frequency fs (physical distance between pixels)
fs = 92e-07/npix
freq_shifted = fs/2 * np.linspace(-1,1,npix)
freq = fs/2 * np.linspace(0,1,int(npix/2))
print("Plotting 2d Fourier Transform ...")
fig, axs = plt.subplots(2,2,figsize=(15, 15))
axs[0,0].imshow(10*np.log10(np.abs(fourier_image)))
axs[0,0].set_title('fft')
axs[0,1].imshow(10*np.log10(np.abs(np.fft.fftshift(fourier_image))))
axs[0,1].set_title('shifted fft')
axs[1,0].plot(freq,10*np.log10(np.abs(fourier_amplitudes[:npix//2])))
axs[1,0].set_title('freq vs amplitude')
for ii in list(range(npix//2)):
axs[1,1].plot(freq_shifted,10*np.log10(np.fft.fftshift(np.abs(fourier_amplitudes[ii]))))
axs[1,1].set_title('shifted freq vs amplitude')
#constructing a wave vector array
## Get frequencies corresponding to signal PSD
kfreq = np.fft.fftfreq(npix) * npix
kfreq2D = np.meshgrid(kfreq, kfreq)
knrm = np.sqrt(kfreq2D[0]**2 + kfreq2D[1]**2)
knrm = knrm.flatten()
fourier_amplitudes = fourier_amplitudes.flatten()
#creating the power spectrum
kbins = np.arange(0.5, npix//2+1, 1.)
kvals = 0.5 * (kbins[1:] + kbins[:-1])
Abins, _, _ = stats.binned_statistic(knrm, fourier_amplitudes,
statistic = "mean",
bins = kbins)
Abins *= np.pi * (kbins[1:]**2 - kbins[:-1]**2)
print("Plotting power spectrum of surface ...")
fig = plt.figure(figsize=(10, 10))
plt.loglog(fs/kvals, Abins)
plt.xlabel("Spatial Frequency $k$ [meters]")
plt.ylabel("Power per Spatial Frequency $P(k)$")
plt.tight_layout()

Inverse FFT returns negative values when it should not

I have several points (x,y,z coordinates) in a 3D box with associated masses. I want to draw an histogram of the mass-density that is found in spheres of a given radius R.
I have written a code that, providing I did not make any errors which I think I may have, works in the following way:
My "real" data is something huge thus I wrote a little code to generate non overlapping points randomly with arbitrary mass in a box.
I compute a 3D histogram (weighted by mass) with a binning about 10 times smaller than the radius of my spheres.
I take the FFT of my histogram, compute the wave-modes (kx, ky and kz) and use them to multiply my histogram in Fourier space by the analytic expression of the 3D top-hat window (sphere filtering) function in Fourier space.
I inverse FFT my newly computed grid.
Thus drawing a 1D-histogram of the values on each bin would give me what I want.
My issue is the following: given what I do there should not be any negative values in my inverted FFT grid (step 4), but I get some, and with values much higher that the numerical error.
If I run my code on a small box (300x300x300 cm3 and the points of separated by at least 1 cm) I do not get the issue. I do get it for 600x600x600 cm3 though.
If I set all the masses to 0, thus working on an empty grid, I do get back my 0 without any noted issues.
I here give my code in a full block so that it is easily copied.
import numpy as np
import matplotlib.pyplot as plt
import random
from numba import njit
# 1. Generate a bunch of points with masses from 1 to 3 separated by a radius of 1 cm
radius = 1
rangeX = (0, 100)
rangeY = (0, 100)
rangeZ = (0, 100)
rangem = (1,3)
qty = 20000 # or however many points you want
# Generate a set of all points within 1 of the origin, to be used as offsets later
deltas = set()
for x in range(-radius, radius+1):
for y in range(-radius, radius+1):
for z in range(-radius, radius+1):
if x*x + y*y + z*z<= radius*radius:
deltas.add((x,y,z))
X = []
Y = []
Z = []
M = []
excluded = set()
for i in range(qty):
x = random.randrange(*rangeX)
y = random.randrange(*rangeY)
z = random.randrange(*rangeZ)
m = random.uniform(*rangem)
if (x,y,z) in excluded: continue
X.append(x)
Y.append(y)
Z.append(z)
M.append(m)
excluded.update((x+dx, y+dy, z+dz) for (dx,dy,dz) in deltas)
print("There is ",len(X)," points in the box")
# Compute the 3D histogram
a = np.vstack((X, Y, Z)).T
b = 200
H, edges = np.histogramdd(a, weights=M, bins = b)
# Compute the FFT of the grid
Fh = np.fft.fftn(H, axes=(-3,-2, -1))
# Compute the different wave-modes
kx = 2*np.pi*np.fft.fftfreq(len(edges[0][:-1]))*len(edges[0][:-1])/(np.amax(X)-np.amin(X))
ky = 2*np.pi*np.fft.fftfreq(len(edges[1][:-1]))*len(edges[1][:-1])/(np.amax(Y)-np.amin(Y))
kz = 2*np.pi*np.fft.fftfreq(len(edges[2][:-1]))*len(edges[2][:-1])/(np.amax(Z)-np.amin(Z))
# I create a matrix containing the values of the filter in each point of the grid in Fourier space
R = 5
Kh = np.empty((len(kx),len(ky),len(kz)))
#njit(parallel=True)
def func_njit(kx, ky, kz, Kh):
for i in range(len(kx)):
for j in range(len(ky)):
for k in range(len(kz)):
if np.sqrt(kx[i]**2+ky[j]**2+kz[k]**2) != 0:
Kh[i][j][k] = (np.sin((np.sqrt(kx[i]**2+ky[j]**2+kz[k]**2))*R)-(np.sqrt(kx[i]**2+ky[j]**2+kz[k]**2))*R*np.cos((np.sqrt(kx[i]**2+ky[j]**2+kz[k]**2))*R))*3/((np.sqrt(kx[i]**2+ky[j]**2+kz[k]**2))*R)**3
else:
Kh[i][j][k] = 1
return Kh
Kh = func_njit(kx, ky, kz, Kh)
# I multiply each point of my grid by the associated value of the filter (multiplication in Fourier space = convolution in real space)
Gh = np.multiply(Fh, Kh)
# I take the inverse FFT of my filtered grid. I take the real part to get back floats but there should only be zeros for the imaginary part.
Density = np.real(np.fft.ifftn(Gh,axes=(-3,-2, -1)))
# Here it shows if there are negative values the magnitude of the error
print(np.min(Density))
D = Density.flatten()
N = np.mean(D)
# I then compute the histogram I want
hist, bins = np.histogram(D/N, bins='auto', density=True)
bin_centers = (bins[1:]+bins[:-1])*0.5
plt.plot(bin_centers, hist)
plt.xlabel('rho/rhom')
plt.ylabel('P(rho)')
plt.show()
Do you know why I'm getting these negative values? Do you think there is a simpler way to proceed?
Sorry if this is a very long post, I tried to make it very clear and will edit it with your comments, thanks a lot!
-EDIT-
A follow-up question on the issue can be found [here].1
The filter you create in the frequency domain is only an approximation to the filter you want to create. The problem is that we are dealing with the DFT here, not the continuous-domain FT (with its infinite frequencies). The Fourier transform of a ball is indeed the function you describe, however this function is infinitely large -- it is not band-limited!
By sampling this function only within a window, you are effectively multiplying it with an ideal low-pass filter (the rectangle of the domain). This low-pass filter, in the spatial domain, has negative values. Therefore, the filter you create also has negative values in the spatial domain.
This is a slice through the origin of the inverse transform of Kh (after I applied fftshift to move the origin to the middle of the image, for better display):
As you can tell here, there is some ringing that leads to negative values.
One way to overcome this ringing is to apply a windowing function in the frequency domain. Another option is to generate a ball in the spatial domain, and compute its Fourier transform. This second option would be the simplest to achieve. Do remember that the kernel in the spatial domain must also have the origin at the top-left pixel to obtain a correct FFT.
A windowing function is typically applied in the spatial domain to avoid issues with the image border when computing the FFT. Here, I propose to apply such a window in the frequency domain to avoid similar issues when computing the IFFT. Note, however, that this will always further reduce the bandwidth of the kernel (the windowing function would work as a low-pass filter after all), and therefore yield a smoother transition of foreground to background in the spatial domain (i.e. the spatial domain kernel will not have as sharp a transition as you might like). The best known windowing functions are Hamming and Hann windows, but there are many others worth trying out.
Unsolicited advice:
I simplified your code to compute Kh to the following:
kr = np.sqrt(kx[:,None,None]**2 + ky[None,:,None]**2 + kz[None,None,:]**2)
kr *= R
Kh = (np.sin(kr)-kr*np.cos(kr))*3/(kr)**3
Kh[0,0,0] = 1
I find this easier to read than the nested loops. It should also be significantly faster, and avoid the need for njit. Note that you were computing the same distance (what I call kr here) 5 times. Factoring out such computation is not only faster, but yields more readable code.
Just a guess:
Where do you get the idea that the imaginary part MUST be zero? Have you ever tried to take the absolute values (sqrt(re^2 + im^2)) and forget about the phase instead of just taking the real part? Just something that came to my mind.

curve fitting and parameter estimation in Python

I am currently using Python to compare two different datasets (xDAT and yDAT) that are composed of 240 distance measurements taken over a certain amount of time. However, dataset xDAT is offset by a non-linear amount. This non-linear amount is equal to the width of a time-dependent, dynamic medium, which I call level-A. More specifically xDAT measures from the origin to the top of level-A, whereas yDAT measures from the origin to the bottom of level-A. See following diagram:
In order to compare both curves, I must fist apply a correction to xDAT to make up for its offset (the width of level-A).
As of yet, I have played around with different degrees of numpy.polyfit. I.E:
coefs = np.polynomial.polynomial.polyfit(xDAT, yDAT, 5)
polyEST=[]
for i in range(0,len(x-DAT)):
polyEST.append(coefs[0] + coefs[1]*xDAT[i] + coefs[2]*pow(xDAT[i],2) + coefs[3]*pow(xDAT[i],3) + coefs[4]*pow(xDAT[i],4) + coefs[5]*pow(xDAT[i],5))
The problem with using this method, is that when I plot polyEST (which is the corrected version of xDAT), the plot still does not match the trend of yDAT and remains offset. Please see the figure below, where xDAT= blue, corrected xDAT=red, and yDAT=green:
Ideally, the corrected xDAT should still remain noisier than the yDAT, but the general oscillation and trend of the curves should match.
I would greatly appreciate help on implementing a different curve-fitting and parameter estimation technique in order to correct for the non-linear offset caused by level-A.
Thank you.
The answer depends on what Level A is. If it is independent, your first line should be something like
coefs = np.polynomial.polynomial.polyfit(numpy.arange(xDAT.size), yDAT-xDAT, 5)
This will give a polyfit of an independent A as drawn, and then the corrected x should be
xDAT+np.polynomial.polynomial.polyval(numpy.arange(xDAT.size),coefs)
If A is dependent on the variables (as it looks to be), you don't want to polyfit, as that only regresses the real part of the oscillation (the "spring" part of a spring-damper system), which is why your corrected_xDat is in phase with xDat instead of yDat. To regress something like that you'll need to use Fourier transforms (which is not my specialty).

Plotting geostrophic wind plot in matplotlib

I am working on an assignment that is teaching how to plot and label using matplotlib using Python. Science or math is not my background. I have been given the formula for calculating the geostrophic wind and we are to plot it (on the y-axis) versus the latitude on the x-axis.
I know how to plot give an x and a y. Beyond that, the formula is not making sense to me given my lack of background in the area.
The formula is the geostrophic wind formula. Because all I have is an image and I need 10 rep to post an image, I'll just focus on the greek letters I am given.
For example, I am given
r'$x^{10}$'
r'$R_^{final}$'
r'$alpha^{\eta}$'
The first two are superscript and subscript. That I understand. But how this helps with the formula calculations I do not know.
I am given the values to put into the formula as well. An explanation of the order of operations would help.
g0=9.81 ms-­‐2;
ΔZ=60m;
Δn=2x10^5m;
and
f=2Ωsin(φ)
My question is how do I put the values above into the formula and then plot them in matplotlib? is it as easy as x and y?
Example of plotting done so far:
x = arange(1, 100, 1)
y1 = 2.0*np.sqrt(x)
y2 = 3.0*x**(1.0/3.0)
plt.plot(x, y1)
plt.plot(x, y2)
Sorry, I'm new to this.
geostrophic wind formula
The physical explanation in jclark754's answer is good. Look at the wiki page on Geostrophic wind, too.
$\Delta n$ is, I assume, your northward distance. I call it dy below, for clarification. Also, it is a question whether you should take g to be negative (z-axis upward). I do so.
For the code, you need to be aware that np.sin expects radians rather than degrees.
And if you work with NumPy arrays rather than lists, you do not need all those list comprehensions and the formulation gets much simpler and closer to the formula:
import matplotlib.pyplot as plt ; plt.style.use('ggplot')
import numpy as np
# define the parameters
g = -9.81 # m/s^2
dZ = 60 # m
dx = 2e5 # m
omega = 7.2921e-5 # rad/s
phi = np.linspace(10,40) # deg
f = 2 * omega * np.sin(np.radians(phi)) # coriolis frequency, s^-1
# compute geostrophic wind, x-component
u_g = -1. * g/f * dZ/dx
# plot phi vs V_g
fig, ax = plt.subplots()
ax.plot(phi, u_g)
ax.set_xlabel('latitude (degrees)')
ax.set_ylabel('geostrophic wind, y-component (m/s)')
plt.show()
The plot shows the geostrophic wind resulting from a constant geostrophic height gradient (dZ/dx = 60 m / 2e5 m) and the Coriolis effect, at different latitudes.
From physical intuition, I find it strange that the velocity increases as you get closer to the equator, even though the Coriolis effect is strongest towards the poles. But then again, the Coriolis effect is not a force but more a balancing effect, obstructing the release of potential energy contained in the pressure gradient force.
So I believe the equation you're trying to show is the geostrophic wind equation:
Is that it? If so, it's one of the simpler equations in meteorology and I'd be happy to explain!
Vg is the geostrophic wind, it's a theoretical wind that results from a balance between the Coriolis effect and the pressure gradient force. It's an idealized wind that doesn't really exist in nature.
g0 and f are gravity and the Coriolis parameter. The Coriolis parameter is a necessary correction needed to account for the Coriolis force.
grad(h) and Z are just the height gradient per degree of latitude. In your case, you're provided with 60 meters as Z and I'm unsure what Δn is for. Maybe your instructor is saying that the change is 60 meters per 2x10^5 meters? I'll assume that's the case.
So just calculating this in wolfram alpha for Denver, Colorado's latitude (40 deg), I get 31.39 meters per second, which is a reasonable number.
Let's try to plot it:
import matplotlib.pyplot as plt
import numpy as np
# Create a list of latitudes but exclude the equator because sin(0) is 0
lat_list = [i for i in range(-90, 91) if i != 0]
# Create a list of coriolis values
cor_list = [2 * 7.292e-5 * np.sin(i) for i in lat_list]
# Create a list of geostrophic winds
geo_wind = [(9.81 / i) * (60.0 / 200000.0) for i in cor_list]
# Plot the geostrophic winds on a line
# Make a new plot, with lat as x and wind as y. 'r--' is a red dashed line
plt.plot(lat_list, geo_wind, 'r--')
# set the axis range
plt.axis([-90, 90, min(geo_wind), max(geo_wind)])
# show the plot
plt.show()
Would give you the following chart, where latitude is the x-axis and wind speed is the y-axis:
Oddly, the chart (and printing the geo_wind list) show some wind value calculations exceeding 100 m/s and in some cases over 1000 m/s. I'm unsure why that's the case right now...it's a bit late! So the logic is correct, I would just check how python is calculating the wind speed...I'm think it has to do with scientific notation and floating point numbers.
Anyway, I should note that I wrote the above lists as list comprehensions. If that's a bit over your head, it's ok. Check out this link for a good explanation on how they compare to regular lists/for loops. I hope this gets you off to a good start. Happy trails!

Categories

Resources