I would like to compute the RMS Amplitude, of a gaussian white noise signal.
import matplotlib.pyplot as plt
import numpy as np
mean = 0
std = 1.0
t = 100
def zv(t):
return np.random.normal(mean, std, size = t)
def rms(x):
return np.sqrt(np.mean(zv(x)**2))
plt.plot(zv(t))
plt.plot(rms(t))
The plot of zv(t) works - but I don't know why the plot of rms(t) is just empty.
Do you have some comments?
Best Regards
zv(t) returns a one dimensional array of size t. As a result, when you take the mean, it is a single value. You can verify this by printing out the value of rms(t). If you want to create a plot along t for rms, you will need to generate multiple monte carlo samples. For example,
def zv(t):
n = 1000
return np.random.normal(mean, std, size = (n, t))
def rms(x):
return np.sqrt(np.mean(zv(x)**2, axis = 0))
Related
The Gaussian basis function is given by the following equation.
Essentially I am creating a data set made up of N = 25 observations of my x_n ranging from [0 1] and the my target value function_s_noise. Since the 24 gaussian basis functions will be used for a regression model I created a design matrix phi and when I plot them I should expect this result. However this is what I am getting when plotting phi.
I am mostly not sure what the values of mu and s need to be of the corresponding basis functions.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
x_n=np.arange(0,1,0.04) #[0,1]
function = np.sin(x_n*np.pi*2)
n = 25 # Number of data points
noise = np.random.normal(0.0, 0.01, n) # Random Gaussian noise
# print(noise)
function_s_noise = function + noise
def gaussian_basis(x, mu, s=7):
return np.exp(-(x-mu)**2/(2*(s^2)))
M = 24
# Calculate design matrix Phi
phi = np.ones((x_n.shape[0], M))
for m in range(M-1):
mu = m/M
phi[:, m+1] = np.vectorize(gaussian_basis)(x_n, mu)
plt.plot(phi)
I would like to smooth time series data. For this I would like to use Python.
Now I have already found the function scipy.ndimage.gaussian_filter1d.
For this, the array and a sigma value must be passed.
Now to my question:
Is the sigma value equal to the filter length?
I would like to run a filter of length 365 over the data.
Would it then be the correct procedure to set this sigma value to 365 or am I confusing things?
sigma defines how your Gaussian filter are spread around its mean. You can create gaussian filter with a specific size like below.
import numpy as np
import matplotlib.pyplot as plt
sigma1 = 3
sigma2 = 50
def gaussian_filter1d(size,sigma):
filter_range = np.linspace(-int(size/2),int(size/2),size)
gaussian_filter = [1 / (sigma * np.sqrt(2*np.pi)) * np.exp(-x**2/(2*sigma**2)) for x in filter_range]
return gaussian_filter
fig,ax = plt.subplots(1,2)
ax[0].plot(gaussian_filter1d(size=365,sigma=sigma1))
ax[0].set_title(f'sigma= {sigma1}')
ax[1].plot(gaussian_filter1d(size=365,sigma=sigma2))
ax[1].set_title(f'sigma= {sigma2}')
plt.show()
Here is the effect of sigma on the Gaussian filter.
Later, you might convolve your signal with your Gaussian filter.
I want a normal curve to fit the histogram I already have.
navf2 is a list of normalized random numbers and the histogram is based on those, and I want a curve to show the general trend of the histogram.
while len(navf2)<252:
number=np.random.normal(0,1,None)
navf2.append(number)
bin_edges=np.arange(70,130,1)
plt.style.use(["dark_background",'ggplot'])
plt.hist(navf2, bins=bin_edges, alpha=1)
plt.ylabel("Frequency of final NAV")
plt.xlabel("Ranges")
ymin=0
ymax=100
plt.ylim([ymin,ymax])
plt.show()
Here You go:
=^..^=
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt
# create raw data
data = np.random.uniform(size=252)
# distribution fitting
mu, sigma = norm.fit(data)
# fitting distribution
x = np.linspace(-0.5,1.5,100)
y = norm.pdf(x, loc=mu, scale=sigma)
# plot data
plt.plot(x, y,'r-')
plt.hist(data, density=1, alpha=1)
plt.show()
Output:
Here is a another solution using your code as mentioned in the question. We can achieve the expected result without the use of the scipy library. we will have to do three things, compute the mean of the data set, compute the standard deviation of the set, and create a function that generates the normal or Gaussian curve.
To compute the mean we can use the function within numpy library, ie mu = np.mean(your_data_set_here)
The standard deviation of the set is the square root of the sum of the differences of the values and mean squared https://en.wikipedia.org/wiki/Standard_deviation. We can express it in code as follows, using the numpy library again:
data_set = [] # some data set
sigma = np.sqrt(1/(len(data_set))*sum((data_set-mu)**2))
Finally we have to build the function for the normal curve or Gaussian https://en.wikipedia.org/wiki/Gaussian_function, it relies on both the mean (mu) and the standard deviation (sigma), so we will use those as parameters in our function:
def Gaussian(x,sigma,mu): # sigma is the standard deviation and mu is the mean
return ((1/(np.sqrt(2*np.pi)*sigma))*np.exp(-(x-mu)**2/(2*sigma**2)))
putting it all together looks like this:
import numpy as np
import matplotlib.pyplot as plt
navf2 = []
while len(navf2)<252:
number=np.random.normal(0,1,None) # since all values will be between 0,1 the bin size doesnt work
navf2.append(number)
navf2 = np.asarray(navf2) # convert to array for better results
mu = np.mean(navf2) #the avg of all values in navf2
sigma = np.sqrt(1/(len(navf2))*sum((navf2-mu)**2)) # standard deviation of navf2
x_vals = np.arange(min(navf2),max(navf2),0.001) # create a flat range based off data
# to build the curve
gauss = [] #store values for normal curve here
def Gaussian(x,sigma,mu): # defining the normal curve
return ((1/(np.sqrt(2*np.pi)*sigma))*np.exp(-(x-mu)**2/(2*sigma**2)))
for val in x_vals :
gauss.append(Gaussian(val,sigma,mu))
plt.style.use(["dark_background",'ggplot'])
plt.hist(navf2, density = 1, alpha=1) # add density = 1 to fix the scaling issues
plt.ylabel("Frequency of final NAV")
plt.xlabel("Ranges")
plt.plot(x_vals,gauss)
plt.show()
Here is a picture of an output:
Hope this helps, I tired to keep it as close to your original code as possible !
I have a set of points (x,y) as two vectors
x,y for example:
from pylab import *
x = sorted(random(30))
y = random(30)
plot(x,y, 'o-')
Now I would like to smooth this data with a Gaussian and evaluate it only at certain (regularly spaced) points on the x-axis. lets say for:
x_eval = linspace(0,1,11)
I got the tip that this method is called a "Gaussian sum filter", but so far I have not found any implementation in numpy/scipy for that, although it seems like a standard problem at first glance.
As the x values are not equally spaced I can't use the scipy.ndimage.gaussian_filter1d.
Usually this kind of smoothing is done going through furrier space and multiplying with the kernel, but I don't really know if this will be possible with irregular spaced data.
Thanks for any ideas
This will blow up for very large datasets, but the proper calculaiton you are asking for would be done as follows:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0) # for repeatability
x = np.random.rand(30)
x.sort()
y = np.random.rand(30)
x_eval = np.linspace(0, 1, 11)
sigma = 0.1
delta_x = x_eval[:, None] - x
weights = np.exp(-delta_x*delta_x / (2*sigma*sigma)) / (np.sqrt(2*np.pi) * sigma)
weights /= np.sum(weights, axis=1, keepdims=True)
y_eval = np.dot(weights, y)
plt.plot(x, y, 'bo-')
plt.plot(x_eval, y_eval, 'ro-')
plt.show()
I'll preface this answer by saying that this is more of a DSP question than a programming question...
...that being said there, there is a simple two step solution to your problem.
Step 1: Resample the data
So to illustrate this we can create a random data set with unequal sampling:
import numpy as np
x = np.cumsum(np.random.randint(0,100,100))
y = np.random.normal(0,1,size=100)
This gives something like:
We can resample this data using simple linear interpolation:
nx = np.arange(x.max()) # choose new x axis sampling
ny = np.interp(nx,x,y) # generate y values for each x
This converts our data to:
Step 2: Apply filter
At this stage you can use some of the tools available through scipy to apply a Gaussian filter to the data with a given sigma value:
import scipy.ndimage.filters as filters
fx = filters.gaussian_filter1d(ny,sigma=100)
Plotting this up against the original data we get:
The choice of the sigma value determines the width of the filter.
Based on #Jaime's answer I wrote a function that implements this with some additional documentation and the ability to discard estimates far from the datapoints.
I think confidence intervals could be obtained on this estimate by bootstrapping, but I haven't done this yet.
def gaussian_sum_smooth(xdata, ydata, xeval, sigma, null_thresh=0.6):
"""Apply gaussian sum filter to data.
xdata, ydata : array
Arrays of x- and y-coordinates of data.
Must be 1d and have the same length.
xeval : array
Array of x-coordinates at which to evaluate the smoothed result
sigma : float
Standard deviation of the Gaussian to apply to each data point
Larger values yield a smoother curve.
null_thresh : float
For evaluation points far from data points, the estimate will be
based on very little data. If the total weight is below this threshold,
return np.nan at this location. Zero means always return an estimate.
The default of 0.6 corresponds to approximately one sigma away
from the nearest datapoint.
"""
# Distance between every combination of xdata and xeval
# each row corresponds to a value in xeval
# each col corresponds to a value in xdata
delta_x = xeval[:, None] - xdata
# Calculate weight of every value in delta_x using Gaussian
# Maximum weight is 1.0 where delta_x is 0
weights = np.exp(-0.5 * ((delta_x / sigma) ** 2))
# Multiply each weight by every data point, and sum over data points
smoothed = np.dot(weights, ydata)
# Nullify the result when the total weight is below threshold
# This happens at evaluation points far from any data
# 1-sigma away from a data point has a weight of ~0.6
nan_mask = weights.sum(1) < null_thresh
smoothed[nan_mask] = np.nan
# Normalize by dividing by the total weight at each evaluation point
# Nullification above avoids divide by zero warning shere
smoothed = smoothed / weights.sum(1)
return smoothed
I have two lists ( of different lengths) of numbers.
Using Python, I want to calculate histograms with say 10 bins.
Then I want to smooth these two histograms with Standard kernel (gaussian kernel with mean = 0 ,sigma=1)
Then I want to calculate the KL distance between these 2 smoothed histograms.
I found some code about histogram calculation but no sure about how to apply standard kernel for smoothening and then how to calculate the KL distance.
Please help.
For calculating histograms you can use numpy.histogram() and for gaussian smoothing scipy.ndimage.filters.gaussian_filter(). Kullback-Leibler divergence code can be found here.
Method to calculate do the required calculation would look something like this:
import numpy as np
from scipy.ndimage.filters import gaussian_filter
def kl(p, q):
p = np.asarray(p, dtype=np.float)
q = np.asarray(q, dtype=np.float)
return np.sum(np.where(p != 0, p * np.log(p / q), 0))
def smoothed_hist_kl_distance(a, b, nbins=10, sigma=1):
ahist, bhist = (np.histogram(a, bins=nbins)[0],
np.histogram(b, bins=nbins)[0])
asmooth, bsmooth = (gaussian_filter(ahist, sigma),
gaussian_filter(bhist, sigma))
return kl(asmooth, bsmooth)