I am using this function that I found on the web, to add speckle noise to images for research purposes:
def add_speckle(k,theta,img):
gauss = np.random.gamma(k, theta, img.size)
gauss = gauss.reshape(img.shape[0], img.shape[1], img.shape[2]).astype('uint8')
noise = img + img * gauss
return noise
My issue is that I want to estimate/define the speckle noise I add as a standard deviation(sigma) parameter, and this function that I found depends on the gamma distribution or random.gamma() which it depends on the k,theta(shape,scale) parameters, which you can see in the gamma pdf equation down below:
according to my knowledge, the variance can be calculated in gamma distribution as follows:
so standard deviation or sigma is equivalent to:
I want to add speckle noise as sigma dependent, so am saying there should be a way to estimate that sigma from k,theta(shape,scale) that we make the input with, so the speckle_adding() function would look like something like this:
def add_speckle(sigma,img):
edited : for the answer in the comments :
def add_speckle(sigma,mean,img):
theta = sigma ** 2 / mean
k = mean / theta
gauss = np.random.gamma(k,theta,img.size)
gauss = gauss.reshape(img.shape[0],img.shape[1],img.shape[2]).astype('uint8')
noise = img + img * gauss
return noise
img = cv2.imread('/content/Hand.jpeg')
thanks sir for your help, but i really understand why k,theta values changes each time i change values of mean while sigma is constant, i think it must not changes??
As you have noticed that sigma = k ** 0.5 * theta, there are infinite possibilities for parameters in the gamma distribution if only sigma is given (eg. if sigma is 1, (k, theta) can be (1,1) or (4, 0.5) and so on).
If you really want to generate the speckle with statistical inferences as input, I suggest you to add mean as the second input so that the required (k, theta) can be calculated.
The first moment (ie. mean) of a gamma distribution is simply k * theta.
def add_speckle(mean, sigma, img):
# find theta
theta = sigma ** 2 / mean
k = mean / theta
# your code proceeds...
I am trying to fit some data using scipy.optimize.curve_fit. I have read the documentation and also this StackOverflow post, but neither seem to answer my question.
I have some data which is simple, 2D data which looks approximately like a trig function. I want to fit it with a general trig function
using scipy.
My approach is as follows:
from __future__ import division
import numpy as np
from scipy.optimize import curve_fit
#Load the data
data = np.loadtxt('example_data.txt')
t = data[:,0]
y = data[:,1]
#define the function to fit
def func_cos(t,A,omega,dphi,C):
# A is the amplitude, omega the frequency, dphi and C the horizontal/vertical shifts
return A*np.cos(omega*t + dphi) + C
#do a scipy fit
popt, pcov = curve_fit(func_cos, t,y)
#Plot fit data and original data
fig = plt.figure(figsize=(14,10))
ax1 = plt.subplot2grid((1,1), (0,0))
This outputs:
where blue is the data orange is the fit. Clearly I am doing something wrong. Any pointers?
If no values are provided for initial guess of the parameters p0 then a value of 1 is assumed for each of them. From the docs:
p0 : array_like, optional
Initial guess for the parameters (length N). If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).
Since your data has very large x-values and very small y-values an initial guess of 1 is far from the actual solution and hence the optimizer does not converge. You can help the optimizer by providing suitable initial parameter values that can be guessed / approximated from the data:
Amplitude: A = (y.max() - y.min()) / 2
Offset: C = (y.max() + y.min()) / 2
Frequency: Here we can estimate the number of zero crossing by multiplying consecutive y-values and check which products are smaller than zero. This number divided by the total x-range gives the frequency and in order to get it in units of pi we can multiply that number by pi: y_shifted = y - offset; oemga = np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min())
Phase shift: can be set to zero, dphi = 0
So in summary, the following initial parameter guess can be used:
offset = (y.max() + y.min()) / 2
y_shifted = y - offset
p0 = (
(y.max() - y.min()) / 2,
np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min()),
popt, pcov = curve_fit(func_cos, t, y, p0=p0)
Which gives me the following fit function:
I am trying to calculate the mean and std for burr distribution, but I am not quite sure how to input this. The pdf I am using is: f(x) = (alpha*gamma*lambda**alpha*x**(gamma-1))/(lambda+x**gamma)**(alpha+1) from the IFoA Formulae.
I have calculated the parameters to be: alpha = 2.3361635751273977, lambda = 10.596809948869414 and gamma = 0.5 in order to get mean = 500 and std = 600.
Could someone suggest how I should input the data into scipy.stats.burr or scipy.stats.burr12?
You need burr12 here, not burr. (The difference is in the sign of the power of x that sits inside another power. Confusingly, it's burr12 that is usually called simply Burr outside of SciPy, not the thing that SciPy calls burr.)
The Burr XII PDF is written in SciPy as c*d*x**(c-1)*(1+x**c)**(-d-1) where c, d are positive shape parameters. Your formula
(alpha*gamma*lamda**alpha*x**(gamma-1)) / (lamda+x**gamma)**(alpha+1)
has lambda in place of 1, so there is some scaling involved. SciPy docs say
burr12.pdf(x, c, d, loc, scale) is identically equivalent to burr12.pdf(y, c, d) / scale with y = (x - loc) / scale.
So, in order for lamda+x**gamma to be a constant multiple of 1 + (x/scale)**gamma, we need scale to be lamda**(1/gamma). The exponents correspond to SciPy notation as c = gamma and d = alpha. Let's test this:
from scipy.stats import burr12
alpha = 2.3361635751273977
lamda = 10.596809948869414
gamma = 0.5
scale = lamda**(1/gamma)
c = gamma
d = alpha
print(burr12.mean(c, d, loc=0, scale=scale))
print(burr12.std(c, d, loc=0, scale=scale))
which prints
I am aware that following will require patience and I do appreciate the effort you will be giving.
I have a measured data, which represent the derivative of the magnetic moment : dM/dH. A good mathematical model of M(H) curve is the langevin function : where:
M(H) = 1/coth(xi) - 1/xi , xi = cte*Vi³
so the derivative of the magnetic moment can be obtained from the derivative of the derivative of the langevin function :
dM/dH = 1/xi² - 1/(sinh²(xi))
For the fitting I used this function as a fitting function :
def langevinDeriv(xx):
if not hasattr(xx, '__iter__'):
xx = [ xx ]
res = np.zeros(len(xx))
eps = 1e-1
for i in range(len(xx)):
x = xx[i]
if np.fabs(x) < eps:
res[i] = 1./3. - x**2/15. + 2.* x**4 / 189. - x**6/675. + 2.* x**8 / 10395. - 1382. * x**10 / 58046625. + 4. * x**12 / 1403325.
res[i] = (1./x**2 - 1./np.sinh(x)**2)
return res
and minimized the error with a simple Least square function.
Here is what I got : comparaison : fit and data
I would say, that the fit is not good, because actually I don't have one diameter of particles but polydisperse ensembles with different diameters and so with different Langevin_derivative functions.
My question is, how can I integrate this probability density for the diameter to my fitting function, so that the program would fit to a probability distribution and not a single Diameter Vi. The function of the probability density is given here:
So I fiddled around bit. As mentioned in the comments, fit will never give super results as the model does not capture the drop in signal at the ends (as well as the step-like behaviour on the graph). The results, however looks much better than a simple Langevin derivative. I basically sum up functions with different particle volume providing a max diameter. You can control the max diameter and the number of diameters used in the range of 0 to max diameter. The only two fit parameters are the standard deviation and the overall amplitude. In detail you have to be careful with the scaling to get physically meaningful results. I played already a little with n and d_max finding that in my scaling 15,3 is OK. I guess d_max should be sufficiently larger than s and n reasonably large to have several values near the max of the log-normal distribution.
import matplotlib
from matplotlib import pyplot as plt
import numpy as np
from scipy.optimize import curve_fit ,leastsq
def log_gauss(x,s):
if x==0 or s==0:
if abs(exponent)>100:
out=np.exp(exponent)/np.sqrt(2 * np.pi * x**2 * s**2)
return out
def langevin(x,epsilon=1e-4):
if abs(x)<epsilon:
return out
def langevin_d(x,epsilon=1e-4):
if abs(x)<epsilon:
elif abs(x)>100.:
out= 1./x**2
return out
def langevin_d_distributed(h,s,n=25,dMax=10):
pdiaList=[log_gauss(d,s) for d in diaList]
volList=[d**3 for d in diaList]
for v,p in zip(volList,pdiaList):
return dm
def residuals(parameters,dataPoint,n=25,dMax=10):
a,s = abs(parameters)
dist = [y -a*langevin_d_distributed(x,s,n=n,dMax=dMax) for x,y in dataPoint]
return dist
meas_x,meas_y=np.loadtxt('OBaPH.txt', delimiter=',',unpack=True)
langevinDList=[langevin_d(h) for h in hList]
distList_01=[langevin_d_distributed(h,.29) for h in hList]
estimate = [1,0.29]
for nnn,ddd in [(15,3),(15,1.5),(15,10),(5,3),(25,3)]:
bestFitValues[(nnn,ddd)], ier = leastsq(residuals, estimate,args=(dataTupel,nnn,ddd))
print bestFitValues[(nnn,ddd)]
myFit[(nnn,ddd)]= [bestFitValues[(nnn,ddd)][0]*langevin_d_distributed(h,bestFitValues[(nnn,ddd)][1],n=nnn,dMax=ddd) for h in hList]
ax.plot(meas_x,meas_y,linestyle='',marker='o',label='rescaled data')
ax.plot(hList,distList_01,label='log_norm test')
for key,val in myFit.iteritems():
I need to know how to generate 1000 random numbers between 500 and 600 that has a mean = 550 and standard deviation = 30 in python.
import pylab
import random
xrandn = pylab.zeros(1000,float)
for j in range(500,601):
xrandn[j] = pylab.randn()
You are looking for stats.truncnorm:
import scipy.stats as stats
a, b = 500, 600
mu, sigma = 550, 30
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)
values = dist.rvs(1000)
There are other choices for your problem too. Wikipedia has a list of continuous distributions with bounded intervals, depending on the distribution you may be able to get your required characteristics with the right parameters. For example, if you want something like "a bounded Gaussian bell" (not truncated) you can pick the (scaled) beta distribution:
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
def my_distribution(min_val, max_val, mean, std):
scale = max_val - min_val
location = min_val
# Mean and standard deviation of the unscaled beta distribution
unscaled_mean = (mean - min_val) / scale
unscaled_var = (std / scale) ** 2
# Computation of alpha and beta can be derived from mean and variance formulas
t = unscaled_mean / (1 - unscaled_mean)
beta = ((t / unscaled_var) - (t * t) - (2 * t) - 1) / ((t * t * t) + (3 * t * t) + (3 * t) + 1)
alpha = beta * t
# Not all parameters may produce a valid distribution
if alpha <= 0 or beta <= 0:
raise ValueError('Cannot create distribution for the given parameters.')
# Make scaled beta distribution with computed parameters
return scipy.stats.beta(alpha, beta, scale=scale, loc=location)
min_val = 1.5
max_val = 35
mean = 9.87
std = 3.1
my_dist = my_distribution(min_val, max_val, mean, std)
# Plot distribution PDF
x = np.linspace(min_val, max_val, 100)
plt.plot(x, my_dist.pdf(x))
# Stats
print('mean:', my_dist.mean(), 'std:', my_dist.std())
# Get a large sample to check bounds
sample = my_dist.rvs(size=100000)
print('min:', sample.min(), 'max:', sample.max())
mean: 9.87 std: 3.100000000000001
min: 1.9290674232087306 max: 25.03903889816994
Probability density function plot:
Note that not every possible combination of bounds, mean and standard deviation will produce a valid distribution in this case, though, and depending on the resulting values of alpha and beta the probability density function may look like an "inverted bell" instead (even though mean and standard deviation would still be correct).
I'm not exactly sure what the OP desired, but if he just wanted an array xrandn fulfilling the bottom plot - below I present the steps:
First, create a standard distribution (Gaussian distribution), the easiest way might be to use numpy:
import numpy as np
random_nums = np.random.normal(loc=550, scale=30, size=1000)
And then you keep only the numbers within the desired range with a list comprehension:
random_nums_filtered = [i for i in random_nums if i>500 and i<600]
Right now, I am using the Box-Muller method to generate 10 24 Gaussian random numbers in python. I am supposed to plot the power spectrum, and see a Gaussian curve. My code is below:
import numpy as np
import matplotlib.pyplot as plt
def fast_fourier_transform(y):
'''Return the fast Fourier transform of y.'''
Y = np.fft.fft(y)
f = np.fft.fftfreq(len(y),1.0/1024)
return f,Y
for i in range((2**10)/2):
u = np.random.random()
v = np.random.random()
z1 = np.sqrt(-2.0 * np.log(u)) * np.sin(2.0 * np.pi * v)
z2 = np.sqrt(-2.0 * np.log(u)) * np.cos(2.0 * np.pi * v)
x1 = mu + z1 * sigma
x2 = mu + z2 * sigma
print u, v, x1, x2
However, when I plot this, I don't get a Gaussian distribution. My question is this: why am I not getting a Gaussian distribution in my Gaussian-generated white noise power spectrum? Am I plotting something wrong? Thank you in advance.
To see the Gaussian curve, you want a histogram rather than a power spectrum. The power spectrum of independent random variables is uniform (flat). The term "white noise" is itself a big hint - white light is comprised of equal amounts of light at all frequencies.