how to do this in torch
np.random.normal(loc=mean,scale=stdev,size=vsize)
i.e. by providing mean, stddev or variance
There's a similar torch function torch.normal:
torch.normal(mean=mean, std=std)
If mean and std are scalars, this will produce a scalar value. You can simply repeat them for the target you want it to be, eg:
mean = 2
std = 10
size = (3,3)
r = torch.normal(mean=torch.full(size,mean).float(),std=torch.full(size,mean).float())
print(r)
> tensor([[ 2.2263, 1.1374, 4.5766],
[ 3.4727, 2.6712, 2.4878],
[-0.1787, 2.9600, 2.7598]])
Related
I'm trying to test layer normalization function of PyTorch.
But I don't know why b[0] and result have different values here
Did I do something wrong ?
import numpy as np
import torch
import torch.nn as nn
a = torch.randn(1, 5)
m = nn.LayerNorm(a.size()[1:], elementwise_affine= False)
b = m(a)
Result:
input: a[0] = tensor([-1.3549, 0.3857, 0.1110, -0.8456, 0.1486])
output: b[0] = tensor([-1.5561, 1.0386, 0.6291, -0.7967, 0.6851])
mean = torch.mean(a[0])
var = torch.var(a[0])
result = (a[0]-mean)/(torch.sqrt(var+1e-5))
Result:
result = tensor([-1.3918, 0.9289, 0.5627, -0.7126, 0.6128])
And, for n*2 normalization , the result of pytorch layer norm is always [1.0 , -1.0] (or [-1.0, 1.0]) . I can't understand why. Please let me know if you have any hints
a = torch.randn(1, 2)
m = nn.LayerNorm(a.size()[1:], elementwise_affine= False)
b = m(a)
Result:
b = tensor([-1.0000, 1.0000])
For calculating the variance use torch.var(a[0], unbiased=False). Then you will get the same result. By default pytorch calculates the unbiased estimation of the variance.
For your 1st question, as #Theodor said, you need to use unbiased=False unbiased when calculating variance.
Only if you want to explore more: As your input size is 5, unbiased estimation of variance will be 5/4 = 1.25 times the biased estimation. Because unbiased estimation uses N-1 instead of N in the denominator. As a result, each value of result that you generated, is sqrt(4/5) = 0.8944 times the values of b[0].
About your 2nd question:
And, for n*2 normalization , the result of pytorch layer norm is always [1.0 , -1.0]
This is reasonable. Suppose only two elements are a and b. So, mean will be (a+b)/2 and variance ((a-b)^2)/4. So, the normalization result will be [((a-b)/2) / (sqrt(variance)) ((b-a)/2) / (sqrt(variance))] which is essentially [1, -1] or [-1, 1] depending on a > b or a < b.
I understand torch.Normal(loc, scale) is a class corresponding to univariate normal distribution in pytorch.
I understand how it works when loc and scale are numbers.
The problem is when the inputs to torch.Normal are tensors as opposed to numbers. In that case I do not understand it well. What is the exact interpretation/usage of such tensor arguments?
See for example y_dist in code below. loc and scale are tensors for y_dist. What does this exactly mean? I do not think this converts the univariate distribution to multivariate, does it? Does it instead form a group of univariate distributions?
import torch as pt
ptd = pt.distributions
x_dist = ptd.Normal(loc = 2, scale = 3)
x_samples = x_dist.sample()
batch_size = 256
y_dist = ptd.Normal(loc = 0.25 * pt.ones(batch_size, dtype=pt.float32), scale = pt.ones(batch_size, dtype=pt.float32))
As you said, if loc (a.k.a. mu) and scale (a.k.a. sigma) are floats then it will sample from a normal distribution, with loc as the mean, and scale as the standard deviation.
Providing tensors instead of floats will just make it sample from more than one normal distribution independently (unlike torch.distributions.MultivariateNormal of course)
If you look at the source code you will see loc and scale are broadcasted to the same shape on __init__.
Here's an example to show this behavior:
>>> mu = torch.tensor([-10, 10], dtype=torch.float)
>>> sigma = torch.ones(2, 2)
>>> y_dist = Normal(loc=mu, scale=sigma)
Above mu is 1D, while sigma is 2D, yet:
>>> y_dist.loc
tensor([[-10., 10.],
[-10., 10.]])
So it will get two samples from N(-10, 1) and two samples from N(10, 1)
>>> y_dist.sample()
tensor([[ -9.1686, 10.6062],
[-10.0974, 8.5439]])
Similarly:
>>> mu = torch.zeros(2, 2)
>>> sigma = torch.tensor([0.001, 1000], dtype=torch.float)
>>> y_dist = Normal(loc=mu, scale=sigma)
Will broadcast scale to be:
>>> y_dist.scale
tensor([[1.0000e-03, 1.0000e+01],
[1.0000e-03, 1.0000e+01]])
>>> y_dist.sample()
tensor([[-8.0329e-04, 1.4213e+01],
[-1.4907e-03, 3.1190e+02]])
I'd like to get an NxM matrix where numbers in each row are random samples generated from different normal distributions(same mean but different standard deviations). The following code works:
import numpy as np
mean = 0.0 # same mean
stds = [1.0, 2.0, 3.0] # different stds
matrix = np.random.random((3,10))
for i,std in enumerate(stds):
matrix[i] = np.random.normal(mean, std, matrix.shape[1])
However, this code is not quite efficient as there is a for loop involved. Is there a faster way to do this?
np.random.normal() is vectorized; you can switch axes and transpose the result:
np.random.seed(444)
arr = np.random.normal(loc=0., scale=[1., 2., 3.], size=(1000, 3)).T
print(arr.mean(axis=1))
# [-0.06678394 -0.12606733 -0.04992722]
print(arr.std(axis=1))
# [0.99080274 2.03563299 3.01426507]
That is, the scale parameter is the column-wise standard deviation, hence the need to transpose via .T since you want row-wise inputs.
How about this?
rows = 10000
stds = [1, 5, 10]
data = np.random.normal(size=(rows, len(stds)))
scaled = data * stds
print(np.std(scaled, axis=0))
Output:
[ 0.99417905 5.00908719 10.02930637]
This exploits the fact that a two normal distributions can be interconverted by linear scaling (in this case, multiplying by standard deviation). In the output, each column (second axis) will contain a normally distributed variable corresponding to a value in stds.
I am newbie to numpy and recently I got very confused on the random.normal method
I would like to generate a 2 by 2 matrix where the mean is zero so I wrote the following, however, as you can see the abs(0 - np.mean(b)) < 0.01 line outputs False, why? I expect it to output True.
>>> import numpy as np
>>> b = np.random.normal(0.0, 1.0, (2,2))
>>> b
array([[-1.44446094, -0.3655891 ],
[-1.15680584, -0.56890335]])
>>> abs(0 - np.mean(b)) < 0.01
False
If you want a generator, you'll need to manually fix the mean and std to your expected values:
def normal_gen(m, s, shape=(2,2)):
b = np.random.normal(0, s, shape)
b = (b - np.mean(b)) * (s / np.std(b)) + m
return b
Sampling from a normal distribution does not guarantee that the mean of your sample is the same as the mean of the normal distribution. If you take an infinite number of samples, it should have the same mean (via the Central Limit Theorem) but obviously, you can't really take an infinite number of samples.
I tried to use numpy.random.multivariate_normal to do random samplings on some 30000+ variables, while it always took all of my memory (32G) and then terminated. Actually, the correlation is spherical and every variable is correlated to about only 2500 other variables. Is there another way to specify the spherical covariance matrix, rather than the full covariance matrix, or any other way to reduce the usage of the memory?
My code is like this:
cm = [] #covariance matrix
for i in range(width*height):
cm.append([])
for j in range(width*height):
cm[i].append(corr_calc()) #corr is inversely proportional to the distance
mean = [vth]*(width*height)
cache_vth=numpy.random.multivariate_normal(mean,cm)
If your correlation is spherical, that is the same as saying that the value along each dimension is uncorrelated to the other dimensions, and that the variance along every dimension is the same. You don't need to build the covariance matrix at all, drawing one sample from your 30,000-D multivariate normal is the same as drawing 30,000 samples from a 1-D normal. That is, instead of doing:
n = 30000
mu= 0
corr = 1
cm = np.eye(n) * corr
mean = np.ones((n,)) * mu
np.random.multivariate_normal(mean, cm)
Which fails when trying to build the cm array, try the following:
n = 30000
mu = 0
corr = 1
>>> np.random.normal(mu, corr, size=n)
array([ 0.88433649, -0.55460098, -0.74259886, ..., 0.66459841,
0.71225572, 1.04012445])
If you want more than one random sample, say 3, try
>>> np.random.normal(mu, corr, size=(3, n))
array([[-0.97458499, 0.05072532, -0.0759601 , ..., -0.31849315,
-2.17552787, -0.36884723],
[ 1.5116701 , 2.53383547, 1.99921923, ..., -1.2769304 ,
0.36912488, 0.3024549 ],
[-1.12615267, 0.78125589, 0.67133243, ..., -0.45441239,
-1.21083007, 1.45696714]])