I can generate Gaussian data with random.gauss(mu, sigma) function, but how can I generate 2D gaussian? Is there any function like that?
If you can use numpy, there is numpy.random.multivariate_normal(mean, cov[, size]).
For example, to get 10,000 2D samples:
np.random.multivariate_normal(mean, cov, 10000)
where mean.shape==(2,) and cov.shape==(2,2).
I'd like to add an approximation using exponential functions. This directly generates a 2d matrix which contains a movable, symmetric 2d gaussian.
I should note that I found this code on the scipy mailing list archives and modified it a little.
import numpy as np
def makeGaussian(size, fwhm = 3, center=None):
""" Make a square gaussian kernel.
size is the length of a side of the square
fwhm is full-width-half-maximum, which
can be thought of as an effective radius.
"""
x = np.arange(0, size, 1, float)
y = x[:,np.newaxis]
if center is None:
x0 = y0 = size // 2
else:
x0 = center[0]
y0 = center[1]
return np.exp(-4*np.log(2) * ((x-x0)**2 + (y-y0)**2) / fwhm**2)
For reference and enhancements, it is hosted as a gist here. Pull requests welcome!
Since the standard 2D Gaussian distribution is just the product of two 1D Gaussian distribution, if there are no correlation between the two axes (i.e. the covariant matrix is diagonal), just call random.gauss twice.
def gauss_2d(mu, sigma):
x = random.gauss(mu, sigma)
y = random.gauss(mu, sigma)
return (x, y)
import numpy as np
# define normalized 2D gaussian
def gaus2d(x=0, y=0, mx=0, my=0, sx=1, sy=1):
return 1. / (2. * np.pi * sx * sy) * np.exp(-((x - mx)**2. / (2. * sx**2.) + (y - my)**2. / (2. * sy**2.)))
x = np.linspace(-5, 5)
y = np.linspace(-5, 5)
x, y = np.meshgrid(x, y) # get 2D variables instead of 1D
z = gaus2d(x, y)
Straightforward implementation and example of the 2D Gaussian function. Here sx and sy are the spreads in x and y direction, mx and my are the center coordinates.
Numpy has a function to do this. It is documented here. Additionally to the method proposed above it allows to draw samples with arbitrary covariance.
Here is a small example, assuming ipython -pylab is started:
samples = multivariate_normal([-0.5, -0.5], [[1, 0],[0, 1]], 1000)
plot(samples[:, 0], samples[:, 1], '.')
samples = multivariate_normal([0.5, 0.5], [[0.1, 0.5],[0.5, 0.6]], 1000)
plot(samples[:, 0], samples[:, 1], '.')
In case someone find this thread and is looking for somethinga little more versatile (like I did), I have modified the code from #giessel. The code below will allow for asymmetry and rotation.
import numpy as np
def makeGaussian2(x_center=0, y_center=0, theta=0, sigma_x = 10, sigma_y=10, x_size=640, y_size=480):
# x_center and y_center will be the center of the gaussian, theta will be the rotation angle
# sigma_x and sigma_y will be the stdevs in the x and y axis before rotation
# x_size and y_size give the size of the frame
theta = 2*np.pi*theta/360
x = np.arange(0,x_size, 1, float)
y = np.arange(0,y_size, 1, float)
y = y[:,np.newaxis]
sx = sigma_x
sy = sigma_y
x0 = x_center
y0 = y_center
# rotation
a=np.cos(theta)*x -np.sin(theta)*y
b=np.sin(theta)*x +np.cos(theta)*y
a0=np.cos(theta)*x0 -np.sin(theta)*y0
b0=np.sin(theta)*x0 +np.cos(theta)*y0
return np.exp(-(((a-a0)**2)/(2*(sx**2)) + ((b-b0)**2) /(2*(sy**2))))
We can try just using the numpy method np.random.normal to generate a 2D gaussian distribution.
The sample code is np.random.normal(mean, sigma, (num_samples, 2)).
A sample run by taking mean = 0 and sigma 20 is shown below :
np.random.normal(0, 20, (10,2))
>>array([[ 11.62158316, 3.30702215],
[-18.49936277, -11.23592946],
[ -7.54555371, 14.42238838],
[-14.61531423, -9.2881661 ],
[-30.36890026, -6.2562164 ],
[-27.77763286, -23.56723819],
[-18.18876597, 41.83504042],
[-23.62068377, 21.10615509],
[ 15.48830184, -15.42140269],
[ 19.91510876, 26.88563983]])
Hence we got 10 samples in a 2d array with mean = 0 and sigma = 20
Related
I have several points on the unit sphere that are distributed according to the algorithm described in https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf (and implemented in the code below). On each of these points, I have a value that in my particular case represents 1 minus a small error. The errors are in [0, 0.1] if this is important, so my values are in [0.9, 1].
Sadly, computing the errors is a costly process and I cannot do this for as many points as I want. Still, I want my plots to look like I am plotting something "continuous".
So I want to fit an interpolation function to my data, to be able to sample as many points as I want.
After a little bit of research I found scipy.interpolate.SmoothSphereBivariateSpline which seems to do exactly what I want. But I cannot make it work properly.
Question: what can I use to interpolate (spline, linear interpolation, anything would be fine for the moment) my data on the unit sphere? An answer can be either "you misused scipy.interpolation, here is the correct way to do this" or "this other function is better suited to your problem".
Sample code that should be executable with numpy and scipy installed:
import typing as ty
import numpy
import scipy.interpolate
def get_equidistant_points(N: int) -> ty.List[numpy.ndarray]:
"""Generate approximately n points evenly distributed accros the 3-d sphere.
This function tries to find approximately n points (might be a little less
or more) that are evenly distributed accros the 3-dimensional unit sphere.
The algorithm used is described in
https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf.
"""
# Unit sphere
r = 1
points: ty.List[numpy.ndarray] = list()
a = 4 * numpy.pi * r ** 2 / N
d = numpy.sqrt(a)
m_v = int(numpy.round(numpy.pi / d))
d_v = numpy.pi / m_v
d_phi = a / d_v
for m in range(m_v):
v = numpy.pi * (m + 0.5) / m_v
m_phi = int(numpy.round(2 * numpy.pi * numpy.sin(v) / d_phi))
for n in range(m_phi):
phi = 2 * numpy.pi * n / m_phi
points.append(
numpy.array(
[
numpy.sin(v) * numpy.cos(phi),
numpy.sin(v) * numpy.sin(phi),
numpy.cos(v),
]
)
)
return points
def cartesian2spherical(x: float, y: float, z: float) -> numpy.ndarray:
r = numpy.linalg.norm([x, y, z])
theta = numpy.arccos(z / r)
phi = numpy.arctan2(y, x)
return numpy.array([r, theta, phi])
n = 100
points = get_equidistant_points(n)
# Random here, but costly in real life.
errors = numpy.random.rand(len(points)) / 10
# Change everything to spherical to use the interpolator from scipy.
ideal_spherical_points = numpy.array([cartesian2spherical(*point) for point in points])
r_interp = 1 - errors
theta_interp = ideal_spherical_points[:, 1]
phi_interp = ideal_spherical_points[:, 2]
# Change phi coordinate from [-pi, pi] to [0, 2pi] to please scipy.
phi_interp[phi_interp < 0] += 2 * numpy.pi
# Create the interpolator.
interpolator = scipy.interpolate.SmoothSphereBivariateSpline(
theta_interp, phi_interp, r_interp
)
# Creating the finer theta and phi values for the final plot
theta = numpy.linspace(0, numpy.pi, 100, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, 100, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(100))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
Issue with the code above:
With the code as-is, I have a
ValueError: The required storage space exceeds the available storage space: nxest or nyest too small, or s too small. The weighted least-squares spline corresponds to the current set of knots.
that is raised when initialising the interpolator instance.
The issue above seems to say that I should change the value of s that is one on the parameters of scipy.interpolate.SmoothSphereBivariateSpline. I tested different values of s ranging from 0.0001 to 100000, the code above always raise, either the exception described above or:
ValueError: Error code returned by bispev: 10
Edit: I am including my findings here. They can't really be considered as a solution, that is why I am editing and not posting as an answer.
With more research I found this question Using Radial Basis Functions to Interpolate a Function on a Sphere. The author has exactly the same problem as me and use a different interpolator: scipy.interpolate.Rbf. I changed the above code by replacing the interpolator and plotting:
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp)
# Creating the finer theta and phi values for the final plot
plot_points = 100
theta = numpy.linspace(0, numpy.pi, plot_points, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, plot_points, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(plot_points))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import cm
colormap = cm.inferno
normaliser = mpl.colors.Normalize(vmin=numpy.min(heatmap), vmax=1)
scalar_mappable = cm.ScalarMappable(cmap=colormap, norm=normaliser)
scalar_mappable.set_array([])
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.plot_surface(
X,
Y,
Z,
facecolors=colormap(normaliser(heatmap)),
alpha=0.7,
cmap=colormap,
)
plt.colorbar(scalar_mappable)
plt.show()
This code runs smoothly and gives the following result:
The interpolation seems OK except on one line that is discontinuous, just like in the question that led me to this class. One of the answer give the idea of using a different distance, more adapted the the spherical coordinates: the Haversine distance.
def haversine(x1, x2):
theta1, phi1 = x1
theta2, phi2 = x2
return 2 * numpy.arcsin(
numpy.sqrt(
numpy.sin((theta2 - theta1) / 2) ** 2
+ numpy.cos(theta1) * numpy.cos(theta2) * numpy.sin((phi2 - phi1) / 2) ** 2
)
)
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp, norm=haversine)
which, when executed, gives a warning:
LinAlgWarning: Ill-conditioned matrix (rcond=1.33262e-19): result may not be accurate.
self.nodes = linalg.solve(self.A, self.di)
and a result that is not at all the one expected: the interpolated function have values that may go up to -1 which is clearly wrong.
You can use Cartesian coordinate instead of Spherical coordinate.
The default norm parameter ('euclidean') used by Rbf is sufficient
# interpolation
x, y, z = numpy.array(points).T
interpolator = scipy.interpolate.Rbf(x, y, z, r_interp)
# predict
heatmap = interpolator(X, Y, Z)
Here the result:
ax.plot_surface(
X, Y, Z,
rstride=1, cstride=1,
# or rcount=50, ccount=50,
facecolors=colormap(normaliser(heatmap)),
cmap=colormap,
alpha=0.7, shade=False
)
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
ax.set_zlabel('z axis')
You can also use a cosine distance if you want (norm parameter):
def cosine(XA, XB):
if XA.ndim == 1:
XA = numpy.expand_dims(XA, axis=0)
if XB.ndim == 1:
XB = numpy.expand_dims(XB, axis=0)
return scipy.spatial.distance.cosine(XA, XB)
In order to better see the differences,
I stacked the two images, substracted them and inverted the layer.
Suppose I have a 2D Gaussian with pdf
I want to draw an ellipse corresponding to the level-set (contour)
Following here I know that I can replace the precision matrix with its eigendecomposition to obtain
where gamma is
Then to find coordinates of the points on the ellipse I would have to do
I tried plotting this but it is not working.
Plotting the Contours
from scipy.stats import multivariate_normal
import numpy as np
from numpy.linalg import eigh
import math
import matplotlib.pyplot as plt
# Target distribution
sx2 = 1.0
sy2 = 2.0
rho = 0.6
Sigma = np.array([[sx2, rho*math.sqrt(sx2)*math.sqrt(sy2)], [rho*math.sqrt(sx2)*math.sqrt(sy2), sy2]])
target = multivariate_normal(mean=np.zeros(2), cov=Sigma)
# Two different contours
xy = target.rvs()
xy2 = target.rvs()
# Values where to plot the density
x, y = np.mgrid[-2:2:0.1, -2:2:0.1]
zz = target.pdf(np.dstack((x, y)))
fig, ax = plt.subplots()
ax.contour(x,y, zz, levels=np.sort([target.pdf(xy), target.pdf(xy2)]))
ax.set_aspect("equal")
plt.show()
The code above shows the contour
Plotting the Ellipse
# Find gamma and perform eigendecomposition
gamma = math.log(1 / (4*(np.pi**2)*sx2*sy2*(1 - rho**2)*(target.pdf(xy)**2)))
eigenvalues, P = eigh(np.linalg.inv(Sigma))
# Compute u and v as per link using thetas from 0 to 2pi
thetas = np.linspace(0, 2*np.pi, 100)
uv = (gamma / np.sqrt(eigenvalues)) * np.hstack((np.cos(thetas).reshape(-1,1), np.sin(thetas).reshape(-1, 1)))
# Plot
plt.scatter(uv[:, 0], uv[:, 1])
However this clearly doesn't work.
You should square sx2 and sy2 in gamma.
gamma should be square rooted.
Multiply the resulting ellipse by P^-1 to get points in the original coordinate system. That's mentioned in the linked post. You have to convert back to the original coordinate system. I don't know actually how to code this, or if it actually works, so I leave the coding to you.
gamma = math.log(1 / (4*(np.pi**2)*(sx2**2)*(sy2**2)*(1 - rho**2)*(target.pdf(xy)**2)))
eigenvalues, P = eigh(np.linalg.inv(Sigma))
# Compute u and v as per link using thetas from 0 to 2pi
thetas = np.linspace(0, 2*np.pi, 100)
uv = (np.sqrt(gamma) / np.sqrt(eigenvalues)) * np.hstack((np.cos(thetas).reshape(-1,1), np.sin(thetas).reshape(-1, 1)))
orig_coord=np.linalg.inv(P) * uv #I don't how to code this in python
plt.scatter(orig_coord[:,0], orig_coord[:,1])
plt.show()
My attempt at coding it:
gamma = math.log(1 / (4*(np.pi**2)*(sx2**2)*(sy2**2)*(1 - rho**2)*(target.pdf(xy)**2)))
eigenvalues, P = eigh(np.linalg.inv(Sigma))
# Compute u and v as per link using thetas from 0 to 2pi
thetas = np.linspace(0, 2*np.pi, 100)
uv = (np.sqrt(gamma) / np.sqrt(eigenvalues)) * np.hstack((np.cos(thetas).reshape(-1,1), np.sin(thetas).reshape(-1, 1)))
orig_coord=np.zeros((100,2))
for i in range(len(uv)):
orig_coord[i,0]=np.matmul(np.linalg.inv(P), uv[i,:])[0]
orig_coord[i,1]=np.matmul(np.linalg.inv(P), uv[i,:])[1]
# Plot
plt.scatter(orig_coord[:, 0], orig_coord[:, 1])
gamma1 = math.log(1 / (4*(np.pi**2)*(sx2**2)*(sy2**2)*(1 - rho**2)*(target.pdf(xy2)**2)))
uv1 = (np.sqrt(gamma1) / np.sqrt(eigenvalues)) * np.hstack((np.cos(thetas).reshape(-1,1), np.sin(thetas).reshape(-1, 1)))
orig_coord1=np.zeros((100,2))
for i in range(len(uv)):
orig_coord1[i,0]=np.matmul(np.linalg.inv(P), uv1[i,:])[0]
orig_coord1[i,1]=np.matmul(np.linalg.inv(P), uv1[i,:])[1]
plt.scatter(orig_coord1[:, 0], orig_coord1[:, 1])
plt.axis([-2,2,-2,2])
plt.show()
Sometimes the plots don't work and you get the error invalid sqrt, but when it works it looks fine.
Good day to you fellow programmer !
Today I would like to do something that I believe is tricky. I have a very large 2D array called tac that basically contains time curve values and a file containing a tuple of coordinates called coor which contains information on where to place these curves in a 3D array. What this set of variables represents is actually a 4D array: the first 3 dimensions represent space dimensions and the fourth is time. The whole thing is stored as is to avoid storing an immense amount of zeros.
I would like to apply, for each time (in other words, each values in the 4th dimension), a gaussian kernel to this set of data. I was able to generate this kernel and to perform the convolution quite easily for a fixed standard deviation for the whole array using scipy.ndimage.convolve. The kernel was created using scipy.signal.gaussian. Here is a brief example of the principle where tac_4d contains the 4D array (stores a lot of data I know... but one problem at the time):
def gaussian_kernel_3d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
kernel_3d = np.expand_dims(kernel_3d, -1)
return kernel_3d
g = gaussian_kernel_3d(1, .5)
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)
The trick is now to convolve the array with a kernel which standard deviation is different for each SPACE coordinate. In other words, I would have a 3D array std containing standard deviations for each coordinate of the array.
It seems https://github.com/sheliak/varconvolve is the code needed to take care of this problem. However I don't really understand how to use it and quite frankly, I would prefer to come up with a genuine solution. Do you guys see a way to solve this problem?
Thanks in advance !
EDIT
Here is what I hope can be considered MCVE
import numpy as np
from scipy import signal
from scipy import ndimage as nd
def gaussian_kernel_2d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
return kernel_2d
def gaussian_kernel_3d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
kernel_3d = np.expand_dims(kernel_3d, -1)
return kernel_3d
np.random.seed(0)
number_of_tac = 150
time_samples = 915
z, y, x = 100, 150, 100
voxel_number = x * y * z
# TACs in the right order
tac = np.random.uniform(0, 4, time_samples * number_of_tac).reshape(number_of_tac, time_samples)
arr = np.array([0] * (voxel_number - number_of_tac) + [1] * number_of_tac)
np.random.shuffle(arr)
arr = arr.reshape(z, y, x)
coor = np.where(arr != 0) # non-empty voxel
# Algorithm to replace TAC in 3D space
nnz = np.zeros(arr.shape)
nnz[coor] = 1
tac_4d = np.zeros((x, y, z, time_samples))
tac_4d[np.where(nnz == 1)] = tac
# 3D convolution for all time
# TODO: find a way to make standard deviation change for each voxel
g = gaussian_kernel_3d(1, 1) # 3D kernel of std = 1
v = np.random.uniform(0, 1, x * y * z).reshape(z, y, x) # 3D array of std
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0) # convolution
Essentially, you have a 4D dataset, shape (nx, ny, nz, nt) that is sparse in (nx, ny, nz) and dense in the nt axis. If (i, j, k) are coordinates of nonzero points in the sparse dimensions, you want to convolve with a Gaussian 3D kernel that has a sigma that depends on (i, j, k).
For example, if there are nonzero points at [1, 2, 5] and [1, 4, 5] with corresponding sigmas 0.1 and 1.0, then the output at coordinates [1, 3, 5] is affected mostly by the [1, 4, 5] point because that one has the largest point spread.
Your question is ambiguous; it could also mean that point [1, 3, 5] has a its own associated sigma, for example 0.5, and pulls data from the two adjacent points with equal weight. I will assume the first definition (sigma values associated with input points, not with output points).
Because the operation is not a true convolution, there is no fast FFT-based method to do the entire operation in one operation. Instead, you have to loop over the sigma values. Fortunately, your example has only 150 nonzero points, so the loop is not too expensive.
Here is an implementation. I keep the data in sparse representation as long as possible.
import scipy.signal
import numpy as np
def kernel3d(mm, sigma):
"""Return (mm, mm, mm) shaped, normalized kernel."""
g1 = scipy.signal.gaussian(mm, std=sigma)
g3 = g1.reshape(mm, 1, 1) * g1.reshape(1, mm, 1) * g1.reshape(1, 1, mm)
return g3 * (1/g3.sum())
np.random.seed(1)
s = 2 # scaling factor (original problem: s=10)
nx, ny, nz, nt, nnz = 10*s, 11*s, 12*s, 91*s, 15*s
# select nnz random voxels to fill with time series data
randint = np.random.randint
tseries = {} # key: (i, j, k) tuple; value: time series data, shape (nt,)
for _ in range(nnz):
while True:
ijk = (randint(nx), randint(ny), randint(nz))
if ijk not in tseries:
tseries[ijk] = np.random.uniform(0, 1, size=nt)
break
ijks = np.array(list(tseries.keys())) # shape (nnz, 3)
# sigmas: key: (i, j, k) tuple; value: standard deviation
sigmas = { k: np.random.uniform(0, 2) for k in tseries.keys() }
# output will be stored as dense array, padded to avoid edge issues
# with convolution.
m = 5 # padding size
cag_4dp = np.zeros((nx+2*m, ny+2*m, nz+2*m, nt))
mm = 2*m + 1 # kernel width
for (i, j, k), tdata in tseries.items():
kernel = kernel3d(mm, sigmas[(i, j, k)]).reshape(mm, mm, mm, 1)
# convolution of one voxel by kernel is trivial.
# slice4d_c has shape (mm, mm, mm, nt).
slice4d_c = kernel * tdata
cag_4dp[i:i+mm, j:j+mm, k:k+mm, :] += slice4d_c
cag_4d = cag_4dp[m:-m, m:-m, m:-m, :]
#%%
import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 2, tight_layout=True)
plt.close('all')
# find a few planes
#ks = np.where(np.any(cag_4d != 0, axis=(0, 1,3)))[0]
ks = ijks[:4, 2]
for ax, k in zip(axs.ravel(), ks):
ax.imshow(cag_4d[:, :, k, nt//2].T)
ax.set_title(f'Voxel [:, :, {k}] at time {nt//2}')
fig.show()
for ijk, sigma in sigmas.items():
print(f'{ijk}: sigma={sigma:.2f}')
I have an array of data Y such that Y is a function of an independent variable X (another array).
The values in X vary from 0 to 360, with wraparound.
The values in Y vary from -180 to 180, also with wraparound.
(That is, these values are angles in degrees around a circle.)
Does anyone know of any function in Python (in numpy, scipy, etc.) capable of low-pass filtering my Y values as a function of X?
In case this is at all confusing, here's a plot of example data:
Say you start with
import numpy as np
x = np.linspace(0, 360, 360)
y = 5 * np.sin(x / 90. * 3.14) + np.random.randn(360)
plot(x, y, '+');
To perform a circular convolution, you can do the following:
yy = np.concatenate((y, y))
smoothed = np.convolve(np.array([1] * 5), yy)[5: len(x) + 5]
This uses, at each point, the cyclic average with the previous 5 points (inclusive). Of course, there are other ways of doing so.
>>> plot(x, smoothed)
Here is a solution using pandas to do a moving average. First unwrap the data (need to convert to radians and back), so there are no discontinuities (e.g., jump from 180 to -179). Then compute the moving average and finally convert back to wrapped data if desired. Also, check out this numpy cookbook recipe using np.convolve().
import numpy as np
import pandas as pd
# generate random data
X = pd.Series([(x + 5*np.random.random())%360 for x in range(-100, 600, 15)])
Y = pd.Series([(y + 5*np.random.random())%360 - 180 for y in range(-200, 500, 15)])
# 'unwrap' the angles so there is no wrap around
X1 = pd.Series(np.rad2deg(np.unwrap(np.deg2rad(Y))))
Y1 = pd.Series(np.rad2deg(np.unwrap(np.deg2rad(Y))))
# smooth the data with a moving average
# note: this is pandas 17.1, the api changed for version 18
X2 = pd.rolling_mean(X1, window=3)
Y2 = pd.rolling_mean(Y1, window=3)
# convert back to wrapped data if desired
X3 = X2 % 360
Y3 = (Y2 + 180)%360 - 180
You can use convolve2D from scipy.signal. Here is a function, which applies smoothing to a numpy array a. If a has more than one dimension smoothing is applied to the innermost (fastest) dimension.
import numpy as np
from scipy import signal
def cyclic_moving_av( a, n= 3, win_type= 'boxcar' ):
window= signal.get_window( win_type, n, fftbins=False ).reshape( (1,n) )
shp_a= a.shape
b= signal.convolve2d( a.reshape( ( np.prod( shp_a[:-1], dtype=int ), shp_a[-1] ) ),
window, boundary='wrap', mode='same' )
return ( b / np.sum( window ) ).reshape( shp_a )
For instance it can be used like
import matplotlib.pyplot as plt
x = np.linspace(0, 360, 360)
y1 = 5 * np.sin(x / 90. * 3.14) + 0.5 * np.random.randn(360)
y2 = 5 * np.cos(0.8 * x / 90. * 3.14) + 0.5 * np.random.randn(360)
y_av= cyclic_moving_av( np.stack((y1,y2)), n=10 ) #1
plt.plot(x, y1, '+')
plt.plot(x, y2, '+')
plt.plot(x, y_av[0])
plt.plot(x, y_av[1])
plt.show()
This results in
Line #1 is equivalent to
y_av[0]= cyclic_moving_av( y1, n=10 )
y_av[1]= cyclic_moving_av( y2, n=10 )
win_type= 'boxcar' results in averaging over neighbors with equal weights. See signal.get_window for other options.
I have two corresponding 2D arrays, one of velocity, one of intensity. The values of intensity match each of the velocity elements.
I have created another 1d array that that goes from min to max velocity in even bin widths.
How would I sum the intensity values from my 2d array which correspond to my velocity bins in my 1d array.
For example: if I have I = 5 corresponding to velocity = 101km/s, then this is added to the bin 100 - 105 km/s.
Here's my input:
rad = np.linspace(0, 3, 100) # polar coordinates
phi = np.linspace(0, np.pi, 100)
r, theta = np.meshgrid(rad, phi) # 2d arrays of r and theta coordinates
V0 = 225 # Velocity function w/ constants.
rpe = 0.149
alpha = 0.003
Vr = V0 * (1 - np.exp(-r / rpe)) * (1 + (alpha * np.abs(r) / rpe)) # returns 100x100 array of Velocities.
Vlos = Vr * np.cos(theta)# Line of sight velocity assuming the observer is in the plane of the polar disk.
a = (r**2) # intensity as a function of radius
b = (r**2 / 0.23)
I = (3.* np.exp(-1. * a)) - (1.8 * np.exp(-1. * b))
I wish to first create velocity bins from Vmin to Vmax and then sum the intensities over each bin.
My desired out put would be something along the lines of
V_bins = [0, 5, 10,... Vlos.max()]
I_sum = [1.4, 1.1, 1.8, ... 1.2]
plot(V_bins, I_sum)
EDIT: I have come up with temporary solution but perhaps there is a more elegant/efficient method of achieving it?
The two array Vlos and I are both 100 by 100 matrices.
Vlos = array([[ 0., 8.9, 17.44, ..., 238.5],...,
[-0., -8.9, -17.44, ..., -238.5]])
I = random.random((100, 100))
V = np.arange(Vlos.min(), Vlos.max()+5, 5)
bins = np.zeros(len(V))
for i in range(0, len(V)-1):
for j in range(0, len(Vlos)): # horizontal coordinate in matrix
for k in range(0, len(Vlos[0])): # vert coordinate
if Vlos[j,k] >= V[i]and Vlos[j,k] < V[i+1]:
bins[i] = bins[i] + I[j,k]
The result is plotted below.
The overall shape in the histogram is to be expected, however I don't understand the spike in the curve at V = 0. As far as I can tell this isn't there in the data which leads me to question my method.
Any further help would be appreciated.
import numpy as np
bins = np.arange(100,120,5)
velocities = np.array([101, 111, 102, 112])
intensities = np.array([1,2,3,4])
h = np.histogram(velocities, bins, weights=intensities)
print h
Output:
(array([4, 0, 6]), array([100, 105, 110, 115]))