How to use np.gradient on multidimensional data?

How to use np.gradient on multidimensional data? - python

There exist a lot of questions on this, but I can't find one that describes this particular use case.
I have a matrix P, which gives a scalar value in a 3D field of size (dx,dy,dz)=(360, 720, 30) over 41 timesteps. That is,
>>> np.shape(P)
(41, 30, 360, 720)
As is seen, the z-index (we'll call it the "vertical") is the second dimension.
I want to calculate dP/dz for this field.
However, the spacing in z is not uniform in any of the three spatial dimensions, and is time-variate (simply imagine that each grid point is allowed to float around per-timestep). That is, there is an associated matrix giving the vertical coordinates of each grid point in the 3D space, Z3, where
>>> np.shape(Z3)
(41, 30, 360, 720)
How do I then use np.gradient() to obtain P'(t, x, y), where P'=dP/dz?
When I try differentiating along axis 1, I get:
>>> np.gradient(P, Z3, axis=1)
*** ValueError: distances must be either scalars or 1d
This error is very opaque, since it contradicts the documentation, which describes the ability to pass N dimensional spacings. However, this does work:
>>> np.gradient(P[0,:,0,0], Z3[0,:,0,0])
This essentially gives me dP/dz in one "vertical column" of the field, i.e. the derivative of P at the origin at time=0, P'(0,0,0). Consider the horrifically slow code:
dpdz = np.zeros(np.shape(P))
for i in range(np.shape(P)[0]):
for j in range(np.shape(P)[2]):
for k in range(np.shape(P)[3]):
dpdz[i,:,j,k] = np.gradient(P[i,:,j,k], Z3[i,:,j,k])
This is exactly the result that I would expect np.gradient(P, Z3, axis=1) to give. Is there a way to make this work?

Related

SavGol Filter gives same length error on 3-D array

Hi, I'm trying to apply a Sav-gol filter to a 3D array of data that I have, (magnetic field data with xyz coordinates.) When I run my program, I Get the error: TypeError: expected x and y to have same length. My array is 460798 units long, with each unit being some list of coordinates [x y z]. I think it has to do something with the window size parameter. When I put it to three, it works fine, but my data points aren't smoothed. Higher than three, it does not work.
I am trying to get the function to smooth the 3-D array.
mag = cdf['Mag'][start_ind:stop_ind) #mag is a 3-D array with coordinate element [x y z]
mag_smoothed = signal.savgol_filter(x=mag, window_length=5, polyorder=2)
print mag_smoothed[1]
I'm supposed to get a smoothed 3-D array back, I believe.
File "/Users/sosa/research/Python Files/MagnometerPlot.py", line 33, in plot
mag_smoothed = signal.savgol_filter(x=mag, window_length=7, polyorder=2,axis=1)
File "/Users/sosa/anaconda/lib/python2.7/site-packages/scipy/signal/_savitzky_golay.py", line 339, in savgol_filter
_fit_edges_polyfit(x, window_length, polyorder, deriv, delta, axis, y)
File "/Users/sosa/anaconda/lib/python2.7/site-packages/scipy/signal/_savitzky_golay.py", line 217, in _fit_edges_polyfit
polyorder, deriv, delta, y)
File "/Users/sosa/anaconda/lib/python2.7/site-packages/scipy/signal/_savitzky_golay.py", line 187, in _fit_edge
xx_edge, polyorder)
File "/Users/sosa/anaconda/lib/python2.7/site-packages/numpy/lib/polynomial.py", line 559, in polyfit
raise TypeError("expected x and y to have same length")
TypeError: expected x and y to have same length

do you think if I separate the x,y,z components of the mag list and apply the filter separately to each component, would the filter be replicated?
I think it could be a reasonable approximation, but that's highly subjective and depends on what you're planning to do with your data. If you're trying to do precision measurements this might not be the best way to process your data.
Since I'm not sure if you're working with volume data or surface data (with z being the magnitude at x, y). I'll use a 3D-surface as an example. (Let's say it's a 2D array of magnitudes, arr1)
What we want to do: Smooth the surface with SG.
What we can do with scipy's SG-Filter: Smooth a 1D line.
But a surface is just a set of lines side by side, so to work around it we might do the following:
1) Smooth every row in arr1 (axis = 0). We put all the smoothed rows into a new array, arr2
2) Now we do the same with every column in arr2 (axis=1) and generate arr3, which is, nominally, the "2D-smoothed" surface.
But it isn't, not quite. For a given data point, the 1D-filter calculates a new values by taking into acount the point itself and several adjacent values. But in a 2D-set, that data point has more adjacent values which the 1D-filter doesn't see, because those values are in the wrong row (or column). It would probably arrive at a different value if it could.
The easiest way to convince yourself that the step-wise smoothing isn't perfect is to do it twice, but the second time you reverse the order. First you work along columns, then rows. In a perfect world the final results should agree, whether you started with rows or columns. As it is you'll probably find they're slightly different.
If your data is quite uniform without many 'jagged' peaks or jumps (e.g. noise) you probably wouldn't have any problems. Otherwise you may see more significant differences between the two results.
A quick google search did show up various discussions about 2D-Savitzky-Golay filters, so investigating that might be worthwhile for you.

If your data are organized in columns, you have to use
mag_smoothed = signal.savgol_filter(x=mag, window_length=5, polyorder=2, axis=0)

Move a vertex along a plane, given the plane normal

I have a 3D vector and a 3D face normal. How do I go along to move this vector along the given face normal using Python (with or without numpy)?
Ideally, I'd build a matrix using the face normal with the x and y and multiply it by the original vector or something like that, but I can't get my head around on how to build it. It's been a while since Linear Algebra.
EDIT:
Thanks for pointing out that my question was too broad.
My goal is to get a new point, that is x and y units away from the original point, along the face defined by its normal.
Example: If the point is (0,0,0) and the normal is (0, 0, 1), the result would be (x, y, 0).
Example 2: If the point is (1, 0, 0) and the normal is (0, 1, 0), the result would be (1+x, 0, y).
I'd need to extrapolate that to work with any point, normal, x and y.

The projection of a vector to a plane defined by its normal is:
def projection(vector, normal):
return vector - vector.dot(normal) * normal
Presumably this means you want something like:
x + projection(y, normal)

def give_me_a_new_vertex_position_along_normal(old_vertex_position, normal):
new_vertex_position = old_vertex_position + normal
return new_vertex_position
There is a difference between affine spaces (your normals) and euclidean/linear spaces (your vertices).
Vectors in linear space have coordinates associated with them, while vectors in affine space do not.
Adding an affine-spaced vector to a linear-spaced vector is called projection and that is what you are looking to do.

Create Numpy Array Representing a Geometric Shape

As the title suggests, how would one create a numpy array of 3D coordinates of a geometric shape?
Currently, I have the easiest shape already figured out:
latva = 6
latvb = 6
latvc = 6
latdiv = 20
latvadiv = latva / latdiv
latvbdiv = latvb / latdiv
latvcdiv = latvc / latdiv
lol = np.zeros((latdiv**3,4),dtype=np.float64)
lol[:,:3] = (np.arange(latdiv**3)[:,None]//(latdiv**2,latdiv,1)*(latvadiv,latvbdiv,latvcdiv)%(latva,latvb,latvc))
creates an array of (8000,4). If you then split the array along the 1,2,3 column (Ignoring the 4th as it's meaningless in this question) and plot it (Personally, I use pyplot) you get a Cube!
Easy enough. Also works for a rectangle.
But I've not the foggiest idea of how to get any further - say plotting a rhombus.
I'm not interested in black magic like spheres, ovals or shapes whose sides do not change following a line. Just things like your standard rhombus/Rhomboid/Parallelepiped/Whatever_you_want_to_call_it.
Any ideas on how to accomplish this?

Because you already have convenient method to generate points in square or cube, the simplest way to make rhombus, parallelogram for 2D case and parallelepiped for 3D case is to apply affine transform to calculate new point coordinates.
For example, to make rhombus, you can find matrix as combination of translation by (-centerX, -centerY), rotation by Pi/4, scaling along axes (if needed) and translation to needed position.
AffMatrix = ShiftMatrix * RotateMatrix * ScaleMatrix * BackShiftMatrix
for each point coordinates:
(NewX, NewY) = (AffMatrix) * (X, Y)
Rhomboid will include also shear transform.
I think that numpy has ready-to-use routines to create and combine (multiply) affine matrices.

2D Interpolation with periodic boundary conditions

I'm running a simulation on a 2D space with periodic boundary conditions. A continuous function is represented by its values on a grid. I need to be able to evaluate the function and its gradient at any point in the space. Fundamentally, this isn't a hard problem -- or to be precise, it's an almost already solved problem. The function can be interpolated using a cubic spline with scipy.interpolate.RectBivariateSpline. The reason it's almost solved is that RectBivariateSpline cannot handle periodic boundary conditions, nor can anything else in scipy.interpolate, as far as I can figure out from the documentation.
Is there a python package that can do this? If not, can I adapt scipy.interpolate to handle periodic boundary conditions? For instance, would it be enough to put a border of, say, four grid elements around the entire space and explicitly represent the periodic condition on it?
[ADDENDUM] A little more detail, in case it matters: I am simulating the motion of animals in a chemical gradient. The continuous function I mentioned above is the concentration of a chemical that they are attracted to. It changes with time and space according to a straightforward reaction/diffusion equation. Each animal has an x,y position (which cannot be assumed to be at a grid point). They move up the gradient of attractant. I'm using periodic boundary conditions as a simple way of imitating an unbounded space.

It appears that the python function that comes closest is scipy.signal.cspline2d. This is exactly what I want, except that it assumes mirror-symmetric boundary conditions. Thus, it appears that I have three options:
Write my own cubic spline interpolation function that works with periodic boundary conditions, perhaps using the cspline2d sources (which are based on functions written in C) as a starting point.
The kludge: the effect of data at i on the spline coefficient at j
goes as r^|i-j|, with r = -2 + sqrt(3) ~ -0.26. So the effect of
the edge is down to r^20 ~ 10^-5 if I nest the grid within a border
of width 20 all the way around that replicates the periodic values,
something like this:
bzs1 = np.array(
[zs1[i%n,j%n] for i in range(-20, n+20) for j in range(-20, n+20)] )
bzs1 = bzs1.reshape((n + 40, n + 40))
Then I call cspline2d on the whole array, but use only the middle. This should work, but it's ugly.
Use Hermite interpolation instead. In a 2D regular grid, this corresponds to bicubic interpolation. The disadvantage is that the interpolated function has a discontinuous second derivative. The advantages are it is (1) relatively easy to code, and (2) for my application, computationally efficient. At the moment, this is the solution I'm favoring.
I did the math for interpolation with trig functions rather than polynomials, as #mdurant suggested. It turns out to be very similar to the cubic spline, but requires more computation and produces worse results, so I won't be doing that.
EDIT: A colleague told me of a fourth solution:
The GNU Scientific Library (GSL) has interpolation functions that can handle periodic boundary conditions. There are two (at least) python interfaces to GSL: PyGSL and CythonGSL. Unfortunately, GSL interpolation seems to be restricted to one dimension, so it's not a lot of use to me, but there's lots of good stuff in GSL.

Another function that could work is scipy.ndimage.interpolation.map_coordinates.
It does spline interpolation with periodic boundary conditions.
It does not not directly provide derivatives, but you could calculate them numerically.

These functions can be found at my github, master/hmc/lattice.py:
Periodic boundary conditions The Periodic_Lattice() class is described here in full.
Lattice Derivatives In the repository you will find a laplacian function, a squared gradient (for the gradient just take the square root) and and overloaded version of np.ndarray
Unit Tests The test cases can be found in same repo in tests/test_lattice.py

I have been using the following function which augments the input to create data with effective periodic boundary conditions. Augmenting the data has a distinct advantage over modifying an existing algorithm: the augmented data can easily be interpolated using any algorithm. See below for an example.
def augment_with_periodic_bc(points, values, domain):
"""
Augment the data to create periodic boundary conditions.
Parameters
----------
points : tuple of ndarray of float, with shapes (m1, ), ..., (mn, )
The points defining the regular grid in n dimensions.
values : array_like, shape (m1, ..., mn, ...)
The data on the regular grid in n dimensions.
domain : float or None or array_like of shape (n, )
The size of the domain along each of the n dimenions
or a uniform domain size along all dimensions if a
scalar. Using None specifies aperiodic boundary conditions.
Returns
-------
points : tuple of ndarray of float, with shapes (m1, ), ..., (mn, )
The points defining the regular grid in n dimensions with
periodic boundary conditions.
values : array_like, shape (m1, ..., mn, ...)
The data on the regular grid in n dimensions with periodic
boundary conditions.
"""
# Validate the domain argument
n = len(points)
if np.ndim(domain) == 0:
domain = [domain] * n
if np.shape(domain) != (n,):
raise ValueError("`domain` must be a scalar or have the same "
"length as `points`")
# Pre- and append repeated points
points = [x if d is None else np.concatenate([x - d, x, x + d])
for x, d in zip(points, domain)]
# Tile the values as necessary
reps = [1 if d is None else 3 for d in domain]
values = np.tile(values, reps)
return points, values
Example
The example below shows interpolation with periodic boundary conditions in one dimension but the function above can be applied in arbitrary dimensions.
rcParams['figure.dpi'] = 144
fig, axes = plt.subplots(2, 2, True, True)
np.random.seed(0)
x = np.linspace(0, 1, 10, endpoint=False)
y = np.sin(2 * np.pi * x)
ax = axes[0, 0]
ax.plot(x, y, marker='.')
ax.set_title('Points to interpolate')
sampled = np.random.uniform(0, 1, 100)
y_sampled = interpolate.interpn([x], y, sampled, bounds_error=False)
valid = ~np.isnan(y_sampled)
ax = axes[0, 1]
ax.scatter(sampled, np.where(valid, y_sampled, 0), marker='.', c=np.where(valid, 'C0', 'C1'))
ax.set_title('interpn w/o periodic bc')
[x], y = augment_with_periodic_bc([x], y, domain=1.0)
y_sampled_bc = interpolate.interpn([x], y, sampled)
ax = axes[1, 0]
ax.scatter(sampled, y_sampled_bc, marker='.')
ax.set_title('interpn w/ periodic bc')
y_sampled_bc_cubic = interpolate.interp1d(x, y, 'cubic')(sampled)
ax = axes[1, 1]
ax.scatter(sampled, y_sampled_bc_cubic, marker='.')
ax.set_title('cubic interp1d w/ periodic bc')
fig.tight_layout()

trouble with performing coordinate map/interpolation with interp2d

I have what is essentially a 4 column lookup table: cols 1, 2 are the respective xi,yj coordinates which map to x'i, y'j coordinates in the respective 3rd and 4th cols.
My goal is to provide a method to enter some (xnew,ynew) position within the range of my look-up values in the 1st and 2nd columns(xi,yj) then map that position to an interpolated (x'i,y'j) from the range of positions in the 3rd and 4th cols of the lut.
I have tried using interp2d, but have not been able to figure out how to enter the arrays into the proper format. For example: I don't understand why scipy.interpolate.interp2d(x'i, y'j, [xi,yj] kind='linear') gives me the following error:
ValueError: Invalid length for input z for non rectangular grid'.
This seems so simple, but I have not been able to figure it out. I will gladly provide more information if required.

interp2d requires that the interpolated function be 1D, see the docs:
z : 1-D ndarray The values of the function to interpolate at the data
points. If z is a multi-dimensional array, it is flattened before use.
So when you enter [xi,yj], it gets converted from its (2, n) shape to (2*n,), hence the error.
You can get around this setting up two different interpolating functions, one for each coordinate. If your lut is a single array of shape (n, 4), you would do something like:
x_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[2], kind = 'linear')
y_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[3], kind = 'linear')
And you can now do things like:
new_x, new_y = x_interp(x, y), y_interp(x, y)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.