I have used numpy meshgrids for a long time, and typically find no issues when trying to pass that meshgrid through a function. In my experience it has always been the case that I can define my coordinate space as
x,y,z = numpy.meshgrid(numpy.linspace(-10,10,10),
numpy.linspace(-10,10,10),
numpy.linspace(-10,10,10))
and then can easily compute something like
u,v,w = numpy.sin(x*y)+numpy.cos(z).
My issue has arisen from the need to do a cross product in that calculation. I am defining a field using the meshgrid, and trying to pass the entire meshgrid through the function:
field_equation = lambda x,y,z: sum([parameter*np.cross([wire_x[i],wire_y[i],wire_z[i]],[x,y,z]) for i in range(len(wire))])
Depending on how I try to solve the problem, I get a whole host of problems. The code works fine when passing individual points (x,y,z) through one at a time, but cannot calculate for the entire field. How do I get around this?
np.cross only accept a vector of size 3, or nd-array with the last dimension of size 3, so we need to stack np.stack([x,y,z]) to create a 10*10*10*3 nd-array first.
The results will be a 10*10*10*3 array, and to be able to unpack this array later, we need to transpose it to size 3*10*10*10, so I swap axes of resulting array at the end.
In the code below, I also take the liberty to shorten the code wrt wire a little, assuming wire_x, wire_y, wire_z are just 3 components of wire.
import numpy as np
# test data
x,y,z = np.meshgrid(np.linspace(-10,10,10),
np.linspace(-10,10,10),
np.linspace(-10,10,10))
wire = [[1,2,3,4], [5,6,7,8], [3,4,5,6]]
parameter = 1
field_equation = lambda x,y,z: sum([parameter*np.cross(w, np.stack([x,y,z], axis=-1)) for w in zip(*wire)]).swapaxes(0,-1)
a,b,c = field_equation(x,y,z)
print(a.shape, b.shape, c.shape)
#(10, 10, 10) (10, 10, 10) (10, 10, 10)
Related
Is there a way to generate a bootstrap sample on an N-dimensional array? I am limited to using numpy==1.19.4
I have already tried using a for loop on the other dimensions to no avail, but the following works for 1-dimensional arrays.
import numpy as np
# Set random state and number of resamples
random.seed(random_state)
n_resamples = 9999
# Generate data
data_1d = np.arange(2, 3, 0.1)
data_nd = np.random.default_rng(42).random((2,3,2))
data = data_1d.copy()
# Resample the data with replacement, computing the test statistic for each set of resamples
bs_samples = [np.std(np.random.choice(data, size=len(data))) for _ in range(n_resamples)]
If I get your problem, I use to apply this method:
suppose you have this multi-dimensionale array:
data_nd = np.random.rand(100, 3, 2)
data_nd.shape #(100, 3, 2)
you can sample elements with bootstrap in this way:
n_resamples = 99
data_nd[np.random.randint(len(data_nd), size=len(data_nd)*n_resamples)].reshape(n_resamples, *data_nd.shape).shape
what I'm doing is to randomly extract indices (randint) with replacement and finally reshape the sampling to obtain 99 bootstrapped dataset with the same dimensions of the original one.
Note that by this procedure you are considering as "elements" the arrays along the first ax and so each element that you are sampling have shape (3,2).
I hope that is clear, but if you have any doubt please let me know.
Given a xarray.DataArray with area values ('Af') along date and depth dimensions: xarray.DataArray 'Af' (time: 366, z: 20) , how to I get the difference between each area across the depth dimension for each date with the same length as 'area' and the first difference being equal to the first indexed 'area'?
So something like:
area_1 = Af_1
area_2 = Af_1 - Af_2
...
area_i = Af_i - Af_(i-1)
If it was in numpy, I could have used np.diff(array, prepend=array[0]), but the prepend option is not available in xarray. Is there any method to imitate np.diff(array, prepend=0) in xarray?
I am new to xarray (and coding in general), so any help would be appreciated.
If I understand you correctly, you want to do the following:
Compute the differences of the areas along the z-dimension. This will necessarily result in a coordinate that is shorter than before.
Then you want to prepend the areas at the first depth to get an array of the same length.
You can us xarray's diff method to do the first step and use concat to do the second step.
import numpy as np
import xarray as xr
# Create a dummy dataset
da = xr.DataArray(
data=np.random.rand(3, 4),
dims=("time", "z"),
name="area",
coords={"time": np.arange(3), "z": np.arange(4)},
)
# Compute the differences
differences = da.diff(dim="z", label="upper")
# Concatenate the differences with the areas at the first depth
xr.concat([da.isel(z=0), differences], dim="z")
I think that this is the easiest way, because you can take advantage of xarray's labeled dimensions.
However, you can actually use any numpy function on a xarray DataArray. So you could use np.diff as well. However, you will end up with an unlabeled numpy array, so you would have to readd dimensions and coordinates.
The easiest way to do so is the copy method using the data argument. It will create an array with the same structure as the original one (same dims and coords) but with different data:
# Compute diffs with numpy and write the results back into a DataArray
da.copy(data=np.diff(da, axis=da.get_axis_num("z"), prepend=0))
So I am a little new to using matrices in Python, and I am looking for the best way to perform the following operation.
Say I have a vector of an arbitrary length, like this:
data = np.array(range(255))
And I want to fit this data inside a matrix with a shape like so:
concept = np.zeros((3, 9, 6))
Now, obviously this will not fit, and results in an error:
ValueError: cannot reshape array of size 255 into shape (3,9,6)
What would be the best way to go about fitting as much of the data vector inside the first matrix with the shape (3, 9, 6) while making sure any "overflow" is stored in a second (or third, fourth, etc.) matrix?
Does this make sense?
Basically, I want to be able to take a vector of any size and produce an arbitrary amount of matrices that have the data shaped according to the 3, 9, 6 dimensions.
Thank you for your help.
def each_matrix(a, dims):
size = dims.prod()
padded = np.concatenate([ a, np.zeros(size-1) ])
for i in range(len(padded) / size):
yield padded[i*size : (i+1)*size].reshape(dims)
for matrix in each_matrix(np.array(range(255)),
dims=np.array([ 3, 9, 6 ])):
print(str(matrix) + '\n\n-------\n')
This will fill the last matrix with zeros.
Here is a rough solution to your problem.
def split_padded(a,n):
padding = n - len(data)%n
numOfsplit = int(len(data)/n)+1
print padding, numOfsplit
return np.split(np.concatenate((a,np.zeros(padding))),numOfsplit)
data = np.array(range(255))
splitnum = 3*9*6
splitdata = split_padded(data,splitnum)
for mat in splitdata:
print mat.reshape(3,9,6)
It is very rough and works for 1D input for array.
First, calculating the number of 0 we need to pad in padding and then calculating the number of matrices we can get out of input data in numOfsplit and doing the splitting in last line.
I have a Numpy array X of n 2x2 matrices, arranged so that X.shape = (2,2,n), that is, to get the first matrix I call X[:,:,0]. I would like to reshape X into an array Y such that I can get the first matrix by calling Y[0] etc., but performing X.reshape(n,2,2) messes up the matrices. How can I get it to preserve the matrices while reshaping the array?
I am essentially trying to do this:
import numpy as np
Y = np.zeros([n,2,2])
for i in range(n):
Y[i] = X[:,:,i]
but without using the for loop. How can I do this with reshape or a similar function?
(To get an example array X, try X = np.concatenate([np.identity(2)[:,:,None]] * n, axis=2) for some n.)
numpy.moveaxis can be used to take a view of an array with one axis moved to a different position in the shape:
numpy.moveaxis(X, 2, 0)
numpy.moveaxis(a, source, destination) takes a view of array a where the axis originally at position source ends up at position destination, so numpy.moveaxis(X, 2, 0) makes the original axis 2 the new axis 0 in the view.
There's also numpy.transpose, which can be used to perform arbitrary rearrangements of an array's axes in one go if you pass it the optional second argument, and numpy.rollaxis, an older version of moveaxis with a more confusing calling convention.
Use swapaxis:
Y = X.swapaxes(0,2)
I want to compute the sum product along one dimension of two multidimensional arrays, using Theano.
I'll describe precisely what I want to do using numpy first. numpy.tensordot and numpy.dot seem to always do a matrix product, whereas I'm in essence looking for a batched equivalent of a vector product. Given x and y, I want to compute z like so:
x = np.random.normal(size=(200, 2, 2, 1000))
y = np.random.normal(size=(200, 2, 2))
# this is how I now approach it:
z = np.sum(y[:,:,:,np.newaxis] * x, axis=1)
# z is of shape (200, 2, 1000)
Now I know that numpy.einsum would probably be able to help me here, but again, I want to do this particular computation in Theano, which does not have an einsum equivalent. I will need to use dot, tensordot, or Theano's specialized einsum subset functions batched_dot or batched_tensordot.
The reason I'm looking to change my approach to this is performance; I suspect that using builtin (CUDA) dot products will be faster than relying on broadcasting, element-wise product, and sum.
In Theano, none of the dimensions of three and four dimensional tensors are broadcastable. You have to explicitly set them. Then the Numpy principles will work just fine. One way to do this is to use T.patternbroadcast. To read more about broadcasting, refer this.
You have three dimensions in one of the tensors. So first you need to append a singleton dimension at the end and then make that dimension broadcastable. These two things can be achieved with a single command - T.shape_padaxis. The entire code is as follows:
import theano
from theano import tensor as T
import numpy as np
X = T.ftensor4('X')
Y = T.ftensor3('Y')
Y_broadcast = T.shape_padaxis(Y, axis=-1) # appending extra dimension and making it
# broadcastable
Z = T.sum((X*Y_broadcast), axis=1) # element-wise multiplication
f = theano.function([X, Y], Z, allow_input_downcast=True)
# Making sure that it works and gives correct results
x = np.random.normal(size=(3, 2, 2, 4))
y = np.random.normal(size=(3, 2, 2))
theano_result = f(x,y)
numpy_result = np.sum(y[:,:,:,np.newaxis] * x, axis=1)
print np.amax(theano_result - numpy_result) # prints 2.7e-7 on my system, close enough!
I hope this helps.