I am writing a python package that performs various complex statistical analysis tasks along an arbitrary axis of an arbitrarily-shaped numpy array.
Currently, so that the array shape and axis can be arbitrary, I just permute the array so the axis of interest is placed on the far RHS, and squash the LHS axes into one. For example, if the array shape is (3,4,5), and we want to perform some operation along axis 1, it is transformed into the shape (15,4), the operations is performed along axis -1, then it is transformed back into the shape (3,4,5) and returned by the function.
I feel this approach may be unnecessarily slow because of all these array manipulations. Is there a way that I can cleanly iterate over all but one dimension of the array? That is, in the above example this would go [0,:,0], [0,:,1], ..., [2,:,3], [2,:,4], but again this should work for arbitrary array shape and axis position.
Maybe np.ndenumerate, np.ndindex, and np.take can be used for this somehow?
Edit: Is there a way to do this with np.nditer? Perhaps this can match the speed of permuting/reshaping.
Turns out just transposing and reshaping is indeed faster. So I guess the answer is... don't do that, it is preferable to permute and reshape as I was already doing.
Here's the code from my project.
# Benchmark
f = lambda x: x # can change this to any arbitrary function
def test1(data, axis=-1):
# Test the lead flatten approach
data, shape = lead_flatten(permute(data, axis))
output = np.empty(data.shape)
for i in range(data.shape[0]): # iterate along first dimension; each row is an autocor
output[i,:] = f(data[i,:]) # arbitrary complex equation
return unpermute(lead_unflatten(output, shape), axis)
def test2(data, axis=-1):
# Test the new approach
output = np.empty(data.shape)
for d,o in zip(iter_1d(data, axis), iter_1d(output, axis)):
o[...] = f(d)
return output
# Iterator class
class iter_1d(object):
def __init__(self, data, axis=-1):
axis = (axis % data.ndim) # e.g. for 3D array, -1 becomes 2
self.data = data
self.axis = axis
def __iter__(self):
shape = (s for i,s in enumerate(self.data.shape) if i!=self.axis)
self.iter = np.ndindex(*shape)
return self
def __next__(self):
idx = self.iter.next()
idx = [*idx]
idx.insert(self.axis, slice(None))
return self.data[idx]
# Permute and reshape functions
def lead_flatten(data, nflat=None):
shape = list(data.shape)
if nflat is None:
nflat = data.ndim-1 # all but last dimension
if nflat<=0: # just apply singleton dimension
return data[None,...], shape
return np.reshape(data, (np.prod(data.shape[:nflat]).astype(int), *data.shape[nflat:]), order='C'), shape # make column major
def lead_unflatten(data, shape, nflat=None):
if nflat is None:
nflat = len(shape) - 1 # all but last dimension
if nflat<=0: # we artificially added a singleton dimension; remove it
return data[0,...]
if data.shape[0] != np.prod(shape[:nflat]):
raise ValueError(f'Number of leading elements {data.shape[0]} does not match leading shape {shape[nflat:]}.')
if not all(s1==s2 for s1,s2 in zip(data.shape[1:], shape[nflat:])):
raise ValueError(f'Trailing dimensions on data, {data.shape[1:]}, do not match trailing dimensions on new shape, {shape[nflat:]}.')
return np.reshape(data, shape, order='C')
def permute(data, source=-1, destination=-1):
data = np.moveaxis(data, source, destination)
return data
def unpermute(data, source=-1, destination=-1):
data = np.moveaxis(data, destination, source)
return data
And here are results from some %timeit operations.
import numpy as np
a = np.random.rand(10,20,30,40)
%timeit -r10 -n10 test1(a, axis=2) # around 12ms
%timeit -r10 -n10 test2(a, axis=2) # around 22ms
Related
I want to write a function for centering an input data matrix by multiplying it with the centering matrix. The function shall subtract the row-wise mean from the input.
My code:
import numpy as np
def centering(data):
n = data.shape()[0]
centeringMatrix = np.identity(n) - 1/n * (np.ones(n) # np.ones(n).T)
data = centeringMatrix # data
data = np.array([[1,2,3], [3,4,5]])
center_with_matrix(data)
But I get a wrong result matrix, it is not centered.
Thanks!
The centering matrix is
np.eye(n) - np.ones((n, n)) / n
Here is a list of issues in your original formulation:
np.ones(n).T is the same as np.ones(n). The transpose of a 1D array is a no-op in numpy. If you want to turn a row vector into a column vector, add the dimension explicitly:
np.ones((n, 1))
OR
np.ones(n)[:, None]
The normal definition is to subtract the column-wise mean, not the row-wise, so you will have to transpose and right-multiply the input to get row-wise operation:
n = data.shape()[1]
...
data = (centeringMatrix # data.T).T
Your function creates a new array for the output but does not currently return anything. You can either return the result, or perform the assignment in-place:
return (centeringMatrix # data.T).T
OR
data[:] = (centeringMatrix # data.T).T
OR
np.matmul(centeringMatrix, data.T, out=data.T)
I'm trying to implement a Softmax activation that can be applied to arrays of any dimension and softmax can be obtained along a specified axis.
Let's suppose I've an array [[1,2],[3,4]], then if I need the softmax along the rows, I extract each row and apply softmax individually on it through np.apply_along_axis with axis=1. So for the example given above applying softmax to each of [1,2] and [3,4] we get the output as softmax = [[0.26894142, 0.73105858], [0.26894142, 0.73105858]]. So far so good.
Now for the backward pass, let's suppose, I'll have the gradient from the upper layer as upper_grad = [[1,1],[1,1]], so I compute the Jacobian jacobian = [[0.19661193, -0.19661193],[-0.19661193, 0.19661193]] of shape (2,2) for each of the 1D arrays of shape (2,) in softmax and then np.dot it with the corresponding 1D array in upper_grad of shape (2,), so the result of dot product will be an array of shape (2,), the final derivative will be grads = [[0. 0.],[0. 0.]]
I definitely know I'm going wrong somewhere, because while doing gradient checking, I get ~0.90, which is absolutely bonkers. Could someone please help with what is wrong in my approach and how I can resolve it?
import numpy as np
def softmax(arr, axis):
# implementation of softmax for a 1d array
def calc_softmax(arr_1d):
exponentiated = np.exp(arr_1d-np.max(arr_1d))
sum_val = np.sum(exponentiated)
return exponentiated/sum_val
# split the given array of multiple dims into 1d arrays along axis and
# apply calc_softmax to each of those 1d arrays
result = np.apply_along_axis(calc_softmax, axis, arr)
return result
def softmax_backward(arr, axis, upper_grad):
result = softmax(arr, axis)
counter = 0
upper_grad_slices = []
def get_ug_slices(arr_1d, upper_grad_slices):
upper_grad_slices.append(arr_1d)
def backward(arr_1d, upper_grad_slices, counter):
local_grad = -np.broadcast_to(arr_1d, (arr_1d.size, arr_1d.size)) # local_grad is the jacobian
np.fill_diagonal(local_grad, 1+np.diagonal(local_grad))
local_grad*=arr_1d.reshape(arr_1d.size, 1)
grads = np.dot(local_grad, upper_grad_slices[counter]) # grads is 1d array because (2,2) dot (2,)
counter+=1 # increment the counter to access the next slice of upper_grad_slices
return grads
# since apply_along_axis doesnt give the index of the 1d array,
# we take the slices of 1d array of upper_grad and store it in a list
np.apply_along_axis(get_ug_slices, axis, upper_grad, upper_grad_slices)
# Iterate over each 1d array in result along axis and calculate its local_grad(jacobian)
# and np.dot it with the corresponding upper_grad slice
grads = np.apply_along_axis(backward, axis, result, upper_grad_slices, counter)
return grads
a = np.array([[1,2],[3,4]])
result = softmax(a, 1)
print("Result")
print(result)
upper_grad = np.array([[1,1],[1,1]])
grads = softmax_backward(a, 1, upper_grad)
print("Gradients")
print(grads)
apply_along_axis documentation - https://numpy.org/doc/stable/reference/generated/numpy.apply_along_axis.html
I'm so dumb. I was using the counter to get the next slice of upper_grad, but the counter was only getting updated locally, so this caused me to get the same slice of upper_grad each time, thus giving invalid gradient. Resolved it using pop method on upper_grad_slices
Updated code
import numpy as np
def softmax(arr, axis):
# implementation of softmax for a 1d array
def calc_softmax(arr_1d):
exponentiated = np.exp(arr_1d-np.max(arr_1d))
sum_val = np.sum(exponentiated)
return exponentiated/sum_val
# split the given array of multiple dims into 1d arrays along axis and
# apply calc_softmax to each of those 1d arrays
result = np.apply_along_axis(calc_softmax, axis, arr)
return result
def softmax_backward(arr, axis, upper_grad):
result = softmax(arr, axis)
upper_grad_slices = []
def get_ug_slices(arr_1d, upper_grad_slices):
upper_grad_slices.append(arr_1d)
def backward(arr_1d, upper_grad_slices):
local_grad = -np.broadcast_to(arr_1d, (arr_1d.size, arr_1d.size)) # local_grad is the jacobian
np.fill_diagonal(local_grad, 1+np.diagonal(local_grad))
local_grad*=arr_1d.reshape(arr_1d.size, 1)
grads = np.dot(local_grad, upper_grad_slices.pop(0)) # grads is 1d array because (2,2) dot (2,)
return grads
# since apply_along_axis doesnt give the index of the 1d array,
# we take the slices of 1d array of upper_grad and store it in a list
np.apply_along_axis(get_ug_slices, axis, upper_grad, upper_grad_slices)
# Iterate over each 1d array in result along axis and calculate its local_grad(jacobian)
# and np.dot it with the corresponding upper_grad slice
grads = np.apply_along_axis(backward, axis, result, upper_grad_slices)
return grads
a = np.array([[1,2],[3,4]])
result = softmax(a, 1)
print("Result")
print(result)
upper_grad = np.array([[1,1],[1,1]])
grads = softmax_backward(a, 1, upper_grad)
print("Gradients")
print(grads)
I am constructing a transition matrix from a n1 x n2 x ... x nN x nN array. For concreteness let N = 3, e.g.,
import numpy as np
# example with N = 3
n1, n2, n3 = 3, 2, 5
dim = (n1, n2, n3)
arr = np.random.random_sample(dim + (n3,))
Here arr contains transition probabilities between 2 states, where the "from"-state is indexed by the first 3 dimensions, and the "to"-state is indexed by the first 2 and the last dimension. I want to construct a transition matrix, which expresses these probabilities raveled into a sparse (n1*n2*n3) x (n1*n2*n3 matrix.
To clarify, let me provide my current approach that does what I want to do. Unfortunately, it's slow and doesn't work when N and n1, n2, ... are large. So I am looking for a more efficient way of doing the same that scales better for larger problems.
My approach
import numpy as np
from scipy import sparse as sparse
## step 1: get the index correponding to each dimension of the from and to state
# ravel axes 1 to 3 into single axis and make sparse
spmat = sparse.coo_matrix(arr.reshape(np.prod(dim), -1))
data = spmat.data
row = spmat.row
col = spmat.col
# use unravel to get idx for
row_unravel = np.array(np.unravel_index(row, dim))
col_unravel = np.array(np.unravel_index(col, n3))
## step 2: combine "to" index with rows 1 and 2 of "from"-index to get "to"-coordinates in full state space
row_unravel[-1, :] = col_unravel # first 2 dimensions of state do not change
colnew = np.ravel_multi_index(row_unravel, dim) # ravel back to 1d
## step 3: assemble transition matrix
out = sparse.coo_matrix((data, (row, colnew)), shape=(np.prod(dim), np.prod(dim)))
Final thought
I will be running this code many times. Across iterations, the data of arr may change, but the dimensions will stay the same. So one thing I could do is to save and load row and colnew from a file, skipping everything between the definition of data (line 2) and final line of my code. Do you think this would be the best approach?
Edit: One problem I see with this strategy is that if some elements of arr are zero (which is possible) then the size of data will change across iterations.
One approach that beats the one posted in the OP. Not sure if it's the most efficient.
import numpy as np
from scipy import sparse
# get col and row indices
idx = np.arange(np.prod(dim))
row = idx.repeat(dim[-1])
col = idx.reshape(-1, dim[-1]).repeat(dim[-1], axis=0).ravel()
# get the data
data = arr.ravel()
# construct the sparse matrix
out = sparse.coo_matrix((data, (row, col)), shape=(np.prod(dim), np.prod(dim)))
Two things that could be improved:
(1) if arr is sparse, the output matrix out will have zeros coded as nonzero.
(2) The approach relies on the new state being the last dimension of dim. It would be nice to generalize so that the last axis of arr can replace any of the originating axis, not just the last one.
I have an index IDX (which may be either list of indices, boolean mask, tuple of slices etc.) indexing some abstract numpy array of known shape shape (possibly big).
I know I can create a dummy array, index it and count the elements:
A = np.zeros(shape)
print(A[IDX].size)
Is there any sensible way I can get the number of indexed elements without creating any (potentially big) array?
I need to tabularize a list of functions at certain points in 3D space. The points are subset of a rectangular grid given as X, Y, Z lists and IDX is indexing their Cartesian product:
XX, YY, ZZ = [A[IDX] for A in np.meshgrid(X, Y, Z)]
The functions accept either X, Y, Z arguments (and return values for their Cartesian product which needs to be indexed) or XX, YY, ZZ.
At the moment I create XX, YY and ZZ arrays whether they are used or not, then I allocate an array for function values:
self.TAB = np.full((len(functions), XX.size),
np.nan)
but I want to create XX, YY and ZZ only if they are necessary. I also want to separate TAB allocation from filling its rows, thus I need to know the number of columns in advance.
Just for fun, let's see if we can make a passable approximation here. Your input can be any of the following:
slice
array-like (including scalars)
integer arrays do fancy indexing
boolean arrays do masking
tuple
If the input isn't explicitly a tuple to begin with, make it one. Now you can iterate along the tuple and match it to the shape. You can't quite zip them together because boolean arrays eat up multiple element of the shape, and trailing axes are included wholesale.
Something like this should do it:
def pint(x):
""" Mimic numpy errors """
if isinstance(x, bool):
raise TypeError('an integer is required')
try:
y = int(x)
except TypeError:
raise TypeError('an integer is required')
else:
if y < 0:
raise ValueError('negative dimensions are not allowed')
return y
def estimate_size(shape, index):
# Ensure input is a tuple
if not isinstance(index, tuple):
index = (index,)
# Clean out Nones: they don't change size
index = tuple(i for i in index if i is not None)
# Check shape shape and type
try:
shape = tuple(shape)
except TypeError:
shape = (shape,)
shape = tuple(pint(s) for s in shape)
size = 1
# Check for scalars
if not shape:
if index:
raise IndexError('too many indices for array')
return size
# Process index dimensions
# you could probably use iter(shape) instead of shape[s]
s = 0
# fancy indices need to be gathered together and processed as one
fancy = []
def get(n):
nonlocal s
s += n
if s > len(shape):
raise IndexError('too many indices for array')
return shape[s - n:s]
for ind in index:
if isinstance(ind, slice):
ax, = get(1)
size *= len(range(*ind.indices(ax)))
else:
ind = np.array(ind, ndmin=1, subok=True, copy=False)
if ind.dtype == np.bool_:
# Boolean masking
ax = get(ind.ndim)
if ind.shape != ax:
k = np.not_equal(ind.shape, ax).argmax()
IndexError(f'IndexError: boolean index did not match indexed array along dimension {s - n.ndim + k}; dimension is {shape[s - n.ndim + k]} but corresponding boolean dimension is {ind.shape[k]}')
size *= np.count_nonzero(ind)
elif np.issubdtype(ind.dtype, np.integer):
# Fancy indexing
ax, = get(1)
if ind.min() < -ax or ind.max() >= ax:
k = ind.min() if ind.min() < -ax else ind.max()
raise IndexError(f'index {k} is out of bounds for axis {s} with size {ax}')
fancy.append(ind)
else:
raise IndexError('arrays used as indices must be of integer (or boolean) type')
# Add in trailing dimensions
size *= np.prod(shape[s:])
# Add fancy indices
if fancy:
size *= np.broadcast(*fancy).size
return size
This is only an approximation. You will need to change it any time the API changes, and it already has some incomplete features. Testing, fixing and, expanding is left as an exercise for the reader.
PREREQUISITE
import numpy as np
import pandas as pd
INPUT1:boolean 2d array (a sample array as below)
x = np.array(
[[False,False,False,False,True],
[True,False,False,False,False],
[False,False,True,False,True],
[False,True,True,False,False],
[False,False,False,False,False]])
INPUT2:1D Range values (a sample as below)
y=np.array([1,2,3,4])
EXPECTED OUTPUT:2D ndarray
[[0,0,0,0,1],
[1,0,0,0,2],
[2,0,1,0,1],
[3,1,1,0,2],
[4,2,2,0,3]]
I want to set a range value(vertical vector) for each True in 2d ndarray(INPUT1) efficiently. Is there some useful APIs or solutions for this purpose?
Unfortunately I couldn't come up with an elegant solution, so I came up with multiple inelegant ones. The two main approaches I could think of are
brute-force looping over each True value and assigning slices, and
using a single indexed assignment to replace the necessary values.
It turns out that the time complexity of these approaches is non-trivial, so depending on the size of your array either can be faster.
Using your example input:
import numpy as np
x = np.array(
[[False,False,False,False,True],
[True,False,False,False,False],
[False,False,True,False,True],
[False,True,True,False,False],
[False,False,False,False,False]])
y = np.array([1,2,3,4])
refout = np.array([[0,0,0,0,1],
[1,0,0,0,2],
[2,0,1,0,1],
[3,1,1,0,2],
[4,2,2,0,3]])
# alternative input with arbitrary size:
# N = 100; x = np.random.rand(N,N) < 0.2; y = np.arange(1,N)
def looping_clip(x, y):
"""Loop over Trues, use clipped slices"""
nmax = x.shape[0]
n = y.size
# initialize output
out = np.zeros_like(x, dtype=y.dtype)
# loop over True values
for i,j in zip(*x.nonzero()):
# truncate right-hand side where necessary
out[i:i+n, j] = y[:nmax-i]
return out
def looping_expand(x, y):
"""Loop over Trues, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# loop over True values
for i,j in zip(ivals, jvals):
# slice will always be complete, i.e. of length y.size
out[i:i+n, j] = y
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
def index_2d(x, y):
"""Assign directly with 2d indices, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# now we can safely index for each "(ivals:ivals+n, jvals)" so to speak
upped_ivals = ivals[:,None] + np.arange(n) # shape (ntrues, n)
upped_jvals = jvals.repeat(y.size).reshape(-1, n) # shape (ntrues, n)
out[upped_ivals, upped_jvals] = y # right-hand size of shape (n,) broadcasts
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
def index_1d(x,y):
"""Assign using linear indices, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# grab linear indices corresponding to Trues in a buffed-up array
inds = np.ravel_multi_index((ivals, jvals), out.shape)
# now all we need to do is start stepping along rows for each item and assign y
upped_inds = inds[:,None] + mmax*np.arange(n) # shape (ntrues, n)
out.flat[upped_inds] = y # y of shape (n,) broadcasts to (ntrues, n)
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
# check that the results are correct
print(all([np.array_equal(refout, looping_clip(x,y)),
np.array_equal(refout, looping_expand(x,y)),
np.array_equal(refout, index_2d(x,y)),
np.array_equal(refout, index_1d(x,y))]))
I tried to document each function, but here's a synopsis:
looping_clip loops over every True value in the input and assigns to a corresponding slice in the output. We take care on the right-hand side to shorten the assigned array for when part of the slice would go beyond the edge of the array along the first dimension.
looping_expand loops over every True value in the input and assigns to a corresponding full slice in the output after allocating a padded output array ensuring that every slice will be full. We do more work when allocating a larger output array, but we don't have to shorten the right-hand side on assignment. We could omit the .copy() call in the last step, but I prefer not to return a nontrivially strided array (i.e. a view to an auxiliary array rather than a proper copy) as this might lead to obscure surprises for the user.
index_2d computes the 2d indices of every value to be assigned to, and assumes that duplicate indices will be handled in order. This is not guaranteed! (More on this a bit later.)
index_1d does the same using linearized indices and indexing into the flatiter of the output.
Here are the timings of the above methods using random arrays (see the commented line near the start):
What we can see is that for small and large arrays the looping versions are faster, but for linear sizes between roughly 10 and 150 the indexing versions are better. The reason I didn't go to higher sizes is that the indexing cases start to use a lot of memory, and I didn't want to have to worry about this messing with timings.
Just to make the above worse, note that the indexing versions assume that duplicate indices in a fancy indexing scenario are handled in order, so when True values are handled which are "lower" in the array, previous values will be overwritten as per your requirements. There's only one problem: this is not guaranteed:
For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.
This doesn't sounds very encouraging. While in my experiments it seems that the indices are handled in order (according to C order), this can also be coincidence, or an implementation detail. So if you want to use the indexing versions, make sure that on your specific version and specific dimensions and shapes this still holds true.
We can make the assignment safer by getting rid of duplicate indices ourselves. For this we can make use of this answer by Divakar on a corresponding question:
def index_1d_safe(x,y):
"""Same as index_1d but use Divakar's safe solution for reducing duplicates"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# grab linear indices corresponding to Trues in a buffed-up array
inds = np.ravel_multi_index((ivals, jvals), out.shape)
# now all we need to do is start stepping along rows for each item and assign y
upped_inds = inds[:,None] + mmax*np.arange(n) # shape (ntrues, n)
# now comes https://stackoverflow.com/a/44672126
# need additional step: flatten upped_inds and corresponding y values for selection
upped_flat_inds = upped_inds.ravel() # shape (ntrues, n) -> (ntrues*n,)
y_vals = np.broadcast_to(y, upped_inds.shape).ravel() # shape (ntrues, n) -> (ntrues*n,)
sidx = upped_flat_inds.argsort(kind='mergesort')
sindex = upped_flat_inds[sidx]
idx = sidx[np.r_[np.flatnonzero(sindex[1:] != sindex[:-1]), upped_flat_inds.size-1]]
out.flat[upped_flat_inds[idx]] = y_vals[idx]
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
This still reproduces your expected output. The problem is that now the function takes much longer to finish:
Bummer. Considering how my indexing versions are only faster for an intermediate array size and how their faster versions are not guaranteed to work, perhaps it's simplest to just use one of the looping versions. This is not to say, of course, that there aren't any optimal vectorized solutions that I missed.