Related
Suppose I have a 2D array with shape (3, 3), call it a, and an array of zeros with shape (7, 7, 5, 5), call it b. I want to modify b in the following way:
for p in range(5):
for q in range(5):
b[p:p + 3, q:q + 3, p, q] = a
Given:
a = np.array([[4, 2, 2],
[9, 0, 5],
[9, 9, 4]])
b = np.zeros((7, 7, 5, 5), dtype=int)
b would end up something like:
>>> b[:, :, 0, 0]
array([[4, 2, 2, 0, 0, 0, 0],
[9, 0, 5, 0, 0, 0, 0],
[9, 9, 4, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
>>> b[:, :, 0, 1]
array([[0, 4, 2, 2, 0, 0, 0],
[0, 9, 0, 5, 0, 0, 0],
[0, 9, 9, 4, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
One way to think about this to make a sliding window view of b (6D), slice out the parts you want (3D or 4D), and assign a to them.
However, there is a simpler way to do this altogether. The way a sliding window view works is by creating a dimension that steps along less than the full size of the dimension you are viewing. For example:
>>> x = np.array([1, 2, 3, 4])
array([1, 2, 3, 4])
>>> window = np.lib.stride_tricks.as_strided(
x, shape=(x.shape[0] - 2, 3),
strides=x.strides * 2)
[[1 2 3]
[2 3 4]]
I'm deliberately using np.lib.stride_tricks.as_strided rather than np.lib.stride_tricks.sliding_window_view here because it has a certain flexibility that you need.
You can have a stride that is larger than the axis you are viewing, as long as you are careful. Contiguous arrays are more forgiving in this case, but by no means a requirement. An example of this is np.diag. You can implement it something like this:
>>> x = np.arange(12).reshape(3, 4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> diag = np.lib.stride_tricks.as_strided(
x, shape=(min(x.shape),),
strides=(sum(x.strides),))
array([ 0, 5, 10])
The trick is to make a view of only the parts of b you care about in a way that makes the assignment easy. Because of broadcasting rules, you will want the last two dimensions of the view to be a.shape, and the strides to be b.strides[:2], since that's where you want to place a.
The first two dimensions of the view will be responsible for making the copies of a. You want 25 copies, so the shape will be (5, 5). The strides are the trickier part. Let's take a look at a 2D case, just because that's easier to visualize, and then attempt to generalize:
>>> a0 = np.array([1, 2])
>>> b0 = np.zeros((4, 3), dtype=int)
>>> b0[0:2, 0] = b0[1:3, 1] = b0[2:4, 2] = a0
The goal is to make a view that strides along the diagonal of b0 in the first axis. So:
>>> np.lib.stride_tricks.as_strided(
b0, shape=(b0.shape[0] - a0.shape[0] + 1, a0.shape[0]),
strides=(sum(b0.strides), b0.strides[0]))[:] = a0
>>> b0
array([[1, 0, 0],
[2, 1, 0],
[0, 2, 1],
[0, 0, 2]])
So that's what you do for b, but adding up every second dimension:
a = np.array([[4, 2, 2],
[9, 0, 5],
[9, 9, 4]])
b = np.zeros((7, 7, 5, 5), dtype=int)
vshape = (*np.subtract(b.shape[:a.ndim], a.shape) + 1,
*a.shape)
vstrides = (*np.add(b.strides[:a.ndim], b.strides[a.ndim:]),
*b.strides[:a.ndim])
np.lib.stride_tricks.as_strided(b, shape=vshape, strides=vstrides)[:] = a
TL;DR
def emplace_window(a, b):
vshape = (*np.subtract(b.shape[:a.ndim], a.shape) + 1, *a.shape)
vstrides = (*np.add(b.strides[:a.ndim], b.strides[a.ndim:]), *b.strides[:a.ndim])
np.lib.stride_tricks.as_strided(b, shape=vshape, strides=vstrides)[:] = a
I've phrased it this way, because now you can apply it to any number of dimensions. The only expectations is that 2 * a.ndim == b.ndim and that b.shape[a.ndim:] == b.shape[:a.ndim] - a.shape + 1.
I have two arrays, a and b.
a has shape (1, 2, 3, 4)
b has shape (4, 3, 2, 1)
I would like to make them both (4, 3, 3, 4) with the new positions filled with 0's.
I can do:
new_shape = (4, 3, 3, 4)
a = np.resize(a, new_shape)
b = np.resize(b, new_shape)
..but this repeats the elements of each to form the new elements, which does not work for me.
Instead I thought I could do:
a = a.resize(new_shape)
b = b.resize(new_shape)
..which according to the documentation pads with 0's.
But it doesn't work for multi-dimensional arrays, raising error:
ValueError: resize only works on single-segment arrays
So is there a different way to achieve this? ie. same as np.resize but with 0-padding?
NB: I am only looking for pure-numpy solutions.
EDIT: I'm using numpy version 1.20.2
EDIT: I just found out that is works for numbers, but not for objects, I forgot to mention that it is an array of objects not numbers.
resize method pads with 0s in a flattened sense; the function pads with repeats.
To illustrate how resize "flattens" before padding:
In [108]: a = np.arange(12).reshape(1,4,3)
In [109]: a
Out[109]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]]])
In [110]: a1 = a.copy()
In [111]: a1.resize((2,4,4))
In [112]: a1
Out[112]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[ 0, 0, 0, 0]],
[[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0]]])
If instead I make a target array of the right shape, and copy, I can maintain the original multidimensional block:
In [114]: res = np.zeros((2,4,4),a.dtype)
In [115]: res[:a.shape[0],:a.shape[1],:a.shape[2]]=a
In [116]: res
Out[116]:
array([[[ 0, 1, 2, 0],
[ 3, 4, 5, 0],
[ 6, 7, 8, 0],
[ 9, 10, 11, 0]],
[[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0]]])
I wrote out the slices explicitly (for clarity). Such a tuple could be created programmatically if needed.
I have a numpy matrix b = np.array([[1,0,1,0],[0,0,0,1]]) and I want to product it element-wise into a 3-dim array a = np.array([[[1,2,3,4], [5,6,7,8], [9,10,11,12]], [[13,14,15,16], [17,18,19,20], [21,22,23,24]]]) for each index on the second dimension. So, the result I expect should be as follows:
[[[1,0,3,0], [5,0,7,0], [9,0,11,0]], [[0,0,0,16], [0,0,0,20], [0,0,0,24]]]
Numpy does not broadcast if I do a * b. I was thinking of broadcasting b in its second dimension. I tried np.broadcast_to(b, (2,3,4)) but I got error. I tried (np.broadcast_to(b, (3,2,4)).reshape(2,3,4)) but the output is not as expected.
Use None/newaxis to added a new middle dimension (reshape also does this):
In [36]: b.shape
Out[36]: (2, 4)
In [37]: a.shape
Out[37]: (2, 3, 4)
In [38]: b[:,None,:]*a
Out[38]:
array([[[ 1, 0, 3, 0],
[ 5, 0, 7, 0],
[ 9, 0, 11, 0]],
[[ 0, 0, 0, 16],
[ 0, 0, 0, 20],
[ 0, 0, 0, 24]]])
In [39]: b[:,None,:].shape
Out[39]: (2, 1, 4)
broadcast_to can't add that extra dimension automatically. It follows the same rules as b*a operations. It can add leading dimensions if needed, and scale size 1 dimensions. But for anything else, you have to be explicit.
In [41]: np.broadcast_to(b, (2,3,4))
Traceback (most recent call last):
File "<ipython-input-41-3c3268de7ce1>", line 1, in <module>
np.broadcast_to(b, (2,3,4))
File "<__array_function__ internals>", line 5, in broadcast_to
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/stride_tricks.py", line 411, in broadcast_to
return _broadcast_to(array, shape, subok=subok, readonly=True)
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/stride_tricks.py", line 348, in _broadcast_to
it = np.nditer(
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,4) and requested shape (2,3,4)
In [42]: np.broadcast_to(b[:,None,:], (2,3,4))
Out[42]:
array([[[1, 0, 1, 0],
[1, 0, 1, 0],
[1, 0, 1, 0]],
[[0, 0, 0, 1],
[0, 0, 0, 1],
[0, 0, 0, 1]]])
You need to reshape:
c = b.reshape(2,-1,4)*a
I have an N-dimensional array. I want to expand it to an (N+1)-dimensional array by putting the values of the final dimension in the diagonal.
For example, using explicit looping:
In [197]: M = arange(5*3).reshape(5, 3)
In [198]: numpy.dstack([numpy.diag(M[i, :]) for i in range(M.shape[0])]).T
Out[198]:
array([[[ 0, 0, 0],
[ 0, 1, 0],
[ 0, 0, 2]],
[[ 3, 0, 0],
[ 0, 4, 0],
[ 0, 0, 5]],
[[ 6, 0, 0],
[ 0, 7, 0],
[ 0, 0, 8]],
[[ 9, 0, 0],
[ 0, 10, 0],
[ 0, 0, 11]],
[[12, 0, 0],
[ 0, 13, 0],
[ 0, 0, 14]]])
which is a 5×3×3 array.
My actual arrays are large and I would like to avoid explicit looping (hiding the looping in map instead of a list comprehension has no performance gain; it's still a loop). Although numpy.diag works for constructing a regular, 2-D diagonal matrix, it does not extend to higher dimensions (when given a 2-D array, it will extract its diagonal instead). The array returned by numpy.diagflat makes everything into one big diagonal, resulting in a 15×15 array which has far more zeroes and cannot be reshaped into 5×3×3.
Is there a way to efficiently construct an (N+1)-diagonal matrix from the values in a N-dimensional array, without calling diag many times?
Use numpy.diagonal to take a view of the relevant diagonals of a properly-shaped N+1-dimensional array, force the view to be writeable with setflags, and write to the view:
expanded = numpy.zeros(M.shape + M.shape[-1:], dtype=M.dtype)
diagonals = numpy.diagonal(expanded, axis1=-2, axis2=-1)
diagonals.setflags(write=True)
diagonals[:] = M
This produces your desired array as expanded.
You can use an almost-impossible-to-guess-if-you-don't-know feature of the ubiquitous np.einsum. When used as follows einsum will return a writable view of the generalized diagonal:
>>> import numpy as np
>>> M = np.arange(5*3).reshape(5, 3)
>>>
>>> out = np.zeros((*M.shape, M.shape[-1]), M.dtype)
>>> np.einsum('...jj->...j', out)[...] = M
>>> out
array([[[ 0, 0, 0],
[ 0, 1, 0],
[ 0, 0, 2]],
[[ 3, 0, 0],
[ 0, 4, 0],
[ 0, 0, 5]],
[[ 6, 0, 0],
[ 0, 7, 0],
[ 0, 0, 8]],
[[ 9, 0, 0],
[ 0, 10, 0],
[ 0, 0, 11]],
[[12, 0, 0],
[ 0, 13, 0],
[ 0, 0, 14]]])
A general way to turn the last dimension of a N-D array into a diagonal matrix:
We will need to reduce the dimensionality of the array, apply the numpy.diag() function to each vector, and then rebuild that to the original dimensionality + 1.
reshaping the matrix to 2 dimensional:
M.reshape(-1, M.shape[-1])
then use map to apply np.diag to that, and rebuild the matrix with an additional dimension using the following:
result.reshape([*M.shape, M.shape[-1]])
All of this combined gives the following:
result = np.array(list(map(
np.diag,
M.reshape(-1, M.shape[-1])
))).reshape([*M.shape, M.shape[-1]])
An example:
shape = np.arange(2,8)
M = np.arange(shape.prod()).reshape(shape)
print(M.shape) # (2, 3, 4, 5, 6, 7)
result = np.array(list(map(np.diag, M.reshape(-1, M.shape[-1])))).reshape([*M.shape, M.shape[-1]])
print(result.shape) # (2, 3, 4, 5, 6, 7, 7)
and res[0,0,0,0,2,:] contains the following:
array([[14, 0, 0, 0, 0, 0, 0],
[ 0, 15, 0, 0, 0, 0, 0],
[ 0, 0, 16, 0, 0, 0, 0],
[ 0, 0, 0, 17, 0, 0, 0],
[ 0, 0, 0, 0, 18, 0, 0],
[ 0, 0, 0, 0, 0, 19, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
How can i resize a numpy array and fill it with a specific value (if some dimension is extended) ?
I find a way to extend my array with np.pad but I can't shorten it:
>>> import numpy as np
>>> a = np.ndarray((5, 5), dtype=np.uint16)
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]], dtype=uint16)
>>> np.pad(a, ((0, 1), (0,3)), mode='constant', constant_values=9)
array([[0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 9, 9, 9],
[9, 9, 9, 9, 9, 9, 9, 9]], dtype=uint16)
And if i use resize i can't specify the value that I want to use.
>>> a.fill(5)
>>> a.resize((2, 7))
>>> a
array([[5, 5, 5, 5, 5, 5, 5],
[5, 5, 5, 5, 5, 5, 5]], dtype=uint16)
But i would like
>>> a
array([[5, 5, 5, 5, 5, 9, 9],
[5, 5, 5, 5, 5, 9, 9]], dtype=uint16)
After some test I create this function but it's only work when you change x_value or with a lower y_value, if you need to increase y dimension it doesn't work, why ?
VALUE_TO_FILL = 9
def resize(self, x_value, y_value):
x_diff = self.np_array.shape[0] - x_value
y_diff = self.np_array.shape[1] - y_value
self.np_array.resize((x_value, y_value), refcheck=False)
if x_diff < 0:
self.np_array[x_diff:, :] = VALUE_TO_FILL
if y_diff < 0:
self.np_array[:, y_diff:] = VALUE_TO_FILL
Your array has a fixed size data buffer. You can reshape the array without changing that buffer. You can take a slice (view) without changing the buffer. But you can't add values to the array without changing the buffer.
In general resize returns an new array with a new data buffer.
pad is a complex function to handle general cases. But the simplest approach is to create the empty target array, fill it, and then copy the input into the right place.
Alternatively pad could create the fill arrays and concatenate them with the original. But concatenate also makes the empty return and copies.
A do it yourself pad with clipping could be structured as:
n,m = X.shape
R = np.empty((k,l))
R.fill(value)
<calc slices from n,m,k,l>
R[slice1] = X[slice2]
Calculating the slices may require if-else tests or equivalent min/max. You can probably work out those details.
This may be all that is needed
R[:X.shape[0],:X.shape[1]]=X[:R.shape[0],:R.shape[1]]
That's because there's no problem if a slice is larger than the dimension.
In [37]: np.arange(5)[:10]
Out[37]: array([0, 1, 2, 3, 4])
Thus, for example:
In [38]: X=np.ones((3,4),int)
In [39]: R=np.empty((2,5),int)
In [40]: R.fill(9)
In [41]: R[:X.shape[0],:X.shape[1]]=X[:R.shape[0],:R.shape[1]]
In [42]: R
Out[42]:
array([[1, 1, 1, 1, 9],
[1, 1, 1, 1, 9]])
To shorten it, you can use negative values in slice :
>>> import numpy as np
>>> a = np.ndarray((5, 5), dtype=np.uint16)
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]], dtype=uint16)
>>> b = a[0:-1,0:-3]
>>> b
array([[0, 0],
[0, 0],
[0, 0],
[0, 0]], dtype=uint16)