Related
Given an array of shape (8, 3, 4, 4), reshape them into an arbitrary new shape (8, 4, 4, 3) by inputting the new indices compared to the old positions (0, 2, 3, 1).
Bonus: perform numpy.dot of one of said array's non-last index and a 1-D second, i.e. numpy.dot(<array with shape (8, 3, 4, 4)>, [1, 2, 3]) # will return shape mismatch as it is
Numpy's transpose "reverses or permutes":
ni = (0, 2, 3, 1)
arr = arr.transpose(ni)
Old solution:
ni = (0, 2, 3, 1)
s = arr.shape
arr = arr.reshape(s[ni[0]], s[ni[1]]...)
Maybe this is what you are looking for:
arr = np.array([[[1, 2], [3, 4], [5, 6]]])
s = arr.shape
new_indexes = (1, 0, 2) # permutation
new_arr = arr.reshape(*[s[index] for index in new_indexes])
print(arr.shape) # (1, 3, 2)
print(new_arr.shape) # (3, 1, 2)
How can I find the longest consecutive zeros in a 3D array along a specific axis?
import numpy as np
a = np.random.randint(2, size=(10, 10, 10))
I want to find the longest sequence of 0 along axis=0 so that I get a 10x10 array.
In one dimension it works with:
import numpy as np
a = np.random.randint(2, size=100)
condition = (a==0)
L = np.diff(np.where(np.concatenate(([condition[0]],
condition[:-1] != condition[1:],
[True])))[0])[::2]
print(np.max(L))
You could use np.cumsum() to sum up the 1s and 0s along a given dimension.
The idea is that, when you have consecutive zeros the value in the cumsum stays the same. So in the end you want to find the most common value in this array, as its count is exactly the length of the longest sequence of zeros (+1).
import numpy as np
from scipy.stats import mode
# 1D, bincount
a = np.random.randint(2, size=10)
# array([1, 1, 1, 1, 0, 0, 0, 0, 1, 1])
# ^ ^ ^ ^
b = np.cumsum(a)
# array([1, 2, 3, 4, 4, 4, 4, 4, 5, 6])
# ^ ^ ^ ^
c = np.bincount(b)
# array([0, 1, 1, 1, 5, 1, 1])
# ^
res = np.max(c) - 1
# 4
bincount unfortunately works only for 1D arrays, so for the multidimensional case, I switch to scipy.stats.mode, which returns just the modal (most common) value and its count.
# 1D, stats.mode
c2 = mode(b)
# ModeResult(mode=array([4]), count=array([5]))
res = c2[1] - 1
# 3D, stas.mode
from scipy.stats import mode
axis = 0
a = np.random.randint(2, size=(10, 10, 10))
res = mode(np.cumsum(a, axis=axis), axis=axis)[1] - 1
# Note the resulting shape is (1, 10, 10)
# You might want to use np.squeeze() / np.max()
# to get rid of the dimension with size 1
# res = res.max(axis=axis)
EDIT
As #clearseplex pointed out, I didn't think of the case when the array starts with 0.
a = np.array([0, 0, 0, 1, 1, 0, 1, 0, 0, 1])
b = np.cumsum(a)
# array([0, 0, 0, 1, 2, 2, 3, 3, 3, 4])
# ^ ^ ^
There are 3 zeros, but if I subtract one I get the wrong result. So the correct solution is to subtract only if the most common value is not 0.
So the correct way is:
m, res = mode(np.cumsum(a, axis=axis), axis=axis)
res[m != 0] -= 1
# res = res.argmax(axis)
To perform your task, define the following function:
def longestZeroSeqLength(a):
# Changes in "isZero" for consecutive elements
chg = np.abs(np.diff(np.equal(a, 0).view(np.int8), prepend=[0], append=[0]))
# Ranges of "isZero" elements
rng = np.where(chg == 1)[0]
if rng.size == 0: return 0 # All non-zero elements
rng = rng.reshape(-1, 2)
# Compute length of each range and return the biggest
return np.subtract(rng[:,1], rng[:,0]).max()
Then apply it to your array:
result = np.apply_along_axis(longestZeroSeqLength, 0, a)
To test it, I created the following (smaller) array:
siz = (3, 4, 5)
np.random.seed(1)
a = np.random.randint(2, size=siz)
After running my code I got:
array([[1, 0, 2, 2, 0],
[1, 2, 1, 0, 3],
[3, 1, 1, 1, 0],
[1, 1, 2, 2, 3]], dtype=int64)
To easier assess what contains each slice and what is each partial
result, you can run:
for j in range(a.shape[1]):
for k in range(a.shape[2]):
b = a[:, j, k]
res = longestZeroSeqLength(b)
print(f'{j}, {k}: {b}, {res}')
I am playing around with different indexing methods. I have the following working example:
import numpy as np
x = np.random.rand(321,321)
a = range(0, 300)
b = range(1, 301)
mask = np.zeros(x.shape, dtype=bool)
# a and b are lists
mask[a, b] = True
assert x[a, b].shape == x[mask].shape # passes
assert np.isclose(np.sum(x[mask]), np.sum(x[a, b])) # passes
assert np.allclose(x[mask], x[a, b]) # fails sometimes
When I try it with a different x for a project, the last assertion fails. Here is a failing case:
import numpy as np
x = np.random.rand(431,431)
a = [0, 1, 1, 1, 2, 2, 2, 3]
b = [1, 0, 2, 4, 3, 1, 11, 2]
mask = np.zeros(x.shape, dtype=bool)
# a and b are lists
mask[a, b] = True
assert x[a, b].shape == x[mask].shape # passes
assert np.isclose(np.sum(x[mask]), np.sum(x[a, b])) # passes
assert np.allclose(x[mask], x[a, b]) # fails
Can anyone explain why this error occurs? I assume it's because mask is indexing into x differently from (a,b), but not sure how.
I want to do this because I'd like to easily get x[~mask]
Any insight would be appreciated!
The problem with your example lies in how you defined a and b. If you were to print out x[a, b] and x[mask] you would notice that the 5th and 6th elements on x[a, b] would be switched with the 5th and 6th values in x[mask]. The reason for this is that you set every value in mask to True using a and b to index so order didn't matter but you're using a and b to index x in your assertion so order matters there. When you do your index, numpy is taking each value from a to get the appropriate row from your matrix and using the value in the same index on b to index into that row. To illustrate using a 3x8 array:
a = [0, 1, 1, 1, 2, 2, 2]
b = [1, 0, 2, 4, 3, 1, 7]
x = [[1, 2, 3, 4, 5, 6, 7, 8],
[9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24]]
x[a, b] = [2, 9, 11, 13, 20, 18, 24]
mask[a, b] = [2, 9, 11, 13, 18, 20, 24]
A good way to fix this would be to first define a and b as a list of tuples, sort them on their "a-value" first and then on their "b-value" and use them from there. That way you can guarantee the order.
x[a, b] selects elements from x in the order given by a and b. x[a[i], b[i]] will come before x[a[i+1], b[i+1]] in the result.
x[mask] selects elements in the order given by iterating over mask in row-major order to find True cells. This is only the same order as x[a, b] if zip(a, b) is already lexicographically sorted.
In your failing example, 2, 3 comes before 2, 1 in a and b, but iterating over mask in row-major order will find the True at 2, 1 before 2, 3. Thus, x[mask] has x[2, 1] before x[2, 3], while x[a, b] has those elements the other way around.
As #hpaulj mentioned the order of the arrays is different:
import numpy as np
np.random.seed(42)
x = np.random.rand(431,431)
a = [0, 1, 1, 1, 2, 2, 2, 3]
b = [1, 0, 2, 4, 3, 1, 11, 2]
mask = np.zeros(x.shape, dtype=bool)
# a and b are lists
mask[a, b] = True
print(x[mask])
print(x[a, b])
Output
[0.95071431 0.76151063 0.10112268 0.70096913 0.44076275 0.55964033
0.40873417 0.20015024]
[0.95071431 0.76151063 0.10112268 0.70096913 0.55964033 0.44076275
0.40873417 0.20015024]
The reason is that the mask returns in in row-major (C-style) order (see docs) and as for multidimensional indexing:
if the index arrays have a matching shape, and there is an index array
for each dimension of the array being indexed, the resultant array has
the same shape as the index arrays, and the values correspond to the
index set for each position in the index arrays.
In your case the order from the multidimensional indexing is:
[(0, 1), (1, 0), (1, 2), (1, 4), (2, 3), (2, 1), (2, 11), (3, 2)]
and from the mask is:
[(0, 1), (1, 0), (1, 2), (1, 4), (2, 1), (2, 3), (2, 11), (3, 2)]
What would be the best way of broadcasting two arrays together when a simple call to np.broadcast_to() would fail?
Consider the following example:
import numpy as np
arr1 = np.arange(2 * 3 * 4 * 5 * 6).reshape((2, 3, 4, 5, 6))
arr2 = np.arange(3 * 5).reshape((3, 5))
arr1 + arr2
# ValueError: operands could not be broadcast together with shapes (2,3,4,5,6) (3,5)
arr2_ = np.broadcast_to(arr2, arr1.shape)
# ValueError: operands could not be broadcast together with remapped shapes
arr2_ = arr2.reshape((1, 3, 1, 5, 1))
arr1 + arr2
# now this works because the singletons trigger the automatic broadcast
This only work if I manually select a shape for which automatic broadcasting is going to work.
What would be the most efficient way of doing this automatically?
Is there an alternative way other than reshape on a cleverly constructed broadcastable shape?
Note the relation to np.squeeze(): this would perform the inverse operation by removing singletons. So what I need is some sort of np.squeeze() inverse.
The official documentation (as of NumPy 1.13.0 suggests that the inverse of np.squeeze() is np.expand_dim(), but this is not nearly as flexible as I'd need it to be, and actually np.expand_dim() is roughly equivalent to np.reshape(array, shape + (1,)) or array[:, None].
This issue is also related to the keepdims keyword accepted by e.g. sum:
import numpy as np
arr1 = np.arange(2 * 3 * 4 * 5 * 6).reshape((2, 3, 4, 5, 6))
# not using `keepdims`
arr2 = np.sum(arr1, (0, 2, 4))
arr2.shape
# : (3, 5)
arr1 + arr2
# ValueError: operands could not be broadcast together with shapes (2,3,4,5,6) (3,5)
# now using `keepdims`
arr2 = np.sum(arr1, (0, 2, 4), keepdims=True)
arr2.shape
# : (1, 3, 1, 5, 1)
arr1 + arr2
# now this works because it has the correct shape
EDIT: Obviously, in cases where np.newaxis or keepdims mechanisms are an appropriate choice, there would be no need for a unsqueeze() function.
Yet, there are use-cases where none of these is an option.
For example, consider the case of the weighted average as implemented in numpy.average() over an arbitrary number of dimensions specified by axis.
Right now the weights parameter must have the same shape as the input.
However, weights there is no need specify the weights over the non-reduced dimensions as they are just repeating and the NumPy's broadcasting mechanism would appropriately take care of them.
So if we would like to have such a functionality, we would need to code something like (where some consistency checks are just omitted for simplicity):
def weighted_average(arr, weights=None, axis=None):
if weights is not None and weights.shape != arr.shape:
weights = unsqueeze(weights, ...)
weights = np.zeros_like(arr) + weights
result = np.sum(arr * weights, axis=axis)
result /= np.sum(weights, axis=axis)
return result
or, equivalently:
def weighted_average(arr, weights=None, axis=None):
if weights is not None and weights.shape != arr.shape:
weights = unsqueeze(weights, ...)
weights = np.zeros_like(arr) + weights
return np.average(arr, weights, axis)
In either of the two, it is not possible to replace unsqueeze() with weights[:, np.newaxis]-like statements because we do not know beforehand where the new axis will be needed, nor we can use the keepdims feature of sum because the code will fail at arr * weights.
This case could be relatively nicely handled if np.expand_dims() would support an iterable of ints for its axis parameter, but as of NumPy 1.13.0 does not.
My way of achieving this is by defining the following unsqueezing() function to handle cases where this can be done automatically and giving a warning when the inputs could be ambiguous (e.g. when some source elements of the source shape may match multiple elements of the target shape):
def unsqueezing(
source_shape,
target_shape):
"""
Generate a broadcasting-compatible shape.
The resulting shape contains *singletons* (i.e. `1`) for non-matching dims.
Assumes all elements of the source shape are contained in the target shape
(excepts for singletons) in the correct order.
Warning! The generated shape may not be unique if some of the elements
from the source shape are present multiple timesin the target shape.
Args:
source_shape (Sequence): The source shape.
target_shape (Sequence): The target shape.
Returns:
shape (tuple): The broadcast-safe shape.
Raises:
ValueError: if elements of `source_shape` are not in `target_shape`.
Examples:
For non-repeating elements, `unsqueezing()` is always well-defined:
>>> unsqueezing((2, 3), (2, 3, 4))
(2, 3, 1)
>>> unsqueezing((3, 4), (2, 3, 4))
(1, 3, 4)
>>> unsqueezing((3, 5), (2, 3, 4, 5, 6))
(1, 3, 1, 5, 1)
>>> unsqueezing((1, 3, 5, 1), (2, 3, 4, 5, 6))
(1, 3, 1, 5, 1)
If there is nothing to unsqueeze, the `source_shape` is returned:
>>> unsqueezing((1, 3, 1, 5, 1), (2, 3, 4, 5, 6))
(1, 3, 1, 5, 1)
>>> unsqueezing((2, 3), (2, 3))
(2, 3)
If some elements in `source_shape` are repeating in `target_shape`,
a user warning will be issued:
>>> unsqueezing((2, 2), (2, 2, 2, 2, 2))
(2, 2, 1, 1, 1)
>>> unsqueezing((2, 2), (2, 3, 2, 2, 2))
(2, 1, 2, 1, 1)
If some elements of `source_shape` are not presente in `target_shape`,
an error is raised.
>>> unsqueezing((2, 3), (2, 2, 2, 2, 2))
Traceback (most recent call last):
...
ValueError: Target shape must contain all source shape elements\
(in correct order). (2, 3) -> (2, 2, 2, 2, 2)
>>> unsqueezing((5, 3), (2, 3, 4, 5, 6))
Traceback (most recent call last):
...
ValueError: Target shape must contain all source shape elements\
(in correct order). (5, 3) -> (2, 3, 4, 5, 6)
"""
shape = []
j = 0
for i, dim in enumerate(target_shape):
if j < len(source_shape):
shape.append(dim if dim == source_shape[j] else 1)
if i + 1 < len(target_shape) and dim == source_shape[j] \
and dim != 1 and dim in target_shape[i + 1:]:
text = ('Multiple positions (e.g. {} and {})'
' for source shape element {}.'.format(
i, target_shape[i + 1:].index(dim) + (i + 1), dim))
warnings.warn(text)
if dim == source_shape[j] or source_shape[j] == 1:
j += 1
else:
shape.append(1)
if j < len(source_shape):
raise ValueError(
'Target shape must contain all source shape elements'
' (in correct order). {} -> {}'.format(source_shape, target_shape))
return tuple(shape)
This can be used to define unsqueeze() as a more flexible inverse of np.squeeze() compared to np.expand_dims() which can only append one singleton at a time:
def unsqueeze(
arr,
axis=None,
shape=None,
reverse=False):
"""
Add singletons to the shape of an array to broadcast-match a given shape.
In some sense, this function implements the inverse of `numpy.squeeze()`.
Args:
arr (np.ndarray): The input array.
axis (int|Iterable|None): Axis or axes in which to operate.
If None, a valid set axis is generated from `shape` when this is
defined and the shape can be matched by `unsqueezing()`.
If int or Iterable, specified how singletons are added.
This depends on the value of `reverse`.
If `shape` is not None, the `axis` and `shape` parameters must be
consistent.
Values must be in the range [-(ndim+1), ndim+1]
At least one of `axis` and `shape` must be specified.
shape (int|Iterable|None): The target shape.
If None, no safety checks are performed.
If int, this is interpreted as the number of dimensions of the
output array.
If Iterable, the result must be broadcastable to an array with the
specified shape.
If `axis` is not None, the `axis` and `shape` parameters must be
consistent.
At least one of `axis` and `shape` must be specified.
reverse (bool): Interpret `axis` parameter as its complementary.
If True, the dims of the input array are placed at the positions
indicated by `axis`, and singletons are placed everywherelse and
the `axis` length must be equal to the number of dimensions of the
input array; the `shape` parameter cannot be `None`.
If False, the singletons are added at the position(s) specified by
`axis`.
If `axis` is None, `reverse` has no effect.
Returns:
arr (np.ndarray): The reshaped array.
Raises:
ValueError: if the `arr` shape cannot be reshaped correctly.
Examples:
Let's define some input array `arr`:
>>> arr = np.arange(2 * 3 * 4).reshape((2, 3, 4))
>>> arr.shape
(2, 3, 4)
A call to `unsqueeze()` can be reversed by `np.squeeze()`:
>>> arr_ = unsqueeze(arr, (0, 2, 4))
>>> arr_.shape
(1, 2, 1, 3, 1, 4)
>>> arr = np.squeeze(arr_, (0, 2, 4))
>>> arr.shape
(2, 3, 4)
The order of the axes does not matter:
>>> arr_ = unsqueeze(arr, (0, 4, 2))
>>> arr_.shape
(1, 2, 1, 3, 1, 4)
If `shape` is an int, `axis` must be consistent with it:
>>> arr_ = unsqueeze(arr, (0, 2, 4), 6)
>>> arr_.shape
(1, 2, 1, 3, 1, 4)
>>> arr_ = unsqueeze(arr, (0, 2, 4), 7)
Traceback (most recent call last):
...
ValueError: Incompatible `[0, 2, 4]` axis and `7` shape for array of\
shape (2, 3, 4)
It is possible to reverse the meaning to `axis` to add singletons
everywhere except where specified (but requires `shape` to be defined
and the length of `axis` must match the array dims):
>>> arr_ = unsqueeze(arr, (0, 2, 4), 10, True)
>>> arr_.shape
(2, 1, 3, 1, 4, 1, 1, 1, 1, 1)
>>> arr_ = unsqueeze(arr, (0, 2, 4), reverse=True)
Traceback (most recent call last):
...
ValueError: When `reverse` is True, `shape` cannot be None.
>>> arr_ = unsqueeze(arr, (0, 2), 10, True)
Traceback (most recent call last):
...
ValueError: When `reverse` is True, the length of axis (2) must match\
the num of dims of array (3).
Axes values must be valid:
>>> arr_ = unsqueeze(arr, 0)
>>> arr_.shape
(1, 2, 3, 4)
>>> arr_ = unsqueeze(arr, 3)
>>> arr_.shape
(2, 3, 4, 1)
>>> arr_ = unsqueeze(arr, -1)
>>> arr_.shape
(2, 3, 4, 1)
>>> arr_ = unsqueeze(arr, -4)
>>> arr_.shape
(1, 2, 3, 4)
>>> arr_ = unsqueeze(arr, 10)
Traceback (most recent call last):
...
ValueError: Axis (10,) out of range.
If `shape` is specified, `axis` can be omitted (USE WITH CARE!) or its
value is used for addiotional safety checks:
>>> arr_ = unsqueeze(arr, shape=(2, 3, 4, 5, 6))
>>> arr_.shape
(2, 3, 4, 1, 1)
>>> arr_ = unsqueeze(
... arr, (3, 6, 8), (2, 5, 3, 2, 7, 2, 3, 2, 4, 5, 6), True)
>>> arr_.shape
(1, 1, 1, 2, 1, 1, 3, 1, 4, 1, 1)
>>> arr_ = unsqueeze(
... arr, (3, 7, 8), (2, 5, 3, 2, 7, 2, 3, 2, 4, 5, 6), True)
Traceback (most recent call last):
...
ValueError: New shape [1, 1, 1, 2, 1, 1, 1, 3, 4, 1, 1] cannot be\
broadcasted to shape (2, 5, 3, 2, 7, 2, 3, 2, 4, 5, 6)
>>> arr = unsqueeze(arr, shape=(2, 5, 3, 7, 2, 4, 5, 6))
>>> arr.shape
(2, 1, 3, 1, 1, 4, 1, 1)
>>> arr = np.squeeze(arr)
>>> arr.shape
(2, 3, 4)
>>> arr = unsqueeze(arr, shape=(5, 3, 7, 2, 4, 5, 6))
Traceback (most recent call last):
...
ValueError: Target shape must contain all source shape elements\
(in correct order). (2, 3, 4) -> (5, 3, 7, 2, 4, 5, 6)
The behavior is consistent with other NumPy functions and the
`keepdims` mechanism:
>>> axis = (0, 2, 4)
>>> arr1 = np.arange(2 * 3 * 4 * 5 * 6).reshape((2, 3, 4, 5, 6))
>>> arr2 = np.sum(arr1, axis, keepdims=True)
>>> arr2.shape
(1, 3, 1, 5, 1)
>>> arr3 = np.sum(arr1, axis)
>>> arr3.shape
(3, 5)
>>> arr3 = unsqueeze(arr3, axis)
>>> arr3.shape
(1, 3, 1, 5, 1)
>>> np.all(arr2 == arr3)
True
"""
# calculate `new_shape`
if axis is None and shape is None:
raise ValueError(
'At least one of `axis` and `shape` parameters must be specified.')
elif axis is None and shape is not None:
new_shape = unsqueezing(arr.shape, shape)
elif axis is not None:
if isinstance(axis, int):
axis = (axis,)
# calculate the dim of the result
if shape is not None:
if isinstance(shape, int):
ndim = shape
else: # shape is a sequence
ndim = len(shape)
elif not reverse:
ndim = len(axis) + arr.ndim
else:
raise ValueError('When `reverse` is True, `shape` cannot be None.')
# check that axis is properly constructed
if any([ax < -ndim - 1 or ax > ndim + 1 for ax in axis]):
raise ValueError('Axis {} out of range.'.format(axis))
# normalize axis using `ndim`
axis = sorted([ax % ndim for ax in axis])
# manage reverse mode
if reverse:
if len(axis) == arr.ndim:
axis = [i for i in range(ndim) if i not in axis]
else:
raise ValueError(
'When `reverse` is True, the length of axis ({})'
' must match the num of dims of array ({}).'.format(
len(axis), arr.ndim))
elif len(axis) + arr.ndim != ndim:
raise ValueError(
'Incompatible `{}` axis and `{}` shape'
' for array of shape {}'.format(axis, shape, arr.shape))
# generate the new shape from axis, ndim and shape
new_shape = []
i, j = 0, 0
for l in range(ndim):
if i < len(axis) and l == axis[i] or j >= arr.ndim:
new_shape.append(1)
i += 1
else:
new_shape.append(arr.shape[j])
j += 1
# check that `new_shape` is consistent with `shape`
if shape is not None:
if isinstance(shape, int):
if len(new_shape) != ndim:
raise ValueError(
'Length of new shape {} does not match '
'expected length ({}).'.format(len(new_shape), ndim))
else:
if not all([new_dim == 1 or new_dim == dim
for new_dim, dim in zip(new_shape, shape)]):
raise ValueError(
'New shape {} cannot be broadcasted to shape {}'.format(
new_shape, shape))
return arr.reshape(new_shape)
Using these, one can write:
import numpy as np
arr1 = np.arange(2 * 3 * 4 * 5 * 6).reshape((2, 3, 4, 5, 6))
arr2 = np.arange(3 * 5).reshape((3, 5))
arr3 = unsqueeze(arr2, (0, 2, 4))
arr1 + arr3
# now this works because it has the correct shape
arr3 = unsqueeze(arr2, shape=arr1.shape)
arr1 + arr3
# this also works because the shape can be expanded unambiguously
So dynamic broadcast can now happen, and this is consistent with the behavior of keepdims:
import numpy as np
axis = (0, 2, 4)
arr1 = np.arange(2 * 3 * 4 * 5 * 6).reshape((2, 3, 4, 5, 6))
arr2 = np.sum(arr1, axis, keepdims=True)
arr3 = np.sum(arr1, axis)
arr3 = unsqueeze(arr3, axis)
np.all(arr2 == arr3)
# : True
Effectively, this extends np.expand_dims() to handle more complex scenarios.
Improvements over this code are obviously more than welcome.
I am trying to take the dot product between three numpy arrays. However, I am struggling with wrapping my head around this.
The problem is as follows:
I have two (4,) shaped numpy arrays a and b respectively, as well as a numpy array with shape (4, 4, 3), c.
import numpy as np
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
I want to compute the dot product in such a way that my result is a 3-tuple. That is, first dot a with b and then dotting with c, taking transposes if needed. In other words, I want to compute the dot product between a, b and c as if c was of shape (4, 4), but I want a 3-tuple as result.
I have tried:
Reshaping a and c, and then computing the dot product:
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp, c)
Ideally, I should now have:
print(result.shape)
>> (1, 1, 3)
but I get the error
ValueError: shapes (1,4,3) and (4,1) not aligned: 3 (dim 2) != 4 (dim 0)
I have also tried using the tensordot function from numpy, but without luck.
The basic dot(A,B) rule is: last axis of A with the 2nd to the last of B
In [965]: a.shape
Out[965]: (4,)
In [966]: b.shape
Out[966]: (4, 4, 3)
a (and c) is 1d. It's (4,) can dot with the 2nd (4) of b with:
In [967]: np.dot(a,b).shape
Out[967]: (4, 3)
Using c in the same on the output produces a (3,) array
In [968]: np.dot(c, np.dot(a,b))
Out[968]: array([360, 360, 360])
This combination may be clearer with the equivalent einsum:
In [971]: np.einsum('i,jik,j->k',a,b,c)
Out[971]: array([360, 360, 360])
But what if we want a to act on the 1st axis of b? With einsum that's easy to do:
In [972]: np.einsum('i,ijk,j->k',a,b,c)
Out[972]: array([440, 440, 440])
To do the same with the dot, we could just switch a and c:
In [973]: np.dot(a, np.dot(c,b))
Out[973]: array([440, 440, 440])
Or transpose axes of b:
In [974]: np.dot(c, np.dot(a,b.transpose(1,0,2)))
Out[974]: array([440, 440, 440])
This transposition question would be clearer if a and c had different lengths. e.g. A (2,) and (4,) with a (2,4,3) or (4,2,3).
In
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
you have a (1,4a) dotted with (4,4a,3). The result is (1,4,3). I added the a to identify when axes were combined.
To apply the (4,1) c, we have to do the same transpose:
In [977]: np.dot(c[:,None].T, np.dot(a[:,None].T, b))
Out[977]: array([[[360, 360, 360]]])
In [978]: _.shape
Out[978]: (1, 1, 3)
np.dot(c[None,:], np.dot(a[None,:], b)) would do the same without the transposes.
I was hoping numpy would automagically distribute over the last axis. That is, that the dot product would run over the two first axes, if that makes sense.
Given the dot rule that I cited at the start this does not make sense. But if we transpose b so the (3) axis is first, it can 'carry that along', using the last and 2nd to the last.
In [986]: b.transpose(2,0,1).shape
Out[986]: (3, 4, 4)
In [987]: np.dot(a, b.transpose(2,0,1)).shape
Out[987]: (3, 4)
In [988]: np.dot(np.dot(a, b.transpose(2,0,1)),c)
Out[988]: array([440, 440, 440])
(4a).(3, 4a, 4c) -> (3, 4c)
(3, 4c). (4c) -> 3
Not automagical but does the job:
np.einsum('i,ijk,j->k',a,b,c)
# array([440, 440, 440])
This computes d of shape (3,) such that d_k = sum_{ij} a_i b_{ijk} c_j.
You are multiplying (1,4,3) matrix by (4,1) matrix so it is impossible because you have 3 pages of (1,4) matrices in b. If you want to do multiplication of each page of matrix b by c just multiply each page separately.
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp[:,:,0], c)
for i in range(1,3):
result = np.dstack((result, np.dot(tmp[:,:,i], c)))
print np.shape(result)
So you have result of size (1,1,3)