I have a three dimensional array A, with shape (5774,15,100) and another 1 D array B with shape (5774,). I want to add these in order to get the another matrix C with shape (5774,15,101).
I am using hstack as
C = hstack((A ,np.array(B)[:,None]))
I am getting the below error, any suggesstions.
ValueError: could not broadcast input array from shape (5774,15,100) into shape (5774)
You'd need to use np.concatenate (which can cancatenate arrays of different shape, unlike the various np.*stack methods). Then, you need to use np.broadcast_to to get that (5774,) shaped array to (5774, 15, 1) (because concatenate still needs all the arrays to have the same number of dimensions).
C = np.concatenate((A,
np.broadcast_to(np.array(B)[:, None, None], A.shape[:-1] + (1,))),
axis = -1)
Checking:
A = np.random.rand(5774, 15, 100)
B = np.random.rand(5774)
C = np.concatenate((A,
np.broadcast_to(np.array(B)[:, None, None], A.shape[:-1] + (1,))),
axis = -1)
C.shape
Out: (5774, 15, 101)
Related
I currently have an ndarray of shape (27,) where each array entry is an array of shape (121,61). I'd like to reshape the ndarray to a new size of (3267, 61), which is just expanding/flattening the nested arrays into one.
I've tried using the .resize(3267, 61) and .reshape(3267, 61) but when I do, the following error appears:
ValueError: cannot reshape array of size 27 into shape (3267, 61)
ValueError: cannot resize this array: it does not own its data
You can use np.stack() to turn a sequence of arrays into a single ndarray, which can then be reshaped as you need:
>>> a = np.zeros((27,), dtype=object)
>>> for i in range(a.shape[0]):
... a[i] = np.zeros((121, 61))
>>> b = np.stack(a).reshape((27*121, 61))
>>> b.shape
(3267, 61)
If the array elements are indeed other 1D nDarray (as #CamiloMartínez showed it can happen), then use:
b = np.concatenate(a)
If instead the array is just a 3D array (e.g. obtained by putting together a list of arrays and letting numpy optimize it), then use:
b = a.reshape(-1, a.shape[-1])
General case: If you are unsure, then the following works in either case. It also works in the case where a is a 2D (or higher dimensions) array containing arrays (as #drod31 was asking in the comments):
b = np.stack(a.ravel())
b = b.reshape(-1, b.shape[-1])
Here is a minimal example:
case 1: (thx #CamiloMartínez for the setup).
a = np.empty((27,), dtype=object)
for i in range(a.shape[0]):
a[i] = np.zeros((121, 61))
b = np.concatenate(a)
>>> b.shape
(3267, 61)
case 2 (my initial setup, that missed the actual array of array condition):
a = np.array([np.zeros((121, 61)) for _ in range(27)])
b = a.reshape(-1, a.shape[-1])
>>> b.shape
(3267, 61)
In any case, you'd usually like to express the transformation without explicit hardcoded dimensions, for more general use.
Corner-case example (as per #drod31 question):
a = np.empty((15,27), dtype=object)
for i in range(a.shape[0]):
for j in range(a.shape[1]):
a[i,j] = np.zeros((121, 61))
>>> a.shape
(15, 27)
>>> a[0,0].shape
(121, 61)
b = np.stack(a.ravel())
b = b.reshape(-1, b.shape[-1])
>>> b.shape
(49005, 61)
One can use numpy.where for selecting values from two arrays depending on a condition:
import numpy
a = numpy.random.rand(5)
b = numpy.random.rand(5)
c = numpy.where(a > 0.5, a, b) # okay
If the array has more dimensions, however, this does not work anymore:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, 0] > 0.5, a, b) # !
Traceback (most recent call last):
File "p.py", line 10, in <module>
c = numpy.where(a[:, 0] > 0.5, a, b) # okay
File "<__array_function__ internals>", line 6, in where
ValueError: operands could not be broadcast together with shapes (5,) (5,2) (5,2)
I would have expected a numpy array of shape (5,2).
What's the issue here? How to work around it?
Remember that broadcasting in numpy only works from the right, so while (5,) shaped arrays can broadcast with (2,5) shaped arrays they can't broadcast with (5,2) shaped arrays. to broadcast with a (5,2) shaped array you need to maintain the second dimension so that the shape is (5,1) (anything can broadcast with 1)
Thus, you need to maintain the second dimension when indexing it (otherwise it removes the indexed dimension when only one value exists). You can do this by putting the index in a one-element list:
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, [0]] > 0.5, a, b) # works
You can use c = numpy.where(a > 0.5, a, b)
however if you want to use only the first column of a then you need to consider the shape of the output.
let's first see what is the shape of this operation
(a[:, 0] > 0.5).shape # outputs (5,)
it's one dimensional
while the shape of a and b is (5, 2)
it's two dimensional and hence you can't broadcast this
the solution is to reshape the mask operation to be of shape (5, 1)
your code should look like this
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where((a[:, 0] > 0.5).reshape(-1, 1), a, b) # !
You can try:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a > 0.5, a, b)
instead of: c = np.where(a>0.5,a,b)
you can use: c = np.array([a,b])[a>0.5]
which works for multidimensional arrays if a and b have the same shape.
Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
np.random.seed(777)
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:
np.dstack([A,np.broadcast_to(b[:,None],A.shape[:2])]
I have two numpy arrays: One array x with shape (n, a0, a1, ...) and one array k with shape (n, b0, b1, ...). I would like to compute and array of exponentials such that the output has dimension (a0, a1, ..., b0, b1, ...) and
out[i0, i1, ..., j0, j1, ...] == prod(x[:, i0, i1, ...] ** k[:, j0, j1, ...])
If there is only one a_i and one b_j, broadcasting does the trick via
import numpy
x = numpy.random.rand(2, 31)
k = numpy.random.randint(1, 10, size=(2, 101))
out = numpy.prod(x[..., None]**k[:, None], axis=0)
If x has a few dimensions more, more Nones have to be added:
x = numpy.random.rand(2, 31, 32, 33)
k = numpy.random.randint(1, 10, size=(2, 101))
out = numpy.prod(x[..., None]**k[:, None, None, None], axis=0)
If x has a few dimensions more, more Nones have to be added at other places:
x = numpy.random.rand(2, 31)
k = numpy.random.randint(1, 10, size=(2, 51, 51))
out = numpy.prod(x[..., None, None]**k[:, None], axis=0)
How to make the computation of out generic with respect to the dimensionality of the input arrays?
Here's one using reshaping of the two arrays so that they are broadcastable against each other and then performing those operations and prod reduction along the first axis -
k0_shp = [k.shape[0]] + [1]*(x.ndim-1) + list(k.shape[1:])
x0_shp = list(x.shape) + [1]*(k.ndim-1)
out = (x.reshape(x0_shp) ** k.reshape(k0_shp)).prod(0)
Here's another way to reshape both inputs to 3D allowing one singleton dim per input and such that they are broadcastable against each other, perform prod reduction to get 2D array, then reshape back to multi-dim array -
s = x.shape[1:] + k.shape[1:] # output shape
out = (x.reshape(x.shape[0],-1,1)**k.reshape(k.shape[0],1,-1)).prod(0).reshape(s)
It must be noted that reshaping merely creates a view into the input array and as such is virtually free both memory-wise and performance-wise.
Without understanding fully the math of what you're doing, it seems that you need a constant number of None's for the number of dimensions of each x and k.
does something like this work?
out = numpy.prod(x[[...]+[None]*(k.ndim-1)]**k[[slice(None)]+[None]*(x.ndim-1)])
Here are the slices separately so they're a bit easier to read:
x[ [...] + [None]*(k.ndim-1) ]
k[ [slice(None)] + [None]*(x.ndim-1) ]
Compatibility Note:
[...] seems to only be valid in python 3.x If you are using 2.7 (I haven't tested lower) substitute [Ellipsis] instead:
x[ [Ellipsis] + [None]*(k.ndim-1) ]
I'm trying to do an integration with numpy:
A = n.trapz(B,C)
but I have some issues with B and C shapes
B is a filled array inizialized with numpy zeros function
B=np.zeros((N,1))
C is a column extracted from a matrix, always inizialized with numpy:
C = D[:,0]
D = np.zeros((N,2))
the problem is that:
n.shape(B) # (N,1)
n.shape(C) # (N,)
how can I manage this?
Try
B = np.zeros(N)
np.trapz(B, C)
Also, you np.trapz accepts multi-dimensional arrays, so arrays of shape (N, 1) are ok; you just need to specify an axis to handle it properly.
B = np.zeros((N, 1))
C = D[:, 0]
np.trapz(B, C.reshape(N, 1), axis=1)