import numpy as np
Create a 2-D array
x=np.arange(10)
print(x)
x.shape=(2,5) #Means 2 rows and 5 elements(columns) in each row
print("Print the newly made 2-D array")
print(x)
print(x[np.array([0,1]),np.array([3,2,4])])
On running the code, it gives:-
Traceback (most recent call last):
print(x[np.array([0,1]),np.array([3,2,4])])
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
In [122]: x= np.arange(10).reshape(2,5)
In [123]: x
Out[123]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [124]: x[np.array([0,1]),:]
Out[124]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [125]: x[:,np.array([3,2,4])]
Out[125]:
array([[3, 2, 4],
[8, 7, 9]])
Indexing with 2 arrays:
In [126]: x[np.array([0,1]),np.array([3,2,4])]
Traceback (most recent call last):
Input In [126] in <cell line: 1>
x[np.array([0,1]),np.array([3,2,4])]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
But if one is (2,1) shape, then:
In [127]: x[np.array([0,1])[:,None],np.array([3,2,4])]
Out[127]:
array([[3, 2, 4],
[8, 7, 9]])
Note that this is the sames [125] which uses slice for the rows.
The key issue is broadcasting. It's powerful, but a little hard for beginners to grasp. A way to visualize this broadcasting is to add the 2 arrays:
In [128]: np.array([0,1])[:,None]*10 + np.array([3,2,4])
Out[128]:
array([[ 3, 2, 4],
[13, 12, 14]])
If the 2 arrays have the same length, we get a "diagonal" of the [125] box:
In [129]: x[np.array([0,1]),np.array([3,2])]
Out[129]: array([3, 7])
Reference
https://numpy.org/doc/stable/user/basics.indexing.html
Related
I have two arrays:
import numpy as np
a = np.array([[1,2,3], [4,5,6]])
b = np.array([5, 6])
when I try to append b to a as a's last column using the below code
np.hstack([a, b])
it thows an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 6, in hstack
File "C:\Users\utkal\anaconda3\envs\AIML\lib\site-packages\numpy\core\shape_base.py", line 345, in hstack
return _nx.concatenate(arrs, 1)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
I want the final array as:
1 2 3 5
4 5 6 6
You can do it like this:
np.hstack((a,b.reshape(2,1)))
array([[1, 2, 3, 5],
[4, 5, 6, 6]])
np.hstack is a shortcut for np.concatenate on axis=1. So, it requires both arrays have the same shapes with all dimensions matching except on concatenate axis. In your case, you need both array in 2-d and matching on dimension 0
So you need
np.hstack([a, b[:,None]])
Out[602]:
array([[1, 2, 3, 5],
[4, 5, 6, 6]])
Or use np.concatenate
np.concatenate([a, b[:,None]], axis=1)
Out[603]:
array([[1, 2, 3, 5],
[4, 5, 6, 6]])
import numpy as np
arr = np.array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]])
def np2dOperations(arr):
a = arr[0:1]
print(a)
b = arr[1:2]
print(b)
c = arr[2:3]
print(c)
d = arr[3:4]
print(d)
e = np.matmul(a,c)
print(e, "e")
f = b*d
x = e.sum()
y = np.amax(f)
print(x)
print(y)
print(x-y)
return x-y
np2dOperations(arr)
my output:
[[1 1 2 2]]
[[1 1 2 2]]
[[3 3 4 4]]
[[3 3 4 4]]
Traceback (most recent call last):
File "/Users/bethanne/Documents/NumPy2DOperations.py", line 24, in <module>
np2dOperations(arr)
File "/Users/bethanne/Documents/NumPy2DOperations.py", line 14, in np2dOperations
e = np.matmul(a,c)
ValueError: shapes (1,4) and (1,4) not aligned: 4 (dim 1) != 1 (dim 0)
I keep getting the following error "ValueError: shapes (1,4) and (1,4) not aligned: 4 (dim 1) != 1 (dim 0)" even though arrays a and c are the same size. The result should be 16 from x-y. I tried using np.transpose on array a but that didn't work either. I am newer to programming with numpy and python so please explain what I am doing wrong. Thank you!
So you start with a 4x4 array:
In [17]: arr = np.array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]])
In [18]: arr
Out[18]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
In [19]: arr.shape
Out[19]: (4, 4)
Indexing with a slice retains the 2d shape:
In [20]: a = arr[0:1]
In [21]: a
Out[21]: array([[1, 1, 2, 2]])
In [22]: a.shape
Out[22]: (1, 4)
Indexing with a scalar reduces the dimensions by 1
In [23]: a1 = arr[0]
In [24]: a1
Out[24]: array([1, 1, 2, 2])
In [25]: a1.shape
Out[25]: (4,)
matmul for 1d arrays is clearly documented:
In [26]: np.matmul(arr[0],arr[1])
Out[26]: 10
In [27]:
In [27]: np.matmul(arr[0],arr[2])
Out[27]: 22
matmul for 2d arrays is also clearly documented, and the requirements clearly stated in the error:
In [28]: np.matmul(arr[0:1],arr[2:3])
Traceback (most recent call last):
File "<ipython-input-28-88ee2e80387e>", line 1, in <module>
np.matmul(arr[0:1],arr[2:3])
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 4)
matmul of a (1,4) with a (4,1) does work, producing the same result as the 1d "dot" - except the result is a (1,1) array:
In [29]: np.matmul(arr[0:1],arr[2:3].T)
Out[29]: array([[22]])
Elementwise multiplication:
In [30]: arr[1]*arr[3]
Out[30]: array([3, 3, 8, 8])
In [31]: arr[1:2]*arr[3:4]
Out[31]: array([[3, 3, 8, 8]])
and so on for your other expressions. In numpy there is a clear distinction between 1d arrays and 2d ones. A (n,) shaped array is different from a (1,n) or (n,1) shape, even though they can be reshaped to each other.
I have a 2D array of values, and I want to call it by two list of indices x,y. It used to work perfect before, I don't know why it's not working now, maybe python version, not sure.
x = np.squeeze(np.where(data['info'][:,2]==cdp)[0])
y = np.squeeze(np.where((data['Time']>=ub) & (data['Time']<=lb))[0])
s = data['gather'][x,y]
Error:
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (36,) (45,)
I don't what is the problem. It works when I do it in two stages.
s = data['gather'][:,y]; s = s[x,:]
But, I can't do this, I need to do at one run
In [92]: data = np.arange(12).reshape(3,4)
In [93]: x,y = np.arange(3), np.arange(4)
In [94]: data[x,y]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-94-8bd18da6c0ef> in <module>
----> 1 data[x,y]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (4,)
When you provide 2 or more arrays as indices, numpy broadcasts them against each other. Understanding broadcasting is important.
In MATLAB providing two indexing arrays (actually 2d matrices) fetches a block. In numpy, to arrays, if they match in shape, fetch elements, e.g. a diagonal:
In [99]: data[x,x]
Out[99]: array([ 0, 5, 10])
The MATLAB equivalent requires an extra function, 'indices to sub' or some such name.
Two stage indexing:
In [95]: data[:,y][x,:]
Out[95]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
ix_ is a handy tool for constructing indices for block access:
In [96]: data[np.ix_(x,y)]
Out[96]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Notice what it produces:
In [97]: np.ix_(x,y)
Out[97]:
(array([[0],
[1],
[2]]), array([[0, 1, 2, 3]]))
that's the same as doing:
In [98]: data[x[:,None], y]
Out[98]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
x[:,None] is (3,1), y is (4,); they broadcast to produce a (3,4) selection.
There are 2 np.arrays and I would like to reshape np.array1 from shape (12,)in reference to array2 with shape (4,):
array1 = np.array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) and
array1.shape
returns: (12,)
array2 = np.array([ 12, 34, 56, 78])
and
array2.shape
returns: (4,)
I tried to execute reshape:
array1.reshape(array2.shape)
But, there is an error:
ValueError: cannot reshape array of size 12 into shape (4,)
So, Expected result is array1 with 4 elements:
np.array([ 1, 2, 3, 4]),
instead of 12.
I'd appreciate for any idea and help.
If I understand your requirements correctly, I think what you're looking for is simple slicing:
In [140]: array2 = np.array([ 12, 34, 56, 78])
In [135]: a_sliced = array1[:array2.shape[0]]
In [136]: a_sliced.shape
Out[136]: (4,)
If array2 is multi-dimensional, then use the approach suggested by Mad Physicist:
sliced_arr = array1[tuple(slice(0, d) for d in array2.shape)]
Alternatively, if you intended to split the array into three equal halves, then use numpy.split() as in:
# split `array1` into 3 portions
In [138]: np.split(array1, 3)
Out[138]: [array([1, 2, 3, 4]), array([5, 6, 7, 8]), array([ 9, 10, 11, 12])]
I have scripts with multi-dimensional arrays and instead of for-loops I would like to use a vectorized implementation for my problems (which sometimes contain column operations).
Let's consider a simple example with matrix arr:
> arr = np.arange(12).reshape(3, 4)
> arr
> ([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
> arr.shape
> (3, 4)
So we have a matrix arr with 3 rows and 4 columns.
The simplest case in my scripts is adding something to the values in the array. E.g. I'm doing this for single or multiple rows:
> someVector = np.array([1, 2, 3, 4])
> arr[0] += someVector
> arr
> array([[ 1, 3, 5, 7], <--- successfully added someVector
[ 4, 5, 6, 7], to one row
[ 8, 9, 10, 11]])
> arr[0:2] += someVector
> arr
> array([[ 2, 5, 8, 11], <--- added someVector to two
[ 5, 7, 9, 11], <--- rows at once
[ 8, 9, 10, 11]])
This works well. However, sometimes I need to manipulate one or several columns. One column at a time works:
> arr[:, 0] += [1, 2, 3]
> array([[ 3, 5, 8, 11],
[ 7, 7, 9, 11],
[11, 9, 10, 11]])
^
|___ added the values [1, 2, 3] successfully to
this column
But I am struggling to think out why this does not work for multiple columns at once:
> arr[:, 0:2] += [1, 2, 3]
> ValueError
> Traceback (most recent call last)
> <ipython-input-16-5feef53e53af> in <module>()
> ----> 1 arr[:, 0:2] += [1, 2, 3]
> ValueError: operands could not be broadcast
> together with shapes (3,2) (3,) (3,2)
Isn't this the very same way it works with rows? What am I doing wrong here?
To add a 1D array to multiple columns you need to broadcast the values to a 2D array. Since broadcasting adds new axes on the left (of the shape) by default, broadcasting a row vector to multiple rows happens automatically:
arr[0:2] += someVector
someVector has shape (N,) and gets automatically broadcasted to shape (1, N). If arr[0:2] has shape (2, N), then the sum is performed element-wise as though both arr[0:2] and someVector were arrays of the same shape, (2, N).
But to broadcast a column vector to multiple columns requires hinting NumPy that you want broadcasting to occur with the axis on the right. In fact, you have to add the new axis on the right explicitly by using someVector[:, np.newaxis] or equivalently someVector[:, None]:
In [41]: arr = np.arange(12).reshape(3, 4)
In [42]: arr[:, 0:2] += np.array([1, 2, 3])[:, None]
In [43]: arr
Out[43]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])
someVector (e.g. np.array([1, 2, 3])) has shape (N,) and someVector[:, None] has shape (N, 1) so now broadcasting happens on the right. If arr[:, 0:2] has shape (N, 2), then the sum is performed element-wise as though both arr[:, 0:2] and someVector[:, None] were arrays of the same shape, (N, 2).
Very clear explanation of #unutbu.
As a complement, transposition (.T) can often simplify the task, by working in the first dimension :
In [273]: arr = np.arange(12).reshape(3, 4)
In [274]: arr.T[0:2] += [1, 2, 3]
In [275]: arr
Out[275]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])