I have a 2D array of values, and I want to call it by two list of indices x,y. It used to work perfect before, I don't know why it's not working now, maybe python version, not sure.
x = np.squeeze(np.where(data['info'][:,2]==cdp)[0])
y = np.squeeze(np.where((data['Time']>=ub) & (data['Time']<=lb))[0])
s = data['gather'][x,y]
Error:
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (36,) (45,)
I don't what is the problem. It works when I do it in two stages.
s = data['gather'][:,y]; s = s[x,:]
But, I can't do this, I need to do at one run
In [92]: data = np.arange(12).reshape(3,4)
In [93]: x,y = np.arange(3), np.arange(4)
In [94]: data[x,y]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-94-8bd18da6c0ef> in <module>
----> 1 data[x,y]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (4,)
When you provide 2 or more arrays as indices, numpy broadcasts them against each other. Understanding broadcasting is important.
In MATLAB providing two indexing arrays (actually 2d matrices) fetches a block. In numpy, to arrays, if they match in shape, fetch elements, e.g. a diagonal:
In [99]: data[x,x]
Out[99]: array([ 0, 5, 10])
The MATLAB equivalent requires an extra function, 'indices to sub' or some such name.
Two stage indexing:
In [95]: data[:,y][x,:]
Out[95]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
ix_ is a handy tool for constructing indices for block access:
In [96]: data[np.ix_(x,y)]
Out[96]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Notice what it produces:
In [97]: np.ix_(x,y)
Out[97]:
(array([[0],
[1],
[2]]), array([[0, 1, 2, 3]]))
that's the same as doing:
In [98]: data[x[:,None], y]
Out[98]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
x[:,None] is (3,1), y is (4,); they broadcast to produce a (3,4) selection.
Related
import numpy as np
Create a 2-D array
x=np.arange(10)
print(x)
x.shape=(2,5) #Means 2 rows and 5 elements(columns) in each row
print("Print the newly made 2-D array")
print(x)
print(x[np.array([0,1]),np.array([3,2,4])])
On running the code, it gives:-
Traceback (most recent call last):
print(x[np.array([0,1]),np.array([3,2,4])])
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
In [122]: x= np.arange(10).reshape(2,5)
In [123]: x
Out[123]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [124]: x[np.array([0,1]),:]
Out[124]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [125]: x[:,np.array([3,2,4])]
Out[125]:
array([[3, 2, 4],
[8, 7, 9]])
Indexing with 2 arrays:
In [126]: x[np.array([0,1]),np.array([3,2,4])]
Traceback (most recent call last):
Input In [126] in <cell line: 1>
x[np.array([0,1]),np.array([3,2,4])]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
But if one is (2,1) shape, then:
In [127]: x[np.array([0,1])[:,None],np.array([3,2,4])]
Out[127]:
array([[3, 2, 4],
[8, 7, 9]])
Note that this is the sames [125] which uses slice for the rows.
The key issue is broadcasting. It's powerful, but a little hard for beginners to grasp. A way to visualize this broadcasting is to add the 2 arrays:
In [128]: np.array([0,1])[:,None]*10 + np.array([3,2,4])
Out[128]:
array([[ 3, 2, 4],
[13, 12, 14]])
If the 2 arrays have the same length, we get a "diagonal" of the [125] box:
In [129]: x[np.array([0,1]),np.array([3,2])]
Out[129]: array([3, 7])
Reference
https://numpy.org/doc/stable/user/basics.indexing.html
I have a 2-D numpy array X with shape (100, 4). I want to find the sum of each row of that
array and store it inside a new numpy array x_new with shape (100,0). What I've done so far
doesn't work. Any suggestions ?. Below is my approach.
x_new = np.empty([100,0])
for i in range(len(X)):
array = np.append(x_new, sum(X[i]))
Using the sum method on a 2d array:
In [8]: x = np.arange(12).reshape(3,4)
In [9]: x
Out[9]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [10]: x.sum(axis=1)
Out[10]: array([ 6, 22, 38])
In [12]: x.sum(axis=1, keepdims=True)
Out[12]:
array([[ 6],
[22],
[38]])
In [13]: _.shape
Out[13]: (3, 1)
reference: https://numpy.org/doc/stable/reference/generated/numpy.sum.html
I have an array I wan to use for mapping. Let's call it my_map ,type float shape (m,c).
I have a second array with indexes, lest call it my_indexes, type int size (n,c), every value is between 0 and m.
Trying to index my_map doing my_ans = my_map[my_indexes] I get an array of shape (n,c,c), when I was expecting (n,c). What would be the proper way to do it?
Just to be clear, what I am trying to do is something equivalent to:
my_ans = np.empty_like(touch_probability)
for i in range(c):
my_ans[:,i] = my_map[:,i][my_indexes[:,i]]
To illustrate and test your problem, define simple, real arrays:
In [44]: arr = np.arange(12).reshape(3,4)
In [45]: idx = np.array([[0,2,1,0],[2,2,1,0]])
In [46]: arr.shape
Out[46]: (3, 4)
In [47]: idx.shape
Out[47]: (2, 4)
Your desired calculation:
In [48]: res = np.zeros((2,4), int)
In [49]: for i in range(4):
...: res[:,i] = arr[:,i][idx[:,i]] # same as arr[idx[:,i], i]
...:
In [50]: res
Out[50]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
Doing the same with one indexing step:
In [51]: arr[idx, np.arange(4)]
Out[51]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
This is broadcasting the two indexing arrays against each other, and then picking points:
In [52]: np.broadcast_arrays(idx, np.arange(4))
Out[52]:
[array([[0, 2, 1, 0],
[2, 2, 1, 0]]),
array([[0, 1, 2, 3],
[0, 1, 2, 3]])]
So we are indexing the (m,c) array with 2 (n,c) arrays
The following are the same:
arr[idx]
arr[idx, :]
It is using idx to select whole rows from arr, so the result is shape of idx plus the last dimension of arr. Where as what you want is just the ith element of the idx[j,i] row.
I have a couple of ndarrays with same shape, and I would like to get one array (of same shape) with the maximum of the absolute values for each element. So I decided to stack all arrays, and then pick the values along the new stacked axis. But how to do this?
Example
Say we have two 1-D arrays with 4 elements each, so my stacked array looks like
>>> stack
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7]])
If I would just be interested in the maximum I could just do
>>> numpy.amax(stack, axis=0)
array([4, 1, 6, 7])
But I need to consider negative values as well, so I was going for
>>> ind = numpy.argmax(numpy.absolute(stack), axis=0)
>>> ind
array([0, 1, 1, 1])
So now I have the indices I need, but how to apply this to the stacked array? If I just index stack by ind, numpy is doing something broadcasting stuff I don't need:
>>> stack[ind]
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7]])
What I want to get is array([4, -5, 6, 7])
Or to ask from a slightly different perspective: How do I get the array numpy.amax(stack, axis=0) based on the indices returned by numpy.argmax(stack, axis=0)?
The stacking operation would be inefficient. We can simply use np.where to do the choosing based on the absolute valued comparisons -
In [198]: a
Out[198]: array([4, 1, 2, 3])
In [199]: b
Out[199]: array([ 0, -5, 6, 7])
In [200]: np.where(np.abs(a) > np.abs(b), a, b)
Out[200]: array([ 4, -5, 6, 7])
This works on generic n-dim arrays without any modification.
If you have 2D numpy ndarray, classical indexing no longer applies. So to achieve what you want, to avoid brodcatsting, you have to index with 2D array too:
>>> stack[[ind,np.arange(stack.shape[1])]]
array([ 4, -5, 6, 7])
For 'normal' Python:
>>> a=[[1,2],[3,4]]
>>> b=[0,1]
>>> [x[y] for x,y in zip(a,b)]
[1, 4]
Perhaps it can be applied to arrays too, I am not familiar enough with Numpy.
Find array of max and min and combine using where
maxs = np.amax(stack, axis=0)
mins = np.amin(stack, axis=0)
max_abs = np.where(np.abs(maxs) > np.abs(mins), maxs, mins)
I have scripts with multi-dimensional arrays and instead of for-loops I would like to use a vectorized implementation for my problems (which sometimes contain column operations).
Let's consider a simple example with matrix arr:
> arr = np.arange(12).reshape(3, 4)
> arr
> ([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
> arr.shape
> (3, 4)
So we have a matrix arr with 3 rows and 4 columns.
The simplest case in my scripts is adding something to the values in the array. E.g. I'm doing this for single or multiple rows:
> someVector = np.array([1, 2, 3, 4])
> arr[0] += someVector
> arr
> array([[ 1, 3, 5, 7], <--- successfully added someVector
[ 4, 5, 6, 7], to one row
[ 8, 9, 10, 11]])
> arr[0:2] += someVector
> arr
> array([[ 2, 5, 8, 11], <--- added someVector to two
[ 5, 7, 9, 11], <--- rows at once
[ 8, 9, 10, 11]])
This works well. However, sometimes I need to manipulate one or several columns. One column at a time works:
> arr[:, 0] += [1, 2, 3]
> array([[ 3, 5, 8, 11],
[ 7, 7, 9, 11],
[11, 9, 10, 11]])
^
|___ added the values [1, 2, 3] successfully to
this column
But I am struggling to think out why this does not work for multiple columns at once:
> arr[:, 0:2] += [1, 2, 3]
> ValueError
> Traceback (most recent call last)
> <ipython-input-16-5feef53e53af> in <module>()
> ----> 1 arr[:, 0:2] += [1, 2, 3]
> ValueError: operands could not be broadcast
> together with shapes (3,2) (3,) (3,2)
Isn't this the very same way it works with rows? What am I doing wrong here?
To add a 1D array to multiple columns you need to broadcast the values to a 2D array. Since broadcasting adds new axes on the left (of the shape) by default, broadcasting a row vector to multiple rows happens automatically:
arr[0:2] += someVector
someVector has shape (N,) and gets automatically broadcasted to shape (1, N). If arr[0:2] has shape (2, N), then the sum is performed element-wise as though both arr[0:2] and someVector were arrays of the same shape, (2, N).
But to broadcast a column vector to multiple columns requires hinting NumPy that you want broadcasting to occur with the axis on the right. In fact, you have to add the new axis on the right explicitly by using someVector[:, np.newaxis] or equivalently someVector[:, None]:
In [41]: arr = np.arange(12).reshape(3, 4)
In [42]: arr[:, 0:2] += np.array([1, 2, 3])[:, None]
In [43]: arr
Out[43]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])
someVector (e.g. np.array([1, 2, 3])) has shape (N,) and someVector[:, None] has shape (N, 1) so now broadcasting happens on the right. If arr[:, 0:2] has shape (N, 2), then the sum is performed element-wise as though both arr[:, 0:2] and someVector[:, None] were arrays of the same shape, (N, 2).
Very clear explanation of #unutbu.
As a complement, transposition (.T) can often simplify the task, by working in the first dimension :
In [273]: arr = np.arange(12).reshape(3, 4)
In [274]: arr.T[0:2] += [1, 2, 3]
In [275]: arr
Out[275]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])