I want to append values to a selection of array without having to go through a for loop.
i.e. if I want to add 0 values to certain locations of an array:
a=np.array([[1,2,3,4,5],[1,2,3,4,5]])
condition=np.where(a>2)
a[condition]=np.append(a[condition],np.array([0]*len(condition[0])))
-> ValueError: shape mismatch: value array of shape (12,) could not be broadcast to indexing result of shape (6,)
Edit for clarification:
I need to add values (and dimension if needed) to selected array location. The loop looks like that:
for t in range(len(ind)):
c = cols[t]
r = rows[t]
if data1[r, c] > 2:
data2[r,c]=np.append(data2[r,c],t)
Is there any way to remove this loop (~100 000 iterations)? Thank
Let's look at the pieces:
In [92]: a=np.array([[1,2,3,4,5],[1,2,3,4,5]])
...: condition=np.where(a>2)
...:
In [93]: a
Out[93]:
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
In [94]: condition
Out[94]:
(array([0, 0, 0, 1, 1, 1], dtype=int32),
array([2, 3, 4, 2, 3, 4], dtype=int32))
In [95]: a[condition]
Out[95]: array([3, 4, 5, 3, 4, 5])
In [96]: np.append(a[condition],np.array([0]*len(condition[0])))
Out[96]: array([3, 4, 5, 3, 4, 5, 0, 0, 0, 0, 0, 0])
You are trying to put 12 values into 6 slots. No can do!
What are you expecting? I don't think I should even speculate. Go ahead show us the loop.
Related
When I tried using scipy.optimize.linear_sum_assignment as shown, it gives the assignment vector [0 2 3 1] with a total cost of 15.
However, from the cost matrix c, you can see that for the second task, the 5th agent has a cost of 1. So the expected assignment should be [0 3 None 2 1] (total cost of 9)
Why is linear_sum_assignment not returning the optimal assignments?
from scipy.optimize import linear_sum_assignment
c = [
[1, 5, 9, 5],
[5, 8, 3, 2],
[3, 2, 6, 8],
[7, 3, 5, 4],
[2, 1, 9, 9],
]
results = linear_sum_assignment(c)
print(results[1]) # [0 2 3 1]
linear_sum_assignment returns a tuple of two arrays. These are the row indices and column indices of the assigned values. For your example (with c converted to a numpy array):
In [51]: c
Out[51]:
array([[1, 5, 9, 5],
[5, 8, 3, 2],
[3, 2, 6, 8],
[7, 3, 5, 4],
[2, 1, 9, 9]])
In [52]: row, col = linear_sum_assignment(c)
In [53]: row
Out[53]: array([0, 1, 3, 4])
In [54]: col
Out[54]: array([0, 2, 3, 1])
The corresponding index pairs from row and col give the selected entries. That is, the indices of the selected entries are (0, 0), (1, 2), (3, 3) and (4, 1). It is these pairs that are the "assignments".
The sum associated with this assignment is 9:
In [55]: c[row, col].sum()
Out[55]: 9
In the original version of the question (but since edited),
it looks like you wanted to know the row index for each column, so you expected [0, 4, 1, 3]. The values that you want are in row, but the order is not what you expect, because the indices in col are not simply [0, 1, 2, 3]. To get the result in the form that you expected, you have to reorder the values in row based on the order of the indices in col. Here are two ways to do that.
First:
In [56]: result = np.zeros(4, dtype=int)
In [57]: result[col] = row
In [58]: result
Out[58]: array([0, 4, 1, 3])
Second:
In [59]: result = row[np.argsort(col)]
In [60]: result
Out[60]: array([0, 4, 1, 3])
Note that the example in the linear_sum_assignment docstring is potentially misleading; because it displays only col_ind in the python session, it gives the impression that col_ind is "the answer". In general, however, the answer involves both of the returned arrays.
I've seen variations of this question asked a few times but so far haven't seen any answers that get to the heart of this general case. I have an n-dimensional array of shape [a, b, c, ...] . For some dimension x, I want to look at each sub-array and find the coordinates of the maximum.
For example, say b = 2, and that's the dimension I'm interested in. I want the coordinates of the maximum of [:, 0, :, ...] and [:, 1, :, ...] in the form a_max = [a_max_b0, a_max_b1], c_max = [c_max_b0, c_max_b1], etc.
I've tried to do this by reshaping my input matrix to a 2d array [b, a*c*d*...], using argmax along axis 0, and unraveling the indices, but the output coordinates don't wind up giving the maxima in my dataset. In this case, n = 3 and I'm interested in axis 1.
shape = gains_3d.shape
idx = gains_3d.reshape(shape[1], -1)
idx = idx.argmax(axis = 1)
a1, a2 = np.unravel_index(idx, [shape[0], shape[2]])
Obviously I could use a loop, but that's not very pythonic.
For a concrete example, I randomly generated a 4x2x3 array. I'm interested in axis 1, so the output should be two arrays of length 2.
testarray = np.array([[[0.17028444, 0.38504759, 0.64852725],
[0.8344524 , 0.54964746, 0.86628204]],
[[0.77089997, 0.25876277, 0.45092835],
[0.6119848 , 0.10096425, 0.627054 ]],
[[0.8466859 , 0.82011746, 0.51123959],
[0.26681694, 0.12952723, 0.94956865]],
[[0.28123628, 0.30465068, 0.29498136],
[0.6624998 , 0.42748154, 0.83362323]]])
testarray[:,0,:] is
array([[0.17028444, 0.38504759, 0.64852725],
[0.77089997, 0.25876277, 0.45092835],
[0.8466859 , 0.82011746, 0.51123959],
[0.28123628, 0.30465068, 0.29498136]])
, so the first element of the first output array will be 2, and the first element of the other will be 0, pointing to 0.8466859. The second elements of the two matrices will be 2 and 2, pointing to 0.94956865 of testarray[:,1,:]
Let's first try to get a clear idea of what you are trying to do:
Sample 3d array:
In [136]: arr = np.random.randint(0,10,(2,3,4))
In [137]: arr
Out[137]:
array([[[1, 7, 6, 2],
[1, 5, 7, 1],
[2, 2, 5, *6*]],
[[*9*, 1, 2, 9],
[2, *9*, 3, 9],
[0, 2, 0, 6]]])
After fiddling around a bit I came up with this iteration, showing the coordinates for each middle dimension, and the max value
In [151]: [(i,np.unravel_index(np.argmax(arr[:,i,:]),(2,4)),np.max(arr[:,i,:])) for i in range
...: (3)]
Out[151]: [(0, (1, 0), 9), (1, (1, 1), 9), (2, (0, 3), 6)]
I can move the unravel outside the iteration:
In [153]: np.unravel_index([np.argmax(arr[:,i,:]) for i in range(3)],(2,4))
Out[153]: (array([1, 1, 0]), array([0, 1, 3]))
Your reshape approach does avoid this loop:
In [154]: arr1 = arr.transpose(1,0,2) # move our axis first
In [155]: arr1 = arr1.reshape(3,-1)
In [156]: arr1
Out[156]:
array([[1, 7, 6, 2, 9, 1, 2, 9],
[1, 5, 7, 1, 2, 9, 3, 9],
[2, 2, 5, 6, 0, 2, 0, 6]])
In [158]: np.argmax(arr1,axis=1)
Out[158]: array([4, 5, 3])
In [159]: np.unravel_index(_,(2,4))
Out[159]: (array([1, 1, 0]), array([0, 1, 3]))
max and argmax take only one axis value, where as you want the equivalent of taking the max along all but one axis. Some ufunc takes a axis tuple, but these do not. The transpose and reshape may be the only way.
In [163]: np.max(arr1,axis=1)
Out[163]: array([9, 9, 6])
Suppose I have a numpy array
a = np.array([0,2,3,4,5,1,9,0,0,7,9,0,0,0]).reshape(7,2)
I want to find out the indices of all the times the minimum element (here 0) occurs in the 2nd column. Using argmin I can find out the index of when 0 is occurring for the first time. How can I do this in Python?
Using np.flatnonzero on a[:, 1]==np.min(a) is the most starightforward way:
In [3]: idxs = np.flatnonzero(a[:, 1]==np.min(a))
In [4]: idxs
Out[4]: array([3, 5, 6])
After you reshaped your array it looks like this:
array([[0, 2],
[3, 4],
[5, 1],
[9, 0],
[0, 7],
[9, 0],
[0, 0]])
You can get all elements that are of the same value by using np.where. IN your case the following would work:
np.where(a.T[-1] == a.argmin())
# This would give you (array([3, 5, 6]),)
What happens here is that you create a transposed view on the array. This means you can easily access the columns. The term view here means that the a array itself is not changed for that. This leaves you with:
a.T
array([[0, 3, 5, 9, 0, 9, 0],
[2, 4, 1, 0, 7, 0, 0]])
From this you select the last line (i.e. the last column of a) by using the index -1. Now you have the array
array([2, 4, 1, 0, 7, 0, 0])
on which you can call np.where(condititon), which gives you all indices for which the condition is true. In your case the condition is
a.T[-1] == a.argmin()
which gives you all entries in the selected line of the transposed array that have the same value as np.argmin(a) which, as you said, is 0 in your case.
I have a couple of ndarrays with same shape, and I would like to get one array (of same shape) with the maximum of the absolute values for each element. So I decided to stack all arrays, and then pick the values along the new stacked axis. But how to do this?
Example
Say we have two 1-D arrays with 4 elements each, so my stacked array looks like
>>> stack
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7]])
If I would just be interested in the maximum I could just do
>>> numpy.amax(stack, axis=0)
array([4, 1, 6, 7])
But I need to consider negative values as well, so I was going for
>>> ind = numpy.argmax(numpy.absolute(stack), axis=0)
>>> ind
array([0, 1, 1, 1])
So now I have the indices I need, but how to apply this to the stacked array? If I just index stack by ind, numpy is doing something broadcasting stuff I don't need:
>>> stack[ind]
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7]])
What I want to get is array([4, -5, 6, 7])
Or to ask from a slightly different perspective: How do I get the array numpy.amax(stack, axis=0) based on the indices returned by numpy.argmax(stack, axis=0)?
The stacking operation would be inefficient. We can simply use np.where to do the choosing based on the absolute valued comparisons -
In [198]: a
Out[198]: array([4, 1, 2, 3])
In [199]: b
Out[199]: array([ 0, -5, 6, 7])
In [200]: np.where(np.abs(a) > np.abs(b), a, b)
Out[200]: array([ 4, -5, 6, 7])
This works on generic n-dim arrays without any modification.
If you have 2D numpy ndarray, classical indexing no longer applies. So to achieve what you want, to avoid brodcatsting, you have to index with 2D array too:
>>> stack[[ind,np.arange(stack.shape[1])]]
array([ 4, -5, 6, 7])
For 'normal' Python:
>>> a=[[1,2],[3,4]]
>>> b=[0,1]
>>> [x[y] for x,y in zip(a,b)]
[1, 4]
Perhaps it can be applied to arrays too, I am not familiar enough with Numpy.
Find array of max and min and combine using where
maxs = np.amax(stack, axis=0)
mins = np.amin(stack, axis=0)
max_abs = np.where(np.abs(maxs) > np.abs(mins), maxs, mins)
Let's say I have
arr = np.arange(6)
arr
array([0, 1, 2, 3, 4, 5])
and I decide that I want to treat an array "like a circle": When I run out of material at the end, I want to start at index 0 again. That is, I want a convenient way of selecting x elements, starting at index i.
Now, if x == 6, I can simply do
i = 3
np.hstack((arr[i:], arr[:i]))
Out[9]: array([3, 4, 5, 0, 1, 2])
But is there a convenient way of generally doing this, even if x > 6, without having to manually breaking the array apart and thinking through the logic?
For example:
print(roll_array_arround(arr)[2:17])
should return.
array([2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0])
See mode='wrap' in ndarray.take:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html
Taking your hypothetical function:
print(roll_array_arround(arr)[2:17])
If it is implied that it is a true slice of the original array that you are after, that is not going to happen; a wrapped-around array cannot be expressed as a strided view of the original; so if you seek a function that maps an ndarray to an ndarray, this will necessarily involve a copy of your data.
That is, efficiency-wise, you shouldnt expect to find solution that significantly differs in performance from the expression below.
print(arr.take(np.arange(2,17), mode='wrap'))
Modulus operation seems like the best fit here -
def rolling_array(n, x, i):
# n is rolling period
# x is length of array
# i is starting number
return np.mod(np.arange(i,i+x),n)
Sample runs -
In [61]: rolling_array(n=6, x=6, i=3)
Out[61]: array([3, 4, 5, 0, 1, 2])
In [62]: rolling_array(n=6, x=17, i=2)
Out[62]: array([2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0])
A solution you can look into would probably be :
from itertools import cycle
list_to_rotate = np.array([1,2,3,4,5])
rotatable_list = cycle(list_to_rotate)
You need to roll your array.
>>> x = np.arange(10)
>>> np.roll(x, 2)
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
See numpy documentation for more details.