Related
I have the following NumPy matrix:
m = np.array([[1, 2, 3, 4],
[10, 5, 3, 4],
[12, 8, 1, 2],
[7, 0, 2, 4]])
Now, I need the indices of N (say, N=2) lowest values of each row in this matrix . So with the example above, I expect the following output:
[[0, 1],
[2, 3],
[3, 2],
[1, 2]]
where the rows of the output matrix correspond to the respective rows of the original, and the elements of the rows of the output matrix are the indices of the N lowest values in the corresponding original rows (preferably in ascending order by values in the original matrix). How could I do it in NumPy?
You could either use a simple loop-approach (not recommended) or you use np.argpartition:
In [13]: np.argpartition(m, 2)[:, :2]
Out[13]:
array([[0, 1],
[2, 3],
[2, 3],
[1, 2]])
You could use np.argsort on your array and then slice the array with the amount of N lowest/highest values.
np.argsort(m, axis=1)[:, :2]
array([[0, 1],
[2, 3],
[2, 3],
[1, 2]], dtype=int64)
Try this;
import numpy as np
m = np.array([[1, 2, 3, 4],
[10, 5, 3, 4],
[12, 8, 1, 2],
[7, 0, 2, 4]])
for arr in m:
print(arr.argsort()[:2])
I have an matrix represented by a np array. Here is an example of what I am talking about. You can see it has 3 "vectors" inside of it
x = np.array([[1, 1], [1,2],[2,3]])
[1, 1], [1,2] and [2,3]
The goal is to turn this into a matrix where these vectors are repeated. So the 0th row of said matrix should simply be [1,1] repeated n times. And the 1st row should be [1,2] repeated n times. I believe this would look somewhat like for n=4
xresult = np.array([[[1, 1], [1, 1], [1, 1], [1, 1]],
[[1, 2], [1, 2], [1, 2], [1, 2]],
[[2, 3], [2, 3], [2, 3], [2, 3]]])
And therefore
xresult[0,0] = [1,1]
xresult[0,1] = [1,1]
xresult[0,2] = [1,1]
xresult[1,2] = [1,2]
The goal is of course to do this without loops if possible as that is an obvious but perhaps less elegant/performant solution.
Here are some attempts that do not work
np.tile([x],(2,1))
>>>array([[[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]]])
np.tile([x],(2,))
>>>array([[[1, 1, 1, 1],
[1, 2, 1, 2],
[2, 3, 2, 3]]])
np.append(x,x,axis=0)
>>>array([[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]])
np.append([x],[x],axis=1)
>>>array([[[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]]])
np.array([[x],[x]])
>>>array([[[[1, 1],
[1, 2],
[2, 3]]],
[[[1, 1],
[1, 2],
[2, 3]]]])
(Some of these were just with n=2 as a goal)
It is worth noting that the ultimate end goal is to take x and y (a similarly crafted array of vectors of the same dimension but not necessarily the same number of vectors
y = np.array([[99,11], [23,44],[33,44], [2, 1], [9, 9]])
And run the procedure on x so that columns of the result are the number of vectors in y. And run a procedure on y that is similar but does this row-wise.
y after this transform would have the following
yresult[0,0] = [99,11]
yresult[1,0] = [22,44]
yresult[2,0] = [33,44]
yresult[2,1] = [33,44]
This way I can subtract the two matrices. The goal is to create a matrix where x'vector index is the row, y'vector index is the row and the element is the difference between these two vectors.
ultimateResult[0,1]=[1,1]-[23,44]=[-22,-43]
Perhaps there is a better way to get this.
I have a matrix_1 full of numerical values and what I'd like to do is transform this into a matrix_2 with the values(of matrix_1 sorted) and then replace these sorted values in matrix 2 with the original indices from matrix_1.
I don't want to use any loops as the matrices are rather large.
for example : matrix_1=[[2,3,4,1],[6,5,9,7]]
I want to end up with matrix_2=[[(1,4),(1,1),(1,2),(1,3)],
[(2,2),(2,1),(2,4),(2,3)]]
I've tried use np.ndenumerate on the original matrix but it returns array([numpy.ndenumerate object at 0x1a1a9fce90], dtype=object)
I've now also tried np.argsort() but it doesn't seem to work, possibly because all of my entries are floats...
You must come from R or other language that start indexing on 1. In Python, indexes start at 0, so you have to explicitly add + 1 to the indexes to make them start at 1.
Use argsort and then reshape
m1 = matrix_1.argsort(1) + 1
i = (np.repeat(np.arange(m1.shape[0]), m1.shape[1]) + 1).reshape(m1.shape)
np.concatenate([m1[:, None],i[:, None]], axis=1).swapaxes(2,1)
which outputs
array([[[4, 1],
[1, 1],
[2, 1],
[3, 1]],
[[2, 2],
[1, 2],
[4, 2],
[3, 2]]])
using np.argsort should do the trick:
matrix_1=np.array([[2,3,4,1],[6,5,9,7]])
matrix_1
array([[2, 3, 4, 1],
[6, 5, 9, 7]])
x = np.argsort(matrix_1,axis=1)
array([[3, 0, 1, 2],
[1, 0, 3, 2]], dtype=int64)
A matrix consisting of floats shouldn't pose a problem.
You can then create the list as:
[[(i+1,v+1) for v in enumerate(y)] for i, y in enumerate(x.tolist())]
[[(1, 4), (1, 1), (1, 2), (1, 3)], [(2, 2), (2, 1), (2, 4), (2, 3)]]
argsort applied to the flattened array:
In [110]: np.argsort(arr1.ravel())
Out[110]: array([3, 0, 1, 2, 5, 4, 7, 6])
Turn that into 2d indices:
In [111]: np.unravel_index(_,(2,4))
Out[111]: (array([0, 0, 0, 0, 1, 1, 1, 1]), array([3, 0, 1, 2, 1, 0, 3, 2]))
Combine the arrays into one, and reshape:
In [112]: np.transpose(_)
Out[112]:
array([[0, 3],
[0, 0],
[0, 1],
[0, 2],
[1, 1],
[1, 0],
[1, 3],
[1, 2]])
In [113]: _+1 # tweak values to match yours
Out[113]:
array([[1, 4],
[1, 1],
[1, 2],
[1, 3],
[2, 2],
[2, 1],
[2, 4],
[2, 3]])
In [114]: _.reshape(2,4,2)
Out[114]:
array([[[1, 4],
[1, 1],
[1, 2],
[1, 3]],
[[2, 2],
[2, 1],
[2, 4],
[2, 3]]])
a = np.zeros((5,4,3))
v = np.ones((5, 4), dtype=int)
data = a[v]
shp = data.shape
This code gives shp==(5,4,4,3)
I don't understand why. How can a larger array be output? makes no sense to me and would love an explanation.
This is known as advanced indexing. Advanced indexing allows you to select arbitrary elements in the input array based on an N-dimensional index.
Let's use another example to make it clearer:
a = np.random.randint(1, 5, (5,4,3))
v = np.ones((5, 4), dtype=int)
Say in this case a is:
array([[[2, 1, 1],
[3, 4, 4],
[4, 3, 2],
[2, 2, 2]],
[[4, 4, 1],
[3, 3, 4],
[3, 4, 2],
[1, 3, 1]],
[[3, 1, 3],
[4, 3, 1],
[2, 1, 4],
[1, 2, 2]],
...
By indexing with an array of np.ones:
print(v)
array([[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]])
You will simply be indexing a with 1 along the first axis as many times as v. Putting it in another way, when you do:
a[1]
[[4, 4, 1],
[3, 3, 4],
[3, 4, 2],
[1, 3, 1]]
You're indexing along the first axis, as no indexing is specified along the additional axes. It is the same as doing a[1, ...], i.e taking a full slice along the remaining axes. Hence by indexing with a 2D array of ones, you will have the above 2D array (5, 4) times stacked together, resulting in an ndarray of shape (5, 4, 4, 3). Or in other words, a[1], of shape (4,3), stacked 5*4=20 times.
Hence, in this case you'd be getting:
array([[[[4, 4, 1],
[3, 3, 4],
[3, 4, 2],
[1, 3, 1]],
[[4, 4, 1],
[3, 3, 4],
[3, 4, 2],
[1, 3, 1]],
...
the value of v is:
[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]
every single 1 indexes a complete "row" in a, but every "element" in said "row" is a matrix. so every "row" in v indexes a "row" of "matrix"es in a.
(does this make any sense to you..?)
so you get 5 * 4 1s, each is a 4*3 "matrix".
if instead of zeroes you define a as a = np.arange(5*4*3).reshape((5, 4, 3))
it might be easier to understand, because you get to see which parts of a are being chosen:
import numpy as np
a = np.arange(5*4*3).reshape((5, 4, 3))
v = np.ones((5,4), dtype=int)
data = a[v]
print(data)
(output is pretty long, I don't want to paste it here)
If I have a multidimensional array like this:
a = np.array([[9,9,9],[9,0,9],[9,9,9]])
I'd like to get an array of each index in that array, like so:
i = np.array([[0,0],[0,1],[0,2],[1,0],[1,1],...])
One way of doing this that I've found is like this, using np.indices:
i = np.transpose(np.indices(a.shape)).reshape(a.shape[0] * a.shape[1], 2)
But that seems somewhat clumsy, especially given the presence of np.nonzero which almost does what I want.
Is there a built-in numpy function that will produce an array of the indices of every item in a 2D numpy array?
Here is one more concise way (if the order is not important):
In [56]: np.indices(a.shape).T.reshape(a.size, 2)
Out[56]:
array([[0, 0],
[1, 0],
[2, 0],
[0, 1],
[1, 1],
[2, 1],
[0, 2],
[1, 2],
[2, 2]])
If you want it in your intended order you can use dstack:
In [46]: np.dstack(np.indices(a.shape)).reshape(a.size, 2)
Out[46]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
For the first approach if you don't want to use reshape another way is concatenation along the first axis using np.concatenate().
np.concatenate(np.indices(a.shape).T)