I have numpy array of floats with shape (x,14) and I would like to add to the end of each "row" one more value (to each row different value), so that end result has shape (x,15).
We can suppose that I have those values in some list, so that part of the question is also defined.
How to do it with numpy functions?
Define a 2d array and a list:
In [73]: arr = np.arange(12).reshape(4,3)
In [74]: arr
Out[74]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [75]: alist = [10,11,12,13]
Note their shapes:
In [76]: arr.shape
Out[76]: (4, 3)
In [77]: np.array(alist).shape
Out[77]: (4,)
To join alist to arr it needs to have the same number of dimensions, and same number of 'rows'. We can do that by adding a dimension with the None idiom:
In [78]: np.array(alist)[:,None].shape
Out[78]: (4, 1)
Now we can concatenate on the 2nd axis:
In [79]: np.concatenate((arr, np.array(alist)[:,None]),axis=1)
Out[79]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
column_stack does the same thing, taking care that each input is at least 2d (I'd suggest reading its code.) In the long run you should be familiar enough with dimensions and shapes to do this with plain concatenate.
In [81]: np.column_stack((arr, alist))
Out[81]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
np.c_ also does this - but note the use of [] instead of (). It's a clever use of indexing notation, convenient, but potentially confusing.
np.c_[arr, alist]
np.r_['-1,2,0', arr, alist] # for more clever obscurity
You can use numpy.insert function (https://numpy.org/doc/stable/reference/generated/numpy.insert.html)
a = np.array([[1, 1], [2, 2], [3, 3]])
np.insert(a, 2, 0, axis=1)
Output:
array([[1, 1, 0],
[2, 2, 0],
[3, 3, 0]])
Related
I have a Matrix of indices I e.g.
I = np.array([[1, 0, 2], [2, 1, 0]])
The index at i-th row selects an element from another Matrix M in the i-th row.
So having M e.g.
M = np.array([[6, 7, 8], [9, 10, 11])
M[I] should select:
[[7, 6, 8], [11, 10, 9]]
I could have:
I1 = np.repeat(np.arange(0, I.shape[0]), I.shape[1])
I2 = np.ravel(I)
Result = M[I1, I2].reshape(I.shape)
but this looks very complicated and I am looking for a more elegant solution. Preferably without flattening and reshaping.
In the example I used numpy, but I am actually using jax. So if there is a more efficient solution in jax, feel free to share.
In [108]: I = np.array([[1, 0, 2], [2, 1, 0]])
...: M = np.array([[6, 7, 8], [9, 10, 11]])
...:
...: I,M
I had to add a ']' to M.
Out[108]:
(array([[1, 0, 2],
[2, 1, 0]]),
array([[ 6, 7, 8],
[ 9, 10, 11]]))
Advanced indexing with broadcasting:
In [110]: M[np.arange(2)[:,None],I]
Out[110]:
array([[ 7, 6, 8],
[11, 10, 9]])
THe first index has shape (2,1) which pairs with the (2,3) shape of I to select a (2,3) block of values.
How about this one line code? The idea is to enumerate both the rows and the row indices of the matrix, so you can access the corresponding rows in the indexing matrix.
import numpy as np
I = np.array([[1, 0, 2], [2, 1, 0]])
M = np.array([[6, 7, 8], [9, 10, 11]])
Result = np.array([row[I[i]] for i, row in enumerate(M)])
print(Result)
Output:
[[ 7 6 8]
[11 10 9]]
np.take_along_axis can also be used here to take values of M using indices I over axis=1:
>>> np.take_along_axis(M, I, axis=1)
array([[ 7, 6, 8],
[11, 10, 9]])
I would like to reduce a NumPy matrix using the vector u and the numpy.compress() method, first going across the rows and then columns. Now my code looks like this:
n = 4 #number of rows/columns
square_matrix = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
u = np.array([1,0,1,0])
v = []
for i in range(n):
v.append(np.compress(u,square_matrix[i]))
print(v)
I get the following output:
[array([1, 3]), array([5, 7]), array([ 9, 11]), array([13, 15])]
I have two questions:
How can I now create a matrix from the output again.
How could I repeat the same process for the columns. (My initial idea was to use a transpose of u, something like this:
for j in range((len(v_matrix[0])-1)):
w.append(np.compress(u.transpose(),v_matrix[:][j]))
compress like many numpy reduction functions takes an axis parameter:
In [166]: np.compress(u,square_matrix, axis=1)
Out[166]:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
In [167]: np.compress(u,square_matrix, axis=0)
Out[167]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
Applying the compression sequentially:
In [168]: np.compress(u,np.compress(u,square_matrix, axis=0),axis=1)
Out[168]:
array([[ 1, 3],
[ 9, 11]])
I didn't realize np.compress existed, though from the source file it must have been there from the start. Boolean indexing is the same, and more common.
In [169]: bu = u.astype(bool)
In [170]: square_matrix[bu,:]
Out[170]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
In [171]: square_matrix[:,bu]
Out[171]:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
Boolean indexing is equivalent to indexing with the nonzero result:
In [177]: np.nonzero(u)
Out[177]: (array([0, 2]),)
In [178]: idx = np.nonzero(u)[0]
In [179]: square_matrix[idx,:]
Out[179]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
And that can be applied to both dimensions at once with:
In [180]: square_matrix[idx[:,None],idx]
Out[180]:
array([[ 1, 3],
[ 9, 11]])
Without the reshaping (making the first a column), we get the diagonal:
In [181]: square_matrix[idx,idx]
Out[181]: array([ 1, 11])
and using the ix_ utility:
In [185]: np.ix_(bu,bu)
Out[185]:
(array([[0],
[2]]),
array([[0, 2]]))
In [186]: square_matrix[np.ix_(bu,bu)]
Out[186]:
array([[ 1, 3],
[ 9, 11]])
How can I now create a matrix from the output again.
You can do the operation in a vectorized manner, just specify the axis keyword for np.compress
np.compress(u, square_matrix, axis=1)
output:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
How could I repeat the same process for the columns. (My initial idea was to use a transpose of u)
Your suggestion is correct, but transpose the matrix instead of u. This will switch the columns with rows.
np.compress(u, square_matrix.T, axis=1)
output:
array([[ 1, 9],
[ 2, 10],
[ 3, 11],
[ 4, 12]])
I would like to know if there is any fast way to sum each row of a first array with all rows of a second array. In this case both arrays have the same number of colulmns. For instance if array1.shape = (n,c) and array2.shape = (m,c), the resulting array would be an array3.shape = ((n*m), c)
Look at the example below:
array1 = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
array2 = np.array([[0, 1, 2],
[3, 4, 5]])
The result would be:
array3 = np.array([[0, 2, 4],
[3, 5, 7]
[3, 5, 7]
[6, 8, 10]
[6, 8, 10]
[9, 11, 13]])
The only way I see I can do this is to repeat each row of one of the arrays the number of rows of the other array. For instance, by doing np.repeat(array1, len(array2), axis=0) and then sum this array with array2. This is not very practical however if the number of rows is too big. The other way would be with a for loop but this is too slow.
Any other better way to do it..?
Thanks in advance.
Extend array1 to 3D so that it becomes broadcastable against 2D array2 and then perform broadcasted addition and a final reshape is needed for desired output -
In [30]: (array1[:,None,:] + array2).reshape(-1,array1.shape[1])
Out[30]:
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
You could try the following inline code if you haven't already. This is the simplest and probably also the quickest on a single thread.
>>> import numpy as np
>>> array1 = np.array([[0, 1, 2],
... [3, 4, 5],
... [6, 7, 8]])
>>>
>>> array2 = np.array([[0, 1, 2],
... [3, 4, 5]])
>>> array3 = np.array([i+j for i in array1 for j in array2])
>>> array3
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
>>>
If you are looking for speed up by treading, you could consider using CUDA or multithreading. This suggestion goes a bit out of scope of your question but gives you an idea of what can be done to speed up matrix operations.
I am looking for a way to apply a function n items at the time along an axis. E.g.
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8]])
If I apply sum across the rows 2 items at a time I get:
array([[ 4, 6],
[ 12, 14]])
Which is the sum of 1st 2 rows and the last 2 rows.
NB: I am dealing with much larger array and I have to apply the function to n items which I can be decided at runtime.
The data extends along different axis. E.g.
array([[... [ 1, 2, ...],
[ 3, 4, ...],
[ 5, 6, ...],
[ 7, 8, ...],
...], ...])
This is a reduction:
numpy.add.reduceat(a, [0,2])
>>> array([[ 4, 6],
[12, 14]], dtype=int32)
As long as by "larger" you mean longer in the "y" axis, you can extend:
a = numpy.array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6],
[12, 14],
[20, 22]], dtype=int32)
EDIT: actually, this works fine for "larger in both dimensions", too:
a = numpy.arange(24).reshape(6,4)
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6, 8, 10],
[20, 22, 24, 26],
[36, 38, 40, 42]], dtype=int32)
I will leave it up to you to adapt the indices to your specific case.
Reshape splitting the first axis into two axes, such that the second split axis is of length n to have a 3D array and then sum along that split axis, like so -
a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
It should be pretty efficient as reshaping just creates a view into input array.
Sample run -
In [55]: a
Out[55]:
array([[2, 8, 0, 0],
[1, 5, 3, 3],
[6, 1, 4, 7],
[0, 4, 0, 7],
[8, 0, 8, 1],
[8, 3, 3, 8]])
In [56]: n = 2 # Sum every two rows
In [57]: a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
Out[57]:
array([[ 3, 13, 3, 3],
[ 6, 5, 4, 14],
[16, 3, 11, 9]])
How about something like this?
n = 2
# calculate the cumsum along axis 0 and take one row from every n rows
cumarr = arr.cumsum(axis = 0)[(n-1)::n]
# calculate the difference of the resulting numpy array along axis 0
np.vstack((cumarr[0][None, :], np.diff(cumarr, axis=0)))
# array([[ 4, 6],
# [12, 14]])
I am trying to understand numpy's combined slicing and indexing concept, however I am not sure how to correctly get the below results from numpy's output (by hand so that we can understand how numpy process combined slicing and indexing, which one will be process first?):
>>> import numpy as np
>>> a=np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> i=np.array([[0,1],[2,2]])
>>> a[i,:]
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 8, 9, 10, 11]]])
>>> j=np.array([[2,1],[3,3]])
>>> a[:,j]
array([[[ 2, 1],
[ 3, 3]],
[[ 6, 5],
[ 7, 7]],
[[10, 9],
[11, 11]]])
>>> aj=a[:,j]
>>> aj.shape
(3L, 2L, 2L)
I am bit confused about how aj's shape becomes (3,2,2) with the above output, any detailed explanations are very appreciated, thanks!
Whenever you use an array of indices, the result has the same shape as the indices; for example:
>>> x = np.ones(5)
>>> i = np.array([[0, 1], [1, 0]])
>>> x[i]
array([[ 1., 1.],
[ 1., 1.]])
We've indexed with a 2x2 array, and the result is a 2x2 array.
When combined with a slice, the size of the slice is preserved. For example:
>>> x = np.ones((5, 3))
>>> x[i, :].shape
(2, 2, 3)
Where the first example was a 2x2 array of items, this example is a 2x2 array of (length-3) rows.
The same is true when you switch the order of the slice:
>>> x = np.ones((5, 3))
>>> x[:, i].shape
(5, 2, 2)
This can be thought of as a list of five 2x2 arrays.
Just remember: when any dimension is indexed with a list or array, the result has the shape of the indices, not the shape of the input.
a[:,j][0] is equivalent to a[0,j] or [0, 1, 2, 3][j] which gives you [[2, 1], [3, 3]])
a[:,j][1] is equivalent to a[1,j] or [4, 5, 6, 7][j] which gives you [[6, 5], [7, 7]])
a[:,j][2] is equivalent to a[2,j] or [8, 9, 10, 11][j] which gives you [[10, 9], [11, 11]])