Related
This question already has answers here:
NumPy selecting specific column index per row by using a list of indexes
(7 answers)
Closed 1 year ago.
Given an array like below:
np.arange(12).reshape(4,3)
Out[119]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
I want to select a single element from each of the rows using a list of indices [0, 2, 1, 2] to create a 4x1 array of [0, 5, 7, 11].
Is there any easy way to do this indexing. The closest I could see was the gather method in pytorch.
>>> import torch
>>> import numpy as np
>>> s = np.arange(12).reshape(4,3)
>>> s = torch.tensor(s)
>>> s
tensor([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
>>> idx = torch.tensor([0, 2, 1, 2])
>>> torch.gather(s,-1 ,idx.unsqueeze(-1))
tensor([[ 0],
[ 5],
[ 7],
[11]])
torch.gather(s,-1 ,idx.unsqueeze(-1))
arr[[0,1,2,3], [0,2,1,2]]
or if you prefer np.arange(4) for the 1st indexing array.
Please try to run the following code.
import numpy as np
x = [[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]]
index_array = [0, 2, 1, 2]
index = 0
result = []
for item in x:
result.append(item[index_array[index]])
index += 1
print (result)
Here is the result.
[0, 5, 7, 11]
>
Let a = np.arange(1, 4).
To get the 2 dimensional multiplication table for a, I do:
>>> a * a[:, None]
>>> array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
For 3 dimensions, I can do the following:
>>> a * a[:, None] * a[:, None, None]
>>> array([[[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9]],
[[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]],
[[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]]])
How could I write a function that takes a numpy array a and a number of dimensions n as input and ouputs the n dimensional multiplication table for a?
This should do what you need:
import itertools
a = np.arange(1, 4)
n = 3
def f(x, y):
return np.expand_dims(x, len(x.shape))*y
l = list(itertools.accumulate(np.repeat(np.atleast_2d(a), n, axis=0), f))[-1]
Just change n to be whatever dimension you need
First we can use numpy.expand_dims() for dynamically promoting the array dimensions as needed in a list/generator comprehension and then use an iterable product tool such as math.prod on Python 3.8+. The implementation would then look like as demonstrated below:
from math import prod
def n_dim_multiplication(arr, num_dims):
gen_arr = (np.expand_dims(a, axis=tuple(range(1, idx+1))) for idx in range(num_dims))
return prod(gen_arr)
Sample run for the 3 dimensional case:
# input array
In [83]: a = np.arange(1, 4)
# desired number of dimensions
In [84]: num_dims = 3
In [85]: n_dim_multiplication(a, num_dims)
Out[85]:
array([[[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9]],
[[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]],
[[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]]])
I would like to know if there is any fast way to sum each row of a first array with all rows of a second array. In this case both arrays have the same number of colulmns. For instance if array1.shape = (n,c) and array2.shape = (m,c), the resulting array would be an array3.shape = ((n*m), c)
Look at the example below:
array1 = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
array2 = np.array([[0, 1, 2],
[3, 4, 5]])
The result would be:
array3 = np.array([[0, 2, 4],
[3, 5, 7]
[3, 5, 7]
[6, 8, 10]
[6, 8, 10]
[9, 11, 13]])
The only way I see I can do this is to repeat each row of one of the arrays the number of rows of the other array. For instance, by doing np.repeat(array1, len(array2), axis=0) and then sum this array with array2. This is not very practical however if the number of rows is too big. The other way would be with a for loop but this is too slow.
Any other better way to do it..?
Thanks in advance.
Extend array1 to 3D so that it becomes broadcastable against 2D array2 and then perform broadcasted addition and a final reshape is needed for desired output -
In [30]: (array1[:,None,:] + array2).reshape(-1,array1.shape[1])
Out[30]:
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
You could try the following inline code if you haven't already. This is the simplest and probably also the quickest on a single thread.
>>> import numpy as np
>>> array1 = np.array([[0, 1, 2],
... [3, 4, 5],
... [6, 7, 8]])
>>>
>>> array2 = np.array([[0, 1, 2],
... [3, 4, 5]])
>>> array3 = np.array([i+j for i in array1 for j in array2])
>>> array3
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
>>>
If you are looking for speed up by treading, you could consider using CUDA or multithreading. This suggestion goes a bit out of scope of your question but gives you an idea of what can be done to speed up matrix operations.
I have numpy array of floats with shape (x,14) and I would like to add to the end of each "row" one more value (to each row different value), so that end result has shape (x,15).
We can suppose that I have those values in some list, so that part of the question is also defined.
How to do it with numpy functions?
Define a 2d array and a list:
In [73]: arr = np.arange(12).reshape(4,3)
In [74]: arr
Out[74]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [75]: alist = [10,11,12,13]
Note their shapes:
In [76]: arr.shape
Out[76]: (4, 3)
In [77]: np.array(alist).shape
Out[77]: (4,)
To join alist to arr it needs to have the same number of dimensions, and same number of 'rows'. We can do that by adding a dimension with the None idiom:
In [78]: np.array(alist)[:,None].shape
Out[78]: (4, 1)
Now we can concatenate on the 2nd axis:
In [79]: np.concatenate((arr, np.array(alist)[:,None]),axis=1)
Out[79]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
column_stack does the same thing, taking care that each input is at least 2d (I'd suggest reading its code.) In the long run you should be familiar enough with dimensions and shapes to do this with plain concatenate.
In [81]: np.column_stack((arr, alist))
Out[81]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
np.c_ also does this - but note the use of [] instead of (). It's a clever use of indexing notation, convenient, but potentially confusing.
np.c_[arr, alist]
np.r_['-1,2,0', arr, alist] # for more clever obscurity
You can use numpy.insert function (https://numpy.org/doc/stable/reference/generated/numpy.insert.html)
a = np.array([[1, 1], [2, 2], [3, 3]])
np.insert(a, 2, 0, axis=1)
Output:
array([[1, 1, 0],
[2, 2, 0],
[3, 3, 0]])
I am looking for a way to apply a function n items at the time along an axis. E.g.
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8]])
If I apply sum across the rows 2 items at a time I get:
array([[ 4, 6],
[ 12, 14]])
Which is the sum of 1st 2 rows and the last 2 rows.
NB: I am dealing with much larger array and I have to apply the function to n items which I can be decided at runtime.
The data extends along different axis. E.g.
array([[... [ 1, 2, ...],
[ 3, 4, ...],
[ 5, 6, ...],
[ 7, 8, ...],
...], ...])
This is a reduction:
numpy.add.reduceat(a, [0,2])
>>> array([[ 4, 6],
[12, 14]], dtype=int32)
As long as by "larger" you mean longer in the "y" axis, you can extend:
a = numpy.array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6],
[12, 14],
[20, 22]], dtype=int32)
EDIT: actually, this works fine for "larger in both dimensions", too:
a = numpy.arange(24).reshape(6,4)
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6, 8, 10],
[20, 22, 24, 26],
[36, 38, 40, 42]], dtype=int32)
I will leave it up to you to adapt the indices to your specific case.
Reshape splitting the first axis into two axes, such that the second split axis is of length n to have a 3D array and then sum along that split axis, like so -
a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
It should be pretty efficient as reshaping just creates a view into input array.
Sample run -
In [55]: a
Out[55]:
array([[2, 8, 0, 0],
[1, 5, 3, 3],
[6, 1, 4, 7],
[0, 4, 0, 7],
[8, 0, 8, 1],
[8, 3, 3, 8]])
In [56]: n = 2 # Sum every two rows
In [57]: a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
Out[57]:
array([[ 3, 13, 3, 3],
[ 6, 5, 4, 14],
[16, 3, 11, 9]])
How about something like this?
n = 2
# calculate the cumsum along axis 0 and take one row from every n rows
cumarr = arr.cumsum(axis = 0)[(n-1)::n]
# calculate the difference of the resulting numpy array along axis 0
np.vstack((cumarr[0][None, :], np.diff(cumarr, axis=0)))
# array([[ 4, 6],
# [12, 14]])