How to duplicate rows in a feature matrix array [duplicate] - python

This question already has answers here:
Resizing and stretching a NumPy array
(3 answers)
Closed 2 years ago.
Suppose I have the following feature matrix X (ie. with 4 rows and 3 features):
X = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
(array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]]),
How do I duplicate say, the 1st and 2nd row twice, the 3rd row 3 times and no duplication in the 4th row, ie. I want to get this:
(array([[ 1, 2, 3],
[ 1, 2, 3]
[ 4, 5, 6],
[ 4, 5, 6]
[ 7, 8, 9],
[ 7, 8, 9]
[ 7, 8, 9]
[10, 11, 12]]),
Is there a way for me to multiply the features matrix X with an array of weights, say something like this:
np.array([2,2,3,1])
Thanks in advance.

You can do it through indexing:
repeats = np.array([2,2,3,1])
indices = np.repeat(range(len(X)), repeats)
X[indices]
Where indices is
array([0, 0, 1, 1, 2, 2, 2, 3])

np.repeat(X, [2, 2, 3, 1], axis=0)

Related

numpy.roll horizontally on a 2D ndarray with different values

Doing np.roll(a, 1, axis = 1) on:
a = np.array([
[6, 3, 9, 2, 3],
[1, 7, 8, 1, 2],
[5, 4, 2, 2, 4],
[3, 9, 7, 6, 5],
])
results in the correct:
array([
[3, 6, 3, 9, 2],
[2, 1, 7, 8, 1],
[4, 5, 4, 2, 2],
[5, 3, 9, 7, 6]
])
The documentation says:
If a tuple, then axis must be a tuple of the same size, and each of the given axes is shifted by the corresponding number.
Now I like to roll rows of a by different values, like [1,2,1,3] meaning, first row will be rolled by 1, second by 2, third by 1 and forth by 3. But np.roll(a, [1,2,1,3], axis=(1,1,1,1)) doesn't seem to do it. What would be the correct interpretation of the sentence in the docs?
By specifying a tuple in np.roll you can roll an array along various axes. For example, np.roll(a, (3,2), axis=(0,1)) will shift each element of a by 3 places along axis 0, and it will also shift each element by 2 places along axis 1. np.roll does not have an option to roll each row by a different amount. You can do it though for example as follows:
import numpy as np
a = np.array([
[6, 3, 9, 2, 3],
[1, 7, 8, 1, 2],
[5, 4, 2, 2, 4],
[3, 9, 7, 6, 5],
])
shifts = np.c_[[1,2,1,3]]
a[np.c_[:a.shape[0]], (np.r_[:a.shape[1]] - shifts) % a.shape[1]]
It gives:
array([[3, 6, 3, 9, 2],
[1, 2, 1, 7, 8],
[4, 5, 4, 2, 2],
[7, 6, 5, 3, 9]])

Numpy/Torch : Selecting elements using indices over a dimension [duplicate]

This question already has answers here:
NumPy selecting specific column index per row by using a list of indexes
(7 answers)
Closed 1 year ago.
Given an array like below:
np.arange(12).reshape(4,3)
Out[119]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
I want to select a single element from each of the rows using a list of indices [0, 2, 1, 2] to create a 4x1 array of [0, 5, 7, 11].
Is there any easy way to do this indexing. The closest I could see was the gather method in pytorch.
>>> import torch
>>> import numpy as np
>>> s = np.arange(12).reshape(4,3)
>>> s = torch.tensor(s)
>>> s
tensor([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
>>> idx = torch.tensor([0, 2, 1, 2])
>>> torch.gather(s,-1 ,idx.unsqueeze(-1))
tensor([[ 0],
[ 5],
[ 7],
[11]])
torch.gather(s,-1 ,idx.unsqueeze(-1))
arr[[0,1,2,3], [0,2,1,2]]
or if you prefer np.arange(4) for the 1st indexing array.
Please try to run the following code.
import numpy as np
x = [[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]]
index_array = [0, 2, 1, 2]
index = 0
result = []
for item in x:
result.append(item[index_array[index]])
index += 1
print (result)
Here is the result.
[0, 5, 7, 11]
>

Creating a block array from a given array [duplicate]

This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 2 years ago.
I have a 2d array like this
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and I want to create an array, where every entry of the one above fills a whole block of the new array. I.e. if I want 2x2 blocks, I want my new array to look like this
B = np.array([[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[..., ..., ...,],
[..., 8, 8, 9, 9],
[..., 8, 8, 9, 9]])
I managed to do this by iterating over the arrays and creating a corresponding block for every entry, but I'm wondering if there's a better way to do this.
A.repeat(2, axis=1).repeat(2, axis=0)
First repeat the elements along the first axis to get:
array([[1, 1, 2, 2, 3, 3],
[4, 4, 5, 5, 6, 6],
[7, 7, 8, 8, 9, 9]])
Then repeat the elements along the zeroth axis to get:
array([[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[4, 4, 5, 5, 6, 6],
[4, 4, 5, 5, 6, 6],
[7, 7, 8, 8, 9, 9],
[7, 7, 8, 8, 9, 9]])
(The order of repetition axes doesn't matter.)
You can change 2s to the desired block size.

Indexing N dimensional matrix [duplicate]

This question already has answers here:
using an numpy array as indices of the 2nd dim of another array? [duplicate]
(2 answers)
Closed 2 years ago.
I have the following array:
import numpy as np
print(A)
array([[ 0, 1, 4, 5, 8, 7],
[ 5, 3, 4, 1, 8, 11],
[ 2, 7, 5, 3, 4, 1],
[ 2, 8, 8, 1, 10, 1],
[ 2, 14, 8, 6, 5, 3]])
And I need to the values A corresponding to these column indices:
b = np.array([5, 0, 3, 4, 4])
Expected output:
array([ 7, 5, 3, 10, 5])
Thanks in advance.
You can use advanced indexing. You need to define an indexing array across the first axis, so that both indexing arrays are broadcast together and each column index refers to a specific row. In this case you just want an np.arange to index on the rows:
A[np.arange(A.shape[0]), b]
# array([ 7, 5, 3, 10, 5])

Sum each row of a numpy array with all rows of second numpy array (python)

I would like to know if there is any fast way to sum each row of a first array with all rows of a second array. In this case both arrays have the same number of colulmns. For instance if array1.shape = (n,c) and array2.shape = (m,c), the resulting array would be an array3.shape = ((n*m), c)
Look at the example below:
array1 = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
array2 = np.array([[0, 1, 2],
[3, 4, 5]])
The result would be:
array3 = np.array([[0, 2, 4],
[3, 5, 7]
[3, 5, 7]
[6, 8, 10]
[6, 8, 10]
[9, 11, 13]])
The only way I see I can do this is to repeat each row of one of the arrays the number of rows of the other array. For instance, by doing np.repeat(array1, len(array2), axis=0) and then sum this array with array2. This is not very practical however if the number of rows is too big. The other way would be with a for loop but this is too slow.
Any other better way to do it..?
Thanks in advance.
Extend array1 to 3D so that it becomes broadcastable against 2D array2 and then perform broadcasted addition and a final reshape is needed for desired output -
In [30]: (array1[:,None,:] + array2).reshape(-1,array1.shape[1])
Out[30]:
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
You could try the following inline code if you haven't already. This is the simplest and probably also the quickest on a single thread.
>>> import numpy as np
>>> array1 = np.array([[0, 1, 2],
... [3, 4, 5],
... [6, 7, 8]])
>>>
>>> array2 = np.array([[0, 1, 2],
... [3, 4, 5]])
>>> array3 = np.array([i+j for i in array1 for j in array2])
>>> array3
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
>>>
If you are looking for speed up by treading, you could consider using CUDA or multithreading. This suggestion goes a bit out of scope of your question but gives you an idea of what can be done to speed up matrix operations.

Categories

Resources