Convert one-dimensional array to two-dimensional array so that each element is a row in the result - python

I want to know how to convert this: array([0, 1, 2, 3, 4, 5]) to this:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5]])
In short, given a flat array, repeat each element inside the array n times, so that each element creates a sub-array of n of the same element, and concatenate these sub-arrays into one, so that each row contains an element from the original array repeated n times.
I can do this:
def repeat(lst, n):
return [[e]*n for e in lst]
>repeat(range(10), 4)
[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5],
[6, 6, 6, 6],
[7, 7, 7, 7],
[8, 8, 8, 8],
[9, 9, 9, 9]]
How to do this in NumPy?

You can use numpy's repeat like this:
np.repeat(range(10), 4).reshape(10,4)
which gives:
[[0 0 0 0]
[1 1 1 1]
[2 2 2 2]
[3 3 3 3]
[4 4 4 4]
[5 5 5 5]
[6 6 6 6]
[7 7 7 7]
[8 8 8 8]
[9 9 9 9]]

You can use tile that handles dimensions:
a = np.array([0, 1, 2, 3, 4, 5])
N = 4
np.tile(a[:,None], (1, N))
# or
np.tile(a, (N, 1)).T
or broadcast_to:
np.broadcast_to(a, (N, a.shape[0])).T
# or
np.broadcast_to(a[:,None], (a.shape[0], N))
Or multiply by an array of ones:
a[:,None]*np.ones(N, dtype=a.dtype)
output:
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5]])

Related

How to generate values from a diagonal to fill matrix

I have the following diagonal matrix
a = array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
And the desired out come is the following
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Each element is the sum of the corresponding diagonals.
Thanks a lot
Try:
>>> np.diag(a) + np.diag(a)[:, None] - a
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Addendum
What if a is a DataFrame?
Then: np.diag(a) + np.diag(a)[:, None] - a is also a DataFrame (with same index and columns as a).
What if a is a numpy array, but I want a DataFrame result?
Then use: pd.DataFrame(...) instead.
You can use:
# get diagonal
diag = np.diag(a)
# outer sum
out = diag+diag[:,None]
# or
# out = np.outer(diag, diag)
# reset diagonal
np.fill_diagonal(out, diag)
print(out)
output:
[[1 3 4 5]
[3 2 5 6]
[4 5 3 7]
[5 6 7 4]]

row-wise Cartesian product between a 1d array and a 2d array

I think I'm missing something obvious. I want to find a cartesian product of arr1 (a 1d numpy array), and the ROWS of arr2 (a 2d numpy array). So, if arr1 has 4 elements and arr2 has shape (5,2), the output should have shape (20,3). (see below)
import numpy as np
arr1 = np.array([1, 4, 7, 3])
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
The desired output is:
arr3 = np.array([[1, 0, 1],
[1, 2, 3],
[1, 4, 5],
[1, 4, 0],
[1, 9, 9],
[4, 0, 1],
[4, 2, 3],
[4, 4, 5],
[4, 4, 0],
[4, 9, 9],
[7, 0, 1],
[7, 2, 3],
[7, 4, 5],
[7, 4, 0],
[7, 9, 9],
[3, 0, 1],
[3, 2, 3],
[3, 4, 5],
[3, 4, 0],
[3, 9, 9]])
I've been trying to use transpose and reshape with code like np.array(np.meshgrid(arr1,arr2)), but no success yet.
I'm hoping the solution can be generalized because I also need to deal with situations like this: Get all combinations of the ROWS of a 2d (10,2) array and the ROWS of a 2d array (20, 5) to get an output array (200,7).
Here is a vectorized solution that works for your general case as well:
arr1 = np.array([[1, 4],
[7, 3]])
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
np.hstack((np.repeat(arr1,len(arr2),0),np.stack((arr2,)*len(arr1)).reshape(-1,arr2.shape[1])))
output of shape (2,2)*(5,2)->(10,4):
[[1 4 0 1]
[1 4 2 3]
[1 4 4 5]
[1 4 4 0]
[1 4 9 9]
[7 3 0 1]
[7 3 2 3]
[7 3 4 5]
[7 3 4 0]
[7 3 9 9]]
You can use hstack to add columns to arr2, and vstack to get the final array.
np.vstack(np.apply_along_axis(lambda x: np.hstack([np.repeat(x[0], arr2.shape[0]).reshape(-1, 1),
arr2]),
1,
arr1[:, None]))
I think this should do it:
import numpy as np
arr0 = np.array([1, 4, 7, 3])
arr1 = np.reshape(arr0, (len(arr0),1))
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
r1,c1 = arr1.shape
r2,c2 = arr2.shape
arrOut = np.zeros((r1,r2,c1+c2), dtype=arr1.dtype)
arrOut[:,:,:c1] = arr1[:,None,:]
arrOut[:,:,c1:] = arr2
arrOut.reshape(-1,c1+c2)
The output is:
array([[1, 0, 1],
[1, 2, 3],
[1, 4, 5],
[1, 4, 0],
[1, 9, 9],
[4, 0, 1],
[4, 2, 3],
[4, 4, 5],
[4, 4, 0],
[4, 9, 9],
[7, 0, 1],
[7, 2, 3],
[7, 4, 5],
[7, 4, 0],
[7, 9, 9],
[3, 0, 1],
[3, 2, 3],
[3, 4, 5],
[3, 4, 0],
[3, 9, 9]])

numpy select values based on list of indices. Process batch at once [duplicate]

Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
How do I select values from A pointed by B, i.e.:
A[B] = [[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]]
EDIT: np.take_along_axis is a builtin function for this use case implemented since numpy 1.15. See #hpaulj 's answer below for how to use it.
You can use NumPy's advanced indexing -
A[np.arange(A.shape[0])[:,None],B]
One can also use linear indexing -
m,n = A.shape
out = np.take(A,B + n*np.arange(m)[:,None])
Sample run -
In [40]: A
Out[40]:
array([[2, 4, 5, 3],
[1, 6, 8, 9],
[8, 7, 0, 2]])
In [41]: B
Out[41]:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
In [42]: A[np.arange(A.shape[0])[:,None],B]
Out[42]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
In [43]: m,n = A.shape
In [44]: np.take(A,B + n*np.arange(m)[:,None])
Out[44]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
More recent versions have added a take_along_axis function that does the job:
A = np.array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
B = np.array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
np.take_along_axis(A, B, 1)
Out[]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
There's also a put_along_axis.
I know this is an old question, but another way of doing it using indices is:
A[np.indices(B.shape)[0], B]
output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
Following is the solution using for loop:
outlist = []
for i in range(len(B)):
lst = []
for j in range(len(B[i])):
lst.append(A[i][B[i][j]])
outlist.append(lst)
outarray = np.asarray(outlist)
print(outarray)
Above can also be written in more succinct list comprehension form:
outlist = [ [A[i][B[i][j]] for j in range(len(B[i]))]
for i in range(len(B)) ]
outarray = np.asarray(outlist)
print(outarray)
Output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]

Tile rows of a 2D numpy array based on values in separate numpy vector

I have a source array:
a = array([[1, 1, 2, 2],
[3, 4, 5, 6],
[7, 7, 7, 8]])
And a vector that indicates how many times I want to tile each row of the array:
count = array([3, 1, 2])
I want to get:
results =array([[1, 1, 2, 2],
[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 4, 5, 6],
[7, 7, 7, 8],
[7, 7, 7, 8]]
Is there a vectorized/numpy way to achieve this?
Currently I'm using an iterative loop approach and it's horribly slow when len(a) and/or count contains high values.
numpy.repeat() is what you are after:
Code:
np.repeat(a, count, axis=0)
Test Code:
import numpy as np
a = np.array([[1, 1, 2, 2],
[3, 4, 5, 6],
[7, 7, 7, 8]])
count = np.array([3, 1, 2])
print(np.repeat(a, count, axis=0))
Results:
[[1 1 2 2]
[1 1 2 2]
[1 1 2 2]
[3 4 5 6]
[7 7 7 8]
[7 7 7 8]]

Building a matrix of 'rolled' rows efficiently in Numpy

I'd like to construct a (n,n)-array from a one dimensional array, where each row is shifted (with wrapping) by one relative to the previous one. The following code does this:
import numpy as np
r = np.array([1, 2, 3, 4, 5])
n = len(r)
MM = np.zeros((n, n), dtype=r.dtype)
for k in range(n):
MM[k, :] = np.roll(r, k)
print(MM)
which results in:
[[1 2 3 4 5]
[5 1 2 3 4]
[4 5 1 2 3]
[3 4 5 1 2]
[2 3 4 5 1]]
Is there a way to do this Numpy faster, i.e., avoiding the for-loop, for large r in Numpy?
Take a look at scipy.linalg.circulant
In [255]: r
Out[255]: array([1, 2, 3, 4, 5])
In [256]: circulant(r).T
Out[256]:
array([[1, 2, 3, 4, 5],
[5, 1, 2, 3, 4],
[4, 5, 1, 2, 3],
[3, 4, 5, 1, 2],
[2, 3, 4, 5, 1]])
or scipy.linalg.toeplitz
In [257]: toeplitz(np.roll(r[::-1], 1), r)
Out[257]:
array([[1, 2, 3, 4, 5],
[5, 1, 2, 3, 4],
[4, 5, 1, 2, 3],
[3, 4, 5, 1, 2],
[2, 3, 4, 5, 1]])

Categories

Resources