how to duplicate each row of a matrix N times Numpy

how to duplicate each row of a matrix N times Numpy - python

I have a matrix with these dimensions (150,2) and I want to duplicate each row N times. I show what I mean with an example.
Input:
a = [[2, 3], [5, 6], [7, 9]]
suppose N= 3, I want this output:
[[2 3]
[2 3]
[2 3]
[5 6]
[5 6]
[5 6]
[7 9]
[7 9]
[7 9]]
Thank you.

Use np.repeat with parameter axis=0 as:
a = np.array([[2, 3],[5, 6],[7, 9]])
print(a)
[[2 3]
[5 6]
[7 9]]
r_a = np.repeat(a, repeats=3, axis=0)
print(r_a)
[[2 3]
[2 3]
[2 3]
[5 6]
[5 6]
[5 6]
[7 9]
[7 9]
[7 9]]

To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don't know m how many rows you will append and don't care about the computational cost Stephen Simmons mentioned (namely re-building the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).
This way you can use for example (here m = 5 which we assume we didn't know when creating the empty matrix, and n = 2):
import numpy as np
n = 2
X = np.empty(shape=[0, n])
for i in range(5):
for j in range(2):
X = np.append(X, [[i, j]], axis=0)
print X
which will give you:
[[ 0. 0.]
[ 0. 1.]
[ 1. 0.]
[ 1. 1.]
[ 2. 0.]
[ 2. 1.]
[ 3. 0.]
[ 3. 1.]
[ 4. 0.]
[ 4. 1.]]

If your input is a vector, use atleast_2d first.
a = np.atleast_2d([2, 3]).repeat(repeats=3, axis=0)
print(a)
# [[2 3]
# [2 3]
# [2 3]]

Related

How can I get a sublist with wraparound in Python

Simple 1D case
I would like to get a substring with wraparound.
str = "=Hello community of Python="
# ^^^^^ ^^^^^^^ I want this wrapped substring
str[-7]
> 'P'
str[5]
> 'o'
str[-7:5]
> ''
Why does this slice of a sequence starting at a negative index and ending in a positive one result in an empty string?
How would I get it to output "Python==Hell"?
Higher dimensional cases
In this simple case I could do some cutting and pasting, but in my actual application I want to get every sub-grid of size 2x2 of a bigger grid - with wraparound.
m = np.mat('''1 2 3;
4 5 6;
7 8 9''')
And I want to get all submatrices centered at some location (x, y), including '9 7; 3 1'. Indexing with m[x-1:y+1] doesn't work for (x,y)=(0,0), nor does (x,y)=(1,0) give 7 8; 1 2
3D example
m3d = np.array(list(range(27))).reshape((3,3,3))
>
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
m3d[-1:1,-1:1,-1:1]
# doesn't give [[[26, 24], [20, 18]], [8, 6], [2, 0]]]
If need be I could write some code which gets the various sub-matrices and glues them back together, but this approach might get quite cumbersome when I have to apply the same method to 3d arrays.
I was hoping there would be an easy solution. Maybe numpy can help out here?

Using Advanced indexing (see the section starting with "From a 4x3 array the corner elements should be selected using advanced indexing"):
import numpy as np
m = np.mat('''1 2 3;
4 5 6;
7 8 9''')
print(m[np.ix_(range(-1, 1), range(-1, 1))])
print(m[np.ix_(range(-2, 2), range(-2, 2))])
print(m[np.arange(-2, 2)[:, np.newaxis], range(-2, 2)])
Output (Attempt This Online!):
[[9 7]
[3 1]]
[[5 6 4 5]
[8 9 7 8]
[2 3 1 2]
[5 6 4 5]]
[[5 6 4 5]
[8 9 7 8]
[2 3 1 2]
[5 6 4 5]]
Going through all sub-matrices
Since you want to go through all sub-matrices, we can beforehand separately prepare the row ranges and the column ranges, and then use pairs of them to quickly index:
import numpy as np
A = np.mat('''1 2 3;
4 5 6;
7 8 9''')
m, n = A.shape
rowranges = [
(np.arange(i, i+2) % m)[:, np.newaxis]
for i in range(m)
]
colranges = [
np.arange(j, j+2) % n
for j in range(n)
]
for rowrange in rowranges:
for colrange in colranges:
print(A[rowrange, colrange])
Output (Attempt This Online!):
[[1 2]
[4 5]]
[[2 3]
[5 6]]
[[3 1]
[6 4]]
[[4 5]
[7 8]]
[[5 6]
[8 9]]
[[6 4]
[9 7]]
[[7 8]
[1 2]]
[[8 9]
[2 3]]
[[9 7]
[3 1]]
3D case
m3d = np.array(list(range(27))).reshape((3,3,3))
m3d[np.ix_(range(-1,1), range(-1,1), range(-1,1))]
Output:
array([[[26, 24],
[20, 18]],
[[ 8, 6],
[ 2, 0]]])

Simply combine the two halfs yourself:
>>> str[-7:]+str[:5]
'Python==Hell'

You could repeat your data enough so that you don't need wraparound.
Substrings of length 3:
s = 'Python'
r = 3
s2 = s + s[:r-1]
for i in range(len(s)):
print(s2[i:i+r])
Output:
Pyt
yth
tho
hon
onP
nPy
Sub-matrices of size 2×2:
import numpy as np
m = np.mat('''1 2 3;
4 5 6;
7 8 9''')
r = 2
m2 = np.tile(m, (2, 2))
for i in range(3):
for j in range(3):
print(m2[i:i+r, j:j+r])
Output (Attempt This Online!):
[[1 2]
[4 5]]
[[2 3]
[5 6]]
[[3 1]
[6 4]]
[[4 5]
[7 8]]
[[5 6]
[8 9]]
[[6 4]
[9 7]]
[[7 8]
[1 2]]
[[8 9]
[2 3]]
[[9 7]
[3 1]]
For larger more-dimensional arrays, the simple np.tile adds mmore than necessary. You really just need to increase the size by + r-1 in each dimension, not by * 2. Like I did with the string. Not sure how to do that well with arrays. Plus I think you can also make your negative indexes work, so we just need someone to come along and do that.

Batch of multidimensional arrays multiplied by a batch of scalers (without loops)

I have a batch, two data, of multidimensional arrays (3,3,2) as following:
batch= np.asarray([
[
[[1,2,3],
[3,1,1,],
[4,9,0,]],
[[2,2,2],
[5,6,7],
[3,3,3]]
],
[
[[2,2,2],
[5,6,7],
[3,3,3]],
[[1,2,3],
[3,1,1],
[4,9,0]]
]
])
correspondingly I have batch, two data, of (1,1,2) as follows
scalers = np.asarray([
[
[[1]],
[[2]]
],
[
[[0]],
[[3]]
]
])
each dimension in the batch should be multiplied by its corresponding scaler in the scalers array. For example:
# the first dimension
[[1,2,3],
1 * [3,1,1,],
[4,9,0,]]
# the second dimension
2 * [[2,2,2],
[5,6,7],
[3,3,3]]
.
.
.
# the last dimension
3* [[1,2,3],
[3,1,1],
[4,9,0]]
So , the expected output should be like the following:
[
[
[[1 2 3],
[3 1 1 ],
[4 9 0 ]],
[[4 4 4],
[10 12 14],
[6 6 6]]
],
[
[[0 0 0],
[0 0 0],
[0 0 0]],
[[3 6 9],
[9 3 3],
[12 27 0]]
]
]
I was trying to do the following to avoid any loops
batch * scalers
but it seems it is not correct, I wonder how to do the behavior above

How to Broadcast Sum Vector and Tensor?

Suppose we have:
row vector V of shape (F,1), and
4-D tensor T of shape (N, F, X, Y).
As a concrete example, let N, F, X, Y = 2, 3, 2, 2. Let V = [v0, v1,v2].
Then, I want to element-wise add v0 to the inner 2x2 matrix T[0,0], v1 to T[0,1], and v2 to T[0,2]. Similarly, I want to add v0 to T[1,0], v1 to T[1,1], and v2 to T[1,2].
So at the "innermost" level, the addition between the 2x2 matrix and a scalar, e.g. T[0,0] + v0, uses broadcasting to element-wise add v0. Then what I'm trying to do is apply that more generally to each inner 2x2.
I've tried using np.einsum() and np.tensordot(), but I couldn't figure out what each of those functions was actually doing on a more fundamental level, so I wanted to ask for a more step-by-step explanation of how this computation might be done.
Thanks

To multiply: You can simply translate your text into indices names of eisnum and it will take care of broadcasting:
TV = np.einsum('ijkl,j->ijkl',T,V)
To add: Simply add dimensions to your V using None to match up last two dimensions of T and broadcasting will take care of the rest:
TV = T + V[:,None,None]
Example input/output that shows the desired behavior of your output for adding:
T:
[[[[7 4]
[5 9]]
[[0 3]
[2 6]]
[[7 6]
[1 1]]]
[[[8 0]
[8 7]]
[[2 6]
[9 2]]
[[8 6]
[4 9]]]]
V:
[0 1 2]
TV:
[[[[ 7 4]
[ 5 9]]
[[ 1 4]
[ 3 7]]
[[ 9 8]
[ 3 3]]]
[[[ 8 0]
[ 8 7]]
[[ 3 7]
[10 3]]
[[10 8]
[ 6 11]]]]

Why is the shape of multidimensional arrays handled differently in numpy when using axis parameter

I am a rookie in the python language and have a question regarding the shape of arrays.
So far as I understand, if a 3 dimensional numpy array is created like this
temp = numpy.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4], [5, 5, 5]], [[6, 6, 6], [7, 7, 7], [8, 8, 8]]]),
the shape is created like in the following figure:
shape of 3 dimensional array
To calculate the sum, median etc. an axis can be defined to calculate the values e.g.
>>> print(numpy.median(temp, axis=0))
[[3. 3. 3.] [4. 4. 4.] [5. 5. 5.]]
>>> print(numpy.median(temp, axis=1))
[[1. 1. 1.] [4. 4. 4.] [7. 7. 7.]]
>>> print(numpy.median(temp, axis=2))
[[0. 1. 2.] [3. 4. 5.] [6. 7. 8.]]
which implies to me a shape like this shape of 3 dimensional array using axis parameter
Why is the shape handled differently when calculateing the sum, median etc.with the axis parameter?

Your numpy array temp = numpy.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4], [5, 5, 5]], [[6, 6, 6], [7, 7, 7], [8, 8, 8]]]) looks actually like this:
axis=2
|
v
[[[0 0 0] <-axis=1
[1 1 1]
[2 2 2]] <- axis=0
[[3 3 3]
[4 4 4]
[5 5 5]]
[[6 6 6]
[7 7 7]
[8 8 8]]]
Therefore, when you take the median over specific axis, numpy keeps the rest of the axis as is and finds the median along the specified axis. To have a better understanding, I am going to use the suggested array in comments by #hpaulj:
temp:
axis=2
|
v
[[[ 0 1 2 3] <-axis=1
[ 4 5 6 7]
[ 8 9 10 11]] <- axis=0
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
We then have:
numpy.median(temp, axis=0):
#The first element is median of [0,12], second one median of [1,13] and so on.
[[ 6. 7. 8. 9.]
[10. 11. 12. 13.]
[14. 15. 16. 17.]]
np.median(temp, axis=1)
#The first element is median of [0,4,8], second one median of [1,5,9] and so on.
[[ 4. 5. 6. 7.]
[16. 17. 18. 19.]]
np.median(temp, axis=2)
#The first element is median of [0,1,2,3], second one median of [4,5,6,7] and so on.
[[ 1.5 5.5 9.5]
[13.5 17.5 21.5]]

Put a numpy array of lists inside a numpy matrix at corresponding positions

I have the following numpy array containing lists inside
example=np.array(([[1, 2, 3], [4, 5], [6,7]]))
print(example)
[list([1, 2, 3]) list([4, 5]) list([6, 7])]
I would like to put these values at the corresponding values of a numpy array of any suitable size. For example, I have the following matrix:
[[14021982. 14021982. 14021982.]
[14021982. 14021982. 14021982.]
[14021982. 14021982. 14021982.]]
Therefore, I want the output to be
[[1. 2. 3.]
[4. 5. 14021982.]
[6. 7. 14021982.]]
Is there an efficient way to that in python no matter the size of the two matrices sizes?
EDIT: I also want to know if is that possible to do that for a matrix of smaller size:
for example, I want to put the input
print(example)
[list([1, 2, 3]) list([4, 5]) list([6, 7])]
In the following matrix
[[14021982. 14021982.]
[14021982. 14021982.]
[14021982. 14021982.]]
which would result in
[[1. 2.]
[4. 5.]
[6. 7.]]

You can replace values by indices
example = [[1, 2, 3], [4, 5], [6, 7]]
target = np.array([...])
for i, j in enumerate(example):
size = min(len(j), len(target[i]))
target[i][0:size] = j[:size]
print(target)
Output
target = np.array([[14021982, 14021982, 14021982],
[14021982, 14021982, 14021982],
[14021982, 14021982, 14021982]])
[[ 1 2 3]
[ 4 5 14021982]
[ 6 7 14021982]]
---------------------------------------------------------
target = np.array([[14021982, 14021982],
[14021982, 14021982],
[14021982, 14021982]])
[[1 2]
[4 5]
[6 7]]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to duplicate each row of a matrix N times Numpy - python

I have a matrix with these dimensions (150,2) and I want to duplicate each row N times. I show what I mean with an example. Input: a = [[2, 3], [5, 6], [7, 9]] suppose N= 3, I want this output: [[2 3] [2 3] [2 3] [5 6] [5 6] [5 6] [7 9] [7 9] [7 9]] Thank you.

Use np.repeat with parameter axis=0 as: a = np.array([[2, 3],[5, 6],[7, 9]]) print(a) [[2 3] [5 6] [7 9]] r_a = np.repeat(a, repeats=3, axis=0) print(r_a) [[2 3] [2 3] [2 3] [5 6] [5 6] [5 6] [7 9] [7 9] [7 9]]

If your input is a vector, use atleast_2d first. a = np.atleast_2d([2, 3]).repeat(repeats=3, axis=0) print(a) # [[2 3] # [2 3] # [2 3]]

Related

How can I get a sublist with wraparound in Python

Batch of multidimensional arrays multiplied by a batch of scalers (without loops)

How to Broadcast Sum Vector and Tensor?

Why is the shape of multidimensional arrays handled differently in numpy when using axis parameter

Put a numpy array of lists inside a numpy matrix at corresponding positions

Categories

Resources