I try to compute a convolution on a scipy.sparse matrix. Here is the code:
import numpy as np
import scipy.sparse, scipy.signal
M = scipy.sparse.csr_matrix([[0,1,0,0],[1,0,0,1],[1,0,1,0],[0,0,0,0]])
kernel = np.ones((3,3))
kernel[1,1]=0
X = scipy.signal.convolve(M, kernel, mode='same')
Which produces the following error:
ValueError: volume and kernel should have the same dimensionality
Computing scipy.signal.convolve(M.todense(), kernel, mode='same') provides the expected result. However, I would like to keep the computation sparse.
More generally speaking, my goal is to compute the 1-hop neighbourhood sum of the sparse matrix M. If you have any good idea how to calculate this on a sparse matrix, I would love to hear it !
EDIT:
I just tried a solution for this specific kernel (sum of neighbors) that is not really faster than the dense version (I didn't try in a very high dimension though). Here is the code:
row_ind, col_ind = M.nonzero()
X = scipy.sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
for i in [0, 1, 2]:
for j in [0, 1, 2]:
if i!= 1 or j !=1:
X += scipy.sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (M.shape[0]+2, M.shape[1]+2))
X = X[1:-1, 1:-1]
In [1]: from scipy import sparse, signal
In [2]: M = sparse.csr_matrix([[0,1,0,0],[1,0,0,1],[1,0,1,0],[0,0,0,0]])
...: kernel = np.ones((3,3))
...: kernel[1,1]=0
In [3]: X = signal.convolve(M.A, kernel, mode='same')
In [4]: X
Out[4]:
array([[2., 1., 2., 1.],
[2., 4., 3., 1.],
[1., 3., 1., 2.],
[1., 2., 1., 1.]])
Why do posters show runnable code, but not the results? Most of us can't run code like this in our heads.
In [5]: M.A
Out[5]:
array([[0, 1, 0, 0],
[1, 0, 0, 1],
[1, 0, 1, 0],
[0, 0, 0, 0]])
Your alternative - while the result is a sparse matrix, all values are filled. Even if M is larger and sparser, X will be denser.
In [7]: row_ind, col_ind = M.nonzero()
...: X = sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: X += sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (M
...: .shape[0]+2, M.shape[1]+2))
...: X = X[1:-1, 1:-1]
In [8]: X
Out[8]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 16 stored elements in Compressed Sparse Row format>
In [9]: X.A
Out[9]:
array([[2., 1., 2., 1.],
[2., 4., 3., 1.],
[1., 3., 1., 2.],
[1., 2., 1., 1.]])
Here's an alternative that builds the coo style inputs, and only makes the matrix at the end. Keep in mind that repeated coordinates are summed. That's handy in FEM stiffness matrix construction, and fits nicely here as well.
In [10]: row_ind, col_ind = M.nonzero()
...: data, row, col = [],[],[]
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: data.extend(M.data)
...: row.extend(row_ind+i)
...: col.extend(col_ind+j)
...: X = sparse.csr_matrix( (data, (row, col)), (M.shape[0]+2, M.shape[1]+2)
...: )
...: X = X[1:-1, 1:-1]
In [11]: X
Out[11]:
<4x4 sparse matrix of type '<class 'numpy.int64'>'
with 16 stored elements in Compressed Sparse Row format>
In [12]: X.A
Out[12]:
array([[2, 1, 2, 1],
[2, 4, 3, 1],
[1, 3, 1, 2],
[1, 2, 1, 1]])
===
My approach is noticeably faster (but still well behind the dense convolution). sparse.csr_matrix(...) is pretty slow, so it isn't a good idea to do repeatedly. And sparse addition isn't very good either.
In [13]: %%timeit
...: row_ind, col_ind = M.nonzero()
...: data, row, col = [],[],[]
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: data.extend(M.data)
...: row.extend(row_ind+i)
...: col.extend(col_ind+j)
...: X = sparse.csr_matrix( (data, (row, col)), (M.shape[0]+2, M.shape[1]+2)
...: )
...: X = X[1:-1, 1:-1]
...:
...:
793 µs ± 20 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [14]: %%timeit
...: row_ind, col_ind = M.nonzero()
...: X = sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: X += sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (
...: M.shape[0]+2, M.shape[1]+2))
...: X = X[1:-1, 1:-1]
...:
...:
4.72 ms ± 92.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [15]: timeit X = signal.convolve(M.A, kernel, mode='same')
85.9 µs ± 339 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Related
I have been trying to construct the matrix Dij, defined as
I want to plot it for points located at xi = -cos[ π (2 i + 1) / (2 N)] on the interval [-1,1] to consequentially take derivatives of a function. I am though having problems constructing the differentiating matrix Dij.
I have written a python script as:
import numpy as np
N = 100
x = np.linspace(-1,1,N-1)
for i in range(0, N - 1):
x[i] = -np.cos(np.pi*(2*i + 1)/2*N)
def Dmatrix(x,N):
m_ij = np.zeros(3)
for k in range(len(x)):
for j in range(len(x)):
for i in range(len(x)):
m_ij[i,j,k] = -2/N*((k*np.sin(k*np.pi*(2*i + 1)/2*N(np.cos(k*np.pi*(2*j +1))/2*N)/(np.sin(np.pi*(2*i + 1)/2*N)))
return m_ij
xx = Dmatrix(x,N)
This thus returns the error:
IndexError: too many indices for array
Is there a way one could more efficiently construct this and successfully compute it over all k ?
The goal will be to multiply this matrix by a function and sum over j to get the first order derivative of given function.
m_ij = np.zeros(3) doesn't make a three-dimensional array, it makes an array with one dimension of length 3.
In [1]: import numpy as np
In [2]: m_ij = np.zeros(3)
In [3]: print(m_ij)
[0. 0. 0.]
I suspect you want (as a simple fix)
len_x = len(x)
m_ij = np.zeros((len_x, len_x, len_x))
Look at your x calc by itself
In [418]: N = 10
...: x = np.linspace(-1,1,N-1)
...: y = np.zeros(N)
...: for i in range(N):
...: y[i] = -np.cos(np.pi*(2*i + 1)/2*N)
...:
In [419]: x
Out[419]: array([-1. , -0.75, -0.5 , -0.25, 0. , 0.25, 0.5 , 0.75, 1. ])
In [420]: y
Out[420]: array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
In [421]: (2*np.arange(N)+1)
Out[421]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])
In [422]: (2*np.arange(N)+1)/2*N
Out[422]: array([ 5., 15., 25., 35., 45., 55., 65., 75., 85., 95.])
I separated x and y, because otherwise it doesn't make any sense to create x and then over write it.
The y values don't look interesting because they are all just cos of odd whole multiples of pi.
Note how I use np.arange instead of looping on range.
can be implemented as
def D(N):
from numpy import zeros, pi, sin, cos
D = zeros((N, N))
for i in range(N):
for j in range(N):
for k in range(N):
D[i,j] -= k*sin(k*pi*(i+i+1)/2/N)*cos(k*pi*(j+j+1)/2/N)
D[i,j] /= sin(pi*(i+i+1)/2/N)
return D*2/N
It could be convenient to vectorize the inner loop.
On second tought, all the procedure can be vectorized using np.einsum (at the end I have also some timing, the einsum version, of course, abysmally faster than a triple loop):
In [1]: from numpy import set_printoptions ; set_printoptions(linewidth=120)
In [2]: def D(N):
...: from numpy import zeros, pi, sin, cos
...: D = zeros((N, N))
...: for i in range(N):
...: for j in range(N):
...: for k in range(N):
...: D[i,j] -= k * sin(k*pi*(2*i+1)/2/N) * cos(k*pi*(2*j+1)/2/N)
...: D[i,j] /= sin(pi*(2*i+1)/2/N)
...: return D*2/N
In [3]: def E(N):
...: from numpy import arange, cos, einsum, outer, pi, sin
...: i = j = k = arange(N)
...: s_i = sin((2*i+1)*pi/2/N)
...: s_ki = sin(outer(k,(2*i+1)*pi/2/N))
...: c_kj = cos(outer(k,(2*j+1)*pi/2/N))
...: return -2/N*einsum('k, ki, kj -> ij', k, s_ki, c_kj) / s_i[:,None]
In [4]: for N in (3,4,5):
...: print(D(N)) ; print(E(N)) ; print('==========')
...:
[[-1.73205081e+00 2.30940108e+00 -5.77350269e-01]
[-5.77350269e-01 1.22464680e-16 5.77350269e-01]
[ 5.77350269e-01 -2.30940108e+00 1.73205081e+00]]
[[-1.73205081e+00 2.30940108e+00 -5.77350269e-01]
[-5.77350269e-01 1.22464680e-16 5.77350269e-01]
[ 5.77350269e-01 -2.30940108e+00 1.73205081e+00]]
==========
[[-3.15432203 4.46088499 -1.84775907 0.5411961 ]
[-0.76536686 -0.22417076 1.30656296 -0.31702534]
[ 0.31702534 -1.30656296 0.22417076 0.76536686]
[-0.5411961 1.84775907 -4.46088499 3.15432203]]
[[-3.15432203 4.46088499 -1.84775907 0.5411961 ]
[-0.76536686 -0.22417076 1.30656296 -0.31702534]
[ 0.31702534 -1.30656296 0.22417076 0.76536686]
[-0.5411961 1.84775907 -4.46088499 3.15432203]]
==========
[[-4.97979657e+00 7.20682930e+00 -3.40260323e+00 1.70130162e+00 -5.25731112e-01]
[-1.05146222e+00 -4.49027977e-01 2.10292445e+00 -8.50650808e-01 2.48216561e-01]
[ 3.24919696e-01 -1.37638192e+00 2.44929360e-16 1.37638192e+00 -3.24919696e-01]
[-2.48216561e-01 8.50650808e-01 -2.10292445e+00 4.49027977e-01 1.05146222e+00]
[ 5.25731112e-01 -1.70130162e+00 3.40260323e+00 -7.20682930e+00 4.97979657e+00]]
[[-4.97979657e+00 7.20682930e+00 -3.40260323e+00 1.70130162e+00 -5.25731112e-01]
[-1.05146222e+00 -4.49027977e-01 2.10292445e+00 -8.50650808e-01 2.48216561e-01]
[ 3.24919696e-01 -1.37638192e+00 2.44929360e-16 1.37638192e+00 -3.24919696e-01]
[-2.48216561e-01 8.50650808e-01 -2.10292445e+00 4.49027977e-01 1.05146222e+00]
[ 5.25731112e-01 -1.70130162e+00 3.40260323e+00 -7.20682930e+00 4.97979657e+00]]
==========
In [5]: %timeit D(20)
36 ms ± 277 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit E(20)
146 µs ± 777 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [7]: %timeit D(100)
4.35 s ± 30.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [8]: %timeit E(100)
7.7 ms ± 2.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [9]:
I tried searching for an answer, but couldn't find what I needed. Apologies if this is a duplicate question.
Suppose I have a 2d-array with shape (n, n*m). What I want to do is an outer sum of this array to its transpose that results in an array with shape (n*m, n*m). For example, suppose i have
A = array([[1., 1., 2., 2.],
[1., 1., 2., 2.]])
I want to do an outer sum of A and A.T such that the output is:
>>> array([[2., 2., 3., 3.],
[2., 2., 3., 3.],
[3., 3., 4., 4.],
[3., 3., 4., 4.]])
Note that np.add.outer does not work because it ravels in the inputs into vectors. I can achieve something similar by doing
np.tile(A, (2, 1)) + np.tile(A.T, (1, 2))
but this does not seem reasonable when n and m are reasonably large (n > 100 and m > 1000). Is it possible to write this sum using einsum? I just can't seem to figure out einsum.
To leverage broadcasting, we need to break it down to 3D and then permute axes and add -
n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n) # reshape to 3D
out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)
Sample run for verification -
In [359]: # Setup input array
...: np.random.seed(0)
...: n,m = 3,4
...: A = np.random.randint(1,10,(n,n*m))
In [360]: # Original soln
...: out0 = np.tile(A, (m, 1)) + np.tile(A.T, (1, m))
In [361]: # Posted soln
...: n = A.shape[0]
...: m = A.shape[1]//n
...: a = A.reshape(n,m,n)
...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)
In [362]: np.allclose(out0, out)
Out[362]: True
Timings with large n,m -
In [363]: # Setup input array
...: np.random.seed(0)
...: n,m = 100,100
...: A = np.random.randint(1,10,(n,n*m))
In [364]: %timeit np.tile(A, (m, 1)) + np.tile(A.T, (1, m))
1 loop, best of 3: 407 ms per loop
In [365]: %%timeit
...: # Posted soln
...: n = A.shape[0]
...: m = A.shape[1]//n
...: a = A.reshape(n,m,n)
...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)
1 loop, best of 3: 219 ms per loop
Further performance boost with numexpr
We can leverage multi-core with numexpr module for large data and to gain memory efficiency and hence performance -
import numexpr as ne
n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n)
p1 = a[None,:,:,:]
p2 = a.transpose(1,2,0)[:,:,None,:]
out = ne.evaluate('p1+p2').reshape(n*m,-1)
Timings with same large n, m setup -
In [367]: %%timeit
...: # Posted soln
...: n = A.shape[0]
...: m = A.shape[1]//n
...: a = A.reshape(n,m,n)
...: p1 = a[None,:,:,:]
...: p2 = a.transpose(1,2,0)[:,:,None,:]
...: out = ne.evaluate('p1+p2').reshape(n*m,-1)
10 loops, best of 3: 152 ms per loop
one way is
(A.reshape(-1,*A.shape).T+A)[:,0,:]
I think this will take a lot of memory with n>100 and m>1000.
but isn't this the same as
np.add.outer(A,A)[:,0,:].reshape(4,-1)
This question already has answers here:
NumPy Broadcasting: Calculating sum of squared differences between two arrays
(3 answers)
Closed 4 years ago.
Basically, I have two matrices A and B, and I want C (dimensions marked by the side of the matrices), with computation like this:
The formula below is what I do now. I take advantage of some broadcasting, but I am still left with a loop. I am novel to Python so maybe I am wrong, but I just have a hunch that this loop can be eliminated. Can anyone share some ideas?
EDIT: 2018-04-27 09:48:28
as requested, an example:
In [5]: A
Out[5]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
In [6]: B
Out[6]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
In [7]: C = np.zeros ((B.shape[0], A.shape[0]))
In [8]: for m in range (B.shape[0]):
...: C[m] = np.sum (np.square (B[m] - A), axis=1).flatten ()
...:
In [9]: C
Out[9]:
array([[ 0., 8., 32., 72., 128.],
[ 8., 0., 8., 32., 72.],
[ 32., 8., 0., 8., 32.],
[ 72., 32., 8., 0., 8.]])
This appears to work at the cost of some extra memory:
C = ((B[:, :, None] - A.T)**2).sum(axis=1)
Testing:
import numpy
D = 10
N = 20
M = 30
A = numpy.random.rand(N, D)
B = numpy.random.rand(M, D)
C = numpy.empty((M, N))
Timing:
for m in range(M):
C[m] = numpy.sum((B[m, :] - A)**2, axis=1)
514 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
C2 = ((B[:, :, None] - A.T)**2).sum(axis=1)
53.6 µs ± 529 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Sometimes it is useful to "clone" a row or column vector to a matrix. By cloning I mean converting a row vector such as
[1, 2, 3]
Into a matrix
[[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]
or a column vector such as
[[1],
[2],
[3]]
into
[[1, 1, 1]
[2, 2, 2]
[3, 3, 3]]
In MATLAB or octave this is done pretty easily:
x = [1, 2, 3]
a = ones(3, 1) * x
a =
1 2 3
1 2 3
1 2 3
b = (x') * ones(1, 3)
b =
1 1 1
2 2 2
3 3 3
I want to repeat this in numpy, but unsuccessfully
In [14]: x = array([1, 2, 3])
In [14]: ones((3, 1)) * x
Out[14]:
array([[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
# so far so good
In [16]: x.transpose() * ones((1, 3))
Out[16]: array([[ 1., 2., 3.]])
# DAMN
# I end up with
In [17]: (ones((3, 1)) * x).transpose()
Out[17]:
array([[ 1., 1., 1.],
[ 2., 2., 2.],
[ 3., 3., 3.]])
Why wasn't the first method (In [16]) working? Is there a way to achieve this task in python in a more elegant way?
Use numpy.tile:
>>> tile(array([1,2,3]), (3, 1))
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
or for repeating columns:
>>> tile(array([[1,2,3]]).transpose(), (1, 3))
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
Here's an elegant, Pythonic way to do it:
>>> array([[1,2,3],]*3)
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> array([[1,2,3],]*3).transpose()
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
the problem with [16] seems to be that the transpose has no effect for an array. you're probably wanting a matrix instead:
>>> x = array([1,2,3])
>>> x
array([1, 2, 3])
>>> x.transpose()
array([1, 2, 3])
>>> matrix([1,2,3])
matrix([[1, 2, 3]])
>>> matrix([1,2,3]).transpose()
matrix([[1],
[2],
[3]])
First note that with numpy's broadcasting operations it's usually not necessary to duplicate rows and columns. See this and this for descriptions.
But to do this, repeat and newaxis are probably the best way
In [12]: x = array([1,2,3])
In [13]: repeat(x[:,newaxis], 3, 1)
Out[13]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [14]: repeat(x[newaxis,:], 3, 0)
Out[14]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
This example is for a row vector, but applying this to a column vector is hopefully obvious. repeat seems to spell this well, but you can also do it via multiplication as in your example
In [15]: x = array([[1, 2, 3]]) # note the double brackets
In [16]: (ones((3,1))*x).transpose()
Out[16]:
array([[ 1., 1., 1.],
[ 2., 2., 2.],
[ 3., 3., 3.]])
Let:
>>> n = 1000
>>> x = np.arange(n)
>>> reps = 10000
Zero-cost allocations
A view does not take any additional memory. Thus, these declarations are instantaneous:
# New axis
x[np.newaxis, ...]
# Broadcast to specific shape
np.broadcast_to(x, (reps, n))
Forced allocation
If you want force the contents to reside in memory:
>>> %timeit np.array(np.broadcast_to(x, (reps, n)))
10.2 ms ± 62.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit np.repeat(x[np.newaxis, :], reps, axis=0)
9.88 ms ± 52.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit np.tile(x, (reps, 1))
9.97 ms ± 77.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
All three methods are roughly the same speed.
Computation
>>> a = np.arange(reps * n).reshape(reps, n)
>>> x_tiled = np.tile(x, (reps, 1))
>>> %timeit np.broadcast_to(x, (reps, n)) * a
17.1 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit x[np.newaxis, :] * a
17.5 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit x_tiled * a
17.6 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
All three methods are roughly the same speed.
Conclusion
If you want to replicate before a computation, consider using one of the "zero-cost allocation" methods. You won't suffer the performance penalty of "forced allocation".
I think using the broadcast in numpy is the best, and faster
I did a compare as following
import numpy as np
b = np.random.randn(1000)
In [105]: %timeit c = np.tile(b[:, newaxis], (1,100))
1000 loops, best of 3: 354 µs per loop
In [106]: %timeit c = np.repeat(b[:, newaxis], 100, axis=1)
1000 loops, best of 3: 347 µs per loop
In [107]: %timeit c = np.array([b,]*100).transpose()
100 loops, best of 3: 5.56 ms per loop
about 15 times faster using broadcast
One clean solution is to use NumPy's outer-product function with a vector of ones:
np.outer(np.ones(n), x)
gives n repeating rows. Switch the argument order to get repeating columns. To get an equal number of rows and columns you might do
np.outer(np.ones_like(x), x)
You can use
np.tile(x,3).reshape((4,3))
tile will generate the reps of the vector
and reshape will give it the shape you want
Returning to the original question
In MATLAB or octave this is done pretty easily:
x = [1, 2, 3]
a = ones(3, 1) * x
...
In numpy it's pretty much the same (and easy to memorize too):
x = [1, 2, 3]
a = np.tile(x, (3, 1))
If you have a pandas dataframe and want to preserve the dtypes, even the categoricals, this is a fast way to do it:
import numpy as np
import pandas as pd
df = pd.DataFrame({1: [1, 2, 3], 2: [4, 5, 6]})
number_repeats = 50
new_df = df.reindex(np.tile(df.index, number_repeats))
Another solution
>> x = np.array([1,2,3])
>> y = x[None, :] * np.ones((3,))[:, None]
>> y
array([[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
Why? Sure, repeat and tile are the correct way to do this. But None indexing is a powerful tool that has many times let me quickly vectorize an operation (though it can quickly be very memory expensive!).
An example from my own code:
# trajectory is a sequence of xy coordinates [n_points, 2]
# xy_obstacles is a list of obstacles' xy coordinates [n_obstacles, 2]
# to compute dx, dy distance between every obstacle and every pose in the trajectory
deltas = trajectory[:, None, :2] - xy_obstacles[None, :, :2]
# we can easily convert x-y distance to a norm
distances = np.linalg.norm(deltas, axis=-1)
# distances is now [timesteps, obstacles]. Now we can for example find the closest obstacle at every point in the trajectory by doing
closest_obstacles = np.argmin(distances, axis=1)
# we could also find how safe the trajectory is, by finding the smallest distance over the entire trajectory
danger = np.min(distances)
To answer the actual question, now that nearly a dozen approaches to working around a solution have been posted: x.transpose reverses the shape of x. One of the interesting side-effects is that if x.ndim == 1, the transpose does nothing.
This is especially confusing for people coming from MATLAB, where all arrays implicitly have at least two dimensions. The correct way to transpose a 1D numpy array is not x.transpose() or x.T, but rather
x[:, None]
or
x.reshape(-1, 1)
From here, you can multiply by a matrix of ones, or use any of the other suggested approaches, as long as you respect the (subtle) differences between MATLAB and numpy.
import numpy as np
x=np.array([1,2,3])
y=np.multiply(np.ones((len(x),len(x))),x).T
print(y)
yields:
[[ 1. 1. 1.]
[ 2. 2. 2.]
[ 3. 3. 3.]]
Sometimes it is useful to "clone" a row or column vector to a matrix. By cloning I mean converting a row vector such as
[1, 2, 3]
Into a matrix
[[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]
or a column vector such as
[[1],
[2],
[3]]
into
[[1, 1, 1]
[2, 2, 2]
[3, 3, 3]]
In MATLAB or octave this is done pretty easily:
x = [1, 2, 3]
a = ones(3, 1) * x
a =
1 2 3
1 2 3
1 2 3
b = (x') * ones(1, 3)
b =
1 1 1
2 2 2
3 3 3
I want to repeat this in numpy, but unsuccessfully
In [14]: x = array([1, 2, 3])
In [14]: ones((3, 1)) * x
Out[14]:
array([[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
# so far so good
In [16]: x.transpose() * ones((1, 3))
Out[16]: array([[ 1., 2., 3.]])
# DAMN
# I end up with
In [17]: (ones((3, 1)) * x).transpose()
Out[17]:
array([[ 1., 1., 1.],
[ 2., 2., 2.],
[ 3., 3., 3.]])
Why wasn't the first method (In [16]) working? Is there a way to achieve this task in python in a more elegant way?
Use numpy.tile:
>>> tile(array([1,2,3]), (3, 1))
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
or for repeating columns:
>>> tile(array([[1,2,3]]).transpose(), (1, 3))
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
Here's an elegant, Pythonic way to do it:
>>> array([[1,2,3],]*3)
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> array([[1,2,3],]*3).transpose()
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
the problem with [16] seems to be that the transpose has no effect for an array. you're probably wanting a matrix instead:
>>> x = array([1,2,3])
>>> x
array([1, 2, 3])
>>> x.transpose()
array([1, 2, 3])
>>> matrix([1,2,3])
matrix([[1, 2, 3]])
>>> matrix([1,2,3]).transpose()
matrix([[1],
[2],
[3]])
First note that with numpy's broadcasting operations it's usually not necessary to duplicate rows and columns. See this and this for descriptions.
But to do this, repeat and newaxis are probably the best way
In [12]: x = array([1,2,3])
In [13]: repeat(x[:,newaxis], 3, 1)
Out[13]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [14]: repeat(x[newaxis,:], 3, 0)
Out[14]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
This example is for a row vector, but applying this to a column vector is hopefully obvious. repeat seems to spell this well, but you can also do it via multiplication as in your example
In [15]: x = array([[1, 2, 3]]) # note the double brackets
In [16]: (ones((3,1))*x).transpose()
Out[16]:
array([[ 1., 1., 1.],
[ 2., 2., 2.],
[ 3., 3., 3.]])
Let:
>>> n = 1000
>>> x = np.arange(n)
>>> reps = 10000
Zero-cost allocations
A view does not take any additional memory. Thus, these declarations are instantaneous:
# New axis
x[np.newaxis, ...]
# Broadcast to specific shape
np.broadcast_to(x, (reps, n))
Forced allocation
If you want force the contents to reside in memory:
>>> %timeit np.array(np.broadcast_to(x, (reps, n)))
10.2 ms ± 62.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit np.repeat(x[np.newaxis, :], reps, axis=0)
9.88 ms ± 52.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit np.tile(x, (reps, 1))
9.97 ms ± 77.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
All three methods are roughly the same speed.
Computation
>>> a = np.arange(reps * n).reshape(reps, n)
>>> x_tiled = np.tile(x, (reps, 1))
>>> %timeit np.broadcast_to(x, (reps, n)) * a
17.1 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit x[np.newaxis, :] * a
17.5 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit x_tiled * a
17.6 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
All three methods are roughly the same speed.
Conclusion
If you want to replicate before a computation, consider using one of the "zero-cost allocation" methods. You won't suffer the performance penalty of "forced allocation".
I think using the broadcast in numpy is the best, and faster
I did a compare as following
import numpy as np
b = np.random.randn(1000)
In [105]: %timeit c = np.tile(b[:, newaxis], (1,100))
1000 loops, best of 3: 354 µs per loop
In [106]: %timeit c = np.repeat(b[:, newaxis], 100, axis=1)
1000 loops, best of 3: 347 µs per loop
In [107]: %timeit c = np.array([b,]*100).transpose()
100 loops, best of 3: 5.56 ms per loop
about 15 times faster using broadcast
One clean solution is to use NumPy's outer-product function with a vector of ones:
np.outer(np.ones(n), x)
gives n repeating rows. Switch the argument order to get repeating columns. To get an equal number of rows and columns you might do
np.outer(np.ones_like(x), x)
You can use
np.tile(x,3).reshape((4,3))
tile will generate the reps of the vector
and reshape will give it the shape you want
Returning to the original question
In MATLAB or octave this is done pretty easily:
x = [1, 2, 3]
a = ones(3, 1) * x
...
In numpy it's pretty much the same (and easy to memorize too):
x = [1, 2, 3]
a = np.tile(x, (3, 1))
If you have a pandas dataframe and want to preserve the dtypes, even the categoricals, this is a fast way to do it:
import numpy as np
import pandas as pd
df = pd.DataFrame({1: [1, 2, 3], 2: [4, 5, 6]})
number_repeats = 50
new_df = df.reindex(np.tile(df.index, number_repeats))
Another solution
>> x = np.array([1,2,3])
>> y = x[None, :] * np.ones((3,))[:, None]
>> y
array([[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
Why? Sure, repeat and tile are the correct way to do this. But None indexing is a powerful tool that has many times let me quickly vectorize an operation (though it can quickly be very memory expensive!).
An example from my own code:
# trajectory is a sequence of xy coordinates [n_points, 2]
# xy_obstacles is a list of obstacles' xy coordinates [n_obstacles, 2]
# to compute dx, dy distance between every obstacle and every pose in the trajectory
deltas = trajectory[:, None, :2] - xy_obstacles[None, :, :2]
# we can easily convert x-y distance to a norm
distances = np.linalg.norm(deltas, axis=-1)
# distances is now [timesteps, obstacles]. Now we can for example find the closest obstacle at every point in the trajectory by doing
closest_obstacles = np.argmin(distances, axis=1)
# we could also find how safe the trajectory is, by finding the smallest distance over the entire trajectory
danger = np.min(distances)
To answer the actual question, now that nearly a dozen approaches to working around a solution have been posted: x.transpose reverses the shape of x. One of the interesting side-effects is that if x.ndim == 1, the transpose does nothing.
This is especially confusing for people coming from MATLAB, where all arrays implicitly have at least two dimensions. The correct way to transpose a 1D numpy array is not x.transpose() or x.T, but rather
x[:, None]
or
x.reshape(-1, 1)
From here, you can multiply by a matrix of ones, or use any of the other suggested approaches, as long as you respect the (subtle) differences between MATLAB and numpy.
import numpy as np
x=np.array([1,2,3])
y=np.multiply(np.ones((len(x),len(x))),x).T
print(y)
yields:
[[ 1. 1. 1.]
[ 2. 2. 2.]
[ 3. 3. 3.]]