Copy numpy array from one (2-D) to another (3-D) - python

I tried to copy one array, says A (2-D) to another array, says B (3-D) which have following shape
A is m * n array and B is m * n * p array
I tried the following code but it is very slow, like 1 sec/frame
for r in range (0, h):
for c in range (0, w):
x = random.randint(0, 20)
B[r, c, x] = A[r, c]
I also read some websites about fancy indexing but I still don't know how to apply it in mine.

I propose a solution using array indices. M,N,P are each (m,n) index arrays, specifying the m*n elements of B that will receive data from A.
def indexing(A, p):
m,n = A.shape
B = np.zeros((m,n,p), dtype=int)
P = np.random.randint(0, p, (m,n))
M, N = np.indices(A.shape)
B[M,N,P] = A
return B
For comparision, the original loop, and the solution using shuffle
def looping(A, p):
m, n = A.shape
B = np.zeros((m,n,p), dtype=int)
for r in range (m):
for c in range (n):
x = np.random.randint(0, p)
B[r, c, x] = A[r, c]
return B
def shuffling(A, p):
m, n = A.shape
B = np.zeros((m,n,p), dtype=int)
B[:,:,0] = A
map(np.random.shuffle, B.reshape(m*n,p))
return B
for m,n,p = 1000,1000,20, timings are:
looping: 1.16 s
shuffling: 10 s
indexing: 271 ms
for small m,n, looping is fastest. My indexing solution takes more time to setup, but the actual assignment is fast. The shuffling solution has as many iterations as the original.
The M,N arrays don't have to be full. They can be column and row arrays, respectively
M = np.arange(m)[:,None]
N = np.arange(n)[None,:]
or
M,N = np.ogrid[:m,:n]
This shaves off some time, more so for small test cases than a large one.
A repeatable version:
def indexing(A, p, B=None):
m, n = A.shape
if B is None:
B = np.zeros((m,n,p), dtype=int)
for r in range (m):
for c in range (n):
x = np.random.randint(0, p)
B[r, c, x] = A[r, c]
return B
indexing(A,p,indexing(A,p))
If A isn't the same size as the 1st 2 dim of B the index ranges will have to be changed. A doesn't have to be a 2D array either:
B[[0,0,2],[1,1,0],[3,4,5]] = [10,11,12]

Assuming that h=m, w=n and x=p, this should give you the same as you have in your example:
B[:,:,0]=A
map(np.random.shuffle, B.reshape(h*w,p))
Note also, I'm assuming the answer to NPE's question in comments is 'yes'

Related

How to vectorize with mismatched dimensionality

I have some constraints of the form of
A_{i,j,k} = r_{i,j}B_{i,j,k}
A is a nxmxp matrix, as is B. r is an nxm matrix.
I would like to vectorize this in Python somehow, as efficiently as possible. Right now, I am making r into nxmxp matrix by saying r_{i,j,k} = r_{i,j} for all 1 <= k <= p. Then I call np.multiply on r and B. This seems inefficient. Any ideas welcome, thanks.
def ndHadamardProduct(r, n, m, p): #r is a n x m matrix, p is an int
rnew = np.zeros(n, m, p)
B = np.zeros(n, m, p)
for i in range(n):
for j in range(m):
for k in range(p):
r[i, j, k] = r[i, j]
B[i, j, k] = random.uniform(0, 1)
return np.multiply(r, B)
Add an extra dimension with np.newaxis and then broadcasting takes care of the repetition for you.
import numpy as np
r = np.random.random((3,4))
b = np.random.random((3,4,5))
a = r[:,:,np.newaxis] * b

Advanced broadcasting in TensorFlow (or Numpy)

In TensorFlow I have a rank-2 tensor M (a matrix) of shape [D, D] and a rank-3 tensor T of shape [D, D, D].
I need to combine them to form a new matrix R as follows: the element R[a, b+c-a] is given by the sum of all the elements T[a, b, c]*M[b, c] where b+c-a is constant (where b+c-a has to be between 0 and D-1).
An inefficient way to create R is with nested for loops over the indices and a check that b+c-a does not exceed D-1 (e.g. in numpy):
R = np.zeros([D,D])
for a in range(D):
for b in range(D):
for c in range(D):
if 0 <= b+c-a < D:
R[a, b+c-a] += T[a, b, c]*M[b, c]
but I would like to use broadcasting and/or other more efficient methods.
How can I achieve this?
You can vectorize that calculation as follows:
import numpy as np
np.random.seed(0)
D = 10
M = np.random.rand(D, D)
T = np.random.rand(D, D, D)
# Original calculation
R = np.zeros([D, D])
for a in range(D):
for b in range(D):
for c in range(D):
if 0 <= b + c - a < D:
R[a, b + c - a] += T[a, b, c] * M[b, c]
# Vectorized calculation
tm = T * M
a = np.arange(D)[:, np.newaxis, np.newaxis]
b, c = np.ogrid[:D, :D]
col_idx = b + c - a
m = (col_idx >= 0) & (col_idx < D)
row_idx = np.tile(a, [1, D, D])
R2 = np.zeros([D, D])
np.add.at(R2, (row_idx[m], col_idx[m]), tm[m])
# Check result
print(np.allclose(R, R2))
# True
Alternatively, you could consider using Numba to accelerate the loops:
import numpy as np
import numba as nb
#nb.njit
def calculation_nb(T, M, D):
tm = T * M
R = np.zeros((D, D), dtype=tm.dtype)
for a in nb.prange(D):
for b in range(D):
for c in range(max(a - b, 0), min(D + a - b, D)):
R[a, b + c - a] += tm[a, b, c]
return R
print(np.allclose(R, calculation_nb(T, M, D)))
# True
In a couple of quick tests, even without parallelization, this is quite faster than NumPy.

Numpy broadcasting elementwise product on all pairs of rows?

I have a 1d ndarray A of shape (n,) and a 2d ndarray E of shape (n,m). I am trying to preform the following calculation (the circle-dot denotes element wise multiplication):
I have written it using with a for loop, but this block of code is called thousands of times, and I was hoping there was a way to accomplish this with broadcasting or numpy functions. The following is my for loop solution I'm trying to rewrite:
def fun(E, A):
X = E * A[:,np.newaxis]
R = np.zeros(E.shape[-1])
for ii in xrange(len(E)-1):
for jj in xrange(ii+1, len(E)):
R += X[ii] * X[jj]
return R
Any help would be appreciated.
Current approach, but still not working:
def fun1(E, A):
X = E * A[:,np.newaxis]
R = np.zeros(E.shape[-1])
for ii in xrange(len(E)-1):
for jj in xrange(ii+1, len(E)):
R += X[ii] * X[jj]
return R
def fun2(E, A):
n = E.shape[0]
m = E.shape[1]
A_ = np.triu(A[1:] * A[:-1].reshape(-1,1))
E_ = E[1:] * E[:-1]
R = np.sum((A_.reshape(n-1, 1, n-1) * E_.T).transpose(0,2,1).reshape(n-1*n-1,m), axis=0)
return R
A = np.arange(4,9)
E = np.arange(20).reshape((5,4))
print fun1(E,A)
print fun2(E,A)
Now, this should work:
def fun3(E,A):
n,m = E.shape
n_ = n - 1
X = E * A[:, np.newaxis]
a = (X[:-1].reshape(n_, 1, m) * X[1:])
b = np.tril(np.ones((m, n_, n_))).T
R = np.sum((a*b).reshape(n_*n_, m), axis=0)
return R
Last function was only based on the given formula. This is instead based on fun and tested with your added test case.
Hope this works for you!

Refactor matrix permutations in numpy's style

I wrote the following code to do multiplication of matrix permutations and I was wondering if it can be written in a numpy style, such that I can get rid of the two for loops:
Z = np.empty([new_d, X.shape[1]])
Z = np.ndarray(shape=(new_d, X.shape[1]))
Z = np.concatenate((X, X**2))
res = []
for i in range(0, d):
for j in range(i+1, d):
res.append(np.array(X.T[:,i]* X.T[:,j]))
Z = np.concatenate((Z, res))
while: X shape is (7, 1000), d = 7, new_d=35
any suggestion ?
Approach #1
We could use np.triu_indices to get those pair-wise permutation-indices and then simply perform elementwise multiplicatons of row-indexed arrays -
r,c = np.triu_indices(d,1)
res = X[r]*X[c]
Approach #2
For memory efficiency and hence performance especially on large arrays, we are better off slicing the input array and run a single loop with each iteration working on chunks of data, like so -
n = d-1
idx = np.concatenate(( [0], np.arange(n,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
L = n*(n+1)//2
res_out = np.empty((L,X.shape[1]), dtype=X.dtype)
for i,(s0,s1) in enumerate(zip(start,stop)):
res_out[s0:s1] = X[i] * X[i+1:]
To get Z directly and thus avoid all those concatenations, we could modify the earlier posted approach, like so -
n = d-1
N = len(X)
idx = 2*N + np.concatenate(( [0], np.arange(n,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
L = n*(n+1)//2
Z_out = np.empty((2*N + L,X.shape[1]), dtype=X.dtype)
Z_out[:N] = X
Z_out[N:2*N] = X**2
for i,(s0,s1) in enumerate(zip(start,stop)):
Z_out[s0:s1] = X[i] * X[i+1:]

Vectorizing numpy loops

I've been trying to vectorize out the loops of the following code. (edited for comments)
M, N, F = 10, 50, 30
ts = np.linspace(0.001,3,M)
v = np.random.rand(N,1)
A = np.random.rand(N,N)
D = np.zeros(shape=(N,N,M))
for i, t in enumerate(ts):
for x in range(0,N):
for y in range(x,N):
D[x,y,i] = np.sum( np.exp(-t * v[0:F]) * A[x,0:F] * A[y,0:F] )
D[y,x,i] = D[x,y,i]
I've been reading other questions but can't figure out how to apply it here.
Suggestions?
Here's a vectorized approach using a combination of broadcasting and matrix-multiplication with np.dot -
# Get r,c indices corresponding to indices along dim-0,1 for o/p
r,c = np.triu_indices(N)
vals = (A[r,:F] * A[c,:F]).dot(np.exp(v[:nf,None]*(-ts)))
# Initialize o/p array and assign values
out = np.empty(shape=(N,N,M))
out[r,c,:] = vals
out[c,r,:] = vals

Categories

Resources