Fancy indexing of a numpy ndarray - python

Suppose i have an array shaped as a:
import numpy as np
n = 10
d = 5
a = np.zeros(shape = np.repeat(n,d))
And that I want to obtain the values corresponding to indexes (0,...,:,...,0) for the : along dimensions, resulting in a (n,d)-shaped array b, with b[i,j] = a[0,...,0,i,0,...,0] where the i is in the jth dimension.
How can i extractb from a ?

Get the flattened indices and just index for a vectorized solution -
n = len(a)
d = a.ndim
idxs = np.multiply.outer(n**np.arange(d), np.arange(n))
out = a.flat[idxs]

Easiest is to do a for loop:
# get the first slice of `a` along given dimension `j`
def get_slice(a,j):
idx = [0]*len(a.shape)
idx[j] = slice(None)
return a[tuple(idx)]
out = np.stack([get_slice(a,j) for j in range(len(a.shape))])
And out.shape is (10,5)

Related

How to sum a single column array with another array (going column by column)?

The code below allows me to add a vector to each row of a given matrix using Numpy:
import numpy as np
m = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 1, 0])
print("Original vector:")
print(v)
print("Original matrix:")
print(m)
result = np.empty_like(m)
for i in range(4):
result[i, :] = m[i, :] + v
print("\nAfter adding the vector v to each row of the matrix m:")
print(result)
How do I perform a similar addition operation, but going column by column?
I have tried the following:
import numpy as np
array1 = np.array([[5,5,3],[2,2,3]])
print(array1)
addition = np.array([[1],[1]])
print(addition)
for i in range(3):
array1[:,i] = array1[:,i] + addition
print(array1)
However, I get the following broadcasting error:
ValueError: could not broadcast input array from shape (2,2) into shape (2)
Just match the number of dimensions, numpy will broadcast the arrays as needed. In the first example, it should be:
result = m + v.reshape((1, -1))
In the second example, the addition is already 2D so it will be just:
array1 + addition
You can alternatively, add a dimension via Numpy None syntax and then do the addition:
array1 += addition[:,None]

How to convert a matrix of torch.tensor to a larger tensor?

I meet a problem to convert a python matrix of torch.tensor to a torch.tensor
For example, M is an (n,m) matrix, with each element M[i][j] is a torch.tensor with same size (p, q, r, ...). How to convert python list of list M to a torch.tensor with size (n,m,p,q,r,...)
e.g.
M = []
for i in range(5):
row = []
for j in range(10):
row.append(torch.rand(3,4))
M.append(row)
How to convert above M to a torch.tensor with size (5,10,3,4).
Try torch.stack() to stack a list of tensors on the first dimension.
import torch
M = []
for i in range(5):
row = []
for j in range(10):
row.append(torch.rand(3,4))
row = torch.stack(row)
M.append(row)
M = torch.stack(M)
print(M.size())
# torch.Size([5, 10, 3, 4])
Try this.
ref = np.arange(3*4*5).reshape(3,4,5) # numpy array
values = [ref.copy()+i for i in range(6)] # List of numpy arrays
b = torch.from_numpy(np.array(values)) # torch-array from List of numpy arrays
References
Converting NumPy Array to Torch Tensor

Initialize a numpy sparse matrix efficiently

I have an array with m rows and arrays as values, which indicate the index of columns and are bounded to a large number n.
E.g:
Y = [[1,34,203,2032],...,[2984]]
Now I want an efficient way to initialize a sparse numpy matrix X with dimensions m,n and values corresponding to Y (X[i,j] = 1, if j is in Y[i], = 0 otherwise).
Your data are already close to csr format, so I suggest using that:
import numpy as np
from scipy import sparse
from itertools import chain
# create an example
m, n = 20, 10
X = np.random.random((m, n)) < 0.1
Y = [list(np.where(y)[0]) for y in X]
# construct the sparse matrix
indptr = np.fromiter(chain((0,), map(len, Y)), int, len(Y) + 1).cumsum()
indices = np.fromiter(chain.from_iterable(Y), int, indptr[-1])
data = np.ones_like(indices)
S = sparse.csr_matrix((data, indices, indptr), (m, n))
# or
S = sparse.csr_matrix((data, indices, indptr))
# check
assert np.all(S==X)

Multi-dimensional gather in Tensorflow

The general solution to this question is being worked on in this github issue, but I was wondering if there are workarounds using tf.gather (or something else) to achieve array indexing using a multi-index. One solution I came up with was to broadcast multiply each index in the multi-idx with the cumulative product of the tensor shape, which produces indices suitable for indexing the flattened tensor:
import tensorflow as tf
import numpy as np
def __cumprod(l):
# Get the length and make a copy
ll = len(l)
l = [v for v in l]
# Reverse cumulative product
for i in range(ll-1):
l[ll-i-2] *= l[ll-i-1]
return l
def ravel_multi_index(tensor, multi_idx):
"""
Returns a tensor suitable for use as the index
on a gather operation on argument tensor.
"""
if not isinstance(tensor, (tf.Variable, tf.Tensor)):
raise TypeError('tensor should be a tf.Variable')
if not isinstance(multi_idx, list):
multi_idx = [multi_idx]
# Shape of the tensor in ints
shape = [i.value for i in tensor.get_shape()]
if len(shape) != len(multi_idx):
raise ValueError("Tensor rank is different "
"from the multi_idx length.")
# Work out the shape of each tensor in the multi_idx
idx_shape = [tuple(j.value for j in i.get_shape()) for i in multi_idx]
# Ensure that each multi_idx tensor is length 1
assert all(len(i) == 1 for i in idx_shape)
# Create a list of reshaped indices. New shape will be
# [1, 1, dim[0], 1] for the 3rd index in multi_idx
# for example.
reshaped_idx = [tf.reshape(idx, [1 if i !=j else dim[0]
for j in range(len(shape))])
for i, (idx, dim)
in enumerate(zip(multi_idx, idx_shape))]
# Figure out the base indices for each dimension
base = __cumprod(shape)
# Now multiply base indices by each reshaped index
# to produce the flat index
return (sum(b*s for b, s in zip(base[1:], reshaped_idx[:-1]))
+ reshaped_idx[-1])
# Shape and slice starts and sizes
shape = (Z, Y, X) = 4, 5, 6
Z0, Y0, X0 = 1, 1, 1
ZS, YS, XS = 3, 3, 4
# Numpy matrix and index
M = np.random.random(size=shape)
idx = [
np.arange(Z0, Z0+ZS).reshape(ZS,1,1),
np.arange(Y0, Y0+YS).reshape(1,YS,1),
np.arange(X0, X0+XS).reshape(1,1,XS),
]
# Tensorflow matrix and indices
TM = tf.Variable(M)
TF_flat_idx = ravel_multi_index(TM, [
tf.range(Z0, Z0+ZS),
tf.range(Y0, Y0+YS),
tf.range(X0, X0+XS)])
TF_data = tf.gather(tf.reshape(TM,[-1]), TF_flat_idx)
with tf.Session() as S:
S.run(tf.initialize_all_variables())
# Obtain data via flat indexing
data = S.run(TF_data)
# Check that it agrees with data obtained
# by numpy smart indexing
assert np.all(data == M[idx])
However, this only works on tensors of rank 3 due to this (current) limitation limiting broadcasts to tensors of rank 3.
At the moment I can only think of doing a chained gather, transpose, gather, transpose, gather, but this is unlikely to be efficient. e.g.
shape = (8, 9, 10)
A = tf.random_normal(shape)
data = tf.gather(tf.transpose(tf.gather(A, [1, 3]), [1,0,2]), ...)
Any ideas?
It sounds like you want gather_nd.

Python numpy array manipulation

i need to manipulate an numpy array:
My Array has the followng format:
x = [1280][720][4]
The array stores image data in the third dimension:
x[0][0] = [Red,Green,Blue,Alpha]
Now i need to manipulate my array to the following form:
x = [1280][720]
x[0][0] = Red + Green + Blue / 3
My current code is extremly slow and i want to use the numpy array manipulation to speed it up:
for a in range(0,719):
for b in range(0,1279):
newx[a][b] = x[a][b][0]+x[a][b][1]+x[a][b][2]
x = newx
Also, if possible i need the code to work for variable array sizes.
Thansk Alot
Use the numpy.mean function:
import numpy as np
n = 1280
m = 720
# Generate a n * m * 4 matrix with random values
x = np.round(np.random.rand(n, m, 4)*10)
# Calculate the mean value over the first 3 values along the 2nd axix (starting from 0)
xnew = np.mean(x[:, :, 0:3], axis=2)
x[:, :, 0:3] gives you the first 3 values in the 3rd dimension, see: numpy indexing
axis=2 specifies, along which axis of the matrix the mean value is calculated.
Slice the alpha channel out of the array, and then sum the array along the RGB axis and divide by 3:
x = x[:,:,:-1]
x_sum = x.sum(axis=2)
x_div = x_sum / float(3)

Categories

Resources