Tensorflow- Cartesian product of two 2-D tensors - python

I have two 2-D tensors and want to have Cartesian product of them. By Cartesian, I mean the concat of every row of first tensor with every row of second tensor. For example:
Input:
[[1,2,3],[4,5,6]]
and
[[7,8],[9,10]]
Output:
[[1,2,3,7,8],
[1,2,3,9,10],
[4,5,6,7,8],
[4,5,6,9,10]]
I've seen this post, but it doesn't work for this case. What is the best for it?
Thanks

Here is one way. Repeat elements a and b along the second and first dimension respectively, further reshape repeated a and then concatenate the two repeated tensors.
a_ = tf.reshape(tf.tile(a, [1, b.shape[0]]), (a.shape[0] * b.shape[0], a.shape[1]))
b_ = tf.tile(b, [a.shape[0], 1])
tf.concat([a_, b_], 1).eval()
#array([[ 1, 2, 3, 7, 8],
# [ 1, 2, 3, 9, 10],
# [ 4, 5, 6, 7, 8],
# [ 4, 5, 6, 9, 10]])

Related

Chose rows of 3d Tensor based on some repeated indices. Tricky Slicing

So this is a tricky bit of Tensor slicing I'm trying.
I have a tensor A which is 3d
>> A.shape
torch.Size([60, 10, 16])
So this tensor is composed of 5 different data samples, where in dim=0 we have cuts at split_ids = [10, 14, 10, 12, 14] i.e first 10 elements belong to sample1; next 14 belong to sample2 & so on.. I can split the tensor in such a way:
>> torch.split(A, split_ids, dim=0)
(tensor([[[-0.3888, -...Backward>), tensor([[[ 2.6473e-0...Backward>), tensor([[[ 1.1621, ...Backward>), tensor([[[ 0.1953, -...Backward>), tensor([[[ 8.1993e-0...Backward>))
This comprises of tuple of 5 elements (or 5 tensors), of shapes Size(10,10,16);Size(14,10,16); and so on for the splits we had.
Now, comes the tricky part - I have another indices mapping that I have derived some previous processing for each of these individual splits. Its a list of 1d tensors like this:
>> reverse_map
[tensor([1, 2, 2, 1, ...='cuda:0'), tensor([ 7, 7, 9, ...='cuda:0'), tensor([7, 7, 4, 3, ...='cuda:0'), tensor([ 9, 4, 9, ...='cuda:0'), tensor([ 0, 0, 0, ...='cuda:0')]
>> reverse_map[0]
tensor([1, 2, 2, 1, 1, 2, 0, 1, 2, 0, 0, 1, 6, 1, 7, 1, 2, 7, 4, 5, 3, 4, 9, 7,
7, 3, 8, 7, 7, 7], device='cuda:0')
So I need to basically use these indices and pull the count of these indices from the above split tensors i.e For tensor.Size(10,10,16) I need to pull [0, 0:3, :] which is for index 0 in dim=0, I need to pull 0:3 in dim=1 because there are 3 0's in the indexing. Then at index 1 I need to pull first 4 vectors since there are 4 1's.. and so on.
Whats the best way to do this ? Does scatter_() help here ?
I think a combination of torch.bincount and this answer can give you what you want.
For simplicity, let's focus on the first tensor split from A and the first reverse_map. You can then apply this code to the other splits and reverse_maps. Let source be the first split of shape (10, 10, 16).
Here's how it goes:
# inputs
source = torch.arange(10*10*16).view(10, 10, 16)
reverse_map = torch.tensor(tensor([1, 2, 2, 1, 1, 2, 0, 1, 2, 0, 0, 1, 6, 1, 7, 1, 2, 7, 4, 5, 3, 4, 9, 7, 7, 3, 8, 7, 7, 7])
# how many columns to pull for each row - use bincount to find out!
lengths = torch.bincount(reverse_map, minlength=source.shape[0])
# use a mask to pull the elements
mask = torch.zeros(source.shape[0], source.shape[1] + 1, dtype=source.dtype, device=source.device)
mask[(torch.arange(source.shape[0]), lengths)] = 1
mask = mask.cumsum(dim=1)[:, :-1] == 0
# expand the mask to dim=2 as well and pull the elements
out = source[mask[..., None].expand(-1, -1, source.shape[2])]
# since you pull different number of columns per row, you loose the shape of source. You need a final split to recover it
target = torch.split(out, (lengths * source.shape[2]).cpu().numpy().tolist())
target = [t_.view(-1, source.shape[2]) for t_ in target]
The output target is a list of 2-D tensors with varying number of rows (according to the counts of reverse_map and source.shape[2] columns in each.

PyTorch slice matrix with vector

Say I have one matrix and one vector as follows:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
is there a way to slice it x[y] so the result is:
res = [1, 6, 8]
So basically I take the first element of y and take the element in x that corresponds to the first row and the elements' column.
You can specify the corresponding row index as:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
x[range(x.shape[0]), y]
tensor([1, 6, 8])
Advanced indexing in pytorch works just as NumPy's, i.e the indexing arrays are broadcast together across the axes. So you could do as in FBruzzesi's answer.
Though similarly to np.take_along_axis, in pytorch you also have torch.gather, to take values along a specific axis:
x.gather(1, y.view(-1,1)).view(-1)
# tensor([1, 6, 8])

1D vector from lower part of diagonal in matrix [Python]

I am struggeling with pretty easy thing but unfortunatelly I cannot solve it. I have a matrix 64x64 elements as you can see on the image. Where reds are zeros and greens are values I am interested in.
I would like to end up with only lower triangular part under diagonal (green values) into one array.
I use Python 2.7
Thank you a lot,
Michael
Assuming you can pull your data into a numpy array, use the tril_indices function. It looks like your data doesn't include the main diagonal so you can shift by -1
data = np.arange(4096).reshape(64, 64)
inds = np.tril_indices(64, -1)
vals = data[inds]
You can use np.tril_indices which returns the indices of a lower triangular part of a matrix with given shape, the indices can be further used to extract values from the matrix, suppose your matrix is called arr:
arr[np.tril_indices(n=64,m=64)]
You can provide an extra offset parameter if you want to exclude the diagonal:
arr[np.tril_indices(n = 64, m = 64, k = -1)]
An example:
arr = np.array([list(range(i, 5+i)) for i in range(5)])
arr
#array([[0, 1, 2, 3, 4],
# [1, 2, 3, 4, 5],
# [2, 3, 4, 5, 6],
# [3, 4, 5, 6, 7],
# [4, 5, 6, 7, 8]])
arr[np.tril_indices(n = 5, m = 5)]
# array([0, 1, 2, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7, 8])
Two time faster than triu on this example :
np.concatenate([arr[i,:i] for i in range(1,n)])

Numpy: vectorized access of several columns at once?

I have scripts with multi-dimensional arrays and instead of for-loops I would like to use a vectorized implementation for my problems (which sometimes contain column operations).
Let's consider a simple example with matrix arr:
> arr = np.arange(12).reshape(3, 4)
> arr
> ([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
> arr.shape
> (3, 4)
So we have a matrix arr with 3 rows and 4 columns.
The simplest case in my scripts is adding something to the values in the array. E.g. I'm doing this for single or multiple rows:
> someVector = np.array([1, 2, 3, 4])
> arr[0] += someVector
> arr
> array([[ 1, 3, 5, 7], <--- successfully added someVector
[ 4, 5, 6, 7], to one row
[ 8, 9, 10, 11]])
> arr[0:2] += someVector
> arr
> array([[ 2, 5, 8, 11], <--- added someVector to two
[ 5, 7, 9, 11], <--- rows at once
[ 8, 9, 10, 11]])
This works well. However, sometimes I need to manipulate one or several columns. One column at a time works:
> arr[:, 0] += [1, 2, 3]
> array([[ 3, 5, 8, 11],
[ 7, 7, 9, 11],
[11, 9, 10, 11]])
^
|___ added the values [1, 2, 3] successfully to
this column
But I am struggling to think out why this does not work for multiple columns at once:
> arr[:, 0:2] += [1, 2, 3]
> ValueError
> Traceback (most recent call last)
> <ipython-input-16-5feef53e53af> in <module>()
> ----> 1 arr[:, 0:2] += [1, 2, 3]
> ValueError: operands could not be broadcast
> together with shapes (3,2) (3,) (3,2)
Isn't this the very same way it works with rows? What am I doing wrong here?
To add a 1D array to multiple columns you need to broadcast the values to a 2D array. Since broadcasting adds new axes on the left (of the shape) by default, broadcasting a row vector to multiple rows happens automatically:
arr[0:2] += someVector
someVector has shape (N,) and gets automatically broadcasted to shape (1, N). If arr[0:2] has shape (2, N), then the sum is performed element-wise as though both arr[0:2] and someVector were arrays of the same shape, (2, N).
But to broadcast a column vector to multiple columns requires hinting NumPy that you want broadcasting to occur with the axis on the right. In fact, you have to add the new axis on the right explicitly by using someVector[:, np.newaxis] or equivalently someVector[:, None]:
In [41]: arr = np.arange(12).reshape(3, 4)
In [42]: arr[:, 0:2] += np.array([1, 2, 3])[:, None]
In [43]: arr
Out[43]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])
someVector (e.g. np.array([1, 2, 3])) has shape (N,) and someVector[:, None] has shape (N, 1) so now broadcasting happens on the right. If arr[:, 0:2] has shape (N, 2), then the sum is performed element-wise as though both arr[:, 0:2] and someVector[:, None] were arrays of the same shape, (N, 2).
Very clear explanation of #unutbu.
As a complement, transposition (.T) can often simplify the task, by working in the first dimension :
In [273]: arr = np.arange(12).reshape(3, 4)
In [274]: arr.T[0:2] += [1, 2, 3]
In [275]: arr
Out[275]:
array([[ 1, 2, 2, 3],
[ 6, 7, 6, 7],
[11, 12, 10, 11]])

How can I find the dimensions of a matrix in Python?

How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]

Categories

Resources