construct a pairwise matrix from a vector in tensorflow - python

Suppose that I have a 1*3 vector [[1,3,5]] (or a list like [1,3,5] if you with), how do I generate a 9*2 matrix: [[1,1],[1,3],[1,5],[3,1],[3,3],[3,5],[5,1],[5,3],[5,5]]?
Elements in the new matrix is the pairwise combination of elements in the original matrix.
Also, the original matrix could be with zeros, like this [[0,1],[0,3],[0,5]].
The implementation should generalise to vectors of any dimensionalities.
Many thanks!

You can use tf.meshgrid() and tf.transpose() to generate two matrices. Then reshape and concat them.
import tensorflow as tf
a = tf.constant([[1,3,5]])
A,B=tf.meshgrid(a,tf.transpose(a))
result = tf.concat([tf.reshape(B,(-1,1)),tf.reshape(A,(-1,1))],axis=-1)
with tf.Session() as sess:
print(sess.run(result))
[[1 1]
[1 3]
[1 5]
[3 1]
[3 3]
[3 5]
[5 1]
[5 3]
[5 5]]

You can use product from itertools
from itertools import product
np.array([np.array(item) for item in product([1,3,5],repeat =2 )])
array([[1, 1],
[1, 3],
[1, 5],
[3, 1],
[3, 3],
[3, 5],
[5, 1],
[5, 3],
[5, 5]])

I also come up with an answer, similar to #giser_yugang, but not using tf.meshgrid and tf.concat.
import tensorflow as tf
inds = tf.constant([1,3,5])
num = tf.shape(inds)[0]
ind_flat_lower = tf.tile(inds,[num])
ind_mat = tf.reshape(ind_flat_lower,[num,num])
ind_flat_upper = tf.reshape(tf.transpose(ind_mat),[-1])
result = tf.transpose(tf.stack([ind_flat_upper,ind_flat_lower]))
with tf.Session() as sess:
print(sess.run(result))
[[1 1]
[1 3]
[1 5]
[3 1]
[3 3]
[3 5]
[5 1]
[5 3]
[5 5]]

Related

How to get indices of np.amin of 3d np array?

Given the following code:
import numpy as np
x = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]],
[[3, 1],
[1, 5]]])
x_min = np.amin(x, axis=0)
print(x_min)
The output (x_min) is
[[1 1]
[1 4]]
Now I want to get the indices of dimension 0 of array x for the results of x_min array, it should be:
[[0 2]
[2 0]]
Which function can I use to get this indices?
Try np.argmin: np.argmin(x, axis=0)

Remove elements in an ndarray based on condition on one dimension

In a Numpy ndarray, how do I remove elements in a dimension based on condition in a different dimension?
I have:
[[[1 3]
[1 4]]
[[2 6]
[2 8]]
[[3 5]
[3 5]]]
I want to remove based on condition x[:,:,1] < 7
Desired output ([:,1,:] removed):
[[[1 3]
[1 4]]
[[3 5]
[3 5]]]
EDIT: fixed typo
This may work:
x[np.where(np.all(x[..., 1] < 7, axis=1)), ...]
yields
array([[[[1, 3],
[1, 4]],
[[3, 5],
[3, 5]]]])
You do get an extra dimension, but that's easy to remove:
np.squeeze(x[np.where(np.all(x[..., 1] < 7, axis=1)), ...])
Briefly how it works:
First the condition: x[..., 1] < 7.
Then test if the condition is valid for all elements along the specific axis: np.all(x[..., 1] < 7, axis=1).
Then, use where to grab the indices instead of an array of booleans: np.where(np.all(x[..., 1] < 7, axis=1)).
And insert those indices into the relevant dimension: x[np.where(np.all(x[..., 1] < 7, axis=1)), ...].
As your desired output, you filter x on axis=0. Therefore, you may try this way
m = (x[:,:,1] < 7).all(1)
x_out = x[m,:,:]
Or simply
x_out = x[m]
Out[70]:
array([[[1, 3],
[1, 4]],
[[3, 5],
[3, 5]]])

Tensorflow embedding_lookup on multiple dimension

I would like to select a part of this tensor.
A = tf.constant([[[1,1],[2,2],[3,3]], [[4,4],[5,5],[6,6]]])
The output of A will be
[[[1 1]
[2 2]
[3 3]]
[[4 4]
[5 5]
[6 6]]]
The index I want to select from A is [1, 0]. I mean [2 2] of the first part and [4 4] of the second part of this tensor, so my expected result is
[2 2]
[4 4]
How can I do it with embedding_lookup function?
B = tf.nn.embedding_lookup(A, [1, 0])
I have already tried this
but it's not my expectation.
[[[4 4]
[5 5]
[6 6]]
[[1 1]
[2 2]
[3 3]]]
Can anyone help me and explain how to do it?
Try the following,
A = tf.constant([[[1,1],[2,2],[3,3]], [[4,4],[5,5],[6,6]]])
B = [1,0]
inds = [(a,b) for a,b in zip(np.arange(len(B)), B)]
C = tf.gather_nd(params=A, indices=inds)

Scanning over different dimensions of tensors in theano

I'm moving my first steps with theano and I cannot figure out how to solve this problem which could be actually very easy.
I have a 3 * 4 * 2 tensor, like the following:
[1 1] | [2 2] | [3 3]
[1 1] | [2 2] | [3 3]
[0 0] | [2 2] | [3 3]
[9 9] | [0 0] | [3 3]
So I have N=3 sequences, each of them of length L=4 with their elements that are vectors of dimension d=2. Actually, the sequences can be of different length but I could think of padding them with [0 0] vectors, as shown above.
What I want to do is, first scan through the first axis of the tensor and sum up all the vector in the lists up to the the first [0 0] vector -- that's why I added the [9 9] at the end of the first tensor slice, in order to check the sum exit condition [1]. I should end up in [[2 2], [6 6], [12 12]]. I tried in many ways to solve this problem which seems to me just a nested looping problem... but always got some weird errors[2].
Thanks,
Giulio
--
[1]: the actual problem is the training of a recurrent neural network for NLP purposes, with N the dimension of the batch, L the max length of a sentence in the batch and d the dimension of the representation of each word. I omitted the problem so that I could focus on the simplest coding aspect.
[2] I omit the history of my failures, maybe I could add them later.
If your sequences are always zero padded then you can just sum along the axis of interest since the padding regions will not change the sum. However, if the padding regions may contain non-zero values there are two approaches.
Use scan. This is slow and should be avoided if possible. In fact it can be avoided because,
Create a binary mask and multiply out the padding region.
Here's some code that illustrates these three approaches. For the two approaches that allow for non-zero padding regions (v2 and v3) the computation needs an additional input: a vector giving the lengths of the sequences within the batch.
import numpy
import theano
import theano.tensor as tt
def v1():
# NOTE: [9, 9] element changed to [0, 0]
# since zero padding must be used for
# this method
x_data = [[[1, 1], [1, 1], [0, 0], [0, 0]],
[[2, 2], [2, 2], [2, 2], [0, 0]],
[[3, 3], [3, 3], [3, 3], [3, 3]]]
x = tt.tensor3()
x.tag.test_value = x_data
y = x.sum(axis=1)
f = theano.function([x], outputs=y)
print f(x_data)
def v2_step(i_t, s_tm1, x, l):
in_sequence = tt.lt(i_t, l).dimshuffle(0, 'x')
s_t = s_tm1 + tt.switch(in_sequence, x[i_t], 0)
return s_t
def v2():
x_data = [[[1, 1], [1, 1], [0, 0], [9, 9]],
[[2, 2], [2, 2], [2, 2], [0, 0]],
[[3, 3], [3, 3], [3, 3], [3, 3]]]
l_data = [2, 3, 4]
x = tt.tensor3()
x.tag.test_value = x_data
l = tt.lvector()
l.tag.test_value = l_data
# Must dimshuffle first because scan can only iterate over first (0'th) axis.
x_hat = x.dimshuffle(1, 0, 2)
y, _ = theano.scan(v2_step, sequences=[tt.arange(x_hat.shape[0])],
outputs_info=[tt.zeros_like(x_hat[0])],
non_sequences=[x_hat, l], strict=True)
f = theano.function([x, l], outputs=y[-1])
print f(x_data, l_data)
def v3():
x_data = [[[1, 1], [1, 1], [0, 0], [9, 9]],
[[2, 2], [2, 2], [2, 2], [0, 0]],
[[3, 3], [3, 3], [3, 3], [3, 3]]]
l_data = [2, 3, 4]
x = tt.tensor3()
x.tag.test_value = x_data
l = tt.lvector()
l.tag.test_value = l_data
indexes = tt.arange(x.shape[1]).dimshuffle('x', 0)
mask = tt.lt(indexes, l.dimshuffle(0, 'x')).dimshuffle(0, 1, 'x')
y = (mask * x).sum(axis=1)
f = theano.function([x, l], outputs=y)
print f(x_data, l_data)
def main():
theano.config.compute_test_value = 'raise'
v1()
v2()
v3()
main()
In general, if your step function is dependent on the output of a previous step then you need to use scan.
If every step/iteration could, in principle, be executed concurrently (i.e. they don't rely on each other at all) then there is often a much more efficient way to do this without using scan

How to search in a sorted 2d matrix

i have got list of x y coordinates:
import numpy as np
a=np.array([[2,1],[1,3],[1,5],[2,3],[3,5]])
that i've sorted with
a=np.sort(a,axis=0)
print a
>[[1 3] [1 5] [2 1] [2 3] [3 5]]
i'd like to perform a search :
a.searchsorted([2,1])
>Value error : object too deep for desired array
Any ideas how to do that ?
This gona work may be , if I got what you asking :
>>> a = [[1, 3], [1, 5], [2, 1], [2, 3], [3, 5]]
>>> [2, 1] in a
True

Categories

Resources