Indexing with ragged tensors in tensorflow.js

Indexing with ragged tensors in tensorflow.js - python

I am trying to index a batch in tensorflow with a ragged tensor.
X = tf.constant([[[1,2,3], [4,5,6], [7,8,9]],
[[9,8,7], [6,5,4], [3,2,1]]])
The first dimension is the batch, the second is the sequence length.
Using gather_nd I can select the individual rows and columns.
tf.gather_nd(X, [[[0, 1], [0, 2], [0, 0]], [[1, 0], [1, 1], [1, 2]]])
But I have to use a ragged tensor as the input for the selection.
For example.
tf.gather_nd(X, [[[0, 1], [0, 2]], [[1, 0], [1, 1], [1, 2]]])
This of course does not work.
Is there a way to make the above code work?

You need to explicitly create a RaggedTensor object as tensorflow does not recognize them automatically:
>>> tf.gather_nd(X, tf.ragged.constant([[[0, 1], [0, 2]], [[1, 0], [1, 1], [1, 2]]], inner_shape=(2,)))
<tf.RaggedTensor [[[4, 5, 6], [7, 8, 9]], [[9, 8, 7], [6, 5, 4], [3, 2, 1]]]>
However, if your ultimate goal is to filter out specific batches, tf.boolean_mask (API) might be more straight-forward for this.

Related

How to create a multidimensional matrix in Python

I am a beginner in Python.
I want to create the matrix below, how should I create it?
[
[0,1], [0,2], [0,3],
[1,1], [1,2], [1,3],
[2,1], [2,2], [2,3],
[3,1], [3,2], [3,3]
]
I looked up numpy, maybe I'm not looking in the right way, I didn't find any good way.

This is almost what numpy.ndindex is doing, except you want one of the values to start with 1. You can fix it by converting to array and adding 1:
np.array(list(np.ndindex(4,3)))+[0,1]
Output:
array([[0, 1],
[0, 2],
[0, 3],
[1, 1],
[1, 2],
[1, 3],
[2, 1],
[2, 2],
[2, 3],
[3, 1],
[3, 2],
[3, 3]])

A rather simple list comprehension will generate this data structure. Numpy not required.
[[x, y] for x in range(4) for y in range(1, 4)]
Result:
[[0, 1], [0, 2], [0, 3],
[1, 1], [1, 2], [1, 3],
[2, 1], [2, 2], [2, 3],
[3, 1], [3, 2], [3, 3]]

Argsort issue in multi-dimensional array in Python

I have arrays I1 (shape=(1, 10, 2)) and I2 (shape=(2,)). I am trying to sort using argsort() but I am getting an error for I2.
import numpy as np
I1=np.array([[[0, 1],
[0, 3],
[1, 2],
[1, 4],
[2, 5],
[3, 4],
[3, 6],
[4, 7],
[5, 4],
[6, 7]]])
I2=np.array([[[0, 1],
[0, 3],
[1, 2],
[1, 4],
[2, 5],
[3, 4],
[3, 6],
[4, 7],
[5, 4],
[6, 7]],
[[0, 1],
[0, 3],
[1, 2],
[1, 4],
[2, 5],
[3, 4],
[3, 6],
[4, 7]]])
order1 = I1[0,:, 1].argsort()
print("order1 =",[order1])
order2 = I2[0,:, 1].argsort()
print("order2 =",[order2])
The error is
in <module>
order2 = I2[0,:, 1].argsort()
IndexError: too many indices for array: array is 1-dimensional, but 3 were indexed

If you would print I2, you'll quickly see what is causing the problem:
array([list([[0, 1], [0, 3], [1, 2], [1, 4], [2, 5], [3, 4], [3, 6], [4, 7], [5, 4], [6, 7]]),
list([[0, 1], [0, 3], [1, 2], [1, 4], [2, 5], [3, 4], [3, 6], [4, 7]])],
dtype=object)
I2 is not a three dimensional array, but a one dimensional array of lists (each individual list consists of a list of 2-element lists).
In fact, when you create I2, with a recent NumPy, you should also see a DeprecationWarning:
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
I2=np.array([[[0, 1],
which essentially identifies the same problem. Indeed, it states "from ragged nested sequences". Ragged is key here: your input outer list contains two lists that are not of equal length. As a result the three dimensional nested list is not of "rectangular" (box-shaped) dimensions, but it's a collection of list.
If you planned to you use your data this way with NumPy, you can't, really: NumPy is meant for (fast) operations with regular arrays, not with ragged arrays.

How can I merge a chain of intersecting 2-D lists (list of of lists) into a single 2-D list of lists

I have had to edit this question for the third time and present the data as simply as possible. I suppose the last one seemed very complex to recognize the pattern. Here is what I have now which looks more like the first one, for which #Andrej provided a solution, but I was unable to adapt to my scenario. I suppose that would go down to his conditions for merging. The original data is 3-D and is given below.`
original = [
[[0,1],[2,3],[4,5]],
[[0,1],[4,5]],
[[2,3]],
[[6,7],[8,9],[10,11]],
[[8,9],[6,7]],
[[6,7],[10,11]],
[[16,17],[12,13],[14,15]],
[[12,13]],
[[14,15],[16,17],[18,19]]
[[12,13],[16,17],[20,21]]
]
`
From the given data, I want to obtain another 3-D merged data`
merged = [
[[0,1],[2,3],[4,5]],
[[6,7],[8,9],[10,11]],
[[12,13],[14,15],[16,17],[18,19],[20,21]]
]
. I need to loop over all the 2-D list and merge all 2-D lists with common 1-D inner lists, while removing any duplicate 1-D lists. More like finding 2-D lists that have intersecting lists, and then merging all such 2-D lists. From the given original data, the first 2-D list intersects with the second through the list [0,1],[4,5] while the third 2-D list intersects with the first via [2,3]. Together, all three 2-D lists form a connected chain via their intersecting 1-D lists. This chain should be merged into a union of all three 2-D lists i.e [[0,1],[2,3],[4,5]]. I have tried the sample code below:
import numpy as np
original = [
[[0, 1], [2, 3], [4, 5]],
[[0, 1], [4, 5]],
[[2, 3]],
[[6, 7], [8, 9], [10, 11]],
[[8, 9], [6, 7]],
[[6, 7], [10, 11]],
[[16, 17], [12, 13], [14, 15]],
[[12, 13]],
[[14, 15], [16, 17], [18, 19]],
[[12, 13], [16, 17], [20, 21]]
]
tmp = {}
for subl in original:
for a, b in subl:
tmp.setdefault(a, set()).add(b)
merged = []
for k, v in tmp.items():
out.append([[k, i] for i in v])
print(merged)
But this is not giving the expected merged data as given above but this: [[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4]], [[1, 0], [1, 1], [1, 2]], [[2, 0], [2, 1], [2, 2], [2, 3], [2, 4]]]. Any help would be hugely appreciated, please.

Try:
original = [
[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4]],
[[0, 1], [0, 2], [0, 3], [0, 4], [0, 5]],
[[0, 2], [0, 3], [0, 5]],
[[1, 0], [1, 2], [1, 4]],
[[1, 2], [1, 3], [1, 4]],
[[1, 0], [1, 2], [1, 3], [1, 4]],
[[1, 0]],
[[1, 0], [1, 3]],
[[2, 0], [2, 1], [2, 2], [2, 3]],
[[2, 1], [2, 2], [2, 3], [2, 4]],
[[2, 2], [2, 3], [2, 4]],
[[2, 3], [2, 4]],
[[2, 4]],
]
tmp = {}
for subl in original:
for a, b in subl:
tmp.setdefault(a, set()).add(b)
out = []
for k, v in tmp.items():
out.append([[k, i] for i in v])
print(out)
Prints:
[
[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5]],
[[1, 0], [1, 2], [1, 3], [1, 4]],
[[2, 0], [2, 1], [2, 2], [2, 3], [2, 4]],
]

Creating a 2D matrix of vectors from a n-d array

I have an matrix represented by a np array. Here is an example of what I am talking about. You can see it has 3 "vectors" inside of it
x = np.array([[1, 1], [1,2],[2,3]])
[1, 1], [1,2] and [2,3]
The goal is to turn this into a matrix where these vectors are repeated. So the 0th row of said matrix should simply be [1,1] repeated n times. And the 1st row should be [1,2] repeated n times. I believe this would look somewhat like for n=4
xresult = np.array([[[1, 1], [1, 1], [1, 1], [1, 1]],
[[1, 2], [1, 2], [1, 2], [1, 2]],
[[2, 3], [2, 3], [2, 3], [2, 3]]])
And therefore
xresult[0,0] = [1,1]
xresult[0,1] = [1,1]
xresult[0,2] = [1,1]
xresult[1,2] = [1,2]
The goal is of course to do this without loops if possible as that is an obvious but perhaps less elegant/performant solution.
Here are some attempts that do not work
np.tile([x],(2,1))
>>>array([[[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]]])
np.tile([x],(2,))
>>>array([[[1, 1, 1, 1],
[1, 2, 1, 2],
[2, 3, 2, 3]]])
np.append(x,x,axis=0)
>>>array([[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]])
np.append([x],[x],axis=1)
>>>array([[[1, 1],
[1, 2],
[2, 3],
[1, 1],
[1, 2],
[2, 3]]])
np.array([[x],[x]])
>>>array([[[[1, 1],
[1, 2],
[2, 3]]],
[[[1, 1],
[1, 2],
[2, 3]]]])
(Some of these were just with n=2 as a goal)
It is worth noting that the ultimate end goal is to take x and y (a similarly crafted array of vectors of the same dimension but not necessarily the same number of vectors
y = np.array([[99,11], [23,44],[33,44], [2, 1], [9, 9]])
And run the procedure on x so that columns of the result are the number of vectors in y. And run a procedure on y that is similar but does this row-wise.
y after this transform would have the following
yresult[0,0] = [99,11]
yresult[1,0] = [22,44]
yresult[2,0] = [33,44]
yresult[2,1] = [33,44]
This way I can subtract the two matrices. The goal is to create a matrix where x'vector index is the row, y'vector index is the row and the element is the difference between these two vectors.
ultimateResult[0,1]=[1,1]-[23,44]=[-22,-43]
Perhaps there is a better way to get this.

Problem with tf.SparseTensor and tf.while_loop

I face a problem when I try to change the shape of tf.SparseTensor inside a tf.while_loop. Let's say I have this sparse tensor:
indices = np.array([[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5],
[1, 0], [1, 1], [1, 3], [1, 4], [1, 5],
[2, 1], [2, 2], [2, 3], [2, 4],
[3, 0], [3, 1], [3, 2], [3, 3], [3, 4], [3, 5],
[4, 0], [4, 2], [4, 3], [4, 4], [4, 5]], dtype=np.int64)
values = np.array([7, 6, 7, 4, 5, 4,
6, 7, 4, 3, 4,
3, 3, 1, 1,
1, 2, 2, 3, 3, 4,
1, 1, 2, 3, 3], dtype=np.float64)
dense_shape = np.array([5, 6], dtype=np.int64)
tRatings = tf.SparseTensor(indices, values, dense_shape)
So, I want to take a slice from the first 3 rows. I know for that purpose I can use tf.sparse_slice but this is an example. In my real code, I gather multiple rows from the sparse Tensor which they are not serial. The code I wrote is this:
subTensor = tf.sparse_slice(tRatings, [0, 0], [1, 6])
i = tf.constant(1)
def condition(i, sub):
return tf.less(i, 3)
def body(i, sub):
tempUser = tf.sparse_slice(tRatings, [i, 0], [1, 6])
sub = tf.sparse_concat(axis = 0, sp_inputs = [sub, tempUser])
return [tf.add(i, 1), sub]
subTensor = tf.while_loop(condition1, body1, [i, subTensor], shape_invariants=[i.get_shape(), tf.TensorShape([2])])[1]
which does't work for some reason when I run it. I get this:
ValueError: Dimensions 1 and 2 are not compatible
According to https://www.tensorflow.org/api_docs/python/tf/while_loop it says that:
The shape_invariants argument allows the caller to specify a less specific shape invariant for each loop variable, which is needed if the shape varies between iterations. The tf.Tensor.set_shape function may also be used in the body function to indicate that the output loop variable has a particular shape. The shape invariant for SparseTensor and IndexedSlices are treated specially as follows:
a) If a loop variable is a SparseTensor, the shape invariant must be TensorShape([r]) where r is the rank of the dense tensor represented by the sparse tensor. It means the shapes of the three tensors of the SparseTensor are ([None], [None, r], [r]). NOTE: The shape invariant here is the shape of the SparseTensor.dense_shape property. It must be the shape of a vector.
What am I missing here?

There are two problems.
First the problem in Tensorflow code. Change this line to:
var.indices.set_shape(tensor_shape.TensorShape([None, shape[0]]))
Another small problem in your code. You have to use int64 type for indexing variable:
i = tf.constant(1, dtype=tf.int64)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Indexing with ragged tensors in tensorflow.js - python

Related

How to create a multidimensional matrix in Python

Argsort issue in multi-dimensional array in Python

How can I merge a chain of intersecting 2-D lists (list of of lists) into a single 2-D list of lists

Creating a 2D matrix of vectors from a n-d array

Problem with tf.SparseTensor and tf.while_loop

Categories

Resources