Convert elements in a tensorflow array using a dictionary - python

I have a tensorflow array and I want to convert each one of it's elements to another element using a dictionary.
Here is my array:
elems = tf.convert_to_tensor(np.array([1, 2, 3, 4, 5, 6]))
and here is the dictionary:
d = {1:1,2:5,3:7,4:5,5:8,6:2}
After the conversion, the resulting array should be
tf.convert_to_tensor(np.array([1, 5, 7, 5, 8, 2]))
In order to do that, I tried to use tf.map_fn as follows:
import tensorflow as tf
import numpy as np
d = {1:1,2:5,3:7,4:5,5:8,6:2}
elems = tf.convert_to_tensor(np.array([1, 2, 3, 4, 5, 6]))
res = tf.map_fn(lambda x: d[x], elems)
sess=tf.Session()
print(sess.run(res))
When I run the code above, I get the following error:
squares = tf.map_fn(lambda x: d[x], elems) KeyError: <tf.Tensor 'map/while/TensorArrayReadV3:0' shape=() dtype=int64>
What would be the correct way to do that? I was basically trying to follow the usage from here.
P.S. my arrays are actually 3D, I just used 1D as an example since the code fails in that case as well.

You should use tensorflow.contrib.lookup.HashTable:
import tensorflow as tf
import numpy as np
d = {1:1,2:5,3:7,4:5,5:8,6:2}
keys = list(d.keys())
values = [d[k] for k in keys]
table = tf.contrib.lookup.HashTable(
tf.contrib.lookup.KeyValueTensorInitializer(keys, values, key_dtype=tf.int64, value_dtype=tf.int64), -1
)
elems = tf.convert_to_tensor(np.array([1, 2, 3, 4, 5, 6]), dtype=tf.int64)
res = tf.map_fn(lambda x: table.lookup(x), elems)
sess=tf.Session()
sess.run(table.init)
print(sess.run(res))

Related

How to Do Numpy Like Index Selection in Tensorflow?

Let us use the following code
#!/usr/bin/env python3
# encoding: utf-8
import numpy as np, tensorflow as tf # tf.__version__==2.7.0
sample_array=np.random.uniform(size=(2**10, 120, 20))
to_select=[5, 6, 9, 4]
sample_tensor=tf.convert_to_tensor(value=sample_array)
sample_array[:, :, to_select] # Works okay
sample_tensor[:, :, to_select] # TypeError. How to do this in tensor?
tf.convert_to_tensor(value=sample_tensor.numpy()[:, :, to_select]) # Ugly workaround
Basically, how to get those elements as a tensor of appropriate dimension, just like numpy? I tried tf.slice and tf.gather, but cannot figure out the proper arguments to pass.
I can convert it to numpy and back, but not sure if it will sacrifice the operation's efficiency, and work as part of a custom training loop.
The simplest solution would be to use tf.concat, although it is probably not so efficient:
import numpy as np
import tensorflow as tf
sample_array = np.random.uniform(size=(2, 2, 20))
to_select = [5, 6, 9, 4]
sample_tensor = tf.convert_to_tensor(value = sample_array)
numpy_way = sample_array[:, :, to_select]
tf_way = tf.concat([tf.expand_dims(sample_array[:, :, to_select[i]], axis=-1) for i in tf.range(len(to_select))], axis=-1)
#tf_way = tf.concat([tf.expand_dims(sample_array[:, :, s], axis=-1) for s in to_select], axis=-1)
print(numpy_way)
print(tf_way)
[[[0.81208086 0.03873406 0.89959868 0.97896671]
[0.57569184 0.33659472 0.32566287 0.58383079]]
[[0.59984846 0.43405048 0.42366314 0.25505199]
[0.16180442 0.5903358 0.21302399 0.86569914]]]
tf.Tensor(
[[[0.81208086 0.03873406 0.89959868 0.97896671]
[0.57569184 0.33659472 0.32566287 0.58383079]]
[[0.59984846 0.43405048 0.42366314 0.25505199]
[0.16180442 0.5903358 0.21302399 0.86569914]]], shape=(2, 2, 4), dtype=float64)
A more complicated, but efficient solution would involve using tf.meshgrid and tf.gather_nd. Check this post or this post and finally this. Here is an example based on your question:
to_select = tf.expand_dims(tf.constant([5, 6, 9, 4]), axis=0)
to_select_shape = tf.shape(to_select)
sample_tensor_shape = tf.shape(sample_tensor)
to_select = tf.expand_dims(tf.reshape(tf.tile(to_select, [1, to_select_shape[1]]), (sample_tensor_shape[0], sample_tensor_shape[0] * to_select_shape[1])), axis=-1)
ij = tf.stack(tf.meshgrid(
tf.range(sample_tensor_shape[0], dtype=tf.int32),
tf.range(sample_tensor_shape[1], dtype=tf.int32),
indexing='ij'), axis=-1)
gather_indices = tf.concat([tf.repeat(ij, repeats=to_select_shape[1], axis=1), to_select], axis=-1)
gather_indices = tf.reshape(gather_indices, (to_select_shape[1], to_select_shape[1], 3))
result = tf.gather_nd(sample_tensor, gather_indices, batch_dims=0)
result = tf.reshape(result, (result.shape[0]//2, result.shape[0]//2, result.shape[1]))
tf.Tensor(
[[[0.81208086 0.03873406 0.89959868 0.97896671]
[0.57569184 0.33659472 0.32566287 0.58383079]]
[[0.59984846 0.43405048 0.42366314 0.25505199]
[0.16180442 0.5903358 0.21302399 0.86569914]]], shape=(2, 2, 4), dtype=float64)

how to detach list of pytorch tensors to array

There is a list of PyTorch's Tensors and I want to convert it to array but it raised with error:
'list' object has no attribute 'cpu'
How can I convert it to array?
import torch
result = []
for i in range(3):
x = torch.randn((3, 4, 5))
result.append(x)
a = result.cpu().detach().numpy()
You can stack them and convert to NumPy array:
import torch
result = [torch.randn((3, 4, 5)) for i in range(3)]
a = torch.stack(result).cpu().detach().numpy()
In this case, a will have the following shape: [3, 3, 4, 5].
If you want to concatenate them in a [3*3, 4, 5] array, then:
a = torch.cat(result).cpu().detach().numpy()

numpy dot on 1D and 2D array

I am trying to understand what happens in the following python code:
import numpy as np
numberList1 = [1,2,3]
numberList2 = [[4,5,6],[7,8,9]]
result = np.dot(numberList2, numberList1)
# Converting iterator to set
resultSet = set(result)
print(resultSet)
Output:
{32, 50}
I can see that it is multiplying each element in numberList1 by the element in the same position in each array within numberList2 - so {1*4 + 2*5 + 3*6 = 32},{1*7+2*8+3*9 = 50}.
But, if I change the arrays to:
numberList1 = [1,1,1]
numberList2 = [[2,2,2],[3,3,3]]
Then the output I see is
{9, 6}
Which is the wrong way around...
and, if I change it to:
numberList1 = [1,1,1]
numberList2 = [[2,2,2],[2,2,2]]
Then the output I see is just
{6}
From the documentation:
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
I am not enough of a mathematician to understand quite what this is telling me; or why the order of the outputs swaps around sometimes.
a set is an unordered data type - and it will remove your duplicates. np.dot does not return an iterator (as mentioned in your code) but an np.ndarray which will be in the order you expect:
import numpy as np
numberList1 = [1, 2, 3]
numberList2 = [[4, 5, 6], [7, 8, 9]]
result = np.dot(numberList2, numberList1)
# [32 50]
# <class 'numpy.ndarray'>
# numberList1 = [1, 1, 1]
# numberList2 = [[2, 2, 2], [3, 3, 3]]
# -> [6 9]

zip like function in Tensorflow? Tensorflow tensor operation

My question is about the tensor operation in Tensorflow.
Let's say:
import tensorflow as tf
import numpy as np
a = tf.Variable(np.random.random([10, 3, 3]))
b = tf.Variable(np.random.random([10, 3, 3]))
def some_function(m,n):
# just as an example
return tf.add(m, n)
This works in Tensorflow but it requires to know the dimension in advanced. However, it is very likely that the first dimension of the Tensor is None.
c = []
for i in range(10):
c.append(some_function(a[i], b[i]))
c = tf.stack(c)
So I wonder if there is a zip-like function in Tensorflow? Then we can do:
# TypeError: zip argument #1 must support iteration
c = []
for i, j in zip(a,b):
c.append(some_function(i,j))
c = tf.stack(c)
Maybe we can use some function like tf.map_fn or tf.scan? But I am not sure. Really thank you, guys.
Tensor objects are not iterable, which explains why your third code sample fails. So, to answer your question, there is no zip-like function in TensorFlow.
You can indeed use tf.map_fn to apply a function to a sequence of tensors. The problem you pose in your example code can be solved in the following fashion:
def some_function(tensor):
return tf.reduce_sum(tensor)
c = tf.stack([a, b], axis=1)
d = tf.map_fn(some_function, c, dtype=tf.float32)
yields a Tensor d whose value is [20., 6., 6.].
You can use tf.transpose like this
>>> a = tf.constant([1, 2, 3])
>>> b = tf.constant([4, 5, 6])
>>> tf.transpose([a, b])
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 4],
[2, 5],
[3, 6]], dtype=int32)>
For those of you using JavsScript, this is #bachr's answer in tensorflow.js (node):
const a = tf.tensor([1, 3, 5, 7])
const b = tf.tensor([2, 4, 6, 8])
const zip = tf.transpose(tf.stack([a, b]))
zip.print()
// Tensor
// [[1, 2],
// [3, 4],
// [5, 6],
// [7, 8]]

How to get the count of an element in a tensor in TensorFlow?

I want to get the count of an element in a tensor, for example, t = [1, 2, 0, 0, 0, 0] (t is a tensor). I can get the amount 4 of zeros by calling t.count(0) in Python, but in TensorFlow, I can't find any functions to do this. How can I get the count of zeros?
There isn't a built in count method in TensorFlow right now. But you could do it using the existing tools in a method like so:
def tf_count(t, val):
elements_equal_to_value = tf.equal(t, val)
as_ints = tf.cast(elements_equal_to_value, tf.int32)
count = tf.reduce_sum(as_ints)
return count
To count just a specific element you can create a boolean mask, convert it to int and sum it up:
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
res = tf.reduce_sum(tf.cast(tf.equal(X, 3), tf.int32))
with tf.Session() as sess:
print sess.run(res)
Also you can count every element in the list/tensor using tf.unique_with_counts;
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
y, idx, cnts = tf.unique_with_counts(X)
with tf.Session() as sess:
a, _, b = sess.run([y, idx, cnts])
print a
print b
An addition to Slater's answer above. If you want to get the count of all the elements, you can use one_hot and reduce_sum to avoid any looping within python. For example, the code-snippet below returns a vocab, ordered by occurrences within a word_tensor.
def build_vocab(word_tensor, vocab_size):
unique, idx = tf.unique(word_tensor)
counts_one_hot = tf.one_hot(
idx,
tf.shape(unique)[0],
dtype=tf.int32
)
counts = tf.reduce_sum(counts_one_hot, 0)
_, indices = tf.nn.top_k(counts, k=vocab_size)
return tf.gather(unique, indices)
EDIT: After a little experimentation, I discovered it's pretty easy for the one_hot tensor to blow up beyond TF's maximum tensor size. It's likely more efficient (if a little less elegant) to replace the counts call with something like this:
counts = tf.foldl(
lambda counts, item: counts + tf.one_hot(
item, tf.shape(unique)[0], dtype=tf.int32),
idx,
initializer=tf.zeros_like(unique, dtype=tf.int32),
back_prop=False
)

Categories

Resources