I'm currently working on a quaternionic Neural Network using Tensorflow (I want to use GPUs). TensorFlow doesn't have support for quaternions, but you can represent than as a 4x4 real matrix, so it might be possible to build such a neural network in TensorFlow.
Is there a simple way to add a custom operation or to do a custom operation on tensors?
For example, I can write:
output_activation = tf.nn.softmax(tf.matmul(hidden_activation, Weight_to_ouput))
...and that's pretty cool! All you have to do is add a loss function and then do backpropagation. However, I want to do the same thing but with quaternions, for example:
output_activation = mySigmoid(myFunction(hidden_activation, Weight_to_output))
However, I need to transform the quaternions to and from tensors to optimize the GPU calculation. So I need to create a function that gets some tensors as parameters and returns the transformed tensors.
I've looked at py_func, but it seems that you can't return tensors.
I tried the following, but it failed:
def layerActivation(inputTensor,WeightTensor):
newTensor = tf.matmul(inputTensor,WeightTensor)
return newTensor
...and in main():
x = placeholder ...
W_to_hidden = tf.Variable
test = tf.py_func(layerActivation, [x,_W_to_hidden], [tf.float32])
with tf.Session() as sess:
tf.initialize_all_variables().run()
king_return = sess.run(test, feed_dict={x: qtrain})
Error : Unimplemented: Unsupported object type Tensor
Ideally I could use this output_activation in the standard backprop algorithm of TensorFlow but I don't know if it's possible.
Depending on the functionality required, you might be able to implement your operation as a composition of existing TensorFlow ops, without needing to use tf.py_func().
For example, the following works and will run on a GPU:
def layer_activation(input_tensor, weight_tensor):
return tf.matmul(input_tensor, weight_tensor)
# ...
x = tf.placeholder(...)
W_to_hidden = tf.Variable(...)
test = layer_activation(input_tensor, weight_tensor)
# ...
The main reason to use tf.py_func() is if your operations cannot be implemented using TensorFlow operations, and you want to inject some Python code (e.g. using NumPy) that works on the actual values of your tensor.
However, if your mySigmoid() or myFunction() operations cannot be implemented in terms of existing TensorFlow operations, and you want to implement them on GPU, then—as keveman says—you will need to add a new op.
If you want to run your custom operations on GPUs, you have to provide GPU implementation (kernels) in C++. Look at the documentation here for how to extend TensorFlow with custom operations, and especially the section on GPU support.
Related
I'm trying to write a reinforcement learning agent using tensorflow. I'm wondering if the states should be tf.Variables or can be numpy arrays for backpropogation using gradient tape. I'm not sure if the gradients will be correct if my states/action arrays are numpy instead of tensorflow arrays, I do know that the loss function returns a tf.Variable however. Thanks, I'm still a beginner to using Tensorflow any explanation/suggestions would help alot.
In a very simplified form (not word for word), my code looks something like:
with tf.GradientTape as tape:
#actions/states are both lists of np arrays
action = model.call(state)
states.append(state)
actions.append(actions)
loss = model.loss(states,actions) #loss returns tf.variable
model.optimizer.apply_gradients(tape.gradient(loss, model.variables)
Hi Noob :) The optimizer.apply_gradients operation will update only model tf.Variables having non-zero gradients (see input argument model.variables).
Reference: https://www.tensorflow.org/api_docs/python/tf/GradientTape
Trainable variables (created by tf.Variable or
tf.compat.v1.get_variable, where trainable=True is default in both
cases) are automatically watched. Tensors can be manually watched by
invoking the watch method on this context manager.
Edit: if you want to call the model to make a predictions given a numpy array: this is sort of possible. According to the documentation the input of model.call() should be a tensor object. You can simply get a tensor from your numpy array as:
state # numpy array
tf_state = tf.constant(state)
model.call(tf_state)
Of course, instead of creating new tf.constants for each iteration of the training loop, you can first initialize a (non-trainable) tf.Variables, and then just update its values with those of the numpy array! Something like the following should work:
tf_state = tf.Variable(np.zeros_like(state), dtype=tf.float32, trainable=False)
for iter in n_train_iterations:
state = get_new_numpy_state()
tf_state.assign(state)
model.call(tf_state)
I'd like to use pre-trained sentence embeddings in my tensorflow graph execution model. The embeddings are available dynamically from a function call, which takes in an array of sentences and outputs an array of sentence embeddings. This function uses a pre-trained pytorch model so has to remain separate from the tensorflow model I'm training:
def get_pretrained_embeddings(sentences):
return pretrained_pytorch_model.encode(sentences)
My tensorflow model looks like this:
class SentenceModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, sentences):
embedding_layer = tf.keras.layers.Embedding(
10_000,
256,
embeddings_initializer=tf.keras.initializers.Constant(get_pretrained_embeddings(sentences)),
trainable=False,
)
sentence_text_embedding = tf.keras.Sequential([
embedding_layer,
tf.keras.layers.GlobalAveragePooling1D(),
])
return sentence_text_embedding,
But when I try to train this model using
cached_train = train.shuffle(100_000).batch(1024)
model.fit(cached_train)
my embeddings_initializer call gets the error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
I assume this is because tensorflow is trying to compile the graph using symbolic data. How can I get my external function, which relies on the current training data batch, to work with tensorflow's graph training?
Tensorflow compiles models to an execution graph before performing the actual training process. The obvious side-effect that clues us into this is if we have a regular Python print() statement in e.g. our call() method, it will only get executed once as Tensorflow runs through your code to construct the execution graph, which it will later convert to native code.
The other side effect of this is that cannot use anything that isn't a tensor of some description when training. By 'tensor' here, all of the following can be considered a tensor:
The input value of your call() method (obviously)
A tf.Sequential
A tf.keras.Model/tf.keras.layers.Layer subclass
A SparseTensor
A tf.constant()
....probably more I haven't listed here.
To this end, you would need to convert your PyTorch model to a Tensorflow one to be able to reference it in a subclass of tf.keras.Model/tf.keras.layers.Layer.
As a side note, if you do find you need to iterate a tensor, you should just be able to iterate it on the 1st dimension (i.e. the batch size) like so:
for part in some_tensor:
pass
If you want to iterate on some other dimension, I recommend doing a tf.unstack(some_tensor, axis=AXIS_NUMBER_HERE) first and iterate over the result thereof.
I am sub-classing tensorflow.keras.Model to implement a certain model. Expected behavior:
Training (fitting) time: returns a list of tensors including the final output and auxiliary output;
Inferring (predicting) time: returns a single output tensor.
And the code is:
class SomeModel(tensorflow.keras.Model):
# ......
def call(self, x, training=True):
# ......
return [aux1, aux2, net] if training else net
This is how i use it:
model=SomeModel(...)
model.compile(...,
loss=keras.losses.SparseCategoricalCrossentropy(),
loss_weights=[0.4, 0.4, 1],...)
# ......
model.fit(data, [labels, labels, labels])
And got:
AssertionError: in converted code:
ipython-input-33-862e679ab098:140 call *
`return [aux1, aux2, net] if training else net`
...\tensorflow_core\python\autograph\operators\control_flow.py:918 if_stmt
Then the problem is that the if statement is converted into the calculation graph and this would of course cause the problem. I found the whole stack trace is long and useless so it's not included here.
So, is there any way to make TensorFlow generate different graph based on training or not?
Which tensorflow version are you using? You can overwrite behaviour in the .fit, .predict and .evaluate methods in Tensorflow 2.2, which would generate different graphs for these methods (I assume) and potentially work for your use-case.
The problems with earlier versions is that subclassed models get created by tracing the call method. This means Python conditionals become Tensorflow conditionals and face several limitations during graph creation and execution.
First, both branches (if-else) have to be defined, and regarding python collections (eg. lists), the branches have to have the same structure (eg. number of elements). You can read about the limitations and effects of Autograph here and here.
(Also, a conditional may not get evaluated at every run, if the condition is based on a Python variable and not a tensor.)
I'm looking to do SVD for a custom optimizer in Keras (specifically, I want to port the the Shampoo optimizer to Keras.
In Tensorflow, I would use tensorflow.python.ops.linalg_ops.svd(), however, there is no function like this in keras.backend.
Can SVD be carried out in a purely Keras setting, or can I somehow use the Tensorflow function directly (and if so, how)?
EDIT: Just for future reference, there actually exists a wrapper function allowing the direct use of a native tf optimizer in Keras:
import keras as ks
from tensorflow.contrib.opt import AdamWOptimizer
tfopt = AdamWOptimizer()
ksopt = ks.optimizers.TFOptimizer(tfopt)
Unfortunately though, it does not seem to work with the Shampoo optimizer specifically.
If you are using keras with a tensorflow backend, than keras backend is tensorflow.
This means that when you call a method from keras backend, it actually calls a method of tensorflow.
Therefore you could use both keras backend operations and tensorflow together and interchangeably.
For example, in the given code:
tensor = ...
m = K.mean(tensor)
...
I could change the line K.mean(tensor) to tf.mean(tensor)
tensor = ...
m = tf.mean(tensor)
...
So you can just use the tensorflow SVD operation as you would use it if it was a function of keras backend :)
For example if you would like to have
tensor = ...
res = K.some_submodule.svd(tensor)
...
Than you can instead just do
tensor = ...
res = tensorflow.python.ops.linalg_ops.svd(tensor)
...
I just recently started playing around with Keras and got into making custom layers. However, I am rather confused by the many different types of layers with slightly different names but with the same functionality.
For example, there are 3 different forms of the concatenate function from https://keras.io/layers/merge/ and https://www.tensorflow.org/api_docs/python/tf/keras/backend/concatenate
keras.layers.Concatenate(axis=-1)
keras.layers.concatenate(inputs, axis=-1)
tf.keras.backend.concatenate()
I know the 2nd one is used for functional API but what is the difference between the 3? The documentation seems a bit unclear on this.
Also, for the 3rd one, I have seen a code that does this below. Why must there be the line ._keras_shape after the concatenation?
# Concatenate the summed atom and bond features
atoms_bonds_features = K.concatenate([atoms, summed_bond_features], axis=-1)
# Compute fingerprint
atoms_bonds_features._keras_shape = (None, max_atoms, num_atom_features + num_bond_features)
Lastly, under keras.layers, there always seems to be 2 duplicates. For example, Add() and add(), and so on.
First, the backend: tf.keras.backend.concatenate()
Backend functions are supposed to be used "inside" layers. You'd only use this in Lambda layers, custom layers, custom loss functions, custom metrics, etc.
It works directly on "tensors".
It's not the choice if you're not going deep on customizing. (And it was a bad choice in your example code -- See details at the end).
If you dive deep into keras code, you will notice that the Concatenate layer uses this function internally:
import keras.backend as K
class Concatenate(_Merge):
#blablabla
def _merge_function(self, inputs):
return K.concatenate(inputs, axis=self.axis)
#blablabla
Then, the Layer: keras.layers.Concatenate(axis=-1)
As any other keras layers, you instantiate and call it on tensors.
Pretty straighforward:
#in a functional API model:
inputTensor1 = Input(shape) #or some tensor coming out of any other layer
inputTensor2 = Input(shape2) #or some tensor coming out of any other layer
#first parentheses are creating an instance of the layer
#second parentheses are "calling" the layer on the input tensors
outputTensor = keras.layers.Concatenate(axis=someAxis)([inputTensor1, inputTensor2])
This is not suited for sequential models, unless the previous layer outputs a list (this is possible but not common).
Finally, the concatenate function from the layers module: keras.layers.concatenate(inputs, axis=-1)
This is not a layer. This is a function that will return the tensor produced by an internal Concatenate layer.
The code is simple:
def concatenate(inputs, axis=-1, **kwargs):
#blablabla
return Concatenate(axis=axis, **kwargs)(inputs)
Older functions
In Keras 1, people had functions that were meant to receive "layers" as input and return an output "layer". Their names were related to the merge word.
But since Keras 2 doesn't mention or document these, I'd probably avoid using them, and if old code is found, I'd probably update it to a proper Keras 2 code.
Why the _keras_shape word?
This backend function was not supposed to be used in high level codes. The coder should have used a Concatenate layer.
atoms_bonds_features = Concatenate(axis=-1)([atoms, summed_bond_features])
#just this line is perfect
Keras layers add the _keras_shape property to all their output tensors, and Keras uses this property for infering the shapes of the entire model.
If you use any backend function "outside" a layer or loss/metric, your output tensor will lack this property and an error will appear telling _keras_shape doesn't exist.
The coder is creating a bad workaround by adding the property manually, when it should have been added by a proper keras layer. (This may work now, but in case of keras updates this code will break while proper codes will remain ok)
Keras historically supports 2 different interfaces for their layers, the new functional one and the old one, that requires model.add() calls, hence the 2 different functions.
For the TF -- their concatenate() functions does not do everything that required for Keras to work, hence, the additional calls to make ._keras_shape variable correct and not to upset Keras that expects that variable to have some particular value.