I am trying to use a third party library (neural renderer) in my code alongside Tensorflow. This library has a function which I would like to run between Tensorflow executions. However, whenever I feed a placeholder into the function, it just gives me this problem:
AttributeError: 'Tensor' object has no attribute 'transpose'
How do I combine a Tensorflow session and a non-Tensorflow function? For example, how do I do something like this:
input = not_tensorflow(foo, bar)
op1 = tf.assign(var, input)
sess.run(op1, feed_dict={foo:stuff})
I thought I would make foo a simple Placeholder, but apparently that doesn't work.
Related
I've been following Apple's coremltools docs for converting PyTorch segmentation models to CoreML.
While it works fine when we're loading a remote PyTorch model, I'm yet to figure out a working Python script to perform conversions with local/already-downloaded PyTorch models.
The following piece of code throws a TypeError: 'dict' object is not callable
#This works fine: model = torch.hub.load('pytorch/vision:v0.6.0', 'deeplabv3_resnet101',pretrained=True).eval()
model = torch.load('local_model_file.pth')
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)
with torch.no_grad():
output = model(input_batch)['out'][0] #error here
torch_predictions = output.argmax(0)
There is a SO answer that offers a solution by initialising the model class and loading the state_dict, but I wonder what's the concrete solution when we don't have access to the PyTorch model?
In your code, model is a state dict, which is a dictionary from parameter names to the parameter tensor values. As the linked answer stated, the right way to load a state dict is by (a) creating the model object that the state dict belongs to and then (b) use nn.Module.load_state_dict to load the state dict. To do (a), you need access to the model's class definition. If you don't have that access, then unfortunately I don't see any reliable way to load the state dict.
You might be able to guess what the class' __init__ look like by looking at the parameter names in the state dict (e.g., 'module.stage1.rebnconvin.conv_s1.weight' looks like a convolution). However, even if the guess is correct and the state dict can be loaded, you still need to define the forward method because the state dict only stores the parameters.
I'd like to use pre-trained sentence embeddings in my tensorflow graph execution model. The embeddings are available dynamically from a function call, which takes in an array of sentences and outputs an array of sentence embeddings. This function uses a pre-trained pytorch model so has to remain separate from the tensorflow model I'm training:
def get_pretrained_embeddings(sentences):
return pretrained_pytorch_model.encode(sentences)
My tensorflow model looks like this:
class SentenceModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, sentences):
embedding_layer = tf.keras.layers.Embedding(
10_000,
256,
embeddings_initializer=tf.keras.initializers.Constant(get_pretrained_embeddings(sentences)),
trainable=False,
)
sentence_text_embedding = tf.keras.Sequential([
embedding_layer,
tf.keras.layers.GlobalAveragePooling1D(),
])
return sentence_text_embedding,
But when I try to train this model using
cached_train = train.shuffle(100_000).batch(1024)
model.fit(cached_train)
my embeddings_initializer call gets the error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
I assume this is because tensorflow is trying to compile the graph using symbolic data. How can I get my external function, which relies on the current training data batch, to work with tensorflow's graph training?
Tensorflow compiles models to an execution graph before performing the actual training process. The obvious side-effect that clues us into this is if we have a regular Python print() statement in e.g. our call() method, it will only get executed once as Tensorflow runs through your code to construct the execution graph, which it will later convert to native code.
The other side effect of this is that cannot use anything that isn't a tensor of some description when training. By 'tensor' here, all of the following can be considered a tensor:
The input value of your call() method (obviously)
A tf.Sequential
A tf.keras.Model/tf.keras.layers.Layer subclass
A SparseTensor
A tf.constant()
....probably more I haven't listed here.
To this end, you would need to convert your PyTorch model to a Tensorflow one to be able to reference it in a subclass of tf.keras.Model/tf.keras.layers.Layer.
As a side note, if you do find you need to iterate a tensor, you should just be able to iterate it on the 1st dimension (i.e. the batch size) like so:
for part in some_tensor:
pass
If you want to iterate on some other dimension, I recommend doing a tf.unstack(some_tensor, axis=AXIS_NUMBER_HERE) first and iterate over the result thereof.
I want to train a model with a custom loss function, in order to do that, I need to convert the tensor to numpy array inside the method below:
def median_loss_estimation(y_true, y_predicted):
a = y_predicted.numpy()
but I have this error:
AttributeError: 'Tensor' object has no attribute 'numpy'
Why?
How can I convert the tensor to a numpy array?
The answer is: put run_eagerly=True in model.compile!
You're doing the right thing, only Tensorflow 2.1 is currently broken in that aspect. This would normally happen if you run the code without eager mode enabled. However, Tensorflow 2 by default runs in eager mode... or at least it should. The issue is tracked here.
There are at least two solutions to this:
Install the latest nightly build.
Set model.run_eagerly = True.
So, I'm using a bunch of functions from OpenAI baselines for Reinforcement Learning. In those functions, policy nets are initialised using statements like:
with tf.variable_scope('deepq', reuse=True):
...
return output
The problem is that the pointer to the output of those networks gets returned while still inside the scope, which means that when accessing those functions from another .py file I am still inside those scopes.
Basically I want to run a first function train_policy(output_dir) that trains the net and dumps the checkpoint to disk using tf.Saver().
Next, I run a function run_policy(output_dir) that reinitializes the same tf Graph and loads it's pretrained values using the checkpoint dir.
Right now, when I try this, I get a ValueError:
"Variable deepq/... already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?" because at the point of running the second function, I'm still in the scope defined by the first.. I checked the code from OpenAI baselines (very nested code, hard to see everything that's going on), and reuse is already set to True.
So I tried doing something like:
tf.get_default_session().close() followed by:
tf.reset_default_graph()
after the first function call. (I don't need the session to remain active since I'm dumping everything to disk)
But this gives me errors because I'm still inside a nested graph scope and so I can't reset the default graph... (see eg here)
Alternatively I tried things like:
tf.get_default_graph().as_graph_def().__exit__()
or
tf.name_scope('deepq').__exit__()
but the exit() function needs a whole bunch of args I don't know how to get... (and I can't find good documentation on how to use this function).
My current solution is to run these functions in separate subprocesses in Python (and let the garbage collector do all the work), but this doensn't feel like a satisfactory solution..
Any ideas on how to deal with this? Ideally I'd need something like: tf.clear_all_graphs_and_sessions()
Ait one solution is indeed to reset the default graph:
I simply wrap every function call in a new default graph object like this:
with tf.Graph().as_default():
train_policy(output_dir)
with tf.Graph().as_default():
run_policy(output_dir)
...
This way the default graph simply gets reinitialised empty and you can load whatever is in the checkpoint file. (Inside every function I also close the default session before returning).
You can try to do your work in another default graph:
with tf.get_default_graph().as_default():
with tf.variable_scope('deepq', reuse=False):
v = tf.get_variable('v', shape=[])
print(v.name, v.graph)
with tf.Graph().as_default():
v = tf.get_variable('v', shape=[])
print(v.name, v.graph)
Output:
deepq/v:0 <tensorflow.python.framework.ops.Graph object at 0x7f61adaa6390>
v:0 <tensorflow.python.framework.ops.Graph object at 0x7f61460abbd0>
I'm currently working on a quaternionic Neural Network using Tensorflow (I want to use GPUs). TensorFlow doesn't have support for quaternions, but you can represent than as a 4x4 real matrix, so it might be possible to build such a neural network in TensorFlow.
Is there a simple way to add a custom operation or to do a custom operation on tensors?
For example, I can write:
output_activation = tf.nn.softmax(tf.matmul(hidden_activation, Weight_to_ouput))
...and that's pretty cool! All you have to do is add a loss function and then do backpropagation. However, I want to do the same thing but with quaternions, for example:
output_activation = mySigmoid(myFunction(hidden_activation, Weight_to_output))
However, I need to transform the quaternions to and from tensors to optimize the GPU calculation. So I need to create a function that gets some tensors as parameters and returns the transformed tensors.
I've looked at py_func, but it seems that you can't return tensors.
I tried the following, but it failed:
def layerActivation(inputTensor,WeightTensor):
newTensor = tf.matmul(inputTensor,WeightTensor)
return newTensor
...and in main():
x = placeholder ...
W_to_hidden = tf.Variable
test = tf.py_func(layerActivation, [x,_W_to_hidden], [tf.float32])
with tf.Session() as sess:
tf.initialize_all_variables().run()
king_return = sess.run(test, feed_dict={x: qtrain})
Error : Unimplemented: Unsupported object type Tensor
Ideally I could use this output_activation in the standard backprop algorithm of TensorFlow but I don't know if it's possible.
Depending on the functionality required, you might be able to implement your operation as a composition of existing TensorFlow ops, without needing to use tf.py_func().
For example, the following works and will run on a GPU:
def layer_activation(input_tensor, weight_tensor):
return tf.matmul(input_tensor, weight_tensor)
# ...
x = tf.placeholder(...)
W_to_hidden = tf.Variable(...)
test = layer_activation(input_tensor, weight_tensor)
# ...
The main reason to use tf.py_func() is if your operations cannot be implemented using TensorFlow operations, and you want to inject some Python code (e.g. using NumPy) that works on the actual values of your tensor.
However, if your mySigmoid() or myFunction() operations cannot be implemented in terms of existing TensorFlow operations, and you want to implement them on GPU, then—as keveman says—you will need to add a new op.
If you want to run your custom operations on GPUs, you have to provide GPU implementation (kernels) in C++. Look at the documentation here for how to extend TensorFlow with custom operations, and especially the section on GPU support.