I want to make a custom layer which takes image as an input, transforms it to string of a certain kind, than performs some complicated calculations with this string and transforms it back to image. These functions represent images as numpy arrays.
As far as I know, Keras custom layers should utilize Tensorflow tensors and operations over them. But in this case it seems to be impossible to write my functions in such format.
So, can I somehow just put my function into the model code?
Related
I want to use a pre-trained Pytorch model in Tensorflow and I need to convert the tensorflow tensors to pytorch tensors. But I don't want to convert the pytorch tensor to a numpy array and convert that to a tensorflow tensor since I'm getting the error of " You must feed a value for placeholder tensor".
I need this conversion when I'm making the graph so the tensorflow tensor doesn't have value and cannot be converted to numpy!
Any solution for that?
Operations you do to Tensorflow tensors are "remembered" in order to calculate and back-propagate gradients. Same is true for PyTorch tensors. All this is ultimately required to train the model in both frameworks. This also is the reason why you can't convert tensors between the two frameworks: They have different ops and gradient calculation systems. They are incapable of capturing any operation that happens beyond their framework. For example, you can't (as of Jan 2021) have python for loops in custom loss functions. It has to be implemented into the framework in order to work. Similarly, there is no implementation of converting pytorch operations to Tensorflow operations.
This answer shows how it's done when your tensor is well-defined (not a placeholder). But there is currently no way to propagate gradients from Tensorflow to PyTorch or vice-versa. Maybe in the future there will be some kind of massive update to both frameworks that lets them inter-operate, but I doubt that. It's best to use them both separately.
So, in short, you can't convert placeholder tensors between two frameworks. You have to stick to one of the libraries or use concrete tensors + numpy mediator to communicate in-between frameworks.
I am following the tutorial on neural style transfer. The style transfer is done by minimizing a loss function with respect to an image (initialized with the content image). What confuses me is the following piece of code:
preprocessed_input = tf.keras.applications.vgg19.preprocess_input(inputs)
which is part of the call method in the StyleContentModel class. How does TensorFlow know the gradient of this operation? I have checked if this operation has a gradient function using get_gradient_function in the module tensorflow.python.framework.ops, and as far as I can tell it does not.
It is very simple, the function internally uses symbolic tensor operations that are differentiable. TensorFlow can compute gradients through functions that internally use TensorFlow operations, there is no need to manually define a gradient for each function.
You can confirm by looking at the code of that function here, specially if you look at the _preprocess_symbolic_function here which is using normal scalar operations and Keras backend functions (which are just TensorFlow functions in tf.keras).
This has nothing to do with the model or gradients. What this function does is scale the input images so the pixels are in the range from -1 to +1. This is a common requirement for many models used in transfer learning like VGG and MobileNet. If you use the ImageDataGenerator it has a parameter preprocessing_function which the generator calls to preprocess the images. Make sure if you preprocess the training images you do the same for the test and validation images.
I have a 3D sparse tensor in Tensorflow which I want to split along the first dimension (axis=0). I was thinking of using tf.sparse_split operation. But it requires an argument num_splits as Python integer. I wanted to know if I have the num_splits in a scalar placeholder is there any way to use it?
Why such a convention has been followed for this function, I haven't seen this in any other tensorflow operation?
In the tensorflow framework, num_splits has to be known at graph building time, because the graph is meant to be static. At least when using daddy's old, graph-based tensorflow. If you really have to have parts of your graph that are dynamic, you might success using tensorflow's imperative eager execution.
I am working with the DCGAN code. I need to modify the reward that is given to one of the neural nets by adding a function that would take the output of this neural net, analyse it, and issue a penalty on it. So my loss function would look like:
self.g_loss = self.g_loss + self.penalty
Problem is
this penalty function only takes the numpy arrays as an input (I have no way of modifying this),
neural network output is a tf.tensor,
and as the values haven't been assigned to the neural net yet (technically it hasn't been built yet) I can't run neither .eval() nor sess.run().
So how would I convert a tensorflow tensor into numpy array in this case?
Tensorflow has tf.py_func for wrapping Python functions and passing tensors to them. However, you can't then use this loss function to train the network, because Tensorflow doesn't automatically differentiate numpy code.
Luckily for you, autograd does automatically differentiate numpy code. If you use that, in another tf.pyfunc call, you can get gradients, which you can then put back into the tensorflow graph on the backward pass.
Here's an example of how you can do it all in this gist.
I want to use my own convolution function in tensorflow. I have implemented that using numpy. How would I convert the code to Tensorflow format(dynamic inputs in computational graph).
At present my function takes a 2d numpy array as input and produces a 3d numpy array(height, width and output channels). How can I iterate through all the input images?
What you want does not make sense. Convolution is a mathematical operation which is defined in some particular way. It is easily extended in N-dimensions and converted to a discrete case (by summing instead of integration), this is why TF has conv1d, conv2d, and general n-dim convolution.
So it is impossible to define my own convolution function in the same way as you can't define your own matrix multiplication function because if it does not calculate the values in exactly the same way, it will no longer be a convolution function.
Now if you want to define your own operation, you should take a look at the official documentation how to define it. In a few words:
create the op using already written functions
write stuff in C++