custom convolution function in tensorflow - python

I want to use my own convolution function in tensorflow. I have implemented that using numpy. How would I convert the code to Tensorflow format(dynamic inputs in computational graph).
At present my function takes a 2d numpy array as input and produces a 3d numpy array(height, width and output channels). How can I iterate through all the input images?

What you want does not make sense. Convolution is a mathematical operation which is defined in some particular way. It is easily extended in N-dimensions and converted to a discrete case (by summing instead of integration), this is why TF has conv1d, conv2d, and general n-dim convolution.
So it is impossible to define my own convolution function in the same way as you can't define your own matrix multiplication function because if it does not calculate the values in exactly the same way, it will no longer be a convolution function.
Now if you want to define your own operation, you should take a look at the official documentation how to define it. In a few words:
create the op using already written functions
write stuff in C++

Related

How to make Keras custom layer without using Tensofrlow tensors?

I want to make a custom layer which takes image as an input, transforms it to string of a certain kind, than performs some complicated calculations with this string and transforms it back to image. These functions represent images as numpy arrays.
As far as I know, Keras custom layers should utilize Tensorflow tensors and operations over them. But in this case it seems to be impossible to write my functions in such format.
So, can I somehow just put my function into the model code?

Tensorflow - Sparse embedding lookup that remains sparse

I'm implementing a text classifier with a CNN similar to Kim 2014 with Tensorflow. Tensorflow provides tf.nn.embedding_lookup_sparse, which allows you to provide the word IDs as a sparse tensor. This is nice, especially for enabling variable length sequences. However, this function requires a "combination" step after the lookup, such as "mean" or "sum". This coerces it back to the dense tensor space. I don't want to do any combination. I want to keep my vectors in the sparse representation, so I can do other convolutions afterwards. Is this possible in TF?
EDIT: I want to avoid padding the input prior to the embedding lookup. This is because Tensorflow's embedding lookup generates vectors for the pad value, and its a kludge trying to mask it with zeros (see here).
I think there are two points of confusion in the question. Firstly, the combiner operation happens across the set of embedding IDs for each row of the sparse indices input sp_ids. So if sp_ids has a shape of N x 1, then you are "combining" just one embedding vector per each row of sp_ids, which will just retrieve that embedding vector (which is I think what you are saying you want).
Secondly though, the return value is the embedding vector for each row of input. The embedding vector itself is a dense vector, by very definition of what the embedding is and what the TensorFlow embedding operations calculate. So this return result will always be dense, and that's what you want. A sparse matrix representation would be horribly inefficient, since the matrix truly will be dense (full of dense embeddings), regardless of whether any 'combiner' operation happens or not.
The research paper you linked does not seem to be doing any type of special methodology that would result in a special case of a sparse embedding vector, so I don't see a reason here for expecting or desiring sparse outputs.
Maybe I am incorrect, can you provide more details about why you expect the embedding vectors themselves to be sparse vectors? That would be a highly unusual situation if so.

Can I split a tensor dynamically in a variable number of parts with tf.sparse_split?

I have a 3D sparse tensor in Tensorflow which I want to split along the first dimension (axis=0). I was thinking of using tf.sparse_split operation. But it requires an argument num_splits as Python integer. I wanted to know if I have the num_splits in a scalar placeholder is there any way to use it?
Why such a convention has been followed for this function, I haven't seen this in any other tensorflow operation?
In the tensorflow framework, num_splits has to be known at graph building time, because the graph is meant to be static. At least when using daddy's old, graph-based tensorflow. If you really have to have parts of your graph that are dynamic, you might success using tensorflow's imperative eager execution.

What is the most efficient way to get the upper triangle of a symmetric matrix in Tensorflow/Keras?

I have an intermediate model that outputs symmetric arrays. Those arrays are then used as input to another model. I'd like to just flatten the arrays and discard the lower triangles, since they're symmetric. Is there a best/most efficient way to do this?
Edit: I want the triangle extraction to be handled similar to any other Keras layer, so that the output of the first model can be input directly to the second model and trained end-to-end.
Tensorflow and Keras use Numpy to handle the data. Consider using Numpy functions triu or triu_indices

Tensorflow, tf.gradients calculations

I am learning how to use Tensorflow and at this 1 particular point I am really stuck and can not make a sense around it. Imagine I have a 5 layer network and the output is represented by output. Now suppose I want to find the gradient of output with respect to layer_2. For that purpose, the code I will write in Tensorflow will be something like:
gradients_i_want = tf.gradients(output, layer_2)
Theoretically, this gradient should be calculated via chain rule. I want to ask, that whether Tensorflow calculates these gradients via chain rule or it will just take the derivative of output with respect to layer_2
Tensorflow will create a graph for your model, where each node is an operation (e.g. addition, multiplication, or a combination of them). Basic ops have manually defined gradient functions, and those functions will be used when applying the chain rule while traveling backwards through the graph.
If you write your own custom op, you might need to also write the corresponding gradient function.

Categories

Resources