Accessing Input Layer data in Tensorflow/Keras - python

I am trying to replicate a neural network for depth estimation. The original authors have taken a pre-trained network and added between the fully connected layer and the convolutional layer a 'Superpixel Pooling Layer'. In this layer, the convolutional feature maps are upsampled and the features per superpixel are averaged.
My problem is that in order to successfully achieve this, I need to calculate the superpixels per image. How can I access the data being used by keras/tensorflow during batch processing to perform SLIC oversegmentation?
I considered splitting the tasks and working by pieces i.e. feed the images into the convolutional network. Process the outputs separately and then feed them into a fully connected layer. However, this makes further training of the network impossible.

At the time it seems to be impossible to actually access the data within the symbolic tensor. It also seems unlikely that such functionality will be added in the future since in the Tensorflow page it says:
A Tensor object is a symbolic handle to the result of an operation, but
does not actually hold the values of the operation's output.
Keras allows for the creation of personalized layers. However, these are limited by the available backend operations. As such, it is simply not possible to access the batch data.

Related

Can I use a convolution filter instead of a dense layer for clasification?

I was reading a decent paper S-DCNet and I fell upon a section (page3,table1,classifier) where a convolution layer has been used on the feature map in order to produce a binary classification output as part of an internal process. Since I am a noob and when someone talks to me about classification I automatically make a synapse relating to FCs combined with softmax, I started wondering ... Is this a possible thing to do? Can indeed a convolutional layer be used to classify a binary outcome? The whole concept triggered my imagination so much that I insist on getting answers...
Honestly, how does this actually work? What is the difference between using a convolution filter instead of a fully connected layer for classification purposes?
Edit (Uncertain answer on how does it work): I asked a colleague and he told me that using a filter of the same shape as the length-width shape of the feature map at the current stage, may lead to a learnable binary output (considering that you also reduce the #channels of the feature map to a single channel). But I still don't understand the motivations behind such a technique ..
Using convolutions as FCs can be done (for example) with filters of spatial size (1,1) and with depth of the same size as the FC input size.
The resulting feature map would be of the same size as the input feature map, but each pixel would be the output of a "FC" layer whose weights are the weights of the shared 1x1 conv filter.
This kind of thing is used mainly for semantic segmentation, meaning classification per pixel. U-net is a good example if memory serves.
Also see this.
Also note that 1x1 convolutions have other uses as well.
paperswithcode probably some of the nets there use this trick.

What is the sense behind setting weights for the layers in keras?

I have trouble understanding weight transfer in transfer learning like tasks...
I trained two networks and saved the weights using keras with tensorflow backend (two networks are in the same model). I would like to use half of the layers from one network and half of the layers from the other network and concatenate them as a new network. Practically I want to cut two networks and join them in a new network and throw away remaining layers. Since half of the layer are top layers I couldn't do it with .pop() so I decided to transfer weights.
I tried this by setting corresponding weights from each layer (the ones that I needed) in the old model to corresponding layers in my new model like:
new_model.layers[i].set_weights(model.layers[i].get_weights())
This however loads the weights but seems to not work as I expect.
Then I tried get_layer:
new_model.layers[i] = model.get_layer('name').output
This also seems to do a meaningless weight transfer.
What should I transfer from my old network to the new network to carry the sense of actually taking half of the whole network?
Do only weights (and biases) carry all information? What else should I assign to have the theoretically same layers?
What does get_leyer return?
Does get_weight/set_weight do same thing as load_weight?

Can CNN autoencoders have different input and output dimensions?

I am working on a problem which requires me to build a deep learning model that based on certain input image it has to output another image. It is worth noting that these two images are conceptually related but they don't have the same dimensions.
At first I thought that a classical CNN with a final dense layer whose argument is the multiplication of the height and width of the output image would suit this case, but when training it was giving strange figures such as accuracy of 0.
While looking for some answers on the Internet I discovered the concepts of CNN autoencoders and I was wondering if this approach could help me solve my problem. Among all the examples I saw, the input and output of an autoencoder had the same size and dimensions.
At this point I wanted to ask if there was a type of CNN autoencoders that produce an output image that has different dimension compared to input image.
Auto-encoder (AE) is an architecture that tries to encode your image into a lower-dimensional representation by learning to reconstruct the data from such representation simultaniously. Therefore AE rely on a unsupervised (don't need labels) data that is used both as an input and as the target (used in the loss).
You can try using a U-net based architecture for your usecase. A U-net would forward intermediate data representations to later layers of the network which should assist with faster learning/mapping of the inputs into a new domain..
You can also experiment with a simple architecture containing a few ResNet blocks without any downsampling layers, which might or might not be enough for your use-case.
If you want to dig a little deeper you can look into Disco-GAN and related methods.They explicitly try to map image into a new domain while maintaining image information.

tensorflow save and restore autoencoder

I used tf.layers.dense to build a fully connected autoencoder. and I want to save it and restore only the encoder to get the embedding output.
How to use tf.train.saver to restore only the encoder? Because I want to set different batch size of the restored model, to input only one data into it.
I saw many tutorials but there is no tutorials about this.
Is there any standard solution about this
Thank you very much
If you don't care about memory space the easiest way is by saving the whole graph (encoder and decoder) and when using it for prediction, you can pass the last layer of the encoder as the fetch argument. Tensorflow will only calculate to this point and you don't have any computational difference compared to only saving the encoder.
Otherwise you can create two graphs (one for the encoder, one for the decoder) an train them at the same time and train them together. But this is I bit more complex.

Train a feed forward neural network indirectly

I am faced with this problem:
I have to build an FFNN that has to approximate an unknown function f:R^2 -> R^2. The data in my possession to check the net is a one-dimensional R vector. I know the function g:R^2->R that will map the output of the net into the space of my data. So I would use the neural network as a filter against bias in the data. But I am faced with two problems:
Firstly, how can I train my network in this way?
Secondly, I am thinking about adding an extra hidden layer that maps R^2->R and lets the net train itself to find the correct maps and then remove the extra layer. Would this algorithm be correct? Namely, would the output be the same that I was looking for?
Your idea with additional layer is good, although the problem is, that your weights in this layer have to be fixed. So in practise, you have to compute the partial derivatives of your R^2->R mapping, which can be used as the error to propagate through your network during training. Unfortunately, this may lead to the well known "vanishing gradient problem" which stopped the development of NN for many years.
In short - you can either manually compute the partial derivatives, and given expected output in R, simply feed the computed "backpropagated" errors to the network looking for R^2->R^2 mapping or as you said - create additional layer, and train it normally, but you will have to make the upper weights constant (which will require some changes in the implementation).

Categories

Resources