I am creating a model somewhat similar to the one mentioned below:
model
I am using Keras to create such model but have struck a dead end as I have not been able find a way to add SoftMax to outputs of the LSTM units. So far all the tutorials and helping material provides with information about outputting a single class even like in the case of image captioning as provided in this link.
So is it possible to apply SoftMax to every unit of LSTM (where return sequence is true) or do I have to move to pytorch.
The answer is: yes, it is possible to apply to each unit of LSTM and no, you do not have to move to PyTorch.
While in Keras 1.X you needed to explicitly state that you add a TimeDistributed layer, in Keras 2.X you can just write:
model.add(LSTM(50,activation='relu',return_sequences=False))
model.add(Dense(number_of_classes,activation='softmax'))
Related
I have written a code that computes Choquet pooling in a Custom Layer in Keras. Below the Colab link to the notebook:
https://colab.research.google.com/drive/1lCrUb2Jm680JRnACPxWpxkOSkP_DlHGj
As you can the code crashes in gradient computation, precisely inside the function custom_grad. This is impossible because I'm returning 0 gradients with the same shape as the previous layer.
So I have 2 questions:
Is in Keras (or in Tensorflow) a way to compute gradient between the layer input and its output?
If I have passed a Tensor with the same shape as the previous layer, but filled with 0s, why the code is not working?
Thanks for your attention and I'm waiting for your help.
Thanks in advance
No one is interested in that question.
After several trials, I have found a solution. The problem is that, as posted by Mainak431 in this GitHub repo:
link to diff and non-diff ops in tensorflow
There are differentiable TensorFlow operations and non-differentiable operations. In the Colab notebook, I used, as an example, scatter_nd_update that is non-differentiable.
So I suggest, if you want to create your own Custom Layer in Keras to take a look at the above lists in order to use operations that allow Keras to auto-differentiate for you.
Anyway, I'm working on it to inform as much as possible on that open research topic. I remember that with the neural network the "LEGO-ing" is borderline, and I know for sure that many of you are interested in adding your operations(aggregation or something else) in a deep neural network model.
Special Thanks to Maniak431, I love you <3
I'm a beginer in this field of Deep Learning. I'm trying to use Keras for a LSTM in a regression problem. I would like to build an ANN which could exploit the memory cell between one prediction and the next one.
In more details... I have a neural network (Keras) with 2 Hidden layer-LSTM and 1 output layer for a regression context.
The batch_size is equal to 7, timestep equal to 1 and I have 5749 samples.
I'm only interested to understand if using timestep == 1 is the same thing as using an MLP instead of LSTM. For time_step, I'm referring to the reshape phase for the input of the Sequential model in Keras. The output is a single regression.
I'm not interested in the previous inputs, but I'm interested only on the output of the network as an information for the next prediction.
Thank you in advance!
You can say so :)
You're right in thinking that you won't have any recurrency anymore.
But internally, there will be still more operations than in regular Dense layers, due to the existence of more kernels.
But be careful:
If you use stateful=True, it will still be a recurrent LSTM!
If you use initial states properly, you can still make it recurrent.
If you're interested in creating custom operations with the memory/state of the cells, you could try creating your custom recurrent cell taking the LSTMCell code as a template.
Then you'd use that cell in a RNN(CustomCell, ...) layer.
I try to convert the Keras OCR example into a CoreML model.
I already can train my slightly modified model and everything looks good in Python. But now I want to convert the model into CoreML to use it my iOS app.
The problem is, that the CoreML file format can't support Lambda layers.
I am not an expert in this field, but as far as I understand, the Lambda layer here is used to calculate the loss using ctc_batch_cost().
The layer is created around line 464.
I guess this is used for greater precision over the "build in" loss functions.
Is there any way the model creation can be rewritten to fit the layer set CoreML supports?
I have no idea which output layer type to use for the model.
Cost functions usually aren't included in the CoreML model, since CoreML only does inference while cost functions are used for training. So strip out that layer before you export the model and you should be good to go.
I'm trying to replicate the code in this blog article How convolutional neural networks see the world
It works well in a CNN where there's no dropout layer but when there's one (or more) dropout layers, I can't directly use the layer.output line because it expects a learning phase.
When I use the recommend way to extract the output of a layer :
get_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[layer_index].output])
layer_output = get_3rd_layer_output([input_img, 0])[0]
The problem is that I can't put a placeholder in input_img because it expects "real" data but if I put directly "real" data then the rest of the code doesn't work (creating the loss, gradients and iterating needs a placeholder).
Is there a way I can make this work?
I'm using the Tensorflow backend.
EDIT : I solved my issue by using the K.set_learning_phase() method before doing anything like building my model (I had to start from a new environment and I used the method right after the imports).
I want to use a pretrained neural network and just fine-tune it to my specific needs. I wanted to use Python and the Lasagne framework for this. On:
https://github.com/Lasagne/Recipes/blob/master/examples/ImageNet%20Pretrained%20Network%20%28VGG_S%29.ipynb
I found an example of how to use a pretrained network for specific images. My problem is that I would like to use
the network described in the link above as a starting point and add a final layer to it that makes it implement a TWO CLASS
classifier which is what I need. I therefore wanted to keep all the layers in the network frozen and allow training ONLY in my last added layer.
Apparently there is a way to indicate that layers should be "nontrainable" in lasagne, but I have found no expemples of how to do this on the web.
Any thoughts on this would be highly appreciated.
Set those layers that you want to frozen with lr to be 0 and only set those layer you want to fine tune lr nonzero. There is not a online example yet. But you should check this thread https://groups.google.com/forum/#!topic/lasagne-users/2z-6RrgiHkE
Remove trainable tag from all parameters of the layers that you want to keep frozen:
def freeze_layer(layer):
for param in layer.params.values():
param.remove('trainable')
To freeze all your network up to a certain layer you can simply iterate over its lower layers:
from lasagne.layers import get_all_layers
def freeze_net(net):
layers = get_all_layers(net)
for l in layers:
freeze_layer(l)
Code untested. See this discussion for more info.