I want to implement i-RevNet on MNIST dataset on keras and generate the original 28*28 input images from the output of i-RevNet, but i don't have a clue. Online resources I can find are all based on tensorflow.
important is this paper https://arxiv.org/pdf/1802.07088.pdf - i-REVNET: DEEP INVERTIBLE NETWORKS and this git https://github.com/jhjacobsen/pytorch-i-revnet
when reading the above paper critical components in i-RevNets are homeomorphic layers, on the link between topology and neural nets cf http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/ - Neural Networks, Manifolds, and Topology ( search for 'homeomorphic' )
in https://github.com/jhjacobsen/pytorch-i-revnet homeomorphic layers are implemented in class irevnet_block(nn.Module), note that there are NO operations that discard information like maxpooling, averaging, ... ( with exception of the output layer ), only batch normalization ( https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c ) is applied, the ReLUs are also locally strictly linear.
in Where do I call the BatchNormalization function in Keras? is how to implement this in keras, simply stack the layers into a homeomorphic layer:
homeomorphic layer -> NO POOLING, ... LAYERS
model.add(Dense(64, init='uniform'))
model.add(Activation('relu'))
model.add(BatchNormalization())
the rest of the code in https://github.com/jhjacobsen/pytorch-i-revnet/blob/master/models/iRevNet.py like i.e. def inverse(self, x) or def forward(self, x) can be reproduced using the keras functions in https://keras.io/layers/merge/ . Cf https://github.com/jhjacobsen/pytorch-i-revnet/blob/master/models/model_utils.py on the merge and split functions, they use torch.cat and torch.split whichs keras equivalents are in https://keras.io/layers/merge/
Related
We learned about using Keras to build LSTM model in class, however, I'm still confused on how should you set up the layers for the model. What are the rules and what does each step means?
For instance, for the code below:
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
numUnits = 50
model = Sequential()
model.add( LSTM(units=numUnits,return_sequences=True,
input_shape=(X_train.shape[1], 1)) )
model.add( Dropout(0.2) )
model.add( LSTM(units=numUnits) )
model.add( Dropout(0.2) )
model.add( Dense(units=1) )
model.compile( loss='mean_squared_error' )
What does each of these steps mean? Do we need to use dropout for after setting each layer? Does it always has to end with a Dense layer?
The first layer that is added to the model is an LSTM (Long Short-Term Memory) layer, which is a type of recurrent neural network layer that is well suited to process sequential data.
The Dropout layer is used to randomly drop a fraction of the units during training, which helps to prevent overfitting. It drops a fraction of units based on its input parameter.
The dense layer is the most common type of layer in a neural network, and it is typically used to transform the output of the previous layer into a format that is suitable for the task at hand. For example, if the task is a classification task, the dense layer may be used to transform the output of the previous layer into a probability distribution over the classes.
To answer your questions, Dropout layers are not required after every layer, but they are often placed after recurrent layers such as LSTM or GRU to prevent overfitting. The Dense layer is usually the last layer in a model, but it's not completely necessary.
So I want to do predict the number of stars a product gets on Amazon through keras, I have seen other ways of doing this, but I have used the universal sentence encoder with one-hot encoding (I have followed a Youtube tutorial to embed the reviews). Now without using an LSTM layer and using the following layers:
`model.add(keras.layers.Dense(units=256,input_shape=(X_train.shape[1], ),activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Dense(units=128,activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(0.0001),metrics= ['accuracy'])`
I am able to get an accuracy of around 0.55 and a loss of 1, which isn't great. However when I reshape my X_train and X_test data to be 3D input for an LSTM layer and then put it into a model such as:
`model.add(keras.layers.Dense(units=256,input_shape=(512, 1), activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Bidirectional(keras.layers.LSTM(100, dropout=0.2, recurrent_dropout=0.3)))
model.add(keras.layers.Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(0.0001),metrics= ['accuracy'])`
I get an accuracy of around 0.2 which is even worse, with a loss of close to 2.00.
I have no idea whether an LSTM is necessary as I am new to neural networks but I have been trying to do this for my project.
So I am asking should I stick with the first model without an LSTM or is there a way of changing the second neural network with LSTM to have an accuracy of 0.2 whilst using the embedding methods that I have used?
Thanks for your time!
Why you should choose LSTM instead of normal neurons is because in language, there is a relationship between words and that is important in understanding what the sentence means. The model with only dense layer is not able to do that great because there are no connections it that can store such information, it just predicts by looking at the whole picture and not the connections the words have in between. Coming to LSTM, they stand for Long Short Term Memory, in short, what they have is the capability to remember data that they had seen previously, which helps it in creating connections with different words in the same sentence.
Coming to how you would go about creating your model. First, you need a Tokenizer in the TF library to create token out of your data, then convert your sequence into numbers through it, then pad your data using pad_sequences. Your data is then ready. In your network, your first layer should be an Embedding layer. Followed by it you can have the LSTM (as I have explained why you should use them) or Bidirectional LSTM (they can learn the dependency from left-to-right and right-to-left, performs better than unidirectional LSTM) or Conv1D (according to filter size it is able model dependencies in lying in its filter length, it has been used and works, you can try) layers, followed by pooling layer (GlobalMaxPooling1D) and then, dense layers to get your predictions.
Can you tell me what deploy.prototxt in Caffe model is for?
A neuronal network has two phases: traning phase and test phase. In trainng phase we find the weights by mean of a training algorithm. In test phase we use the trained net for a specific task. In caffe library, generally, each phase has its own architecture. Thus, for example, the caffe net convolutional network in training phase is composed of:
data layer: this layer read training data from hard disk.
convolutional network: conv layers, relu layers, max-pooling layers, and inner product layers.
loss layer: Softmax with loss. It is necessary to calculate the error between labels and output of fc8 layer (see picture below) and then backpropagate the gradient.
While in test phase it is only composed of:
input layer: this layer read data of memory. It is a mutable pointer in C++.
convolutional network: conv layers, relu layers, max-pooling layers, inner product layers, and sofmax layer (named prob below).
Note that in test phase loss layer is not necesary.
Below it is shown training architecture (left) and test (deploy) architecture (right) obtained using Netscope.
I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services.
My idea is to store in the device only the code in charge of retrieving the inputs of the ML module and performing the prediction, together with the file containing the Neural Network model already fitted using Keras.
In other words, I would like to avoid to install all the Keras/Tensorflow packages and dependencies if my purpose is only to perform the prediction on a trained model, and not to train a new model.
Is there a way to do that? Are there any lightweight libraries that allow to load the model of a Neural Network (with all the weights and biases settings) and perform a prediction, given the inputs?
What I am able to do now is to load in the Raspberry Pi a ".h5" file containing the model, weights and biases, but still I have to declare the building function of the model through Keras.
from tensorflow.keras.models import load_model
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
def NN_model():
'''
Definition of the Neural Network model
'''
model = Sequential()
model.add(Dense(7, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(15, kernel_initializer='normal', activation='relu'))
model.add(Dense(24, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
'''
Load NN model and use it to predict the radiation values
for the next 24 hours, hour by hour
'''
regr = KerasRegressor(build_fn=NN_model, epochs=1000, batch_size=5, verbose=0)
regr.model = load_model('saved_model.h5')
pred=regr.predict(input_row)
Since a fitted Neural Network is just a matter of weights and biases (and activation functions), I would expect that, once these parameters are determined, I wouldn't need the whole Tensforflow and Keras environment to map an output to the inputs I give to the NN.
What I would like to have is just something like:
import lightweight_module as lm
regression_model = lm.load_model('saved_model.h5')
prediction=regression_model.predict(inputs)
What you can do is, prune your neural network while retaining the same accuracy. It removes all the unwanted connections between different neurons that does not learn anything significant. It not only reduces complexity of your NN, also drastically reduces the storage space required & also reduces the inference time. In Keras I don't know of any such module (though I think people have made their own version), but modules like pytorch & caffe have some implementation of AlexNets & VGGNets they can reduce the size of your NN model by even 49x times. You can find one such implementation here.
https://github.com/felzek/AlexNet-A-Practical-Implementation/blob/master/testModel.py
I wanted to add one tanh layer to embedding layer with keras functional api:
x=layers.Embedding(vocab_size, 8, input_length=max_length)(input)
output=keras.activations.tanh(x)
model = Model(inputs=input, outputs=output)
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(data, labels)
but system told me I must use keras layers ,not tensor. I searched a lot keras tutorials. There is only one way to solve this problem:
model.add(Activation('tanh'))
but it is Sequential model which I don't want to use.Is there some ways to solve this with functional api?
With the functional api it's almost the same as the Sequential model:
output = Activation('tanh')(x)