Feature Extraction on Neural Networks Using Theano

Feature Extraction on Neural Networks Using Theano - python

I have a trained network, which consists of the following layers:
{conv1, pool1, conv2, pool2, conv3, pool3, conv4, pool4, fc5, fc6, output}
which fc means fully connected layers and conv means convolutional layers.
I need to do feature extraction for some images. I am using Lasagne and Theano.
I need to save features from each layer for later analysis. I am a newbie in this language so I tried to find sample codes or some tutorials on this (with theano/lasagne). However, I failed to understand what should I do by myself.
I would appreciate if someone can guide me on this in order to how to implement feature extraction.
Thank you in advance
Edit: I followed comments by Mr/Ms gntoni, here is my code:
feat_all = []
for layer in layer_list:
feat = np.zeros_like(lasagne.layers.get_output([self.acnn.cnn[layer]], inputs = img, deterministic=True))
feat[:] = lasagne.layers.get_output([self.acnn.cnn[layer]], inputs = img, deterministic=True)
feat_all.append(feat)
=
For my case, I need to save features from each layer. I want to write a function like the one that we have in Caffe:
self.net.blobs['data'].data[0] = img
self.net.forward(end=layer_list[-1])
feat_all = []
for layer in layer_list:
feat = np.zeros_like(self.net.blobs[layer].data[0])
feat[:] = self.net.blobs[layer].data[0]
feat_all.append(feat)
However, my trained model is written with lasagne and theano, So I have to implement this in lasagne format.
After writing the code above (in lasagne), I am getting an empty output.
I wonder why and how can I fix it.
Thank you in advance

A Convolutional Neural Network, like yours, consists in two parts:
The first one is the feature extraction part and in your case consists in the conv-pool layers {conv1, pool1, conv2, pool2, conv3, pool3, conv4, pool4}.
The second is the classification part. In your network: {fc5, fc6, output}.
When training, the first part is trying to obtain the best representation of the input data to be classified by the second part.
So, if after trained, you disconnect this two parts, the output of conv4 layer will be giving you the features you want.
This features can be used with a different classifier. In fact, many people use an already trained network (e.g. AlexNet), remove the last classification layers, and use the features with their own classification system.

Keep in mind that in Lasagne the get_output method returns Theano tensors, and you cannot directly use them to compute the features from a numpy array. However, you can define a Theano function and use it to compute the values. In your case:
layers = [self.acnn.cnn[layer] for layer in layer_list]
feat_fn = theano.function([input_var], lasagne.layers.get_output(layers),
deterministic=True)
where input_var is the input tensor to your network. The get_output method can accept multiple layers and Theano functions can have multiple outputs, so you can define a single function to extract all the features. Getting the numerical values is then as simple as:
feat_all = feat_fn(img)

Related

How can I pretrain for a multi-input DNN for Keras

I'm currently developing a model using Keras + Tensorflow in order to determine the temperature range of a set of proteins. What I first did was create a pre-trained model that converts the proteins into embeddings and then predicts its respective temperature.
What I want to do now is incorporate this pre=trained model to a new model which can use this given model and respective weights as input. Then fit on a new dataset and predict once again. The following code for the new top model is:
UPDATED CODE
'Load Pretrained Model'
loaded_model = keras.models.load_model('pretrained_model')
#Freeze all model layer weights
loaded_model.trainable = False
input1 = np.expand_dims(x_train['input1'],1)
input2 = np.expand_dims(x_train['input2'], 1)
input3 = x_train['input3']
#Redefine Input Layers for ANN
input1 = Input(shape = (input1.shape[1],), name = "input1")
input2 = Input(shape = (input2.shape[1],), name = "input2")
input3 = Input(shape = (input3.shape[1],), name = "input2")
base_inputs = [input1, input2, input3]
x = loaded_model(base_inputs, training = False)
x = Dense(64, activation = "relu", kernel_regularizer=regularizers.l2(0.01))(x)
output = Dense(1, activation = "sigmoid")(x)
top_model = Model(inputs = base_inputs, outputs = output)
# Compile the Model
top_model.compile(loss='mse', optimizer = Adam(lr = 0.0001), metrics = ['mse'])
This is not working correctly and I'm not sure on how to get this up and running. I'm struggling a bit to get this and come across this error quite often:
AttributeError: 'Dense' object has no attribute 'shape'
Any thoughts?

Could you please try to use the initialize the inputs using the keras layers and try?
You have initialized the input shapes using Numpy.
But, If I am right unfortunately the dense layer which you have imported from keras does not support this ('Dense' object has no attribute 'op').
Kindly note that, 'Input' is a keras layer
Could you try to as specified in the following link to initialize the keras inputs (https://keras.io/guides/functional_api/)?
As an Example,
input1 = keras.Input(shape=(1,))
input2 = keras.Input(shape=(1,))
input3 = keras.Input(shape=(1,))
It totally depends on your machine learning architecture whether to make layers trainable or not. In Case of transfer learning, You can just use the trained weights from a pre-trained model and train your new network using the trained weights acquired from that model. In this case, you have to freeze the layers of the pre-trained model. Hence trainable = False. You use these weights in the mathematical calculation of the hidden layers you will use in your custom architecture.
But from your code snippet, I could predict that you are not using any hidden layers like LSTM, RNN or any other cells for your sequential data. Also, you are trying to provide the initialized Numpy inputs to a pre-trained model. I don't think whether it is a right way to do so. From the code snippet, you are making the layer not trainable, but you are then trying to train the model.
Also, If I am right, I think that you have to train with new set of data using the Pre-trained model right? If so, then kindly look at the following link(https://keras.io/guides/transfer_learning/).

Considering your problem, I could suggest that transfer learning approach would be a feasible solution. In transfer learning, you could use the trained model from one domain of a set of data to train similar kind of problem using other set of data. To clearly understand the how to make the layers trainable and freeze it, and fine-tuning can be understood in the following link(https://keras.io/guides/transfer_learning/)
Concerning the Attribute error, it is recommended to have a look at the following link (https://keras.io/guides/functional_api/)
At first, you to initialize the input node for keras along with the shape of the inputs with respect to the data you will feed to the train the model. An Example is shown below as follows,
inputs = keras.Input(shape=(784,))
or it can be something like as follows, if you are providing the image data
img_inputs = keras.Input(shape=(32, 32, 3))
The dense layer expects the input should be in a specific shape, which you can find according to your data. If you are not sure about it, please analyse the data at first. It will give you lot of information to proceed further.

Keras model for multiclass classification for sentiment analysis with LSTM - how can my model be improved?

So I want to do predict the number of stars a product gets on Amazon through keras, I have seen other ways of doing this, but I have used the universal sentence encoder with one-hot encoding (I have followed a Youtube tutorial to embed the reviews). Now without using an LSTM layer and using the following layers:
`model.add(keras.layers.Dense(units=256,input_shape=(X_train.shape[1], ),activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Dense(units=128,activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(0.0001),metrics= ['accuracy'])`
I am able to get an accuracy of around 0.55 and a loss of 1, which isn't great. However when I reshape my X_train and X_test data to be 3D input for an LSTM layer and then put it into a model such as:
`model.add(keras.layers.Dense(units=256,input_shape=(512, 1), activation='relu'))
model.add(keras.layers.Dropout(rate=0.5))
model.add(keras.layers.Bidirectional(keras.layers.LSTM(100, dropout=0.2, recurrent_dropout=0.3)))
model.add(keras.layers.Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(0.0001),metrics= ['accuracy'])`
I get an accuracy of around 0.2 which is even worse, with a loss of close to 2.00.
I have no idea whether an LSTM is necessary as I am new to neural networks but I have been trying to do this for my project.
So I am asking should I stick with the first model without an LSTM or is there a way of changing the second neural network with LSTM to have an accuracy of 0.2 whilst using the embedding methods that I have used?
Thanks for your time!

Why you should choose LSTM instead of normal neurons is because in language, there is a relationship between words and that is important in understanding what the sentence means. The model with only dense layer is not able to do that great because there are no connections it that can store such information, it just predicts by looking at the whole picture and not the connections the words have in between. Coming to LSTM, they stand for Long Short Term Memory, in short, what they have is the capability to remember data that they had seen previously, which helps it in creating connections with different words in the same sentence.
Coming to how you would go about creating your model. First, you need a Tokenizer in the TF library to create token out of your data, then convert your sequence into numbers through it, then pad your data using pad_sequences. Your data is then ready. In your network, your first layer should be an Embedding layer. Followed by it you can have the LSTM (as I have explained why you should use them) or Bidirectional LSTM (they can learn the dependency from left-to-right and right-to-left, performs better than unidirectional LSTM) or Conv1D (according to filter size it is able model dependencies in lying in its filter length, it has been used and works, you can try) layers, followed by pooling layer (GlobalMaxPooling1D) and then, dense layers to get your predictions.

reducing output feature number of pretraind mode weight in keras

I want extract 1000 image features using pretrained Xception model.
but xception models last layer(avg_pool) give 2048 features.
Can i reduce final output feature number without additional training?
I want image features before the softmax not predcition result.
base_model = xception.Xception(include_top=True, weights='imagenet')
base_model.summary()
self.model = Model(inputs = base_model.input, outputs=base_model.get_layer('avg_pool').output)

This model was trained to produce embeddings in 2048-dimensional space to the classifier after it. There is no sense in trying to reduce the dimensionality of the embedding space, unless you are combining very complex and inflexible models. If you are just doing simple transfer learning with no memory constraints, just snap your new classifier (extra layers) on top of it, and retrain after freezing (or not) all layers in the original Xception. That should work regardless of Xception output_shape. See the keras docs.
That said, if you REALLY need to reduce dimensionality to 1000-d, you will need a method that preserves (or at least tries to preserve) the original topology of the embedding space, otherwise your model will not benefit at all from transfer learning. Take a look at PCA, SVD, or T-SNE.

Train Stacked Autoencoder Correctly

I try to build a Stacked Autoencoder in Keras (tf.keras). By stacked I do not mean deep. All the examples I found for Keras are generating e.g. 3 encoder layers, 3 decoder layers, they train it and they call it a day. However, it seems the correct way to train a Stacked Autoencoder (SAE) is the one described in this paper: Stacked Denoising Autoencoders: Learning Useful Representations in
a Deep Network with a Local Denoising Criterion
In short, a SAE should be trained layer-wise as shown in the image below. After layer 1 is trained, it's used as input to train layer 2. The reconstruction loss should be compared with the layer 1 and not the input layer.
And here is where my trouble begins. How to tell Keras which layers to use the loss function on?
Here is what I do. Since the Autoencoder module is not existed anymore in Keras, I build the first autoencoder, and I set its encoder's weights (trainable = False) in the 1st layer of a second autoencoder with 2 layers in total. Then when I train that, it obviously compares the reconstructed layer out_s2 with the input layer in_s, instead of the layer 1 hid1.
# autoencoder layer 1
in_s = tf.keras.Input(shape=(input_size,))
noise = tf.keras.layers.Dropout(0.1)(in_s)
hid = tf.keras.layers.Dense(nodes[0], activation='relu')(noise)
out_s = tf.keras.layers.Dense(input_size, activation='sigmoid')(hid)
ae_1 = tf.keras.Model(in_s, out_s, name="ae_1")
ae_1.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])
# autoencoder layer 2
hid1 = tf.keras.layers.Dense(nodes[0], activation='relu')(in_s)
noise = tf.keras.layers.Dropout(0.1)(hid1)
hid2 = tf.keras.layers.Dense(nodes[1], activation='relu')(noise)
out_s2 = tf.keras.layers.Dense(nodes[0], activation='sigmoid')(hid2)
ae_2 = tf.keras.Model(in_s, out_s2, name="ae_2")
ae_2.layers[0].set_weights(ae_1.layers[0].get_weights())
ae_2.layers[0].trainable = False
ae_2.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])
The solution should be fairly easy, but I can't see it nor find it online. How do I do that in Keras?

It seems like the question is outdated by looking at the comments. But I'll still answer this as the use-case mentioned in this question is not just specific to autoencoders and might be helpful for some other cases.
So, when you say "train the whole network layer by layer", I would rather interpret it as "train small networks with one single layer in a sequence".
Looking at the code posted in this question, it seems that the OP has already built small networks. But both these networks do not consist of one single layer.
The second autoencoder here, takes as input the input of first autoencoder. But, it should actually take as input, the output of first autoencoder.
So then, you train the first autoencoder and collect it's predicitons after it is trained. Then you train the second autoencoder, which takes as input the output (predictions) of first autoencoder.
Now let's focus on this part: "After layer 1 is trained, it's used as input to train layer 2. The reconstruction loss should be compared with the layer 1 and not the input layer."
Since the network takes as input the output of layer 1 (autoencoder 1 in OP's case), it will be comparing it's output with this. The task is achieved.
But to achieve this, you will need to write the model.fit(...) line which is missing in the code provided in the question.
Also, just in case you want the model to calculate loss on input layer, you simply replace the y parameter in model,fit(...) to the input of autoencoder 1.
In short, you just need to decouple these autoencoders into tiny networks with one single layer and then train them as you wish. No need to use trainable = False now, or else use it as you wish.

Visualizing the input that maximizes the activation of a layer with dropout in Keras

I'm trying to replicate the code in this blog article How convolutional neural networks see the world
It works well in a CNN where there's no dropout layer but when there's one (or more) dropout layers, I can't directly use the layer.output line because it expects a learning phase.
When I use the recommend way to extract the output of a layer :
get_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[layer_index].output])
layer_output = get_3rd_layer_output([input_img, 0])[0]
The problem is that I can't put a placeholder in input_img because it expects "real" data but if I put directly "real" data then the rest of the code doesn't work (creating the loss, gradients and iterating needs a placeholder).
Is there a way I can make this work?
I'm using the Tensorflow backend.
EDIT : I solved my issue by using the K.set_learning_phase() method before doing anything like building my model (I had to start from a new environment and I used the method right after the imports).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.