Is it possible in Keras that the training of each or some of outputs in multi-output training start at different epochs? For example one of the outputs takes some other outputs as its input. But those outputs at the beginning are quite premature and it brings huge computational burdens to the model. This output that I would like its training to be postponed to some time later is a custom layer that has to apply some image processing operations to its input which is an image generated by another output but at the beginning that the generated image is quite meaningless, I think it's just waste of time for first epochs to apply this custom layer. Is there a way to do that? Like we have weights over each output's loss, do we have different starting point for calculating each output's loss?
Build a model that does not contain the later output.
Train that model to the degree you want.
Build a new model that incorporates the old model into it.
Compile the new model with the new loss functions you want.
Train that model.
To elaborate on step 3: Keras models can be used like layers in Keras' functional API.
You can build a normal model like so:
input = Input((100,))
x = Dense(50)(input)
x = Dense(1, activation='sigmoid')(x)
model = Model(input, x)
However, if you have another standard Keras model, it can be used just like any other layer. For example, if we have a model (created with Sequential(), Model(), or keras.models.load_model()) called model1, we can put it in like this:
input = Input((100,))
x = model1(input)
x = Dense(1, activation='sigmoid')(x)
model = Model(input, x)
This would be the equivalent of putting in each layer in model1 individually.
Related
I'm currently developing a model using Keras + Tensorflow in order to determine the temperature range of a set of proteins. What I first did was create a pre-trained model that converts the proteins into embeddings and then predicts its respective temperature.
What I want to do now is incorporate this pre=trained model to a new model which can use this given model and respective weights as input. Then fit on a new dataset and predict once again. The following code for the new top model is:
UPDATED CODE
'Load Pretrained Model'
loaded_model = keras.models.load_model('pretrained_model')
#Freeze all model layer weights
loaded_model.trainable = False
input1 = np.expand_dims(x_train['input1'],1)
input2 = np.expand_dims(x_train['input2'], 1)
input3 = x_train['input3']
#Redefine Input Layers for ANN
input1 = Input(shape = (input1.shape[1],), name = "input1")
input2 = Input(shape = (input2.shape[1],), name = "input2")
input3 = Input(shape = (input3.shape[1],), name = "input2")
base_inputs = [input1, input2, input3]
x = loaded_model(base_inputs, training = False)
x = Dense(64, activation = "relu", kernel_regularizer=regularizers.l2(0.01))(x)
output = Dense(1, activation = "sigmoid")(x)
top_model = Model(inputs = base_inputs, outputs = output)
# Compile the Model
top_model.compile(loss='mse', optimizer = Adam(lr = 0.0001), metrics = ['mse'])
This is not working correctly and I'm not sure on how to get this up and running. I'm struggling a bit to get this and come across this error quite often:
AttributeError: 'Dense' object has no attribute 'shape'
Any thoughts?
Could you please try to use the initialize the inputs using the keras layers and try?
You have initialized the input shapes using Numpy.
But, If I am right unfortunately the dense layer which you have imported from keras does not support this ('Dense' object has no attribute 'op').
Kindly note that, 'Input' is a keras layer
Could you try to as specified in the following link to initialize the keras inputs (https://keras.io/guides/functional_api/)?
As an Example,
input1 = keras.Input(shape=(1,))
input2 = keras.Input(shape=(1,))
input3 = keras.Input(shape=(1,))
It totally depends on your machine learning architecture whether to make layers trainable or not. In Case of transfer learning, You can just use the trained weights from a pre-trained model and train your new network using the trained weights acquired from that model. In this case, you have to freeze the layers of the pre-trained model. Hence trainable = False. You use these weights in the mathematical calculation of the hidden layers you will use in your custom architecture.
But from your code snippet, I could predict that you are not using any hidden layers like LSTM, RNN or any other cells for your sequential data. Also, you are trying to provide the initialized Numpy inputs to a pre-trained model. I don't think whether it is a right way to do so. From the code snippet, you are making the layer not trainable, but you are then trying to train the model.
Also, If I am right, I think that you have to train with new set of data using the Pre-trained model right? If so, then kindly look at the following link(https://keras.io/guides/transfer_learning/).
Considering your problem, I could suggest that transfer learning approach would be a feasible solution. In transfer learning, you could use the trained model from one domain of a set of data to train similar kind of problem using other set of data. To clearly understand the how to make the layers trainable and freeze it, and fine-tuning can be understood in the following link(https://keras.io/guides/transfer_learning/)
Concerning the Attribute error, it is recommended to have a look at the following link (https://keras.io/guides/functional_api/)
At first, you to initialize the input node for keras along with the shape of the inputs with respect to the data you will feed to the train the model. An Example is shown below as follows,
inputs = keras.Input(shape=(784,))
or it can be something like as follows, if you are providing the image data
img_inputs = keras.Input(shape=(32, 32, 3))
The dense layer expects the input should be in a specific shape, which you can find according to your data. If you are not sure about it, please analyse the data at first. It will give you lot of information to proceed further.
I am trying to implement a model that uses encoding from multiple pre-trained BERT models on different datasets and gets a combined representation using a fully-connected layer. In this, I want that BERT models should remain fixed and only fully-connected layers should get trained. Is it possible to achieve this in huggingface-transformers? I don't see any flag which allows me to do that.
PS: I don't want to go by the way of dumping the encoding of inputs for each BERT model and use them as inputs.
A simple solution to this is to just exclude the parameters related to the BERT model while passing to the optimizer.
param_optimizer = [x for x in param_optimizer if 'bert' not in x[0]]
optimizer = AdamW(param_optimizer, lr)
I am new to Keras and starting with this code from tf tutorial :
# choosing the layers of my models
model = keras.Sequential([ # the sequential model of Keras library
keras.layers.Flatten(input_shape=(28, 28)), # the first input layer
keras.layers.Dense(128, activation='relu'),# the hidden layer
keras.layers.Dense(10)# output layers and 10 corresponds to the number of used classes
])
I wonder what the value 128 is? and how it was calculated?
It's not calculated, it's a hyperparameter (a parameter that isn't estimated by the data, but selected by you prior to running the model). It essentially determines the complexity of the model. The more neurons, the more complex relationships it can model in the data.
128 is a hyper parameter which is the number of nodes in your second to last layer.
It isn't calculated, you can change it to whatever you want, try [18,32,64...etc]. The larger you make it the slower your training will be; however your model might be more accurate since there are more nodes to capture the signal of your dataset.
I have a sequential model with a custom loss function for training. For prediction and validation however, I want to remove one layer. Is there any way to do this? The easiest thing I could think would be within a custom metric by being able to get the value of output from a previous layer without access to the input. Alternatively, I could run prediction and verification on a separate model, but I worry about constructing a separate model because I want the weights to be saved. Any suggestions? I have spent a lot of time with this and any thing I try has involved scope issues. I took a look at this: Keras, How to get the output of each layer? but every answer I see requires me to know the inputs.
You can create separate models. Each model will need to be compiled. My solution was of this form...
inputs = Input(input_shape)
model = Conv2D(32, [3,3])(inputs)
# pass the model through some layers
# finish the model
model = Model(inputs=inputs, outputs=model)
input_2 = Input(input_shape)
second_model = model(input_2)
# pass the second model through some layers
second_model = Model(inputs=inputs, outputs=second_model)
model.compile(...
second_model.compile(...
Now any training done to second_model affects the weights of model, allowing you to do training off of second_model and predictions with model.
AFAIK, we still need to create a model and add layers to the model using the functional API. Why people say that the functional AI is used to create non-sequential neural network?
Why people said it is used to non-sequential neural network?
The thing is that with the Sequential Model you are defining a model step by step (sequentially), when you call the .add() method. Whereas, on the Functional API (specifically the Model class) you have more freedom, as you can define different layers that receive different inputs and then instantiate a model with the Model creator using any of those layers (not necessarily in a step by step, or sequential way).
In other words, when calling model = Sequential() you are in that moment instantiating your model object (for which you then add layers and constraints). In the Functional API, you create layers and then instantiate your model by calling model = Model(inputs=in, outputs=out) with your desired input and output layer(s). As you see, both approaches can be equivalent, for example, these two are the same:
from keras.models import Sequential, Model
from keras.layers import Input, Dense, Activation
#---Using the Sequential model
model = Sequential() #Object is instantiated here
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
#---Or using the Functional API
a = Input(shape=(784,))
b = Dense(32, activation='relu')(a)
model = Model(inputs=a, outputs=b) #Object is instantiated here
Considering only this, then choosing which way to go depends more on your personal style and coding preferences. Now, there is a major advantage of using the Functional API over the Sequential model, which is that you can share or reuse layers across different models.
When compiling and fitting a model, all its associated layers will be compiled and trained. Thus, any other model that shares such layers will also reflect those changes. This gives you the freedom to do many things, as obtaining sub-models of your network, redefine them, obtain its relative outputs, merge them into more complex models, etc., without having to train again for each of those sub-models.
To make it clearer, here is an example (based on this Keras Autoencoder Blog post) that illustrates what was discussed in the last paragraph:
from keras.layers import Input, Dense
from keras.models import Model
#Create an autoencoder, along with its encoder and decoder model
input_img = Input(shape=(784,))
encoded = Dense(32, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
#Here we define our autoencoder model: image -> encoding -> decoded image
autoencoder = Model(input_img, decoded)
#Now here is the advantage of the Funcional API
#We can reuse those layers to obtain an encoder model (image -> encoding)
#as well as a decoder model (encoding -> image)
#but compile all three by just compiling and fitting the Autoencoder model
encoder = Model(input_img, encoded) #see how the 'encoded' layer is output
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(32,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))
#compile and fit with your data
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(X,Y,...)
After this, you will be able to make predictions on the encoder and decoder model individually (to visualize your encoding, for example), as well as make predictions with the autoencoder model as a whole. At this point, doing the following things are equivalent:
#obtain reconstructed representation directly
autoencoder_imgs = autoencoder.predict(x_test)
#obtain reconstructed representation by joining encoder and decoder models
encoder_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
Hope this helps. Personally, I always try to use the Functional API instead, regardless if I want or not to reuse or recycle layers, as I find it more verbose, but that is up to you to decide.
Well, "sequential" is not really the most correct term, but it's the name chosen by Keras developers. All models work in sequence, of course.
The difference is:
The Sequential model is a straight line. You keep adding layers, every new layer takes the output of the previous layer. You cannot make creative graphs with branches.
The functoinal API Model is completely free to have as many ramifications, inputs and outputs as you need.
Example with the Sequential model:
from keras.models import Sequential
from keras.layers import *
#you create a model
model = Sequential()
#and you add layers
model.add(SomeKerasLayer(...))
model.add(AnotherKerasLayer(...))
#as you can see, this model is a straight line, you only add layers "sequentially"
Example with the functional API Model:
Now here we start creating really fancy models.
from keras.models import Model
from keras.layers import *
We start by defining input tensors. And we can have any number of inputs! (The sequential model is limited to one input, that you define in the first layer with input_shape).
input1 = Input(inputShape1)
#We can have more inputs if we want!
input2 = Input(inputShape2)
input3 = Input(inputShape3)
We work by creating layers and "calling layers with input tensors".
When we call a layer with an input tensor, we get an output tensor.
And we can create whatever path we want.
#Example: two separate layers taking two separate inputs:
output1 = SomeLayer(...)(input1)
output2 = AnotherLayer(...)(input2)
We can join two branches with different options, such as add, multiply, concatenate, etc.:
#joining the previous tensors output1 and output2
joined1_2 = Concatenate()([output1,output2])
We can reuse the same layers with different inputs, getting different outputs:
aLayer = AKerasLayer(...) #notice I'm creating this layer but not calling it yet
#calling the same layer with two different inputs
output1 = aLayer(joined1_2)
output2 = aLayer(input3)
And finally, we can define the model with as many inputs and outputs we want:
model = Model([input1,input2,input3],[output1, output2])
Reusing and associating models
Both models, Sequential and functional API, can be used as if they were layers.
You can call a model with and input tensor and get an output tensor, just like you would do when creating a functional API model:
input1 = Input(shape)
output1 = anExistingSequentialModel(input1)
output2 = anExistingFunctionalModel(input1)
newModel = Model(input1,[output1,output2])
And you can add models in sequential models as well (beware of the branches, it's best to have one input and one output for the added model, since this is a Sequential one)
seqModel = Sequential()
seqModel.add(anotherModel)