I want to create a model which can predict two outputs. I did some research and I found that there's a way to do it by creating two branches (for predicting two outputs) using functional API in Tensorflow Keras but I have a another approach in my mind which looks like this :
i.e. given a input, first I want to predict output1 and then based on that I want to predict output2.
So how can this can be done in Tensorflow ?
Please let me know how the training will be done as well i.e. how I'll be to pass labels for each output1 and output2 and then calculate the loss as well.
Thank you
You can do it with functional API of tensorflow. I write it in some sort of pseudo code:
Inputs = your_input
x = hidden_layers()(Inputs)
Output1 = Dense()(x)
x = hidden_layers()(Output1)
Output2 = Dense()(x)
So you can separate it to two models if it is what you desired:
model1 = tf.keras.models.Model(inputs=[Input], outputs=[Output1])
model2 = tf.keras.models.Model(inputs=[Input], outputs=[Output2])
Or have everything in one model:
model = tf.keras.models.Model(inputs=[Input], outputs=[Output2])
Output1_pred = model.get_layer('Output1').output
UPDATE:
In order to training model with two outputs, you can separate model to two parts and train each part separately as follow:
model1 = tf.keras.models.Model(inputs=[Input], outputs=[Output1])
model2 = tf.keras.models.Model(inputs=[model1.get_layer('Output1').output], outputs=[Output2])
model1.cmpile(...)
model1.fit(...)
for layer in model1.layers:
layer.trainable = False
model2.compile(...)
model2.fit(...)
You can actually modify the great answer by #Mohammad to compose a unique model with two outputs.
Inputs = your_input
x = hidden_layers()(Inputs)
Output1 = Dense()(x)
x = hidden_layers()(Output1)
Output2 = Dense()(x)
model = tf.keras.models.Model(inputs=[Inputs], outputs=[Output1, Output2])
model.compile(loss=[loss_1, loss_2], loss_weights=[0.5, 0.5], optimizer=sgd, metrics=['accuracy'])
of course you can change weights, optimiser and metric according to your case.
Then the model has to be trained on data like (X, y1, y2) where (y1, y2) are output1 and output2 labels respectively.
Related
I am trying to train a multi-input (3) multi-output (4) model using Keras and I need to use a SINGLE loss function that takes in all the output predictions. 2 of these outputs are my true model outputs that I care about and have corresponding labels, while the other 2 outputs are learnable parameters from within my model that I want to use to dynamically update the loss weights for my true model outputs.
I need something like this:
model.compile(optimizer=optimizer, loss = unified_loss
where the unified loss should have access to all my model outputs and corresponding labels. I am using tf.data.from_tensor_slices(...) to train.
The only workaround I have found is to use a custom training loop, which allows this. But, I lose a lot of functionality and callbacks become trickier to implement.
Is there a way to solve this using the regular model.compilt(...) and model.fit(...)?
Apart from a custom training loop, which is not preferred, I did try the standard approach of:
model.compile(optimizer=optimizer, loss = [loss1, loss2], loss_weights = [alpha, beta]
where I tried to make alpha and beta learnable parameters but this is not desired because I have a custom equation that is more involved than a simple weighted sum.
Add a layer to your model that concats the losses into a single tensor/output. Have your custom loss parse out each of the four values and run the necessary math on them. During inference, run the model without the extra layer.
The pattern of having a slightly different model for training and inference is a common one.
Here is an example of the basic idea:
import tensorflow as tf
inp1 = tf.keras.Input((1,))
inp2 = tf.keras.Input((1,))
inp3 = tf.keras.Input((1,))
inputs = tf.keras.layers.Concatenate()([inp1, inp2, inp3])
out1 = tf.keras.layers.Dense(1)(inputs)
out2 = tf.keras.layers.Dense(1)(inputs)
out3 = tf.keras.layers.Dense(1)(inputs)
out4 = tf.keras.layers.Dense(1)(inputs)
model = tf.keras.Model([inp1, inp2, inp3], [out1, out2, out3, out4])
x1 = tf.convert_to_tensor([1])
x2 = tf.convert_to_tensor([1])
x3 = tf.convert_to_tensor([1])
model((x1, x2, x3))
outs = tf.stack([out1, out2, out3, out4])
training_model = tf.keras.Model([inp1, inp2, inp3], outs)
training_model((x1, x2, x3))
def exotic_loss(y_true, y_pred):
true1, true2, true3 = tf.unstack(y_true)
pred1, pred2, pred3 = tf.unstack(y_pred)
return true1 + true2 + true3 + pred1 + pred2 + pred3
training_model.compile(loss=exotic_loss)
I want to create a custom loss function for a Keras deep learning regression model. For the custom loss function, I want to use a feature that is in the dataset but I am not using that particular feature as an input to the model.
My data looks like this:
X | Y | feature
---|-----|--------
x1 | y1 | f1
x2 | y2 | f2
The input to the model is X and I want to predict Y using the model. I want something like the following as the loss function:
def custom_loss(feature):
def loss(y_true, y_pred):
root_mean__square(y_true - y_pred) + std(y_pred - feature)
return loss
I can't use a wrapper function as above, because the feature values depends on the training and test batches, thus cannot be passed to the custom loss function at the model compile time. How can I use the additional feature in the dataset to create a custom loss function?
EDIT:
I did the following based on an answer on this thread. When I make predictions using this model, does it make predictions for 'Y' or a combination of Y and the additional feature? I want to make sure because model.fit( ) takes both 'Y' and 'feature' as y to train but model.predict( ) only gives the one output. If the predictions are a combination of Y and the additional feature, how can I extract only Y?
def custom_loss(data, y_pred):
y_true = data[:, 0]
feature = data[:, 1]
return K.mean(K.square((y_pred - y_true) + K.std(y__pred - feature)))
def create_model():
# create model
model = Sequential()
model.add(Dense(5, input_dim=1, activation="relu"))
model.add(Dense(1, activation="linear"))
(train, test) = train_test_split(df, test_size=0.3, random_state=42)
model = models.create_model(train["X"].shape[1])
opt = Adam(learning_rate=1e-2, decay=1e-3/200)
model.compile(loss=custom_loss, optimizer=opt)
model.fit(train["X"], train[["Y", "feature"]], validation_data=(test["X"], test[["Y", "feature"]]), batch_size = 8, epochs=90)
predY = model.predict(test["X"]) # what does the model predict here?
First check the data structure of your input Y in fit function see if it have same structure as the answer in that thread you following, if you does thing exactly right then it should solve your problem.
When I make predictions using this model, does it make predictions for 'Y' or a combination of Y and the additional feature?
The model will have same output shape exactly like what you defined, in your case because model output is Dense(1, activation="linear"), so it have output shape y_pred.shape == (batchsize, 1), nothing more, you can be sure about that, print it out using tf.print(y_pred) to see for yourself
also i don't know if it's your typing error, last line of your custom_loss function should be :
return K.mean(K.square((y_pred - y_true) + K.std(y_pred - feature)))
instead of
return K.mean(K.square((y_pred - y_true) + K.std(y__pred - feature)))
You can also use .add_loss with a simple mse loss the following way:
input = Input(size)
output = YourLayers(input)
model = Model(input, output)
model.add_loss(std(tf.gather(input, feature_idx, axis=1) - output))
model.compile(loss='mse', optimizer=opt)
BTW, it is strange that your regularizer is a square of variance, while your loss is mse. Maybe you would prefer them to be on the same squared scale (variance and mse), as people usually do (consider any L2 shrinkage, e.g. Ridge regression).
I got the following data sample:
[1,2,1,4,5],[1,2,1,4,5],[0,2,7,0,1] with a label of [1,0,1]
....
[1,9,1,4,5],[1,5,1,4,5],[0,7,7,0,1] with a label of [0,1,1]
I can't train it on a single series of [1,2,1,4,5] with a label of 1 or 0, as the whole row got a meaningful context information to it, so the whole 15 input digits should be inferred together.
It's not your typical classification, and it doesn't seem as a regression issue either. Also, the data is not related to imagery, it's taken from a scientific domain.
Obviously I am feeding the data as a flat 15 input node to the net
model = Sequential(
[
Dense(units=16,input_shape = scaled_train_samples[0].shape,activation='relu'),
Dense(units=32,activation='relu'),
Dense(units=3,activation='???'),
])
Which activation output function would be ideal in such case?
I would recommend having 3 outputs to the network. Since the data can affect the 3 "sub-labels", the network only branches apart on the classification layer. If you want, you can add more layers to each specific branch.
I'm assuming that each "sub-label" is binary classification, so that's why I chose sigmoid (returns value from 0 to 1, so larger number means network thinks it's class 1 over class 0)
To do this, you would have to change to the Functional API like this:
from keras.layers import Input, Dense
from keras.models import Model
visible = Input(shape=(scaled_train_samples[0].shape))
model = Dense(16, input_shape = activation='relu')(visible)
model = Dense(32,activation='relu')(model)
model = Dense(16,activation='relu')(model)
out1 = Dense(units=1,activation='sigmoid',name='OUT1')(model)
out2 = Dense(units=1,activation='sigmoid',name='OUT2')(model)
out3 = Dense(units=1,activation='sigmoid',name='OUT3')(model)
finalModel = Model(inputs=visible outputs=[out1, out2, out3])
optimizer = Adam(learning_rate=.0001)
losses = {
'OUT1': 'binary_crossentropy',
'OUT2': 'binary_crossentropy',
'OUT3': 'binary_crossentropy',
}
model.compile(optimizer=optimizer, loss=losses, metrics={'OUT1':'accuracy', 'OUT2':'accuracy', 'OUT3':'accuracy'})
So I have created a predictive model using Keras, which has accuracy about 60%-65%.
So the data we pass is xtst,xtrn,ytst,ytrn to train_test_split, test_split=.3 and so on, to train and test on supervised data. Now after all these I have a new set of data say xnew.
How do i use this data to predict the y values for this new data?
Where should i feed this xnew data for it to give me y?
The model:
model = Sequential()
model.add(Dense(10,input_shape=(4,),activation = 'relu')
model.add(Dense(32,activation = 'relu'))
model.add(Dense(101,activation = 'softmax'))
from keras import optimizers
model.compile(Adam(lr=.01),loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=20,epochs=40,shuffle=True,verbose=2)
pred = model.predict(x_test,batch_size = 10,verbose = 2)
for i in pred:
print(i)
When you have trained your model you can use model.save(your_model_name.h5) to save your model. Then you can load it again using model = load_model(your_model_name.h5). From there you can use model.predict(xnew) or perhaps model.predict_classes(xnew) if you have made a classifier. I suggest that you look at the Model API also.
for a multi-category classifier model with a softmax output layer you can train the model using those vectors
x => [....] #some vector
y => [0,0,0,1,0,0,0,...]
where y is a vector indicating the probabilities of each category
to predict given some x like y = model.predict(x) you will get a probabilities vector like this [0.1, 0.05, 0.5, ....] you simply need to find the index with the max probability, you can use category = numpy.argmax(y)
I'm trying to combine two outputs that are produced by the same network that makes predictions on a 4 class task and a 10 class task. Then I look to combine these outputs to give a length 14 array which I use as my end target.
While this seems to work actively the predictions are always for one class so it produces a probability dist which is only concerned with selecting 1 out of the 14 options instead of 2. What I actually need it to do is to provide 2 predictions, one for each class. I want this all to be produced by the same model.
input = Input(shape=(100, 100), name='input')
lstm = LSTM(128, input_shape=(100, 100)))(input)
output1 = Dense(len(4), activation='softmax', name='output1')(lstm)
output2 = Dense(len(10), activation='softmax', name='output2')(lstm)
output3 = concatenate([output1, output2])
model = Model(inputs=[input], outputs=[output3])
My issue here is determining an appropriate loss function and method of prediction? For prediction I can simply grab the output of each layer after the softmax however I'm unsure how to set the loss function for each of these things to be trained.
Any ideas?
Thanks a lot
You don't need to concatenate the outputs, your model can have two outputs:
input = Input(shape=(100, 100), name='input')
lstm = LSTM(128, input_shape=(100, 100)))(input)
output1 = Dense(len(4), activation='softmax', name='output1')(lstm)
output2 = Dense(len(10), activation='softmax', name='output2')(lstm)
model = Model(inputs=[input], outputs=[output1, output2])
Then to train this model, you typically use two losses that are weighted to produce a single loss:
model.compile(optimizer='sgd', loss=['categorical_crossentropy',
'categorical_crossentropy'], loss_weights=[0.2, 0.8])
Just make sure to format your data right, as now each input sample corresponds to two output labeled samples. For more information check the Functional API Guide.