Only the first output parameter is learned to be properly estimated during training of a multi regression output net. Second and subsequent parameters only seem to follow first parameter. It seems, that ground truth for second output parameter is not used during training. How do I shape tf.data.Dataset and input it into model.fit() function so second output parameter is trained?
import tensorflow as tf
import pandas as pd
from tensorflow import keras
from keras import layers
#create dataset from csv
file = pd.read_csv( 'minimalDataset.csv', skipinitialspace = True)
input = file["input"].values
output1 = file["output1"].values
output2 = file["output2"].values
dataset = tf.data.Dataset.from_tensor_slices((input, (output1, output2))).batch(4)
#create multi output regression net
input_layer = keras.Input(shape=(1,))
x = layers.Dense(20, activation="relu")(input_layer)
x = layers.Dense(60, activation="relu")(x)
output_layer = layers.Dense(2)(x)
model = keras.Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="mean_squared_error")
#train model and make prediction (deliberately overfitting to illustrate problem)
model.fit(dataset, epochs=500)
prediction = model.predict(dataset)
minimalDataset.csv and predictions:
input
output1
output2
prediction_output1
prediction_output2
0
-1
1
-0.989956
-0.989964
1
2
0
1.834444
1.845085
2
0
2
0.640249
0.596099
3
1
-1
0.621426
0.646796
If I create two independent dense final layers the second parameter is learned accurately but I get two losses:
output_layer = (layers.Dense(1)(x), layers.Dense(1)(x))
Note: I want to use tf.data.Dataset because I build a 20k image/csv with it and do per-element transformations as preprocessing.
tf.data.Dataset.from_tensor_slices() slices along the first dimension. Because of this the input and output tensors need to be transposed:
dataset = tf.data.Dataset.from_tensor_slices((tf.transpose(input), (tf.transpose([output1, output2])))).batch(4)
Related
I am interested in using a neural network to estimate the parameters of a linear regression. To do this I am creating a network that makes two-parameter prediction, and I am trying to write a custom loss function that will determine how well the two parameters do as a slope and intercept in a logistic regression model, using a third dataset as a predictor in the logistic regression.
So I have a matrix of predictors X, with dimensions 10,000 by 20, and a binary outcome variable y. Additionally, I have a 10,000 observations linear_predictor that I want to use to use in the custom loss function evaluate the two outputs of the model.
import numpy as np
from tensorflow.keras import Model, Input
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense
import tensorflow as tf
# create some dummy data
X = np.random.rand(10_000, 20)
y = (np.random.rand(10_000) > 0.8).astype(int)
linear_predictor = np.random.rand(10_000)
# define custom loss function
def CustomLoss(y_true, y_pred, input_):
y_estim = y_pred[:,0]*input_ + y_pred[:,1]
y_estim = tf.gather(y_pred, 0, axis=1)*input_ + tf.gather(y_pred, 1, axis=1)
return tf.keras.losses.BinaryCrossentropy(from_logits=True)(y_true, y_estim)
# create inputs to model
lp_input = Input(shape=linear_predictor.shape)
X_input = Input(shape=X.shape)
y_input = Input(shape=y.shape)
# create network
hidden1 = Dense(32, activation='relu')(X_input)
hidden2 = Dense(8, activation='relu')(hidden1)
output = Dense(2, activation='linear')(hidden2)
model = Model([y_input, X_input, lp_input], output)
# add loss function
model.add_loss(CustomLoss(y_input, output, lp_input))
# fit model
model.fit(x=X_input, y=y_input, epochs=3)
However, I am unable to get the CustomLoss function to work. Something is going wrong with subsetting the model's two-parameter output to get one parameter to use as a scalar as the slope and another to use as the intercept.
The error I am getting is:
ValueError: Exception encountered when calling layer "tf.math.multiply_1" (type TFOpLambda).
Dimensions must be equal, but are 2 and 10000 for '{{node tf.math.multiply_1/Mul}} = Mul[T=DT_FLOAT](
Placeholder, Placeholder_1)' with input shapes: [?,2], [?,10000].
Call arguments received by layer "tf.math.multiply_1" (type TFOpLambda):
• x=tf.Tensor(shape=(None, 2), dtype=float32)
• y=tf.Tensor(shape=(None, 10000), dtype=float32)
• name=None
This suggests that the variable y_pred is not being subset, even though I have tried using the method recommended here with numpy-like indexing (y_pred[:1]) as well as the gather_nd method here, among others.
I think this should be possible, any help is appreciated.
I'm trying to test an LSTM model on the following time series:
As you can see it is stationary and periodic (not that this matters, but it should be pretty easy for a neural net to pick up). This is in fact a coordinate of a simple pendulum vs time.
The steps for preprocessing are the following:
Scale this array using MinMaxScaler.
My model will predict x[t] using x[t-1] up to x[t-5]
scaler = MinMaxScaler()
X = scaler.fit_transform(x.reshape(-1,1))
lookback = 5
features=1
model_input, labels = [],[]
for i in range(X.shape[0]-lookback):
model_input.append(X[i:i+lookback])
labels.append(X[i+lookback])
model_input = np.asarray(model_input)
labels = np.asarray(labels)
model_input.shape, labels.shape
which returns ((495,5,1), (495,1)) this makes sense because my t has 500 steps.
Then I build and train the model:
#train on the first 400 steps, predict on the next 100
train_in, train_out = model_input[:400], labels[:400]
test_out = labels[400:]
model = Sequential()
model.add(LSTM(64, input_shape = (lookback, features))) #input shape is (batch, timesteps, features)
model.add(Dense(1))
model.compile(optimizer = 'adam', loss = 'mse')
#train
model.fit(train_in, train_out, epochs = 30)
Finally, I want to test my model. I don't see the point of using predict here. I want to use the last 5 coordinates in the training set to generate a prediction for the first step in the testing set. Then, I will use this prediction as an input to calculate the next position. And so on...
Here is the code:
#now we make predictions
preds = []
preds_input = train_in[-1:] #to make the first prediction on the test set, we start with the last batch of the training set
for i in range(test_out.shape[0]):
#the next step is the prediction on preds_input
next_step = model.predict(preds_input, verbose=0)
#append next_step to preds
preds.append(next_step)
#append next_step to preds_input and remove the first value so it keeps shape 1,5,1
preds_input = np.append(preds_input,next_step.reshape(1,1,1), axis=1)
preds_input = preds_input[:, 1:, :,]
I then rescaled the predictions and the testing data using inverse_transform and plotted the results.
This is what I got
I'm not able to understand why my model performed so poorly. The pattern is simple and it should've been able to pick it up. Any help would be great!
I got the following data sample:
[1,2,1,4,5],[1,2,1,4,5],[0,2,7,0,1] with a label of [1,0,1]
....
[1,9,1,4,5],[1,5,1,4,5],[0,7,7,0,1] with a label of [0,1,1]
I can't train it on a single series of [1,2,1,4,5] with a label of 1 or 0, as the whole row got a meaningful context information to it, so the whole 15 input digits should be inferred together.
It's not your typical classification, and it doesn't seem as a regression issue either. Also, the data is not related to imagery, it's taken from a scientific domain.
Obviously I am feeding the data as a flat 15 input node to the net
model = Sequential(
[
Dense(units=16,input_shape = scaled_train_samples[0].shape,activation='relu'),
Dense(units=32,activation='relu'),
Dense(units=3,activation='???'),
])
Which activation output function would be ideal in such case?
I would recommend having 3 outputs to the network. Since the data can affect the 3 "sub-labels", the network only branches apart on the classification layer. If you want, you can add more layers to each specific branch.
I'm assuming that each "sub-label" is binary classification, so that's why I chose sigmoid (returns value from 0 to 1, so larger number means network thinks it's class 1 over class 0)
To do this, you would have to change to the Functional API like this:
from keras.layers import Input, Dense
from keras.models import Model
visible = Input(shape=(scaled_train_samples[0].shape))
model = Dense(16, input_shape = activation='relu')(visible)
model = Dense(32,activation='relu')(model)
model = Dense(16,activation='relu')(model)
out1 = Dense(units=1,activation='sigmoid',name='OUT1')(model)
out2 = Dense(units=1,activation='sigmoid',name='OUT2')(model)
out3 = Dense(units=1,activation='sigmoid',name='OUT3')(model)
finalModel = Model(inputs=visible outputs=[out1, out2, out3])
optimizer = Adam(learning_rate=.0001)
losses = {
'OUT1': 'binary_crossentropy',
'OUT2': 'binary_crossentropy',
'OUT3': 'binary_crossentropy',
}
model.compile(optimizer=optimizer, loss=losses, metrics={'OUT1':'accuracy', 'OUT2':'accuracy', 'OUT3':'accuracy'})
I have trained a regression model that approximates the weights for the equation :
Y = R+B+G
For this, I provide pre-determined values of R, B and G and Y, as training data and after training the model, the model is successfully able to predict the value of Y for given values of R, B and G. I used a neural network with 3 inputs, 1 dense layer (hidden layer) with 2 neurons and the output layer (output) with a single neuron.
hidden = tf.keras.layers.Dense(units=2, input_shape=[3])
output = tf.keras.layers.Dense(units=1)
But, I need to implement the inverse of this. i.e., I need to train a model that takes in value of Y and predicts values of R, B and G that corrspond to that value of Y.
I have just learnt that regression is all about a single output. So, I am unable to think of a solution and the path to it.
Kindly Help.
(P.S Is it possible to use the model that I have already trained, to do this? I mean, once, the weights have been determined for R, B and G, is it possible to manipulate the model to use these weights to map Y to R, B and G?)
Here is an example to start solving your problem using neural network in tensorflow.
import numpy as np
from tensorflow.python.keras.layers import Input, Dense
from tensorflow.python.keras.models import Model
X=np.random.random(size=(100,1))
y=np.random.randint(0,100,size=(100,3)).astype(float) #Regression
input1 = Input(shape=(1,))
l1 = Dense(10, activation='relu')(input1)
l2 = Dense(50, activation='relu')(l1)
l3 = Dense(50, activation='relu')(l2)
out = Dense(3)(l3)
model = Model(inputs=input1, outputs=[out])
model.compile(
optimizer='adam',
loss=['mean_squared_error']
)
history = model.fit(X, [y], epochs=10, batch_size=64)
I want to train a binary classifier using Keras and my training data is of shape (2000,2,128) and labels of shape (2000,) as Numpy arrays.
The idea is to train such that embeddings together in a single array means they are either same or different, labelled using 0 or 1 respectively.
The training data looks like:
[[[0 1 2 ....128][129.....256]][[1 2 3 ...128][9 9 3 5...]].....]
and the labels looks like [1 1 0 0 1 1 0 0..].
Here is the code:
import keras
from keras.layers import Input, Dense
from keras.models import Model
frst_input = Input(shape=(128,), name='frst_input')
scnd_input = Input(shape=(128,),name='scnd_input')
x = keras.layers.concatenate([frst_input, scnd_input])
x = Dense(128, activation='relu')(x)
x=(Dense(1, activation='softmax'))(x)
model=Model(inputs=[frst_input, scnd_input], outputs=[x])
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[ 0.2],metrics=['accuracy'])
I am getting the following error while running this code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[ 0.07124118, -0.02316936, -0.12737238, ..., 0.15822273,
0.00129827, -0.02457245],
[ 0.15869428, -0.0570458 , -0.10459555, ..., 0.0968155 ,
0.0183982 , -0.077924...
How can I resolve this issue? Is my code correct to train the classifier using two inputs to classify?
Well, you have two options here:
1) Reshape the training data to (2000, 128*2) and define only one input layer:
X_train = X_train.reshape(-1, 128*2)
inp = Input(shape=(128*2,))
x = Dense(128, activation='relu')(inp)
x = Dense(1, activation='sigmoid'))(x)
model=Model(inputs=[inp], outputs=[x])
2) Define two input layers, as you have already done, and pass a list of two input arrays when calling fit method:
# assuming X_train have a shape of `(2000, 2, 128)` as you suggested
model.fit([X_train[:,0], X_train[:,1]], y_train, ...)
Further, since you are doing binary classification here, you need to use sigmoid as the activation of last layer (i.e. using softmax in this case would always outputs 1, since softmax normalizes the outputs such that their sum equals to one).