Error in reverse scaling outputs predicted by a LSTM RNN - python

I used the LSTM model to predict the future open price of a stock. Here the data was preprocessed and the model was built and trained without any errors, and I used Standard Scaler to scale down the values in the DataFrame. But while retrieving the predictions from the model, when I used the scaler.reverse() method it gave the following error.
ValueError: non-broadcastable output operand with shape (59,1) doesn't match the broadcast shape (59,4)
The complete code is a too big jupyter notebook to directly show, so I have uploaded it in a git repository

This is because the model is predicting output with shape (59, 1). But your Scaler was fit on (251, 4) data frame. Either create a new scaler on the data frame of the shape of y values or change your model dense layer output to 4 dimensions instead of 1.
The data shape on which scaler is fit, it will take that shape only during scaler.inverse_transform.
Old Code - Shape (n,1)
trainY.append(df_for_training_scaled[i + n_future - 1:i + n_future, 0])
Updated Code - Shape (n,4) - use all 4 outputs
trainY.append(df_for_training_scaled[i + n_future - 1:i + n_future,:])

Normally you'd be re-scaling independent variables (features) as differences in scale can affect model calculations, but the dependent variable that you're trying to predict is normally left untouched. There's usually no reason to re-scale the dependent variable and scaling it makes it extremely difficult to interpret results.
The first line of documentation of StandardScaler class even specifies as much:
Standardize features by removing the mean and scaling to unit variance
You can optionally also scale labels, but once again this is not normally required.
So what I'd do in your place is (assuming your original dataframe contains 3 independent variables and 1 target variable) is this:
X = some_df.iloc[:, :3].values
y = some_df.iloc[3].values
scaler = StandardScaler()
X = scaler.fit_transform(X)
# And then goes everything as usual
Now, when you go to predict values you simply need to transform your input with the scaler in the same way it's been done before.
The better way, though, would be to add to your model a Normalization layer as a pre-processing step. This way you just feed raw data into your estimator and it handles all the nitty-gritty for you. And, similarly, you won't need to normalize data when generating predictions, the model will do everything for you. You could add something like:
from tensorflow.keras.layers.experimental.preprocessing import Normalization
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras import Model
# this is your default batch_size
BATCH_SIZE = 128
# Here's your raw (non-normalized) X data
X = some_df.iloc[:, :3].values
norm = Normalization()
norm.adapt(X)
preprocess = Sequential([
Input(shape=(BATCH_SIZE, 3)),
norm
])
# Now finally, when you build your actual model you add
# pre-processing step in the beginning
inp = preprocess()
x = Dense(64)(input)
x = Dense(128)(x)
x = Dense(1)(x)
model = Model(inputs=inp, outputs=x)
Here the pre-process step is a part of the model itself so once you do that you can just feed it raw data without any additional transformations.
This is what it will do:
# Skipping the imports as they are the same as above + numpy
X = np.array([[1, 2, 3], [10, 20, 40], [100, 200, 400]])
norm = Normalization()
norm.adapt(X)
preprocess = Sequential([
Input(shape=(3, 3)),
norm
])
x_new = preprocess(X)
print(x_new)
Out: tf.Tensor(
[[-0.80538726 -0.80538726 -0.807901 ]
[-0.60404044 -0.60404044 -0.6012719 ]
[ 1.4094278 1.4094278 1.4091729 ]], shape=(3, 3), dtype=float32)

Related

Subset model outputs in custom loss function in tensorflow/keras

I am interested in using a neural network to estimate the parameters of a linear regression. To do this I am creating a network that makes two-parameter prediction, and I am trying to write a custom loss function that will determine how well the two parameters do as a slope and intercept in a logistic regression model, using a third dataset as a predictor in the logistic regression.
So I have a matrix of predictors X, with dimensions 10,000 by 20, and a binary outcome variable y. Additionally, I have a 10,000 observations linear_predictor that I want to use to use in the custom loss function evaluate the two outputs of the model.
import numpy as np
from tensorflow.keras import Model, Input
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense
import tensorflow as tf
# create some dummy data
X = np.random.rand(10_000, 20)
y = (np.random.rand(10_000) > 0.8).astype(int)
linear_predictor = np.random.rand(10_000)
# define custom loss function
def CustomLoss(y_true, y_pred, input_):
y_estim = y_pred[:,0]*input_ + y_pred[:,1]
y_estim = tf.gather(y_pred, 0, axis=1)*input_ + tf.gather(y_pred, 1, axis=1)
return tf.keras.losses.BinaryCrossentropy(from_logits=True)(y_true, y_estim)
# create inputs to model
lp_input = Input(shape=linear_predictor.shape)
X_input = Input(shape=X.shape)
y_input = Input(shape=y.shape)
# create network
hidden1 = Dense(32, activation='relu')(X_input)
hidden2 = Dense(8, activation='relu')(hidden1)
output = Dense(2, activation='linear')(hidden2)
model = Model([y_input, X_input, lp_input], output)
# add loss function
model.add_loss(CustomLoss(y_input, output, lp_input))
# fit model
model.fit(x=X_input, y=y_input, epochs=3)
However, I am unable to get the CustomLoss function to work. Something is going wrong with subsetting the model's two-parameter output to get one parameter to use as a scalar as the slope and another to use as the intercept.
The error I am getting is:
ValueError: Exception encountered when calling layer "tf.math.multiply_1" (type TFOpLambda).
Dimensions must be equal, but are 2 and 10000 for '{{node tf.math.multiply_1/Mul}} = Mul[T=DT_FLOAT](
Placeholder, Placeholder_1)' with input shapes: [?,2], [?,10000].
Call arguments received by layer "tf.math.multiply_1" (type TFOpLambda):
• x=tf.Tensor(shape=(None, 2), dtype=float32)
• y=tf.Tensor(shape=(None, 10000), dtype=float32)
• name=None
This suggests that the variable y_pred is not being subset, even though I have tried using the method recommended here with numpy-like indexing (y_pred[:1]) as well as the gather_nd method here, among others.
I think this should be possible, any help is appreciated.

Determining the Right Shape for a RNN with a Integer Sequence Target

I am trying to build an RNN based model in Tensorflow that takes a sequence of categorical values as an input, and sequence of categorical values as the output.
For example, if I have sequence of 30 values, the first 25 would be the training data, and the last 5 would be the target. Imagine the data is something like a person pressing keys on a computer keyboard and recording their key presses over time.
I've tried to feed the training data and targets into this model in different shapes, and I always get an error that indicates the data is in the wrong shape.
I've included a code sample that should run and demonstrate what I'm trying to do and the failure I'm seeing.
In the code sample, I've used windows for batches. So if there are 90 values in the sequence, the first 25 values would be the training data for the first batch, and the next 5 values would be the target. This next batch would be the next 30 values (25 training values, 5 target values).
import numpy as np
import tensorflow as tf
from tensorflow import keras
num_categories = 20
data_sequence = np.random.choice(num_categories, 10000)
def create_target(batch):
X = tf.cast(batch[:,:-5][:,:,None], tf.float32)
Y = batch[:,-5:][:,:,None]
return X,Y
def add_windows(data):
data = tf.data.Dataset.from_tensor_slices(data)
return data.window(20, shift=1, drop_remainder=True)
dataset = tf.data.Dataset.from_tensor_slices(data_sequence)
dataset = dataset.window(30, drop_remainder=True)
dataset = dataset.flat_map(lambda x: x.batch(30))
dataset = dataset.batch(5)
dataset = dataset.map(create_target)
model = keras.models.Sequential([
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.TimeDistributed(keras.layers.Dense(num_categories, activation="softmax"))
])
optimizer = keras.optimizers.Adam()
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer)
model.fit(dataset, epochs=1)
The error I get when I run the above code is
Node: 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [125,20] and labels shape [25]
I've also tried the following model, but the errors are similar.
model = keras.models.Sequential([
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.SimpleRNN(20),
keras.layers.Dense(num_categories, activation="softmax"))
])
Does anybody have any recommendations about what I need to do to get this working?
Thanks.
I figured out the issue. The size of the time dimension needs to be the same for the training data and the target.
If you look at my original example code, the training data has these shapes
X.shape = (1, 25, 1)
Y.shape = (1, 5, 1)
To fix it, the time dimension should be the same.
X.shape = (1, 15, 1)
Y.shape = (1, 15, 1)
Here is the updated function that will let the model train. Note that all I did was update the array sizes so they are equally sized. The value of 15 is used because the original array length is 30.
def create_target(batch):
X = tf.cast(batch[:,:-15][:,:,None], tf.float32)
Y = batch[:,-15:][:,:,None]
return X,Y

How to apply Keras Normalization to a ParallelMapDataset without making it eager?

I am training a Tensorflow Keras CNN over images, too much training data to fit into memory. I've got a tf.Dataset preprocessing pipeline that reads the images from HDF5 files using a dataset.map() pipeline step. Now I'm trying to normalize the numeric image data to 0 mean and unit variance.
I'm following this example from this guide, except that I have that .map() in there:
def load_features_from_hdf5(filename):
spec = tf.TensorSpec(feature_shape, dtype=tf.dtypes.float32, name=None)
dataset = tfio.IODataset.from_hdf5(filename, "/features", spec=spec) # returns a Dataset
feature = dataset.get_single_element()
feature.set_shape(feature_shape)
return feature
train_x = tf.data.Dataset.from_tensor_slices(filenames).map(load_features_from_fbank, num_parallel_calls=tf.data.AUTOTUNE)
normalizer = tf.keras.layers.Normalization(axis=None)
normalizer.adapt(train_x.take(1000))
train_x_normalized = normalizer(train_x) # <-- ValueError
adapt() successfully computes the mean and variance from the dataset. But when I try to actually apply normalization of values on the exact same dataset, it errors while trying to convert my ParallelMapDataset to an EagerTensor.
ValueError: Attempt to convert a value (<ParallelMapDataset shapes: (41, 682, 1), types: tf.float32>) with an unsupported type (<class 'tensorflow.python.data.ops.dataset_ops.ParallelMapDataset'>) to a Tensor.
How can I get this working? Since the data is so large, I wouldn't think I want to make anything eager until training starts. Should I make the normalization an explicit pipeline step on the Dataset? Or an explicit layer on the model itself? (If the latter case, how can I bring the mean and variance values from training time to inference time in another process?)
You could try something like this:
import tensorflow as tf
# Create dummy data
train_x = tf.data.Dataset.from_tensor_slices((tf.random.normal((100, 28, 28, 3)), tf.random.normal((100, 1)))).batch(10)
normalizer = tf.keras.layers.Normalization(axis=None)
# Adapt
normalizer.adapt(train_x.map(lambda x, y: x))
# Apply to images
train_x_normalized = train_x.map(lambda x, y: (normalizer(x), y))
Example:
for x, y in train_x_normalized.take(1):
print(tf.reduce_mean(x), tf.math.reduce_variance(x))
tf.Tensor(0.00930768, shape=(), dtype=float32) tf.Tensor(1.0023469, shape=(), dtype=float32)
Or, as you mentioned in your question, your can use the normalization layer as part of your model.

How to print out the tensor values of a specific layer

I wish to exam the values of a tensor after mask is applied to it.
Here is a truncated part of the model. I let temp = x so later I wish to print temp to check the exact values.
So given a 4-class classification model using acoustic features. Assume I have data in (1000,50,136) as (batch, timesteps, features)
The objective is to check if the model is studying the features by timesteps. In other words, we wish to reassure the model is learning using slice as the red rectangle in the picture. Logically, it is the way for Keras LSTM layer but the confusion matrix produced is quite different when a parameter changes (eg. Dense units). The validation accuracy stays 45% thus we would like to visualize the model.
The proposed idea is to print out the first step of the first batch and print out the input in the model. If they are the same, then model is learning in the right way ((136,1) features once) instead of (50,1) timesteps of a single feature once.
input_feature = Input(shape=(X_train.shape[1],X_train.shape[2]))
x = Masking(mask_value=0)(input_feature)
temp = x
x = Dense(Dense_unit,kernel_regularizer=l2(dense_reg), activation='relu')(x)
I have tried tf.print() which brought me AttributeError: 'Tensor' object has no attribute '_datatype_enum'
As Get output from a non final keras model layer suggested by Lescurel.
model2 = Model(inputs=[input_attention, input_feature], outputs=model.get_layer('masking')).output
print(model2.predict(X_test))
AttributeError: 'Masking' object has no attribute 'op'
You want to output after mask.
lescurel's link in the comment shows how to do that.
This link to github, too.
You need to make a new model that
takes as inputs the input from your model
takes as outputs the output from the layer
I tested it with some made-up code derived from your snippets.
import numpy as np
from keras import Input
from keras.layers import Masking, Dense
from keras.regularizers import l2
from keras.models import Sequential, Model
X_train = np.random.rand(4,3,2)
Dense_unit = 1
dense_reg = 0.01
mdl = Sequential()
mdl.add(Input(shape=(X_train.shape[1],X_train.shape[2]),name='input_feature'))
mdl.add(Masking(mask_value=0,name='masking'))
mdl.add(Dense(Dense_unit,kernel_regularizer=l2(dense_reg),activation='relu',name='output_feature'))
mdl.summary()
mdl2mask = Model(inputs=mdl.input,outputs=mdl.get_layer("masking").output)
maskoutput = mdl2mask.predict(X_train)
mdloutput = mdl.predict(X_train)
maskoutput # print output after/of masking
mdloutput # print output of mdl
maskoutput.shape #(4, 3, 2): masking has the shape of the layer before (input here)
mdloutput.shape #(4, 3, 1): shape of the output of dense

Passing non-tensor parameters to a Keras model during training / using tensors for indexing

I'm trying to train a Keras model that incorporates data augmentation in the model itself. The input to the model are images of different classes, and the model is supposed to generate an augmentation model for each class which should be used for the augmentation process. My code roughly looks like this:
from keras.models import Model
from keras.layers import Input
...further imports...
def get_main_model(input_shape, n_classes):
encoder_model = get_encoder_model()
input = Input(input_shape, name="input")
label_input = Input((1,), name="label_input")
aug_models = [get_augmentation_model() for i in range(n_classes)]
augmentation = aug_models[label_input](input)
x = encoder_model(input)
y = encoder_model(augmentation)
model = Model(inputs=[input, label_input], outputs=[x, y])
model.add_loss(custom_loss_function(x, y))
return model
I would then like to pass batches of data through the model which consist of an array of images (passed to input) and a corresponding array of labels (passed to label_input). However, this doesn't work since whatever is input into label_input is converted to a tensor by Tensorflow and can't be used for indexing in the following. What I've tried is the following:
augmentation = aug_models[int(label_input)](input) --> doesn't work
because label_input is a tensor
augmentation = aug_models[tf.make_ndarray(label_input)](input) --> casting doesn't work (I guess because label_input is a symbolic tensor)
tf.gather(aug_models, label_input) --> doesn't work because the result of the operation is a Keras model instance that Tensorflow tries to cast into a tensor (which obviously fails)
Is there any kind of trick in Tensorflow that would enable me to pass a parameter to the model during training that is not converted to a tensor or a different way in which I could tell the model which augmentation model to select? Thanks in advance!
To apply a different augmentation to each element of the input tensor (e.g. conditioned on label_input), you will need to:
First, compute each possible augmentation for each element of the batch.
Second, select the desired augmentations according to the label.
Indexing is unfortunately impossible because both the input and label_input tensors are multi-dimensional (e.g. if you were to apply the same augmentation to each element of the batch, it would then be possible to use any conditional tensorflow statement such as tf.case).
Here is a minimal working example showing how you can achieve this:
input = tf.ones((3, 1)) # Shape=(bs, 1)
label_input = tf.constant([3, 2, 1]) # Shape=(bs, 1)
aug_models = [lambda x: x, lambda x: x * 2, lambda x: x * 3, lambda x: x * 4]
nb_classes = len(aug_models)
augmented_data = tf.stack([aug_model(input) for aug_model in aug_models]) # Shape=(nb_classes, bs, 1)
selector = tf.transpose(tf.one_hot(label_input, depth=nb_classes)) # Shape=(nb_classes, bs)
augmentation = tf.reduce_sum(selector[..., None] * augmented_data, axis=0) # Shape=(bs, 1)
print(augmentation)
# prints:
# tf.Tensor(
# [[4.]
# [3.]
# [2.]], shape=(3, 1), dtype=float32)
NOTE: You might need to wrap these operations into a Keras Lambda layer.

Categories

Resources