I have a keras model here which looks as follows:
As you can see, an intent (four classes) is predicted and each word of the sentence is tagged (choce of 10 classes). I'm now struggeling with the model.fit and the y_train data preparation. If I shape it as follows, all works, but it doesn't feel correct as the left output will have the same shape as the right output.
x = np.array(df_ic.message)
y = np.zeros((df_ic.message.size,2,85))
Can anyone help/suggest how to best shape the train data, i.e. y?
Thanks a lot,
Martin
You can create a Keras model with 2 outputs and provide (y1, y2) to it.
As an example, please see https://github.com/ageron/handson-ml3/blob/main/10_neural_nets_with_keras.ipynb
Search for "Adding an auxiliary output for regularization":
"...
model = tf.keras.Model(inputs=[input_wide, input_deep],
outputs=[output, aux_output])
...
history = model.fit(
(X_train_wide, X_train_deep), (y_train, y_train), epochs=20,
validation_data=((X_valid_wide, X_valid_deep), (y_valid, y_valid))
)
..."
This is explained in detail in book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition" by Aurélien Géron, see pages 333-335.
Related
say I have a MLPClassifier from SK-Learn, I have some train data, X_train (n rows), y_train (n rows) and I have a weights vector ( length n)
How would I go about weighting the input to the Classifier assuming that the weights vector sums to 1?
Say:
C = MLPClassifier(hidden_layer_sizes=hiddenlayers, activation=activation, solver='adam', max_iter=epoch, \
learning_rate_init=0.001, learning_rate = 'adaptive', verbose=False)
Defines the classifier.
Sp['weights'] <--- pandas.Series
Defines the weights to apply to:
x_train <--- pandas.DataFrame
For fitting to the model by:
C.fit(Sp, y_train)
Note, that I have already attempted this problem by putting the weights column at the start of Sp, followed by x_train columns and data. I am unsure as to whether this is the solution or not, but anything would be helpful.
Do note, I don't want to move to keras, my program didn't work right using keras, but if there is a solution in keras, do tell that as well.
Thanks in advance!
I'm currently working with a time series dataset of 46 lines about meteorological measurements on approximately each 3 hours by day during one week. My explanatory variables (X) is composed of 26 variables and some variable has different units of measurement (degree, minimeters, g/m3 etc.). My variable to explain (y) is composed of only one variable temperature.
My goal is to predict temperature (y) on a slot of 12h-24h with the ensemble of variables (X)
For that I used Keras Tensorflow and Python, with MLP regressor model :
X = df_forcast_cap.loc[:, ~df_forcast_cap.columns.str.startswith('l')]
X = X.drop(['temperature_Y'],axis=1)
y = df_forcast_cap['temperature_Y']
y = pd.DataFrame(data=y)
# normalize the dataset X
scaler = MinMaxScaler(feature_range=(0, 1))
scaler.fit_transform(X)
normalized = scaler.transform(X)
# normalize the dataset y
scaler = MinMaxScaler(feature_range=(0, 1))
scaler.fit_transform(y)
normalized = scaler.transform(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# define base model
def norm_model():
# create model
model = Sequential()
model.add(Dense(26, input_dim=26, kernel_initializer='normal', activation='relu'))# 30 is then number of neurons
#model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# evaluate model with standardized dataset
estimator = KerasRegressor(build_fn=norm_model, epochs=(100), batch_size=5, verbose=1)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X, y, cv=kfold)
print(results)
[-0.00454741 -0.00323181 -0.00345096 -0.00847261 -0.00390925 -0.00334816
-0.00239754 -0.00681044 -0.02098541 -0.00140129]
# invert predictions
X_train = scaler.inverse_transform(X_train)
y_train = scaler.inverse_transform(y_train)
X_test = scaler.inverse_transform(X_test)
y_test = scaler.inverse_transform(y_test)
results = scaler.inverse_transform(results)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))
Results: -0.01 (0.01) MSE
(1) I read that cross-validation is not adapted for time series prediction. So, I'm wondering which others techniques exist and which one is more adapted to time-series.
(2) In a second place, I decided to normalize my data because my X dataset is composed of different metrics (degree, minimeters, g/m3 etc.) and my variable to explain y is in degree. In this way, I know that have to deal with a more complicated interpretation of the MSE because its result won't be in the same unity that my y variable. But for the next step of my study I need to save the result of the y predicted (made by the MLP model) and I need that these values be in degree. So, I tried to inverse the normalization but without success, when I print my results, the predicted values are still in normalized format (see in my code above). Does anyone see my mistake.s ?
The model that you present above is looking at a single instance of 26 measurements to make a prediction. From your description it seems that you would like to make predictions from a sequence of these measurements. I'm not sure if I fully understood the description but I'll assume that you have a sequence of 46 measurements, each with 26 values that you believe should be good predictors of the temperature. If that is the case, the input shape of your model should be (46, 26,). The 46 here is called time_steps, 26 is the number of features.
For a time series you need to select a model design. There are 2 approaches: a recurrent network or a convolutional network (or a mixture of the 2nd). A convolutional network is typically used to detect patterns in the input data which may be located somewhere in the data. For instance, suppose you want to detect a given shape in an image. Convolutional Networks are a good starting point. Recurrent networks, update their internal state after each time step. They can detect patterns as well as a convolutional network, but you can think of them as being less position independent.
Simple example of a convolutional approach.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import Sequential, Model
average_tmp = 0.0
model = Sequential([
InputLayer(input_shape=(46,26,)),
Conv1D(16, 4),
Conv1D(32, 4),
Conv1D(64, 2),
Conv1D(128, 4),
MaxPooling1D(),
Flatten(),
Dense(256, activation='relu'),
Dense(1, bias_initializer=keras.initializers.Constant(average_tmp)),
])
model.compile('adam', 'mse')
model.summary()
A mixed approach, would replace the ```Flatten`` layer above with an LSTM node. That would probably be a reasonable starting point to start experimenting.
(1) I read that cross-validation is not adapted for time series prediction. So, I'm wondering which others techniques exist and which one is more adapted to time-series.
cross validation is a technique that is very well suited for this problem. If you try the example model above, I can almost guarantee that it will overfit your dataset very significantly. cross-validation can help you determine the right regularisation parameters for your model in order to avoid overfitting.
Examples of regularisation techniques that you probably want to consider:
Saving the model weights at the epoch with lower validation score.
Dropout and/or BatchNormalization.
kernel regularisation.
(2) In a second place, I decided to normalize my data because my X dataset is composed of different metrics (degree, minimeters, g/m3 etc.) and my variable to explain y is in degree.
Good call. It will avoid training cycles of your model trying to discover the bias at very high values from the random initialisation.
In this way, I know that have to deal with a more complicated interpretation of the MSE because its result won't be in the same unity that my y variable.
This is orthogonal. The inputs are not assumed to be in the same unit as y. We assume in a DNN that we can create a combination of linear transformation of weights (plus non-linear activations). That has no implicit assumption of units.
But for the next step of my study I need to save the result of the y predicted (made by the MLP model) and I need that these values be in degree. So, I tried to inverse the normalization but without success, when I print my results, the predicted values are still in normalized format (see in my code above). Does anyone see my mistake.s ?
scaler.inverse_transform(results) should do the trick.
It doesn't make sense to inverse transform the inputs X_ and Y_. And it would probably help you keep your code straight to not use the same variable name for both the X and Y scalers.
It is also possible to refrain from scaling Y. If you choose to do so, I'd suggest that you initialise the output layer bias with the mean of the Ys.
I am quite new to machine learning and RNN.
I have a small problem as follows: the problem has 4 independent input variables (says (x1, x2, x_3, x_4)) and a vector output (says y_out = 100x1 elements). I want to built a LSTM model to predict the new y_out for any input set (x_1new, x_2new, x_3new, x_4new).
Let say, I build a sample dataset = 50000 lines, each line consists of [x_1i, x_2i, x_3i, x_4i, y_outi]. So my dataset has the dimension of 50000x(4+100).
I don't know how to configure my LSTM model. Any suggestion would be appreciated.
I have already tried the NN model such as Backpropagation with very worst prediction results.
Thank you
I recently trying to build a program, that classify Quora (question pair) dataset, whether it's duplicate or not. I got the accuracy and loss based on real y, but IDK how to proceed the output (predicted y) can anyone help me?
the output shud be 1 or 0 (binary class)
This is the sentence merger code, training process use LSTM
merged = RNN(EMBED_HIDDEN_SIZE)(merged)
merged = layers.Dropout(dropoutp)(merged)
preds = layers.Dense(answer_size, activation='sigmoid')(merged)
model = Model([questiona, questionb], preds)
rmsprop = keras.optimizers.rmsprop(lr=lrn)
model.summary()
You can get the predictions by passing the test data to predict funtion
predictions=model.predict(X)
Link to the docs
I have a dataset spanning hundreds of values regarding temperature. Obviously, in meteorology, it is helpful to predict what future values will be based on the past.
I have the following stateful model, built in Keras:
look_back = 1
model.add(LSTM(32, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(10):
model.fit(trainX, trainY, epochs=4, batch_size=batch_size, verbose=2, shuffle=False)
model.reset_states()
# make predictions
trainPredict = model.predict(trainX, batch_size=batch_size)
I have successfully been able to train and test the model on my dataset to reasonable results, however am struggling to comprehend what is required to predict the next, say, 20 points in the dataset. Obviously, these 20 points are outside of the dataset, and they have yet to "occur".
I would appreciate anything that would be of help; I feel like I am missing some simple functionality in Keras.
Thank you.
I feel like I am missing some simple functionality in Keras.
You have all you need right there. To obtain predictions on new data you have to use model.predict() again, but on the desired range. This depends on how your data looks.
Lets assume your timeseries trainX had events with x ranging from [0,100].
Then to predict the next 20 events you want to call predict() on values 101 to 120, something like:
futureData = np.array(range(101,121)) #[101,102,...,120]
futurePred = model.predict(futureData)
Again, this depends on how your "next 20" events look. If you bin size were instead 0.1 (100, 100.1, 100.2,...) you should evaluate the prediction accordingly.
You may also like to check this page where they give examples and explain more about Timeseries in Keras with RNNs, if you are interested.