I am getting this error for LSTM model.
data has three columns
Sentence (input)
Value (output)
Label (output)
I am using tensorflow2.3.0. I have tried 2.0.0 as suggested but I am landing up with dependency errors.
Please help me resolve this error below in quotes
"ValueError: Data cardinality is ambiguous:
x sizes: 720
y sizes: 89
Please provide data which shares the same first dimension."
### Create sequence
vocab_size = 20000
tokenizer = Tokenizer(num_words= vocab_size)
tokenizer.fit_on_texts(df['Sentence'])
sequences = tokenizer.texts_to_sequences(df['Sentence'])
data = pad_sequences(sequences, maxlen=100)
le = LabelEncoder()
df['label'] = le.fit_transform(df['label'])
X = df['Sentence']
y = df[['value','label']]
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size = 0.1)
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(X_train)
X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)
vocab_size = len(tokenizer.word_index) + 1
maxlen = 200
X_train = pad_sequences(X_train, padding='post', maxlen=maxlen)
X_test = pad_sequences(X_test, padding='post', maxlen=maxlen)
#print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
model = Sequential()
model.add(Embedding(vocab_size, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Flatten())
model.add(Dense(2, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=3,batch_size=8, validation_split=0.1)
accr = model.evaluate(X_test, y_test)
print('Test set\n Loss: {:0.3f}\n Accuracy: {:0.3f}'.format(accr[0], accr[1]))
Your data has two outputs (the Value and Label columns). But your model has only one output.
This code works:
X_train = tf.random.uniform([100, 100], 0, 100, dtype=tf.int32)
y_train = tf.random.uniform([100, 2])
model.fit(X_train, y_train, epochs=3,batch_size=8, validation_split=0.1)
Check the shape of y_train. It should be [batch_size, 2].
Related
Dear all: I'm very new to deep learning. I was trying to add a for loop to test all the possible combinations to get the best result. Currently what I have is the following.
def coeff_determination(y_true, y_pred):
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)
x_train = x_train.to_numpy()
x_test = x_test.to_numpy()
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)
(37118, 105)
(37118,)
(15908, 105)
(15908,)
timesteps = 3
features = 35 #this is the number of features
x_train = x_train.reshape((x_train.shape[0], timesteps, features))
x_test = x_test.reshape((x_test.shape[0], timesteps, features))
dense_layers=[0, 1, 2]
layer_sizes=[32, 64, 128]
LSTM_layers=[1,2,3]
for dense_layer in dense_layers:
for layer_size in layer_sizes:
for LSTM_layer in LSTM_layers:
NAME="{}-lstm-{}-nodes-{}-dense-{}".format(LSTM_layer, layer_size, dense_layer, int(time.time()))
tensorboard = TensorBoard(log_dir=f"LSTM_logs\\{NAME}")
print(NAME)
model = Sequential()
model.add(LSTM(layer_size, input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=True))
for i in range(LSTM_layer-1):
model.add(LSTM(layer_size, input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=True))
for i in range(dense_layer):
model.add(Dense(layer_size))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam',metrics=[coeff_determination])
epochs = 10
result = model.fit(x_train, y_train, epochs=epochs, batch_size=72, validation_data=(x_test, y_test), verbose=2, shuffle=False)
However, a got a traceback says the following
ValueError: Error when checking target: expected dense_192 to have 3 dimensions, but got array with shape (37118, 1)
and the error occurs in the following line.
---> 19 result = model.fit(x_train, y_train, epochs=epochs, batch_size=72, validation_data=(x_test, y_test), verbose=2, shuffle=False)
Could anyone please kindly give me some hint regarding how to solve the problem. Thanks a lot for your time and support.
Sincerely
Wilson
Use return_sequence = False for your last LSTM layer so it only returns a vector with the last hidden state.
Sincerely,
Alexander
more details: How to use return_sequences option and TimeDistributed layer in Keras?
I am trying to build a bi-LSTM model in tensorflow, environment google colab. In the training process, the model have an issue: the last layer says that there is shape incompatibility. I wonder if there is any way to reshape the x_train and y_train, to fix this problem
Traceback
ValueError: Shapes (16, 11) and (16, 10) are incompatible
If I change the value of the neurons units to my output layer, from 11 to 10, it does not give any error and the model can be trained. However, I want the output to be 10 and not 11.
# current output layer (run perfectly)
tf.keras.layers.Dense (11, activation = 'softmax')
# expected output layer (shape incompatibility)
tf.keras.layers.Dense (10, activation = 'softmax')
BiLSTM Model
def build_model(vocab_size, embedding_dim=64, input_length=30):
print('\nbuilding the model...\n')
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=(vocab_size + 1), output_dim=embedding_dim, input_length=input_length),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units,return_sequences=True, dropout=0.2)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units,return_sequences=True, dropout=0.2)),
tf.keras.layers.GlobalMaxPool1D(),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Dense(64, activation='tanh'),
# softmax output layer
tf.keras.layers.Dense(10, activation='softmax')
])
# optimizer & loss
opt = 'RMSprop' #tf.optimizers.Adam(learning_rate=1e-4)
loss = 'categorical_crossentropy'
# Metrics
metrics = ['accuracy', 'AUC','Precision', 'Recall']
# compile model
model.compile(optimizer=opt,
loss=loss,
metrics=metrics)
model.summary()
return model
The BATCH_SIZE is set to 16. And the shapes of y_train and x_train are:
x_train.shape
(800, 30)
y_train.shape
(800,)
Training
def train(model, x_train, y_train, x_validation, y_validation,
epochs, batch_size=32, patience=5,
verbose=2, monitor_es='accuracy', mode_es='auto', restore=True,
monitor_mc='val_accuracy', mode_mc='max'):
print('\ntraining...\n')
# callback
early_stopping = tf.keras.callbacks.EarlyStopping(monitor=monitor_es,
verbose=1, mode=mode_es, restore_best_weights=restore,
min_delta=1e-3, patience=patience)
model_checkpoint = tf.keras.callbacks.ModelCheckpoint('tfjsmode.h5', monitor=monitor_mc, mode=mode_mc,
verbose=1, save_best_only=True)
# Define Tensorboard as a Keras callback
tensorboard = TensorBoard(
log_dir='./logs',
histogram_freq=1,
write_images=True
)
keras_callbacks = [tensorboard, early_stopping, model_checkpoint]
# train model
history = model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs, verbose=verbose,
validation_data=(x_validation, y_validation),
callbacks=keras_callbacks)
return history
Preprocessing
def preprocess(x, padding_shape=30):
return np.array([ord(i.lower()) - ord('a')+1 if not i.isdigit() and i != ' ' else 0 for i in list(x)] + ([0] * (padding_shape - len(x))), dtype=int)
def prepare_dataset(labeldict : dict, test_size=.3, validation_size=.1):
print('preparing the dataset...\n')
from sklearn import preprocessing
# load dataset
# split dataset (as string into panda.core.series.Serie object)
x, y = load_clean_dataset()
x = np.array(list(map(preprocess, x)))
y = np.array(list(map(lambda x: labeldict[x.replace(' ', '_')], y)))
print(('y: {}').format(y))
# create/split train, validation and test and shuffle the data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=test_size, shuffle=True)
print(x.max(), x.min())
x_train_val, x_validation, y_train_val, y_validation = train_test_split(x_train, y_train, test_size=test_size, shuffle=True)
# pandas.core.series.Series to numpy array
x_train, y_train = np.array(x_train), np.array(y_train)
x_validation, y_validation = np.array(x_validation), np.array(y_validation)
x_test, y_test = np.array(x_test), np.array(y_test)
x_train_val, y_train_val = np.array(x_train_val), np.array(y_train_val)
print(('\nx_train: \n{}\n\ny_train: \n{}').format(x_train_val, y_train_val))
y_train = tf.keras.utils.to_categorical(y, num_classes=10)
return (x_train, y_train), (x_validation, y_validation), (x_test, y_test), (x_train_val, y_train_val)
It seems you currently have labels as integers (i.e. not one-hot encoded vectors). For example your y seems to be like,
[0, 1, 8, 9, ....] # a vector of 800 elements
There's two ways to train a model on such data.
Alternative 1 (easiest I guess)
Use sparse_categorical_crossentropy as the loss function of the model
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=metrics)
Alternative 2
Convert your labels to one-hot encoded using,
y_onehot = tf.keras.utils.to_categorical(y, num_classes=10)
and then keep the loss of the model as categorical_crossentropy
I have a pandas dataframe of features and samples, and a single series with binary category (0 or 1) values. With that I'm trying to train a neural network, but I am getting the error:
TensorFlow incompatible shapes binary classification
Here is a summary of the code:
X_train, X_test, y_train, y_test = train_test_split(df_x, series_y, random_state=1, test_size=0.25)
best_weight_path = 'best_weights.hdf5'
x = df_x.to_numpy()
y = series_y.to_numpy()
numpy_x_train = X_train.to_numpy()
numpy_y_train = y_train.to_numpy()
numpy_x_test = X_test.to_numpy()
numpy_y_test = y_test.to_numpy()
model = Sequential()
model.add(Dense(20, input_dim=x.shape[1], activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, verbose=1, mode='auto')
checkpointer = ModelCheckpoint(filepath=best_weight_path, verbose=0, save_best_only=True)
model.fit(x, y, validation_data=(numpy_x_test, numpy_y_test), callbacks=[monitor, checkpointer], verbose=0, epochs=1000)
ValueError: Shapes (None, 1) and (None, 2) are incompatible
Shouldn't the last dense layer have 2 units as there are two possible outcomes, so where is the shape (None, 1) coming from?
The problem is related to the correct choice of an appropriate loss function according to the format of your labels. you have 2 possibilities when using softmax in classification task:
1 possibility: if you have 1D integer encoded target, you can use sparse_categorical_crossentropy as loss function (this seems to be your case)
n_class = 2
n_features = 100
n_sample = 1000
X = np.random.randint(0,10, (n_sample,n_features))
y = np.random.randint(0,n_class, n_sample)
inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)
model = Model(inp, out)
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
2 possibility: if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class), you can use categorical_crossentropy
n_class = 2
n_features = 100
n_sample = 1000
X = np.random.randint(0,10, (n_sample,n_features))
y = pd.get_dummies(np.random.randint(0,n_class, n_sample)).values
inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)
model = Model(inp, out)
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
I have x_train 62796 and x_test 15684 and I want to predict the values after that. I seek your advice to forecast the values after that using LSTM in Keras. Here is my code :
...
look_back = 20
train_size = int(len(data) * 0.80)
test_size = len(data) - train_size
train = data[0:train_size]
test = data[train_size:len(data)]
x_train, y_train = create_dataset(train, look_back)
x_test, y_test = create_dataset(test, look_back)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
y_train=np.repeat(y_train.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
y_test=np.repeat(y_test.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
...
model = Sequential()
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(1, return_sequences=True))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])
model.summary()
model.fit(x_train, y_train, epochs=10, batch_size=64)
p = model.predict(x_test)
So, predictions = model.predict(x_train) and shape is (62796, 20, 1)
I tried this code
future = []
currentStep = predictions[-20:, :, :] # -20 is last look_back number
for i in range(10):
currentStep = model.predict(currentStep)
future.append(currentStep)
In this code future result is :
but p = model.predict(x_test)'s [:4000] result is :
I want to know how to predict the exact next value. But, The difference between the two results is very large. I don't know where it went wrong or the code went wrong. Here is full source.
here is my code
...
look_back = 20
train_size = int(len(data) * 0.80)
test_size = len(data) - train_size
train = data[0:train_size]
test = data[train_size:len(data)]
x_train, y_train = create_dataset(train, look_back)
x_test, y_test = create_dataset(test, look_back)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
y_train=np.repeat(y_train.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
y_test=np.repeat(y_test.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
...
model = Sequential()
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(1, return_sequences=True))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])
model.summary()
model.fit(x_train, y_train, epochs=10, batch_size=64)
p = model.predict(x_test)
and I want to predict the next value So,
predictions = model.predict(x_train) and shape is (62796, 20, 1)
and I coded the following site how to use the Keras model to forecast for future dates or events?
future = []
currentStep = predictions[-20:, :, :] # -20 is last look_back number
for i in range(10):
currentStep = model.predict(currentStep)
future.append(currentStep)
in this code future's result is
but p = model.predict(x_test)'s [:4000] result is
I want to know how to predict the exact next value.
But, The difference between the two results is very large.
what is the right way to Predict the next value??
I don't know where it went wrong or the code went wrong.
I hope for your opinion.
full source is https://gist.github.com/Lay4U/654f70bd1fb9c4f7d5bdb21ddcb588ab