I'm trying to follow a kaggle for the BERT model : https://www.kaggle.com/code/ludovicocuoghi/twitter-sentiment-analysis-with-bert-roberta/notebook
And at the step near the end I get this error :
ValueError: Shapes (None, 2) and (None, 3) are incompatible
The cell :
history_bert = model.fit([train_input_ids,train_attention_masks], y_train, validation_data=([val_input_ids,val_attention_masks], y_valid), epochs=4, batch_size=32)
I tried many things but I didnt find any solution If someone cane enlight me it'll be useful !
the shape of differents variables :
train_input_ids SHAPE: (640740, 128)
train_attention_masks SHAPE: (640740, 128)
y_train SHAPE: (640740, 2)
val_input_ids SHAPE: (71194, 128)
val_attention_masks SHAPE: (71194, 128)
y_valid SHAPE: (71194, 2)
In my dataset Sentiment has only 2 value 0 / 1 idk if this may have an impact on the error
I switched the variable's place in the functions fit.
I search for similar errors on differents forums to find a solution.
Related
So I have some machine learning data split into testing and training data. The data is imported from a csv file and split into training and testing data using a numpy array.
I manage to split the data fine but when I try to use this data in the model I get an error of:
ValueError: Input 0 of layer "mobilenetv2_1.00_3998" is incompatible with the layer: expected shape=(None, 3998, 140, 1), found shape=(None, 140, 1)
I have tried to reshape the data to match the input shape of the model. This still doesn't work and not really sure how to go about doing this. The data needs to be reshaped but with the correct values.
training dataset consists of:
[[ 0.00770334 -1.4224063 -2.4392433 ... 2.1296244 1.7076529
0.2145994 ]
[-0.9572602 -2.1521447 -2.7491045 ... -3.784852 -2.7787943
-1.727039 ]
testing dataset consists of:
[1. 0. 0. ... 1. 0. 0.]
shape of data:
x_train: (3998, 140)
x_test: (1000, 140)
y_train: (3998,)
y_test: (1000,)
The size of the each testing and training set:
x_train: 559720
x_test: 140000
y_train: 3998
y_test: 1000
here is my code:
model = tf.keras.applications.MobileNetV2((3998, 140, 1), classes=10, weights = None)
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
x_train, x_test, y_train, y_test = model_selection.train_test_split(x, y, test_size=0.2, random_state=123)
x_train = x_train.reshape(3998, 140, 1)
x_test = x_test.reshape(1000, 140, 1)
tf.keras.applications.MobileNetV2 is for images only, meaning a shape of (None, Height, Width, 3), where None is the batch size and 3 is the number of channels. But your training data seems to have a shape of (None, 140) which does not match the required input shape. So, try to use a different model which matches your data shape, and your error will be eliminated
I am sorry can't comment in the question
what is the shape of your input excluding batch
is it (3998, 140, 1) or (140, 1)
if it is (140, 1)
i think this part should be
tf.keras.applications.MobileNetV2((140, 1), classes=10, weights = None)
but if am correct mobile net input should have 3 dimension like (240, 240, 3)
link
but the 1 data shape is (3998, 140, 1) then you should add batch dimension to it before passing to the model
x_train = x_train.reshape(1, 3998, 140, 1)
I am implementing a MTL solution for a regression model in an already known benchmarking dataset for this kind of applications (School Dataset from Manash).
I could efficiently train the model using 3 inputs with a different sample size each. More specifically I have 2 datasets with shapes (91, 28) and 1 with shape (212,28), and each one has their own labels with shapes ((91,1), (91,1) & (212,1)) respectively.
I split each dataset for training, validation and testing in the same proportions.
Using the Keras API I coded the following Network Architecture:
Layer (type) | Output Shape | Param # | Connected to
==========================================================================================
school_1_in (InputLayer) [(None, 28)] 0 []
school_2_in (InputLayer) [(None, 28)] 0 []
school_3_in (InputLayer) [(None, 28)] 0 []
concatenate_10 (Concatenate) (None, 84) 0 ['school_1_in[0][0]',
'school_2_in[0][0]',
'school_3_in[0][0]']
dense_41 (Dense) (None, 16) 1360 ['concatenate_10[0][0]']
dense_42 (Dense) (None, 8) 136 ['dense_41[0][0]']
dense_43 (Dense) (None, 4) 36 ['dense_42[0][0]']
dense_44 (Dense) (None, 4) 36 ['dense_42[0][0]']
dense_45 (Dense) (None, 4) 36 ['dense_42[0][0]']
school_1_out (Dense) (None, 1) 5 ['dense_43[0][0]']
school_2_out (Dense) (None, 1) 5 ['dense_44[0][0]']
school_3_out (Dense) (None, 1) 5 ['dense_45[0][0]']
==================================================================================================
Total params: 1,619
Trainable params: 1,619
Non-trainable params: 0
There are the 3 Input Layers for each train split from the datasets, followed by 1 Concatenation and 2 Dense Shared Layers for learning a feature representation of the whole input combined, then I use 3 Task-specific Dense Layers for each output as to learn higher level representations.
here is the code to the model:
(I saved each train_input and train_output on a dict just for simplicity)
# Modelling - Keras Functional API
input_tensor_1 = Input(shape=(train_inputs[0].shape[1],), dtype='int32', name='school_1_in')
input_tensor_2 = Input(shape=(train_inputs[1].shape[1],), dtype='int32', name='school_2_in')
input_tensor_3 = Input(shape=(train_inputs[2].shape[1],), dtype='int32', name='school_3_in')
concatenated = layers.concatenate([input_tensor_1, input_tensor_2, input_tensor_3],
axis=-1)
shared_layer_1 = layers.Dense(16, activation='relu')(concatenated)
shared_layer_2 = layers.Dense(8, activation='relu')(shared_layer_1)
hidden_1 = layers.Dense(4, activation='relu')(shared_layer_2)
hidden_2 = layers.Dense(4, activation='relu')(shared_layer_2)
hidden_3 = layers.Dense(4, activation='relu')(shared_layer_2)
output_1 = layers.Dense(1, name='school_1_out')(hidden_1)
output_2 = layers.Dense(1, name='school_2_out')(hidden_2)
output_3 = layers.Dense(1, name='school_3_out')(hidden_3)
model = models.Model([input_tensor_1, input_tensor_2, input_tensor_3],
[output_1, output_2, output_3])
model.compile(optimizer='adam',
loss={
'school_1_out': 'mse',
'school_2_out': 'mse',
'school_3_out': 'mse'
},
metrics=['mae']
)
epochs = 300
model.fit({'school_1_in': train_inputs[0], 'school_2_in': train_inputs[1], 'school_3_in': train_inputs[2]},
{'school_1_out': train_outputs[0], 'school_2_out': train_outputs[1], 'school_3_out': train_outputs[2]},
epochs=epochs,
batch_size=32,
validation_split=0.2,
verbose=1)
history_dict = model.history.history
model.summary()
This code runs successfully.
Then I try to evaluate it on the test data for each one and I get the following error:
model.evaluate([test_inputs[0], test_inputs[1], test_inputs[2]],
[test_outputs[0], test_outputs[1], test_outputs[2]])
ValueError: Data cardinality is ambiguous:
x sizes: 19, 19, 43
y sizes: 19, 19, 43
Make sure all arrays contain the same number of samples.
I understand the error in any context other than the one I am trying to apply here, since the core idea is that the number of samples is different. I read many papers referring to MTL on these approaches but could not get any code example to implement it correctly.
My question is, how could I evaluate the model on the testing data using these inputs with obviously different number of samples?
And therefore how is the model even trained in the first place? I understand that the forward pass for the back-propagation algorithm needs a 3 input/3 output pair at every time to activate every neuron of the shared layers and calculate the loss for each output in a given batch, but in this case the bigger input should be trained with missing data on the other two.
I appreciate any help regarding this issue, I think maybe the whole implementation is ill-defined but having read so many papers allowing this difference in the number of samples to happen, I assume I may be making a silly mistake.
Thank you!!!!!!!!!
I am using DeepCTR (version 0.7.5) keras library to predict ctr (using DeepFM)
https://deepctr-doc.readthedocs.io/en/latest/deepctr.models.deepfm.html.
Here is a small example of the code to fit the model:
#Imports, then feature preperation...
model = DeepFM(linear_feature_columns, dnn_feature_columns, task='binary')
model.compile(optimizer, loss)
train_model_input = [train_df[name] for name in feature_names]
model.fit(x=train_model_input, y=train_df[TARGET].values, validation_split=0.3)
But when I try the following:
e = shap.DeepExplainer(model, test_df.head(50))
I get the following error:
ValueError: Cannot feed value of shape (50, 17) for Tensor 'x:0', which has shape '(?, 1)'
I looked all over google and tried playing alot with the inputs shape and the SHAP API but nothing worked for me.
Additional info:
The model inputs format (17 values) is:
[<tf.Tensor 'x:0' shape=(?, 1) dtype=int32>, <tf.Tensor 'x1:0' shape=(?, 1) dtype=int32>...
And outputs is:
<tf.Tensor 'prediction_layer/Reshape:0' shape=(?, 1) dtype=float32>
I am trying to build an Inception model as described here:
https://towardsdatascience.com/deep-learning-for-time-series-classification-inceptiontime-245703f422db
It all works so far, but when I try to implement the shortcut layer and add the two tensors together I get an Error.
Here is my shortcut code:
def shortcut_layer(inputs,z_interception):
print(inputs.shape)
inputs = keras.layers.Conv1D(filters=int(z_interception.shape[-1]),kernel_size=1,padding='same',use_bias=False)(inputs)
print(z_interception.shape[-1])
print(inputs.shape,z_interception.shape)
inputs = keras.layers.BatchNormalization()(inputs)
z = keras.layers.Add()([inputs,z_interception])
print('zshape: ',z.shape)
return keras.layers.Activation('relu')(z)
The output is as follows:
(None, 160, 8)
128
(None, 160, 128) (None, 160, 128)
The output is exactly as I expect it to be, but I still get the error:
ValueError: Operands could not be broadcast together with shapes (160, 128) (160, 8)
which doesn't make sense to me as I try to add the two tensors with shape: (None, 160, 128)
I hope someone can help me with this. Thank you in advance.
I have this error and I'm not sure how do I reshape where there's a dimension with None.
Exception: Error when checking : expected input_1 to have shape (None, 192) but got array with shape (192, 1)
How do I reshape an array to (None, 192)?
I've the array accuracy with shape (12, 16) and I did accuracy.reshape(-1) that gives (192,). However this is not (None, 192).
In keras/keras/engine/training.py
def standardize_input_data(data, names, shapes=None,
check_batch_dim=True,
exception_prefix=''):
...
# check shapes compatibility
if shapes:
for i in range(len(names)):
...
for j, (dim, ref_dim) in enumerate(zip(array.shape, shapes[i])):
if not j and not check_batch_dim:
# skip the first axis
continue
if ref_dim:
if ref_dim != dim:
raise Exception('Error when checking ' + exception_prefix +
': expected ' + names[i] +
' to have shape ' + str(shapes[i]) +
' but got array with shape ' +
str(array.shape))
Comparing that with the error
Error when checking : expected input_1 to have shape (None, 192) but got array with shape (192, 1)
So it is comparing (None, 192) with (192, 1), and skipping the 1st axis; that is comparing 192 and 1. If array has shape (n, 192) it probably would pass.
So basically, what ever is generating the (192,1) shape, as opposed to (1,192) or a broadcastable (192,) is causing the error.
I'm adding keras to the tags on the guess that this is the problem module.
Searching other keras tagged SO questions:
Exception: Error when checking model target: expected dense_3 to have shape (None, 1000) but got array with shape (32, 2)
Error: Error when checking model input: expected dense_input_6 to have shape (None, 784) but got array with shape (784L, 1L)
Dimensions not matching in keras LSTM model
Getting shape dimension errors with a simple regression using Keras
Deep autoencoder in Keras converting one dimension to another i
I don't know enough about keras to understand the answers, but there's more to it than simply reshaping your input array.