Unable to define first layer of DNN in keras - python

I was trying to design a multi-class Classification NN.
My train_x dataset contains 23 examples each containing 37 features (dimension : 23*37)
train_y contains output for each example (dimension : 23*7) [ 7 Labels/Classes ]. I used one-hot encoding for each example's output.
len(words) is the number of features
This is my model design :
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(units=len(words), input_shape=[len(words)]),
tf.keras.layers.Dense(8, activation="relu"),
tf.keras.layers.Dense(8, activation="relu"),
tf.keras.layers.Dense(len(labels), activation="softmax")
])
For optimizer I used Adam Optimizer and for loss function I used Sparse Categorical Entropy.
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.fit(train_x, train_y, epochs=100)
I am getting the following traceback call:
Epoch 1/100
Traceback (most recent call last):
File "main.py", line 83, in <module>
model.fit(train_x, train_y, epochs=100, callbacks=[callbacks])
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 66, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 848, in fit
tmp_logs = train_function(iterator)
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\def_function.py", line 644, in _call
return self._stateless_fn(*args, **kwds)
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\function.py", line 2420, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\function.py", line 1661, in _filtered_call
return self._call_flat(
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\function.py", line 1745, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\function.py", line 593, in call
outputs = execute.execute(
File "C:\Users\aaman\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [23,7] and labels shape [161]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at main.py:83) ]] [Op:__inference_train_function_709]
Function call stack:
train_function
I have been searching in various sites for two days. But all of them flattens the input data for first layer. All of them use either grey scale images or RGB images as input. All of them requires the first layer to be flattened. But my input data is already flattened.
As much I am understanding this, I am getting the traceback call for the first layer. I may have misunderstood the concept of units and input_shape, thus defining them incorrectly.

Change sparse_categorical_crossentropy to categorical_crossentropy.

Related

Error while training CNN for text classification in keras "ValueError: Input 0 is incompatible with layer"

I am building a prediction model for sequence data using conv1d layer provided by Keras. This is how I did
input_layer = Input(shape=(500,))
layer = Conv1D(128,5,activation="relu")(input_layer)
layer = MaxPooling1D(pool_size=2)(layer)
layer = Flatten()(layer)
layer = Dense(128, activation='relu')(layer)
output_layer = Dense(10, activation='softmax')(layer)
classifier = Model(input_layer, output_layer)
classifier.summary()
classifier.compile(optimizer=optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return classifier
However, am facing the following error:
Traceback (most recent call last):
File "train.py", line 71, in <module>
classifier = create_cnn_model()
File "train.py", line 60, in create_cnn_model
layer = Conv1D(128,5, activation="relu")(input_layer)
File "C:\Python368\lib\site-packages\keras\backend\tensorflow_backend.py", line 75, in symbolic_fn
_wrapper
return func(*args, **kwargs)
File "C:\Python368\lib\site-packages\keras\engine\base_layer.py", line 446, in __call__
self.assert_input_compatibility(inputs)
File "C:\Python368\lib\site-packages\keras\engine\base_layer.py", line 342, in assert_input_compat
ibility
str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer conv1d_1: expected ndim=3, found ndim=2
I think the input_shape in the first layer is not setup right. How to set it up?
Right, conv layers need 3 dimensional input.
I am assuming you have a univariate time series with 500 samples.
You need to write a function to split the time series into steps.
For example:
x y
[t-n,...,t-2,t-1] t
So you are basically using the last n values to predict the next value in your series.
Then your input shape will be [len(x), n, 1]

Dropout and BatchNormalization layers throw TypeError: Incompatible types: <dtype: 'variant'> vs. int32. Value is 1, model works without them

When using custom estimators in Tensorflow 2, when the model contains BatchNorm or Dropout layers, tf fails while building the graph with the following error. It works just fine when I comment out the Dropout and BatchNorm layers.
The model I use is a simple CNN model with two conv blocks and dense layer at the end:
def build_conv_block(x: Model, filter_map_count: int, name: str):
x = Conv2D(filter_map_count, (3, 3), name=f'{name}_conv_2d')(x)
x = BatchNormalization(name=f'{name}_bn')(x) <------- Error when not commented out
x = ReLU(name=f'{name}_relu')(x)
x = MaxPool2D((2, 2), name=f'{name}_max_pool_2d')(x)
x = Dropout(0.25, name=f'{name}_dropout')(x) <------- Error when not commented out
return x
def get_model(params):
input_image = Input(shape=params.input_shape)
x = build_conv_block(input_image, filter_map_count=64, name='layer_1')
x = build_conv_block(x, filter_map_count=128, name='layer_2')
x = Flatten(name='flatten_conv')(x)
output_pred = Dense(10, activation='softmax', name='output')(x)
model = Model(inputs=input_image, outputs=output_pred)
model.optimizer = Adam(learning_rate=params.learning_rate)
return model
I have a standard train_op in the model_fn that takes mnist images and labels as input and the class as output:
# Calculate gradients
with tf.GradientTape() as tape:
y_pred = model(features, training=training)
loss = tf.losses.categorical_crossentropy(labels, y_pred)
if mode == tf.estimator.ModeKeys.TRAIN:
gradients = tape.gradient(loss, model.trainable_variables)
train_op = model.optimizer.apply_gradients(zip(gradients, model.trainable_variables))
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
Here's the traceback of the error I get:
Traceback (most recent call last):
File "F:/Projects/python/my_project/train.py", line 38, in <module>
tf.estimator.train_and_evaluate(estimator, train_spec=train_spec, eval_spec=eval_spec)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 473, in train_and_evaluate
return executor.run()
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 613, in run
return self.run_local()
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 714, in run_local
saving_listeners=saving_listeners)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1160, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1190, in _train_model_default
features, labels, ModeKeys.TRAIN, self.config)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1148, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "F:\Projects\python\my_project\model.py", line 62, in model_fn
gradients = tape.gradient(loss, model.trainable_variables)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\eager\backprop.py", line 1014, in gradient
unconnected_gradients=unconnected_gradients)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\eager\imperative_grad.py", line 76, in imperative_grad
compat.as_str(unconnected_gradients.value))
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\eager\backprop.py", line 138, in _gradient_function
return grad_fn(mock_op, *out_grads)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\cond_v2.py", line 120, in _IfGrad
true_graph, grads, util.unique_grad_fn_name(true_graph.name))
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\cond_v2.py", line 395, in _create_grad_func
func_graph=_CondGradFuncGraph(name, func_graph))
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\framework\func_graph.py", line 915, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\cond_v2.py", line 394, in <lambda>
lambda: _grad_fn(func_graph, grads), [], {},
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\cond_v2.py", line 373, in _grad_fn
src_graph=func_graph)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 550, in _GradientsHelper
gradient_uid)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 175, in _DefaultGradYs
constant_op.constant(1, dtype=y.dtype, name="grad_ys_%d" % i)))
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\framework\constant_op.py", line 227, in constant
allow_broadcast=True)
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\framework\constant_op.py", line 265, in _constant_impl
allow_broadcast=allow_broadcast))
File "F:\Python\envs\tf2\lib\site-packages\tensorflow_core\python\framework\tensor_util.py", line 484, in make_tensor_proto
(dtype, nparray.dtype, values))
TypeError: Incompatible types: <dtype: 'variant'> vs. int32. Value is 1
It looks similar to the error mentioned in TF Issue #31894, but it doesn't seem to solve this problem. The TypeError does not tell much about where and why the error is happening and directly googling it does not help.
Although it may not be too obvious from the TypeError variant vs int32, if we carefully check the logs, we can see that the error occurs when finding gradients:
File "F:\Projects\python\my_project\model.py", line 62, in model_fn
gradients = tape.gradient(loss, model.trainable_variables)
Also, it should be noted that we get the same error even if one of them is present. So, if we try and analyze the common attributes in BatchNormalization and Dropout layer, both may seem to not come under the core layers, but when we look carefully, only those two layers in the model have a different train/test phase i.e. dropout doesn't zero out the values in test phase and batch norm uses a moving mean and variance during test phase.
Now the problem is narrowed down to using any layer that has a different train/test phase. This happens because tensorflow identifies if training mode is on or not using training parameter passed to the model.
This problem can be solved by using
y_pred = model(features, training=True)
when finding the gradients i.e. for the training phase and by using
y_pred = model(features, training=False)
otherwise i.e. for predict and eval phases.
Linked: Errors where moving mean is not updating is also reported, which can be solved by adding the same attribute.

ValueError: A target array with shape (32, 3) was passed for an output of shape (None, 2) while using as loss `binary_crossentropy`. In Keras model

I am trying to ensemble the Keras binary pre-trained models into one multi-class model by the voting system. Binary pre-trained models are trained on different classes each. To ensemble the model, I am referring to this blog for the same
Here is the code
for i in os.listdir(model_root): //loading all the models
print(i)
filename = model_root + "/" + i
# load model
model = load_model(filename, custom_objects={'KerasLayer': hub.KerasLayer})
models.append(model)
print(len(models)) //3
#To fit the loaded models to the data and saving it to an array fit_models
steps_per_epoch = image_data.samples // image_data.batch_size
batch_stats = CollectBatchStats()
validation_steps = image_data_val.samples / image_data_val.batch_size
for i in range(len(models)):
model[i].fit_generator((item for item in image_data), epochs=2,
steps_per_epoch=steps_per_epoch, #callbacks=[batch_stats],
validation_data=(item for item in image_data_val), validation_steps=validation_steps, verbose=2)
fit_models.append(model[i])
Here is the traceback to the error:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm 2019.2\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2019.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/Pawandeep/Desktop/Python projects/ensemble_image.py", line 89, in <module>
validation_data=(item for item in image_data_val), validation_steps=validation_steps, verbose=2)
File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1433, in fit_generator
steps_name='steps_per_epoch')
File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training_generator.py", line 264, in model_iteration
batch_outs = batch_function(*batch_data)
File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1153, in train_on_batch
extract_tensors_from_dataset=True)
File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2692, in _standardize_user_data
y, self._feed_loss_fns, feed_output_shapes)
File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training_utils.py", line 549, in check_loss_and_target_compatibility
' while using as loss `' + loss_name + '`. '
ValueError: A target array with shape (32, 3) was passed for an output of shape (None, 2) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.
Data definition
#define data
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1 / 255, validation_split=0.20)
IMAGE_SIZE= (224,224)
image_data = image_generator.flow_from_directory(str(data_root), target_size=IMAGE_SIZE, subset='training')
image_data_val = image_generator.flow_from_directory(str(data_root), target_size=IMAGE_SIZE, subset='validation')
My data looks like this:
Image batch shape: (32, 224, 224, 3)
Label batch shape: (32, 3)
I tried to print out the shape of each model in models array. It is
(32, 2)
Now I understand the problem. So the problem is I have binary models trained on each class. That is why its shape is 32*3. I want to ensemble these binary models into an ensemble model so that the addition of each model(class) becomes a multiclass model. Then based on the prediction of this model I want to label my dataset. So how can I achieve this now?

LSTM value error connected to the initializer

I am using Keras to built a LSTM model.
def LSTM_model_1(X_train,Y_train,Dropout,hidden_units):
model = Sequential()
model.add(Masking(mask_value=666, input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(LSTM(hidden_units, activation='tanh', return_sequences=True, dropout=Dropout))
model.add(LSTM(hidden_units, return_sequences=True))
model.add(LSTM(hidden_units, return_sequences=True))
model.add(Dense(Y_train.shape[-1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam',metrics['categorical_accuracy'])
return model
The input data is of shape
X_train.shape=(77,100,34); Y_Train.shape=(77,100,7)
The Y data is one-hot-encoded. Both input tensors are zero-padded for the last list entry. The padded values in Y_train is 0. So no state gets a value of 1 for the padded end. dropout=0 and hidden_units=2 which seems not related to the following error.
Unfortunately, I get following error which I think is connected with the shape of Y. But I cannot put my finger on it. The error happens when the first LSTM layer is initialized/added.
ValueError: Initializer for variable lstm_58/kernel/ is from inside a
control-flow construct, such as a loop or conditional. When creating a
variable inside a loop or conditional, use a lambda as the
initializer.
If I follow the error I noticed that it comes down to this:
dtype: If set, initial_value will be converted to the given type.
If None, either the datatype will be kept (if initial_value is
a Tensor), or convert_to_tensor will decide.
"convert to tensor' creates an object which is then None and leads to the error. Apparently, the LSTM tries to convert the input into a tensor... But if I look at my input, it is already a tensor.
Does any of you have an idea what went wrong or how to use lambda as an initializer? Thanks
EDit: the stack trace
File "C:\Users\310122653\Documents\GitHub\DNN\build_model.py", line
44, in LSTM_model_1
model.add(LSTM(hidden_units, activation='tanh', return_sequences=True, dropout=Dropout))
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\models.py",
line 492, in add
output_tensor = layer(self.outputs[0])
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py",
line 499, in call
return super(RNN, self).call(inputs, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\topology.py",
line 592, in call
self.build(input_shapes[0])
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py",
line 461, in build
self.cell.build(step_input_shape)
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py",
line 1838, in build
constraint=self.kernel_constraint)
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\legacy\interfaces.py",
line 91, in wrapper
return func(*args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\topology.py",
line 416, in add_weight
constraint=constraint)
File
"C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py",
line 395, in variable
v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name)
File
"C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py",
line 235, in init
constraint=constraint)
File
"C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py",
line 356, in _init_from_args
"initializer." % name)
The solution, in this case, was to restart the Kernel.
Thanks to Daniel Möller

Keras: input_shape=train_data.shape produces "list index out of range"

I want to use Keras to build a CNN-LSTM network. However, I have trouble finding the right shape for the first layer's input_shape parameter.
My train_data is a ndarray of the shape (1433, 32, 32); 1433 pictures of size 32x32.
As found in this example, I tried using input_shape=train_data.shape[1:], which results in the same error as input_shape=train_data.shape:
IndexError: list index out of range
The relevant code is:
train_data, train_labels = get_training_data()
# train_data = train_data.reshape(train_data.shape + (1,))
model = Sequential()
model.add(TimeDistributed(Conv2D(
CONV_FILTER_SIZE[0],
CONV_KERNEL_SIZE,
activation="relu",
padding="same"),
input_shape=train_data.shape[1:]))
All the results I found for this error were produced under different dircumstances; not through input_shape. So how do I have to shape my Input? Do I have to look for the error somewhere completely different?
Update:
Complete error:
Traceback (most recent call last):
File "trajecgen_keras.py", line 131, in <module>
tf.app.run()
File "/home/.../lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "trajecgen_keras.py", line 85, in main
input_shape=train_data.shape))
File "/home/.../lib/python3.5/site-packages/keras/models.py", line 467, in add
layer(x)
File "/home/.../lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
output = self.call(inputs, **kwargs)
File "/home/.../lib/python3.5/site-packages/keras/layers/wrappers.py", line 211, in call
y = self.layer.call(inputs, **kwargs)
File "/home/.../lib/python3.5/site-packages/keras/layers/convolutional.py", line 168, in call
dilation_rate=self.dilation_rate)
File "/home/.../lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3335, in conv2d
data_format=tf_data_format)
File "/home/.../lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 753, in convolution
name=name, data_format=data_format)
File "/home/.../lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 799, in __init__
input_channels_dim = input_shape[num_spatial_dims + 1]
File "/home/../lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 521, in __getitem__
return self._dims[key]
IndexError: list index out of range
When using a TimeDistributed layer combined with a Conv2D layer, it seems that input_shape requires a tuple of length 4 at least: input_shape = (number_of_timesteps, height, width, number_of_channels).
You could try to modify your code like this for example:
model = Sequential()
model.add(TimeDistributed(Conv2D(
CONV_FILTER_SIZE[0],
CONV_KERNEL_SIZE,
activation="relu",
padding="same"),
input_shape=(None, 32, 32, 1))
More info here.

Categories

Resources