Training a CNN model UnimplementedError: Graph execution error: - python

I am a bit confused as I never encountered such an error before. I am tryiing to train my CNN model on images. Below you can see a picture of my code, and then the error message. As you can see it starts at epoch 1 then it stops :(
Does anyone have any idea where does the problem comes from? If anyone had a similar error message before when training your CNN?
Any help is welcome,
Thanks
history = modelA.fit(train_data,
validation_data = test_data,
epochs = 60,
callbacks = [best_model, reduce_lr, es])
ERROR MESSAGE
Epoch 1/60
---------------------------------------------------------------------------
UnimplementedError Traceback (most recent call last)
<ipython-input-68-4b47ff852a2a> in <module>()
2 validation_data = test_data,
3 epochs = 60,
----> 4 callbacks = [best_model, reduce_lr, es])
1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
53 ctx.ensure_initialized()
54 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 55 inputs, attrs, num_outputs)
56 except core._NotOkStatusException as e:
57 if name is not None:
UnimplementedError: Graph execution error:

Related

Model.fit tensorflow Issue

model.fit(X_train, y_train, batch_size=128, epochs=30)
i am using this and i got this error
Epoch 1/30
Output exceeds the size limit. Open the full output data in a text editor
UnimplementedError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_1768\4221927022.py in
----> 1 model.fit(X_train, y_train, batch_size=128, epochs=30)
c:\Users\decil\anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
c:\Users\decil\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
50 try:
51 ctx.ensure_initialized()
---> 52 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
53 inputs, attrs, num_outputs)
54 except core._NotOkStatusException as e:
UnimplementedError: Graph execution error:
Detected at node 'sequential/Cast' defined at (most recent call last):
File "c:\Users\decil\anaconda3\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\Users\decil\anaconda3\lib\runpy.py", line 87, in _run_code
...
File "c:\Users\decil\anaconda3\lib\site-packages\keras\engine\functional.py", line 762, in _conform_to_reference_input
tensor = tf.cast(tensor, dtype=ref_input.dtype)
Node: 'sequential/Cast'
Cast string to float is not supported
[[{{node sequential/Cast}}]] [Op:__inference_train_function_529]
Please help me in this issue
I see the problem is here Cast string to float is not supported basically you're trying to pass a string (maybe the labels?) when the model expects a number (float). But I don't have enough info to help you any further.

InvalidArgumentError: Graph execution error for MobileNet model

I'm rather new to the world of DL and I'm trying to run a MobileNet model to do a multi-class classification but I keep getting this invalidargumenterror when I run it:
code snippet
datagen = ImageDataGenerator(
rotation_range=0.5,
zoom_range = 0.5,
width_shift_range=0.5,
height_shift_range=0.5,
horizontal_flip=True,
vertical_flip=True)
datagen.fit(x_train)
model = applications.mobilenet.MobileNet(weights = "imagenet", include_top=False, input_shape = (150, 150, 3))
for layer in model.layers[:5]:
layer.trainable = False
#Adding custom Layers
x = model.output
x = Flatten()(x)
x = Dense(512, activation="relu")(x)
x = Dropout(0.5)(x)
x = Dense(1024, activation="relu")(x)
x = Dropout(0.5)(x)
predictions = Dense(1, activation="sigmoid")(x)
model = Model(inputs=model.input, outputs=predictions)
model.compile(loss='categorical_crossentropy',optimizer = "Adam",metrics=['acc'])
history = model.fit(datagen.flow(x_train,y_train, batch_size=50),
epochs = 20, validation_data = (x_val,y_val), steps_per_epoch=80)
error that I get:
WARNING:tensorflow:`input_shape` is undefined or non-square, or `rows` is not in [128, 160, 192, 224]. Weights for input shape (224, 224) will be loaded as the default.
Epoch 1/20
2022-10-20 21:32:06.755928: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
Cell In [20], line 18
15 model = Model(inputs=model.input, outputs=predictions)
16 model.compile(loss='binary_crossentropy',optimizer = "Adam",metrics=['acc'])
---> 18 history = model.fit(datagen.flow(x_train,y_train, batch_size=50),
19 epochs = 20, validation_data = (x_val,y_val), steps_per_epoch=80)
File ~/miniforge3/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~/miniforge3/lib/python3.10/site-packages/tensorflow/python/eager/execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
52 try:
53 ctx.ensure_initialized()
---> 54 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
55 inputs, attrs, num_outputs)
56 except core._NotOkStatusException as e:
57 if name is not None:
InvalidArgumentError: Graph execution error:
does anyone know how to solve this error? I'm not too sure if the error has got to do with the size of my input images.
My TensorFlow version is 2.10.0 and I'm running my code on VSC using M2 mac.
thank you!!

InvalidArgumentError: Graph execution error: Detected at node

My idea is to train a collaborative filter model for arts. I'm trying to train my model like this:
def utils_plot_keras_training(training):
metrics = [k for k in training.history.keys() if ("loss" not in k) and ("val" not in k)]
fig, ax = plt.subplots(nrows=1, ncols=2, sharey=True, figsize=(15,3))
ax[0].set(title="Training")
ax11 = ax[0].twinx()
ax[0].plot(training.history['loss'], color='black')
ax[0].set_xlabel('Epochs')
ax[0].set_ylabel('Loss', color='black')
for metric in metrics:
ax11.plot(training.history[metric], label=metric)
ax11.set_ylabel("Score", color='steelblue')
ax11.legend()
ax[1].set(title="Validation")
ax22 = ax[1].twinx()
ax[1].plot(training.history['val_loss'], color='black')
ax[1].set_xlabel('Epochs')
ax[1].set_ylabel('Loss', color='black')
for metric in metrics:
ax22.plot(training.history['val_'+metric], label=metric)
ax22.set_ylabel("Score", color="steelblue")
plt.show()
training = model.fit(x=[train["user_id"], train["art_id"]], y=train["y"],
epochs=100, batch_size=128, shuffle=True, verbose=0, validation_split=0.3)
model = training.model
utils_plot_keras_training(training)
And getting next error:
--------------------------------------------------------------------------- InvalidArgumentError Traceback (most recent call
last) Input In [30], in <cell line: 2>()
1 # train
----> 2 training = model.fit(x=[train["user_id"], train["art_id"]], y=train["y"],
3 epochs=100, shuffle=True, verbose=0, validation_split=0.3)
4 model = training.model
5 utils_plot_keras_training(training)
File
~\DataspellProjects\Arts\venv\lib\site-packages\keras\utils\traceback_utils.py:67,
in filter_traceback..error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.traceback)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
File
~\DataspellProjects\Arts\venv\lib\site-packages\tensorflow\python\eager\execute.py:54,
in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
52 try:
53 ctx.ensure_initialized()
---> 54 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
55 inputs, attrs, num_outputs)
56 except core._NotOkStatusException as e:
57 if name is not None:
InvalidArgumentError: Graph execution error:
Detected at node 'CollaborativeFiltering/xusers_emb/embedding_lookup'
defined at (most recent call last):
File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals) ........
Node: 'CollaborativeFiltering/xusers_emb/embedding_lookup'
indices[28,0] = 1000 is not in [0, 1000) [[{{node
CollaborativeFiltering/xusers_emb/embedding_lookup}}]]
[Op:__inference_test_function_2209]
Any thoughts on how to resolve it? Full code and datasets are here: Github.

Matrix size-incompatible: In[0]: [1,501760], In[1]: [25088,1024]

I am trying to use a pre-trained VGG16 network for classification. For this I wrote the code like the following:
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Flatten, Dense, Dropout
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
flat1 = Flatten()(base_model.outputs)
class1 = Dense(1024, activation='relu')(flat1)
drop1 = Dropout(0.5)(class1)
class2 = Dense(512, activation='relu')(drop1)
drop2 = Dropout(0.5)(class2)
class3 = Dense(256, activation='relu')(drop2)
drop3 = Dropout(0.5)(class3)
class4 = Dense(128, activation='relu')(drop3)
drop4 = Dropout(0.5)(class4)
output = Dense(4, activation='softmax')(drop4)
model = Model(inputs=base_model.inputs, outputs=output)
model.summary()
But when I try to fit the model using:
history = model.fit_generator(train_generator, steps_per_epoch=70, epochs=40, validation_data=validation_generator, validation_steps=20, verbose=0)
I am getting this error which I am unable to understand.
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-26-11baa76f8cd2> in <module>()
8 #mc = ModelCheckpoint('vgg16_end_to_end.hdf5', monitor='val_loss', mode='min', save_best_only=True)
9
---> 10 history = model.fit_generator(train_generator, steps_per_epoch=70, epochs=40, validation_data=validation_generator, validation_steps=20, verbose=0)
9 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:
InvalidArgumentError: Matrix size-incompatible: In[0]: [1,501760], In[1]: [25088,1024]
[[node dense_8/MatMul (defined at /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_5022]
Function call stack:
keras_scratch_graph
I have been training using this model for the last few days and the model was running fine. Suddenly what happened? Please help ya... just goin crazy

AttributeError: 'OwnedIterator' object has no attribute '_get_trainable_state'

I want to use quantization aware training(QAT) to quantize EfficientNet and run
on Google Colaboratory.
I use its GPU(Tesla K80).
So, I change keras to tensorflow keras due to fit QAT API.
But, there is a problem when I training the model(before using QAT).
# model compilation
model_final.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(0.0001),
metrics=['accuracy', acc_top5])
mcp_save = tf.keras.callbacks.ModelCheckpoint('EnetB7_CIFAR10_TL.h5', save_best_only=True, monitor='val_acc')
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_acc', factor=0.5, patience=2, verbose=1,)
model_final.fit(x_train, y_train,
batch_size=32,
epochs=10,
validation_split=0.1,
callbacks=[mcp_save, reduce_lr],
shuffle=True,
verbose=1)
The error is AttributeError: 'OwnedIterator' object has no attribute '_get_trainable_state'.
I don't know why I could run the code on colab two times without any change.
full traceback
Epoch 1/10
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-199bf2f08514> in <module>()
14 callbacks=[mcp_save, reduce_lr],
15 shuffle=True,
---> 16 verbose=1)
10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
966 options=autograph.ConversionOptions(
967 recursive=True,
--> 968 optional_features=autograph_options,
969 user_requested=True,
970 ))
AttributeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:576 _reset_compile_cache *
self._compiled_trainable_state = self._get_trainable_state()
AttributeError: 'OwnedIterator' object has no attribute '_get_trainable_state'

Categories

Resources