I have used NSGA-Net neural architecture search to generate and train several architectures. I am trying to generate PGD adversarial examples using my trained PyTorch models. I tried using both Adversarial Robustness Toolbox 1.3 (ART) and torchattacks 2.4 but I get the same error.
These few lines of code describe the main functionality of my code and what I am trying to achieve here:
# net is my trained NSGA-Net PyTorch model
# Defining PGA attack
pgd_attack = PGD(net, eps=4 / 255, alpha=2 / 255, steps=3)
# Creating adversarial examples using validation data and the defined PGD attack
for images, labels in valid_data:
images = pgd_attack(images, labels).cuda()
outputs = net(images)
So here is what the error generally looks like:
Traceback (most recent call last):
File "torch-attacks.py", line 296, in <module>
main()
File "torch-attacks.py", line 254, in main
images = pgd_attack(images, labels).cuda()
File "\Anaconda3\envs\GPU\lib\site-packages\torchattacks\attack.py", line 114, in __call__
images = self.forward(*input, **kwargs)
File "\Anaconda3\envs\GPU\lib\site-packages\torchattacks\attacks\pgd.py", line 57, in forward
outputs = self.model(adv_images)
File "\envs\GPU\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "\codes\NSGA\nsga-net\models\macro_models.py", line 79, in forward
x = self.gap(self.model(x))
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "\codes\NSGA\nsga-net\models\macro_decoder.py", line 978, in forward
x = self.first_conv(x)
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "\Anaconda3\envs\GPU\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected object of scalar type Float but got scalar type Double for argument #2 'weight' in call to _thnn_conv2d_forward
I have used the same the code with a simple PyTorch model and it worked but I am using NSGA-Net so I haven't designed the model myself. I also tried using .float() on both the model and inputs and still got the same error.
Keep in mind that I only have access to the following files:
torch-attacks.py
macro_models.py
macro_decoder.py
You should convert images to the desired type (images.float() in your case). Labels must not be converted to any floating type. They are allowed to be either int or long tensors.
Related
I am trying to debug my tensorflow code that suddenly produces a NaN loss after about 30 epochs. You may find my specific problem and things I tried in this SO question.
I monitored the weights of all layers for each mini-batch during training and found that the weights suddenly jump to NaN although all weight values were less than 1 during the previous iteration (I have set kernel_constraint max_norm to 1). This makes it very hard to figure out which operation is the culprit.
Pytorch has a cool debugging method torch.autograd.detect_anomaly that produces an error at any backward computation that produces NaN value and shows the traceback. This makes it easy to debug the code.
Is there something similar in TensorFlow? If not can you suggest a method to debug this?
There is indeed a similar debugging tool in tensorflow. See tf.debugging.check_numerics.
This can be used to track the tensors that produce inf or nan values during training. As soon as such value is found, tensorflow produces an InvalidArgumentError.
tf.debugging.check_numerics(LayerN, "LayerN is producing nans!")
If the tensor LayerN has nans, you would get an error like that:
Traceback (most recent call last):
File "trainer.py", line 506, in <module>
worker.train_model()
File "trainer.py", line 211, in train_model
l, tmae = train_step(*batch)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 855, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 2943, in __call__
filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 560, in call
ctx=ctx)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: LayerN is producing nans! : Tensor had NaN values
I'm trying to train a convolutional neural network using Keras/Tensorflow. My model compiles correctly, but as soon as training begins the following error is returned:
Using TensorFlow backend.
Epoch 1/3
Traceback (most recent call last):
File "./main.py", line 17, in <module>
history = CNN.fit(TrainImages, TrainMasks, epochs = 3)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1239, in fit
validation_freq=validation_freq)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 196, in fit_loop
outs = fit_function(ins_batch)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3727, in __call__
outputs = self._graph_fn(*converted_inputs)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1551, in __call__
return self._call_impl(args, kwargs)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1591, in _call_impl
return self._call_flat(args, self.captured_inputs, cancellation_manager)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.**InvalidArgumentError: BiasGrad requires tensor size <= int32 max**
[[node gradients/conv2d_22/BiasAdd_grad/BiasAddGrad (defined at /home/tomhalmos/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_5496]
Function call stack:
keras_scratch_graph
Happy to provide any further details if the above is not sufficient.
The bounds checking is on the number of elements in a tensor. The size is limited to 2.147 billion values (int32).
Take your image size (h x v) times the sample batch size. Multiply that by the number of channels in your operation (such as Conv2D). The place where you get a count larger than 2.1e9 is the guilty operation. There is no solution that I can see other than reducing one of those numbers.
I change my job on GPU and it's work well.
In short, i have 2 trained models, one trained on 2 classes, the other on 3 classes.
My code loads a model, loads an image, and predicts a classification result.
finetune_model = tf.keras.models.load_model(modelPath)
model = load_model(my_file)
img = image.load_img(img_path, target_size=(img_width, img_height))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
The model file is of .h5 type.
When loading the 2-class trained model, it works fine. When i try to load the 3-class trained model, i get the title error, Traceback is below :
File "C:/Users/x/PycharmProjects/y/Learning_python.py", line 23, in <module>
dope = Prediction('Three_Classes','./1.JPEG')
File "C:\Users\x\PycharmProjects\Car_Damage_Detection_Project\Predict.py", line 37, in Prediction
model = load_model(my_file)
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\engine\saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "C:\Users\RonShvartzburd\Miniconda3\envs\y\lib\site-packages\keras\engine\saving.py", line 225, in _deserialize_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "C:\Users\RonShvartzburd\Miniconda3\envs\y\lib\site-packages\keras\engine\saving.py", line 458, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "C:\Users\RonShvartzburd\Miniconda3\envs\y\lib\site-packages\keras\layers\__init__.py", line 55, in deserialize
printable_module_name='layer')
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\engine\network.py", line 1032, in from_config
process_node(layer, node_data)
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\engine\network.py", line 991, in process_node
layer(unpack_singleton(input_tensors), **kwargs)
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\engine\base_layer.py", line 431, in __call__
self.build(unpack_singleton(input_shapes))
File "C:\Users\x\Miniconda3\envs\y\lib\site-packages\keras\layers\normalization.py", line 94, in build
dim = input_shape[self.axis]
TypeError: tuple indices must be integers or slices, not list
What exactly is different between the two models? both were build and trained the same way, except the class definition. How can i go about with this issue? Thanks.
Link provided to the Git repository containing the file where the models were created, namely - modelTraining.py
https://github.com/lepilmen/Car-Damage-Detection
Your inputs must be numpy ndarrays.
After conversing with #Geeocode ,
I retrained the model again with the same code, and the new model did not produce an error.
Perhaps something happened to the previous model, and it messed up the input layer.
He reproduced a new model with 3 images, 1 per class, and also couldn't recreate the problem.
Meaning it's solved. Thanks for all the time spend helping me.
I am new to CNN and machine learning in general and have been trying to follow Image Classification tutorial of TensorFlow.
Now, the Google Colab can be found here. I've followed the this official tutorial of TensorFlow. And I've changed it slightly so it saves the model as h5 instead of tf format so I can use the Keras' model.predict_classes.
Now, I have the model trained and the model reloaded from the saved model alright. But I'm repeatedly getting list index out of range error whenever I am trying to predict the image which I am doing so:
def predict():
image = tf.io.read_file('target.jpeg')
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [224, 224])
print(model.predict_classes(image)[0])
target.jpeg is one of the images I took from the flowers_photos dataset on which the model is trained.
The Traceback is:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in predict
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py", line 319, in predict_classes
proba = self.predict(x, batch_size=batch_size, verbose=verbose)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 821, in predict
use_multiprocessing=use_multiprocessing)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 712, in predict
callbacks=callbacks)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 187, in model_iteration
f = _make_execution_function(model, mode)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 555, in _make_execution_function
return model._make_execution_function(mode)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 2037, in _make_execution_function
self._make_predict_function()
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 2027, in _make_predict_function
**kwargs)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3544, in function
return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
File "/home/amitjoki/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3468, in __init__
self.outputs[0] = array_ops.identity(self.outputs[0])
IndexError: list index out of range
I searched extensively but couldn't get any solution. It would be helpful if anyone can point me in the direction of getting this up and running.
All the predict functions in Keras expect batch of inputs. Therefore, since you are doing prediction on one single image, you need to add an axis at the beginning of image tensor to represent the batch axis:
image = tf.expand_dims(image, axis=0) # the shape would be (1, 224, 224, 3)
print(model.predict_classes(image)[0])
I have a trained model which I am loading using CNTK.load_model() function. I was looking at the MNIST Tutorial on the CNTK git repo as reference for model evaluation code. I have created a data reader (which is a MinibatchSource object) and trying to run model.eval(mb) where mb = minibatch_source.next_minibatch(...) (Similar to this answer)
But, I'm getting the following error message
Traceback (most recent call last):
File "LID_test.py", line 162, in <module>
test_and_evaluate()
File "LID_test.py", line 159, in test_and_evaluate
predictions = model.eval(mb)
File "/home/t-asbahe/anaconda3/envs/cntk-py35/lib/python3.5/site-packages/cntk/ops/functions.py", line 228, in eval
_, output_map = self.forward(arguments, self.outputs, device=device, as_numpy=as_numpy)
File "/home/t-asbahe/anaconda3/envs/cntk-py35/lib/python3.5/site-packages/cntk/utils/swig_helper.py", line 62, in wrapper
result = f(*args, **kwds)
File "/home/t-asbahe/anaconda3/envs/cntk-py35/lib/python3.5/site-packages/cntk/ops/functions.py", line 354, in forward
None, device)
File "/home/t-asbahe/anaconda3/envs/cntk-py35/lib/python3.5/site-packages/cntk/utils/__init__.py", line 393, in sanitize_var_map
if len(arguments) < len(op_arguments):
TypeError: object of type 'Variable' has no len()
I have no input_variable named 'Variable' in my model and I don't see any reason to get this error.
P.S.: My inputs are sparse inputs (one-hots)
You have a few options:
Pass a set of data as numpy array (instance in CNTK 202 tutorial) where onehot data is passed in as a numpy array.
pred = model.eval({model.arguments[0]:[onehot]})
Read the minibatch data and pass it to the eval function
eval_input_map = { input : reader_eval.streams.features }
eval_data = reader_eval.next_minibatch(eval_minibatch_size,
input_map = eval_input_map)
mydata = eval_data[input].value
predicted= model.eval(mydata)