CNTK: cuDNN failure 7: CUDNN_STATUS_MAPPING_ERROR

CNTK: cuDNN failure 7: CUDNN_STATUS_MAPPING_ERROR - python

i'm trying to create a simple convolution from an 224 * 244 * 3 image down to and 56 * 56 * 3 tensor which i want to compare to another image.
For that purpose i create a composite reader
scale = ImageDeserializer.scale(width=224,
height=224,
channels=3,
scale_mode="pad",
pad_value=114,
interpolations='linear')
scale2 = ImageDeserializer.scale(width=56,
height=56,
channels=3,
scale_mode="pad",
pad_value=114,
interpolations='linear')
image_source = ImageDeserializer(os.path.join(path, "images_map.txt"))
image_source.ignore_labels()
image_source.map_features('features', [scale])
mask_source = ImageDeserializer(os.path.join(path, "images_mask_map.txt"))
mask_source.ignore_labels()
mask_source.map_features('mask', [scale2])
return MinibatchSource([image_source, mask_source])
and with that reader a create an input map
input_map = {
input_var: reader_train["features"],
input_var_mask: reader_train["mask"]
}
the cnn looks like this
conv1 = cntk.layers.Convolution((5, 5), filterdims[0], pad=True, activation=cntk.ops.relu)(input_var)
maxpool1 = cntk.layers.MaxPooling((2, 2), (2, 2))(conv1)
conv2 = cntk.layers.Convolution((4, 4), filterdims[1], pad=True, activation=cntk.ops.relu)(maxpool1)
maxpool2 = cntk.layers.MaxPooling((2, 2), (2, 2))(conv2)
conv3 = cntk.layers.Convolution((4, 4), 3, pad=True, activation=cntk.ops.relu)(maxpool2)
return conv3 # shape is (3, 56, 56) conv3 = z in the error equation
with inputs
input_var = cntk.ops.input_variable((3, 224, 224), np.float32)
input_var_mask = cntk.ops.input_variable((3, 56, 56), np.float32)
and the error function
f2 = cntk.ops.element_times(cntk.ops.constant(0.00390625), input_var_mask, name="f2")
err = cntk.ops.reshape(cntk.ops.minus(z, f2), (56 * 56 * 3))
sq_err = cntk.ops.element_times(err, err)
mse = cntk.ops.reduce_mean(sq_err)
rmse_loss = cntk.ops.sqrt(mse)
rmse_eval = cntk.ops.sqrt(mse)
When i train to train it everything works fine until
data = reader_train.next_minibatch(min(minibatch_size, epoch_size - sample_count), input_map=input_map) # fetch minibatch.
trainer.train_minibatch(data) # Error as in title
where i get the
cuDNN failure 7: CUDNN_STATUS_MAPPING_ERROR ; GPU=0 ; hostname=STEPHENPC
train_minibatch_overload_for_minibatchdata
return _cntk_py.Trainer_train_minibatch_overload_for_minibatchdata(self, *args)
RuntimeError: cuDNN failure 7: CUDNN_STATUS_MAPPING_ERROR ; GPU=0 ; hostname=STEPHENPC ; expr=err
cudaStreamDestroy failed (PrefetchGPUDataTransferer dtor): an illegal memory access was encountered (cuda error 77)
Can some help me and tell me the cause of the error?
Thanks in advance

According to other projects that have run into this issue (see for example here) this may indicate a bug in the cudnn library. CNTK does not use texture memory so it is best to report this problem to NVidia.

Related

Strange padding layer output

I am trying to construct a model that looks like this.
Notice that the output shape of the padding layer is 1 * 48 * 48 * 32. The input shape to padding layer is 1 * 48 * 48 * 16. Which type of padding operation does that?
My code:
prelu3 = tf.keras.layers.PReLU(shared_axes = [1, 2])(add2)
deptconv3 = tf.keras.layers.DepthwiseConv2D(3, strides=(2, 2), padding='same')(prelu3)
conv4 = tf.keras.layers.Conv2D(32, 1, strides=(1, 1), padding='same')(deptconv3)
maxpool1 = tf.keras.layers.MaxPool2D()(prelu3)
pad1 = tf.keras.layers.ZeroPadding2D(padding=(1, 1))(maxpool1) # This is the padding layer where problem lies.
This is the part of code that is trying to replicate that block. However, I get model that looks like this.
Am I missing something here or am I using the wrong layer?

By default, keras maxpool2d takes in:
Input shape : 4D tensor with shape (batch_size, rows, cols, channels).
Output shape : (batch_size, padded_rows, padded_cols, chamels)
PLease have a look here zero_padding2d layer docs in keras.
In that respect you are trying to double what is getting treated as a channel here.
Your input looks more like (batch, x, y, z) and you want to have a (batch, x, y, 2*z)
Why do you want to have a zeropadding to double your z? I would rather suggest you to use a dense layer like
tf.keras.layers.Dense(32)(maxpool1)
That would increase z shape from 16 to 32.
Edited:
I got something which can help you.
tf.keras.layers.ZeroPadding2D(
padding=(0, 8), data_format="channels_first"
)(maxpool1)
What this does is treats your y, z as (x, y) and x as channel and pads (0, 8) around (y, z) to give (y, 32)
Demo:
import tensorflow as tf
input_shape = (4, 28, 28, 3)
x = tf.keras.layers.Input(shape=input_shape[1:])
y = tf.keras.layers.Conv2D(16, 3, activation='relu', dilation_rate=2, input_shape=input_shape[1:])(x)
x=tf.keras.layers.ZeroPadding2D(
padding=(0, 8), data_format="channels_first"
)(y)
print(y.shape, x.shape)
(None, 24, 24, 16) (None, 24, 24, 32)

How to reproduce the Bottleneck Blocks in Mobilenet V3 with Keras API?

Using Keras API, I am trying to write the MobilenetV3 as explained in this article: https://arxiv.org/pdf/1905.02244.pdf with the architecture as described in this picture:
For that, I need to implement the bottloneck_blocks from the previous article https://arxiv.org/pdf/1801.04381.pdf. See image for architecture:
I managed to glue together the Initial and final Conv layers:
from tensorflow.keras.layers import Input, Conv2D, Add, AvgPool2D, UpSampling2D
first_input = Input(shape=(256, 256, 3))
firt_conv = Conv2D(16,3, strides=2, name="FirstConv2d", padding="same")(first_input)
bneck1 = add_bottleneck_block(firt_conv, 16, 16)
bneck2 = add_bottleneck_block(bneck1, 64, 24, strides=2)
#... Skiping all the other BottleNeck Blocks for simplicity
lastBneck = add_bottleneck_block(second2LastBneck, 960, 160, bneck_depth=5)
middleConv = Conv2D(160, 1 , strides=1, name="MiddleConv", )(bneck3)
pool7 = AvgPool2D(7, strides=1, padding='same', name="7x7Pool")(middleConv)
SecondLastConv = Conv2D(1280, 1, strides=1, name="SecondLastConv")(pool7)
lastConv = Conv2D(3,1, strides=1, name="lastConv1x1")(SecondLastConv)
upScale = UpSampling2D(2)(lastConv) # This layer is application specific for my training.
v3 = tf.keras.models.Model(inputs=[first_input], outputs=upScale)
v3.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(),)
v3.summary()
Where the bottleneck_block is given in the next snippet of code (modified from https://towardsdatascience.com/mobilenetv2-inverted-residuals-and-linear-bottlenecks-8a4362f4ffd5)
def bottleneck_block(x, expand=64, squeeze=16, strides=1, bneck_depth=3):
"""
Bottleneck block with Activation and batch normalization commented since
I don't believe this is the issue in my problem
"""
m = tf.keras.layers.Conv2D(expand, (1,1), strides=1)(x)
#m = tf.keras.layers.BatchNormalization()(m)
#m = tf.keras.layers.Activation('relu6')(m)
m = tf.keras.layers.DepthwiseConv2D(bneck_depth, padding='same', strides=strides)(m)
#m = tf.keras.layers.BatchNormalization()(m)
#m = Activation('relu6')(m)
m = tf.keras.layers.Conv2D(squeeze, (1,1), strides=1)(m)
#m = tf.keras.layers.BatchNormalization()(m)
return tf.keras.layers.Add()([m, x])
However, in bneck2 I get the following error:
ValueError: Operands could not be broadcast together with shapes (16, 16, 24) (128, 128, 16)
I know the error means the dimension of the inputs and outputs are off, but I don't know how to fix it to structure the network as the MobileNetV3.
What am I missing here?
For reference, here is source code in the tensorflow repo for the same network: https://github.com/tensorflow/models/blob/a174bf5b1db0e2c1e04697ff5aae5182bd1c60e7/research/slim/nets/mobilenet/mobilenet_v3.py#L130

The Solution is to modify the bottleneck_block as described in the V3 author's repo:
import tensorflow as tf
def bottleneck_block(x, expand=64, squeeze=16, strides=1, bneck_depth=3, se=False):
"""
se stands for squeeze_excite
"""
m = tf.keras.layers.Conv2D(expand, (1,1), strides=1)(x)
m = tf.keras.layers.BatchNormalization()(m)
#m = tf.keras.layers.Activation('relu6')(m)
m = tf.keras.layers.DepthwiseConv2D(bneck_depth, padding='same', strides=strides)(m)
m = tf.keras.layers.BatchNormalization()(m)
#m = Activation('relu6')(m)
if se:
m = squeeze_excite_block(m, ratio=4)
m = tf.keras.layers.Conv2D(squeeze, (1,1), strides=1, padding='same')(m)
m = tf.keras.layers.BatchNormalization()(m)
if (
# stride check enforces that we don't add residuals when spatial
# dimensions are None
strides == 1 and
# Depth matches
m.get_shape().as_list()[3] == x.get_shape().as_list()[3]
):
m = tf.keras.layers.Add()([m, x])
return m
The check in dimension and stride prevents the error I initially got when adding two nets that do not match the dimension

In your bottolneck layers, there are Add() ops.
Now, Add expects two tensors with the same shape. But, as you have skipped so many layers when this line is run, tf.keras.layers.Add()([m, x]) - m and x have different dimensions.
So, either design a smaller network with fewer layers or just implement all of the intermediate layers.

"Only one input size may be -1, not both 0 and 2" error in Keras

This is summary of my model.
My model is basically similar to a convolution network.
I want my model to work regardless of the width of the input. So the width size appears as None.
and I want to attach a decoder to my model.
However, when I attach the decoder, an error occurs. (If I don't attach the decoder, the program works fine)
This part is my decoder part (below)
if args.decoder==True:
decoder = ConvCapsuleLayer(kernel_size=args.kernel_size, num_capsule=4, num_atoms=1, strides=1, padding='same',
routings=1)(conv_cap)
_, H, W, C, A = decoder.get_shape()
y = layers.Input(shape=(n_class,))
masked_by_y = Mask()([decoder, y])
masked = Mask()(decoder)
def shared_decoder(mask_layer):
recon_1 = layers.Conv2DTranspose(4, (5,5), strides=(2, 2), padding='same', kernel_initializer='he_normal', name='decoder_1', activation='relu')(mask_layer)
recon_2 = layers.Conv2DTranspose(8, (5,5), strides=(2, 2), padding='same', kernel_initializer='he_normal', name='decoder_2', activation='relu')(recon_1)
recon_3 = layers.Conv2DTranspose(1, (1,1), strides=(1, 1), padding='same', kernel_initializer='he_normal', name='decoder_3', activation='linear')(recon_2)
return recon_3
if args.decoder==True:
train_model = models.Model(inputs=[x, y], outputs=[out_seg, shared_decoder(masked_by_y)]) # [x:image,y: mask] // [out_seg:length, reconstruction output]
eval_model = models.Model(x, [out_seg, shared_decoder(masked)])
else:
train_model = models.Model(inputs=x, outputs=out_seg)
eval_model = models.Model(inputs=x, outputs=out_seg)
return train_model, eval_model
mask_1 is my Mask layer.
If a label is given, only the channel of the label is returned. (masked_by_y)
If a label is not given, this layer only returns channel with the largest sum of the element values in conv_capsule_layer_1. (masked)
The shape of conv_capsule_layer_1 is (batch_size = None, height = 50, width = None, num_channel = 4, 1)
That is, the mask layer returns a channel having the largest sum of the element values among the four channels.
Then, use Conv2DTranspose to make it equal to the size of the original input using the returned value (output of mask layer).
However, the following error occurs
InvalidArgumentError (see above for traceback): Only one input size may be -1, not both 0 and 2
[[Node: mask_1/Reshape_1 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mask_1/boolean_mask/Gather, mask_1/Reshape_1/shape)]]
How can I make the length variable not using -1? I already tried that non_zero_masked = K.reshape(non_zero,[-1, masked.shape[1], masked.shaped[2],1])
This is function name is call in my Mask layer
def call(self, inputs, **kwargs):
if type(inputs) is list: # true label is provided with shape = [None, n_classes], i.e. one-hot code.
assert len(inputs) == 2
inputs, mask = inputs
inputs = K.squeeze(inputs, axis=-1) # [batch, input_height, input_width, num_cap, num_atom] -> [batch, input_height, input_width, num_cap]
else: # if no true label, mask by the max length of capsules. Mainly used for prediction
inputs = K.squeeze(inputs, axis=-1) #[batch, input_height, input_width, num_cap]
x = K.softmax(K.sqrt(K.sum(K.square(inputs), axis=(1,2)) + K.epsilon())) # x: [batch, 4]
mask = K.one_hot(indices=K.argmax(x, 1), num_classes=x.get_shape().as_list()[1]) # mask: [batch,4]
expand_mask = K.reshape(mask,[-1,1,1,mask.shape[1]]) #[batch_size, 1, 1, num_class]
masked = inputs*expand_mask
non_zero = tf.boolean_mask(masked, tf.not_equal(masked,0))
non_zero_masked = K.reshape(non_zero,[-1, masked.shape[1], -1,1])
return non_zero_masked
Does anybody know why this error is happening? How can I solve it?

keras scalar multiplication using inputs

I just want to do scalar multiplication using inputs
int_input = Input(shape=(1,), name='depth')
int_sigmoid = (Activation('sigmoid')(depthInput))
imageInput = Input(shape=(100, 100, 1), name='image')
imageInputNormalized = BatchNormalization()(imageInput)
con1 = Conv2D(64, (2, 2), padding='same', name='con1')(Activation('relu')(imageInputNormalized))
mp1 = MaxPooling2D(pool_size=2)(con1)
con2 = Conv2D(128, (2, 2), padding='same', name='con2')(Activation('relu')(mp1))
l1 = Lambda(lambda x: x ** depthSigmoid)(con2)
i get error
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [20,50,50,128] vs. [20,1]

Your code sample seems to be incomplete, or possibly I'm missing where a number of variables are defined. I would consider adding more of your code.
What I do notice in your last line, you do not perform scalar multiplication. ** is not used for scalar multiplication, that might need to be changed to *.
Hope this helps and good luck solving your problem!

Architecture of the Large All-CNN network

I want to implement this neural network using Keras:
When trying to run the following commands:
model_All_CNN = Sequential()
# input: 126*126 images with 3 channels -> (126, 126, 3) tensors.
# this applies 320 convolution filters of size 2*2 each.
act = keras.layers.advanced_activations.LeakyReLU(alpha=0.3)
# This returns a tensor
#input_img = Input(shape=(3 ,126, 126))
conv1 = model_All_CNN.add(Conv2D(320, (2, 2), strides=(1, 1), activation=act,input_shape= (128,128,3)))
model_All_CNN.add(Activation(act))
I am getting the following error:
File "/Users/M.I.T/anaconda/envs/Mulimediaassignment/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py",
line 2856, in conv2d
x = tf.nn.convolution(
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'convolution'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

CNTK: cuDNN failure 7: CUDNN_STATUS_MAPPING_ERROR - python

According to other projects that have run into this issue (see for example here) this may indicate a bug in the cudnn library. CNTK does not use texture memory so it is best to report this problem to NVidia.

Related

Strange padding layer output

How to reproduce the Bottleneck Blocks in Mobilenet V3 with Keras API?

"Only one input size may be -1, not both 0 and 2" error in Keras

keras scalar multiplication using inputs

Architecture of the Large All-CNN network

Categories

Resources