So I am trying to implement a custom function using Lambda layers in Keras (Tensorflow backend).
I want to convert the input Tensor into numpy array to perform my function. However, I cannot run tensor.eval() as it throws an error :
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,960,960,1]
This is my code:
def tensor2np(tensor):
return tensor.eval(session=K.get_session())
def np2tensor(np):
return tf.convert_to_tensor(np.reshape((1,480,480,3)))
def calculate_dwt1(tensor):
np_input = tensor2np(tensor)
coeff = pywt.wavedec2((np_input[0,:,:,0]), 'db1', level=1)
return np2tensor(np.dstack((coeff[1][0],coeff[1][1],coeff[1][2])))
def network():
input = Input(shape=(960,960,1), dtype='float32')
conv1 = Convolution2D(64, (3,3), activation='relu', padding='same')(input)
conv1 = Convolution2D(64, (3,3), activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D((2,2), strides=(2,2))(conv1)
conv2 = Convolution2D(128, (3, 3), activation='relu', padding='same')(pool1)
conv2 = Convolution2D(128, (3, 3), activation='relu', padding='same')(conv2)
lambda1 = Lambda(calculate_dwt1)(input)
me = merge((lambda1, conv2),mode='concat', concat_axis=3)
..
..
Or is there anyway I can get the result of the custom function at runtime and convert to Tensor and feed it into my network?
Basically, I'm trying to implement this model architecture.
As it is, you're asking your network to backpropagate through a) the array-> tensor transformation and b) a blackbox function that operates on arrays. Obviously it's no surprise it's unable to do that. You will need to rewrite your custom function using standard (or custom) TF/K operations, and have it be applied on tensor objects. Then and only then will it be able to propagate gradients backwards and values forward.
If you want to use a pure python function as a TensorFlow operation, you can use tf.py_func.
In your case, you need to use a custom python function as a loss function instead of built-in operations. TensorFlow's built-in operations are symbolic and compiled before execution. Then TensorFlow optimizes the given cost function by using its gradients. As your custom loss function's gradient is unknown, TensorFlow cannot optimize your custom loss function.
You have two options. You can either define your custom function in a more symbolic way in order to utilize TF's automatic differentiation, or you need to provide your pure python function's gradient externally like this.
Related
Is it possible to access pre-activation tensors in a Keras Model? For example, given this model:
import tensorflow as tf
image_ = tf.keras.Input(shape=[224, 224, 3], batch_size=1)
vgg19 = tf.keras.applications.VGG19(include_top=False, weights='imagenet', input_tensor=image_, input_shape=image_.shape[1:], pooling=None)
the usual way to access layers is:
intermediate_layer_model = tf.keras.models.Model(inputs=image_, outputs=[vgg19.get_layer('block1_conv2').output])
intermediate_layer_model.summary()
This gives the ReLU outputs for a layer, while I would like the ReLU inputs. I tried doing this:
graph = tf.function(vgg19, [tf.TensorSpec.from_tensor(image_)]).get_concrete_function().graph
outputs = [graph.get_tensor_by_name(tname) for tname in [
'vgg19/block4_conv3/BiasAdd:0',
'vgg19/block4_conv4/BiasAdd:0',
'vgg19/block5_conv1/BiasAdd:0'
]]
intermediate_layer_model = tf.keras.models.Model(inputs=image_, outputs=outputs)
intermediate_layer_model.summary()
but I get the error
ValueError: Unknown graph. Aborting.
The only workaround I've found is to edit the model file to manually expose the intermediates, turning every layer like this:
x = layers.Conv2D(256, (3, 3), activation="relu", padding="same", name="block3_conv1")(x)
into 2 layers where the 1st one can be accessed before activations:
x = layers.Conv2D(256, (3, 3), activation=None, padding="same", name="block3_conv1")(x)
x = layers.ReLU(name="block3_conv1_relu")(x)
Is there a way to acces pre-activation tensors in a Model without essentially editing Tensorflow 2 source code, or reverting to Tensorflow 1 which had full flexibility accessing intermediates?
There is a way to access pre-activation layers for pretrained Keras models using TF version 2.7.0. Here's how to access two intermediate pre-activation outputs from VGG19 in a single forward pass.
Initialize VGG19 model. We can omit top layers to avoid loading unnecessary parameters into memory.
vgg19 = tf.keras.applications.VGG19(
include_top=False,
weights="imagenet"
)
This is the important part: Create a deepcopy of the intermediate layer form which you like to have the features, change the activation of the conv layers to linear (i.e. no activation), rename the layer (otherwise two layers in the model will have the same name which will raise errors) and finally pass the output of the previous through the copied conv layer.
# for more intermediate features wrap a loop around it to avoid copy paste
b5c4_layer = deepcopy(vgg19.get_layer("block5_conv4"))
b5c4_layer.activation = tf.keras.activations.linear
b5c4_layer._name = b5c4_layer.name + str("_preact")
b5c4_preact_output = b5c4_layer(vgg19.get_layer("block5_conv3").output)
b2c2_layer = deepcopy(vgg19.get_layer("block2_conv2"))
b2c2_layer.activation = tf.keras.activations.linear
b2c2_layer._name = b2c2_layer.name + str("_preact")
b2c2_preact_output = b2c2_layer(vgg19.get_layer("block2_conv1").output)
Finally, get the outputs and check if they equal post-activation outputs when we apply ReLU-activation.
vgg19_features = Model(vgg19.input, [b2c2_preact_output, b5c4_preact_output])
vgg19_features_control = Model(vgg19.input, [vgg19.get_layer("block2_conv2").output, vgg19.get_layer("block5_conv4").output])
b2c2_preact, b5c4_preact = vgg19_features(tf.keras.applications.vgg19.preprocess_input(img))
b2c2, b5c4 = vgg19_features_control(tf.keras.applications.vgg19.preprocess_input(img))
print(np.allclose(tf.keras.activations.relu(b2c2_preact).numpy(),b2c2.numpy()))
print(np.allclose(tf.keras.activations.relu(b5c4_preact).numpy(),b5c4.numpy()))
True
True
Here's a visualization similar to Fig. 6 of Wang et al. to see the effect in the feature space.
Input image
To get output of each layer. You have to define a keras function and evaluate it for each layer.
Please refer the code as shown below
from tensorflow.keras import backend as K
inp = model.input # input
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
For more details on this please refer SO Answer.
I'm wondering what the current available options are for simulating BatchNorm folding during quantization aware training in Tensorflow 2. Tensorflow 1 has the tf.contrib.quantize.create_training_graph function which inserts FakeQuantization layers into the graph and takes care of simulating batch normalization folding (according to this white paper).
Tensorflow 2 has a tutorial on how to use quantization in their recently adopted tf.keras API, but they don't mention anything about batch normalization. I tried the following simple example with a BatchNorm layer:
import tensorflow_model_optimization as tfmo
model = tf.keras.Sequential([
l.Conv2D(32, 5, padding='same', activation='relu', input_shape=input_shape),
l.MaxPooling2D((2, 2), (2, 2), padding='same'),
l.Conv2D(64, 5, padding='same', activation='relu'),
l.BatchNormalization(), # BN!
l.MaxPooling2D((2, 2), (2, 2), padding='same'),
l.Flatten(),
l.Dense(1024, activation='relu'),
l.Dropout(0.4),
l.Dense(num_classes),
l.Softmax(),
])
model = tfmo.quantization.keras.quantize_model(model)
It however gives the following exception:
RuntimeError: Layer batch_normalization:<class 'tensorflow.python.keras.layers.normalization.BatchNormalization'> is not supported. You can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.
which indicates that TF does not know what to do with it.
I also saw this related topic where they apply tf.contrib.quantize.create_training_graph on a keras constructed model. They however don't use BatchNorm layers, so I'm not sure this will work.
So what are the options for using this BatchNorm folding feature in TF2? Can this be done from the keras API, or should I switch back to the TensorFlow 1 API and define a graph the old way?
If you add BatchNormalization before activation, you would not have issues with Quantization. Note: Quantization is supported in BatchNormalization only if it the layer is exactly after Conv2D layer.
https://www.tensorflow.org/model_optimization/guide/quantization/training
# Change
l.Conv2D(64, 5, padding='same', activation='relu'),
l.BatchNormalization(), # BN!
# with this
l.Conv2D(64, 5, padding='same'),
l.BatchNormalization(),
l.Activation('relu'),
#Other way of declaring the same
o = (Conv2D(512, (3, 3), padding='valid' , data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = Activation('relu')(o)
You should apply the quantization annotation as in the instruction. I think you can call the BatchNorm now like this:
class DefaultBNQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):
def get_weights_and_quantizers(self, layer):
return []
def get_activations_and_quantizers(self, layer):
return []
def set_quantize_weights(self, layer, quantize_weights):
pass
def set_quantize_activations(self, layer, quantize_activations):
pass
def get_output_quantizers(self, layer):
return [tfmot.quantization.keras.quantizers.MovingAverageQuantizer(
num_bits=8, per_axis=False, symmetric=False, narrow_range=False)]
def get_config(self):
return {}
If you still want to quantize for the layer, change the return of the get_weights_and_quantizers to return [(layer.weights[i], LastValueQuantizer(num_bits=8, symmetric=True, narrow_range=False, per_axis=False)) for i in range(2)]. Then set back the quantizers to gamma,beta,... according to the indices of the return list above at set_quantize_weights. However, I am not encouraging this way as it surely harm the accuracy as BN should act as an activation quantization
The result you have would be like this (RESNET50):
I am currently using TensorFlow version 1.14.
In the code below, I am trying to create a dummy model that takes in 2 inputs and provides two outputs, with all weights set to ones and biases to zeros (Single layered perceptron). I am defining a custom loss function that computes the jacobian of the input layer wrt the output layer.
def my_loss(y_pred, y_true):
jacobian_tf = jacobian_tensorflow3(sim.output, sim.input)
loss = tf.abs(tf.linalg.det(jacobian_tf))
return K.mean(loss)
def jacobian_tensorflow3(x,y, verbose=False):
jacobian_matrix = []
it = tqdm(range(ndim)) if verbose else range(ndim)
for o in it:
grad_func = tf.gradients(x[:,o], y)
jacobian_matrix.append(grad_func[0])
jacobian_matrix = tf.stack(jacobian_matrix)
jacobian_matrix1 = tf.transpose(jacobian_matrix, perm=[1,0,2])
return jacobian_matrix1
sim = Sequential()
sim.add(Dense(2, kernel_initializer='ones', bias_initializer='zeros', activation='linear', input_dim=2))
sim.compile(optimizer='adam', loss=my_loss)
sim.fit(B, np.random.random(B.shape), batch_size=100, epochs=2)
While this model works in giving the result of the Jacobian matrix and also has no issues with compilation, but when I run sim.fit I get the following error:
ValueError: Variable <tf.Variable 'dense_14/bias:0' shape=(2,) dtype=float32> has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
I am stuck at this step for a long time, and I am not able to proceed ahead. Any help/suggestions would be beneficial.
I want to write some custom Keras Layers and do some advanced calculations in the layer, for example with Numpy, Scikit, OpenCV...
I know there are some math functions in keras.backend that can operate on tensors, but i need some more advanced functions.
However, i have no clue how to implement this correctly, i get the error message:
You must feed a value for placeholder tensor 'input_1' with dtype float and shape [...]
Here is my custom layer:
class MyCustomLayer(Layer):
def __init__(self, **kwargs):
super(MyCustomLayer, self).__init__(**kwargs)
def call(self, inputs):
"""
How to implement this correctly in Keras?
"""
nparray = K.eval(inputs) # <-- does not work
# do some calculations here with nparray
# for example with Numpy, Scipy, Scikit, OpenCV...
result = K.variable(nparray, dtype='float32')
return result
def compute_output_shape(self, input_shape):
output_shape = tuple([input_shape[0], 256, input_shape[3]])
return output_shape # (batch, 256, channels)
The error appears here in this dummy model:
inputs = Input(shape=(96, 96, 3))
x = MyCustomLayer()(inputs)
x = Flatten()(x)
x = Activation("relu")(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)
model = Model(inputs=inputs, outputs=predictions)
Thanks for all hints...
TD;LR You should not mix Numpy inside Keras layers. Keras uses Tensorflow underneath because it has to track all the computations to be able to compute the gradients in the backward phase.
If you dig in Tensorflow, you will see that it almost covers all the Numpy functionality (or even extends it) and if I remember correctly, Tensorflow functionality can be accessed through the Keras backend (K).
What are the advance calculations/functions you need?
i think that this kinda process should apply before the model because the process does not contain variables so it cant be optimized.
K.eval(inputs) does not work beacuse you are trying to evaluate a placeholder not variable placeholders has not values for evaluate. if you want get values you should feed it or you can make a list from tensors one by one with tf.unstack()
nparray = tf.unstack(tf.unstack(tf.unstack(inputs,96,0),96,0),3,0)
your call function is wrong because returns a variable you should return a constant:
result = K.constant(nparray, dtype='float32')
return result
Is there any advantage in using tf.nn.* over tf.layers.*?
Most of the examples in the doc use tf.nn.conv2d, for instance, but it is not clear why they do so.
As GBY mentioned, they use the same implementation.
There is a slight difference in the parameters.
For tf.nn.conv2d:
filter: A Tensor. Must have the same type as input. A 4-D tensor of shape [filter_height, filter_width, in_channels, out_channels]
For tf.layers.conv2d:
filters: Integer, the dimensionality of the output space (i.e. the number of filters in the convolution).
I would use tf.nn.conv2d when loading a pretrained model (example code: https://github.com/ry/tensorflow-vgg16), and tf.layers.conv2d for a model trained from scratch.
For convolution, they are the same. More precisely, tf.layers.conv2d (actually _Conv) uses tf.nn.convolution as the backend. You can follow the calling chain of: tf.layers.conv2d>Conv2D>Conv2D.apply()>_Conv>_Conv.apply()>_Layer.apply()>_Layer.\__call__()>_Conv.call()>nn.convolution()...
As others mentioned the parameters are different especially the "filter(s)". tf.nn.conv2d takes a tensor as a filter, which means you can specify the weight decay (or maybe other properties) like the following in cifar10 code. (Whether you want/need to have weight decay in conv layer is another question.)
kernel = _variable_with_weight_decay('weights',
shape=[5, 5, 3, 64],
stddev=5e-2,
wd=0.0)
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
I'm not quite sure how to set weight decay in tf.layers.conv2d since it only take an integer as filters. Maybe using kernel_constraint?
On the other hand, tf.layers.conv2d handles activation and bias automatically while you have to write additional codes for these if you use tf.nn.conv2d.
All of these other replies talk about how the parameters are different, but actually, the main difference of tf.nn and tf.layers conv2d is that for tf.nn, you need to create your own filter tensor and pass it in. This filter needs to have the size of: [kernel_height, kernel_width, in_channels, num_filters]
Essentially, tf.nn is lower level than tf.layers. Unfortunately, this answer is not applicable anymore is tf.layers is obselete
DIFFERENCES IN PARAMETER:
Using tf.layer* in a code:
# Convolution Layer with 32 filters and a kernel size of 5
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
# Max Pooling (down-sampling) with strides of 2 and kernel size of 2
conv1 = tf.layers.max_pooling2d(conv1, 2, 2)
Using tf.nn* in a code:
( Notice we need to pass weights and biases additionally as parameters )
strides = 1
# Weights matrix looks like: [kernel_size(=5), kernel_size(=5), input_channels (=3), filters (= 32)]
# Similarly bias = looks like [filters (=32)]
out = tf.nn.conv2d(input, weights, padding="SAME", strides = [1, strides, strides, 1])
out = tf.nn.bias_add(out, bias)
out = tf.nn.relu(out)
Take a look here:tensorflow > tf.layers.conv2d
and here: tensorflow > conv2d
As you can see the arguments to the layers version are:
tf.layers.conv2d(inputs, filters, kernel_size, strides=(1, 1), padding='valid', data_format='channels_last', dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=None, bias_initializer=tf.zeros_initializer(), kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, trainable=True, name=None, reuse=None)
and the nn version:
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)
I think you can choose the one with the options you want/need/like!