Dynamic switching of dropout in Keras/Tensorflow - python

I am building a reinforcement learning algorithm in Tensorflow and I would like to be able to dynamically turn dropout off and then on within one single call to session.run().
Rationale: I need to (1) do a forward pass w/o dropout to calculate the targets; and (2) do a training step with the generated targets. If I execute these two steps in different calls to session.run(), everything is ok. But I would like to do it with one single call to session.run() (using tf.stop_gradients(targets)).
After trying several solutions w/o much success, I landed on a solution where I replace the learning_phase placeholder used by Keras with a variable (since placeholders are tensors and do not allow assignments) and use a custom layer to set that variable to True or False as desired. This solution is shown in the code below. Getting the value of either m1 or m2 separately (e.g., running sess.run(m1, feed_dict={ph:np.ones((1,1))})works as expected w/o error. However, getting the value of m3, or getting the values of m1 and m2 simultaneously, works sometimes and sometimes not (and the error message is uninformative).
Do you know what I am doing wrong or a better way to do what I want?
EDIT: The code shows a toy example. In reality I have a single model and I need to run two forward passes (one with dropout off and the other with dropout on) and one backward pass. And I want to do all this it w/o returning to python.
from tensorflow.keras.layers import Dropout, Dense, Input, Layer
from tensorflow.python.keras import backend as K
from tensorflow.keras import Model
import tensorflow as tf
import numpy as np
class DropoutSwitchLayer(Layer):
def __init__(self, stateful=True, **kwargs):
self.stateful = stateful
self.supports_masking = True
super(DropoutSwitchLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.lph = tf.Variable(True, dtype=tf.bool, name="lph", trainable=False)
K._GRAPH_LEARNING_PHASES[tf.get_default_graph()] = self.lph
super(DropoutSwitchLayer, self).build(input_shape)
def call(self, inputs, mask=None):
data_input, training = inputs
op = self.lph.assign(training[0], use_locking=True)
# ugly trick here to make the layer work
data_input = data_input + tf.multiply(tf.cast(op, dtype=tf.float32), 0.0)
return data_input
def compute_output_shape(self, input_shape):
return input_shape[0]
dropout_on = np.array([True], dtype=np.bool)
dropout_off = np.array([False], dtype=np.bool)
input_ph = tf.placeholder(tf.float32, shape=(None, 1))
drop = Input(shape=(), dtype=tf.bool)
input = Input(shape=(1,))
h = DropoutSwitchLayer()([input, drop])
h = Dense(1)(h)
h = Dropout(0.5)(h)
o = Dense(1)(h)
m = Model(inputs=[input, drop], outputs=o)
m1 = m([input_ph, dropout_on])
m2 = m([input_ph, dropout_off])
m3 = m([m2, dropout_on])
sess = tf.Session()
K.set_session(sess)
sess.run(tf.global_variables_initializer())
EDIT 2: Daniel Möller's solution below works when using a Dropout layer, but what if using dropout inside an LSTM layer?
input = Input(shape=(1,))
h = Dense(1)(input)
h = RepeatVector(2)(h)
h = LSTM(1, dropout=0.5, recurrent_dropout=0.5)(h)
o = Dense(1)(h)

Why not make a single continuous model?
#layers
inputs = Input(shape(1,))
dense1 = Dense(1)
dense2 = Dense(1)
#no drop pass:
h = dense1(inputs)
o = dense2(h)
#optionally:
o = Lambda(lambda x: K.stop_gradient(x))(o)
#drop pass:
h = dense1(o)
h = Dropout(.5)(h)
h = dense2(h)
modelOnlyFinalOutput = Model(inputs,h)
modelOnlyNonDrop = Model(inputs,o)
modelBothOutputs = Model(inputs, [o,h])
Choose one for training:
model.fit(x_train,y_train) #where y_train = [targets1, targets2] if using both outputs

It turns out Keras supports, out of the box, what I want to do. Using the training argument in the call to the Dropout/LSTM layer, in combination with Daniel Möller's approach to build the model (thanks!), does the trick.
In the code below (just a toy example), o1 and o3 should be equal and different than o2
from tensorflow.keras.layers import Dropout, Dense, Input, Lambda, Layer, Add, RepeatVector, LSTM
from tensorflow.python.keras import backend as K
from tensorflow.keras import Model
import tensorflow as tf
import numpy as np
repeat = RepeatVector(2)
lstm = LSTM(1, dropout=0.5, recurrent_dropout=0.5)
#Forward pass with dropout disabled
next_state = tf.placeholder(tf.float32, shape=(None, 1), name='next_state')
h = repeat(next_state)
# Use training to disable dropout
o1 = lstm(h, training=False)
target1 = tf.stop_gradient(o1)
#Forward pass with dropout enabled
state = tf.placeholder(tf.float32, shape=(None, 1), name='state')
h = repeat(state)
o2 = lstm(h, training=True)
target2 = tf.stop_gradient(o2)
#Forward pass with dropout disabled
ph3 = tf.placeholder(tf.float32, shape=(None, 1), name='ph3')
h = repeat(ph3)
o3 = lstm(h, training=False)
loss = target1 + target2 - o3
opt = tf.train.GradientDescentOptimizer(0.1)
train = opt.minimize(loss)
sess = tf.Session()
K.set_session(sess)
sess.run(tf.global_variables_initializer())
data = np.ones((1,1))
sess.run([o1, o2, o3], feed_dict={next_state:data, state:data, ph3:data})

How about this :
class CustomDropout(tf.keras.layers.Layer):
def __init__(self):
super(CustomDropout, self).__init__()
self.dropout1= Dropout(0.5)
self.dropout2= Dropout(0.1)
def call(self, inputs):
if xxx:
return self.dropout1(inputs)
else:
return self.dropout2(inputs)

Related

Simple Neural Network in Pytorch with 3 inputs (Numerical Values)

Having a hard time setting up a neural network most of the examples are images. My problem has 3 inputs each of size N X M where N are the samples and M are the features. I have a separate file (CSV) with 1 x N binary target (0,1).
The network i'm trying to configure should have two hidden layers with 100 and 50 neurons, respectively. Sigmoid activation function and cross-entropy to check performance. The result should just be a single probability output.
Please help?
EDIT:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.autograd as autograd
import torch.nn.functional as F
#from torch.autograd import Variable
import pandas as pd
# Import Data
Input1 = pd.read_csv(r'...')
Input2 = pd.read_csv(r'...')
Input3 = pd.read_csv(r'...')
Target = pd.read_csv(r'...' )
# Convert to Tensor
Input1_tensor = torch.tensor(Input1.to_numpy()).float()
Input2_tensor = torch.tensor(Input2.to_numpy()).float()
Input3_tensor = torch.tensor(Input3.to_numpy()).float()
Target_tensor = torch.tensor(Target.to_numpy()).float()
# Transpose to have signal as columns instead of rows
input1 = Input1_tensor
input2 = Input2_tensor
input3 = Input3_tensor
y = Target_tensor
# Define the model
class Net(nn.Module):
def __init__(self, num_inputs, hidden1_size, hidden2_size, num_classes):
# Initialize super class
super(Net, self).__init__()
#self.criterion = nn.CrossEntropyLoss()
# Add hidden layer
self.layer1 = nn.Linear(num_inputs,hidden1_size)
# Activation
self.sigmoid = torch.nn.Sigmoid()
# Add output layer
self.layer2 = nn.Linear(hidden1_size,hidden2_size)
# Activation
self.sigmoid2 = torch.nn.Sigmoid()
self.layer3 = nn.Linear(hidden2_size, num_classes)
def forward(self, x1, x2, x3):
# implement the forward pass
in1 = self.layer1(x1)
in2 = self.layer1(x2)
in3 = self.layer1(x3)
xyz = torch.cat((in1,in2,in3),1)
return xyz
# Define loss function
loss_function = nn.CrossEntropyLoss()
# Define optimizer
optimizer = optim.SGD(model.parameters(), lr=1e-4)
for t in range(num_epochs):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(input1, input2, input3)
# Compute and print loss
loss = loss_function(y_pred, y)
print(t, loss.item())
# Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
# Calculate gradient using backward pass
loss.backward()
# Update model parameters (weights)
optimizer.step()
Here I am getting an error of "
RuntimeError: 0D or 1D target tensor expected, multi-target not supported"
for line "loss = loss_function(y_pred, y)"
Where y_pred is [20000,375] and y is [20000,1]
you can refer to pytorch, a python library for deep learning and neural networks.
and you can use code that defines network below:
from torch import nn
import torch.nn.functional as F
def network(nn.Module):
def __init__(self, M):
# M is the dimension of input feature
super(network, self).__init__()
self.layer1 = nn.Linear(M, 100)
self.layer2 = nn.Linear(100, 50)
self.out = nn.Linear(50,1)
def forward(self,x):
return F.sigmoid(self.out(self.layer2(self.layer1(x))))
----------
You can then refer to the pytorch documentation and finish the rest training code.
Edit:
As for RuntimeError, you can squeeze the target tensor by y.squeeze(). This will remove redundant dimension in your tensor, e.g. [20000,1] -> [20000]

Graph disconnected: cannot obtain value for tensor Tensor("input_5:0", shape=(None, None, None, 128), dtype=float32) at layer "input_5"

I am trying to implement a tensorflow model (encoder decoder like) in which I train initially with a small number of layers, and append the model with more layers after training. I thought it would be easiest to create the layers as Models as I intend on setting various layers to trainable = False at points and thought it'd be easiest this way.
The following code is a simple demonstration of an error I'm getting.
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.models import Model
from tensorflow.keras.layers import concatenate, Input
from tensorflow.keras.layers import MaxPool2D, UpSampling2D, ReLU
from tensorflow.keras.layers import BatchNormalization
def conv_block(x, filters, kernel_size=(3,3), padding="same", strides=1):
c = Conv2D(filters, kernel_size, padding=padding, strides=strides)(x)
c = ReLU()(c)
c=BatchNormalization()(c)
c = Conv2D(filters, kernel_size, padding=padding, strides=strides)(c)
c = ReLU()(c)
c=BatchNormalization()(c)
return c
def down_block(x, filters, kernel_size=(3,3), padding="same", strides=1):
c = conv_block(x, filters, kernel_size = kernel_size,
padding = padding, strides = strides)
p = MaxPool2D((2,2))(c)
return c,p
def up_block(x, skip, filters, kernel_size=(3,3), padding="same", strides=1):
us = UpSampling2D((2,2))(x)
concat = concatenate([us, skip])
c = conv_block(concat, filters, kernel_size = kernel_size,
padding = padding, strides = strides)
return c
def create_base_model():
inner_input = Input((None,None,128))
bn = conv_block(inner_input,128)
inner_model = Model(inputs=inner_input,outputs=bn)
return inner_model
def create_downblock_model():
model_input = Input((None,None,128))
c,p = down_block(model_input, 128)
down_model = Model(inputs = model_input, outputs = [c,p])
return down_model
def create_upblock_model():
input_u = Input((None,None,128))
input_c = Input((None,None,128))
u = up_block(input_u, input_c, 128)
up_model = Model(inputs=[input_u,input_c], outputs = u)
return up_model
bn_model = create_base_model()
# 1ST METHOD - This works
down_model1 = create_downblock_model()
up_model1 = create_upblock_model()
x = bn_model(down_model1.output[-1])
x = up_model1([x,down_model1.output[0]])
inner_model = Model(inputs=down_model1.input, outputs=x)
# 2ND METHOD - This doesn't work
down_model2 = create_downblock_model()
up_model2 = create_upblock_model()
x = down_model2(down_model1.output[-1])
x = bn_model(x[-1])
x = up_model2([x,down_model2.output[0]])
x = up_model1([x,down_model1.output[0]])
inner_model = Model(inputs=down_model1.input, outputs=x)
gets the following error for the second method.
Graph disconnected: cannot obtain value for tensor Tensor("input_5:0", shape=(None, None, None, 128), dtype=float32) at layer "input_5". The following previous layers were accessed without issue: ['input_2', 'conv2d_2', 're_lu_2']
Now down_model2 has the layer input_5:0, so I am assuming the issue is with the line x = down_model2(down_model1.output[-1]). I searched around and topics with a similar error would suggest that maybe the fact that: down_model1.output[-1] isn't an input layer is the issue, however I really don't understand why my method one works completely fine, but when I try to incorporate 2 downblocks, the same way of doing things fails? In my 1st method, I use down_block1.output[-1] as input when defining a new model fine, however it doesn't work in the second method?
I'm using tensorflow2.1.
Apologies if I'm overlooking something simple but I can't understand why this isn't working. Cheers
The problem is cause by x = up_model2([x,down_model2.output[0]]) at third-to-last line probably due to wrong repeated reference, you need change the last block of code to:
down_model2_output = down_model2(down_model1.output[-1])
x = bn_model(down_model2_output[-1])
x = up_model2([x,down_model2_output[0]])
x = up_model1([x,down_model1.output[0]])
inner_model = Model(inputs=down_model1.input, outputs=x)

Changing MobileNet Dropout After Loading

I am working a transfer learning problem. When I create a new model from just the Mobilenet, I set a dropout.
base_model = MobileNet(weights='imagenet', include_top=False, input_shape=(200,200,3), dropout=.15)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(10, activation='softmax')(x)
I save models as I train using model_checkpoint_callback. As I train I find where overfitting is happening and adjust the amount of frozen layers and the learning rate. Can I also adjust dropout when I save a loaded model again?
I saw this answer but there are no actual dropout layers in Mobilenet, so this
for layer in model.layers:
if hasattr(layer, 'rate'):
print(layer.name)
layer.rate = 0.5
doesn't do anything.
In the past, you had to clone the model for the new dropout to take. I haven't tried it recently.
# This code allows you to change the dropout
# Load model from .json
model.load_weights(filenameToModelWeights) # Load weights
model.layers[-2].rate = 0.04 # layer[-2] is my dropout layer, rate is dropout attribute
model = keras.models.clone(model) # If I do not clone, the new rate is never used. Weights are re-init now.
model.load_weights(filenameToModelWeights) # Load weights
model.predict(x)
credit to
http://www.gergltd.com/home/2018/03/changing-dropout-on-the-fly-during-training-time-test-time-in-keras/
If the model doesn't have dropout layers to even begin with, as with Keras's pretrained mobilenet, you'll have to add them with methods. Here's one way you could do it.
For adding in a single layer
def insert_single_layer_in_keras(model, layer_name, new_layer):
layers = [l for l in model.layers]
x = layers[0].output
for i in range(1, len(layers)):
x = layers[i](x)
# add layer afterward
if layers[i].name == layer_name:
x = new_layer(x)
new_model = Model(inputs=layers[0].input, outputs=x)
return new_model
For systematically adding a layer
def insert_layers_in_model(model, layer_common_name, new_layer):
import re
layers = [l for l in model.layers]
x = layers[0].output
layer_config = new_layer.get_config()
base_name = layer_config['name']
layer_class = type(dropout_layer)
for i in range(1, len(layers)):
x = layers[i](x)
match = re.match(".+" + layer_common_name + "+", layers[i].name)
# add layer afterward
if match:
layer_config['name'] = base_name + "_" + str(i) # no duplicate names, could be done different
layer_copy = layer_class.from_config(layer_config)
x = layer_copy(x)
new_model = Model(inputs=layers[0].input, outputs=x)
return new_model
Run like this
import tensorflow as tf
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorflow.keras.layers import Dropout
from tensorflow.keras.models import Model
base_model = MobileNet(weights='imagenet', include_top=False, input_shape=(192, 192, 3), dropout=.15)
dropout_layer = Dropout(0.5)
# add single layer after last dropout
mobile_net_with_dropout = insert_single_layer_in_model(base_model, "conv_pw_13_bn", dropout_layer)
# systematically add layers after any batchnorm layer
mobile_net_with_multi_dropout = insert_layers_in_model(base_model, "bn", dropout_layer)
By the way, you should absolutely experiment, but it's unlikely you want additional regularization on top of batchnorm for a small net like mobilenet.

Keras custom layer to Conv2D input channels error, ValueError: number of input channels does not match corresponding dimension of filter, 50 != 3200

I am trying to create a model with Normalized cross correlation custom layer, code taken from here
from keras import backend as K
from keras.layers import Conv2D, MaxPooling2D, Dense, Input, Flatten
from keras.models import Model, Sequential
from keras.engine import InputSpec, Layer
from keras import regularizers
from keras.optimizers import SGD, Adam
from keras.utils.conv_utils import conv_output_length
from keras import activations
import numpy as np
class Normalized_Correlation_Layer(Layer):
# create a class inherited from keras.engine.Layer.
def __init__(self, patch_size=(5, 5),
dim_ordering='tf',
border_mode='same',
stride=(1, 1),
activation=None,
**kwargs):
if border_mode != 'same':
raise ValueError('Invalid border mode for Correlation Layer '
'(only "same" is supported as of now):', border_mode)
self.kernel_size = patch_size
self.subsample = stride
self.dim_ordering = dim_ordering
self.border_mode = border_mode
self.activation = activations.get(activation)
super(Normalized_Correlation_Layer, self).__init__(**kwargs)
def compute_output_shape(self, input_shape):
return(input_shape[0][0], input_shape[0][1], input_shape[0][2], self.kernel_size[0] * input_shape[0][2]*input_shape[0][-1])
def get_config(self):
config = {'patch_size': self.kernel_size,
'activation': self.activation.__name__,
'border_mode': self.border_mode,
'stride': self.subsample,
'dim_ordering': self.dim_ordering}
base_config = super(Correlation_Layer, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def call(self, x, mask=None):
input_1, input_2 = x
stride_row, stride_col = self.subsample
inp_shape = input_1._keras_shape
output_shape = self.compute_output_shape([inp_shape, inp_shape])
padding_row = (int(self.kernel_size[0] / 2),int(self.kernel_size[0] / 2))
padding_col = (int(self.kernel_size[1] / 2),int(self.kernel_size[1] / 2))
input_1 = K.spatial_2d_padding(input_1, padding =(padding_row,padding_col))
input_2 = K.spatial_2d_padding(input_2, padding = ((padding_row[0]*2, padding_row[1]*2),padding_col))
output_row = output_shape[1]
output_col = output_shape[2]
output = []
for k in range(inp_shape[-1]):
xc_1 = []
xc_2 = []
# print("here")
for i in range(padding_row[0]):
for j in range(output_col):
xc_2.append(K.reshape(input_2[:, i:i+self.kernel_size[0], j:j+self.kernel_size[1], k],
(-1, 1,self.kernel_size[0]*self.kernel_size[1])))
for i in range(output_row):
slice_row = slice(i, i + self.kernel_size[0])
slice_row2 = slice(i + padding_row[0], i +self.kernel_size[0] + padding_row[0])
# print("dfg")
for j in range(output_col):
slice_col = slice(j, j + self.kernel_size[1])
xc_2.append(K.reshape(input_2[:, slice_row2, slice_col, k],
(-1, 1,self.kernel_size[0]*self.kernel_size[1])))
xc_1.append(K.reshape(input_1[:, slice_row, slice_col, k],
(-1, 1,self.kernel_size[0]*self.kernel_size[1])))
for i in range(output_row, output_row+padding_row[1]):
for j in range(output_col):
xc_2.append(K.reshape(input_2[:, i:i+ self.kernel_size[0], j:j+self.kernel_size[1], k],
(-1, 1,self.kernel_size[0]*self.kernel_size[1])))
xc_1_aggregate = K.concatenate(xc_1, axis=1)
xc_1_mean = K.mean(xc_1_aggregate, axis=-1, keepdims=True)
xc_1_std = K.std(xc_1_aggregate, axis=-1, keepdims=True)
xc_1_aggregate = (xc_1_aggregate - xc_1_mean) / xc_1_std
xc_2_aggregate = K.concatenate(xc_2, axis=1)
xc_2_mean = K.mean(xc_2_aggregate, axis=-1, keepdims=True)
xc_2_std = K.std(xc_2_aggregate, axis=-1, keepdims=True)
xc_2_aggregate = (xc_2_aggregate - xc_2_mean) / xc_2_std
xc_1_aggregate = K.permute_dimensions(xc_1_aggregate, (0, 2, 1))
block = []
len_xc_1= len(xc_1)
print("asdf")
for i in range(len_xc_1):
#This for loop is to compute the product of a given patch of feature map 1 and the feature maps on which it is supposed to
sl1 = slice(int(i/inp_shape[2])*inp_shape[2],
int(i/inp_shape[2])*inp_shape[2]+inp_shape[2]*self.kernel_size[0])
#This calculates which are the patches of feature map 2 to be considered for a given patch of first feature map.
block.append(K.reshape(K.batch_dot(xc_2_aggregate[:,sl1,:],
xc_1_aggregate[:,:,i]),(-1,1,1,inp_shape[2] *self.kernel_size[0])))
block = K.concatenate(block, axis=1)
# print("zxcv")
block= K.reshape(block,(-1,output_row,output_col,inp_shape[2] *self.kernel_size[0]))
output.append(block)
output = self.activation(output)
print(output)
return output
My model is a combination of cross correlation and Conv2D layers,
dt = 'float32'
def create_model():
ip = keras.layers.Input((50,50, 1))
ncx1_1 = Normalized_Correlation_Layer(patch_size=(1, 1))([ip,ip])
ncn1_1 = keras.layers.Conv2D(64, (1,1), activation = 'relu', dtype=dt)(ip)
ncn2_1 = keras.layers.Conv2D(64, (1,1), activation = 'relu', dtype=dt)(ncx1_1)
ncx2_1 = Normalized_Correlation_Layer(patch_size=(1, 1),dtype=dt)([ncn1_1,ncn2_1])
# ncx2_1 = keras.layers.Reshape((50, 50, 3200))(ncx2_1)
# Problem occurs here
ncn3 = keras.layers.Conv2D(filters=64,kernel_size=(1,1), activation = 'relu', dtype=dt)(ncx2_1)
ncn4 = keras.layers.Conv2D(12, (1,1), activation = 'sigmoid', dtype=dt)(ncn3)
model = keras.models.Model(ip,ncn4)
return model
The model till the last cross correlation layer is successfully created, but I get problem for ncn3 layer
ValueError: number of input channels does not match corresponding dimension of filter, 50 != 3200
The output shape printed from the ncx2_1 layer, while creating it is printed as (?, 50, 50, 50),
when I print ncx2_1.shape and also the outputs returned from call function of layer class ([<tf.Tensor 'normalized__correlation__layer_4/Reshape_10000:0' shape=(?, 50, 50, 50) dtype=float32>]).
But the model summary shows it as (?,50,50,3200) when I create the model till that layer only, ie. model = keras.models.Model(ip,ncx2_1)
When I reshape the layer using ncx2_1 = keras.layers.Reshape((50, 50, 3200))(ncx2_1) , I can create the model successfully, but when I try to fit the data on it, I get :
InvalidArgumentError: Input to reshape is a tensor with 6250000 values, but the requested shape has 400000000
[[node reshape_1/Reshape (defined at /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1781) ]]
[[node loss/mul (defined at /usr/local/lib/python3.6/dist-packages/keras/engine/training.py:865) ]]
Here, my batch size is 50, so for a layer with (B,H,W,C) inputs of (50,50,50,50), the size should be 6250000, butt for (50,50,50,3200), it should be 400000000, which means that the output of cross correlation layer is 50 channels.
I am either interpreting this wrong or I have made a mistake somewhere which I would like to know about.
I am using keras 2.1.2 with tensorflow 1.13.1 (That was the version in which the custom layer was written and I was getting other problems with latest version)
I am also using a custom generator if that is needed info and calling fit using md.fit_generator(train_gen,verbose=1). I can also add any other detail necessary.

How to get Keras activations?

I'm not sure how to modify my code to get keras activations. I've seen conflicting examples of K.function() inputs and am not sure if I'm getting outputs per layer our activations.
Here is my code
activity = 'Downstairs'
layer = 1
seg_x = create_segments_and_labels(df[df['ActivityEncoded']==mapping[activity]],TIME_PERIODS,STEP_DISTANCE,LABEL)[0]
get_layer_output = K.function([model_m.layers[0].input],[model_m.layers[layer].output])
layer_output = get_layer_output([seg_x])[0]
try:
ax = sns.heatmap(layer_output[0].transpose(),cbar=True,cbar_kws={'label':'Activation'})
except:
ax = sns.heatmap(layer_output.transpose(),cbar=True,cbar_kws={'label':'Activation','rotate':180})
ax.set_xlabel('Kernel',fontsize=30)
ax.set_yticks(range(0,len(layer_output[0][0])+1,10))
ax.set_yticklabels(range(0,len(layer_output[0][0])+1,10))
ax.set_xticks(range(0,len(layer_output[0])+1,5))
ax.set_xticklabels(range(0,len(layer_output[0])+1,5))
ax.set_ylabel('Filter',fontsize=30)
ax.xaxis.labelpad = 10
ax.set_title('Filter vs. Kernel\n(Layer=' + model_m.layers[layer].name + ')(Activity=' + activity + ')',fontsize=35)
Suggestions here on stack overflow just do it as I do:
Keras, How to get the output of each layer?
Example 4 adds k's learning phase to the mix but my output is still the same.
https://www.programcreek.com/python/example/93732/keras.backend.function
Am I getting output or activations? Documentation implies i might need layers.activations but I haven't made that work.
My code, or the code passing in learning phase both get this heatmap.
https://imgur.com/a/5fI6N0B
For layers defined as e.g. Dense(activation='relu'), layer.outputs will fetch the (relu) activations. To get layer pre-activations, you'll need to set activation=None (i.e. 'linear'), followed by an Activation layer. Example below.
from keras.layers import Input, Dense, Activation
from keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
import keras.backend as K
ipt = Input(shape=(8,))
x = Dense(10, activation=None)(ipt)
x = Activation('relu')(x)
out = Dense(1, activation='sigmoid')(x)
model = Model(ipt, out)
model.compile('adam', 'binary_crossentropy')
X = np.random.randn(16, 8)
outs1 = get_layer_outputs(model, model.layers[1], X, 1) # Dense
outs2 = get_layer_outputs(model, model.layers[2], X, 1) # Activation
plt.hist(np.ndarray.flatten(outs1), bins=200); plt.show()
plt.hist(np.ndarray.flatten(outs2), bins=200); plt.show()
Function used:
def get_layer_outputs(model, layer, input_data, learning_phase=1):
layer_fn = K.function([model.input, K.learning_phase()], layer.output)
return layer_fn([input_data, learning_phase])

Categories

Resources