How to plot keras activation functions in a notebook - python

I wanted to plot all Keras activation functions but some of them are not working. i.e. linear throws an error:
AttributeError: 'Series' object has no attribute 'eval'
which is weird. How can I plot the rest of my activation functions?
points = 100
zeros = np.zeros((points,1))
df = pd.DataFrame({"activation": np.linspace(-1.2,1.2,points)})
df["softmax"] = K.eval(activations.elu(df["activation"]))
#df["linear"] = K.eval(activations.linear(df["activation"]))
df["tanh"] = K.eval(activations.tanh(df["activation"]))
df["sigmoid"] = K.eval(activations.sigmoid(df["activation"]))
df["relu"] = K.eval(activations.relu(df["activation"]))
#df["hard_sigmoid"] = K.eval(activations.hard_sigmoid(df["activation"]))
#df["exponential"] = K.eval(activations.exponential(df["activation"]))
df["softsign"] = K.eval(activations.softsign(df["activation"]))
df["softplus"] = K.eval(activations.softplus(df["activation"]))
#df["selu"] = K.eval(activations.selu(df["activation"]))
df["elu"] = K.eval(activations.elu(df["activation"]))
df.plot(x="activation", figsize=(15,15))

That's because the linear activation returns the input without any modifications:
def linear(x):
"""Linear (i.e. identity) activation function.
"""
return x
Since you are passing a Pandas Series as input, the same Pandas Series will be returned and therefore you don't need to use K.eval():
df["linear"] = activations.linear(df["activation"])
As for the selu activation, you need to reshape the input to (n_samples, n_output):
df["selu"] = K.eval(activations.selu(df["activation"].values.reshape(-1,1)))
And as for the hard_sigmoid activation, its input should be explicitly a Tensor which you can create using K.variable():
df["hard_sigmoid"] = K.eval(activations.hard_sigmoid(K.variable(df["activation"].values)))
Further, exponential activation works as you have written and there is no need for modifications.

Related

Keras won't broadcast-multiply the model output with a mask designed for the entire mini batch

I have a data generator that produces batches of input data (X) and targets (Y), and also a mask (batch_mask) to be applied to the model output (the same mask applies to all the datapoint in the batch; there are different masks for different batches and the data generator takes care of doing this).
As a result, the first dimension of batch_mask could have shape 1 or batch_size (by repeating the same mask along the first dimension batch_size times). I was expecting Keras to let me use either, and I wanted to simply create masks having a shape of 1 on the first dimension.
However, when I tried this, I got the error:
ValueError: Data cardinality is ambiguous:
x sizes: 128, 1
y sizes: 128
Make sure all arrays contain the same number of samples.
Why won't Keras broadcast along the first dimension? It seems like this should not be complicated.
Here's some minimal example code to observe this behavior
import tensorflow.keras as tfk
import numpy as np
#######################
# 1. model definition #
#######################
# model parameters
nfeatures_in = 6
target_size = 8
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = tfk.layers.Input(target_size)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask)) # multiply all model outputs in the batch by the same mask
model = tfk.Model(inputs=(input, input_mask), outputs=out_masked)
##########################
# 2. dummy data creation #
##########################
batch_size = 32
# create masks the batch
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
# dummy data creation
X = np.random.randn(batch_size, 6)
Y = np.random.randn(batch_size, target_size)*batch_mask # the target is masked by design in each batch
############################
# 3. compile model and fit #
############################
model.compile(optimizer="Adam", loss="mse")
model.fit((X, batch_mask),Y, batch_size=batch_size)
I know I could make this work by either:
repeating the mask to make the first dimension of batch_mask be the size of the first dimension of X (instead of 1).
using pure tensorflow (but I feel like broadcasting along the batch dimension should not be a problem for Keras).
How can I make this work with Keras?
Thank you!
You can create an IdentityLayer which receives as an external input parameter the batch_mask and returns it as a tensor.
class IdentityLayer(tfk.layers.Layer):
def __init__(self, my_mask, **kwargs):
super(IdentityLayer, self).__init__()
self.my_mask = my_mask
def call(self, _):
my_mask = tf.convert_to_tensor(self.my_mask, dtype=tf.float32)
return my_mask
def get_config(self):
config = super().get_config()
config.update({
"my_mask": self.my_mask,
})
return config
The usage of IdentityLayer in a model is straightforward:
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = IdentityLayer(batch_mask)(input)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask))
model = tfk.Model(inputs=input, outputs=out_masked)
Where batch_mask is a numpy array created as you reported:
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
The solution is to (properly) use a DataGenerator.
See the gist with the working code: https://gist.github.com/iranroman/2aaecf5b5621051df6b1b6b5394e5ef3
Thank you #Marco Cerliani for the discussion that led to figuring out the solution.

keras. Initializing a bidirecional LSTM. Passing word embeddings

In the implementation i am using, the lstm is initialized in the following way:
l_lstm = Bidirectional(LSTM(64, return_sequences=True))(embedded_sequences)
What i don't really understand and it might be because of the lack of experience in Python generally: the notation l_lstm= Bidirectional(LSTM(...))(embedded_sequences).
I don't get what i am passing the embedded_sequences to? Because it is not a parameter of LSTM() but also does not seem to be an argument for Bidirectional() as it stands separately.
Here is the documentation for Bidirectional:
def __init__(self, layer, merge_mode='concat', weights=None, **kwargs):
if merge_mode not in ['sum', 'mul', 'ave', 'concat', None]:
raise ValueError('Invalid merge mode. '
'Merge mode should be one of '
'{"sum", "mul", "ave", "concat", None}')
self.forward_layer = copy.copy(layer)
config = layer.get_config()
config['go_backwards'] = not config['go_backwards']
self.backward_layer = layer.__class__.from_config(config)
self.forward_layer.name = 'forward_' + self.forward_layer.name
self.backward_layer.name = 'backward_' + self.backward_layer.name
self.merge_mode = merge_mode
if weights:
nw = len(weights)
self.forward_layer.initial_weights = weights[:nw // 2]
self.backward_layer.initial_weights = weights[nw // 2:]
self.stateful = layer.stateful
self.return_sequences = layer.return_sequences
self.return_state = layer.return_state
self.supports_masking = True
self._trainable = True
super(Bidirectional, self).__init__(layer, **kwargs)
self.input_spec = layer.input_spec
self._num_constants = None
Let's try to break down what is going on:
You start with LSTM(...) which creates an LSTM Layer. Now layers in Keras are callable which means you can use them like functions. For example lstm = LSTM(...) and then lstm(some_input) will call the LSTM on the given input tensor.
The Bidirectional(...) wraps any RNN layer and returns you another layer that when called applies the wrapped layer in both directions. So l_lstm = Bidirectional(LSTM(...)) is a layer when called with some input will apply the LSTM in both direction. Note: Bidirectional creates a copy of passed LSTM layer, so backwards and forwards are different LSTMs.
Finally, when you call Bidirectional(LSTM(...))(embedded_seqences) bidirectional layer takes the input sequences, passes it to the wrapped LSTMs in both directions, collects their output and concatenates it.
To understand more about layers and their callable nature, you can look at the functional API guide of the documentation.

Bool value of Tensor with more than one value is ambiguous in Pytorch

I want to create a model in pytorch, but I can't
compute the loss.
It's always return Bool value of Tensor with more
than one value is ambiguous
Actually, I run example code, it work.
loss = CrossEntropyLoss()
input = torch.randn(8, 5)
input
target = torch.empty(8,dtype=torch.long).random_(5)
target
output = loss(input, target)
Here is my code,
################################################################################
##
##
import torch
from torch.nn import Conv2d, MaxPool2d, Linear, CrossEntropyLoss, MultiLabelSoftMarginLoss
from torch.nn.functional import relu, conv2d, max_pool2d, linear, softmax
from torch.optim import adadelta
##
##
## Train
Train = {}
Train["Image"] = torch.rand(2000, 3, 76, 76)
Train["Variable"] = torch.rand(2000, 6)
Train["Label"] = torch.empty(2000, dtype=torch.long).random_(2)
##
##
## Valid
Valid = {}
Valid["Image"] = torch.rand(150, 3, 76, 76)
Valid["Variable"] = torch.rand(150, 6)
Valid["Label"] = torch.empty(150, dtype=torch.long).random_(2)
################################################################################
##
##
## Model
ImageTerm = Train["Image"]
VariableTerm = Train["Variable"]
Pip = Conv2d(in_channels=3, out_channels=32, kernel_size=(3,3), stride=1, padding=0)(ImageTerm)
Pip = MaxPool2d(kernel_size=(2,2), stride=None, padding=0)(Pip)
Pip = Conv2d(in_channels=32, out_channels=64, kernel_size=(3,3), stride=1, padding=0)(Pip)
Pip = MaxPool2d(kernel_size=(2,2), stride=None, padding=0)(Pip)
Pip = Pip.view(2000, -1)
Pip = torch.cat([Pip, VariableTerm], 1)
Pip = Linear(in_features=18502, out_features=1000 , bias=True)(Pip)
Pip = Linear(in_features=1000, out_features=2 , bias=True)(Pip)
##
##
## Loss
Loss = CrossEntropyLoss(Pip, Train["Label"])
The error is on Loss = CrossEntropyLoss(Pip, Train["Label"]),
thanks.
In your minimal example, you create an object "loss" of the class "CrossEntropyLoss". This object is able to compute your loss as
loss(input, target)
However, in your actual code, you try to create the object "Loss", while passing Pip and the labels to the "CrossEntropyLoss" class constructor.
Instead, try the following:
loss = CrossEntropyLoss()
loss(Pip, Train["Label"])
Edit (explanation of the error message): The error Message Bool value of Tensor with more than one value is ambiguous appears when you try to cast a tensor into a bool value. This happens most commonly when passing the tensor to an if condition, e.g.
input = torch.randn(8, 5)
if input:
some_code()
The second argument of the CrossEntropyLoss class constructor expects a boolean. Thus, in the line
Loss = CrossEntropyLoss(Pip, Train["Label"])
the constructor will at some point try to use the passed tensor Train["Label"] as a boolean, which throws the mentioned error message.
You can not use the class CrossEntropyLoss directly. You should instantiate this class before using it.
original code:
loss = CrossEntropyLoss(Pip, Train["Label"])
should be replaced by:
loss = CrossEntropyLoss()
loss(Pip, Train["Label"])
First Instantiate loss
L = CrossEntropyLoss()
Then compute loss
L(y_pred, y_true)
This will fix the error.
if you landed up in this page because of pyplot not displaying your tensor image properly then use plt.imshow() instead of plt.show()
for example, instead of
plt.show(images[0].permute(1,2,0))
use
plt.imshow(images[0].permute(1,2,0))

How to create Keras model with optional inputs

I'm looking for a way to create a Keras model with optional inputs. In raw TensorFlow, you can create placeholders with optional inputs as follows:
import numpy as np
import tensorflow as tf
def main():
required_input = tf.placeholder(
tf.float32,
shape=(None, 2),
name='required_input')
default_optional_input = tf.random_uniform(
shape=(tf.shape(required_input)[0], 3))
optional_input = tf.placeholder_with_default(
default_optional_input,
shape=(None, 3),
name='optional_input')
output = tf.concat((required_input, optional_input), axis=-1)
with tf.Session() as session:
with_optional_input_output_np = session.run(output, feed_dict={
required_input: np.random.uniform(size=(4, 2)),
optional_input: np.random.uniform(size=(4, 3)),
})
print(f"with optional input: {with_optional_input_output_np}")
without_optional_input_output_np = session.run(output, feed_dict={
required_input: np.random.uniform(size=(4, 2)),
})
print(f"without optional input: {without_optional_input_output_np}")
if __name__ == '__main__':
main()
In a similar fashion, I would want to be able to have optional inputs for my Keras model. It seems like the tensor argument in the keras.layers.Input.__init__ might be what I'm looking for, but at least it doesn't work as I was expecting (i.e. the same way as tf.placeholder_with_default shown above). Here's an example that breaks:
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
def create_model(output_size):
required_input = tf.keras.layers.Input(
shape=(13, ), dtype='float32', name='required_input')
batch_size = tf.shape(required_input)[0]
def sample_optional_input(inputs, batch_size=None):
base_distribution = tfp.distributions.MultivariateNormalDiag(
loc=tf.zeros(output_size),
scale_diag=tf.ones(output_size),
name='sample_optional_input')
return base_distribution.sample(batch_size)
default_optional_input = tf.keras.layers.Lambda(
sample_optional_input,
arguments={'batch_size': batch_size}
)(None)
optional_input = tf.keras.layers.Input(
shape=(output_size, ),
dtype='float32',
name='optional_input',
tensor=default_optional_input)
concat = tf.keras.layers.Concatenate(axis=-1)(
[required_input, optional_input])
dense = tf.keras.layers.Dense(
output_size, activation='relu')(concat)
model = tf.keras.Model(
inputs=[required_input, optional_input],
outputs=[dense])
return model
def main():
model = create_model(output_size=3)
required_input_np = np.random.normal(size=(4, 13))
outputs_np = model.predict({'required_input': required_input_np})
print(f"outputs_np: {outputs_np}")
required_input = tf.random_normal(shape=(4, 13))
outputs = model({'required_input': required_input})
print(f"outputs: {outputs}")
if __name__ == '__main__':
main()
The first call to the model.predict seems to give correct output, but for some reason, the direct call to model fails with the following error:
ValueError: Layer model expects 2 inputs, but it received 1 input tensors. Inputs received: []
Can the tensor argument in Input.__init__ be used to implement optional inputs for Keras model as in my example above? If yes, what should I change in my example to make it run correctly? If not, what is the expected way of creating optional inputs in Keras?
I really don't think it's possible without workarounds. Keras was not meant for that.
But, noticing that you are using two different session.run commands for each case, it seems that it should be easy to do it with two models. One model uses the optional input, the other doesn't. You choose which one to use the same way you choose which session.run() to call.
That said, you can use Input(tensor=...) or simply create the optional input inside a Lambda layer. Both things are fine. But don't use Input(shape=..., tensor=...), these are redundant arguments and sometimes Keras does not deal well with redundancies like this.
Ideally, keep all operations inside Lambda layers, even the tf.shape operation.
That said:
required_input = tf.keras.layers.Input(
shape=(13, ), dtype='float32', name='required_input')
#needs the input for the case you want to pass it:
optional_input_when_used = tf.keras.layers.Input(shape=(output_size,))
#operations should be inside Lambda layers
batch_size = Lambda(lambda x: tf.shape(x)[0])(required_input)
#updated for using the batch size coming from lambda
#you didn't use "inputs" anywhere in this function
def sample_optional_input(batch_size):
base_distribution = tfp.distributions.MultivariateNormalDiag(
loc=tf.zeros(output_size),
scale_diag=tf.ones(output_size),
name='sample_optional_input')
return base_distribution.sample(batch_size)
#updated for using the batch size as input
default_optional_input = tf.keras.layers.Lambda(sample_optional_input)(batch_size)
#let's skip the concat for now - notice I'm not "using" this layer yet
dense_layer = tf.keras.layers.Dense(output_size, activation='relu')
#you could create the rest of the model here if it's big, so you don't create it twice
#(check the final section of this answer)
Model using passed input:
concat_when_used = tf.keras.layers.Concatenate(axis=-1)(
[required_input, optional_input_when_used]
)
dense_when_used = dense_layer(concat_when_used)
#or final_part_of_the_model(concat_when_used)
model_when_used = Model([required_input, optional_input_when_used], dense_when_used)
Model not using the optional input:
concat_not_used = tf.keras.layers.Concatenate(axis=-1)(
[required_input, default_optional_input]
)
dense_not_used = dense_layer(concat_not_used)
#or final_part_of_the_model(concat_not_used)
model_not_used = Model(required_input, dense_not_used)
It's ok to create two models like this and choose one to use (both models share the final layers, so they will always be trained together)
Now, at the point you choose which session.run, now you will choose which model to use:
model_when_used.predict([x1, x2])
model_when_used.fit([x1,x2], y)
model_not_used.predict(x)
model_not_used.fit(x, y)
How to create a shared final part?
If your final part is big, you will not want to call everything twice to create two models. In this case, create a final model first:
input_for_final = Input(shape_after_concat)
out = Dense(....)(input_for_final)
out = Dense(....)(out)
out = Dense(....)(out)
.......
final_part_of_the_model = Model(input_for_final, out)
Then use this final part in previous answer.
dense_when_used = final_part_of_the_model(concat_when_used)
dense_not_used = final_part_of_the_model(concat_not_used)

Merging tensor rowwise with a vector in keras

I was hoping to implement a variation of PointNet (https://arxiv.org/pdf/1612.00593.pdf) in keras, but I'm having trouble repeating the context vector (g) a variable amount of times so that I can Concatenate it rowwise with a previous layer that lacks context(pre). I tried Repeat() and keras.backend.Tile().
input = Input(shape=(None,3))
x = TimeDistributed(Dense(128, activation = 'relu'))(input)
pre = TimeDistributed(Dense(256, activation = 'relu'))(x)
g = GlobalMaxPooling1D()(pre)
x = Lambda(merge_on_single, output_shape=(None,512))([pre,g])
print(x.shape)
This is the lambda definition I came up with.
def merge_on_single(v):
#v[0] is variable length tensor, v[1] is the single vector
return Concatenate()([K.repeat(v[1],K.get_variable_shape(v[0])),v[0]])
However the following error occurs:
TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [int32, , int32] that don't all match.
UPDATE:
So I was able to get the layers to not give errors by doing the following:
input = Input(shape=(None,3))
num_point = K.placeholder(input.get_shape()[1].value, dtype=tf.int32)
#first global feature layer
x = TimeDistributed(Dense(512, activation = 'relu'))(input)
x = TimeDistributed(Dense(256, activation = 'relu'))(x)
g = GlobalMaxPooling1D()(x)
g = K.reshape(g,(-1,1,256))
g = K.tile(x, [1,num_point,1])
concat_feat = K.concatenate([x, g])
but now, I get the following error:
AttributeError: 'Tensor' object has no attribute '_keras_history'
I suspect the culprit is K.get_variable_shape(v[0]). Since v[0] is of type int32 (as specified by your error), when you get the shape it returns None. Concatenate wants all inputs to be of the same type.

Categories

Resources