Add extra output to existing Chainer network - python

Let's say I create a simple fully connected network:
import chainer
import chainer.functions as F
import chainer.links as L
from chainer import Sequential
model = Sequential(
L.Linear(n_in, n_hidden),
F.relu,
L.Linear(n_hidden, n_hidden),
F.relu,
L.Linear(n_hidden, n_out)
)
# Compute the forward pass
y = model(x)
I want to train this model with n_out outputs, then, after it is trained, to add extra outputs before fine-tuning the network.
I have found ways to remove the last layer in order to retrain a new last layer, however this is not what I want: I want to keep the weights of the existing outputs. The weights of the new outputs would be initialized randomly.

How about introducing an additional linear layer L.Linear(n_hidden, n_extra_out) (without removing any of the existing ones) where n_extra_out is the number of additional outputs. You can then extract the output from the last F.relu (you might want to consider replacing the Sequential object with an instance of a chainer.Chain implementation for this, similar to this example https://github.com/chainer/chainer/blob/master/examples/mnist/train_mnist.py#L16) and pass it as inputs to both your pretrained last linear layer as well as this new layer. The two outputs can then be concatenated using F.concat.

Related

Is it possible to create a model in Keras Functional API without an input layer?

I would like to create a model consisting of 2 convolutional, one flatten, and one dense layer in Keras. This would be a model with shared weights, so without any predefined input layer.
It is possible to do using the sequential way:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu))
model.add(tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(200,activation=tf.nn.relu))
However, using the Functional API, produces a TypeError:
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)
model2 = tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Flatten()(model2)
model2 = tf.keras.layers.Dense(200,activation=tf.nn.relu)(model2)
Error :
TypeError: Inputs to a layer should be tensors. Got: <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7fb060598100>
Is it impossible to do this way, or am I missing something?
The keras sequential api is designed to be easier to use, and as a result is less flexible than the functional api. The benefit of this is that an input 'layer' shape can be inferred automatically by whatever shape of the data you pass to it. The downside is that this easier to use model is simplified, and so you can't do things like using multiple inputs.
From the keras docs:
A Sequential model is not appropriate when:
Your model has multiple inputs or multiple outputs
Any of your layers has multiple inputs or multiple outputs
You need to do layer sharing
You want non-linear topology (e.g. a residual connection, a
multi-branch model)
The functional api is designed to be more flexible i.e. multiple inputs, and so it doesn't make any sort of automatic inference for you, hence the error. You must explicitly pass an input layer in this case. For your use case, it might seem odd that it doesn't automatically infer the shape, however when you consider the wider use-case scenario it makes sense.
So the second scenario should be :
model2 = tf.keras.layers.Input((10,3,2)) # specified input layer
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Conv2D(20,3,2,'valid',activation=tf.nn.relu)(model2)
model2 = tf.keras.layers.Flatten()(model2)
model2 = tf.keras.layers.Dense(200,activation=tf.nn.relu)(model2)
Update
If you want to create two separate models and join them together, you should use the functional API, and then due to it's constraints you must therefore use input layers. So you could do something like:
import tensorflow as tf
from tensorflow.keras.layers import Input, Flatten, Dense, concatenate, Conv2D
from tensorflow.keras.models import Model
input1 = Input((10,3,2))
model1 = Dense(200,activation=tf.nn.relu)(input1)
input2 = Input((10,3,2))
model2 = Dense(200,activation=tf.nn.relu)(input2)
merged = concatenate([model1, model2])
merged = Conv2D(10,3,2,'valid',activation=tf.nn.relu)(merged)
merged = Flatten()(merged)
merged = Dense(200,activation=tf.nn.relu)(merged)
model = Model(inputs=[input1, input2], outputs=merged)
Above we have two separate inputs and then two Dense layers - you can build these separate lines however you want, and then to merge them together to pass them through a convolutional layer you need to use a tf.keras.layers.concatenate layer, and then you can continue the joint model from there. Wrapping the whole thing inside a Model object then allows you access training and inference methods like fit/predict etc.
The linking in keras works by propagating tensors through the layers. So in your second example, at the beginning model2 is an instance of a keras.layers.Layer and not a tf.Tensor that why you get the error.
Input creates a tensor which can then be used to link the layers. So if there is not a specific reason, you just add one:
model2 = tf.keras.layers.Input((10,3,2))
model2 = tf.keras.layers.Conv2D(10,3,2,'valid',activation=tf.nn.relu)(model2)

ConvLSTM2D for one-to-many network

I want to use some ConvLSTM2D layer for a multi output regression model. One image should be the input and depending on the image a certain number of values should be the output padded by zeros. My Question is what function to use to have the same image as an input?
If I'm using
import keras.backend as K
K.tile(input, number_timesteps)
I'm getting the error:
AttributeError:'Tensor' object has no attribute '_keras_history'.
Is there any other way to solve this or do I have to input the same image multiple times?
All keras tensors in a model must be produced by a Layer.
When you use backend functions, you're not using layers.
You can use Lambda layers to wrap custom and backend functions:
tiledOutputs = Lambda(lambda x: K.tile(x, number_timesteps))(imageInputs)
Or add the layer to a sequential model:
model.add(Lambda(lambda x: K.tile(x, number_timesteps)))
But you're probably looking for K.stack([x]*number_timesteps, axis=1).

Keras Concatenate Layers: Difference between different types of concatenate functions

I just recently started playing around with Keras and got into making custom layers. However, I am rather confused by the many different types of layers with slightly different names but with the same functionality.
For example, there are 3 different forms of the concatenate function from https://keras.io/layers/merge/ and https://www.tensorflow.org/api_docs/python/tf/keras/backend/concatenate
keras.layers.Concatenate(axis=-1)
keras.layers.concatenate(inputs, axis=-1)
tf.keras.backend.concatenate()
I know the 2nd one is used for functional API but what is the difference between the 3? The documentation seems a bit unclear on this.
Also, for the 3rd one, I have seen a code that does this below. Why must there be the line ._keras_shape after the concatenation?
# Concatenate the summed atom and bond features
atoms_bonds_features = K.concatenate([atoms, summed_bond_features], axis=-1)
# Compute fingerprint
atoms_bonds_features._keras_shape = (None, max_atoms, num_atom_features + num_bond_features)
Lastly, under keras.layers, there always seems to be 2 duplicates. For example, Add() and add(), and so on.
First, the backend: tf.keras.backend.concatenate()
Backend functions are supposed to be used "inside" layers. You'd only use this in Lambda layers, custom layers, custom loss functions, custom metrics, etc.
It works directly on "tensors".
It's not the choice if you're not going deep on customizing. (And it was a bad choice in your example code -- See details at the end).
If you dive deep into keras code, you will notice that the Concatenate layer uses this function internally:
import keras.backend as K
class Concatenate(_Merge):
#blablabla
def _merge_function(self, inputs):
return K.concatenate(inputs, axis=self.axis)
#blablabla
Then, the Layer: keras.layers.Concatenate(axis=-1)
As any other keras layers, you instantiate and call it on tensors.
Pretty straighforward:
#in a functional API model:
inputTensor1 = Input(shape) #or some tensor coming out of any other layer
inputTensor2 = Input(shape2) #or some tensor coming out of any other layer
#first parentheses are creating an instance of the layer
#second parentheses are "calling" the layer on the input tensors
outputTensor = keras.layers.Concatenate(axis=someAxis)([inputTensor1, inputTensor2])
This is not suited for sequential models, unless the previous layer outputs a list (this is possible but not common).
Finally, the concatenate function from the layers module: keras.layers.concatenate(inputs, axis=-1)
This is not a layer. This is a function that will return the tensor produced by an internal Concatenate layer.
The code is simple:
def concatenate(inputs, axis=-1, **kwargs):
#blablabla
return Concatenate(axis=axis, **kwargs)(inputs)
Older functions
In Keras 1, people had functions that were meant to receive "layers" as input and return an output "layer". Their names were related to the merge word.
But since Keras 2 doesn't mention or document these, I'd probably avoid using them, and if old code is found, I'd probably update it to a proper Keras 2 code.
Why the _keras_shape word?
This backend function was not supposed to be used in high level codes. The coder should have used a Concatenate layer.
atoms_bonds_features = Concatenate(axis=-1)([atoms, summed_bond_features])
#just this line is perfect
Keras layers add the _keras_shape property to all their output tensors, and Keras uses this property for infering the shapes of the entire model.
If you use any backend function "outside" a layer or loss/metric, your output tensor will lack this property and an error will appear telling _keras_shape doesn't exist.
The coder is creating a bad workaround by adding the property manually, when it should have been added by a proper keras layer. (This may work now, but in case of keras updates this code will break while proper codes will remain ok)
Keras historically supports 2 different interfaces for their layers, the new functional one and the old one, that requires model.add() calls, hence the 2 different functions.
For the TF -- their concatenate() functions does not do everything that required for Keras to work, hence, the additional calls to make ._keras_shape variable correct and not to upset Keras that expects that variable to have some particular value.

How to access weight variables in Keras layers in tensor form for clip_by_weight?

I'm implementing WGAN and need to clip weight variables.
I'm currently using Tensorflow with Keras as high-level API. Thus building layers with Keras to avoid manually creation and initialization of variables.
The problem is WGAN need to clip weight varibales, This can be done using tf.clip_by_value(x, v0, v1) once I got those weight variable tensors, but I don't know to how to get them safely.
One possible solution maybe using tf.get_collection() to get all trainable variables. But I don't know how to get only weight variable without bias variables.
Another solution is layer.get_weights(), but it get numpy arrays, although I can clip them with numpy APIs and set them using layer.set_weights(), but this may need CPU-GPU corporation, and may not be a good choice since clip operation needs to be performed on each train step.
The only way I know is access them directly using exact variable names which I can get from TF lower level APIs or TensorBoard, but this is may not be safe since naming rule of Keras is not guaranteed to be stable.
Is there any clean way to perform clip_by_value only on those Ws with Tensorflow and Keras?
You can use constraints(here) class to implement new constraints on parameters.
Here is how you can easily implement clip on weights and use it in your model.
from keras.constraints import Constraint
from keras import backend as K
class WeightClip(Constraint):
'''Clips the weights incident to each hidden unit to be inside a range
'''
def __init__(self, c=2):
self.c = c
def __call__(self, p):
return K.clip(p, -self.c, self.c)
def get_config(self):
return {'name': self.__class__.__name__,
'c': self.c}
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(30, input_dim=100, W_constraint = WeightClip(2)))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop')
X = np.random.random((1000,100))
Y = np.random.random((1000,1))
model.fit(X,Y)
I have tested the running of the above code, but not the validity of the constraints. You can do so by getting the model weights after training using model.get_weights() or model.layers[idx].get_weights() and checking whether its abiding the constraints.
Note: The constrain is not added to all the model weights .. but just to the weights of the specific layer its used and also W_constraint adds constrain to W param and b_constraint to b (bias) param

Tensorflow: What's the difference between the use of tf.mat_fn() or tf.nn.dynamic_rnn() to apply layers before an LSTM?

This question is about coding strategy using Tensorflow. I would like to create a small classifier network made of:
1: an input
2: a simple layer fully connected (W*x+B)
3: a LSTM layer
4: a softmax layer
5: an ouput
In tensorflow, to use the class tf.nn.dynamic_rnn(), we need to a batch of sequences to the network. So far, it's work perfectly (I love this library).
But as I want to apply a simple layer on each features of my sequences (2nd layer in my description), i'm wondering:
Do i preceed my LSTM layer with this simple layer and pass both to the tf.nn.dynamic_rnn() operation...
OR
Do i use the function tf.map_fn() twice (one to unpack batches, one to unpack sequences), which if a understood well, is able to unpack my sequences and apply a layer on each features line.
Normally, it should give me the same result ? If it's the case, what should I use ?
Thank you for your time !
I recently encountered a similar scenario, where I'd like to chain recurrent and non-recurrent layers.
Do i preceed my LSTM layer with this simple layer and pass both to the
tf.nn.dynamic_rnn() operation...
This won't work. The function dynamic_rnn expects a cell as its first argument. A cell is a class that inherits from tf.nn.rnn_cell.RNNCell. Additionally, the second input argument to dynamic_rnn should be a tensor with at least 3 dimensions, where the first two dimensions are batch and time (time_major=False) or time and batch (time_major=True).
Do i use the function tf.map_fn() twice (one to unpack batches, one to unpack sequences), which if a understood well, is able to unpack my sequences and apply a layer on each features line.
This might work, but doesn't appear to me to be an efficient and clean solution. Firstly, it should not be necessary to 'unpack batches', as you presumably want to perform some operation on batches of features and time-steps, where each observation in a batch is independent from the others.
My solution to this particular problem was to create a sub-class of tf.nn.rnn_cell.RNNCell. In my case I wanted a simple feedforward layer that would iterate over all of the time steps and that could be used in dynamic_rnn:
import tensorflow as tf
class FeedforwardCell(tf.nn.rnn_cell.RNNCell):
"""A stateless feedforward cell that can be used with MultiRNNCell
"""
def __init__(self, num_units, activation=tf.tanh, dtype=tf.float32):
self._num_units = num_units
self._activation = activation
# Store a dummy state to make dynamic_rnn happy.
self.dummy = tf.constant([[0.0]], dtype=dtype)
#property
def state_size(self):
return 1
#property
def output_size(self):
return self._num_units
def zero_state(self, batch_size, dtype):
return self.dummy
def __call__(self, inputs, state, scope=None):
"""Basic feedforward: output = activation(W * input)."""
with tf.variable_scope(scope or type(self).__name__): # "FeedforwardCell"
output = self._activation(tf.nn.rnn_cell._linear(
[inputs], self._num_units, True))
return output, self.dummy
An instance of this class can be passed, in a list with "normal" RNN cells, to an tf.nn.rnn_cell.MultiRNNCell initializer. The resulting object instance can be passed as the cell input argument to dynamic_rnn.
Important to note: dynamic_rnn expects that a recurrent cell returns a state when called. I therefore use dummy in FeedforwardCell as a fake state variable.
My solution might not be the smoothest or best way to chain recurrent and non-recurrent layers together. I'd be interested in hearing from other Tensorflow users about their suggestions.
Edit
If you choose to use the sequence_length input argument of dynamic_rnn, then state_size should be self._num_units and the dummy state should have shape [batch_size, self.state_size]. In other words, the state cannot be a scalar. Note that bidirectional_dynamic_rnn requires that the sequence_length argument is not None, whereas dynamic_rnn does not have this requirement. (This is weakly documented in the TF documentation.)

Categories

Resources