Suppose there's a model defined as class like this:
class SimpleAutoencoder(object):
def __init__(self, x):
self.x = x
self.input_dim = 92
self.latent_dim = 10
self.build_model()
def build_model(self):
latent = tf.contrib.layers.fully_connected(self.x,
self.latent_dim,
scope='latent',
activation_fn=tf.nn.relu)
self.x_hat = tf.contrib.layers.fully_connected(latent,
self.input_dim,
scope='output',
activation_fn=tf.nn.sigmoid)
self.loss = tf.losses.mean_squared_error(self.x, self.x_hat)
self.train_op = tf.AdamOptimizer().minimize(self.loss)
You train it using an input pipeline for feeding your data:
...
x = iterator.get_next()
model = SimpleAutoencoder(x)
...
## train and save it to disk
Now, when using a placeholder for self.x while building the model, I can give it a name and access the input variable easily when I restore the model to do inference. But with the input pipeline, x is no variable, constant or placeholder and therefore I cannot give it a proper name. How can I inject new data into x and feed it through the graph?
Even though the training works, I have considered that I might do it wrong somehow, as the code looks really ugly to me (the part with giving the pipeline output to the init function).
Please help me with this one! Thank you!
You could get x's name using x.name,
or, you could rename x to a name to your taste using x = tf.identity(x, name='my_name'),
(With these two solutions, you can feed your values with the name of the tensor -- even if x is not a placeholder:
sess.run(my_ops, feed_dict{tensor_name: tensor_value})
)
or, your could replace the entire input pipeline with a placeholder (explained here for the opposite problem — replacing a placeholder with a Dataset input)
Related
Compare the following code snippets. I implemented a simple keras model like this
inp = layers.Input((10,2))
x = layers.Flatten()(inp)
x = layers.Dense(5)(x)
m = models.Model(inputs=inp, outputs=x)
For one reason or another, I need to have my model in an objective way. So no problem, it's easy to reimplement that into:
class MyModel(tf.keras.Model):
def __init__(self, inp_shape, out_size = 5):
super(MyModel, self).__init__()
self.inp = layers.InputLayer(input_shape=inp_shape)
self.flatten = layers.Flatten()
self.dense = layers.Dense(out_size)
def call(self, a):
x = self.inp(a)
x = self.flatten(x)
x = self.dense(x)
return x
However in the second case when I try to run:
m = MyModel((10,2))
m.summary()
I get:
ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.
I don't quite get why? Shouldn't the above be equivalent?
The reason for this is that when you create an object of this model you are just creating its layers and not its graph. So in short the output from layer 1 is not going in layer 2 cause those are entirely separate attributes of the class but when you call the model those separate attributes combines and form the graph.
When you define a model in tf. keras with subclassed API, you need to build the model first by calling build or run the model on some data.
m = MyModel((10,2))
m.build(input_shape=(10, 2)) # < -- build the model
m.summary()
That said, you don't also need to define self.inp while building the model with subclassed API. The .summary() may not look right to you for the subclassed model, you may need to check this instead.
I am trying to apply one idea proposed by Rusu et al. in https://arxiv.org/pdf/1511.06295.pdf, which consists in training a NN changing the output layer according to the class of the input, i.e., provided that we know the id of the input, we would pick the corresponding output layer. This way, all the hidden layers would be trained with all the data, but each output layer would only be trained with its corresponding type of input data.
This is meant to achieve good results in a transfer learning framework.
How can I implement this "change of the last layer" in tensorflow 2.0?
If you use model subclassing, you can actually define you forward pass.
class MyModel(tf.keras.Model):
def __init__(self):
super(Model, self).__init__()
self.block_1 = BlockA()
self.block_2 = BlockB()
self.global_pool = layers.GlobalAveragePooling2D()
self.classifier = Dense(num_classes)
def call(self, inputs):
if condition:
x = self.block_1(inputs)
else:
x = self.block_2(inputs)
x = self.global_pool(x)
return self.classifier(x)
You'll still have the backprop part to figure out, but I think it's fairly easy if you use a multioutput model and train all your "last layers" at the same time.
I have some cnn, and I want to fetch the value of some intermediate layer corresponding to a some key from the state dict.
How could this be done?
Thanks.
I think you need to create a new class that redefines the forward pass through a given model. However, most probably you will need to create the code regarding the architecture of your model. You can find here an example:
class extract_layers():
def __init__(self, model, target_layer):
self.model = model
self.target_layer = target_layer
def __call__(self, x):
return self.forward(x)
def forward(self, x):
module = self.model._modules[self.target_layer]
# get output of the desired layer
features = module(x)
# get output of the whole model
x = self.model(x)
return x, features
model = models.vgg19(pretrained=True)
target_layer = 'features'
extractor = extract_layers(model, target_layer)
image = Variable(torch.randn(1, 3, 244, 244))
x, features = extractor(image)
In this case, I am using the pre-defined vgg19 network given in the pytorch models zoo. The network has the layers structured in two modules the features for the convolutional part and the classifier for the fully-connected part. In this case, since features wraps all the convolutional layers of the network it is straightforward. If your architecture has several layers with different names, you will need to store their output using something similar to this:
for name, module in self.model._modules.items():
x = module(x) # forward the module individually
if name in self.target_layer:
features = x # store the output of the desired layer
Also, you should keep in mind that you need to reshape the output of the layer that connects the convolutional part to the fully-connected one. It should be easy to do if you know the name of that layer.
There is a function tf.get_variable('name') which allows to "implicitly" pass parameters into function like:
def function(sess, feed):
with tf.variable_scope('training', reuse=True):
cost = tf.get_variable('cost')
value = sess.run(cost, feed_dict=feed)
# other statements
But what if one want to pass a tf.placeholder into function? Is there same mechanism for placeholders, i.e. something like tf.get_placeholder():
def function(sess, cost, X_train, y_train):
# Note this is NOT a valid TF code
with tf.variable_scope('training', reuse=True):
features = tf.get_placeholder('features')
labels = tf.get_placeholder('labels')
feed = {features: X_train, labels: y_train}
value = sess.run(cost, feed_dict=feed)
print('Cost: %s' % value)
Or it doesn't make too much sense to do it and better to just construct placeholders inside of function?
Placeholders are just... placeholders. It's pointless "getting" a placeholder as if it has some sort of state (that's what get variable does, returns a variable in its current state).
Just use the same python variable everywhere.
Also, if you don't want to pass a python variable because your method signaturl becomes ugly, you can exploit the fact that you're building a graph and the graph itself contains the information about the declared placeholders.
You can do something like:
#define your placeholder
a = tf.placeholder(tf.float32, name="asd")
# then, when you need it, fetch if from the graph
graph = tf.get_default_graph()
placeholder = graph.get_tensor_by_name("asd:0")
Aside the fact that if you are working in the same script you should not need this, you can do that by getting the tensor by name, as in Tensorflow: How to get a tensor by name?
For instance
p = tf.placeholder(tf.float32)
p2 = tf.get_default_graph().get_tensor_by_name(p.name)
assert p == p2
I'm trying to get into tensorflow, setting up a network and then feeding data to it. For some reason I end up with the error message ValueError: setting an array element with a sequence. I made a minimal example of what I'm trying to do:
import tensorflow as tf
K = 10
lchild = tf.placeholder(tf.float32, shape=(K))
rchild = tf.placeholder(tf.float32, shape=(K))
parent = tf.nn.tanh(tf.add(lchild, rchild))
input = [ tf.Variable(tf.random_normal([K])),
tf.Variable(tf.random_normal([K])) ]
with tf.Session() as sess :
print(sess.run([parent], feed_dict={ lchild: input[0], rchild: input[1] }))
Basically, I'm setting up a network with place holders and a sequence of input embeddings that I want to learn, and then I try to run the network, feeding the input embeddings into it. From what I can tell by searching for the error message, there might be something wrong with my feed_dict, but I can't see any obvious mismatches in eg. dimensionality.
So, what did I miss, or how did I get this completely backwards?
EDIT: I've edited the above to clarify that the input represents embeddings that need to be learned. I guess the question can be asked more sharply as: Is it possible to use placeholders for parameters?
The inputs should be numpy arrays.
So, instead of tf.Variable(tf.random_normal([K])), simply write np.random.randn(K) and everything should work as expected.
EDIT (The question was clarified after my answer):
It is possible to use placeholders as parameters but in a slightly different way. For example:
lchild = tf.placeholder(tf.float32, shape=(K))
rchild = tf.placeholder(tf.float32, shape=(K))
parent = tf.nn.tanh(tf.add(lchild, rchild))
loss = <some loss that depends on the parent tensor or lchild/rchild>
# Compute gradients with respect to the input variables
grads = tf.gradients(loss, [lchild, rchild])
inputs = [np.random.randn(K), np.random.randn(K)]
for i in range(<number of iterations>):
np_grads = sess.run(grads, feed_dict={lchild:inputs[0], rchild:inputs[1])
inputs[0] -= 0.1 * np_grads[0]
inputs[1] -= 0.1 * np_grads[1]
It is not however the best or easiest way to do this. The main problem with it is that at every iteration you need to copy numpy arrays in and out of the session (which is running potentially on a different device like GPU).
Placeholders generally are used to feed the data external to the model (like texts or images). The way to solve it using tensorflow utilities would be something like:
lchild = tf.Variable(tf.random_normal([K])
rchild = tf.Variable(tf.random_normal([K])
parent = tf.nn.tanh(tf.add(lchild, rchild))
loss = <some loss that depends on the parent tensor or lchild/rchild>
train_op = tf.train.GradientDescentOptimizer(loss).minimize(0.1)
for i in range(<number of iterations>):
sess.run(train_op)
# Retrieve the weights back to numpy:
np_lchild = sess.run(lchild)