Description
I wrap TensorFlow model with a loss function in a model_fn() for a tf.estimator.Estimator instantiation.
Various optimizers (e.g. tf.train.MomentumOptimizer, or tf.train.AdagradOptimizer) create Read(_x) operations for all trainable variables in the model (gamma, beta of batch normalization, kernels of convolutions, ...) when optimizer.apply_gradients() is called.
Specifically, they are named as Read_<num>/ReadVariableOp, and are of type tf.Operation.
The problem is that these variables are created outside of any tf.variable_scope or tf.name_scope at the root of the tf.Graph.
This totally messes up readability in TensorBoard:
The red frame is the actual model. The blue frame encapsulates a tiny fraction of all 300+ read functions.
Question
Is there a way how I could wrap all Read_<num> Operations in something like a tf.name_scope()? Or is there another way to programmatically (i.e. not by clicking with the mouse) remove them from TensorBoard?
What have I tried
Wrap the call to apply_gradients() like this:
with tf.name_scope('apply_gradients_to_ns'):
with tf.variable_scope('apply_gradients_to_vs'):
minimize_op = optimizer.apply_gradients(
grads_and_vars=grads_and_vars,
global_step=tf.train.get_or_create_global_step(),
name='apply_gradients_to_name'
)
with no effect. Still, the scope of all Read_<num> operations is not influenced.
Trace the creation of these operations:
In tensorflow.python.training.slot_creator.py, line 179, in create_zeros_slot(..., colocate_with_primary=True) triggers a call to tensorflow.python.ops.variable_scope.py, line 1298, get_variable(..., use_resource=None) with use_resource=True.
However, I don't want to mess around in the source code of TensorFlow.
Also, I conclude that this behavior is intended and I just use it wrong.
How should it be used?
Try different distribution strategies: OneDeviceStrategy, MirroredStrategy.
Both produce a similar effect.
The only difference is that the MirroredStrategy creates group_deps_<num> variables.
Code to reproduce this effect, a more detailed description, and two more screenshots can be found in this repository: https://github.com/patzm/tf-estimator-distribute-so
Related
I'm trying to use tfd.TransformedDistribution to apply a chain of bijectors to modify a bivariate Gaussian distribution, and I'm getting the error noted above ("AttributeError: Tensor.name is meaningless when eager execution is enabled."). I'm using using TensorFlow 2.0 (Python) and TensorFlow Probability 0.9.0 in a Jupyter Notebook hosted in a Chrome browser, version 94.0.4606.61. The call that appears to provoke the error is this:
x_dist = tfd.TransformedDistribution(z, chain_of_bijectors)
Some of the chained bijectors have been subclassed using naming conventions similar to what is shown below, but the error happens even when I use a single bijector (i.e., even one derived directly from TensorFlow's library of bijectors). The bijectors appear to work normally (with no errors) when used in a scrutinized sequence that resembles the chain.
Example code snippet of a typical subclassed bijector:
class MyBijector(tfb.Bijector):
def __init__(self, validate_args=False, name='my_bijector'):
super(MyBijector, self).__init__(
validate_args=validate_args,
forward_min_event_ndims=0,
name=name
)
To resolve the error, I have tried different variations of the subclass names (for the two init's ), and removing the names altogether. (The fact that the same error occurs even when a single, non-subclassed bijector is used in the function call seems to suggest the issue is not really with the names of the bijectors). I also tried disabling eager execution (which seems unnecessary). When eager execution was disabled, the code ran normally until the same call, and then it produced a different error related to the chain of bijectors: "ValueError: 'chain_of_[...string of mostly bijector names omitted here...]/forward/add:0' is not a valid scope name".
Can anyone explain the cause of the AttributeError and how to fix it? If eager execution must be disabled to run this code, how can I fix the ValueError? Thanks!
Nevermind. I figured out the problem: In the function call listed above ("x_dist = tfd.TransformedDistribution(z, chain_of_bijectors)"), z was a sample from an underlying distribution, rather than the distribution itself, causing the error. The error went away once I passed z as an actual distribution object, rather than as a sample from such an object.
I'm trying to use TensorFlow from an IPython notebook. I've created a function that defines a placeholder an a variable. Since I'm a TensorFlow newbie, I did not initialize the variable properly and got an error saying I did not initialize a placeholder.
I have two cells, one with the function and one with a function call. No matter how much I fix the function (and rerun both cells, of course) I keep getting initialization errors even after I fix the bug.
The only way to get it to work is to restart the kernel, which pretty much beats the purpose of a notebook, I can just write a Python script.
It is mostly speculation without seeing your code, but from what I read I believe to know what you are doing wrong.
When using Tensorflow inside a notebook you have to be especially careful not to confuse graph building code with evaluation code. You only need and should define the computational graph once at the beginning. Executing functions which define the graph again will just build another subgraph (this probably also goes for your function which defines the placeholder and variables). The tf.global_variables_initializer operation should also only be executed once.
It is crucial to understand that the Tensorflow graph can not be dynamically handled by the notebook, because python does not actually control Tensorflow variables. Python in this case is just a meta language for defining the graph and initiating computations.
So in the notebook after initializing the graph exactly once you can only call functions which wrap Tensorflow graph evaluation code, not graph building code dynamically without resetting the kernel. Examples for such methods which only evaluate an existing graph are session.run, other tf.Session methods or similar evaluation methods like tensor.eval.
So yea to make it clear, there is no way to change an already build graph without rebuilding it which in this case requires resetting the kernel, unless you just build new subgraphs over and over again (and initialize the new variables) but that will at some point use up all available memory.
I am trying to register a python function and its gradient as a tensorflow operation.
I found many useful examples e.g.:
Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)
https://programtalk.com/python-examples/tensorflow.python.framework.function.Defun/
Nonetheless I would like to register attributes in the operation and use these attributes in the gradient definition by calling op.get_attr('attr_name').
Is this possible without going down to C implementation?
May you give me an example?
Unfortunately I don't believe it is possible to add attributes without using a C++ implementation of the operation. One feature that may help though is that you can define 'private' attributes by prepending an underscore to the start. I'm not sure if this is well documented or what the long-term guarantees are, but you can try setting '_my_attr_name' and you should be able to retrieve it later.
I am currently reading a source code for slim library that is based on Tensorflow and they use values argument for variable_scope method alot, like here.
From the API page I can see:
This context manager validates that the (optional) values are from the same graph, ensures that graph is the default graph, and pushes a name scope and a variable scope.
My question is: variables from values are only being checked if they are from the same graph? What are the use cases for this and why someone will need that?
The variable_scope parameter helps ensure uniqueness of variables and reuse of variables where desired.
Yes if you create two or more different computation graphs then they wouldn't necessarily share the same variable scope; however, there are ways to get them to be shared across graphs so the option is there.
Primary use cases for variable scope are for RNN's where many of the weights are tied and reused. That's one reason someone would need it. The other main reason it's there is to ensure that you are reusing the same variables when you explicitly mean to and not by accident. (For distributed settings this can become a concern.)
I want to run parameter studies in different modelica building libraries (buildings, IDEAS) with python: For example: change the infiltration rate.
I tried: simulateModel and simulateExtendedModel(..."zone.n50", [value])
My questions:Why is it not possible to translate the model and then change the parameter: Warning: Setting zone.n50 has no effect in model. After translation you can only set literal start-values and non-evaluated parameters.
It is also not possible to run: simulateExtendedModel. When i go to command line in dymola and write for zone.n50, then i get the actual value (that i have defined in python), but in the result file (and the plotted variable) it is always the standard n50 value.So my question: How can I change values ( befor running (and translating?) the simulation?
The value for the parameter is also not visible in the variable browser.
Kind regards
It might be a strcutrual parameter, these are evaluated also. It should work if you explicitly set Evaluate=False for the parameter that you want to study.
Is it not visible in the variable browser or is it just greyed out and constant? If it is not visible at all you should check if it is protected.
Some parameters cannot be changed after compilation, even with Evaluate=False. This is the case for parameters that influence the structure of the model, for example parameters that influence a discretization scheme and therefore influence the number of equations.
Changing such parameters requires to recompile the model. You can still do this in a parametric study though, I think you can use Modelicares to achieve this (http://kdavies4.github.io/ModelicaRes/modelicares.exps.html)