From the tensorflow doc i have read here, I have tried to minimise the adam optimizer.
optimizer = tf.compat.v1.train.AdamOptimizer
print("Using AdamOptimizer...")
train_step = optimizer.minimize(loss, global_step = global_step,var_list = [process_image])
But I receive this error below from the code. Even though I have passed through the 'loss' argument. I think it may be due to using Tensorflow 2?
Do you have a loss tensor called loss?
In that case, you could try to write simply :
optimizer.minimize(loss = loss, ...)
Related
According to keras.io:
Once the model is created, you can config the model with losses and
metrics with model.compile().
But this explanation does not provide enough information about what exactly compiling model does.
Configures the model for training. documentation
Personally, I wouldn't call it compile, because what it does has got nothing to do with compilation, in computer science terms, and this is very confusing/ overwhelming to think about machine learning and compilation at the same time.
Its just a method which does configuration:
It just sets the arguments you pass it: optimizer, loss function, metrics, eager execution. You can run it multiple times, it will just overwrite the settings you set previously.
My suggestion to developers of TensorFlow would be to rename it to configure in the short term, and perhaps in the future (not that important), move to having 1 setter (or use the factory/ builder pattern) for each configuration argument.
Heres the code for it:
base_layer.keras_api_gauge.get_cell('compile').set(True)
with self.distribute_strategy.scope():
if 'experimental_steps_per_execution' in kwargs:
logging.warn('The argument `steps_per_execution` is no longer '
'experimental. Pass `steps_per_execution` instead of '
'`experimental_steps_per_execution`.')
if not steps_per_execution:
steps_per_execution = kwargs.pop('experimental_steps_per_execution')
self._validate_compile(optimizer, metrics, **kwargs)
self._run_eagerly = run_eagerly
self.optimizer = self._get_optimizer(optimizer)
self.compiled_loss = compile_utils.LossesContainer(
loss, loss_weights, output_names=self.output_names)
self.compiled_metrics = compile_utils.MetricsContainer(
metrics, weighted_metrics, output_names=self.output_names)
self._configure_steps_per_execution(steps_per_execution or 1)
# Initializes attrs that are reset each time `compile` is called.
self._reset_compile_cache()
self._is_compiled = True
self.loss = loss or {} # Backwards compat.
model.compile is related to training your model. Actually, your weights need to optimize and this function can optimize them. In a way that your accuracy make increases. This was just one of the input parameters called 'optimizer'.
model.compile(
optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics='acc'
)
These are the main inputs. Also you can find more details in TensorFlow documentation in link below:
https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile
So I have been working with a sequential Keras model in Tensor-Flow, and have come across an odd behavior where if I compile a sequential model with the loss as a function, the result is different than if it were a string.
The network (convolutional in nature) is defined as such:
model = tf.keras.Sequential()
add_model_layers(model) # Can provide if needed
adam_opt = tf.train.AdamOptimizer(learning_rate=0.001,
beta1=0.9,
beta2=0.999)
The network is then compiled:
loss_param1 = tf.keras.losses.categorical_crossentropy
loss_param2 = "categorical_crossentropy"
model.compile(optimizer=adam_opt,
loss=loss_param # ADD NUMBER TO END
metrics=[tf.keras.metrics.categorical_accuracy])
After which it is trained:
# Records are tf.data.Dataset based on TFRecord files
model.fit(train_records,
epochs=400,
use_multiprocessing=False,
validation_data=validation_records)
# And for completeness, tested
model.evaluate(test_records)
If the loss parameter of model.compile is loss_param1, it begins training with a high loss value (in my case, after an epoch or 5 around 192). On the other hand, if loss_param2 is used, training begins at a much lower loss (around 41).
Does anyone know why this would be occurring?
(As an additional note, I am also running into a similar issue where if the metric is as a string I get a different result. However in model.fit, if use_multiprocessing is True the effect is negated (this also applies the other way around).)
Here is an example in Pytorch:
optimizer = optim.Adam([modifier_var], lr=0.0005)
And here in Tensorflow:
self.train = self.optimizer.minimize(self.loss, var_list=[self.modifier])
But Chainer's optimizers only can use on 'Link', how can I apply Optimizer on Variable in Chainer?
In short, there is no way to directly assign chainer.Variable (even nor chainer.Parameter) to chainer.Optimizer.
The following is some redundant explanation.
First, I re-define Variable and Parameter to avoid confusion.
Variable is (1) torch.Tensor in PyTorch v4, (2) torch.autograd.Variable in PyTorch v3, and (3) chainer.Variable in Chainer v4.
Variable is an object who holds two tensors; .data and .grad. It is the necessary and sufficient condition, so Variable is not necessarily a learnable parameter, which is a target of the optimizer.
In both libraries, there is another class Parameter, which is similar but not the same with Variable. Parameter is torch.autograd.Parameter in Pytorch and chainer.Parameter in Chainer.
Parameter must be a learnable parameter and should be optimized.
Therefore, there should be no case to register Variable (not Parameter) to Optimizer (although PyTorch allows to register Variable to Optimizer: this is just for backward compatibility).
Second, in PyTorch torch.nn.Optimizer directly optimizes Parameter, but in Chainer chainer.Optimizer DOES NOT optimize Parameter: instead, chainer.UpdateRule does. The Optimizer just registers UpdateRules to Parameters in a Link.
Therefore, it is only natural that chainer.Optimizer does not receive Parameter as its arguments, because it is just a "delivery-man" of UpdateRule.
If you want to attach different UpdateRule for each Parameter, you should directly create an instance of UpdateRule subclass, and attach it to the Parameter.
Below is an example to learn regression task by MyChain MLP model using Adam optimizer in Chainer.
from chainer import Chain, Variable
# Prepare your model (neural network) as `Link` or `Chain`
class MyChain(Chain):
def __init__(self):
super(MyChain, self).__init__(
l1=L.Linear(None, 30),
l2=L.Linear(None, 30),
l3=L.Linear(None, 1)
)
def __call__(self, x):
h = self.l1(x)
h = self.l2(F.sigmoid(h))
return self.l3(F.sigmoid(h))
model = MyChain()
# Then you can instantiate optimizer
optimizer = chainer.optimizers.Adam()
# Register model to optimizer (to indicate which parameter to update)
optimizer.setup(model)
# Calculate loss, and update parameter as follows.
def lossfun(x, y):
loss = F.mean_squared_error(model(x), y)
return loss
# this iteration is "training", to fit the model into desired function.
for i in range(300):
optimizer.update(lossfun, x, y)
So in summary, you need to setup the model, after that you can use update function to calculate loss and update model's parameter.
The above code comes from here
Also, there are other way to write training code using Trainer module. For more detailed tutorial of Chainer, please refer below
chainer-handson
deep-learning-tutorial-with-chainer
I'm using tensorflow to do a gradient decent classification.
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
here cost is the cost function that I have used in optimization.
After launching the Graph in the Session, the Graph can be fed as:
sess.run(train_op, feed_dict)
And with this, all the variables in the cost function will be updated in order to minimized the cost.
Here is my question. How can I update only some variables in the cost function when training..? Is there a way to convert created variables into constants or something..?
There are several good answers, this subject should already be closed:
stackoverflow
Quora
Just to avoid another click for people getting here :
The minimize function of the tensorflow optimizer takes a var_list argument for that purpose:
first_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
"scope/prefix/for/first/vars")
first_train_op = optimizer.minimize(cost, var_list=first_train_vars)
second_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
"scope/prefix/for/second/vars")
second_train_op = optimizer.minimize(cost, var_list=second_train_vars)
I took it as is from mrry
To get the list of the names you should use instead of "scope/prefix/for/second/vars" you can use :
tf.get_default_graph().get_collection_ref(tf.GraphKeys.TRAINABLE_VARIABLES)
I am experimenting with some simple models in tensorflow, including one that looks very similar to the first MNIST for ML Beginners example, but with a somewhat larger dimensionality. I am able to use the gradient descent optimizer with no problems, getting good enough convergence. When I try to use the ADAM optimizer, I get errors like this:
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value Variable_21/Adam
[[Node: Adam_2/update_Variable_21/ApplyAdam = ApplyAdam[T=DT_FLOAT, use_locking=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_21, Variable_21/Adam, Variable_21/Adam_1, beta1_power_2, beta2_power_2, Adam_2/learning_rate, Adam_2/beta1, Adam_2/beta2, Adam_2/epsilon, gradients_11/add_10_grad/tuple/control_dependency_1)]]
where the specific variable that complains about being uninitialized changes depending on the run. What does this error mean? And what does it suggest is wrong? It seems to occur regardless of the learning rate I use.
The AdamOptimizer class creates additional variables, called "slots", to hold values for the "m" and "v" accumulators.
See the source here if you're curious, it's actually quite readable:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/adam.py#L39 . Other optimizers, such as Momentum and Adagrad use slots too.
These variables must be initialized before you can train a model.
The normal way to initialize variables is to call tf.initialize_all_variables() which adds ops to initialize the variables present in the graph when it is called.
(Aside: unlike its name suggests, initialize_all_variables() does not initialize anything, it only add ops that will initialize the variables when run.)
What you must do is call initialize_all_variables() after you have added the optimizer:
...build your model...
# Add the optimizer
train_op = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# Add the ops to initialize variables. These will include
# the optimizer slots added by AdamOptimizer().
init_op = tf.initialize_all_variables()
# launch the graph in a session
sess = tf.Session()
# Actually intialize the variables
sess.run(init_op)
# now train your model
for ...:
sess.run(train_op)
FailedPreconditionError: Attempting to use uninitialized value is one of the most frequent errors related to tensorflow. From official documentation, FailedPreconditionError
This exception is most commonly raised when running an operation that
reads a tf.Variable before it has been initialized.
In your case the error even explains what variable was not initialized: Attempting to use uninitialized value Variable_1. One of the TF tutorials explains a lot about variables, their creation/initialization/saving/loading
Basically to initialize the variable you have 3 options:
initialize all global variables with tf.global_variables_initializer()
initialize variables you care about with tf.variables_initializer(list_of_vars). Notice that you can use this function to mimic global_variable_initializer: tf.variable_initializers(tf.global_variables())
initialize only one variable with var_name.initializer
I almost always use the first approach. Remember you should put it inside a session run. So you will get something like this:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
If your are curious about more information about variables, read this documentation to know how to report_uninitialized_variables and check is_variable_initialized.
You need to call tf.global_variables_initializer() on you session, like
init = tf.global_variables_initializer()
sess.run(init)
Full example is available in this great tutorial
https://www.tensorflow.org/get_started/mnist/mechanics
run init after AdamOptimizer,and without define init before or run init
sess.run(tf.initialize_all_variables())
or
sess.run(tf.global_variables_initializer())
I was having a similar problem. (No problems training with GradientDescent optimizer, but error raised when using to Adam Optimizer, or any other optimizer with its own variables)
Changing to an interactive session solved this problem for me.
sess = tf.Session()
into
sess = tf.InteractiveSession()