I wrote a simple script to calculate the golden ratio from 1,2,5. Is there a way to actually produce a visual through tensorflow (possibly with the aid of matplotlib or networkx) of the actual graph structure? The doc of tensorflow is pretty similar to a factor graph so I was wondering:
How can an image of the graph structure be generated through tensorflow?
In this example below, it would be C_1, C_2, C_3 as individual nodes, and then C_1 would have the tf.sqrt operation followed by the operation that brings them together. Maybe the graph structure (nodes,edges) can be imported into networkx? I see that the tensor objects have a graph attribute but I haven't found out how to actually use this for imaging purposes.
#!/usr/bin/python
import tensorflow as tf
C_1 = tf.constant(5.0)
C_2 = tf.constant(1.0)
C_3 = tf.constant(2.0)
golden_ratio = (tf.sqrt(C_1) + C_2)/C_3
sess = tf.Session()
print sess.run(golden_ratio) #1.61803
sess.close()
This is exactly what tensorboard was created for. You need to slightly modify your code to store the information about your graph.
import tensorflow as tf
C_1 = tf.constant(5.0)
C_2 = tf.constant(1.0)
C_3 = tf.constant(2.0)
golden_ratio = (tf.sqrt(C_1) + C_2)/C_3
with tf.Session() as sess:
writer = tf.summary.FileWriter('logs', sess.graph)
print sess.run(golden_ratio)
writer.close()
This will create a logs folder with event files in your working directory. After this you should run tensorboard from your command line tensorboard --logdir="logs" and navigate to the url it gives you (http://127.0.0.1:6006). In your browser go to GRAPHS tab and enjoy your graph.
You will use TB a lot if you are going to do anything with TF. So it makes sense to learn about it more from official tutorials and from this video.
You can get an image of the graph using Tensorboard. You need to edit your code to output the graph, and then you can launch tensorboard and see it. See, in particular, TensorBoard: Graph Visualization. You create a SummaryWriter and include the sess.graph_def in it. The graph def will be output to the log directory.
Related
I'm quite familiar in TensorFlow 1.x and I'm considering to switch to TensorFlow 2 for an upcoming project. I'm having some trouble understanding how to write scalars to TensorBoard logs with eager execution, using a custom training loop.
Problem description
In tf1 you would create some summary ops (one op for each thing you would want to store), which you would then merge into a single op, run that merged op inside a session and then write this to a file using a FileWriter object. Assuming sess is our tf.Session(), an example of how this worked can be seen below:
# While defining our computation graph, define summary ops:
# ... some ops ...
tf.summary.scalar('scalar_1', scalar_1)
# ... some more ops ...
tf.summary.scalar('scalar_2', scalar_2)
# ... etc.
# Merge all these summaries into a single op:
merged = tf.summary.merge_all()
# Define a FileWriter (i.e. an object that writes summaries to files):
writer = tf.summary.FileWriter(log_dir, sess.graph)
# Inside the training loop run the op and write the results to a file:
for i in range(num_iters):
summary, ... = sess.run([merged, ...], ...)
writer.add_summary(summary, i)
The problem is that sessions don't exist anymore in tf2 and I would prefer not disabling eager execution to make this work. The official documentation is written for tf1 and all references I can find suggest using the Tensorboard keras callback. However, as far as I know, this only works if you train the model through model.fit(...) and not through a custom training loop.
What I've tried
The tf1 version of tf.summary functions, outside of a session. Obviously any combination of these functions fails, as FileWriters, merge_ops, etc. don't even exist in tf2.
This medium post states that there has been a "cleanup" in some tensorflow APIs including tf.summary(). They suggest using from tensorflow.python.ops.summary_ops_v2, which doesn't seem to work. This implies using a record_summaries_every_n_global_steps; more on this later.
A series of other posts 1, 2, 3, suggest using the tf.contrib.summary and tf.contrib.FileWriter. However, tf.contrib has been removed from the core TensorFlow repository and build process.
A TensorFlow v2 showcase from the official repo, which again uses the tf.contrib summaries along with the record_summaries_every_n_global_steps mentioned previously. I couldn't make this to work either (even without using the contrib library).
tl;dr
My questions are:
Is there a way to properly use tf.summary in TensroFlow 2?
If not, is there another way to write TensorBoard logs in TensorFlow 2, when using a custom training loop (not model.fit())?
Yes, there is a simpler and more elegant way to use summaries in TensorFlow v2.
First, create a file writer that stores the logs (e.g. in a directory named log_dir):
writer = tf.summary.create_file_writer(log_dir)
Anywhere you want to write something to the log file (e.g. a scalar) use your good old tf.summary.scalar inside a context created by the writer. Suppose you want to store the value of scalar_1 for step i:
with writer.as_default():
tf.summary.scalar('scalar_1', scalar_1, step=i)
You can open as many of these contexts as you like inside or outside of your training loop.
Example:
# create the file writer object
writer = tf.summary.create_file_writer(log_dir)
for i, (x, y) in enumerate(train_set):
with tf.GradientTape() as tape:
y_ = model(x)
loss = loss_func(y, y_)
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
# write the loss value
with writer.as_default():
tf.summary.scalar('training loss', loss, step=i+1)
I'm after advice on how to debug what on Tensorflow is struggling with when it hangs.
I have a multi layer CNN which hangs upon global_variables_initializer() is run in the session. I am getting no errors or messages on the console output.
Is there an intelligent way of debugging what Tensorflow is struggling with when it hangs instead of repeatedly commenting out lines of code that makes the graph, and re-running to see where it hangs. Would TensorFlow debugger (tfdbg) help? What options do I have?
Ideally it would be great to just to break current execution and look at some stack or similar to see where the execution is hanging during the init.
I'm currently running Tensorflow 0.12.1 with Python 3 inside a Jupiter notebook.
I managed to solve the problem. The tip from #amo-ej1 to run in a regular file was a step in the correct direction. This uncovered that the tensor flow process was killing itself off with a SIGKILL and returning an error code of 137.
I tried Tensorflow Debugger tfdbg though this did not provide any further details as the problem was the graph did not initialize. I started to think the graph structure was incorrect, so I dumped out the graph structure using:
tf.summary.FileWriter('./logs/traing_graph', graph)
I then used up Tensorboard to inspect the resultant summary graph structure data dumped out the the directory and found that the tensor dimensions of the Fully Connected layer was wrong , having a width of 15million !!?! (wrong)
It turned out that one of the configurable parameters of the graph was incorrect. It was picking the dimension of the layer 2 tensor shape incorrectly from an incorrect addressing the previous tf.shape type property and it exploded the dimensions of the graph.
There were no OOM error messages in /var/log/system.log so I am unsure why the graph initialisation caused the python tensorflow script process to die.
I fixed the dimensions of the graph and graph initialization worked just fine!
My top tip is visualise your graph with Tensorboard before initialisation and training to do a quick check the resultant graph structure you coded it what you expected it to be. You probably will save yourself a lot of time! :-)
A common methodology to debug tensorflow is to replace the placeholders and/or variables with numpy arrays and put them inside tf.const. When you do so you can actually examine the logic of your code by setting a breakpoints and to see numbers in "pythoninc" and not just tensors. It will be much easier to help you if you would post your code here, but here is a dummy example:
with tf.name_scope('scope_name'):
### This block is for debug only
import numpy as np
batch_size = 20
sess = tf.Session()
sess.run(tf.tables_initializer())
init_op = tf.global_variables_initializer()
sess.run(init_op)
### End of first debug block
## Replacing Placeholders for debug - uncomment the placehlolders and comment the numpy arrays to producation mode
const_a = tf.constant((np.random.rand(batch_size, 26) > 0.85).astype(int), dtype=tf.float32)
const_b = tf.constant(np.random.randint(0, 20, batch_size * 26).reshape((batch_size, 26)), dtype=tf.float32)
# real_a_placeholder = tf.log(input_placeholder_dict[A_DATA])
# real_b_placeholder = tf.log(input_placeholder_dict[B_DATA])
# dummy opreation
c = a - b
# selecting top k - in the sanity check you can see here that you actullay get the top items and top values
top_k = 5
top_k_values, top_k_indices = tf.nn.top_k(c,
k=top_k, sorted=True,
name="top_k")
## Replacing Variable for debug - uncomment the variables and comment the numpy arrays to producation mode
Now, run your code with breakpoints and you have 2 options to see the values in the debugger:
1.sess.run(palceholder_name)
2.you can use eval - varaible_name.eval(sessnio=sess)
Doing the tutorial off the TensorFlow community Git repository at https://github.com/BinRoot/TensorFlow-Book/blob/master/ch02_basics/Concept08_TensorBoard.ipynb
When running tensorboard --logdir=path/to/logs in the command panel I get, Starting TensorBoard b'47' at http://0.0.0.0:6006.
Then when I go to explorer and look at the board it brings up, No scalar data was found. I am not sure what I am missing.
Copy of the code as I have it in my Python script:
import tensorflow as tf
import numpy as np
raw_data = np.random.normal(10, 1, 100)
alpha = tf.constant(0.05)
curr_value = tf.placeholder(tf.float32)
prev_avg = tf.Variable(0.)
update_avg = alpha * curr_value + (1 - alpha) * prev_avg
avg_hist = tf.summary.scalar("running_average", update_avg)
value_hist = tf.summary.scalar("incoming_values", curr_value)
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs")
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(len(raw_data)):
summary_str, curr_avg = sess.run([merged, update_avg], feed_dict=
{curr_value: raw_data[i]})
sess.run(tf.assign(prev_avg, curr_avg))
print(raw_data[i], curr_avg)
writer.add_summary(summary_str, i)
Tensorboard has a known issue with paths on windows.
To summarize, tensorboard's --logdir can take a path, such as --logdir=/my/path, but the user can also specify a name to one or several comma separated paths, such as --logdir=foo:/my/path1,bar:/my/path2.
The problem is that this naming system does not play nice with Windows' drive name. When specifying --logdir=C:\my\path, how does tensorboard knows C: is a drive name and not a path name? Well, it doesn't, and you end up with a nice tensorboard webpage showing no summaries at all.
The solution is either to omit the drive letter and make sure you start from the correct drive, or somewhat more robustly, to always provide a path name, as in --logdir foo:"C:\My path\to my logs".
UPDATE
Since TF 1.5, tensorboard learned to recognize Windows drives and do not treat them as labels anymore.
Do not use the absolute path,like"--logdir=path/to/logs".Try a shorter path,like"--logdir=path",it works for my code.
I cannot understand Tensorflow system.
First,I wrote
#coding:UTF-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
const1 = tf.constant(2)
const2 = tf.constant(3)
add_op = tf.add(const1,const2)
with tf.Session() as sess:
result = sess.run(add_op)
print(result)
and it print out 5.
Second,I wrote
#coding:UTF-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
const1 = tf.constant(2)
const2 = tf.constant(3)
add_op = tf.add(const1,const2)
print(add_op)
and it print out Tensor("Add:0", shape=(), dtype=int32).
I cannot understand this system.
I use Python and other languages, so I think tf.add() method is add method.However,in the case of Tensorflow,it seems different.
why is this part
with tf.Session() as sess:
result = sess.run(add_op)
print(result)
necessary?
What functions does this part have?
I would suggest to read the official Getting Started with TensorFlow guide of TensorFlow to get to know the core concepts of the library, such as the one which seems to be the problem here:
Every TensorFlow program consists of two parts:
Building the computational graph.
Running the computational graph.
Now, what is a "computational graph"? In TensorFlow, you specify a series of operations which are executed on your input. This series of operations is your "computational graph". To understand that, lets look at some examples:
Simple addition: let's look at your example, your code is
const1 = tf.constant(2)
const2 = tf.constant(3)
add_op = tf.add(const1,const2)
This creates two constant nodes in the graph, and creates a second node which adds them. Graphically, this looks like:
To make it a little bit more complex, lets say you have an input x and want to add a constant 3 to it. Then your code would be:
const1 = tf.constant(2)
x = tf.placeholder(tf.float32)
add_op = tf.add(const1,x)
and your graph is
In both examples, this was the first part of the program. So far, we only defined how our computational graph should look, i.e. what inputs we have, what outputs, and all calculations needed.
But: no calculations have been done so far! In the second example, you don't even know what your input x is - only that it will be a float32.
If you have a GPU, you'll notice that TensorFlow hasn't even touched the GPU yet. Even if you have a huge neural network with millions of training images, this step runs in milliseconds, as no "real" work has to be done.
Now comes part two: running the graph we defined above. Here's where the work happens!
We fire up TensorFlow by creating a tf.Session, and then we can run anything by calling sess.run().
with tf.Session() as sess:
result = sess.run(add_op)
print(result)
In the second example, we now have to tell TensorFlow what our value x should be:
with tf.Session() as sess:
result = sess.run(add_op, {x: 5.0})
print(result)
tl;dr: every TensorFlow program has two parts: 1. building a computational graph, and 2. running this graph. With tf.add you only define the graph, but no addition is performed yet. To run this graph, use sess.run() as in your first piece code.
launch tensorboard with tensorboard --logdir=/home/vagrant/notebook
at tensorboard:6006 > graph, it says No graph definition files were found.
To store a graph, create a tf.python.training.summary_io.SummaryWriter and pass the graph either via the constructor, or by calling its add_graph() method.
import tensorflow as tf
sess = tf.Session()
writer = tf.python.training.summary_io.SummaryWriter("/home/vagrant/notebook", sess.graph_def)
However the page is still empty, how can I start playing with tensorboard?
current tensorboard
result wanted
An empty graph that can add nodes, editable.
update
Seems like tensorboard is unable to create a graph to add nodes, drag and edit etc ( I am confused by the official video ).
running https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py and then tensorboard --logdir=/home/vagrant/notebook/data is able to view the graph
However seems like tensorflow only provide ability to view summary, nothing much different to make it standout
TensorBoard is a tool for visualizing the TensorFlow graph and analyzing recorded metrics during training and inference. The graph is created using the Python API, then written out using the tf.train.SummaryWriter.add_graph() method. When you load the file written by the SummaryWriter into TensorBoard, you can see the graph that was saved, and interactively explore it.
However, TensorBoard is not a tool for building the graph itself. It does not have any support for adding nodes to the graph.
Starting from the following Code Example, I can add one line as shown below:
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession() #define a session
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(100).astype("float32")
y_data = x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but Tensorflow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b
# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
# Before starting, initialize the variables. We will 'run' this first.
init = tf.initialize_all_variables()
# Launch the graph.
sess = tf.Session()
sess.run(init)
#### ----> ADD THIS LINE <---- ####
writer = tf.train.SummaryWriter("/tmp/test", sess.graph)
# Fit the line.
for step in xrange(201):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(W), sess.run(b))
# Learns best fit is W: [0.1], b: [0.3]
And then run tensorboard from the command line, pointing to the appropriate directory. This shows a complete call for the SummaryWriter. It is important to note the following things:
SummaryWriter is passed a Session, and so must happen after the Session (or InteractiveSession) is created
That Session may be created early in the program, but when the Session is passed to the SummaryWriter, the graph as it exists at that point is written to the file that the TensorBoard will use.
In this page, there is a very simple code that you can use to test your installation: http://tensorflow.org/get_started
I included this line
tf.train.write_graph(sess.graph_def, '/home/daniel/Documents/Projetos/Prorum/ProgramasEmPython/TestingTensorFlow/fileGraph', 'graph.pbtxt')
After this "sess.run(init)"
This will generate a file that you have to upload to the "TensorBoard".
In order to open the TensorBoard, supposing that it is installed in your computer (it must be if you use pip to install), I used the terminal of Ubuntu and wrote:
"tensorboard --logdir nameOfDirectory"
Then, you should open your browser in Port 6006:
http://localhost:6006/
This will open the TensorBoard. I went to the "Graph Menu" and uploaded the file. It generated this figure below:
So, what I have done is to transfer the model I created in Python to TensorBoard. I believe that it is possible to create an empty one, if no model is created (only the session is initiated). However, I am not sure if you are able to change this directly in the TensorBoard.
I have answered before this question here in Portuguese with more details for Brazilian users. Maybe it can be useful for other people: http://prorum.com/index.php/1843/recentemente-plataforma-aprendizagem-primeira-impressao
i solved by on windows:
file_writer = tf.summary.FileWriter("output", sess.graph)
for that directory "output". I opened command on windows.
typed
tensorboard --logdir="C:\Users\kiran\machine Learning\output"
my mistake was on that line..
The graphs in TensorBoard do not show up if you are using Firefox. You have to install Chrome.
result wanted
An empty graph that can add nodes, editable.
I think you will find the Orange tool useful. It allows you to drag and drop various nodes and implement algorithms via GUI.
I had to use
python -m tensorflow.tensorboard --logdir="C:\tmp\tensorflow\.."
somehow tensorboard --logdir didn't work.
My environment
OS: Windows 7, Python 3.5, and Tensorflow 1.1.0