I used the "Tensorflow For Poets" tutorial a few years ago to create an image classifier. It's amazing and I've been using it regularly ever since.
Today I have attempted to migrate my image classifier to a new Docker environment but it's running the new version Tensorflow 2 and so my script breaks.
Can anyone help to upgrade this famous tutorial script to Tensorflow 2?
directory = '/imageFolder'
# Tensorflow labels
label_lines = [line.rstrip() for line in tf.gfile.GFile('/tf_files/retrained_labels.txt')]
# Unpersists graph from file
with tf.gfile.FastGFile('/tf_files/retrained_graph.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
# Count the folders
def fcount(path, map = {}):
count = 0
for f in os.listdir(path):
child = os.path.join(path, f)
if os.path.isdir(child):
child_count = fcount(child, map)
count += child_count + 1 # unless include self
map[path] = count
return count
map = {}
totalDirectories = fcount(directory, map)
# Walk the directory
for dirpath, dirnames, filenames in os.walk(directory):
splicedDirpath = dirpath[len(directory):]
print "Processing ", splicedDirpath
counter = 0
for name in filenames:
if name.lower().endswith(('.jpg', '.jpeg', '.tiff')):
print name
image_data = tf.gfile.FastGFile(os.path.join(dirpath, name), 'rb').read()
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
firstElt = top_k[0];
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
There is already a new updated version of the TensorFlow for Poets in this notebook.
If you need to migrate your code from TensorFlow 1.x to TensorFlow 2.x.
You can use the compat library of TensorFlow 2.x by basically importing tf.compat.v1 and executing tf.compat.v1.disable_v2_behavior().
You can also use the upgrade script provided by TensorFlow to assist in migrating your code, this will also show you parts of the code that needs manual changes.
The guide in using the automatic upgrade script is in this link.
A more in-depth discussion on the guide of Migrating your code from TensorFlow 1.x to 2.x is in this link.
Related
I want to change the keras .h5 file to tensorflow .pb file. It seems there are something wrong with the .pb file. My code is shown as follow:
network_eval = model.vggvox_resnet2d_icassp(input_dim=params['dim'],
num_class=params['n_classes'],
mode='eval', args=args)
path = 'XXX'
name = 'XXX.pb'
network_eval.load_weights(os.path.join(args.resume), by_name=True) # load model
# I use parts of the keras_to_tensorflow util, see https://github.com/amir-abdi/keras_to_tensorflow
orig_output_node_names = [node.op.name for node in network_eval.outputs]
# I do not change the output_nodes_prefix, so
converted_output_node_names = orig_output_node_names
sess = K.get_session()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), converted_output_node_names)
graph_io.write_graph(constant_graph, path, name, as_text=False)
The .pb file was generated successfully, but the predicted outputs of the test file using .pb model are different from those using original .h5 keras model. The test code is shown as
# using .h5 model
for spec in specs: # specs is sliced magtitude spectrum of a .wav file for predicting
spec = np.expand_dims(np.expand_dims(spec, 0), -1)
v_1 = network_eval.predict(spec)
print(v_1)
# using .pb model
sess = tf.Session()
with gfile.FastGFile('XXX.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')
sess.run(tf.global_variables_initializer())
# 'lambda_1/l2_normalize' is the name of the last layer of the network
# I got this name by printing network_eval.output.name
op = sess.graph.get_tensor_by_name('lambda_1/l2_normalize:0')
x = sess.graph.get_tensor_by_name('input:0')
for spec in specs:
spec = np.expand_dims(np.expand_dims(spec, 0), -1)
v_2 = sess.run(op, feed_dict={x: spec, K.learning_phase(): 0})
print(v_2)
As I said above, the printed results v_1 and v_2 are quite different, but they have the same shape, which makes me confused and I don't know which step was wrong. Is there anyone can help me? I will be very grateful.
I've been trying to train a CNN written using the Tensorflow implementation of Keras. It appears as though training gets stuck when it hits the first epoch - although it looks like my GPUs are still using memory according to nvidia-smi. There are no error messages or tracebacks that are printed to the terminal either, which is making debugging this a little tricky for me. I've also written this code using TF estimators and datasets, the network didn't train when I left it overnight. Therefore, I don't think that this is just a case of leaving the code to run for longer - it's probably something I've done, but it may also be due to (an allegedly fixed) bug according to the second link below.
At the moment, I'm also trying to track training process using the "verbose" argument in model.fit() to see if anything is happening. I'm not seeing anything appear in the terminal though. Other people who get this problem seem to still be getting a progress bar to appear.
I've also tried logging with TensorBoard and saving model checkpoints. No checkpoints are being saved and regarding Tensorboard, it looks there are no graphs being saved either.
Any ideas on what might be causing this?
Can't get past first epoch -- just hangs [Keras Transfer Learning Inception]
Keras fit freezes at the end of the first epoch
import os
import tensorflow as tf
from tensorflow import keras
import cv2
import numpy as np
from tensorflow.python.framework.graph_util import convert_variables_to_constants
from tensorflow.python.keras import backend as K
cwd = os.getcwd()
log_dir = cwd + "/Keras_Model/"
callbacks = [keras.callbacks.ModelCheckpoint(filepath="./Checkpoints/weights.{epoch:02d}-{val_loss:.2f}.hdf5"),
keras.callbacks.TensorBoard(log_dir="./logs")]
def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
"""
TAKEN FROM HERE: https://stackoverflow.com/questions/45466020/how-to-export-keras-h5-to-tensorflow-pb
Freezes the state of a session into a pruned computation graph. Used later to save model as TF pb file.
Creates a new computation graph where variable nodes are replaced by
constants taking their current value in the session. The new graph will be
pruned so subgraphs that are not necessary to compute the requested
outputs are removed.
#param session The TensorFlow session to be frozen.
#param keep_var_names A list of variable names that should not be frozen,
or None to freeze all the variables in the graph.
#param output_names Names of the relevant graph outputs.
#param clear_devices Remove the device directives from the graph for better portability.
#return The frozen graph definition.
"""
graph = session.graph
with graph.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
output_names = output_names or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
if clear_devices:
for node in input_graph_def.node:
node.device = ""
frozen_graph = convert_variables_to_constants(session, input_graph_def,
output_names, freeze_var_names)
return frozen_graph
### IMPORT TRAINING IMAGES AS NUMPY ARRAY ###
t_dir = cwd + "/data-1/training/"
e_dir = cwd + "/data-1/evaluation"
xtrain = []
ytrain = []
print(" - Collating training data and labels... - ")
for subdir, dirs, files in os.walk(t_dir):
for f in files:
img = os.path.join(subdir, f)
x = cv2.imread(img) # --> Produces 8-bit tensor from image file.
y = int(img.split("/")[-2]) - 1 # --> Get label from file path.
xtrain.append(x)
ytrain.append(y)
data = np.asarray(xtrain)
print(" - Training data collated. - ")
labels = np.asarray(ytrain)
print(" - Training labels collated. - ")
### IMPORT EVALUATION IMAGES AS TF ITERATOR ###
xeval = []
yeval = []
print(" - Collating validation data and labels... - ")
for subdir, dirs, files in os.walk(e_dir):
for f in files:
img = os.path.join(subdir, f)
x = cv2.imread(img) # --> Produces 8-bit tensor from image file.
y = int(img.split("/")[-2]) - 1 # --> Get label from file path.
xeval.append(x)
yeval.append(y)
val_data = np.asarray(xeval)
print(" - Validation data collated. - ")
val_labels = np.asarray(yeval)
print(" - Validation labels collated. - ")
### CREATE MODEL ###
model = keras.Sequential()
model.add(keras.layers.Conv2D(filters=32, kernel_size=5, strides=1, padding="same", data_format = "channels_last", activation="relu", input_shape= (480,640,3)))
model.add(keras.layers.GlobalMaxPool2D(data_format = "channels_last"))
model.add(keras.layers.Dense(64, activation="relu"))
model.add(keras.layers.Dropout(0.4)) # --> Change dropout rate here.
model.add(keras.layers.Dense(8, activation="softmax"))
model.compile(optimizer=tf.train.AdamOptimizer(0.001), # --> Choose learning rate here.
loss=keras.losses.sparse_categorical_crossentropy,
metrics=[keras.metrics.categorical_accuracy])
print(" - Model created... - ")
print(" - Model Summary - ")
model.summary() # --> Print model summary.
### TRAIN AND EVALUATE MODEL ###
print(" - Training model... - ")
model.fit(data, labels, epochs = 5, batch_size=32, callbacks=callbacks, validation_data=(val_data, val_labels), verbose = 2)
print(" - Model trained! - ")
### SAVE MODEL AS H5 AND PB FILES ###
model.save("./Keras_Model/model.h5", save_format="h5")
print(" - Saved model as h5. - ")
frozen_graph = freeze_session(K.get_session(), output_names=[out.op.name for out in model.outputs])
tf.train.write_graph(frozen_graph, "./Tensorflow_Model/", "model.pb", as_text=False)
print(" - Saved model as pb. - ")
print(" - Clearing session. - ")
keras.clear_session()
I can also provide the version where I use TF datasets and evaluators, or anything else if I can. Apologies if I've left anything obvious out, I've just started using SO.
UPDATE: I went home last night and ran this script on my computer - it seems to work so clearly this is not a usage issue, but probably either a problem with TF itself or the way it's been configured on our server. It's a bit bizarre because TF was working at some point previously, but what can you do. Cheers all.
I am trying to generate an eightbit quantized graph for a custom LSTM model using TransformGraph. The graph import works fine if I only quantize_weights. Once quantize_nodes is applied importing fails with an error as given below
ValueError: Specified colocation to an op that does not exist during import: lstm1/lstm1/BasicLSTMCellZeroState/zeros in lstm1/lstm1/cond/Switch_2
The code snippet I an using for quantizing is listed below
from tensorflow.tools.graph_transforms import TransformGraph
import tensorflow as tf
input_names = ["inp/X"]
output_names = ["out/Softmax"]
#transforms = ["quantize_weights", "quantize_nodes"]
#transforms = ["quantize_weights"]
transforms = ["add_default_attributes",
"strip_unused_nodes",
"remove_nodes(op=Identity, op=CheckNumerics)",
#"fold_constants(ignore_errors=true)",
"fold_batch_norms",
"fold_old_batch_norms",
"quantize_weights",
"quantize_nodes",
"sort_by_execution_order"]
#output_graph_path="/tmp/fixed.pb"
output_graph_path="/tmp/output_graph.pb"
with tf.Graph().as_default():
output_graph_def = tf.GraphDef()
with tf.Session() as sess:
with open(output_graph_path, "rb") as f:
output_graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(output_graph_def, name="")
transformed_graph_def = TransformGraph(output_graph_def, input_names,
output_names, transforms)
tf.train.write_graph(transformed_graph_def, '/tmp', 'quantized.pb', as_text=False)
I also tried using quantize_graph.py, which always resulted in a keyerror as in https://github.com/tensorflow/tensorflow/issues/8025. I believe this code is no longer maintained. Can anyone please point out how to debug this issue.
I went through the Tensorflow for poets Tutorial and then classified my own images, this is all done in the TF docker container provided. The model has a validation accuracy in the mid to low 90's. There is a separate file that makes predictions for new images(below).
I copied the files 'retrained_labels_corn.txt' and 'retrained_graph_corn.pd' and the file holding the code seen below to a directory(and changed the file paths) to see if I could make predictions while not in the docker container. I made sure I give it a valid image path as the system arg but it always predicts the image as one class with a probability above 97%. When I do the same thing while in the docker container everything works fine. I even tried pointing the labeling file to the exact same files that docker container uses and I am getting the same result of it always predicting one class with a high degree of certainty.
What did I do wrong?
import tensorflow as tf, sys
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/tf_files/retrained_labels_corn.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/tf_files/retrained_graph_corn.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
I am on Ubuntu version 16.04 and TF 0.10
I am trying to classify a set of photos (<10000) using Tensorflow Inception v3.
I'm using the TensorFlow Docker installation CPU Binary image plus source code on a Macbook Air with 4gb DDR3.
I've written a python script to spawn a subprocess to classify each image:
"""
classifier_spawner.py
Runs custom_classify_image.py in subprocess
"""
if __name__ == "__main__":
import sqlite3
import subprocess as sub
from ast import literal_eval
conn = sqlite3.connect("database_name.db")
c = conn.cursor()
c.execute("SELECT * FROM images")
images = c.fetchall()
for image in images:
p = sub.Popen(["python", "custom_classify_image.py", "--image_file=image_dir/" + str(image[0]) + ".jpg"], stdout=sub.PIPE, stderr=sub.PIPE)
output, errors = p.communicate()
categories = literal_eval(output)
for cat in categories:
c.execute("""INSERT OR IGNORE INTO classification (image_id, class_id, probability) VALUES(?,?,?)""", (str(image[0]), str(cat[0]), float(cat[1])))
conn.commit()
print("Classification complete, exiting.")
conn.close()
I have amended run_inference_on_image(image) in classify_image.py as provided with TensorFlow to print a python list of tuples of the top 5 classifications, as shown.
"""My custom run_inference_on_image(image) """
def run_inference_on_image(image):
"""Runs inference on an image.
Args:
image: Image file name.
Returns:
Nothing
"""
if not tf.gfile.Exists(image):
tf.logging.fatal('File does not exist %s', image)
image_data = tf.gfile.FastGFile(image, 'rb').read()
# Creates graph from saved GraphDef.
"""Creates a graph from saved GraphDef file and returns a saver."""
# Creates graph from saved graph_def.pb.
with tf.gfile.FastGFile(os.path.join(
FLAGS.model_dir, 'classify_image_graph_def.pb'), 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Some useful tensors:
# 'softmax:0': A tensor containing the normalized prediction across
# 1000 labels.
# 'pool_3:0': A tensor containing the next-to-last layer containing 2048
# float description of the image.
# 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
# encoding of the image.
# Runs the softmax tensor by feeding the image_data as input to the graph.
softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
predictions = sess.run(softmax_tensor,
{'DecodeJpeg/contents:0': image_data})
predictions = np.squeeze(predictions)
# Creates node ID --> English string lookup.
node_lookup = NodeLookup()
top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
top_k_tup_list = []
for node_id in top_k:
human_string = node_lookup.id_to_string(node_id)
score = predictions[node_id]
top_k_tup_list.append((human_string, score))
sess.close()
print(top_k_tup_list)
My issue is that when I run classifier_spawner.py, after the subprocess has completed, the memory that it used is not freed and after ~10 classifications my disk is full (after writing ~15gb to disk). I then have to delete my virtual machine.
I don't understand why this is happening, shouldn't the virtual memory the subprocess uses be freed after (1) the TensorFlow session is ended and (2) the process has exited?
Thanks in advance, if you need any clarification please let me know.